Statistics

THRESHOLD FOR HANDLING SEVERITY OF OVERDISPERSION IN SOME COUNT DATA MODELS USING A FUZZY SET APPROACH

2023-08-01T00:00:00Z

THRESHOLD FOR HANDLING SEVERITY OF OVERDISPERSION IN SOME COUNT DATA MODELS USING A FUZZY SET APPROACH OYALADE, Abidemi Damaris Overdispersion, often associated with count data is difficult to handle by a single parameter regression model such as the Poisson regression model. Previous attempts to modify the Poisson regression model with additional parameters did not take cognisance of the different levels of overdispersion because there might be no need for modification at-times. Modification done without any need affects the standard error leading to wrong conclusions. Therefore, this study was aimed at determining the threshold for modification in some count data models when the problem of overdispersion is unavoidable. Fuzzy 𝑐-partition was used to classify the degree of overdispersion severity into not severe, moderate, severe, and very severe. Membership function was constructed for each of the classes with its fuzzy dispersion percentage (𝑑) range: 0 for not severe with 𝑑 ≤ 10, (4𝑑−40) 210 for moderate with 10 < 𝑑 ≤ 40, 𝑑/70 for severe with 40 < 𝑑 ≤ 70 and 1 for very severe with 𝑑 > 70. The universal set of the dispersion percentage, 𝐷 = (𝑣−𝑚𝑚) × 100%, where 𝑣 is the variance and 𝑚, the mean. Four models: Poisson (PO), Negative Binomial (NB), Com-Poisson (CP), and Generalised Poisson (GP) were used to simulate the benchmark for modification. Different random sample sizes, including 𝑛 = 20 for small sample and 𝑛 = 5000 for large sample were used with mean (µ) = 0.01, 0.05, 1.00, 2.00 and variance (σ2) = 0.05, 0.50, 1.50, 2.50, respectively. The ratio of the residual deviance of PO (simplest model) to its degree of freedom was used to detect the presence of overdispersion in the count data. The averaging method was used to determine the threshold ( 𝐷̅). The models were validated with monthly road crashes data from the Federal Road Safety Corps in 36 states and the Federal Capital Territory of Nigeria between 2014-2018 and the Akaike Information Criteria (AIC) was used for model selection. The threshold 𝐷̅ for models PO, NB, CP and GP given that 𝑛 = 20, were 24.2, 69.4, 34.8 and 32.6%; 26.6, 73.6, 26.5 and 27.1%; 23.1, 75.2, 25.1 and 37.1%; 30.4, 77.5, 54.9 and 24.5%, respectively. The highest 𝐷̅, at different values of µ and σ2 for PO, NB, CP and GP when 𝑛 = 20 were 30.4, 77.5, 54.9 and 37.1%, respectively. For n= 5000, 𝐷̅ were 27.7, 74.9, 22.1 and 28.3%; 27.6, 74.5, 22.2 and 28.9%; 27.9, 38.2, 22.2 and 29.2%; 28.2, 29.1, 22.2 and 28.3%, respectively. The highest 𝐷̅, at different values of µ and σ2 for PO, NB, CP and GP when 𝑛 = 5000 were 28.2, 74.9, 22.2 and 29.2%, respectively, indicating points for modifications. The ratio of the residual deviance of PO to its degree of freedom is 42.0 flagging very severe overdispersion (95.5%) of road crashes having membership function of 1. The AIC for PO, NB, CP and GP were 8826.7, 8657.6, 2211.0 and 2205.4, respectively. This implies that GP is the best model. The thresholds for modification of severity of overdispersion for Poisson, Negative Binomial, Com-Poisson, and Generalised Poisson models were determined. The determined thresholds could be used to minimise wrong conclusions arising from defective standard errors.

OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS

2023-08-16T00:00:00Z

OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS OGUNDUNMADE, Tayo Peter Neural Network (NN) allows complex nonlinear relationships between the response variables and its predictors. The Deep NN have made notable contributions across computer vision, reinforcement learning, speech recognition and natural language processing. Previous studies have obtained the parameters of NN through the classical approach using Homogeneous Activation Functions (HOMAFs). However, a major setback of NN using the classical approach is its tendency to over-fit. Therefore, this study was aimed at developing a Bayesian NN (BNN) model to ameliorate over-fitting using Heterogeneous Activation Functions (HETAFs). A BNN model was developed with Gaussian error distribution for the likelihood function; inverse gamma and inverse Wishart priors for the parameters, to obtain the BNN estimators. The HOMAFs (Rectified Linear Unit (ReLU), Sigmoid and Hyperbolic Tangent Sigmoid (TANSIG)) and HETAFs (Symmetric Saturated Linear Hyperbolic Tangent (SSLHT) and Symmetric Saturated Linear Hyperbolic Tangent Sigmoid (SSLHTS)) were used to activate the model parameters.The Bayesian approach was used to ameliorate the problem of over-fitting, while the Posterior Mean (PM), Posterior Standard Deviation (PSD) and Numerical Standard Error (NSE) were used to determine the estimators’ sensitivity. The performance of the Bayesian estimators from each of the activation functions was evaluated in the Monte Carlo experiment using the Mean Square Error (MSE), Mean Absolute Error (MAE) and training error as metrics. The proximity of MSE and training error values were used to generalise on the problem of over-fitting. The derived Bayesian estimators were β ∼ N(Kβ, Hβ) and γ ∼ exp (−1 2{Fγ +Mγ); where Kβ is derived mean of β, Hβ is derived standard deviation of β; Fγ and Mγ are the derived posteriors of γ. For ReLU, the PM, PSD and NSE values for β and γ were 0.4755, 0.0646, 0.0020; and 0.2370, 0.0642, 0.0020, respectively; for Sigmoid: 0.4476, 0.2734, 0.0087; and 1.0269, 0.2732, 0.0086, respectively; for TANSIG: 0.4718, 0.0826, 0.0026, and 1.0239, 0.0822, 0.0026, respectively. For SSLHT, the PM, PSD and NSE values for β and γ were 0.8344, 0.0567, 0.0018; and 1.0242, 0.0566, 0.0016, respectively; and for SSLHTS: 0.89825, 0.01278, 0.0004; and 1.0236, v0.0127, 0.0003, respectively. The MSE, MAE and training error values for the performance of the activation functions were ReLU: 0.1631, 0.2465, 0.1522; Sigmoid: 0.1834, 0.2074, 0.1862; TANSIG: 0.1943, 0.269, 0.1813; SSLHT: 0.0714, 0.0131, 0.0667; and SSLHTS: 0.0322, 0.0339, 0.0328, respectively. The HETAFs showed closer proximity between MSE and training error implying amelioration of overfitting and minimum error values compared to HOMAFS. The derived Bayesian neural network estimators ameliorated the problem of overfitting with close values of Mean Square Error and training error, thus making them more appropriate in handling Neural Network models. They could be used in solving problems in machine learning.

GENERALISED MULTIVARIATE MIXTURE REGRESSION ESTIMATORS FOR THE POPULATION MEAN WITH MULTI – AUXILIARY CHARACTERISTICS IN MULTI-PHASE SAMPLING

2022-01-01T00:00:00Z

GENERALISED MULTIVARIATE MIXTURE REGRESSION ESTIMATORS FOR THE POPULATION MEAN WITH MULTI – AUXILIARY CHARACTERISTICS IN MULTI-PHASE SAMPLING OLOGUNLEKO, EMMANUEL FEMI Generalised Multivariate Regression Estimators (GMREs) with multi-auxiliary quantitative variables in multi-phase sampling have been used over time to estimate the population mean. These estimators are structurally complex and maximised multiauxiliary quantitative variables only, to produce minimum Mean Square Errors (MSEs). The minimum MSEs can be further reduced with the inclusion of multiauxiliary qualitative variables. However, the existing estimators do not accommodate multi-auxiliary qualitative variables. Therefore, this study was designed to improve the efficiency of the estimators with multi-auxiliary characteristics in multi-phase sampling and simplifying the structurally complex estimators. A population of 𝑁 units, having 𝑌1, 𝑌2, … , 𝑌𝑝 study variables, with 𝑋1, 𝑋2, … , 𝑋𝑡 auxiliary variables and 𝑃1, 𝑃2, … , 𝑃𝑞 auxiliary attributes was considered. The 𝑛ℎ and 𝑛𝑘 (𝑛𝑘 < 𝑛ℎ) are the sample sizes of the ℎ𝑡ℎ and 𝑘𝑡ℎ phases, respectively. Different auxiliary attributes and variables were introduced to the generalised multivariate mixture regression which included Full Information Case (FIC), No Information Case (NIC), Partial Information Case-I (PIC-I), Partial Information Case-II (PIC-II) and Partial Information Case-III (PIC-III). The Improved Estimator Schema (IES) was introduced for the five estimators, in order to simplify the structurally complex estimators. The analytical comparison of the MSEs in five sampling phases was used for the computation of the Percentage Relative Efficiency (PRE) of the estimators. Random deviates of size 𝑁 = 10000 following normal distribution were used to study the behaviour of the estimators asymptotically. Five samples of sizes: 𝑛1, 𝑛2, 𝑛3, 𝑛4 and 𝑛5, with intervals ( 1233 ≤ 𝑛1 ≤ 3333), (542 ≤ 𝑛2 ≤ 1667), ( 361 ≤ 𝑛3 ≤ 1111), ( 271 ≤ 𝑛4 ≤ 833) and (45 ≤ 𝑛5 ≤ 139), were considered for the simulated populations, respectively. The estimators obtained for FIC, NIC, PIC-I, PIC-II, and PIC-III were 𝑡39(1×𝑝), 𝑡40(1×𝑝), 𝑡41(1×𝑝), 𝑡42(1×𝑝) and 𝑡43(1×𝑝), respectively, which were the estimated population means for the multivariate mixture regression estimators in multi-phase sampling with (1 × 𝑝) dimensions. The existing GMREs produced three estimators, which were 𝑡36(1×𝑝), 𝑡37(1×𝑝) and 𝑡38(1×𝑝). The IES obtained for FIC, NIC, PIC-I, PIC-II, and PIC-III estimators which simplified the structurally complex estimators for the multivariate mixture regression estimators in multi-phase sampling were 𝛾𝑡39(1×𝑝),vi 𝛾𝑡40(1×𝑝), 𝛾𝑡41(1×𝑝), 𝛾𝑡42(1×𝑝) and 𝛾𝑡43(1×𝑝), respectively. The corresponding minimised MSEs for the estimators were 𝑀𝑆𝐸(𝑡39)𝑚𝑖𝑛 = 1.9556327, 𝑀𝑆𝐸(𝑡40)𝑚𝑖𝑛 = 2.2219481, 𝑀𝑆𝐸(𝑡41)𝑚𝑖𝑛 = 2.0966104, 𝑀𝑆𝐸(𝑡42)𝑚𝑖𝑛 = 2.1493192 and 𝑀𝑆𝐸(𝑡43)𝑚𝑖𝑛 = 2.2049730, while the corresponding minimised MSEs for the existing estimators were 𝑀𝑆𝐸(𝑡36)𝑚𝑖𝑛 = 1.9714285, 𝑀𝑆𝐸(𝑡37)𝑚𝑖𝑛 = 2.3846115 and 𝑀𝑆𝐸(𝑡38)𝑚𝑖𝑛 = 2.2130263. The proposed estimators have 100.8%, 105.2%, 105.5%, 102.9%, and 100.3% PRE values over the existing estimators, indicating that the proposed estimators were more efficient than the existing estimators. The FIC estimator was the most efficient estimator, while the NIC estimator was the least efficient estimator. Among the partial information case estimators, the PIC-I estimator was conditionally more efficient than PIC-II estimator and PIC-III estimator, while the PIC-II estimator was more efficient than PIC-III estimator. It was observed that the proposed FIC, NIC, PIC-I, PIC-II and PIC-III estimators were asymptotically more efficient. The developed generalised multivariate mixture regression estimators with multiauxiliary characteristics in multi-phase sampling were more efficient in the estimation of the population mean. The structurally complex estimators were simplified by the improved estimator schema.

FITTING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE WITH EXOGENOUS VARIABLES MODEL ASSUMING LOGNORMAL ERROR TERM

2021-08-01T00:00:00Z

FITTING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE WITH EXOGENOUS VARIABLES MODEL ASSUMING LOGNORMAL ERROR TERM BELLO, ANDREW OJUTOMORI The conventional Autoregressive Integrated Moving Average with Exogenous Variables (arimax) model with Normal Error term and Multiple Linear Regression (MLR) require stringent assumptions of normality of error term and stationarity of the series. These models have found widespread application in multidimensional relationships among economic variables; when these assumptions are often violated in practice leading to spurious regression model with poor forecast performance. Thus, this study was designed to develop an arimax model with Lognormal Error term capable of analysing time series data even when the assumptions were violated with reasonable forecast performance. The conventional arimax (1, 0, 1) with normal error term defined as:where the lag operator B = yt−1; the parameter 1 was the coefficient of the Autoregressive model (AR), θ1 was the coefficient of Moving Average (MA), β0 was the intercept and β1 was the slope of the Regression part of the model. The proposed model was estimated by modifying the arimax (p, d, q) with lognormal error term where p is order of AR part, d is order of difference and q is order of MA part of the mixed model. The parameters were estimated using the maximum likelihood method. The choice of lognormal error term was based on the asymmetric property which overcomes non normality, the long tail and positive limit values properties overcome non stationarity. The dataset used were monthly External Reserves (Million USD), Official Exchange Rate (Naira to USD), Crude Oil Export (Million Barrel per Day) and Crude Oil Price (USD per Barrel). One hundred and twenty (120) observations were used for the modeling process. The proposed arimax (1, 0, 1) with lognormal error term ameliorate the non-normal and non-stationary assumptions. The proposed model performance was compared with conventional arimax (1, 1, 1) with normal error term and MLR model. Box-Jenkins Time Series procedure was used to model arimax (1, 1, 1) with normal error and Least Squares Estimator (LSE) technique for modeling MLR. The performance of proposed model was tested using Akaike Information Criteria (AIC), Mean Square Forecast Error (MSFE) and Loglikelihood (Loglik) values. The non normal error function was obtained as:while the loglikelihood function was: where σ2 is variance. All the series were found to be non-stationary and non-normally distributed. The Loglik values of MLR, conventional arimax (1, 1, 1) with normal error and proposed arimax (1, 0, 1) with lognormal error term were -317.41, -240.23 and 1344.47; AIC values were 5.36, 490.45 and -0.41 while MSFE values were 12.41, 12.48 and 1.77. The proposed model has the highest Loglik value, smallest AIC and smallest MSFE values when compared with conventional arimax (1, 1, 1) with normal error and MLR model. Hence, the proposed model was considered better. The autoregressive integrated moving average with exogenous variables assuming lognormal error term improved the capability of modeling time series data with better forecast performance even when the assumptions of normality of error term and stationarity of series were violated.