<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel rdf:about="http://hdl.handle.net/123456789/117">
<title>Statistics</title>
<link>http://hdl.handle.net/123456789/117</link>
<description/>
<items>
<rdf:Seq>
<rdf:li rdf:resource="http://hdl.handle.net/123456789/2153"/>
<rdf:li rdf:resource="http://hdl.handle.net/123456789/2151"/>
<rdf:li rdf:resource="http://hdl.handle.net/123456789/1863"/>
<rdf:li rdf:resource="http://hdl.handle.net/123456789/1800"/>
</rdf:Seq>
</items>
<dc:date>2026-04-04T12:02:38Z</dc:date>
</channel>
<item rdf:about="http://hdl.handle.net/123456789/2153">
<title>THRESHOLD FOR HANDLING SEVERITY OF OVERDISPERSION IN SOME COUNT DATA MODELS USING A FUZZY SET APPROACH</title>
<link>http://hdl.handle.net/123456789/2153</link>
<description>THRESHOLD FOR HANDLING SEVERITY OF OVERDISPERSION IN SOME COUNT DATA MODELS USING A FUZZY SET APPROACH
OYALADE, Abidemi Damaris
Overdispersion, often associated with count data is difficult to handle by a single&#13;
parameter regression model such as the Poisson regression model. Previous attempts to&#13;
modify the Poisson regression model with additional parameters did not take&#13;
cognisance of the different levels of overdispersion because there might be no need for&#13;
modification at-times. Modification done without any need affects the standard error&#13;
leading to wrong conclusions. Therefore, this study was aimed at determining the&#13;
threshold for modification in some count data models when the problem of&#13;
overdispersion is unavoidable.&#13;
Fuzzy &#119888;-partition was used to classify the degree of overdispersion severity into not&#13;
severe, moderate, severe, and very severe. Membership function was constructed for&#13;
each of the classes with its fuzzy dispersion percentage (&#119889;) range: 0 for not severe with&#13;
&#119889; ≤ 10, (4&#119889;−40)&#13;
210&#13;
for moderate with 10 &lt; &#119889; ≤ 40, &#119889;/70 for severe with 40 &lt; &#119889; ≤ 70&#13;
and 1 for very severe with &#119889; &gt; 70. The universal set of the dispersion percentage,&#13;
&#119863; = (&#119907;−&#119898;&#119898;) × 100%, where &#119907; is the variance and &#119898;, the mean. Four models: Poisson&#13;
(PO), Negative Binomial (NB), Com-Poisson (CP), and Generalised Poisson (GP)&#13;
were used to simulate the benchmark for modification. Different random sample sizes,&#13;
including &#119899; = 20 for small sample and &#119899; = 5000 for large sample were used with&#13;
mean (µ) = 0.01, 0.05, 1.00, 2.00 and variance (σ2) = 0.05, 0.50, 1.50, 2.50,&#13;
respectively. The ratio of the residual deviance of PO (simplest model) to its degree of&#13;
freedom was used to detect the presence of overdispersion in the count data. The&#13;
averaging method was used to determine the threshold ( &#119863;̅). The models were&#13;
validated with monthly road crashes data from the Federal Road Safety Corps in 36&#13;
states and the Federal Capital Territory of Nigeria between 2014-2018 and the Akaike&#13;
Information Criteria (AIC) was used for model selection.&#13;
The threshold &#119863;̅ for models PO, NB, CP and GP given that &#119899; = 20, were 24.2, 69.4,&#13;
34.8 and 32.6%; 26.6, 73.6, 26.5 and 27.1%; 23.1, 75.2, 25.1 and 37.1%; 30.4, 77.5,&#13;
54.9 and 24.5%, respectively. The highest &#119863;̅, at different values of µ and σ2 for PO,&#13;
NB, CP and GP when &#119899; = 20 were 30.4, 77.5, 54.9 and 37.1%, respectively. For n=&#13;
5000, &#119863;̅ were 27.7, 74.9, 22.1 and 28.3%; 27.6, 74.5, 22.2 and 28.9%; 27.9, 38.2,&#13;
22.2 and 29.2%; 28.2, 29.1, 22.2 and 28.3%, respectively. The highest &#119863;̅, at different&#13;
values of µ and σ2 for PO, NB, CP and GP when &#119899; = 5000 were 28.2, 74.9, 22.2&#13;
and 29.2%, respectively, indicating points for modifications. The ratio of the residual&#13;
deviance of PO to its degree of freedom is 42.0 flagging very severe overdispersion&#13;
(95.5%) of road crashes having membership function of 1. The AIC for PO, NB, CP&#13;
and GP were 8826.7, 8657.6, 2211.0 and 2205.4, respectively. This implies that GP is&#13;
the best model.&#13;
The thresholds for modification of severity of overdispersion for Poisson, Negative&#13;
Binomial, Com-Poisson, and Generalised Poisson models were determined. The&#13;
determined thresholds could be used to minimise wrong conclusions arising from&#13;
defective standard errors.
</description>
<dc:date>2023-08-01T00:00:00Z</dc:date>
</item>
<item rdf:about="http://hdl.handle.net/123456789/2151">
<title>OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS</title>
<link>http://hdl.handle.net/123456789/2151</link>
<description>OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS
OGUNDUNMADE, Tayo Peter
Neural Network (NN) allows complex nonlinear relationships between the response&#13;
variables and its predictors. The Deep NN have made notable contributions across&#13;
computer vision, reinforcement learning, speech recognition and natural language&#13;
processing. Previous studies have obtained the parameters of NN through the classical approach using Homogeneous Activation Functions (HOMAFs). However, a&#13;
major setback of NN using the classical approach is its tendency to over-fit. Therefore, this study was aimed at developing a Bayesian NN (BNN) model to ameliorate&#13;
over-fitting using Heterogeneous Activation Functions (HETAFs).&#13;
A BNN model was developed with Gaussian error distribution for the likelihood&#13;
function; inverse gamma and inverse Wishart priors for the parameters, to obtain&#13;
the BNN estimators. The HOMAFs (Rectified Linear Unit (ReLU), Sigmoid and&#13;
Hyperbolic Tangent Sigmoid (TANSIG)) and HETAFs (Symmetric Saturated Linear Hyperbolic Tangent (SSLHT) and Symmetric Saturated Linear Hyperbolic Tangent Sigmoid (SSLHTS)) were used to activate the model parameters.The Bayesian&#13;
approach was used to ameliorate the problem of over-fitting, while the Posterior&#13;
Mean (PM), Posterior Standard Deviation (PSD) and Numerical Standard Error&#13;
(NSE) were used to determine the estimators’ sensitivity. The performance of the&#13;
Bayesian estimators from each of the activation functions was evaluated in the&#13;
Monte Carlo experiment using the Mean Square Error (MSE), Mean Absolute Error (MAE) and training error as metrics. The proximity of MSE and training error&#13;
values were used to generalise on the problem of over-fitting.&#13;
The derived Bayesian estimators were β ∼ N(Kβ, Hβ) and γ ∼ exp (−1 2{Fγ +Mγ);&#13;
where Kβ is derived mean of β, Hβ is derived standard deviation of β; Fγ and&#13;
Mγ&#13;
are the derived posteriors of γ. For ReLU, the PM, PSD and NSE values for&#13;
β and γ were 0.4755, 0.0646, 0.0020; and 0.2370, 0.0642, 0.0020, respectively; for&#13;
Sigmoid: 0.4476, 0.2734, 0.0087; and 1.0269, 0.2732, 0.0086, respectively; for TANSIG: 0.4718, 0.0826, 0.0026, and 1.0239, 0.0822, 0.0026, respectively. For SSLHT,&#13;
the PM, PSD and NSE values for β and γ were 0.8344, 0.0567, 0.0018; and 1.0242,&#13;
0.0566, 0.0016, respectively; and for SSLHTS: 0.89825, 0.01278, 0.0004; and 1.0236,&#13;
v0.0127, 0.0003, respectively. The MSE, MAE and training error values for the performance of the activation functions were ReLU: 0.1631, 0.2465, 0.1522; Sigmoid:&#13;
0.1834, 0.2074, 0.1862; TANSIG: 0.1943, 0.269, 0.1813; SSLHT: 0.0714, 0.0131,&#13;
0.0667; and SSLHTS: 0.0322, 0.0339, 0.0328, respectively. The HETAFs showed&#13;
closer proximity between MSE and training error implying amelioration of overfitting and minimum error values compared to HOMAFS.&#13;
The derived Bayesian neural network estimators ameliorated the problem of overfitting with close values of Mean Square Error and training error, thus making&#13;
them more appropriate in handling Neural Network models. They could be used&#13;
in solving problems in machine learning.
</description>
<dc:date>2023-08-16T00:00:00Z</dc:date>
</item>
<item rdf:about="http://hdl.handle.net/123456789/1863">
<title>GENERALISED MULTIVARIATE MIXTURE REGRESSION ESTIMATORS FOR THE POPULATION MEAN WITH MULTI – AUXILIARY CHARACTERISTICS IN MULTI-PHASE SAMPLING</title>
<link>http://hdl.handle.net/123456789/1863</link>
<description>GENERALISED MULTIVARIATE MIXTURE REGRESSION ESTIMATORS FOR THE POPULATION MEAN WITH MULTI – AUXILIARY CHARACTERISTICS IN MULTI-PHASE SAMPLING
OLOGUNLEKO, EMMANUEL FEMI
Generalised Multivariate Regression Estimators (GMREs) with multi-auxiliary&#13;
quantitative variables in multi-phase sampling have been used over time to estimate&#13;
the population mean. These estimators are structurally complex and maximised multiauxiliary quantitative variables only, to produce minimum Mean Square Errors&#13;
(MSEs). The minimum MSEs can be further reduced with the inclusion of multiauxiliary qualitative variables. However, the existing estimators do not accommodate&#13;
multi-auxiliary qualitative variables. Therefore, this study was designed to improve the&#13;
efficiency of the estimators with multi-auxiliary characteristics in multi-phase&#13;
sampling and simplifying the structurally complex estimators.&#13;
A population of &#119873; units, having &#119884;1, &#119884;2, … , &#119884;&#119901; study variables, with &#119883;1, &#119883;2, … , &#119883;&#119905;&#13;
auxiliary variables and &#119875;1, &#119875;2, … , &#119875;&#119902; auxiliary attributes was considered. The &#119899;ℎ and&#13;
&#119899;&#119896; (&#119899;&#119896; &lt; &#119899;ℎ) are the sample sizes of the ℎ&#119905;ℎ and &#119896;&#119905;ℎ phases, respectively. Different&#13;
auxiliary attributes and variables were introduced to the generalised multivariate&#13;
mixture regression which included Full Information Case (FIC), No Information Case&#13;
(NIC), Partial Information Case-I (PIC-I), Partial Information Case-II (PIC-II) and&#13;
Partial Information Case-III (PIC-III). The Improved Estimator Schema (IES) was&#13;
introduced for the five estimators, in order to simplify the structurally complex&#13;
estimators. The analytical comparison of the MSEs in five sampling phases was used&#13;
for the computation of the Percentage Relative Efficiency (PRE) of the estimators.&#13;
Random deviates of size &#119873; = 10000 following normal distribution were used to study&#13;
the behaviour of the estimators asymptotically. Five samples of&#13;
sizes: &#119899;1, &#119899;2, &#119899;3, &#119899;4 and &#119899;5, with intervals ( 1233 ≤ &#119899;1 ≤ 3333), (542 ≤ &#119899;2 ≤&#13;
1667), ( 361 ≤ &#119899;3 ≤ 1111), ( 271 ≤ &#119899;4 ≤ 833) and (45 ≤ &#119899;5 ≤ 139), were&#13;
considered for the simulated populations, respectively.&#13;
The estimators obtained for FIC, NIC, PIC-I, PIC-II, and PIC-III were &#119905;39(1×&#119901;),&#13;
&#119905;40(1×&#119901;), &#119905;41(1×&#119901;), &#119905;42(1×&#119901;) and &#119905;43(1×&#119901;), respectively, which were the estimated&#13;
population means for the multivariate mixture regression estimators in multi-phase&#13;
sampling with (1 × &#119901;) dimensions. The existing GMREs produced three estimators,&#13;
which were &#119905;36(1×&#119901;), &#119905;37(1×&#119901;) and &#119905;38(1×&#119901;). The IES obtained for FIC, NIC, PIC-I,&#13;
PIC-II, and PIC-III estimators which simplified the structurally complex estimators for&#13;
the multivariate mixture regression estimators in multi-phase sampling were &#120574;&#119905;39(1×&#119901;),vi&#13;
&#120574;&#119905;40(1×&#119901;), &#120574;&#119905;41(1×&#119901;), &#120574;&#119905;42(1×&#119901;) and &#120574;&#119905;43(1×&#119901;), respectively. The corresponding minimised&#13;
MSEs for the estimators were &#119872;&#119878;&#119864;(&#119905;39)&#119898;&#119894;&#119899; = 1.9556327, &#119872;&#119878;&#119864;(&#119905;40)&#119898;&#119894;&#119899; =&#13;
2.2219481, &#119872;&#119878;&#119864;(&#119905;41)&#119898;&#119894;&#119899; = 2.0966104, &#119872;&#119878;&#119864;(&#119905;42)&#119898;&#119894;&#119899; = 2.1493192 and&#13;
&#119872;&#119878;&#119864;(&#119905;43)&#119898;&#119894;&#119899; = 2.2049730, while the corresponding minimised MSEs for the&#13;
existing estimators were &#119872;&#119878;&#119864;(&#119905;36)&#119898;&#119894;&#119899; = 1.9714285, &#119872;&#119878;&#119864;(&#119905;37)&#119898;&#119894;&#119899; = 2.3846115&#13;
and &#119872;&#119878;&#119864;(&#119905;38)&#119898;&#119894;&#119899; = 2.2130263. The proposed estimators have 100.8%, 105.2%,&#13;
105.5%, 102.9%, and 100.3% PRE values over the existing estimators, indicating&#13;
that the proposed estimators were more efficient than the existing estimators. The FIC&#13;
estimator was the most efficient estimator, while the NIC estimator was the least&#13;
efficient estimator. Among the partial information case estimators, the PIC-I estimator&#13;
was conditionally more efficient than PIC-II estimator and PIC-III estimator, while the&#13;
PIC-II estimator was more efficient than PIC-III estimator. It was observed that the&#13;
proposed FIC, NIC, PIC-I, PIC-II and PIC-III estimators were asymptotically more&#13;
efficient.&#13;
The developed generalised multivariate mixture regression estimators with multiauxiliary characteristics in multi-phase sampling were more efficient in the estimation&#13;
of the population mean. The structurally complex estimators were simplified by the&#13;
improved estimator schema.
</description>
<dc:date>2022-01-01T00:00:00Z</dc:date>
</item>
<item rdf:about="http://hdl.handle.net/123456789/1800">
<title>FITTING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE WITH EXOGENOUS VARIABLES MODEL ASSUMING LOGNORMAL ERROR TERM</title>
<link>http://hdl.handle.net/123456789/1800</link>
<description>FITTING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE WITH EXOGENOUS VARIABLES MODEL ASSUMING LOGNORMAL ERROR TERM
BELLO, ANDREW OJUTOMORI
The conventional Autoregressive Integrated Moving Average with Exogenous Variables&#13;
(arimax) model with Normal Error term and Multiple Linear Regression (MLR) require&#13;
stringent assumptions of normality of error term and stationarity of the series. These models&#13;
have found widespread application in multidimensional relationships among economic&#13;
variables; when these assumptions are often violated in practice leading to spurious regression&#13;
model with poor forecast performance. Thus, this study was designed to develop an arimax&#13;
model with Lognormal Error term capable of analysing time series data even when the&#13;
assumptions were violated with reasonable forecast performance.&#13;
The conventional arimax (1, 0, 1) with normal error term defined as:where the lag operator B = yt−1; the parameter 1 was the&#13;
coefficient of the Autoregressive model (AR), θ1 was the coefficient of Moving Average&#13;
(MA), β0 was the intercept and β1 was the slope of the Regression part of the model. The&#13;
proposed model was estimated by modifying the arimax (p, d, q) with lognormal error term&#13;
where p is order of AR part, d is order of difference and q is order of MA part of the mixed&#13;
model. The parameters were estimated using the maximum likelihood method. The choice of&#13;
lognormal error term was based on the asymmetric property which overcomes non normality,&#13;
the long tail and positive limit values properties overcome non stationarity. The dataset used&#13;
were monthly External Reserves (Million USD), Official Exchange Rate (Naira to USD),&#13;
Crude Oil Export (Million Barrel per Day) and Crude Oil Price (USD per Barrel). One&#13;
hundred and twenty (120) observations were used for the modeling process. The proposed&#13;
arimax (1, 0, 1) with lognormal error term ameliorate the non-normal and non-stationary&#13;
assumptions. The proposed model performance was compared with conventional arimax (1, 1,&#13;
1) with normal error term and MLR model. Box-Jenkins Time Series procedure was used to&#13;
model arimax (1, 1, 1) with normal error and Least Squares Estimator (LSE) technique for&#13;
modeling MLR. The performance of proposed model was tested using Akaike Information&#13;
Criteria (AIC), Mean Square Forecast Error (MSFE) and Loglikelihood (Loglik) values.&#13;
The non normal error function was obtained as:while the loglikelihood function was:&#13;
where σ2 is variance. All the series were found to be non-stationary and non-normally&#13;
distributed. The Loglik values of MLR, conventional arimax (1, 1, 1) with normal error and&#13;
proposed arimax (1, 0, 1) with lognormal error term were -317.41, -240.23 and 1344.47; AIC&#13;
values were 5.36, 490.45 and -0.41 while MSFE values were 12.41, 12.48 and 1.77. The&#13;
proposed model has the highest Loglik value, smallest AIC and smallest MSFE values when&#13;
compared with conventional arimax (1, 1, 1) with normal error and MLR model. Hence, the&#13;
proposed model was considered better.&#13;
The autoregressive integrated moving average with exogenous variables assuming lognormal&#13;
error term improved the capability of modeling time series data with better forecast&#13;
performance even when the assumptions of normality of error term and stationarity of series&#13;
were violated.
</description>
<dc:date>2021-08-01T00:00:00Z</dc:date>
</item>
</rdf:RDF>
