OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS

OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS http://hdl.handle.net/123456789/2150 Tue, 19 May 2026 15:36:17 GMT 2026-05-19T15:36:17Z OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS http://hdl.handle.net/123456789/2151 OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS OGUNDUNMADE, Tayo Peter Neural Network (NN) allows complex nonlinear relationships between the response variables and its predictors. The Deep NN have made notable contributions across computer vision, reinforcement learning, speech recognition and natural language processing. Previous studies have obtained the parameters of NN through the classical approach using Homogeneous Activation Functions (HOMAFs). However, a major setback of NN using the classical approach is its tendency to over-fit. Therefore, this study was aimed at developing a Bayesian NN (BNN) model to ameliorate over-fitting using Heterogeneous Activation Functions (HETAFs). A BNN model was developed with Gaussian error distribution for the likelihood function; inverse gamma and inverse Wishart priors for the parameters, to obtain the BNN estimators. The HOMAFs (Rectified Linear Unit (ReLU), Sigmoid and Hyperbolic Tangent Sigmoid (TANSIG)) and HETAFs (Symmetric Saturated Linear Hyperbolic Tangent (SSLHT) and Symmetric Saturated Linear Hyperbolic Tangent Sigmoid (SSLHTS)) were used to activate the model parameters.The Bayesian approach was used to ameliorate the problem of over-fitting, while the Posterior Mean (PM), Posterior Standard Deviation (PSD) and Numerical Standard Error (NSE) were used to determine the estimators’ sensitivity. The performance of the Bayesian estimators from each of the activation functions was evaluated in the Monte Carlo experiment using the Mean Square Error (MSE), Mean Absolute Error (MAE) and training error as metrics. The proximity of MSE and training error values were used to generalise on the problem of over-fitting. The derived Bayesian estimators were β ∼ N(Kβ, Hβ) and γ ∼ exp (−1 2{Fγ +Mγ); where Kβ is derived mean of β, Hβ is derived standard deviation of β; Fγ and Mγ are the derived posteriors of γ. For ReLU, the PM, PSD and NSE values for β and γ were 0.4755, 0.0646, 0.0020; and 0.2370, 0.0642, 0.0020, respectively; for Sigmoid: 0.4476, 0.2734, 0.0087; and 1.0269, 0.2732, 0.0086, respectively; for TANSIG: 0.4718, 0.0826, 0.0026, and 1.0239, 0.0822, 0.0026, respectively. For SSLHT, the PM, PSD and NSE values for β and γ were 0.8344, 0.0567, 0.0018; and 1.0242, 0.0566, 0.0016, respectively; and for SSLHTS: 0.89825, 0.01278, 0.0004; and 1.0236, v0.0127, 0.0003, respectively. The MSE, MAE and training error values for the performance of the activation functions were ReLU: 0.1631, 0.2465, 0.1522; Sigmoid: 0.1834, 0.2074, 0.1862; TANSIG: 0.1943, 0.269, 0.1813; SSLHT: 0.0714, 0.0131, 0.0667; and SSLHTS: 0.0322, 0.0339, 0.0328, respectively. The HETAFs showed closer proximity between MSE and training error implying amelioration of overfitting and minimum error values compared to HOMAFS. The derived Bayesian neural network estimators ameliorated the problem of overfitting with close values of Mean Square Error and training error, thus making them more appropriate in handling Neural Network models. They could be used in solving problems in machine learning. Wed, 16 Aug 2023 00:00:00 GMT http://hdl.handle.net/123456789/2151 2023-08-16T00:00:00Z