Abstract
A Bayesian selective combination method is proposed for combining multiple neural networks in nonlinear dynamic process modelling. Instead of using fixed combination weights, the probability of a particular network being the true model is used as the combination weight for combining that network. The prior probability is calculated using the sum of squared errors of individual networks on a sliding window covering the most recent sampling times. A nearest neighbour method is used for estimating the network error for a given input data point, which is then used in calculating the combination weights for individual networks. Forward selection and backward elimination are used to select the individual networks to be combined. In forward selection, individual networks are gradually added into the aggregated network until the aggregated network error on the original training and testing data sets cannot be further reduced. In backward elimination, all the individual networks are initially aggregated and some of the individual networks are then gradually eliminated until the aggregated network error on the original training and testing data sets cannot be further reduced. Application results demonstrate that the proposed techniques can significantly improve model generalisation and perform better than aggregating all the individual networks.
Similar content being viewed by others
References
Ahmad Z, Zhang J (2003) Improving data based nonlinear process modelling through Bayesian combination of multiple neural networks. In: Proceedings of international joint conference on neural networks (IJCNN 2003), pp 2472–2477
Bishop C (1995) Neural networks for pattern recognition. Clarendon Press, Oxford
Caruana R, Lawrence S, Lee Giles C (2000) Overfitting in neural networks: backpropagation, conjugate gradient and early stopping. Neural Inf Process Syst 13:402–408
Cervantes AL, Agamennoni OE, Figueroa JL (2003) A nonlinear model predictive control system based on Wiener piecewise linear models. J Process Control 13:655–666
Hagiwara K, Kuno K (2000) Regularisation learning and early stopping in linear networks. In: International joint conference on neural networks (IJCNN 2000), pp 511–516
Hashem S (1997) Optimal linear combination. Neural Netw 10(4):599–614
Hashem S (1999) Treating harmful collinearity in neural networks ensembles. In: Sharkey AJC (ed) Combining artificial neural nets ensemble and modular. Springer, Berlin Heidelberg New York
Jacobs RAMIJ, Nowlan SJ, Hinton GE (1991) Adaptive mixture of local expert. Neural Comput 3:79–87
Jordan MI, Jacobs RA (1994) Hierarchical mixtures of expert and the EM algorithm. Neural Comput 6:191–214
Kiartzis S, Kehagias A, Bakirtzis A, Petridis A (1997) Short term load forecasting using a Bayesian combination method. Electrical Power Energy Syst 19(3):171–177
McAvoy TJ, Hsu E, Lowenthal S (1972) Dynamics of pH in controlled stirred tank reactor. Ind Chem Process Des Dev 11:68–70
Morgan N, Bourlard H (1990) Generalisation and parameter estimation in feedforward nets: some experiments. In: Touretzkey DS (ed) Advances in neural information processing system, vol 2. San Mateo, CA, pp 630–637
Ohbayashi M, Hirasawa K, Toshimitsu K, Murata J, Hu J (1998) Robust cntrol for non-linear system by universal learning networks considering fuzzy criterion and second order derivatives. IEEE world congress on computational intelligence. In: IEEE international conference proceeding on neural networks, vol 2, pp 968–973
Perrone MP, Cooper LN (1993) When networks disagree: ensembles methods for hybrid neural networks. In: Mammone RJ (ed) Artificial neural networks for speech and vision. Chapman and Hall, London, pp 126–142
Petridis A, Kehagias A, Petrou L, Bakirtzis A, Kiartzis S, Panagiotou H, Maslaris N (2001) A Bayesian multiple models combination method for time series prediction. J Int Robotics Syst 31:69–89
Sharkey AJC (1999) Multi nets system. In: Sharkey AJC (ed) Combining artificial neural nets ensemble and modular. Springer, Berlin Heidelberg New York
Sridhar DV, Bartlett EB, Seagrave RC (1996) Process modelling using stacked neural networks. AIChE J 42(9):2529–2539
Sridhar DV, Bartlett EB, Seagrave RC (1999) An information theoretic approach for combining neural network process models. Neural Netw 12:915–926
Wolpert DH (1992) Stacked generalisation. Neural Netw 5:241–259
Ye K (2003) Model Averaging. Int Soc Bayesian Anal Bull 10(1):12–14
Zhang J (1999) Developing robust non-linear models through bootstrap aggregated neural networks. Neurocomputing 25:93–113
Zhang J (2001) Developing robust neural network models by using both dynamic and static process operating data. Ind Eng Chem Res 40:234–241
Zhang J, Morris AJ, Martin EB (1998) Long-term prediction models based on mixed order locally recurrent neural networks. Comput Chem Eng 22(7–8):1051–1063
Zhang J, Morris AJ, Martin EB, Kipaerissides C (1998) Prediction of polymer quality in batch polymerisation reactors using robust neural networks. Chem Eng J 69:135–143
Zhang J, Martin EB, Morris AJ, Kiparissides C (1997) Inferential estimation of polymer quality using stacked neural networks. Comput Chem Eng 21:s1025–s1030
Acknowledgements
This work was supported by University Science Malaysia (for Z. Ahmad) and UK EPSRC through the Grant GR/R10875 (for J. Zhang). The authors also thank the anonymous reviewers for their constructive comments which helped to improve the quality and presentation of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ahmad, Z., Zhang, J. Bayesian selective combination of multiple neural networks for improving long-range predictions in nonlinear process modelling. Neural Comput & Applic 14, 78–87 (2005). https://doi.org/10.1007/s00521-004-0451-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-004-0451-y