Model combination in the multiple-data-batches scenario

Ting, Kai Ming; Low, Boon Toh

doi:10.1007/3-540-62858-4_90

Kai Ming Ting¹ &
Boon Toh Low²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1224))

Included in the following conference series:

European Conference on Machine Learning

654 Accesses
7 Citations

Abstract

The approach of combining models learned from multiple batches of data provide an alternative to the common practice of learning one model from all the available data (i.e., the data combination approach). This paper empirically examines the base-line behaviour of the model combination approach in this multiple-data-batches scenario. We find that model combination can lead to better performance even if the disjoint batches of data are drawn randomly from a larger sample, and relate the relative performance of the two approaches to the learning curve of the classifier used.

The practical implication of our results is that one should consider using model combination rather than data combination, especially when multiple batches of data for the same task are readily available.

Another interesting result is that we empirically show that the near-asymptotic performance of a single model, in some classification task, can be significantly improved by combining multiple models (derived from the same algorithm) if the constituent models are substantially different and there is some regularity in the models to be exploited by the combination method used. Comparisons with known theoretical results are also provided.

Download to read the full chapter text

Chapter PDF

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Keywords

References

Aha, D.W., D. Kibler & M.K. Albert (1991), Instance-Based Learning Algorithms, Machine Learning, 6, pp. 37–66.
Google Scholar
Ali, K.M. & M.J. Pazzani (1996), Error Reduction through Learning Multiple Descriptions, Machine Learning, Vol. 24, No. 3, pp. 173–206.
Google Scholar
Baxt, W.G. (1992), Improving the Accuracy of an Artificial Neural Network using Multiple Differently Trained Networks, Neural Computation, Vol. 4, No. 5, pp. 772–780, The MIT Press.
Google Scholar
Brazdil,P. & Torgo,L. (1990), Knowledge Acquisition via Knowledge Integration. In Current Trends in Knowledge Acquisition, Wielinga, B. et al.(eds.).
Google Scholar
Breiman, L. (1996a), Bagging Predictors, Machine Learning, Vol. 24, No. 2, pp. 123–140.
Google Scholar
Breiman, L. (1996b), Bias, Variance, and Arcing Classifiers, Technical Report 460, Department of Statistics, University of California, Berkeley, CA.
Google Scholar
Breiman, L. (1996c), Pasting Bites Together for Prediction in Large Data Sets and On-Line, [ftp.stat.berkeley.edu/users/pub/breiman/pasting.ps].
Google Scholar
Breiman, L., J.H. Friedman, R.A. Olshen & C.J. Stone (1984), Classification And Regression Trees, Belmont, CA: Wadsworth.
Google Scholar
Brodley, C.E. (1993), Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection, in Proceedings of the Tenth International Conference on Machine Learning, pp. 17–24.
Google Scholar
Buntine, W. (1991), Classifiers: A Theoretical and Empirical Study, in Proceedings of the Twelfth International Joint Conference on Artificial Intelligence, pp. 638–644, Morgan-Kaufmann.
Google Scholar
Cestnik, B. (1990), Estimating Probabilities: A Crucial Task in Machine Learning, in Proceedings of the European Conference on Artificial Intelligence, pp. 147–149.
Google Scholar
Chan, P.K. & S.J. Stolfo (1995), A Comparative Evaluation of Voting and Metalearning on Partitioned Data, in Proceedings of the Twelfth International Conference on Machine Learning, pp. 90–98, Morgan Kaufmann.
Google Scholar
Chan, P.K. & S.J. Stolfo (1996), On the Accuracy of Meta-learning for Scalable Data Mining, in Journal of Intelligent System, to appear.
Google Scholar
Cost, S. & S. Salzberg (1993), A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features, Machine Learning, 10, pp. 57–78.
Google Scholar
Craven, M.W. & J.W. Shavlik (1993), Learning to Represent Codons: A Challenge Problem for Constructive Induction, Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1319–1324.
Google Scholar
Fayyad, U.M. & K.B. Irani (1993), Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning, in Proceedings of 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027.
Google Scholar
Freund, Y. & R.E. Schapire (1996), Experiments with a New Boosting Algorithm, in Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156, Morgan Kaufmann.
Google Scholar
Hansen, L.K. & P. Salamon (1990), Neural Network Ensembles, in IEEE Transactions of Pattern Analysis and Machine Intelligence, 12, pp. 993–1001.
Google Scholar
Kearns, M. & H.S. Seung (1995), Learning from a Population of Hypotheses, Machine Learning, 18, pp. 255–276, Kluwer Academic Publishers.
Google Scholar
Kononenko, I. & M. Kovačič (1992), Learning as Optimization: Stochastic Generation of Multiple Knowledge, in Proceedings of the Ninth International Conference on Machine Learning, pp. 257–262, Morgan Kaufmann.
Google Scholar
Krogh, A. & J. Vedelsby (1995), Neural Network Ensembles, Cross Validation, and Active Learning, in Advances in Neural Information Processing Systems 7, G. Tesauro, D.S. Touretsky & T.K. Leen (Editors), pp. 231–238.
Google Scholar
Kwok, S. & C. Carter (1990), Multiple Decision Trees, Uncertainty in Artificial Intelligence 4, R. Shachter, T. Levitt, L. Kanal and J. Lemmer (Editors), pp. 327–335, North-Holland.
Google Scholar
Merz, C.J. & Murphy, P.M. (1996), UCI Repository of machine learning data-bases [http:// www.ics.uci.edu/ mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.
Google Scholar
Oliver, J.J. & D.J. Hand (1995), On Pruning and Averaging Decision Trees, in Proceedings of the Twelfth International Conference on Machine Learning, pp. 430–437. Morgan Kaufmann.
Google Scholar
Perrone, M.P. & L.N. Cooper (1993), When Networks Disagree: Ensemble Methods for Hybrid Neural Networks, in Artificial Neural Networks for Speech and Vision, R.J. Mammone (Editor), Chapman-Hall.
Google Scholar
Provost, F.J. & D.N. Hennessy (1996), Scaling Up: Distributed Machine Learning with Cooperation, in Proceedings of the Thirteen National Conference on Artificial Intelligence, pp. 74–79, Menlo Park, CA: AAAI Press.
Google Scholar
Quinlan, J.R. (1993), C4.5: Program for machine learning, Morgan Kaufmann.
Google Scholar
Quinlan, J.R. (1996), Boosting, Bagging, and C4.5, in Proceedings of the 13th National Conference on Artificial Intelligence, pp. 725–730, AAAI Press.
Google Scholar
Quinlan, J.R., P.J. Compton, K.A. Horn & L. Lazarus (1987), Inductive Knowledge Acquisition: A Case Study, in Applications of Expert Systems, J.R. Quinlan (Editor). Turing Institute Press with Addison Wesley.
Google Scholar
Schapire, R.E. (1990), The Strength of Weak Learnability, Machine Learning, 5, pp. 197–227, Kluwer Academic Publishers.
Google Scholar
Sejnowski, T.J. & C.R. Rosenberg (1987), Parallel networks that learn to pronounce English text, Complex Systems, 1, pp. 145–168.
Google Scholar
Tcheng, D., B. Lambert, C-Y. Lu & L. Rendell (1989), Building Robust Learning Systems by Combining Induction and Optimization, in Proceedings of the 11th International Joint Conference on Artificial Intelligence, pp. 806–812.
Google Scholar
Ting, K.M. (1994), Discretization of Continuous-Valued Attributes and Instance-Based Learning, TR 491, Basser Department of Computer Science, University of Sydney.
Google Scholar
Ting, K.M. (1996), The Characterisation of Predictive Accuracy and Decision Combination, in Proceedings of the Thirteenth International Conference on Machine Learning, pp. 498–506, Morgan Kaufmann.
Google Scholar
Ting, K.M. (1997), Discretisation in Lazy Learning Algorithms, to appear in the special issue on Lazy Learning in Artificial Intelligence Review Journal.
Google Scholar
Ting, K.M. & B.T. Low (1996), Theory Combination: an alternative to Data Combination, Working Paper 96/19, Department of Computer Science, University of Waikato. [http://www.cs.waikato.ac.nz/cs/Staff/kaiming.html].
Google Scholar
Ting, K.M. & I. H. Witten (1997), Stacked Generalization: when does it work?, Working Paper 97/1, Dept of Computer Science, University of Waikato.
Google Scholar
Towell, G., J. Shavlik & M. Noordewier (1990), Refinement of Approximate Domain Theories by Knowledge-Based Artificial Neural Networks, in Proceedings of the Eighth National Conference on Artificial Intelligence.
Google Scholar
Utgoff, P.E. (1989), Perceptron Trees: A case study in hybrid concept representations, Connection Science, 1, pp. 337–391.
Google Scholar
Wettschereck, D. (1994), A Hybrid Nearest-Neighbor and Nearest-Hyperrectangle Algorithm, in Proceedings of the Seventh European Conference on Machine Learning, LNAI-784, pp. 323–335, Springer Verlag.
Google Scholar
Wolpert, D.H. (1992), Stacked Generalization, Neural Networks, Vol. 5, pp. 241–259, Pergamon Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Kai Ming Ting
Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong, Shatin, Hong Kong
Boon Toh Low

Authors

Kai Ming Ting
View author publications
You can also search for this author in PubMed Google Scholar
Boon Toh Low
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Maarten van Someren Gerhard Widmer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ting, K.M., Low, B.T. (1997). Model combination in the multiple-data-batches scenario. In: van Someren, M., Widmer, G. (eds) Machine Learning: ECML-97. ECML 1997. Lecture Notes in Computer Science, vol 1224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62858-4_90

Download citation

DOI: https://doi.org/10.1007/3-540-62858-4_90
Published: 02 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62858-3
Online ISBN: 978-3-540-68708-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics