Abstract
InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aslam, J.A.: Improving algorithms for boosting. In: Proc. 13th Annu. Conference on Comput. Learning Theory, pp. 200–207. Morgan Kaufmann, San Francisco (2000)
Breiman, L.: Prediction games and arcing algorithms. Neural Computation 11(7), 1493–1518 (1999)
Domingo, C., Watanabe, O.: MadaBoost: A modification of AdaBoost. In: Proc. 13th Annu. Conference on Comput. Learning Theory, pp. 180–189. Morgan Kaufmann, San Francisco (2000)
Duffy, N., Helmbold, D.: A geometric approach to leveraging weak learners. Theoret. Comput. Sci. 284(1), 67–108 (2002)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 2, 337–374 (2000)
Grove, A.J., Schuurmans, D.: Boosting in the limit: Maximizing the margin of learned ensembles. 15th AAAI, 692–699 (1998)
Kalai, A., Servedio, R.A.: Boosting in the presence of noise. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing, San Diego, California, USA, June 9-11, pp. 196–205. ACM Press, New York (2003)
Kearns, M., Mansour, Y.: On the boosting ability of top-down decision tree learning algorithms. J. of Comput. Syst. Sci. 58(1), 109–128 (1999)
Kivinen, J., Warmuth, M.K.: Boosting as entropy projection. In: Proc. 12th Annu. Conf. on Comput. Learning Theory, pp. 134–144. ACM Press, New York (1999)
Mansour, Y., McAllester, D.A.: Boosting using branching programs. J. of Comput. Syst. Sci. 64(1), 103–112 (2002); Special Issue for COLT 2000
Rätsch, G., Warmuth, M.K.: Maximizing the margin with boosting. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 334–350. Springer, Heidelberg (2002)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. Annals of Statistics 26(5), 1651–1686 (1998)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidencerated predictions. Machine Learning 37(3), 297–336 (1999)
Takimoto, E., Maruoka, A.: Top-down decision tree learning as information based boosting. Theoret. Comput. Sci. 292(2), 447–464 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takimoto, E., Koya, S., Maruoka, A. (2004). Boosting Based on Divide and Merge. In: Ben-David, S., Case, J., Maruoka, A. (eds) Algorithmic Learning Theory. ALT 2004. Lecture Notes in Computer Science(), vol 3244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30215-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-30215-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23356-5
Online ISBN: 978-3-540-30215-5
eBook Packages: Springer Book Archive