Quality Diversity Genetic Programming for Learning Decision Tree Ensembles

Boisvert, Stephen; Sheppard, John W.

doi:10.1007/978-3-030-72812-0_1

Stephen Boisvert¹¹ &
John W. Sheppard¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12691))

Included in the following conference series:

European Conference on Genetic Programming (Part of EvoStar)

753 Accesses
3 Citations
1 Altmetric

Abstract

Quality Diversity (QD) algorithms are a class of population-based evolutionary algorithms designed to generate sets of solutions that are both fit and diverse. In this paper, we describe a strategy for applying QD concepts to the generation of decision tree ensembles by optimizing collections of trees for both individually accurate and collectively diverse predictive behavior. We compare three variants of this QD strategy with two existing ensemble generation strategies over several classification data sets. We then briefly highlight the effect of the evolutionary algorithm at the core of the strategy. The examined algorithms generate ensembles with distinct predictive behaviors as measured by classification accuracy and intrinsic diversity. The plotted behaviors hint at highly data-dependent relationships between these metrics. QD-based strategies are suggested as a means to optimize classifier ensembles along this performance curve along with other suggestions for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bandar, Z., Al-Attar, H., McLean, D.: Genetic algorithm based multiple decision tree induction. In: Proceedings of the 6th International Conference on Neural Information Processing (ICONIP), vol. 2, pp. 429–434 (1999)
Google Scholar
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A new ensemble diversity measure applied to thinning ensembles. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 306–316. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44938-8_31
Chapter Google Scholar
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A comparison of decision tree ensemble creation techniques. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 173–180 (2006)
Article Google Scholar
Barros, R.C., Basgalupp, M.P., De Carvalho, A.C., Freitas, A.A.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(3), 291–312 (2011)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1007/BF00058655
Article MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Brodley, C.E.: Recursive automatic bias selection for classifier construction. Mach. Learn. 20(1–2), 63–94 (1995). https://doi.org/10.1023/A:1022686102325
Article Google Scholar
Burke, E.K., Gustafson, S., Kendall, G.: Diversity in genetic programming: an analysis of measures and correlation with fitness. IEEE Trans. Evol. Comput. 8(1), 47–62 (2004)
Article Google Scholar
Chan, P.K., Stolfo, S.J.: On the accuracy of meta-learning for scalable data mining. J. Intell. Inf. Syst. 8(1), 5–28 (1997). https://doi.org/10.1023/A:1008640732416
Article Google Scholar
Cully, A., Demiris, Y.: Quality and diversity optimization: a unifying modular framework. IEEE Trans. Evol. Comput. 22(2), 245–259 (2017)
Article Google Scholar
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000). https://doi.org/10.1023/A:1007607513941
Article Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Fan, W., Wang, H., Yu, P.S., Ma, S.: Is random model better? On its accuracy and efficiency. In: Third International Conference on Data Mining, pp. 51–58. IEEE (2003)
Google Scholar
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, pp. 231–238 (1995)
Google Scholar
Kuncheva, L.I., Whitaker, C.J., Shipp, C.A., Duin, R.P.: Is independence good for combining classifiers? In: International Conference on Pattern Recognition, vol. 2, pp. 168–171. IEEE (2000)
Google Scholar
Lehman, J., Stanley, K.O.: Evolving a diversity of virtual creatures through novelty search and local competition. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation (GECCO), pp. 211–218. ACM (2011)
Google Scholar
Liu, F.T., Ting, K.M., Fan, W.: Maximizing tree diversity by building complete-random decision trees. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 605–610. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_70
Chapter Google Scholar
Merz, C.J.: Dynamical selection of learning algorithms. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data. LNS, vol. 112, pp. 281–290. Springer, New York (1996). https://doi.org/10.1007/978-1-4612-2404-4_27
Chapter Google Scholar
Mouret, J.B., Clune, J.: Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909 (2015)
Pugh, J.K., Soros, L.B., Stanley, K.O.: Quality diversity: a new frontier for evolutionary computation. Front. Robot. AI 3, 40 (2016)
Article Google Scholar
Raileanu, L.E., Stoffel, K.: Theoretical comparison between the Gini index and information gain criteria. Ann. Math. Artif. Intell. 41(1), 77–93 (2004). https://doi.org/10.1023/B:AMAI.0000018580.96245.c6
Article MathSciNet MATH Google Scholar
Tanigawa, T., Zhao, Q.: A study on efficient generation of decision trees using genetic programming. In: Proceedings of the 2nd Annual Conference on Genetic and Evolutionary Computation (GECCO), pp. 1047–1052. ACM (2000)
Google Scholar
Van Erp, M., Vuurpijl, L., Schomaker, L.: An overview and comparison of voting methods for pattern recognition. In: Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition, pp. 195–200. IEEE (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Johns Hopkins University, Baltimore, USA
Stephen Boisvert
Montana State University, Bozeman, USA
John W. Sheppard

Authors

Stephen Boisvert
View author publications
You can also search for this author in PubMed Google Scholar
John W. Sheppard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Stephen Boisvert or John W. Sheppard .

Editor information

Editors and Affiliations

Queen’s University, Kingston, ON, Canada
Ting Hu
University of Coimbra, Coimbra, Portugal
Nuno Lourenço
University of Trieste, Trieste, Italy
Eric Medvet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Boisvert, S., Sheppard, J.W. (2021). Quality Diversity Genetic Programming for Learning Decision Tree Ensembles. In: Hu, T., Lourenço, N., Medvet, E. (eds) Genetic Programming. EuroGP 2021. Lecture Notes in Computer Science(), vol 12691. Springer, Cham. https://doi.org/10.1007/978-3-030-72812-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-72812-0_1
Published: 25 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72811-3
Online ISBN: 978-3-030-72812-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics