Characterizing the Effects of Random Subsampling on Lexicase Selection

Ferguson, Austin J.; Hernandez, Jose Guadalupe; Junghans, Daniel; Lalejini, Alexander; Dolson, Emily; Ofria, Charles

doi:10.1007/978-3-030-39958-0_1

Austin J. Ferguson⁸,
Jose Guadalupe Hernandez⁸,
Daniel Junghans⁸,
Alexander Lalejini⁸,
Emily Dolson⁹ &
…
Charles Ofria⁸

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

765 Accesses
11 Citations

Abstract

Lexicase selection is a proven parent-selection algorithm designed for genetic programming problems, especially for uncompromising test-based problems where many distinct test cases must all be passed. Previous work has shown that random subsampling techniques can improve lexicase selection’s problem-solving success; here, we investigate why. We test two types of random subsampling lexicase variants: down-sampled lexicase, which uses a random subset of all training cases each generation; and cohort lexicase, which collects candidate solutions and training cases into small groups for testing, reshuffling those groups each generation. We show that both of these subsampling lexicase variants improve problem-solving success by facilitating deeper evolutionary searches; that is, they allow populations to evolve for more generations (relative to standard lexicase) given a fixed number of test-case evaluations. We also demonstrate that the subsampled variants require less computational effort to find solutions, even though subsampling hinders lexicase’s ability to preserve specialists. Contrary to our expectations, we did not find any evidence of systematic loss of phenotypic diversity maintenance due to subsampling, though we did find evidence that cohort lexicase is significantly better at preserving phylogenetic diversity than down-sampled lexicase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Evaluating a single program on a single test case is one test case evaluation.
2.
Choosing when to measure diversity in evolutionary computation is an interesting problem. In evolutionary computation, diversity maintenance is often viewed as a mechanism to avoid premature convergence on suboptimal solutions. If our goal is to compare how well different selection schemes maintain diversity, when should we measure diversity? Measuring diversity after a global solution is found is not particularly meaningful, as finding the solution often causes the population to converge, decreasing diversity. We measured diversity at the time the solution is found to mitigate this problem. However, this solution only partially addresses the underlying problem: the process of evolution often involves many selective sweeps and subsequent divergences and we cannot know where in this cycle our measurements occurred.

References

Aenugu, S., Spector, L.: Lexicase selection in learning classifier systems. In: Proceedings of the Genetic and Evolutionary Computation Conference - GECCO 2019, pp. 356–364. ACM Press, Prague, Czech Republic (2019)
Google Scholar
Curry, R., Heywood, M.: Towards efficient training on large datasets for genetic programming. In: A. Tawfik, S. Goodwin (eds.) Conference of the Canadian Society for Computational Studies of Intelligence, pp. 161–174. Springer (2004)
Google Scholar
Dolson, E., Lalejini, A., Jorgensen, S., Ofria, C.: Quantifying the tape of life: Ancestry-based metrics provide insights and intuition about evolutionary dynamics. In: Artificial Life Conference Proceedings, pp. 75–82. MIT Press (2018)
Google Scholar
Dolson, E.L., Banzhaf, W., Ofria, C.: Ecological theory provides insights about evolutionary computation. preprint, PeerJ Preprints (2018). URL https://peerj.com/preprints/27315
Ferguson, A.: FergusonAJ/gptp-2019-subsampled-lexicase: GPTP Chapter Companion (2020). https://doi.org/10.5281/zenodo.3679380, https://github.com/FergusonAJ/gptp-2019-subsampled-lexicase
Forstenlechner, S., Fagan, D., Nicolau, M., O’Neill, M.: Towards Understanding and Refining the General Program Synthesis Benchmark Suite with Genetic Programming. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pp. 1–6. IEEE, Rio de Janeiro (2018)
Google Scholar
Gathercole, C., Ross, P.: Dynamic training subset selection for supervised learning in Genetic Programming. In: Y. Davidor, H.P. Schwefel, R. Maenner (eds.) Parallel Problem Solving from Nature - PPSN III, vol. 866, pp. 312–321. Springer Berlin Heidelberg, Berlin, Heidelberg (1994)
Chapter Google Scholar
Gonçalves, I., Silva, S., Melo, J.B., Carreiras, J.M.: Random sampling technique for overfitting control in genetic programming. In: A. Moraglio, S. Silva, K. Krawiec, P. Machado, C. Cotta (eds.) European Conference on Genetic Programming
Google Scholar
Helmuth, T., McPhee, N.F., Spector, L.: Effects of lexicase and tournament selection on diversity recovery and maintenance. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, pp. 983–990. ACM (2016)
Google Scholar
Helmuth, T., Pantridge, E., Spector, L.: Lexicase selection of specialists. In: Proceedings of the Genetic and Evolutionary Computation Conference on - GECCO 2019, pp. 1030–1038. ACM Press, Prague, Czech Republic (2019)
Google Scholar
Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp. 1039–1046. ACM (2015)
Google Scholar
Helmuth, T., Spector, L., Matheson, J.: Solving uncompromising problems with lexicase selection. IEEE Transactions on Evolutionary Computation 19(5), 630–643 (2015)
Article Google Scholar
Hernandez, J.G., Lalejini, A., Dolson, E., Ofria, C.: Random Subsampling Improves Performance in Lexicase Selection. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2019, pp. 2028–2031. ACM, New York, NY, USA (2019). Event-place: Prague, Czech Republic
Google Scholar
Hmida, H., Hamida, S.B., Borgi, A., Rukoz, M.: Sampling Methods in Genetic Programming Learners from Large Datasets: A Comparative Study. In: P. Angelov, Y. Manolopoulos, L. Iliadis, A. Roy, M. Vellasco (eds.) Advances in Big Data, vol. 529, pp. 50–60. Springer International Publishing, Cham (2017)
Chapter Google Scholar
La Cava, W., Helmuth, T., Spector, L., Moore, J.H.: A Probabilistic and Multi-Objective Analysis of Lexicase Selection and 𝜖-Lexicase Selection. Evolutionary Computation 27, 377–402 (2018)
Article Google Scholar
La Cava, W., Spector, L., Danai, K.: Epsilon-Lexicase Selection for Regression. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, pp. 741–748. ACM, New York, NY, USA (2016). Event-place: Denver, Colorado, USA
Google Scholar
Lalejini, A., Ofria, C.: Evolving event-driven programs with SignalGP. In: Proceedings of the Genetic and Evolutionary Computation Conference on - GECCO 2018, pp. 1135–1142. ACM Press, Kyoto, Japan (2018)
Google Scholar
Lalejini, A., Ofria, C.: Tag-accessed memory for genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion - GECCO 2019, pp. 346–347. ACM Press, Prague, Czech Republic (2019)
Google Scholar
Lalejini, A., Wiser, M.J., Ofria, C.: Gene duplications drive the evolution of complex traits and regulation. In: Artificial Life Conference Proceedings 14, pp. 257–264. MIT Press (2017)
Google Scholar
Martinez, Y., Naredo, E., Trujillo, L., Legrand, P., Lopez, U.: A comparison of fitness-case sampling methods for genetic programming. Journal of Experimental & Theoretical Artificial Intelligence 29, 1203–1224 (2017)
Article Google Scholar
Melo, V.V., Vargas, D.V., Banzhaf, W.: Batch Tournament Selection for Genetic Programming. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion - GECCO 2019, pp. 994–1002. ACM Press, Prague, Czech Republic (2019)
Google Scholar
Metevier, B., Saini, A.K., Spector, L.: Lexicase selection beyond genetic programming. In: W. Banzhaf, L. Spector, L. Sheneman (eds.) Genetic Programming Theory and Practice XVI, pp. 123–136. Springer International Publishing, Cham (2019)
Chapter Google Scholar
Moore, J.M., Stanton, A.: Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers. In: Proceedings of the 14th European Conference on Artificial Life ECAL 2017, pp. 290–297. MIT Press, Lyon, France (2017)
Google Scholar
Moore, J.M., Stanton, A.: Tiebreaks and Diversity: Isolating Effects in Lexicase Selection. In: The 2018 Conference on Artificial Life, pp. 590–597. MIT Press, Tokyo, Japan (2018)
Google Scholar
Moore, J.M., Stanton, A.: The Limits of Lexicase Selection in an Evolutionary Robotics Task. In: The 2019 Conference on Artificial Life, pp. 551–558. MIT Press, Newcastle, United Kingdom (2019)
Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019). URL https://www.R-project.org/
Spector, L.: Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In: Proceedings of the 14th annual conference companion on Genetic and evolutionary computation, pp. 401–408. ACM (2012)
Google Scholar
Spector, L., Cava, W.L., Shanabrook, S., Helmuth, T., Pantridge, E.: Relaxations of Lexicase Parent Selection. In: W. Banzhaf, R.S. Olson, W. Tozier, R. Riolo (eds.) Genetic Programming Theory and Practice XV, pp. 105–120. Springer International Publishing, Cham (2018)
Chapter Google Scholar
Spector, L., Martin, B., Harrington, K., Helmuth, T.: Tag-based modules in genetic programming. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation - GECCO 2011, p. 1419. ACM Press, Dublin, Ireland (2011)
Google Scholar
Webb, C.O.: Exploring the phylogenetic structure of ecological communities: an example for rain forest trees. The American Naturalist 156(2), 145–155 (2000)
Article Google Scholar
Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York (2016). URL https://ggplot2.tidyverse.org

Download references

Acknowledgements

This research was supported by the National Science Foundation through the BEACON Center (Coop. Agreement No. DBI-0939454), a Graduate Research Fellowship to AL (Grant No. DGE-1424871), and Grant No. DEB-1655715 to CO. Michigan State University provided computational resources through the Institute for Cyber-Enabled Research.

Author information

Authors and Affiliations

The BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI, USA
Austin J. Ferguson, Jose Guadalupe Hernandez, Daniel Junghans, Alexander Lalejini & Charles Ofria
Department of Translational Hematology and Oncology Research, Cleveland Clinic, Cleveland, OH, USA
Emily Dolson

Authors

Austin J. Ferguson
View author publications
You can also search for this author in PubMed Google Scholar
Jose Guadalupe Hernandez
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Junghans
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Lalejini
View author publications
You can also search for this author in PubMed Google Scholar
Emily Dolson
View author publications
You can also search for this author in PubMed Google Scholar
Charles Ofria
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Austin J. Ferguson .

Editor information

Editors and Affiliations

Computer Science and Engineering, John R. Koza Chair, Michigan State University, East Lansing, MI, USA
Wolfgang Banzhaf
BEACON Center, Michigan State University, East Lansing, MI, USA
Erik Goodman
Department of Computer Science and Engineering, Michigan State University, Okemos, MI, USA
Leigh Sheneman
Depto Ingenieria en Electronic Electrica Tecnológico Nacional de México/ IT, Tijuana, Baja California, Mexico
Leonardo Trujillo
Evolution Enterprises, Ann Arbor, MI, USA
Bill Worzel

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ferguson, A.J., Hernandez, J.G., Junghans, D., Lalejini, A., Dolson, E., Ofria, C. (2020). Characterizing the Effects of Random Subsampling on Lexicase Selection. In: Banzhaf, W., Goodman, E., Sheneman, L., Trujillo, L., Worzel, B. (eds) Genetic Programming Theory and Practice XVII. Genetic and Evolutionary Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-39958-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-39958-0_1
Published: 08 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39957-3
Online ISBN: 978-3-030-39958-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics