Skip to main content

Characterizing the Effects of Random Subsampling on Lexicase Selection

  • Chapter
  • First Online:
Genetic Programming Theory and Practice XVII

Abstract

Lexicase selection is a proven parent-selection algorithm designed for genetic programming problems, especially for uncompromising test-based problems where many distinct test cases must all be passed. Previous work has shown that random subsampling techniques can improve lexicase selection’s problem-solving success; here, we investigate why. We test two types of random subsampling lexicase variants: down-sampled lexicase, which uses a random subset of all training cases each generation; and cohort lexicase, which collects candidate solutions and training cases into small groups for testing, reshuffling those groups each generation. We show that both of these subsampling lexicase variants improve problem-solving success by facilitating deeper evolutionary searches; that is, they allow populations to evolve for more generations (relative to standard lexicase) given a fixed number of test-case evaluations. We also demonstrate that the subsampled variants require less computational effort to find solutions, even though subsampling hinders lexicase’s ability to preserve specialists. Contrary to our expectations, we did not find any evidence of systematic loss of phenotypic diversity maintenance due to subsampling, though we did find evidence that cohort lexicase is significantly better at preserving phylogenetic diversity than down-sampled lexicase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Evaluating a single program on a single test case is one test case evaluation.

  2. 2.

    Choosing when to measure diversity in evolutionary computation is an interesting problem. In evolutionary computation, diversity maintenance is often viewed as a mechanism to avoid premature convergence on suboptimal solutions. If our goal is to compare how well different selection schemes maintain diversity, when should we measure diversity? Measuring diversity after a global solution is found is not particularly meaningful, as finding the solution often causes the population to converge, decreasing diversity. We measured diversity at the time the solution is found to mitigate this problem. However, this solution only partially addresses the underlying problem: the process of evolution often involves many selective sweeps and subsequent divergences and we cannot know where in this cycle our measurements occurred.

References

  1. Aenugu, S., Spector, L.: Lexicase selection in learning classifier systems. In: Proceedings of the Genetic and Evolutionary Computation Conference - GECCO 2019, pp. 356–364. ACM Press, Prague, Czech Republic (2019)

    Google Scholar 

  2. Curry, R., Heywood, M.: Towards efficient training on large datasets for genetic programming. In: A. Tawfik, S. Goodwin (eds.) Conference of the Canadian Society for Computational Studies of Intelligence, pp. 161–174. Springer (2004)

    Google Scholar 

  3. Dolson, E., Lalejini, A., Jorgensen, S., Ofria, C.: Quantifying the tape of life: Ancestry-based metrics provide insights and intuition about evolutionary dynamics. In: Artificial Life Conference Proceedings, pp. 75–82. MIT Press (2018)

    Google Scholar 

  4. Dolson, E.L., Banzhaf, W., Ofria, C.: Ecological theory provides insights about evolutionary computation. preprint, PeerJ Preprints (2018). URL https://peerj.com/preprints/27315

  5. Ferguson, A.: FergusonAJ/gptp-2019-subsampled-lexicase: GPTP Chapter Companion (2020). https://doi.org/10.5281/zenodo.3679380, https://github.com/FergusonAJ/gptp-2019-subsampled-lexicase

  6. Forstenlechner, S., Fagan, D., Nicolau, M., O’Neill, M.: Towards Understanding and Refining the General Program Synthesis Benchmark Suite with Genetic Programming. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pp. 1–6. IEEE, Rio de Janeiro (2018)

    Google Scholar 

  7. Gathercole, C., Ross, P.: Dynamic training subset selection for supervised learning in Genetic Programming. In: Y. Davidor, H.P. Schwefel, R. Maenner (eds.) Parallel Problem Solving from Nature - PPSN III, vol. 866, pp. 312–321. Springer Berlin Heidelberg, Berlin, Heidelberg (1994)

    Chapter  Google Scholar 

  8. Gonçalves, I., Silva, S., Melo, J.B., Carreiras, J.M.: Random sampling technique for overfitting control in genetic programming. In: A. Moraglio, S. Silva, K. Krawiec, P. Machado, C. Cotta (eds.) European Conference on Genetic Programming

    Google Scholar 

  9. Helmuth, T., McPhee, N.F., Spector, L.: Effects of lexicase and tournament selection on diversity recovery and maintenance. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, pp. 983–990. ACM (2016)

    Google Scholar 

  10. Helmuth, T., Pantridge, E., Spector, L.: Lexicase selection of specialists. In: Proceedings of the Genetic and Evolutionary Computation Conference on - GECCO 2019, pp. 1030–1038. ACM Press, Prague, Czech Republic (2019)

    Google Scholar 

  11. Helmuth, T., Spector, L.: General program synthesis benchmark suite. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp. 1039–1046. ACM (2015)

    Google Scholar 

  12. Helmuth, T., Spector, L., Matheson, J.: Solving uncompromising problems with lexicase selection. IEEE Transactions on Evolutionary Computation 19(5), 630–643 (2015)

    Article  Google Scholar 

  13. Hernandez, J.G., Lalejini, A., Dolson, E., Ofria, C.: Random Subsampling Improves Performance in Lexicase Selection. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2019, pp. 2028–2031. ACM, New York, NY, USA (2019). Event-place: Prague, Czech Republic

    Google Scholar 

  14. Hmida, H., Hamida, S.B., Borgi, A., Rukoz, M.: Sampling Methods in Genetic Programming Learners from Large Datasets: A Comparative Study. In: P. Angelov, Y. Manolopoulos, L. Iliadis, A. Roy, M. Vellasco (eds.) Advances in Big Data, vol. 529, pp. 50–60. Springer International Publishing, Cham (2017)

    Chapter  Google Scholar 

  15. La Cava, W., Helmuth, T., Spector, L., Moore, J.H.: A Probabilistic and Multi-Objective Analysis of Lexicase Selection and 𝜖-Lexicase Selection. Evolutionary Computation 27, 377–402 (2018)

    Article  Google Scholar 

  16. La Cava, W., Spector, L., Danai, K.: Epsilon-Lexicase Selection for Regression. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, pp. 741–748. ACM, New York, NY, USA (2016). Event-place: Denver, Colorado, USA

    Google Scholar 

  17. Lalejini, A., Ofria, C.: Evolving event-driven programs with SignalGP. In: Proceedings of the Genetic and Evolutionary Computation Conference on - GECCO 2018, pp. 1135–1142. ACM Press, Kyoto, Japan (2018)

    Google Scholar 

  18. Lalejini, A., Ofria, C.: Tag-accessed memory for genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion - GECCO 2019, pp. 346–347. ACM Press, Prague, Czech Republic (2019)

    Google Scholar 

  19. Lalejini, A., Wiser, M.J., Ofria, C.: Gene duplications drive the evolution of complex traits and regulation. In: Artificial Life Conference Proceedings 14, pp. 257–264. MIT Press (2017)

    Google Scholar 

  20. Martinez, Y., Naredo, E., Trujillo, L., Legrand, P., Lopez, U.: A comparison of fitness-case sampling methods for genetic programming. Journal of Experimental & Theoretical Artificial Intelligence 29, 1203–1224 (2017)

    Article  Google Scholar 

  21. Melo, V.V., Vargas, D.V., Banzhaf, W.: Batch Tournament Selection for Genetic Programming. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion - GECCO 2019, pp. 994–1002. ACM Press, Prague, Czech Republic (2019)

    Google Scholar 

  22. Metevier, B., Saini, A.K., Spector, L.: Lexicase selection beyond genetic programming. In: W. Banzhaf, L. Spector, L. Sheneman (eds.) Genetic Programming Theory and Practice XVI, pp. 123–136. Springer International Publishing, Cham (2019)

    Chapter  Google Scholar 

  23. Moore, J.M., Stanton, A.: Lexicase selection outperforms previous strategies for incremental evolution of virtual creature controllers. In: Proceedings of the 14th European Conference on Artificial Life ECAL 2017, pp. 290–297. MIT Press, Lyon, France (2017)

    Google Scholar 

  24. Moore, J.M., Stanton, A.: Tiebreaks and Diversity: Isolating Effects in Lexicase Selection. In: The 2018 Conference on Artificial Life, pp. 590–597. MIT Press, Tokyo, Japan (2018)

    Google Scholar 

  25. Moore, J.M., Stanton, A.: The Limits of Lexicase Selection in an Evolutionary Robotics Task. In: The 2019 Conference on Artificial Life, pp. 551–558. MIT Press, Newcastle, United Kingdom (2019)

    Google Scholar 

  26. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2019). URL https://www.R-project.org/

  27. Spector, L.: Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In: Proceedings of the 14th annual conference companion on Genetic and evolutionary computation, pp. 401–408. ACM (2012)

    Google Scholar 

  28. Spector, L., Cava, W.L., Shanabrook, S., Helmuth, T., Pantridge, E.: Relaxations of Lexicase Parent Selection. In: W. Banzhaf, R.S. Olson, W. Tozier, R. Riolo (eds.) Genetic Programming Theory and Practice XV, pp. 105–120. Springer International Publishing, Cham (2018)

    Chapter  Google Scholar 

  29. Spector, L., Martin, B., Harrington, K., Helmuth, T.: Tag-based modules in genetic programming. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation - GECCO 2011, p. 1419. ACM Press, Dublin, Ireland (2011)

    Google Scholar 

  30. Webb, C.O.: Exploring the phylogenetic structure of ecological communities: an example for rain forest trees. The American Naturalist 156(2), 145–155 (2000)

    Article  Google Scholar 

  31. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York (2016). URL https://ggplot2.tidyverse.org

Download references

Acknowledgements

This research was supported by the National Science Foundation through the BEACON Center (Coop. Agreement No. DBI-0939454), a Graduate Research Fellowship to AL (Grant No. DGE-1424871), and Grant No. DEB-1655715 to CO. Michigan State University provided computational resources through the Institute for Cyber-Enabled Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Austin J. Ferguson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ferguson, A.J., Hernandez, J.G., Junghans, D., Lalejini, A., Dolson, E., Ofria, C. (2020). Characterizing the Effects of Random Subsampling on Lexicase Selection. In: Banzhaf, W., Goodman, E., Sheneman, L., Trujillo, L., Worzel, B. (eds) Genetic Programming Theory and Practice XVII. Genetic and Evolutionary Computation. Springer, Cham. https://doi.org/10.1007/978-3-030-39958-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-39958-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-39957-3

  • Online ISBN: 978-3-030-39958-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics