Abstract
Multi-objective optimization has played a major role in solving problems where two or more conflicting objectives need to be simultaneously optimized. This paper presents a Multi-Objective grammar-based genetic programming (MOGGP) system that automatically evolves complete rule induction algorithms, which in turn produce both accurate and compact rule models. The system was compared with a single objective GGP and three other rule induction algorithms. In total, 20 UCI data sets were used to generate and test generic rule induction algorithms, which can be now applied to any classification data set. Experiments showed that, in general, the proposed MOGGP finds rule induction algorithms with competitive predictive accuracies and more compact models than the algorithms it was compared with.
Similar content being viewed by others
References
Aho AV, Sethi R, Ullman JD (1986) Compilers: principles, techniques and tools, 1st edn. Addison-Wesley, Reading
Banzhaf W, Nordin P, Keller R, Francone F (1998) GP—an introduction. On the automatic evolution of computer programs and its applications. Morgan Kaufmann, San Francisco
Bleuler S, Brack M, Thiele L, Zitzler E (2001) Multiobjective genetic programming: Reducing bloat using SPEA2. In: Proceedings of the 2001 congress on evolutionary computation—CEC2001. IEEE, Korea, pp 536–543
Brunk CA, Pazzani MJ (1991) An investigation of noise-tolerant relational concept learning algorithms. In: Birnbaum L, Collins G (eds) Proceedings of the 8th international workshop on machine learning. Morgan Kaufmann, San Francisco, pp 389–393
Clare A, King RD (2002) Machine learning of functional class from phenotype data. Bioinformatics 18(1): 160–166
Clark P, Boswell R (1991) Rule induction with cn2: some recent improvements. In: Kodratoff Y (eds) EWSL-91: Proceedings of the working session on learning. Springer, New York, pp 151–163
Cleary R (2005) Extending grammar evolution with attribute grammars: an application to knapsack problems. Master’s Thesis, University of Limerick
Coello CAC (1999) A comprehensive survey of evolutionary-based multiobjective optimization techniques. Knowl Inf Syst 1(3): 129–156
Coello CAC, Veldhuizen DV, Lamont G (2002) Algorithms for solving multi-objective problems. Kluwer, New York
Cohen WW (1993) Efficient pruning methods for separate-and-conquer rule learning systems. In: Proceedings of the 13th international joint conference on artificial intelligence (IJCAI-93), France, pp 988–994
Cohen WW (1995) Fast effective rule induction. In: Prieditis A, Russell S (eds) Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 115–123
De Jong ED, Watson RA, Pollack JB (2001) Reducing bloat and promoting diversity using multi-objective methods. In: Spector L, Goodman E, Wu A, Langdon W, Voigt H-M, Gen M, Sen S, Dorigo M, Pezeshk S, Garzon M, Burke E (eds) Proceedings of the genetic and evolutionary computation conference, GECCO-2001. Morgan Kaufmann, San Francisco, pp 11–18
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley Interscience series in Systems and Optimization, Berlin
Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. In: Schoenauer M, Deb KGR, Yao X, Lutton E, Merelo JJ, Schwefel H (eds) Parallel problem solving from nature—PPSN VI. Springer, Berlin, pp 849–858
Falco ID, Cioppa AD, Iazzetta A, Tarantino E (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7(2): 179–201
Falco ID, Tarantino E, Cioppa AD, Fontanella F (2005) A novel grammar-based genetic programming approach to clustering. In: Proceedings of the 2005 ACM symposium on applied computing (SAC-05). ACM Press, New York, pp 928–932
Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery: an overview. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining, AAAI/MIT Press
Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Heidelberg
Freitas AA (2004) A critical review of multi-objective optimization in data mining: a position paper. ACM SIGKDD Explor. Newsl. 6(2): 77–86
Furnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1): 3–54
Furnkranz J, Widmer G (1994) Incremental reduced error pruning. In: Proceedings of the 11th international conference on machine learning, New Brunswick, NJ, pp 70–77
Gruau F (1996) On using syntactic constraints with genetic programming. In: Angeline PJ, Kinnear KE Jr (eds) Advances in genetic programming 2, chap 19. MIT Press, Cambridge, pp 377–394
Handl J, Kell DB, Knowles J (2007) Multiobjective optimization in bioinformatics and computational biology. IEEE/ACM Trans Comput Biol Bioinf 4(2): 279–292
Handl J, Knowles J (2004) Evolutionary multiobjective clustering, PPSN VIII: proceedings of the 8th international conference on parallel problem solving from nature. Springer, London, pp 1081–1091
Hetland ML, Saetrom P (2005) Evolutionary rule mining in time series databases. Mach Learn 58(2): 107–125
Hoai NX, McKay RI, Abbass HA (2003) Tree adjoining grammars, language bias, and genetic programming. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Proceedings of the 6th European conference on genetic programming (EuroGP-03), vol 2610 of Lecure Notes in Computer Science. Springer, Essex, pp 335–344
Hussain T, Browse R (1998) Network generating attribute grammar encoding. In: Proceedings of IEEE international joint conference on neural networks, pp 431–436
Jacobson H (2005) Rule extraction from recurrent neural networks: a taxonomy and review. Neural Comput. 17: 1223–1263
Karwath A, King R (2002) Homology induction: the use of machine learning to improve sequence similarity searches. BMC Bioinf 3 (online publication)
Keller RE, Banzhaf W (1996) Genetic programming using genotype-phenotype mapping from linear genomes into linear phenotypes. In: Koza JR, Goldberg DE, Fogel DB, Riolo RL (eds) Proceedings of the 1st annual conference on genetic programming (GP-96). MIT Press, Stanford University, pp 116–122
Law MH, Topchy A, Jain A (2004) Multiobjective data clustering. Proc. IEEE Comput Soc Conf Comput Vis Pattern Recogn 2: 424–430
Lim T, Loh W, Shih Y (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40(3): 203–228
McConaghy T, Gielen G (2006) Canonical form functions as a simple means for genetic programming to evolve human-interpretable functions. In: Proceedings of the 8th annual conference on genetic and evolutionary computation (GECCO-06). ACM Press, New York, pp 855–862
Michie, D, Spiegelhalter, DJ, Taylor, CC, Campbell, J (eds) (1994) Machine learning, neural and statistical classification. Ellis Horwood, Upper Saddle River
Mirkin B, Ritter O (2000) A feature-based approach to discrimination and prediction of protein folding groups. Genomics and Proteomics. Springer, Heidelberg, pp 155–177
Newman DJ, Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases
O’Neill M, BrabazonA, Ryan C, Collins JJ (2001) Evolving market index trading rules using grammatical evolution. In: Boers EJW, Cagnoni S, Gottlieb J, Hart E, Lanzi PL, Raidl GR, Smith RE, Tijink H (eds) Applications of evolutionary computing, vol 2037 of LNCS. Springer, Heidelberg, pp 343–352
O’Neill M, Ryan C (2003) Grammatical evolution evolutionary automatic programming in an arbitrary language. Morgan Kaufmann, San Francisco
Ortega A, de la Cruz M, Alfonseca M (2007) Christiansen grammar evolution: grammatical evolution with semantics. Evol Comput IEEE Trans 11(1): 77–90
Pagallo G, Haussler D (1990) Boolean feature discovery in empirical learning. Mach Learn 5(1): 71–99
Pappa GL (2007) Automatically evolving rule induction algorithms with grammar-based genetic programming. PhD Thesis, Computing Laboratory, University of Kent
Pappa GL, Freitas AA (2006) Automatically evolving rule induction algorithms. In: Fuernkranz J, Scheffer T, Spiliopoulou M (eds) Proceedings of the 17th European conference on machine learning, vol 4212 of Lecture Notes in Computer Science. Springer, Berlin, pp 341–352
Pappa GL, Freitas AA (2007) Discovering new rule induction algorithms with grammar-based genetic programming. In: Maimon O, Rokach L (eds) Soft computing for knowledge discovery and data mining. Springer, Heidelberg, pp 177–196
Pappa GL, Freitas AA, Kaestner CAA (2004) Multi-objective algorithms for attribute selection in data mining. In: Coello CAC, Lamont G (eds) Applications of multi-objective evolutionary algorithms. World Scientific, Singapore, pp 603–626
Pazzani MJ (2000) Knowledge discovery from data?. IEEE Intell Syst 15(2): 10–13
Quinlan JR (1990) Induction of decision trees. In: Shavlik JW, Dietterich TG (eds) Readings in machine learning. Morgan Kaufmann (originally published in Machine Learning 1:81–106, 1986)
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Ratle A, Sebag M (2000) Genetic programming and domain knowledge: beyond the limitations of grammar-guided machine discovery. In: Schoenauer M, Deb K, Rudolph G, Yao X, Lutton E, Merelo JJ, Schwefel H (eds) Proceedings of the 6th international conference on parallel problem solving from nature (PPSN). Springer, Heidelberg, pp 211–220
Ratle A, Sebag M (2001) Avoiding the bloat with probabilistic grammar-guided genetic programming. In: Collet P, Fonlupt C, Hao J-K, Lutton E, Schoenauer M (eds) 5th international conference on evolution artificielle, EA, vol 2310. Springer, Creusot, pp 255–266
Rodrfguez-Vzquez K, Fleming PJ (2005) Evolution of mathematical models of chaotic systems based on multiobjective genetic programming. Knowl Inf Syst 8(2): 235–256
Romero C, Ventura S, De-Bra P (2005) Knowledge discovery with genetic programming for providing feedback to courseware authors. User Model User Adapt Interact 14(5): 425–464
Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. The MIT Press, Cambridge
Soule T, Foster JA (1998) Effects of code growth and parsimony pressure on populations in genetic programming. Evol Comput 6(4): 293–309
Szafron D, Lu P, Greiner R, Wishart DS, Poulin B, Eisner R, Lu Z, Anvik J, Macdonell C, Fyshe A, Meeuwis D (2004) Proteome analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations. Nucleic Acids Res 32(suppl-2): W365–371
Tsakonas A, Dounias G, Jantzen J, Axer H, Bjerregaard B, von Keyserlingk DG (2004) Evolving rule-based systems in two medical domains using genetic programming. Artif Intell Med 32(3): 195–216
Whigham PA (1995) Grammatically-based genetic programming. In: Rosca JP (ed) Proceedings of the workshop on GP: from theory to real-world applications, Tahoe City, pp 33–41
Whigham PA (1996) Grammatical bias for evolutionary learning. PhD Thesis, School of Computer Science, University College, University of New South Wales, Canberra, Australia
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco
Wong ML (1998) An adaptive knowledge-acquisition system using generic genetic programming. Exp Syst Appl 15(1): 47–58
Wong ML, Leung KS (2000) Data mining using grammar-based genetic programming and applications. Kluwer, Dordrecht
Zafra A, Ventura S (2007) Multi-objective genetic programming for multiple instance learning. In: Proceedings of European conference on machine learning—ECML 2007, pp 790–797
Zhang J (1992) Selecting typical instances in instance-based learning. In: Proceedings of the 9th international workshop on machine learning. Morgan Kaufmann, San Francisco, pp 470–479
Zhao H (2007) A multi-objective genetic programming approach to developing pareto optimal decision trees. Decis Support Syst 43(3): 809–826
Zitzler E, Laumanns M, Thiele L (2002) SPEA2: improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Giannakoglou K, Tsahalis D, Periaux J, Papaliliou K, Fogarty T (eds) Evolutionary methods for design, optimisation and control with application to industrial problems. Proceedings of the EUROGEN2001 conference on international center for numerical methos in engineering (CIMNE), pp 95–100
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pappa, G.L., Freitas, A.A. Evolving rule induction algorithms with multi-objective grammar-based genetic programming. Knowl Inf Syst 19, 283–309 (2009). https://doi.org/10.1007/s10115-008-0171-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-008-0171-1