Abstract
Medical data feature a number of characteristics that make their classification a complex task. Yet, the societal significance of the subject and the computational challenge it presents has caused the classification of medical datasets to be a popular research area. A new hybrid metaheuristic is presented for the classification task of medical datasets. The hybrid ant–bee colonies (HColonies) consists of two phases: an ant colony optimization (ACO) phase and an artificial bee colony (ABC) phase. The food sources of ABC are initialized into decision lists, constructed during the ACO phase using different subsets of the training data. The task of the ABC is to optimize the obtained decision lists. New variants of the ABC operators are proposed to suit the classification task. Results on a number of benchmark, real-world medical datasets show the usefulness of the proposed approach. Classification models obtained feature good predictive accuracy and relatively small model size.
Similar content being viewed by others
Notes
IBM Corp. Released 2011. IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp.
References
Abuhamdah A, Ayob M, Kendall G, Sabar N (2013) Population based local search for university course timetabling problems. Appl Intell 40(1):44–53. doi:10.1007/s10489-013-0444-6
Alcalá-fdez J, Sánchez L, García S, Jesus MJD, Ventura S, Garrell JM, Otero J, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318
Aribarg T, Supratid S, Lursinsap C (2012) Optimizing the modified fuzzy ant-miner for efficient medical diagnosis. Appl Intell 37(3):1–20
Bernadó-Mansilla E, Garrell-Guiu JM (2003) Accuracy based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238
Blum C, Aguilera MJB, Roli A, Sampels M (eds) (2008) Hybrid metaheuristics, an emerging approach to optimization. SCI, vol 114. Springer, Berlin
Cohen WW (1995) Fast effective rule induction. In: Prieditis A, Russell SJ (eds) Proc ICML, Morgan Kaufmann, pp 115–123
Cuevas E, Sención F, Zaldivar D, Pérez-Cisneros M, Sossa H (2012) A multi-threshold segmentation approach based on artificial bee colony optimization. Appl Intell 37(3):321–336
Diwold K, Beekman M, Middendorf M (2010) Honeybee optimisation—an overview and a new bee inspired optimisation scheme. In: Panigrahi B, Shi Y, Lim MH (eds) Handbook of swarm intelligence, adaptation, learning, and optimization, vol 8. Springer, Berlin, pp 295–327. http://dx.doi.org/10.1007/978-3-642-17390-5_13
Dorigo M (1992) Optimization, learning and natural algorithms. Ph.D. thesis, Politecnico di Milano, Italie
Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66
Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
Frank E, Witten I (1998) Generating accurate rule sets without global optimization. In: Proc ICML, Morgan Kaufmann, pp 144–151
Garcá-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282
García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977
Gonzalez A, Perez R (1999) Slave: a genetic learning system based on an iterative approach. IEEE Trans Fuzzy Syst 7(2):176–191
Hanczara B, Dougherty ER (2013) The reliability of estimated confidence intervals for classification error rates when only a single sample is available. Pattern Recognit 64(3):1067–1077
Holland J (1975) Adaptation in natural and artificial systems, 1st edn. University of Michigan Press, Ann Arbor
Holm S (1979) A simple sequentially rejective test procedure. Scand J Stat 6(2):65–70
Inza I, Calvo B, Armañanzas R, Bengoetxea E, Larrañaga P, Lozano JA (2010) Machine learning: an indispensable tool in bioinformatics. In: Matthiesen R (ed) Bioinformatics methods in clinical research, methods in molecular biology, vol 593. Humana Press, Clifton, pp 25–48. Chap. 2
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical Report TR06, Erciyes University, Engineering Faculty, Computer Engineering Department
Karaboga D, Akay B (2009) A survey: algorithms simulating bee swarm intelligence. Artif Intell Rev 31(1–4):61–85. http://dblp.uni-trier.de/db/journals/air/air31.html#KarabogaA09
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39(3):459–471. doi:10.1007/s10898-007-9149-x
Karaboga D, Akay B, Ozturk C (2007) Artificial bee colony (ABC) optimization algorithm for training feed-forward neural networks. In: Torra V, Narukawa Y, Yoshida Y (eds) Proc MDAI. LNCS, vol 4617. Springer, Berlin, pp 318–329
Karaboga N, Kockanat S, Dogan H (2013) The parameter extraction of the thermally annealed schottky barrier diode using the modified artificial bee colony. Appl Intell 38(3):279–288
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE international conference on neural networks, Piscataway, NJ, vol 4, pp 1942–1948
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc IJCAI, Morgan Kaufmann, vol 14, pp 1137–1145. http://dblp.uni-trier.de/db/conf/ijcai/ijcai95.html
Koza JR (1992) Genetic programming. MIT Press, Cambridge
Langdon WB (1997) Fitness causes bloat in variable size representations. Tech. Rep. CSRP-97-14, University of Birmingham, School of Computer Science
Lavarč N (1999) Selected techniques for data mining in medicine. Artif Intell Med 16(1):3–23
Lee K, Yoon W, Baek D (2006) A classification method using a hybrid genetic algorithm combined with an adaptive procedure for the pool of ellipsoids. Appl Intell 25(3):293–304
Martens D, Backer MD, Haesen R, Vanthienen J, Snoeck M, Baesens B (2007) Classification with ant colony optimization. IEEE Trans Evol Comput 11(5):651–665
Minnaert B, Martens D, De Baker M, Baesens B (2012) To tune or not to tune: rule evaluation for metaheuristic-based sequential covering algorithms. Working Paper 12769, Universiteit Gent
Orriols-Puig A, Casillas J, Bernadó-Mansilla E (2008) A comparative study of several genetic-based supervised learning systems. In: Bull L, Bernadó-Mansilla E, Holmes JH (eds) Learning classifier systems in data mining, SCI, vol 125. Springer, Berlin, pp 205–230
Otero FEB, Freitas AA, Johnson CG (2008) cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes. In: Dorigo M, Birattari M, Blum C, Clerc M, Stützle T, Winfield AFT (eds) ANTS conference, Springer. LNCS, vol 5217, pp 48–59
Parpinelli RS, Lopes HS, Freitas AA (2002) Data mining with an ant colony optimization algorithm. IEEE Trans Evol Comput 6(4):321–332
Penã-Reyes CA, Sipper M (2000) Evolutionary computation in medicine: an overview. Artif Intell Med 19(1):1–23
Peng Jin KH Yunlong Z Li S (2006) Classification rule mining based on ant colony optimization algorithm. In: Huang DS, Li K, Irwin G (eds) Intell Control Autom. LNCIC, vol 344. Springer, Berlin, pp 654–663
Pham DT, Ghanbarzadeh A, Koc E, Otri S, Rahim S, Zaidi M (2006) The bees algorithm, a novel tool for complex optimisation problems. In: Proc IPROMS. Elsevier, Amsterdam, pp 454–459
Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Sarkar BK, Sana SS, Chaudhuri K (2012) A genetic algorithm-based rule extraction system. Appl Soft Comput 12(1):238–254
Sato T, Hagiwara M (1997) Bee system: finding solution by a concentrated search. In: Proc IEEE sys man cybern, vol 4, pp 3954–3959
Shukran MAM, Chung YY, Yeh WC, Wahid N, Zaidi AMA (2011) Artificial bee colony based data mining algorithms for classification tasks. Math Models Methods Appl Sci 5(4):217–231
Sousa T, Silva A, Neves A (2004) Particle swarm based data mining algorithms for classification tasks. Parallel Comput 30(5–6):767–783
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B 36(2):111–147
Stützle T, Hoos HH (2000) MAX-MIN ant system. Future Gener Comput Syst 16(8):889–914
Tan KC, Yu Q, Heng CM, Lee TH (2003) Evolutionary computing for knowledge discovery in medical diagnosis. Artif Intell Med 27(2):129–154
Teodorovic D, Dell’orco M (2005) Bee colony optimization—a cooperative learning approach to complex transportation problems. In: Proc 16th mini-EURO conf advanced OR and AI methods in transportation, pp 51–60
Tian J, Yu B, Yu D, Ma S (2013) Missing data analyses: a hybrid multiple imputation algorithm using gray system theory and entropy based on clustering. Appl Intell pp 1–13. doi:10.1007/s10489-013-0469-x
Verma B, Hassan S (2011) Hybrid ensemble approach for classification. Appl Intell 34(2):258–278
von Frisch K (1967) The dance language and orientation of bees. Belknap Press of Harvard University Press, Cambridge
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Zhu B, He C, Liatsis P (2012) A robust missing value imputation method for noisy data. Appl Intell 36(1):61–74
Acknowledgements
This research project has been supported by a grant from the “Research Center of the Center for Female Scientific and Medical Colleges”, Deanship of Scientific Research, King Saud University. The authors would like to thank the anonymous reviewers for their valuable and constructive comments. Special thanks to Dr. Joaquin Derrac Rus from Cardiff University, UK, for his assistance in using KEEL. Also, an honorable mention goes to Dr. Pedro J. García Laencina from Universidad Politécnica de Cartagena, Spain; Dr. Iñaki Inza from University of the Basque Country, Spain; Dr. Bahriye Basturk Akay from Erciyes University, Turkey; Dr. Pat Langley from Stanford University, California; and Dr. Sotos Kotsiantis from University of Patras, Greece, for providing useful information for the project.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
AlMuhaideb, S., Menai, M.E.B. HColonies: a new hybrid metaheuristic for medical data classification. Appl Intell 41, 282–298 (2014). https://doi.org/10.1007/s10489-014-0519-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-014-0519-z