Cancer Prediction Using Diversity-Based Ensemble Genetic Programming

Hong, Jin-Hyuk; Cho, Sung-Bae

doi:10.1007/11526018_29

Jin-Hyuk Hong²¹ &
Sung-Bae Cho²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3558))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

1159 Accesses
2 Citations

Abstract

Combining a set of classifiers has often been exploited to improve the classification performance. Accurate as well as diverse base classifiers are prerequisite to construct a good ensemble classifier. Therefore, estimating diversity among classifiers has been widely investigated. This paper presents an ensemble approach that combines a set of diverse rules obtained by genetic programming. Genetic programming generates interpretable classification rules, and diversity among them is directly estimated. Finally, several diverse rules are combined by a fusion method to generate a final decision. The proposed method has been applied to cancer classification using gene expression profiles, which is one of the important issues in bioinformatics. Experiments on several popular cancer datasets have demonstrated the usability of the method. High performance of the proposed method has been obtained, and the accuracy has increased by diversity among the base classification rules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Koza, J.: Genetic programming. Encyclopedia of Computer Science and Technology 39, 29–43 (1998)
Google Scholar
Bruke, E., et al.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Trans. Evolutionary Computation 8(1), 47–62 (2004)
Article Google Scholar
Kuncheva, L.: A theoretical study on six classifier fusion strategies. IEEE Trans. Pattern Analysis and Machine Intelligence 24(2), 281–286 (2002)
Article Google Scholar
Bryll, R., et al.: Attribute bagging: Improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36(6), 1291–1302 (2003)
Article MATH Google Scholar
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intelligence 12(10), 993–1001 (1990)
Article Google Scholar
Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. J. of Artificial Intelligence Research 11, 160–198 (1999)
Google Scholar
Zhou, Z., et al.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)
Article MATH MathSciNet Google Scholar
Ruta, D., Gabrys, B.: Classifier selection for majority voting. Information Fusion (2004)
Google Scholar
Brown, G., et al.: Diversity creation methods: A survey and categorization. Information Fusion 6(1), 5–20 (2005)
Article Google Scholar
Bakker, B., Heskes, T.: Clustering ensembles of neural network models. Neural Networks 16(2), 261–269 (2003)
Article Google Scholar
Tan, A., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification. Applied Bioinformatics 2(3), 75–83 (2003)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proc. the 13th Int. Conf. Machine Learning, pp. 148–156 (1996)
Google Scholar
Breiman, L.: Bias, variance, and arcing classifiers, Tech. Rep. 460, UC-Berkeley (1996)
Google Scholar
Peterson, C., Ringner, M.: Analyzing tumor gene expression profiles. Artificial Intelligence in Medicine 28(1), 59–74 (2003)
Article Google Scholar
Hong, J.-H., Cho, S.-B.: Lymphoma cancer classification using genetic programming with SNR features. In: Keijzer, M., O’Reilly, U.-M., Lucas, S., Costa, E., Soule, T. (eds.) EuroGP 2004. LNCS, vol. 3003, pp. 78–88. Springer, Heidelberg (2004)
Chapter Google Scholar
Wang, J., Zhang, K.: Finding similar consensus between trees: An algorithm and a distance hierarchy. Pattern Recognition 34(1), 127–137 (2001)
Article MATH Google Scholar
Xiong, M., et al.: Feature selection in gene expression-based tumor classification. Molecular Genetics and Metabolism 73(3), 239–247 (2001)
Article Google Scholar
Brameier, M., Banzhaf, W.: A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evolutionary Computation 5(1), 17–26 (2001)
Article Google Scholar
Zhang, Y., Bhattacharyya, S.: Genetic programming in classifying large-scale data: An ensemble method. Information Sciences 163(1-3), 85–101 (2004)
Article Google Scholar
Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769), 503–511 (2000)
Article Google Scholar
Gordon, G., et al.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research 62(17), 4963–4967 (2002)
Google Scholar
Petricoin III, E., et al.: Use of proteomic patterns in serum to identify ovarian cancer. The Lancet 359(9306), 572–577 (2002)
Article Google Scholar
Shipp, M., et al.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine 8(1), 68–74 (2002)
Article Google Scholar
Ando, T., et al.: Selection of causal gene sets for lymphoma prognostication from expression profiling and construction of prognostic fuzzy neural network models. J. Bioscience and Bioengineering 96(2), 161–167 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Yonsei University, 134 Sinchon-dong, Sudaemoon-ku, Seoul, 120-749, Korea
Jin-Hyuk Hong & Sung-Bae Cho

Authors

Jin-Hyuk Hong
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Bae Cho
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IIIA, Artificial Intelligence Research Institute CSIC, Spanish National Research Council, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra
Toho Gakuen, 3-1-10 Naka, Kunitachi, 186-0004, Tokyo, Japan
Yasuo Narukawa
Department of Risk Engineering, School of Systems and Information Engineering, University of Tsukuba, 305-8573, Ibaraki, Japan
Sadaaki Miyamoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hong, JH., Cho, SB. (2005). Cancer Prediction Using Diversity-Based Ensemble Genetic Programming. In: Torra, V., Narukawa, Y., Miyamoto, S. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2005. Lecture Notes in Computer Science(), vol 3558. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11526018_29

Download citation

DOI: https://doi.org/10.1007/11526018_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27871-9
Online ISBN: 978-3-540-31883-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics