An Empirical Study of Combined Classifiers for Knowledge Discovery on Medical Data Bases

Lopes, Lucelene; Scalabrin, Edson Emilio; Fernandes, Paulo

doi:10.1007/978-3-540-89376-9_11

Lucelene Lopes²¹,
Edson Emilio Scalabrin²¹ &
Paulo Fernandes²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4977))

Included in the following conference series:

Asia-Pacific Web Conference

890 Accesses
3 Citations

Abstract

This paper compares the accuracy of combined classifiers in medical data bases to the same knowledge discovery techniques applied to generic data bases. Specifically, we apply Bagging and Boosting methods for 16 medical and 16 generic data bases and compare the accuracy results with a more traditional approach (C4.5 algorithm). Bagging and Boosting methods are applied using different numbers of classifiers and the accuracy is computed using a cross-validation technique. This paper main contribution resides in recommend the most accurate method and possible parameterization for medical data bases and an initial identification of some characteristics that make medical data bases different from generic ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. Repository, Irvine, CA: University of California, Department of Information and Computer Science (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, Boosting and variants. Machine Learning 36(1/2), 105–139 (1999)
Article Google Scholar
Boetticher, G., Menzies, T., Ostrand, T.: PROMISE Repository of empirical software engineering data. Repository, West Virginia University, Department of Computer Science (2007), http://promisedata.org/
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Cios, K.J.: Medical Data Mining and Knowledge Discovery. Studies in Fuzziness and Soft Computing, vol. 60. Springer, Heidelberg (2001)
MATH Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Int. Conf. on Machine Learning, pp. 148–156 (1996)
Google Scholar
Kotsianti, S.B., Kanellopoulos, D.: Combining Bagging, Boosting and Dagging for classifications problems. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part II. LNCS (LNAI), vol. 4693. Springer, Heidelberg (2007)
Chapter Google Scholar
Li, J., Cercone, N.: Assigning Missing Attribute Values Based on Rough Sets Theory. In: IEEE Int. Conf. on Granular Computing. IEEE Computer Society Press, Los Alamitos (2006)
Google Scholar
Melville, P., Mooney, R.: Constructing Diverse Classifer Ensembles using Artificial Training Examples. In: Proceedings of IJCAI 2003, Acapulco, Mexico, pp. 505–510 (2003)
Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Quinlan, J.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.: Bagging, Boosting and C4.5. In: Proceedings of AAAI/IAAI. The MIT Press, Cambridge (1996)
Google Scholar
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A New explanationfor the effectiveness of voting methods. The Annals of Statistics 26, 1651–1686 (1998)
Article MATH MathSciNet Google Scholar
Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

PPGTS, PUCPR, Curitiba, Brazil
Lucelene Lopes & Edson Emilio Scalabrin
PPGCC, PUCRS, Porto Alegre, Brazil, on sabbatical at LFCS, Univ. of Edinburgh, Edinburgh, UK, CAPES grant 1341/07-3, Brazil, UK
Paulo Fernandes

Authors

Lucelene Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Edson Emilio Scalabrin
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Fernandes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
CAS Research Center on Data Technology and Knowledge Economy, Beijing, China
Jing He & Yong Shi &
Victoria University, Melbourne, Australia
Guandong Xu
Institute of Software, Chinese Academy of Sciences, Beijing, China
Guangyan Huang
CSIRO ICT Centre, Brisbane, QLD, Australia
Chaoyi Pang & Qing Zhang &
Northeastern University, Shenyang, China
Guoren Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lopes, L., Scalabrin, E.E., Fernandes, P. (2008). An Empirical Study of Combined Classifiers for Knowledge Discovery on Medical Data Bases. In: Ishikawa, Y., et al. Advanced Web and Network Technologies, and Applications. APWeb 2008. Lecture Notes in Computer Science, vol 4977. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89376-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-89376-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89375-2
Online ISBN: 978-3-540-89376-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics