Abstract
This paper compares the accuracy of combined classifiers in medical data bases to the same knowledge discovery techniques applied to generic data bases. Specifically, we apply Bagging and Boosting methods for 16 medical and 16 generic data bases and compare the accuracy results with a more traditional approach (C4.5 algorithm). Bagging and Boosting methods are applied using different numbers of classifiers and the accuracy is computed using a cross-validation technique. This paper main contribution resides in recommend the most accurate method and possible parameterization for medical data bases and an initial identification of some characteristics that make medical data bases different from generic ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. Repository, Irvine, CA: University of California, Department of Information and Computer Science (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, Boosting and variants. Machine Learning 36(1/2), 105–139 (1999)
Boetticher, G., Menzies, T., Ostrand, T.: PROMISE Repository of empirical software engineering data. Repository, West Virginia University, Department of Computer Science (2007), http://promisedata.org/
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Cios, K.J.: Medical Data Mining and Knowledge Discovery. Studies in Fuzziness and Soft Computing, vol. 60. Springer, Heidelberg (2001)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Int. Conf. on Machine Learning, pp. 148–156 (1996)
Kotsianti, S.B., Kanellopoulos, D.: Combining Bagging, Boosting and Dagging for classifications problems. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part II. LNCS (LNAI), vol. 4693. Springer, Heidelberg (2007)
Li, J., Cercone, N.: Assigning Missing Attribute Values Based on Rough Sets Theory. In: IEEE Int. Conf. on Granular Computing. IEEE Computer Society Press, Los Alamitos (2006)
Melville, P., Mooney, R.: Constructing Diverse Classifer Ensembles using Artificial Training Examples. In: Proceedings of IJCAI 2003, Acapulco, Mexico, pp. 505–510 (2003)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Quinlan, J.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Quinlan, J.: Bagging, Boosting and C4.5. In: Proceedings of AAAI/IAAI. The MIT Press, Cambridge (1996)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A New explanationfor the effectiveness of voting methods. The Annals of Statistics 26, 1651–1686 (1998)
Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lopes, L., Scalabrin, E.E., Fernandes, P. (2008). An Empirical Study of Combined Classifiers for Knowledge Discovery on Medical Data Bases. In: Ishikawa, Y., et al. Advanced Web and Network Technologies, and Applications. APWeb 2008. Lecture Notes in Computer Science, vol 4977. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89376-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-89376-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89375-2
Online ISBN: 978-3-540-89376-9
eBook Packages: Computer ScienceComputer Science (R0)