Abstract
This chapter considers how to compare the performance of alternative classifiers across a range of datasets. The commonly used paired t-test is described and illustrated with worked examples, leading to the use of confidence intervals when the predictive accuracies of two classifiers are found to be significantly different.
Pitfalls involved in comparing classifiers are discussed, leading to alternative ways of comparing their performance that do not rely on comparisons of predictive accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For those not familiar with this notation, which uses the Greek letter ∑ (pronounced ‘sigma’) to denote summation, it is explained in Appendix A.1.1. The simplified variant used here leaves out the subscripts, as the values to be added are obvious. \(\sum{}z\) (read as ‘sigma z’) denotes the sum of all values of z, which here is 7, \(\sum{}z^{2}\) (read as ‘sigma z squared’) represents the sum of all the values of \(z^{2}\), which is 437. The latter is not to be confused with \((\sum{}z)^{2}\), which is the square of \(\sum{}z\), i.e. 49.
References
Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. Irvine: University of California, Department of Information and Computer Science. http://www.ics.uci.edu/~mlearn/MLRepository.html .
Salzberg, S. L. (1997). On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1, 317–327. Kluwer.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag London Ltd.
About this chapter
Cite this chapter
Bramer, M. (2016). Comparing Classifiers. In: Principles of Data Mining. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-7307-6_15
Download citation
DOI: https://doi.org/10.1007/978-1-4471-7307-6_15
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-7306-9
Online ISBN: 978-1-4471-7307-6
eBook Packages: Computer ScienceComputer Science (R0)