Skip to main content

Comparing Classifiers

  • Chapter
  • First Online:
Principles of Data Mining

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

Abstract

This chapter considers how to compare the performance of alternative classifiers across a range of datasets. The commonly used paired t-test is described and illustrated with worked examples, leading to the use of confidence intervals when the predictive accuracies of two classifiers are found to be significantly different.

Pitfalls involved in comparing classifiers are discussed, leading to alternative ways of comparing their performance that do not rely on comparisons of predictive accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For those not familiar with this notation, which uses the Greek letter ∑ (pronounced ‘sigma’) to denote summation, it is explained in Appendix A.1.1. The simplified variant used here leaves out the subscripts, as the values to be added are obvious. \(\sum{}z\) (read as ‘sigma z’) denotes the sum of all values of z, which here is 7, \(\sum{}z^{2}\) (read as ‘sigma z squared’) represents the sum of all the values of \(z^{2}\), which is 437. The latter is not to be confused with \((\sum{}z)^{2}\), which is the square of \(\sum{}z\), i.e. 49.

References

  1. Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases. Irvine: University of California, Department of Information and Computer Science. http://www.ics.uci.edu/~mlearn/MLRepository.html .

    Google Scholar 

  2. Salzberg, S. L. (1997). On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1, 317–327. Kluwer.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag London Ltd.

About this chapter

Cite this chapter

Bramer, M. (2016). Comparing Classifiers. In: Principles of Data Mining. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-7307-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-7307-6_15

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-7306-9

  • Online ISBN: 978-1-4471-7307-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics