Skip to main content
Log in

Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

There have been many comparative studies of classification methods in which real datasets are used as a gauge to assess the relative performance of the methods. Since these comparisons often yield inconclusive or limited results on how methods perform, it is often believed that a broader approach combining these studies would shed some light on this difficult question. This paper describes such an attempt: we have sampled the available literature and created a dataset of 5807 classification results. We show that one of the possible ways to analyze the resulting data is an overall assessment of the classification methods, and we present methods for that particular aim. The merits and demerits of such an approach are discussed, and conclusions are drawn which may assist future research: we argue that the current state of the literature hardly allows large-scale investigations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BATCHELOR, B. G. and HAND, D. J. (1976), “A Pattern Recognition Competition”, in Proceedings of the Third International Joint Conference on Pattern Recognition, San Diego, 1976.

  • BERTHOLD, M.R. and DIAMOND, J. (1998), “Constructive Training of Probabilistic Neural Networks”, Neurocomputing, 19:167–183.

    Article  Google Scholar 

  • BLUE, J.L., CANDELA, G.T., GROTHER, P.J., CHELLAPPA, R., and WILSON, C.L. (1994), “Evaluation of Pattern Classifiers for Fingerprint and OCR Applications”, Pattern Recognition, 4:485–501.

    Article  Google Scholar 

  • BRADLEY, R.A. and TERRY, M.E. (1952), “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons”, Biometrika, 39:324–345.

    MATH  MathSciNet  Google Scholar 

  • BRAZDIL, P.B., SOARES, C., and PINTO DA COSTA, J. (2003), “Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results”, Machine Learning, 50:251–277.

    Article  MATH  Google Scholar 

  • COLLETT, D. (2002), Modelling Binary Data (2nd ed.), London: Chapman and Hall.

    Google Scholar 

  • CURRAM, S.P. and MINGERS, J. (1994), “Neural Networks, Decision Tree Induction and Discriminant Analysis”, Journal of Operational Research Society, 45:440–450.

    Article  MATH  Google Scholar 

  • DIETTERICH, T.G. (2000), “An Experimental Comparison of Three Methods for Constructing Ensembles of Decisions Trees: Bagging, Boosting, and Randomization”, Machine Learning, 40:139–157.

    Article  Google Scholar 

  • DUIN, R.P.W. (1996), “A Note on Comparing Classifiers”, Pattern Recognition Letters, 17:529–536.

    Article  Google Scholar 

  • EKLUND, P.W. and HOANG, A. (2002), “A Performance Survey of Public Domain Supervised Machine Learning Algorithms”, http://citeseer.nj.nec.com/142129.html.

  • FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annals of Eugenics, 7:179–188.

    Google Scholar 

  • FUKUNAGA, K. (1990), Introduction to Statistical Pattern Recognition, San Diego: Academic Press.

    MATH  Google Scholar 

  • HAND, D.J. (2004), “Academic Obsessions and Classification Realities: Ignoring Practicalities in Supervised Classification”, in Classification, Clustering and Data Mining Applications, eds. B. Banks, L. House, F. R. McMorris, P. Arabie, and W. Gaul, Berlin: Springer, pp. 209–232.

  • HAND, D.J. (1981), Discrimination and Classification, Chichester: Wiley.

    MATH  Google Scholar 

  • HAND, D.J. (1997), Construction and Assessment of Classification Rules, Chichester: Wiley.

    MATH  Google Scholar 

  • HAND, D.J., MANNILA, H., and SMYTH, P. (2001), Principles of Data Mining, Cambridge MA: MIT Press.

    Google Scholar 

  • HASTIE, T., TIBSHIRANI, R., and FRIEDMAN, J. (2001), The Elements of Statistical Learning Theory, New York: Springer.

    Google Scholar 

  • HOOPER, P.M. (1999), “Reference Point Logistic Classification”, Journal of Classification, 16(1):91–116.

    Article  MATH  MathSciNet  Google Scholar 

  • JAMAIN, A. (2004), “Meta-analysis of Classification Methods”, PhD thesis, Department of Mathematics, Imperial College, London.

  • JAMAIN, A. and HAND, D.J. (2005), “The Naive Bayes Mystery: a Classification Detective Story”, Pattern Recognition Letters, 26:1752–1760.

    Article  Google Scholar 

  • KLEINBERG, E.M. (2000), “On the Algorithmic Implementation of Stochastic Discrimination”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(5): 473–490.

    Article  Google Scholar 

  • LIM, T., LOH, W., and SHIH, Y. (2000), “A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms”, Machine Learning, 40:203–228.

    Article  MATH  Google Scholar 

  • MCLACHLAN, G.J. (1992), Discriminant Analysis and Statistical Pattern Recognition, New York: Wiley.

    Google Scholar 

  • METAL CONSORTIUM (2002), “Esprit Project METAL (#26.357)”, http://www.metalkdd.org.

  • MICHIE, D., SPIEGELHALTER, D.J., and TAYLOR, C.C. (1994), Machine Learning, Neural and Statistical Classification, New York: Ellis Horwood.

    MATH  Google Scholar 

  • MITCHELL, T.M. (1997), Machine Learning, New York: McGraw-Hill.

    MATH  Google Scholar 

  • RASMUSSEN, C.E., NEAL, R.M., HINTON, G.E., VAN CAMP, D., REVOW, M., GHAHRAMANI, Z., KUSTRA, R., and TIBSHIRANI, R. (1996), “DELVE, Data for Evaluating Learning in Valid Experiments”, http://www.cs.toronto.edu/~delve/.

  • RENDELL, L. and SESHU, R. (1990), “Learning Hard Concepts Through Constructive Induction”, Computational Intelligence, 6:247–270.

    Article  Google Scholar 

  • RIPLEY, B.D. (1996), Pattern Recognition and Neural Networks, Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • SARGENT, D.J. (2001), “Comparison of Artificial Neural Networks with Other Statistical Approaches”, Cancer, 91:1636–42.

    Article  Google Scholar 

  • SCHIAVO, R.A. and HAND, D.J. (2000), “Ten More Years of Error Rate Research”, International Statistical Review, 68(3):295–310.

    Article  MATH  Google Scholar 

  • SOHN, S.Y. (1999), “Meta-analysis of Classification Algorithms for Pattern Recognition”, IEEE Transactions on Pattern Recognition and Machine Intelligence, 21(11):1137–1144.

    Article  Google Scholar 

  • VAN DER LINDEN, W.J. and HAMBLETON, R.K. (1997), Handbook of Modern Item Response Theory, New York: Springer-Verlag.

    MATH  Google Scholar 

  • WEBB, A. (2002), Statistical Pattern Recognition (2nd ed.), London: Arnold.

    MATH  Google Scholar 

  • ZARNDT, F. (1995), “A Comprehensive Case Study: an Examination of Machine Learning and Connectionnist Algorithms”, http://citeseer.nj.nec.com/481595.html.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrien Jamain.

Additional information

This work was sponsored by the MOD Corporate Research Programme, CISP, as part of a larger project on technology assessment. We would like to express our appreciation to Andrew Webb for his support throughout the entire project, and to Wojtek Krzanowski for valuable comments on a draft of this paper. We would also like to thank the anonymous referees for some very interesting comments, some of which we hope to pursue in future work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jamain, A., Hand, D.J. Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation. J Classif 25, 87–112 (2008). https://doi.org/10.1007/s00357-008-9003-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-008-9003-y

Keywords

Navigation