Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation

Jamain, Adrien; Hand, David J.

doi:10.1007/s00357-008-9003-y

Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation

Published: 26 June 2008

Volume 25, pages 87–112, (2008)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Adrien Jamain¹ &
David J. Hand²

382 Accesses
32 Citations
Explore all metrics

Abstract

There have been many comparative studies of classification methods in which real datasets are used as a gauge to assess the relative performance of the methods. Since these comparisons often yield inconclusive or limited results on how methods perform, it is often believed that a broader approach combining these studies would shed some light on this difficult question. This paper describes such an attempt: we have sampled the available literature and created a dataset of 5807 classification results. We show that one of the possible ways to analyze the resulting data is an overall assessment of the classification methods, and we present methods for that particular aim. The merits and demerits of such an approach are discussed, and conclusions are drawn which may assist future research: we argue that the current state of the literature hardly allows large-scale investigations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised Machine Learning in a Nutshell

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

Dealing with the evaluation of supervised classification algorithms

Article 30 June 2015

References

BATCHELOR, B. G. and HAND, D. J. (1976), “A Pattern Recognition Competition”, in Proceedings of the Third International Joint Conference on Pattern Recognition, San Diego, 1976.
BERTHOLD, M.R. and DIAMOND, J. (1998), “Constructive Training of Probabilistic Neural Networks”, Neurocomputing, 19:167–183.
Article Google Scholar
BLUE, J.L., CANDELA, G.T., GROTHER, P.J., CHELLAPPA, R., and WILSON, C.L. (1994), “Evaluation of Pattern Classifiers for Fingerprint and OCR Applications”, Pattern Recognition, 4:485–501.
Article Google Scholar
BRADLEY, R.A. and TERRY, M.E. (1952), “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons”, Biometrika, 39:324–345.
MATH MathSciNet Google Scholar
BRAZDIL, P.B., SOARES, C., and PINTO DA COSTA, J. (2003), “Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results”, Machine Learning, 50:251–277.
Article MATH Google Scholar
COLLETT, D. (2002), Modelling Binary Data (2nd ed.), London: Chapman and Hall.
Google Scholar
CURRAM, S.P. and MINGERS, J. (1994), “Neural Networks, Decision Tree Induction and Discriminant Analysis”, Journal of Operational Research Society, 45:440–450.
Article MATH Google Scholar
DIETTERICH, T.G. (2000), “An Experimental Comparison of Three Methods for Constructing Ensembles of Decisions Trees: Bagging, Boosting, and Randomization”, Machine Learning, 40:139–157.
Article Google Scholar
DUIN, R.P.W. (1996), “A Note on Comparing Classifiers”, Pattern Recognition Letters, 17:529–536.
Article Google Scholar
EKLUND, P.W. and HOANG, A. (2002), “A Performance Survey of Public Domain Supervised Machine Learning Algorithms”, http://citeseer.nj.nec.com/142129.html.
FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annals of Eugenics, 7:179–188.
Google Scholar
FUKUNAGA, K. (1990), Introduction to Statistical Pattern Recognition, San Diego: Academic Press.
MATH Google Scholar
HAND, D.J. (2004), “Academic Obsessions and Classification Realities: Ignoring Practicalities in Supervised Classification”, in Classification, Clustering and Data Mining Applications, eds. B. Banks, L. House, F. R. McMorris, P. Arabie, and W. Gaul, Berlin: Springer, pp. 209–232.
HAND, D.J. (1981), Discrimination and Classification, Chichester: Wiley.
MATH Google Scholar
HAND, D.J. (1997), Construction and Assessment of Classification Rules, Chichester: Wiley.
MATH Google Scholar
HAND, D.J., MANNILA, H., and SMYTH, P. (2001), Principles of Data Mining, Cambridge MA: MIT Press.
Google Scholar
HASTIE, T., TIBSHIRANI, R., and FRIEDMAN, J. (2001), The Elements of Statistical Learning Theory, New York: Springer.
Google Scholar
HOOPER, P.M. (1999), “Reference Point Logistic Classification”, Journal of Classification, 16(1):91–116.
Article MATH MathSciNet Google Scholar
JAMAIN, A. (2004), “Meta-analysis of Classification Methods”, PhD thesis, Department of Mathematics, Imperial College, London.
JAMAIN, A. and HAND, D.J. (2005), “The Naive Bayes Mystery: a Classification Detective Story”, Pattern Recognition Letters, 26:1752–1760.
Article Google Scholar
KLEINBERG, E.M. (2000), “On the Algorithmic Implementation of Stochastic Discrimination”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(5): 473–490.
Article Google Scholar
LIM, T., LOH, W., and SHIH, Y. (2000), “A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms”, Machine Learning, 40:203–228.
Article MATH Google Scholar
MCLACHLAN, G.J. (1992), Discriminant Analysis and Statistical Pattern Recognition, New York: Wiley.
Google Scholar
METAL CONSORTIUM (2002), “Esprit Project METAL (#26.357)”, http://www.metalkdd.org.
MICHIE, D., SPIEGELHALTER, D.J., and TAYLOR, C.C. (1994), Machine Learning, Neural and Statistical Classification, New York: Ellis Horwood.
MATH Google Scholar
MITCHELL, T.M. (1997), Machine Learning, New York: McGraw-Hill.
MATH Google Scholar
RASMUSSEN, C.E., NEAL, R.M., HINTON, G.E., VAN CAMP, D., REVOW, M., GHAHRAMANI, Z., KUSTRA, R., and TIBSHIRANI, R. (1996), “DELVE, Data for Evaluating Learning in Valid Experiments”, http://www.cs.toronto.edu/~delve/.
RENDELL, L. and SESHU, R. (1990), “Learning Hard Concepts Through Constructive Induction”, Computational Intelligence, 6:247–270.
Article Google Scholar
RIPLEY, B.D. (1996), Pattern Recognition and Neural Networks, Cambridge: Cambridge University Press.
MATH Google Scholar
SARGENT, D.J. (2001), “Comparison of Artificial Neural Networks with Other Statistical Approaches”, Cancer, 91:1636–42.
Article Google Scholar
SCHIAVO, R.A. and HAND, D.J. (2000), “Ten More Years of Error Rate Research”, International Statistical Review, 68(3):295–310.
Article MATH Google Scholar
SOHN, S.Y. (1999), “Meta-analysis of Classification Algorithms for Pattern Recognition”, IEEE Transactions on Pattern Recognition and Machine Intelligence, 21(11):1137–1144.
Article Google Scholar
VAN DER LINDEN, W.J. and HAMBLETON, R.K. (1997), Handbook of Modern Item Response Theory, New York: Springer-Verlag.
MATH Google Scholar
WEBB, A. (2002), Statistical Pattern Recognition (2nd ed.), London: Arnold.
MATH Google Scholar
ZARNDT, F. (1995), “A Comprehensive Case Study: an Examination of Machine Learning and Connectionnist Algorithms”, http://citeseer.nj.nec.com/481595.html.

Download references

Author information

Authors and Affiliations

BNP-Paribas, 10 Harewood Avenue, London, NW1 6AA, UK
Adrien Jamain
Department of Mathematics, and Institute for Mathematical Sciences, Imperial College, London, SW7 2AZ, UK
David J. Hand

Authors

Adrien Jamain
View author publications
You can also search for this author in PubMed Google Scholar
David J. Hand
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrien Jamain.

Additional information

This work was sponsored by the MOD Corporate Research Programme, CISP, as part of a larger project on technology assessment. We would like to express our appreciation to Andrew Webb for his support throughout the entire project, and to Wojtek Krzanowski for valuable comments on a draft of this paper. We would also like to thank the anonymous referees for some very interesting comments, some of which we hope to pursue in future work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jamain, A., Hand, D.J. Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation. J Classif 25, 87–112 (2008). https://doi.org/10.1007/s00357-008-9003-y

Download citation

Published: 26 June 2008
Issue Date: June 2008
DOI: https://doi.org/10.1007/s00357-008-9003-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation

Abstract

Access this article

Similar content being viewed by others

Supervised Machine Learning in a Nutshell

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

Dealing with the evaluation of supervised classification algorithms

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation

Abstract

Access this article

Similar content being viewed by others

Supervised Machine Learning in a Nutshell

Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations

Dealing with the evaluation of supervised classification algorithms

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation