Skip to main content

On Measuring the Complexity of Classification Problems

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9489))

Included in the following conference series:

Abstract

There has been a growing interest in describing the difficulty of solving a classification problem. This knowledge can be used, among other things, to support more grounded decisions concerning data pre-processing, as well as for the development of new data-driven pattern recognition techniques. Indeed, to estimate the intrinsic complexity of a classification problem, there are a variety of measures that can be extracted from a training data set. This paper presents some of them, performing a theoretical analysis.

A.C. Lorena—Acknowledgements to the Brazilian Research Agencies FAPESP and CNPq.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Antolnez, N.M.: Data complexity in supervised learning: a far-reaching implication. Ph.D. thesis, La Salle, Universitat Ramon Llull (2011)

    Google Scholar 

  2. Basu, M., Ho, T.K.: Data Complexity in Pattern Recognition. Springer, London (2006)

    Book  MATH  Google Scholar 

  3. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)

    Book  MATH  Google Scholar 

  4. Cummins, L.: Combining and choosing case base maintenance algorithms. Ph.D. thesis, National University of Ireland, Cork (2013)

    Google Scholar 

  5. Dong, M., Kothari, R.: Feature subset selection using a new definition of classificability. PRL 24, 1215–1225 (2003)

    Article  MATH  Google Scholar 

  6. Flores, M.J., Gámez, J.A., Martínez, A.M.: Domains of competence of the semi-naive bayesian network classifiers. Inf. Sci. 260, 120–148 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  7. Garcia, L.P.F., de Carvalho, A.C.P.L.F., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing (accepted) (2015, in press)

    Google Scholar 

  8. Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)

    Article  Google Scholar 

  9. Hoekstra, A., Duin, R.P.: On the nonlinearity of pattern classifiers. In: Proceedings of the 13th International Conference on Pattern Recognition, vol. 4, pp. 271–275. IEEE (1996)

    Google Scholar 

  10. Hu, Q., Pedrycz, W., Yu, D., Lang, J.: Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Trans. Syst. Man Cybern. Part B Cybern. 40(1), 137–150 (2010)

    Article  Google Scholar 

  11. Li, L., Abu-Mostafa, Y.S.: Data complexity in machine learning. Technical Report CaltechCSTR:2006.004, Caltech Computer Science (2006)

    Google Scholar 

  12. Lorena, A.C., Costa, I.G., Spolar, N., Souto, M.C.P.: Analysis of complexity indices for classification problems: cancer gene expression data. Neurocomputing 75, 33–42 (2012)

    Article  Google Scholar 

  13. Luengo, J., Herrera, F.: Shared domains of competence of approximate learning models using measures of separability of classes. Inf. Sci. 185(1), 43–65 (2012)

    Article  MathSciNet  Google Scholar 

  14. Mansilla, E.B., Ho, T.K.: On classifier domains of competence. In: Proceedings of the 17th ICPR, pp. 136–139 (2004)

    Google Scholar 

  15. Mollineda, R.A., Sánchez, J.S., Sotoca, J.M.: Data characterization for effective prototype selection. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 27–34. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Orriols-Puig, A., Maci, N., Ho, T.K.: Documentation for the data complexity library in c++. Technical report, La Salle - Universitat Ramon Llull (2010)

    Google Scholar 

  17. Singh, S.: Multiresolution estimates of classification complexity. IEEE Trans. PAMI 25, 1534–1539 (2003)

    Article  Google Scholar 

  18. Smith, M.R., Martinez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2014)

    Article  MathSciNet  Google Scholar 

  19. Souto, M.C.P., Lorena, A.C., Spolar, N., Costa, I.G.: Complexity measures of supervised classification tasks: a case study for cancer gene expression data. In: Proceedings of IJCNN, pp. 1352–1358 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana Carolina Lorena .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Lorena, A.C., de Souto, M.C.P. (2015). On Measuring the Complexity of Classification Problems. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26532-2_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26531-5

  • Online ISBN: 978-3-319-26532-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics