Skip to main content

Reliability of Cross-Validation for SVMs in High-Dimensional, Low Sample Size Scenarios

  • Conference paper
Artificial Neural Networks - ICANN 2008 (ICANN 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5163))

Included in the following conference series:

Abstract

A Support-Vector-Machine (SVM) learns for given 2-class-data a classifier that tries to achieve good generalisation by maximising the minimal margin between the two classes. The performance can be evaluated using cross-validation testing strategies. But in case of low sample size data, high dimensionality might lead to strong side-effects that can significantly bias the estimated performance of the classifier. On simulated data, we illustrate the effects of high dimensionality for cross-validation of both hard- and soft-margin SVMs. Based on the theoretical proofs towards infinity we derive heuristics that can be easily used to validate whether or not given data sets are subject to these constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  2. Lockhart, D.J., Winzeler, E.: Genomics, Gene Expression and DNA Arrays. Nature 405, 827–836 (2000)

    Article  Google Scholar 

  3. Cristianini, N., Shawe-Taylor, J.: Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  4. Martinetz, T., Labusch, K., Schneegaß, D.: SoftDoubleMinOver: A Simple Procedure for Maximum Margin Classification. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 301–306. Springer, Heidelberg (2005)

    Google Scholar 

  5. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)

    Google Scholar 

  6. Hall, P., Marron, J.S., Neeman, A.: Geometric representation of high dimension, low sample size data. J. R. Statist. Soc. 67(3), 427–444 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  7. Ahn, J., Marron, J.S., Muller, K.M., Chi, Y.Y.: The high-dimension, low-sample-size geometric representation holds under mild conditions. Biometrika 94(3), 760–766 (2007)

    Article  MATH  Google Scholar 

  8. Bartlett, P., Shawe-Taylor, J.: Generalization Performance of Support Vector Machines and Other Pattern Classifiers. In: Advances in Kernel Methods: Support Vector Learning, pp. 43–54. MIT Press, Cambridge (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Véra Kůrková Roman Neruda Jan Koutník

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Klement, S., Madany Mamlouk, A., Martinetz, T. (2008). Reliability of Cross-Validation for SVMs in High-Dimensional, Low Sample Size Scenarios. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87536-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87536-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87535-2

  • Online ISBN: 978-3-540-87536-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics