Skip to main content

A DC Programming Approach for Sparse Optimal Scoring

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

  • 4152 Accesses

Abstract

We consider the supervised classification problem in the high-dimensional setting. High-dimensionality makes the application of most classification difficult. We present a novel approach to the sparse linear discriminant analysis (LDA) based on its optimal scoring interpretation and the zero-norm. The difficulty in treating the zero-norm is overcome by using an appropriate continuous approximation such that the resulting problem can be formulated as a DC (Difference of Convex functions) program to which DCA (DC Algorithms) is investigated. The computational results on both simulated data and real microarray cancer data show the efficiency of the proposed algorithm in feature selection as well as classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bradley, P.S., Mangasarian, O.L.: Feature selection via mathematical programming. In: Proceeding of International Conference on Machine Learning, ICML 1998 (2008)

    Google Scholar 

  2. Clemmensen, L., Hastie, T., Witten, D., Ersbøll, B.: Sparse discriminant analysis. Technometrics 53(4), 406–413 (2011)

    Google Scholar 

  3. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annal of Eugenics 7, 179–188 (1936)

    Article  Google Scholar 

  4. Friedman, J., Hastie, T., Hoefling, H., Tibshirani, R.: Pathwise coordinate optimization. The Anals of Applied Statistics 1, 302–332 (2007)

    Article  MATH  Google Scholar 

  5. Grosenick, L., Greer, S., Knutson, B.: Interpretable classifers for fmri improve prediction of purchases. IEEE Transactions on Neural Systems and Rehabilitation Engineering 16(6), 539–547 (2008)

    Article  Google Scholar 

  6. Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1), 86–100 (2007)

    Article  MATH  Google Scholar 

  7. Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. The Annals of Statistics 23(1), 73–102 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  8. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009)

    Google Scholar 

  9. Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC programming approach for feature selection in support vector machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259–278 (2008)

    Google Scholar 

  10. Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for hierarchical clustering. European Journal of Operational Research 183, 1067–1085 (2007)

    Google Scholar 

  11. Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23–46 (2005)

    Google Scholar 

  12. Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Exact penalty and error bounds in DC programming. Journal of Global Optimization 52(3), 509–535 (2012)

    Google Scholar 

  13. Le Thi, H.A., Pham Dinh, T., Le Hoai, M., Vo Xuan, T.: DC approximation approaches for sparse optimization. To appear in European Journal of Operational Research (2014)

    Google Scholar 

  14. Leng, C.: Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Computational Biology and Chemistry 32, 417–425 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  15. Liu, Y., Shen, X.: Multicategory \(\psi \)-learning. Journal of the American Statistical Association 101, 500–509 (2006)

    Article  MathSciNet  Google Scholar 

  16. Liu, Y., Shen, X., Doss, H.: Multicategory \(\psi \)-learning and support vector machine: Computational tools. Journal of Computational and Graphical Statistics 14, 219–236 (2005)

    Article  MathSciNet  Google Scholar 

  17. Peleg, D., Meir, R.: A bilinear formulation for vector sparsity optimization. Signal Processing 88(2), 375–389 (2008)

    Article  MATH  Google Scholar 

  18. DPham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: Theory, algorithms and applications. Acta Mathematica Vietnamica 22(1), 289–355 (1997)

    Google Scholar 

  19. Pham Dinh, T., Le Thi, H.A.: A DC optimization algorithm for solving the trust-region subproblem. SIAM. Journal of Optimization 8(2), 476–505 (1998)

    Google Scholar 

  20. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  21. Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. 99, 6567–6572 (2002)

    Article  Google Scholar 

  22. Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Statistical Science 18(1), 104–117 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  23. Witten, D., Tibshirani, R.: Penalized classification using Fisher’s linear discriminant. Journal Royal Statistical Society B 73, 753–772 (2011)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hoai An Le Thi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Thi, H.A.L., Phan, D.N. (2015). A DC Programming Approach for Sparse Optimal Scoring. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics