Skip to main content

An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Big Data (MOD 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10710))

Included in the following conference series:

  • 2967 Accesses

Abstract

The problem of extracting a well conditioned submatrix from any rectangular matrix (with e.g. normalized columns) has been a subject of extensive research with applications to machine learning (rank revealing factorization, sparse solutions to least squares regression problems, clustering, \(\cdots \)), optimisation (low stretch spanning trees, \(\cdots \)), and is also connected with problems in functional and harmonic analysis (Bourgain-Tzafriri restricted invertibility problem).

In this paper, we provide a deterministic algorithm which extracts a submatrix \(X_S\) from any matrix X with guaranteed individual lower and upper bounds on each singular value of \(X_S\). We are also able to deduce a slightly weaker (up to a \(\log \)) version of the Bourgain-Tzafriri theorem as an immediate side result.

We end the paper with a description of how our method applies to the analysis of a large data set and how its numerical efficiency compares with the method of Spieman and Srivastava.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.cad.zju.edu.cn/home/dengcai/Data/Yale/Yale_64x64.mat.

References

  1. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)

    Google Scholar 

  2. Avron, H., Boutsidis, C.: Faster subset selection for matrices and applications. SIAM J. Matrix Anal. Appl. 34(4), 1464–1499 (2013)

    Article  MathSciNet  Google Scholar 

  3. Bourgain, J., Tzafriri, L.: Invertibility of “large” submatrices with applications to the geometry of Banach spaces and harmonic analysis. Israel J. Math. 57(2), 137–224 (1987)

    Article  MathSciNet  Google Scholar 

  4. Boutsidis, C., Drineas, P., Magdon-Ismail, M.: Near-optimal column-based matrix reconstruction. SIAM J. Comput. 43(2), 687–717 (2014)

    Article  MathSciNet  Google Scholar 

  5. Boutsidis, C., Drineas, P., Mahoney, M.: On selecting exactly k columns from a matrix (2008, in press)

    Google Scholar 

  6. d’Aspremont, A., Ghaoui, L.E., Jordan, M.I., Lanckriet, G.R.: A direct formulation for sparse PCA using semidefinite programming. In: Advances in Neural Information Processing Systems, pp. 41–48 (2005)

    Google Scholar 

  7. Farahat, A.K., Elgohary, A., Ghodsi, A., Kamel, M.S.: Greedy column subset selection for large-scale data sets. Knowl. Inform. Syst. 45(1), 1–34 (2015)

    Article  Google Scholar 

  8. Mallat, S.: Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)

    Article  MathSciNet  Google Scholar 

  9. Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)

    Article  Google Scholar 

  10. Naor, A.: Sparse quadratic forms and their geometric applications [following Batson, Spielman and Srivastava]. Séminaire Bourbaki: Vol. 2010/2011. Exposés 1027–1042. Astérisque No. 348 (2012), Exp. No. 1033, viii, 189–217

    Google Scholar 

  11. Nikolov, A.: Randomized rounding for the largest simplex problem. In: Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pp. 861–870 (2015)

    Google Scholar 

  12. Nikolov, A., Singh, M.: Maximizing determinants under partition constraints. In: STOC 2016, pp. 192–201 (2016)

    Google Scholar 

  13. Spielman, D.A., Srivastava, N.: An elementary proof of the restricted invertibility theorem. Israel J. Math. 190, 83–91 (2012)

    Article  MathSciNet  Google Scholar 

  14. Tropp, J.A.: The random paving property for uniformly bounded matrices. Studia Math. 185(1), 67–82 (2008)

    Article  MathSciNet  Google Scholar 

  15. Tropp, J.A.: Norms of random submatrices and sparse approximation. C. R. Acad. Sci. Paris, Ser. I 346, 1271–1274 (2008)

    Article  MathSciNet  Google Scholar 

  16. Tropp, J.A.: Column subset selection, matrix factorization, and eigenvalue optimization. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 978–986. Society for Industrial and Applied Mathematics (2009)

    Google Scholar 

  17. Vershynin, R.: John’s decompositions: selecting a large part. Israel J. Math. 122, 253–277 (2001)

    Article  MathSciNet  Google Scholar 

  18. Youssef, P.: A note on column subset selection. Int. Math. Res. Not. IMRN 23, 6431–6447 (2014)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Chrétien .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chrétien, S., Darses, S. (2018). An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R. (eds) Machine Learning, Optimization, and Big Data. MOD 2017. Lecture Notes in Computer Science(), vol 10710. Springer, Cham. https://doi.org/10.1007/978-3-319-72926-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-72926-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-72925-1

  • Online ISBN: 978-3-319-72926-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics