Estimation in High Dimensions: A Geometric Perspective

Vershynin, Roman

doi:10.1007/978-3-319-19749-4_1

Roman Vershynin²

Part of the book series: Applied and Numerical Harmonic Analysis ((ANHA))

2608 Accesses
27 Citations

Abstract

This tutorial provides an exposition of a flexible geometric framework for high-dimensional estimation problems with constraints. The tutorial develops geometric intuition about high-dimensional sets, justifies it with some results of asymptotic convex geometry, and demonstrates connections between geometric results and estimation problems. The theory is illustrated with applications to sparse recovery, matrix completion, quantization, linear and logistic regression, and generalized linear models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Partially supported by NSF grant DMS 1265782 and USAF Grant FA9550-14-1-0009.
2.
Here \(a_{n} \asymp b_{n}\) means that there exists positive absolute constants c and C such that ca _n ≤ b _n ≤ Ca _n for all n.
3.
This intuition is a good approximation to truth, but it should be corrected. While concentration of volume tells us that the bulk is contained in a certain Euclidean ball (and even in a thin spherical shell), it is not always true that the bulk is a Euclidean ball (or shell); a counterexample is the unit cube [−1, 1]ⁿ. In fact, the cube is the worst convex set in the Dvoretzky theorem, which we are about to state.
4.
Conclusion (1.10) is stated with the convention that \(\sup _{\boldsymbol{u}\in T_{\varepsilon }}\|u\|_{2} = 0\) whenever \(T_{\varepsilon } =\emptyset\).
5.
We can assume T to be finite to avoid measurability complications and then proceed by approximation; see, e.g., [43, Section 2.2].
6.
The increment comparison may look better if we replace the L ₂ norm on the right-hand side by ψ ₂ norm. Indeed, it is easy to see that \(\|G(\boldsymbol{s}) - G(\boldsymbol{t})\|_{\psi _{2}} \asymp (\mathbb{E}\|G(\boldsymbol{s}) - G(\boldsymbol{t})\|_{2}^{2})^{1/2}\).
7.
We should mention that a reverse inequality also holds: by isotropy, one has \(\mathbb{E}_{\boldsymbol{a}}\vert \left \langle \boldsymbol{a},\boldsymbol{u}\right \rangle \vert \leq (\mathbb{E}_{\boldsymbol{a}}\left \langle \boldsymbol{a},\boldsymbol{u}\right \rangle ^{2})^{1/2} =\|\boldsymbol{ u}\|_{2}\). However, this inequality will not be used in the proof.
8.
In definition (1.28), we adopt the convention that 0∕0 = 0.
9.
The only (minor) difference with our former definition (1.3) of the mean width is that we take supremum over S instead of S − S, so \(\bar{w}(S)\) is a smaller quantity. The reason we do not need to consider S − S because we already subtracted \(\boldsymbol{x}\) in the definition of the descent cone.
10.
Formally, consider the singular value decomposition \(p^{-1}Y =\sum _{i}s_{i}\boldsymbol{u}_{i}\boldsymbol{v}_{i}^{\mathsf{T}}\) with nonincreasing singular values s _i . We define \(\hat{X}\) by retaining the r leading terms of this decomposition, i.e., \(\hat{X} =\sum _{ i=1}^{r}s_{i}\boldsymbol{u}_{i}\boldsymbol{v}_{i}^{\mathsf{T}}\).
11.
A high-probability version of Theorem 11.2 was proved in [59]. Namely, denoting by δ the right-hand side of (1.44) , we have \(\max _{\mathcal{C}}\mathop{\mathrm{diam}}\nolimits (K \cap \mathcal{C}) \leq \delta\) with probability at least \(1 - 2\exp (-c\delta ^{2}m)\) , as long as m ≥ Cδ ⁻⁶ w(K) ² . The reader will easily deduce the statement of Theorem 11.2 from this.

References

R. Adamczak, R. Latala, A. Litvak, K. Oleszkiewicz, A. Pajor, N. Tomczak-Jaegermann, A short proof of Paouris’ inequality. Can. Math. Bull. 57, 3–8 (2014)
Article MathSciNet MATH Google Scholar
A. Ai, A. Lapanowski, Y. Plan, R. Vershynin, One-bit compressed sensing with non-Gaussian measurements. Linear Algebra Appl. 441, 222–239 (2014)
Article MathSciNet MATH Google Scholar
D. Amelunxen, M. Lotz, M. McCoy, J.A. Tropp, Living on the edge: a geometric theory of phase transitions in convex optimization. Inf. Inference 3, 224–294 (2014)
Article MathSciNet MATH Google Scholar
S. Artstein-Avidan, A. Giannopoulos, V. Milman, Asymptotic Geometric Analysis, Part I. AMS Mathematical Surveys and Monographs (2015)
Google Scholar
F. Bach, R. Jenatton, J. Mairal, G. Obozinski, Structured sparsity through convex optimization. Stat. Sci. 27, 450–468 (2012)
Article MathSciNet Google Scholar
K. Ball, An elementary introduction to modern convex geometry, in Flavors of Geometry. Mathematical Sciences Research Institute Publications, vol. 31 (Cambridge University Press, Cambridge, 1997), pp. 1–58
Google Scholar
H. Bauschke, P. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC (Springer, New York, 2011)
Google Scholar
A. Ben-Tal, A. Nemirovski, Lectures on Modern Convex Optimization. Analysis, Algorithms, and Engineering Applications. MPS/SIAM Series on Optimization (Society for Industrial and Applied Mathematics (SIAM)/Mathematical Programming Society (MPS), Philadelphia, 2001)
Google Scholar
S. Boucheron, O. Bousquet, G. Lugosi, Concentration inequalities, in Advanced Lectures in Machine Learning, ed. by O. Bousquet, U. Luxburg, G. Rätsch (Springer, Berlin, 2004), pp. 208–240
Chapter Google Scholar
P. Boufounos, R. Baraniuk, 1-bit compressive sensing, in Conference on Information Sciences and Systems (CISS), March 2008 (Princeton, New Jersey, 2008)
Google Scholar
P. Bühlmann, S. van de Geer, Statistics for High-Dimensional Data. Methods, Theory and Applications. Springer Series in Statistics (Springer, Heidelberg, 2011)
Google Scholar
E. Candès, B. Recht, Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2009)
Article MathSciNet MATH Google Scholar
E. Candès, T. Tao, The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56, 2053–2080 (2010)
Article Google Scholar
E. Candès, X. Li, Y. Ma, J. Wright, Robust principal component analysis? J. ACM 58(3), Art. 11, 37 pp. (2011)
Google Scholar
D. Chafaï, O. Guédon, G. Lecué, A. Pajor, Interactions Between Compressed Sensing Random Matrices and High Dimensional Geometry. Panoramas et Synthéses, vol. 37 (Société Mathématique de France, Paris, 2012)
Google Scholar
V. Chandrasekaran, B. Recht, P. Parrilo, A. Willsky, The convex geometry of linear inverse problems. Found. Comput. Math. 12, 805–849 (2012)
Article MathSciNet MATH Google Scholar
S. Chatterjee, A new perspective on least squares under convex constraint. Ann. Stat. 42, 2340–2381 (2014)
Article MATH Google Scholar
S. Chen, D. Donoho, M. Saunders, Atomic decomposition by Basis Pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)
Article MathSciNet Google Scholar
M. Davenport, M. Duarte, Y. Eldar, G. Kutyniok, Introduction to compressed sensing, in Compressed Sensing (Cambridge University Press, Cambridge, 2012), pp. 1–64
Google Scholar
M. Davenport, D. Needell, M. Wakin, Signal space CoSaMP for sparse recovery with redundant dictionaries. IEEE Trans. Inf. Theory 59, 6820–6829 (2013)
Article MathSciNet Google Scholar
D. Donoho, M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via l ¹ minimization. Proc. Natl. Acad. Sci. USA 100, 2197–2202 (2003)
Article MathSciNet MATH Google Scholar
D. Donoho, J. Tanner, Counting faces of randomly projected polytopes when the projection radically lowers dimension. J. Am. Math. Soc. 22, 1–53 (2009)
Article MathSciNet MATH Google Scholar
A. Dvoretzky, A theorem on convex bodies and applications to Banach spaces. Proc. Natl. Acad. Sci. USA 45, 223–226 (1959)
Article MathSciNet Google Scholar
A. Dvoretzky, Some results on convex bodies and Banach spaces, in Proceedings of the International Symposium on Linear Spaces (Jerusalem, 1961), pp. 123–161
Google Scholar
X. Fernique, Regularité des trajectoires des fonctions aléatoires gaussiennes, in École d’Été de Probabilités de Saint-Flour, IV-1974. Lecture Notes in Mathematics, vol. 480 (Springer, Berlin, 1975), pp. 1–96
Google Scholar
S. Foucart, H. Rauhut, A Mathematical Introduction to Compressive Sensing. Applied and Numerical Harmonic Analysis (Birkhäuser/Springer, New York, 2013)
Book MATH Google Scholar
A. Giannopoulos, V. Milman, How small can the intersection of a few rotations of a symmetric convex body be? C. R. Acad. Sci. Paris Ser. I Math. 325, 389–394 (1997)
Article MathSciNet MATH Google Scholar
A. Giannopoulos, V. Milman, On the diameter of proportional sections of a symmetric convex body. Int. Math. Res. Not. 1, 5–19 (1997)
Article MathSciNet Google Scholar
A. Giannopoulos, V.D. Milman, Mean width and diameter of proportional sections of a symmetric convex body. J. Reine Angew. Math. 497, 113–139 (1998)
MathSciNet MATH Google Scholar
A. Giannopoulos, V. Milman, Asymptotic convex geometry: short overview, in Different Faces of Geometry. International Mathematical Series (NY), vol. 3 (Kluwer/Plenum, New York, 2004), pp. 87–162
Google Scholar
A. Giannopoulos, V.D. Milman, Asymptotic formulas for the diameter of sections of symmetric convex bodies. J. Funct. Anal. 223, 86–108 (2005)
Article MathSciNet MATH Google Scholar
A. Giannopoulos, S. Brazitikos, P. Valettas, B.-H. Vritsiou, Geometry of Isotropic Convex Bodies. Mathematical Surveys and Monographs, vol. 196 (American Mathematical Society, Providence, 2014)
Google Scholar
Y. Gordon, On Milman’s inequality and random subspaces which escape through a mesh in \(\mathbb{R}^{n}\), in Geometric Aspects of Functional Analysis. Israel Seminar 1986–1987. Lecture Notes in Mathematics, vol. 1317 (Springer, Berlin, 1988), pp. 84–106
Google Scholar
D. Gross, Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57, 1548–1566 (2011)
Article Google Scholar
O. Guédon, E. Milman, Interpolating thin-shell and sharp large-deviation estimates for isotropic log-concave measures. Geom. Funct. Anal. 21, 1043–1068 (2011)
Article MathSciNet MATH Google Scholar
L. Jacques, J. Laska, P. Boufounos, R. Baraniuk, Robust 1-bit compressive sensing via binary stable embeddings of sparse vectors. IEEE Trans. Inf. Theory 59(4), 2082–2102 (2013)
Article MathSciNet Google Scholar
R. Keshavan, A. Montanari, S. Oh, Matrix completion from a few entries. IEEE Trans. Inf. Theory 56, 2980–2998 (2010)
Article MathSciNet Google Scholar
B. Klartag, Power-law estimates for the central limit theorem for convex sets. J. Funct. Anal. 245, 284–310 (2007)
Article MathSciNet MATH Google Scholar
G. Kutyniok, Theory and applications of compressed sensing. GAMM-Mitt. 36, 79–101 (2013)
Article MathSciNet MATH Google Scholar
G. Lecué, S. Mendelson, Sparse recovery under weak moment assumptions. J. Eur. Math. Soc. (2015, to appear)
Google Scholar
G. Lecué, S. Mendelson, Necessary moment conditions for exact reconstruction via Basis Pursuit (submitted)
Google Scholar
M. Ledoux, The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs, vol. 89 (American Mathematical Society, Providence, 2001)
Google Scholar
M. Ledoux, M. Talagrand, Probability in Banach Spaces. Isoperimetry and Processes [Reprint of the 1991 Edition]. Classics in Mathematics (Springer, Berlin, 2011)
Google Scholar
S. Mendelson, A few notes on statistical learning theory, in Advanced Lectures in Machine Learning, ed. by S. Mendelson, A.J. Smola. Lecture Notes in Computer Science, vol. 2600 (Springer, Berlin, 2003), pp. 1–40
Google Scholar
S. Mendelson, Geometric parameters in learning theory, in Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics, vol. 1850 (Springer, Berlin, 2004), pp. 193–235
Google Scholar
S. Mendelson, A remark on the diameter of random sections of convex bodies, in Geometric Aspects of Functional Analysis (GAFA Seminar Notes). Lecture Notes in Mathematics, vol. 2116 (2014), pp. 395–404
Google Scholar
S. Mendelson, A. Pajor, N. Tomczak-Jaegermann, Reconstruction and subgaussian operators in asymptotic geometric analysis. Geom. Funct. Anal. 17, 1248–1282 (2007)
Article MathSciNet MATH Google Scholar
V. Milman, New proof of the theorem of Dvoretzky on sections of convex bodies. Funct. Anal. Appl. 5, 28–37 (1971)
MathSciNet Google Scholar
V. Milman, Geometrical inequalities and mixed volumes in the local theory of Banach spaces. Astérisque 131, 373–400 (1985)
MathSciNet Google Scholar
V. Milman, Random Subspaces of Proportional Dimension of Finite Dimensional Normed Spaces: Approach Through the Isoperimetric Inequality. Lecture Notes in Mathematics, vol. 1166 (1985, Springer), pp. 106–115
Google Scholar
V. Milman, Surprising geometric phenomena in high-dimensional convexity theory, in European Congress of Mathematics, vol. II (Budapest, 1996). Progress in Mathematics, vol. 169 (Birkhäuser, Basel, 1998), pp. 73–91
Google Scholar
V. Milman, G. Schechtman, Asymptotic Theory of Finite-Dimensional Normed Spaces. With an Appendix by M. Gromov. Lecture Notes in Mathematics, vol. 1200 (Springer, Berlin, 1986)
Google Scholar
S. Oymak, B. Hassibi, New null space results and recovery thresholds for matrix rank minimization. Available at arxiv.org/abs/1011.6326 (2010)
Google Scholar
G. Paouris, Concentration of mass on convex bodies. Geom. Funct. Anal. 16, 1021–1049 (2006)
Article MathSciNet MATH Google Scholar
A. Pajor, N. Tomczak-Jaegermann, Subspaces of small codimension of finite dimen- sional Banach spaces. Proc. Am. Math. Soc. 97, 637–642 (1986)
Article MathSciNet MATH Google Scholar
G. Pisier, The Volume of Convex Bodies and Banach Space Geometry. Cambridge Tracts in Mathematics, vol. 94 (Cambridge University Press, Cambridge, 1989)
Google Scholar
Y. Plan, R. Vershynin, One-bit compressed sensing by linear programming. Commun. Pure Appl. Math. 66, 1275–1297 (2013)
Article MathSciNet MATH Google Scholar
Y. Plan, R. Vershynin, Robust 1-bit compressed sensing and sparse logistic regression: a convex programming approach. IEEE Trans. Inf. Theory 59, 482–494 (2013)
Article MathSciNet Google Scholar
Y. Plan, R. Vershynin, Dimension reduction by random hyperplane tessellations. Discret. Comput. Geom. 51, 438–461 (2014)
Article MathSciNet MATH Google Scholar
Y. Plan, R. Vershynin, E. Yudovina, High-dimensional estimation with geometric constraints [Arxiv: 1404.3749] (submitted)
N. Rao, B. Recht, R. Nowak, Tight measurement bounds for exact recovery of structured sparse signals, in Proceedings of AISTATS (2012)
Google Scholar
H. Rauhut, K. Schnass, P. Vandergheynst, Compressed sensing and redundant dictionaries. IEEE Trans. Inf. Theory 54, 2210–2219 (2008)
Article MathSciNet MATH Google Scholar
B. Recht, A simpler approach to matrix completion. J. Mach. Learn. Res. 12, 3413–3430 (2011)
MathSciNet MATH Google Scholar
R. Rockafellar, Convex Analysis. Princeton Mathematical Series, vol. 28 (Princeton University Press, Princeton, 1970)
Google Scholar
M. Rudelson, R. Vershynin, Combinatorics of random processes and sections of convex bodies. Ann. Math. 164, 603–648 (2006)
Article MathSciNet MATH Google Scholar
M. Rudelson, R. Vershynin, On sparse reconstruction from Fourier and Gaussian measurements. Commun. Pure Appl. Math. 61, 1025–1045 (2008)
Article MathSciNet MATH Google Scholar
Y. Seginer, The expected norm of random matrices. Comb. Probab. Comput. 9, 149–166 (2000)
Article MathSciNet MATH Google Scholar
N. Srebro, N. Alon, T. Jaakkola, Generalization error bounds for collaborative prediction with low-rank matrices, in Proceedings of Advances in Neural Information Processing Systems (NIPS), vol. 17 (2005)
Google Scholar
M. Stojnic, Various thresholds for ℓ ₁-optimization in compressed sensing (2009) [Arxiv: 0907.3666]
M. Talagrand, Regularity of Gaussian processes. Acta Math. 159, 99–149 (1987)
Article MathSciNet MATH Google Scholar
M. Talagrand, The Generic Chaining. Upper and Lower Bounds of Stochastic Processes. Springer Monographs in Mathematics (Springer, Berlin, 2005)
Google Scholar
J. Tropp, Convex recovery of a structured signal from independent random linear measurements, in Sampling Theory, a Renaissance (to appear)
Google Scholar
R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, in Compressed Sensing (Cambridge University Press, Cambridge, 2012), pp. 210–268
Google Scholar
M. Wainwright, Structured regularizers for high-dimensional problems: statistical and computational issues. Ann. Rev. Stat. Appl. 1, 233–253 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Michigan, 530 Church Street, Ann Arbor, MI, 48109, USA
Roman Vershynin

Authors

Roman Vershynin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Roman Vershynin .

Editor information

Editors and Affiliations

School of Engineering and Science, Jacobs University Bremen, Bremen, Bremen, Germany
Götz E. Pfander

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vershynin, R. (2015). Estimation in High Dimensions: A Geometric Perspective. In: Pfander, G. (eds) Sampling Theory, a Renaissance. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-19749-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-19749-4_1
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-19748-7
Online ISBN: 978-3-319-19749-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics