Skip to main content

The Effect of Random Projection on Local Intrinsic Dimensionality

  • Conference paper
  • First Online:
Similarity Search and Applications (SISAP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13058))

Included in the following conference series:

Abstract

Much attention has been given in the research literature to the study of distance-preserving random projections of discrete data sets, the limitations of which are established by the classical Johnson-Lindenstrauss existence lemma. In this theoretical paper, we analyze the effect of random projection on a natural measure of the local intrinsic dimensionality (LID) of smooth distance distributions in the Euclidean setting. The main contribution of the paper consists of upper and lower bounds on the LID in the vicinity of a reference point after random projection. The bounds depend only on the LID in the original data domain and the target dimension of the projection; as the difference between the target and intrinsic dimensionalities grows, these bounds converge to the LID of the original domain. The paper concludes with a brief discussion of the implications for applications in databases, machine learning and data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Sys. Sci. 66(4), 671–687 (2003)

    Article  MathSciNet  Google Scholar 

  2. Ailon, N., Chazelle, B.: The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput. 39(1), 302–322 (2009)

    Article  MathSciNet  Google Scholar 

  3. Amsaleg, L.: Extreme-value-theoretic estimation of local intrinsic dimensionality. Data Min. Knowl. Disc. 32(6), 1768–1805 (2018). https://doi.org/10.1007/s10618-018-0578-6

    Article  MathSciNet  MATH  Google Scholar 

  4. Baraniuk, R.G., Wakin, M.B.: Random projections of smooth manifolds. Found. Comput. Math. 9(1), 51–77 (2009)

    Article  MathSciNet  Google Scholar 

  5. Bartal, Y., Recht, B., Schulman, L.J.: Dimensionality reduction: beyond the Johnson-Lindenstrauss bound. In: Proceedings of the Twenty-second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2011, pp. 868–887 (2011)

    Google Scholar 

  6. Bruske, J., Sommer, G.: Intrinsic dimensionality estimation with optimally topology preserving maps. IEEE Trans. Pattern Anal. Mach. Intell. 20(5), 572–575 (1998)

    Article  Google Scholar 

  7. Camastra, F., Vinciarelli, A.: Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans. Pattern Anal. Mach. Intell. 24(10), 1404–1407 (2002)

    Article  Google Scholar 

  8. Casanova, G., Englmeier, E., Houle, M.E., Kröger, P., Nett, M., Zimek, A.: Dimensional testing for reverse \(k\)-nearest neighbor search. PVLDB 10(7), 769–780 (2017)

    Google Scholar 

  9. Coles, S.: An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics, Springer, London (2001). https://doi.org/10.1007/978-1-4471-3675-0

    Book  MATH  Google Scholar 

  10. Dasgupta, A., Kumar, R., Sarlos, T.: A sparse Johnson-Lindenstrauss transform. In: STOC, pp. 341–350 (2010)

    Google Scholar 

  11. Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22(1), 60–65 (2003)

    Article  MathSciNet  Google Scholar 

  12. de Vries, T., Chawla, S., Houle, M.E.: Density-preserving projections for large-scale local anomaly detection. Knowl. Inf. Syst. 32(1), 25–52 (2012)

    Article  Google Scholar 

  13. Dubhashi, D.P., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  14. Falconer, K.: Fractal Geometry: Mathematical Foundations and Applications. Wiley (2003)

    Google Scholar 

  15. Faloutsos, C., Kamel, I.: Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension. In: PODS 1994, pp. 4–13 (1994)

    Google Scholar 

  16. Frankl, P., Maehara, H.: The Johnson-Lindenstrauss lemma and the sphericity of some graphs. J. Comb. Theor. Ser. B 44(3), 355–362 (1988)

    Article  MathSciNet  Google Scholar 

  17. Gomes, M.I., Canto e Castro, L., Fraga Alves, M.I., Pestana, D., Laurens de Haan leading contributions: Statistics of extremes for IID data and breakthroughs in the estimation of the extreme value index. Extremes 11, 3–34 (2008)

    Article  MathSciNet  Google Scholar 

  18. Grassberger, P., Procaccia, I.: Measuring the strangeness of strange attractors. Phys. D 9(1–2), 189–208 (1983)

    Article  MathSciNet  Google Scholar 

  19. Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS 2003, pp. 534–543. IEEE Computer Society (2003)

    Google Scholar 

  20. Houle, M.E.: Dimensionality, discriminability, density & distance distributions. In: Proceedings of the ICDMW 2013, pp. 468–473 (2013)

    Google Scholar 

  21. Houle, M.E.: Local intrinsic dimensionality I: an extreme-value-theoretic foundation for similarity applications. In: International Conference on Similarity Search and Applications, pp. 64–79 (2017)

    Google Scholar 

  22. Houle, M.E.: Local intrinsic dimensionality II: multivariate analysis and distributional support. In: International Conference on Similarity Search and Applications, pp. 80–95 (2017)

    Google Scholar 

  23. Houle, M.E., Kashima, H., Nett, M.: Generalized expansion dimension. In: Proceedings of the ICDMW 2012, pp. 587–594 (2012)

    Google Scholar 

  24. Houle, M.E., Ma, X., Nett, M., Oria, V.: Dimensional testing for multi-step similarity search. In: ICDM 2012, pp. 299–308 (2012)

    Google Scholar 

  25. Houle, M.E., Ma, X., Oria, V.: Effective and efficient algorithms for flexible aggregate similarity search in high dimensional spaces. IEEE TKDE 27(12), 3258–3273 (2015)

    Google Scholar 

  26. Houle, M.E., Ma, X., Oria, V., Sun, J.: Efficient algorithms for similarity search in user-specified projective subspaces. Inf. Syst. 59, 2–14 (2016)

    Article  Google Scholar 

  27. Houle, M.E., Nett, M.: Rank-based similarity search: Reducing the dimensional dependence. IEEE TPAMI 37(1), 136–150 (2015)

    Article  Google Scholar 

  28. Huisman, R., Koedijk, K.G., Kool, C.J.M., Palm, F.: Tail-index estimates in small samples. J. Bus. Econ. Stat. 19(2), 208–216 (2001)

    Article  MathSciNet  Google Scholar 

  29. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)

    Google Scholar 

  30. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: AMS Conference in Modern Analysis and Probability, pp. 189–206 (1982)

    Google Scholar 

  31. Kane, D.M., Nelson, J.: Sparser Johnson-Lindenstrauss transforms. J. ACM 61(1), 4:1-4:23 (2014)

    Article  MathSciNet  Google Scholar 

  32. Karger, D.R., Ruhl, M.: Finding nearest neighbors in growth-restricted metrics. In: STOC 2002, pp. 741–750 (2002)

    Google Scholar 

  33. Larsen, K.G., Nelson, J.: The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction. arXiv.org, cs.IT (2014)

  34. Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Advances in Neural Information Processing Systems 17 (NIPS 2004) (2004)

    Google Scholar 

  35. Mattila, P.: Hausdorff dimension, orthogonal projections and intersections with planes. Ann. Acad. Sci. Fenn. A Math. 1, 227–244 (1975)

    Article  MathSciNet  Google Scholar 

  36. Navarro, G., Paredes, R., Reyes, N., Bustos, C.: An empirical evaluation of intrinsic dimension estimators. Inf. Syst. 64, 206–218 (2017)

    Article  Google Scholar 

  37. Romano, S., Chelly,O., Nguyen, V., Bailey, J., Houle, M.E.: Measuring dependency via intrinsic dimensionality. In: ICPR, pp. 1207–1212 (2016)

    Google Scholar 

  38. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  39. Rozza, A., Lombardi, G., Ceruti, C., Casiraghi, E., Campadelli, P.: Novel high intrinsic dimensionality estimators. Mach. Learn. J. 89(1–2), 37–65 (2012)

    Article  MathSciNet  Google Scholar 

  40. Schölkopf, B., Smola, A.J., Müller, K.-R.: Nonlinear component analysis as a Kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998)

    Article  Google Scholar 

  41. Tenenbaum, J., Silva, V.D., Langford, J.: A global geometric framework for non linear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  42. Venna, J., Kaski, S.: Local multidimensional scaling. Neural Netw. 19(6–7), 889–899 (2006)

    Article  Google Scholar 

  43. Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.J.: Distance metric learning with application to clustering with side-information. In: NIPS 2002, pp. 505–512 (2002)

    Google Scholar 

Download references

Acknowledgments

Michael  E. Houle acknowledges the financial support of JSPS Kakenhi Kiban (B) Research Grant 18H03296. Ken-ichi Kawarabayashi is supported by JSPS Kakenhi Research Grant JP18H05291.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael E. Houle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Houle, M.E., Kawarabayashi, Ki. (2021). The Effect of Random Projection on Local Intrinsic Dimensionality. In: Reyes, N., et al. Similarity Search and Applications. SISAP 2021. Lecture Notes in Computer Science(), vol 13058. Springer, Cham. https://doi.org/10.1007/978-3-030-89657-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89657-7_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89656-0

  • Online ISBN: 978-3-030-89657-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics