Skip to main content

Intrinsic Dimension Induced Similarity Measure for Clustering

  • Conference paper
Advanced Data Mining and Applications (ADMA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7121))

Included in the following conference series:

Abstract

The goal of clustering is to partition the data points into clusters, such that the data points in the same cluster are similar. Therefore, similarity measure is one of the most critical issues for clustering. In this paper, we present a novel similarity measure based on intrinsic dimension, where the local intrinsic dimension of each data point is considered as a new feature to describe the data points, leading to a new type of similarity measure combining the new feature and original features. The main idea is that the data points in the same cluster are expected to have the same intrinsic dimension while they have similar values of the traditional features. The proposed method is evaluated on some artificial data sets and the experiment results illustrate the effectiveness of the proposed similarity measure. Moreover, the segmentation results of natural images based on the proposed similarity measure show that the intrinsic dimension is worthy of being considered as a new feature of the data points in more applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baraldi, A., Blonda, P.: A Survey of Fuzzy Clustering Algorithms for Pattern Recognition-Part I and II. IEEE Transaction on Systems, Man, and Cybernetics, Part B 29, 778–801 (1999)

    Article  Google Scholar 

  2. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers (2000)

    Google Scholar 

  3. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31, 651–666 (2010)

    Article  Google Scholar 

  4. Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recognition 41, 176–190 (2008)

    Article  MATH  Google Scholar 

  5. Shepard, R.N.: Toward a universal law of generalization for psychological science. Science 237, 1317–1323 (1987)

    Article  MATH  Google Scholar 

  6. Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)

    Article  Google Scholar 

  7. Santini, S., Jain, R.: Similarity measures. IEEE Transaction on Pattern Analysis and Machine Intelligence 21, 871–883 (1999)

    Article  Google Scholar 

  8. Navarro, D.J., Perfors, A.F.: Similarity, feature discovery, and the size principle. Acta Psychologica 133, 256–268 (2010)

    Article  Google Scholar 

  9. Cheng, H., Liu, Z., Yang, J.: Sparsity Induced Similarity Measure for Label Propagation. In: 12th IEEE International Conference on Computer Vision, pp. 317–324. IEEE Press, Kyoto (2009)

    Google Scholar 

  10. Jiang, H., Ngo, C.W., Tan, H.K.: Gestalt-based feature similarity measure in trademark database. Pattern Recognition 39, 988–1001 (2006)

    Article  Google Scholar 

  11. Liu, C.: The Bayes Decision Rule Induced Similarity Measures. IEEE Transaction on Pattern Analysis and Machine Intelligence 29, 1086–1090 (2007)

    Article  Google Scholar 

  12. Camastra, F.: Data dimensionality estimation methods: A survey. Pattern Recognition 36, 2945–2954 (2003)

    Article  MATH  Google Scholar 

  13. Kégl, B.: Intrinsic Dimension estimation using packing numbers. In: Neural Information Processing Systems, pp. 681–688. MIT Press, Canada (2002)

    Google Scholar 

  14. Costa, J.A., Hero, A.O.: Geodesic entropy graphs for dimension and entropy estimation in manifold learning. IEEE Transaction on Signal Processing 52, 231–252 (2004)

    Article  Google Scholar 

  15. Fan, M., Qiao, H., Zhang, B.: Intrinsic dimension estimation of manifolds by incising balls. Patter Recoginition 42, 780–787 (2009)

    Article  MATH  Google Scholar 

  16. Eriksson, B., Crovella, M.: Estimation of Inrinsic Dimension via Clustering. BU/CS Technical Report (2011)

    Google Scholar 

  17. Fukunaga, K., Olsen, D.: An algorithm for finding intrinsic dimensionality of data. IEEE Transaction on Computers C-20 (1971)

    Google Scholar 

  18. Pettis, K., Bailey, T., Jain, A., Dubes, R.: An intrinsic dimensionality estimator from near-neighbor information. IEEE Transaction on Pattern Analysis and Machine Intelligence 1, 25–36 (1979)

    Article  MATH  Google Scholar 

  19. Costa, J.A., Girotra, A., Hero, A.O.: Estimating local intrinsic dimension with k-nearest neighbor graphs. In: IEEE Workshop on Statistical Signal Processing, pp. 417–422 (2005)

    Google Scholar 

  20. Levina, E., Bickel, P.: Maximum likelihood estimation of intrinsic dimension. In: Neural Information Processing Systems. MIT Press, Vancouver (2004)

    Google Scholar 

  21. Carter, K.M., Raich, R., Hero, A.O.: On local intrinsic dimension estimation and its applications. IEEE Transaction on signal processing 58, 650–663 (2010)

    Article  Google Scholar 

  22. MacKay, D.J.C., Ghahramani, Z.: Comments on ‘Maximum likelihood estimation of intrinsic dimension’ by E.Levina and P.Bickel (2005), http://www.inference.phy.cam.ac.uk/mackay/dimension

  23. Wang, Y., Jiang, Y., Wu, Y., Zhou, Z.-H.: Multi-Manifold Clustering. In: Zhang, B.-T., Orgun, M.A. (eds.) PRICAI 2010. LNCS, vol. 6230, pp. 280–291. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  24. Fischer, B., Zöller, T., Buhmann, J.M.: Path Based Pairwise Data Clustering with Application to Texture Segmentation. In: Figueiredo, M., Zerubia, J., Jain, A.K. (eds.) EMMCVPR 2001. LNCS, vol. 2134, pp. 235–250. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  25. Ng, A.Y., Jordan, M.L., Weiss, Y.: On spectral clustering: Analysis and algorithm. In: Neural Information Processing Systems, pp. 849–856. MIT Press, Canada (2002)

    Google Scholar 

  26. Shi, J.B., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2004)

    Google Scholar 

  27. Zelnik-Manor, L., Perona, P.: Self-Tuning Spectral Clustering. In: Neural Information Processing Systems, pp. 1601–1608. MIT Press, Canada (2005)

    Google Scholar 

  28. Petland, P.A.: Fractal-based description of natural scenes. IEEE Transaction on Pattern Analysis and Machine Intelligence 6, 661–674 (1984)

    Article  Google Scholar 

  29. Chaudhuri, B.B., Sarkar, N.: Texture segmentation using fractal dimension. IEEE Transaction on Pattern Analysis and Machine Intelligence 17, 72–77 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xiao, Y., Yu, J., Gong, S. (2011). Intrinsic Dimension Induced Similarity Measure for Clustering. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25856-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25856-5_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25855-8

  • Online ISBN: 978-3-642-25856-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics