Abstract
The goal of clustering is to partition the data points into clusters, such that the data points in the same cluster are similar. Therefore, similarity measure is one of the most critical issues for clustering. In this paper, we present a novel similarity measure based on intrinsic dimension, where the local intrinsic dimension of each data point is considered as a new feature to describe the data points, leading to a new type of similarity measure combining the new feature and original features. The main idea is that the data points in the same cluster are expected to have the same intrinsic dimension while they have similar values of the traditional features. The proposed method is evaluated on some artificial data sets and the experiment results illustrate the effectiveness of the proposed similarity measure. Moreover, the segmentation results of natural images based on the proposed similarity measure show that the intrinsic dimension is worthy of being considered as a new feature of the data points in more applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baraldi, A., Blonda, P.: A Survey of Fuzzy Clustering Algorithms for Pattern Recognition-Part I and II. IEEE Transaction on Systems, Man, and Cybernetics, Part B 29, 778–801 (1999)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers (2000)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31, 651–666 (2010)
Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recognition 41, 176–190 (2008)
Shepard, R.N.: Toward a universal law of generalization for psychological science. Science 237, 1317–1323 (1987)
Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)
Santini, S., Jain, R.: Similarity measures. IEEE Transaction on Pattern Analysis and Machine Intelligence 21, 871–883 (1999)
Navarro, D.J., Perfors, A.F.: Similarity, feature discovery, and the size principle. Acta Psychologica 133, 256–268 (2010)
Cheng, H., Liu, Z., Yang, J.: Sparsity Induced Similarity Measure for Label Propagation. In: 12th IEEE International Conference on Computer Vision, pp. 317–324. IEEE Press, Kyoto (2009)
Jiang, H., Ngo, C.W., Tan, H.K.: Gestalt-based feature similarity measure in trademark database. Pattern Recognition 39, 988–1001 (2006)
Liu, C.: The Bayes Decision Rule Induced Similarity Measures. IEEE Transaction on Pattern Analysis and Machine Intelligence 29, 1086–1090 (2007)
Camastra, F.: Data dimensionality estimation methods: A survey. Pattern Recognition 36, 2945–2954 (2003)
Kégl, B.: Intrinsic Dimension estimation using packing numbers. In: Neural Information Processing Systems, pp. 681–688. MIT Press, Canada (2002)
Costa, J.A., Hero, A.O.: Geodesic entropy graphs for dimension and entropy estimation in manifold learning. IEEE Transaction on Signal Processing 52, 231–252 (2004)
Fan, M., Qiao, H., Zhang, B.: Intrinsic dimension estimation of manifolds by incising balls. Patter Recoginition 42, 780–787 (2009)
Eriksson, B., Crovella, M.: Estimation of Inrinsic Dimension via Clustering. BU/CS Technical Report (2011)
Fukunaga, K., Olsen, D.: An algorithm for finding intrinsic dimensionality of data. IEEE Transaction on Computers C-20 (1971)
Pettis, K., Bailey, T., Jain, A., Dubes, R.: An intrinsic dimensionality estimator from near-neighbor information. IEEE Transaction on Pattern Analysis and Machine Intelligence 1, 25–36 (1979)
Costa, J.A., Girotra, A., Hero, A.O.: Estimating local intrinsic dimension with k-nearest neighbor graphs. In: IEEE Workshop on Statistical Signal Processing, pp. 417–422 (2005)
Levina, E., Bickel, P.: Maximum likelihood estimation of intrinsic dimension. In: Neural Information Processing Systems. MIT Press, Vancouver (2004)
Carter, K.M., Raich, R., Hero, A.O.: On local intrinsic dimension estimation and its applications. IEEE Transaction on signal processing 58, 650–663 (2010)
MacKay, D.J.C., Ghahramani, Z.: Comments on ‘Maximum likelihood estimation of intrinsic dimension’ by E.Levina and P.Bickel (2005), http://www.inference.phy.cam.ac.uk/mackay/dimension
Wang, Y., Jiang, Y., Wu, Y., Zhou, Z.-H.: Multi-Manifold Clustering. In: Zhang, B.-T., Orgun, M.A. (eds.) PRICAI 2010. LNCS, vol. 6230, pp. 280–291. Springer, Heidelberg (2010)
Fischer, B., Zöller, T., Buhmann, J.M.: Path Based Pairwise Data Clustering with Application to Texture Segmentation. In: Figueiredo, M., Zerubia, J., Jain, A.K. (eds.) EMMCVPR 2001. LNCS, vol. 2134, pp. 235–250. Springer, Heidelberg (2001)
Ng, A.Y., Jordan, M.L., Weiss, Y.: On spectral clustering: Analysis and algorithm. In: Neural Information Processing Systems, pp. 849–856. MIT Press, Canada (2002)
Shi, J.B., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2004)
Zelnik-Manor, L., Perona, P.: Self-Tuning Spectral Clustering. In: Neural Information Processing Systems, pp. 1601–1608. MIT Press, Canada (2005)
Petland, P.A.: Fractal-based description of natural scenes. IEEE Transaction on Pattern Analysis and Machine Intelligence 6, 661–674 (1984)
Chaudhuri, B.B., Sarkar, N.: Texture segmentation using fractal dimension. IEEE Transaction on Pattern Analysis and Machine Intelligence 17, 72–77 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xiao, Y., Yu, J., Gong, S. (2011). Intrinsic Dimension Induced Similarity Measure for Clustering. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25856-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-25856-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25855-8
Online ISBN: 978-3-642-25856-5
eBook Packages: Computer ScienceComputer Science (R0)