Abstract
We present a density-based clustering method. The clusters are determined by splitting a similarity graph of the data into connected components. The splitting is accomplished by removing vertices of the graph at which an estimated density function of the data evaluates to values below a threshold. The density function is approximated on a sparse grid in order to make the method feasible in higher-dimensional settings and scalable in the number of data points. With benchmark examples we show that our method is competitive with other modern clustering methods. Furthermore, we consider a real-world example where we cluster nodes of a finite element model of a Chevrolet pick-up truck with respect to the displacements of the nodes during a frontal crash.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, pp. 1027–1035. SIAM, Philadelphia (2007)
Bengio, Y., Paiement, J., Vincent, P., Delalleau, O., Roux, N.L., Ouimet, M.: Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16. MIT Press, Cambridge (2004)
Bungartz, H.J., Griebel, M.: Sparse grids. Acta Numerica 13, 147–269 (2004)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press (1996)
Franzelin, F.: Classification with Estimated Densities on Sparse Grids. Master’s thesis, Institut für Informatik, Technische Universität München (September 2011)
Garcke, J., Griebel, M., Thess, M.: Data mining with sparse grids. Computing 67(3), 225–253 (2001)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer (2009)
Hegland, M., Hooker, G., Roberts, S.: Finite element thin plate splines in density estimation. ANZIAM Journal 42 (2009)
Hinneburg, A., Gabriel, H.-H.: DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation. In: Berthold, M., Shawe-Taylor, J., Lavrač, N. (eds.) IDA 2007. LNCS, vol. 4723, pp. 70–80. Springer, Heidelberg (2007)
Hubert, L., Arabie, P.: Comparing partitions. J. of Classification 2(1), 193–218 (1985)
Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17, 395–416 (2007)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
Peherstorfer, B., Pflüger, D., Bungartz, H.-J.: A Sparse-Grid-Based Out-of-Sample Extension for Dimensionality Reduction and Clustering with Laplacian Eigenmaps. In: Wang, D., Reynolds, M. (eds.) AI 2011. LNCS, vol. 7106, pp. 112–121. Springer, Heidelberg (2011)
Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München (2010)
Pflüger, D., Peherstorfer, B., Bungartz, H.J.: Spatially adaptive sparse grids for high-dimensional data-driven problems. J. of Complexity 26(5), 508–522 (2010)
Xu, R., Wunsch II, D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)
Zupan, J., Novic, M., Li, X., Gasteiger, J.: Classification of multicomponent analytical data of olive oils using different neural networks. Analytica Chimica Acta 292(3), 219–234 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peherstorfer, B., Pflüger, D., Bungartz, HJ. (2012). Clustering Based on Density Estimation with Sparse Grids. In: Glimm, B., Krüger, A. (eds) KI 2012: Advances in Artificial Intelligence. KI 2012. Lecture Notes in Computer Science(), vol 7526. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33347-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-33347-7_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33346-0
Online ISBN: 978-3-642-33347-7
eBook Packages: Computer ScienceComputer Science (R0)