Skip to main content

Abstract

This paper proposes a new data clustering algorithm based on data depth. In the proposed algorithm the centroids of the K-clusters are calculated using Mahalanobis data depth method. The performance of the algorithm called K-Data Depth Based Clustering Algorithm (K-DBCA) is evaluated in R using datasets defined in the mlbench package of R and from UCI Machine Learning Repository, yields good clustering results and is robust to outliers. In addition, it is invariant to affine transformations and it is also tested for face recognition which yields better accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cheng, C.-H., Chen, Y.-S.: Classifying the segmentation of customer value via RFM model and RS theory. Expert Syst. Appl. 36(3), 4176–4184 (2009)

    Article  Google Scholar 

  2. Pappas, T.N.: An adaptive clustering algorithm for image segmentation. IEEE Trans. Signal Process 40(4), 901–914 (1992)

    Article  Google Scholar 

  3. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Nat. Acad. Sci. 95(25), 14863–14868 (1998)

    Article  Google Scholar 

  4. Jurdak, R., Zhao, K., Liu, J., AbouJaoude, M., Cameron, M., Newth, D.: Understanding human mobility from twitter. PLoS ONE 10(7), e0131469 (2015)

    Article  Google Scholar 

  5. Rokach, L., Maimon, O.: Clustering methods. In: Data Mining and Knowledge Discovery Handbook, pp. 321–352. Springer, Berlin (2005)

    Google Scholar 

  6. Ester, M., Kriegel, H.-P., Sander, J., Xiaowei, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd vol. 96, pp. 226–231 (1996)

    Google Scholar 

  7. Hamid, A., Sukumar, M.: Gchl: a grid-clustering algorithm for high-dimensional very large spatial data bases. Pattern Recogn. Lett. 26(7), 999–1010 (2005)

    Article  Google Scholar 

  8. Boley, D., Gini, M., Gross, R., Sam Han, E.-H., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., Moore, J.: Partitioning-based clustering for web document categorization. Decis. Support Syst. 27(3), 329–341 (1999)

    Article  Google Scholar 

  9. John, A.: Hartigan and Manchek A Wong. Algorithm as 136: A k-means clustering algorithm. J. Roy. Stat. Soc.: Ser. C (Appl. Stat.) 28(1), 100–108 (1979)

    Google Scholar 

  10. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)

    Google Scholar 

  11. Liu, R.Y., Parelius, J.M., Singh, K., et al.: Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by liu and singh). Ann. Statist. 27(3), 783–858 (1999)

    Article  MathSciNet  Google Scholar 

  12. Barnett, V.: The ordering of multivariate data. J. Royal Stat. Soc. Series A (General), pp. 318–355 (1976)

    Article  MathSciNet  Google Scholar 

  13. Eddy, W.F.: Convex hull peeling. In: COMPSTAT 1982 5th Symposium held at Toulouse 1982, pp. 42–47. Springer, Berlin (1982)

    Chapter  Google Scholar 

  14. Hodges, J.L.: A bivariate sign test. Ann. Math. Stat. 26(3), 523–527 (1955)

    Article  MathSciNet  Google Scholar 

  15. Tukey, J.W.: Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians, vol. 2, pp. 523–531 (1975)

    Google Scholar 

  16. Liu, R.Y., et al.: On a notion of data depth based on random simplices. Ann. Statist. 18(1), 405–414 (1990)

    Article  MathSciNet  Google Scholar 

  17. Rousseeuw, P.J., Hubert, M.: Depth in an arrangement of hyperplanes. Discrete Comput. Geom. 22(2), 167–176 (1999)

    Article  MathSciNet  Google Scholar 

  18. Rousseeuw, P.J., Hubert, M.: Regression depth. J. Am. Statist. Assoc. 94(446), 388–402 (1999)

    Article  MathSciNet  Google Scholar 

  19. Vardi, Y., Zhang, C.-H.: The multivariate L1-median and associated data depth. Proc. Nat. Acad. Sci. 97(4), 1423–1426 (2000)

    Article  Google Scholar 

  20. Zuo, Y., Serfling, R.: General notions of statistical depth function. Ann. Statist. pp. 461–482 (2000)

    Article  MathSciNet  Google Scholar 

  21. Serfling, R.: Depth functions in nonparametric multivariate inference. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 72, 1 (2006)

    Article  MathSciNet  Google Scholar 

  22. Leisch, F., Dimitriadou, E.: mlbench: Machine Learning Benchmark Problems (2010). R package version 2.1-1

    Google Scholar 

  23. Hubert, L., Arabie, P.: Comparing partitions. J. classif. 2(1), 193–218 (1985)

    Article  Google Scholar 

  24. Meilă, M.: Comparing clusteringsan information based distance. J. Multivar. Anal. 98(5), 873–895 (2007)

    Article  MathSciNet  Google Scholar 

  25. Alcantarilla, P.F., Bartoli, A., Davison, A.J.: Kaze features. In: European Conference on Computer Vision, pp. 214–227. Springer, Berlin (2012)

    Chapter  Google Scholar 

  26. Lyons, M.J., Akamatsu, S., Kamachi, M., Gyoba, J., Budynek, J.: The Japanese female facial expression (JAFFE) database. In: Proceedings of Third International Conference on Automatic Face and Gesture Recognition, pp. 14–16 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ishwar Baidari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Baidari, I., Patil, C. (2019). K-Data Depth Based Clustering Algorithm. In: Verma, N., Ghosh, A. (eds) Computational Intelligence: Theories, Applications and Future Directions - Volume I. Advances in Intelligent Systems and Computing, vol 798. Springer, Singapore. https://doi.org/10.1007/978-981-13-1132-1_2

Download citation

Publish with us

Policies and ethics