Skip to main content

Symmetry-Based Automatic Clustering

  • Chapter
Unsupervised Classification

Abstract

This chapter deals with some automatic clustering techniques, where the number of clusters need not be fixed a priori. First some recently developed genetic algorithm-based automatic clustering techniques are described briefly. Thereafter a recently developed point symmetry-based automatic genetic clustering technique, VGAPS, is described in detail; it uses Sym-index for computing the fitness of the chromosomes. In VGAPS clustering, the assignment of points to different clusters is done based on the point symmetry distance rather than the Euclidean distance when the point is indeed symmetric with respect to a center. Moreover, the use of adaptive mutation and crossover probabilities helps VGAPS clustering to converge faster. The global convergence property of VGAPS-clustering is also established. Experimental results prove the fact that VGAPS clustering is well suited to detect the number of clusters and the proper partitioning from data sets having clusters with widely varying characteristics, irrespective of their convexity, or overlap or size, as long as they possess the property of symmetry. Thereafter a variable string length genetic point symmetry-based fuzzy clustering technique, Fuzzy-VGAPS, is described in this chapter. It utilizes the fuzzy version of Sym-index. Fuzzy-VGAPS can detect clusters of any shape (e.g., hyperspherical, linear, ellipsoidal, ring shaped, etc.) and size (mixture of small and large clusters, i.e., clusters of unequal sizes) as long as they satisfy the property of point symmetry. In a part of the experiment, a real-life application of Fuzzy-VGAPS to automatically segment magnetic resonance brain images with multiple sclerosis lesions is demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.95
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. BrainWeb: Simulated brain database. http://www.bic.mni.mcgill.ca/brainweb

  2. UC Irvine Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLRepository.html

  3. Anderson, T.W., Sclove, S.L.: Introduction to the Statistical Analysis of Data. Houghton Mifflin, Boston (1978)

    MATH  Google Scholar 

  4. Bandyopadhyay, S.: Simulated annealing using reversible jump Markov chain Monte Carlo algorithm for fuzzy clustering. IEEE Trans. Knowl. Data Eng. 17(4), 479–490 (2005)

    Article  Google Scholar 

  5. Bandyopadhyay, S., Maulik, U.: Non-parametric genetic clustering: Comparison of validity indices. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 31(1), 120–125 (2001)

    Article  Google Scholar 

  6. Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit. 35(6), 1197–1208 (2002)

    Article  MATH  Google Scholar 

  7. Bandyopadhyay, S., Pal, S.K.: Classification and Learning Using Genetic Algorithms Applications in Bioinformatics and Web Intelligence. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  8. Bandyopadhyay, S., Saha, S.: GAPS: A clustering method using a new point symmetry based distance measure. Pattern Recognit. 40(12), 3430–3451 (2007)

    Article  MATH  Google Scholar 

  9. Bandyopadhyay, S., Saha, S.: A point symmetry based clustering technique for automatic evolution of clusters. IEEE Trans. Knowl. Data Eng. 20(11), 1–17 (2008)

    Article  Google Scholar 

  10. Bezdek, J.C.: Fuzzy mathematics in pattern classification. Ph.D. thesis, Cornell University, Ithaca, NY (1973)

    Google Scholar 

  11. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)

    Book  MATH  Google Scholar 

  12. Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. Syst. Man Cybern. 28(3), 301–315 (1998)

    Article  Google Scholar 

  13. Bhandarkar, S.M., Zhang, H.: Image segmentation using evolutionary computation. IEEE Trans. Evol. Comput. 3(1), 1–21 (1999)

    Article  Google Scholar 

  14. Bradley, P.S., Fayyad, U.M., Reina, C.: Scaling EM (expectation maximization) clustering to large databases. Tech. rep., Microsoft Research Center (1998)

    Google Scholar 

  15. Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat., Theory Methods 3(1), 1–27 (1974)

    MathSciNet  MATH  Google Scholar 

  16. Campello, R.J., Hruschka, E.R., Alves, V.S.: On the efficiency of evolutionary fuzzy clustering. J. Heuristics 15(1), 43–75 (2009)

    Article  Google Scholar 

  17. Campello, R.J.G.B., Hruschka, E.R.: A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 157, 2858–2875 (2007)

    Article  MathSciNet  Google Scholar 

  18. Chaoji, V., Hasan, M.A., Salem, S., Zaki, M.J.: SPARCL: Efficient and effective shape-based clustering. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 93–102. IEEE Comput. Soc., Washington (2008). http://dl.acm.org/citation.cfm?id=1510528.1511311

    Chapter  Google Scholar 

  19. Chou, C.H., Su, M.C., Lai, E.: Symmetry as a new measure for cluster validity. In: 2nd WSEAS Int. Conf. on Scientific Computation and Soft Computing, Crete, Greece, pp. 209–213 (2002)

    Google Scholar 

  20. Chou, C.H., Su, M.C., Lai, E.: A new cluster validity measure and its application to image compression. Pattern Anal. Appl. 7(2), 205–220 (2004)

    Article  MathSciNet  Google Scholar 

  21. Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, Department of Computer Science, University of Western Australia, Australia (1998)

    Google Scholar 

  22. Cowgill, M.C., Harvey, R.J., Watson, L.T.: A genetic algorithm approach to cluster analysis. Comput. Math. Appl. 37(7), 99–108 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  23. Das, S., Abraham, A., Konar, A.: Automatic clustering using an improved differential evolution algorithm. IEEE Trans. Syst. Man Cybern., Part A, Syst. Hum. 38(1), 218–237 (2008)

    Article  Google Scholar 

  24. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(4), 224–227 (1979)

    Article  Google Scholar 

  25. Dong, Y.Y., Zhang, Y.J., Chang, C.L.: Multistage random sampling genetic-algorithm-based fuzzy c-means clustering algorithm. In: Proceedings of 2004 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 2069–2073 (2004)

    Google Scholar 

  26. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  27. Eduardo, R.H., Nelson, F.F.E.: A genetic algorithm for cluster analysis. Intell. Data Anal. 7, 15–25 (2003)

    Google Scholar 

  28. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, New York (1989)

    MATH  Google Scholar 

  29. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley, Massachusetts (1992)

    Google Scholar 

  30. Grira, N., Houle, M.E.: Best of both: A hybridized centroid-medoid clustering heuristic. In: ICML ’07: Proceedings of the 24th International Conference on Machine Learning, pp. 313–320. ACM, New York (2007)

    Chapter  Google Scholar 

  31. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)

    Google Scholar 

  32. Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolutionary algorithms for clustering gene-expression data. In: Proc. 4th IEEE Int. Conference on Data Mining, pp. 403–406 (2004)

    Chapter  Google Scholar 

  33. Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Improving the efficiency of a clustering genetic algorithm. In: Proc. 9th Ibero-American Conference on Artificial Intelligence, Lecture Notes in Computer Science, vol. 3315, pp. 861–870 (2004)

    Google Scholar 

  34. Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolving clusters in gene-expression data. Inf. Sci. 176(13), 1898–1927 (2006)

    Article  Google Scholar 

  35. Hruschka, E.R., Ebecken, N.F.F.: A genetic algorithm for cluster analysis. Intell. Data Anal. 7(1), 15–25 (2003)

    Google Scholar 

  36. Jain, A.K., Duin, P., Jianchang, M.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)

    Article  Google Scholar 

  37. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Article  Google Scholar 

  38. Jardine, N., Sibson, R.: Mathematical Taxonomy. Wiley, New York (1971)

    MATH  Google Scholar 

  39. Kaufman, L., Rosseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)

    Book  Google Scholar 

  40. Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., Ren, B.: A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005)

    Article  Google Scholar 

  41. Kim, Y.I., Kim, D.W., Lee, D., Lee, K.H.: A cluster validation index for GK cluster analysis based on relative degree of sharing. Inf. Sci. 168(1–4), 225–242 (2004)

    Article  MATH  Google Scholar 

  42. Krishna, K., Murty, M.N.: Genetic K-means algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 29(3), 433–439 (1999)

    Article  Google Scholar 

  43. Leon, E., Nasraoui, O., Gomez, J.: ECSAGO: Evolutionary clustering with self adaptive genetic operators. In: Proc. IEEE Congress on Evolutionary Computation, July 16–21, 2006, pp. 1768–1775 (2006)

    Google Scholar 

  44. Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Trans. Evol. Comput. 10(3), 296–314 (2006)

    Article  Google Scholar 

  45. Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24(12), 1650–1654 (2002)

    Article  Google Scholar 

  46. Maulik, U., Bandyopadhyay, S.: Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification. IEEE Trans. Geosci. Remote Sens. 41(5), 1075–1081 (2003)

    Article  Google Scholar 

  47. Pakhira, M.K., Maulik, U., Bandyopadhyay, S.: Validity index for crisp and fuzzy clusters. Pattern Recognit. 37(3), 487–501 (2004)

    Article  MATH  Google Scholar 

  48. Rudolph, G.: Convergence analysis of canonical genetic algorithms. IEEE Trans. Neural Netw. 5(1), 96–101 (1994)

    Article  Google Scholar 

  49. Saha, S., Bandyopadhyay, S.: A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters. Inf. Sci. 179(19), 3230–3246 (2009)

    Article  MATH  Google Scholar 

  50. Sheng, W., Swift, S., Zhang, L., Liu, X.: A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 35(6), 56–67 (2005)

    Article  Google Scholar 

  51. Alves, V.S., Campello, R.J.G.B., Hruschka, E.R.: Towards a fast evolutionary algorithm for clustering. In: Proc. IEEE Congress on Evolutionary Computation, pp. 6240–6247 (2006)

    Google Scholar 

  52. Storn, R., Price, K.: Differential evolution – A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  53. Su, M.C., Chou, C.H.: A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 674–680 (2001)

    Article  Google Scholar 

  54. Suckling, J., Sigmundsson, T., Greenwood, K., Bullmore, E.: A modified fuzzy clustering algorithm for operator independent brain tissue classification of dual echo MR images. J. Magn. Reson. Imaging 17(7), 1065–1076 (1999)

    Article  Google Scholar 

  55. Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)

    Article  Google Scholar 

  56. Zhang, Y., Brady, M., Smith, S.: A hidden Markov random field model for segmentation of brain MR images. In: Proceedings of SPIE Medical Imaging 2000, vol. 3979, pp. 1126–1137 (2000)

    Google Scholar 

  57. Zhang, Y., Brady, M., Smith, S.: Segmentation of brain MR image through a hidden Markov random field model and the expectation maximization algorithm. IEEE Trans. Med. Imaging 20(1), 45–57 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bandyopadhyay, S., Saha, S. (2013). Symmetry-Based Automatic Clustering. In: Unsupervised Classification. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32451-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32451-2_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32450-5

  • Online ISBN: 978-3-642-32451-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics