Abstract
This chapter deals with some automatic clustering techniques, where the number of clusters need not be fixed a priori. First some recently developed genetic algorithm-based automatic clustering techniques are described briefly. Thereafter a recently developed point symmetry-based automatic genetic clustering technique, VGAPS, is described in detail; it uses Sym-index for computing the fitness of the chromosomes. In VGAPS clustering, the assignment of points to different clusters is done based on the point symmetry distance rather than the Euclidean distance when the point is indeed symmetric with respect to a center. Moreover, the use of adaptive mutation and crossover probabilities helps VGAPS clustering to converge faster. The global convergence property of VGAPS-clustering is also established. Experimental results prove the fact that VGAPS clustering is well suited to detect the number of clusters and the proper partitioning from data sets having clusters with widely varying characteristics, irrespective of their convexity, or overlap or size, as long as they possess the property of symmetry. Thereafter a variable string length genetic point symmetry-based fuzzy clustering technique, Fuzzy-VGAPS, is described in this chapter. It utilizes the fuzzy version of Sym-index. Fuzzy-VGAPS can detect clusters of any shape (e.g., hyperspherical, linear, ellipsoidal, ring shaped, etc.) and size (mixture of small and large clusters, i.e., clusters of unequal sizes) as long as they satisfy the property of point symmetry. In a part of the experiment, a real-life application of Fuzzy-VGAPS to automatically segment magnetic resonance brain images with multiple sclerosis lesions is demonstrated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
BrainWeb: Simulated brain database. http://www.bic.mni.mcgill.ca/brainweb
UC Irvine Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLRepository.html
Anderson, T.W., Sclove, S.L.: Introduction to the Statistical Analysis of Data. Houghton Mifflin, Boston (1978)
Bandyopadhyay, S.: Simulated annealing using reversible jump Markov chain Monte Carlo algorithm for fuzzy clustering. IEEE Trans. Knowl. Data Eng. 17(4), 479–490 (2005)
Bandyopadhyay, S., Maulik, U.: Non-parametric genetic clustering: Comparison of validity indices. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 31(1), 120–125 (2001)
Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit. 35(6), 1197–1208 (2002)
Bandyopadhyay, S., Pal, S.K.: Classification and Learning Using Genetic Algorithms Applications in Bioinformatics and Web Intelligence. Springer, Heidelberg (2007)
Bandyopadhyay, S., Saha, S.: GAPS: A clustering method using a new point symmetry based distance measure. Pattern Recognit. 40(12), 3430–3451 (2007)
Bandyopadhyay, S., Saha, S.: A point symmetry based clustering technique for automatic evolution of clusters. IEEE Trans. Knowl. Data Eng. 20(11), 1–17 (2008)
Bezdek, J.C.: Fuzzy mathematics in pattern classification. Ph.D. thesis, Cornell University, Ithaca, NY (1973)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. Syst. Man Cybern. 28(3), 301–315 (1998)
Bhandarkar, S.M., Zhang, H.: Image segmentation using evolutionary computation. IEEE Trans. Evol. Comput. 3(1), 1–21 (1999)
Bradley, P.S., Fayyad, U.M., Reina, C.: Scaling EM (expectation maximization) clustering to large databases. Tech. rep., Microsoft Research Center (1998)
Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat., Theory Methods 3(1), 1–27 (1974)
Campello, R.J., Hruschka, E.R., Alves, V.S.: On the efficiency of evolutionary fuzzy clustering. J. Heuristics 15(1), 43–75 (2009)
Campello, R.J.G.B., Hruschka, E.R.: A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets Syst. 157, 2858–2875 (2007)
Chaoji, V., Hasan, M.A., Salem, S., Zaki, M.J.: SPARCL: Efficient and effective shape-based clustering. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 93–102. IEEE Comput. Soc., Washington (2008). http://dl.acm.org/citation.cfm?id=1510528.1511311
Chou, C.H., Su, M.C., Lai, E.: Symmetry as a new measure for cluster validity. In: 2nd WSEAS Int. Conf. on Scientific Computation and Soft Computing, Crete, Greece, pp. 209–213 (2002)
Chou, C.H., Su, M.C., Lai, E.: A new cluster validity measure and its application to image compression. Pattern Anal. Appl. 7(2), 205–220 (2004)
Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, Department of Computer Science, University of Western Australia, Australia (1998)
Cowgill, M.C., Harvey, R.J., Watson, L.T.: A genetic algorithm approach to cluster analysis. Comput. Math. Appl. 37(7), 99–108 (1999)
Das, S., Abraham, A., Konar, A.: Automatic clustering using an improved differential evolution algorithm. IEEE Trans. Syst. Man Cybern., Part A, Syst. Hum. 38(1), 218–237 (2008)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(4), 224–227 (1979)
Dong, Y.Y., Zhang, Y.J., Chang, C.L.: Multistage random sampling genetic-algorithm-based fuzzy c-means clustering algorithm. In: Proceedings of 2004 International Conference on Machine Learning and Cybernetics, vol. 4, pp. 2069–2073 (2004)
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973)
Eduardo, R.H., Nelson, F.F.E.: A genetic algorithm for cluster analysis. Intell. Data Anal. 7, 15–25 (2003)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, New York (1989)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley, Massachusetts (1992)
Grira, N., Houle, M.E.: Best of both: A hybridized centroid-medoid clustering heuristic. In: ICML ’07: Proceedings of the 24th International Conference on Machine Learning, pp. 313–320. ACM, New York (2007)
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolutionary algorithms for clustering gene-expression data. In: Proc. 4th IEEE Int. Conference on Data Mining, pp. 403–406 (2004)
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Improving the efficiency of a clustering genetic algorithm. In: Proc. 9th Ibero-American Conference on Artificial Intelligence, Lecture Notes in Computer Science, vol. 3315, pp. 861–870 (2004)
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolving clusters in gene-expression data. Inf. Sci. 176(13), 1898–1927 (2006)
Hruschka, E.R., Ebecken, N.F.F.: A genetic algorithm for cluster analysis. Intell. Data Anal. 7(1), 15–25 (2003)
Jain, A.K., Duin, P., Jianchang, M.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Comput. Surv. 31(3), 264–323 (1999)
Jardine, N., Sibson, R.: Mathematical Taxonomy. Wiley, New York (1971)
Kaufman, L., Rosseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., Ren, B.: A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005)
Kim, Y.I., Kim, D.W., Lee, D., Lee, K.H.: A cluster validation index for GK cluster analysis based on relative degree of sharing. Inf. Sci. 168(1–4), 225–242 (2004)
Krishna, K., Murty, M.N.: Genetic K-means algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 29(3), 433–439 (1999)
Leon, E., Nasraoui, O., Gomez, J.: ECSAGO: Evolutionary clustering with self adaptive genetic operators. In: Proc. IEEE Congress on Evolutionary Computation, July 16–21, 2006, pp. 1768–1775 (2006)
Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Trans. Evol. Comput. 10(3), 296–314 (2006)
Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24(12), 1650–1654 (2002)
Maulik, U., Bandyopadhyay, S.: Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification. IEEE Trans. Geosci. Remote Sens. 41(5), 1075–1081 (2003)
Pakhira, M.K., Maulik, U., Bandyopadhyay, S.: Validity index for crisp and fuzzy clusters. Pattern Recognit. 37(3), 487–501 (2004)
Rudolph, G.: Convergence analysis of canonical genetic algorithms. IEEE Trans. Neural Netw. 5(1), 96–101 (1994)
Saha, S., Bandyopadhyay, S.: A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters. Inf. Sci. 179(19), 3230–3246 (2009)
Sheng, W., Swift, S., Zhang, L., Liu, X.: A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 35(6), 56–67 (2005)
Alves, V.S., Campello, R.J.G.B., Hruschka, E.R.: Towards a fast evolutionary algorithm for clustering. In: Proc. IEEE Congress on Evolutionary Computation, pp. 6240–6247 (2006)
Storn, R., Price, K.: Differential evolution – A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Su, M.C., Chou, C.H.: A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 674–680 (2001)
Suckling, J., Sigmundsson, T., Greenwood, K., Bullmore, E.: A modified fuzzy clustering algorithm for operator independent brain tissue classification of dual echo MR images. J. Magn. Reson. Imaging 17(7), 1065–1076 (1999)
Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)
Zhang, Y., Brady, M., Smith, S.: A hidden Markov random field model for segmentation of brain MR images. In: Proceedings of SPIE Medical Imaging 2000, vol. 3979, pp. 1126–1137 (2000)
Zhang, Y., Brady, M., Smith, S.: Segmentation of brain MR image through a hidden Markov random field model and the expectation maximization algorithm. IEEE Trans. Med. Imaging 20(1), 45–57 (2001)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bandyopadhyay, S., Saha, S. (2013). Symmetry-Based Automatic Clustering. In: Unsupervised Classification. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32451-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-32451-2_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32450-5
Online ISBN: 978-3-642-32451-2
eBook Packages: Computer ScienceComputer Science (R0)