Abstract
In order to improve the performance of a clustering on a data set, a number of primary partitions are generated and stored in an ensemble and their aggregated consensus partition is used as their clustering. It is widely accepted that the consensus partition outperforms the primary partitions. In this paper, an ensemble clustering method called multi-level consensus clustering (MLCC) is proposed. To construct the MLCC, a cluster–cluster similarity matrix which is achieved by an innovative similarity metric is first generated. The mentioned cluster–cluster similarity matrix is based on a multi-level similarity metric. In fact, it can be computed in a new defined multi-level space. Then, a point–point similarity matrix which is boosted using the mentioned cluster–cluster similarity matrix is generated. The new consensus function applies an average linkage hierarchical clusterer algorithm on the mentioned point–point similarity matrix to make consensus partition. MLCC is better than traditional clustering ensembles and simple versions of clustering ensembles on traditional cluster–cluster similarity matrix. Its computational cost is not very bad too. Accuracy and robustness of the proposed method are compared with those of the modern clustering algorithms through the experimental tests. Also, time analysis is presented in the experimental results.
Similar content being viewed by others
References
Abbasi S, Nejatian S et al (2019) Clustering ensemble selection considering quality and diversity. Artif Intell Rev 52:1311–1340
Akbari E, Mohamed Dahlan H, Ibrahim R, Alizadeh H (2015) Hierarchical cluster ensemble selection. Eng Appl AI 39:146–156
AleAhmad A, Amiri H, Darrudi E, Rahgozar M, Oroumchian F (2009) Hamshahri: a standard Persian text collection. J Knowl-Based Syst 22(5):382–387
Alishvandi H, Gouraki GH, Parvin H (2016) An enhanced dynamic detection of possible invariants based on best permutation of test cases. Comput Syst Sci Eng 31(1):53–61
Alizadeh H, Minaei-Bidgoli B, Parvin H (2011a) A new criterion for clusters validation. In: Artificial intelligence applications and innovations (AIAI 2011), IFIP, Springer, Heidelberg, Part I, pp 240–246
Alizadeh H, Minaei-Bidgoli B, Parvin H, Moshki M (2011b) An asymmetric criterion for cluster validation, developing concepts in applied intelligence. Stud Comput Intell 363:1–14
Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) Optimizing fuzzy cluster ensemble in string representation. Int J Pattern Recognit Artif Intell 27(2):1350005
Alizadeh H, Minaei-Bidgoli B, Parvin H (2014a) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26(1):127–150
Alizadeh H, Minaei-Bidgoli B, Parvin H (2014b) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18(3):389–408
Alizadeh H, Yousefnezhad M, Minaei-Bidgoli B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19(3):485–503
Alqurashi T, Wang W (2014) Object-neighborhood clustering ensemble method. In Intelligent data engineering and automated learning (IDEAL), Springer, pp 142–149
Alqurashi T, Wang W (2015) A new consensus function based on dual-similarity measurements for clustering ensemble. In: International conference on data science and advanced analytics (DSAA), IEEE/ACM, pp 149–155
Ayad HG, Kamel MS (2008) Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters. IEEE Trans Pattern Anal Mach Intell 30(1):160–173
Bagherinia A, Minaei-Bidgoli B, Hossinzadeh M, Parvin H (2019) Elite fuzzy clustering ensemble based on clustering diversity and quality measures. Appl Intell 49(5):1724–1747
Bai L, Cheng X, Liang J, Guo Y (2017) Fast graph clustering with a new description model for community detection. Inf Sci 388–389:37–47
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Dimitriadou E, Weingessel A, Hornik K (2002) A combination scheme for fuzzy clustering. Int J Pattern Recognit Artif Intell 16(07):901–912
Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowl Disc Data (TKDD) 2(4):1–42
Dueck D (2009) Affinity propagation: clustering data by passing messages. Ph.D. dissertation, University of Toronto
Faceli K, Marcilio CP, Souto D (2006) Multi-objective clustering ensemble. In: Proceedings of the sixth international conference on hybrid intelligent systems
X. Z. Fern and C. E. Brodley, “Random projection for high dimensional data clustering: A cluster ensemble approach”, In: Proceedings of the 20th International Conference on Machine Learning, (2003), pp. 186–193.
Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st international conference on machine learning, ACM, p 36
Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47(2):833–842
Fred A, Jain AK (2002) Data clustering using evidence accumulation. In: Intl. conf. on pattern recognition, ICPR02, Quebec City, pp 276–280
Fred A, Jain AK (2005) Combining multiple clustering’s using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. Comput Learn Theory 55:119–139
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Ghaemi R, ben Sulaiman N, Ibrahim H, Mustapha N (2011) A review: accuracy optimization in clustering ensembles using genetic algorithms. Artif Intell Rev 35(4):287–318
Ghosh J, Acharya A (2011) Cluster ensembles. Data Min Knowl Disc 1(4):305–315
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4
Hanczar B, Nadif M (2012) Ensemble methods for biclustering tasks. Pattern Recognit 45(11):3938–3949
Ho TK (1995) Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol. 1, pp 278–282. https://doi.org/10.1109/ICDAR.1995.598994
Hong Y, Kwong S, Chang Y, Ren Q (2008) Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recogn 41(9):2742–2756
Hosseinpoor MJ, Parvin H, Nejatian S, Rezaie V (2019) Gene regulatory elements extraction in breast cancer by Hi-C data using a meta-heuristic method. Russ J Genet 55(9):1152–1164
Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250
Huang D, Lai J, Wang CD (2016) Ensemble clustering using factor graph. Pattern Recogn 50:131–142
Huang D, Lai J, Wang CD (2016b) Robust ensemble clustering using probability trajectories. The IEEE Trans Knowl Data Eng
Huang D, Wang CD, Lai JH (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 99:1–14. https://doi.org/10.1109/TCYB.2017.2702343
N. Iam-On, T. Boongoen and S. M. Garrett, “Refining Pairwise Similarity Matrix for Cluster Ensemble Problem with Cluster Relations”, Discovery Science, (2008), 222–233.
Iam-On N, Boongoen T, Garrett S (2010) LCE: a link-based cluster ensemble method for improved gene expression data analysis. Bioinformatics 26(12):1513–1519
Iam-On N, Boongoen T, Garrett S, Price C (2011) A link based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33(12):2396–2409
Iam-On N, Boongeon T, Garrett S, Price C (2012) A link based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425
Jamalinia H, Khalouei S, Rezaie V, Nejatian S, Bagheri-Fard K, Parvin H (2018) Diverse classifier ensemble creation based on heuristic dataset modification. J Appl Stat 45(7):1209–1226
Jenghara MM, Ebrahimpour-Komleh H, Parvin H (2018a) Dynamic protein–protein interaction networks construction using firefly algorithm. Pattern Anal Appl 21(4):1067–1081
Jenghara MM, Ebrahimpour-Komleh H, Rezaie V, Nejatian S, Parvin H, Syed-Yusof SK (2018b) Imputing missing value through ensemble concept based on statistical measures. Knowl Inf Syst 56(1):123–139
Jiang Y, Chung FL, Wang S, Deng Z, Wang J, Qian P (2015) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701
Mimaroglu S, Aksehirli E (2012) DICLENS: Divisive clustering ensemble with automatic cluster number. IEEE/ACM Trans Comput Biol Bioinf 9(2):408–420
Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: Intl. conf. on information technology, ITCC 04, Las Vegas, pp 188–192
Minaei-Bidgoli B, Parvin H, Alinejad-Rokny H, Alizadeh H, Punch WE (2014) Effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev 41(1):27–48
Mirzaei A, Rahmati M (2010) A Novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Trans Fuzzy Syst 18(1):27–39
Mojarad M, Parvin H, Nejatian S, Rezaie V (2019a) Consensus function based on clusters clustering and iterative fusion of base clusters. Int J Uncertain Fuzziness Knowl-Based Syst 27(1):97–120
Mojarad M, Nejatian S, Parvin H, Mohammadpoor M (2019b) A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters. Appl Intell 49(7):2567–2581
Moradi M, Nejatian S, Parvin H, Rezaie V (2018) CMCABC: Clustering and memory-based chaotic artificial bee colony dynamic optimization algorithm. Int J Inf Technol Decis Mak 17(04):1007–1046
Naldi MC, De Carvalho ACM, Campello RJ (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27(2):259–289
Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H (2019) A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Appl 22(1):133–145
Nejatian S, Parvin H, Faraji E (2018) Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification. Neurocomputing 276:55–66
Nejatian S, Rezaie V, Parvin H, Pirbonyeh M, Bagherifard K, Yusof SKS (2019) An innovative linear unsupervised space adjustment by keeping low-level spatial data structure. Knowl Inf Syst 59(2):437–464
Newman CBDJ, Hettich SS, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/˜mlearn/MLSummary.html.
Omidvar MN, Nejatian S, Parvin H, Rezaie V (2018) A new natural-inspired continuous optimization approach, Journal of Intelligent & Fuzzy Systems, 1–17,
Partabian J, Rafe V, Parvin H, Nejatian S (2020) An approach based on knowledge exploration for state space management in checking reachability of complex software systems. Soft Comput 24(10):7181–7196
H. Parvin, B. Minaei-Bidgoli, A clustering ensemble framework based on elite selection of weighted clusters, Advances in Data Analysis and Classification (2013) 1–28.
Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112
Parvin H, Beigi A, Mozayani N (2012) A clustering ensemble learning method based on the ant colony clustering algorithm. Int J Appl Comput Math 11(2):286–302
Parvin H, Minaei-Bidgoli B, Alinejad-Rokny H, Punch WF (2013) Data weighing mechanisms for clustering ensembles. Comput Electr Eng 39(5):1433–1450
Parvin H, Nejatian S, Mohamadpour M (2018) Explicit memory based ABC with a clustering strategy for updating and retrieval of memory in dynamic environments. Appl Intell 48(11):4317–4337
Pirbonyeh A, Rezaie V, Parvin H, Nejatian S, Mehrabi M (2019) A linear unsupervised transfer learning by preservation of cluster-and-neighborhood data organization. Pattern Anal Appl 22(3):1149–1160
Rafiee G, Dlay SS, Woo WL (2013) Region-of-interest extraction in low depth of field images using ensemble clustering and difference of Gaussian approaches. Pattern Recognit 46(10):2685–2699
Rashidi F, Nejatian S, Parvin H, Rezaie V (2019) Diversity based cluster weighting in cluster ensemble: an information theory approach. Artif Intell Rev 52(2):1341–1368
Ren Y, Zhang G, Domeniconi C, Yu G (2013) Weighted object ensemble clustering. In Proceedings of the IEEE 13th international conference on data mining (ICDM), IEEE, pp 627–636
Roth V, Lange T, Braun M, Buhmann J (2002) A resampling approach to cluster validation. Intl. conf. on computational statistics, COMPSTAT
Shabaniyan T, Parsaei H, Aminsharifi A, Movahedi MM, Jahromi AT, Pouyesh S, Parvin H (2019) An artificial intelligence-based clinical decision support system for large kidney stone treatment. Australas Phys Eng Sci Med 42(3):771–779
Shahriari A, Parvin H, Monajati A (2015) Exploring weights of hierarchical and equivalency relationship in general Persian texts. EANN Workshops 7(1):7
Soto V, Garcia-Moratilla S, Martinez-Munoz G, Hernandez- Lobato D, Suarez A (2014) A double pruning scheme for boosting ensembles. IEEE Trans Cybern 44(12):2682–2695
Strehl A, Ghosh J (2000) Value-based customer grouping from large retail data sets. In AeroSense, International Society for Optics and Photonics, pp 33–42
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for multiple partitions. J Mach Learn Res 3:583–617
Szetoa PM, Parvin H, Mahmoudi MR, Tuan BA, Pho KH (2020) Deep neural network as deep feature learner. J Intell Fuzzy Syst. https://doi.org/10.3233/JIFS-191292
Topchy AP, Jain AK, Punch WF (2003) Combining multiple weak clusterings. In: IEEE international conference on data mining, pp 331–338
Topchy A, Jain AK, Punch W (2005) A mixture model of clustering ensembles. Proc SIAM Int Conf Data Min, Citeseer 27(12):1866–1881
N. X. Vinh and M. E. Houle, “A set correlation model for partitional clustering”, In: Advances in Knowledge Discovery and Data Mining, Springer, (2010) pp. 4–15.
Yang Y, Jiang J (2016) Hybrid sampling-based clustering ensemble with global and local constitutions. IEEE Trans Neural Netw Learn Syst 27(5):952–965
Yasrebi M, Eskandar-Baghban A, Parvin H, Mohammadpour M (2018) Optimisation inspiring from behaviour of raining in nature: droplet optimisation algorithm. Int J Bio-Inspired Comput 12(3):152–163
Yi J, Yang T, Jin R, Jain AK, Mahdavi M (2012) Robust ensemble clustering by matrix completion. In: Proceedings of the IEEE 12th international conference on data mining (ICDM), IEEE, pp 1176–1181
Yousefnezhad M, Huang SJ, Zhang D (2018) WoCE: a framework for clustering ensemble by exploiting the wisdom of crowds theory. IEEE Trans Cybernetics 48(2):486–499
Yu Z, Wong HS, You J, Yang Q, Liao H (2011) Knowledge based cluster ensemble for cancer discovery from biomolecular data. IEEE Trans Nanobiosci 10(2):76–85
Yu Z, You J, Wong HS, Han G (2012) From cluster ensemble to structure ensemble. Inf Sci 198:81–99
Yu Z, Chen H, You J, Han G, Li L (2013) Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular data. IEEE/ACM Trans Comput Biol Bioinf 10(3):657–670
Yu Z, Li L, Liu J, Han G (2015) Hybrid Adaptive Classifier Ensemble. IEEE Transactions on Cybernetics 45(2):177–190
Yu Z, Zhu X, Wong HS, You J, Zhang J, Han G (2016a) Distribution-based cluster structure selection. IEEE Trans Cybern 99:1–14. https://doi.org/10.1109/TCYB.2016.2569529
Yu Z, Chen H, Liu J, You J, Leung H, Han G (2016b) Hybrid k-nearest neighbor classifier. IEEE Trans Cybern 46(6):1263–1275
Yu Z, Lu Y, Zhang J, You J, Wong HS, Wang Y, Han G (2017) Progressive semisupervised learning of multiple classifiers. IEEE Trans Cybern 99:1–14
Zhang S, Wong HS, Shen Y (2012) Generalized adjusted rand indices for cluster ensembles. Pattern Recognit 45(6):2214–2226
Zhao X, Liang J, Dang C (2017) Clustering ensemble selection for categorical data based on internal validity indices. Pattern Recognit 69:150–168
Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pho, KH., Akbarzadeh, H., Parvin, H. et al. A multi-level consensus function clustering ensemble. Soft Comput 25, 13147–13165 (2021). https://doi.org/10.1007/s00500-021-06092-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-06092-7