Semi-Naive Mixture Model for Consensus Clustering

Moltisanti, Marco; Farinella, Giovanni Maria; Battiato, Sebastiano

doi:10.1007/978-3-319-27926-8_30

Marco Moltisanti¹⁷,
Giovanni Maria Farinella¹⁷ &
Sebastiano Battiato¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9432))

Included in the following conference series:

International Workshop on Machine Learning, Optimization and Big Data

2134 Accesses

Abstract

Consensus clustering is a powerful method to combine multiple partitions obtained through different runs of clustering algorithms. The goal is to achieve a robust and stable partition of the space through a consensus procedure which exploits the diversity of multiple clusterings outputs. Several methods have been proposed to tackle the consensus clustering problem. Among them, the algorithm which models the problem as a mixture of multivariate multinomial distributions in the space of cluster labels gained high attention in the literature. However, to make the problem tractable, the theoretical formulation takes into account a Naive Bayesian conditional independence assumption over the components of the vector space in which the consensus function acts (i.e., the conditional probability of a \(d-\)dimensional vector space is represented as the product of conditional probability in an one dimensional feature space). In this paper we propose to relax the aforementioned assumption, heading to a Semi-Naive approach to model some of the dependencies among the components of the vector space for the generation of the final consensus partition. The Semi-Naive approach consists in grouping in a random way the components of the labels space and modeling the conditional density term in the maximum-likelihood estimation formulation as the product of the conditional densities of the finite set of groups composed by elements of the labels space. Experiments are performed to point out the results of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.cs.cmu.edu/Groups/AI/areas/neural/bench/cmu/.

References

Battiato, S., Farinella, G.M., Gallo, G., Ravì, D.: Exploiting textons distributions on spatial hierarchy for scene classification. J. Image Video Process. 2010(7), 1–13 (2010)
Google Scholar
Battiato, S., Farinella, G.M., Guarnera, M., Messina, G., Ravì, D.: Red-eyes removal through cluster based linear discriminant analysis. In: 2010 17th IEEE International Conference on Image Processing (ICIP), pp. 2185–2188. IEEE (2010)
Google Scholar
Battiato, S., Farinella, G.M., Puglisi, G., Ravì, D.: Aligning codebooks for near duplicate image detection. Multimedia Tools Appl. 72(2), 1483–1506 (2014)
Article Google Scholar
Estivill-Castro, V.: Why so many clustering algorithms: a position paper. ACM SIGKDD Explor. Newsl. 4(1), 65–75 (2002)
Article MathSciNet Google Scholar
Farinella, G.M., Moltisanti, M., Battiato, S.: Classifying food images represented as bag of textons. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 5212–5216 (2014)
Google Scholar
Farinella, G.M., Moltisanti, M., Battiato, S.: Food recognition using consensus vocabularies. In: Murino, V., Puppo, E., Sona, D., Cristani, M., Sansone, C. (eds.) ICIAP 2015 Workshops. LNCS, vol. 9281, pp. 384–392. Springer, Heidelberg (2015)
Chapter Google Scholar
Fred, A., Jain, A.K.: Robust data clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2–128. IEEE (2003)
Google Scholar
Fred, A., Jain, A.K.: Data clustering using evidence accumulation. In: International Conference on Pattern Recognition, vol. 4, pp. 276–280. IEEE (2002)
Google Scholar
Fred, A., Jain, A.K.: Evidence accumulation clustering based on the K-means algorithm. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 442–451. Springer, Heidelberg (2002)
Chapter Google Scholar
Ghaemi, R., Sulaiman, N., Ibrahim, H., Mustapha, N.: A survey: clustering ensembles techniques. Eng. Technol. 38(February), 636–645 (2009)
Google Scholar
Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: applications in VLSI domain. IEEE Trans. Very Large Scale Integr. VLSI Syst. 7(1), 69–79 (1999)
Article Google Scholar
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)
Article MathSciNet Google Scholar
Kleinberg, J.: An impossibility theorem for clustering. In: Advances in Neural Information Processing Systems, pp. 446–453 (2002)
Google Scholar
Kononenko, I.: Semi-naive bayesian classifier. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482. Springer, Heidelberg (1991)
Google Scholar
Özuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 448–461 (2010)
Article Google Scholar
Pazzani, M.J.: Constructive induction of cartesian product attributes. In: Feature Extraction, Construction and Selection, pp. 341–354. Springer (1998)
Google Scholar
Saffari, A., Bischof., H.: Clustering in a boosting framework. In: Proceedings of Computer Vision Winter Workshop (CVWW), St. Lambrecht, Austria, pp. 75–82 (2007)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)
MATH MathSciNet Google Scholar
Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)
Article Google Scholar
Zheng, F., Webb, G.: A comparative study of semi-naive bayes methods in classification learning. In: Proceedings of the 4th Australasian Data Mining Conference (AusDM 2005), pp. 141–156 (2005)
Google Scholar
Zheng, Z., Webb, G.I., Ting, K.M.: Lazy bayesian rules: a lazy semi-naive bayesian learning technique competitive to boosting decision trees. In: Proceedings of the 16th International Conference on Machine Learning (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Image Processing Laboratory – Dipartimento di Matematica e Informatica, Università degli Studi di Catania, Catania, Italy
Marco Moltisanti, Giovanni Maria Farinella & Sebastiano Battiato

Authors

Marco Moltisanti
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Maria Farinella
View author publications
You can also search for this author in PubMed Google Scholar
Sebastiano Battiato
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Moltisanti .

Editor information

Editors and Affiliations

University of Florida, Gainsville, Florida, USA
Panos Pardalos
University of Catania, Catania, Italy
Mario Pavone
University of Catania, Catania, Italy
Giovanni Maria Farinella
University of Catania, Catania, Italy
Vincenzo Cutello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moltisanti, M., Farinella, G.M., Battiato, S. (2015). Semi-Naive Mixture Model for Consensus Clustering. In: Pardalos, P., Pavone, M., Farinella, G., Cutello, V. (eds) Machine Learning, Optimization, and Big Data. MOD 2015. Lecture Notes in Computer Science(), vol 9432. Springer, Cham. https://doi.org/10.1007/978-3-319-27926-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-27926-8_30
Published: 06 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27925-1
Online ISBN: 978-3-319-27926-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics