Abstract
Consider the problem of fitting a finite Gaussian mixture, with an unknown number of components, to observed data. This paper proposes a new minimum description length (MDL) type criterion, termed MMDL(for mixture MDL), to select the number of components of the model. MMDLis based on the identification of an “equivalent sample size”, for each component, which does not coincide with the full sample size. We also introduce an algorithm based on the standard expectationmaximization (EM) approach together with a new agglomerative step, called agglomerative EM (AEM). The experiments here reported have shown that MMDLo utperforms existing criteria of comparable computational cost. The good behavior of AEM, namely its good robustness with respect to initialization, is also illustrated experimentally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Banfield and A. Raftery. Model-based Gaussian and non-Gaussian clustering. Biometrics, 49:803–821, 1993.
H. Bensmail, G. Celeux, A. Raftery, and C. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:1–10, 1997.
J. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781–793, 1995.
C. Fraley and A. Raftery. How many clusters? Which clustering method? Answers via model-based cluster analysis. Technical Report 329, Department of Statistics, University of Washington, Seattle, WA, 1998.
T. Hastie and R. Tibshirani. Discriminant analysis by Gaussian mixtures. Journal of the Royal Statistical Society (B), 58:155–176, 1996.
A. Jain and R. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, N. J., 1988.
A. Jain and J. Moreau. Bootstrap techniques in cluster analysis. Pattern Recognition, 20(5):547–568, 1987.
R. Kass and A. Raftery. Bayes factors. Journal of the American Statistical Association, 90:733–795, 1995.
M. Kloppenburg and P. Tavan. Deterministic annealing for density estimation by multivariate normal mixtures. Physical Review E, 55:R2089–R2092, 1997.
P. Kontkanen, P. Myllymäki, and H. Tirri. Comparing bayesian model class selection criteria in discrete finite mixtures. In Proceedings of Information, Statistics, and Induction in Science-ISIS’96, pp. 364–374, Singapore, 1996. World Scientific.
S. Kullback. Information Theory and Statistics. J. Wiley & Sons, N. York, 1959.
J. Lin. Divergence measures based on the Shannon entropy. IEEE Trans. Information Theory, 37:145–151, 1991.
G. McLachlan. On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Jour. Roy. Stat. Soc. (C), 36:318–324, 1987.
G. McLachlan and K. Basford. Mixture Models: Inference and Application to Clustering. Marcel Dekker, New York, 1988.
G. McLachlan and T. Krishnan. The EM Algorithm and Extensions. John Wiley & Sons, New York, 1997.
G. McLachlan and D. Peel. MIXFIT: an algorithm for the automatic fitting and testing of normal mixture models. In Proceedings of the 14th IAPR International Conference on Pattern Recognition, volume II, pages 553–557, 1998.
K. Mengersen and C. Robert. Testing for mixtures: a Bayesian entropic approach. In J. Bernardo, J. Berger, A Dawid, and F. Smith, editors, Bayesian Statistsics 5: Proceedings of the Fifth Valencia International Meeting, pages 255–276. Oxford University Press, 1996.
R. Neal. Bayesian mixture modeling. In Proceedings of the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis, pages 197–211. Kluwer, Dordrecht, The Netherlands, 1992.
J. Oliver, R. Baxter, and C. Wallace. Unsupervised learning using MML. In Proceedings of the Thirtheenth International Conference on Machine Learning, pages 364–372. Morgan Kaufmann, San Francisco, CA, 1996.
S. Richardson and P. Green. On Bayesian analysis of mixtures with unknown number of components. Jour. of the Royal Statist. Soc. B, 59:731–792, 1997.
B. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, U.K., 1996.
J. Rissanen. Stochastic Complexity in Stastistical Inquiry. World Scientific, 1989.
C. Robert. Mixtures of distributions: Inference and estimation. In W. Gilks, S. Richardson, and D. Spiegelhalter, editors, Markov Chain Monte Carlo in Practice, London, 1996. Chapman & Hall.
S. Roberts, D. Husmeier, I. Rezek, and W. Penny. Bayesian approaches to gaussian mixture modelling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), November 1998.
K. Roeder and L. Wasserman. Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association, 92:894–902, 1997.
K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. Proc. of IEEE, 86:2210–2239, 1998.
B. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, London, 1986.
P. Smyth. Clustering using Monte-Carlo cross-validation. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 126–133. AAAI Press, Menlo Park, CA, 1996.
P. Smyth. Model selection for probabilistic clustering using cross-validated likelihood. Technical Report UCI-ICS 98-09, Information and Computer Science, University of California, Irvine, CA, 1998.
D. Titterington, A. Smith, and U. Makov. Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, Chichester (U.K.), 1985.
N. Ueda and R. Nakano. Deterministic annealing EM algorithm. Neural Networks, 11:271–282, 1998.
A. Vailaya, M. Figueiredo, A. K. Jain, and H. Jiang Zhang. A bayesian framework for semantic classification of outdoor vacation images. In Proceedings of the 1999 SPIE Conference on Storage and Retrieval for Image and Video Databases VII, pages 415–426. San Jose, CA, 1999.
M. West and J Harrison. Bayesian Forecasting and Dynamic Models. Springer-Verlag, New York, 1989.
M. Whindham and A. Cutler. Information ratios for validating mixture analysis. Journal of the American Satistical Association, 87:1188–1192, 1992.
A. Yuille, P. Stolorz, and J. Utans. Statistical physics, mixtures of distributions, and the EM algorithm. Neural Computation, 6:332–338, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Figueiredo, M.A.T., Leitão, J.M.N., Jain, A.K. (1999). On Fitting Mixture Models. In: Hancock, E.R., Pelillo, M. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 1999. Lecture Notes in Computer Science, vol 1654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48432-9_5
Download citation
DOI: https://doi.org/10.1007/3-540-48432-9_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66294-5
Online ISBN: 978-3-540-48432-5
eBook Packages: Springer Book Archive