Abstract
Mixtures of multivariate t distributions provide a robust parametric extension to the fitting of data with respect to normal mixtures. In presence of some noise component, potential outliers or data with longer-than-normal tails, one way to broaden the model can be provided by considering t distributions. In this framework, the degrees of freedom can act as a robustness parameter, tuning the heaviness of the tails, and downweighting the effect of the outliers on the parameters estimation. The aim of this paper is to extend to mixtures of multivariate elliptical distributions some theoretical results about the likelihood maximization on constrained parameter spaces. Further, a constrained monotone algorithm implementing maximum likelihood mixture decomposition of multivariate t distributions is proposed, to achieve improved convergence capabilities and robustness. Monte Carlo numerical simulations and a real data study illustrate the better performance of the algorithm, comparing it to earlier proposals.
Similar content being viewed by others
References
Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49, 803–821 (1993)
Biernacki, C.: (2004). An asymptotic upper bound of the likelihood to prevent Gaussian mixture from degenerating. Technical report, Université de Franche-Comté
Campbell, N.A., Mahon, R.J.: A multivariate study of variation in two species of rock crab of genus. Letpograspus, Aust. J. Zool. 22, 417–455 (1974)
Day, N.E.: Estimating the components of a mixture of normal distributions. Biometrika 56, 463–474 (1969)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. B 39, 1–38 (1977)
Fang, K.T., Anderson, T.W.: Statistical Inference in Elliptically Contoured and Related Distributions. Alberton, New York (1990)
Frayley, C., Raftery, A.E.: Model-based clustering, discriminant analysis and density estimation. J. Am. Stat. Assoc. 97, 611–631 (2002)
Greselin, F., Ingrassia, S.: A note on constrained EM algorithms for mixtures of elliptical distributions. In: Advances in Data Analysis, Data Handling and Business Intelligence, Proceedings of 32nd Annual Conference of German Classification Society, 53 (2008)
Guerrero-Cusumano, J.L.: A measure of total variability for the multivariate t distribution with applications to finance. Inf. Sci. 92, 47–63 (1996)
Hathaway, R.J.: A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann. Stat. 13, 795–800 (1985)
Hawkins, D.M.: A new test for multivariate normality and homoscedasticity. Technometrics 23, 105–110 (1981)
Hennig, C.: Breakdown points for maximum likelihood estimators of location-scale mixtures. Ann. Stat. 32, 1313–1340 (2004)
Ingrassia, S.: A likelihood-based constrained algorithm for multivariate normal mixture models. Stat. Methods Appl. 13, 151–166 (2004)
Ingrassia, S., Rocci, R.: Constrained monotone EM algorithms for finite mixture of multivariate Gaussians. Comput. Stat. Data Anal. 51, 5339–5351 (2007)
Kotz, S., Nadarajah, S.: Multivariate t Distributions and Their Applications. Cambridge University Press, New York (2004)
Lange, K.L., Little, R.J.A., Taylor, G.M.G.: Robust statistical modeling using the t distribution. J. Am. Stat. Assoc. 84, 881–896 (1989)
Lin, T.I., Lee, J.C., Ni, H.F.: Bayesian analysis of mixture modelling using the multivariate t distribution. Stat. Comput. 14, 119–130 (2004)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Meng, X.L., Rubin, D.B.: Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80, 267–278 (1993)
Nadarajah, S., Kotz, S.: Mathematical properties of the multivariate t distribution. Acta Appl. Math. 89, 53–84 (2005)
Nettleton, D.: Convergence properties of the EM algorithm in constrained parameter spaces. Can. J. Stat. 27, 639–648 (1999)
Peel, D., McLachlan, G.J.: Robust mixture modelling using the t distribution. Stat. Comput. 10, 339–348 (2000)
Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26, 195–239 (1984)
Shoham, S.: Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognit. 35, 1127–1142 (2002)
Theobald, C.M.: An inequality with applications to multivariate analysis. Biometrika 62, 461–466 (1975)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Greselin, F., Ingrassia, S. Constrained monotone EM algorithms for mixtures of multivariate t distributions. Stat Comput 20, 9–22 (2010). https://doi.org/10.1007/s11222-008-9112-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-008-9112-9