Abstract
The complete-data model that underlies an Expectation-Maximization (EM) algorithm must have a parameter space that coincides with the parameter space of the observed-data model. Otherwise, maximization of the observed-data log-likelihood will be carried out over a space that does not coincide with the desired parameter space. In some contexts, however, a natural complete-data model may be defined only for parameter values within a subset of the observed-data parameter space. In this paper we discuss situations where this can still be useful if the complete-data model can be viewed as a member of a finite family of complete-data models that have parameter spaces which collectively cover the observed-data parameter space. Such a family of complete-data models defines a family of EM algorithms which together lead to a finite collection of constrained maxima of the observed-data log-likelihood. Maximization of the log-likelihood function over the full parameter space then involves identifying the constrained maximum that achieves the greatest log-likelihood value. Since optimization over a finite collection of candidates is referred to as combinatorial optimization, we refer to such a family of EM algorithms as a combinatorial EM (CEM) algorithm. As well as discussing the theoretical concepts behind CEM algorithms, we discuss strategies for improving the computational efficiency when the number of complete-data models is large. Various applications of CEM algorithms are also discussed, ranging from simple examples that illustrate the concepts, to more substantive examples that demonstrate the usefulness of CEM algorithms in practice.
Similar content being viewed by others
References
Aitkin, M.: Modelling variance heterogeneity in normal regression using GLIM. Appl. Stat. 36, 332–339 (1987)
ASSENT-2 Investigators: Single-bolus tenecteplase compared with front-loaded alteplase in acute myocardial infarction: the ASSENT-2 double-blind randomised trial. Lancet 354, 716–722 (1999)
Barlow, R.E., Bartholomew, D.J., Bremner, J.M., Brunk, H.D.: Statistical Inference Under Order Restrictions: The Theory and Applications of Isotonic Regression. Wiley, New York (1972)
Berlinet, A., Roland, C.: Parabolic acceleration of the EM algorithm. Stat. Comput. 19, 35–47 (2009)
Bobb, J.F., Zhao, H., Varadhan, R.: turboEM: a suite of convergence acceleration schemes for EM and MM algorithms, R package version 2012.2-1 (2012). http://CRAN.R-project.org/package=turboEM
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood for incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–38 (1977)
Fessler, J.A., Hero, A.O.: Space-alternating generalized expectation-maximization algorithm. IEEE Trans. Signal Process. 42, 2664–2677 (1994)
Kuritzkes, D.R., Marschner, I.C., Johnson, V.A., Bassett, R., Eron, J.J., Fischl, M.A., Murphy, R.L., Fife, K., Maenza, J., Rosandich, M.E., Bell, D., Wood, K., Sommadossi, J.-P., Pettinelli, C.: Lamivudine in combination with zidovudine, stavudine, or didanosine in patients with HIV-1 infection: a randomized, double-blind, placebo-controlled trial. AIDS 13, 685–694 (1999)
Lange, K.: Numerical Analysis for Statisticians, 2nd edn. Springer, New York (2010)
Liu, C., Rubin, D.B.: The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81, 633–684 (1994)
Liu, C., Rubin, D.B., Wu, Y.N.: Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika 85, 755–770 (1998)
Marschner, I.C.: Stable computation of maximum likelihood estimates in identity link Poisson regression. J. Comput. Graph. Stat. 19, 666–683 (2010)
Marschner, I.C.: glm2: fitting generalized linear models with convergence problems. R J. 3, 12–15 (2011)
Marschner, I.C., Gillett, A.C.: Relative risk regression: reliable and flexible methods for log-binomial models. Biostatistics 13, 179–192 (2012)
Marschner, I.C., Gillett, A.C., O’Connell, R.L.: Stratified additive Poisson models: computational methods and applications in clinical epidemiology. Comput. Stat. Data Anal. 56, 1115–1130 (2012)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, Hoboken (2008)
Meng, X.L., Rubin, D.B.: Maximum likelihood via the ECM algorithm: a general framework. Biometrika 80, 267–278 (1993)
Nettleton, D.: Convergence properties of the EM algorithm in constrained parameter spaces. Can. J. Stat. 27, 639–648 (1999)
Schrijver, A.: Combinatorial Optimization. Springer, Berlin (2003)
Smyth, G.K.: An efficient algorithm for REML in heteroscedastic regression. J. Comput. Graph. Stat. 11, 836–847 (2002)
Varadhan, R., Roland, C.: Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scand. J. Stat. 35, 335–353 (2008)
Verbyla, A.P.: Modelling variance heterogeneity: residual maximum likelihood and diagnostics. J. R. Stat. Soc. B 55, 493–508 (1993)
Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11, 95–103 (1983)
Acknowledgements
This paper was written while the author was a Visiting Fellow at I-BioStat, Katholieke Universiteit Leuven, Belgium. This research was supported by the Australian Research Council (DP110101254).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marschner, I.C. Combinatorial EM algorithms. Stat Comput 24, 921–940 (2014). https://doi.org/10.1007/s11222-013-9411-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-013-9411-7