Skip to main content
Log in

Properties of the stochastic approximation EM algorithm with mini-batch sampling

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation–Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classical conditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models. In particular, we highlight that mini-batch sampling results in an important speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented. Finally, we illustrate how to use mini-batch sampling in practice to improve results when a constraint on the computing time is given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Allassonnière, S., Amit, Y., Trouvé, A.: Toward a coherent statistical framework for dense deformable template estimation. J. R. Stat. Soc. Ser. B 69, 3–29 (2007)

    Article  MathSciNet  Google Scholar 

  • Allassonnière, S., Kuhn, E., Trouvé, A.: Construction of Bayesian deformable models via a stochastic approximation algorithm: a convergence study. Bernoulli 16(3), 641–678 (2010)

    Article  MathSciNet  Google Scholar 

  • Andrieu, C., Moulines, E., Priouret, P.: Stability of stochastic approximation under verifiable conditions. SIAM J. Control. Optim. 44(1), 283–312 (2005)

    Article  MathSciNet  Google Scholar 

  • Cappé, O.: Online EM algorithm for hidden Markov models. J. Comput. Graph. Stat. 20(3), 728–749 (2011)

    Article  MathSciNet  Google Scholar 

  • Cappé, O., Moulines, E.: On-line expectation-maximization algorithm for latent data models. J. R. Stat. Soc. Ser. B 71(3), 593–613 (2009)

    Article  MathSciNet  Google Scholar 

  • Davidian, M., Giltinan, D.M.: Nonlinear Models for Repeated Measurement Data. CRC Press, Boca Raton (1995)

    Google Scholar 

  • Delyon, B., Lavielle, M., Moulines, E.: Convergence of a stochastic approximation version of the EM algorithm. Ann. Stat. 27(1), 94–128 (1999)

    Article  MathSciNet  Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Duchateau, L., Janssen, P.: The Frailty Model. Springer, New York (2008)

    MATH  Google Scholar 

  • Fort, G., Moulines, E., Roberts, G.O., Rosenthal, J.S.: On the geometric ergodicity of hybrid samplers. J. Appl. Probab. 40, 123–146 (2003)

    Article  MathSciNet  Google Scholar 

  • Fort, G., Jourdain, B., Kuhn, E., Lelièvre, T., Stoltz, G.: Convergence of the Wang-Landau algorithm. Math. Comput. 84(295), 2297–2327 (2015)

    Article  MathSciNet  Google Scholar 

  • Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  Google Scholar 

  • Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16(5), 550–554 (1994)

    Article  Google Scholar 

  • Karimi, B., Lavielle, M., Moulines, E.: On the Convergence Properties of the Mini-batch EM and MCEM Algorithms (unpublished) (2018)

  • Karimi, B.: Non-Convex Optimization for Latent Data Models: Algorithms, Analysis and Applications. phD Thesis https://tel.archives-ouvertes.fr/tel-02319140 (2019)

  • Kuhn, E., Lavielle, M.: Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: P&S 8, 115–131 (2004)

  • Kuhn, E., Lavielle, M.: Maximum likelihood estimation in nonlinear mixed effects models. Comput. Stat. Data Ann. 49(4), 1020–1038 (2005)

    Article  MathSciNet  Google Scholar 

  • Lange, K.: A gradient algorithm locally equivalent to the EM algorithm. J. R. Stat. Soc. Ser. B 2(57), 425–437 (1995)

    MathSciNet  MATH  Google Scholar 

  • Liang, P., Klein, D.: Online EM for unsupervised models. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, NAACL ’09, pp 611–619 (2009)

  • Matias, C., Robin, S.: Modeling heterogeneity in random graphs through latent space models: a selective review. ESAIM Proce. Surv. 47, 55–74 (2014)

    Article  MathSciNet  Google Scholar 

  • Neal, R.M., Hinton, G.E.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M.I. (ed.) Learning in Graphical Models, pp. 355–368. MIT Press, Cambridge (1999)

    Google Scholar 

  • Nguyen, H., Forbes, F., McLachlan, G.: Mini-batch learning of exponential family finite mixture models. Stat, Comput (2020)

    Book  Google Scholar 

  • Robert, C.P., Casella, G.: Monte Carlo statistical methods, 2nd edn. Springer Texts in Statistics. Springer, New York (2004)

    Book  Google Scholar 

  • Titterington, D.M.: Recursive parameter estimation using incomplete data. J. R. Stat. Soc. Ser. B 2(46), 257–267 (1984)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

Work partly supported by the Grant ANR-18-CE02-0010 of the French National Research Agency ANR (Project EcoNet).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tabea Rebafka.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuhn, E., Matias, C. & Rebafka, T. Properties of the stochastic approximation EM algorithm with mini-batch sampling. Stat Comput 30, 1725–1739 (2020). https://doi.org/10.1007/s11222-020-09968-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-020-09968-0

Keywords

Mathematics Subject Classification

Navigation