Mapping the Energy Landscape

Barbu, Adrian; Zhu, Song-Chun

doi:10.1007/978-981-13-2971-5_11

Adrian Barbu³ &
Song-Chun Zhu⁴

4618 Accesses

Abstract

In many statistical learning problems, optimization is performed on a target function that is highly non-convex. A large body of research has been devoted to either approximating the target function by a related convex function, such as replacing the L ₀ norm with the L ₁ norm in regression models, or designing algorithms to find a good local optimum, such as the Expectation-Maximization algorithm for clustering. The task of analyzing the non-convex structure of a target function has received much less attention. In this chapter, inspired by successful visualization of landscapes for molecular systems [2] and spin-glass models [40], we compute Energy Landscape Maps (ELMs) in the high-dimensional spaces. The first half of the chapter explores and visualizes the model space (i.e. the hypothesis spaces in the machine learning literature) for clustering, bi-clustering, and grammar learning. The second half of the chapter introduces a novel MCMC method for identifying macroscopic structures in locally noisy energy landscapes. The technique is applied to explore the formation of stable concepts in deep network models of images.

“By visualizing information we turn it into a landscape that you can explore with your eyes: a sort of information map. And when you’re lost in information, an information map is kind of useful.”

– David McCandless

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barbu A, Zhu S-C (2005) Generalizing Swendsen-wang to sampling arbitrary posterior probabilities. IEEE Trans Pattern Anal Mach Intell 27(8):1239–1253
Article Google Scholar
Becker OM, Karplus M (1997) The topology of multidimensional potential energy surfaces: theory and application to peptide structure and kinetics. J Chem Phys 106(4):1495–1517
Article Google Scholar
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: ICML, pp 41–48
Google Scholar
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Oakland. Robustness of maximum boxes
Google Scholar
Bovier A, den Hollander F (2006) Metastability: a potential theoretic approach. Int Cong Math 3:499–518
MathSciNet MATH Google Scholar
Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7(4):434–455
MathSciNet Google Scholar
Charniak E (2001) Immediate-head parsing for language models. In: Proceedings of the 39th annual meeting on association for computational linguistics, pp 124–131
Google Scholar
Collins M (1999) Head-driven statistical models for natural language parsing. PhD thesis, University of Pennsylvania
Google Scholar
Dasgupta S, Schulman LJ (2000) A two-round variant of em for gaussian mixtures. In: Proceedings of the sixteenth conference on uncertainty in artificial intelligence (UAI’00), pp 152–159
Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38
MathSciNet MATH Google Scholar
Elman JL (1993) Learning and development in neural networks: the importance of starting small. Cognition 48(1):71–99
Article Google Scholar
Ganchev K, Graça J, Gillenwater J, Taskar B (2010) Posterior regularization for structured latent variable models. J Mach Learn Res 11:2001–2049
MathSciNet MATH Google Scholar
Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 457–472
Article Google Scholar
Geyer CJ, Thompson EA (1995) Annealing Markov chain Monte Carlo with applications to ancestral inference. J Am Stat Assoc 90(431):909–920
Article Google Scholar
Headden WP III, Johnson M, McClosky D (2009) Improving unsupervised dependency parsing with richer contexts and smoothing. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics, pp 101–109
Google Scholar
Hill M, Nijkamp E, Zhu S-C (2019) Building a telescope to look into high-dimensional image spaces. Q Appl Math 77(2):269–321
Article MathSciNet Google Scholar
Julesz B (1962) Visual pattern discrimination. IRE Trans Inf Theory 8(2):84–92
Article Google Scholar
Julesz B (1981) Textons, the elements of texture perception, and their interactions. Nature 290:91
Article Google Scholar
Klein D, Manning CD (2004) Corpus-based induction of syntactic structure: models of dependency and constituency. In: Proceedings of the 42nd annual meeting on association for computational linguistics, p 478
Google Scholar
Kübler S, McDonald R, Nivre J (2009) Dependency parsing. Synth Lect Hum Lang Technol 1(1):1–127
Google Scholar
Liang F (2005) A generalized wang-landau algorithm for Monte Carlo computation. J Am Stat Assoc 100(472):1311–1327
Article MathSciNet Google Scholar
Liang F, Liu C, Carroll RJ (2007) Stochastic approximation in Monte Carlo computation. J Am Stat Assoc 102(477):305–320
Article MathSciNet Google Scholar
Marinari E, Parisi G (1992) Simulated tempering: a new Monte Carlo scheme. EPL (Europhys Lett) 19(6):451
Article Google Scholar
Mel’čuk IA (1988) Dependency syntax: theory and practice. SUNY Press, New York
Google Scholar
Onuchic JN, Luthey-Schulten Z, Wolynes PG (1997) Theory of protein folding: the energy landscape perspective. Ann Rev Phys Chem 48(1):545–600
Article Google Scholar
Pavlovskaia M (2014) Mapping highly nonconvex energy landscapes in clustering, grammatical and curriculum learning. PhD thesis, Doctoral Dissertation, UCLA
Google Scholar
Pavlovskaia M, Tu K, Zhu S-C (2015) Mapping the energy landscape of non-convex optimization problems. In: International workshop on energy minimization methods in computer vision and pattern recognition. Springer, pp 421–435
Google Scholar
Rohde DLT, Plaut DC (1999) Language acquisition in the absence of explicit negative evidence: how important is starting small? Cognition 72(1):67–109
Article Google Scholar
Samdani R, Chang M-W, Roth D (2012) Unified expectation maximization. In: Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, pp 688–698
Google Scholar
Spitkovsky VI, Alshawi H, Jurafsky D (2010) From baby steps to leapfrog: how “less is more” in unsupervised dependency parsing. In: NAACL
Google Scholar
Swendsen RH, Wang J-S (1987) Nonuniversal critical dynamics in Monte Carlo simulations. Phys Rev Lett 58(2):86–88
Article Google Scholar
Tu K, Honavar V (2011) On the utility of curricula in unsupervised learning of probabilistic grammars. In: IJCAI proceedings-international joint conference on artificial intelligence, vol 22, p 1523
Google Scholar
Tu K, Honavar V (2012) Unambiguity regularization for unsupervised learning of probabilistic grammars. In: Proceedings of the 2012 conference on empirical methods in natural language processing and natural language learning (EMNLP-CoNLL 2012)
Google Scholar
Wales DJ, Doye JPK (1997) Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms. J Phys Chem 101(28):5111–5116
Article Google Scholar
Wales DJ, Trygubenko SA (2004) A doubly nudged elastic band method for finding transition states. J Chem Phy 120:2082–2094
Article Google Scholar
Wang F, Landau DP (2001) Efficient, multiple-range random walk algorithm to calculate the density of states. Phys Rev Lett 86(10):2050
Article Google Scholar
Wu YN, Guo C-E, Zhu S-C (2007) From information scaling of natural images to regimes of statistical models. Q Appl Math 66(1):81–122
Article MathSciNet Google Scholar
Xie J, Lu Y, Wu YN (2018) Cooperative learning of energy-based model and latent variable model via MCMC teaching. In: AAAI
Google Scholar
Zhou Q (2011) Multi-domain sampling with applications to structural inference of Bayesian networks. J Am Stat Assoc 106(496):1317–1330
Article MathSciNet Google Scholar
Zhou Q (2011) Random walk over basins of attraction to construct Ising energy landscapes. Phys Rev Lett 106(18):180602
Article Google Scholar
Zhou Q, Wong WH (2008) Reconstructing the energy landscape of a distribution from Monte Carlo samples. Ann Appl Stat 2:1307–1331
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Florida State University, Tallahassee, FL, USA
Adrian Barbu
Departments of Statistics and Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
Song-Chun Zhu

Authors

Adrian Barbu
View author publications
You can also search for this author in PubMed Google Scholar
Song-Chun Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Barbu, A., Zhu, SC. (2020). Mapping the Energy Landscape. In: Monte Carlo Methods. Springer, Singapore. https://doi.org/10.1007/978-981-13-2971-5_11

Download citation

DOI: https://doi.org/10.1007/978-981-13-2971-5_11
Published: 25 February 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2970-8
Online ISBN: 978-981-13-2971-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics