Unsupervised separation of dynamics from pixels

Chiappa, Silvia; Paquet, Ulrich

doi:10.1007/s40300-019-00155-4

Unsupervised separation of dynamics from pixels

Published: 20 July 2019

Volume 77, pages 119–135, (2019)
Cite this article

METRON Aims and scope Submit manuscript

281 Accesses
3 Citations
6 Altmetric
Explore all metrics

Abstract

We present an approach to learn the dynamics of multiple objects from image sequences in an unsupervised way. We introduce a probabilistic model that first generate noisy positions for each object through a separate linear state-space model, and then renders the positions of all objects in the same image through a highly non-linear process. Such a linear representation of the dynamics enables us to propose an inference method that uses exact and efficient inference tools and that can be deployed to query the model in different ways without retraining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

It’s Moving! A Probabilistic Model for Causal Motion Segmentation in Moving Camera Videos

2D or Not 2D: Bridging the Gap Between Tracking and Structure from Motion

Multiple Image Segmentation

Notes

Whilst in practice we need to consider all observed sequences in the KL, to simplify the notation we focus the exposition on one sequence only.
In practice, as the state \(s_0^n\) encodes which way we can interrogate \(v_1\) to infer \(a_1^n\), we have obtained better results by learning separate \(\phi _{s_0^n}\) that depend on the number of objects N in the image.

References

Babaeizadeh, M., Finn, C., Erhan, D., Campbell, R., Levine, S.: Stochastic variational video prediction. In: 6th International Conference on Learning Representations (2018)
Bar-Shalom, Y., Li, X.R.: Estimation and Tracking: Principles, Techniques, and Software. Artech House, Norwood (1993)
MATH Google Scholar
Barber, D., Cemgil, A.T., Chiappa, S.: Inference and estimation in probabilistic time series models. In: Bayesian Time Series Models, pp. 1–31 (2011)
Blackman, S., Popoli, R.: Design and Analysis of Modern Tracking Systems. Artech House, Norwood (1999)
MATH Google Scholar
Chiappa, S.: Analysis and Classification of EEG Signals using Probabilistic Models for Brain Computer Interfaces. Ph.D. thesis, EPF Lausanne, Switzerland (2006)
Chiappa, S.: A Bayesian approach to switching linear Gaussian state-space models for unsupervised time-series segmentation. In: Proceedings of the Seventh International Conference on Machine Learning and Applications, pp. 3–9 (2008)
Chiappa, S.: Explicit-duration Markov switching models. Found. Trends Mach. Learn. 7(6), 803–886 (2014)
Article MATH Google Scholar
Chiappa, S., Racanière, S., Wierstra, D., Mohamed, S.: Recurrent environment simulators. In: 5th International Conference on Learning Representations (2017)
Denton, E.L., Birodkar, V.: Unsupervised learning of disentangled representations from video. Adv. Neural Inf. Process. Syst. 30, 4414–4423 (2017)
Google Scholar
Finn, C., Goodfellow, I.J., Levine, S.: Unsupervised learning for physical interaction through video prediction. Adv. Neural Inf. Process. Syst. 29, 64–72 (2016)
Google Scholar
Fraccaro, M., Kamronn, S., Paquet, U., Winther, O.: A disentangled recognition and nonlinear dynamics model for unsupervised learning. Adv. Neural Inf. Process. Syst. 30, 3604–3613 (2017)
Google Scholar
Fraccaro, M., Sønderby, S.K., Paquet, U., Winther, O.: Sequential neural models with stochastic layers. Adv. Neural Inf. Process. Syst. 29, 2199–2207 (2016)
Google Scholar
Gao, Y., Archer, E.W., Paninski, L., Cunningham, J.P.: Linear dynamical neural population models through nonlinear embeddings. Adv. Neural Inf. Process. Syst. 29, 163–171 (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Johnson, M., Duvenaud, D.K., Wiltschko, A., Adams, R.P., Datta, S.R.: Composing graphical models with neural networks for structured representations and fast inference. Adv. Neural Inf. Process. Syst. 29, 2946–2954 (2016)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: 2nd International Conference on Learning Representations (2014)
Krishnan, R., Shalit, U., Sontag, D.: Structured inference networks for nonlinear state space models. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 2101–2109 (2017)
Lin, W., Hubacher, N., Khan, M.E.: Variational message passing with structured inference networks. In: 6th International Conference on Learning Representations (2018)
Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in Atari games. Adv. Neural Inf. Process. Syst. 28, 2863–2871 (2015)
Google Scholar
Pearce, M., Chiappa, S., Paquet, U.: Comparing interpretable inference models for videos of physical motion. In: Symposium on Advances in Approximate Bayesian Inference (2018)
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1278–1286 (2014)
Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using LSTMs. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 843–852 (2015)
Sun, W., Venkatraman, A., Boots, B., Bagnell, J.A.: Learning to filter with predictive state inference machines. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 1197–1205 (2016)
Watters, N., Tacchetti, A., Weber, T., Pascanu, R., Battaglia, P., Zoran, D.: Visual interaction networks. CoRR. arXiv:1706.01433 (2017)

Download references

Author information

Silvia Chiappa and Ulrich Paquet contributed equally.

Authors and Affiliations

DeepMind, London, UK
Silvia Chiappa & Ulrich Paquet

Authors

Silvia Chiappa
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Paquet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silvia Chiappa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Multi-step ahead generation of images and inference using past and future observations

See Figs. 9, 10, 11 and 12.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chiappa, S., Paquet, U. Unsupervised separation of dynamics from pixels. METRON 77, 119–135 (2019). https://doi.org/10.1007/s40300-019-00155-4

Download citation

Received: 30 November 2018
Accepted: 02 July 2019
Published: 20 July 2019
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s40300-019-00155-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised separation of dynamics from pixels

Abstract

Access this article

Similar content being viewed by others

It’s Moving! A Probabilistic Model for Causal Motion Segmentation in Moving Camera Videos