Investigating Efficient Learning and Compositionality in Generative LSTM Networks

Fabi, Sarah; Otte, Sebastian; Wiese, Jonas Gregor; Butz, Martin V.

doi:10.1007/978-3-030-61609-0_12

Sarah Fabi¹¹,
Sebastian Otte¹¹,
Jonas Gregor Wiese¹¹ &
…
Martin V. Butz¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12396))

Included in the following conference series:

International Conference on Artificial Neural Networks

3015 Accesses
5 Citations

Abstract

When comparing human with artificial intelligence, one major difference is apparent: Humans can generalize very broadly from sparse data sets because they are able to recombine and reintegrate data components in compositional manners. To investigate differences in efficient learning, Joshua B. Tenenbaum and colleagues developed the character challenge: First an algorithm is trained in generating handwritten characters. In a next step, one version of a new type of character is presented. An efficient learning algorithm is expected to be able to re-generate this new character, to identify similar versions of this character, to generate new variants of it, and to create completely new character types. In the past, the character challenge was only met by complex algorithms that were provided with stochastic primitives. Here, we tackle the challenge without providing primitives. We apply a minimal recurrent neural network (RNN) model with one feedforward layer and one LSTM layer and train it to generate sequential handwritten character trajectories from one-hot encoded inputs. To manage the re-generation of untrained characters when presented with only one example of them, we introduce a one-shot inference mechanism: the gradient signal is backpropagated to the feedforward layer weights only, leaving the LSTM layer untouched. We show that our model is able to meet the character challenge by recombining previously learned dynamic substructures, which are visible in the hidden LSTM states. Making use of the compositional abilities of RNNs in this way might be an important step towards bridging the gap between human and artificial intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018)
Geirhos, R., Temme, C.R.M., Rauber, J., Schütt, H.H., Bethge, M., Wichmann, F.A.: Generalisation in humans and deep neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 7538–7550. Curran Associates, Inc. (2018)
Google Scholar
Hassabis, D., Kumaran, D., Summerfield, C., Botvinick, M.: Neuroscience-inspired artificial intelligence. Neuron 95(2), 245–258 (2017)
Article Google Scholar
Hofstadter, D.: Metamagical Themas: Questing for the Essence of Mind and Pattern. Basic Books, New York (1985)
Google Scholar
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: 3rd International Conference for Learning Representations (2015)
Google Scholar
Lake, B., Baroni, M.: Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks. arXiv preprint arXiv:1711.00350 (2018)
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)
Article MathSciNet Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: The Omniglot challenge: a 3-year progress report. Curr. Opin. Behav. Sci. 29, 97–104 (2019)
Article Google Scholar
Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40, e253 (2017)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Article Google Scholar
Marcus, G.: Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631 (2018)
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436 (2015)
Google Scholar
Otte, S., Rubisch, P., Butz, M.V.: Gradient-based learning of compositional dynamics with modular RNNs. In: Tetko, I.V., Kurková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11727, pp. 484–496. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30487-4_38
Chapter Google Scholar
Werbos, P.J.: Backpropagation through time: what it does and how to do it. In: Proceedings of the IEEE, pp. 1550–1560 (1990)
Google Scholar

Download references

Acknowledgements

The results of this work were produced with the help of the GPU cluster of the BMBF funded project Training Center for Machine Learning (TCML) at the Eberhard Karls Universität Tübingen, administered by the Cognitive Systems group. We especially thank Maximus Mutschler who is responsible for the maintenance of the cluster.

Author information

Authors and Affiliations

Neuro-Cognitive Modeling Group, Eberhard Karls University Tübingen, Tübingen, Germany
Sarah Fabi, Sebastian Otte, Jonas Gregor Wiese & Martin V. Butz

Authors

Sarah Fabi
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Otte
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Gregor Wiese
View author publications
You can also search for this author in PubMed Google Scholar
Martin V. Butz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sarah Fabi .

Editor information

Editors and Affiliations

Department of Applied Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
Paolo Masulli
Department of Informatics, University of Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fabi, S., Otte, S., Wiese, J.G., Butz, M.V. (2020). Investigating Efficient Learning and Compositionality in Generative LSTM Networks. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12396. Springer, Cham. https://doi.org/10.1007/978-3-030-61609-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-61609-0_12
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61608-3
Online ISBN: 978-3-030-61609-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics