Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Gers, Felix A.; Pérez-Ortiz, Juan Antonio; Eck, Douglas; Schmidhuber, Jürgen

doi:10.1007/3-540-46084-5_107

Felix A. Gers⁵,
Juan Antonio Pérez-Ortiz⁶,
Douglas Eck⁷ &
…
Jürgen Schmidhuber⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2415))

Included in the following conference series:

International Conference on Artificial Neural Networks

143 Accesses
2 Citations

Abstract

Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sensitive language aⁿbⁿcⁿ to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

Work supported by SNF grant 2100-49’144.96, Spanish Comisión Interministerial de Ciencia y Tecnología grant TIC2000-1599-C02-02, and Generalitat Valenciana grant FPI-99-14-268. J.R. Dorronsoro (Ed.): ICANN 2002, LNCS 2415, pp. 655-660, 2002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boden, M., Wiles, J.: Context-free and context-sensitive dynamics in recurrent neural networks. Connection Science 12,3 (2000).
Google Scholar
Chalup, S., Blair, A.: Hill climbing in recurrent neural networks for learning the aⁿbⁿ ⁿn language. Proc. 6th Conf. on Neural Information Processing (1999) 508–513.
Google Scholar
Gers, F. A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Computation 12,10 (2000) 2451–2471.
Article Google Scholar
Gers, F. A., Schmidhuber, J.: LSTM recurrent networks learn simple context free and context sensitive languages. IEEE Transactions on Neural Networks 12,6 (2001) 1333–1340.
Article Google Scholar
Haykin, S. (ed.): Kalman filtering and neural networks. Wiley (2001).
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9,8 (1997) 1735–1780.
Article Google Scholar
Puskorius, G. V., Feldkamp, L. A.: Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks. IEEE Transactions on Neural Networks 5,2 (1994) 279–297.
Article Google Scholar
Rodriguez, P., Wiles, J., Elman, J.: A recurrent neural network that learns to count. Connection Science 11,1 (1999) 5–40.
Article Google Scholar
Rodriguez, P., Wiles, J.: Recurrent neural networks can learn to implement symbol-sensitive counting. Advances in Neural Information Processing Systems 10 (1998) 87–93. The MIT Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Mantik Bioinformatik GmbH, Neue Gruenstrasse 18, 10179, Berlin, Germany
Felix A. Gers
DLSI, Universitat d’Alacant, E-03071, Alacant, Spain
Juan Antonio Pérez-Ortiz
IDSIA, Galleria 2, 6928, Manno, Switzerland
Douglas Eck & Jürgen Schmidhuber

Authors

Felix A. Gers
View author publications
You can also search for this author in PubMed Google Scholar
Juan Antonio Pérez-Ortiz
View author publications
You can also search for this author in PubMed Google Scholar
Douglas Eck
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Schmidhuber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ETS Informática, Universidad Autónoma de Madrid, 28049, Madrid, Spain
José R. Dorronsoro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gers, F.A., Pérez-Ortiz, J.A., Eck, D., Schmidhuber, J. (2002). Learning Context Sensitive Languages with LSTM Trained with Kalman Filters. In: Dorronsoro, J.R. (eds) Artificial Neural Networks — ICANN 2002. ICANN 2002. Lecture Notes in Computer Science, vol 2415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46084-5_107

Download citation

DOI: https://doi.org/10.1007/3-540-46084-5_107
Published: 21 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44074-1
Online ISBN: 978-3-540-46084-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics