Vapnik-Chervonenkis dimension of recurrent neural networks

Koiran, Pascal; Sontag, Eduardo D.

doi:10.1007/3-540-62685-9_19

Pascal Koiran¹ &
Eduardo D. Sontag²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1208))

Included in the following conference series:

European Conference on Computational Learning Theory

149 Accesses
4 Citations

Abstract

Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of w only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case k≫w. Ignoring multiplicative constants, the main results say roughly the following:

For architectures with activation σ = any fixed nonlinear polynomial, the VC dimension is ≈ wk.
For architectures with activation σ = any fixed piecewise polynomial, the VC dimension is between wk and w²k.
For architectures with activation σ = H (threshold nets), the VC dimension is between w log(k/w) and min{wk log wk, w ²+w log wk}.
For the standard sigmoid σ(x)=1/(1+e ^−x), the VC dimension is between wk and w⁴k².

This research was carried out in part while visiting DIMACS and the Rutgers Center for Systems and Control (SYCON) at Rutgers University.

This research was supported in part by US Air Force Grant AFOSR-94-0293.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E.B. Baum and D. Haussler, “What size net gives valid generalization?”, Neural Computation, 1 (1989), pp. 151–160.
Google Scholar
Y. Bengio, Neural Networks for Speech and Sequence Recognition, Thompson Computer Press, Boston, 1996.
Google Scholar
T.M. Cover, “Capacity problems for linear machines”, in: Pattern Recognition (L. Kanal ed.), Thompson Book Co., 1968, pp. 283–289
Google Scholar
B. Dasgupta and E.D. Sontag, “Sample complexity for learning recurrent perceptron mappings,” IEEE Trans. Inform. Theory, September 1996, to appear. (Summary in Advances in Neural Information Processing Systems 8 (NIPS95) (D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, eds.), MIT Press, Cambridge, MA, 1996, pp. 204–210.)
Google Scholar
C.L. Giles, G.Z. Sun, H.H. Chen, Y.C. Lee and D. Chen, “Higher order recurrent networks and grammatical inference”, in Advances in Neural Information Processing Systems 2, D.S. Touretzky (ed.), Morgan Kaufmann, San Mateo, CA, 1990.
Google Scholar
P. Goldberg and M. Jerrum, “Bounding the Vapnik-Chervonenkis dimension of concept classes parametrized by real numbers,” Machine Learning 18(1995), pp. 131–148.
Google Scholar
M. Karpinski and A. Macintyre, “Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks,” J. Computer Sys. Sci., to appear. (Summary in “Polynomial bounds for VC dimension of sigmoidal neural networks,” in Proc. 27th ACM Symposium on Theory of Computing, 1995, pp. 200–208).
Google Scholar
P. Koiran and E.D. Sontag, “Neural networks with quadratic VC dimension,” J. Computer Sys. Sci., to appear. (Summary in Advances in Neural Information Processing Systems 8 (NIPS95) (D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, eds.), MIT Press, Cambridge, MA, 1996, pp. 197–203.)
Google Scholar
M. Matthews, “A state-space approach to adaptive nonlinear filtering using recurrent neural networks,” Proc. 1990 IASTED Symp. on Artificial Intelligence Applications and Neural Networks, Zürich, pp. 197–200, July 1990.
Google Scholar
M.M. Polycarpou, and P.A. Ioannou, “Neural networks and on-line approximators for adaptive control,” in Proc. Seventh Yale Workshop on Adaptive and Learning Systems, pp. 93–798, Yale University, 1992.
Google Scholar
H. Siegelmann and E.D. Sontag, “On the computational power of neural nets,” J. Comp. Syst. Sci. 50(1995): 132–150.
Google Scholar
H. Siegelmann and E.D. Sontag, “Analog computation, neural networks, and circuits,” Theor. Comp. Sci. 131(1994): 331–360.
Google Scholar
E.D. Sontag, Mathematical Control Theory: Deterministic Finite Dimensional Systems, Springer, New York, 1990.
Google Scholar
E.D. Sontag, “Neural nets as systems models and controllers,” in Proc. Seventh Yale Workshop on Adaptive and Learning Systems, pp. 73–79, Yale University, 1992.
Google Scholar
E.D. Sontag, “Feedforward nets for interpolation and classification,” J. Comp. Syst. Sci. 45(1992): 20–48.
Google Scholar
A.M. Zador and B.A. Pearlmutter, “VC dimension of an integrate-and-fire neuron model,” Neural Computation 8(1996): 611–624.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire de l'Informatique du Parallélisme Ecole Normale Supérieure de Lyon, CNRS, 46 allée d'Italie, 69364, Lyon Cedex 07, France
Pascal Koiran
Department of Mathematics, Rutgers University, 08903, New Brunswick, NJ, USA
Eduardo D. Sontag

Authors

Pascal Koiran
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo D. Sontag
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Shai Ben-David

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koiran, P., Sontag, E.D. (1997). Vapnik-Chervonenkis dimension of recurrent neural networks. In: Ben-David, S. (eds) Computational Learning Theory. EuroCOLT 1997. Lecture Notes in Computer Science, vol 1208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62685-9_19

Download citation

DOI: https://doi.org/10.1007/3-540-62685-9_19
Published: 03 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62685-5
Online ISBN: 978-3-540-68431-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics