Abstract
Having described the structure, the operation and the training of (artificial) neural networks in a general fashion in the preceding chapter, we turn in this and the subsequent chapters to specific forms of (artificial) neural networks. We start with the best-known and most widely used form, the so-called multi-layer perceptron (MLP), which is closely related to the networks of threshold logic units we studied in a previous chapter. They exhibit a strictly layered structure and may employ other activation functions than a step at a crisp threshold.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Conservative logic is a mathematical model for computations and computational powers of computers, in which the fundamental physical principles that govern computing machines are explicitly taken into account. Among these principles are, for instance, that the speed with which information can travel as well as the amount of information that can be stored in the state of a finite system are both finite (Fredkin and Toffoli 1982).
- 2.
In the following we assume implicitly that the output function of all neurons is the identity. Only the activation functions are exchanged.
- 3.
Note that this approach is not easily transferred to functions with multiple arguments. For this to be possible, the influences of the two or more inputs have to be independent in a certain sense.
- 4.
Note, however, that with this approach the sum of squared errors is minimized in the transformed space (coordinates x′=lnx and y′=lny), but this does not imply that it is also minimized in the original space (coordinates x and y). Nevertheless this approach usually yields very good results or at least an initial solution that may then be improved by other means.
- 5.
Note again that with this procedure the sum of squared errors is minimized in the transformed space (coordinates x and \(z = \ln (\frac{Y-y}{y} )\)), but this does not imply that it is also minimized in the original space (coordinates x and y), cf. the preceding footnote.
- 6.
Unless the output function is not differentiable. However, we usually assume (implicitly) that the output function is the identity and thus does not introduce any problems.
- 7.
In order to avoid this factor right from the start, the error of an output neuron is sometimes defined as \(e_{u}^{(l)} = \frac{1}{2} (o_{u}^{(l)} - \operatorname{out}_{u}^{(l)} )^{2}\). In this way the factor 2 simply cancels in the derivation.
- 8.
Note that the bias value θ u is already contained in the extended weight vector.
References
S.E. Fahlman. An Empirical Study of Learning Speed in Backpropagation Networks. In: Touretzky et al. (1988)
E. Fredkin and T. Toffoli. Conservative Logic. International Journal of Theoretical Physics 21(3/4):219–253. Plenum Press, New York, NY, USA, 1982
R.A. Jakobs. Increased Rates of Convergence Through Learning Rate Adaption. Neural Networks 1:295–307. Pergamon Press, Oxford, United Kingdom, 1988
A. Pinkus. Approximation Theory of the MLP Model in Neural Networks. Acta Numerica 8:143–196. Cambridge University Press, Cambridge, United Kingdom, 1999
M. Riedmiller and H. Braun. Rprop—A Fast Adaptive Learning Algorithm. Technical Report, University of Karlsruhe, Karlsruhe, Germany, 1992
M. Riedmiller and H. Braun. A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. Int. Conf. on Neural Networks (ICNN-93, San Francisco, CA), 586–591. IEEE Press, Piscataway, NJ, USA, 1993
D.E. Rumelhart, G.E. Hinton and R.J. Williams. Learning Representations by Back-Propagating Errors. Nature 323:533–536, 1986
T. Tollenaere. SuperSAB: Fast Adaptive Backpropagation with Good Scaling Properties. Neural Networks 3:561–573, 1990
D. Touretzky, G. Hinton and T. Sejnowski (eds.) Proc. of the Connectionist Models Summer School (Carnegie Mellon University). Morgan Kaufman, San Mateo, CA, USA, 1988
P.J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1974
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Kruse, R., Borgelt, C., Klawonn, F., Moewes, C., Steinbrecher, M., Held, P. (2013). Multi-Layer Perceptrons. In: Computational Intelligence. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-5013-8_5
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5013-8_5
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5012-1
Online ISBN: 978-1-4471-5013-8
eBook Packages: Computer ScienceComputer Science (R0)