Multi-Layer Perceptrons

Kruse, Rudolf; Borgelt, Christian; Klawonn, Frank; Moewes, Christian; Steinbrecher, Matthias; Held, Pascal

doi:10.1007/978-1-4471-5013-8_5

Rudolf Kruse⁷,
Christian Borgelt⁸,
Frank Klawonn⁹,
Christian Moewes⁷,
Matthias Steinbrecher¹⁰ &
…
Pascal Held⁷

Part of the book series: Texts in Computer Science ((TCS))

3767 Accesses
15 Citations

Abstract

Having described the structure, the operation and the training of (artificial) neural networks in a general fashion in the preceding chapter, we turn in this and the subsequent chapters to specific forms of (artificial) neural networks. We start with the best-known and most widely used form, the so-called multi-layer perceptron (MLP), which is closely related to the networks of threshold logic units we studied in a previous chapter. They exhibit a strictly layered structure and may employ other activation functions than a step at a crisp threshold.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.95; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Conservative logic is a mathematical model for computations and computational powers of computers, in which the fundamental physical principles that govern computing machines are explicitly taken into account. Among these principles are, for instance, that the speed with which information can travel as well as the amount of information that can be stored in the state of a finite system are both finite (Fredkin and Toffoli 1982).
2.
In the following we assume implicitly that the output function of all neurons is the identity. Only the activation functions are exchanged.
3.
Note that this approach is not easily transferred to functions with multiple arguments. For this to be possible, the influences of the two or more inputs have to be independent in a certain sense.
4.
Note, however, that with this approach the sum of squared errors is minimized in the transformed space (coordinates x′=lnx and y′=lny), but this does not imply that it is also minimized in the original space (coordinates x and y). Nevertheless this approach usually yields very good results or at least an initial solution that may then be improved by other means.
5.
Note again that with this procedure the sum of squared errors is minimized in the transformed space (coordinates x and \(z = \ln (\frac{Y-y}{y} )\)), but this does not imply that it is also minimized in the original space (coordinates x and y), cf. the preceding footnote.
6.
Unless the output function is not differentiable. However, we usually assume (implicitly) that the output function is the identity and thus does not introduce any problems.
7.
In order to avoid this factor right from the start, the error of an output neuron is sometimes defined as \(e_{u}^{(l)} = \frac{1}{2} (o_{u}^{(l)} - \operatorname{out}_{u}^{(l)} )^{2}\). In this way the factor 2 simply cancels in the derivation.
8.
Note that the bias value θ _u is already contained in the extended weight vector.

References

S.E. Fahlman. An Empirical Study of Learning Speed in Backpropagation Networks. In: Touretzky et al. (1988)
Google Scholar
E. Fredkin and T. Toffoli. Conservative Logic. International Journal of Theoretical Physics 21(3/4):219–253. Plenum Press, New York, NY, USA, 1982
Article MathSciNet MATH Google Scholar
R.A. Jakobs. Increased Rates of Convergence Through Learning Rate Adaption. Neural Networks 1:295–307. Pergamon Press, Oxford, United Kingdom, 1988
Article Google Scholar
A. Pinkus. Approximation Theory of the MLP Model in Neural Networks. Acta Numerica 8:143–196. Cambridge University Press, Cambridge, United Kingdom, 1999
Article MathSciNet Google Scholar
M. Riedmiller and H. Braun. Rprop—A Fast Adaptive Learning Algorithm. Technical Report, University of Karlsruhe, Karlsruhe, Germany, 1992
Google Scholar
M. Riedmiller and H. Braun. A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. Int. Conf. on Neural Networks (ICNN-93, San Francisco, CA), 586–591. IEEE Press, Piscataway, NJ, USA, 1993
Chapter Google Scholar
D.E. Rumelhart, G.E. Hinton and R.J. Williams. Learning Representations by Back-Propagating Errors. Nature 323:533–536, 1986
Article Google Scholar
T. Tollenaere. SuperSAB: Fast Adaptive Backpropagation with Good Scaling Properties. Neural Networks 3:561–573, 1990
Article Google Scholar
D. Touretzky, G. Hinton and T. Sejnowski (eds.) Proc. of the Connectionist Models Summer School (Carnegie Mellon University). Morgan Kaufman, San Mateo, CA, USA, 1988
Google Scholar
P.J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1974
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
Rudolf Kruse, Christian Moewes & Pascal Held
Intelligent Data Analysis & Graphical Models Research Unit, European Centre for Soft Computing, Mieres, Spain
Christian Borgelt
FB Informatik, Ostfalia University of Applied Sciences, Wolfenbüttel, Germany
Frank Klawonn
SAP Innovation Center, Potsdam, Germany
Matthias Steinbrecher

Authors

Rudolf Kruse
View author publications
You can also search for this author in PubMed Google Scholar
Christian Borgelt
View author publications
You can also search for this author in PubMed Google Scholar
Frank Klawonn
View author publications
You can also search for this author in PubMed Google Scholar
Christian Moewes
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Steinbrecher
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Held
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kruse, R., Borgelt, C., Klawonn, F., Moewes, C., Steinbrecher, M., Held, P. (2013). Multi-Layer Perceptrons. In: Computational Intelligence. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-5013-8_5

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5013-8_5
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5012-1
Online ISBN: 978-1-4471-5013-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics