An Introduction to Neural Networks

Aggarwal, Charu C.

doi:10.1007/978-3-319-94463-0_1

Charu C. Aggarwal²

435k Accesses
17 Citations
3 Altmetric

Abstract

Artificial neural networks are popular machine learning techniques that simulate the mechanism of learning in biological organisms. The human nervous system contains cells, which are referred to as neurons. The neurons are connected to one another with the use of axons and dendrites, and the connecting regions between axons and dendrites are referred to as synapses. These connections are illustrated in Figure 1.1(a). The strengths of synaptic connections often change in response to external stimuli. This change is how learning takes place in living organisms.

“Thou shalt not make a machine to counterfeit a human mind.”—Frank Herbert

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The ReLU shows asymmetric saturation.
2.
Examples include Torch [572], Theano [573], and TensorFlow [574].
3.
Weight decay is generally used with other loss functions in single-layer models and in all multi-layer models with a large number of parameters.
4.
This is an overloading of the terminology used in convolutional neural networks. The meaning of the word “depth” is inferred from the context in which it is used.

Bibliography

C. Aggarwal. Data classification: Algorithms and applications, CRC Press, 2014.
Google Scholar
C. Aggarwal. Data mining: The textbook. Springer, 2015.
Google Scholar
C. Aggarwal. Machine learning for text. Springer, 2018.
Google Scholar
Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), pp. 1–127, 2009.
Article Google Scholar
Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE TPAMI, 35(8), pp. 1798–1828, 2013.
Article Google Scholar
Y. Bengio and O. Delalleau. On the expressive power of deep architectures. Algorithmic Learning Theory, pp. 18–36, 2011.
Google Scholar
J. Bergstra et al. Theano: A CPU and GPU math compiler in Python. Python in Science Conference, 2010.
Google Scholar
C. M. Bishop. Pattern recognition and machine learning. Springer, 2007.
Google Scholar
C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.
Google Scholar
L. Breiman. Random forests. Journal Machine Learning archive, 45(1), pp. 5–32, 2001.
Article Google Scholar
A. Bryson. A gradient method for optimizing multi-stage allocation processes. Harvard University Symposium on Digital Computers and their Applications, 1961.
Google Scholar
D. Ciresan, U. Meier, L. Gambardella, and J. Schmidhuber. Deep, big, simple neural nets for handwritten digit recognition. Neural Computation, 22(12), pp. 3207–3220, 2010.
Article Google Scholar
T. Cover. Geometrical and statistical properties of systems of linear inequalities with applications to pattern recognition. IEEE Transactions on Electronic Computers, pp. 326–334, 1965.
Google Scholar
N. de Freitas. Machine Learning, University of Oxford (Course Video), 2013.https://www.youtube.com/watch?v=w2OtwL5T1ow&list=PLE6Wd9FREdyJ5lbFl8Uu-GjecvVw66F6
N. de Freitas. Deep Learning, University of Oxford (Course Video), 2015.https://www.youtube.com/watch?v=PlhFWT7vAEw&list=PLjK8ddCbDMphIMSXn-1IjyYpHU3DaUYw
O. Delalleau and Y. Bengio. Shallow vs. deep sum-product networks. NIPS Conference, pp. 666–674, 2011.
Google Scholar
Y. Freund and R. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3), pp. 277–296, 1999.
Article Google Scholar
K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), pp. 193–202, 1980.
Article Google Scholar
S. Gallant. Perceptron-based learning algorithms. IEEE Transactions on Neural Networks, 1(2), pp. 179–191, 1990.
Article Google Scholar
A. Ghodsi. STAT 946: Topics in Probability and Statistics: Deep Learning, University of Waterloo, Fall 2015. https://www.youtube.com/watch?v=fyAZszlPphs&list=PLehuLRPyt1Hyi78UOkMP-WCGRxGcA9NVOE
X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. AISTATS, pp. 249–256, 2010.
Google Scholar
I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT Press, 2016.
Google Scholar
A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649, 2013.
Google Scholar
A. Graves, G. Wayne, and I. Danihelka. Neural turing machines. arXiv:1410.5401, 2014.https://arxiv.org/abs/1410.5401
K. Greff, R. K. Srivastava, and J. Schmidhuber. Highway and residual networks learn unrolled iterative estimation. arXiv:1612.07771, 2016.https://arxiv.org/abs/1612.07771
D. Hassabis, D. Kumaran, C. Summerfield, and M. Botvinick. Neuroscience-inspired artificial intelligence. Neuron, 95(2), pp. 245–258, 2017.
Article Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.
Google Scholar
S. Haykin. Neural networks and learning machines. Pearson, 2008.
Google Scholar
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
Google Scholar
G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313, (5766), pp. 504–507, 2006.
Google Scholar
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), pp. 1735–1785, 1997.
Article Google Scholar
J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. National Academy of Sciences of the USA, 79(8), pp. 2554–2558, 1982.
Article MathSciNet Google Scholar
K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), pp. 359–366, 1989.
Article Google Scholar
D. Hubel and T. Wiesel. Receptive fields of single neurones in the cat’s striate cortex. The Journal of Physiology, 124(3), pp. 574–591, 1959.
Article Google Scholar
H. Kandel, J. Schwartz, T. Jessell, S. Siegelbaum, and A. Hudspeth. Principles of neural science. McGraw Hill, 2012.
Google Scholar
A. Karpathy, J. Johnson, and L. Fei-Fei. Stanford University Class CS321n: Convolutional neural networks for visual recognition, 2016.http://cs231n.github.io/
H. J. Kelley. Gradient theory of optimal flight paths. Ars Journal, 30(10), pp. 947–954, 1960.
Article Google Scholar
T. Kietzmann, P. McClure, and N. Kriegeskorte. Deep Neural Networks In Computational Neuroscience. bioRxiv, 133504, 2017.https://www.biorxiv.org/content/early/2017/05/04/133504
J. Kivinen and M. Warmuth. The perceptron algorithm vs. winnow: linear vs. logarithmic mistake bounds when few input variables are relevant. Computational Learning Theory, pp. 289–296, 1995.
Google Scholar
D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT Press, 2009.
Google Scholar
A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS Conference, pp. 1097–1105. 2012.
Google Scholar
H. Larochelle. Neural Networks (Course). Universite de Sherbrooke, 2013.https://www.youtube.com/watch?v=SGZ6BttHMPw&list=PL6Xpj9I5qXYEcOhn7-TqghAJ6NAPrNmUBH
H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. ICML Confererence, pp. 473–480, 2007.
Google Scholar
Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553), pp. 436–444, 2015.
Article Google Scholar
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), pp. 2278–2324, 1998.
Article Google Scholar
Y. LeCun, C. Cortes, and C. Burges. The MNIST database of handwritten digits, 1998.http://yann.lecun.com/exdb/mnist/
C. Manning and R. Socher. CS224N: Natural language processing with deep learning. Stanford University School of Engineering, 2017. https://www.youtube.com/watch?v=OQQ-W_63UgQ
W. S. McCulloch and W. H. Pitts. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), pp. 115–133, 1943.
Article MathSciNet Google Scholar
G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), pp. 235–312, 1990.https://wordnet.princeton.edu/
Article Google Scholar
M. Minsky and S. Papert. Perceptrons. An Introduction to Computational Geometry, MIT Press, 1969.
Google Scholar
G. Montufar. Universal approximation depth and errors of narrow belief networks with discrete units. Neural Computation, 26(7), pp. 1386–1407, 2014.
Article MathSciNet Google Scholar
R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley. Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics, pp. 1–11, 2017.
Google Scholar
H. Poon and P. Domingos. Sum-product networks: A new deep architecture. Computer Vision Workshops (ICCV Workshops), pp. 689–690, 2011.
Google Scholar
V. Romanuke. Parallel Computing Center (Khmelnitskiy, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate. Retrieved 24 November 2016.
Google Scholar
F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386, 1958.
Google Scholar
D. Rumelhart, G. Hinton, and R. Williams. Learning representations by back-propagating errors. Nature, 323 (6088), pp. 533–536, 1986.
Article Google Scholar
D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by back-propagating errors. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 318–362, 1986.
Google Scholar
J. Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61, pp. 85–117, 2015.
Article Google Scholar
H. Siegelmann and E. Sontag. On the computational power of neural nets. Journal of Computer and System Sciences, 50(1), pp. 132–150, 1995.
Article MathSciNet Google Scholar
S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming, 127(1), pp. 3–30, 2011.
Article MathSciNet Google Scholar
B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
Google Scholar
S. Wang, C. Aggarwal, and H. Liu. Using a random forest to inspire a neural network and improving on it. SIAM Conference on Data Mining, 2017.
Google Scholar
A. Wendemuth. Learning the unlearnable. Journal of Physics A: Math. Gen., 28, pp. 5423–5436, 1995.
Article MathSciNet Google Scholar
P. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974.
Google Scholar
P. Werbos. The roots of backpropagation: from ordered derivatives to neural networks and political forecasting (Vol. 1). John Wiley and Sons, 1994.
Google Scholar
P. Werbos. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10), pp. 1550–1560, 1990.
Article Google Scholar
J. Weston, S. Chopra, and A. Bordes. Memory networks. ICLR, 2015.
Google Scholar
B. Widrow and M. Hoff. Adaptive switching circuits. IRE WESCON Convention Record, 4(1), pp. 96–104, 1960.
Google Scholar
http://caffe.berkeleyvision.org/
http://torch.ch/
http://deeplearning.net/software/theano/
https://www.tensorflow.org/
https://keras.io/
https://lasagne.readthedocs.io/en/latest/
http://www.image-net.org/
http://www.image-net.org/challenges/LSVRC/
https://deeplearning4j.org/
https://www.wikipedia.org/
https://science.education.nih.gov/supplements/webversions/BrainAddiction/guide/lesson2-1.html
https://www.ibm.com/us-en/marketplace/deep-learning-platform
https://www.coursera.org/learn/neural-networks
https://archive.ics.uci.edu/ml/datasets.html
https://www.youtube.com/watch?v=2pWv7GOvuf0

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, International Business Machines, Yorktown Heights, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2018). An Introduction to Neural Networks. In: Neural Networks and Deep Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-94463-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-94463-0_1
Published: 26 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94462-3
Online ISBN: 978-3-319-94463-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics