Abstract
The main objective of this chapter is to discuss various supervised learning models in detail. The supervised learning models provide parametrized mapping that projects a data domain into a response set, and thus helps extract knowledge (known) from data (unknown). These learning models, in simple form, can be grouped into predictive models and classification models. Firstly, the predictive models, such as the standard regression, ridge regression, lasso regression, and elastic-net regression are discussed in detail with their mathematical and visual interpretations using simple examples. Secondly, the classification models are discussed and grouped into three models: mathematical models, hierarchical models, and layered models. Also discussed are the mathematical models, such as the logistic regression and support vector machine; the hierarchical models, like the decision tree and the random forest; and the layered models, like the deep learning. They are discussed only from the modeling point of view, and they will be discussed in detail together as the modeling and algorithms in separate chapters later in the book.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. B. Kotsiantis. “Supervised machine learning: A review of classification techniques,” Informatica 31, pp. 249–268, 2007.
T. G. Dietterich, “Machine-learning research: Four current directions,” AI Magazine, vol. 18, no. 4, pp. 97–136,1997.
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2009.
G. Huang, H. Chen, Z. Zhou, F. Yin and K. Guo. “Two-class support vector data description.” Pattern Recognition, 44, pp. 320–329, 2011.
D. Meyer, F. Leisch, and K. Hornik. “The support vector machine under test. Neurocomputing,” 55, pp. 169–186, 2003.
G. M. Weiss, and F. Provost, F. “Learning when training data are costly: the effect of class distribution on tree induction,” Journal of Artificial Intelligence Research, vol. 19, pp. 315–354, 2003.
Van der Kooij, A.J. and Meulman, J.J.(2006). “Regularization with Ridge penalties, the Lasso, and the Elastic Net for Regression with Optimal Scaling Transformations,” https://openaccess.leidenuniv.nl/bitstream/handle/1887/12096/04.pdf (last accessed April 16th 2015).
H. Zou, and T. Hastie. “Regularization and variable selection via the elastic net,” Journal of the Royal Society series, vol. 67, no. 2, pp. 301–320, 2005.
M. A. Hearst, S. T. Dumais, E. Osman, J. Platt, and B. Scholkopf. “Support vector machines.” Intelligent Systems and their Applications, IEEE, 13(4), pp. 18–28, 1998.
L. Rokach, and O. Maimon. “Top-down induction of decision trees classifiers-a survey.” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 35, no. 4, pp. 476–487, 2005.
L. Breiman, “Random forests.” Machine learning 45, pp. 5–32, 2001.
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.
M. Dunbar, J. M. Murray, L. A. Cysique, B. J. Brew, and V. Jeyakumar. “Simultaneous classification and feature selection via convex quadratic programming with application to HIV-associated neurocognitive disorder assessment.” European Journal of Operational Research 206(2): pp. 470–478, 2010.
http://en.wikipedia.org/wiki/Distance_from_a_point_to_a_line
O. L. Mangasarian and D. R. Musicant. 2000. “LSVM Software: Active set support vector machine classification software,” Available online at http://research.cs.wisc.edu/dmi/lsvm/.
V. Franc, and V. Hlavac. “Multi-class support vector machine.” In Proceedings of the IEEE 16th International Conference on Pattern Recognition, vol. 2, pp. 236–239, 2002.
R. J. Lewis. “An introduction to classification and regression tree (CART) analysis” In Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California, pp. 1–14, 2000.
http://www.simafore.com/blog/bid/62482/2-main-differences- between- classification-and- regression-trees. (last accessed April 19, 2015).
Li Deng. “A tutorial survey of architectures, algorithms, and applications for deep learning,” APSIPA Transactions on Signal and Information Processing, 3, e2 doi:10.1017/atsip.2013.9, 2014.
Y. Bengio. “Learning deep architectures for AI.” Foundations and trends in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
L. Wan, M. Zeiler, S. Zhang, Y. L. Cunn, and R. Fergus. “Regularization of neural networks using dropconnect.” In Proceedings of the International Conference on Machine Learning, pp. 1058–1066, 2013.
B. L. Kalman and S. C. Kwasny. “Why tanh: choosing a sigmoidal function.” International Joint Conference on Neural Networks, vol. 4, pp. 578–581, 1992.
T. Zhang. “Solving large scale linear prediction problems using stochastic gradient descent algorithms.” In Proceedings of the International Conference on Machine Learning, pp. 919–926, 2004.
Acknowledgements
I would like to thank the authors/owners of the Latex materials that they have posted at https://jcnts.wordpress.com/ (for formatting the optimization problems, last accessed on April 20th, 2015) and http://www.tex.ac.uk/CTAN/info/symbols/comprehensive/symbols-a4.pdf (for the latex symbols, last accessed on April 20th, 2015). It helped the formatting of several mathematical equations in this book.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this chapter
Cite this chapter
Suthaharan, S. (2016). Supervised Learning Models. In: Machine Learning Models and Algorithms for Big Data Classification. Integrated Series in Information Systems, vol 36. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7641-3_7
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7641-3_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7640-6
Online ISBN: 978-1-4899-7641-3
eBook Packages: Business and ManagementBusiness and Management (R0)