Skip to main content

Part of the book series: Integrated Series in Information Systems ((ISIS,volume 36))

  • 13k Accesses

Abstract

The main objective of this chapter is to discuss various supervised learning models in detail. The supervised learning models provide parametrized mapping that projects a data domain into a response set, and thus helps extract knowledge (known) from data (unknown). These learning models, in simple form, can be grouped into predictive models and classification models. Firstly, the predictive models, such as the standard regression, ridge regression, lasso regression, and elastic-net regression are discussed in detail with their mathematical and visual interpretations using simple examples. Secondly, the classification models are discussed and grouped into three models: mathematical models, hierarchical models, and layered models. Also discussed are the mathematical models, such as the logistic regression and support vector machine; the hierarchical models, like the decision tree and the random forest; and the layered models, like the deep learning. They are discussed only from the modeling point of view, and they will be discussed in detail together as the modeling and algorithms in separate chapters later in the book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. S. B. Kotsiantis. “Supervised machine learning: A review of classification techniques,” Informatica 31, pp. 249–268, 2007.

    MATH  MathSciNet  Google Scholar 

  2. T. G. Dietterich, “Machine-learning research: Four current directions,” AI Magazine, vol. 18, no. 4, pp. 97–136,1997.

    Google Scholar 

  3. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2009.

    Book  MATH  Google Scholar 

  4. G. Huang, H. Chen, Z. Zhou, F. Yin and K. Guo. “Two-class support vector data description.” Pattern Recognition, 44, pp. 320–329, 2011.

    Article  MATH  Google Scholar 

  5. D. Meyer, F. Leisch, and K. Hornik. “The support vector machine under test. Neurocomputing,” 55, pp. 169–186, 2003.

    Google Scholar 

  6. G. M. Weiss, and F. Provost, F. “Learning when training data are costly: the effect of class distribution on tree induction,” Journal of Artificial Intelligence Research, vol. 19, pp. 315–354, 2003.

    MATH  Google Scholar 

  7. Van der Kooij, A.J. and Meulman, J.J.(2006). “Regularization with Ridge penalties, the Lasso, and the Elastic Net for Regression with Optimal Scaling Transformations,” https://openaccess.leidenuniv.nl/bitstream/handle/1887/12096/04.pdf (last accessed April 16th 2015).

  8. H. Zou, and T. Hastie. “Regularization and variable selection via the elastic net,” Journal of the Royal Society series, vol. 67, no. 2, pp. 301–320, 2005.

    Article  MATH  MathSciNet  Google Scholar 

  9. M. A. Hearst, S. T. Dumais, E. Osman, J. Platt, and B. Scholkopf. “Support vector machines.” Intelligent Systems and their Applications, IEEE, 13(4), pp. 18–28, 1998.

    Article  Google Scholar 

  10. L. Rokach, and O. Maimon. “Top-down induction of decision trees classifiers-a survey.” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 35, no. 4, pp. 476–487, 2005.

    Article  Google Scholar 

  11. L. Breiman, “Random forests.” Machine learning 45, pp. 5–32, 2001.

    Article  MATH  Google Scholar 

  12. G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.

    Google Scholar 

  13. M. Dunbar, J. M. Murray, L. A. Cysique, B. J. Brew, and V. Jeyakumar. “Simultaneous classification and feature selection via convex quadratic programming with application to HIV-associated neurocognitive disorder assessment.” European Journal of Operational Research 206(2): pp. 470–478, 2010.

    Article  MATH  Google Scholar 

  14. http://en.wikipedia.org/wiki/Distance_from_a_point_to_a_line

  15. O. L. Mangasarian and D. R. Musicant. 2000. “LSVM Software: Active set support vector machine classification software,” Available online at http://research.cs.wisc.edu/dmi/lsvm/.

  16. V. Franc, and V. Hlavac. “Multi-class support vector machine.” In Proceedings of the IEEE 16th International Conference on Pattern Recognition, vol. 2, pp. 236–239, 2002.

    Google Scholar 

  17. R. J. Lewis. “An introduction to classification and regression tree (CART) analysis” In Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California, pp. 1–14, 2000.

    Google Scholar 

  18. http://www.simafore.com/blog/bid/62482/2-main-differences- between- classification-and- regression-trees. (last accessed April 19, 2015).

  19. Li Deng. “A tutorial survey of architectures, algorithms, and applications for deep learning,” APSIPA Transactions on Signal and Information Processing, 3, e2 doi:10.1017/atsip.2013.9, 2014.

    Google Scholar 

  20. Y. Bengio. “Learning deep architectures for AI.” Foundations and trends in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.

    Article  MATH  MathSciNet  Google Scholar 

  21. L. Wan, M. Zeiler, S. Zhang, Y. L. Cunn, and R. Fergus. “Regularization of neural networks using dropconnect.” In Proceedings of the International Conference on Machine Learning, pp. 1058–1066, 2013.

    Google Scholar 

  22. B. L. Kalman and S. C. Kwasny. “Why tanh: choosing a sigmoidal function.” International Joint Conference on Neural Networks, vol. 4, pp. 578–581, 1992.

    Google Scholar 

  23. T. Zhang. “Solving large scale linear prediction problems using stochastic gradient descent algorithms.” In Proceedings of the International Conference on Machine Learning, pp. 919–926, 2004.

    Google Scholar 

Download references

Acknowledgements

I would like to thank the authors/owners of the Latex materials that they have posted at https://jcnts.wordpress.com/ (for formatting the optimization problems, last accessed on April 20th, 2015) and http://www.tex.ac.uk/CTAN/info/symbols/comprehensive/symbols-a4.pdf (for the latex symbols, last accessed on April 20th, 2015). It helped the formatting of several mathematical equations in this book.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this chapter

Cite this chapter

Suthaharan, S. (2016). Supervised Learning Models. In: Machine Learning Models and Algorithms for Big Data Classification. Integrated Series in Information Systems, vol 36. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7641-3_7

Download citation

Publish with us

Policies and ethics