Skip to main content

Parallel Fitting of Additive Models for Regression

  • Conference paper
KI 2014: Advances in Artificial Intelligence (KI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8736))

Abstract

To solve big data problems which occur in modern data mining applications, a comprehensive approach is required that combines a flexible model and an optimisation algorithm with fast convergence and a potential for efficient parallelisation both in the number of data points and the number of features.

In this paper we present an algorithm for fitting additive models based on the basis expansion principle. The classical backfitting algorithm that solves the underlying normal equations cannot be properly parallelised due to inherent data dependencies and leads to a limited error reduction under certain circumstances. Instead, we suggest a modified BiCGStab method adapted to suit the special block structure of the problem. The new method demonstrates superior convergence speed and promising parallel scalability.

We discuss the convergence properties of the method and investigate its convergence and scalability further using a set of benchmark problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Garcke, J., Griebel, M., Thess, M.: Data mining with sparse grids. Computing 67(3), 225–253 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  2. Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München (2010)

    Google Scholar 

  3. Heinecke, A., Pflüger, D.: Emerging architectures enable to boost massively parallel data mining using adaptive sparse grids. International Journal of Parallel Programming, 1–43 (July 2012)

    Google Scholar 

  4. Xu, W.: Towards optimal one pass large scale learning with averaged stochastic gradient descent. CoRR, abs/1107.2490 (2011)

    Google Scholar 

  5. Bellman, R., Bellman, R.: Adaptive Control Processes: A Guided Tour. Rand Corporation. Research studies, Princeton University Press (1961)

    Google Scholar 

  6. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume I: Linear Information. European Mathematical Society, Zürich (2008)

    Book  Google Scholar 

  7. Novak, E.: Tractability of multivariate problems, Volume II: Standard Information for Functionals. European Mathematical Society, Zürich (2010)

    Book  Google Scholar 

  8. Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume III: Standard Information for Operators. European Mathematical Society, Zürich (2012)

    Book  Google Scholar 

  9. Novak, E., Woźniakowski, H.: Approximation of infinitely differentiable multivariate functions is intractable. Journal of Complexity 25, 398–404 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hegland, M., Wasilkowski, G.W.: On tractability of approximation in special function spaces. J. Complex. 29, 76–91 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  11. Hastie, T., Tibshirani, R.: Generalized Additive Models. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis (1990)

    Google Scholar 

  12. Buja, A., Hastie, T., Tibshirani, R.: Linear smoothers and additive models. The Annals of Statistics 17, 453–510 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  13. Chu, E., Keshavarz, A., Boyd, S.: A distributed algorithm for fitting generalized additive models. Optimization and Engineering 14, 213–224 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  14. Hsu, D., Karampatziakis, N., Langford, J., Smola, A.: Parallel online learning, ch. 14. Cambridge University Press (2011)

    Google Scholar 

  15. Stone, C.J.: The dimensionality reduction principle for generalized additive models. The Annals of Statistics 14, 590–606 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer (2011)

    Google Scholar 

  17. Xia, Y.: A note on the backfitting estimation of additive models. Bernoulli 15, 1148–1153 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  18. van der Vorst, H.: Bi-cgstab: A fast and smoothly converging variant of bi-cg for the solution of nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 13(2), 631–644 (1992)

    Article  MATH  Google Scholar 

  19. Paige, C., Saunders, M.: Solution of sparse indefinite systems of linear equations. SIAM Journal on Numerical Analysis 12(4), 617–629 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  20. Harrison, D.J., Rubinfeld, D.L.: Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management 5(1), 81–102 (1978)

    Article  MATH  Google Scholar 

  21. Adelman-McCarthy, J.K., et al.: The fifth data release of the sloan digital sky survey. The Astrophysical Journal Supplement Series 172(2), 634 (2007)

    Article  Google Scholar 

  22. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. The Annals of Statistics 32, 407–499 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  23. Friedman, J.H.: Multivariate adaptive regression splines. The Annals of Statistics 19, 1–67 (1991)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Khakhutskyy, V., Hegland, M. (2014). Parallel Fitting of Additive Models for Regression. In: Lutz, C., Thielscher, M. (eds) KI 2014: Advances in Artificial Intelligence. KI 2014. Lecture Notes in Computer Science(), vol 8736. Springer, Cham. https://doi.org/10.1007/978-3-319-11206-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11206-0_24

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11205-3

  • Online ISBN: 978-3-319-11206-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics