Parallel Fitting of Additive Models for Regression

Khakhutskyy, Valeriy; Hegland, Markus

doi:10.1007/978-3-319-11206-0_24

Valeriy Khakhutskyy²¹ &
Markus Hegland²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8736))

Included in the following conference series:

Joint German/Austrian Conference on Artificial Intelligence (Künstliche Intelligenz)

1151 Accesses
1 Altmetric

Abstract

To solve big data problems which occur in modern data mining applications, a comprehensive approach is required that combines a flexible model and an optimisation algorithm with fast convergence and a potential for efficient parallelisation both in the number of data points and the number of features.

In this paper we present an algorithm for fitting additive models based on the basis expansion principle. The classical backfitting algorithm that solves the underlying normal equations cannot be properly parallelised due to inherent data dependencies and leads to a limited error reduction under certain circumstances. Instead, we suggest a modified BiCGStab method adapted to suit the special block structure of the problem. The new method demonstrates superior convergence speed and promising parallel scalability.

We discuss the convergence properties of the method and investigate its convergence and scalability further using a set of benchmark problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Garcke, J., Griebel, M., Thess, M.: Data mining with sparse grids. Computing 67(3), 225–253 (2001)
Article MathSciNet MATH Google Scholar
Pflüger, D.: Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München (2010)
Google Scholar
Heinecke, A., Pflüger, D.: Emerging architectures enable to boost massively parallel data mining using adaptive sparse grids. International Journal of Parallel Programming, 1–43 (July 2012)
Google Scholar
Xu, W.: Towards optimal one pass large scale learning with averaged stochastic gradient descent. CoRR, abs/1107.2490 (2011)
Google Scholar
Bellman, R., Bellman, R.: Adaptive Control Processes: A Guided Tour. Rand Corporation. Research studies, Princeton University Press (1961)
Google Scholar
Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume I: Linear Information. European Mathematical Society, Zürich (2008)
Book Google Scholar
Novak, E.: Tractability of multivariate problems, Volume II: Standard Information for Functionals. European Mathematical Society, Zürich (2010)
Book Google Scholar
Novak, E., Wozniakowski, H.: Tractability of Multivariate Problems, Volume III: Standard Information for Operators. European Mathematical Society, Zürich (2012)
Book Google Scholar
Novak, E., Woźniakowski, H.: Approximation of infinitely differentiable multivariate functions is intractable. Journal of Complexity 25, 398–404 (2009)
Article MathSciNet MATH Google Scholar
Hegland, M., Wasilkowski, G.W.: On tractability of approximation in special function spaces. J. Complex. 29, 76–91 (2013)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R.: Generalized Additive Models. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis (1990)
Google Scholar
Buja, A., Hastie, T., Tibshirani, R.: Linear smoothers and additive models. The Annals of Statistics 17, 453–510 (1989)
Article MathSciNet MATH Google Scholar
Chu, E., Keshavarz, A., Boyd, S.: A distributed algorithm for fitting generalized additive models. Optimization and Engineering 14, 213–224 (2013)
Article MathSciNet MATH Google Scholar
Hsu, D., Karampatziakis, N., Langford, J., Smola, A.: Parallel online learning, ch. 14. Cambridge University Press (2011)
Google Scholar
Stone, C.J.: The dimensionality reduction principle for generalized additive models. The Annals of Statistics 14, 590–606 (1986)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer (2011)
Google Scholar
Xia, Y.: A note on the backfitting estimation of additive models. Bernoulli 15, 1148–1153 (2009)
Article MathSciNet MATH Google Scholar
van der Vorst, H.: Bi-cgstab: A fast and smoothly converging variant of bi-cg for the solution of nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 13(2), 631–644 (1992)
Article MATH Google Scholar
Paige, C., Saunders, M.: Solution of sparse indefinite systems of linear equations. SIAM Journal on Numerical Analysis 12(4), 617–629 (1975)
Article MathSciNet MATH Google Scholar
Harrison, D.J., Rubinfeld, D.L.: Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management 5(1), 81–102 (1978)
Article MATH Google Scholar
Adelman-McCarthy, J.K., et al.: The fifth data release of the sloan digital sky survey. The Astrophysical Journal Supplement Series 172(2), 634 (2007)
Article Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. The Annals of Statistics 32, 407–499 (2004)
Article MathSciNet MATH Google Scholar
Friedman, J.H.: Multivariate adaptive regression splines. The Annals of Statistics 19, 1–67 (1991)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Advanced Study, Technische Universität Munchen, Lichtenbergstrasse 2a, D-85748, Garching, Germany
Valeriy Khakhutskyy
Centre for Mathematics and Its Applications, Mathematical Sciences Institute Australian National University, Canberra, ACT, 0200, Australia
Markus Hegland

Authors

Valeriy Khakhutskyy
View author publications
You can also search for this author in PubMed Google Scholar
Markus Hegland
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universität Bremen, Germany
Carsten Lutz
University of New South Wales, 2052, Sydney, NSW, Australia
Michael Thielscher

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khakhutskyy, V., Hegland, M. (2014). Parallel Fitting of Additive Models for Regression. In: Lutz, C., Thielscher, M. (eds) KI 2014: Advances in Artificial Intelligence. KI 2014. Lecture Notes in Computer Science(), vol 8736. Springer, Cham. https://doi.org/10.1007/978-3-319-11206-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-319-11206-0_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11205-3
Online ISBN: 978-3-319-11206-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics