Moving Beyond Linearity

James, Gareth; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert

doi:10.1007/978-1-4614-7138-7_7

Gareth James⁴,
Daniela Witten⁵,
Trevor Hastie⁶ &
…
Robert Tibshirani⁶

Part of the book series: Springer Texts in Statistics ((STS,volume 103))

362k Accesses
4 Citations

Abstract

So far in this book, we have mostly focused on linear models. Linear models are relatively simple to describe and implement, and have advantages over other approaches in terms of interpretation and inference. However, standard linear regression can have significant limitations in terms of predictive power. This is because the linearity assumption is almost always an approximation, and sometimes a poor one. In Chapter 6 we see that we can improve upon least squares using ridge regression, the lasso, principal components regression, and other techniques. In that setting, the improvement is obtained by reducing the complexity of the linear model, and hence the variance of the estimates. But we are still using a linear model, which can only be improved so far! In this chapter we relax the linearity assumption while still attempting to maintain as much interpretability as possible. We do this by examining very simple extensions of linear models like polynomial regression and step functions, as well as more sophisticated approaches such as splines, local regression, and generalized additive models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
If \(\hat{\mathbf{C}}\) is the 5 ×5 covariance matrix of the \(\hat{\beta }_{j}\), and if \(\boldsymbol{\ell}_{0}^{T} = (1,x_{0},x_{0}^{2},x_{0}^{3},x_{0}^{4})\), then \(\mbox{ Var}[\hat{f}(x_{0})] = \boldsymbol{\ell}_{0}^{T}\hat{\mathbf{C}}\boldsymbol{\ell}_{0}\).
2.
We exclude C ₀(X) as a predictor in (7.5) because it is redundant with the intercept. This is similar to the fact that we need only two dummy variables to code a qualitative variable with three levels, provided that the model will contain an intercept. The decision to exclude C ₀(X) instead of some other C _k(X) in (7.5) is arbitrary. Alternatively, we could include C ₀(X), C ₁(X), …, C _K(X), and exclude the intercept.
3.
derivative
cubic spline
Cubic splines are popular because most human eyes cannot detect the discontinuity at the knots.
4.
There are actually five knots, including the two boundary knots. A cubic spline with five knots would have nine degrees of freedom. But natural cubic splines have two additional natural constraints at each boundary to enforce linearity, resulting in \(9 - 4 = 5\) degrees of freedom. Since this includes a constant, which is absorbed in the intercept, we count it as four degrees of freedom.
5.
The exact formulas for computing \(\hat{g}(x_{i})\) and S _λ are very technical; however, efficient algorithms are available for computing these quantities.
6.
backfitting
A partial residual for X ₃, for example, has the form \(r_{i} = y_{i} - f_{1}(x_{i1}) - f_{2}(x_{i2})\). If we know f ₁ and f ₂, then we can fit f ₃ by treating this residual as a response in a non-linear regression on X ₃.

Author information

Authors and Affiliations

Department of Information and Operations Management, University of Southern California, Los Angeles, CA, USA
Gareth James
Department of Biostatistics, University of Washington, Seattle, WA, USA
Daniela Witten
Department of Statistics, Stanford University, Stanford, CA, USA
Trevor Hastie & Robert Tibshirani

Authors

Gareth James
View author publications
You can also search for this author in PubMed Google Scholar
Daniela Witten
View author publications
You can also search for this author in PubMed Google Scholar
Trevor Hastie
View author publications
You can also search for this author in PubMed Google Scholar
Robert Tibshirani
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). Moving Beyond Linearity. In: An Introduction to Statistical Learning. Springer Texts in Statistics, vol 103. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7138-7_7

Download citation

DOI: https://doi.org/10.1007/978-1-4614-7138-7_7
Published: 18 April 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7137-0
Online ISBN: 978-1-4614-7138-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics