On linear regression models in infinite dimensional spaces with scalar response

Ghiglietti, Andrea; Ieva, Francesca; Paganoni, Anna Maria; Aletti, Giacomo

doi:10.1007/s00362-015-0710-2

On linear regression models in infinite dimensional spaces with scalar response

Regular Article
Published: 28 August 2015

Volume 58, pages 527–548, (2017)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Andrea Ghiglietti¹,
Francesca Ieva¹,
Anna Maria Paganoni² &
…
Giacomo Aletti¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In functional linear regression, the parameters estimation involves solving a non necessarily well-posed problem, which has points of contact with a range of methodologies, including statistical smoothing, deconvolution and projection on finite-dimensional subspaces. We discuss the standard approach based explicitly on functional principal components analysis, nevertheless the choice of the number of basis components remains something subjective and not always properly discussed and justified. In this work we discuss inferential properties of least square estimation in this context, with different choices of projection subspaces, as well as we study asymptotic behaviour increasing the dimension of subspaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Unified Framework to Study the Properties of the PLS Vector of Regression Coefficients

On Parameter Estimation for High Dimensional Errors-in-Variables Models

Variable selection in multivariate linear models for functional data via sparse regularization

Article 19 July 2019

References

Bache K, Lichman M (2013) UCI machine learning repository. https://archive.ics.uci.edu/ml/index.html. Accessed 27 Aug 2015
Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13:571–591
MathSciNet MATH Google Scholar
Chiou JM, Müller HG, Wang JL, Carey JR (2003) A functional multiplicative effects model for longitudinal data, with application to reproductive histories of female medflies. Stat Sin 13:1119–1133
MathSciNet MATH Google Scholar
Cuevas A, Febrero M, Fraiman R (2002) Linear functional regression: the case of fixed design and functional response. Can J Stat 30(2):285–300
Article MathSciNet MATH Google Scholar
Frank IE, Friedman JH (1993) A statistical view of some chemometrics regression tools. Technometrics 35:109–148
Article MATH Google Scholar
Hastie T, Mallows C (1993) A discussion of A statistical view of some chemometrics regression tools by I. E. Frank and J. H. Friedman. Technometrics 35:140–143
Google Scholar
Hawkins T (1977) Weierstrass and the theory of matrices. Arch Hist Exact Sci 17(2):119–163
Article MathSciNet MATH Google Scholar
Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer, New York
Book MATH Google Scholar
Jacques J, Preda C (2014) Model-based clustering for multivariate functional data. Comput Stat Data Anal 71:92–106
Article MathSciNet Google Scholar
Koch I, Hoffman P, Marron JS (2013) Proteomics profiles from mass spectrometry. Electron J Stat 8(2):1703–1713
Article MathSciNet MATH Google Scholar
Larsen F, van den Berg F, Engelsenm S (2006) An exploratory chemometric study of NMR spectra of table wines. J Chemom 20(5):198–208
Article Google Scholar
Marx BD, Eilers PH (1996) Generalized linear regression on sampled signals with penalized likelihood. In: Forcina A, Marchetti GM, Hatzinger R, Galmacci G (eds) Statistical modelling. Proceedings of the 11th international workshop on statistical modelling, Orvietto
Melas V, Pepelyshev A, Shpilev P, Salmaso L, Corain L, Arboretti R (2014) On the optimal choice of the number of empirical Fourier coefficients for comparison of regression curves. Stat Pap. doi:10.1007/s00362-014-0619-1
Osborne BG, Fearn T, Miller AR, Douglas S (1984) Application of near infrared reflectance spectroscopy to the compositional analysis of biscuits and biscuit dough. J Sci Food Agric 35:99–105
Article Google Scholar
R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. Accessed 27 Aug 2015
Ramsay JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer, New York
MATH Google Scholar
Wang G, Zhou J, Wu W, Chen M (2015) Robust functional sliced inverse regression. Stat Pap. doi: 10.1007/s00362-015-0695-x

Download references

Acknowledgments

The authors wish to thank Piercesare Secchi for stimulating and essential discussions about topics covered by this paper.

Author information

Authors and Affiliations

ADAMSS Center & Department of Mathematics “F. Enriques”, Università degli Studi di Milano, Via Saldini 50, 20133, Milan, Italy
Andrea Ghiglietti, Francesca Ieva & Giacomo Aletti
MOX - Department of Mathematics, Politecnico di Milano, Via Bonardi 9, 20133, Milan, Italy
Anna Maria Paganoni

Authors

Andrea Ghiglietti
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Ieva
View author publications
You can also search for this author in PubMed Google Scholar
Anna Maria Paganoni
View author publications
You can also search for this author in PubMed Google Scholar
Giacomo Aletti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Maria Paganoni.

Appendices

Appendix 1: Formal characterization of the sub-space E

This section focuses on computing explicitly the following quantities introduced in the Sect. 3:

(1)
the orthonormal basis of $E{\text {:}}\,\{\varphi _{k}^{E};\,k=1,\ldots ,d\};$
(2)
the multivariate projection matrix $P{\text {:}}\,\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}$ that transforms the basis coefficients of elements in D in the basis coefficients of elements in E;
(3)
the functional projection operator $\pi {\text {:}}\,D\rightarrow E\subseteq S$ of D on S.

Let us consider point (1). First, project the basis of $D\, (\{\varphi _{k}^{D};\,k=1,\ldots ,d\}$) on S, so obtaining a $\dim (S)\times d$-matrix A, where $[A]_{ij}=\langle \varphi _{i}^{S},\,\varphi _{j}^{D}\rangle .$ Note that A may have infinite rows if $\dim (S)=\infty .$ Then, the basis of D projected on S generates d linear independent functions given by $A^{T}\varvec{\varphi ^{S}(t)},$ that is a basis for E. It is easy to show that $A^{T}\varvec{\varphi ^{S}(t)}$ are linear independent since $\varphi _{1}^{D},\ldots ,\varphi _{d}^{D}$ are, and $D\cap S^{\perp }=0.$ To make $A^{T}\varvec{\varphi ^{S}(t)}$ be an orthonormal basis for E we do some calculations, obtaining:

$$\begin{aligned} \varvec{\varphi ^{E}(t)}=V_{S}D_{D}^{-1/2}V_{D}^{T}A^{T}\varvec{\varphi ^{S}(t)}, \end{aligned}$$

(26)

where $D_{D}$ and $V_{D}$ represent the eigen-structure of $A^{T}A\,(A^{T}AV_{D}=V_{D}D_{D}$) and $V_{S}$ is an arbitrary $d\times d$-orthonormal matrix that allows the basis of E to be changed; without loss of generality, we can consider $V_{S}=I_{d}.$ Note that, except for $V_{S},$ the basis $\varvec{\varphi ^{E}(t)}$ is independent of the choice of the basis $\varvec{\varphi ^{D}(t)}$ and $\varvec{\varphi ^{S}(t)}.$ It is worth saying that the eigenvalues in $D_{D}$ are all strictly positive since $A^{T}A$ has full rank, since $\varphi _{1}^{E},\ldots ,\varphi _{d}^{E}$ are linear independent. Moreover, the eigenvalues in $D_{D}$ are all less or equal to one since A is a projection operator.

Now, consider point (2). From (26) the projection matrix P from D to E can be defined as

$$\begin{aligned} P := \left\langle \varvec{\varphi ^{E}(t)},\,\left( \varvec{\varphi ^{D}(t)}\right) ^{T}\right\rangle =V_{S}D_{D}^{-1/2}V_{D}^{T}A^{T}\left\langle \varvec{\varphi ^{S}(t)},\,\left( \varvec{\varphi ^{D}(t)}\right) ^{T}\right\rangle , \end{aligned}$$

since $\langle \varvec{\varphi ^{S}(t)},\,(\varvec{\varphi ^{D}(t)})^{T}\rangle =A$ and $V_{D}^{T}A^{T}A=D_{D}V_{D}^{T},$ we obtain

$$\begin{aligned} P=V_{S}D_{D}^{1/2}V_{D}^{T}. \end{aligned}$$

(27)

Note that, using (27) we can rewrite (26) as

$$\begin{aligned} \varvec{\varphi ^{E}(t)}=\left( P^{-1}\right) ^{T}A^{T}\varvec{\varphi ^{S}(t)}. \end{aligned}$$

Then, from the vectorial estimate in E given by (9), we can obtain the vectorial estimate in D with $\varvec{\widehat{\beta }^{D}_{n}}=P^{-1}(\varvec{\widehat{\beta }^{E}_{n}}),$ and finally compute the functional estimate $\widehat{\beta }^{D}_{n}(t)=(\varvec{\widehat{\beta }^{D}_{n}})^{T}\varvec{\varphi ^{D}(t)}.$ This coincides with the solution of (7).

Finally, consider point (3). Using the projection matrix P we can define the functional operator $\pi $ as follows

$$\begin{aligned} \pi (g)=\left( P\left\langle g,\,\varvec{\varphi ^{D}(t)}\right\rangle \right) ^{T}\varvec{\varphi ^{E}(t)}, \end{aligned}$$

for any $g\in D.$ Then, using (27) we can easily obtain

$$\begin{aligned} \pi (\cdot )=\left( \left\langle \cdot ,\,\varvec{\varphi ^{D}(t)}\right\rangle \right) ^{T}A^{T}\varvec{\varphi ^{S}(t)}. \end{aligned}$$

(28)

Note that $\pi $ is independent of any choice of basis of $S,\,D$ and E. Using (28), once we get the vectorial estimate in E from (9), we can immediately compute the functional estimate $\widehat{\beta }^{E}_{n}(t)=(\varvec{\widehat{\beta }^{E}_{n}})^{T}\varvec{\varphi ^{E}(t)},$ and then obtain the functional estimate in D, i.e., $\widehat{\beta }^{D}_{n}=(\pi )^{-1}(\widehat{\beta }^{E}_{n}).$

Appendix 2: Increasing information property

In this section, we discuss an interesting property concerning the behavior of the eigenvalues of the covariance matrix when its dimension increases.

Let $\{M^{(n)}=[m^{(n)}_{ij}],\,n\ge 1\}$ be a sequence of symmetric matrices such that, for each $n\ge 1,\,M^{(n)}$ is a $n\times n$ matrix with $m^{(n)}_{ij}=m^{(n-1)}_{ij}$ for any $i,\,j\le n-1.$ In other words, $M^{(n-1)}$ is obtained by $M^{(n)}$ by deleting the last row and column. The eigenvalues are real, and are ordered according to the following general result proved by Cauchy.

Theorem 1

(See Hawkins 1997, p. 125) On the nested sequence $(M^{(n)})_{n}$ of matrices given above, denote with $\{\lambda ^{n}_{k};\,k=1,\ldots ,n\}$ the sequences of the ordered eigenvalues of $M^{(n)}.$ Then, for any $n\ge 1,$

$$\begin{aligned} \lambda ^{n+1}_{1} \ge \lambda ^{n}_{1} \ge \lambda ^{n+1}_{2} \ge \lambda ^{n}_{2} \ge \lambda ^{n+1}_{3} \ge \cdots \ge \lambda ^{n}_{n} \ge \lambda ^{n+1}_{n+1}. \end{aligned}$$

A direct consequence of the previous theorem is

$$\begin{aligned} \lambda _{i}^{k}\le \lambda _{i}^{d}, \quad \lambda _{i}^{i} \le \lambda _{k}^{k}, \quad \forall i\le k\le d. \end{aligned}$$

(29)

This result is applied in Sect. 4.2, where $M^{(n)}$ is represented the covariance matrix of the random vector $(\langle X,\,\varphi _{1}\rangle ,\ldots ,\langle X,\,\varphi _{n}\rangle ).$ In this context, a direct interpretation of (29) is that the variance of X projected into a subspace increases when further components are added.

Appendix 3: Simulation settings

The settings of the simulation study presented in Sect. 3 are the following.

Data $x_{i}(t)$ and regression coefficient $\beta (t)$ belong to the Hilbert space $L^{2}(T)$ with $T= [-1,\,1]$ closed interval.

For each $i = 1,\ldots ,n$ where n is the sample size (in our examples $n=500$),

$$\begin{aligned} X_{i}(t) = \sum \limits _{j \in J_{i}} \alpha _{j} \eta _{j} \theta ^{X}_{j}(t), \end{aligned}$$

where $\{\theta _{k}^{X}(t)\} \equiv \{1/\sqrt{2}\} \bigcup \{\cos {(\pi k t)},\,k = 1,\ldots \},\,\alpha _{j}$ are randomly sampled from a uniform distribution $U \sim \text {Unif}_{[-10,\,10]},\,\eta _{1} = 0.01,\,\eta _{j} = 1/j,$ for $j > 1$ and $J_{i}$ is a subset of size Z [with Z Poisson random variable $Z \sim {\mathcal {P}}(\lambda )$] of the integer from 1 to $2\,*\,Z.$ We set $\lambda = 10.$

Chosen a function $\beta (t) \in L^{2}(T)$ the scalar responses $y_{1},\ldots ,y_{n}$ are generated as $y_{i} = \int \nolimits _{T} \beta (t)X_{i}(t) dt + \epsilon _{i},$ where $\epsilon _{i} \sim {\mathcal {N}}(0,\,1).$ We repeat the estimation procedure $M = 100$ times.

In this setting the space S where the data are generated is the space of the even functions of $L^{2}(T)$ and its orthogonal space $S^{\perp }$ is composed by the odd functions of $L^{2}(T).$ By definition, E is the projection of a sub-space D on S, and hence E will be made by even functions. In particular, we defined E as the space of the even polynomials of degree at most 4, i.e., $E=\mathrm{{Span}}\{1,\,t^{2},\,t^{4}\}.$ For computational aspects, we adopt an equivalent orthonormal basis given by the Legendre polynomials, i.e., $E=\text {Span}\{\phi _{0},\,\phi _{2},\,\phi _{4}\},$ where

$$\begin{aligned} \phi _{0}=1/\sqrt{2},\quad \phi _{2}=\sqrt{5/8}\left( 3t^{2}-1\right) ,\quad \phi _{4}=\sqrt{9/128} \left( 35 t^{4} - 30 t^{2} +3\right) . \end{aligned}$$

Figure 1 shows the behavior of the estimator $\widehat{\beta }^{D}_{n}$ for different choices of D that maintain the same projected space E. Thus, we introduce a parameter $\theta \in [0,\,2\pi )$ and we define

$$\begin{aligned} D_{\theta } :=\text {Span} \left\{ \cos (\theta )\phi _{0}+\sin (\theta )\phi _{1},\phi _{2}, \phi _{4}\right\} , \end{aligned}$$

where $\phi _{1}=\sqrt{3/2}t$ is the Legendre polynomial of degree 1. Note that E represents the projection of $D_{\theta }$ on S for any $\theta \in [0,\,2\pi ).$ In Fig. 1, we set $\beta (t)=t^{2}+2t+1/3,$ so that $\beta (t)\in D_{\pi /3}.$

In Figs. 2 and 3 we are not interested in the bias on $S^{\perp }$ and hence we take $D\equiv E=\text {Span}\{1,\,t^{2},\,t^{4}\}.$ Figure 2 is dedicated to the study of the bias $\gamma (t),$ and hence we set $\beta (t)$ that does not lie in D : in particular $\beta (t)=\mathbf {1}_{[-0.5,0.5]}(t).$ Figure 3 illustrates the bias–variance trade-off between D and the sub-space generated by the FPCs. Hence, we set a true $\beta (t)$ that lies in D: in particular $\beta (t)=t^{4}.$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghiglietti, A., Ieva, F., Paganoni, A.M. et al. On linear regression models in infinite dimensional spaces with scalar response. Stat Papers 58, 527–548 (2017). https://doi.org/10.1007/s00362-015-0710-2

Download citation

Received: 09 February 2015
Revised: 31 July 2015
Published: 28 August 2015
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00362-015-0710-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On linear regression models in infinite dimensional spaces with scalar response

Abstract

Access this article

Similar content being viewed by others

A Unified Framework to Study the Properties of the PLS Vector of Regression Coefficients

On Parameter Estimation for High Dimensional Errors-in-Variables Models

Variable selection in multivariate linear models for functional data via sparse regularization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Formal characterization of the sub-space E

Appendix 2: Increasing information property

Theorem 1

Appendix 3: Simulation settings

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

On linear regression models in infinite dimensional spaces with scalar response

Abstract

Access this article

Similar content being viewed by others

A Unified Framework to Study the Properties of the PLS Vector of Regression Coefficients

On Parameter Estimation for High Dimensional Errors-in-Variables Models

Variable selection in multivariate linear models for functional data via sparse regularization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Formal characterization of the sub-space E

Appendix 2: Increasing information property

Theorem 1

Appendix 3: Simulation settings

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation