Abstract
This paper studies local polynomial estimation of expectile regression. Expectiles and quantiles both provide a full characterization of a (conditional) distribution function, but have each their own merits and inconveniences. Local polynomial fitting as a smoothing technique has a major advantage of being simple, allowing for explicit expressions and henceforth advantages when doing inference theory. The aim of this paper is twofold: to study in detail the use of local polynomial fitting in the context of expectile regression and to contribute to the important issue of bandwidth selection, from theoretical and practical points of view. We discuss local polynomial expectile regression estimators and establish an asymptotic normality result for them. The finite-sample performance of the estimators, combined with various bandwidth selectors, is investigated in a simulation study. Some illustrations with real data examples are given.
Similar content being viewed by others
References
Bellini, F., D Bernardino, E. (2017). Risk management with expectiles. The European Journal of Finance, 23(6), 487–506.
Bellini, F., Klar, B., Müller, A., Rosazza Gianin, E. (2014). Generalized quantiles as risk measures. Insurance: Mathematics and Economics, 54, 41–48.
Breckling, J., Chambers, R. (1988). M-quantiles. Biometrika, 75(4), 761–771.
Chen, J., Shao, J. (1993). Iterative weighted least squares estimators. The Annals of Statistics, 21(2), 1071–1092.
De Rossi, G., Harvey, A. (2009). Quantiles, expectiles and splines. Journal of Econometrics, 152, 179–185.
Efron, B. (1991). Regression percentiles using asymmetric squared error loss. Statistica Sinica, 1, 93–125.
Fan, J., Gijbels, I. (1995). Adaptive order polynomial fitting: bandwidth robustification and bias reduction. Journal of Computational and Graphical Statistics, 4(3), 213–227.
Fan, J., Gijbels, I. (1996). Local polynomial modelling and its applications. Number 66 in monographs on statistics and applied probability series. London: Chapman & Hall.
Fan, J., Hu, T., Truong, Y. (1994). Robust non-parametric function estimation. Scandinavian Journal of Statistics, 21(4), 433–446.
Fredriks, A., van Buuren, S., Burgmeijer, R., Meulmeester, J., Beuker, R., Brugman, E., et al. (2000). Continuing positive secular growth change in the Netherlands 1955–1997. Pediatric Research, 47(3), 316–323.
Gijbels, I., Karim, R., Verhasselt, A. (2019). On quantile-based asymmetric family of distributions: Properties and inference. International Statistical Review, 87(3), 471–504.
Härdle, W. (1990). Applied nonparametric regression. Cambridge: Cambridge University Press.
Huber, P., Ronchetti, E. (2009). Robust statistics (2nd ed.). New Jersey: Wiley.
Jones, M. (1994). Expectiles and M-quantiles are quantiles. Statistics and Probability Letters, 20(2), 149–153.
Koenker, R. (2005). Quantile regression (Vol. 38). Cambridge: Cambridge University Press.
Koenker, R., Bassett, G. (1978). Regression quantiles. Econometrica, 46(1), 33–50.
Krätschmer, V., Zähle, H. (2017). Statistical inference for expectile-based risk measures. Scandinavian Journal of Statistics, 44(2), 425–454.
Newey, W., Powell, J. (1987). Asymmetric least squares estimation and testing. Econometrica, 55(4), 819–847.
Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7(2), 186–199.
Remillard, B., Abdous, B. (1995). Relating quantiles and expectiles under weighted-symmetry. Annals of the Institute of Statistical Mathematics, 47, 371–384.
Schnabel, S., Eilers, P. (2009). Optimal expectile smoothing. Computational Statistics & Data Analysis, 53(12), 4168–4177.
Schulze Waltrup, L., Sobotka, F., Kneib, T., Kauermann, G. (2015). Expectile and quantile regression-David and Goliath? Statistical Modelling, 15(5), 433–456.
Taylor, J. (2008). Estimating value at risk and expected shortfall using expectiles. Journal of Financial Econometrics, 6(2), 231–252.
Wand, M., Jones, M. (1995). Kernel smoothing. London: Chapman and Hall.
Wolke, R., Schwetlick, H. (1988). Iteratively reweighted least squares: Algorithms, convergence analysis, and numerical comparisons. SIAM Journal on Scientific and Statistical Computing, 9(5), 907–921.
Yang, Y., Zou, H. (2015). Nonparametric multiple expectile regression via er-boost. Journal of Statistical Computation and Simulation, 85(7), 1442–1458.
Yao, Q., Tong, H. (1996). Asymmetric least squares regression estimation: A nonparametric approach. Journal of Nonparametric Statistics, 6(2), 273–292.
Yu, K., Jones, M. (1998). Local linear quantile regression. Journal of the American Statistical Association, 93(441), 228–237.
Zhang, L., Mei, C. (2008). Testing heteroscedasticity in nonparametric regression models based on residual analysis. Applied Mathematics, 23, 265–272.
Ziegel, J. (2016). Coherence and elicitability. Mathematical Finance, 26(4), 901–918.
Acknowledgements
The authors are grateful to an Associate Editor and two reviewers for the very valuable comments which led to an improvement of the paper. The authors gratefully acknowledge support of Research Grant FWO G0D6619N from the Flemish Science Foundation and of GOA/12/014 and C16/20/002 projects from the Research Fund KU Leuven.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
1.1 A.1 Proof of Theorem 1
The proof of this theorem is similar in setup as the one provided by Fan et al. (1994) to study nonparametric regression based on i.i.d. observations. The main idea of the proof is to approximate the quantity to be minimized in (11) by a quadratic function whose minimizer is asymptotically normal, and then to show that \((\widehat{\tau }_\omega (x),\widehat{\tau }_\omega ^{(1)} (x), \cdots , \widehat{\tau }_\omega ^{(p)} (x) )^{\text{ T }}\) lies close enough to that minimizer to share the latter’s asymptotic behaviour. The convexity lemma (Pollard 1991) plays a role in the above approximation. We give the details of the proof below.
Recall that, for x a given point, \(\beta _0=\tau _\omega (x), \beta _1=\tau _\omega ^{(1)}(x),\cdots , \beta _p=\frac{\tau _\omega ^{(p)}(x)}{p!}\) and \(\widehat{\beta }_0=\widehat{\tau }_\omega (x),\widehat{\beta }_1=\widehat{\tau }_\omega ^{(1)}(x),\cdots ,\widehat{\beta }_p=\frac{\widehat{\tau }_\omega ^{(p)}(x)}{p!}\) with \((\widehat{\beta }_0,\cdots ,\widehat{\beta }_p)\) minimizing
Let
For \((\theta _0,\ldots ,\theta _p)^{\text{ T }} =\varvec{\theta }\in \mathbb {R}^{p+1}\), \(\widehat{\varvec{\theta }}\) minimizes the function
Note that the function \(G_n(\varvec{\theta })\) is convex in \(\varvec{\theta }\) (the second derivative is \(\ge 0\) for all \(\varvec{\theta }\)). It is sufficient to prove that this function converges pointwise to its conditional expectation, since it follows from the convexity lemma of Pollard (1991) that the convergence is also uniform on any compact set of \(\varvec{\theta }\).
We next approximate \(G_n(\cdot )\) by a quadratic function whose minimizing value has an asymptotic normal distribution. Two terms contribute to the approximation. One is a quadratic function obtained via a Taylor expansion of the expected value, and the other term is random and linear in \(\varvec{\theta }\). Write
with
Let M be a real number such that the interval \([-M,M]\) contains the support of K. By Taylor expansion,
with \(\xi _{n,i}=o_P\left( |X_i-x|^{p+1}\right) =o_P(h^{p+1})\) holds uniformly as \(X_i\rightarrow x\), i.e. \( \max _{\{i:|X_i-x|\le Mh\}}||\xi _{n,i}||_{\infty }=o_P(h^{p+1})\) since \(\tau _{\omega }(.)\) has a continuous \((p+2)\)th derivative.
We have
Moreover,
It follows that
Thus, we have
Denoting \( \widetilde{S}_{n,j}=\frac{1}{nh}\sum _{i=1}^n\gamma (\omega ,X_i)\left( \frac{X_i-x}{h}\right) ^jK\left( \frac{X_i-x}{h}\right) \), for \(j=0,1,\ldots ,2p\), it follows from the fact that K has bounded support (see e.g. Fan and Gijbels 1996) that
where the last equality comes from the dominated convergence theorem where we assumed that \(h\rightarrow 0\) and \(f_X(.)\) is continuous in a neighbourhood of x.
A similar argument leads to
With this result and the definition of the matrix \(\mathbf {S}\), we have
We then obtain that
Next we show that \(R_n(\varvec{\theta })=o_P(1)\) (for the definition of \(R_n(\varvec{\theta })\) see (A.2)). We start by rewriting and approximating this quantity as follows:
By using Assumption (A3), we obtain
with \(\mathbf {z}=\left( 1,\frac{v-x}{h},\left( \frac{v-x}{h}\right) ^2,\cdots ,\left( \frac{v-x}{h}\right) ^p\right) ^{\text{ T }} \) and \(y_v^*=y-\tau _\omega (x)-\tau _\omega ^{(1)}(x)(v-x)-\cdots -\frac{\tau _\omega ^{(p)}(x)}{p!}(v-x)^p\).
It follows from the definition of \(R_n(\varvec{\theta })\) (in (A.2)) that for any \(\varvec{\theta }\in \mathbf {R}^{p+1}\), \(\omega \in (0,1)\) and \(x \in \mathbb {R}^{p+1}\), \(\mathrm {E}_{X,Y}[R_n(\varvec{\theta })]= 0\). Therefore \(R_n(\varvec{\theta })=o_P(1)\). Indeed, for any constant \(\epsilon >0\) and by the inequality of Chebyshev,
For the quantity \(G_n(\varvec{\theta })\) in (A.1), we thus obtain
with \(r_n(\varvec{\theta })=o_P(1)\) for each fixed \(\varvec{\theta }\) and
It easy to see that \(\mathbf {W}_n\) has a bounded second moment and hence is stochastically bounded. For \(c>0\) and by Assumption (A1) (\(\varphi (t|z)\) is bounded), we have, with \(\varvec{z}_u=\left( 1,u,u^2,\cdots ,u^p\right) ^{\text{ T }} \), \(y_u^*=y-\tau _\omega (x)-\tau _\omega ^{(1)}(x)(hu)-\cdots -\frac{\tau _\omega ^{(p)}(x)}{p!}(hu)^p\), using Hölder’s inequality and Assumption (A3),
which also implies that \(\mathrm {E}_{Y,X}[\varvec{W}_n]=O(1)\) as a result of Jensen’s inequality.
Note that
is a convex function of \(\varvec{\theta }\) which converges in probability to the convex function \(\frac{1}{2}\varvec{\theta }^{\text{ T }} \gamma (w,x)f_X(x)\mathbf {S}\varvec{\theta }\).
By the convexity lemma, Pollard (1991), for any compact subset \(\varLambda \in \mathbb {R}^{p+1}\)
So the quadratic approximation to the convex function \(G_n(\varvec{\theta })\) holds uniformly for \(\varvec{\theta }\) in any compact set. Then, using the convexity assumption again, the minimizer \(\widehat{\varvec{\theta }}\) of \(G_n(\varvec{\theta })\) converges in probability to the minimizer of the quadratic function \(-(\gamma (\omega ,x)f_X(x)\mathbf {S})^{-1}\mathbf {W}_n\)
In matrix notation, we have
with
and hence
So
The \((j+1)\)th component (for \(j=0,1,\ldots ,p\)) of the above equality is
denoting \(V_{n,j}=\frac{U_{n,j}}{\gamma (\omega ,x)f_X(x)\text {det}(\mathbf {S})}\) and \(U_{n,j}=2(nh)^{-1}\sum _{i=1}^nL_\omega (Y_i^*)(\text {adj}(\mathbf {S})\mathbf {Z}_{i})_{j+1}K_i\), where \(\text {det}(\mathbf {S})\) is the determinant of \(\mathbf {S}\) and \(\text {adj}(\mathbf {S})\) is the adjugate matrix of \(\mathbf {S}\).
Equivalently, for any \(\epsilon >0\), we have
This implies that
Hence, the conditional asymptotic normality follows from that of \(U_{n,j}\), which is established with the help of Lemmas 1 and 2 stated in Section A.2. The proofs of the lemmas are provided in Section S6 of the Supplementary Material part. \(\square \)
1.2 A.2 Two lemmas
Lemma 1
Under the assumptions of Theorem 1, we have
where \( U_{n,j}=2(nh)^{-1}\sum _{i=1}^nL_\omega (Y_i^*)(\text {adj}(\mathbf {S})\mathbf {Z}_{i})_{j+1}K_i\),
with \(\mathbf {z}_v=\left( 1,v,v^2,\cdots ,v^p\right) ^{\text{ T }} \).
Lemma 2
Under Assumptions (A1)—(A5), we have
About this article
Cite this article
Adam, C., Gijbels, I. Local polynomial expectile regression. Ann Inst Stat Math 74, 341–378 (2022). https://doi.org/10.1007/s10463-021-00799-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-021-00799-y