Abstract
It is well known that inference for the generalized Pareto distribution (GPD) is a difficult problem since the GPD violates the classical regularity conditions in the maximum likelihood method. For parameter estimation, most existing methods perform satisfactorily only in the limited range of parameters. Furthermore, the interval estimation and hypothesis tests have not been studied well in the literature. In this article, we develop a novel framework for inference for the GPD, which works successfully for all values of shape parameter k. Specifically, we propose a new method of parameter estimation and derive some asymptotic properties. Based on the asymptotic properties, we then develop new confidence intervals and hypothesis tests for the GPD. The numerical results are provided to show that the proposed inferential procedures perform well for all choices of k.
Similar content being viewed by others
References
Andrews, D. F., Herzberg, A. M. (1985). Data: A Collection of Problems from Many Fields for the Student and Research Worker. New York: Springer.
Beirlant, J., Goegebeur, Y., Segers, J., Teugels, J. (2004). Statistics of Extremes: Theory and Applications. Chichester, England: Wiley.
Billingsley, P. (1994). Probability and Measure 3rd ed. New York: Wiley.
Castillo, E., Hadi, A. S. (1997). Fitting the generalized Pareto distribution to data. Journal of the American Statistical Association, 92, 1609–1620.
Castillo, E., Hadi, A. S., Balakrishnan, N., Sarabia, J. M. (2004). Extreme Value and Related Models with Applications in Engineering and Science. Hoboken, New Jersey: Wiley.
Chen, P., Ye, Z., Zhao, X. (2017). Minimum distance estimation for the generalized Pareto distribution. Technometrics, 59, 528–541.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. London: Springer.
Davison, A. C., Smith, R. L. (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society, Series B, 52, 393–442.
de Haan, L., Ferreira, A. (2006). Extreme Value Theory: An Introduction. New York: Springer.
de Zea, Bermudez P., Kotz, S. (2010a). Parameter estimation of the generalized Pareto distribution—Part I. Journal of Statistical Planning and Inference, 140, 1353–1373.
de Zea, Bermudez P., Kotz, S. (2010b). Parameter estimation of the generalized Pareto distribution - Part II. Journal of Statistical Planning and Inference, 140, 1374–1388.
del Castillo, J., Serra, I. (2015). Likelihood inference for generalized Pareto distribution. Computational Statistics & Data Analysis, 83, 116–128.
Giles, D. E., Feng, H., Godwin, R. T. (2016). Bias-corrected maximum likelihood estimation of the parameters of the generalized Pareto distribution. Communications in Statistics—Theory and Methods, 45, 2465–2483.
Grimshaw, S. D. (1993). Computing maximum likelihood estimates for the generalized Pareto distribution. Technometrics, 35, 185–191.
Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3, 1163–1174.
Hosking, J. R. M. (1990). L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society, Series B, 52, 105–124.
Hosking, J. R. M., Wallis, J. R. (1987). Parameter and quantile estimation for the generalized Pareto distribution. Technometrics, 29, 339–349.
Iliopoulos, G., Balakrishnan, N. (2009). Conditional independence of blocked ordered data. Statistics & Probability Letters, 79, 1008–1015.
Lehmann, E. L., Casella, G. (1998). Theory of Point Estimation 2nd ed. New York: Springer.
Pickands, J. (1975). Statistical inference using extreme order statistics. The Annals of Statistics, 3, 119–131.
Salvadori, G., De Michele, C., Kottegoda, N. T., Rosso, R. (2007). Extremes in Nature: An Approach Using Copulas. Dordrecht: Springer.
Smith, R. L. (1984). Threshold methods for sample extremes. In J: Tiago de Oliveira (Ed.), Statistical Extremes and Applications, pp. 621–638. Dordrecht: Springer.
Smith, R. L. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 67–90.
Song, J., Song, S. (2012). A quantile estimation for massive data with generalized Pareto distribution. Computational Statistics & Data Analysis, 56, 143–150.
Zhang, J. (2010). Improving on estimation for the generalized Pareto distribution. Technometrics, 52, 335–339.
Zhang, J., Stephens, M. A. (2009). A new and efficient estimation method for the generalized Pareto distribution. Technometrics, 51, 316–325.
Acknowledgements
The authors thank the Associate Editor and two referees for their incisive comments and suggestions which led to a great improvement in the paper. Hideki Nagatsuka was partially supported by the Grant-in-Aid for Scientific Research (C) 19K04890, Japan Society for the Promotion of Science, and Chuo University Grant for Special Research, while N. Balakrishnan was supported by an Individual Discovery Grant (RGPIN-2020-06733) from the Natural Sciences and Engineering Research Council of Canada.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Propositions for derivatives of the likelihood function of k
Proposition 8
For \(k \in {\mathbb {R}}\) and any given \(\varvec{s}_n^{(j)}\), where j, \(1\le j\le n\), is fixed, the derivative \(l'(k ; \varvec{s}_n^{(j)})=(\partial /\partial k)l(k ; \varvec{s}_n^{(j)})\) is given by
Proposition 9
For \(k \in {\mathbb {R}}\) and any given \(\varvec{s}_n^{(j)}\), where j, \(1\le j\le n\), is fixed, the second derivative \(l''(k ; \varvec{s}_n^{(j)})=(\partial ^2/\partial k^2)l(k ; \varvec{s}_n^{(j)})\) is given by
Proposition 10
For \(k \in {\mathbb {R}}\) and any given \(\varvec{s}_n^{(j)}\), where j, \(1\le j\le n\), is fixed, the third derivative \(l'''(k ; \varvec{s}_n^{(j)})=(\partial ^3/\partial k^3)l(k ; \varvec{s}_n^{(j)})\) is given by
and
where \(s_1\le \cdots \le s_{j-1} \le 1 \le s_{j+1} \le \cdots \le s_n\), and \(s_j=1\).
Proofs
1.1 Proof of Proposition 1
Denote the cdf and pdf of the GPD with \(\sigma =1\), \(F(\cdot ;\, k, 1)\) and \(f(\cdot ;\, k, 1), \) by \(G(\cdot \,;k)\) and \(g(\cdot \,;\beta )\), respectively, for simplicity. Suppose \(Z_{i}\), \(i=1, \ldots ,n\), are n independent random variables from such a standard GPD with shape parameter k. For \(i=1, \ldots ,n\), let \(Z_{i:n}\) be the i-th order statistic among \(Z_{1}, \ldots ,Z_{n}\).
First, we assume that \(k\ne 0\). Define \(\varvec{i_n^{(j)}}=\left\{ i|i=1, \ldots ,j-1, j+1, \ldots , n\right\} \). For a fixed positive integer value j, and for any \(n-1\) real values \(s_1\le \cdots \le s_{j-1} \le 1 \le s_{j+1} \le \cdots \le s_n\), we consider
where \(h_j\left( \cdot ;\, k\right) \) is the pdf of \(Z_{j:n}\).
We note that the integrand in Eq. (8) has its partial derivative with respect to \(s_i\), \(i \in \varvec{i_n^{(j)}}\), as \(n!g(u\,;k)\prod _{i \in \varvec{i_n^{(j)}}}u\,g(u s_i\,;k)\), and further that
is bounded above. From the boundedness of (9), we have
where \(C_0\) is a positive constant, and
Then, upon using Part (ii) of Theorem 16.8 of Billingsley (1994), we can interchange the derivatives and the integration in (8), so that the partial derivative of it with respect to \(s_i, i \in \varvec{i_n^{(j)}}\), as
The result for \(k=0\) can be obtained by letting \(k \rightarrow 0\) in the result for \(k\ne 0\). After some simple algebra, the proof of Proposition 1 gets completed. \(\square \)
1.2 Proof of Theorem 3
First, we shall show that the likelihood equation has at least one solution. Given \(\varvec{s}_n^{(j)}\), the derivative of the likelihood function in (5) for \(k \ne 0\) can be rewritten as
where \(\eta \left( k, u\right) =-\frac{n}{k}-\frac{\sum _{i=1}^n \log \left( 1-u s_i\right) }{k^2}\), \(\Lambda \left( k, u\right) =\frac{1}{\left| k \right| }\left( \frac{u}{k}\right) ^{n-1}\prod _{i=1}^n \left( 1- u s_i\right) ^{1/k-1}\) and \(s_j=1\). It follows from the facts that \(\eta \left( k, u\right) =-\frac{1}{k}\left( n+\frac{\sum _{i=1}^n \log \left( 1-u s_i\right) }{k}\right) >0\) for sufficiently small \(k\in {\mathbb {R}}\), and \(\eta \left( k, u\right) <0\) for sufficiently large \(k\in {\mathbb {R}}\), and \(\Lambda \left( k, u\right) >0\) for every \(k\in {\mathbb {R}}\), for every \(u\in \chi _k\), there exist real values \(\delta _1\) and \(\delta _2\) such that \(l'(k ; \varvec{s}_n^{(j)})>0\) for every \(k<\delta _1\), and \(l'(k ; \varvec{s}_n^{(j)})<0\) for every \(k>\delta _2\), respectively. In addition, we see from Proposition 8 that \(l'(k ; \varvec{s}_n^{(j)})\) is continuous with respect to \(k\in {\mathbb {R}}\). Thus, \(l'(k ; \varvec{s}_n^{(j)})=0\) has at least one solution.
Next, we shall show that the number of solutions is exactly one. Let \(k^*\) be one of the solutions of \(l'(k ; \varvec{s}_n^{(j)})=0\). We see that \(\eta \left( k^*, u\right) \) is strictly increasing in u and takes on values over \((-\infty , -n/k^*)\) for \(k^*<0\). Thus, there exists a unique value of u such that \(\eta \left( k^*, u\right) =0\), which we denote by \(u_0\). We also see that \(\eta \left( k^*, u\right) <0\) for \(u<u_0\) and \(\eta \left( k^*, u\right) >0\) for \(u>u_0\).
We have, for \(k^*<0\) and sufficiently small \(\Delta k>0\) such that \(k^* + \Delta k<0\),
where \(u_0-=\lim _{u\uparrow u_0}u\), and the inequality follows from the facts that
is greater than \(\left( 1+\Delta k/k^*\right) ^{-2}\) for \(u\in (-\infty , u_0)\), is less than \(\left( 1+\Delta k/k^*\right) ^{-2}\) for \(u\in (u_0, 0)\), and \(\eta \left( k^*+\Delta k, u_0\right) <0 =\eta \left( k^*, u_0\right) \left( 1+\Delta k/k^*\right) ^{-2}\).
We further note that
is strictly increasing in u and takes on value over \(\left( 0, (1+\Delta k/k^*)^n\right) \). Then, it follows from (10) and by the mean value theorem that
where \(M=\Lambda \left( k^*+\Delta k, u'\right) /\Lambda \left( k^*, u'\right) \in \left( 0, (1+\Delta k/k^*)^n\right) \), for \(u'\in (-\infty , 0)\).
We can also obtain the same results for \(k^*\ge 0\). The proofs are very similar to the proof for \(k^*< 0\) (for \(k^*= 0\), by using the fact that \(l'(0 ; \varvec{s}_n^{(j)})=\lim _{k\rightarrow 0}l'(k ; \varvec{s}_n^{(j)})\) and Lebesgue’s dominated convergence theorem) and are therefore omitted here. The fact that \(l'(k^*+\Delta k ; \varvec{s}_n^{(j)})<0\) for every \(k^* \ne 0\) clearly implies that \(l'(k ; \varvec{s}_n^{(j)})\) changes sign only once with respect to k.
From the above arguments, \(l'(k ; \varvec{s}_n^{(j)})=0\) always has a unique solution with respect to k, and the proof of Theorem 3 thus gets completed.\(\square \)
1.3 Proof of Lemma 1
Let \(\varvec{S}_{n,1}^{(j)}=\left( S_{1:n}^{(j)}, \ldots ,S_{j-1:n}^{(j)}\right) \) and \(\varvec{S}_{n,2}^{(j)}=\left( S_{j+1:n}^{(j)}, \ldots ,S_{n:n}^{(j)}\right) \). Then, by Theorem 2 of Iliopoulos and Balakrishnan (2009), conditional on \(Z_{j:n}=u\in \lambda _{k}\), where \(Z_{j:n}=X_{j:n}/\sigma _0\) and \(\lambda _{k}=\{u:0< u< \infty , {\text{ if }} k < 0,\) or \(0< u< 1/k, {\text{ if }} k>0\}\) as defined in the proof of Proposition 1, we see that \(\varvec{S}_{n,1}^{(j)}\) are distributed exactly as order statistics from a sample of size \(j-1\) from the distribution with density \(\psi _1(s;\, k_0, u)=u\, g(u\,s;\, k_0)/G(u;\, k_0)\), \(0\le s\le 1\), and \(\varvec{S}_{n,2}^{(j)}\) are distributed exactly as order statistics from a sample of size \(n-j\) from the distribution with density \(\psi _2(s;\, k_0, u)=u\, g(u\,s;\, k_0)/(1-G(u;\, k_0))\), \(s\ge 1\), where \(g(\cdot ;\, k)=f(\cdot ;\, k, 1)\) and \(G(\cdot ;\, k)=F(\cdot ;\, k, 1)\). We also have \(\varvec{S}_{n,1}^{(j)}\) and \(\varvec{S}_{n,2}^{(j)}\) to be conditionally independent. Hence, under the condition that \(Z_{j:n}=u \in \lambda _{k}\), we have the joint density function of \(\varvec{S}_{n,1}^{(j)}\) and \(\varvec{S}_{n,2}^{(j)}\) to be
denoted by \(l_u\left( k_0;\,\varvec{s}_{n}^{(j)}\right) \), where \(\varvec{s}_n^{(j)}=(s_1, \ldots , s_{j-1}, s_{j+1}, \ldots , s_n)\), for \(0\le s_1\le \cdots \le s_{j-1} \le 1 \le s_{j+1}\le \cdots \le s_{n}\). Equation (11) implies that \(\varvec{S}_{1*}^{(j)}=(S_1^{(j)}, \ldots ,S_{j-1}^{(j)})\), which are the corresponding random variables to \(\varvec{S}_{n,1}^{(j)}=(S_{1:n}^{(j)}, \ldots ,S_{j-1:n}^{(j)})\), are i.i.d. distributed with the conditional density function \(\psi _1\), and \(\varvec{S}_{2*}^{(j)}=(S_{j+1}^{(j)}, \ldots ,S_{n}^{(j)})\), which are the corresponding random variables to \(\varvec{S}_{n,2}^{(j)}=(S_{j+1:n}^{(j)}, \ldots ,S_{n:n}^{(j)})\), are i.i.d. with the conditional density function \(\psi _2\), given \(Z_{j:n}=u\).
Let \(Z'_{j:n}=X'_{j:n}/\sigma \), where \(X'_{j:n}\) is the jth-order statistic from the GPD with parameters \(k\ne k_0\) and \(\sigma \ne \sigma _0\). Then, for any fixed \(u\in \lambda _{k}\) and \(u'\in \lambda _{k}\), and any \(k\ne k_0\), conditional on \(Z_{j:n}=u\) and \(Z'_{j:n}=u'\), it follows that
where \(\varvec{S}_{n*}^{(j)}=\,(\varvec{S}_{1*}^{(j)}, \varvec{S}_{2*}^{(j)})=\,(S_{1}^{(j)}, \ldots , S_{j-1}^{(j)}, S_{j+1}^{(j)},\ldots , S_{n}^{(j)})\). By the weak law of large numbers, (12) converges in probability to
where \(S_1\) and \(S_2\) are random variables which are distributed with the conditional density functions \(\psi _1(x;\, k_0, u)\) and \(\psi _2(x;\, k_0, u)\), given \(Z_{j:n}=u\), respectively, and \(p=\lim _{n\rightarrow \infty } j/n\) (\(0\le p\le 1\)). \(E_1\) and \(E_2\) denote the conditional expectations with respect to \(\psi _1\) and \(\psi _2\), given \(Z_{j:n}=u\), respectively. By Jensen’s inequality, we have
Hence, for any fixed u, \(u'\in \lambda _{k}\), we have
or
Now, the density of \(Z_{j:n}\), with \(k_0 \in {\mathbb {R}}\), is given by
and thus we see that
since \(P\left( l_{u'}\left( k;\,\varvec{S}_{n}^{(j)}\right) <l_u\left( k_0;\,\varvec{S}_{n}^{(j)}\right) \,|\, Z_{j:n}=u,\, Z'_{j:n}=u'\right) \) is bounded by 1. Then, by applying the dominated convergence theorem, we have
which completes the proof of Lemma 1.\(\square \)
1.4 Proof of Theorem 5
Here, we shall use the shorthand notation \(L\left( k;\, \varvec{S}_{n}^{(j)} \right) \) for the log-likelihood function based on \(\varvec{S}_{n}^{(j)}\), \(\log l\left( k;\, \varvec{S}_{n}^{(j)} \right) \), and \(L'\left( k;\, \varvec{S}_{n}^{(j)} \right) \) and \(L''\left( k;\, \varvec{S}_{n}^{(j)} \right) \) for its derivatives with respect to k.
First, we assume that \(k\ne 0\) and \(k^*\ne 0\). We then have
where \(S_{j:n}^{(j)}=S_{j}^{(j)}=1\), \(C_1\left( k, v\right) =-\frac{1}{k}-\frac{\log \left( 1-k\, v\right) }{k^2}\), \(\delta (\cdot )\) is the Dirac delta function, \(\psi _{n,j}(\varvec{S}_{n}^{(j)}, k, u)=(j-1)!\prod _{i=1}^{j-1}\psi _1(S_{i:n};\, k, u)\times (n-j)!\prod _{i=j+1}^{n}\psi _2(S_{i:n};\, k, u)\), and \(\psi _1\), \(\psi _2\), \(h_j\), \(\varvec{S}_{1*}^{(j)}=(S_1^{(j)}, \ldots ,S_{j-1}^{(j)})\) and \(\varvec{S}_{2*}^{(j)}=(S_{j+1}^{(j)}, \ldots ,S_{n}^{(j)})\) are all as defined in the proof of Lemma 1.
As with the likelihood function under regularity conditions (see Lemma 5.3 of Lehmann and Casella 1998), we obtain
Hence, it follows, from (13)–(15), with the use of central limit theorem, that
We can obtain the same results when \(k=0\) or \(k^*=0\), by noting that \(L'(0 ; \varvec{S}_n^{(j)})=\lim _{k\rightarrow 0}L'(k ; \varvec{S}_n^{(j)})\) and \(L''(0 ; \varvec{S}_n^{(j)})=\lim _{k\rightarrow 0}L''(k ; \varvec{S}_n^{(j)})\), and by Lebesgue’s dominated convergence theorem. These details are therefore omitted for the sake of brevity.\(\square \)
1.5 Proof of Theorem 6
Here, we shall use the shorthand notation \(L\left( k;\, \varvec{S}_{n}^{(j)} \right) \) for the log-likelihood function based on \(\varvec{S}_{n}^{(j)}\), \(\log l\left( k;\, \varvec{S}_{n}^{(j)} \right) \), and \(L'\left( k;\, \varvec{S}_{n}^{(j)} \right) \), \(L''\left( k;\, \varvec{S}_{n}^{(j)} \right) \) and \(L'''\left( k;\, \varvec{S}_{n}^{(j)} \right) \) for its derivatives with respect to k. By a Taylor expansion of \(L'\left( {\hat{k}};\, \varvec{S}_{n}^{(j)} \right) \) around k, we obtain
where \(k^*\) lies between k and \({\hat{k}}\).
By Theorem 1, we can take any \(j\in \left\{ 1,\ldots ,n\right\} \) to treat \({\hat{k}}\) that is the MLE based on \(\varvec{S}_n^{(j)}\), without loss of generality. Here, we take j such as \(Z_{j:n} {\mathop {\rightarrow }\limits ^{{\mathscr {P}}}} v\in \lambda _k\) as \(n\rightarrow \infty \), where \({\mathop {\rightarrow }\limits ^{{\mathscr {P}}}}\) denotes convergence in probability, and \(\lambda _{k}=\{u:0< u< \infty , {\text{ if }} k < 0,\) or \(0< u< 1/k, {\text{ if }} k>0\}\) as defined in the proof of Proposition 1, and let \(p=\lim _{n\rightarrow \infty } j/n\).
From here, we shall show the following facts:
as \(n \rightarrow \infty \).
We first note that (17) holds from Theorem 6. So, we shall show now (18), for which we assume that \(k\ne 0\) and \(k^*\ne 0\).
As in (13), we have
The last convergence follows by weak law of large numbers.
Next, (19) holds since
where \(C_2\left( k^{*}, v\right) =-\frac{2}{{k^{*}}^3}-\frac{6\, \log \left( 1-k^*\, v\right) }{{k^{*}}^4}\), and \(E_1\) and \(E_2\) are conditional expectations with respect to \(\psi _1\) and \(\psi _2\), given \(Z_{j:n}=v\), respectively, as defined in the proof of Lemma 1.
The interchangeability of integrations, differentiations and limits in the proofs of (17), (18) and (19) can be justified by Lebesgue’s dominated convergence theorem. These proofs are quite similar to the proof of interchangeability of differentiations and integration in Proposition 1 and are therefore omitted. We can further obtain the same results when \(k=0\) or \(k^*=0\), by noting that \(L''(0 ; \varvec{S}_n^{(j)})=\lim _{k\rightarrow 0}L''(k ; \varvec{S}_n^{(j)})\) and \(L'''(0 ; \varvec{S}_n^{(j)})=\lim _{k^*\rightarrow 0}L'''(k^* ; \varvec{S}_n^{(j)})\) and by the use of Lebesgue’s dominated convergence theorem. These proofs are not presented here for the sake of brevity. Thus, the proof of Theorem 6 gets completed.\(\square \)
About this article
Cite this article
Nagatsuka, H., Balakrishnan, N. Efficient likelihood-based inference for the generalized Pareto distribution. Ann Inst Stat Math 73, 1153–1185 (2021). https://doi.org/10.1007/s10463-020-00782-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-020-00782-z