On goodness-of-fit tests for the Bell distribution

Batsidis, Apostolos; Jiménez-Gamero, María Dolores; Lemonte, Artur J.

doi:10.1007/s00184-019-00733-6

On goodness-of-fit tests for the Bell distribution

Published: 27 July 2019

Volume 83, pages 297–319, (2020)
Cite this article

Metrika Aims and scope Submit manuscript

Apostolos Batsidis ORCID: orcid.org/0000-0002-4491-4387¹,
María Dolores Jiménez-Gamero² &
Artur J. Lemonte³

503 Accesses
5 Citations
Explore all metrics

Abstract

The one-parameter Bell family of distributions, introduced by Castellares et al. (Appl Math Model 56:172–185, 2018), is useful for modeling count data. This paper proposes and studies a goodness-of-fit test for this distribution, which is consistent against fixed alternatives. The finite sample performance of the proposed test is investigated by means of several Monte Carlo simulation experiments, and it is also compared with other related ones. Real data applications are considered for illustrative purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Evaluating significance in linear mixed-effects models in R

Article 12 September 2016

References

Baringhaus L, Henze N (1992) A goodness of fit test for the Poisson distribution based on the empirical generating function. Stat Probab Lett 13:269–274
Article MathSciNet Google Scholar
Castellares F, Ferrari SLP, Lemonte AJ (2018) On the Bell distribution and its associated regression model for count data. Appl Math Model 56:172–185
Article MathSciNet Google Scholar
Corless RM, Gonnet GH, Hare DEG, Jeffrey D, Knuth DE (1996) On the Lambert W function. Adv Comput Math 5:329–359
Article MathSciNet Google Scholar
Epps TW (1995) A test of fit for lattice distributions. Commun Stat Theory Methods 24:1455–1479
Article MathSciNet Google Scholar
Giacomini R, Politis DN, White H (2013) A warp-speed method for conducting Monte Carlo experiments involving bootstrap estimators. Econ Theory 29:567–589
Article MathSciNet Google Scholar
Goerg GM (2016) LambertW: probabilistic models to analyze and Gaussianize heavy-tailed, skewed data. R library version 0.6.4
Gürtler N, Henze N (2000) Recent and classical goodness-of-fit test for the Poisson distribution. J Stat Plan Inference 90:207–225
Article MathSciNet Google Scholar
Janssen A (2000) Global power functions of goodness of fit tests. Ann Stat 28:239–253
Article MathSciNet Google Scholar
Jiménez-Gamero MD, Alba-Fernández MV (2019) Testing for the Poisson–Tweedie distribution. Math Comput Simul 164:146–162
Article MathSciNet Google Scholar
Jiménez-Gamero MD, Batsidis A (2017) Minimum distance estimators for count data based on the probability generating function with applications. Metrika 80:503–545
Article MathSciNet Google Scholar
Kemp AW (1992) Heine–Euler extensions of the Poisson distribution. Commun Stat Theory Methods 21:571–588
Article MathSciNet Google Scholar
Kocherlakota S, Kocherlakota K (1986) Goodness of fit test for discrete distributions. Commun Stat Theory Methods 15:815–829
Article MathSciNet Google Scholar
Klugman S, Panjer H, Willmot G (1998) Loss Models, From Data to Decisions. Wiley, New York
MATH Google Scholar
Kundu S, Majumdar S, Mukherjee K (2000) Central limit theorems revisited. Stat Probab Lett 47:265–275
Article MathSciNet Google Scholar
Meintanis S, Bassiakos Y (2005) Goodness-of-fit tests for additively closed count models with an application to the generalized Hermite distribution. Sankhya A 67:538–552
MathSciNet MATH Google Scholar
Meintanis S (2008) New inference procedures for generalized Poisson distributions. J Appl Stat 35:751–762
Article MathSciNet Google Scholar
Nakamura M, Pérez-Abreu V (1993a) Empirical probability generating function: an overview. Insur Math Econ 12:287–295
Article Google Scholar
Nakamura M, Pérez-Abreu V (1993b) Use of an empirical probability generating function for testing a Poisson model. Can J Stat 21:149–156
Article MathSciNet Google Scholar
Novoa Muñoz F, Jiménez-Gamero MD (2014) Testing for the bivariate Poisson distribution. Metrika 77:771–793
Article MathSciNet Google Scholar
Novoa Muñoz F, Jiménez-Gamero MD (2016) A goodness-of-fit test for the multivariate Poisson distribution. Sort 40:1–26
MathSciNet MATH Google Scholar
Olver FWJ, Lozier DW, Boisvert RF, Clark CW (2010) NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge
MATH Google Scholar
R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Google Scholar
Rashid A, Ahmad Z, Jan TR (2016) A new count data model with application in genetics and ecology. Electron J Appl Stat Anal 9:213–226
MathSciNet Google Scholar
Rueda R, O’Reilly F (1999) Tests of fit for discrete distributions based on the probability generating function. Commun Stat Simul Comput 28:259–274
Article MathSciNet Google Scholar
Rueda R, Pérez Abreu V, O’Reilly F (1991) Goodness of fit for the Poisson distribution based on the probability generating function. Commun Stat Theory Methods 20:3093–3110
Article MathSciNet Google Scholar
Sichel HS (1951) The estimation of the parameters of a negative binomial distribution with special reference to psychological data. Psychometrika 16:107–127
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors thank the Editor, the Associate Editor and two anonymous referees for their constructive comments and suggestions which helped to improve the presentation. M.D. Jiménez-Gamero has been partially supported by Grant MTM2017-89422-P of the Spanish Ministry of Economy, Industry and Competitiveness, the State Agency of Investigation, the European Regional Development Fund. Artur J. Lemonte acknowledges the financial support of the Brazilian agency CNPq (Grant 301808/2016–3).

Author information

Authors and Affiliations

Department of Mathematics, University of Ioannina, 45110, Ioannina, Greece
Apostolos Batsidis
Departamento de Estadística e Investigación Operativa, Universidad de Sevilla, 41012, Seville, Spain
María Dolores Jiménez-Gamero
Departamento de Estatística, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
Artur J. Lemonte

Authors

Apostolos Batsidis
View author publications
You can also search for this author in PubMed Google Scholar
María Dolores Jiménez-Gamero
View author publications
You can also search for this author in PubMed Google Scholar
Artur J. Lemonte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Apostolos Batsidis.

Ethics declarations

Conflict of interest

The authors declare that no conflict of interest (financial or otherwise) exists in the submission of this manuscript, and manuscript is approved by all authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Proofs

Here we prove the results given in the previous sections.

Proof of Proposition 1

It can be checked that the PGF of $X \sim \mathrm{Bell}(\theta )$ given in (1) satisfies the differential equation given in (2). Next, we proof that it is the only PGF in G satisfying such differential equation. It is well-known that the solution of the linear differential equation of order one of the form $y^{\prime }+p(t) y=0$, where $y=y(t)$, $y'=\frac{\partial }{\partial t}y(t)$ and p(t) is a continuous function in t, is given by $y=C \exp (-\int p(t) dt)$, where C is an arbitrary constant. Since the differential equation (2) is of this form, we have that

$$\begin{aligned} g(t)=C \exp \left( \int \theta e^{\theta \, t} dt \right) = C \exp \left( e^{\theta \, t}\right) . \end{aligned}$$

Taking into account that g is a PGF, it must satisfy $g(1)=1$, implying that $C=\exp \left( -e^{\theta }\right) $ and hence the desired result is obtained. $\square $

Let $\phi (x;\theta )=(\phi (x; 0,\theta ),\phi (x; 1,\theta ), \ldots )$ and

$$\begin{aligned} f_r(x)=\sum _{u\ge 0}(u+r)\frac{x^u}{u!}=(x+r)e^x. \end{aligned}$$

We have the following lemmas.

Lemma 1

Let $X_1, \ldots , X_n$ be independent and identically distributed from X, a random variable taking values in $\mathbb {N}_{0}$ with probability mass function $p(k)=\Pr (X=k)$, $k\in \mathbb {N}_{0}$, so that $E(X^2)<\infty $, then

$$\begin{aligned} E(\Vert \phi (X;\theta )\Vert _2^2) \le E(X^2) +\theta ^2 e^{\theta ^2} <\infty . \end{aligned}$$

Proof

By definition,

$$\begin{aligned} \Vert \phi (X;\theta )\Vert _2^2= & {} \sum _{k \ge 0}\phi (X; k,\theta )^2 \\= & {} \sum _{k \ge 0} (k+1)^2I(X=k+1)+ \sum _{k \ge 0} \sum _{u= 0}^{k}\frac{\theta ^{2u+2}}{(u!)^2}I(X=k-u), \end{aligned}$$

and thus

$$\begin{aligned} E(\Vert \phi (X;\theta )\Vert _2^2)= & {} E(X^2)+ \sum _{k \ge 0} \sum _{u= 0}^{k}\frac{\theta ^{2u+2}}{(u!)^2}p(k-u). \end{aligned}$$

Taking into account that

$$\begin{aligned} \sum _{k \ge 0} \sum _{u= 0}^{k}\frac{\theta ^{2u+2}}{(u!)^2}p(k-u)= \theta ^2\sum _{k \ge 0} p_k \sum _{u \ge 0}\frac{\theta ^{2u}}{(u!)^2} \le \theta ^2 \sum _{u \ge 0}\frac{\theta ^{2u}}{u!}=\theta ^2 e^{\theta ^2}, \end{aligned}$$

the result follows.$\square $

Lemma 2

Let $X_1, \ldots , X_n$ be independent and identically distributed from X, a random variable taking values in $\mathbb {N}_{0}$, then

$$\begin{aligned} \sum _{k \ge 0}\left[ \frac{\partial }{\partial \theta }\widehat{d}(k; \theta )\right] ^2 \le f_1^2(\theta ) <\infty , \quad \forall \, \theta >0. \end{aligned}$$

Proof

We have that

$$\begin{aligned} \sum _{k \ge 0}\left[ \frac{\partial }{\partial \theta }\widehat{d}(k; \theta )\right] ^2= \sum _{k \ge 0} \sum _{u,v=0}^k(v+1)(u+1)\frac{\theta ^v}{v!} \frac{\theta ^u}{u!}\widehat{p}(k-v) \widehat{p}(k-u). \end{aligned}$$

By interchanging the order of the sums, one gets

$$\begin{aligned} \sum _{k \ge 0}\left[ \frac{\partial }{\partial \theta }\widehat{d}(k; \theta )\right] ^2=\sum _{u,v \ge 0} (v+1)(u+1)\frac{\theta ^v}{v!}\frac{\theta ^u}{u!} \sum _{k\ge \max \{u,v\}}\widehat{p}(k-v) \widehat{p}(k-u). \end{aligned}$$

Taking into account that

$$\begin{aligned} \sum _{k\ge \max \{u,v\}}\widehat{p}(k-v) \widehat{p}(k-u) \le \sum _{k\ge 0}\widehat{p}(k)=1, \end{aligned}$$

the result follows.$\square $

Lemma 3

Let $X_1, \ldots , X_n$ be independent and identically distributed from X, a random variable taking values in $\mathbb {N}_{0}$. Assume that $\widehat{\theta } {\mathop {\longrightarrow }\limits ^{a.s.(P)}} \theta $, for some $\theta >0$. For each $k\in \mathbb {N}_{0}$, let $\theta _k=\alpha _{k}\theta +(1-\alpha _{k})\widehat{\theta }$, for some $\alpha _{k} \in [0,1]$. Then,

$$\begin{aligned} \sum _{k \ge 0}\left[ \frac{\partial }{\partial \theta }\widehat{d}(k; \theta _k)\right] ^2 <\infty , \quad a.s.(P). \end{aligned}$$

Proof

Let $\widetilde{\theta }=\max \{ \widehat{\theta }, \theta \}$. Proceeding as in the proof of Lemma 2, we get that

$$\begin{aligned} \sum _{k \ge 0}\left[ \frac{\partial }{\partial \theta }\widehat{d}(k; \theta _k)\right] ^2 \le f_1(\widetilde{\theta })^2. \end{aligned}$$

Since $f_1(\widetilde{\theta })^2$ is a continuous function of $\widehat{\theta }$, we have that

$$\begin{aligned} f_1(\widetilde{\theta })^2{\mathop {\longrightarrow }\limits ^{a.s.(P)}} f_1(\theta )^2 <\infty , \quad \forall \, \theta >0, \end{aligned}$$

and the result follows.$\square $

Lemma 4

Let $X_1, \ldots , X_n$ be independent and identically distributed from X, a random variable taking values in $\mathbb {N}_{0}$. Assume that $\widehat{\theta } {\mathop {\longrightarrow }\limits ^{a.s.(P)}} \theta $, for some $\theta >0$. Given the data, let $X_1^*,\ldots , X_n^*$ be independent and identically distributed from $X^* \sim \mathrm{Bell}(\widehat{\theta })$. Let $\widehat{d}^*(k;\theta )$ be defined as $\widehat{d}(k;\theta )$ with $\widehat{p}(k)$ replaced by

$$\begin{aligned} \widehat{p}^*(k)=\frac{1}{n}\sum _{j=1}^nI(X^*_j=k), \quad k \ge 0. \end{aligned}$$

Then,

(I)
$\displaystyle \sum _{k \ge 0} \left[ \frac{\partial }{\partial \theta }\widehat{d}^*(k;\widehat{\theta })-\mu (k; \widehat{\theta }) \right] ^2 {\mathop {\longrightarrow }\limits ^{P_*}} 0$, a.s.(P),
(II)
$\displaystyle \sum _{k \ge 0} \left[ \mu (k; {\theta })-\mu (k; \widehat{\theta }) \right] ^2 \rightarrow 0$, a.s.(P).

Proof

(I) We have that

$$\begin{aligned}&\sum _{k \ge 0} \left[ \frac{\partial }{\partial \theta }\widehat{d}^*(k;\widehat{\theta })-\mu (k; \widehat{\theta }) \right] ^2\\&\quad = \sum _{k \ge 0}\left\{ \sum _{v=0}^k(v+1)\frac{\theta ^v}{v!} \left[ \widehat{p}^*(k-v)-p(k-v;\widehat{\theta })\right] \right\} ^2\\&\quad = \sum _{u,v \ge 0} (u+1)\frac{\theta ^u}{u!}(v+1)\frac{\theta ^v}{v!} \sum _{k\ge \max \{u,v\}}\left\{ \widehat{p}^*(k-v) - p(k-v;\widehat{\theta })\right\} \left\{ \widehat{p}^*(k-u)- p(k-u;\widehat{\theta })\right\} \\&\quad \le f_1(\widehat{\theta })^2 \sum _{k\ge 0}\left\{ \widehat{p}^*(k)- p(k;\widehat{\theta })\right\} ^2. \end{aligned}$$

Since $f_1(\widehat{\theta })^2$ is a continuous function of $\widehat{\theta }$, we have that

$$\begin{aligned} f_1(\widehat{\theta })^2{\mathop {\longrightarrow }\limits ^{a.s.(P)}} f_1(\theta )^2 <\infty , \quad \forall \, \theta >0. \end{aligned}$$

(12)

We also have that

$$\begin{aligned} E_*\left[ \sum _{k\ge 0}\left\{ \widehat{p}^*(k)- p(k;\widehat{\theta })\right\} ^2\right] = \frac{1}{n}\sum _{k\ge 0}p(k;\widehat{\theta })\left\{ 1- p(k;\widehat{\theta })\right\} \le \frac{1}{n} \rightarrow 0. \end{aligned}$$

Therefore,

$$\begin{aligned} \sum _{k\ge 0}\left\{ \widehat{p}^*(k)- p(k;\widehat{\theta })\right\} ^2 {\mathop {\longrightarrow }\limits ^{P_*}} 0, \end{aligned}$$

and the result in part (I) follows.

(II) We have that

$$\begin{aligned} \sum _{k \ge 0} \left[ \mu (k; {\theta })-\mu (k; \widehat{\theta }) \right] ^2=\varDelta _1+2\varDelta _2+\varDelta _3, \end{aligned}$$

where

$$\begin{aligned} \varDelta _1= & {} \sum _{k \ge 0} \sum _{u,v=0}^k (u+1)\frac{\widehat{\theta }^u}{u!}(v+1)\frac{\widehat{\theta }^v}{v!} \left\{ p(k-u; \widehat{\theta })- p(k-u; \theta )\right\} \left\{ p(k-v; \widehat{\theta })- p(k-v; \theta )\right\} ,\\ \varDelta _2= & {} \sum _{k \ge 0} \sum _{u,v=0}^k (u+1)\frac{\widehat{\theta }^u}{u!}\frac{v+1}{v!} \left\{ p(k-u; \widehat{\theta })- p(k-u; \theta )\right\} p(k-v; \theta ) \left\{ \widehat{\theta }^v-\theta ^v\right\} ,\\ \varDelta _3= & {} \sum _{k \ge 0} \sum _{u,v=0}^k \frac{u+1}{u!}\frac{v+1}{v!} p(k-u; \theta ) p(k-v; \theta ) \left\{ \widehat{\theta }^u-\theta ^u\right\} \left\{ \widehat{\theta }^v-\theta ^v\right\} . \end{aligned}$$

We first deal with $\varDelta _1$. We have that

$$\begin{aligned} \varDelta _1 \le f_1(\widehat{\theta })^2 \sum _{k \ge 0} \left\{ p(k; \widehat{\theta })- p(k; \theta )\right\} ^2. \end{aligned}$$

(13)

To deal with the second term on the right-hand of (13), note that since

$$\begin{aligned} \frac{\partial }{\partial \theta } p(k; \theta )=\frac{p(k; \theta )}{\theta }\left\{ k-\theta e^\theta \right\} , \quad \forall \, k\ge 1, \end{aligned}$$

there exists $k_1=k_1(\theta ) \ge 1$ such that $\frac{\partial }{\partial \theta } p(k; \theta ) >0$, $\forall \, k \ge k_1$. We are assuming that $\widehat{\theta } {\mathop {\longrightarrow }\limits ^{a.s.(P)}} \theta $, for some $\theta >0$, and therefore for any $\varepsilon >0$ there exists $n_0 \in \mathbb {N}$ (depending on $\theta $, $\varepsilon $ and the sequence $X_1, X_2, \ldots $) such that $\widehat{\theta } \le \theta +\varepsilon $a.s.(P) $\forall \ n \ge n_0$. For any $k_0 \ge k_1$, we have that

$$\begin{aligned} \sum _{k \ge 0} \left\{ p(k; \widehat{\theta })- p(k; \theta )\right\} ^2&= \sum _{k = 0}^{k_0} \left\{ p(k; \widehat{\theta })- p(k; \theta )\right\} ^2+ \sum _{k >k_0} \left\{ p(k; \widehat{\theta })- p(k; \theta )\right\} ^2\\&:= \delta _1(\widehat{\theta })+\delta _2(\widehat{\theta }). \end{aligned}$$

Since $\delta _1(\widehat{\theta })$ is a continuous function of $\widehat{\theta }$, it follows that

$$\begin{aligned} \delta _1(\widehat{\theta }){\mathop {\longrightarrow }\limits ^{a.s.(P)}} \delta _1(\theta )=0. \end{aligned}$$

As for $\delta _2(\widehat{\theta })$, because $k_0 \ge k_1$, we have that

$$\begin{aligned} \delta _2(\widehat{\theta })\le \sum _{k>k_0} p(k; \widehat{\theta })^2 + \sum _{k>k_0} p(k; {\theta })^2 \le 2\sum _{k >k_0} p(k;\theta +\varepsilon )^2, \quad a.s. \,(P), \quad \forall \, n \ge n_0, \end{aligned}$$

and the right-hand side of the above expression is as small as desired for large enough $k_0$. Therefore, we have shown that

$$\begin{aligned} \sum _{k \ge 0} \left\{ p(k; \widehat{\theta })- p(k; \theta )\right\} ^2 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} 0. \end{aligned}$$

This fact together with (12) shows that $\varDelta _1 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} 0$.

We have that

$$\begin{aligned} \varDelta _2 = \sum _{u \ge 0}(u+1)\frac{\widehat{\theta }^u}{u!} \sum _{v \ge 1}\frac{v+1}{v!}\left\{ \widehat{\theta }^v-\theta ^v \right\} M_2(u,v), \end{aligned}$$

with

$$\begin{aligned} M_2(u,v)=\sum _{k\ge \max \{u,v\}} \left\{ p(k-u; \widehat{\theta })- p(k-u; \theta )\right\} p(k-v; \theta ). \end{aligned}$$

Notice that $0 \le |M_2(u,v)| \le 1$, $\forall u,\, v \ge 0$. By applying the mean value theorem,

$$\begin{aligned} \widehat{\theta }^v-\theta ^v=v\widetilde{\theta }_v^{v-1}(\widehat{\theta }-\theta ), \quad \, \forall v \ge 1, \end{aligned}$$

where $\widetilde{\theta }_v=\alpha _v\widehat{\theta }+(1-\alpha _v)\theta $, for some $\alpha _v \in (0,1)$. As in the proof of Lemma 3, let $\widetilde{\theta }=\max \{\theta , \widehat{\theta }\}$. Note that $\widehat{\theta }_v \le \widetilde{\theta }$, $\forall \, v \ge 1$. From the above considerations, we have that

$$\begin{aligned} |\varDelta _2| \le | \widehat{\theta }-\theta | \sum _{u \ge 0}(u+1)\frac{\widehat{\theta }^u}{u!} \sum _{v \ge 1}\frac{(v+1)v}{v!}\widetilde{\theta }^{v-1}= | \widehat{\theta }-\theta | f_1(\widehat{\theta })f_2(\widetilde{\theta }). \end{aligned}$$

Since the right-hand side of the above expression is a continuous function of $\theta $, it follows that

$$\begin{aligned} | \widehat{\theta }-\theta | f_1(\widehat{\theta })f_2(\widetilde{\theta }) {\mathop {\longrightarrow }\limits ^{a.s.(P)}} | {\theta }-\theta |f_1({\theta })f_2({\theta })=0, \end{aligned}$$

and thus $\varDelta _2 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} 0$.

Finally,

$$\begin{aligned} \varDelta _3 = \sum _{u,v \ge 0}\frac{u+1}{u!}\left\{ \widehat{\theta }^u-\theta ^u \right\} \frac{v+1}{v!}\left\{ \widehat{\theta }^v-\theta ^v \right\} M_3(u,v), \end{aligned}$$

with

$$\begin{aligned} 0 \le M_3(u,v)=\sum _{k\ge \max \{u,v\}} p(k-u; \theta ) p(k-v; \theta ) \le 1. \end{aligned}$$

By applying the mean value theorem (as done when studying $\varDelta _2$), we get

$$\begin{aligned} |\varDelta _3| \le (\widehat{\theta }-\theta )^2 f_2(\widetilde{\theta })^2. \end{aligned}$$

Since the right-hand side of the above expression is a continuous function of $\theta $, it follows that

$$\begin{aligned} (\widehat{\theta }-\theta )^2 f_2(\widetilde{\theta })^2 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} ({\theta }-\theta )^2 f_2({\theta })^2=0, \end{aligned}$$

and thus $\varDelta _3 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} 0$.$\square $

Lemma 5

Let $X_1, \ldots , X_n$ be independent and identically distributed from X, a random variable taking values in $\mathbb {N}_{0}$. Assume that $\widehat{\theta } {\mathop {\longrightarrow }\limits ^{a.s.(P)}} \theta $, for some $\theta >0$. For each $k\in \mathbb {N}_{0}$, let $\theta _k= \alpha _{k}\theta +(1-\alpha _{k})\widehat{\theta }$, for some $\alpha _{k} \in [0,1]$. Then,

$$\begin{aligned} \sum _{k \ge 0}\frac{\partial }{\partial \theta }\left\{ \widehat{d}(k; \theta _k) -\widehat{d}(k; \theta )\right\} ^2 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} 0. \end{aligned}$$

Proof

The proof is parallel to that of $\varDelta _3 {\mathop {\longrightarrow }\limits ^{a.s.(P)}} 0$ in the proof of Lemma 4.$\square $

Proof of Theorem 1

By applying the mean value theorem, we get, for each $k \in \mathbb {N}_{0}$, that

$$\begin{aligned} \widehat{d}(k;\widehat{\theta })=\widehat{d}(k;{\theta })+\frac{\partial }{\partial \theta } \widehat{d}(k;{\theta }_k)(\widehat{\theta }-{\theta }), \end{aligned}$$

(14)

with $\theta _k=\alpha _{k}\theta +(1-\alpha _{k})\widehat{\theta }$, for some $\alpha _{k} \in (0,1)$. From Lemma 1, $E(\Vert \phi (X;\theta )\Vert _2^2)<\infty $ and thus by the SLLN in Hilbert spaces and the continuous mapping theorem, it follows that

$$\begin{aligned} \Vert \widehat{d}(k;{\theta })\Vert _2^2{\mathop {\longrightarrow }\limits ^{a.s.}}\Vert E\{\phi (X;\theta )\} \Vert _2^2=\eta <\infty . \end{aligned}$$

(15)

Finally, the result follows from (14), (15) and Lemma 3. $\square $

Proof of Theorem 2

From expansion (14),

$$\begin{aligned} \widehat{d}(k;\widehat{\theta })=\widehat{d}(k;{\theta })+\frac{\partial }{\partial \theta } \widehat{d}(k;{\theta })(\widehat{\theta }-{\theta })+\left\{ \frac{\partial }{\partial \theta } \widehat{d}(k;{\theta }_k)- \frac{\partial }{\partial \theta } \widehat{d}(k;{\theta })\right\} (\widehat{\theta }-{\theta }), \end{aligned}$$

with $\theta _k=\alpha _{k}\theta +(1-\alpha _{k})\widehat{\theta }$, for some $\alpha _{k} \in (0,1)$. Assumption 1 and Lemmas 2 and 4 imply that

$$\begin{aligned} \sqrt{n}\widehat{d}(\cdot ;\widehat{\theta }) = \sqrt{n}\widehat{d}(\cdot ;{\theta })+ \frac{\partial }{\partial \theta } \widehat{d}(\cdot ;{\theta })\sqrt{n}(\widehat{\theta }-{\theta })+r_1, \end{aligned}$$

(16)

with $\Vert r_1\Vert _2=o_P(1)$. Now, by applying the SLLN in Hilbert spaces and Assumption 1, we get

$$\begin{aligned} \sqrt{n}\widehat{d}(\cdot ;{\theta })+\frac{\partial }{\partial \theta } \widehat{d}(\cdot ;{\theta })\sqrt{n}(\widehat{\theta }-{\theta })= \frac{1}{\sqrt{n}}\sum _{i=1}^n Y(X_i;\cdot , \theta )+r_2, \end{aligned}$$

(17)

with $\Vert r_2\Vert _2=o_P(1)$. By the central limit theorem in Hilbert spaces,

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i=1}^n Y(X_i;\cdot , \theta ) {\mathop {\longrightarrow }\limits ^{\mathcal {L}}} S(\theta ), \end{aligned}$$

(18)

where $Y(X; \cdot , \theta ) =(Y(X;0, \theta ), Y(X;1, \theta ), \ldots )$. The result follows from (16)–(18) and the continuous mapping theorem. $\square $

Proof of Theorem 3

Proceeding as in the proof of Theorem 2, we have that

$$\begin{aligned} \sqrt{n}\widehat{d}^*(\cdot ;\widehat{\theta }^*) = \sqrt{n}\widehat{d}^*(\cdot ;{\theta })+ \frac{\partial }{\partial \theta } \widehat{d}^*(\cdot ;\widehat{\theta })\sqrt{n}(\widehat{\theta }^*-\widehat{\theta })+r^*_1, \end{aligned}$$

with $\Vert r^*_1\Vert _2=o_{P_*}(1)$ a.s.(P). Let

$$\begin{aligned} Y_n^*=\frac{1}{\sqrt{n}}\sum _{i=1}^n Y(X_i^*;\cdot , \widehat{\theta }). \end{aligned}$$

By applying Lemma 4 and Assumption 2, we get

$$\begin{aligned} \sqrt{n}\widehat{d}^*(\cdot ;{\theta })+\frac{\partial }{\partial \theta } \widehat{d}^*(\cdot ;\widehat{\theta })\sqrt{n}(\widehat{\theta }^*-\widehat{\theta })= Y_n^*+r_2^*, \end{aligned}$$

with $\Vert r_2^*\Vert _2=o_{P_*}(1)$a.s.(P). To prove the result we derive the asymptotic distribution of $Y_n^*$, showing that it coincides with the asymptotic distribution of $S_{n}(\widehat{\theta } )$ when the data come from $X\sim \mathrm{Bell}(\theta )$. With this aim, we apply Theorem 1.1 in Kundu et al. (2000). So, we will show that conditions (i)–(iii) in that theorem hold. For $k \ge 0$, let $e_k(j)= I(k=j)$. $\{e_k\}_{k\ge 0}$ is an orthonormal basis of $l^2$. We have that $E_*\{\langle Y(X_i^*;\cdot , \widehat{\theta }),e_k \rangle _2\}= E_*\{Y(X^*;k, \widehat{\theta })\}=0$, $\forall \, k \ge 0$, and by Lemma 1 and Assumption 2, $E_*\{\Vert Y(X^*;\cdot , \widehat{\theta })\Vert _2^2\}<\infty $.

Let $\mathcal {C}$ denote the operator defined in (11) and let $\mathcal {C}_n$ be similarly defined by replacing $\varrho (k,r)=Cov_{\theta }\{Y(X;k,\theta ),Y(X;r,\theta )\}$ with $\varrho _n(k,r)=Cov_*\{\{Y(X^*; k,\widehat{\theta })Y(X^*; r,\widehat{\theta }) \}$, $k\in \mathbb {N}_{0}$, $r \in \mathbb {N}_{0}$. Assumption 2 and Lemma 4 imply that

$$\begin{aligned} \langle \mathcal {C}_n e_k,e_r\rangle _2= & {} E_*\{Y(X^*; k,\widehat{\theta })Y(X^*; r,\widehat{\theta }) \} \rightarrow E_{\theta }\{Y(X; k,{\theta })Y(X; r,{\theta }) \}\\= & {} \langle \mathcal {C} e_k,e_l\rangle _2\, \quad a.s.(P). \end{aligned}$$

Setting $a_{k,r}=\langle \mathcal {C} e_k,e_r\rangle _2$ in the aforementioned Theorem 1.1, this proves that condition (i) holds. Similarly, condition (ii) holds since

$$\begin{aligned} \sum _{k\ge 0} \langle \mathcal {C}_n e_k,e_k\rangle _2 \rightarrow \sum _{k\ge 0}a_{kk}, \quad a.s.(P), \end{aligned}$$

and $\sum _{k\ge 0}a_{kk}<\infty $. Finally, condition (iii) readily follows from Assumption 2. $\square $

Practical issues

Next, we describe some computational issues related to the calculation of the test statistics considered in the simulation study of Sect. 4. The test statistics $S_{n}(\widehat{\theta })$, $R_{n,w}(\widehat{\theta })$ and $M_{n,w}(\widehat{\theta })$ are defined by means of infinite sums. However, these sums have to be truncated at some finite value, say M; that is,

$$\begin{aligned} S_n(\widehat{\theta })= & {} \sum _{k=0}^{M} \widehat{d}(k;\widehat{\theta })^2,\\ R_{n,w}(\widehat{\theta })= & {} \sum _{r,k\ge 0}^M \{p(r;\theta )-\widehat{p}(r)\} \{p(k;\theta )-\widehat{p}(k)\} \int _{0}^{1} t^{r+k} w(t) dt,\\ M_{n,w}(\widehat{\theta })= & {} \sum _{r,k\ge 0}^M\widehat{d}(r;\widehat{\theta }) \widehat{d}(k;\widehat{\theta }) \int _{0}^{1}t^{r+k} w(t) dt. \end{aligned}$$

From the numerical results, we have noted that taking $M=20$ yields sufficiently precise values of these statistics. Finally, note that

$$\begin{aligned} \widehat{d}(k;\theta )=(k+1)\widehat{p}(k+1)-\sum _{u=0}^{k} {\textit{coef}}(u;\theta )\widehat{p}(k-u),\quad k\ge 0, \end{aligned}$$

with

$$\begin{aligned} \widehat{p}(k)=\frac{1}{n}\sum _{j=1}^{n}I(X_j=k),\quad k\ge 0, \end{aligned}$$

and, therefore,

$$\begin{aligned} {\textit{coef}}(u;\theta )=\frac{\theta ^{u+1}}{u!} \end{aligned}$$

can be recursively calculated as follows: ${\textit{coef}}(0;\theta )=\theta $, and ${\textit{coef}}(u;\theta )={\textit{coef}}(u-1;\theta )\theta /u$ for $u\ge 1$.

Calculation of the bootstrap p-value

Let T denote any of the three test statistics and let $T_{obs}$ stand for the observed value of such statistic. The bootstrap p-value, $\hat{p}=P_*(T \ge T_{obs})$ cannot be exactly calculated. Nevertheless, it can be approximated as follows.

1.
Calculate the observed value of the test statistics for the available dataset $X_1,\ldots , X_n$, say $S_{obs}(\widehat{\theta })$, $M_{obs}(\widehat{\theta })$ and $R_{obs}(\widehat{\theta })$.
2.
Generate B bootstrap samples $X_1^{*b},\ldots , X_n^{*b}$ from $X^*\sim \mathrm{Bell}(\widehat{\theta })$, for $b = 1,\ldots , B$.
3.
Calculate the test statistics $S_n(\widehat{\theta })$, $M_{n,w}(\widehat{\theta })$ and $R_{n,w}(\widehat{\theta })$ for each bootstrap sample and denote them, respectively, by $S_b^*$, $M_b^*$ and $R_b^*$ for $b = 1,\ldots , B$.
4.
Compute the p-values of the tests based on the statistics $S_n(\widehat{\theta })$, $M_{n,w}(\widehat{\theta })$ and $R_{n,w}(\widehat{\theta })$ by means, respectively, of the expressions
$$\begin{aligned} \widehat{p}_S =\frac{\#\{S_b^*\ge S_{obs}(\widehat{\theta })\}}{B},\quad \widehat{p}_M =\frac{\#\{M_b^*\ge M_{obs}(\widehat{\theta })\}}{B},\quad \widehat{p}_R =\frac{\#\{R_b^*\ge R_{obs}(\widehat{\theta })\}}{B}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Batsidis, A., Jiménez-Gamero, M.D. & Lemonte, A.J. On goodness-of-fit tests for the Bell distribution. Metrika 83, 297–319 (2020). https://doi.org/10.1007/s00184-019-00733-6

Download citation

Received: 19 February 2019
Published: 27 July 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00184-019-00733-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On goodness-of-fit tests for the Bell distribution

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Violating the normality assumption may be the lesser of two evils

Evaluating significance in linear mixed-effects models in R

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

Proofs

Proof of Proposition 1

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Lemma 5

Proof

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Practical issues

Calculation of the bootstrap p-value

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On goodness-of-fit tests for the Bell distribution

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Violating the normality assumption may be the lesser of two evils

Evaluating significance in linear mixed-effects models in R

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

Proofs

Proof of Proposition 1

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

Lemma 5

Proof

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Practical issues

Calculation of the bootstrap p-value

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation