Estimating the number of shared species by a jackknife procedure

Chuang, Chia-Jui; Shen, Tsung-Jen; Hwang, Wen-Han

doi:10.1007/s10651-015-0318-7

Estimating the number of shared species by a jackknife procedure

Published: 07 May 2015

Volume 22, pages 759–778, (2015)
Cite this article

Environmental and Ecological Statistics Aims and scope Submit manuscript

Chia-Jui Chuang^1,2,
Tsung-Jen Shen¹ &
Wen-Han Hwang¹

429 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

A sequence of jackknife estimators is developed to estimate the number of shared species in two communities. The estimators have simple and explicit formulae. A sequential testing criterion is also developed to determine a proper order for these jackknife estimators. The performance of the estimators is evaluated using empirical data on two forests from Malaysia, where 209 shared species present in both forests, and using simulated data. Results for the empirical data and simulated scenarios (for sampling fraction ranging from 0.5 to 20 %) show that the jackknife estimator, compared with other existing estimators, has a smaller bias and provides more reliable interval estimation in most cases. Additionally, two avian datasets from Taiwan and Hong Kong are used to demonstrate the proposed method. To extend the proposed method to three communities, we also list the first six orders of the jackknife estimators explicitly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inference on diversity from forest inventories: a review

Article 16 October 2015

Biased-corrected richness estimates for the Amazonian tree flora

Article Open access 23 June 2020

Hierarchical Species Distribution Models

Article 01 June 2016

References

Amstrup SC, McDonald TL, Manly BF (eds) (2010) Handbook of capture–recapture analysis. Princeton University Press, Princeton
Google Scholar
Arvesen JN (1969) Jackknifing U-statistics. Ann Math Stat 40:2076–2100
Article Google Scholar
Burnham KP, Overton WS (1978) Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika 65(3):625–633
Article Google Scholar
Burnham KP, Overton WS (1979) Robust estimation of population size when capture probabilities vary among animals. Ecology 60(5):927–936
Article Google Scholar
Chao A (1987) Estimating the population size for capture–recapture data with unequal catchability. Biometrics 43:783–791
Article CAS PubMed Google Scholar
Chao A (2005) Species estimation and applications. In: Balakrishnan N, Read CB, Vidakovic B (eds) Encyclopedia of statistical sciences, vol 12, 2nd edn. Wiley, New York, pp 7907–7916
Google Scholar
Chao A, Hwang W-H, Chen Y-C, Kuo C-Y (2000) Estimating the number of shared species in two communities. Stat Sin 10:227–246
Google Scholar
Chao A, Jost L, Chiang S-C, Jiang Y-H, Chazdon R (2008) A two-stage probabilistic approach to multiple-community similarity indices. Biometrics 64:1178–1186
Article PubMed Google Scholar
Chao A, Lee S-M (1992) Estimating the number of classes via sample coverage. J Am Stat Assoc 87:210–217
Article Google Scholar
Chao A, Ma M-C, Yang MCK (1993) Stopping rules and estimation for recapture debugging with unequal failure rates. Biometrika 80:193–201
Article Google Scholar
Chao A, Shen T-J (2010) Program SPADE (Species Prediction And Diversity Estimation). Program and User’s Guide published at http://chao.stat.nthu.edu.tw
Chao A, Shen T-J, Hwang W-H (2006) Application of Laplace’s boundary-mode approximations to estimate species and shared species richness. Aust N Z J Stat 48:117–128
Article Google Scholar
Chiarucci A, Enright NJ, Perry GLW, Miller BP, Lamont BB (2003) Performance of nonparametric species richness estimators in a high diversity plant community. Divers Distrib 9:283–295
Article Google Scholar
Chiu CH, Wang YT, Walther BA, Chao A (2014) An improved nonparametric lower bound of species richness via a modified good-turing frequency formula. Biometrics 70(3):671–682
Article PubMed Google Scholar
Colwell RK, Coddington JA (1994) Estimating terrestrial biodiversity through extrapolation. Philos Trans R Soc Lond B 345:101–118
Article CAS Google Scholar
Colwell RK, Elsensohn JE (2014) EstimateS turns 20: statistical estimation of species richness and shared species from samples, with non-parametric extrapolation. Ecography 37:609–613
Article Google Scholar
Condit R, Pitman N, Leigh EG Jr, Chave J, Terborgh J, Foster RB, Núñez P, Aguilar S, Valencia R, Villa G, Muller-Landau HC, Losos E, Hubbell SP (2002) Beta-diversity in tropical forest trees. Science 295:666–669
Article CAS PubMed Google Scholar
Cormack RM (1989) Log-linear models for capture-recapture. Biometrics 395–413
Darroch JN, Ratcliff D (1980) A note on capture–recapture estimation. Biometrics 36:149–153
Article Google Scholar
Eren MI, Chao A, Hwang WH, Colwell RK (2012) Estimating the richness of a population when the maximum number of classes is fixed: a nonparametric solution to an archaeological problem. PLoS One 7(5):e34179
Article PubMed Central CAS PubMed Google Scholar
Esty WW (1985) Estimation of the number of classes in a population and the coverage of a sample. Math Stat 10:41–50
Google Scholar
Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40:237–264
Article Google Scholar
Gotelli NJ, Colwell RK (2009) Estimating species richness. In: Magurran A, McGill B (eds) Frontiers in measuring biodiversity. Oxford University Press, New York
Google Scholar
Goutis C, Casella G (1999) Explaining the saddle point approximation. Am Stat 53:216–224
Google Scholar
Heltshe JF, Forrester NE (1983) Estimating species using the jackknife procedure. Biometrics 39:1–11
Article CAS PubMed Google Scholar
Hellmann JJ, Fowler GW (1999) Bias, precision, and accuracy of four measures of species richness. Ecol Appl 9:824–834
Article Google Scholar
Hwang WH, Huang SY (2003) Estimation in capture–recapture models when covariates are subject to measurement errors. Biometrics 59:1113–1122
Article PubMed Google Scholar
Krishnamani R, Kumar A, Harte J (2004) Estimating species richness at large spatial scales using data from discrete plots. Ecography 27:637–642
Article Google Scholar
Magurran AE (2004) Measuring biological diversity. Blackwell, Oxford
Google Scholar
Ostling A, Harte J, Green J, Kinzig A (2003) A community-level fractal property produces power-law species–area relationships. Oikos 103:218–224
Article Google Scholar
Palmer MW (1990) The estimation of species richness by extrapolation. Ecology 71:1195–1198
Article Google Scholar
Palmer MW (1991) Estimating species richness: the second-order jackknife reconsidered. Ecology 72:1512–1513
Article Google Scholar
Pan H-Y, Chao A, Foissner W (2009) A nonparametric lower bound for the number of specie hared by multiple communities. J Agric Biol Environ Stat 14:452–468
Article PubMed Central PubMed Google Scholar
Quenouille MH (1949) Approximate tests of correlation in time series. J R Stat Soc Ser B 11:68–84
Google Scholar
Rasmussen SL, Starr N (1979) Optimal and adaptive stopping in the search for new species. J Am Stat Assoc 74:661–667
Article Google Scholar
Schechtman E, Wang S (2004) Jackknifing two-sample statistics. J Stat Plan Inference 119:329–340
Article Google Scholar
Schloss PD, Handelsman J (2006) Introducing SONS, a tool for OTU-based comparisons of membership and structure between microbial communities. Appl Environ Microbiol 72:6773–6779
Article PubMed Central CAS PubMed Google Scholar
Shao J, Tu D (1995) The jackknife and bootstrap. Springer, New York
Book Google Scholar
Tjørve E, Tjørve KMC (2008) The species–area relationship, self-similarity, and the true meaning of the z-value. Ecology 89:3528–3533
Article PubMed Google Scholar
Walther BA, Moore JL (2005) The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography 28:815–829
Article Google Scholar
Walther BA, Morand S (1998) Comparative performance of species richness estimation methods. Parasitology 116:395–405
Article PubMed Google Scholar
Williams VL, Witkowski ET, Balkwill K (2007) The use of incidence-based species richness estimators, species accumulation curves and similarity measures to appraise ethnobotanical inventories from South Africa. Biodivers Conserv 16:2495–2513
Article Google Scholar
Yip PS, Fang X, Zhou Y, Wang Y (2003) Sequential procedure for fixed accuracy estimation of the population size in recapture sampling. Aust N Z J Stat 45:207–216
Article Google Scholar
Yue JC, Clayton MK (2012) Sequential sampling in the search for new shared species. J Stat Plan Inference 142:1031–1039
Article Google Scholar

Download references

Acknowledgments

The authors are grateful to Professor Fangliang He for his valuable discussions and providing the Lambir forest plot data. The authors thank the referees and editor for their useful comments. We also thank Roman Gulati for his generous editing assistance. This work was supported by the Ministry of Science and Technology of Taiwan.

Author information

Authors and Affiliations

Department of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taichung, Taiwan
Chia-Jui Chuang, Tsung-Jen Shen & Wen-Han Hwang
Center for Biomedical Resources, National Health Research Institutes, Zhunan, Taiwan
Chia-Jui Chuang

Authors

Chia-Jui Chuang
View author publications
You can also search for this author in PubMed Google Scholar
Tsung-Jen Shen
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Han Hwang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen-Han Hwang.

Additional information

Handling Editor: Pierre Dutilleul.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (docx 1065 KB)

Appendix: A general result of the jackknife estimators $\hat{S}_k $

Define a 2-dimensional array of coefficients $d_{t,u} $ as:

$$\begin{aligned} \left\{ {{\begin{array}{ll} {d_{1,1} =1} &{} \\ {d_{t,t} =-td_{t-1,t-1}} &{} {\forall t\ge 2;} \\ {d_{t,1} =2^{t}-1} &{} {\forall t\ge 2;} \\ {d_{t,u} =d_{t-1,u} +u\left( {d_{t-1,u} -d_{t-1,u-1}} \right) } &{} {\forall t\ge 2\hbox { and }2\le u<t} \\ {d_{t,u} =0} &{} {\hbox {otherwise}.} \\ \end{array}}} \right. \end{aligned}$$

These coefficients are used to simplify the expressions of jackknife estimators. The formulae can be summarized with the following Theorem.

Theorem 1

For each nonnegative integer $v$, we have:

$$\begin{aligned} \hat{S}_{2\nu ,X}= & {} D+\sum _{t=1}^{\nu +1} {d_{\nu +1,t} \frac{n_1 -t}{n_1}f_{t+} +} \sum _{u=1}^\nu {d_{\nu ,u} \frac{n_2 -u}{n_2}f_{+u}} \nonumber \\&+\,\sum _{t=1}^{\nu +1} {\sum _{u=1}^\nu {d_{\nu +1,t} d_{\nu ,u} \frac{(n_1 -t)(n_2 -u)}{n_1 n_2}f_{tu}}} \end{aligned}$$

(7)

and

$$\begin{aligned} \hat{S}_{2\nu ,Y}= & {} D+\sum _{t=1}^\nu {d_{\nu ,t} \frac{n_1 -t}{n_1}f_{t+}} + \sum _{u=1}^{\nu +1} {d_{\nu +1,u} \frac{n_2 -u}{n_2}f_{+u}} \nonumber \\&+\,\sum _{t=1}^\nu {\sum _{u=1}^{\nu +1} {d_{\nu ,t} d_{\nu +1,u} \frac{(n_1 -t)(n_2 -u)}{n_1 n_2}f_{tu}}} . \end{aligned}$$

(8)

Therefore, $\hat{S}_{2\nu +1} =(n_1 \hat{S}_{2\nu ,X} +n_2 \hat{S}_{2\nu ,Y})/(n_1 +n_2)$ is a linear combination of the frequencies $f_{tu}$. Furthermore, the $(2\nu +2)$-th order jackknife estimator is:

$$\begin{aligned} \hat{S}_{2\nu +2}= & {} D+\sum _{t=1}^{\nu +1} {d_{\nu +1,t} \frac{n_1 -t}{n_1}f_{t+}} + \sum _{u=1}^{\nu +1} {d_{\nu +1,u} \frac{n_2 -u}{n_2}f_{+u}} \nonumber \\&+\,\sum _{t=1}^{\nu +1} {\sum _{u=1}^{\nu +1} {d_{\nu +1,t} d_{\nu +1,u} \frac{(n_1 -t)(n_2 -u)}{n_1 n_2}f_{tu}}} . \end{aligned}$$

(9)

The proof is established by mathematical induction and is shown in the Supplementary Materials due to lengthy algebra. We can further simplify the formulae in the next Corollary.

Corollary 1

When the sample sizes $n_1 $ and $n_2 $ are sufficiently large, define $\lambda _j =(n_j -h)/(n_1 +n_2)$ for any finite number $h$ and $j=1,2$. Asymptotically, the explicit forms of the jackknife estimators $\hat{S}_k $ for $k=1,\ldots ,6$, are as follows:

$$\begin{aligned} \hat{S}_1= & {} D+\lambda _1 f_{1+} +\lambda _2 f_{+1} ; \\ \hat{S}_2= & {} D+f_{1+} +f_{+1} +f_{11} ; \\ \hat{S}_3= & {} D+(1+2\lambda _1)f_{1+} -2\lambda _1 f_{2+} +(1+2\lambda _2)f_{+1} -2\lambda _2 f_{+2} \\&+\,3f_{11} -2\lambda _1 f_{12} -2\lambda _1 \lambda _2 f_{21} ; \\ \hat{S}_4= & {} D+3f_{1+} -2f_{2+} +3f_{+1} -2f_{+2} +9f_{11} -6f_{12} -6f_{21} +4f_{22} ; \\ \hat{S}_5= & {} D+(3+4\lambda _1)f_{1+} -2(1+5\lambda _1)f_{2+} +6\lambda _1 f_{3+} \\&+\,(3+4\lambda _2)f_{+1} -2(1+5\lambda _2)f_{+2} +6\lambda _2 f_{+3} \\&+\,21f_{11} +(22\lambda _1 -36)f_{12} -(22\lambda _2 -36)f_{21} +24f_{22} \\&+\,18\lambda _1 f_{31} +18\lambda _2 f_{13} -12\lambda _1 f_{32} -12\lambda _2 f_{23} ; \\ \hat{S}_6= & {} D+7f_{1+} -12f_{2+} +6f_{3+} +7f_{+1} -12f_{+2} +6f_{+3} \\&+\,49f_{11} -84f_{12} +42f_{13} -84f_{21} +144f_{22} -72f_{23} \\&+\,42f_{31} -72f_{32} +36f_{33} . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chuang, CJ., Shen, TJ. & Hwang, WH. Estimating the number of shared species by a jackknife procedure. Environ Ecol Stat 22, 759–778 (2015). https://doi.org/10.1007/s10651-015-0318-7

Download citation

Received: 12 September 2014
Revised: 13 April 2015
Published: 07 May 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s10651-015-0318-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the number of shared species by a jackknife procedure

Abstract

Access this article

Similar content being viewed by others

Inference on diversity from forest inventories: a review

Biased-corrected richness estimates for the Amazonian tree flora

Hierarchical Species Distribution Models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (docx 1065 KB)

Appendix: A general result of the jackknife estimators \(\hat{S}_k \)

Theorem 1

Corollary 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimating the number of shared species by a jackknife procedure

Abstract

Access this article

Similar content being viewed by others

Inference on diversity from forest inventories: a review

Biased-corrected richness estimates for the Amazonian tree flora

Hierarchical Species Distribution Models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (docx 1065 KB)

Appendix: A general result of the jackknife estimators \(\hat{S}_k \)

Appendix: A general result of the jackknife estimators \(\hat{S}_k \)

Theorem 1

Corollary 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation