Skip to main content
Log in

Mixture models for capture-recapture count data

  • Statistical Methods
  • Published:
Statistical Methods and Applications Aims and scope Submit manuscript

Abstract

The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bishop YMM, Fienberg SE, Holland PW (1975) Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge

    MATH  Google Scholar 

  • Blumenthal S, Dahiya R, Gross A (1978) Estimating complete sample-size from an incomplete Poisson sample. Journal of the American Statistical Association 73, 182–187

    Article  MathSciNet  MATH  Google Scholar 

  • Böhning Suppawattanabodee B, Kusolvisitkul W, Viwatwongkasem C (2004). Estimating the number of drug users in Bangkok 2001: A capture-recapture approach using repeated entries in one list. European Journal of Epidemiology 19, 1075–1083

    Article  Google Scholar 

  • Böhning D (2000) Computer-assisted analysis of mixtures and applications. Meta-analysis, disease mapping and others. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Chao A (2001) An overview of closed capture-recapture models. Journal of Agricultural, Biological, and Environmental Statistics 6, 158–175

    Article  Google Scholar 

  • Chao A (1998) Capture-recapture. In: Armitage P, Colton T (eds) Encyclopedia of biostatistics, vol. 1. Wiley, pp 482–486

  • Chao A (1989) Estimating population size for sparse data in capture-recapture experiments. Biometrics 45, 427–438

    Article  MathSciNet  MATH  Google Scholar 

  • Chao A, Bunge J (2002) Estimating the number of species in a stochastic abundance model. Biometrics 58, 531–539

    Article  MathSciNet  Google Scholar 

  • Comiskey CM, Barry JM (2001) A capture-recapture study of the prevalence and implications of opiate use in Dublin. European Journal of Public Health 11, 198–200

    Article  Google Scholar 

  • Cormack RM (1992) Interval estimation for mark-recapture studies of closed populations. Biometrics 48, 567–576

    Article  MathSciNet  Google Scholar 

  • Dorazio RM, Royle JA (2003) Mixture models for estimating the size of a closed population when capture rates vary among individuals. Biometrics 59, 351–364

    Article  MathSciNet  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B 39, 1–38

    MathSciNet  MATH  Google Scholar 

  • Dietz E, Böhning D (2000) On estimation of the Poisson parameter in zero-modified Poisson models. Computational Statistics & Data Analysis 34, 441–459

    Article  MATH  Google Scholar 

  • Hook EB, Regal R (1995) Capture-recapture methods in epidemiology: methods and limitations. Epidemiologic Reviews 17, 243–264

    Google Scholar 

  • International Working Group for Disease Monitoring and Forecasting (1995a) Capture-recapture and multiple record systems estimation I: history and theoretical development. American Journal of Epidemiology 142, 1047–1058

    Google Scholar 

  • International Working Group for Disease Monitoring and Forecasting (1995b) Capture-recapture and multiple record systems estimation II: Applications in human diseases. American Journal of Epidemiology 142, 1059–1068

    Google Scholar 

  • LaPorte RE, McCarty DJ, Tull ES, Tajima N (1992) Counting birds, bees, and NCDs. Lancet 339, 494–495

    Article  Google Scholar 

  • Laird NM (1978) Nonparametric maximum likelihood estimation of a mixing distribution. Journal of the American Statistical Association 73, 805–811

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay BG, Roeder K (1987) A unified treatment of integer parameter models. Journal of the American Statistical Association 82, 758–764

    Article  MathSciNet  MATH  Google Scholar 

  • Mao CX, Lindsay BG (2003) Tests and diagnostics for heterogeneity in the species problem. Computational Statistics and Data Analysis 41, 389–398

    Article  MathSciNet  Google Scholar 

  • McKendrick AG (1926) Application of mathematics to medical problems. Proceedings of the Edinburgh Mathematical Society 44, 98–130

    Article  Google Scholar 

  • McLachlan G, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York

    MATH  Google Scholar 

  • McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  • Meng X-L (1997) The EM algorithm and medical studies: a historical link. Statistical Methods in Medical Research 6, 3–23

    Article  Google Scholar 

  • Miloslavsky M, van der Laan MJ (2003) Fitting of mixtures with unspecified number of components using cross validation distance estimate. Computational Statistics and Data Analysis 41, 413–428

    Article  MathSciNet  Google Scholar 

  • Nannan DJ, White F (1997) Capture-recapture: Reconnaissance of a demographic technique in epidemiology. Health Canada 18(4)

  • Norris JL III, Pollock KH (1998) Non-parametric MLE for Poisson species abundance models allowing for heterogeneity between species. Environmental and Ecological Statistics 5, 391–402

    Article  Google Scholar 

  • Norris JL III, Pollock KH (1996) Nonparametric MLE under two closed capture-recapture models with heterogeneity. Biometrics 52, 639–649

    Article  MATH  Google Scholar 

  • Pledger S (2000) Unified maximum likelihood estimates for closed capture-recapture models using mixtures. Biometrics 56, 434–442

    Article  MATH  Google Scholar 

  • Sanathanan L (1972) Estimating the size of a multinomial population. Annals of Mathematical Statistics 42, 58–69

    MathSciNet  Google Scholar 

  • Sanathanan L (1977) Estimating the size of a truncated sample. Journal of the American Statistical Association 72, 669–672

    Article  MathSciNet  MATH  Google Scholar 

  • Schouten LJ, Straatmann H, Kiemeney LA, Gimbrere CH, Verbeek AL (1994) The capture-recapture method for estimation of cancer registry completeness: a useful tool? International Journal of Epidemiology 23, 1111–1116

    Google Scholar 

  • Scollnik D (1997) Inference concerning the size of the zero class from an incomplete Poisson sample. Communication in Statistics-Theory and Methods 26, 221–236a

    MathSciNet  MATH  Google Scholar 

  • Sekar C, Deming WE (1949) On a method of estimating birth and death rates and the extent of registration. JASA 44, 101–115

    MATH  Google Scholar 

  • Tilling K (2001) Capture-recapture methods-useful or misleading? International Journal of Epidemiology 30, 12–14

    Article  Google Scholar 

  • van der Heijden PGM, Bustami R, Cruyff M, Engbersen G, van Houwelingen HC (2003) Point and interval estimation of the population size using the truncated Poisson regression model. Statistical Modelling-An International Journal 3, 305–322

    Article  MathSciNet  MATH  Google Scholar 

  • van der Heijden PGM, Cruyff M, van Houwelingen H C (2003) Estimating the size of a criminal population from police records using the truncated Poisson regression model. Statistica Neerlandica 57, 1–16

    Article  MathSciNet  Google Scholar 

  • Wilson RM, Collins MF (1992) Capture-recapture estimation with samples of size one using frequency data. Biometrika 79, 543–553

    Article  MATH  Google Scholar 

  • Wittes JT, Sidel VW (1968) A generalization of the simple capture-recapture model with applications to epidemiological research. Journal of Chronic Diseases 21, 287–301

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Böhning, D., Dietz, E., Kuhnert, R. et al. Mixture models for capture-recapture count data. Statistical Methods & Applications 14, 29–43 (2005). https://doi.org/10.1007/BF02511573

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02511573

Key words

Navigation