Abstract
The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.
Similar content being viewed by others
References
Bishop YMM, Fienberg SE, Holland PW (1975) Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge
Blumenthal S, Dahiya R, Gross A (1978) Estimating complete sample-size from an incomplete Poisson sample. Journal of the American Statistical Association 73, 182–187
Böhning Suppawattanabodee B, Kusolvisitkul W, Viwatwongkasem C (2004). Estimating the number of drug users in Bangkok 2001: A capture-recapture approach using repeated entries in one list. European Journal of Epidemiology 19, 1075–1083
Böhning D (2000) Computer-assisted analysis of mixtures and applications. Meta-analysis, disease mapping and others. Chapman & Hall/CRC, Boca Raton
Chao A (2001) An overview of closed capture-recapture models. Journal of Agricultural, Biological, and Environmental Statistics 6, 158–175
Chao A (1998) Capture-recapture. In: Armitage P, Colton T (eds) Encyclopedia of biostatistics, vol. 1. Wiley, pp 482–486
Chao A (1989) Estimating population size for sparse data in capture-recapture experiments. Biometrics 45, 427–438
Chao A, Bunge J (2002) Estimating the number of species in a stochastic abundance model. Biometrics 58, 531–539
Comiskey CM, Barry JM (2001) A capture-recapture study of the prevalence and implications of opiate use in Dublin. European Journal of Public Health 11, 198–200
Cormack RM (1992) Interval estimation for mark-recapture studies of closed populations. Biometrics 48, 567–576
Dorazio RM, Royle JA (2003) Mixture models for estimating the size of a closed population when capture rates vary among individuals. Biometrics 59, 351–364
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society B 39, 1–38
Dietz E, Böhning D (2000) On estimation of the Poisson parameter in zero-modified Poisson models. Computational Statistics & Data Analysis 34, 441–459
Hook EB, Regal R (1995) Capture-recapture methods in epidemiology: methods and limitations. Epidemiologic Reviews 17, 243–264
International Working Group for Disease Monitoring and Forecasting (1995a) Capture-recapture and multiple record systems estimation I: history and theoretical development. American Journal of Epidemiology 142, 1047–1058
International Working Group for Disease Monitoring and Forecasting (1995b) Capture-recapture and multiple record systems estimation II: Applications in human diseases. American Journal of Epidemiology 142, 1059–1068
LaPorte RE, McCarty DJ, Tull ES, Tajima N (1992) Counting birds, bees, and NCDs. Lancet 339, 494–495
Laird NM (1978) Nonparametric maximum likelihood estimation of a mixing distribution. Journal of the American Statistical Association 73, 805–811
Lindsay BG, Roeder K (1987) A unified treatment of integer parameter models. Journal of the American Statistical Association 82, 758–764
Mao CX, Lindsay BG (2003) Tests and diagnostics for heterogeneity in the species problem. Computational Statistics and Data Analysis 41, 389–398
McKendrick AG (1926) Application of mathematics to medical problems. Proceedings of the Edinburgh Mathematical Society 44, 98–130
McLachlan G, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Meng X-L (1997) The EM algorithm and medical studies: a historical link. Statistical Methods in Medical Research 6, 3–23
Miloslavsky M, van der Laan MJ (2003) Fitting of mixtures with unspecified number of components using cross validation distance estimate. Computational Statistics and Data Analysis 41, 413–428
Nannan DJ, White F (1997) Capture-recapture: Reconnaissance of a demographic technique in epidemiology. Health Canada 18(4)
Norris JL III, Pollock KH (1998) Non-parametric MLE for Poisson species abundance models allowing for heterogeneity between species. Environmental and Ecological Statistics 5, 391–402
Norris JL III, Pollock KH (1996) Nonparametric MLE under two closed capture-recapture models with heterogeneity. Biometrics 52, 639–649
Pledger S (2000) Unified maximum likelihood estimates for closed capture-recapture models using mixtures. Biometrics 56, 434–442
Sanathanan L (1972) Estimating the size of a multinomial population. Annals of Mathematical Statistics 42, 58–69
Sanathanan L (1977) Estimating the size of a truncated sample. Journal of the American Statistical Association 72, 669–672
Schouten LJ, Straatmann H, Kiemeney LA, Gimbrere CH, Verbeek AL (1994) The capture-recapture method for estimation of cancer registry completeness: a useful tool? International Journal of Epidemiology 23, 1111–1116
Scollnik D (1997) Inference concerning the size of the zero class from an incomplete Poisson sample. Communication in Statistics-Theory and Methods 26, 221–236a
Sekar C, Deming WE (1949) On a method of estimating birth and death rates and the extent of registration. JASA 44, 101–115
Tilling K (2001) Capture-recapture methods-useful or misleading? International Journal of Epidemiology 30, 12–14
van der Heijden PGM, Bustami R, Cruyff M, Engbersen G, van Houwelingen HC (2003) Point and interval estimation of the population size using the truncated Poisson regression model. Statistical Modelling-An International Journal 3, 305–322
van der Heijden PGM, Cruyff M, van Houwelingen H C (2003) Estimating the size of a criminal population from police records using the truncated Poisson regression model. Statistica Neerlandica 57, 1–16
Wilson RM, Collins MF (1992) Capture-recapture estimation with samples of size one using frequency data. Biometrika 79, 543–553
Wittes JT, Sidel VW (1968) A generalization of the simple capture-recapture model with applications to epidemiological research. Journal of Chronic Diseases 21, 287–301
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Böhning, D., Dietz, E., Kuhnert, R. et al. Mixture models for capture-recapture count data. Statistical Methods & Applications 14, 29–43 (2005). https://doi.org/10.1007/BF02511573
Issue Date:
DOI: https://doi.org/10.1007/BF02511573