Abstract
In health research, count outcomes are fairly common and often these counts have a large number of zeros. In order to adjust for these extra zero counts, various modifications of the Poisson regression model have been proposed. Lambert (Lambert, D., Technometrics 34, 1–14, 1992) described a zero-inflated Poisson (ZIP) model that is based on a mixture of a binary distribution (π i ) degenerated at zero with a Poisson distribution (λ i ). Depending on the relationship between π i and λ i , she described two variants: a ZIP and a ZIP (τ) model. In this paper, we extend these models for the case of clustered data (e.g., patients observed within hospitals) and describe random-effects ZIP and ZIP (τ) models. These models are appropriate for the analysis of clustered extra-zero Poisson count data. The distribution of the random effects is assumed to be normal and a maximum marginal likelihood estimation method is used to estimate the model parameters. We applied these models to data from patients who underwent colon operations from 123 Veterans Affairs Medical Centers in the National VA Surgical Quality Improvement Program.
Similar content being viewed by others
References
Albert, J., “A Bayesian analysis of a Poisson random effects model for home run hitters,” The American Statistician 46, 246–253, 1992.
Berndt, B., Hall, E., Hall, R., and Hausman, J., “Estimation and inference in nonlinear structural models,” Annals of Economic and Social Measurement 3, 653–666, 1974.
Bork, R.D., Multilevel analysis of educational data, Academic Press, New York, 1989.
Breslow, N.E., “Extra Poisson variation in log-linear models,” Applied Statistics 33, 38–44, 1984.
Bryk, A.S. and Raudenbush, S.W., Hierarchical linear models: Applications and data analysis methods, Sage, London, 1992.
Cameron, A. and Trivedi, P., “Econometric models based on count data: Comparisons and applications of some estimators and test,” Journal of the Applied Econometrics 1, 29–53, 1986.
Cohen, A., “Estimation of the Poisson parameter from truncated samples and from censored samples,” Journal of the American Statistical Association 49, 158–168, 1954.
Daley, J., Khuri, S.F., Henderson, W.G., Hur, K. et al., “Risk adjustment of the postoperative morbidity rate for the comparative assessment of the quality of surgical care,” Journal of the American College of Surgeons 185(4), 328–340, 1997.
Dunlop, D., “Regression for longitudinal data: A bridge from least squares Regression,” The American Statistician 48(4), 299–303, 1994.
Gibbons, R. and Hedeker, D., “Application of random effects probit regression models,” Journal of Consulting and Clinical Psychology 62, 285–296, 1994.
Gibbons, R. and Hedeker, D., “Random effects probit and logistic regression models for three-level data,” Biometrics 53, 1527–1537, 1997.
Greene, W.H., “Accounting for excess zeros and sample selection in Poisson and negative binomial regression models,” Working paper, Department of Economics, Stern School of business, New York University, New York, 1994.
Greene, W.H., LIMDEP Version 7.0 user's manual, rev. edn., Econometric Software, Inc., Plainview, NY, 1998.
Gupta, P., Gupta, R., and Tripathi, R., “Analysis of zero-adjusted count data,” Computational Statistics and Data Analysis 23, 207–218, 1996.
Hall, D.B., “Zero-inflated Poisson and binomial regression with random effects: A case study,” Biometrics 56, 1030–1039, 2000.
Hedeker, D., MIXPREG:Acomputer program for mixed-effects Poisson regression. Technical Report, School of Public Health, University of Illinois at Chicago, Chicago, 1998.
Hedeker, D. and Gibbons, R., “A random effects ordinal regression model for multilevel analysis,” Biometrics 50, 933–944, 1994.
Hedeker, D., Siddiqui, O., and Hu, F., “Random effects regression analysis of correlated grouped-time survival data,” Statistical Methods in Medical Research 9, 161–179, 2000.
Heilbron, D., “Generalized linear models for altered zero probabilities and over dispersion in count data,” Technical Report, Department of Epidemiology and Biostatistics, University of California, San Francisco, 1989.
Heilbron, D., “Zero-altered and other regression models for count data with added zeros,” Biometrics Journal 36, 531–547, 1994.
Johnson, D. and Kotz, S., Distributions in statistics-Discrete distributions, JohnWiley and Sons, New York, 1969.
Khuri, S.F., Daley, J., Henderson, W.G., Hur, K. et al., “Risk adjustment of the postoperative mortality rate for the comparative assessment of the quality of surgical care: Results of the National Veterans Affairs Surgical Risk Study,” Journal of the American College of Surgeons 185(4), 315–327, 1997.
Khuri, S.F., Daley, J., Henderson, W.G., Hur, K. et al., “The Department of Veterans Affairs NSQIP: The first national validated, outcome-based, risk-adjusted, and peer-controlled program for the measurement and enhancement of the quality of care,” Annals of Surgery 228(4), 491–507, 1998.
King, G., “Event count models for international relations: Generalizations and applications,” International Studies Quarterly 33, 123–147, 1989.
Lambert, D., “Zero-inflated Poisson regression with an application to defects in manufacturing,” Technometrics 34, 1–14, 1992.
Lawless, J., “Negative binomial and mixed Poisson regression,” Canadian Journal of Statistics 15, 209–225, 1987.
Lee, Y. and Nelder, J.A., “Hierarchical generalized linear models (with discussion),” Journal of the Royal Statistical Society B 58, 619–678, 1996.
Long, S., Regression models for categorical and limited dependent variables, Sage, London, 1997.
Mullahy, J., “Specification and testing of some modified count data models,” Journal of Econometrics 33, 341–365, 1986.
Neuhaus, J.M. and Jewell, N., “Some comments on Rosner's multiple logistic model for clustered data,” Biometrics 46, 523–534, 1990.
Neuhaus, J.M., Kalbfleisch, J.D., and Hauck, W.W., “Acomparison of cluster-specific and population-averaged approaches for analyzing correlated binary data,” International Statistical Review 59, 25–35, 1991.
Normand, S.L.T., Glickman, M.E. et al., “Using admission characteristics to predict short-term mortality from myocardial infarction in elderly patients,” JAMA 275, 1322–1328, 1996.
Preisler, H.K., “Analysis of a toxicological experiment using a generalized linear model with nested random effects,” International Statistical Review 57, 145–159, 1989.
Prentice, R., “Correlated binary regression with covariates specific to each binary observation,” Biometrics 44, 1033–1048, 1988.
Rosner, B., “Multivariate methods for clustered binary data with more than one level of nesting,” Journal of the American Statistical Association 84, 373–380, 1989.
Siddiqui, O., “Modeling clustered count and survival data with an application to a school-based smoking prevention study,” PhD Dissertation, University of Illinois at Chicago, 1996.
Snijders, T. and Bosker, R., Multilevel analysis: An introduction to basic and advanced multilevel modeling, Sage, Thousand Oaks, CA, 1999.
Stroud, A.H. and Sechrest, D., Gaussian quadrature formulas, Prentice Hall, Englewood Cliffs, NJ, 1966.
Ten Have, T., Landis, R., and Hartzel, J., “Population-averaged and cluster-specific models for clustered ordinal response data,” Statistics in Medicine 15, 2573–2588, 1996.
Thall, P.F., “Mixed Poisson likelihood regression models for longitudinal interval count data,” Biometrics 44, 197–209, 1992.
Vach, W. and Blettner, M., “Missing data in epidemiologic studies,” Encyclopedia of Biostatistics 4, 2641–2654, 1998.
Vuong, Q., “Likelihood ratio tests for model selection and non-nested hypotheses,” Econometrica 57(2), 307–333, 1989.
Yau, K. and Lee, A., “Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme,” Statistics in Medicine 20, 2907–2920, 2001.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hur, K., Hedeker, D., Henderson, W. et al. Modeling Clustered Count Data with Excess Zeros in Health Care Outcomes Research. Health Services & Outcomes Research Methodology 3, 5–20 (2002). https://doi.org/10.1023/A:1021594923546
Issue Date:
DOI: https://doi.org/10.1023/A:1021594923546