Measuring the Reliability of Diagnostic Classification Model Examinee Estimates

Templin, Jonathan; Bradshaw, Laine

doi:10.1007/s00357-013-9129-4

Measuring the Reliability of Diagnostic Classification Model Examinee Estimates

Published: 05 July 2013

Volume 30, pages 251–275, (2013)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Jonathan Templin¹ &
Laine Bradshaw¹

1408 Accesses
90 Citations
Explore all metrics

Abstract

Over the past decade, diagnostic classification models (DCMs) have become an active area of psychometric research. Despite their use, the reliability of examinee estimates in DCM applications has seldom been reported. In this paper, a reliability measure for the categorical latent variables of DCMs is defined. Using theory-and simulation-based results, we show how DCMs uniformly provide greater examinee estimate reliability than IRT models for tests of the same length, a result that is a consequence of the smaller range of latent variable values examinee estimates can take in DCMs. We demonstrate this result by comparing DCM and IRT reliability for a series of models estimated with data from an end-of-grade test, culminating with a discussion of how DCMs can be used to change the character of large scale testing, either by shortening tests that measure examinees unidimensionally or by providing more reliable multidimensional measurement for tests of the same length.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Article Open access 07 June 2017

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 01 March 2024

References

ACKERMAN, T. (2009), "Using Confirmatory MIRT Modeling to Provide Diagnostic Information in Large Scale Assessment”, paper presented at the April 2009 meeting of the National Council for Measurement in Education, San Diego CA.
AMERICAN EDUCATIONAL RESEARCH ASSOCIATION, AMERICAN PSYCHOLOGICAL ASSOCIATION, and NATIONAL COUNCIL ON MEASUREMENT IN EDUCATION (1999), Standards for Educational and Psychological Testing, Washington DC: Authors.
BIRNBAUM, A. (1968), “Some Latent Trait Models and Their Use in Inferring an Examinee’s Ability”, in Statistical Theories of Mental Test Scores, eds. F.M. Lord and M.R. Novick, Reading MA: Addison-Wesley, pp. 397–479.
Google Scholar
DE AYALA, R.J. (2009), Theory and Practice of Item Response Theory, New York: Guilford.
Google Scholar
HABERMAN, S.J., VON DAVIER, M., and LEE, Y.-H. (2008), “Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions Versus Multivariate Polytomous Ability Distributions”, Research Report 08–45, Princeton NJ: Educational Testing Service.
HAERTEL, E. (1989), “Using Restricted Latent Class Models to Map the Skill Structure of Achievement Items”, Journal of Educational Measurement, 26, 333–352.
Article Google Scholar
HAMBLETON, R.K., SWAMINATHAN, H., and ROGERS, H.J. (1991), Fundamentals of Item Response Theory, Newbury Park CA: Sage.
Google Scholar
HARTZ, S.M. (2002), A Bayesian Framework for The Unified Model for Assessing Cognitive Abilities: Blending Theory with Practicality, unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL.
HENSON, R., TEMPLIN, J., and WILLSE, J. (2009a), “Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables”, Psychometrika, 74, 191–210.
Article MathSciNet MATH Google Scholar
HENSON, R., TEMPLIN, J., and WILLSE, J. (2009b), “Ancillary Random Effects: A Way to Obtain Diagnostic Information from Existing Large Scale Tests”, paper presented at the April 2009 meeting of the National Council for Measurement in Education, San Diego CA.
JUNKER, B.W., and SIJTSMA, K. (2001), “Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory”, Applied Psychological Measurement, 25, 258–272.
Article MathSciNet Google Scholar
LEIGHTON, J.P., and GIERL, M.J. (Eds.) (2007), Cognitive Diagnostic Assessment for Education: Theory and Practices, Cambridge: Cambridge University Press.
Google Scholar
LORD, F.M. (1980), Applications of Item Response Theory to Practical Testing Problems”, Hillsdale NJ: Erlbaum.
Google Scholar
MACREADY, G.B., and DAYTON, C.M. (1977), “The Use of Probabilistic Models in the Assessment of Mastery”, Journal of Educational Statistics, 2, 99–120.
Article Google Scholar
MARIS, E. (1999), “Estimating Multiple Classification Latent Class Models”, Psychometrika, 64, 197–212.
Article MathSciNet Google Scholar
MAYDEU-OLIVARES, A., and JOE, H. (2005), “Limited- and Full-Information Estimation and Goodness-of-Fit Testing in 2ⁿ Contingency Tables: A Unified Framework”, Journal of the American Statistical Association, 100, 1009–1020.
Article MathSciNet MATH Google Scholar
MISLEVY, R.J., BEATON, A.E., KAPLAN, B., and SHEEHAN, K.M. (1992), “Estimating Population Characteristics from Sparse Matrix Samples of Item Responses”, Journal of Educational Measurement, 29, 133–161.
Article Google Scholar
MUTHÉN, L.K., and MUTHÉN, B.O. (2010), “Mplus User’s Guide” (Version 5.21, Computer software and manual), Los Angeles CA: Muthén and Muthén.
Google Scholar
ROUSSOS, L., DIBELLO, L., STOUT, W., HARTZ, S., HENSON, R., and TEMPLIN, J. (2007), “The Fusion Model Skills Diagnosis System”, in Cognitive Diagnostic Assessment in Education, eds. J. Leighton and M. Gierl, New York NY: Cambridge University Press, pp. 275–318.
Chapter Google Scholar
RUPP, A., and TEMPLIN, J. (2008), “Unique Characteristics of Diagnostic Models: A Review of the Current State-of-the-Art”, Measurement, 6, 219–262.
Google Scholar
RUPP, A., TEMPLIN, J., and HENSON, R. (2010), Diagnostic Measurement: Theory, Methods, and Applications, New York: Guilford.
Google Scholar
SINHARAY, S., and HABERMAN, S. J. (2009), “How Much Can We Reliably Know About What Examinees Know?”, Measurement, 7, 49–53.
Google Scholar
TEMPLIN, J. (2004), Generalized Linear Mixed Proficiency Models for Cognitive Diagnosis, unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL.
TEMPLIN, J. (2006), CDM User's Guide, Lawrence KS: University of Kansas.
Google Scholar
TEMPLIN, J., and HENSON, R. (2006), “Measurement of Psychological Disorders Using Cognitive Diagnosis Models”, Psychological Methods, 11, 287–305.
Article Google Scholar
VON DAVIER, M. (2005), “A General Diagnostic Model Applied to Language Testing Data”, ETS Research Report RR-05-16.

Download references

Author information

Authors and Affiliations

Department of Educational Psychology, The University of Georgia, 323 Aderhold Hall, Athens, GA, 30602, USA
Jonathan Templin & Laine Bradshaw

Authors

Jonathan Templin
View author publications
You can also search for this author in PubMed Google Scholar
Laine Bradshaw
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonathan Templin.

Additional information

We would like to thank Terry Ackerman, Allan Cohen, Jeff Douglas, Robert Henson, John Poggio, and John Willse for their helpful comments and critiques of the concepts and text presented in this paper. Complete syntax for running all analyses herein and resulting program output are available at the first author’s website.

This research was funded by National Science Foundation grants DRL-0822064; SES-0750859; and SES-1030337. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Templin, J., Bradshaw, L. Measuring the Reliability of Diagnostic Classification Model Examinee Estimates. J Classif 30, 251–275 (2013). https://doi.org/10.1007/s00357-013-9129-4

Download citation

Published: 05 July 2013
Issue Date: July 2013
DOI: https://doi.org/10.1007/s00357-013-9129-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Measuring the Reliability of Diagnostic Classification Model Examinee Estimates

Abstract

Access this article

Similar content being viewed by others

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Measuring the Reliability of Diagnostic Classification Model Examinee Estimates

Abstract

Access this article

Similar content being viewed by others

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation