Abstract
In this paper, I will review some aspects of psychometric projects that I have been involved in, emphasizing the nature of the work of the psychometricians involved, especially the balance between the statistical and scientific elements of that work. The intent is to seek to understand where psychometrics, as a discipline, has been and where it might be headed, in part at least, by considering one particular journey (my own). In contemplating this, I also look to psychometrics journals to see how psychometricians represent themselves to themselves, and in a complementary way, look to substantive journals to see how psychometrics is represented there (or perhaps, not represented, as the case may be). I present a series of questions in order to consider the issue of what are the appropriate foci of the psychometric discipline. As an example, I present one recent project at the end, where the roles of the psychometricians and the substantive researchers have had to become intertwined in order to make satisfactory progress. In the conclusion I discuss the consequences of such a view for the future of psychometrics.
Similar content being viewed by others
Notes
This system, called the BEAR Assessment System (BAS), is described in Wilson (2005).
This level of CoS is summarized as: “Consider statistics as measures of characteristics of a sample distribution.”
References
Adams, R.J., Wilson, M., & Wu, M. (1997a). Multilevel item response models: an approach to errors in variables regression. Journal of Educational and Behavioral Statistics, 22(1), 47–76.
Adams, R.J., Wilson, M., & Wang, W.C. (1997b). The multidimensional random coefficients multinomial logit. Applied Psychological Measurement, 21, 1–23.
Adams, R.J., Wu, M., & Wilson, M. (2012). ConQuest 3.0 [computer program]. Hawthorn, Australia: ACER.
Acton, G.S., Kunz, J.D., Wilson, M., & Hall, S.M. (2005). The construct of internalization: conceptualization, measurement, and prediction of smoking treatment outcome. Psychological Medicine, 35, 395–408.
American Educational Research Association, American Psychological Association, National Council for Measurement in Education (AERA, APA, NCME) (1999). Standards for educational and psychological testing. Washington: American Educational Research Association.
American Institutes for Research (2000). Voluntary national test, cognitive laboratory report, year 2. Palo Alto: American Institutes for Research.
Biggs, J.B., & Collis, K.F. (1982). Evaluating the quality of learning: the SOLO taxonomy. New York: Academic Press.
Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71(3), 425–440.
Brown, N.J.S., & Wilson, M. (2011). Model of cognition: the missing cornerstone of assessment. Educational Psychology Review, 23(2), 221–234.
Corcoran, T., Mosher, F.A., & Rogat, A. (2009). Learning progressions in science: an evidence-based approach to reform (CPRE Research Report #RR-63). New York: Center on Continuous Instructional Improvement, Teachers College—Columbia University.
De Boeck, P., Wilson, M., & Acton, G.S. (2005). A conceptual and psychometric framework for distinguishing categories and dimensions. Psychological Review, 112(1), 129–158.
Demetriou, A., & Efklides, A. (1989). The person’s conception of the structures of developing intellect: early adolescence to middle age. Genetic, Social, and General Psychology Monographs, 115, 371–423.
Demetriou, A., & Kyriakides, L. (2006). The functional and developmental organization of cognitive developmental sequences. British Journal of Educational Psychology, 76(2), 209–242.
Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B, 39, 1–38.
Diakow, R., & Irribarra, D.T. (2011). Developing assessments of data modeling and mapping a learning progression using a structured constructs model. Paper presented at the international meeting of the psychometric society, Hong Kong, July 2011.
Diakow, R., Irribarra, D.T., & Wilson, M. (2011). Analyzing the complex structure of a learning progression: structured construct models. Paper presented at the annual meeting of the national council of measurement in education, New Orleans, LA, April 2011.
Diakow, R., Irribarra, D.T., & Wilson, M. (2012a). Analyzing the complex structure of a learning progression: structured construct models. Paper presented at the national council on measurement in education annual meeting, Vancouver, Canada, April 2012.
Diakow, R., Irribarra, D.T., & Wilson, M. (2012b). Evaluating the impact of alternative models for between and within construct relations. Paper presented at the international meeting of the psychometric society, Lincoln, Nebraska, July 2012.
Draney, K. (1996). The polytomous saltus model: a mixture model approach to the diagnosis of developmental differences. Unpublished doctoral dissertation, University of California, Berkeley.
Draney, K., & Jeon, M. (2011). Investigating the saltus model as a tool for setting standards. Psychological Test and Assessment Modeling, 53(4), 486–498.
Draney, K., & Wilson, M. (2004). Application of the polytomous saltus model to stage-like data. In A. van der Ark, M. Croon, & K. Sijtsma (Eds.), New developments in categorical data analysis for the social and behavioral sciences. Mahwah: Erlbaum.
Falmagne, J.-C., & Doignon, J.-P. (2011). Learning spaces. Heidelberg: Springer.
Fischer, K.W., Pipp, S.L., & Bullock, D. (1984). Detecting discontinuities in development: methods and measurement. In R.N. Emde & R. Harmon (Eds.), Continuities and discontinuities in development. Norwood: Ablex.
Irribarra, D.T., Diakow, R., & Wilson, M. (2012). Alternative specifications for structured construct models. Paper presented at the IOMW 2012 conference, Vancouver, April 2012.
Lehrer, R., Kim, M.-J., Ayers, E., & Wilson, M. (2013, in press). Toward establishing a learning progression to support the development of statistical reasoning. In J. Confrey & A. Maloney (Eds.), Learning over time: learning trajectories in mathematics education. Charlotte: Information Age Publishers.
Marton, F. (1981). Phenomenography: describing conceptions of the world around us. Instructional Science, 10, 177–200.
Marton, F. (1986). Phenomenography—a research approach to investigating different understandings of reality. Journal of Thought, 21, 29–49.
Marton, F. (1988). Phenomenography—exploring different conceptions of reality. In D. Fetterman (Ed.), Qualitative approaches to evaluation in education (pp. 176–205). New York: Praeger.
Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
Mislevy, R.J., Steinberg, L.S., & Almond, R.G. (2003). On the structure of educational assessments. Measurement Interdisciplinary Research & Perspective, 1, 3–67.
Mislevy, R.J., & Wilson, M. (1996). Marginal maximum likelihood estimation for a psychometric model of discontinuous development. Psychometrika, 61, 41–71.
National Research Council (2001). Knowing what students know: the science and design of educational assessment. Committee on the Foundations of Assessment, J. Pellegrino, N. Chudowsky, & R. Glaser (Eds.), Washington: National Academy Press.
Nunnally, J.C., & Bernstein, I.H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Patton, M.Q. (1980). Qualitative evaluation methods. Beverly Hills: Sage.
Pirolli, P., & Wilson, M. (1998). A theory of the measurement of knowledge content, access, and learning. Psychological Review, 105(1), 58–82.
Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In Proceedings of the fourth Berkeley symposium on mathematical statistics and probability (Vol. 4, pp. 321–334).
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press (original work published 1960).
Rost, J. (1990). Rasch models in latent classes: an integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271–282.
Rupp, A.A., Templin, J., & Henson, R. (2010). Diagnostic measurement: theory, methods, and applications. New York: The Guilford Press.
Scalise, K., & Gifford, B.R. (2008). Innovative item types: intermediate constraint questions and tasks for computer-based testing. Paper presented at the national council on measurement in education (NCME), session on ‘Building adaptive and other computer-based tests’, in New York, May 2008.
Schwartz, R., Ayers, E., & Wilson, M. (2010). Modeling a multi-dimensional learning progression. Paper presented at the annual meeting of the American educational research association, Denver, CO, April 2010.
Siegler, R.S. (1981). Developmental sequences within and between concepts. Monograph of the Society for Research in Child Development, 46(2, Serial No. 189).
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583–616.
Vermunt, J.K., & Magidson, J. (2007). Latent GOLD 4.5 syntax module (computer program). Belmont, MA: Statistical Innovations.
Wilson, M. (1989). Saltus: a psychometric model of discontinuity in cognitive development. Psychological Bulletin, 105(2), 276–289.
Wilson, M. (2005). Constructing measures: an item response modeling approach. Mahwah: Lawrence Erlbaum Associates.
Wilson, M. (2009). Measuring progressions: assessment structures underlying a learning progression. Journal for Research in Science Teaching, 46(6), 716–730.
Wilson, M. (2012). Responding to a challenge that learning progressions pose to measurement practice: hypothesized links between dimensions of the outcome progression. In A.C. Alonzo & A.W. Gotwals (Eds.), Learning progressions in science. Rotterdam: Sense Publishers.
Acknowledgements
Many colleagues have contributed to the thoughts and ideas presented in this paper—unfortunately, I cannot acknowledge all of you. Hence, I restrict my acknowledgements to two groups. First, those who commented on drafts of the text: Ronli Diakow, Paul De Boeck, Karen Draney, Andy Maul, Roger Millsap, and David Torres Irribarra. Second, those who worked directly on the examples used in the text: for the saltus example, Karen Draney and Bob Mislevy; for the ADM example, Beth Ayers, Kristen Burmester, Tzur Karelitz, Rich Lehrer, David Torres Irribarra, Kavita Seeratan and Bob Schwartz; and for the SCM example, Ronli Diakow, and David Torres Irribarra. Any errors or omissions are, of course, the responsibility of the author.
Author information
Authors and Affiliations
Corresponding author
Appendix: Publications Related to the Saltus Model (in Chronological Order)
Appendix: Publications Related to the Saltus Model (in Chronological Order)
-
21.
Draney, K., & Jeon, M. (2011). Investigating the saltus model as a tool for setting standards. Psychological Test and Assessment Modeling, 53(4), 486–498.
-
20.
Draney, K., Wilson, M., Gluck, J., & Spiel, C. (2008). Mixture models in a developmental context. In G.R. Hancock & K.M. Samuelson (Eds.), Advances in latent variable mixture models (pp. 199–216). Charlotte: Information Age Publishing.
-
19.
Draney, K., & Wilson, M. (2007). Application of the saltus model to stage-like data: some applications and current developments. In M. von Davier & C. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 119–130). New York: Springer.
-
18.
Draney, K. (2007). Understanding Rasch measurement: the saltus model applied to proportional reasoning data. Journal of Applied Measurement, 8.
-
17.
Demetriou, A., & Kyriakides, L. (2006). The functional and developmental organization of cognitive developmental sequences. British Journal of Educational Psychology, 76(2), 209–242.
-
16.
Acton, G.S., Kunz, J.D., Wilson, M., & Hall, S.M. (2005). The construct of internalization: conceptualization, measurement, and prediction of smoking treatment outcome. Psychological Medicine, 35, 395–408.
-
15.
De Boeck, P., Wilson, M., & Acton, G.S. (2005). A conceptual and psychometric framework for distinguishing categories and dimensions. Psychological Review, 112(1), 129–158.
-
14.
Draney, K., & Wilson, M. (2004). Application of the polytomous saltus model to stage-like data. In A. van der Ark, M. Croon, & K. Sijtsma (Eds.), New developments in categorical data analysis for the social and behavioral sciences. Mahwah: Erlbaum.
-
13.
Fieuws, S., Spiessens, B., & Draney, K. (2004). Mixture models. In P. De Boeck & M. Wilson, (Eds.), Explanatory item response models: a generalized linear and nonlinear approach (pp. 317–340). New York: Springer.
-
12.
Pirolli, P., & Wilson, M. (1998). A theory of the measurement of knowledge content, access, and learning. Psychological Review, 105(1), 58–82.
-
11.
Wilson, M., & Draney, K. (1997). Partial credit in a developmental context: the case for adopting a mixture model approach. In M. Wilson, G. Engelhard, & K. Draney (Eds.), Objective measurement IV: theory into practice (pp. 333–350). Norwood: Ablex.
-
10.
Draney, K.L., & Wilson, M. (1997). PC-saltus [computer program]. BEAR Center Research Report, UC Berkeley.
-
9.
Mislevy, R.J., & Wilson, M. (1996). Marginal maximum likelihood estimation for a psychometric model of discontinuous development. Psychometrika, 61(1), 41–71.
-
8.
Draney, K.L. (1996). The polytomous saltus model: a mixture model approach to the diagnosis of developmental differences. Unpublished doctoral dissertation, UC Berkeley.
-
7.
Wilson, M. (1994). Measurement of developmental levels. In T. Husen & T.N. Postlethwaite (Eds.), International encyclopedia of education (2nd ed., pp. 1508–1514). Oxford: Pergamon Press.
-
6.
Wilson, M. (1993). The “Saltus model” misunderstood. Methodika 7, 1–4.
-
5.
Wilson, M. (1990). Measurement of developmental levels. In T. Husen & T.N. Postlethwaite (Eds.), International encyclopedia of education: research and studies. Supplementary volume 2. Oxford: Pergamon Press.
-
4.
Demetriou, A., & Efklides, A. (1989). The person’s conception of the structures of developing intellect: early adolescence to middle age. Genetic, Social, and General Psychology Monographs, 115, 371–423.
-
3.
Wilson, M. (1989). Saltus: a psychometric model of discontinuity in cognitive development. Psychological Bulletin, 105(2), 276–289.
-
2.
Wilson, M. (1985). Measuring stages of growth, ACER occasional paper, No. 19. Melbourne, Australia: ACER.
-
1.
Wilson, M. (1984). A psychometric model of hierarchical development. Unpublished doctoral dissertation, University of Chicago.
Rights and permissions
About this article
Cite this article
Wilson, M. Seeking a Balance Between the Statistical and Scientific Elements in Psychometrics. Psychometrika 78, 211–236 (2013). https://doi.org/10.1007/s11336-013-9327-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-013-9327-3