Skip to main content

Differential Item Functioning Depending on General Covariates

  • Chapter
Essays on Item Response Theory

Part of the book series: Lecture Notes in Statistics ((LNS,volume 157))

Abstract

Item response theory (IRT) is a powerful tool for the detection of differential item functioning (DIF). It is shown that the class of IRT models with manifest predictors is a comprehensive framework for the detection of DIF. These models also support the investigation of the causes of DIF. In principle, the responses to every item in a test can be subject to DIF, and traditional IRT-based detection methods require one or more estimation runs for every single item. Therefore, (1998) proposed an alternative procedure that can be performed using only a single estimate of the item parameters. This procedure is based on the Lagrange multiplier test or the equivalent Rao efficient score test. In this chapter, the procedure is generalized in various directions, the most important one being the possibility of conditioning on general covariates. A small simulation study is presented to give an impression of the power of the test. In an example using real data it is shown how the method can be applied to the identification of main and interaction effects in DIF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aitchison, J., & Silvey, S.D. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics 29, 813–828.

    Article  MathSciNet  MATH  Google Scholar 

  • Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM-algorithm. Psychometrika, 46, 443–459.

    Article  MathSciNet  Google Scholar 

  • Bock, R.D., & Zimowski, M.F. (1997). Multiple group IRT. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer-Verlag.

    Google Scholar 

  • Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.

    Google Scholar 

  • Cox, D.R., & Hinkley, D.V. (1974). Theoretical statistics. London: Chapman and Hall.

    MATH  Google Scholar 

  • Cressie, N., & Holland, P.W. (1983). Characterizing the manifest probabilities of latent trait models. Psychometrika, 48, 129–141.

    Article  MathSciNet  MATH  Google Scholar 

  • Efron, B. (1977). Discussion on maximum likelihood from incomplete data via the EM algorithm (by A. Dempster, N. Laird, and D. Rubin). Journal of the Royal Statistical Society, Series B, 39, 1–38.

    Google Scholar 

  • Fischer, G.H. (1993). Notes on the Mantel—Haenszel procedure and another chi-square test for the assessment of DIF. Methodika, 7, 88–100.

    Google Scholar 

  • Fischer, G.H. (1995). Some neglected problems in IRT. Psychometrika, 60, 459–487.

    Article  MATH  Google Scholar 

  • Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.

    MathSciNet  MATH  Google Scholar 

  • Glas, C.A.W. (1999). Modification indices for the 2-PL and the nominal response model. Psychometrika, 64, 273–294.

    Article  Google Scholar 

  • Glas, C.A.W., & Verfielst, N.D. (1995). Tests of fit for polytomous Rasch models. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 325–352). New York: Springer-Verlag.

    Google Scholar 

  • Hambleton, R.K., & Rogers, H.J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2, 313–334.

    Article  Google Scholar 

  • Holland, P.W., & Thayer, D.T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Holland, P.W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Jansen, M.G.H., & Glas, C.A.W. (2001). Statistical tests for differential test functioning in Rasch’s model for speed tests. In A. Boomsma, M.A.J. van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 149–162). New York: Springer-Verlag.

    Chapter  Google Scholar 

  • Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.

    Article  MathSciNet  Google Scholar 

  • Kok, F.G., Mellenbergh, G.J., & van der Flier, H. (1985). Detecting experimentally induced item bias using the iterative logit method. Journal of Educational Measurement, 22, 295–303.

    Article  Google Scholar 

  • Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.

    MathSciNet  MATH  Google Scholar 

  • Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719–748.

    Google Scholar 

  • McCullagh, P., & Neider, J. (1989). Generalized linear models (2nd ed.). London: Chapman and Hall.

    MATH  Google Scholar 

  • Meijer, R.R., & Van Krimpen-Stoop, E.M.L.A. (2001). Person fit across subgroups: An achievement testing example. In A. Boomsma, M.A.J. van Duijn, & T.A.B. Snijders (Eds.), Essays on item response theory (pp. 377–390). New York: Springer-Verlag.

    Chapter  Google Scholar 

  • Meredith, W., & Millsap, R.E. (1992). On the misuse of manifest variables in the detection of measurement bias. Psychometrika, 57, 289–311.

    Article  MathSciNet  MATH  Google Scholar 

  • Mislevy, R.J. (1984). Estimating latent distributions. Psychometrika, 49, 359–381.

    Article  MATH  Google Scholar 

  • Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177–195.

    Article  MathSciNet  MATH  Google Scholar 

  • Muraki, E., & Bock, R.D. (1991). PARSCALE: Parameter scaling of rating data [Computer software]. Chicago: Scientific Software.

    Google Scholar 

  • Rao, C.R. (1947). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 50–57.

    Google Scholar 

  • Rao, C. R. (1973). Linear statistical inference and its applications. New York: Wiley.

    Book  MATH  Google Scholar 

  • Rigdon, S.E., & Tsutakawa, R.K. (1983). Parameter estimation in latent trait models. Psychometrika, 48, 567–574.

    Article  MathSciNet  MATH  Google Scholar 

  • Rogers, H.J., Swaminathan, H., & Egan, K. (1999, April). A multi-level approach for investigating differential item functioning. Paper presented at the Annual Meeting of the NCME, Montreal.

    Google Scholar 

  • Swaminathan, H., & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.

    Article  Google Scholar 

  • Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of IRT models. In P.W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–113). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Wang, W.-C. (in press). Modeling effects of differential item functioning in polytomous items. Journal of Outcome Measurement.

    Google Scholar 

  • Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In P.W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 337–347). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996). BILOGMG: Multiple-group IRT analysis and test maintenance for binary items [Computer software]. Chicago: Scientific Software.

    Google Scholar 

  • Zwinderman, A.H. (1991). A generalized Rasch model for manifest predictors. Psychometrika, 56, 589–600.

    Article  MATH  Google Scholar 

  • Zwinderman, A.H. (1997). Response models with manifest predictors. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media New York

About this chapter

Cite this chapter

Glas, C.A.W. (2001). Differential Item Functioning Depending on General Covariates. In: Boomsma, A., van Duijn, M.A.J., Snijders, T.A.B. (eds) Essays on Item Response Theory. Lecture Notes in Statistics, vol 157. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0169-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-0169-1_7

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-95147-8

  • Online ISBN: 978-1-4613-0169-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics