More flexible response functions for the PROMIS physical functioning item bank by application of a monotonic polynomial approach

Falk, Carl F.; Fischer, Felix

doi:10.1007/s11136-021-02873-7

More flexible response functions for the PROMIS physical functioning item bank by application of a monotonic polynomial approach

Special Section: Non-parametric IRT
Published: 27 May 2021

Volume 31, pages 37–47, (2022)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

431 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Purpose

In developing item banks for patient reported outcomes (PROs), nonparametric techniques are often used for investigating empirical item response curves, whereas final banks usually use parsimonious parametric models. A flexible approach based on monotonic polynomials (MP) provides a compromise by modeling items with both complex and simpler response curves. This paper investigates the suitability of MPs to PRO data.

Method

Using PROMIS Wave 1 data (N = 15,725) for Physical Function, we fitted an MP model and the graded response model (GRM). We compared both models in terms of overall model fit, latent trait estimates, and item/test information. We quantified possible GRM item misfit using approaches that compute discrepancies with the MP. Through simulations, we investigated the ability of the MP to perform well versus the GRM under identical data collection conditions.

Results

A likelihood ratio test (p < 0.001) and AIC (but not BIC) indicated better fit for the MP. Latent trait estimates and expected test scores were comparable between models, but we observed higher information for the MP in the lower range of physical functioning. Many items were flagged as possibly misfitting and simulations supported the performance of the MP. Yet discrepancies between the MP and GRM were small.

Conclusion

The MP approach allows inclusion of items with complex response curves into PRO item banks. Information for the physical functioning item bank may be greater than originally thought for low levels of physical functioning. This may translate into small improvements if an MP approach is used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Calibration of a physical functioning item bank for measurement of health-related quality of life in Singapore

Article 22 May 2020

Validation of two PROMIS item banks for measuring social participation in the Dutch general population

Article Open access 10 September 2018

Scoring the EQ-HWB-S: can we do it without value sets? A non-parametric item response theory analysis

Article Open access 21 February 2024

Data availability

As described in the method section, data used in this manuscript are available in the public domain.

Code availability

Examples of estimation of the monotonic polynomial model are available in Supplementary Materials.

Notes

For a recent discussion on the merits of collapsing categories, see Harel and Steele [25].
Estimation options were changed slightly to increase computational speed and are described in Supplementary Materials.

References

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley.
Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.
Google Scholar
Fries, J. F., Bruce, B., & Cella, D. (2005). The promise of PROMIS: Using item response theory to improve assessment of patient-reported outcomes. Clinical and Experimental Rheumatology, 23(5 Suppl 39), S53–S57.
CAS PubMed Google Scholar
Choi, S. W., Schalet, B., Cook, K. F., & Cella, D. (2014). Establishing a common metric for depressive symptoms: Linking the BDI-II, CES-D, and PHQ-9 to PROMIS depression. Psychological Assessment, 26, 513–527. https://doi.org/10.1037/a0035768
Article PubMed PubMed Central Google Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monographs. https://doi.org/10.1002/j.2333-8504.1968.tb00153.x
Article Google Scholar
Samejima, F. (1972). A general model of free-response data. Psychometric Monographs No. 18. Psychometric Society.
Google Scholar
Samejima, F. (2010). The general graded response model. In M. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models: Developments and applications (pp. 77–107). Taylor & Francis.
Google Scholar
Rose, M., Bjorner, J. B., Gandek, B., Bruce, B., Fries, J. F., & Ware, J. E. (2014). The PROMIS physical function item bank was calibrated to a standardized metric and show to improve measurement efficiency. Journal of Clinical Epidemiology, 67, 516–526. https://doi.org/10.1016/j.jclinepi.2013.10.024
Article PubMed PubMed Central Google Scholar
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: A case for nonparametric item response theory modeling. Psychological Methods, 9, 354–368. https://doi.org/10.1037/1082-989X.9.3.354
Article PubMed Google Scholar
Patient-Reported Outcomes Measurement Information System (2013). PROMIS instrument development and validation scientific standards version 2.0. Retrieved from, http://www.healthmeasures.net/images/PROMIS/PROMISStandards_Vers2.0_Final.pdf
Falk, C. F., & Cai, L. (2016). Semi-parametric item response functions in the context of guessing. Journal of Educational Measurement, 53, 229–247. https://doi.org/10.1111/jedm.12111
Article Google Scholar
Wells, C. S., & Bolt, D. M. (2008). Investigation of a nonparametric procedure for assessing goodness-of-fit in item response theory. Applied Measurement in Education, 21, 22–40. https://doi.org/10.1080/08957340701796464
Article Google Scholar
Falk, C. F. (2019). Model selection for monotonic polynomial item response models. Quantitative psychology: The 83rd Annual Meeting of the Psychometric Society, New York, NY, 2018 (pp. 75–85). Springer. https://doi.org/10.1007/978-3-030-01310-3_7
Chapter Google Scholar
Falk, C. F. (2020). The monotonic polynomial graded response model: Implementation and a comparative study. Applied Psychological Measurement, 44, 465–481. https://doi.org/10.1177/0146621620909897
Article PubMed PubMed Central Google Scholar
Falk, C. F., & Cai, L. (2016). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81, 434–460. https://doi.org/10.1007/s11336-014-9428-7
Article PubMed Google Scholar
Liang, L., & Browne, M. W. (2015). A quasi-parametric method for fitting flexible item response functions. Journal of Educational and Behavioral Statistics, 40, 5–34. https://doi.org/10.3102/1076998614556816
Article Google Scholar
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Lawrence Erlbaum Associates.
Google Scholar
Feuerstahler, L. M. (2016). Exploring alternate latent trait metrics with filtered monotonic polynomial IRT models (PhD thesis). Department of Psychology, University of Minnesota.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459. https://doi.org/10.1007/BF02293801
Article Google Scholar
Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177–195. https://doi.org/10.1007/BF02293979
Article Google Scholar
Feuerstahler, L. M. (2019). Metric transformations and the filtered monotonic polynomial item response model. Psychometrika, 84, 105–123. https://doi.org/10.1007/s11336-018-9642-9
Article PubMed Google Scholar
Choi, S. W., Reise, S. P., Pilkonis, P., Hays, R. D., & Cella, D. (2010). Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Quality of Life Research, 19, 125–136. https://doi.org/10.1007/s11136-009-9560-5
Article PubMed Google Scholar
Cella, D. (2015). PROMIS 1 wave 1. Harvard Dataverse. https://doi.org/10.7910/DVN/0NGAKG.
Liu, H. H., Cella, D., Gershon, R., Shen, J., Morales, L. S., Riley, W., & Hays, R. D. (2010). Representativeness of the PROMIS internet panel. Journal of Clinical Epidemiology, 63, 1169–1178. https://doi.org/10.1016/j.jclinepi.2009.11.021
Article PubMed PubMed Central Google Scholar
Harel, D., & Steele, R. J. (2018). An information matrix test for the collapsing of categories under the partial credit model. Journal of Educational and Behavioral Statistics, 43, 721–750.
Article Google Scholar
Santor, D. A., Ramsay, J. O., & Zuroff, D. C. (1994). Nonparametric item analyses of the Beck depression inventory: Evaluating gender item bias and response option weights. Psychological Assessment, 6, 255–270. https://doi.org/10.1037/1040-3590.6.3.255
Article Google Scholar
Rose, M., Bjorner, J. B., Becker, J., Fries, J. F., & Ware, J. E. (2008). Evaluation of a preliminary physical function item bank supported the expected advantages of the patient-reported outcomes measurement information system (PROMIS). Journal of Clinical Epidemiology, 61, 17–33. https://doi.org/10.1016/j.jclinepi.2006.06.025
Article CAS PubMed Google Scholar
Sijtsma, K., & van der Ark, L. A. (2003). Investigation and treatment of missing item scores in test and questionnaire data. Multivariate Behavioral Research, 38, 505–528. https://doi.org/10.1207/s15327906mbr3804_4
Article PubMed Google Scholar
van der Ark, L. A., & Sijtsma, K. (2005). The effect of missing data imputation on Mokken scale analysis. In L. A. van der Ark, M. A. Croon, & K. Sijtsma (Eds.), New developments in categorical data analysis for the social and behavioral sciences (pp. 147–166). Lawrence Erlbaum.
Google Scholar
van Ginkel, J. R., van der Ark, L. A., & Sijtsma, K. (2007). Multiple imputation of item scores in test and questionnaire data, and influence on psychometric results. Multivariate Behavioral Research, 42, 387–414. https://doi.org/10.1080/00273170701360803
Article PubMed Google Scholar
Wind, S. A., & Patil, Y. J. (2018). Exploring incomplete rating designs with Mokken scale analysis. Educational and Psychological Measurement, 78, 319–342. https://doi.org/10.1177/0013164416675393
Article PubMed Google Scholar
Neale, M. C., Hunter, M. D., Pritikin, J. N., Zahery, M., Brick, T. R., Kickpatrick, R. M., Estabrook, R., Bates, T. C., Maes, H. H., & Boker, S. M. (2016). OpenMx 2.0: Extended structural equation and statistical modeling. Psychometrika, 81, 535–549. https://doi.org/10.1007/s11336-014-9435-8
Article PubMed Google Scholar
Pritikin, J. N., Hunter, M. D., & Boker, S. M. (2015). Modular open-source software for item factor analysis. Educational and Psychological Measurement, 75, 458–475. https://doi.org/10.1177/0013164414554615
Article PubMed Google Scholar
Pritikin, J. N. (2016). Rpf: Response probability functions. Retrieved from https://CRAN.R-project.org/package=rpf
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444. https://doi.org/10.1177/014662168200600405
Article Google Scholar
Chalmers, R. P. (2018). Model-based measures for detecting and quantifying response bias. Psychometrika, 83, 696–732. https://doi.org/10.1007/s11336-018-9626-9
Article PubMed Google Scholar
Chalmers, R. P., Counsell, A., & Flora, D. B. (2016). It might not make a big DIF: Improved differential test functioning statistics that account for sampling variability. Educational and Psychological Measurement, 76, 114–140. https://doi.org/10.1177/0013164415584576
Article PubMed Google Scholar
Edelen, M. O., Stucky, B. D., & Chandra, A. (2015). Quantifying “problematic” DIF within an IRT framework: Application to a cancer stigma index. Quality of Life Research, 24, 95–103. https://doi.org/10.1007/s11136-013-0540-4
Article PubMed Google Scholar
Organization for Economic Cooperation and Development. (2017). PISA 2015 technical report. Organization for Economic Cooperation and Development.
Google Scholar
Waller, N. G., & Feuerstahler, L. (2017). Bayesian modal estimation of the four-parameter item response model in real, realistic, and idealized data sets. Multivariate Behavioral Research, 52, 350–370. https://doi.org/10.1080/00273171.2017.1292893
Article PubMed Google Scholar
Feuerstahler, L. M. (2018). Sources of error in IRT trait estimation. Applied Psychological Measurement, 42, 359–375. https://doi.org/10.1177/0146621617733955
Article PubMed Google Scholar
Bolt, D. M. (2002). A Monte Carlo comparison of parametric and nonparametric polytomous DIF detection methods. Applied Measurement in Education, 15, 113–141. https://doi.org/10.1207/S15324818AME1502_01
Article Google Scholar
Douglas, J., & Cohen, A. (2001). Nonparametric item response function estimation for assessing parametric model fit. Applied Psychological Measurement, 25, 234–243. https://doi.org/10.1177/01466210122032046
Article Google Scholar
Liang, T., & Wells, C. S. (2009). A model fit statistic for generalized partial credit model. Educational and Psychological Measurement, 69, 913–928. https://doi.org/10.1177/0013164409332222
Article Google Scholar
Liang, T., & Wells, C. S. (2015). A nonparametric approach for assessing goodness-of-fit of IRT models in a mixed format test. Applied Measurement in Education, 28, 115–129. https://doi.org/10.1080/08957347.2014.1002918
Article Google Scholar
Maydeu-Olivares, A. (2005). Further empirical results on parametric versus nonparametric IRT modeling of Likert-type personality data. Multivariate Behavioral Research, 40, 261–279.
Article Google Scholar
R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Google Scholar

Download references

Acknowledgements

We acknowledge the support of a research Grant from the Fonds de recherche du Quebec—Nature et technologies [2019-NC-255344] to the first author.

Author information

Authors and Affiliations

Department of Psychology, McGill University, 2001 McGill College, 7th Floor, Montreal, QC, H3A 1G1, Canada
Carl F. Falk
Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department for Psychosomatic Medicine, Charitéplatz 1, 10117, Berlin, Germany
Felix Fischer
Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Clinical Study Center, German PROMIS® National Center, Charitéplatz 1, 10117, Berlin, Germany
Felix Fischer

Authors

Carl F. Falk
View author publications
You can also search for this author in PubMed Google Scholar
Felix Fischer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carl F. Falk.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Electronic supplementary material 1 (PDF 799 kb)

Electronic supplementary material 2 (R 17 kb)

Electronic supplementary material 3 (R 4 kb)

Electronic supplementary material 4 (R 6 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Falk, C.F., Fischer, F. More flexible response functions for the PROMIS physical functioning item bank by application of a monotonic polynomial approach. Qual Life Res 31, 37–47 (2022). https://doi.org/10.1007/s11136-021-02873-7

Download citation

Accepted: 04 May 2021
Published: 27 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11136-021-02873-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

More flexible response functions for the PROMIS physical functioning item bank by application of a monotonic polynomial approach