Skip to main content
Log in

Problems in the estimation and interpretation of the reliability of survey data

  • Published:
Quality and Quantity Aims and scope Submit manuscript

Abstract

In this paper I discuss several of the difficulties involved in estimating the reliability of survey measurement. Reliability is defined on the basis of classical true-score theory, as the correlational consistency of multiple measures of the same construct, net of true change. This concept is presented within the framework of a theoretical discussion of the sources of error in survey data and the design requirements for separating response variation into components representing such response consistency and measurement errors. Discussion focuses on the potential sources of random and nonrandom errors, including “invalidity” of measurement, the term frequently used to refer to components of method variance. Problems with the estimation of these components are enumerated and discussed with respect to both cross-sectional and panel designs. Empirical examples are given of the estimation of the quantities of interest, which are the basis of a discussion of the interpretational difficulties encountered in reliability estimation. Data are drawn from the ISR's Quality of Life surveys, the National Election Studies and the NORC's General Social Surveys. The general conclusion is that both cross-sectional and panel estimates of measurement reliability are desirable, but for the purposes of isolating the random component of error, panel designs are probably the most advantageous.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Achen, C.H. (1975). “Mass political attitudes and the survey response”, American Political Science Review 69: 1218–1231.

    Google Scholar 

  • Alwin, D.F. (1973). “Making inferences from attitude-behavior correlations”, Sociometry 36: 253–278.

    Google Scholar 

  • Alwin, D.F. (1974). “Approaches to the interpretation of relationships in the multitrait-multimethod matrix” in H.L. Costner (ed.), Sociological Methodology 1973–74 (pp. 79–105). San Francisco: Jossey-Bass.

    Google Scholar 

  • Alwin, D.F. (1976). “Attitude scales as congeneric tests: a re-examination of an attitude-behavior model”, Sociometry 39: 377–383.

    Google Scholar 

  • Alwin, D.F. (1985). “The application of structural equation models to experimental data: an addendum”. pp. 82–88 in H.M. Blalock,Jr. (ed.), Causal Models in Panel and Experimental Design. New York: Aldine.

    Google Scholar 

  • Alwin, D.F. (1987). “Distributive justice and satisfaction with material well-being”, American Sociological Review 52: 83–95.

    Google Scholar 

  • Alwin, D.F. (1988). “Structural equation models in research on human development and aging”, pp. 71–170 in K.W. Schaie, R.T. Campbell, W. Meredith and S.C. Rawlings (eds.), Methodological Issues in Aging Research. New York: Springer Publishing Company.

    Google Scholar 

  • Alwin, D.F. (1989a). “Are 100-point scales more reliable than 7-point scales? An investigation of components of variance for measures of life satisfaction”. Unpublished manuscript. Institute for Social Research, University of Michigan, Ann Arbor MI.

    Google Scholar 

  • Alwin, D.F. (1989b). “The reliability of survey data: Variation by topic of the question”, unpublished manuscript. Institute for Social Research, University of Michigan, Ann Arbor MI.

    Google Scholar 

  • Alwin, D.F. (1989c). “The concept of validity and its applicability to survey measurement”, unpublished manuscript. Institute for Social Research, University of Michigan, Ann Arbor MI.

    Google Scholar 

  • Alwin, D.F. & Jackson, D.J. (1979). “Measurement models for response errors in surveys: issues and applications”, pp. 68–119, in K.F. Schuessler (ed.), Sociological Methodology 1980. San Francisco: Jossey-Bass.

    Google Scholar 

  • Alwin, D.F. & Krosnick, J.A. (1985). “The measurement of values in surveys: a comparison of ratings and rankings”, Public Opinion Quarterly 49: 535–552.

    Google Scholar 

  • Alwin, D.F. & Krosnick, J.A. (1989a). “Aging, cohorts, and the stability of socio-political orientations over the lifecourse”, unpublished paper. Institute for Social Research, University of Michigan, Ann Arbor MI.

    Google Scholar 

  • Alwin, D.F. & Krosnick, J.A. (1989b). “The reliability of attitudinal survey data: the impact of question and respondent characteristics”, unpublished paper. Institute for Social Research, University of Michigan, Ann Arbor MI.

    Google Scholar 

  • Alwin, D.F. & Thornton, A. (1984). “Family origins and the schooling process: Early vs. late influence of parental characteristics”, American Sociological Review 49: 784–802.

    Google Scholar 

  • Andrews, F.M. (1984). “Construct validity and error components of survey measures: a structural modeling approach”, Public Opinion Quarterly 48: 409–442.

    Google Scholar 

  • Andrews, F.M. and Herzog, A.R. (1986). “The quality of survey data as related to age of respondent”, Journal of the American Statistical Association 81: 403–410.

    Google Scholar 

  • Asher, H.B. (1974). “Some consequences of measurement error in survey data”, American Journal of Political Science 28: 468–485.

    Google Scholar 

  • Bachman, J.G. & O'Malley, P.M. (1981). “When four months equal a year: inconsistencies in student reports of drug use”, Public Opinion Quarterly 45: 536–548.

    Google Scholar 

  • Berg, I.A. (1966). Response Set in Personality Assessment. Chicago: Aldine.

    Google Scholar 

  • Bielby, W.T. & Hauser, R.M. (1977). “Response errors in earnings functions for nonblack males”, Sociological Methods and Research 6: 241–280.

    CAS  Google Scholar 

  • Bielby, W.T., Hauser, R.M. & Featherman, D.L. (1977a). “Response errors of nonblack males in models of the stratification process”, Journal of the American Statistical Association 72: 723–735.

    Google Scholar 

  • Bielby, W.T., Hauser, R.M. & Featherman, D.L. (1977b). “Response errors of black and nonblack males in models of status inheritance and mobility”, American Journal of Sociology 82: 1242–1288.

    Google Scholar 

  • Block, J. (1965). The Challenge of Response Sets. New York: Appleton-Century-Crofts.

    Google Scholar 

  • Bock, R.D. & Bargmann, R.E. (1966). “Analysis of covariance structures”, Psychometrika 31: 507–534.

    Google Scholar 

  • Bohrnstedt, G.W. (1970) “Reliability and validity assessment in attitude research”, pp. 80–99, in G.F. Summers (ed.), Attitude Measurement. Chicago: Rand McNally.

    Google Scholar 

  • Bohrnstedt, G.W. (1983). “Measurement”, pp. 70–121, in P.H. Rossi, J.D. Wright, and A.B. Anderson (eds), Handbook of Survey Research. New York: Academic Press.

    Google Scholar 

  • Bohrnstedt, G.W. & Carter, T.M. (1971). “Robustness in regression analysis”, pp. 118–146, in H.L. Costner (ed.), Sociological Methodology 1971. San Francisco: Jossey-Bass.

    Google Scholar 

  • Bohrnstedt, G.w., Mohler, P.P. & Müller, W. (1987). “Editor's introduction”, Sociological Methods and Research 15: 171–176.

    Google Scholar 

  • Borus, M.E. & Nestle, G. (1973). “Response bias in reports of father's education and socioeconomic status”, Journal of the American Statistical Association 68: 816–820.

    Google Scholar 

  • Campbell, A. & Converse, P.E. (1980). The Quality of American Life, 1978 Codebook. Ann Arbor MI: Inter-University Consortium for Political and Social Research.

    Google Scholar 

  • Campbell, D.T. & Fiske, D.W. (1959). “Convergent and discriminant validation by the multitrait-multimethod matrix”, Psychological Bulletin 56: 81–105.

    Google Scholar 

  • Campbell, R.T. and Mutran, E. (1982). “Analyzing panel data in studies of aging”, Research on Aging 4: 3–41.

    Google Scholar 

  • Cannell, C.F., Miller, P.V. & Oksenberg, L. (1981). “Research on interviewing techniques”, pp. 389–437 in S. Lienhardt (ed.), Sociological Methodology 1981. San Francisco: Jossey-Bass.

    Google Scholar 

  • Cleary, T.A., Linn, R.L. & Walster, G.W. (1970). “Effect of reliability and validity on power of statistical tests”, pp. 30–38 in E.F. Borgatta and G.W. Bohrnstedt (eds.), Sociological 0 Methodology 1970. San Francisco: Jossey-Bass.

    Google Scholar 

  • Converse, P.E. & Markus, G.B. (1979). “Plus ca change...: The New CPS election study panel”, American Political Science Review 73: 32–49.

    Google Scholar 

  • Corcoran, M. (1980). “Sex differences in measurement error in status attainment models”, Sociological Methods and Research 9: 199–217.

    Google Scholar 

  • Costner, H.L. (1969). “Theory, deduction, and rules of correspondence”, American Journal of Sociology 75: 245–63.

    Google Scholar 

  • Cronbach, L.J. (1946). “Response sets and test validity”, Educational and Psychological Measurement 6: 475–94.

    Google Scholar 

  • Cronbach, L.J. (1950). “Further evidence on response sets”, Education and Psychological Measurement 10: 3–31.

    Google Scholar 

  • Cunningham, W.H., Cunningham, I.C.M. & Green, R.T. (1977). “The ipsative process to reduce response set bias”, Public Opinion Quarterly 41: 379–84.

    Google Scholar 

  • Erikson, R.S. (1978). “Analyzing one variable-three wave panel data: a comparison of two models”, Political Methodology 5: 151–161.

    Google Scholar 

  • Erikson, R.S. (1979). “The SRC panel data and mass political attitudes”, British Journal of Political Science 9: 89–114.

    Google Scholar 

  • Feather, N.T. (1973). “The measurement of values: effects of different assessment procedures”, Australian Journal of Psychology 25: 221–31.

    Google Scholar 

  • Greene, V.L. & Carmines, E.G. (1979). “Assessing the reliability of linear composites”, pp. 160–175 in K.F. Schuessler (ed.), Sociological Methodology 1980, San Francosco: Jossey-Bass.

    Google Scholar 

  • Groves, R.M. (1987). “Research on survey data quality”, Public Opinion Quarterly 51: S156-S172.

    Google Scholar 

  • Hamilton, D.L. (1968). “Personality attributes associated with extreme response set”, Psychological Bulletin 69: 192–203.

    Google Scholar 

  • Hargens, L.L., Reskin, B.F. & Allison, P.D. (1976). “Problems in estimating measurement error from panel data: an example involving the measurement of scientific productivity”, Sociological Methods and Research 4: 439–458.

    Google Scholar 

  • Hauser, R.M., Tsai, S.L. & Sewell, W.M. (1983). “A model of stratification with response error in social and psychological variables”, Sociology of Education 56: 20–46.

    Google Scholar 

  • Heise, D.R. (1969). “Separating reliability and stability in test-retest correlations”. American Sociological Review 34: 93–101.

    Google Scholar 

  • Heise, D.R. & Bohrnstedt, G.W. (1970). “Validity, invalidity and reliability”, pp. 104–129 in E.F. Borgatta and G.W. Bohrnstedt (eds.), Sociological Methodology 1970. San Francisco: Jossey-Bass.

    Google Scholar 

  • Jagodzinski, W. & Kühnel, S.M. (1987). “Estimation of reliability and stability in singleindicator multiple wave models”, Sociological Methods and Research 15: 219–259.

    Google Scholar 

  • Jagodzinski, W., Kühnel, S.M. & Schmidt, P. (1987). “Is there a ‘Socratic effect’ in nonexperimental panel studies?”, Sociological Methods and Research 15: 259–303.

    Google Scholar 

  • Jöreskog, K.G. (1970). “Estimation and testing of simplex models”, British Journal of Mathematical and Statistical Psychology 23: 121–145.

    Google Scholar 

  • Jöreskog, K.G. (1971). “Statistical analysis of sets of congeneric tests”, Psychometrika 36: 109–133.

    Google Scholar 

  • Jöreskog, K.G. (1974). “Analyzing psychological data by structural analysis of convariance matrices”, in D.H. Kranz, R.C. Atkinson, R.D. Luce and P. Suppes (eds.), Measurement, Psychophysics, and Neural Information Processing. San Francisco: Freeman.

    Google Scholar 

  • Jöreskog, K.G. (1977). “Statistical models for analysis of longitudinal data”, pp. 285–325 in D.J. Aigner and A.S. Goldberger (eds.), Latent Variabs in Socioeconomic Models. Amsterdam: North-Holland.

    Google Scholar 

  • Jöreskog, K.G. (1978). “Structural analysis of covariance and correlation matrices”, Psychometrika 43: 443–477.

    Google Scholar 

  • Jöreskog, K.G. (1979). “Statistical estimation of structural models in longitudinal developmental investigations”, pp. 303–374 in J.R. Nessleroade and P.B., Baltes (eds.), Longitudinal Research in the Study of Behavior and Development. New York: Academic Press.

    Google Scholar 

  • Jöreskog, K.G. & Sörbom, D. (1986). LISREL VI. Analysis of Linear Structural Relationships by Maximum Likelihood, Instrumental Variables, and Least Squares Methods. Scientific Software, Inc. P.O. Box. 536, Mooresville, Indiana 46158.

    Google Scholar 

  • Kalton, G. & Schuman, H. (1982). “The effect of the question on survey responses: a review”, Journal of the Royal Statistical Association 145: 42–73.

    Google Scholar 

  • Krippendorff, K. (1970). “Bivariate agreement coefficients for reliability of data”, pp. 139–150 in E.F. Borgatta and G.W. Bohrnstedt (eds.), Sociological Methodology 1970. San Francisco: Jossey-Bass.

    Google Scholar 

  • Krosnick, J.A. (1986). Policy Voting in American Presidential Elections: An Application of Psychological Theory to American Politics. Unpublished Ph.D. Dissertation, Department of Psychology, University of Michigan, Ann Arbor, MI.

  • Krosnick, J.A. & Alwin, D.F. (1987). “An evaluation of a theory of response order effects in survey measurement”, Public Opinion Quarterly 51: 201–219.

    Google Scholar 

  • Lord, F.M. and Novick, M.R. (1968). Statistical Theories of Mental Test Scores, Reading, MA: Addison-Wesley.

    Google Scholar 

  • Markus, G.B. (1982). “Political attitudes during an election year: a report on the 1980 NES panel study”, American Political Science Review 76: 538–560.

    Google Scholar 

  • Marquis, K.H. (1978). Record Check Validity of Survey Responses: A Reassessment of Bias in Reports of Hospitalization. Santa Monica, CA: The Rand Corporation.

    Google Scholar 

  • Marquis, K.H., Duan, N., Marquis, M.S. & Polich, J.M. (1981). Response Errors in Sensitive Topic Surveys: Estimates, Effects and Correction Options. Santa Monica, CA: Rand Corporation.

    Google Scholar 

  • Marquis, M.S. & Marquis, K.H. (1977). Survey Measurement Design and Evaluation Using Reliability Theory. Santa Monica, CA: The Rand Corporation.

    Google Scholar 

  • Messick, S. (1968). “Response sets”, in D.L. Sills (ed.), International Encyclopedia of the Social Sciences, Vol. 13. New York: Macmillan.

    Google Scholar 

  • Miller, D.P. and Swain, A.D. (1987). “Human error and human reliability”, pp. 219–250 in G. Salvendy (ed.), The Handbook of Human Factors. New York: John Wiley.

    Google Scholar 

  • Miller, P.V. & Groves, R.M. (1985). “Matching survey responses to official records: an exploration of validity in victimization reporting”, Public Opinion Quarterly 49: 366–380.

    Google Scholar 

  • Miller, W.E., Miller, A.H. & Schneider, E.J. (1980). American National Election Studies Data Sourcebook, 1952–1978. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Moser, C.A. and Kalton, G. (1972). Survey Methods in Social Investigation. New York: Basic Books.

    Google Scholar 

  • National Opinion Research Center. (1988). General Social Surveys, 1972–87: Cumulative Codebook. Chicago: Author.

  • Phillips, D.L. (1973). Abandoning Method. San Francisco: Jossey-Bass.

    Google Scholar 

  • Rodgers, W.L. and Herzog, A.R. (1987a). “Interviewing older adults: the accuracy of factual information”, Journal of Gerontology 42: 387–394.

    CAS  PubMed  Google Scholar 

  • Rodgers, W.L. & Herzog, A.R. (1987b). “Measurement error in interviews with elderly respondents”. Paper Presented at the 21st annual meetings of the Public Health Conference on Records and Statistics. Institute for Social Research, Ann Arbor, MI.

  • Saris, W.E. & and van den Putte, B. (1988). “Test of measurement models: a secondary analysis of the ALBUS test-retest data”, Sociological Methods and Research 17: 123–157. Department of Political Sciences, University of Amsterdam, The Netherlands.

    Google Scholar 

  • Schuman, H. and Presser, S. (1981). Questions and Answers in Attitude Surveys: Experiments in Question Form, Wording and Context. New York: Academic Press.

    Google Scholar 

  • Sears, D.O. (1981). “Life stage effects on attitude change, especially among the elderly”, pp. 183–204 in S.B. Kielser, J.N. Morgan and V.K. Oppenheimer (eds.), Aging and Social Change. New York: Academic Press.

    Google Scholar 

  • Siegel, P.M. & Hodge, R.W. (1968). “A causal approach to the study of measurement error”, pp. 28–59 in H.M. Blalock, Jr and A.B. Blalock (ed.), Methodology in Social Research. New York: McGraw-Hill.

    Google Scholar 

  • Smith, T.W. & Stephenson, C.B. (1979). “An analysis of test/retest experiments on the 1972, 1973, 1974, and 1978 General Social Surveys”, GSS Technical Report, No. 14, December, 1979. NORC.

  • Sörbom, D. (1975). “Detection of correlated errors in longitudinal data”, British Journal of Mathematical and Statistical Psychology 27: 229–239.

    Google Scholar 

  • Weaver, C.N. & Swanson, C.L. (1974). “Validity of reports of date of birth, salary and seniority”, Public Opition Quarterly 38: 69–80.

    Google Scholar 

  • Werts, C.E. and Linn, R.L. (1970). “Path analysis: psychological examples”, Psychological Bulletin 74: 194–212.

    Google Scholar 

  • Werts, C.E., Breland, H.M., Grandy, J. & Rock, D.A. (1980). “Ising longitudinal data to estimate reliability in the presence of correlated measurement errors”, Educational and Psychological Measurement 40: 19–29.

    Google Scholar 

  • Werts, C.E., Jöreskog, K.G. & Linn, R.L. (1971). “Comment on ‘The estimation of measurement error in panel data‘”, American Sociological Review 36: 110–113.

    Google Scholar 

  • Werts, C.E., Linn, R.L. & Jöreskog, K.G. (1974). “Quantifying unmeasured variables”, pp. 270–292 in H.M. Blalock (ed.), Measurement in the Social Sciences. Chicago: Aldine.

    Google Scholar 

  • Werts, C.E., Linn, R.L. & Jöreskog, K.G. (1977). “A simplex model for analyzing academic growth”, Educational and Psychological Measurement 37: 745–756.

    Google Scholar 

  • Werts, C.E., Pike, L.W., Linn, R.L. & Jöreskog, K.G. (1981). “Applications of a quasi-Markov simplex models across populations”, Educational and Psychological Measurement 41: 295–307.

    Google Scholar 

  • Werts, C.E., Rock, D.A., Linn, R.L. & Jöreskog, K.G. (1977). “Validating psychometric assumptions within and between several populations”, Educational and Psychological Measurement 37: 863–872.

    Google Scholar 

  • Wheaton, B., Muthen, B., Alwin, D.F. & Summers, G.F. (1977). “Assessing reliability and stability in panel models”, pp. 85–136 in D.R. Heise (ed.), Sociological Methodology 1977. San Francisco: Jossey-Bass.

    Google Scholar 

  • Williams, R.M. (1968). “Values”, in D.L. Sills (ed.), International Encyclopedia of the Social Sciences. New York: Macmillan.

    Google Scholar 

  • Wiley, D.E. & Wiley, J.A. (1970). “Estimating measurement error using multiple indicators and several points in time”, American Sociological Review 35: 112–117.

    Google Scholar 

  • Zeller, R. & Carmines, E.E. (1980). Measurement in the Social Sciences, New York: Cambridge University Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alwin, D.F. Problems in the estimation and interpretation of the reliability of survey data. Qual Quant 23, 277–331 (1989). https://doi.org/10.1007/BF00172447

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00172447

Keywords

Navigation