Abstract
Null Hypothesis Significance Testing (NHST) has been a mainstay of the social sciences for empirically examining hypothesized relationships, and the main approach for establishing the importance of empirical results. NHST is the foundation of classical or frequentist statistics. The approach is designed to test the probability of generating the observed data if no relationship exists between the dependent and independent variables of interest, recognizing that the results will vary from sample to sample. This paper is intended to evaluate the state of the criminological and criminal justice literature with respect to the correct application of NHST. We apply a modified version of the instrument used in two reviews of the economics literature by McCloskey and Ziliak to code 82 articles in criminology and criminal justice. We have selected three sources of papers: Criminology, Justice Quarterly, and a recent review of experiments in criminal justice by Farrington and Welsh. We find that most researchers provide the basic information necessary to understand effect sizes and analytical significance in tables which include descriptive statistics and some standardized measure of size (e.g., betas, odds ratios). On the other hand, few of the articles mention statistical power and even fewer discuss the standards by which a finding would be considered large or small. Moreover, less than half of the articles distinguish between analytical significance and statistical significance, and most articles used the term ‘significance’ in ambiguous ways.
Similar content being viewed by others
References (Asterisk indicates papers in sample.)
*Agnew, R. (2002). Experienced, vicarious, and anticipated strain: An exploratory study on physical victimization and delinquency. Justice Quarterly 19, 603–632.
*Agnew, R., Brezina, T., Wright, J. P. & Cullen, F. T. (2002). Strain, personality traits, and delinquency: Extending general strain theory. Criminology 40, 43–72.
*Alpert, G. P. & MacDonald, J. M. (2001). Police use of force: An analysis of organizational characteristics. Justice Quarterly 18, 393–409.
Anderson, D. R., Burnham, K. P. & Thompson, W. L. (2000). Null hypothesis testing: Problems, prevalence, and an alternative. Journal of Wildlife Management 64, 912–923.
APA (2001). Publication manual of the American Psychological Association, (5th edition), Washington, DC: American Psychological Association.
APA Task Force on Statistical Inference (1996, December). Task Force on Statistical Inference initial report. Washington, DC: American Psychological Association. Available: http://www.apa.org/science/tfsi.html.
*Armstrong, T. A. (2003). The effect of moral reconation therapy on the recidivism of youthful offenders: A randomized experiment. Criminal Justice and Behavior 30, 668–687.
Arrow, K. J. (1959). Decision theory and the choice of a level of significance for the t-test. In I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow & H. B. Mann (Eds.), Contributions to probability and statistics: Essays in honor of Harold Hotelling (pp. 70–78). Stanford, CA: Stanford University Press.
*Baller, R. D., Anselin, L., Messner, S. F., Deane, G. & Hawkins, D. F. (2001). Structural covariates of U.S. county homicide rates: Incorporating spatial effects. Criminology 39, 561–590.
*Baumer, E. P. (2002). Neighborhood disadvantage and police notification by victims of violence. Criminology 40, 579–616.
Berkson, J. (1938). Some difficulties of interpretation encountered in the application of the chi-square test. Journal of the American Statistical Association 33, 526–536.
*Bernburg, J. G. & Thorlindsson, T. (2001). Routine activities in social context: A closer look at the role of opportunity in delinquent behavior. Justice Quarterly 18, 543–568.
*Borduin, C. M., Mann, B. J., Cone, L. T., Henggeler, S. W., Fucci, B. R., Blaske, D. M. & Williams, R. A. (1995). Multisystemic treatment of serious juvenile offenders: Long-term prevention of criminality and violence. Journal of Consulting and Clinical Psychology 63, 569–578.
Boring, E. G. (1919). Mathematical vs. scientific importance. Psychological Bulletin 16, 335–338.
*Braga, A. A., Weisburd, D. L., Waring, E. J., Mazerolle, L. G., Spelman, W. & Gajewski, F. (1999). Problem-oriented policing in violent crime places: A randomized controlled experiment. Criminology 37, 541–580.
Brame, R., Paternoster, R., Mazerolle, P. & Piquero, A. (1998). Testing for the equality of maximum-likelihood regression coefficients between two independent equations. Journal of Quantitative Criminology 14, 245–261.
*Broidy, L. M. (2001). A test of general strain theory. Criminology 39, 9–36.
*Burruss, G. M. Jr., & Kempf-Leonard, K. (2002). The questionable advantage of defense counsel in juvenile court. Justice Quarterly 19, 37–68.
*Campbell, F. A., Ramey, C. T., Pungello, E., Sparling, J. & Miller-Johnson, S. (2002). Early childhood education: Young adult outcomes from the Abercedarian project. Applied Developmental Science 6, 42–57.
*Cernkovich, S. A., & Giordano, P. C. (2001). Stability and change in antisocial behavior: The transition from adolescence to early adulthood. Criminology 39, 371–410.
*Chermak, S., McGarrell, E. F. & Weiss, A. (2001). Citizens' perceptions of aggressive traffic enforcement strategies. Justice Quarterly 18, 365–392.
Cook, T. D., Gruder, C. L., Hennigan, K. M. & Flay, B. R. (1979). The history of the sleeper effect: Some logical pitfalls in accepting the null hypothesis. Psychological Bulletin 86, 662–679.
*Copes, H., Kerley, K. R. Mason, K. A. & Van Wyk, J. (2001). Reporting behavior of fraud victims and Black's theory of law: An empirical assessment. Justice Quarterly 18, 343–364.
Cumming, G. & Finch, S. (2001). A primer on the understanding, use and calculation of confidence intervals that are based on central and non-central distributions. Educational and Psychological Measurement 61, 532–575.
*Curry, G. D., Decker, S. H. & Egley, A. Jr. (2002). Gang involvement and delinquency in a middle school population. Justice Quarterly 19, 275–292.
*Dawson, M. & Dinovitzer, R. (2001). Victim cooperation and the prosecution of domestic violence in a specialize court. Justice Quarterly 18, 593–622.
*DeJong, C., Mastrofski, S. D. & Parks, R. B. (2001). Patrol officers and problem solving: An application of expectancy theory. Justice Quarterly 18, 31–62.
*Dugan, J. R. & Everett, R. S. (1998). An experimental test of chemical dependency therapy for jail inmates. International Journal of Offender Therapy and Comparative Criminology 42, 360–368.
*Dunford, F. W. (2000). The San Diego Navy Experiment: An assessment of interventions for men who assault their wives. Journal of Consulting and Clinical Psychology 68, 468–476.
Elliott, G. & Granger, C. W. J. (2004). Evaluating significance: Comments on “size matters”. The Journal of Socio-Economics 33, 547–550.
*Engel, R. S. & Silver, E. (2001). Policing mentally disordered suspects: A reexamination of the criminalization hypothesis. Criminology 39, 225–252.
Engen, R. L. & Gainey, R. R. (2000). Modeling the effects of legally relevant and extralegal factors under sentencing guidelines: The rules have changed. Criminology 38, 1207–1230.
*Exum, M. L. (2002). The application and robustness of the rational choice perspective in the study of intoxicated and angry intentions to aggress. Criminology 40, 933–966.
Farrington, D. P. & Welsh, B. C. (2005). Randomized experiments in criminology: What have we learned in the last two decades? Journal of Experimental Criminology 1, 9–38.
*Feder, L. & Dugan, L. (2002). A test of the efficacy of court-mandated counseling for domestic offenders: The Broward experiment. Justice Quarterly 19, 343–376.
*Felson, R. B. & Ackerman, J. (2001). Arrest for domestic and other assaults. Criminology 39, 655–676.
*Felson, R. B. & Haynie, D. L. (2002). Pubertal development, social factors, and delinquency among adolescent boys. Criminology 40, 967–988.
*Felson, R. B., Messner, S. F., Hoskin, A. W. & Deane, G. (2002). Reasons for reporting and not reporting domestic violence to the police. Criminology 40, 617–648.
Fidler, F. (2002). The fifth edition of the APA Publication manual: Why its statistics recommendations are so controversial. Educational and Psychological Measurement 62, 749–770.
*Finn, M. A. & Muirhead-Steves, S. (2002). The effectiveness of electronic monitoring with violent male parolees. Justice Quarterly 19, 293–312.
Fisher, R. A. (1935). The design of experiments. Edinburgh, Scotland: Oliver and Boyd.
Freedman, D. A. (1983). A note on screening regression equations. The American Statistician 37, 152–155.
*Garner, J. H., Maxwell, C. D. & Heraux, C. G. (2002). Characteristics associated with the prevalence and severity of force used by the police. Justice Quarterly 19, 705–746.
Gigerenzer, G. (1987). Probabilistic thinking and the fight against subjectivity. In L. Kruger, G. Gigerenzer & M. S. Morgan (Eds.), The probabilistic revolution. Vol. II: Ideas in the Sciences (pp. 11–33). Cambridge, MA: MIT Press.
*Golub, A., Johnson, B. D., Taylor, A. & Liberty, H. J. (2002). The Validity of arrestees' self-reports: Variations across questions and persons. Justice Quarterly 19, 477–502.
*Gottfredson, D. C., Najaka, S. S. & Kearly, B. (2003). Effectiveness of drug treatment courts: Evidence from a randomized trial. Criminology and Public Policy 2, 171–196.
*Greenberg, D. F. & West, V. (2001). State prison populations and their growth, 1971–1991. Criminology 39, 615–654.
Greene, W. H. (2003). Econometric analysis, (5th edition). Upper Saddle River, NJ: Prentice-Hall.
Harlow, L. L., Mulaik, S. A. & Steiger, J. H. (Eds.), 1997. What if there were no significance tests? Mahwah, NJ: Lawrence Erlbaum Associates.
*Harmon, T. R. (2001). Predictors of miscarriages of justice in capital cases. Justice Quarterly 18, 949–968.
*Hay, C. (2001). Parenting, self-control, and delinquency: A test of self-control theory. Criminology 39, 707–736.
*Henggeler, S. W., Melton, G. B., Brondino, M. J., Scherer, D. G. & Hanley, J. H. (1997). Multisystemic theory with violent and chronic juvenile offenders and their families: The role of treatment fidelity in successful dissemination. Journal of Consulting and Clinical Psychology 65, 821–833.
*Hennigan, K. M., Maxson, C. L., Sloane, D. & Ranney, M. (2002). Community views on crime and policing: Survey mode effects on bias in community surveys. Justice Quarterly 19, 565–587.
Hoenig, J. M. & Heisey, D. M. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician 55, 19–24.
*Inciardi, J. A., Martin, S. S., Butzin, C. A., Hopper, R. M. & Harrizon, L. D. (1997). An effective model of prison-based treatment for drug-involved offenders. Journal of Drug Issues 27, 261–278.
*Ireland, T. O., Smith, C. A. & Thornberry, T. P. (2002). Developmental issues in the impact of child maltreatment on later delinquency and drug use. Criminology 40, 359–400.
Johnson, D. H. (1999). The insignificance of statistical significance testing. Journal of Wildlife Management 63, 763–772.
*Kaminski, R. J. & Marvell, T. B. (2002). A comparison of changes in police and general homicides: 1930–1998. Criminology 40, 171–190.
*Kautt, P. & Spohn, C. (2002). Cracking down on black drug offenders? Testing for interactions among offenders' race, drug type, and sentencing strategy in federal drug sentences. Justice Quarterly 19, 1–36.
*Kempf-Leonard, K., Tracy, P. E. & Howell, J. C. (2001). Serious, violent, and chronic juvenile offenders: The relationship of delinquency career types to adult criminality. Justice Quarterly 18, 449–478.
*Killias, M., Aebi, M. & Ribeaud, D. (2000). Does community service rehabilitate better than short-term imprisonment? Results of a controlled experiment. Howard Journal 39, 40–57.
*Kingsnorth, R. F., MacIntosh, R. C. & Sutherland, S. (2002). Criminal charge or probation violation? Prosecutorial discretion and implications for research in criminal court processing. Criminology 40, 553–578.
*Kleck, G. & Chiricos, T. (2002). Unemployment and property crime: A target-specific assessment of opportunity and motivation as mediating factors. Criminology 40, 649–679.
*Koons-Witt, B. A. (2002). The effect of gender on the decision to incarcerate before and after the introduction of sentencing guidelines. Criminology 40, 297–328.
*Kramer, J. H. & Ulmer, J. T. (2002). Downward departures for serious violent offenders: Local court “corrections” to Pennsylvania sentencing guidelines. Criminology 40, 897–932.
Leamer, E. E. (2004). Are the roads red? Comments on “size matters.” The Journal of Socio-Economics 33, 555–557.
Lipsey, M. W. (1990). Design sensitivity: Statistical power for experimental research. Newbury Park, CA: Sage.
Lipsey, M. W., Crosse, S., Dunkle, J., Pollard, J. & Stobart, G. (1985). Evaluation: The state of the art and the sorry state of the science. In D. S. Cordray (Ed.), Utilizing prior research in evaluation planning (New Directions for Program Evaluation, No. 27, pp. 7–28). San Francisco: Jossey-Bass.
Lunt, P. (2004). The significance of the significance test controversy: comments on ‘size matters.’ The Journal of Socio-Economics 33, 559–564.
*Maguire, E. R. & Katz, C. M. (2002). Community policing, loose coupling, and sensemaking in American police agencies. Justice Quarterly 19, 503–536.
Maltz, M. D. (1994). Deviating from the mean: The declining significance of significance. Journal of Research in Crime and Delinquency 31, 434–463.
Marks, H. M. (1997). The progress of experiment: Science and therapeutic reform in the United States 1900–1990. Cambridge, UK: Cambridge University Press.
*Marlowe, D. B., Festinger, D. S., Lee, P. A., Schepise, M. M., Hazzard, J. E. R., Merrill, J. C., Mulvaney, F. D. & McLellan, A. T. (2003). Are judicial status hearings a key component of drug court? During-treatment data from a randomized trial. Criminal Justice and Behavior 30, 141–162.
*Marquart, J. W., Barnhill, M. B. & Balshaw-Biddle, K. (2001). Fata attraction: An analysis of employee boundary violations in a southern prison system, 1995–1998. Justice Quarterly 18, 877–910.
*Mastrofski, S. D., Reisig, M. D. & McClusky, J. D. (2002). Police disrespect toward the public: An encounter-based analysis. Criminology 40, 519–552.
*McCarthy, B., Hagan, J. & Martin, M. J. (2002). In and out of harm's way: Violent victimization and the social capital of fictive street families. Criminology 40, 831–865.
McCloskey, D. N. & Ziliak, S. T. (1996). The standard error of regressions. Journal of Economic Literature 34, 97–114.
*McNulty, T. L. (2001). Assessing the race–violence relationship at the macro level: The assumption of racial invariance and the problem of restricted distributions. Criminology 39, 467–490.
*Meehan, A. J. & Ponder, M. C. (2002). Race & place: The ecology of racial profiling African American motorists. Justice Quarterly 19, 399–430.
*Menard, S., Mihalic, S. & Huizinga, D. (2001). Drugs and crime revisited. Justice Quarterly 18, 269–300.
*Mills, P. E., Cole, K. N., Jenkins, J. R. & Dale, P. S. (2002). Early exposure to direct instruction and subsequent juvenile delinquency: A prospective examination. Exceptional Children 69, 85–96.
*Ortmann, R. (2000). The effectiveness of social therapy in prison: A randomized experiment. Crime and Delinquency 46, 214–232.
*Peterson, D., Miller, J. & Esbensen, F.-A. (2001). The impact of sex composition on gangs and gang member delinquency. Criminology 39, 411–440.
Petrosino, A. (2005). From Martinson to meta-analysis: Research reviews and the US offender treatment debate. Evidence & Policy: A Journal of Research, Debate and Practice 1, 149–172.
*Piquero, A. R. & Brezina, T. (2001). Testing Moffitt's account of adolescent-limited delinquency. Criminology 39, 353–370.
*Pogarsky, G. (2002). Identifying “deterrable” offenders: Implications for research on deterrence. Justice Quarterly 19, 431–452.
*Rebellon, C. J. (2002). Reconsidering the broken homes/delinquency relationship and exploring its mediating mechanism(s). Criminology 40, 103–136.
*Rhodes, W. & Gross, M. (1997). Case management reduces drug use and criminality among drug-involved arrestees: An experimental study of an HIV prevention intervention. Washington, DC: National Institute of Justice.
*Richards, H. J., Casey, J. O. & Lucente, S. W. (2003). Psychopathy and treatment response in incarcerated female substance abusers. Criminal Justice and Behavior 30, 251–276.
Rosenthal, R. & Rubin, D. B. (1994). The counternull value of an effect size: A new statistic. Psychological Science 5, 329–334.
Rozeboom, W. W. (1960). The fallacy of the null hypothesis significance test. Psychological Bulletin 5, 416–428.
Rozeboom, W. W. (1997). Good science is abductive, not hypothetico-deductive. In L. L. Harlow, S. A. Mulaik & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 335–392). Mahwah, NJ: Lawrence Erlbaum Associates.
*Scheider, M. C. (2001). Deterrence and the base rate fallacy: An examination of perceived certainty. Justice Quarterly 18, 63–86.
*Schnebly, S. M. (2002). An examination of the impact of victim, offenders, and situational attributes on the deterrent effect of defensive gun use: A research note. Justice Quarterly 19, 377–398.
*Schwartz, M. D., DeKeseredy, W. S., Tait, D., & Avi, S. (2001). Male peer support and a feminist routine activities theory: Understanding sexual assault on the college campus. Justice Quarterly 18, 623–650.
Sherman, L. W., Gottfredson, D., MacKenzie, D., Eck, J., Reuter, P. & Bushway, S. (1997). Preventing crime: What works, what doesn't, what's promising: A report to the United States Congress. Washington, DC: National Institute of Justice.
*Silver, E. (2002). Mental disorder and violent victimization: The mediating role of involvement in conflicted relationships. Criminology 40, 191–212.
*Simons, R. L., Stewart, E., Gordon, L. C., Conger, R. D. & Elder, G., Jr. 2002. A test of life-course explanations for stability and change in antisocial behavior from adolescence to young adulthood. Criminology 40, 401–434.
*Spohn, C. & Holleran, D. (2001). Prosecuting sexual assault: A comparison of charging decisions in sexual assault cases involving strangers, acquaintances, and intimate partners. Justice Quarterly 18, 651–688.
*Spohn, C. & Holleran, D. (2002). The effect of imprisonment on recidivism rates of felony offenders: A focus on drug offenders. Criminology 40, 329–358.
*Steffensmeier, D. & Demuth, S. (2001). Ethnicity and judges' sentencing decisions: Hispanic–Black–White comparisons. Criminology 39, 145–178.
*Stewart, E. A., Simons, R. L. & Conger, R. D. (2002). Assessing neighborhood and social psychological influences on childhood influences on childhood violencce in an African-American sample. Criminology 40, 801–829.
*Swanson, J. W., Borum, R., Swartz, M. S., Hiday, V. A., Wagner, H. R. & Burns, B. J. (2001). Can involuntary outpatient commitment reduce arrests among persons with severe mental illness? Criminal Justice and Behavior 28, 156–189.
*Taylor, B. G., Davis, R. C. & Maxwell, C. D. (2001). The effects of a group batterer treatment program: A randomized experiment in Brooklyn. Justice Quarterly 18, 171–201.
*Terrill, W. & Mastrofski, S. D. (2002). Situational and officer-based determinants of police coercion. Justice Quarterly 19, 215–248.
Thompson, B. (2004). The “significance” crisis in psychology and education. The Journal of Socio-Economics 33, 607–613.
*van Voorhis, P., Spruance, L. M., Ritchey, P. N., Listwan, S. J. & Seabrook, R. (2004). The Georgia cognitive skills experiment: A replication of Reasoning and Rehabilitation. Criminal Justice and Behavior 31, 282–305.
*Velez, M. B. (2001). The role of public social control in urban neighborhoods: A multi-level analysis of victimization risk. Criminology 39, 837–864.
*Vogel, B. L. & Meeker, J. W. (2001). Perceptions of crime seriousness in eight African-American communities: The influence of individual, environmental, and crime-based factors. Justice Quarterly 18, 301–321.
Weisburd, D., Lum, C. M. & Yang, S.-M., 2003. When can we conclude that treatments or programs “don't work”? The Annals of the American Academy of Political and Social Science 574, 31–48.
*Weitzer, R. & Tuch, S. A. (2002). Perceptions of racial profiling: Race, class, and personal experience. Criminology 40, 435–456.
Wellford, C., 1989. Towards an integrated theory of criminal behavior. In S. Messner, M. M. Krohn, and A. Liska (Eds.), Theoretical integration in the study of deviance and crime: Problems and prospects (pp. 119–128). Albany, NY: State University of New York.
*Wells, L. E. & Weisheit, R. A. (2001). Gang problems in nonmetropolitan areas: A longitudinal assessment. Justice Quarterly 18, 791–824.
*Welsh, W. N. (2001). Effects of student and school factors on five measures of school disorder. Justice Quarterly 18, 911–948.
*Wexler, H. K., Melnick, G., Lowe, L. & Peters, J. (1999). Three-year reincarceration outcomes for Amity in-prison therapeutic community and aftercare in California. Prison Journal 79, 321–336.
Wilson, D. B. (2001). Meta-analytic methods for criminology. Annals of the American Academy of Political and Social Science 578, 71–89.
Wooldridge, J. M. (2004). Statistical significance is okay, too: Comment on “size matters.” The Journal of Socio-Economics 33, 577–579.
*Wright, B. R. E., Caspi, A., Moffitt, T. E. & Silva, P. A. (2001). The effects of social ties on crime vary by criminal propensity: A life-course model of interdependence. Criminology 39, 321–352.
*Wright, J. P., Cullen, F. T., Agnew, R. S. & Brezina, T. (2001). “The root of all evil”? An exploratory study of money and delinquent involvement. Justice Quarterly 18, 239–268.
Zellner, A. (2004). To test or not to test and if so, how? Comments on “size matters?” The Journal of Socio-Economics 33, 581–586.
Ziliak, S. T. & McCloskey, D. N. (2004). Size matters: The standard error of regressions in the American Economic Review. The Journal of Socio-Economics 33, 527–546.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bushway, S.D., Sweeten, G. & Wilson, D.B. Size matters: Standard errors in the application of null hypothesis significance testing in criminology and criminal justice. J Exp Criminol 2, 1–22 (2006). https://doi.org/10.1007/s11292-005-5129-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11292-005-5129-7