Abstract
Barnard-Brak, Richman, Little, and Yang (Behaviour Research and Therapy, 102, 8–15, 2018) developed a structured-criteria metric, fail-safe k, which quantifies the stability of data series within single-case experimental designs (SCEDs) using published baseline and treatment data. Fail-safe k suggests the optimal point in time to change phases (e.g., move from Phase B to Phase C, reverse back to Phase A). However, this tool has not been tested with clinical data obtained in the course of care. Thus, the purpose of the current study was to replicate the procedures described by Barnard-Brak et al. with clinical data. We also evaluated the correspondence between the fail-safe k metric with outcomes obtained via dual-criteria and conservative-dual criteria methods, which are empirically supported methods for evaluating data-series trends within SCEDs. Our results provide some degree of support for use of this approach as a research tool with clinical data, in particular when evaluating small or medium treatment effect sizes, but further research is needed before this can be used widely by practitioners.
Similar content being viewed by others
Notes
Baseline data from a subset of these individuals were previously analyzed by Falligant et al. (2019).
Independent data collectors calculated interrater agreement by comparing DC/CDC/and FSK results generated via the template with the output recorded by each data recorder. That is, each data collector checked to verify that the other data collector correctly recorded and interpreted results from each analysis. We calculated interrater agreement for 100% of applications—initial results yielded 95% (55 of 58 applications) agreement; after reaching consensus on the remaining three applications, the final agreement for output and interpretation of results across all evaluations was 100%.
See Tarlow and Brossart (2018) for a detailed overview of this method. In essence, a null effect model and experimental effect model are generated from these parameter estimates, both of which produce hypothetical data sets yielding two distributions of time-series data from which the null effect and experimental effect distributions are compared and used to calculate a standardized mean difference effect size (d). Interrupted time series analyses can be implemented in behavior-analytic research and should produce high levels of agreement with visual analysis (see Harrington & Velicer, 2015).
PND reflects the number of sessions within a treatment phase that fall outside of the baseline range (for an overview of PND and related methods, see Wolery et al., 2010; for examples of the application of PND, see Carr et al., 2009, and Sham & Smith, 2014).
Note that the more negative the do value is, the more robust the effect size (because in these cases, behavior should decrease in the B phase relative to the A phase). For example, an effect size of do = -2.0 would be a “larger” effect size than do = -1.5.
This represent the base rate of agreement (i.e., 50%) within the sample.
References
Baer, D. M. (1977). Perhaps it would be better not to know everything. Journal of Applied Behavior Analysis, 10, 167–172. https://doi.org/10.1901/jaba.1977.10-167.
Barnard-Brak, L., Richman, D. M., Little, T. D., & Yang, Z. (2018). Development of an in-vivo metric to aid visual inspection of single-case design data: Do we need to run more sessions? Behaviour Research and Therapy, 102, 8–15. https://doi.org/10.1016/j.brat.2017.12.003.
Brossart, D. F., Parker, R. I., Olson, E. A., & Mahadevan, L. (2006). The relationship between visual analysis and five statistical analyses in a simple AB single-case research design. Behavior Modification, 30, 531–563. https://doi.org/10.1177/0145445503261167.
Carr, J. E., Coriaty, S., Wilder, D. A., Gaunt, B. T., Dozier, C. L., Britton, L. N., et al. (2000). A review of “noncontingent” reinforcement as treatment for the aberrant behavior of individuals with developmental disabilities. Research in Developmental Disabilities, 21, 377–391. https://doi.org/10.1016/S0891-4222(00)00050-0.
Carr, J. E., Severtson, J. M., & Lepper, T. L. (2009). Noncontingent reinforcement is an empirically supported treatment for problem behavior exhibited by individuals with developmental disabilities. Research in Developmental Disabilities, 30, 44–57. https://doi.org/10.1016/j.ridd.2008.03.002.
Falligant, J. M., McNulty, M. K., Hausman, N. L., & Rooker, G. W. (2019). Using dual-criteria methods to supplement visual inspection: Replication and extension. Journal of Applied Behavior Analysis.
Fisher, W. W., Kelley, M. E., & Lomas, J. E. (2003). Visual aids and structured criteria for improving visual inspection and interpretation of single-case designs. Journal of Applied Behavior Analysis, 36, 387–406. https://doi.org/10.1901/jaba.2003.36-387.
Fisher, W. W., & Lerman, D. C. (2014). It has been said that, “There are three degrees of falsehoods: Lies, damn lies, and statistics.” Journal of School Psychology, 52, 243–248. https://doi.org/10.1016/j.jsp.2014.01.001
Gage, N. A., & Lewis, T. J. (2014). Hierarchical linear modeling meta-analysis of single-subject design research. Journal of Special Education, 48, 3–16. https://doi.org/10.1177/0022466912443894.
Greer, B. D., Fisher, W. W., Saini, V., Owen, T. M., & Jones, J. K. (2016). Functional communication training during reinforcement schedule thinning: An analysis of 25 applications. Journal of Applied Behavior Analysis, 49, 105–121. https://doi.org/10.1002/jaba.265.
Hagopian, L. P., Fisher, W. W., Sullivan, M. T., Acquisto, J., & LeBlanc, L. A. (1998). Effectiveness of functional communication training with and without extinction and punishment: A summary of 21 inpatient cases. Journal of Applied Behavior Analysis, 31, 211–235. https://doi.org/10.1901/jaba.1998.31-211.
Hagopian, L. P., Fisher, W. W., Thompson, R. H., Owen-DeSchryver, J., Iwata, B. A., & Wacker, D. P. (1997). Toward the development of structured criteria for interpretation of functional analysis data. Journal of Applied Behavior Analysis, 30, 313–326. https://doi.org/10.1901/jaba.1997.30-313.
Hagopian, L. P., Rooker, G. W., & Yenokyan, G. (2018). Identifying predictive behavioral markers: A demonstration using automatically reinforced self-injurious behavior. Journal of Applied Behavior Analysis, 51, 443–465. https://doi.org/10.1002/jaba.477.
Harrington, M., & Velicer, W. F. (2015). Comparing visual and statistical analysis in single-case studies using published studies. Multivariate Behavioral Research, 50, 162–183. https://doi.org/10.1080/00273171.2014.973989.
Kyonka, E. G., Mitchell, S. H., & Bizo, L. A. (2019). Beyond inference by eye: Statistical and graphing practices in JEAB, 1992-2017. Journal of the Experimental Analysis of Behavior, 111, 155–165. https://doi.org/10.1002/jeab.509.
Lanovaz, M. J., Huxley, S. C., & Dufour, M. M. (2017). Using the dual-criteria methods to supplement visual inspection: An analysis of nonsimulated data. Journal of Applied Behavior Analysis, 50, 662–667. https://doi.org/10.1002/jaba.394.
Lenz, A. S. (2013). Calculating effect size in single-case research: A comparison of nonoverlap methods. Measurement & Evaluation in Counseling & Development, 46, 64–73. https://doi.org/10.1177/0748175612456401.
Manolov, R., & Solanas, A. (2008). Comparing N = 1 effect size indices in presence of autocorrelation. Behavior Modification, 32, 860–875. https://doi.org/10.1177/0145445508318866.
Ninci, J., Vannest, K. J., Willson, V., & Zhang, N. (2015). Interrater agreement between visual analysts of single-case data: A meta-analysis. Behavior Modification, 39, 510–541. https://doi.org/10.1177/0145445515581327.
Parker, R. I., Vannest, K. J., & Davis, J. L. (2011). Effect size in single-case research: A review of nine nonoverlap techniques. Behavior Modification, 35, 303–322. https://doi.org/10.1177/0145445511399147.
Phillips, C. L., Iannaccone, J. A., Rooker, G. W., & Hagopian, L. P. (2017). Noncontingent reinforcement for the treatment of severe problem behavior: An analysis of 27 consecutive applications. Journal of Applied Behavior Analysis, 50, 357–376. https://doi.org/10.1002/jaba.376.
Roane, H. S., Fisher, W. W., Kelley, M. E., Mevers, J. L., & Bouxsein, K. J. (2013). Using modified visual-inspection criteria to interpret functional analysis outcomes. Journal of Applied Behavior Analysis, 46, 130–146. https://doi.org/10.1002/jaba.13.
Sham, E., & Smith, T. (2014). Publication bias in studies of an applied behavior-analytic intervention: An initial analysis. Journal of Applied Behavior Analysis, 47, 663–678. https://doi.org/10.1002/jaba.146.
Stewart, K. K., Carr, J. E., Brandt, C. W., & McHenry, M. M. (2007). An evaluation of the conservative dual-criterion method for teaching university students to visually inspect AB-design graphs. Journal of Applied Behavior Analysis, 40, 713–718. https://doi.org/10.1901/jaba.2007.713-718.
Smith, J. D. (2012). Single-case experimental designs: A systematic review of published research and current standards. Psychological Methods, 17, 510–550. https://doi.org/10.1037/a0029312.
Tarlow, K. R., & Brossart, D. F. (2018). A comprehensive method of single-case data analysis: Interrupted Time-Series Simulation (ITSSIM). School Psychology Quarterly, 33, 590–560. https://doi.org/10.1037/spq0000273.
Thompson, R. H., Iwata, B. A., Hanley, G. P., Dozier, C. L., & Samaha, A. L. (2003). The effects of extinction, noncontingent reinforcement, and differential reinforcement of other behavior as control procedures. Journal of Applied Behavior Analysis, 36(2), 221–238.
Wolery, M., Busick, M., Reichow, B., & Barton, E. E. (2010). Comparison of overlap methods for quantitatively synthesizing single-subject data. Journal of Special Education, 44, 18–28. https://doi.org/10.1177/0022466908328009.
Wolfe, K., Seaman, M. A., Drasgow, E., & Sherlock, P. (2018). An evaluation of the agreement between the conservative dual-criterion method and expert visual analysis. Journal of Applied Behavior Analysis, 51, 345–351. https://doi.org/10.1002/jaba.453.
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors wish to acknowledge Lucy Barnard-Brak and David Richman, as well as the editor and reviewers for their helpful comments during preparation of this manuscript.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Falligant, J.M., Kranak, M.P., Schmidt, J.D. et al. Correspondence between Fail-Safe k and Dual-Criteria Methods: Analysis of Data Series Stability. Perspect Behav Sci 43, 303–319 (2020). https://doi.org/10.1007/s40614-020-00255-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40614-020-00255-x