Correspondence between Fail-Safe k and Dual-Criteria Methods: Analysis of Data Series Stability
Fail-safe k is a handy double-check for phase-change timing, yet still too green for stand-alone use.
01Research in Context
What this study did
Falligant et al. (2020) asked a simple question: does the new fail-safe k metric line up with the old dual-criteria rule when we eyeball single-case graphs? They ran both methods on real clinical data series to see if they flag the same moment of change. No new experiment, just a head-to-head check of two decision tools.
What they found
The two methods mostly agreed, but only 'somewhat.' Fail-safe k caught some shifts the dual-criteria rule missed and vice-versa. The match was good enough to say 'promising,' yet shaky enough to warn 'not ready for live cases.'
How this fits with other research
Carlin et al. (2022) and Costello et al. (2022) also tested SCED decision aids. Both teams pushed Tau and RD effect sizes over plain visual calls, echoing the same warning: eyes alone are risky. Lanovaz et al. (2020) went further, showing machine-learning beats the dual-criteria method outright. Falligant’s weaker praise for fail-safe k fits the pattern—new stats help, but each tool needs its own vetting before you trade your graph paper for code.
Why it matters
If you write up SCED graphs, add a second metric before you claim victory. Try fail-safe k as a backup, but pair it with Tau or RD until more studies replicate the fit. One extra number in your Excel sheet can save you from a reviewer’s ‘visual-inspection-only’ complaint.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Run fail-safe k on your last SCED graph and compare the split to your visual call—note any days that don’t line up.
02At a glance
03Original abstract
Barnard-Brak, Richman, Little, and Yang (Behaviour Research and Therapy, 102, 8–15, 2018) developed a structured-criteria metric, fail-safe k, which quantifies the stability of data series within single-case experimental designs (SCEDs) using published baseline and treatment data. Fail-safe k suggests the optimal point in time to change phases (e.g., move from Phase B to Phase C, reverse back to Phase A). However, this tool has not been tested with clinical data obtained in the course of care. Thus, the purpose of the current study was to replicate the procedures described by Barnard-Brak et al. with clinical data. We also evaluated the correspondence between the fail-safe k metric with outcomes obtained via dual-criteria and conservative-dual criteria methods, which are empirically supported methods for evaluating data-series trends within SCEDs. Our results provide some degree of support for use of this approach as a research tool with clinical data, in particular when evaluating small or medium treatment effect sizes, but further research is needed before this can be used widely by practitioners.
Perspectives on Behavior Science, 2020 · doi:10.1007/s40614-020-00255-x