Assessment & Research

Correspondence between Fail-Safe k and Dual-Criteria Methods: Analysis of Data Series Stability

Falligant et al. (2020) · Perspectives on Behavior Science

★ The Verdict

Fail-safe k is a handy double-check for phase-change timing, yet still too green for stand-alone use.

✓ Read this if BCBAs who publish or review single-case graphs in schools or clinics.

✗ Skip if Practitioners who only run group designs or never look at graphed data.

01Research in Context

What this study did

Falligant et al. (2020) asked a simple question: does the new fail-safe k metric line up with the old dual-criteria rule when we eyeball single-case graphs? They ran both methods on real clinical data series to see if they flag the same moment of change. No new experiment, just a head-to-head check of two decision tools.

What they found

The two methods mostly agreed, but only 'somewhat.' Fail-safe k caught some shifts the dual-criteria rule missed and vice-versa. The match was good enough to say 'promising,' yet shaky enough to warn 'not ready for live cases.'

How this fits with other research

Carlin et al. (2022) and Costello et al. (2022) also tested SCED decision aids. Both teams pushed Tau and RD effect sizes over plain visual calls, echoing the same warning: eyes alone are risky. Lanovaz et al. (2020) went further, showing machine-learning beats the dual-criteria method outright. Falligant’s weaker praise for fail-safe k fits the pattern—new stats help, but each tool needs its own vetting before you trade your graph paper for code.

Why it matters

If you write up SCED graphs, add a second metric before you claim victory. Try fail-safe k as a backup, but pair it with Tau or RD until more studies replicate the fit. One extra number in your Excel sheet can save you from a reviewer’s ‘visual-inspection-only’ complaint.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

Run fail-safe k on your last SCED graph and compare the split to your visual call—note any days that don’t line up.

02At a glance

Intervention

not applicable

Design

other

Finding

weakly positive

03Original abstract

Barnard-Brak, Richman, Little, and Yang (Behaviour Research and Therapy, 102, 8–15, 2018) developed a structured-criteria metric, fail-safe k, which quantifies the stability of data series within single-case experimental designs (SCEDs) using published baseline and treatment data. Fail-safe k suggests the optimal point in time to change phases (e.g., move from Phase B to Phase C, reverse back to Phase A). However, this tool has not been tested with clinical data obtained in the course of care. Thus, the purpose of the current study was to replicate the procedures described by Barnard-Brak et al. with clinical data. We also evaluated the correspondence between the fail-safe k metric with outcomes obtained via dual-criteria and conservative-dual criteria methods, which are empirically supported methods for evaluating data-series trends within SCEDs. Our results provide some degree of support for use of this approach as a research tool with clinical data, in particular when evaluating small or medium treatment effect sizes, but further research is needed before this can be used widely by practitioners.

Perspectives on Behavior Science, 2020 · doi:10.1007/s40614-020-00255-x