Assessment & Research

Statistical Decision-Making Accuracies for Some Overlap- and Distance-based Measures for Single-Case Experimental Designs

Carlin et al. (2022) · Perspectives on Behavior Science

★ The Verdict

Use Tau to detect effects and RD or g to measure them—skip guesswork.

✓ Read this if BCBAs who publish or review single-case data.

✗ Skip if Clinicians who only run standardized norm-referenced tests.

01Research in Context

What this study did

Carlin et al. (2022) ran computer simulations to see which numbers best spot true effects in single-case graphs.

They compared overlap tools like Tau with distance tools like RD and g.

The goal was to tell BCBAs which statistic to trust for yes-or-no decisions and for sizing effects.

What they found

Tau won for decision accuracy: it rarely cried “effect” when none was there.

RD and g gave the clearest picture of how big a change was, not just if it happened.

Eyeballing the graph alone was not enough; the right number helps you act faster.

How this fits with other research

Costello et al. (2022) ran a similar 2022 test and also found Tau and RD beat visual inspection alone.

Dowdy et al. (2021) had already warned that each effect-size index carries hidden rules; Carlin’s work now shows which index to pick in practice.

Manolov et al. (2025) focus on picking the right multilevel model, not the right measure; together the two papers give you a full checklist for solid SCED stats.

Why it matters

Next time you graph a client’s data, run Tau first. If Tau says “effect,” add RD or g to show the size. This two-step habit speeds up team decisions and makes your write-ups reviewer-proof.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

Open last week’s graph, calculate Tau in free SCED software, and note the RD value next to it.

02At a glance

Intervention

not applicable

Design

methodology paper

Finding

not reported

03Original abstract

Selecting a quantitative measure to guide decision making in single-case experimental designs (SCEDs) is complicated. Many measures exist and all have been rightly criticized. The two general classes of measure are overlap-based (e.g., percentage nonoverlapping data) and distance-based (e.g., Cohen’s d). We compare several measures from each category for Type I error rate and power across a range of designs using equal numbers of observations (i.e., 3–10) in each phase. Results showed that Tau and the distance-based measures (i.e., RD and g) provided the highest decision accuracies. Other overlap-based measures (e.g., PND, dual-criterion method) did not perform as well. It is recommended that Tau be used to guide decision making about the presence/absence of a treatment effect, and RD or g be used to quantify the magnitude of the treatment effect. The online version contains supplementary material available at 10.1007/s40614-021-00317-8.

Perspectives on Behavior Science, 2022 · doi:10.1007/s40614-021-00317-8