Statistical comparison of four effect sizes for single-subject designs.
Four common effect-size indices can call the same graph weak or strong, so report the formula you use and treat meta-analytic averages with caution.
01Research in Context
What this study did
Campbell (2004) lined up four popular ways to turn single-case graphs into numbers. The team compared PND, PZD, MBLR, and regression-d. They wanted to see if the indices told the same story about the same treatments.
What they found
All four indices agreed the treatments worked. But the indices did not agree on how well they worked. Only PZD spotted differences tied to moderators such as age or setting.
How this fits with other research
Sen (2022) extends the warning: five newer regression formulas can swing Cohen's d from 0.003 to 3.47 on the same data. The numbers look precise, but they are not interchangeable.
Cohn et al. (2007) acts as a successor. Their field test of 165 AB graphs crowned IRD the best nonoverlap index, pushing past the PND that Campbell (2004) still included.
Carter (2013) reframes the debate. The paper says overlap indices are not wrong, just misread. Use them to judge control, not size. This softens the apparent contradiction without erasing it.
Why it matters
When you write up a single-case study, pick one index and stick with it. State the formula in your method section so readers can compare across studies. If you need to hunt for moderators, try PZD first.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Add one sentence to your next report that names the exact effect-size formula you used.
02At a glance
03Original abstract
Controversy exists regarding appropriate methods for summarizing treatment outcomes for single-subject designs. Nonregression- and regression-based methods have been proposed to summarize the efficacy of single-subject interventions with proponents of both methods arguing for the superiority of their respective approaches. To compare findings for different single-subject effect sizes, 117 articles that targeted the reduction of problematic behaviors in 181 individuals diagnosed with autism were examined. Four effect sizes were calculated for each article: mean baseline reduction (MBLR), percentage of nonoverlapping data (PND), percentage of zero data (PZD), and one regression-based d statistic. Although each effect size indicated that behavioral treatment was effective, moderating variables were detected by the PZD effect size only. Pearson product-moment correlations indicated that effect sizes differed in statistical relationships to one another. In the present review, the regression-based d effect size did not improve the understanding of single-subject treatment outcomes when compared to nonregression effect sizes.
Behavior modification, 2004 · doi:10.1177/0145445503259264