Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.
IRD and BC-SMD are the stats that most often agree with expert visual inspection of multiple-baseline graphs.
01Research in Context
What this study did
Howard et al. (2019) asked a simple question. Which numbers match what our eyes already see?
They took graphs from real multiple-baseline studies. Four quick stats were tried on each graph.
Experts also judged the same graphs by eye. The team then checked which stat best agreed with the pros.
What they found
Two stats won. IRD and BC-SMD lined up closest to visual calls.
The naked eye was usually tougher. Visual judges said "no effect" more often than any stat did.
How this fits with other research
Lanovaz et al. (2017) set the table first. They showed you need at least three A points and five B points before you even trust the simpler dual-criteria rule. Katie’s team built on that by testing fancier stats on the same kind of graphs.
Falligant et al. (2020) double-checked the dual-criteria method with fake data and found it keeps false alarms low. Katie et al. now give you two more tools—IRD and BC-SMD—that do the same job with tighter agreement to expert eyes.
Manolov (2019) ran a sister simulation on alternating-treatment designs and praised ALIV plus randomization. The takeaway across both 2019 papers: pair your eyes with one solid stat, but pick the stat that matches your design—ALIV for ATD, IRD or BC-SMD for multiple baseline.
Why it matters
You no longer have to guess if a graph "looks good enough." Slap IRD or BC-SMD onto your next multiple-baseline project and write the number right under the visual claim. If the stat agrees with your eye, you can sign off with confidence. If they clash, collect more data or tweak the intervention before you call it a win.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Open your last multiple-baseline graph, compute IRD with free online calculators, and compare the number to your visual judgment.
02At a glance
03Original abstract
A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Behavior modification, 2019 · doi:10.1177/0145445518768723