Assessment & Research
How Many Tiers Do We Need? Type I Errors and Power in Multiple Baseline Designs
★ The Verdict
Accept two clear changes out of three or more tiers and move on — power stays strong and errors stay low.
✓ Read this if BCBAs who run or review multiple-baseline studies.
✗ Skip if Practitioners who only use group designs.
01Research in Context
01
What this study did
Stop scrapping your study when the third tier wobbles. If two tiers show clear change, you already meet the evidence standard. Next time you plan a multiple baseline, aim for at least three tiers and be content when two win — your power stays high and your errors stay low.
Free CEUs
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →→ Action — try this Monday
Keep your three-tier study even if tier three wobbles — two clear changes are enough to claim an effect.
02At a glance
Intervention
not applicable
Design
methodology paper
Finding
not reported
03Original abstract
Design quality guidelines typically recommend that multiple baseline designs include at least three demonstrations of effects. Despite its widespread adoption, this recommendation does not appear grounded in empirical evidence. The main purpose of our study was to address this issue by assessing Type I error rate and power in multiple baseline designs. First, we generated 10,000 multiple baseline graphs, applied the dual-criteria method to each tier, and computed Type I error rate and power for different number of tiers showing a clear change. Second, two raters categorized the tiers for 300 multiple baseline graphs to replicate our analyses using visual inspection. When multiple baseline designs had at least three tiers and two or more of these tiers showed a clear change, the Type I error rate remained adequate (< .05) while power also reached acceptable levels (> .80). In contrast, requiring all tiers to show a clear change resulted in overly stringent conclusions (i.e., unacceptably low power). Therefore, our results suggest that researchers and practitioners should carefully consider limitations in power when requiring all tiers of a multiple baseline design to show a clear change in their analyses.
Perspectives on Behavior Science, 2020 · doi:10.1007/s40614-020-00263-x