Assessment & Research

Replicability and randomization test logic in behavior analysis

Jacobs (2019) · Journal of the Experimental Analysis of Behavior 2019
★ The Verdict

Randomization tests give you a p-value for single-case data without assuming normal distributions—use them instead of t-tests next time you analyze an AB design.

✓ Read this if BCBAs who analyze or publish single-case data.
✗ Skip if Clinicians who only read group studies.

01Research in Context

01

What this study did

Jacobs wrote a how-to paper, not an experiment. He looked at how we usually analyze single-case data. He asked: why are we still using t-tests that need big, normal-shaped groups?

He explained randomization tests. These tests count up what actually happened versus what could have happened if the treatment order were shuffled. No need for normal curves or large N.

02

What they found

The paper says randomization tests fit single-case logic better. You get a clean p-value for one learner without pretending the data are normal.

Jacobs showed the math is simple: list every possible order of A and B phases, compute the stat for each order, and see where your real result falls.

03

How this fits with other research

Manolov et al. (2022) built a free visual tool that uses the same shuffle logic. You upload your data and get a modified Brinley plot that tells you if the effect replicates across kids. It turns Jacobs’ idea into a one-click website.

Iversen (2025) goes smaller, not bigger. He says each single trial inside a session is already a mini-replication. You graph moment-to-moment responses to see stimulus control appear or vanish. Same goal as Jacobs: stop waiting for big groups and use the data you have.

Solanas et al. (2010) looks like a rival at first. They offer new slope and level estimators instead of randomization tests. But the two methods answer different questions: their tool describes size of change, Jacobs’ tool tests if change is unlikely under chance. You can run both on the same data set.

04

Why it matters

Next time you run an AB design, swap your t-test for a randomization test. Free online calculators now exist, so you can still get a p-value while staying true to single-case philosophy. Your graph stays small-N, but your inference gets stronger.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Plug your last AB graph into a free randomization-test website and compare the new p-value to the t-test you used before.

02At a glance

Intervention
not applicable
Design
theoretical
Finding
not reported

03Original abstract

Randomization tests are a class of nonparametric statistics that determine the significance of treatment effects. Unlike parametric statistics, randomization tests do not assume a random sample, or make any of the distributional assumptions that often preclude statistical inferences about single-case data. A feature that randomization tests share with parametric statistics, however, is the derivation of a p-value. P-values are notoriously misinterpreted and are partly responsible for the putative "replication crisis." Behavior analysts might question the utility of adding such a controversial index of statistical significance to their methods, so it is the aim of this paper to describe the randomization test logic and its potentially beneficial consequences. In doing so, this paper will: (1) address the replication crisis as a behavior analyst views it, (2) differentiate the problematic p-values of parametric statistics from the, arguably, more useful p-values of randomization tests, and (3) review the logic of randomization tests and their unique fit within the behavior analytic tradition of studying behavioral processes that cut across species.

Journal of the Experimental Analysis of Behavior, 2019 · doi:10.1002/jeab.501