Assessment & Research

Statistical inference in behavior analysis: Some things significance testing does and does not do.

Branch (1999) · The Behavior analyst 1999
★ The Verdict

Let the second curve be your p-value—replicate, don’t calculate.

✓ Read this if BCBAs who write reports with asterisks next to their graphs.
✗ Skip if RBTs who only collect data and never write summaries.

01Research in Context

01

What this study did

Branch (1999) wrote a plain-language essay. He asked one question: do p-values help or hurt behavior analysts?

He read psychology journals and spotted a pattern. Authors used big statistical tests to claim their graphs were “real.”

He argued this habit hides the actual data. The paper has no new numbers—just a warning.

02

What they found

The author found that p-values can fool us. A tiny p does not mean the behavior changed in a useful way.

He says the fix is simple: show the effect again in a new subject, a new room, a new day. If it repeats, it is real.

03

How this fits with other research

Lemons et al. (2015) counted every article in JEAB for 55 years. They saw inferential stats keep rising, just like Branch (1999) feared.

Busch et al. (2010) went the other way. They taught college kids to love t-tests with three short lessons. Branch (1999) would call this teaching the wrong skill.

The two papers seem to clash, but they talk past each other. Lemons et al. (2015) describe what we do; Branch (1999) tells us what we should do. Busch et al. (2010) show we can teach stats fast—yet that speed does not make the stats useful for single-case work.

04

Why it matters

You run sessions, not t-tests. Next time a graph looks good, skip the p-value. Run one more reversal or replicate with a new client. That second curve is your proof. Share that picture in your report and you follow N’s advice: let the behavior speak, not the statistics.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Pick one behavior, run one extra reversal, and plot it—no stats needed.

02At a glance

Intervention
not applicable
Design
theoretical
Finding
not reported

03Original abstract

Significance testing plays a prominent role in behavioral science, but its value is frequently overestimated. It does not estimate the reliability of a finding, it does not yield a probability that results are due to chance, nor does it usually answer an important question. In behavioral science it can limit the reasons for doing experiments, reduce scientific responsibility, and emphasize population parameters at the expense of behavior. It can, and usually does, lead to a poor approach to theory testing, and it can also, in behavior-analytic experiments, discount reliability of data. At best, statistical significance is an ancillary aspect of a set of data, and therefore should play a relatively minor role in advancing a science of behavior.

The Behavior analyst, 1999 · doi:10.1007/BF03391984