Assessment & Research

Predict, Control, and Replicate to Understand: How Statistics Can Foster the Fundamental Goals of Science

Killeen (2019) · Perspectives on Behavior Science

★ The Verdict

Swap p-values for Bayesian predictions and you can spot replicable effects before you collect the first data point.

✓ Read this if BCBAs who design studies, review manuscripts, or teach research methods.

✗ Skip if Clinicians who only deliver treatment and never read stats sections.

01Research in Context

What this study did

Killeen (2019) wrote a think-piece. He says p-values are broken. He wants scientists to forecast results before they run studies.

The paper is conceptual. No new data. It maps how Bayesian or predictive metrics beat null-hypothesis tests.

What they found

The big idea: replace p with prediction. State what you expect, collect data, then check how close you were.

This swap makes replication part of the design, not an afterthought.

How this fits with other research

Branch (2019) agrees p-values hurt replication, but pushes behavior-analytic replication instead of Bayesian math. Both want the same fix—more built-in checks—through different doors.

Franck et al. (2019) show the Bayesian door in action. They re-analyzed delay-discounting data with credible intervals, giving the concrete recipe Killeen only describes.

Bacon et al. (1998) beat everyone to the punch. They told behavior analysts to drop inferential stats entirely and trust visual analysis. Killeen (2019) updates that call by offering Bayesian tools rather than none at all.

Why it matters

If you write or review studies, stop letting p-values do the thinking. Spell out your predicted effect size, plug it into a Bayes factor or credible interval, then run the study. This habit tells you, and every reader, how likely the finding is to repeat. Start small: add one Bayesian contrast to your next single-case design and compare the forecast to what the graph shows.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

Write a numeric prediction for your next client target, then mark on the graph if the data land inside your forecast range.

02At a glance

Intervention

not applicable

Design

theoretical

Finding

not reported

03Original abstract

Scientists abstract hypotheses from observations of the world, which they then deploy to test their reliability. The best way to test reliability is to predict an effect before it occurs. If we can manipulate the independent variables (the efficient causes) that make it occur, then ability to predict makes it possible to control. Such control helps to isolate the relevant variables. Control also refers to a comparison condition, conducted to see what would have happened if we had not deployed the key ingredient of the hypothesis: scientific knowledge only accrues when we compare what happens in one condition against what happens in another. When the results of such comparisons are not definitive, metrics of the degree of efficacy of the manipulation are required. Many of those derive from statistical inference, and many of those poorly serve the purpose of the cumulation of knowledge. Without ability to replicate an effect, the utility of the principle used to predict or control is dubious. Traditional models of statistical inference are weak guides to replicability and utility of results. Several alternatives to null hypothesis testing are sketched: Bayesian, model comparison, and predictive inference (prep). Predictive inference shows, for example, that the failure to replicate most results in the Open Science Project was predictable. Replicability is but one aspect of scientific understanding: it establishes the reliability of our data and the predictive ability of our formal models. It is a necessary aspect of scientific progress, even if not by itself sufficient for understanding.

Perspectives on Behavior Science, 2019 · doi:10.1007/s40614-018-0171-8