How I Learned to Stop Worrying and Love Replication Failures
Treat a failed replication as a flashlight, not a stop sign—use it to find where the procedure breaks.
01Research in Context
What this study did
Perone (2019) wrote a think-piece, not an experiment. He asked one question: why do we panic when a replication fails?
He says we should treat a failed copy as a signal, not a shame. It tells us where our controls are thin and our limits hide.
What they found
The paper finds no new data. Instead it shows that behavior-analysis rules already guard us from false positives.
When a study does not copy, the field should chase the mismatch, not bury it. Follow-up tests sharpen the next study.
How this fits with other research
Lutzker et al. (1979) copied the PASS teacher-training program in two states and got the same gains. Perone cheers this: clean replications prove a tool travels.
Andronis et al. (1997) also copied a happiness procedure; three of four adults with profound disabilities smiled more. Again, the copy worked and the field moved on.
Wilder et al. (2025) did a mini-copy inside one study. They swapped high-p for medium-p instructions and still raised cooperation. Perone would call this a boundary test that tightens our rules.
Eggleston et al. (2018) labeled their work a 'conceptual replication' yet changed both measure and sample. Perone warns that loose copies can look like failures when they are simply different questions.
Why it matters
Next time your clinic pilot fails to match the journal study, do not trash the procedure. Run one quick check: did you keep the same measurement window, the same prompt level, the same reinforcer magnitude? Log the mismatch, adjust one variable, and test again. Turn the 'failure' into a boundary map that protects your future clients from wasted hours.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Pick one past pilot that 'didn't work,' list three procedural differences from the original study, and test the most likely one next session.
02At a glance
03Original abstract
Worries about the reproducibility of experiments in the behavioral and social sciences arise from evidence that many published reports contain false positive results. Misunderstanding and misuse of statistical procedures are key sources of false positives. In behavior analysis, however, statistical procedures have not been used much. Instead, the investigator must show that the behavior of an individual is consistent over time within an experimental condition, that the behavior changes systematically across conditions, and that these changes can be reproduced – and then the whole pattern must be shown in additional individuals. These high standards of within- and between-subject replication protect behavior analysis from the publication of false positive findings. When a properly designed and executed experiment fails to replicate a previously published finding, the failure exposes flaws in our understanding of the phenomenon under study – perhaps in recognizing the boundary conditions of the phenomenon, identifying the relevant variables, or bringing the variables under sufficient control. We must accept the contradictory findings as valid and pursue an experimental analysis of the possible reasons. In this way, we resolve the contradiction and advance our science. To illustrate, two research programs are described, each initiated because of a replication failure.
Perspectives on Behavior Science, 2019 · doi:10.1007/s40614-018-0153-x