Assessment & Research

Effects of serial dependency on the agreement between visual and statistical inference.

Jones et al. (1978) · Journal of applied behavior analysis 1978
★ The Verdict

High autocorrelation makes graphs and stats tell different stories—check it before you decide.

✓ Read this if BCBAs who use visual inspection to make phase-change calls.
✗ Skip if Practitioners who rely only on statistical software packages.

01Research in Context

01

What this study did

The authors built fake data sets that looked like single-case graphs.

Each set had different levels of autocorrelation—high, medium, or none.

They asked both people and a computer to judge if the data showed a real change.

Then they counted how often the two answers matched.

02

What they found

When autocorrelation was high, people and the computer disagreed most of the time.

With no autocorrelation, the two methods usually said the same thing.

The paper warns: if your data points drag each other along, visual and statistical calls can clash.

03

How this fits with other research

A year later, Annable et al. (1979) showed that even trained visual raters only agree with each other 61 % of the time.

Together the two papers tell a story: autocorrelation hurts visual-stat agreement, and even eyeball-to-eyeball agreement is shaky.

Branch (2019) picks up the thread forty years on, arguing that built-in replication—not p values—keeps behavior analysis honest.

All three papers push the same fix: tighten your method, don’t just trust the graph or the number.

04

Why it matters

Before you call a phase change real, run a quick autocorrelation check in Excel or R. If the lag-1 r is above 0.3, show the team both visual and statistical views and explain the mismatch. This small step saves you from later replication headaches.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Run an autocorrelation test on your last five graphs; flag any with r > 0.3 for team review.

02At a glance

Intervention
not applicable
Design
methodology paper
Finding
not reported

03Original abstract

Comparisons between visual and time-series inferences from behavioral data show that serial dependency in scores is likely to disrupt agreement between the two methods of analysis. If researchers follow an earlier recommendation that time-series analysis be used to supplement or confirm visual analysis, this study's findings suggest that the two methods will disagree most often when the data contain high levels of autocorrelation and when reliable behavorial changes are indicated by time-series analysis.

Journal of applied behavior analysis, 1978 · doi:10.1901/jaba.1978.11-277