Assessment & Research

Machine learning to analyze single‐case graphs: A comparison to visual inspection

Lanovaz et al. (2021) · Journal of Applied Behavior Analysis 2021
★ The Verdict

Machine-learning models judged 1,480 single-case graphs more consistently and with better error balance than expert visual inspection.

✓ Read this if BCBAs who publish single-case research or supervise thesis projects.
✗ Skip if Clinicians who only treat and never graph data.

01Research in Context

01

What this study did

Lanovaz and colleagues built computer models that read single-case graphs. They trained the models on 1,024 fake AB graphs that had known answers.

Next they asked 16 board-certified behavior analysts to judge the same graphs by eye. The team then compared who got more calls right—people or code.

02

What they found

The machine-learning models made fewer wild guesses. They balanced false alarms and missed effects better than the human experts did.

Even the best visual inspectors disagreed with each other. The code gave the same verdict every time.

03

How this fits with other research

Ferron et al. (2017) already showed that masked visual analysis keeps error rates low. Lanovaz adds a new layer: let a computer do the masking for you.

Adams et al. (2024) pushed the same idea into functional-analysis data. Their simple script now agrees with experts 89 % of the time—up from 81 %—showing the field is moving fast.

Wolfe et al. (2026) looked at masked versus traditional visual analysis and found so-so agreement. The 2021 machine approach may solve that reliability problem altogether.

04

Why it matters

If you run single-case studies, you can start testing free machine-learning tools on your own graphs. Upload an AB plot, let the model vote, then compare its call with yours. Over time you will see whether the code saves you from false positives and from long team debates about ‘Do you see an effect?’

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Feed one of your recent AB graphs into an open-source decision tool and compare the model’s verdict with your visual call.

02At a glance

Intervention
not applicable
Design
other
Finding
positive

03Original abstract

Behavior analysts commonly use visual inspection to analyze single‐case graphs, but studies on its reliability have produced mixed results. To examine this issue, we compared the Type I error rate and power of visual inspection with a novel approach—machine learning. Five expert visual raters analyzed 1,024 simulated AB graphs, which differed on number of points per phase, autocorrelation, trend, variability, and effect size. The ratings were compared to those obtained by the conservative dual‐criteria method and two models derived from machine learning. On average, visual raters agreed with each other on only 75% of graphs. In contrast, both models derived from machine learning showed the best balance between Type I error rate and power while producing more consistent results across different graph characteristics. The results suggest that machine learning may support researchers and practitioners in making fewer errors when analyzing single‐case graphs, but replications remain necessary.

Journal of Applied Behavior Analysis, 2021 · doi:10.1002/jaba.863