Assessment & Research

A survey evaluation of the reliability of visual inspection and functional analysis graphs.

Danov et al. (2008) · Behavior modification 2008
★ The Verdict

Visual inspection of FA graphs is unreliable—always get a second opinion or use statistical aids.

✓ Read this if BCBAs who run or interpret functional analyses in any setting.
✗ Skip if RBTs who do not interpret FA data.

01Research in Context

01

What this study did

Lancioni et al. (2008) asked 43 behavior analysts to look at functional analysis graphs.

Each rater judged the same 30 graphs using only their eyes.

The team then checked how often the raters agreed with each other.

02

What they found

Agreement was only moderate to low.

Different BCBAs saw different functions in the same data.

Your eyes alone can mislead you when reading FA graphs.

03

How this fits with other research

This finding echoes Matson et al. (1989), who also found poor agreement when raters had to interpret meaning from data.

Chiviacowsky et al. (2013) adds another layer: even structured tools like the MAS and QABF show weak item-level agreement.

Together these papers show a pattern—subjective judgment, whether from eyes or rating scales, needs backup data.

04

Why it matters

Stop trusting visual inspection alone. Pair every FA graph review with a second rater or use statistical aids like the Conservative Dual-Criterion. This simple step cuts the risk of picking the wrong intervention based on shaky data.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Have a colleague blind-review your last three FA graphs and compare conclusions.

02At a glance

Intervention
not applicable
Design
survey
Sample size
43
Population
not specified
Finding
negative

03Original abstract

Visual inspection is the primary method used to analyze graphed behavioral data produced by functional analyses of problem behavior. The purpose of this study was to examine rater reliability of functional analysis graphs using visual inspection. Forty-three participants responded to a one-time anonymous survey (N = 454) mailed to graduate programs accredited in applied behavior analysis (N = 11). Respondents were instructed to classify single-subject data arrays depicted in multielement graphs from published studies. Classification was based on function categories by Hagopian, Fisher, Thompson, and Owen-DeSchryver. Three indices based on overall inter-rater agreement, the consistency of individual raters across graphs, and the aggregate performance of raters per graph were calculated and compared. Results for all three indices of rater performance were moderate to low. Results confirm prior preliminary work indicating relatively low levels of reliability and are discussed in relation to research, training, and practice issues.

Behavior modification, 2008 · doi:10.1177/0145445508318606