Assessment & Research

Observer reliability as a function of circumstances of assessment.

Kent et al. (1977) · Journal of applied behavior analysis

★ The Verdict

Reliability checks can be gamed—blind your observers, monitor sessions, and calculate reliability across independent pairs to get honest IOA.

✓ Read this if BCBAs who collect IOA in clinics, schools, or home programs.

✗ Skip if Practitioners who only use automated data like electronic timing.

01Research in Context

What this study did

The team watched how observers act when they know someone is checking their data.

They changed three things: telling or not telling observers about checks, having a boss in the room or not, and letting pairs grade themselves or using outside pairs.

Then they looked at how high the IOA numbers looked under each setup.

What they found

IOA jumped when observers knew they were being checked that day.

Scores also rose when no boss watched the session and when partners checked each other instead of strangers.

In short, the same video could give very different reliability scores depending on the check setup.

How this fits with other research

Normand et al. (2023) warn that researchers who wear both clinician and scientist hats can bias data; Branch et al. (1977) show even well-meaning observers can tilt numbers if the check system is loose.

Critchfield et al. (2003) found that a “reward” can secretly punish; here, a “check” can secretly inflate. Both papers scream the same message: tiny procedural details swing results.

Schmidt et al. (1969) got clean data by using clear, reset-based rules in class; N et al. prove you need equally clear, blind, cross-pair rules to get honest IOA.

Why it matters

You can’t trust high IOA if your observers know today is check day, sit alone, and swap sheets with a friend. Build blind spot checks, rotate outside pairs, and drop in unannounced. Honest data starts with honest measurement.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

Pick one client, have a second staff member quietly IOA from a video file the first observer did not know was saved.

02At a glance

Intervention

not applicable

Design

other

Sample size

Finding

not reported

03Original abstract

THREE FACTORS CHARACTERISTIC OF EXPERIMENTAL SETTINGS WERE HYPOTHESIZED TO INFLATE ARTIFACTUALLY THE RELIABILITY OF OBSERVATIONAL RECORDINGS: (a) knowledge by observers of when and by whom their reliability is being assessed, (b) the absence of the experimenter or a monitor to prevent cheating, and (c) computation of reliability within- (versus between-) observer group. Three groups of four observers used a standard nine-category observational code for disruptive behavior in recording from videotapes of a classroom for 22 days. Analyses revealed considerable increases in average occurrence reliability as a function of the main effects of each of the experimental factors. The specific increases in reliability associated with each of the 12 combinations of the experimental factors are presented for each category of behavior. The possible role of observer-training procedures and behavioral definitions as determiners of nonartifactual reliability is discussed.

Journal of applied behavior analysis, 1977 · doi:10.1901/jaba.1977.10-317