Assessment & Research

Assessing observer accuracy in continuous recording of rate and duration: three algorithms compared.

Mudford et al. (2009) · Journal of applied behavior analysis 2009
★ The Verdict

Pick time-window IOA for high-rate or long-duration behaviors until automated sensors take over.

✓ Read this if BCBAs who train staff to collect rate or duration data in clinic or classroom settings.
✗ Skip if Practitioners already using motion sensors or AI to track behavior.

01Research in Context

01

What this study did

The team pitted three ways to check observer agreement against each other.

They used the same video clips and ran exact agreement, block-by-block, and time-window formulas.

The goal was to see which math trick best spots when two people press stopwatches at nearly the same time.

02

What they found

No formula won every round.

Exact agreement punished fast or long behaviors too hard.

Block-by-block and time-window sometimes looked better than they really were.

03

How this fits with other research

Bigby et al. (2009) review shows most papers pick one of these three formulas without testing them first.

Gilchrist et al. (2018) and Maharaj et al. (2020) skip the math fight by letting accelerometers and Kinect count for us.

Their gadgets hit over 90% match with humans, hinting that good hardware may soon make the old formula debate moot.

04

Why it matters

You still need to pick an IOA formula for today’s session.

If the behavior is quick or lasts a long time, start with time-window; it is the least likely to cry foul.

While you wait for cheap wearables to land in your clinic, keep reporting which formula you used so later teams can compare.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Switch your next high-rate behavior sheet to 10-s time-window IOA and note the difference.

02At a glance

Intervention
not applicable
Design
other
Sample size
12
Population
not specified
Finding
inconclusive

03Original abstract

The three algorithms most frequently selected by behavior-analytic researchers to compute interobserver agreement with continuous recording were used to assess the accuracy of data recorded from video samples on handheld computers by 12 observers. Rate and duration of responding were recorded for three samples each. Data files were compared with criterion records to determine observer accuracy. Block-by-block and exact agreement algorithms were susceptible to inflated agreement and accuracy estimates at lower rates and durations. The exact agreement method appeared to be overly stringent for recording responding at higher rates (23.5 responses per minute) and for higher relative duration (72% of session). Time-window analysis appeared to inflate accuracy assessment at relatively high but not at low response rate and duration (4.8 responses per minute and 8% of session, respectively).

Journal of applied behavior analysis, 2009 · doi:10.1901/jaba.2009.42-527