Assessing observer accuracy in continuous recording of rate and duration: three algorithms compared.
Pick time-window IOA for high-rate or long-duration behaviors until automated sensors take over.
01Research in Context
What this study did
The team pitted three ways to check observer agreement against each other.
They used the same video clips and ran exact agreement, block-by-block, and time-window formulas.
The goal was to see which math trick best spots when two people press stopwatches at nearly the same time.
What they found
No formula won every round.
Exact agreement punished fast or long behaviors too hard.
Block-by-block and time-window sometimes looked better than they really were.
How this fits with other research
Bigby et al. (2009) review shows most papers pick one of these three formulas without testing them first.
Gilchrist et al. (2018) and Maharaj et al. (2020) skip the math fight by letting accelerometers and Kinect count for us.
Their gadgets hit over 90% match with humans, hinting that good hardware may soon make the old formula debate moot.
Why it matters
You still need to pick an IOA formula for today’s session.
If the behavior is quick or lasts a long time, start with time-window; it is the least likely to cry foul.
While you wait for cheap wearables to land in your clinic, keep reporting which formula you used so later teams can compare.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Switch your next high-rate behavior sheet to 10-s time-window IOA and note the difference.
02At a glance
03Original abstract
The three algorithms most frequently selected by behavior-analytic researchers to compute interobserver agreement with continuous recording were used to assess the accuracy of data recorded from video samples on handheld computers by 12 observers. Rate and duration of responding were recorded for three samples each. Data files were compared with criterion records to determine observer accuracy. Block-by-block and exact agreement algorithms were susceptible to inflated agreement and accuracy estimates at lower rates and durations. The exact agreement method appeared to be overly stringent for recording responding at higher rates (23.5 responses per minute) and for higher relative duration (72% of session). Time-window analysis appeared to inflate accuracy assessment at relatively high but not at low response rate and duration (4.8 responses per minute and 8% of session, respectively).
Journal of applied behavior analysis, 2009 · doi:10.1901/jaba.2009.42-527