Evaluating interobserver reliability of interval data.
Compare your interval IOA to the chance-level table before you call your data reliable.
01Research in Context
What this study did
Hopkins et al. (1977) wrote a math paper for behavior analysts. They built formulas and graphs that show what interobserver agreement should look like if two observers were just guessing.
The team focused on interval recording. They wanted a quick way to know if your 80% IOA is real or just luck.
What they found
The paper gives ready-made tables. You plug in your interval length and behavior rate. The table tells you the exact IOA you would get by chance alone.
If your real number beats the table number, your data is probably solid. If not, you need more training or better definitions.
How this fits with other research
Kangas et al. (2011) turned the 1977 tables into a free Excel file. You now click instead of doing longhand division.
Friedling et al. (1979) added a visual twist. They plotted the chance band as a gray zone on a graph so you can see at a glance if your IOA clears the bar.
Romani et al. (2018) took the idea onto a busy kids' unit. They showed that clickers and simple sheets can push real-world IOA above the chance line for children with ID/DD.
Why it matters
Stop saying “IOA must be 80%.” Check the chance table first. A 70% IOA can be excellent if the behavior is rare; a 90% can be lousy if the behavior is common. Use the formula, the Excel tool, or the graph—whatever is fastest for you—and only trust data that beats chance.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Download the Excel tool from D et al. (2011), paste this week's partial-interval data, and check if your IOA beats the chance value the sheet spits out.
02At a glance
03Original abstract
Previous recommendations to employ occurrence, nonoccurrence, and overall estimates of interobserver reliability for interval data are reviewed. A rationale for comparing obtained reliability to reliability that would result from a random-chance model is explained. Formulae and graphic functions are presented to allow for the determination of chance agreement for each of the three indices, given any obtained per cent of intervals in which a response is recorded to occur. All indices are interpretable throughout the range of possible obtained values for the per cent of intervals in which a response is recorded. The level of chance agreement simply changes with changing values. Statistical procedures that could be used to determine whether obtained reliability is significantly superior to chance reliability are reviewed. These procedures are rejected because they yield significance levels that are partly a function of sample sizes and because there are no general rules to govern acceptable significance levels depending on the sizes of samples employed.
Journal of applied behavior analysis, 1977 · doi:10.1901/jaba.1977.10-121