Reliability of the ADI-R: multiple examiners evaluate a single case.
Seven independent raters scored the same ADI-R interview and agreed 94-96 % of the time, showing the tool is reliable even with just one toddler.
01Research in Context
What this study did
Seven different clinicians gave the same toddler the Autism Diagnostic Interview-Revised. They worked alone and never talked to each other.
The team then compared every score to see how often the raters agreed.
What they found
Agreement was high: 94-96 % of item scores matched across the seven raters. Kappa values were 0.80-0.88, showing strong reliability.
Even with only one child, the ADI-R held together well across independent examiners.
How this fits with other research
Tassé et al. (2013) later repeated the test with 3- to 18-year-olds in Japan and still found solid reliability, so the tool travels beyond toddlers.
de Bildt et al. (2013) looked at 1,204 Dutch children and showed that different ADI-R cut-off sets change sensitivity and specificity. High rater agreement does not guarantee perfect diagnosis; you still need to pick the right algorithm.
Constantino et al. (2003) offered a 15-minute parent scale that correlates about 0.70 with the full ADI-R. The brief form saves time, but the deep interview remains the gold standard when you can spare the hour.
Why it matters
You can trust the ADI-R even when several clinicians interview the same family on different days. Train new team members with this case: have them watch a master interview, then score the recording and compare answers. Aim for the same 94 % match before they fly solo. If you need a quick screen, pair the short SRS first, but keep the full ADI-R in your pocket for the final diagnostic meeting.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Have your new rater watch a recorded ADI-R, score it alone, then compare item-by-item with the expert key until you hit 90 % agreement.
02At a glance
03Original abstract
The authors assessed the reliability of the Autism Diagnostic Interview (ADI-R). Seven Clinical Examiners evaluated a three and one half year old female toddler suspected of being on the Autism Spectrum. Examiners showed agreement levels of 94-96% across all items, with weighted kappa (K(w)) between .80 and .88. They were in 100% agreement on 74% of the items; in excellent agreement on 6% of the items (93-96%, with K(w) between .78 and .85); in good agreement on 7% (89-90%, with K(w) between .62 and 0.68); and in fair agreement on 3% (82 - 84%, with K(w) between .40 and .47). For the remaining 10% of ADI-R items, examiners showed poor agreement (50-81% with K(w )between -.67 and .37).
Journal of autism and developmental disorders, 2008 · doi:10.1007/s10803-007-0448-3