Assessment & Research

The discriminative ability and diagnostic utility of the ADOS-G, ADI-R, and GARS for children in a clinical setting.

Mazefsky et al. (2006) · Autism : the international journal of research and practice

★ The Verdict

ADOS-G and ADI-R agree with clinic teams about 75 % of the time, while the original GARS misses too many autistic kids to trust alone.

✓ Read this if BCBAs who conduct or interpret autism assessments in clinic or school settings.

✗ Skip if Practitioners already using only ADOS-G/ADI-R or those outside of diagnostic roles.

01Research in Context

What this study did

The team compared three autism screeners head-to-head in a busy clinic. They gave the ADOS-G, ADI-R, and GARS to children already referred for evaluation. Then they checked each tool against the final team diagnosis.

This was a real-world test, not a lab study. Kids came in for ordinary assessments and the tools had to prove their worth on the spot.

What they found

ADOS-G and ADI-R matched the clinic team about three times out of four. Most mismatches were false positives; the tools flagged autism when the team later said no.

GARS missed far more cases than it caught. Its scores rarely lined up with the team’s final call, making it the weakest choice of the three.

How this fits with other research

Lecavalier (2005) saw the same GARS problem a year earlier. That study also found low sensitivity and shaky reliability, so the poor showing is no surprise.

Pandolfi et al. (2010) later showed the GARS-2 still had the same structural flaws. The subscales simply do not hold together as the manual claims.

Yang et al. (2026) offers a brighter view, but only for the new GARS-3. Their Chinese version hit 86–89 % accuracy, proving the revision fixed the old faults.

Why it matters

If you screen or diagnose in clinic, lean on ADOS-G or ADI-R and skip the original GARS. Keep an eye on GARS-3 data as it emerges, but for now treat any GARS score as a red flag that needs a second tool, not as proof of autism.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Pull your GARS protocols and double-check any autism diagnosis that rests solely on those scores.

02At a glance

Intervention

not applicable

Design

other

Population

mixed clinical

Finding

mixed

03Original abstract

Recent years have seen a surge of interest in assessment instruments for diagnosing autism in children. Instruments have generally been developed and evaluated from a research perspective. The Autism Diagnostic Observation Schedule-Generic (ADOS-G), Autism Diagnostic Interview-Revised (ADI-R), and Gilliam Autism Rating Scale (GARS) have received considerable attention and are widely used. The objective of this study was to explore the diagnostic utility and discriminative ability of these tools using a clinical population of children referred to a specialty diagnostic clinic over a 3 year time span. The results indicated that the ADOS-G and ADI-R led to approximately 75 percent agreement with team diagnoses, with most inconsistencies being false positive diagnoses based on the measures. The GARS was generally ineffective at discriminating between children with various team diagnoses and consistently underestimated the likelihood of autism. The findings have important implications for the use of these measures in both research and clinical practice.

Autism : the international journal of research and practice, 2006 · doi:10.1177/1362361306068505