Assessment & Research

The Under-Identification of Autism in Females: A Review and Analysis of Sex-Based Scoring Differences Observed in Autism Diagnostic Observation Schedule (ADOS) Module 3.

Yu et al. (2026) · Journal of autism and developmental disorders

★ The Verdict

Use the rank-based anchor method to spot sex-biased ADOS-3 items before you trust the scores.

✓ Read this if BCBAs who give ADOS-3 evaluations in clinic or school settings.

✗ Skip if RBTs who do not take part in diagnostic assessments.

01Research in Context

What this study did

Yu et al. (2026) built a computer model. The model tests if ADOS-3 items treat boys and girls the same way.

They tried a new way to pick anchor items. This rank-based method is meant to catch sex bias more cleanly.

What they found

The new anchor method gave more accurate DIF results. DIF means an item is harder for one sex even when ability is the same.

No real kids were tested; the study is a pure simulation.

How this fits with other research

Bottini et al. (2025) asked clinicians to score the same vignette. When the child was labeled female, clinicians saw more severe autism. This shows bias lives in people, not just test items.

Kopp et al. (2011) added girl-typical items like "avoids demands" to the ASSQ. Their tool caught girls earlier, proving the test itself can be fixed.

Backer van Ommeren et al. (2017) found girls with autism show better back-and-forth talk than boys. If ADOS still uses the same social cut-offs for both sexes, it will keep missing girls.

Kamp-Becker et al. (2013) updated the ADOS-3 algorithm for high-functioning youth. Yvonne’s method could now check if those new rules are also sex-fair.

Why it matters

Before you trust any ADOS score, run the rank-based DIF check in the spreadsheet the authors share. If an item shows sex bias, weight or skip it. Pair this with clinician self-audit tools like those Summer et al. suggest. You will cut the chance that quiet, socially masked girls walk away undiagnosed.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Download the authors’ DIF macro and rerun last month’s ADOS-3 raw scores to see if any items favored one sex.

02At a glance

Intervention

not applicable

Design

methodology paper

Finding

not reported

03Original abstract

Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is based on a subset of group-invariant items called designated anchors. In this research, a quick and easy strategy for empirically selecting designated anchors is proposed and evaluated in simulations. Although the proposed rank-based approach is applicable to any method for DIF testing, this article focuses on likelihood-ratio (LR) comparisons between nested two-group item response models. The rank-based strategy frequently identified a group-invariant designated anchor set that produced more accurate LR test results than those using all other items as anchors. Group-invariant anchors were more difficult to identify as the percentage of differentially functioning items increased. Advice for practitioners is offered.

Journal of autism and developmental disorders, 2026 · doi:10.1177/0146621607314044