Assessment & Research

Combining voice and language features improves automated autism detection.

MacFarlane et al. (2022) · Autism research : official journal of the International Society for Autism Research

★ The Verdict

Mash together language and voice features from everyday ADOS-2 clips to classify autism at ninety-two percent accuracy.

✓ Read this if BCBAs who conduct ADOS-2 assessments in clinic or school teams.

✗ Skip if Practitioners serving only bilingual populations where tonal languages dominate.

01Research in Context

What this study did

The team fed ordinary ADOS-2 videos into a computer. The program pulled two kinds of data: how the kids talked (words, pauses, grammar) and how they sounded (pitch, tone, speed).

Eighty-eight autistic and seventy non-autistic youths aged eight to fourteen took part. No extra tests were needed—just the same clinic footage staff already record.

What they found

Blending language and voice clues let the computer spot autism with ninety-two percent accuracy. Either clue alone scored lower, so the combo is key.

How this fits with other research

Maes et al. (2023) looked deeper. They used only voice sounds in preschoolers and found five language profiles, not two. Their work extends this study downward: voice data stays useful, but younger kids need finer categories.

Zhao et al. (2022) swapped voice for head-movement data and still reached strong autism detection. Both papers show cheap camera or mic data can flag autism; pick the sensor that fits your clinic.

Ni et al. (2025) seems to clash. They found autistic kids miss tonal cues in a second language, hinting voice markers might fail in bilinguals. The difference is task: Si tested new tonal words; Heather used native casual speech. Voice-classification still holds for English-speaking kids.

Why it matters

You already record ADOS-2 sessions. Running Heather’s free code on those files gives a fast second opinion without extra child stress. If the score edges near the cutoff, add language-voice metrics before you call a diagnosis.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

Upload last week’s ADOS-2 video to the open-source extractor and flag any kid whose combined voice-language score sits near the autism threshold for a second clinician review.

02At a glance

Intervention

not applicable

Design

other

Sample size

158

Population

autism spectrum disorder, neurotypical

Finding

positive

Magnitude

large

03Original abstract

Variability in expressive and receptive language, difficulty with pragmatic language, and prosodic difficulties are all features of autism spectrum disorder (ASD). Quantifying language and voice characteristics is an important step for measuring outcomes for autistic people, yet clinical measurement is cumbersome and costly. Using natural language processing (NLP) methods and a harmonic model of speech, we analyzed language transcripts and audio recordings to automatically classify individuals as ASD or non-ASD. One-hundred fifty-eight participants (88 ASD, 70 non-ASD) ages 7 to 17 were evaluated with the autism diagnostic observation schedule (ADOS-2), module 3. The ADOS-2 was transcribed following modified SALT guidelines. Seven automated language measures (ALMs) and 10 automated voice measures (AVMs) for each participant were generated from the transcripts and audio of one ADOS-2 task. The measures were analyzed using support vector machine (SVM; a binary classifier) and receiver operating characteristic (ROC). The AVM model resulted in an ROC area under the curve (AUC) of 0.7800, the ALM model an AUC of 0.8748, and the combined model a significantly improved AUC of 0.9205. The ALM model better detected ASD participants who were younger and had lower language skills and shorter activity time. ASD participants detected by the AVM model had better language profiles than those detected by the language model. In combination, automated measurement of language and voice characteristics successfully differentiated children with and without autism. This methodology could help design robust outcome measures for future research. LAY SUMMARY: People with autism often struggle with communication differences which traditional clinical measures and language tests cannot fully capture. Using language transcripts and audio recordings from 158 children ages 7 to 17, we showed that automated, objective language and voice measurements successfully predict the child's diagnosis. This methodology could help design improved outcome measures for research.

Autism research : official journal of the International Society for Autism Research, 2022 · doi:10.1007/978-3-642-04898-2_253