Early diagnosis of autism across developmental stages through scalable and interpretable ensemble model.
A four-model caregiver questionnaire nails 97-a large share autism classification from toddlers to adults.
01Research in Context
What this study did
The team built a four-model computer ensemble. It reads caregiver answers from a short questionnaire.
They trained it on toddler, child, teen, and adult data. Then they tested if it could spot autism at each age.
What they found
The ensemble hit 97-a large share accuracy in every age group. It beat any single model used alone.
The tool also shows which questions drove each decision. That helps clinicians explain results to families.
How this fits with other research
Older screeners like M-CHAT and SCQ miss many higher-functioning kids (C et al. 2006, W et al. 2007). The new ensemble keeps high sensitivity while pushing accuracy to near-perfect, so it clearly replaces those forms.
Gur et al. (2024) used only routine baby-checkup data and reached 63-a large share accuracy. The 2025 study lifts that idea into questionnaire space and triples the hit rate, showing ML can scale once richer items are added.
Kremkow et al. (2022) reviewed digital toddler tools and said most were still proofs-of-concept. This paper answers that call with a ready-to-scale screener that works from toddlerhood to adulthood.
Why it matters
You can now screen every age group with one short form and get specialist-level accuracy. Use it while families wait for a full evaluation. The built-in explanations keep you compliant with ethical and cultural guidelines. If results replicate in your clinic, you could cut waitlists and start intervention months earlier.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Pilot the 20-item ensemble form with your next five intake families and compare its risk flag to your usual screener.
02At a glance
03Original abstract
Autism Spectrum Disorder (ASD) is a multifaceted neurodevelopmental condition that challenges early diagnosis due to its diverse manifestations across different developmental stages. Timely and accurate detection is essential to enable interventions that significantly enhance developmental outcomes. This study introduces a robust and interpretable machine learning framework to diagnose ASD using questionnaire data. The proposed framework leverages a stacked ensemble model, combining Random Forest (RF), Extra Tree (ET), and CatBoost (CB) as base classifiers, with an Artificial Neural Network (ANN) serving as the meta-classifier. The methodology addresses class imbalance using Safe-Level SMOTE, dimensionality reduction via Principal Component Analysis (PCA), and feature selection using Mutual Information and Pearson correlation. Evaluation on publicly available datasets representing toddlers, children, adolescents, adults, and a merged dataset (Combining children, adolescents, and adults dataset) demonstrates high diagnostic accuracy, achieving 99.86%, 99.68%, 98.17%, 99.89%, and 96.96%, respectively. Comparative analysis with standard machine learning models underscores the superior performance of the proposed framework. SHapley Additive exPlanations (SHAP) were used to interpret feature importance, while Monte Carlo Dropout (MCD) quantified uncertainty in predictions. This framework provides a scalable, interpretable, and reliable solution for ASD screening across diverse populations and developmental stages.
, 2025 · doi:10.3389/frai.2025.1507922