Re-evaluating preterm infants with the Bayley-III: patterns and predictors of change.
Bayley-III scores in preterm toddlers often shift enough to change delay labels, so always retest before big service decisions.
01Research in Context
What this study did
Capio et al. (2013) gave the Bayley-III to preterm babies twice. First at eight months, then again near twenty months.
They wanted to see if the scores stayed the same or moved enough to change the "delay" label.
What they found
Cognitive, receptive language, and fine-motor scores dropped. Gross-motor scores stayed flat.
About fifteen percent of babies switched delay status. The test’s own reliability was only small to moderate.
How this fits with other research
Velikos et al. (2015) saw the same low Bayley-III scores at a single time point. Together the two papers show preterm infants usually score below average, and those scores can slide lower over time.
Yaari et al. (2016) swapped the Bayley for autism screeners and still found labels flipping. Their message matches: rescreen before you lock in a diagnosis.
Kuang et al. (2025) flipped the script. They used Bayley-III trajectories to predict later autism and got positive results. Their success shows the same shaky scores can still be useful if you watch the pattern, not just one snapshot.
Why it matters
If you assess a preterm toddler once and write "delay" in the file, you may mislabel one in six kids. Wait, retest, and look at the slope before you recommend intensive services. Use the second Bayley, or a different tool, to confirm direction. Share the uncertainty with families so they understand early scores are snapshots, not final verdicts.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Schedule a repeat Bayley-III (or alternate assessment) for any preterm child you labeled before twelve months before you lock in the IFSP.
02At a glance
03Original abstract
This study investigates the Third Edition of the Bayley Scales of Infant and Toddler Development (Bayley-III) and (1) mean difference scores, (2) test-retest correlation coefficients, (3) changes in rates of delay and classification from "delayed" to "not delayed," and (4) infant birth, neonatal and sociodemographic predictors of change in scores from the first to second year of life among 131 preterm infants. Cognitive, Receptive Language and Fine Motor Subscale scores decrease and mean Gross Motor Subscale scores remain consistent from the first to second year of life. Bayley-III test-retest reliability ranged from small/fair to moderate from 8 to 20 months corrected age. Classification of delay is not stable over the first two years of life. One in 6 infants' Language Index scores changed from a classification of not delayed at 8 months to delayed at 20 months. One in 10 infants' Gross Motor Subscale scores changed from a classification of delayed at 8 months to not delayed at 20 months. Small for gestational age status predicts improved to nearly consistent Bayley Language Index and Receptive Subscale scores. Public insurance and history of sepsis predict decline in Bayley Language Index and Receptive Subscale scores from 8 to 20 months. Lower gestational age, race, and history of necrotizing enterocolitis and/or intestinal perforation also predict decline in Bayley Cognitive Index from 8 to 20 months. Predictors of decline in performance confirm known neonatal risk factors, are consistent with emerging evidence of detrimental immune related processes, and highlight the importance of inclusion of sociodemographic variables in understanding development in preterm infants.
Research in developmental disabilities, 2013 · doi:10.1016/j.ridd.2013.04.001