Problems and limitations in studies on screening for language delay.
Language screeners keep repeating the same six design flaws—check for small samples, dropouts, and verification bias before you use one.
01Research in Context
What this study did
The authors read 11 recent papers that claim to spot language delay early.
They looked for the same six problems that keep showing up.
They also flagged three trade-offs that no study can avoid.
What they found
Every paper had at least two of the six flaws.
The big ones were tiny samples, kids dropping out, and only testing the kids who already looked delayed.
These flaws make new screeners look better than they really are.
How this fits with other research
Canal-Bedia et al. (2011) ran into the same trap when they checked the Spanish M-CHAT.
Low autism rates in their sample made the tool seem worse than it is.
Madhesh (2024) found the same mess in quality-of-life studies for deaf teens.
Tools gave opposite answers because each team measured differently.
Hawley et al. (2004) saw the same thin evidence when they looked at rate-building claims.
All four papers say the same thing: check the methods before you trust the numbers.
Why it matters
Before you buy or use any new language screener, flip to the methods page.
If the sample is small, half the kids quit, or only high-risk kids were tested, stay skeptical.
Tell families the tool is promising but not proven.
Push the publisher for better data before you bet therapy time on it.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Open the last language screener you used and circle the sample size, attrition rate, and whether all kids got the gold-standard test—if any line worries you, pick a different tool.
02At a glance
03Original abstract
This study discusses six common methodological limitations in screening for language delay (LD) as illustrated in 11 recent studies. The limitations are (1) whether the studies define a target population, (2) whether the recruitment procedure is unbiased, (3) attrition, (4) verification bias, (5) small sample size and (6) inconsistencies in choice of "gold standard". It is suggested that failures to specify a target population, high attrition (both at screening and in succeeding validation), small sample sizes and verification bias in validations are often caused by a misguided focus on screen positives (SPs). Other limitations are results of conflicting methodological goals. We identified three such conflicts. One consists of a dilemma between unbiased recruitment and attrition, another between the comprehensiveness of the applied gold standard and sample size in validation and the third between the specificity of the gold standard and the risk of not identifying co-morbid conditions.
Research in developmental disabilities, 2010 · doi:10.1016/j.ridd.2010.04.019