The behavior equivalence problem in within-subject treatment comparisons.
Match task difficulty before you compare treatments within one learner so the results show method effects, not task drift.
01Research in Context
What this study did
Stoddard et al. (1988) wrote a how-to paper, not an experiment. They asked: how can we compare two teaching methods on the same worker if the jobs are different?
Their fix is to build small groups of tasks that feel equally hard. They call these groups item cohorts. You pick one task from each cohort for each method you want to test.
What they found
The paper gives a step-by-step recipe. First, rank every task by how many correct responses it takes to finish. Next, bundle tasks with close scores into cohorts. Last, draw one task per cohort for each treatment arm.
This keeps difficulty the same across treatments, so any change you see is more likely from the teaching method, not from the job being easier or harder.
How this fits with other research
McMillan (1973) warned that sequence effects can fool you in within-subject designs. T et al. answer that warning with a practical tool: item cohorts.
McLennan et al. (2008) later showed kids shift to easier work even when rewards stay equal. Their data back up the warning T et al. tried to solve.
Cariveau et al. (2021) scanned 30 years of adapted alternating treatments studies. They found most papers skip any difficulty check. They echo T et al.'s call and extend it to academic targets like sight words.
Why it matters
Next time you run an alternating treatments design, take 15 minutes to build item cohorts. List your targets, score them for difficulty, then match them across conditions. This tiny step blocks a major threat to internal validity and makes your data cleaner without extra sessions.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Rank your current targets by baseline accuracy and pair the closest scores across your two treatment sets.
02At a glance
03Original abstract
Within-subject comparisons of multiple treatment effects raise a variety of issues for applied researchers. They include potential nonreversibility of behaviors, practice or habituation effects resulting from repeated presentations of the same stimulus, and the possibility of multiple-treatment interference. It has recently been suggested that the use of item cohorts with equivalent behavioral difficulty addresses those problems. In order to meet the needs of researchers whose primary interest is in domestic, vocational, or other nonacademic skills, a procedure is described for estimating equivalent difficulty for different vocational preparation tasks.
Research in developmental disabilities, 1988 · doi:10.1016/0891-4222(88)90007-8