Effects of two sorting formats and four test criteria on equivalence class formation
Let learners stack items into piles and accept scattered correct responses to turn more trainees into equivalence-class passers.
01Research in Context
What this study did
The team compared two ways to run a sorting test after match-to-sample training.
One way is stacking: the learner puts all A items together, then all B items, then all C items.
The other way is clustering: the learner makes little piles of A-B-C sets.
They also tried four passing rules, from strict (nine right in a row) to loose (nine right total, errors allowed).
Adults without disabilities served as participants.
What they found
Stacking plus looser rules produced the most equivalence classes.
Clustering plus strict rules produced the fewest.
In short, how you ask people to sort, and how perfect they must be, changes who passes.
How this fits with other research
Arntzen et al. (2018) already showed that a tiny warm-up with delayed identity matching lifts class yields.
The new study keeps the same lab set-up but swaps the warm-up for easier test rules, giving you two levers instead of one.
Perez et al. (2020) found that blocking the view of the correct comparison hurts transitivity.
That seems opposite, yet both papers say the same thing: small test details swing the final score.
Where Perez guarded against false negatives, Arntzen et al. (2025) show how to create more true positives.
Why it matters
If you run equivalence-based reading or social-skills programs, switch your post-test to stacking sorts and allow non-consecutive correct answers.
You will see more learners pass without extra training, saving hours of drill time.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Change your equivalence post-test sheet from cluster boxes to three big columns labeled A, B, C and score nine correct placements total.
02At a glance
03Original abstract
The likelihood of forming equivalence classes was influenced by the format used in sorting tests and by four different test criteria applied to the same data set. After 30 participants learned 12 conditional discriminations, MTS tests evaluated the emergence of three 5-member equivalence classes. These tests were followed by sorting tests that were conducted in clustering or stacking formats. After training, 20% of participants formed the classes. Of the 75% who did not, classes emerged for 36% and 15% of participants during stacking and clustering, respectively, with a criterion of consecutive class-indicative sorts in the first two sorting tests, and by 45% and 15% of participants during stacking and clustering, respectively, with a criterion of two successive class-indicative sorts in any of the four sorting tests. Overall, a somewhat higher percentage of participants formed classes during stacking than during clustering, sometimes on a delayed basis. Finally, even higher yields were obtained when criterion was defined as two nonconsecutive class-indicative sorting tests. When classes did not form, clustering rather than stacking tests generated larger proportions of stereotyped, participant-defined, three-member classes and two-term relations but stacking generated more one-stimulus "groupings." Thus, class formation was influenced by sorting format and the criteria used to define class emergence. Also, sorting influenced performances even during failed class formation.
Journal of the Experimental Analysis of Behavior, 2025 · doi:10.1002/jeab.70017