ABA Fundamentals

Effects of two sorting formats and four test criteria on equivalence class formation

Arntzen et al. (2025) · Journal of the Experimental Analysis of Behavior

★ The Verdict

Let learners stack items into piles and accept scattered correct responses to turn more trainees into equivalence-class passers.

✓ Read this if BCBAs who use stimulus-equivalence protocols to teach language or academics.

✗ Skip if Clinicians working solely with intraverbal or motor programs.

01Research in Context

What this study did

The team compared two ways to run a sorting test after match-to-sample training.

One way is stacking: the learner puts all A items together, then all B items, then all C items.

The other way is clustering: the learner makes little piles of A-B-C sets.

They also tried four passing rules, from strict (nine right in a row) to loose (nine right total, errors allowed).

Adults without disabilities served as participants.

What they found

Stacking plus looser rules produced the most equivalence classes.

Clustering plus strict rules produced the fewest.

In short, how you ask people to sort, and how perfect they must be, changes who passes.

How this fits with other research

Arntzen et al. (2018) already showed that a tiny warm-up with delayed identity matching lifts class yields.

The new study keeps the same lab set-up but swaps the warm-up for easier test rules, giving you two levers instead of one.

Perez et al. (2020) found that blocking the view of the correct comparison hurts transitivity.

That seems opposite, yet both papers say the same thing: small test details swing the final score.

Where Perez guarded against false negatives, Arntzen et al. (2025) show how to create more true positives.

Why it matters

If you run equivalence-based reading or social-skills programs, switch your post-test to stacking sorts and allow non-consecutive correct answers.

You will see more learners pass without extra training, saving hours of drill time.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Change your equivalence post-test sheet from cluster boxes to three big columns labeled A, B, C and score nine correct placements total.

02At a glance

Intervention

stimulus equivalence training

Design

single case other

Sample size

Population

neurotypical

Finding

positive

Magnitude

medium

03Original abstract

The likelihood of forming equivalence classes was influenced by the format used in sorting tests and by four different test criteria applied to the same data set. After 30 participants learned 12 conditional discriminations, MTS tests evaluated the emergence of three 5-member equivalence classes. These tests were followed by sorting tests that were conducted in clustering or stacking formats. After training, 20% of participants formed the classes. Of the 75% who did not, classes emerged for 36% and 15% of participants during stacking and clustering, respectively, with a criterion of consecutive class-indicative sorts in the first two sorting tests, and by 45% and 15% of participants during stacking and clustering, respectively, with a criterion of two successive class-indicative sorts in any of the four sorting tests. Overall, a somewhat higher percentage of participants formed classes during stacking than during clustering, sometimes on a delayed basis. Finally, even higher yields were obtained when criterion was defined as two nonconsecutive class-indicative sorting tests. When classes did not form, clustering rather than stacking tests generated larger proportions of stereotyped, participant-defined, three-member classes and two-term relations but stacking generated more one-stimulus "groupings." Thus, class formation was influenced by sorting format and the criteria used to define class emergence. Also, sorting influenced performances even during failed class formation.

Journal of the Experimental Analysis of Behavior, 2025 · doi:10.1002/jeab.70017