ABA Fundamentals

Saving the best for last? A cross-species analysis of choices between reinforcer sequences.

Andrade et al. (2012) · Journal of the experimental analysis of behavior

★ The Verdict

When reinforcers can be claimed right away, both people and pigeons choose the sequence that lands the first reward fastest.

✓ Read this if BCBAs running token economies or delay-discounting protocols in clinics or classrooms.

✗ Skip if Practitioners who work only with immediate reinforcement and no token exchange.

01Research in Context

What this study did

Grindle et al. (2012) let pigeons and college students pick between two strings of food or money. Each string had three rewards, but the delays were shuffled. In one condition the tokens could be traded in right away; in another they had to wait until the end.

The birds pecked keys; the humans clicked boxes on a screen. Researchers recorded which sequence each species chose most often.

What they found

Both pigeons and humans picked the sequence that got the first reward to them fastest when exchange was immediate. They did not care which reward came last.

When exchange was delayed until the session ended, the pattern fell apart. Choice became messy and no longer lined up with simple delay-discounting math.

How this fits with other research

Gowen et al. (2013) later saw the same reversal inside a single session: pigeons started by taking the small-quick reward, then switched to the large-late one as waits grew longer. The two studies line up—delay structure, not reward size, steers the choice.

Tvan der Miesen et al. (2024) looks like it disagrees. They found pigeons failed a two-event sequence task and seemed too dumb to learn order. The difference is the task. TR’s birds could win by just pecking the last color they saw. F’s birds had to pick whole token strings; no shortcut worked. Same species, different rule—no real conflict.

Delano (2007) and Green et al. (2004) set the stage by showing hyperbolic decay handles pigeon choices when only one delay is in play. F extends that work by proving the rule still holds when several delays are chained together.

Why it matters

If you give tokens a client can cash in right away, they will pick the schedule that delivers the first reinforcer fastest—even if the later payoffs are smaller. To promote patience, build in a forced wait before exchange. This tiny tweak, shown in both birds and humans, can make self-control programs more effective without extra tokens or larger prizes.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

Add a 30-second hold before a client can trade tokens; check if they now pick larger-later rewards more often.

02At a glance

Intervention

not applicable

Design

other

Population

neurotypical

Finding

not reported

03Original abstract

Two experiments were conducted to compare choices between sequences of reinforcers in pigeon (Experiment 1) and human (Experiment 2) subjects, using functionally analogous procedures. The subjects made pairwise choices among 3 sequence types, all of which provided the same overall reinforcerment rate, but differed in their temporal patterning. Token reinforcement schedules were used in both experiments and the type of exchange schedule varied across blocks of sessions. Some conditions permitted immediate exchange of tokens for consumable reinforcers (food for pigeons, video access for humans); in other conditions, tokens accumulated and were exchanged for consumable reinforcers only at the end of the sequence. Choice patterns in the immediate-exchange conditions were generally similar across species, with both pigeons and humans preferring sequences with the shortest delay to the initial reinforcer in the series. The results are broadly consistent with models of temporal discounting expanded to include the impact of sequences of delayed reinforcers acting in parallel from the time of the choice. Preferences were less consistent with discounting models in the delayed exchange conditions. Questionnaire data gathered at the end of the experiment were consistent with prior results of questionnaire studies, but showed no straightforward relation to the observed choice patterns, urging caution in the extrapolation of results from one decision-making domain to the other.

Journal of the experimental analysis of behavior, 2012 · doi:10.1901/jeab.2012.98-45