ABA Fundamentals

Choice in a variable environment: every reinforcer counts.

Davison et al. (2000) · Journal of the experimental analysis of behavior

★ The Verdict

One reinforcer from the other choice immediately pulls behavior, but extra reinforcers from the same choice help less after the first handful.

✓ Read this if BCBAs designing concurrent-choice programs in clinics or classrooms.

✗ Skip if Practitioners who run only single-schedule teaching and never offer simultaneous alternatives.

01Research in Context

What this study did

The team used pigeons in a two-key setup. Each key paid off on its own variable-interval schedule.

They changed the rate of grain every session so the birds faced a shifting environment. Computers logged every peck and reinforcer.

What they found

After about eight reinforcers from the same key, extra grains barely nudged preference. One grain from the other key, however, instantly pulled the next peck to that side.

The birds acted as if each side had a short memory window: long runs helped less, but a single 'surprise' reinforcer reset the balance.

How this fits with other research

Landon et al. (2003) ran the same birds with uneven payoff ratios and saw the same pattern—streaks matter, breaks matter more. That study extends the 2000 result by showing the rule holds even when one side is twice as rich.

Boutros et al. (2011) unpacked why one reinforcer swings choice: the grain acts both as a tiny 'cue' for the next response and as a slow builder of long-term bias. Their finding extends the original by giving the single-reinforcer effect a dual job.

Beeby et al. (2017) added a third and fourth key. With more choices, the first post-reinforcer peck still chased the richest key, but the simple 'eight-reinforcer plateau' vanished. The pattern does not scale up neatly—an important boundary condition.

Why it matters

In therapy, kids often face 'two-key' moments: work or break, blue token or red token. This paper warns that a single reinforcer from the 'other' side can instantly reset momentum, so keep the desired side paying off early and often. It also says long unbroken streaks give diminishing returns—perfect timing to splice in novelty or praise from a new source to re-energize responding.

FREE CEUs

Get CEUs on This Topic — Free

The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.

✓ 60+ on-demand CEUs (ethics, supervision, general)

✓ New live CEU every Wednesday

✓ Community of 500+ BCBAs

✓ 100% free to join

Join The ABA Clubhouse — Free →

→ Action — try this Monday

When using two side-by-side tasks, deliver the first few reinforcers quickly from the target task to lock in responding, then thin gradually.

02At a glance

Intervention

not applicable

Design

single case other

Sample size

Population

not specified

Finding

not reported

03Original abstract

Six pigeons were trained in sessions composed of seven components, each arranged with a different concurrent-schedule reinforcer ratio. These components occurred in an irregular order with equal frequency, separated by 10-s blackouts. No signals differentiated the different reinforcer ratios. Conditions lasted 50 sessions, and data were collected from the last 35 sessions. In Part 1, the arranged overall reinforcer rate was 2.22 reinforcers per minute. Over conditions, number of reinforcers per component was varied from 4 to 12. In Part 2, the overall reinforcer rate was six per minute, with both 4 and 12 reinforcers per component. Within components, log response-allocation ratios adjusted rapidly as more reinforcers were delivered in the component, and the slope of the choice relation (sensitivity) leveled off at moderately high levels after only about eight reinforcers. When the carryover from previous components was taken into account, the number of reinforcers in the components appeared to have no systematic effect on the speed at which behavior changed after a component started. Consequently, sensitivity values at each reinforcer delivery were superimposable. However, adjustment to changing reinforcer ratios was faster, and reached greater sensitivity values, when overall reinforcer rate was higher. Within a component, each successive reinforcer from the same alternative ("confirming") had a smaller effect than the one before, but single reinforcers from the other alternative ("disconfirming") always had a large effect. Choice in the prior component carried over into the next component, and its effects could be discerned even after five or six reinforcement and nonreinforcement is suggested.

Journal of the experimental analysis of behavior, 2000 · doi:10.1901/jeab.2000.74-1