ABA Fundamentals

Choice with probabilistic reinforcement: effects of delay and conditioned reinforcers.

Mazur (1991) · Journal of the experimental analysis of behavior

★ The Verdict

Stimuli that fill a delay can flip preference between certain and uncertain reinforcers.

✓ Read this if BCBAs who use token boards, timers, or wait schedules with any client.

✗ Skip if Clinicians who deliver reinforcers immediately with no gaps.

01Research in Context

What this study did

Pigeons pecked two keys. One key gave food every time but after a long wait. The other key gave food only sometimes but right away.

During the wait, lights blinked in different patterns. The researcher asked: do these lights change which key the birds prefer?

What they found

The blinking lights did change preference. When the lights made the delay feel shorter, birds picked the certain key more.

A math model that counted total light-on time fit the data best.

How this fits with other research

Sponheim (1996) ran almost the same test. At first, half the birds liked the risky key. When the authors removed color changes, the liking vanished. Together, the two papers show: signals during the wait drive the choice, not the wait itself.

Bailey et al. (1990) came first. They showed birds quickly learn which side pays more. The 1991 study adds: even after learning, moment-to-moment cues can still flip the choice.

Mazur (2014) swapped pigeons for rats and used token lights. Tokens stacked up like delay signals and steered choice the same way. The effect crosses species and tasks.

Why it matters

Your client also waits. A timer, a spinning icon, or your voice can act like the pigeon's blinking light. Pick signals that make the wait feel short and valuable. Change the signal, not just the reinforcer, when you want to shift choice.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Add a preferred blinking light or song to the wait period before a sure reinforcer and watch if your client picks that option more often.

02At a glance

Intervention

other

Design

single case other

Finding

not reported

03Original abstract

Two experiments measured pigeons' choices between probabilistic reinforcers and certain but delayed reinforcers. In Experiment 1, a peck on a red key led to a 5-s delay and then a possible reinforcer (with a probability of .2). A peck on a green key led to a certain reinforcer after an adjusting delay. This delay was adjusted over trials so as to estimate an indifference point, or a duration at which the two alternatives were chosen about equally often. In all conditions, red houselights were present during the 5-s delay on reinforced trials with the probabilistic alternative, but the houselight colors on nonreinforced trials differed across conditions. Subjects showed a stronger preference for the probabilistic alternative when the houselights were a different color (white or blue) during the delay on nonreinforced trials than when they were red on both reinforced and nonreinforced trials. These results supported the hypothesis that the value or effectiveness of a probabilistic reinforcer is inversely related to the cumulative time per reinforcer spent in the presence of stimuli associated with the probabilistic alternative. Experiment 2 tested some quantitative versions of this hypothesis by varying the delay for the probabilistic alternative (either 0 s or 2 s) and the probability of reinforcement (from .1 to 1.0). The results were best described by an equation that took into account both the cumulative durations of stimuli associated with the probabilistic reinforcer and the variability in these durations from one reinforcer to the next.

Journal of the experimental analysis of behavior, 1991 · doi:10.1901/jeab.1991.55-63