Choice with probabilistic reinforcement: effects of delay and conditioned reinforcers.
Stimuli that fill a delay can flip preference between certain and uncertain reinforcers.
01Research in Context
What this study did
Pigeons pecked two keys. One key gave food every time but after a long wait. The other key gave food only sometimes but right away.
During the wait, lights blinked in different patterns. The researcher asked: do these lights change which key the birds prefer?
What they found
The blinking lights did change preference. When the lights made the delay feel shorter, birds picked the certain key more.
A math model that counted total light-on time fit the data best.
How this fits with other research
Sponheim (1996) ran almost the same test. At first, half the birds liked the risky key. When the authors removed color changes, the liking vanished. Together, the two papers show: signals during the wait drive the choice, not the wait itself.
Bailey et al. (1990) came first. They showed birds quickly learn which side pays more. The 1991 study adds: even after learning, moment-to-moment cues can still flip the choice.
Mazur (2014) swapped pigeons for rats and used token lights. Tokens stacked up like delay signals and steered choice the same way. The effect crosses species and tasks.
Why it matters
Your client also waits. A timer, a spinning icon, or your voice can act like the pigeon's blinking light. Pick signals that make the wait feel short and valuable. Change the signal, not just the reinforcer, when you want to shift choice.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Add a preferred blinking light or song to the wait period before a sure reinforcer and watch if your client picks that option more often.
02At a glance
03Original abstract
Two experiments measured pigeons' choices between probabilistic reinforcers and certain but delayed reinforcers. In Experiment 1, a peck on a red key led to a 5-s delay and then a possible reinforcer (with a probability of .2). A peck on a green key led to a certain reinforcer after an adjusting delay. This delay was adjusted over trials so as to estimate an indifference point, or a duration at which the two alternatives were chosen about equally often. In all conditions, red houselights were present during the 5-s delay on reinforced trials with the probabilistic alternative, but the houselight colors on nonreinforced trials differed across conditions. Subjects showed a stronger preference for the probabilistic alternative when the houselights were a different color (white or blue) during the delay on nonreinforced trials than when they were red on both reinforced and nonreinforced trials. These results supported the hypothesis that the value or effectiveness of a probabilistic reinforcer is inversely related to the cumulative time per reinforcer spent in the presence of stimuli associated with the probabilistic alternative. Experiment 2 tested some quantitative versions of this hypothesis by varying the delay for the probabilistic alternative (either 0 s or 2 s) and the probability of reinforcement (from .1 to 1.0). The results were best described by an equation that took into account both the cumulative durations of stimuli associated with the probabilistic reinforcer and the variability in these durations from one reinforcer to the next.
Journal of the experimental analysis of behavior, 1991 · doi:10.1901/jeab.1991.55-63