Probability learning as a function of momentary reinforcement probability.
Keep reinforcement probability at 0.80 or higher and use two clear cues so learners quickly lock in the best choice rule.
01Research in Context
What this study did
Wright (1972) worked with pigeons in a small chamber.
The birds pecked left or right keys.
Color and position cues told them which key paid off.
The payoff chance changed every trial: 0.80, 0.65, or 0.50.
The team watched how fast the birds learned the best rule: stay after a win, shift after a loss.
What they found
At 0.80 the birds quickly used the win-stay, lose-shift rule.
When the chance dropped to 0.65 learning slowed.
With color cues alone the birds learned even less.
Multiple clear cues plus high payoff gave the fastest learning.
How this fits with other research
Joyce et al. (1988) saw a similar drop: lower overall payoff weakens choice control.
Their concurrent VI data match A’s 0.65 dip.
McSweeney et al. (1993) warn that responding drifts within a session.
A’s moment-to-moment changes fit that drift picture.
Together the three papers say: keep payoff high and watch for within-session slide.
Why it matters
When you thin reinforcement check that the chance still tops 0.80.
Add two salient cues—color plus place—to protect learning.
Track within-session data; if accuracy falls late, bump the rate or add cues.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Add a second visual cue next to the target and raise the reinforcement rate to 80% for the first week of a new discrimination task.
02At a glance
03Original abstract
Pigeons were trained on a probability learning task where the overall reinforcement probability was 0.50 for each response alternative but where the momentary reinforcement probability differed and depended upon the outcome of the preceding trial. In all cases, the maximum reinforcement occurred with a "win-stay, lose-shift" response pattern. When both position and color were relevant cues, the optimal response pattern was learned when the reinforcement probability for repeating the just-reinforced response was 0.80 but not when the probability was 0.65. When only color was relevant, learning occurred much more slowly, and only for subjects trained on large fixed ratio requirements.
Journal of the experimental analysis of behavior, 1972 · doi:10.1901/jeab.1972.17-363