Choice in a variable environment: effects of unequal reinforcer distributions.
A single break in a reinforcement streak can erase built-up preference, so plan your reinforcers like a dynamic playlist, not a fixed loop.
01Research in Context
What this study did
Landon et al. (2003) worked with pigeons in a lab.
The birds pecked two keys that gave grain on different schedules.
Some grain deliveries were bunched on one key, then switched.
The team tracked every peck to see how past wins guided next choices.
What they found
A single break in a long streak reset the bird’s preference hard.
Long runs of wins on one side built slow, steady bias.
The birds acted like the last surprise wiped the scoreboard clean.
How this fits with other research
Mueller et al. (2000) showed each lone reinforcer shifts choice right away.
Landon et al. (2003) add that the shift grows if it ends a long run.
DeRoma et al. (2004) zoomed in on visit patterns the next year.
They found the same pigeons flip quickly after each grain delivery.
Together the three papers build a timeline: single grain → streak break → new visit chain.
Beeby et al. (2017) widened the task to three and four keys.
Their birds still leaped to the richest side first, proving the rule holds with more options.
Why it matters
Your client’s reinforcement history is alive in each moment.
If you change the payoff place, plan for a quick reset.
Watch for streaks: they quietly raise bias until one surprise swings it back.
Use dense reward runs to build momentum, then insert an odd win elsewhere to keep responding flexible.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →After three quick reinforcers on one task, deliver the next reinforcer on the alternate task to keep the learner’s allocation fresh.
02At a glance
03Original abstract
Six pigeons were trained in a procedure in which sessions included seven unsignaled components, each offering two pecking keys, and each providing a potentially different reinforcer ratio between the two keys. Across conditions, various combinations of reinforcer ratios and reinforcer-magnitude ratios were used to create unequal reinforcer distributions between the two alternatives when averaged across a session. The results extended previous research using the same basic procedure that had included only reinforcer distributions symmetrical around 1:1. Data analyses suggested that the variables controlling choice operated at a number of levels: First, individual reinforcers had local effects on choice; second, sequences of successive reinforcers obtained at the same alternative (continuations) had cumulative effects; and, third, when these sequences themselves occurred with greater frequency, their effects further cumulated. A reinforcer obtained at the other alternative following a sequence of continuations (a discontinuation) had a large effect and apparently reset choice to levels approximating the sessional reinforcer ratio.
Journal of the experimental analysis of behavior, 2003 · doi:10.1901/jeab.2003.80-187