Hill-climbing by pigeons.
Animals go where the next reinforcer is most likely, so place your rewards to make the target response the obvious choice.
01Research in Context
What this study did
Researchers put pigeons in a box with two keys. Each key paid off on a different schedule.
The birds could hop between keys. The team asked: do pigeons pick the key that is most likely to pay right now?
What they found
The birds almost always pecked the key with the higher chance of food at that moment.
They were more accurate when both keys used variable-interval (VI) schedules than when one used variable-ratio (VR).
How this fits with other research
Rilling et al. (1969) showed pigeons match their time to the overall payoff ratio. The 1983 study zooms in on the exact second-to-second choice.
Kydd et al. (1982) modeled how often birds switch keys. The new data say birds switch to chase the best local odds, not just follow a fixed rule.
LeBlanc et al. (2003) later found richer schedules make behavior both faster and harder to disrupt. That supports the idea that moment-to-moment payoff guides long-term strength.
Why it matters
If your client has two tasks, watch which one looks like it will pay off right now. Shift the richer reward to the task you want to grow. The bird data say behavior flows toward the best immediate odds, so keep those odds on your target response.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Put the stronger reinforcer on the target key for the next five trials and watch the learner stay there longer.
02At a glance
03Original abstract
Pigeons were exposed to two types of concurrent operant-reinforcement schedules in order to determine what choice rules determine behavior on these schedules. In the first set of experiments, concurrent variable-interval, variable-interval schedules, key-peck responses to either of two alternative schedules produced food reinforcement after a random time interval. The frequency of food-reinforcement availability for the two schedules was varied over different ranges for different birds. In the second series of experiments, concurrent variable-ratio, variable-interval schedules, key-peck responses to one schedule produced food reinforcement after a random time interval, whereas food reinforcement occurred for an alternative schedule only after a random number of responses. Results from both experiments showed that pigeons consistently follow a behavioral strategy in which the alternative schedule chosen at any time is the one which offers the highest momentary reinforcement probability (momentary maximizing). The quality of momentary maximizing was somewhat higher and more consistent when both alternative reinforcement schedules were time-based than when one schedule was time-based and the alternative response-count based. Previous attempts to provide evidence for the existence of momentary maximizing were shown to be based upon faulty assumptions about the behavior implied by momentary maximizing and resultant inappropriate measures of behavior.
Journal of the experimental analysis of behavior, 1983 · doi:10.1901/jeab.1983.39-25