Choice between single and multiple delayed reinforcers.
Each delayed reinforcer adds value in proportion to 1/delay, so staggered rewards help only if the last gap stays tight.
01Research in Context
What this study did
The researcher let pigeons peck two keys. One key gave a single food pellet after a delay. The other key gave several pellets, each with its own shorter delay.
The delays changed session by session until the bird chose each key about half the time. That indifference point showed how the flock valued multiple delayed rewards.
What they found
As more pellets were added and their delays shrank, the birds accepted a longer wait for the bundle. The data traced a clean line: value equals one divided by each delay, then add them up.
In plain words, every future reward pulls some weight, but distant ones pull less.
How this fits with other research
Fraley (1998) swapped food delays for delayed work. Pigeons still followed the same math, showing the rule holds for effort as well as payoff.
Aman et al. (1987) looked at one reward whose duration changed. Longer delays made the birds notice duration more, adding a magnitude twist to the basic delay rule.
Eisenmajer et al. (1998) found that a three-second unsignaled delay crashed preference and staying power. Together these papers say: count every second, and if you must wait, signal it.
Why it matters
When you shape a token board or stagger praise across several tasks, remember each tiny reinforcer competes at 1/delay. A point in 30 s is worth half a point in 15 s, so cluster soon or raise value. Signal all waits and keep the last reward close; the math says spread-out prizes still help, but only if the final gap is short.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Put the final token within five seconds of the big reward; earlier tokens can be slightly farther but never unsignaled.
02At a glance
03Original abstract
Pigeons chose between alternatives that differed in the number of reinforcers and in the delay to each reinforcer. A peck on a red key produced the same consequences on every trial within a condition, but between conditions the number of reinforcers varied from one to three and the reinforcer delays varied between 5 s and 30 s. A peck on a green key produced a delay of adjustable duration and then a single reinforcer. The green-key delay was increased or decreased many times per session, depending on a subject's previous choices, which permitted estimation of an indifference point, or a delay at which a subject chose each alternative about equally often. The indifference points decreased systematically with more red-key reinforcers and with shorter red-key delays. The results did not support the suggestion of Moore (1979) that multiple delayed reinforcers have no effect on preference unless they are closely grouped. The results were well described in quantitative detail by a simple model stating that each of a series of reinforcers increases preference, but that a reinforcer's effect is inversely related to its delay. The success of this model, which considers only delay of reinforcement, suggested that the overall rate of reinforcement for each alternative had no effect on choice between those alternatives.
Journal of the experimental analysis of behavior, 1986 · doi:10.1901/jeab.1986.46-67