Response acquisition under targeted percentile schedules: a continuing quandary for molar models of operant behavior.
Local contingencies can trump overall payoff, so check what the client gets right after the response.
01Research in Context
What this study did
Scientists placed rats in a box with one lever.
The rats had to press the lever many times in a row to get food.
The twist: the longer the run of presses, the less food they earned overall.
The team wanted to see if the animals would still learn the long runs.
What they found
Twenty-nine of thirty rats learned to make long runs anyway.
They kept pressing even though it cut their total food.
The local rule (“keep pressing”) beat the big payoff (“more food”).
How this fits with other research
Rachlin (1978) said animals work to maximize total reward.
Aman et al. (1993) show this is not always true.
The rats picked the rule they could feel right now, not the final score.
Cohen et al. (1993) saw the same split: a reinforcer briefly lengthens the stay, yet the long-term choice curve stays flat.
Together, the two 1993 rat papers say momentary contingencies drive learning, even when they hurt the final tally.
Why it matters
When you write a program, look at what the client gets right after the response, not just the total reinforcers at the end of the day.
If the immediate rule clashes with the long-term gain, plan extra teaching or add cues so the better payoff also wins in the moment.
Get CEUs on This Topic — Free
The ABA Clubhouse has 60+ on-demand CEUs including ethics, supervision, and clinical topics like this one. Plus a new live CEU every Wednesday.
Watch one client’s last 5 responses: does the immediate consequence support the goal you want by Friday?
02At a glance
03Original abstract
The number of responses rats made in a "run" of consecutive left-lever presses, prior to a trial-ending right-lever press, was differentiated using a targeted percentile procedure. Under the nondifferential baseline, reinforcement was provided with a probability of .33 at the end of a trial, irrespective of the run on that trial. Most of the 30 subjects made short runs under these conditions, with the mean for the group around three. A targeted percentile schedule was next used to differentiate run length around the target value of 12. The current run was reinforced if it was nearer the target than 67% of those runs in the last 24 trials that were on the same side of the target as the current run. Programming reinforcement in this way held overall reinforcement probability per trial constant at .33 while providing reinforcement differentially with respect to runs more closely approximating the target of 12. The mean run for the group under this procedure increased to approximately 10. Runs approaching the target length were acquired even though differentiated responding produced the same probability of reinforcement per trial, decreased the probability of reinforcement per response, did not increase overall reinforcement rate, and generally substantially reduced it (i.e., in only a few instances did response rate increase sufficiently to compensate for the increase in the number of responses per trial). Models of behavior predicated solely on molar reinforcement contingencies all predict that runs should remain short throughout this experiment, because such runs promote both the most frequent reinforcement and the greatest reinforcement per press. To the contrary, 29 of 30 subjects emitted runs in the vicinity of the target, driving down reinforcement rate while greatly increasing the number of presses per pellet. These results illustrate the powerful effects of local reinforcement contingencies in changing behavior, and in doing so underscore a need for more dynamic quantitative formulations of operant behavior to supplement or supplant the currently prevalent static ones.
Journal of the experimental analysis of behavior, 1993 · doi:10.1901/jeab.1993.60-171