Operant conditioning of behavioral variability using a percentile reinforcement schedule.
A percentile reinforcement schedule gives you a simple dial to raise or lower how varied a learner’s responses are.
01Research in Context
What this study did
The researchers worked with pigeons in a lab.
They wanted to see if a special rule could make the birds peck in more or less varied patterns.
The rule was called a percentile reinforcement schedule.
It only paid off if the bird’s recent sequence was more (or less) varied than most of its past sequences.
What they found
The schedule worked like a dial.
When the rule paid for high variety, the birds pecked in wild new orders.
When it paid for low variety, the birds settled into almost identical loops.
Plain reward rate did not drive the change; the percentile rule did.
How this fits with other research
Doughty et al. (2015) later showed the same thing: pigeons varied more only when the rule truly required it, not when it just asked for any change.
Galizio et al. (2018) then proved that this varied behavior acts like a true operant—it extinguishes and resurges, just like a reinforced lever press.
Nergaard et al. (2020) review warns that “reinforced variability” might really be extinction of old patterns plus reinforcement of new ones.
The 1989 data still hold, but the review reminds you to watch for extinction effects when you run these schedules.
Why it matters
You now have a dial for variability.
Use a percentile or lag schedule when you want a learner to be more flexible—like varying play actions, vocal sounds, or problem-solving steps.
Start with a low bar (lag-1) and tighten only when variety grows.
Pair the schedule with clear signals, as Reed (2023) shows signals boost the effect.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Count the last five response forms, set a lag-1 rule—reinforce only if the next response differs from the previous one—and watch variety grow.
02At a glance
03Original abstract
The present investigation developed and tested a new percentile reinforcement schedule suited to study pattern variability, whose main feature was the relative dissociation it provided between the variability requirement defining criterional responses and overall probability of reinforcement. In a discrete-trials procedure, pigeons produced patterns of four pecks on two response keys. If the pattern emitted on the current trial differed from the N preceding patterns, reinforcement was delivered with probability mu. The schedule continuously adjusted the criterion N such that the probability of a criterional response, estimated from the subject's recent behavior, was always constant. In these circumstances, the criterion corresponded to an invariant percentile in the distribution of recent responses. Using a between-subjects design, Experiment 1 manipulated the variability requirement--the percentile--while keeping overall reinforcement probability constant. The degree of variability varied directly with the requirement. In addition, an inverse relationship existed between the requirement and within-group variance. Experiment 2 manipulated probability of reinforcement while maintaining the variability requirement constant. No consistent relationship was found between variability and reinforcement probability. A tentative hypothesis was advanced ascribing the operant conditioning of behavioral variability to a process of probability-dependent selection.
Journal of the experimental analysis of behavior, 1989 · doi:10.1901/jeab.1989.52-155