Probability and rate of reinforcement in negative prediction error learning.
Reinforcement rate and probability are not the same knob—rate drives behavior during the cue, probability keeps it alive afterward.
01Research in Context
What this study did
DByiers et al. (2025) tested how two numbers shape learning. One number was how often food arrived (rate). The other was the odds food would come at all (probability).
They used mice in a classic Pavlovian setup. A sound played. Sometimes food followed. They tracked when the mice started to nose-poke during and after the sound.
What they found
Rate ruled the waiting game. Faster food arrival made mice poke sooner and more during the sound.
Probability ruled the after-game. Higher chance of food made mice keep poking after the sound ended. The two numbers lived in different time zones of behavior.
How this fits with other research
Eckerman (1969) showed pigeons learn a discrimination faster when the odds of payoff are high. DJ et al. now show those same odds only bite after the cue ends, stitching 56 years of probability work together.
Malouff et al. (1985) found pigeons switch between using rate or probability depending on how easy the stimuli are to tell apart. DJ et al. keep the two variables separate by letting them control different moments in time, a clean replication with a new twist.
JKaplan-Kahn et al. (2026) prove learning speed is a single math function of CS-US timing. DJ et al. add that once learning is done, rate keeps acting inside the cue while probability acts later, refining the same rate-based model.
Why it matters
When you shape behavior, decide what you want. Want faster responding during the SD? Thin the schedule, but keep the rate of payoff high. Want the learner to stay engaged after the cue ends? Raise the probability that reinforcement still can happen. You can now tune two levers instead of one.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Graph your client’s response timing: count responses during the cue versus after it; try raising reinforcer rate if within-cue behavior is weak, or raising probability if post-cue behavior drops off.
02At a glance
03Original abstract
Trial-based theories of associative learning propose that learning is sensitive to the probability of reinforcement signaled by a conditioned stimulus (CS). Learning, however, is often sensitive to reinforcement rate rather than probability of reinforcement per trial, suggesting that temporal properties of cues may be more important than trial-based properties. In four experiments, the role of probability of reinforcement per trial was examined in appetitive Pavlovian conditioning in mice under conditions in which reinforcement rate was controlled. Experiments 1 and 2 examined the loss of conditioned responding caused by overexpectation of reinforcement. The probability of reinforcement per trial failed to affect acquisition and summation of conditioned responding and failed to affect overexpectation. It also failed to affect extinction of conditioned responding in Experiments 3 and 4. Experiments 2-4 contained nonreinforced trials in which responding at the offset of the CS could be measured. These probe trials did reveal an effect of probability of reinforcement per trial. Cues associated with 100% reinforcement elicited greater post-CS responding than cues associated with 50% reinforcement. The effect was also evident in summation trials (in Experiment 2) in which two 100% or 50% reinforced cues were presented in compound. The results show that mice learn about rate and probability information, but reinforcement rate determines anticipatory responding during the CS. The probability of reinforcement determines responding at the expected time of reinforcement. Thus, learning occurs continuously over the duration of experience and per episode of experience independent of duration. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
, 2025 · doi:10.1037/xan0000396