ABA Fundamentals

Probability and rate of reinforcement in negative prediction error learning.

DJ et al. (2025) ·

★ The Verdict

Reinforcement rate and probability are not the same knob—rate drives behavior during the cue, probability keeps it alive afterward.

✓ Read this if BCBAs teaching new discriminations or building endurance after the prompt fades.

✗ Skip if Practitioners only measuring final accuracy without looking at when responses occur.

01Research in Context

What this study did

DByiers et al. (2025) tested how two numbers shape learning. One number was how often food arrived (rate). The other was the odds food would come at all (probability).

They used mice in a classic Pavlovian setup. A sound played. Sometimes food followed. They tracked when the mice started to nose-poke during and after the sound.

What they found

Rate ruled the waiting game. Faster food arrival made mice poke sooner and more during the sound.

Probability ruled the after-game. Higher chance of food made mice keep poking after the sound ended. The two numbers lived in different time zones of behavior.

How this fits with other research

Eckerman (1969) showed pigeons learn a discrimination faster when the odds of payoff are high. DJ et al. now show those same odds only bite after the cue ends, stitching 56 years of probability work together.

Malouff et al. (1985) found pigeons switch between using rate or probability depending on how easy the stimuli are to tell apart. DJ et al. keep the two variables separate by letting them control different moments in time, a clean replication with a new twist.

JKaplan-Kahn et al. (2026) prove learning speed is a single math function of CS-US timing. DJ et al. add that once learning is done, rate keeps acting inside the cue while probability acts later, refining the same rate-based model.

Why it matters

When you shape behavior, decide what you want. Want faster responding during the SD? Thin the schedule, but keep the rate of payoff high. Want the learner to stay engaged after the cue ends? Raise the probability that reinforcement still can happen. You can now tune two levers instead of one.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Graph your client’s response timing: count responses during the cue versus after it; try raising reinforcer rate if within-cue behavior is weak, or raising probability if post-cue behavior drops off.

02At a glance

Intervention

not applicable

Design

other

Population

neurotypical

Finding

null

03Original abstract

Trial-based theories of associative learning propose that learning is sensitive to the probability of reinforcement signaled by a conditioned stimulus (CS). Learning, however, is often sensitive to reinforcement rate rather than probability of reinforcement per trial, suggesting that temporal properties of cues may be more important than trial-based properties. In four experiments, the role of probability of reinforcement per trial was examined in appetitive Pavlovian conditioning in mice under conditions in which reinforcement rate was controlled. Experiments 1 and 2 examined the loss of conditioned responding caused by overexpectation of reinforcement. The probability of reinforcement per trial failed to affect acquisition and summation of conditioned responding and failed to affect overexpectation. It also failed to affect extinction of conditioned responding in Experiments 3 and 4. Experiments 2-4 contained nonreinforced trials in which responding at the offset of the CS could be measured. These probe trials did reveal an effect of probability of reinforcement per trial. Cues associated with 100% reinforcement elicited greater post-CS responding than cues associated with 50% reinforcement. The effect was also evident in summation trials (in Experiment 2) in which two 100% or 50% reinforced cues were presented in compound. The results show that mice learn about rate and probability information, but reinforcement rate determines anticipatory responding during the CS. The probability of reinforcement determines responding at the expected time of reinforcement. Thus, learning occurs continuously over the duration of experience and per episode of experience independent of duration. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

, 2025 · doi:10.1037/xan0000396