How Reinforcement Really Works: 222 Studies Reviewed

Research Synthesis

What the research says

Reinforcement is not just giving someone something nice after they do the right thing. It is a set of rules about what follows what, and those rules shape every behavior you see. Research across hundreds of studies shows that behavior is always part of a bigger stream. When you reinforce one thing, you change the whole pattern — not just the one target.

Scientists have found that Pavlovian (or respondent) learning is already happening inside your operant procedures, whether you plan for it or not. The cues around your sessions become signals. Those signals influence how long a skill sticks and whether it shows up in new places. Newer frameworks like the PERCS model show that persistence itself has several parts: effort, endurance, resistance to extinction, consistency, and sequence stability. You cannot measure persistence with one number.

Key Findings

What 222 articles tell us

Pavlovian learning happens inside operant procedures — plan for it to help skills last longer and show up in more places.
Persistence is multidimensional; measure effort, endurance, extinction resistance, consistency, and sequence stability separately, not as one score.
Reinforcers belong to consequence classes — their function comes from the schedules and contexts they are part of, not just their physical properties.
Response disequilibrium theory lets you predict whether a reward will strengthen behavior before you run the program.
Reinforcement does not just strengthen one target behavior — it reorganizes the entire pattern of what a person does.

Free CEUs

Get 60+ CEUs Free in The ABA Clubhouse

Live CEU every Wednesday — ethics, supervision, and clinical topics. Always free.

Join Free →

Deeper Dive

What else the research shows

Studies also show that reinforcers are not just items. They belong to classes, and their function comes from the schedules they are part of. Response disequilibrium theory gives you a practical tool: describe the target behavior as the thing you want more of, and the reward as the thing that is currently happening more. If access to the reward is made contingent on the target, behavior will increase. This predicts outcomes more reliably than just asking 'what does this person like?'

The research disagrees on some fine points. Some studies say contiguity (closeness in time) drives learning most. Others say informativeness — how much a cue tells you about what comes next — matters more. What most researchers agree on is this: reinforcement reorganizes the whole behavioral stream, habits and goal-directed actions develop side by side, and small design choices in your program have big downstream effects.

Monday Morning Actions

How to apply these findings

When you design a program, think beyond the reward item itself. Ask what schedule the reward sits in, what cues signal it, and what other behaviors are happening at the same time. Research shows that those details shape whether the skill builds quickly, persists under pressure, and shows up outside your therapy room. If you are surprised when a skill fades at home, check whether the home context has any of the same cues and schedules you used in sessions.

Use response disequilibrium thinking when a client seems hard to motivate. You do not need to find a stronger treat. Instead, observe what the person already does a lot. Then make access to that activity contingent on the target behavior. This approach is grounded in the research on response-contingent probabilities. It works even when preference assessments give unclear results, because it relies on what the person already chooses — not what you guess they might like.

When data show a skill slipping, check all five PERCS dimensions before deciding what to do. Is the person not trying as hard (effort)? Not lasting as long (endurance)? Giving up as soon as you thin the schedule (extinction resistance)? Performing inconsistently across days (consistency)? Losing the sequence under pressure (sequence stability)? Each of those calls for a different fix. Treating them all as one 'problem' leads to wasted time and confused programs.

Frequently Asked Questions

Common questions from BCBAs and RBTs

The home environment likely lacks the same cues and schedule that were present during training. Research shows that Pavlovian signals embedded in your sessions become part of what controls the behavior. Without those signals, the behavior weakens. Build in multiple settings and people from the start.

Use response disequilibrium logic. Observe what your client already does often. If you can make access to that activity depend on the target behavior, the target behavior should increase. This approach gives you a way to predict reinforcer effectiveness based on what the person already chooses.

Habits and goal-directed systems develop in parallel and compete with each other. Research shows that past reinforcement history can quietly remain active and reassert itself when conditions change. Distinguishing a brief re-emergence (lapse) from a full return to the old pattern (relapse) helps you respond correctly and not overreact.

Research using the PERCS model shows that persistence has at least five separate dimensions: effort, endurance, extinction resistance, consistency, and sequence stability. These can split apart — a person may work hard (effort) but not last long (endurance). Measure each one to know what actually needs fixing.

Not necessarily. The research shows that reinforcement reorganizes the whole behavioral stream — other behaviors may increase, decrease, or compete. Bigger or more frequent rewards can have effects you did not plan for. Monitor the full pattern of behavior, not just the target behavior, when you increase reinforcement.