Influence of reinforcement and its omission on trial‐by‐trial changes of response bias in perceptual decision making
Each reinforcer flips the next choice; a trial-level matching law predicts the flip in real time.
01Research in Context
What this study did
Stüttgen et al. (2024) built a new trial-by-trial model. It tracks how each reinforcer changes the next choice.
They compared this fresh model to older fixed-step models. The older ones update bias only after many trials.
Adults did a perceptual task. Reinforcement odds shifted without warning. The team watched choice change second to second.
What they found
The trial-level matching law won. It predicted moment-to-moment bias shifts better than any fixed-step model.
When a reinforcer was skipped, the model caught the very next swing toward the other option.
How this fits with other research
Sayers et al. (1995) argued that matching law matters for people, not just pigeons. The 2024 paper gives them the tool they asked for.
WFrazier et al. (2023) added a time weight to explain extinction bursts. Stüttgen drops the time weight and still tracks fast swings, showing bursts and bias can be two views of the same trial pulse.
Sailor (1971) showed that one skipped reinforcer spikes the next response rate. The new model captures the same skip, but predicts where the spike will aim, not just that it happens.
Why it matters
You can now treat each trial as data. If a client stalls or jumps after a missed reinforcer, the trial-level model tells you which way bias will tilt next. Use that hint to place the next reinforcer and shorten extinction bursts.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Count the last five reinforcers and non-reinforcers; feed those hits and misses into a simple +1/-1 bias tally and place the next trial on the side that shows the lower tally to stay ahead of the swing.
02At a glance
03Original abstract
Discrimination performance in perceptual choice tasks is known to reflect both sensory discriminability and nonsensory response bias. In the framework of signal detection theory, these aspects of discrimination performance are quantified through separate measures, sensitivity (d') for sensory discriminability and decision criterion (c) for response bias. However, it is unknown how response bias (i.e., criterion) changes at the single-trial level as a consequence of reinforcement history. We subjected rats to a two-stimulus two-response conditional discrimination task with auditory stimuli and induced response bias through unequal reinforcement probabilities for the two responses. We compared three signal-detection-theory-based criterion learning models with respect to their ability to fit experimentally observed fluctuations of response bias on a trial-by-trial level. These models shift the criterion by a fixed step (1) after each reinforced response or (2) after each nonreinforced response or (3) after both. We find that all three models fail to capture essential aspects of the data. Prompted by the observation that steady-state criterion values conformed well to a behavioral model of signal detection based on the generalized matching law, we constructed a trial-based version of this model and find that it provides a superior account of response bias fluctuations under changing reinforcement contingencies.
Journal of the Experimental Analysis of Behavior, 2024 · doi:10.1002/jeab.908