ABA Fundamentals

Preference pulses and the win-stay, fix-and-sample model of choice.

Hachiga et al. (2015) · Journal of the experimental analysis of behavior 2015
★ The Verdict

Preference pulses are real, extinction-produced, and best caught with a win-stay tweak in your model.

✓ Read this if BCBAs who run concurrent schedules or study choice under extinction.
✗ Skip if Clinicians who only work on skill acquisition with continuous reinforcement.

01Research in Context

01

What this study did

Hachiga et al. (2015) watched pigeons choose between two keys while food stopped. They tracked tiny choice jumps right after a reinforcer. They added a "win-stay" rule to a computer model to see if it copied the birds' patterns.

02

What they found

The little post-food choice bursts, called preference pulses, only showed up when food ended. When the model used win-stay, the fit errors shrank. This says the pulse is real, not a math trick.

03

How this fits with other research

Sawyer et al. (2014) claimed pulses can appear without any real reinforcement effect—just a side effect of how visits are counted. Yosuke et al. answer: pulses vanish when food keeps coming, so they are tied to reinforcement history, not just bookkeeping.

Malone (1999) showed rats stay or switch based on local payoffs. Adding win-stay to the new model keeps that local idea but explains why birds repeat the last successful key after food stops.

McSweeney et al. (1993) found post-reinforcement pauses track food timing. Yosuke et al. add that the next choice, not just the pause, is also controlled by what happened last.

04

Why it matters

If you run concurrent schedules and see brief jumps back to the just-paid alternative, do not call them artifacts. Under thinning or extinction, those pulses tell you the learner is using a win-stay rule. You can use this to check if your reinforcement history is still driving choice when the chips are gone.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

During extinction probes, watch for a quick return to the last-reinforced option and code it as win-stay, not error.

02At a glance

Intervention
not applicable
Design
single case other
Sample size
12
Population
neurotypical
Finding
not reported

03Original abstract

Two groups of six rats each were trained to respond to two levers for a food reinforcer. One group was trained on concurrent variable-ratio 20 extinction schedules of reinforcement. The second group was trained on a concurrent variable-interval 27-s extinction schedule. In both groups, lever-schedule assignments changed randomly following reinforcement; a light cued the lever providing the next reinforcer. In the next condition, the light cue was removed and reinforcer assignment strictly alternated between levers. The next two conditions redetermined, in order, the first two conditions. Preference pulses, defined as a tendency for relative response rate to decline to the just-reinforced alternative with time since reinforcement, only appeared during the extinction schedule. Although the pulse's functional form was well described by a reinforcer-induction equation, there was a large residual between actual data and a pulse-as-artifact simulation (McLean, Grace, Pitts, & Hughes, 2014) used to discern reinforcer-dependent contributions to pulsing. However, if that simulation was modified to include a win-stay tendency (a propensity to stay on the just-reinforced alternative), the residual was greatly reduced. Additional modifications of the parameter values of the pulse-as-artifact simulation enabled it to accommodate the present results as well as those it originally accommodated. In its revised form, this simulation was used to create a model that describes response runs to the preferred alternative as terminating probabilistically, and runs to the unpreferred alternative as punctate with occasional perseverative response runs. After reinforcement, choices are modeled as returning briefly to the lever location that had been just reinforced. This win-stay propensity is hypothesized as due to reinforcer induction.

Journal of the experimental analysis of behavior, 2015 · doi:10.1002/jeab.170