ABA Fundamentals

Reinforcement of least-frequent sequences of choices.

Shimp (1967) · Journal of the experimental analysis of behavior

★ The Verdict

Matching can emerge even when maximizing isn’t possible—sequential reinforcement contingencies alone can produce the matching relation.

✓ Read this if BCBAs teaching varied response chains to learners who repeat the same answer or play action.

✗ Skip if Clinicians working only with simple single-response targets.

01Research in Context

What this study did

Shimp (1967) worked with three pigeons in a lab box.

The birds could peck four keys in any order.

Food arrived only when the bird picked the four-peck pattern that had happened least often.

What they found

The birds still matched.

Their response ratios lined up with the food ratios, even though the best-paying pattern kept changing.

Matching showed up without any chance to maximize.

How this fits with other research

Kunz et al. (1982) later shortened the game to two-peck strings and still saw matching.

Together the papers show the rule works for short or long response chains.

Glover et al. (1976) looks like a clash: when food stopped, the birds drifted away from matching.

The gap is timing.

Shimp (1967) watched steady-state choices; Glover et al. (1976) tested what happens after the rule is turned off.

Both can be true: matching holds while the rule is on, then fades.

Why it matters

You now know matching can be built even when clients cannot pick a single best move.

If you reinforce varied play actions, social phrases, or academic responses that have been rare lately, you may see the whole response class settle into matching proportions.

Use this to boost flexibility without extra prompts.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Reinforce the least-used 3-step play sequence today and watch if new patterns appear more often.

02At a glance

Intervention

other

Design

single case other

Sample size

Population

not specified

Finding

positive

03Original abstract

When a pigeon's choices between two keys are probabilistically reinforced, as in discrete trial probability learning procedures and in concurrent variable-interval schedules, the bird tends to maximize, or to choose the alternative with the higher probability of reinforcement. In concurrent variable-interval schedules, steady-state matching, which is an approximate equality between the relative frequency of a response and the relative frequency of reinforcement of that response, has previously been obtained only as a consequence of maximizing. In the present experiment, maximizing was impossible. A choice of one of two keys was reinforced only if it formed, together with the three preceding choices, the sequence of four successive choices that had occurred least often. This sequence was determined by a Bernoulli-trials process with parameter p. Each of three pigeons matched when p was (1/2) or (1/4). Therefore, steady-state matching by individual birds is not always a consequence of maximizing. Choice probability varied between successive reinforcements, and sequential statistics revealed dependencies which were adequately described by a Bernoulli-trials process with p depending on the time since the preceding reinforcement.

Journal of the experimental analysis of behavior, 1967 · doi:10.1901/jeab.1967.10-57