The problem with two-event sequence learning by pigeons.
A two-choice sequence task can accidentally reward 'pick the last thing you saw,' hiding failed learning.
01Research in Context
What this study did
The team taught pigeons a two-event sequence game. First the bird saw two colors in a row. Then it had to pick the key that matched the second color.
The catch: the same color could be first or second on different trials. To win, the bird had to notice both events and their order.
What they found
The birds did not learn the sequences. Instead they used a simple cheat: always pick the key that matches the last color you saw.
Because the last color was always one of the two choices, this trick paid off half the time. It looked like learning, but it was just a last-stimulus rule.
How this fits with other research
Grindle et al. (2012) also watched pigeons choose color sequences. They found the birds like the sequence that hands over the first reward fastest. Both studies show pigeons grab the simplest cue—timing or last color—instead of deeper patterns.
Calamari et al. (1987) tracked moment-to-moment key pecks in concurrent chains. They saw birds switch strategies within a session, just like the last-stimulus rule that rose here. The old and new data agree: pigeons follow local cues, not global logic.
Lloyd (2002) showed another schedule quirk: keeping a fixed delay gap did not keep fixed preference. Together with Tvan der Miesen et al. (2024), the message is clear—standard schedules can hide simple rules that break our theories.
Why it matters
When you build a conditional-discrimination task, check if the last stimulus always points to the right answer. If it does, add a delay, shuffle the order, or insert a mask so the learner must use the full sequence. Otherwise you may be shaping a shortcut, not true learning.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Add a 2-s blackout between the last sequence item and the choice keys to block last-stimulus guessing.
02At a glance
03Original abstract
Bonobos appear to show little evidence of learning to make one response (R1) to an AB sequence and a different response (R2) to sequences BB, AA, and BA (Lind et al. PLoS ONE 18(9):e0290546, 2023), yet under different conditions, pigeons can learn this (Weisman et al. Exp Psychol Anim Behav Process 6(4):312, 1980). Aspects of the bonobo procedure may have contributed to this failure. Most important, no response was required in the presence of the stimuli to encourage attention to them. Furthermore, learning to make one response to the target sequence and another to the other sequences involves a bias that allows for better than chance responding. With the two-alternative forced-choice procedure used with the bonobos, the R1 response is correct for one sequence, whereas the R2 response is correct for three sequences. To correct for this, there are three times as many AB trials as each of the other sequences. However, this correction allows a bias to develop in which reinforcement often can be obtained by using only the last stimulus seen as the basis of choice (e.g., when the last stimulus is B respond R1 when the last stimulus is A respond R2). This solution yields reinforcement on five out of six, or 83%, of the trials. In the present experiment with pigeons, using this two-alternative forced choice procedure, most subjects tended to base their choice on the last-seen stimulus. This design allowed subjects to use a suboptimal but relatively effective choice strategy.
, 2024 · doi:10.1007/s10071-024-01906-1