ABA Fundamentals

A broken thread: A review of <i>Pavlov's Legacy: How and What Animals Learn</i>. By Robert A. Boakes

Staddon (2024) · Journal of the Experimental Analysis of Behavior

★ The Verdict

Pavlovian cues and operant consequences are one conversation—write plans that speak to both sides.

✓ Read this if BCBAs who write behavior plans or teach graduate courses

✗ Skip if Clinicians looking for step-by-step skill acquisition protocols today

01Research in Context

What this study did

Staddon (2024) read Boakes' new book on Pavlov and wrote a long essay. The book says Pavlovian and operant learning are one system, not two. Staddon explains the idea and adds his own take.

The paper is a narrative review. It uses history, lab data, and theory. No new experiment was run.

What they found

The main point: Pavlovian cues set the stage. Operant responses are picked from that stage. Treat them as one loop, not separate boxes.

Example: A red light predicts food. The same light also makes pressing the lever more likely. You cannot split the red light into 'Pavlov' half and 'operant' half.

How this fits with other research

Podlesnik et al. (2023) map 50 years of resurgence studies. Resurgence is the return of old responses when new ones stop paying off. It only makes sense if Pavlovian cues and operant payoffs talk to each other. The new view says they never stopped talking.

Cudré-Mauroux (2010) doubts that conditioned reinforcers truly strengthen behavior. Staddon (2024) agrees: the 'strengthening' is really the cue pulling the response out of the Pavlov scene. Both papers chip away at the old wall between Type S and Type R.

Jiménez et al. (2022) fold pivotal behavior, cusps, and traps into one CHL frame. Staddon does the same job for basic processes: he folds Pavlov and Skinner into one frame. The two reviews were published two years apart; together they tidy both the applied and the basic shelves.

Why it matters

If you treat cues and consequences as one system, your interventions get simpler. Put the cue in charge and let the consequence follow naturally. Next time a client relapses, look at what cues returned, not just what rewards vanished. Design one plan that handles both sides together.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Pick one target behavior, list the cues that set the occasion, and tie the reinforcer to the cue so they work as a single package.

02At a glance

Intervention

not applicable

Design

narrative review

Finding

not reported

03Original abstract

My late friend Howard Liddell told me about an unpublished experiment he did while working as a guest in Pavlov's laboratory. It consisted simply in freeing from its harness a dog that had been conditioned to salivate at the acceleration in the beat of a metronome. The dog at once ran to the machine, wagged its tail at it, tried to jump up to it, barked, and so on; in other words, it showed as clearly as possible the whole system of behavior patterns serving, in a number of Canidae, to beg food from a conspecific. It is, in fact, this whole system that is being conditioned in the classical experiment. (p. 27) In other words, there really aren't two separate systems. Operant learning involves a repertoire of behavior controlled by a context that is set by Pavlovian conditioning. From this Pavlovian repertoire, processes such as contiguity and behavioral competition select behavior that is called instrumental or operant. Skinner's distinction between Type R (operant) and Type S (classical/Pavlovian) conditioning is misleading, the two are complementary, not distinct. Pavlov used salivary secretion as the dependent variable in many behavioral experiments, studying things like sensory sensitivity. How much easier it would have been (especially for the dogs!) if he had used some element of the dog's food-related repertoire as a response. My guess is that the sensory threshold so obtained would be quite a bit lower than the one measured by salivation. This book … covers a relatively brief period of about 60 or 70 years from the impact of Darwin's Origin of Species in the 1860's to the establishment of behaviorism in the 1920's. The tale is told with thoroughness, care, good humor, and-best of all-with understanding, because the author is no outsider, no professional philosopher or historian, who might tell it with scorn and the misapprehension that behaviorism is dead, but an experimenter who did his graduate work at Harvard during the 1960's. Another reviewer, in Science, praises the photographs in the 1984 book (Pavlov's Legacy also has excellent historical photographs) and approves of the coverage but nevertheless concludes, “yet there is something a bit flat about the whole thing” (Richards, 1985, p. 862). This reviewer seems to suggest that From Darwin to Behaviourism reads “in the style of a textbook” (p. 862), a comment that also applies to parts of Pavlov's Legacy. The first chapter of the new book deals with Pavlov and others working in that tradition including W. Horsley Gantt, Howard Liddell, and Jerzy Konorski, as well as those interested in experimental neurosis such as Joseph Wolpe. The book describes Pavlov's early career and his step up to an 1890 appointment at the Military Medical Academy. Pavlov fell into poverty during World War One and his laboratory was closed. Surprisingly, after 1917 his fortunes revived under the Bolshevik regime. Boakes comments, “Lenin wanted to show that the new communist government supported science, and Pavlov was Russia's only Nobel Laureate” (p. 7.) This chapter has interesting details on the physiological aspects of Pavlov's work, his initially mistaken view about the function of the pancreas for example. Boakes comments about Pavlov's 1897 Lectures book, “it was not readily apparent that the results reported in this book were mainly obtained from just two dogs” (p. 2), a critical comment about number of subjects repeated several times later in different contexts. Despite his time in the Herrnstein/Skinner lab, Boakes seems not to have abandoned an earlier preference for the group method used by those working in the Hull–Spence and Rescorla–Wagner tradition, represented in the UK most notably by Nicholas Mackintosh, whose work is discussed in several chapters of this book. Boakes speaks favorably about statistics. He does not share Skinner's “deep disdain for the use of statistics in the study of behavior,” and “despite Skinner, generalizing from the results obtained from just a few animals can be very misleading” (p. 240). Of course, generalizing from a group result to the individual can also be misleading (Staddon, 2019). Chapter 2 is about habit. Boakes contrasts the different conclusions of John B. Watson and Edward Thorndike. Watson is often thought of as the founder of behaviorism, although, as Malone (2014) points out, he had many predecessors. according to [Thorndike's] Law of Effect [habits] were “stamped in” by any “satisfying” event that followed shortly after the response. Watson's studies of rats learning to navigate mazes convinced him too of the important of habits, conceived as S-R connections. However, he rejected the idea that “satisfiers” were needed for the development of habits—this was too subjective a concept—and instead proposed that habits grew stronger as a result of repetition alone, the Law of Frequency. (p. 26) The reference is to Watson's Behavior: An Introduction to Comparative Psychology (1914), a wonderful book with many details of comparative psychology and thoughtful discussions of neurophysiology and genetics, limited though such knowledge was at that time. In fact, Watson's reservation about “satisfiers” was completely justified; the subjective label is unnecessary. He was well aware of the role of food as an incentive, but the label “satisfier” struck him as unverifiable, thus circular. Interestingly, given the contemporary bias against aversive control, researchers at that time recognized its role. Thorndike considered punishment to be an effective way to suppress undesired animal behavior. Factors involved in fixation.—We may confess at once that we have no new principles to offer in solving the problems involved in learning; but we hope that by stating our problems carefully and by clearing away the misconceptions referred to, we shall be able to show in a convincing way that the mechanical principles with which we are already familiar and which can experimentally be shown to act in the way we maintain are sufficient to yield the solutions of those problems. We shall call these principles (1) frequency and (2) recency. Without claiming that they are the only ones operative, let us attempt to apply them in specific cases. Watson's recency principle is hard to distinguish from the contemporary idea of reinforcement contiguity. And frequency is useful to tease apart the exteroceptive controlling factors early in training—to type or play the piano, for example—from the kinesthetic factors operative once the habit is established. Chapters 2, 3, and 4 of Boakes's Pavlov book are devoted to Clark Hull, Kenneth Spence, and their successors, such as Leon Kamin, Robert Rescorla, and Alan Wagner: “These led to a revolution in the way learning by animals was studied. Instead of concentrating of how their behavior changed, it took such changes as an index of what associations the animals had formed” (p. xi). Associations, then, were to be the atoms of a new mathematical science of behavior. A number of interesting effects were identified and described mathematically, most notably blocking: initial exposure to a Pavlovian pairing of stimulus A → US (unconditioned stimulus, e.g., shock), for example, blocks learning to the compound AB→US, which would have led to some conditioning to B absent the initial A → US training. The useful ideas of contingency, that conditioning occurs when a stimulus predicts the US, and surprise, that an unexpected conjunction (or lack of an expected conjunction) between an event and a US will lead to conditioning—were principles that were established by this tradition. [t]his paper is based on the proposition that an adequate theory of instrumental behavior must involve three types of goal event: (a) Rewarding events … (b) Punishing events … and (c) Frustrative events—the absence of or delay of a rewarding event in a situation where it had been present. (p. 40) In an experiment by Amsel and Roussel, 18 male albino rats were trained under hunger motivation to run down Runway 1 into Gl where they found food, then leave Gl and run down the second [much longer] runway into G2 where they found food again. Their time to traverse the alley between Gl and G2 was measured during a preliminary period and, when running time had stabilized, a series of test trials were run, on half of which S was not rewarded in Gl prior to running in Runway 2. The results indicated that after the Runway-2 time had stabilized during the preliminary period, nonreward of the Runway-1 response was followed by shorter Runway-2 times (higher speeds) than those following reward of the Runway-1 response. This difference between the vigor of performance following reward as compared with nonreward has been termed the frustration effect (FE). (p. 105) Frustration theory rests on a plausible claim: that failing to get a reward when you expect it “energizes behavior.” The door won't open to your first push, so you push it harder. When expected food fails to appear, the organism does more of whatever response it's been using to get the food. I first learned about the frustration experiment as a very junior faculty member just trained in single-subject operant methods at Harvard but now in Amsel's very Hullian department at the University of Toronto. I was struck by the group design of Amsel's experiment: why so many animals? “Frustration” is a within-subject effect, a couple of animals should suffice to demonstrate it. But then the answer was obvious, the “frustration effect” was apparently too small to show up in a single rat. So why was the effect so small? Without seeing clearly why this was an appropriate experiment, Nancy Innis and I embarked on an operant analogue to the double runway that we thought would yield a much larger “frustration effect.” The procedure was a standard multiple schedule with two alternating 2-min fixed-interval components and pigeon subjects (Staddon & Innis, 1966). The second link was always rewarded, the first link was rewarded only half the time; a brief timeout ended the other half. The results were clear: The large effect of nonreward (reinforcement omission) under these conditions is not excitatory; rather, food delivery (reinforcement) is inhibitory, thus reward omission is disinhibitory not “frustrative.” It's a long story with two main points: (1) Reward in the first goalbox of the long double runaway, like reward on a fixed-interval schedule, is inhibitory because it signals a delay to food that induces a pause in responding. (2) The inhibitory effect is less in the runway than on an FI 2, simply because the delay in the runway is small: the time taken to traverse the second runway is much less than 2 min—thus, the modest size of the runway frustration effect and the need to rely on a group average. The omission and frustration effects are probably both disinhibitory (see, for example, Kello, 1972; Staddon, 1974). “Frustration theory,” which seems now to have vanished from the psychological map, provides a clear contrast between the group methods of the Hull–Spence school and Skinner's single-subject approach; it would have made an interesting addition to Boakes's broad-ranging book. Pavlov's Legacy provides an excellent and thorough account of the theoretical developments in the Hull–Spence tradition.1 It is less thorough covering the theories that emerged from the initially atheoretical single-subject Skinnerian approach. I will just discuss three: behavioral power laws, the matching law, and behavioral contrast. In the nineteenth century research on the relationship between subjective experiences, such as the loudness of a tone, and their physical basis had led to the adoption of a mathematical [logarithmic] function known as the Weber-Fechner Law. Many experiments in the psychophysical laboratory at Harvard in the 1950s suggested that this “law” needed to be replaced by a power function, leading to Stevens' Power Law of 1957. Research focused on extending or testing this “law” continued at Harvard for at least 10 years. In this context, it was not surprising that, the relationship between responding and reinforcement rates discovered by Herrnstein in 1961 became known as the Matching Law, rather than less grandiosely as say, the “matching function.” (p. 247) The testy aside about Herrnstein's matching law presumably reflects the fact that it depends on procedural tweaks: the “changeover delay” that punishes switching between two choices.2 In the absence of a COD, subjects undermatch—response ratio is less extreme than reinforcement ratio (also a power law). What Boakes seems to have missed is a development starting with a short article by British physicist, D. M. MacKay (1963) pointing out that the power law follows from Weber–Fechner functions that affect both the input and the output of a system. In other words, there is no contradiction between Weber–Fechner and Stevens's power law. MacKay's analysis predicts that under certain conditions a set of empirical power functions should converge. This has been confirmed, most dramatically in data from an experiment on the effects of glare on human power-function scaling (Stevens, 1974) as well as data from behavioral experiments (e.g., Nevin, 1974), all summarized in Staddon (1978). This analysis nicely wraps up the Fechner vs. Stevens debate. A pigeon's rate of key pecking during the presentation of one stimulus may be altered by changing only the schedule of reinforcement associated with a different stimulus (Reynolds, 1960). The change in behavior is called a contrast when the change in the rate of responding during the presentation of one stimulus is in a direction away from the rate of responding generated during the presentation of the other stimulus. (p. 57) For example, the schedule of reinforcement associated with a red key is held constant while appropriate operations increase or decrease the rate of responding during the presentation of a green key. If the rate of responding with a red key decreases when the rate with a green key increases (or increases when the other decreases), the change in rate during the presentation of red is called a contrast. (Reynolds, 1961, p. 57) What is the cause of behavioral contrast? Herbert Terrace in his experiments on so-called errorless learning thought it was the unreinforced responding in the extinction component that somehow excited responding in the reinforced component; Terrace, 1963). Later work, however, has cast some doubt on this interpretation, instead suggesting that it was Terrace's careful programmed introduction of the extinction stimulus so as to avoid responding by the pigeon in the extinction component that may be responsible for the absence of contrast in errorless learning. The conclusion seems to be, errorless learning apart, that it is simply the absence of reinforcement in the extinction component that is responsible for contrast. Whether the bird pecks or not may be irrelevant. The next step in this debate was a set of experiments by J. A. Nevin and his collaborators, Nevin proposed what he called momentum theory to account for contrast effects and resistance to extinction. Some of these data did not fit well with Herrnstein's scheme. For example, in one experiment, response rate was measured in two components of a multiple VI 60 (S1), VI 180 (S2) schedule as a function of rate of concurrent free (VT) reinforcement in a third component (Nevin, 1974, Nevin & Grace, 2000, Figure 2). Nevin found that increased rates of free food in the VT component disproportionately depressed response rate in the poorer (VI 180) component, a result inconsistent with Herrnstein's equations. I was later able to show that a model assuming that interactions of this sort are a consequence of reallocation of interim activities (R0) can account for these and other discrepant data (Staddon, 2016, Chapter 12). This analysis was based not on response rate but on time allocation, time—component duration—being a variable rather neglected at the time. The basic idea was simply that reinforcements exert a “pressure” analogous to air pressure and component duration, time, is analogous to volume in a physical system. Boyle's law then yields predictions that are consistent with (for example) Nevin's (1974) data. It becomes obvious, from this point of view, that when the reinforcement rate on the richer component is high enough, it will “saturate”; all the available time will be filled up so that (a) a further increase in reinforcement rate will have no effect on the response rate, (b) any rate decreases caused by increases in the VT rate will at first come from the lower rate VI component, and (c) matching will only occur when both VI components are weak—for example, when the subject is not too hungry. The data confirm these predictions. Until well into the 1960s to be a Skinnerian researcher meant employing just a few animals in an operant chamber, using their rate of responding as the only acceptable measure of behavior, reporting such data by reproducing samples of cumulative records, excluding any kind of statistical analysis and avoiding any temptation to try to “explain” the results. (p. 222) But, of course, the great advance represented by the cumulative record is that it is real time. In the trial-by-trial world that preceded Skinner, time was essentially ignored. Many of the most interesting advances in operant conditioning have come, with cumulative records as a starting point, from the analysis of temporal control—timing behavior—and from a recognition of component duration as a critical variable. The frustration effect analysis is one example where the delay enforced by the long second alley seems to be critical. It is a modern of Watson's recency science is Skinner's on individual subjects as well as his of statistics. his disdain for in a for with in second has held the Pavlov's is a addition to the small of the of not to much of of the author the The historical are many new to The coverage of the Hull–Spence tradition, its and is Boakes is to be for the historical he with From Darwin to The author no of human or animal subjects were used to this data were or to this

Journal of the Experimental Analysis of Behavior, 2024 · doi:10.1002/jeab.4203