Low interrater agreement on the semantic base of textual material.
Skip goals that ask you to find a text’s hidden meaning — raters can’t agree on what it is.
01Research in Context
What this study did
The team ran three small experiments. In each one, they asked different raters to read short texts.
The raters had to pick out the “semantic base” — the hidden meaning Skinner said controls what we say.
Across all tests, the raters rarely agreed. Agreement scores stayed below the level we accept for good data.
What they found
Even trained people could not agree on what a text “really means.”
Because raters disagreed, the idea of a stable “semantic base” failed a basic reliability check.
The authors warn: if we can’t measure it reliably, we can’t use it in science.
How this fits with other research
Baer et al. (1984) already showed that most papers only talk about Skinner’s verbal operants; few test them. L et al. now give one reason why — the key unit can’t be rated reliably.
Lancioni et al. (2008) later found the same trouble with visual inspection of graphs. Low agreement keeps popping up, no matter what we ask raters to judge.
Lerman et al. (1995) looks like a contradiction — they got 95 % agreement on a sex-ed interview. The difference is simple: they used clear, scripted questions, not open “what does this mean?” judgments. Good agreement is possible when the item is concrete.
Why it matters
If you write a goal that hinges on identifying “meaning,” you are building on sand. Use observable verbal operants — mand, tact, intraverbal — that two people can count the same way. Pick targets you can hear and see, not ones you have to interpret.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Rewrite one vague goal like “identify main idea” into an observable operant: “Given a picture, student will tact 3 items shown for 2 consecutive sessions.”
02At a glance
03Original abstract
The present paper describes three experiments which were conducted to determine whether independent raters could agree upon the semantic base of textual materials. These experiments were occasioned by an earlier experiment in which the investigators reported success in increasing the ability of students to extract the semantic base from textual materials. The present paper reports our unsuccessful attempts to obtain an acceptable level of agreement among independent raters about what constituted the semantic base of a number of texts. The paper concludes by raising some doubts about the strategy of extending behavior-analytic research to verbal behavior by combining behavior-change procedures with cognitive constructs.
The Analysis of verbal behavior, 1989 · doi:10.1007/BF03392840