Signal-detection properties of verbal self-reports.
Swap simple accuracy for bias and discriminability indices when you judge verbal self-reports.
01Research in Context
What this study did
College students sat at a computer and played a matching-to-sample game.
After each trial they said whether they were "sure" or "unsure" their answer was right.
The researcher counted hits, misses, false alarms, and correct rejections. Then he calculated two new scores: bias (B'H) and discriminability (A').
What they found
Old-style accuracy missed how the room changed the students’ talk.
Bias showed when the kids leaned toward saying "sure" or "unsure." Discriminability showed how well they could really tell right from wrong.
Together the two numbers gave a clearer picture than percent-correct alone.
How this fits with other research
Baer (1974) first split "bias" from "undermatching" in choice experiments. Glenn (1993) took the same idea and aimed it at self-report words instead of lever presses.
Aguilar-Mediavilla et al. (2024) later saw the same risk: kids with language disorder under-reported bullying when asked directly. Their fix was add peer reports; S’s fix was add signal-detection math.
Ganz et al. (2004) now tell us to use these very indices—B'H and A'—to judge if two therapy groups are truly matched before we compare them.
Why it matters
When a client says "I did great," don’t just count it as right or wrong. Plot their hits versus false alarms, compute bias and discriminability, and you will see if they are over-confident, under-confident, or just guessing. This keeps treatment decisions from being fooled by empty accuracy scores.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Start logging "sure/unsure" after each trial, then chart hits, false alarms, and calculate B'H and A' in Excel.
02At a glance
03Original abstract
The bias (B'H) and discriminability (A') of college students' self-reports about choices made in a delayed identity matching-to-sample task were studied as a function of characteristics of the response about which they reported. Each matching-to-sample trial consisted of two, three, or four simultaneously presented sample stimuli, a 1-s retention interval, and two, three, or four comparison stimuli. One sample stimulus was always reproduced among the comparisons, and choice of the matching comparison in less than 800 ms produced points worth chances in a drawing for money. After each choice, subjects pressed either a "yes" or a "no" button to answer a computer-generated query about whether the choice met the point contingency. The number of sample and comparison stimuli was manipulated across experimental conditions. Rates of successful matching-to-sample choices were negatively correlated with the number of matching-to-sample stimuli, regardless of whether samples or comparisons were manipulated. As in previous studies, subjects exhibited a pronounced bias for reporting successful responses. Self-report bias tended to become less pronounced as matching-to-sample success became less frequent, an outcome consistent with signal-frequency effects in psychophysical research. The bias was also resistant to change, suggesting influences other than signal frequency that remain to be identified. Self-report discriminability tended to decrease with the number of sample stimuli and increase with the number of comparison stimuli, an effect not attributable to differential effects of the two manipulations on matching-to-sample performance. Overall, bias and discriminability indices revealed effects that were not evident in self-report accuracy scores. The results indicate that analyses based on signal-detection theory can improve the description of correspondence between self-reports and their referents and thus contribute to the identification of environmental sources of control over verbal self-reports.
Journal of the experimental analysis of behavior, 1993 · doi:10.1901/jeab.1993.60-495