Assessment & Research

It may not be worth the effort! Trained judges' global ratings as a criterion measure of social skills and anxiety.

Wallander et al. (1983) · Behavior modification

★ The Verdict

Skip pricey judge training—untrained raters give equally solid global social-skills scores.

✓ Read this if BCBAs who use peer or staff ratings to track social progress.

✗ Skip if Clinicians who rely on fine-grained behavior coding rather than global impressions.

01Research in Context

What this study did

Wallander et al. (1983) asked a simple question. Does training judges help them rate social skills better?

They compared trained judges with untrained college peers. Everyone watched the same short videos of people talking.

The judges gave global ratings of social skill and anxiety. The team then checked how much the raters agreed.

What they found

Training gave only a tiny boost in agreement. Trained and untrained judges scored almost the same.

Both groups produced ratings that looked real-world valid. In short, the costly course did little.

How this fits with other research

Segal (1987) seems to disagree. That study found computer training pushed observer accuracy above 90%. The key difference is task type. F taught moment-by-moment coding with instant feedback. L et al. used broad global ratings with no feedback.

Matson et al. (2004) lines up with the target paper. Caregivers using a motivation profile also showed only moderate agreement, showing the pattern holds across tools and populations.

Singh et al. (1993) backs the skepticism. That paper warns against leaning on broad trait ratings, just as L et al. warn against over-training raters for global scores.

Why it matters

Stop pouring hours into judge boot camps. For everyday social-skills checks, a quick rubric and any willing rater will do. Save your budget for teaching the learner, not the observer.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →

→ Action — try this Monday

Use a one-page rubric and any available staff to rate your next social-skills video; skip the long training meeting.

02At a glance

Intervention

not applicable

Design

other

Sample size

Population

neurotypical

Finding

null

Magnitude

negligible

03Original abstract

Within the social skills research area, the ratings of trained judges are presumed to be of better reliability compared to those of untrained peers, but possibly at a cost in social validity since the latter directly represents the criterion. To investigate these issues, videotapes were obtained of 12 males who interacted with a female confederate in a typical four-minute simulated heterosocial situation. These were exhibited to a group of judges who had been trained to rate social skills and anxiety, and to a group that had received no training in this task. Judge types did not differ in mean levels of social skills and anxiety ratings, suggesting that trained judges' impressions are socially valid. However, the trained judges' interrater reliability was only slightly better than that of the peer judges. The latter finding was used to argue that untrained peer judges possibly can be used just as well as trained assistants to provide criterion ratings in social skills research.

Behavior modification, 1983 · doi:10.1177/01454455830072001