Adaptive algorithms for shaping behavior.
Let the learner’s success rate drive task difficulty—automatically drop back to easier steps when errors rise and advance when mastery is fast.
01Research in Context
What this study did
WFradet et al. (2025) built a computer program that decides how hard to make the next trial.
The program watches the learner’s success rate. If errors jump, it drops back to easier steps. If mastery is fast, it moves forward.
They tested the code on two jobs: learning a long button sequence and guiding a robot through a maze with almost no rewards.
What they found
The algorithm hit near-perfect scores on the sequence task.
It also taught the robot to reach the goal in the sparse-reward maze—something basic shaping often fails to do.
In short, the computer picked better steps than a human planner.
How this fits with other research
Terrace (1969) shaped lever-pulling in monkeys by hand. WFradet et al. (2025) now let a Monte-Carlo engine do the picking—same idea, new driver’s seat.
Ninness et al. (2018) used neural nets to predict how stimulus relations form. WL et al. use a different AI trick, but both show computers can forecast learning paths before real kids sit down.
Johansson (2025) modeled human-like relational framing with an AI reasoner. WL et al. model step-by-step skill growth. Together they prove behavioral principles can live inside code.
Why it matters
You already adjust task difficulty on the fly. This paper gives you a rule you can code into any tablet: if accuracy drops below 80 %, back up one step; if the learner nails three trials in a row, jump ahead. No guesswork, no stalled programs. Try it with handwriting, tooth-brushing, or intraverbal webs—let the data, not your gut, drive the next prompt.
Want CEUs on This Topic?
The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.
Join Free →Track the last five responses; if accuracy falls under 80 %, move one prompt level easier—if three correct and fast, move one level harder.
02At a glance
03Original abstract
Dogs and laboratory mice are commonly trained to perform complex tasks by guiding them through a curriculum of simpler tasks ('shaping'). What are the principles behind effective shaping strategies? Here, we propose a teacher-student framework for shaping behavior, where an autonomous teacher agent decides its student's task based on the student's transcript of successes and failures on previously assigned tasks. Using algorithms for Monte Carlo planning under uncertainty, we show that near-optimal shaping algorithms achieve a careful balance between reinforcement and extinction. Near-optimal algorithms track learning rate to adaptively alternate between simpler and harder tasks. Based on this intuition, we derive an adaptive shaping heuristic with minimal parameters, which we show is near-optimal on a sequence learning task and robustly trains deep reinforcement learning agents on navigation tasks that involve sparse, delayed rewards. Extensions to continuous curricula are explored. Our work provides a starting point towards a general computational framework for shaping behavior that applies to both animals and artificial agents.
, 2025 · doi:10.1371/journal.pcbi.1013454