Assessment & Research

<scp><i>R</i><sup>2</sup></scp> should not be used to describe behavioral‐economic discounting and demand models

Gelino et al. (2024) · Journal of the Experimental Analysis of Behavior 2024
★ The Verdict

Stop showing R² for demand or discounting curves—it unfairly favors complex models and hides good fits.

✓ Read this if BCBAs who build or interpret behavioral-economic graphs in clinics or labs.
✗ Skip if Practitioners who only read finished assessment summaries and never touch the stats.

01Research in Context

01

What this study did

Gelino et al. (2024) looked at how we judge the quality of behavioral-economic models.

They checked if R², the usual “goodness-of-fit” number, is fair to low-parameter models.

The paper is a math critique, not a new experiment.

02

What they found

R² punishes simple models even when they describe the data well.

The authors say we should swap R² for expectation-testing algorithms instead.

03

How this fits with other research

Kaplan et al. (2019) gave us the free beezdemand R package. That tool still prints R² for every demand curve.

Corredor et al. (2025) sidestep the whole fight. Their new F-Cap model uses plain OLS on a straight-line version of the curve.

So we now have three choices: keep R² (old way), test expectations (Gelino), or linearize and use OLS (Corredor).

04

Why it matters

If you run demand or discounting analyses, drop R² from your report. Tell your team to quote model evidence with expectation tests or linearized OLS. This small edit keeps your data honest and your graphs publication-ready.

Free CEUs

Want CEUs on This Topic?

The ABA Clubhouse has 60+ free CEUs — live every Wednesday. Ethics, supervision & clinical topics.

Join Free →
→ Action — try this Monday

Open your last demand report, delete the R² line, and add an expectation-test or OLS fit note instead.

02At a glance

Intervention
not applicable
Design
theoretical
Finding
not reported

03Original abstract

Literature concerning operant behavioral economics shows a strong preference for the coefficient of determination (R2) metric to (a) describe how well an applied model accounts for variance and (b) depict the quality of collected data. Yet R2 is incompatible with nonlinear modeling. In this report, we provide an updated discussion of the concerns with R2. We first review recent articles that have been published in the Journal of the Experimental Analysis of Behavior that employ nonlinear models, noting recent trends in goodness-of-fit reporting, including the continued reliance on R2. We then examine the tendency for these metrics to bias against linear-like patterns via a positive correlation between goodness of fit and the primary outputs of behavioral-economic modeling. Mathematically, R2 is systematically more stringent for lower values for discounting parameters (e.g., k) in discounting studies and lower values for the elasticity parameter (α) in demand analysis. The study results suggest there may be heterogeneity in how this bias emerges in data sets of varied composition and origin. There are limitations when using any goodness-of-fit measure to assess the systematic nature of data in behavioral-economic studies, and to address those we recommend the use of algorithms that test fundamental expectations of the data.

Journal of the Experimental Analysis of Behavior, 2024 · doi:10.1002/jeab.4200