Regression to the mean

Regression to the Mean (also known as regression toward the mean, reversion to the mean, or reversion to mediocrity) refers to the statistical phenomenon in which extreme values, whether unusually high or unusually low, are likely followed by values closer to the average on subsequent measurements. This occurs purely due to chance when there is any element of random variability in the system being measured.

Clarification: Regression to the mean does not imply that things “naturally return to normal” or that performance declines due to some inherent limitation. It is a predictable artifact of variability. When measurements include both a consistent signal (e.g., skill, fitness, strength) and a random component (e.g., fatigue, stress, luck), extreme values often reflect a combination of both. The next measurement is unlikely to repeat the same degree of extremity, simply because the random component is unlikely to be as extreme again in the same direction.

Applied Example: If an athlete records a personal best sprint time, significantly faster than usual, it’s likely that on their next attempt, they’ll run closer to their average. This doesn’t mean they’ve gotten slower; rather, it’s likely that their peak performance was aided by favorable random factors (e.g., wind, adrenaline, ideal timing). Similarly, an athlete who underperforms one day is likely to improve on their next attempt. This fluctuation is expected and does not require a causal explanation.

Why Averages Regress to the Mean: Regression to the mean is also the reason that averaging values smooths out extremes. The process of averaging ensures that random high values and random low values balance each other out, pulling the overall average closer to the center of the distribution. The more repeated measures you take (or the larger the sample size), the more the average reflects the underlying signal, and the less it is influenced by noise. In this way, regression to the mean is what guarantees that sample means “regress to the mean.”

Related Terms

Frequently Asked Questions (FAQ)

Is regression to the mean a statistical artifact or a real effect?

Regression to the mean is a real, observable statistical phenomenon, not just a mathematical curiosity. It occurs any time measurements are influenced by both stable traits and temporary fluctuations (e.g., mood, fatigue, measurement error). If you select extreme cases on the first measurement, follow-up measures will likely be less extreme simply because extreme values partly reflect randomness.

Does "regression to the mean" mean things are improving or getting worse?

It may not imply either option. It only means that extreme values are likely to be followed by values closer to the average. This can be misinterpreted as an improvement or a decline. For example, if pain levels are highest during a flare-up, any subsequent measurement may appear to indicate that the treatment is effective, even if the pain improvement is due to natural fluctuations alone.

How can regression to the mean bias research results?

If a study selects participants based on extreme values (e.g., high pain, low performance), those values are likely to regress toward the average over time, regardless of intervention. If a control group is not included, this natural regression may be misattributed to the intervention.

Does regression to the mean affect all research designs equally?

No. Randomized controlled trials (RCTs) help account for regression to the mean by ensuring that both the treatment and control groups experience similar natural variation. However, uncontrolled before-and-after studies, case series, or poorly matched groups are highly susceptible to regression bias.

How is regression to the mean different from natural recovery or placebo effect?

Regression to the mean is a statistical tendency; natural recovery is a biological process; placebo effect is a psychological phenomenon. All three can result in observed improvements unrelated to the intervention. Only rigorous experimental design can separate these effects from true treatment efficacy.

Brookbush Institute Perspective

Regression to the mean is one of the most misunderstood and underappreciated sources of bias in clinical research, particularly in studies with extreme-value inclusion criteria, small sample sizes, or meta-analyses that include heterogeneous studies. The danger is not the phenomenon itself (it is mathematically inevitable), but the failure to recognize when it is at play.

For example, in physical rehabilitation research, participants are often recruited during flare-ups or periods of peak dysfunction. Any intervention that follows is likely to appear effective simply because symptoms improve naturally over time. This is one reason control groups are essential. While research offers stronger evidence than opinion, it is still critical to interpret the outcomes of individual studies with caution.

Regression to the mean also helps explain why meta-analyses often fail to reject the null hypothesis despite trends clearly evident from vote-counting . When averaging across heterogeneous studies, especially those with inconsistent methods, inclusion criteria, and sample sizes, effects may be diluted. Small but consistent trends can be canceled out by variability. This is not evidence that an effect does not exist; it is the mathematical consequence of averaging noisy data. Meta-analyses, while powerful, must be interpreted with a clear understanding of this limitation.

Regression to the mean