Sunday, February 8, 2026

Post 10: The Mathematics of Declining Research Quality—A Deep Dive into the Model

Smaldino & McElreath’s (2016) The Natural Selection of Bad Science is often discussed for its cultural and sociological insights—publish-or-perish, career pressure, replication failures—but fewer people have actually read the mathematics. Yet the mathematics is the engine that powers the paper’s central claim:

If incentives reward production of positive, novel results, then low-rigour research strategies will evolve automatically—even when no one intends harm.

Today’s post is a full walkthrough of the model: what it assumes, how it works, what it predicts, and why its implications are unavoidable under current scientific incentive structures.


1. Why a Mathematical Model?

Intuition is useful, but evolution does not always behave intuitively.

Sometimes selection leads to unexpected outcomes:

  • Cooperation collapses even when everyone agrees it should exist.

  • A trait that is costly (e.g., low rigour → more errors) can still spread if it brings relative advantage.

  • Populations evolve toward states that are stable, not optimal.

Smaldino & McElreath turned to evolutionary game theory and cultural evolution to formalize these dynamics.
Their model is not about people being evil, ignorant, or lazy.
It is about strategies being selected by an environment shaped by:

  • publication counts,

  • significance thresholds,

  • novelty bias,

  • grant success metrics.

In this sense, it is no different from modelling how bacteria evolve in a petri dish.
The “nutrient agar” here is the academic career structure.


2. What Is Being Modelled?

The model simulates research labs (or research strategies), each defined by two key traits:

2.1 Effort (e)

Effort represents rigour, specifically:

  • careful experimental design

  • large sample sizes

  • proper controls

  • thorough analysis

High effort → higher replication success, slower output.
Low effort → faster output, more false positives.

2.2 Productivity (h)

Productivity is the probability of publishing a paper in a given time step.

In the model, effort and productivity are inversely related:

High effort → low productivity.
Low effort → high productivity.

This captures real world lab dynamics: the fastest labs are rarely the most careful.


3. The Core Equations

Now let’s walk through the main mathematical components.


3.1 Producing Results

Each lab attempts studies.
For each study:

  • There is a true effect with base probability b (background rate of true hypotheses).

  • The lab’s rigour (effort e) determines false positive/false negative rates.

The probability of obtaining a publishable, positive result is influenced by:

  • the prevalence of true hypotheses (b)

  • the statistical power of the lab (increasing with effort)

  • the false positive rate (decreasing with effort)

Low effort produces many publishable false positives.
High effort produces fewer but more accurate results.


3.2 Publication and Fitness

Labs gain “fitness” (academic success) through publications.

Fitness increases when:

  • many papers are produced

  • papers get published (positive results only)

  • labs obtain grants (which depend on publication counts)

Fitness is a mathematical proxy for:

  • student recruitment

  • resource access

  • prestige

  • survival through competition

Thus:

A lab’s ability to survive and reproduce (i.e., spawn new labs) is directly tied to publication output.

Quality matters only indirectly, through replication.


3.3 Replication as a Purifying Force

A small proportion r of studies attempt replication.

Let:

  • s = success rate of replication (dependent on original lab’s effort)

  • f = failure rate

Replication outcomes affect fitness:

  • Successful replication → boosts reputation

  • Failed replication → decreases reputation or eliminates strategy

But here is the critical insight:

Replications occur too infrequently to counterbalance the huge productivity advantage of low-effort labs.

This is the mathematical foundation of the replicability crisis.


4. Evolutionary Dynamics: Inheritance, Variation, Competition

Labs reproduce (spawn new labs) with probability proportional to their fitness.
Offspring labs inherit parental traits with mutation.

Just like biological evolution:

  • successful strategies proliferate

  • unsuccessful ones go extinct

  • small random variations introduce novelty

The model is iterated over many generations.


5. What the Model Predicts

The results are stark.

5.1 Effort declines over generations

Even if the initial population begins with very high rigour, the following happens:

  1. Low-effort labs publish more.

  2. They gain more funding, prestige, and visibility.

  3. They produce more “offspring” labs.

  4. They dominate the population.

Mathematically, effort e declines toward the minimum boundary of the model.

This is not moral failure.
It is adaptive optimisation under current incentives.


5.2 False positives increase

Because low-effort labs proliferate, the overall false positive rate increases dramatically.

The model shows a clear, monotonic rise in:

  • Type I errors

  • exaggerated effect sizes

  • non-reproducible claims

This matches empirical data from psychology, cancer biology, genetics, and economics.


5.3 Replication cannot rescue the system

Even when the replication rate is increased, the system continues declining.

Why?

Two reasons:

5.3.1 Replication rate is too small relative to publication volume

Low-effort labs produce so many papers that the replication system is overwhelmed.

5.3.2 Replications have low prestige

In real life (and in the model), failed replications rarely cause career extinction.
They mostly create noise.

Replication is like a weak immune system facing overwhelming infection.


5.4 The only way to reverse decline is to change incentives

The simulations demonstrate that no amount of moral encouragement or “awareness of good practice” can solve the problem.

Rigour can evolve back upward only if:

  • replication is heavily rewarded,

  • low-effort labs are severely punished, and

  • publication counts stop driving survival.

Without structural overhaul, decline is inevitable.


6. Mathematical Intuition: Why Low Rigour Wins

Let's illustrate the intuition using a simplified scenario.

Suppose two labs:

  • Careful Lab: produces 2 solid papers/year

  • Fast Lab: produces 6 weak papers/year

If careers are evaluated on paper count, then:

  • Fast Lab will obtain more grants

  • Fast Lab will attract more students

  • Fast Lab will expand faster

  • Fast Lab’s descendants will dominate

Even if 60% of Fast Lab’s papers are false:

  • They still outcompete

  • Replications catch only a small fraction

  • The ecosystem floods with noise

  • Careful labs eventually die off

The evolutionary equilibrium favors strategies maximizing short-term output, not long-term reliability.

This mirrors classic tragedy-of-the-commons models.


7. Why the Model Matters

This model is not a toy.

It explains real data:

  • rising retraction rates

  • inflated effect sizes

  • widespread p-hacking

  • the reversals of “established findings”

  • competitive “science bubbles” that collapse

  • the growth of high-output, low-rigour labs

It demonstrates mathematically that the crisis is not accidental.

It is the expected outcome of:

  • rewarding quantity over quality

  • rewarding novelty over verification

  • rewarding significance over accuracy

  • punishing slow, careful work

This is the same mathematics that explains:

  • antibiotic resistance

  • overfishing collapse

  • immune evasion in viruses

  • the evolution of cheating in social species

Where selection pressures go, evolution follows.


8. The Paper’s Most Important Mathematical Lesson

The most important insight is simple yet devastating:

If you reward scientists for publishing a lot, you will get a lot of publications.
But you will not get a lot of truth.

Quality cannot win an evolutionary game where fitness depends on quantity.

Only by changing the payoff structure can the system evolve toward healthier equilibrium.


Conclusion: Mathematics as a Warning Signal

Smaldino & McElreath’s model is a mathematical smoke alarm.
It quantifies what many intuitively sensed: science is evolving maladaptively.

The system will not self-correct.
Culture cannot fix what incentives break.
The only remedy is to rebuild the environment so that truth—not productivity—determines academic survival.

In the next posts, we’ll explore solutions—structural, cultural, and institutional—that could reverse these evolutionary trends.

No comments: