Among all the lessons in The Natural Selection of Bad Science, one strikes particularly hard: even strong replication efforts cannot, by themselves, reverse the evolutionary decline in scientific quality.
This is deeply unintuitive. Most scientists agree that the “replication movement” — from the Open Science Framework to large-scale reproducibility projects — is one of the most important reforms of the modern academic era. And yet, according to the model by Smaldino & McElreath, replication, even when well-funded and rigorous, cannot fix the fundamental evolutionary pressures that select for low-effort research.
This article explains why replication fails as a corrective mechanism, what the math shows, and how real-world scientific history aligns perfectly with the model’s predictions. We will also examine several case studies — from priming research to fMRI social neuroscience to cancer biomarker studies — that illustrate the “replication trap” in action.
1. Why People Think Replication Should Work
Replication seems like the perfect immune system for science.
If a result is false, just replicate it.
If it doesn’t repeat, discard it.
Simple.
But this reasoning assumes:
-
Replications are common.
-
Failed replications lead to consequences.
-
Low-quality labs suffer reputational damage.
-
The system rewards trustworthy results.
Unfortunately, none of these assumptions are true.
Replication is not the default. Science has no built-in self-correcting machinery.
It has the potential to self-correct, but only under the right pressures — and those pressures are currently too weak.
Smaldino & McElreath quantify this problem and show that:
Replication has only a tiny effect on the evolutionary trajectory of scientific methods unless it is extremely punitive to low-effort labs.
Which it rarely is.
2. The Model’s Logic: Replicators Cannot Compete with Producers
In the model:
-
Some labs specialize in production: quick studies, low effort, high false-positive rate.
-
Other labs specialize in replication: they repeat studies to verify their truthfulness.
What happens when we simulate a population containing both?
Result 1: Low-effort producers produce more papers and outcompete replicators.
Replicators:
-
publish less frequently
-
spend more time on confirmations
-
cannot generate flashy findings
-
rarely receive top-tier grants
-
don’t produce sensational media-worthy results
Meanwhile, low-effort producers:
-
publish frequently and visibly
-
get grants
-
train more students
-
create more academic successors
-
dominate institutional resources
If fitness = publication output, then:
Producers reproduce faster than replicators, causing replicators to be outcompeted.
This is identical to how parasitic strategies in nature can overwhelm cooperative ones.
3. Replication Has Almost No Punitive Power in the Real World
The model assumes that failed replications might harm a lab.
But in practice:
-
Replication failures are rarely published in high-impact journals.
-
Original authors face little consequence.
-
Failed replication papers get fewer citations.
-
Journals prefer novel claims over verification.
-
Null results are undervalued.
-
Universities don’t reward replication studies at promotion time.
Even when replication failures happen, authors often:
-
invoke “hidden moderators”
-
claim the field has moved on
-
suggest conceptual misinterpretation
-
publicly dispute the findings
Replication often becomes a public debate, not a correction.
The producer has already extracted career value from their original flashy result.
A failed replication five years later affects nothing.
Thus:
Replication does not reduce the reproductive fitness of low-effort labs.
So low-effort labs continue to grow.
4. The Mathematical Trap: Replication Pressure Is Too Slow
Another key point in the paper is about the time lag:
Low-effort labs can outrun replication
Because:
-
Replications take more time than flashy original studies.
-
Producers generate multiple new papers in the time it takes for one failed replication to emerge.
-
Low-quality labs can pivot quickly to new topics.
-
Replicators remain tied to verifying old problems.
This resembles Red Queen dynamics:
The replicators must run as fast as they can just to stay in place,
while low-effort labs sprint ahead unhindered.
5. Real-World Case Study #1: Social Priming
Few fields provide a better illustration of this dynamic.
Early 2000s psychology was full of:
-
very small sample sizes
-
flexible analysis pipelines
-
researcher degrees of freedom
-
surprising “cute” findings
Classic examples:
-
priming people with words related to old age makes them walk slower
-
holding a warm cup makes you judge people as kinder
-
thinking about money makes you less social
These studies were published because:
-
they were novel
-
statistically significant (p < 0.05)
-
quick to run
-
highly publishable in top journals
Replication attempts began years later.
By then:
-
Many of the original authors had built entire careers
-
The most famous results appeared in textbooks
-
High-impact journals resisted null replications
-
Tenure committees didn’t care about replication failures
Even after the field-wide replication crisis, many original researchers insisted the failures were due to:
-
cultural differences
-
subtle context shifts
-
experimenter effects
-
conceptual misunderstanding
This is exactly what Smaldino & McElreath’s model predicts:
the producers had already won the evolutionary race.
The immune system activated too late.
6. Case Study #2: fMRI Social Neuroscience and “Dead Salmon” Problems
In 2009, Bennett et al. famously showed that an fMRI analysis pipeline detected “brain activity” in a dead salmon.
The result: without rigorous correction, false positives ran rampant.
Did this humiliation lead to the downfall of low-effort fMRI studies?
Not really.
-
Labs kept publishing underpowered fMRI studies.
-
Multiverse analysis showed high false discovery rates.
-
The average fMRI sample size remained too small for years.
-
Replication attempts were rare and underfunded.
Why?
Because flashy fMRI studies:
-
made headlines
-
generated TED Talks
-
attracted major grants
-
produced visually compelling brain images
Replicators — who were slower and less flashy — were selected against.
7. Case Study #3: Cancer Biomarker Research
A 2005 paper in Nature found that 88% of cancer biomarker literature was irreproducible.
And yet:
-
The field did not collapse.
-
Labs continued producing low-quality biomarker studies.
-
Replication studies were not rewarded.
-
Novel positive results dominated publication incentives.
Companies and journals prefer exciting claims:
“New blood biomarker predicts cancer risk!”
—even if statistically flawed.
This creates the exact ecological environment where low-effort labs thrive.
8. Replication is Not Evolutionary Pressure — It is Ecological Feedback
A key conceptual error many scientists make is assuming replication will automatically shape behavior.
But in evolutionary terms:
-
Replication is post-hoc ecological feedback.
-
Evolutionary selection is determined by reproductive success.
If failed replication does NOT affect a lab’s reproduction (its ability to secure students, grants, jobs, tenure), then:
Replication has no power as a selective force.
For replication to matter evolutionarily, two things must happen:
(1) Failed replication must be strongly punished
– loss of grants
– loss of prestige
– loss of student recruitment
– slowing of lab growth
(2) Successful replication must be rewarded
– career advancement
– grant funding
– hiring and promotion credit
– institutional prestige
But the current system does none of this.
Thus, as the paper says:
“Replication alone will have little effect unless it affects the differential reproduction of labs.”
In plainer terms:
Scientists must lose by producing bad science, not merely be embarrassed by it.
9. Why Journals Defeat Replication
Even if replicators do their job perfectly, journals undermine their effect.
Replications are not glamorous
Science incentives promote “impact,” not verification.
Replication studies:
-
have lower citation potential
-
rarely produce new mechanisms or theories
-
do not attract media coverage
-
are harder to publish in top journals
Editors prefer:
-
breakthroughs
-
paradigm shifts
-
counterintuitive findings
-
novel experimental paradigms
This creates an asymmetry:
False positives have many outlets.
False negatives have few.
And asymmetry drives evolution.
10. Replication in Other Fields: A Historical View
The replication trap is not new.
It’s just more visible now.
A few examples:
Classical Anthropology
Margaret Mead’s controversial findings on Samoan adolescent sexuality were criticized by later ethnographers — but the replication attempts did not erase Mead’s influence.
Economics
-
Reinhart & Rogoff’s paper on national debt thresholds was debunked by replication.
-
Yet the original paper shaped global austerity policy for years.
Replication came too late.
Nutritional Epidemiology
Contradictory diet studies appear weekly.
Nobody replicates them because:
-
replication is expensive
-
null findings are unpublishable
-
dietary questionnaires are unreliable
-
flashy claims drive media coverage
The field evolves based on visibility, not reliability.
11. The Deeper Evolutionary Lesson
Replication is vital for truth — but weak for evolution.
Evolution does not reward truth-seeking.
It rewards success.
If the system rewards:
-
speed
-
quantity
-
novelty
-
media visibility
…then evolution will select for labs that maximize those traits.
Replication cannot stop this any more than the occasional predator stops a rapidly multiplying prey species — unless predation is intense and targeted.
This is the “evolutionary trap” of scientific incentives.
12. Can Replication Ever Work as a Corrective Force?
Yes — but only under certain extreme conditions:
1. Replications must be common.
(e.g., 10–20% of published studies should be replications)
2. Failed replications must have major career consequences.
(denial of grants, loss of institutional credibility)
3. Replicators must receive strong institutional and financial rewards.
4. Journals must give equal prestige to replications and novel findings.
5. Funding agencies must incentivize adversarial replication.
6. Pre-registration and transparency must be standard.
These policies would change the evolutionary calculus.
Labs that produce unreliable work would:
-
lose funding
-
lose recruits
-
decline in prestige
-
shrink
-
eventually disappear
Labs that produce reliable work would:
-
survive
-
reproduce
-
shape the next generation
Only then does replication become an evolutionary force.
13. Conclusion: Replication is Necessary — but Not Sufficient
Replication is essential for a healthy scientific ecosystem.
But it is not enough.
The model shows — and history affirms — that:
-
Replicators cannot win an evolutionary race against low-effort producers.
-
Replication pressure is slow, weak, and rarely punitive.
-
The incentive structure protects flashy producers.
-
Failed replications seldom harm careers.
-
Replications themselves are under-incentivized.
The core insight:
Replication cleans up messes, but does not prevent them.
Only incentive reform can prevent their creation.
In the next post, we will explore how scientific fields have historically collapsed under their own incentive structures — and what they teach us about the future.