
Is psychology’s replication crisis behind us? Premium
The Hindu
New research shows psychology has improved standards post-replication crisis, with fewer fragile p-values and rising statistical power.
As if political broadsides weren’t enough to undermine public confidence in science, a deep-seated issue became apparent from within science itself in the 2010s: the replication crisis. Researchers began to realise many published papers, especially in psychology and medicine, contained results that couldn’t be replicated. It was a surfeit of bad science that also undermined the work of others that was erected on faulty results.
But according to a new paper published in Advances in Methods and Practices in Psychological Science, psychology at least may have learnt its lesson. Its author, Duke University postdoc Paul Bogdan, parsed 2.4 lakh papers published between 2004 and 2024 to check whether the field had become more robust since the crisis unfolded. Bogdan focused on fragile p-values: statistical results that barely clear the usual cut-off to be considered significant (0.01 to 0.05). The larger the share of such values, the shakier the evidence.
According to Bogdan’s analysis, the share of fragile significant results had dropped from 32% at the start of the crisis to 26%. He also found that the downward slide appeared in every major sub-discipline, suggesting a broad cultural shift toward sturdier work.
Sample size was a key driver. The median size climbed rapidly from 2015 while the reported effect sizes inched downward. This was likely because small studies inflate the effects of their findings whereas bigger ones give truer but smaller estimates. Together, these trends pointed to rising statistical power across the literature.
Journals with higher impact scores and papers with more citations also tended to feature fewer fragile p-values, reversing a pre-crisis pattern in which splashy outlets often published weaker but more sensational findings.
Bogdan revealed one curiosity: scientists at top-ranked universities still published slightly shakier numbers. He used text-mining to explain the mismatch. Words tied to biology-heavy, clinically demanding studies were associated with fragile results as well as high-ranking institutions. This is because such projects are expensive, labour-intensive, and often ethically constrained, making large samples difficult to gather.
In sum, psychology appears to have tightened its standards even as some better-funded corners of the field remain under-powered because they’re tackling tough questions.

“I’ve never even been to these places before,” she laughed, “and suddenly I have memories in all of them.” The dates, she added, were genuinely good — long walks, easy conversations, and meals that stretched late into the evening — and the best part was that none of it felt heavy. The boys she met are all planning to visit her in Mumbai soon, not under without any pressure but with a sense of pleasant continuity. “I’m great,” she said, and she meant it.










