Estimating the reproducibility of psychological science

Published on Jan 1, 2015in Science41.063
· DOI :10.1126/science.aac4716
Alexander A. Aarts1
Estimated H-index: 1
Joanna E. Anderson3
Estimated H-index: 3
(DRDC: Defence Research and Development Canada)
+ 267 AuthorsKellylynn Zuni2
Estimated H-index: 2
(Adams State University)
Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.
  • References (36)
  • Citations (2235)
📖 Papers frequently viewed together
39 Authors (Brian A. Nosek, ..., Tal Yarkoni)
443 Citations
337 Citations
598 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
#1Brian A. NosekH-Index: 61
#2George AlterH-Index: 20
Last. Tal YarkoniH-Index: 35
view all 39 authors...
443 CitationsSource
#1Uri Simonsohn (UPenn: University of Pennsylvania)H-Index: 26
This article introduces a new approach for evaluating replication results. It combines effect-size estimation with hypothesis testing, assessing the extent to which the replication results are consistent with an effect size big enough to have been detectable in the original study. The approach is demonstrated by examining replications of three well-known findings. Its benefits include the following: (a) differentiating “unsuccessful” replication attempts (i.e., studies yielding p > .05) that are...
220 CitationsSource
#1Timothy M. Errington (Center for Open Science)H-Index: 8
#2Elizabeth IornsH-Index: 15
Last. Brian A. Nosek (UVA: University of Virginia)H-Index: 61
view all 6 authors...
It is widely believed that research that builds upon previously published findings has reproduced the original work. However, it is rare for researchers to perform or publish direct replications of existing results. The Reproducibility Project: Cancer Biology is an open investigation of reproducibility in preclinical cancer biology research. We have identified 50 high impact cancer biology articles published in the period 2010-2012, and plan to replicate a subset of experimental results from eac...
133 CitationsSource
#1John P. A. IoannidisH-Index: 151
#2Marcus R. Munafò (UoB: University of Bristol)H-Index: 67
Last. Sean P. David (Stanford University)H-Index: 31
view all 5 authors...
Recent systematic reviews and empirical evaluations of the cognitive sciences literature suggest that publication and other reporting biases are prevalent across diverse domains of cognitive science. In this review, we summarize the various forms of publication and reporting biases and other questionable research practices, and overview the available methods for probing into their existence. We discuss the available empirical evidence for the presence of such biases across the neuroimaging, anim...
194 CitationsSource
#1Richard A. Klein (UF: University of Florida)H-Index: 7
#2Kate A. Ratliff (UF: University of Florida)H-Index: 13
Last. Brian A. Nosek (UVA: University of Virginia)H-Index: 61
view all 51 authors...
337 CitationsSource
#1Sanford L. Braver (UCR: University of California, Riverside)H-Index: 4
#2Felix Thoemmes (Cornell University)H-Index: 18
Last. Robert Rosenthal (UCR: University of California, Riverside)H-Index: 90
view all 3 authors...
The current crisis in scientific psychology about whether our findings are irreproducible was presaged years ago by Tversky and Kahneman (1971), who noted that even sophisticated researchers believe in the fallacious Law of Small Numbers—erroneous intuitions about how imprecisely sample data reflect population phenomena. Combined with the low power of most current work, this often leads to the use of misleading criteria about whether an effect has replicated. Rosenthal (1990) suggested more appr...
138 CitationsSource
#1Brian A. Nosek (UVA: University of Virginia)H-Index: 61
#2Daniel Lakens (TU/e: Eindhoven University of Technology)H-Index: 26
Ignoring replications and negative results is bad for science. This special issue presents a novel publishing format – Registered Reports – as a partial solution. Peer review occurs prior to data collection, design and analysis plans are preregistered, and results are reported regardless of outcome. Fourteen Registered Reports of replications of important published results in social psychology are reported with strong confirmatory tests. Further, the articles demonstrate open science practices s...
181 CitationsSource
#1Geoff Cumming (La Trobe University)H-Index: 30
We need to make substantial changes to how we conduct research. First, in response to heightened concern that our published research literature is incomplete and untrustworthy, we need new requirements to ensure research integrity. These include prespecification of studies whenever possible, avoidance of selection and other inappropriate data- analytic practices, complete reporting, and encouragement of replication. Second, in response to renewed recognition of the severe flaws of null-hypothesi...
1,186 CitationsSource
#1Daniel Lakens (TU/e: Eindhoven University of Technology)H-Index: 26
Effect sizes are the most important outcome of empirical studies. Most articles on effect sizes highlight their importance to communicate the practical significance of results. For scientists themselves, effect sizes are most useful because they facilitate cumulative science. Effect sizes can be used to determine the sample size for follow-up studies, or examining effects across studies. This article aims to provide a practical primer on how to calculate and report effect sizes for t-tests and A...
1,577 CitationsSource
#1Katherine S. Button (UoB: University of Bristol)H-Index: 13
#2John P. A. Ioannidis (Stanford University)H-Index: 151
Last. Marcus R. Munafò (UoB: University of Bristol)H-Index: 67
view all 7 authors...
A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is ineffici...
2,564 CitationsSource
Cited By2235
#1Audrey Bürki (University of Potsdam)H-Index: 10
#2Shereen Elbuy (University of Potsdam)H-Index: 1
Last. Shravan Vasishth (University of Potsdam)H-Index: 23
view all 4 authors...
Abstract When participants in an experiment have to name pictures while ignoring distractor words superimposed on the picture or presented auditorily (i.e., picture-word interference paradigm), they take more time when the word to be named (or target) and distractor words are from the same semantic category (e.g., cat-dog). This experimental effect is known as the semantic interference effect, and is probably one of the most studied in the language production literature. The functional origin of...
1 CitationsSource
#1Simon B. Goldberg (UW: University of Wisconsin-Madison)H-Index: 17
#2Raymond P. Tucker (LSU: Louisiana State University)H-Index: 12
AbstractObjectives: A recent meta-analysis reported that mindfulness-based interventions (MBIs) outperform specific active control conditions but not evidence-based treatments (EBTs) across various...
#1Olga Boukrina (RU: Rutgers University)H-Index: 6
#2N. Erkut Kucukboyaci (RU: Rutgers University)
Last. Ekaterina Dobryakova (RU: Rutgers University)H-Index: 7
view all 3 authors...
Abstract With the current emphasis on power and reproducibility, pressures are rising to increase sample sizes in psychology and neuroscience studies in order to reflect more accurate effect estimation and generalizable results. The conventional way of increasing power by enrolling more participants is less feasible in some fields of research. In particular, rehabilitation research faces considerable challenges in achieving this goal. We describe the specific challenges to increasing power by re...
Abstract A plethora of dissolution tests exists for oral dosage forms, with variations in selection of the dissolution medium, the hydrodynamics and the dissolution equipment. This work aimed at determining the influence of media composition, the type of dissolution test and the method for entering the data into a PBPK model on the ability to simulate the in vivo plasma profile of an immediate release formulation. Using two rDCS IIa substances, glibenclamide and dipyridamole, housed in immediate...
#1Ye Sun (UofU: University of Utah)H-Index: 1
#2Zhongdang Pan (UW: University of Wisconsin-Madison)
#1Oscar Lecuona (UAM: Autonomous University of Madrid)
#2Eduardo Garcia-Garzon (UAM: Autonomous University of Madrid)H-Index: 2
Last. Raquel Rodríguez-Carvajal (UAM: Autonomous University of Madrid)H-Index: 13
view all 4 authors...
The Five Facets Mindfulness Questionnaire (FFMQ) is a popular tool in mindfulness research. However, its psychometric qualities and its replicability have caused controversy. This study carried out...
#1James D. Johnson (USP: University of the South Pacific)H-Index: 3
#2Len Lecci (UNCW: University of North Carolina at Wilmington)H-Index: 11
Last. John F. Dovidio (Yale University)H-Index: 91
view all 3 authors...
Despite the public outrage in response to police violence against unarmed Black men, work on the psychological dynamics of reactions to these incidents is relatively rare. The present research exam...
1 CitationsSource
#1Karen E. Adolph (NYU: New York University)H-Index: 37
#1Samuel V. Bruton (USM: University of Southern Mississippi)H-Index: 3
#2Mitch Brown (Fairleigh Dickinson University)H-Index: 1
Last. Donald F. Sacco (USM: University of Southern Mississippi)H-Index: 17
view all 3 authors...
Over the past couple of decades, the apparent widespread occurrence of Questionable Research Practices (QRPs) in scientific research has been widely discussed in the research ethics literature as a...
1 CitationsSource
#2Christina Bergmann (MPG: Max Planck Society)H-Index: 6
Last. Sho TsujiH-Index: 7
view all 3 authors...