Earlier this year, the first set of studies from the Reproducibility Project: Cancer Biology were published. Out of the five studies, only two were independently replicated and two had inconclusive outcomes. In June of this year, two more studies were released with “important parts of the original papers” independently replicated. Therefore, the current reproducibility rate stands at around 60% (or 80% if excluding the studies with inconclusive results) – a far shot from the anonymous 10-20% reproducibility rate previously reported by Amgen and Bayer.
However, it is important to note that unlike previous studies, the current project defined reproducibility in broad term that may have mirrored the definition of irreproducibility in psychology. Reproducibility was assessed based on whether or not the experimental findings reported were supportive of the overall conclusion.
This paper-centric definition of reproducibility is problematic as papers are complex entities that contain multiple independent experiments. Moreover, the number of experiments per paper is increasing. What if the IDH1 paper that was replicated as part of the reproducibility project was split into two – one in vitro paper (replicated) and the other in vivo (not replicated). What would the reproducibility rate be in that situation?
This issue highlights the need for making a clear distinction between papers and their empirical constituents (a topic I have written about before). Only when reproducibility is assessed on an experimental level rather than a paper-level that we would be able to truly understand its extent.
The high variability in the reproducibility of specific experiments from the same papers –with animal experiments being more difficult to replicate than cell-based experiments - raises an opportunity to fine-tune our definition of irreproducibility and identify new potential causes.
Most of the focus on the causes of irreproducibility have been focused on the sociological pathologies rampant within academia – increased competition, misaligned incentives and poor training. There’s no doubt that these factors either cause or exacerbate irreproducibility. However, the temptation to recognize subjective bias in science may have covered up another more objective cause: lack of robustness.
Let’s assume that scientists are truly honest and report their findings with transparency. Let’s also assume that all experimental observations reported in the 26 million papers already indexed in PubMed were repeated multiple times before publication. In what possible terms could the irreproducibility crisis be explained in that scenario?
The only possible explanation would be that critical conditions necessary to replicate the original finding were neglected – either because the original experimenter did not include them in the methods section or, more-likely, were oblivious to their existence.
If we view reproducibility as a measure of the robustness of the biological phenomenon being studies, we can discern a potential cause for irreproducibility that is independent of sociology. The reproducibility of a biological experiment may be a reflection of the number of conditions it requires as well as the completeness of the reported conditions. The most robust experiments are those with few and fully recognized conditions.
Therefore, reproducibility can be seen as evidence that the phenomenon is robust enough to cut through the molecular and cellular noise. If what you are experimenting on requires a great level of control to reduce interference from other systems, others may have problems with reproducing what you are working on.
This view, of course, does not dismiss the sociological aspects of the irreproducibility crisis. Scientists usually publish incomplete experimental details due to sloppiness or ignorance. Also, the competitive atmosphere has drowned skepticism. Scientists make over-the-top claims about how robust and important their experimental findings all the time with disregard to objectivity or skepticism.
Viewing reproducibility as a surrogate marker of importance (or centrality as it may be expressed in network science) of the biological phenomenon, raises interesting opportunities. The ability to identify robust findings in the pre-clinical literature will enhance the ability of pharmaceutical companies to select and prioritize programs for clinical development that may have a higher chance of success.
Also, exploring non-sociological causes of irreproducibility opens the door for extracting high-order scientific information from the published literature. Scientists can improve their understanding of the biological networks that make up our bodies and how the different components relate to each other.