Editorial: Do You See What I See?: Quality, Reliability, and Reproducibility in Biomedical Research

W Lee Kraus

doi:10.1210/me.2014-1036

editorial

. 2014 Mar;28(3):277–280. doi: 10.1210/me.2014-1036

Editorial: Do You See What I See?: Quality, Reliability, and Reproducibility in Biomedical Research

W Lee Kraus ^1,^✉

PMCID: PMC3938539 PMID: 24617660

Like many other scientists, I believe that science is one of the greatest of all human enterprises. The discipline satisfies our innate curiosity, benefits the world in real and tangible ways, and unites people from diverse backgrounds and cultures. The unique self-critical and self-correcting nature of science has allowed the synthesis of the most accurate view of the natural world known to human kind—a view that has been so integrated into our daily lives that we sometimes fail to recognize its provenance. As noted by the science philosopher Karl Popper, “science is one of the very few human activities – perhaps the only one – where errors are systematically criticized and fairly often, in time, corrected” (1). Therein lies its strength. Both Science (with a capital S), “a reliable body of knowledge about how the world works” (2), and science (with a lower case s) the process of accumulating knowledge carried out by individual scientists (2), are built upon a foundation of trust and verification—trust among scientists, who rely on each others' data and conclusions, and trust between scientists and the public, which funds much of the science and relies on Science to improve the quality of our lives.

“Science is one of the very few human activities – perhaps the only one – where errors are systematically criticized and fairly often, in time, corrected.”

—Karl Popper

Recent reports in the popular press, however, have revealed an erosion of this trust. For example, results from a December 2013 Huffington Post/YouGov poll suggest that only about one-third of Americans have “a lot” of trust in the accuracy and reliability of information reported by scientists, while one-half trust the information only “a little” (3). Polls like this, combined with reports in the media about the lack of reproducibility of major findings in the biomedical sciences, as well as outright fraud, create a perception that Science is unreliable and the scientific enterprise is fundamentally flawed (4 –7). Even if such perceptions are not grounded in fact or are based on a small number of sensationalized outlying anecdotes, they should be of concern to all who care about the scientific enterprise. In this month's editorial, I have chosen to address this issue because I think it is a critically important one for our times, especially as funding for research has decreased and competition among scientists has increased. Although my comments here are directed toward the readers of Molecular Endocrinology, this discussion holds lessons for all disciplines within the biomedical sciences and beyond.

This issue of “flawed science” was laid out in a leader and briefing in the October 19, 2013 issue of The Economist (8, 9). In these pieces, the authors (who are anonymous per the magazine's style) highlight some of the issues facing science today, including “shoddy experiments,” poor design and poor analysis, “exaggeration and cherry picking of results,” emphasis on “flashy” rather than sound or confirmatory results, failure to publish negative results, problems with peer review, and wasted resources (readers of this editorial are directed to the articles for the details of the arguments, which will not be reiterated here).

These articles highlight criticisms from scientists at biotechnology companies who report an alarmingly low rate of reproducibility of landmark discoveries in biomedicine (4, 5), as well as problems with data analysis and peer review, but they fail to provide and substantiate an accurate view of the scope of the problem and its overall impact on the scientific enterprise. In addition, the articles do not address why published studies are not reproducible. This is an instructive question to ask, rather than simply making a definitive statement that they are not. The experimental approaches and model systems used in modern biology are numerous, complex, sophisticated, and nuanced, and exact replication of an experiment may be difficult. More important considerations are whether results are generally meaningful and if differences in results can be informative. Finally, the reporting in the articles is so broad that the reader gets little sense of how the problem may vary by discipline, study design, or methodological approach. Are the problems equally pervasive in epidemiological, clinical, preclinical, and basic research? Across disciplines? I suspect not. Are there unique issues in social science vs natural science? I suspect so. These questions are not clearly addressed in the articles. The mostly anecdotal presentation in the articles is certainly sensational, but I am not prepared to trash the entire scientific enterprise based on the information presented, especially because we have no better current means of understanding the natural world. Of course, there is always room for improvement, and there are issues that need to be addressed, but we need a scientific analysis of these problems that can be used to inform us about their breadth and how best to address them.

graphic file with name zmg0031453320001.jpg

The articles in The Economist do propose some common-sense solutions, many of which are already being implemented by scientists, research institutions, funding agencies, and journals. Clearly, with the advent of “big data” science (eg, genomics, proteomics, high-throughput imaging), all scientists should receive training in statistics, and all projects should involve individuals with the requisite knowledge for proper experimental design and data analysis. In addition, these data sets should be freely available to the research community, which can use the data to replicate the analyses and validate the results. Funding agencies such as the National Institutes of Health (NIH) have addressed both of these issues, requiring instruction in statistics for supported graduate and postgraduate training programs, as well as data sharing plans for supported research. Likewise, major journals, including those published by the Endocrine Society, have clarified and elevated their requirements for statistical analyses and data sharing in published papers. Also, many journals have placed greater emphasis on the reporting of experimental details and the presentation of supporting data through the use of on-line supplements, which can help to ensure the reproducibility and validation of published results. Finally, some journals, including eLife, EMBO Journal, and those in the Public Library of Science (PLoS) family, have made efforts to improve the transparency and quality of the peer review process by publishing reviewers', authors', and readers' comments on line with the article to allow a continuing dialog after publication. Can more be done to improve peer review? I believe so. Journals should place a greater emphasis on the evaluation of experimental design, rather than novelty, and should consider hiring professional staff who can perform a preliminary review of data quality, statistical analyses, and the inclusion of experimental details prior to review.

Are these measures enough? Since the publication of the articles in The Economist, I have spoken about this issue with numerous colleagues, many of whom have proposed additional solutions to the problems of data quality, reliability, and reproducibility, some of which may require alterations to the culture of science. Here are some of their suggestions, which I think warrant serious consideration:

We should hold scientists to a high standard of significance, not just productivity (ie, quality over quantity). Although significance may be difficult to assess, we should try to develop effective metrics to do so. Evaluations of individuals based on the quality and impact of past publications, not just proposed plans, ensures that their research will be scrutinized again and again at many levels over time.
We should encourage the publication of high-quality negative or confirmatory results. According to The Economist, negative results account for just 10%–30% of published papers (depending on the discipline) and have been decreasing precipitously over the past 2 decades (8). This rate seems too low to effectively support robust self-criticism and self-correction. Discussions are beginning in various circles, including the NIH and the editorial group at Molecular Endocrinology, on how best to report confirmatory or negative results; these discussions should be continued and expanded. Perhaps there is a role for independent validation of results on a fee-for-service basis, as promoted by the Reproducibility Initiative, which is supported by Science Exchange, a research outsourcing marketplace, and PLoS One. Alternatively, Master's students could carry out such validation as part of their thesis research. This is an area in need of additional creative ideas.
We should encourage an esthetic of self-confirmation during the course of one's career. As a junior faculty member at Cornell University, I admired how my colleague John Lis has continuously expanded and refined his seminal studies on the functions and generality of promoter-proximally paused RNA polymerase II in gene regulation over 4 decades, generating well-tested, experimentally sound, and internally consistent observations that have withstood the test of time. Such self-confirmation and exploration of depth can be a gratifying way to conduct a life's work, placing greater emphasis on substantiating one's claims (and one's legacy) than jumping from one “hot” topic to the next.
We should reward scientists who “get it right.” The possibilities are numerous, from recognition and accolades to enhanced grant support and personal compensation, although the latter would require careful consideration and implementation. Related to some of the previous points, journals should consider publishing peer-reviewed letters of confirmation for important findings, using them as an opportunity to highlight the previous findings and commend the original authors for the veracity of their work.
We should encourage the acknowledgment and correction of errors. In this regard, journals should willingly publish research articles or letters to the editor that provide convincing arguments refuting results or conclusions published in the journal, opening up a dialog for the community of scientists working in the area. Honest errors of interpretation are inevitable and may even stimulate scientific discourse. Fraud is unacceptable and must be countered with proper training, good examples and mentorship, intense pressure, and substantial penalties.
We should place a much higher standard of proof on extraordinary claims. As noted in The Economist, “by and large, scientists want surprising results, and so they test hypotheses that are normally pretty unlikely, and often very unlikely” (9). If so, one should expect many hypotheses to fail. Carl Sagan, an astrophysicist and popular science writer, wrote that “the extraordinary should certainly be pursued. But extraordinary claims require extraordinary evidence” (10). I suspect that many of us who serve as manuscript reviewers have experienced, at one time or another, a sense that a journal editor is willing to ignore concerns about experimental design, data analysis, or the validity of a data set in order to promote an exciting new result. This subterfuge has no benefit to the author, the reviewer, or the field. Journals need clearer standards and should, as some do, publish commentaries accompanying groundbreaking articles that discuss alternate interpretations of the data and point out potential limitations in the analyses. We must also consider the impact of for-profit journals on the promulgation of extraordinary, but poorly supported, claims. Is this really the best way to publish the findings of our research?
We should make a clear distinction between studies that are designed to reveal new biological outcomes and those that are designed to elucidate mechanisms. Both types of research have their place. During review, many high-quality descriptive studies exploring new biological outcomes are subjected to a request for elucidation of mechanism. One unfortunate outcome is the addition, under pressure, of a final figure containing a poorly designed or hastily executed “mechanistic” experiment. My sense is that if we surveyed across the spectrum of studies in biomedical sciences, the greatest preponderance of errors would be with the mechanisms, not the biological outcomes. We need a place for the publication of well-executed and provocative descriptive or exploratory studies that might serve as a launching point for future investigations.
We should reconsider the importance of “clinical relevance.” There is a great tradition of discovery for discovery's sake in biology and beyond, which has had a tremendous positive impact on human health in ways that no one could have anticipated. But, as Marc Kirschner (11), a cell biologist at Harvard Medical School, pointed out in a recent editorial in Science, “in biomedical science, there is an increasing tendency to equate significance to any form of medical relevance.” Evaluation of studies based primarily on this type of perceived impact may drive the generation of irreproducible results. Kirschner (11) asserts that, “scientists must challenge the assumption that translation, rather than fundamental understanding, is the choke point of progress in the application of science to societal problems.” This sentiment was also articulated by Huda Zoghbi (12), a neurobiologist and human geneticist at Baylor College of Medicine, who wrote in another editorial in Science that “the best way to promote discovery is to invest in talented researchers driven by curiosity and passion, whether for disease-oriented questions or the more obscure mysteries of nature.” Selling the scientific enterprise to the public on the promise of immediate and specific benefits carries the risk of raising unrealistic expectations. We must place a greater emphasis on explaining to the public the long-term merits of basic science. In this regard, the National Academies developed an educational series from 1996 to 2003 titled “Beyond Discovery: The Path from Research to Human Benefit” that traces the basic science origins of technological and medical advances. Perhaps it is time for molecular endocrinologists, as well as other biomedical scientists, to assemble a similar set of educational materials highlighting real-world applications of their basic science discoveries.

In closing, I believe there is no better path to the truth than science, and I remain deeply optimistic about the future of the enterprise. We must laud the accomplishments of science but also acknowledge its shortcomings and limitations. In this regard, I am hopeful that dialogs like this editorial will be constructive. Although science can be a deeply personal experience, it is also a community endeavor, and there can be no doubt of its dependence and impact on the public sphere. As such, we should embrace scrutiny of all aspects of our work. We should also directly acknowledge problems, be willing to address them in transparent ways, and welcome constructive input from all those who wish to contribute. Isn't that what we do, after all?

“Science… is built upon a foundation of trust and verification - trust among scientists, who rely on each others' data and conclusions, and trust between scientists and the public, which funds much of the science.”

W. Lee Kraus, Ph.D.
Editor, Molecular Endocrinology

Acknowledgements

I am grateful to the many colleagues who offered their suggestions and opinions, and provided critical feedback on this piece. Molecular Endocrinology welcomes comments from its readers on this editorial.

Research in the author's lab is funded by grants from the National Institutes of Health (NIH)/National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) and Cancer Prevention Research Institute of Texas (CPRIT).

Dr Kraus holds the Cecil H. and Ida Green Distinguished Chair in Reproductive Biology Sciences.

Disclosure Summary: The author has nothing to disclose.

References

1. Popper K. Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge, London: 1963:216 [Google Scholar]
2. Alberts B. Written Testimony Before the Subcommittee on Research, U.S. House of Representatives Hearing on “Scientific Integrity and Transparency.” March 5, 2013 [Google Scholar]
3. Swanson E. Americans have little faith in scientists, science journalists: poll. HuffingtonPost. December 23, 2013. http://www.huffingtonpost.com/2013/12/21/faith-in-scientists_n_4481487.html
4. Begley CG, Ellis LM. Drug development: raise standards for preclinical cancer research. Nature. 2012;483:531–533 [DOI] [PubMed] [Google Scholar]
5. Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10:712. [DOI] [PubMed] [Google Scholar]
6. Singer N. Duke suspends researcher and halts cancer studies. New York Times. July 20, 2010: http://prescriptions.blogs.nytimes.com/2010/07/20/duke-suspends-researcher-halts-cancer-studies/?_php=true&_type=blogs&_r=0
7. Anonymous Misconduct in science: an array of errors. The Economist. September 10, 2011 [Google Scholar]
8. Anonymous How science goes wrong. The Economist. October 19, 2013:13 [Google Scholar]
9. Anonymous Trouble at the lab. The Economist. October 19, 2013:26–30 [Google Scholar]
10. Sagan C. Broca's Brain: Reflections on the Romance of Science. New York, NY: Random House; 1979:62 [Google Scholar]
11. Kirschner M. A perverted view of “impact.” Science. 2013;340:1265. [DOI] [PubMed] [Google Scholar]
12. Zoghbi HY. The basics of translation. Science. 2013;339:250. [DOI] [PubMed] [Google Scholar]

[B1] 1. Popper K. Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge, London: 1963:216 [Google Scholar]

[B2] 2. Alberts B. Written Testimony Before the Subcommittee on Research, U.S. House of Representatives Hearing on “Scientific Integrity and Transparency.” March 5, 2013 [Google Scholar]

[B3] 3. Swanson E. Americans have little faith in scientists, science journalists: poll. HuffingtonPost. December 23, 2013. http://www.huffingtonpost.com/2013/12/21/faith-in-scientists_n_4481487.html

[B4] 4. Begley CG, Ellis LM. Drug development: raise standards for preclinical cancer research. Nature. 2012;483:531–533 [DOI] [PubMed] [Google Scholar]

[B5] 5. Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10:712. [DOI] [PubMed] [Google Scholar]

[B6] 6. Singer N. Duke suspends researcher and halts cancer studies. New York Times. July 20, 2010: http://prescriptions.blogs.nytimes.com/2010/07/20/duke-suspends-researcher-halts-cancer-studies/?_php=true&_type=blogs&_r=0

[B7] 7. Anonymous Misconduct in science: an array of errors. The Economist. September 10, 2011 [Google Scholar]

[B8] 8. Anonymous How science goes wrong. The Economist. October 19, 2013:13 [Google Scholar]

[B9] 9. Anonymous Trouble at the lab. The Economist. October 19, 2013:26–30 [Google Scholar]

[B10] 10. Sagan C. Broca's Brain: Reflections on the Romance of Science. New York, NY: Random House; 1979:62 [Google Scholar]

[B11] 11. Kirschner M. A perverted view of “impact.” Science. 2013;340:1265. [DOI] [PubMed] [Google Scholar]

[B12] 12. Zoghbi HY. The basics of translation. Science. 2013;339:250. [DOI] [PubMed] [Google Scholar]

PERMALINK

Editorial: Do You See What I See?: Quality, Reliability, and Reproducibility in Biomedical Research

W Lee Kraus

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Editorial: Do You See What I See?: Quality, Reliability, and Reproducibility in Biomedical Research

W Lee Kraus

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases