Robust evidence is critical for informed decision-making from clinical practice to public health policy, yet concerns have been raised that a large proportion of published science is false (Ioannidis 2005). A key test of the accuracy of studies is whether they can be replicated by independent groups. Large-scale replication studies in psychology and economics have found that replications often provide weaker evidence than the original studies or fail to reproduce the original findings and conclusions at all (Open Science Collaboration 2015; Camerer et al. 2016). Even performing replications can be challenging since many papers do not report methods in full and authors often do not provide additional information such as data, samples, or code when requested. For example, the Reproducibility Project: Cancer Biology set out with the lofty aim to replicate 193 experiments in preclinical cancer biology from 53 high-impact journals but eventually had to scale this back to 50 experiments from 23 papers due to barriers obtaining the key information required for replication (Errington, Denis, et al. 2021). Notably, only 4 of the 193 experiments had publicly shared the essential data to calculate effect sizes and conduct power analyses. Using 5 different methods for assessing replicability and considering replication in 3 of the 5 methods as success, the authors successfully replicated just 46% of the original experiments (Errington, Mathur, et al. 2021). Recently, Nature has taken a different approach by paying scientists to hunt for errors in data within published papers, with the approval of the authors (Elson 2024). This project is at an early stage, and it remains to be seen what will be shown by such close scrutiny that goes beyond what can be achieved by unpaid peer reviewers during the standard journal evaluation process. As we have previously noted, traditional peer review is increasingly challenging in the era of big data and complex analytics (Schwendicke et al. 2022), so alternative models to enhance the accuracy of science are to be welcomed. However, any approach to assess and improve reproducibility is dependent on the accuracy and transparency of reporting in the original study.
Lack of description of methods, selective reporting of results, and underreporting of study outcomes are part of the problem that has been described as research “waste,” whereby extensive resources are wasted on studies that are not designed well enough, or reported well enough, to stand a chance of providing important and meaningful information (Chalmers and Glasziou 2009). A recent scoping review identified 71 articles that have addressed topics related to the design, conduct, and analysis of studies in the field of dentistry (Pandis et al. 2021). Deficiencies were identified across each of these areas in the dental research literature. Nevertheless, there was evidence that improvements have been made over time, presumably due to greater awareness of the problems and more rigorous attention to study design and reporting. One major step toward improving the quality of scientific evidence has been the introduction of reporting guidelines and checklists that are now required by many journals upon submission of a manuscript. Guidelines such as the Consolidated Standards for Reporting Trials (CONSORT) (Schulz et al. 2010) and Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) (von Elm et al. 2007) provide an essential framework for reporting that can also be useful in the design stage of a study.
In this issue of the Journal of Dental Research and co-published in 5 other journals in the field, Best et al. (2024) present the “OHStat Guidelines,” a new checklist specifically for oral health studies. This builds upon existing guidelines and checklists and, importantly, draws attention to common issues in oral health research including split-mouth studies, clustering of measurements of multiple teeth or sites in the oral cavity, and the unique characteristics of dental development. Following recommendations of the American Statistical Association (Wasserstein and Lazar 2016), the OHStat Guidelines call for reporting calculated P values rather than simply whether this value does or does not meet an arbitrary threshold of P < 0.05. Importantly, for many analyses, estimates and confidence intervals are preferred over P values in any case (Gardner and Altman 1986). The guidelines also recommend multivariable and multivariate statistical methods if feasible and useful to complement pairwise testing. Although the OHStat Guidelines integrate elements from several existing guidelines to provide a detailed checklist that will be appropriate to many studies in oral health, they do not cover all study designs, such as systematic reviews and meta-analyses and comparative effectiveness studies. These, and many more guidelines, are provided and updated by the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) Network (UK EQUATOR Centre 2024). The OHStat Guidelines also do not cover experimental laboratory research, in which sample sizes are often small and do not always support a statistical approach to hypothesis testing (Vaux et al. 2012). Other checklists are available for this such as the RoBDEMAT risk of bias checklist for preclinical studies in dental materials research (Delgado et al. 2022). Nevertheless, many aspects of the OHStat Guidelines will be relevant to all studies in oral health.
The Journal of Dental Research requires authors of clinical and epidemiological studies in oral health to include a relevant checklist. The OHStat Guidelines now provide a checklist that is tuned to studies in oral health. This is an important step toward improving the rigor and reproducibility of research in the field. Nevertheless, we recognize that there is still a great deal of work needed to enhance the trustworthiness of scientific studies. Korbmacher et al. (2023) have proposed structural, procedural, and community-driven changes that can have lasting positive effects on the research environment. Improved statistical approaches are part of the procedural changes and will need to be accompanied by structural changes revolving around open science and community changes to encourage team science. Ultimately, better science and stronger evidence will lead to improved clinical interventions and public health programs that will benefit oral health for all.
Author Contributions
N.S. Jakubovics, F. Schwendicke, contributed to conception, design, data interpretation, drafted and critically revised the manuscript. All authors gave final approval and agree to be accountable for all aspects of the work.
Footnotes
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The authors received no financial support for the research, authorship, and/or publication of this article.
ORCID iDs: N.S. Jakubovics
https://orcid.org/0000-0001-6667-0515
F. Schwendicke
https://orcid.org/0000-0003-1223-1669
References
- Best AM, Lang TA, Greenberg BL, Gunsolley JC, Ioannidou E; Task Force on Design and Analysis in Oral Health Research. 2024. The OHStat guidelines for reporting observational studies and clinical trials in oral health research: manuscript checklist. J Dent Res [epub ahead of print 13 Jul 2024] in press. doi: 10.1177/00220345241247028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Camerer CF, Dreber A, Forsell E, Ho T-H, Huber J, Johannesson M, Kirchler M, Almenberg J, Altmejd A, Chan T, et al. 2016. Evaluating replicability of laboratory experiments in economics. Science. 351(6280):1433–1436. [DOI] [PubMed] [Google Scholar]
- Chalmers I, Glasziou P. 2009. Avoidable waste in the production and reporting of research evidence. Lancet. 374(9683):86–89. [DOI] [PubMed] [Google Scholar]
- Delgado AHS, Sauro S, Lima AF, Loguercio AD, Della Bona A, Mazzoni A, Collares FM, Staxrud F, Ferracane J, Tsoi J, et al. 2022. RoBDEMAT: a risk of bias tool and guideline to support reporting of pre-clinical dental materials research and assessment of systematic reviews. J Dent. 127:104350. [DOI] [PubMed] [Google Scholar]
- Elson M. 2024. Pay researchers to spot errors in published papers. Nature. 629(8013):730. [DOI] [PubMed] [Google Scholar]
- Errington TM, Denis A, Perfito N, Iorns E, Nosek BA. 2021. Challenges for assessing replicability in preclinical cancer biology. eLife. 10:e67995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Errington TM, Mathur M, Soderberg CK, Denis A, Perfito N, Iorns E, Nosek BA. 2021. Investigating the replicability of preclinical cancer biology. eLife. 10:e71601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gardner MJ, Altman DG. 1986. Confidence intervals rather than P values: estimation rather than hypothesis testing. BMJ. 292(6522):746–750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ioannidis JPA. 2005. Why most published research findings are false. PLoS Med. 2(8):e124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korbmacher M, Azevedo F, Pennington CR, Hartmann H, Pownall M, Schmidt K, Elsherif M, Breznau N, Robertson O, Kalandadze T, et al. 2023. The replication crisis has led to positive structural, procedural, and community changes. Commun Psychol. 1:Article 3. doi: 10.1038/s44271-023-00003-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science. 349(6251):aac4716. [DOI] [PubMed] [Google Scholar]
- Pandis N, Fleming PS, Katsaros C, Ioannidis JPA. 2021. Dental research waste in design, analysis, and reporting: a scoping review. J Dent Res. 100(3):245–252. [DOI] [PubMed] [Google Scholar]
- Schulz KF, Altman DG, Moher D, Group C. 2010. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ. 340:c332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwendicke F, Marazita ML, Jakubovics NS, Krois J. 2022. Big Data and complex data analytics: breaking peer review? J Dent Res. 101(4):369–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- UK EQUATOR Centre. 2024. EQUATOR Network. Search for reporting guidelines [accessed 2024 Jun 6]. https://www.equator-network.org/reporting-guidelines-study-design/clinical-practice-guidelines/.
- Vaux DL, Fidler F, Cumming G. 2012. Replicates and repeats—what is the difference and is it significant? A brief discussion of statistics and experimental design. EMBO Rep. 13(4):291–296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP, STROBE Initiative. 2007. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies. Epidemiology. 18(6):800–804. [DOI] [PubMed] [Google Scholar]
- Wasserstein RL, Lazar NA. 2016. The ASA Statement on p-values: context, process, and purpose. Am Stat. 70(2):129–133. [Google Scholar]
