Skip to main content
PLOS One logoLink to PLOS One
. 2020 Nov 5;15(11):e0241826. doi: 10.1371/journal.pone.0241826

Scientific quality of COVID-19 and SARS CoV-2 publications in the highest impact medical journals during the early phase of the pandemic: A case control study

Marko Zdravkovic 1,#, Joana Berger-Estilita 2,#, Bogdan Zdravkovic 1, David Berger 3,*
Editor: Bart Ferket4
PMCID: PMC7643945  PMID: 33152034

Abstract

Background

A debate about the scientific quality of COVID-19 themed research has emerged. We explored whether the quality of evidence of COVID-19 publications is lower when compared to nonCOVID-19 publications in the three highest ranked scientific medical journals.

Methods

We searched the PubMed Database from March 12 to April 12, 2020 and identified 559 publications in the New England Journal of Medicine, the Journal of the American Medical Association, and The Lancet which were divided into COVID-19 (cases, n = 204) and nonCOVID-19 (controls, n = 355) associated content. After exclusion of secondary, unauthored, response letters and non-matching article types, 155 COVID-19 publications (including 13 original articles) and 130 nonCOVID-19 publications (including 52 original articles) were included in the comparative analysis. The hierarchical level of evidence was determined for each publication included and compared between cases and controls as the main outcome. A quantitative scoring of quality was carried out for the subgroup of original articles. The numbers of authors and citation rates were also compared between groups.

Results

The 130 nonCOVID-19 publications were associated with higher levels of evidence on the level of evidence pyramid, with a strong association measure (Cramer’s V: 0.452, P <0.001). The 155 COVID-19 publications were 186-fold more likely to be of lower evidence (95% confidence interval [CI] for odds ratio, 7.0–47; P <0.001). The quantitative quality score (maximum possible score, 28) was significantly different in favor of nonCOVID-19 (mean difference, 11.1; 95% CI, 8.5–13.7; P <0.001). There was a significant difference in the early citation rate of the original articles that favored the COVID-19 original articles (median [interquartile range], 45 [30–244] vs. 2 [1–4] citations; P <0.001).

Conclusions

We conclude that the quality of COVID-19 publications in the three highest ranked scientific medical journals is below the quality average of these journals. These findings need to be verified at a later stage of the pandemic.

Introduction

Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS CoV-2), and it is a rapidly spreading pandemic that is putting extraordinary stress on healthcare systems across the globe (For simplicity, we will use COVID-19 in reference to both the virus and the disease). While everyone waits for a breakthrough of a specific COVID-19 therapy and an effective vaccine, scientists are redirecting their efforts into COVID-19–themed research to build up our knowledge of this new disease [1]. A search for “COVID-19 or SARS-CoV2” in the PubMed database revealed 4,670 publications between January 1, 2020, and April 12, 2020. This need to publish COVID-19–related findings has been supported by many Ethical Committees, grant providers, and journal editors, who have ‘fast-tracked’ COVID-19 publications so that they can be processed at record speed [24]. However, concerns are emerging that scientific standards are not being met.

The first report of COVID-19 transmission in asymptomatic individuals [5] was later considered to have been flawed, because the patient showed symptoms at the time of transmission [6]. A similar example occurred in The Lancet, whereby the authors retracted a publication after admitting irregularities on the first-hand account of the front-line experience of two Chinese nurses [7]. While our article was under review, two major analyses on the use of hydroxychloroquine and cardiovascular mortality associated with COVID-19 were retracted in the Lancet [8] and the New England Journal of Medicine [9] because source data could not be verified.

Such situations raise concerns as to the quality of the data, the conclusions presented by the authors, and the peer review by the editors, due to the pressure to publish highly coveted information on COVID-19. The urgency of the outbreak suddenly appears to legitimize key limitations of studies, such as small sample sizes, lack of randomization or blinding, and unvalidated surrogate endpoints [10, 11].

While clinicians and the public long for effective treatments, a debate about the quality of this surge of research and the potential violations of scientific rigor has emerged [10, 12, 13]. Despite this massive publication effort, current guidelines remain without any recommendations on core topics for patient management and care [14, 15]. The combination of clinical urgency, weak evidence, pre-print publications without prior peer review [16], and public pressure [17] might lead to inappropriate public health actions and incorrect translation into clinical practice [18], with the potential for worrying breaches in patient safety [19]. A further concern is the inflation of publication metrics, particularly in terms of journal impact factors. Citation-based metrics are used by researchers to maximize the citation potential of their articles [20]. The expectation of a high citation rate might be used by journals to publish papers of questionable scientific value on ‘trendy’ topics [21].

To date, the quality of COVID-19 publications in the top three general medical journals by impact factor (i. e. the New England Journal of Medicine, The Lancet and The Journal of the American Medical Association, represented by an impact factor > 50 for all) has not been formally assessed. We hypothesized that the quality of recent publications on COVID-19 in the three most influential medical journals is lower than for nonCOVID-19 articles published during the same time period. We also determined the early research impact of COVID-19 original articles versus nonCOVID-19 original articles.

Materials and methods

This report follows the applicable STROBE guidelines for case-control studies.

Publication selection and identification of cases and controls

For the time period of March 12 to April 12, 2020 (i.e., during the early outbreak phase of the COVID-19 pandemic), we identified all of the publications from the top three general medical journals by impact factor (the New England Journal of Medicine (NEJM), the Journal of the American Medical Association (JAMA), and The Lancet). We conducted a PubMed database search on April 17, 2020, using the following search string: ((("The New England journal of medicine"[Journal]) OR "Lancet (London, England)"[Journal]) OR "JAMA"[Journal]) AND ("2020/03/12"[Date—Publication]: "2020/04/12"[Date—Publication]). The resulting publications were stratified into COVID-19–related and nonCOVID-19–related. We matched the nonCOVID-19 publications with COVID-19 publications according to article types within each journal, with the exclusion of nonmatching article types. Secondary studies, correspondence letters on previously published articles, unauthored publications, and specific article types not matching any of the six categories on the levels of the evidence pyramid [2224] (e.g., infographic, erratum) were excluded (Fig 1).

Fig 1. Flow chart of the processing of the publications included in this study.

Fig 1

The article types in the NEJM are grouped (by the publisher) into Original Research (Research Articles and Special Articles for research on economics, ethics, law and health care systems), Clinical Cases (Brief Reports and Clinical Problem Solving), Review Articles (Clinical Practice Review or Other Reviews), Commentaries (Editorials, Perspectives, Clinical Implications of Basic Research, Letters to the Editor, Images and Videos in Clinical Medicine), and other articles (Special Reports, Policy Reports, Sounding Board, Medicine and Society and Case Records of the Massachusetts General Hospital). The JAMA articles are grouped by the publisher into Research (Original Investigation, Clinical Trials, Caring for the Critically Ill Patient, Meta-Analysis, Brief Reports and Research letters), Clinical Review and Education (Systematic Reviews, Advances in Diagnosis and Treatment, Narrative Reviews, Special Communications, Clinical Challenges, Diagnostic Test Interpretation, Clinical Evidence Synopsis), Opinion (Viewpoints), Humanities (The Arts and Medicine, A Piece of My Mind, Poetry) and Correspondence (Letters to the Editor). The Lancet’s articles are grouped into a Red Section (Articles and Clinical Pictures), a Blue Section (Comments, World Reports, Perspectives, Obituaries, Correspondence, Adverse Drug Reactions and Department of Error) and a Green Section (Seminars, Reviews, Therapeutics, Series, Hypothesis, Other Departments and Commissions).

Multi-step design

We performed a multi-step 360-degree assessment of the studies. It consisted of their classification according to level of evidence for a quantitative appraisal of their methodological quality using a validated tool, and a narrative analysis of the strengths and weaknesses of the COVID-19 publications, as is often used in social sciences [25]. Early citation frequencies of the original articles was determined.

Levels of evidence

All of the publications included were assessed for number of authors and level of evidence. We used the Oxford Quality Rating Scheme for Studies and Other Evidence [22] to categorize the level of evidence, as adjusted to include animal and in-vitro research [23, 24]. The highest level is attributed to research as randomized trials, followed by nonrandomized controlled studies and cohort trials. The lower levels are represented by descriptive studies, expert opinion, and animal or in-vitro research, commonly represented in the form of a pyramid [22, 23, 26]. For secondary analysis, we split the six levels of evidence into the upper and lower halves, which reflected higher (i.e., 1–3) and lower (i.e., 4–6) levels of evidence, respectively. The number of authors per publication was counted manually.

Quantitative appraisal using the “Standard quality assessment criteria for evaluating primary research papers from a variety of fields (QUALSYST)”

After the hierarchical grading of included publications, the original articles (i.e., published as ‘original research articles’ in each of the journals; Fig 1) were defined for further in-depth analysis using the study quality checklist proposed by Kmet et al. [27]. This checklist is consistent with the recommendations from the Center for Reviews and Dissemination [28, 29]. Four authors in pairs (MZ–DB, JBE–BZ; each pair assessing one half of the publications) independently assessed the original articles on 14 quality criteria (see S1 File). The 14 items covered the research question, design, measures to reduce bias, and data reporting and interpretation, and these were scored according to the degree to which each specific criterion was met (“yes” = 2; “partial” = 1; “no” = 0; “not applicable” = n/a) with the help of a prespecified manual [27]. The total score ranged from 0 to 28. The summary percentage scores were calculated for each original article by summing the total score obtained across the applicable items and dividing by the total possible score (i.e., 28 –[number of “n/a” × 2] ×100). Disagreements between the reviewers (defined as >2 difference in the total score, or >10% difference in the summary percentage scores), were resolved through one round of discussion between each 2-author pair.

Narrative analysis of COVID-19 original articles

The COVID-19 original research articles (n = 13) were assessed in narrative form to report on their major weaknesses, potential conflicts of interest, and likely influence on further research and clinical practice.

Citation frequencies

The early citation frequencies were tracked every 5 days from April 25th to May 25th 2020 for all of the original scientific articles through GoogleScholar [30], to determine how strongly these COVID-19 original articles had impacted upon further publications, in comparison to the nonCOVID-19 original articles. A comparison to an original article set in the same time frame of 2019 was done. Citations per month were calculated to reduce lead time bias. The Google scholar search engine has been shown to reliably identify the most highly-cited academic documents [31].

Statistical analysis

The distributions of the COVID-19 and nonCOVID-19 publications on the levels of evidence pyramid were analyzed using Pearson’s Chi-squared statistics and Cramer’s V as the measure of strength of association (weak: >0.05; moderate: >0.10; strong: >0.15; very strong: >0.25) [32]. Further effect size estimations were performed on two by two contingency tables (split by level of evidence into high and low quality groups) and are reported as odds ratios with 95% confidence intervals (CI).

The retrospectively calculated sample size for the summary percentage scores [27] to detect a 20% change from 90 (nonCOVID-19) to 72 (COVID-19), with 4:1 allocation (52:13 original articles, respectively) on a t-test, with a standard deviation of 15, 85% power, and 0.05 alpha, was 8 original articles [33, 34]. Thus, we deemed our collected data sufficient.

We also planned for a secondary analysis if the comparison above resulted in a significant difference (defined as P <0.05) in the mean percentage scores between the COVID-19 and nonCOVID-19 original articles. The secondary analysis aimed to compare the 2:1 allocation of nonCOVID-19:COVID-19 original articles, for which the allocation was carried out with the 26 original articles with the lowest overall percentage scores in the nonCOVID-19 group versus all of the 13 original articles in the COVID-19 group. The threshold p-value for significance was set at P <0.025, to adjust for multiple testing.

Assessment of the original articles’ quality is reported as a two-reviewer mean score (95% CI) and was analyzed using Welch’s t-tests. Hedges’s g was used as the effect size measure based on a standardized mean difference [35] (small: d = 0.20; medium: d = 0.50; large: d = 0.80; very large: d = 1.20; huge: d >2.00) [36, 37]. To confirm the reliability of the scoring, Cronbach’s alpha was calculated for the total score and the summary percentage score (internal consistency), and the Intraclass Correlation Coefficient with absolute agreement for the inter-rater reliability. The percentage agreement between the two reviewers was also calculated for each individual item (see S2 File).

The data distributions were tested for normality with Kolmogorov-Smirnov tests, and are reported accordingly. Tests between two groups were done with Mann-Withney tests, between multiple groups with Kurskal-Wallis test. Significance was set at P <0.05 or adjusted for multiple testing. All of the tests were two-tailed. The statistical analysis was performed using SPSS Statistics 20 (IBM Inc., Armonk, NY, USA) and Prism 8 (GraphPad Software, San Diego, CA, USA).

Results

Out of 559 publication entries on PubMed for the selected journals, 155 publications on COVID-19 and 130 publications on other (nonCOVID-19) topics were included in the level of evidence analysis. The subsequent analysis of quality was performed on 13 COVID-19 original articles in comparison with 52 nonCOVID-19 original articles (Fig 1).

Levels of evidence and number of authors

The nonCOVID-19 publications were associated with higher quality on the level of evidence pyramid (P <0.001; Chi squared), with a strong association measure (Cramer’s V: 0.452, Table 1). When comparing the higher evidence group to the lower evidence group, the COVID-19 publications were 18-fold more likely (i.e., odds ratio) to be in the lower evidence group (95% CI: 7.0–47; P <0.001). When comparing only the original articles on the levels of evidence pyramid (Table 2), the nonCOVID-19 publications were also associated with higher quality (P <0.001; Chi squared), with a strong association measure (Cramer’s V: 0.641, Table 2). When comparing the higher evidence group to the lower evidence group, the COVID-19 original articles were 26-fold more likely (i.e., odds ratio) to be in the lower evidence group (95% CI: 5.4–120; P <0.001).

Table 1. Frequency distribution of the publications included on the levels of evidence pyramid [23, 24].

Study design Level Group COVID-19 (n = 155) [n (%)] nonCOVID-19 (n = 130) [n (%)]
Randomized controlled trial 1 Higher level of evidence 1 (0.6) 38 (29.2)
Well-designed controlled trial without randomization; prospective comparative cohort trial 2 0 (0) 2 (1.5)
Case-control study; retrospective cohort study 3 4 (2.6) 9 (6.9)
Case series without or with intervention; cross-sectional study 4 Lower level of evidence 19 (12.3) 10 (7.7)
Opinion papers; case reports 5 129 (83.2) 69 (53.1)
Animal or in-vitro research 6 2 (1.3) 2 (1.5)

Table 2. Frequency distribution of the original articles on the levels of evidence pyramid [23, 24].

Study design Level Group COVID-19 (n = 13) [n (%)] nonCOVID-19 (n = 52) [n (%)]
Randomized controlled trial 1 Higher level of evidence 1 (7.7) 38 (73.1)
Well-designed controlled trial without randomization; prospective comparative cohort trial 2 0 (0) 1 (1.9)
Case-control study; retrospective cohort study 3 2 (15.4) 7 (13.5)
Case series without or with intervention; cross-sectional study 4 Lower level of evidence 9 (69.2) 6 (11.5)
Opinion papers; case reports 5 1 (7.7) 0 (0)
Animal or in-vitro research 6 0 (0) 0 (0)

Numbers of authors were similar between groups (median [interquartile range]: 3 [2–6.5] versus 3 [2–13.5]; P = 0.394; Mann-Whitney). In an a posteriori subgroup analysis in the lower evidence group (adjusted threshold p-value as P <0.017), there were significantly more authors in the COVID-19 publications (median [interquartile range]: 3 [2–6]) than in the nonCOVID-19 publications (median: 2 [1–3]) (P <0.001; Mann-Whitney). Obvious outliers were a NEJM case report [38] with 35 authors, an opinion correspondence piece in The Lancet [39] with 29 authors, and a comment piece in The Lancet with 77 authors in a coalition [40].

Quantitative appraisal

Due to >2 difference in the total scores, or >10% difference in the summary percentage scores, the reviewer pairs discussed 8 (of 32) and 12 (of 33), respectively, of the original articles after the individual scoring. The internal consistency reliability of the total score was 0.987, and of the summary percentage score was 0.964 (Cronbach’s alphas) for the reviewer pair MZ–DB, and 0.988 and 0.928, respectively, for the reviewer pair JBE–BZ (P < 0.001, for all). The inter-rater reliabilities of the total scores was 0.975, and the summary percentage score was 0.930 (Intraclass Correlation Coefficient, absolute agreement) for pair MZ–DB, and 0.974 and 0.860, respectively, for pair JBE–BZ (Intraclass Correlation Coefficient, absolute agreement) (P < 0.001, for all).

The mean total scores in the COVID-19 and nonCOVID-19 groups were 12.6 (95% CI 10.1–15.1) and 23.7 (95% CI 22.9–24.6) respectively (Fig 2A), and the mean summary percentage scores were 71.8% (95% CI 62.4–81.1) and 91.1% (95% CI 89.0–93.2), respectively (Fig 2C). The mean total score and the mean summary percentage scores were significantly different between the groups, favoring the nonCOVID-19 original articles (P <0.001, for both; Welch’s t-test; Hedges’ g = 3.37, 2.02, respectively). For the total scores, the difference between the means was 11.1 (95% CI 8.5–13.7; P <0.001), and for the summary percentage scores, 19.3% (95% CI 9.8%–28.8%; P <0.001). Also, in the secondary analysis, when the COVID-19 original articles were compared to the lower quality half of the nonCOVID-19 original articles (i.e., the 26 scoring lower instead of all 52), the differences in the mean total scores (Fig 2B; 12.6 [95% CI 10.1–15.1] vs 21.4 [95% CI 20.4.1–22.3] points, respectively; P = 0.008; Welch’s t-test; Hedges’ g = 2.86) and the mean summary percentage scores (Fig 2D; 71.8% [95% CI 62.4–81.1] vs 85.6% [95% CI 82.8–88.5], respectively; P <0.001; Welch’s t-test; Hedges’ g = 1.31) were significant. For this secondary analysis, the threshold P value for significance was set at p = 0.025.

Fig 2. Quantitative appraisal of the quality of the COVID-19 versus nonCOVID-19 original articles.

Fig 2

The “Standard quality assessment criteria for evaluating primary research papers from a variety of fields”25 was used, for a maximum total score of 28. (A, C) Primary analysis for mean total scores (A) and mean summary percentage scores (C) for all COVID-19 (n = 13) and nonCOVID-19 (n = 52) original articles. (B, D) Secondary analysis for mean total scores (B) and mean summary percentage scores (D) that included all of the COVID-19 original articles (n = 13) and the lower quality half of the nonCOVID-19 original articles (n = 26). Data are means with 95% CI. An adjusted threshold P value of 0.025 defines significance (adjusted for multiple testing. Welch’s t-tests).

In a secondary sensitivity analysis that also included research letters, the mean total scores in the COVID-19 (n = 21) and nonCOVID-19 (n = 55) groups were 12.3 (95% CI 10.6–14) and 23.3 (95% CI 22.2–24.2) respectively, and the mean summary percentage scores were 72.6% (95% CI 66.1–79.1) and 90.9% (95% CI 88.9–92.9), respectively. The mean total score and the mean summary percentage scores were significantly different between the groups, favoring the nonCOVID-19 original articles (P <0.001, for both; Welch’s t-test; Hedges’ g = 2.98, 1.87, respectively). For the total scores, the difference between the means was 11.0 (95% CI 9.1–12.9; P <0.001), and for the summary percentage scores, 18.3% (95% CI 11.6%–25.0%; P <0.001).

Citation frequency

There was a significant difference in the median number of citations according to GoogleScholar at each of the seven dates tested, favoring COVID-19 original research papers (P <0.001, for all; Mann-Whitney, Table 3). A comparison to a set of original articles from the same dates in 2019 revealed 53 (25 to 90) citations in 2019 vs. 334 (222 to 1001) citations for COVID articles in August 2020 and 10 (4 to 18) for non COVID articles (p<0.002 for all comparisons). When corrected for lead-time with citations per month, the articles in 2019 have 4 (2 to 6) cites per month, the non-Covid articles in 2020 2.5 (1 to 4.5) without significance. The COVID articles in 2020 have 83.5 (55 to 250) cites per month (p<0.001).

Table 3. Google Scholar citations of original articles published between March 12 and April 12, 2020.

Date Original articles citations P value*
COVID-19 (n = 13) nonCOVID-19 (n = 52)
April 25 33 (14–212) 2 (1–3) <0.001
April 30 45 (30–244) 2 (1–4) <0.001
May 5 65 (41–290) 2 (1–4) <0.001
May 10 88 (48–328) 2 (1–5) <0.001
May 15 123 (59–390) 2.5 (1–5) <0.001
May 20 139 (64–435) 3 (1.3–6) <0.001
May 25 149 (73–512) 3 (1.3–7) <0.001

Data are median (interquartile range)

* Mann-Whitney tests

Narrative appraisal

The major weaknesses of the 13 COVID-19 original research articles were assessed (Table 4). The selection included one randomized trial [41], four retrospective cohort studies or case series [4245], five epidemiological descriptive studies [4650], three epidemiologic modeling studies [5153], with most of the designs reflecting low grades of evidence [22]. Most of these studies had limitations in terms of missing data or under-reporting. The randomized trial was not blinded. Ten studies showed no apparent conflicts of interest. Two studies were based on data collected by the World Health Organization [51, 52], and in another study [54] a pharmaceutical company screened the patients for treatment, collected the data, and supported the trial financially. Two studies had a patient:author ratio <1 [43, 46]. Two studies were close to 1 [55, 56]. Three studies were considered not relevant for further research [46, 48, 55], and four studies were deemed not relevant for clinical practice [43, 46, 55, 56], because the findings were neither new nor generalizable. The 13 COVID-19 original articles have already been cited in 52 sets of published guidelines.

Table 4. Narrative assessment of the quality of the COVID-19 original articles.

Reported study Major weaknesses Conflict of interest Patient:author (ratio) Should influence further research? Should influence clinical practice? Citation rate as of April 30
Bhatraju et al. Covid-19 in critically ill patients in the Seattle region—case series [55] Design implies a low grade evidence (case-series; no generalizable or representative information). Patients presented with similar respiratory symptoms and had similar mortality rate to patients described in reports from China. Incomplete documentation of symptoms and missing laboratory testing None apparent 24:18 (1.33) No. Similar data across Chinese and European cohorts. No. No new findings. Incorporated into two guideline documents 86
Cao et al. A trial of lopinavir-ritonavir in adults hospitalized with severe COVID-19 [41] Some exclusion criteria were vague (physician decision when involved in the trial as not in the best interest of the patients, presence of any condition that would not allow protocol to be followed safely). No blinding. No placebo prepared. None apparent 199:65 (3.06) Yes. Pursuing more trials with lopinavir-ritonavir not necessary. Yes. Lopinavir-ritonavir treatment added to standard supportive care not associated with clinical improvement or mortality in seriously ill patients with COVID-19, and therefore should not be used for treatment. 389
Ghinai et al. First known person-to-person transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the USA [46] Design implies low grade evidence (case-report; no generalizable or representative information). Incomplete documentation. Epidemiological design performed before implementation of CDC guidelines (not comparable to future investigations). None apparent 2:38 (0.05) No. Epidemiological design performed before implementation of CDC guidelines (methodology not comparable to future investigations). No. Described before in another country. Incorporated into Position Paper on COVID-19 of the EASL-ESCMID 38
Gilbert et al. Preparedness and vulnerability of African countries against importations of COVID-19: a modelling study [51] Design implies low grade evidence (epidemiologic modeling study; anticipatory). Study did not state limitations. Complex analysis. Yes. WHO supported N/A Yes. Should influence public health measures and research for implementation and effectiveness Yes. Should influence public health measures. Mainly Africa-derived research 98
Grasselli et al. Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy region, Italy [42] Design implies low grade evidence (Case-series). Data acquired telephonically. Large amounts of missing data. ICU mortality reported while 58% were still on ICU. None apparent 1591:21 (75.76) Yes. Baseline data for Europe. Yes. Representative cohort to inform clinical practice. Incorporated into a Position Paper of the German Society of Pneumology on treatment for COVID-19 and in guideline from ENT-UK for safe tracheostomy of COVID-19 patients. 51
Grein et al. Compassionate use of remdesivir for patients with severe COVID-19 [54] Design implies low grade evidence (Case-Series). No sample size calculation/ small sample size/ underpowered study. Limited number of collected laboratory measures. Missing data. No control group. Yes. Medication supplied after request to Gilead. Gilead funded trial, collected data, and decided which patients got drug 53:56 (0.94) Yes. Findings from these uncontrolled data informed by the ongoing randomized, placebo-controlled trials of remdesivir therapy for COVID-19. Currently no. Data too low quality to influence clinical practice, concerns regarding patient safety. Included in four sets of guidelines. 42
Kandel et al. Health security capacities in the context of COVID-19 outbreak: an analysis of International Health Regulations annual report data from 182 countries [52] Design implies a low grade evidence (epidemiologic modelling study; anticipatory). Study does not state limitations. Complex analysis. Yes. WHO supported N/A Yes. Should influence public health measures and research for implementation and effectiveness Yes. Should influence public health measures and research for implementation and effectiveness. 24
Leung et al. First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment [53] Design implies a low grade evidence (epidemiologic modelling study; anticipatory). Under-reporting from national sources. Complex analysis. None apparent N/A Yes. Should influence public health measures and research for implementation and effectiveness Yes. Should influence public health measures and research for implementation and effectiveness. 11
Li et al. Early transmission dynamics in Wuhan, China, of novel Coronavirus-infected pneumonia [47] Design implies a low grade evidence (epidemiologic descriptive study). Missing values, probably underreporting. None apparent 425:45(9.44) Yes. First estimate of pandemic dynamics. Yes. Representative cohort can inform clinical practice. Included in eight sets of guidelines 2027
McMichael et al. Epidemiology of COVID-19 in a long-term care facility in King County, Washington [48] Design implies a low grade evidence (epidemiologic descriptive study). Missing values. None apparent 147/31 (4.74) No. Similar data to other cohorts, no generalizability of results. Yes. Representative cohort can inform clinical practice. Included in two societal recommendations for protecting against and mitigation of COVID-19 pandemic in long-term care facilities. 45
Pan et al. Association of public health interventions with the epidemiology of the COVID-19 outbreak in Wuhan, China [49] Design implies a low grade evidence (epidemiologic descriptive study). Missing values. Questionable findings (letter from Lipsitch et al.) [63] None apparent N/A Yes. Should influence public health measures and research for implementation and effectiveness. Yes. Should influence public health measures and research for implementation and effectiveness. 24
Pung et al. Investigation of three clusters of COVID-19 in Singapore: implications for surveillance and response measures [50] Design implies a low grade evidence (epidemiologic descriptive study). Small sample size. Missing values. Recall bias. None apparent 36:20 (1.80) Might influence public health measures to contain clusters. No. Data too low quality to influence clinical practice (no generalizability). 36
Zhou et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study [44] Small sample. Missing values. None apparent 191:19 (10.05) Yes. Early description of clinical course. Findings might change with ongoing pandemic and for other health systems Yes. Representative cohort can inform clinical practice. Included in 33 sets of guidelines from different societies (all continents represented). 1085

CDC: Center for Disease Control; N/A, not applicable

Discussion

The main finding of our study is that the COVID-19–related research in these highly ranked medical journals is of lower quality than research on other topics in the same journals for the same period of time, with strong measures for effect size. We also demonstrated that the number of publications on COVID-19 alone is almost the same as the number of publications on all other topics. These findings provide evidence for the debate on the scientific value, ethics, and information overload of COVID-19 research [10, 13, 19].

There are several limitations to the present study. Even though our data were less than a month old at first submission, the results may soon become obsolete, as new COVID-19 research emerges on a daily basis. We tried to overcome potential bias with a clear search strategy and simple analysis, making our findings highly reproducible. We chose Lander’s method because it allowed inclusion of in-vitro and animal research [23], and we refined the hierarchical grading of the level of evidence using a quantitative tool [27]. Given the vast choice [57], we chose the QUALSYST-tool on the basis that it allows assessment and comparison across multiple study types [27]. Even when the summary scoring might be biased for a methodological quality assessment [57], “composite quality scales can provide useful overall assessments when comparing populations of trials” [57]. The QUALSYST tool has been validated and is easy to use. This may facilitate additional similar studies at a later stage of the pandemic. Compared to an in-depth analysis of a study’s peer-review process prior to acceptance for publication, it must remain very superficial. We did not expand our analysis to check source data. The data scandal leading to retraction of two major studies [8, 9] emerged while our article was under peer-review. The tools we used would not be suitable to have detected this. Public data repositories and an “open science” approach may facilitate data validation [58].

The imbalance between the two cohorts in our study might come from a lack of randomized trials and a proliferation of opinion articles and cluster descriptions for the COVID-19 publications. It can be argued that in the early phases of a pandemic, case-defining reports are mandatory for the evolving dynamics of the outbreak and that such studies will suffer from the usual limitations of initial investigations, and will score lower on quality, even when they are carried out to high standards. However, in our secondary analysis, after exclusion of the highest-quality nonCOVID-19 publications, the significant quality difference remained. One might argue that a comparison to a historical control group, for example the same time frame in 2019, when there was no pandemic effect on research, would have been more appropriate. Our hypothesis was that COVID-related research showed lower quality than non-COVID research. A historical control group may introduce a selection bias, since conditions for research then would be clearly different. We would therefore argue that the control group has to be subject to the same conditions as the test group, when methodological quality is assessed. This may be different for other endpoints like total research output. In line with our results, Stefanini et al reported—in an oral presentation at the European Society of Cardiology Congress 2020—similar findings of lower quality associated with COVID-19 in the same journals and timeframe as our work with a historical control group of 2019. So, both historical and contemporary control groups lead to the same conclusions.

The COVID-19 thematic per se might have attracted more readers and researchers, which will have led to more citations and greater incorporation into secondary studies, as we have also demonstrated. Such a ‘double-whammy’ of lower-quality literature and high dissemination potential can have grave consequences, as it might urge clinicians to take actions and use treatments that are compassionately based but supported by little scientific evidence. Indeed, apart from exposing patients to potential side effects of some drugs [46, 59, 60], treatment strategies based on case reports are generally futile [61]. While multiple diagnostic, therapeutic, and preventive interventions for COVID-19 are being trialed [62], clinicians should sometimes resist the wish “to at least do something”, and to maintain clinical equipoise while fully gathering and evaluating the data that are available [12, 61]. This responsibility needs to be shared by the high-impact journals, which should continue to maintain publication standards as for other nonCOVID-19 research. It must be acknowledged though, that a citation does not necessarily need to be positive for a study or author, if the context, i. e. criticism or discussions about retractions and corrections, of the citations are considered. This is beyond the scope of our work.

The pandemic took a toll on all aspects of life. Clearly, journal reviewers were restricted in the time they were able to invest into their valuable, voluntary and honorary work. To what extent changes in their practices have occurred is not accessible for us, since the peer-review process was blind and confidential. Assessing of journals with open peer review during the pandemic may shed light on such phenomena, but this was not the scope of our study.

We also demonstrated a worrying trend of increasingly long authorships in lower quality COVID-19 publications, with the almost ‘anecdotical’ findings of some of the publications actually having more authors than patients [38, 43, 46]. The current demand for publications appears to have led authors to send their COVID-19 findings to higher-impact journals. As the authors of the present report, we are exposed to the same allegations.

At present, we can only issue a plea to both authors and editors to maintain their ethical and moral responsibilities in terms of the International Committee of Medical Journal Editors authorship standards. Being at the forefront of medical discovery, these journals should not publish lower quality findings just to promote citations. The risk of bias and unintended consequences for patients is relevant [61], and scientific standards must not be ‘negotiable’[10].

Conclusions

The quality of the COVID-19–related research in the top three scientific medical journals is below the quality average of these journals. Unfortunately, our numbers do not contribute to a solution as to how to preserve scientific rigor under the pressure of a pandemic.

Supporting information

S1 File. Checklist used for the assessment of the quality of the quantitative studies.

Description of data: Detailed criteria are shown for the quality assessment of the quantitative studies.

(DOCX)

S2 File. Assessor (authors MZ–DB, JBE–BZ) agreements on the qualities of the quantitative studies.

Description of data: Percentage assessor agreement after independent individual scoring and following resolution of disagreements.

(DOCX)

Acknowledgments

We would like to thank Professor Jukka Takala for revision of the manuscript draft, and Chris Berrie for manuscript editing and help with the language.

Abbreviations

CI

confidence interval

COVID-19

coronavirus disease 2019

QUALSYST

Standard quality assessment criteria for evaluating primary research papers from a variety of fields

SARS-CoV-2

severe acute respiratory syndrome coronavirus 2

Data Availability

https://figshare.com/projects/Scientific_Quality_of_COVID-19_and_SARS_CoV-2_Publications_in_the_Highest_Impact_Medical_Journals_during_the_Early_Phase_of_the_Pandemic_A_Case-Control_Study/86027.

Funding Statement

The author(s) received no specific funding for this work.

References

  • 1.Hossain MM. Current Status of Global Research on Novel Coronavirus Disease (COVID-19): A Bibliometric Analysis and Knowledge Mapping https://ssrncom/abstract=3547824. 2020.
  • 2.Brown A, Horton R. A planetary health perspective on COVID-19: a call for papers. Lancet Planet Health. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Greaves S. Sharing findings related to COVID-19 in these extraordinary times www.hindawi.com: Hindawi; [19th April 2020]. https://www.hindawi.com/post/sharing-findings-related-covid-19-these-extraordinary-times/.
  • 4.PLOS. A message to our community regarding COVID-19 plos.org: PLOS; [updated 19th March, 202019th April, 2020]. https://plos.org/blog/announcement/a-message-to-our-community-regarding-covid-19/.
  • 5.Rothe C, Schunk M, Sothmann P, Bretzel G, Froeschl G, Wallrauch C, et al. Transmission of 2019-nCoV Infection from an Asymptomatic Contact in Germany. N Eng J Med. 2020;382(10):970–1. 10.1056/NEJMc2001468 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kupferschmidt K. Study claiming new coronavirus can be transmitted by people without symptoms was flawed sciencemag.org2020 [19th April, 2020]. https://www.sciencemag.org/news/2020/02/paper-non-symptomatic-patient-transmitting-coronavirus-wrong.
  • 7.Zeng Y, Zhen Y. RETRACTED: Chinese medical staff request international medical assistance in fighting against COVID-19. Lancet Glob Health. 2020. 10.1016/S2214-109X(20)30065-6 [DOI] [PubMed] [Google Scholar]
  • 8.Mehra MR, Ruschitzka F, Patel AN. Retraction—Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis. The Lancet. 2020;395(10240):1820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mehra MR, Desai SS, Kuy S, Henry TD, Patel AN. Retraction: Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19. N Engl J Med. 10.1056/NEJMoa2007621 New England Journal of Medicine. 2020;382(26):2582-. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.London AJ, Kimmelman J. Against pandemic research exceptionalism. Science (New York, NY). 2020. 10.1126/science.abc1731 [DOI] [PubMed] [Google Scholar]
  • 11.Kim AHJ, Sparks JA, Liew JW, Putman MS, Berenbaum F, Duarte-Garcia A, et al. A Rush to Judgment? Rapid Reporting and Dissemination of Results and Its Consequences Regarding the Use of Hydroxychloroquine for COVID-19. Ann Intern Med. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Angus DC. Optimizing the Trade-off Between Learning and Doing in a Pandemic. JAMA. 2020. 10.1001/jama.2020.4984 [DOI] [PubMed] [Google Scholar]
  • 13.Kalil AC. Treating COVID-19—Off-Label Drug Use, Compassionate Use, and Randomized Clinical Trials During Pandemics. JAMA. 2020. [DOI] [PubMed] [Google Scholar]
  • 14.Alhazzani W, Moller MH, Arabi YM, Loeb M, Gong MN, Fan E, et al. Surviving Sepsis Campaign: guidelines on the management of critically ill adults with Coronavirus Disease 2019 (COVID-19). Intensive Care Med. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lamontagne F, Angus DC. Toward Universal Deployable Guidelines for the Care of Patients With COVID-19. JAMA. 2020. 10.1001/jama.2020.5110 [DOI] [PubMed] [Google Scholar]
  • 16.Johansson MA, Saderi D. Open peer-review platform for COVID-19 preprints. Nature. 2020;579(7797):29 10.1038/d41586-020-00613-4 [DOI] [PubMed] [Google Scholar]
  • 17.Adare A, Afanasiev S, Aidala C, Ajitanand NN, Akiba Y, Al-Bataineh H, et al. Enhanced production of direct photons in Au + Au collisions at square root(S(NN)) = 200 GeV and implications for the initial temperature. Phys Rev Lett. 2010;104(13):132301 10.1103/PhysRevLett.104.132301 [DOI] [PubMed] [Google Scholar]
  • 18.Ioannidis JPA. Coronavirus disease 2019: the harms of exaggerated information and non-evidence-based measures. Eur J Clin Invest. 2020:e13223. [DOI] [PubMed] [Google Scholar]
  • 19.Goodman JL, Borio L. Finding Effective Treatments for COVID-19: Scientific Integrity and Public Confidence in a Time of Crisis. JAMA. 2020. [DOI] [PubMed] [Google Scholar]
  • 20.Bradshaw CJ, Brook BW. How to Rank Journals. PLoS One. 2016;11(3):e0149852 10.1371/journal.pone.0149852 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Ioannidis JPA, Thombs BD. A user’s guide to inflated and manipulated impact factors. Eur J Clin Invest. 2019;49(9):e13151 10.1111/eci.13151 [DOI] [PubMed] [Google Scholar]
  • 22.Phillips B, Ball C, Sackett D, Badenoch D, Straus S, Haynes B, et al. Oxford centre for evidence-based medicine-levels of evidence (March 2009). 2009.
  • 23.Lander B, Balka E. Exploring How Evidence is Used in Care Through an Organizational Ethnography of Two Teaching Hospitals. Journal of medical Internet research. 2019;21(3):e10769 10.2196/10769 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Djulbegovic B, Guyatt GH. Progress in evidence-based medicine: a quarter century on. Lancet. 2017;390(10092):415–23. 10.1016/S0140-6736(16)31592-6 [DOI] [PubMed] [Google Scholar]
  • 25.Understanding qualitative research in health care. Drug and Therapeutics Bulletin. 2017;55(2):21 10.1136/dtb.2017.2.0457 [DOI] [PubMed] [Google Scholar]
  • 26.Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000;342(25):1887–92. 10.1056/NEJM200006223422507 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kmet LM, Cook LS, Lee RC. Standard quality assessment criteria for evaluating primary research papers from a variety of fields. 2004. [Google Scholar]
  • 28.Khan KS, Ter Riet G, Glanville J, Sowden AJ, Kleijnen J. Undertaking systematic reviews of research on effectiveness: CRD’s guidance for carrying out or commissioning reviews: NHS Centre for Reviews and Dissemination; 2001.
  • 29.Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283(15):2008–12. 10.1001/jama.283.15.2008 [DOI] [PubMed] [Google Scholar]
  • 30.Falagas ME, Pitsouni EI, Malietzis GA, Pappas G. Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses. FASEB J. 2008;22(2):338–42. 10.1096/fj.07-9492LSF [DOI] [PubMed] [Google Scholar]
  • 31.Martin-Martin A, Orduna-Malea E, Harzing A-W, López-Cózar ED. Can we use Google Scholar to identify highly-cited documents? J Informetr. 2017;11(1):152–63. [Google Scholar]
  • 32.Zdravkovic M, Osinova D, Brull SJ, Prielipp RC, Simoes CM, Berger-Estilita J, et al. Perceptions of gender equity in departmental leadership, research opportunities, and clinical work attitudes: an international survey of 11 781 anaesthesiologists. Br J Anaesth. 2020. [DOI] [PubMed] [Google Scholar]
  • 33.Marshall KH, D’Udekem Y, Sholler GF, Opotowsky AR, Costa DS, Sharpe L, et al. Health-Related Quality of Life in Children, Adolescents, and Adults With a Fontan Circulation: A Meta-Analysis. J Am Heart Assoc. 2020;9(6):e014172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Brennan ME, Gormally JF, Butow P, Boyle FM, Spillane AJ. Survivorship care plans in cancer: a systematic review of care plan outcomes. Br J Cancer. 2014;111(10):1899–908. 10.1038/bjc.2014.505 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Durlak JA. How to Select, Calculate, and Interpret Effect Sizes. Journal of Pediatric Psychology. 2009;34(9):917–28. 10.1093/jpepsy/jsp004 [DOI] [PubMed] [Google Scholar]
  • 36.Sawilowsky SS. New effect size rules of thumb. J Mod Appl Stat Methods. 2009;8(2):26. [Google Scholar]
  • 37.Glen S. Hegde’s g: Definition, Formula [August 11th, 2020]. https://www.statisticshowto.com/hedges-g/.
  • 38.Zhang Y, Xiao M, Zhang S, Xia P, Cao W, Jiang W, et al. Coagulopathy and Antiphospholipid Antibodies in Patients with Covid-19. N Engl J Med. 2020;382(17):e38 10.1056/NEJMc2007575 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Alwan NA, Bhopal R, Burgess RA, Colburn T, Cuevas LE, Smith GD, et al. Evidence informing the UK’s COVID-19 public health response must be transparent. Lancet. 2020;395(10229):1036–7. 10.1016/S0140-6736(20)30667-X [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.nick.white@covid19crc.org C-CRCEa. Global coalition to accelerate COVID-19 clinical research in resource-limited settings. Lancet. 2020;395(10233):1322–5. 10.1016/S0140-6736(20)30798-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Cao B, Wang Y, Wen D, Liu W, Wang J, Fan G, et al. A Trial of Lopinavir-Ritonavir in Adults Hospitalized with Severe Covid-19. N Engl J Med. 2020. 10.1056/NEJMoa2001282 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Grasselli G, Zangrillo A, Zanella A, Antonelli M, Cabrini L, Castelli A, et al. Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy. JAMA. 2020. 10.1001/jama.2020.5394 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Grein J, Ohmagari N, Shin D, Diaz G, Asperges E, Castagna A, et al. Compassionate Use of Remdesivir for Patients with Severe Covid-19. N Engl J Med. 2020. 10.1056/NEJMoa2007016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(10229):1054–62. 10.1016/S0140-6736(20)30566-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bhatraju PK, Ghassemieh BJ, Nichols M, Kim R, Jerome KR, Nalla AK, et al. Covid-19 in Critically Ill Patients in the Seattle Region—Case Series. New England Journal of Medicine. 2020;382(21):2012–22. 10.1056/NEJMoa2004500 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Ghinai I, McPherson TD, Hunter JC, Kirking HL, Christiansen D, Joshi K, et al. First known person-to-person transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the USA. Lancet. 2020;395(10230):1137–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med. 2020;382(13):1199–207. 10.1056/NEJMoa2001316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.McMichael TM, Currie DW, Clark S, Pogosjans S, Kay M, Schwartz NG, et al. Epidemiology of Covid-19 in a Long-Term Care Facility in King County, Washington. N Engl J Med. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Pan A, Liu L, Wang C, Guo H, Hao X, Wang Q, et al. Association of Public Health Interventions With the Epidemiology of the COVID-19 Outbreak in Wuhan, China. Jama. 2020. 10.1001/jama.2020.6130 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Pung R, Chiew CJ, Young BE, Chin S, Chen MI, Clapham HE, et al. Investigation of three clusters of COVID-19 in Singapore: implications for surveillance and response measures. Lancet. 2020;395(10229):1039–46. 10.1016/S0140-6736(20)30528-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Gilbert M, Pullano G, Pinotti F, Valdano E, Poletto C, Boelle PY, et al. Preparedness and vulnerability of African countries against importations of COVID-19: a modelling study. Lancet. 2020;395(10227):871–7. 10.1016/S0140-6736(20)30411-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kandel N, Chungong S, Omaar A, Xing J. Health security capacities in the context of COVID-19 outbreak: an analysis of International Health Regulations annual report data from 182 countries. Lancet. 2020;395(10229):1047–53. 10.1016/S0140-6736(20)30553-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Leung K, Wu JT, Liu D, Leung GM. First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment. Lancet. 2020;395(10233):1382–93. 10.1016/S0140-6736(20)30746-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Grein J, Ohmagari N, Shin D, Diaz G, Asperges E, Castagna A, et al. Compassionate Use of Remdesivir for Patients with Severe Covid-19. New England Journal of Medicine. 2020. 10.1056/NEJMoa2007016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Bhatraju PK, Ghassemieh BJ, Nichols M, Kim R, Jerome KR, Nalla AK, et al. Covid-19 in Critically Ill Patients in the Seattle Region—Case Series. N Engl J Med. 2020. 10.1056/NEJMoa2004500 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Pung R, Chiew CJ, Young BE, Chin S, Chen MI, Clapham HE, et al. Investigation of three clusters of COVID-19 in Singapore: implications for surveillance and response measures. Lancet. 2020;395(10229):1039–46. 10.1016/S0140-6736(20)30528-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Jüni P, Witschi A, Bloch R, Egger M. The Hazards of Scoring the Quality of Clinical Trials for Meta-analysis. JAMA. 1999;282(11):1054–60. 10.1001/jama.282.11.1054 [DOI] [PubMed] [Google Scholar]
  • 58.Shamoo AE. Validate the integrity of research data on COVID 19. Accountability in Research. 2020;27(6):325–6. 10.1080/08989621.2020.1787838 [DOI] [PubMed] [Google Scholar]
  • 59.Kalil AC. Treating COVID-19-Off-Label Drug Use, Compassionate Use, and Randomized Clinical Trials During Pandemics. JAMA. 2020. [DOI] [PubMed] [Google Scholar]
  • 60.Stockman LJ, Bellamy R, Garner P. SARS: systematic review of treatment effects. PLoS Med. 2006;3(9):e343 10.1371/journal.pmed.0030343 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Zagury-Orly I, Schwartzstein RM. Covid-19—A Reminder to Reason. N Engl J Med. 2020. 10.1056/NEJMp2009405 [DOI] [PubMed] [Google Scholar]
  • 62.Maguire BJ, Guerin PJ. A living systematic review protocol for COVID-19 clinical trial registrations. Wellcome Open Res. 2020;5:60 10.12688/wellcomeopenres.15821.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Lipsitch M, Swerdlow DL, Finelli L. Defining the Epidemiology of Covid-19—Studies Needed. N Engl J Med. 2020;382(13):1194–6. 10.1056/NEJMp2002125 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Bart Ferket

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

30 Jul 2020

PONE-D-20-14688

Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Review and Case-Control

PLOS ONE

Dear Dr. Berger,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Could  you please pay attention to the following comments made by the reviewers:

  1. Per reviewer 1's suggestion, specify the 3 journals you are looking at and/or mention that these are the top 3 regarding IF within the general medicine segment.

  2. Better explain article types of each journal by providing definitions and reasons for in/exclusion. The reader should not need to search on journal websites what each article type entails. You could also include definitions for article types in the legend of Figure 1.

  3. For the level of evidence assessment, please do an additional analysis as suggested by Reviewer 1 = just look at the original research papers and see if the level of evidence is different for these article types.

  4. Consider adding all research letters for the QUALSYST analysis (not only the JAMA research letters). I agree with the Reviewer 1 that correspondence/research letters contain new findings and are currently oftentimes full manuscripts presented in shorter form. Moreover, additional tables and figures are usually added as supplemental material in NEJM, Lancet and JAMA. Please provide a rationale why this would not be possible when applicable.

  5. Make sure to only include papers for which the Levels of Evidence pyramid can be used.  E.g. exclude the JAMA "medical news and perspectives and "piece of my mind" papers. These are not opinion papers.

  6. Address using consecutive controls vs historic controls, together with the expected difference.

  7. Reviewer 2 suggests a lot of excellent textual revisions, which should be addressed.

Please submit your revised manuscript by Sep 13 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Bart Ferket

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional information concerning the qualitative analyses performed. For example, if any rubric, theoretical framework or protocol was followed please describe this in sufficient detail for replication.

3. Please ensure that "systematic review" is incorprated into the title per PLOS ONE submission guidelines.

4. Please amend either the title on the online submission form (via Edit Submission) or the title in the manuscript so that they are identical.

5.  We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

6.  Thank you for stating the following in the Competing Interests section:

"Marko Zdravkovic, Bogdan Zdravkovic and Joana Berger-Estilita  have declared that no competing interests exist.

David Berger has read the journal's policy and the authors of this manuscript have the following competing interests:

The Department of Intensive Care Medicine at Inselspital has, or has had in the past, research contracts with Abionic SA, AVA AG, CSEM SA, Cube Dx GmbH, Cyto Sorbents Europe GmbH, Edwards Lifesciences LLC, GE Healthcare, ImaCor Inc., MedImmune LLC, Orion Corporation, Phagenesis Ltd. and research & development/consulting contracts with Edwards Lifesciences LLC, Nestec SA, Wyss Zurich. The money was paid into a departmental fund; Dr Berger received no personal financial gain.

The Department of Intensive Care Medicine has received unrestricted educational grants from the following organizations for organizing a quarterly postgraduate educational symposium, the Berner Forum for Intensive Care (until 2015): Abbott AG, Anandic Medical Systems, Astellas, AstraZeneca, Bard Medica SA, Baxter, B | Braun, CSL Behring, Covidien, Fresenius Kabi, GSK, Lilly, Maquet, MSD, Novartis, Nycomed, Orion Pharma, Pfizer, Pierre Fabre Pharma AG (formerly known as RobaPharm).

The Department of Intensive Care Medicine has received unrestricted educational grants from the following organizations for organizing bi-annual postgraduate courses in the fields of critical care ultrasound, management of ECMO and mechanical ventilation: Abbott AG, Anandic Medical Systems, Bard Medica SA., Bracco, Dräger Schweiz AG, Edwards Lifesciences AG, Fresenius Kabi (Schweiz) AG, Getinge Group Maquet AG, Hamilton Medical AG, Pierre Fabre Pharma AG (formerly known as RobaPharm), PanGas AG Healthcare, Pfizer AG, Orion Pharma, Teleflex Medical GmbH.".

 

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

* Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Many thanks for the opportunity to review this paper. Overall, I think this is a good paper and solid analysis that supports the discussion and conclusion provided. I do, however, identify some aspects of the methods and conclusions that require attention or reconsideration.

Was the impact factor >50 prespecified or did that just happen to fit the 3 journals the authors wanted to look at? I can’t imagine that the authors didn’t know exactly what journals would be included by setting the cutoff that high. Since it doesn’t appear this analysis was prespecified just be upfront about the journals you wanted to look at and why.

Page 5 Lines 102-106: This is not presented clearly. For instance the authors say they exclude correspondence but then (rightfully as they may include original results) include some correspondence in their final sample. Please add a bit more detail about what was included and excluded and why.

The Oxford Quality Rating Scheme is a valid choice for the purposes described, but I am wary of one aspect of its use: the inclusion of editorials, which I assume are largely categorised as the lowest level of evidence (“expert opinion”). I fear this might be skewing the findings quite a bit and I’m not sure top journals publishing editorials on the all-consuming topic of the moment, in its early stages when original research is going to be sparse, is indicative of compromising their standards evidence (the authors allude to this in their discussion). I would be interested in a sensitivity analysis that excluded these and seeing how it might impact the robustness of the findings.

The authors exclude JAMA Research Letters from the “Original Article” count but I think this is a mistake. These are usually original research simply presented in a shorter format for manuscripts that don’t require a full article of detail. However, the level of detail they usually describe should be enough to perform the assessments. If they didn’t, that itself might be an interesting finding on the format (although I don’t think that is the case).

Medical news and perspectives from JAMA are included in the quality assessments. This is odd to me since, as news pieces, they do not really fit as something appropriate to be examined by a “levels of evidence” tool. News is typically not held to the same standard as original research.

The QUALSYST checklist appears to be an acceptable and general enough tool for the purposes described.

Page 6 Lines 140-142: Small detail but were disagreements resolved through consensus with the full group or between each 2-author pair?

Why, for the “Qualitative analysis of COVID-19 original articles” did the authors rely on a personal subjective interpretation of article quality rather than existing methods, like say the Cochrane risk of bias tools (even if modified or slimmed down), to assess article quality? These cover many of the same areas examined and have been created for a wide array of research types.

The citation frequency analysis is the part of the study that I feel the least confident drawing any meaningful conclusions from as a reader. COVID articles are extremely prevalent and concentrated in a single area of great interest while the rest of the corpus is likely spread out over the entire rest of the field of biomedical sciences. With a preponderance of research being published in these journals dealing with COVID specifically, a high citation rate feels natural rather than indicative of any larger issue. Similarly, more than usual, I think the speed at which the academic community is presenting hypothesis and exploratory research on an entirely novel disease, and focusing intense public scrutiny on notable findings, may be leading to a situation in which the context around a citation is crucial to understanding this research question. I could reasonably imagine that a given study or article is far more likely to be cited alongside criticism, or in some negative light, than usual in the context of COVID research. This might be something the authors could explore given the relatively small amount of citations (52) included in their analysis.

Please make sure you discuss all your reported outcomes in the methodology section. For instance, you report the amount of authors without ever mentioning this was being examined in the methods section.

Would Hedge’s g be a more accurate estimator of effect size in this case than Cohen’s d given the unequal sample sizes? Though since the “sample” is complete for articles from these journals for this time period, perhaps Cohen’s d is appropriate? I’m not entirely sure one way or the other but it might be worth the authors justifying their use of one vs the other so it is clear to the reader.

The authors should report the actual mean total and summary percentage scores (and resulting error bars) and not just refer to Figure 2 from which it is difficult to tell the exact numbers.

It is good that the authors are willing to make their data available, however I don’t see any reason the underlying data for this study couldn’t be proactively made available on an appropriate repository like OSF or Figshare (or any other appropriate public repository) rather than from the author. I would think this would be particularly important for this research as others may want to pick up and expand on these methods as the pandemic matures. Requiring interested parties to contact the author is simply an additional barrier to the benefits of data sharing and availability. I would ask the authors to consider taking this proactive step.

The authors recognize what I believe to be the biggest issue with this paper which is timeliness. Obviously there is not much to be done to combat this other than a potential update of the results before submission but this would be a lot to ask. However, I do agree this is a major limitation of these findings in terms of overall interest to the community. That said, I think even more value can come from sharing the framework for the methods to potentially be expanded and applied at other points in the epidemic. I think there is some interesting potential for this that the authors would do well to acknowledge.

The authors appropriately acknowledge the inherent limitations of choosing any single assessment tool for paper quality and I think their choice to use QUALSYST is fine. However the recent Surgisphere scandal brings this limitation to further light. I’m fairly certain this methodology would have assessed both the NEJM and Lancet Surisphere papers that were retracted as high quality research despite failing on some general quality measures. Things like the availability of data and materials are metrics that aren’t included in the scales used that may also be important indicators of quality. To be clear, I do not think it would be reasonable for the authors to have scrutinized the papers included to a level that they would have noticed the irregularities that tipped off the community to the issues with the Surgisphere Lancet paper. That level of detailed scrutiny was not within the scope of this paper. However there are broad indicators largely related to best practice in Open Science (and I’m sure in some other areas as well) that impact their ability to assess quality. Just an interesting case study to consider in grounding this limitation.

Reviewer #2: I have provided a review of the manuscript "Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Review and Case-Control”

The authors provide am interesting manuscript which can enlighten the quality of COVID-19 related publications which could have been affected by rapidity and high needs in this pandemic.

INTRODUCTION

Line 65 – 68. Can be worthy to add the example of the retraction of the papers published on The Lancet and New England Journal regarding the use of Covid-19 treatments chloroquine and hydroxychloroquine https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31180-6/fulltext. This info can strengthen the rational of the study.

Lines 86-87: “We hypothesized that the quality of recent publications on COVID-19 in medical journals with impact factor >50 is lower than for nonCOVID-19 articles published during the same time period.”

I wonder if considering the same period as span of time for the non-covid publications is adequate. COVID19 has affected not only the publications related to it but also all the other specialties have been altered and affected. Priorities has changed during the first 6 months of 2020 and also the publications process has been speeded for COVID19 publications but slowed for the non-COVID19 areas. Would have been more appropriate to consider the same months for the non-covid publications but in 2019 when COVID19 couldn’t affected the quality and the process?

METHODS

Lines 91 – 92. “This report follows the applicable STROBE guidelines for case-control studies and PRISMA guidelines for systematic reviews”. Could you please define the study design? Is this an observational study (ie. Case-control or even cross sectional) or a review? Why use the PRISMA guideline if not properly a systematic review? This is not a systematic review but a case/control study so observational.

Lines 102 – 104. “The resulting publications were stratified into COVID-19–related and nonCOVID-19–related. We matched the nonCOVID-19 publications with COVID-19 publications according to article types within each journal…”.

As already discussed above, wouldn’t an historical control, ie non-COVID19 publication from March 12 to April 12, 2019 be more appropriate and unbiased? The publications process for the non-covid19 has been altered due to the COVID19 priprities. For example, peer review process were slower for non-covid19 than covid-19 publications; the attention were more directed to COVID19 publications than non COVID.

Table 1. Table 1 should be included only in results sections not in the methods since it reported characteristics as level of evidence of the included studies.

Lines 126-127. “Quantitative appraisal” and lines “Qualitative analysis”. Reading these two sections, I would not find a difference in terms of quantitative or qualitative appraisal. The checklist the authors used to evaluate the methodological quality express a qualitative approach and not a quantitative one. Quantitative appraisal or quantitative synthesis usually refers to a meta-analysis or any intended statistical methods for combining results across studies (e.g. meta-analysis, subgroup analysis, meta-regression, sensitivity analysis), including methods for assessing heterogeneity

In this case, it may be more informative to write about “qualitative appraisal of quantitative research” or simply report a unique paragraph with the qualitative appraisal. Then, what reported in lines 144 – 147 (ie. Funding or missing data) can be an other way to inform qualitatively about the included research.

Line 174: “Quantitative appraisal of the quality of the original articles is..” I would change this terms into qualitative assessment of original articles. Quality is implicitly considered into “qualitative assessment”.

Line 232: “favoring COVID-19 original research papers”. I suppose that this might be obvious considering the period of high demand for COVID19 answers in the international scientific community. A comparison with non-COVID19 research in a period not affected by COVID19 could have been more appropriate for detecting the real difference in number of citations.

Lines 239-240: “Most of these studies had limitations in terms of missing data or under-reporting. The randomized trial was not blinded” This sentence confused me. What did the authors consider as quantitative and what as qualitative assessment? Elements considered in the “qualitative assessment” are related to the methodological quality of the study designs. For example, the blinding in a RCT is an item of the Cochrane Risk of Bias tool whose aim is to assess the internal validity of a randomized controlled trial, the risk of bias in terms of methodological quality. https://handbook-5-1.cochrane.org/chapter_8/8_assessing_risk_of_bias_in_included_studies.htm

DISCUSSION

Discussion is too limited and need to be enriched and enlarged – several issues can be of interest, here some already arisen in the issues above:

- The assessment of nonCOVID19 publication in a different period could have changed the results?

- The period is highly influenced by the changed research priorities related to COVID19 – the efforts (money, time etc…) of the whole international scientific community has been dedicated to COVID19.

- Could the journal peer review process have affected the quality of the published journal? In this six months of SARS-COV-2 pandemic even the attention of editors and reviewer was directed to speed as much as possible COVID19 publications: did the author discuss this? Moreover, a lot of pre-prints on COVID19 exist – could these influence the publications of COVID-19 research?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Nicholas J. DeVito

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Nov 5;15(11):e0241826. doi: 10.1371/journal.pone.0241826.r002

Author response to Decision Letter 0


21 Aug 2020

Reply to reviewers for PONE-D-20-14688-R1:

Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Review and Case-Control study

Editor’s comments

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Could you please pay attention to the following comments made by the reviewers:

1. Per reviewer 1's suggestion, specify the 3 journals you are looking at and/or mention that these are the top 3 regarding IF within the general medicine segment.

Please refer to our answer A-1-1

2. Better explain article types of each journal by providing definitions and reasons for in/exclusion. The reader should not need to search on journal websites what each article type entails. You could also include definitions for article types in the legend of Figure 1.

Please refer to our answer A-1-2

3. For the level of evidence assessment, please do an additional analysis as suggested by Reviewer 1 = just look at the original research papers and see if the level of evidence is different for these article types.

Please refer to our Answer A-1-3.

4. Consider adding all research letters for the QUALSYST analysis (not only the JAMA research letters). I agree with the Reviewer 1 that correspondence/research letters contain new findings and are currently oftentimes full manuscripts presented in shorter form. Moreover, additional tables and figures are usually added as supplemental material in NEJM, Lancet and JAMA. Please provide a rationale why this would not be possible when applicable.

Please refer to our Answer A-1-4.

5. Make sure to only include papers for which the Levels of Evidence pyramid can be used. E.g. exclude the JAMA "medical news and perspectives and "piece of my mind" papers. These are not opinion papers.

Please refer to our Answer A-1-5.

6. Address using consecutive controls vs historic controls, together with the expected difference.

Please refer to our Answer A-2-2. We think that selecting a different time period for assessment of quality would introduce a selection bias, since conditions are not the same. We have added a historic citation count.

7. Reviewer 2 suggests a lot of excellent textual revisions, which should be addressed.

Please refer to our last anwer in the reply letter and to the expanced discussion.

Reviewer #1:

Many thanks for the opportunity to review this paper. Overall, I think this is a good paper and solid analysis that supports the discussion and conclusion provided. I do, however, identify some aspects of the methods and conclusions that require attention or reconsideration.

We thank the reviewer for the appreciation of our work

Q- 1-1: Was the impact factor >50 prespecified or did that just happen to fit the 3 journals the authors wanted to look at? I can’t imagine that the authors didn’t know exactly what journals would be included by setting the cutoff that high. Since it doesn’t appear this analysis was prespecified just be upfront about the journals you wanted to look at and why.

A-1-1: We choose the three journals as you rightfully say based on their reputation and influence, rather than on a prespecified impact factor.

The Introduction and Methods were adapted as follows:

“To date, the quality of COVID-19 publications in the top three general medical journals by impact factor (i. e. the New England Journal of Medicine, The Lancet and The Journal of the American Medical Association, represented by an impact factor > 50 for all) has not been formally assessed. We hypothesized that the quality of recent publications on COVID-19 in the three most influential medical journals is lower than for nonCOVID-19 articles published during the same time period.”

“For the time period of March 12 to April 12, 2020 (i.e., during the early outbreak phase of the COVID-19 pandemic), we identified all of the publications from the top three general medical journals by impact factor (the New England Journal of Medicine (NEJM), the Journal of the American Medical Association (JAMA), and The Lancet).”

Q-1-2: Page 5 Lines 102-106: This is not presented clearly. For instance the authors say they exclude correspondence but then (rightfully as they may include original results) include some correspondence in their final sample. Please add a bit more detail about what was included and excluded and why.

A1-2: We have excluded only correspondence letters on previously published papers – which we believe is now clear from text and Figure 1.

“Secondary studies, correspondence letters on previously published articles, unauthored publications, and specific article types not matching any of the six categories on the levels of the evidence pyramid(1-3) (e.g., infographic, erratum) were excluded (Figure 1).”

Q-1-3: The Oxford Quality Rating Scheme is a valid choice for the purposes described, but I am wary of one aspect of its use: the inclusion of editorials, which I assume are largely categorised as the lowest level of evidence (“expert opinion”). I fear this might be skewing the findings quite a bit and I’m not sure top journals publishing editorials on the all-consuming topic of the moment, in its early stages when original research is going to be sparse, is indicative of compromising their standards evidence (the authors allude to this in their discussion). I would be interested in a sensitivity analysis that excluded these and seeing how it might impact the robustness of the findings.

A-1-3: This is a valid argument, but we would like to draw your attention to the numbers: there are 12 editorials in the COVID group included in the analysis (out of 155 = 7.7%), and there are 18 editorials in the nonCOVID group (out of 130 = 13.8%). Given these numbers, we think that the editorials are not likely to skew the data in the direction mentioned above. Also, we wanted to compare all the levels of evidence represented – including the opinion papers in the initial stage as evident from Table 1 and as explained in the methods section:

"We matched the nonCOVID-19 publications with COVID-19 publications according to article types within each journal, with the exclusion of nonmatching article types."

Then in the second stage we analysed, in greater detail, the original research papers only, which are supposed to be of the highest quality and again showed major differences between the two groups of papers.

We have now added a sensitivity analysis upon your request in table 2 which only included full original articles. It clearly confirms our hypothesis. The results section was adapted accordingly.

“When comparing only the original articles on the levels of evidence pyramid (Table 2), the nonCOVID-19 publications were also associated with higher quality (P <0.001; Chi squared), with a strong association measure (Cramer's V: 0.641, Table 2). When comparing the higher evidence group to the lower evidence group, the COVID-19 original articles were 26-fold more likely (i.e., odds ratio) to be in the lower evidence group (95% CI: 5.4–120; P <0.001).”

Q-1-4: The authors exclude JAMA Research Letters from the “Original Article” count but I think this is a mistake. These are usually original research simply presented in a shorter format for manuscripts that don’t require a full article of detail. However, the level of detail they usually describe should be enough to perform the assessments. If they didn’t, that itself might be an interesting finding on the format (although I don’t think that is the case).

A-1-4: This is an interesting point raised here. As we discussed this within the group in the planning phase and decided not to include them in Original papers group because: 1) These were decided by the editors/reviewers as not to be good enough to meet the criteria for Original article category (a separate article type in JAMA); 2) There were 8 COVID research letters and 3 nonCOVID published in JAMA – would clearly skew the quality of original papers 3); JAMA is the only journal which has this category of papers, but all three journals have a category of “Original research”.

We have now provided a separate sensitivity analysis for original papers below *including 11 JAMA research letters* in the main text of results.

“The mean total scores in the COVID-19 (n=21) and nonCOVID-19 (n=55) groups were 12.3 (95% CI 10.6-14) and 23.3 (95% CI 22.2-24.2) respectively, and the mean summary percentage scores were 72.6% (95% CI 66.1-79.1) and 90.9% (95% CI 88.9-92.9), respectively. The mean total score and the mean summary percentage scores were significantly different between the groups, favoring the nonCOVID-19 original articles (P <0.001, for both; Welch's t-test; Hedges’ g = 2.98, 1.87, respectively). For the total scores, the difference between the means was 11.0 (95% CI 9.1–12.9; P <0.001), and for the summary percentage scores, 18.3% (95% CI 11.6%–25.0%; P <0.001). “

Q-1-5: Medical news and perspectives from JAMA are included in the quality assessments. This is odd to me since, as news pieces, they do not really fit as something appropriate to be examined by a “levels of evidence” tool. News is typically not held to the same standard as original research.

A-1-5: We followed the suggestion and adapted the data accordingly to the reviewers comment. Figure 1 and the appropriate positions in the text have been edited accordingly.

Q-1-6: The QUALSYST checklist appears to be an acceptable and general enough tool for the purposes described.

Page 6 Lines 140-142: Small detail but were disagreements resolved through consensus with the full group or between each 2-author pair?

A-1-6a: This has now been clarified:

“Disagreements between the reviewers (defined as >2 difference in the total score, or >10% difference in the summary percentage scores), were resolved through one round of discussion between each 2-author pair. “

Why, for the “Qualitative analysis of COVID-19 original articles” did the authors rely on a personal subjective interpretation of article quality rather than existing methods, like say the Cochrane risk of bias tools (even if modified or slimmed down), to assess article quality? These cover many of the same areas examined and have been created for a wide array of research types.

A-1-6b: Thank you for this insightful comment. In fact, there seems to be a confusion between qualitative analysis as a methodology and analysis of the quality of the papers. We used the QUALSYST checklist as a tool to characterize the quality of the papers. This tool is similar (in terms of its function) to the Cochrane Risk of Bias tool. The reasons for choosing the QUALSYST have been described in the paper.

Qualitative Methodology, on the other hand, uses subjective judgment to analyze a value based on non-quantifiable information. Qualitative analysis is the analysis of qualitative data such as text data from interview transcripts. Unlike quantitative analysis, which is statistics-driven and largely independent of the researcher, qualitative analysis is heavily dependent on the researcher’s analytic and integrative skills and personal knowledge of the social context where the data is collected. The emphasis in qualitative analysis is “sense making” or understanding a phenomenon, rather than predicting or explaining. This methodology is very frequently used in social sciences in combination with quantitative analysis, the so called mixed-methods methodology. We decided to perform qualitative analysis in our work to allow for data triangulation, therefore strengthening our hypothesis(4). We have adapted the method section to clarifiy these differences.

“Mixed methods design

We performed a multi-step 360-degree assessment of the studies. It consisted of their classification according to level of evidence for a quantitative appraisal of their methodological quality using a validated tool, and a qualitative analysis of the strengths and weaknesses of the COVID-19 publications, as is often used in social sciences(4) Early citation frequencies of the original articles was determined. “

Please also refer to reviewer comment Q-2-5 and Q-2-6.

Q-1-7: The citation frequency analysis is the part of the study that I feel the least confident drawing any meaningful conclusions from as a reader. COVID articles are extremely prevalent and concentrated in a single area of great interest while the rest of the corpus is likely spread out over the entire rest of the field of biomedical sciences. With a preponderance of research being published in these journals dealing with COVID specifically, a high citation rate feels natural rather than indicative of any larger issue. Similarly, more than usual, I think the speed at which the academic community is presenting hypothesis and exploratory research on an entirely novel disease, and focusing intense public scrutiny on notable findings, may be leading to a situation in which the context around a citation is crucial to understanding this research question. I could reasonably imagine that a given study or article is far more likely to be cited alongside criticism, or in some negative light, than usual in the context of COVID research. This might be something the authors could explore given the relatively small amount of citations (52) included in their analysis.

A-1-7: Thank you for this comment. We have followed this up for a month and provided a new table here. Your suggestion for checking the content of the citing papers is great, the problem is that the number of total citations is not small – median for COVID papers was 149 on May 25th for the 13 papers / the total number of citations to check on May 25th would be 7,500. This would be an interesting study though on itself given the amount of text to be screened for the content of citations, especially on COVID publications. Please respect that this is beyond the capacity of our small group.

Q-1-8: Please make sure you discuss all your reported outcomes in the methodology section. For instance, you report the amount of authors without ever mentioning this was being examined in the methods section.

A-1-8: We thank you for making us aware of this omission

The method section has been adapted accordingly

“The number of authors per publication was counted manually.”

Q-1-9: Would Hedge’s g be a more accurate estimator of effect size in this case than Cohen’s d given the unequal sample sizes? Though since the “sample” is complete for articles from these journals for this time period, perhaps Cohen’s d is appropriate? I’m not entirely sure one way or the other but it might be worth the authors justifying their use of one vs the other so it is clear to the reader.

A-1-9: We thank you for this suggestion as Hedges g is indeed more robust for small sample sizes. The statistical section and results have been updated accordingly.

“Hedges’s g is used as the effect size measure based on a standardized mean difference (5)(small: d = 0.20; medium: d = 0.50; large: d = 0.80; very large: d = 1.20; huge: d >2.00)(6)»

Q-1-10: The authors should report the actual mean total and summary percentage scores (and resulting error bars) and not just refer to Figure 2 from which it is difficult to tell the exact numbers.

A-1-10: We have provided the requested numbers in the results section.

Q-1-11: It is good that the authors are willing to make their data available, however I don’t see any reason the underlying data for this study couldn’t be proactively made available on an appropriate repository like OSF or Figshare (or any other appropriate public repository) rather than from the author. I would think this would be particularly important for this research as others may want to pick up and expand on these methods as the pandemic matures. Requiring interested parties to contact the author is simply an additional barrier to the benefits of data sharing and availability. I would ask the authors to consider taking this proactive step.

A-1-11: Thank you for this comment. We have decided to make all our data available on Figshare. https://figshare.com/projects/Scientific_Quality_of_COVID-19_and_SARS_CoV-2_Publications_in_the_Highest_Impact_Medical_Journals_during_the_Early_Phase_of_the_Pandemic_A_Case-Control_Study/86027

Q-1-12: The authors recognize what I believe to be the biggest issue with this paper which is timeliness. Obviously there is not much to be done to combat this other than a potential update of the results before submission but this would be a lot to ask. However, I do agree this is a major limitation of these findings in terms of overall interest to the community. That said, I think even more value can come from sharing the framework for the methods to potentially be expanded and applied at other points in the epidemic. I think there is some interesting potential for this that the authors would do well to acknowledge.

A-1-12: We thank you for recognizing that updating our article with additional five months of pandemic publications would simply be beyond our capabilities. The issue of timeliness was demonstrated to us clearly by the emerging Surgisphere scandal, during which our article was “hidden” in peer review. We have expanded the discussion in this direction, please also refer to our next answer.

Q-1-13: The authors appropriately acknowledge the inherent limitations of choosing any single assessment tool for paper quality and I think their choice to use QUALSYST is fine. However the recent Surgisphere scandal brings this limitation to further light. I’m fairly certain this methodology would have assessed both the NEJM and Lancet Surisphere papers that were retracted as high quality research despite failing on some general quality measures. Things like the availability of data and materials are metrics that aren’t included in the scales used that may also be important indicators of quality. To be clear, I do not think it would be reasonable for the authors to have scrutinized the papers included to a level that they would have noticed the irregularities that tipped off the community to the issues with the Surgisphere Lancet paper. That level of detailed scrutiny was not within the scope of this paper. However there are broad indicators largely related to best practice in Open Science (and I’m sure in some other areas as well) that impact their ability to assess quality. Just an interesting case study to consider in grounding this limitation.

A-1-13: We fully agree. The Surgisphere scandal emerged while our article was under review. Please also refer to the first comment fo reviewer 2.

This underscores the issue of timeliness from above. We have expanded the discussion on timeliness, discuss our methodology in the light of source data and open science.

“The QUALSYST tool has been validated and is easy to use. This may facilitate additional similar studies at a later stage of the pandemic. Compared to an in-depth analysis of a study’s peer-review process prior to acceptance for publication, it must remain very superficial. We did not expand our analysis to check source data. The data scandal leading to retraction of two major studies (7, 8) emerged while our article was under peer-review. The tools we used would not be suitable to have detected this. Public data repositories and an “open science” approach may facilitate data validation(9).”

“The pandemic took a toll on all aspects of life. Clearly, journal reviewers were restricted in the time they were able to invest into their valuable, voluntary and honorary work. To what extent changes in their practices have occurred is not accessible for us, since the peer-review process was blind and confidential. Assessing of journals with open peer review during the pandemic may shed light on such phenomena, but this was not the scope of our study.” 

Reviewer #2:

I have provided a review of the manuscript "Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Review and Case-Control”

The authors provide am interesting manuscript which can enlighten the quality of COVID-19 related publications which could have been affected by rapidity and high needs in this pandemic.

We thank the referee for the appreciation of our work.

INTRODUCTION

Q-2-1: Line 65 – 68. Can be worthy to add the example of the retraction of the papers published on The Lancet and New England Journal regarding the use of Covid-19 treatments chloroquine and hydroxychloroquine https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31180-6/fulltext. This info can strengthen the rational of the study.

A-2-1: We thank the reviewer for the opportunity to add a bit of detail on the Surgisphere scandal, which emerged while our article was under review. We have adapted the introduction according to your suggestion. Please also refer to answer 13 for reviewer 1.

“The first report of COVID-19 transmission in asymptomatic individuals(10) was later considered to have been flawed, because the patient showed symptoms at the time of transmission(11). A similar example occurred in The Lancet, whereby the authors retracted a publication after admitting irregularities on the first-hand account of the front-line experience of two Chinese nurses(12). While our article was under review, two major analyses on the use of hydroxychloroquine and cardiovascular mortality associated with COVID-19 were retracted in the Lancet(8) and the New England Journal of Medicine(7) because source data could not be verified. “

Q-2-2: Lines 86-87: “We hypothesized that the quality of recent publications on COVID-19 in medical journals with impact factor >50 is lower than for nonCOVID-19 articles published during the same time period.”

I wonder if considering the same period as span of time for the non-covid publications is adequate. COVID19 has affected not only the publications related to it but also all the other specialties have been altered and affected. Priorities has changed during the first 6 months of 2020 and also the publications process has been speeded for COVID19 publications but slowed for the non-COVID19 areas. Would have been more appropriate to consider the same months for the non-covid publications but in 2019 when COVID19 couldn’t affected the quality and the process?

A-2-2: We clearly see your point and agree that the pandemic has affected all areas of research. The redirection of efforts to COVD-19 may have lowered the output of non-Covid research and slowed down the publication and review process. This was not our research question.

Since we had put the weight in our main analysis on the methodological quality of articles, we think it is important to perform a comparison to non-COVID articles in the same time frame. Our hypothesis was that COVID-related research shows lower quality than non-COVID research. If we had chosen a historical control time, one would introduce a selection bias in the historical control, since conditions for research then were clearly different. We find it important that the control group is subject to the same conditions as the test group and hope that you can follow our reasoning. This issue is now discussed.For other endpoints than quality, like for example total research output in an area or citations, a historical control may be more appropriate. For the citations, we have added data.

“A comparison to a set of original articles from the same dates in 2019 revealed 53 (25 to 90) citations in 2019 vs. 334 (222 to 1001) citations for COVID articles in August 2020 and 10 (4 to 18) for non COVID articles (p<0.002 for all comparisons). When corrected for lead time with citations per month, the articles in 2019 have 4 (2 to 6) cites per month, the non-Covid articles in 2020 2.5 (1 to 4.5) without significance. The COVID articles in 2020 have 83.5 (55 to 250) cites per month (p<0.001).”

“One migh argue that a comparison to a historical control group, for example the same time frame in 2019, when there was no pandemic effect on research, would have been more appropriate. Our hypothesis was that COVID-related research showed lower quality than non-COVID research. A historical control group may introduce a selection bias, since conditions for research then would be clearly different. The control group has to be subject to the same conditions as the test group, when methodological quality is assessed. This may be different for other endpoints like total research output.

METHODS

Q-2-3: Lines 91 – 92. “This report follows the applicable STROBE guidelines for case-control studies and PRISMA guidelines for systematic reviews”. Could you please define the study design? Is this an observational study (ie. Case-control or even cross sectional) or a review? Why use the PRISMA guideline if not properly a systematic review? This is not a systematic review but a case/control study so observational.

A-2-3: We fully agree with the reviewer. We had planed an observational study design as a case control. We were then obliged by PLOS ONE to include “systematic review” in the title, because we deal with study comparisons. We would like to follow your suggestion to go as a case-control study which was the initial plan, unless the editorial office overrules us.

Q-2-4: Lines 102 – 104. “The resulting publications were stratified into COVID-19–related and nonCOVID-19–related. We matched the nonCOVID-19 publications with COVID-19 publications according to article types within each journal…”.

As already discussed above, wouldn’t an historical control, ie non-COVID19 publication from March 12 to April 12, 2019 be more appropriate and unbiased? The publications process for the non-covid19 has been altered due to the COVID19 priprities. For example, peer review process were slower for non-covid19 than covid-19 publications; the attention were more directed to COVID19 publications than non COVID.

A-2-4: Please refer to our answer to your comment Q-2-2

Q-2-4: Table 1. Table 1 should be included only in results sections not in the methods since it reported characteristics as level of evidence of the included studies.

A-2-4: Thank you. This has been adapted.

Q-2-5: Lines 126-127. “Quantitative appraisal” and lines “Qualitative analysis”. Reading these two sections, I would not find a difference in terms of quantitative or qualitative appraisal. The checklist the authors used to evaluate the methodological quality express a qualitative approach and not a quantitative one. Quantitative appraisal or quantitative synthesis usually refers to a meta-analysis or any intended statistical methods for combining results across studies (e.g. meta-analysis, subgroup analysis, meta-regression, sensitivity analysis), including methods for assessing heterogeneity

In this case, it may be more informative to write about “qualitative appraisal of quantitative research” or simply report a unique paragraph with the qualitative appraisal. Then, what reported in lines 144 – 147 (ie. Funding or missing data) can be an other way to inform qualitatively about the included research.

A-2-5: Thank you for this insightful comment. In fact, there seems to be a confusion between qualitative analysis as a methodology and analysis of the quality of the papers. Please also refer to question 1-6b of reviewer 1 and the adaption in the method section.

Q-2-6: Line 174: “Quantitative appraisal of the quality of the original articles is..” I would change this terms into qualitative assessment of original articles. Quality is implicitly considered into “qualitative assessment”.

A-2-6: We have inserted the suggested changes. Please also refer to answer A1-6b and 2-5.

“Assessment of the original articles’ quality is reported as a two-reviewer mean score (95% CI) and was analyzed using Welch’s t-tests.”

Q-2-7: “favoring COVID-19 original research papers”. I suppose that this might be obvious considering the period of high demand for COVID19 answers in the international scientific community. A comparison with non-COVID19 research in a period not affected by COVID19 could have been more appropriate for detecting the real difference in number of citations.

A-2-7: Please refer to our answer to your comment Q-2-2

Q-2-8: “Most of these studies had limitations in terms of missing data or under-reporting. The randomized trial was not blinded” This sentence confused me. What did the authors consider as quantitative and what as qualitative assessment? Elements considered in the “qualitative assessment” are related to the methodological quality of the study designs. For example, the blinding in a RCT is an item of the Cochrane Risk of Bias tool whose aim is to assess the internal validity of a randomized controlled trial, the risk of bias in terms of methodological quality. https://handbook-5-1.cochrane.org/chapter_8/8_assessing_risk_of_bias_in_included_studies.htm

A-2-8: Please refer to our answer for A1-6b for reviewer 1

DISCUSSION

Q-2-9: Discussion is too limited and need to be enriched and enlarged – several issues can be of interest, here some already arisen in the issues above:

- The assessment of nonCOVID19 publication in a different period could have changed the results?

Please see our response to your question Q 2-2

- The period is highly influenced by the changed research priorities related to COVID19 – the efforts (money, time etc…) of the whole international scientific community has been dedicated to COVID19.

- Could the journal peer review process have affected the quality of the published journal? In this six months of SARS-COV-2 pandemic even the attention of editors and reviewer was directed to speed as much as possible COVID19 publications: did the author discuss this? Moreover, a lot of pre-prints on COVID19 exist – could these influence the publications of COVID-19 research?

This is now discussed

“The pandemic took a toll on all aspects of life. Clearly, journal reviewers were restricted in the time they were able to invest into their valuable, voluntary and honorary work. To what extent changes in their practices have occurred is not accessible for us, since the peer-review process was blind and confidential. Assessing of journals with open peer review during the pandemic may shed light on such phenomena, but this was not the scope of our study.”

1. Phillips B, Ball C, Sackett D, Badenoch D, Straus S, Haynes B, et al. Oxford centre for evidence-based medicine-levels of evidence (March 2009). 2009.

2. Lander B, Balka E. Exploring How Evidence is Used in Care Through an Organizational Ethnography of Two Teaching Hospitals. Journal of medical Internet research. 2019;21(3):e10769.

3. Djulbegovic B, Guyatt GH. Progress in evidence-based medicine: a quarter century on. Lancet. 2017;390(10092):415-23.

4. Understanding qualitative research in health care. Drug and Therapeutics Bulletin. 2017;55(2):21.

5. Durlak JA. How to Select, Calculate, and Interpret Effect Sizes. Journal of Pediatric Psychology. 2009;34(9):917-28.

6. Sawilowsky SS. New effect size rules of thumb. J Mod Appl Stat Methods. 2009;8(2):26.

7. Mehra MR, Desai SS, Kuy S, Henry TD, Patel AN. Retraction: Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19. N Engl J Med. DOI: 10.1056/NEJMoa2007621. New England Journal of Medicine. 2020;382(26):2582-.

8. Mehra MR, Ruschitzka F, Patel AN. Retraction—Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis. The Lancet. 2020;395(10240):1820.

9. Shamoo AE. Validate the integrity of research data on COVID 19. Accountability in Research. 2020;27(6):325-6.

10. Rothe C, Schunk M, Sothmann P, Bretzel G, Froeschl G, Wallrauch C, et al. Transmission of 2019-nCoV Infection from an Asymptomatic Contact in Germany. N Eng J Med. 2020;382(10):970-1.

11. Kupferschmidt K. Study claiming new coronavirus can be transmitted by people without symptoms was flawed sciencemag.org2020 [19th April, 2020]. Available from: https://www.sciencemag.org/news/2020/02/paper-non-symptomatic-patient-transmitting-coronavirus-wrong.

12. Zeng Y, Zhen Y. RETRACTED: Chinese medical staff request international medical assistance in fighting against COVID-19. Lancet Glob Health. 2020.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Bart Ferket

14 Oct 2020

PONE-D-20-14688R1

Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Case Control Study

PLOS ONE

Dear Dr. Berger,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I agree with reviewer 2 regarding the appropriateness of the current title. I also do agree with changing the phrasing of the qualitative appraisal sections. Please use different terminology to describe this analysis, for example narrative appraisal. 

Please submit your revised manuscript by Nov 28 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Bart Ferket

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Many thanks for the opportunity to review the revised article. This remains good research, fit for publication, and I believe the authors have mostly addressed the issues discussed during the prior round of review. I offer 1 major issue and some additional brief comments and replies to the author’s responses. Assuming these are addressed, I would recommend for publication.

Major Issues:

My biggest remaining issue is the “Qualitative evaluation.” I thank the authors for their explanation of mixed-methods research but to claim you used a “mixed-methods methodology” that includes a “Qualitative” section you need to actually include and describe some sort of qualitative method used to evaluate the work. e.g., thematic analysis, grounded theory, content analysis, framework analysis etc. These are all well established, detailed, systematic methods for conducting qualitative analysis. A one sentence qualitative methods section that just states that the research was “assessed qualitatively” is not sufficient. The Cochrane Risk of Bias tool may have provided a deductive framework around which to categorise and report these evaluations (using something like content analysis) is what I was getting at. You should remove the word “Qualitative” from this section and replace it with something like “Narrative” pr “Subjective” assessments so as to not give the impression that a robust, systematic Qualitative method was used for this evaluation. At least, based on what is reported in the paper, that does not seem to be the case.

Minor Issues:

Many thanks to the authors for addressing how they selected the journals. They may be interested to see, if they have not already, that some very similar research was presented recently at the European Society of Cardiology Congress 2020. https://www.tctmd.com/news/covid-19-blamed-weaker-research-published-top-tier-journals-2020 The fact that they looked at the same three journals, however using different criteria/scales and historic controls. This is an interesting check on the findings from Zdravkovic et al. and can be brought out in the discussion. I couldn’t locate a book of abstracts for that conference (it may not be available yet) but the results are described in that article. Obviously not ideal for referencing, but the similarity is notable and probably worth being mentioned in relation to this research.

I am glad to see the sensitivity analysis removing editorials. I disagree that ~8% and ~14% of your sample is very small. That said it is very good that your findings remained robust to removing these. I think that your findings remained robust to the various sensitivity analyses is a strength.

I don’t think it is correct to say that the JAMA Research Letter format (or other journal’s similar formats) are indicative of the editors/reviewers believing the articles “are not good enough to meet the criteria for Original article category.” You can submit directly in the Research Letter format without being referred there by the editors. Some research simply doesn’t need 2000+ words to get the point across. That said, I think the inclusion of the sensitivity analysis is sufficient.

On the citation analysis, I agree that with the new data, it would be unreasonable to check all of these citations for context. However I do think the levels originally reported through May could reasonably have been investigated. I understand that you feel this is out of the scope of this paper and resources of your group and respect the decision not to investigate further at this time. I would simply request that if you agree that what I stated about citation context may be true, that it is mentioned as relevant to the interpretation of this finding on Page 14 and potentially as a direction for future research in the Discussion. A case study in this that has personally annoyed me quite a bit is Didier Raoult going on about how many times his hydroxychloroquine paper has been cited, as a defense of the paper, with no context of how many of those were citing it in the context of pointing out the many limitations of that research. No need to use this example but I think it proves the point.

I would like to applaud the authors for making their data openly available.

Many thanks again for the opportunity to review this paper. If these issues can be rectified to the editor’s satisfaction I am happy to recommend this paper go forward to publication.

Reviewer #2: 1. I would like to comment on the following author’s answer:

A-2-3: We fully agree with the reviewer. We had planed an observational study design as a case control. We were then obliged by PLOS ONE to include “systematic review” in the title, because we deal with study comparisons. We would like to follow your suggestion to go as a case-control study which was the initial plan, unless the editorial office overrules us.

Actually, I agree that this is not a systematic review therefore I would avoid any reference to this study design in order to not arise misunderstanding regarding the study type. I did not find the term “systematic review in the title accordingly and I agree with the removal of any reference to the PRISMA statement.

2. My major concern is still related to the quality appraisal the authors performed. I went through the references the authors reported to support the choice of QUALSYST tool use, the mixed methodology and still some issues and confusion raised.

I agree with their comment A1-6 B where the authors described the definition of qualitative and quantitative research but I think something is not still clear or there is some confusion in the description of this concept. They stated: Qualitative analysis is the analysis of qualitative data such as text data from interview transcripts. Unlike quantitative analysis, which is statistics-driven and largely independent of the researcher, qualitative analysis is heavily dependent on the researcher’s analytic and integrative skills and personal knowledge of the social context where the data is collected. The emphasis in qualitative analysis is “sense making” or understanding a phenomenon, rather than predicting or explaining. This methodology is very frequently used in social sciences in combination with quantitative analysis, the so called mixed-methods methodology.

I am aware regarding the mixed-method methodology but I don’t think this is the case to adopt it: here the author should have assessed just the methodological quality of the quantitative research study design (ie. experimental, observational etc….) offering a qualitative assessment and not a quantitative one which in the Cochrane wording is referring to a quantitative synthesis/statistically driven. Moreover, the second part the authors assessed, ie. “qualitative analysis” is just an assessment of reporting characteristic of the included studies - I did not assess it as a qualitative analysis. PLease look at the following examples of reporting characteristics assessment/methdoological quality: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0040078; https://www.sciencedirect.com/science/article/pii/S0895435616308162.

Otherwise the study design is not clearly described – the mixed methodology might describe the study design and not the appraisal of the studies. Here an example: https://pubmed.ncbi.nlm.nih.gov/33033951/ where both a questionnaire(quantitative analysis) and focused groups (qualitative analysis) were performed.

In this case control study, this mixed methodology did not really reflect what was done. To my opinion is a control study belonging to the quantitative research world where included studies were compared using the following instruments:

1. Methodological quality throughout the QUALSYST tool (even though the high standard: Cochrane Risk of bias or the New Castle or the ROBINS I could have been used too)

2. Asssessment of reporting elements such as: reported from page 7, lines 150: ”The COVID-19 original research articles (n = 13) were assessed qualitatively to report on their major weaknesses (which type of weakness and how this is standardized across studies? Was not already included in the methodological quality for quantitative studies in the QUALSYST tool?), potential conflicts of interest, and likely influence on further research and clinical practice (in which way these are standardized, collected and reported in the assessment? Are regression analyses planned to investigate the influence? ).

The second point cannot be equal to a qualitative part of a mixed methodology where usually focus groups or interviews are used to collect qualitative data. The data/info the authors wanted to comment on are included and reported in the selected studies, in the manuscript/full text as general characteristics (ie. conflict of interest) and so are simply collected from them and then discussed.

I suggest the authors to better revise the study design performed, the qualitative/quantitative wording.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Nicholas J. DeVito

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Nov 5;15(11):e0241826. doi: 10.1371/journal.pone.0241826.r004

Author response to Decision Letter 1


20 Oct 2020

Reply to Reviewers for PONE-D-20-14688R1:

Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Case Control Study

Review Comments to the Author

Reviewer #1: Many thanks for the opportunity to review the revised article. This remains good research, fit for publication, and I believe the authors have mostly addressed the issues discussed during the prior round of review. I offer 1 major issue and some additional brief comments and replies to the author’s responses. Assuming these are addressed, I would recommend for publication.

Our reply: We thank the reviewer for the appreciation of our work

Major Issues:

My biggest remaining issue is the “Qualitative evaluation.” I thank the authors for their explanation of mixed-methods research but to claim you used a “mixed-methods methodology” that includes a “Qualitative” section you need to actually include and describe some sort of qualitative method used to evaluate the work. e.g., thematic analysis, grounded theory, content analysis, framework analysis etc. These are all well established, detailed, systematic methods for conducting qualitative analysis. A one sentence qualitative methods section that just states that the research was “assessed qualitatively” is not sufficient. The Cochrane Risk of Bias tool may have provided a deductive framework around which to categorise and report these evaluations (using something like content analysis) is what I was getting at. You should remove the word “Qualitative” from this section and replace it with something like “Narrative” pr “Subjective” assessments so as to not give the impression that a robust, systematic Qualitative method was used for this evaluation. At least, based on what is reported in the paper, that does not seem to be the case.

Our reply: Thank you for this comment. We have changed the wording “qualitative” to “narrative” throughout.

Minor Issues:

Many thanks to the authors for addressing how they selected the journals. They may be interested to see, if they have not already, that some very similar research was presented recently at the European Society of Cardiology Congress 2020. https://www.tctmd.com/news/covid-19-blamed-weaker-research-published-top-tier-journals-2020 The fact that they looked at the same three journals, however using different criteria/scales and historic controls. This is an interesting check on the findings from Zdravkovic et al. and can be brought out in the discussion. I couldn’t locate a book of abstracts for that conference (it may not be available yet) but the results are described in that article. Obviously not ideal for referencing, but the similarity is notable and probably worth being mentioned in relation to this research.

Our reply: This is indeed very interesting. Thank you for making us aware. We have contacted Professor Stefanini and he provided us the slideset that he presented at the congress. He states that besides that, his data do not exist in a published and citable form yet. We have included a statement in the discussion. If this does not suit your citation policy, please advise us.

“One might argue that a comparison to a historical control group, for example the same time frame in 2019, when there was no pandemic effect on research, would have been more appropriate. Our hypothesis was that COVID-related research showed lower quality than non-COVID research. A historical control group may introduce a selection bias, since conditions for research then would be clearly different. We would therefore argue that the control group has to be subject to the same conditions as the test group, when methodological quality is assessed. This may be different for other endpoints like total research output. In line with our results, Stefanini et al reported - in an oral presentation at the European Society of Cardiology Congress 2020 - similar findings of lower quality associated with COVID-19 in the same journals and time frame as our work with a historical control group of 2019. So, both historical and contemporary control groups lead to the same conclusions.”

I am glad to see the sensitivity analysis removing editorials. I disagree that ~8% and ~14% of your sample is very small. That said it is very good that your findings remained robust to removing these. I think that your findings remained robust to the various sensitivity analyses is a strength.

Our reply: Thank you

I don’t think it is correct to say that the JAMA Research Letter format (or other journal’s similar formats) are indicative of the editors/reviewers believing the articles “are not good enough to meet the criteria for Original article category.” You can submit directly in the Research Letter format without being referred there by the editors. Some research simply doesn’t need 2000+ words to get the point across. That said, I think the inclusion of the sensitivity analysis is sufficient.

Our reply: Thank you

On the citation analysis, I agree that with the new data, it would be unreasonable to check all of these citations for context. However I do think the levels originally reported through May could reasonably have been investigated. I understand that you feel this is out of the scope of this paper and resources of your group and respect the decision not to investigate further at this time. I would simply request that if you agree that what I stated about citation context may be true, that it is mentioned as relevant to the interpretation of this finding on Page 14 and potentially as a direction for future research in the Discussion. A case study in this that has personally annoyed me quite a bit is Didier Raoult going on about how many times his hydroxychloroquine paper has been cited, as a defense of the paper, with no context of how many of those were citing it in the context of pointing out the many limitations of that research. No need to use this example but I think it proves the point.

Our reply: We fully agree with your notion and have expanded the discussion:

It must be acknowledged though, that a citation does not necessarily need to be positive for a study or author, if the context, i. e. criticisim or discussions about retractions and corrections, of the citations are considered. This is beyond the scope of our work.

I would like to applaud the authors for making their data openly available.

Our reply: Thank you

Many thanks again for the opportunity to review this paper. If these issues can be rectified to the editor’s satisfaction I am happy to recommend this paper go forward to publication.

Our reply: We thank you for your constructive criticism and hope that we have now sufficiently addressed all the points raised.

Reviewer #2: 1. I would like to comment on the following author’s answer:

A-2-3: We fully agree with the reviewer. We had planed an observational study design as a case control. We were then obliged by PLOS ONE to include “systematic review” in the title, because we deal with study comparisons. We would like to follow your suggestion to go as a case-control study which was the initial plan, unless the editorial office overrules us.

Actually, I agree that this is not a systematic review therefore I would avoid any reference to this study design in order to not arise misunderstanding regarding the study type. I did not find the term “systematic review in the title accordingly and I agree with the removal of any reference to the PRISMA statement.

Our reply: Thank you for supporting our idea. We have already changed the title according to your suggestion and journal editors seem to agree.

2. My major concern is still related to the quality appraisal the authors performed. I went through the references the authors reported to support the choice of QUALSYST tool use, the mixed methodology and still some issues and confusion raised.

I agree with their comment A1-6 B where the authors described the definition of qualitative and quantitative research but I think something is not still clear or there is some confusion in the description of this concept. They stated: Qualitative analysis is the analysis of qualitative data such as text data from interview transcripts. Unlike quantitative analysis, which is statistics-driven and largely independent of the researcher, qualitative analysis is heavily dependent on the researcher’s analytic and integrative skills and personal knowledge of the social context where the data is collected. The emphasis in qualitative analysis is “sense making” or understanding a phenomenon, rather than predicting or explaining. This methodology is very frequently used in social sciences in combination with quantitative analysis, the so called mixed-methods methodology.

I am aware regarding the mixed-method methodology but I don’t think this is the case to adopt it: here the author should have assessed just the methodological quality of the quantitative research study design (ie. experimental, observational etc….) offering a qualitative assessment and not a quantitative one which in the Cochrane wording is referring to a quantitative synthesis/statistically driven. Moreover, the second part the authors assessed, ie. “qualitative analysis” is just an assessment of reporting characteristic of the included studies - I did not assess it as a qualitative analysis. PLease look at the following examples of reporting characteristics assessment/methdoological quality: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0040078; https://www.sciencedirect.com/science/article/pii/S0895435616308162.

Otherwise the study design is not clearly described – the mixed methodology might describe the study design and not the appraisal of the studies. Here an example: https://pubmed.ncbi.nlm.nih.gov/33033951/ where both a questionnaire(quantitative analysis) and focused groups (qualitative analysis) were performed.

In this case control study, this mixed methodology did not really reflect what was done. To my opinion is a control study belonging to the quantitative research world where included studies were compared using the following instruments:

1. Methodological quality throughout the QUALSYST tool (even though the high standard: Cochrane Risk of bias or the New Castle or the ROBINS I could have been used too)

2. Asssessment of reporting elements such as: reported from page 7, lines 150: ”The COVID-19 original research articles (n = 13) were assessed qualitatively to report on their major weaknesses (which type of weakness and how this is standardized across studies? Was not already included in the methodological quality for quantitative studies in the QUALSYST tool?), potential conflicts of interest, and likely influence on further research and clinical practice (in which way these are standardized, collected and reported in the assessment? Are regression analyses planned to investigate the influence? ).

The second point cannot be equal to a qualitative part of a mixed methodology where usually focus groups or interviews are used to collect qualitative data. The data/info the authors wanted to comment on are included and reported in the selected studies, in the manuscript/full text as general characteristics (ie. conflict of interest) and so are simply collected from them and then discussed.

I suggest the authors to better revise the study design performed, the qualitative/quantitative wording.

Our reply: Thank you. As suggested by reviewer 1, we changed the wording “qualitative” to “narrative” throughout the manuscript. We also changed the “Mixed-methods” wording to “multi-step”, to more accurately reflect the methodology used.

Attachment

Submitted filename: Reply to Reviewers for PONE-R2.docx

Decision Letter 2

Bart Ferket

22 Oct 2020

Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Case Control Study

PONE-D-20-14688R2

Dear Dr. Berger,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Bart Ferket

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Bart Ferket

26 Oct 2020

PONE-D-20-14688R2

Scientific Quality of COVID-19 and SARS CoV-2 Publications in the Highest Impact Medical Journals during the Early Phase of the Pandemic: A Case Control Study

Dear Dr. Berger:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Bart Ferket

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Checklist used for the assessment of the quality of the quantitative studies.

    Description of data: Detailed criteria are shown for the quality assessment of the quantitative studies.

    (DOCX)

    S2 File. Assessor (authors MZ–DB, JBE–BZ) agreements on the qualities of the quantitative studies.

    Description of data: Percentage assessor agreement after independent individual scoring and following resolution of disagreements.

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Reply to Reviewers for PONE-R2.docx

    Data Availability Statement

    https://figshare.com/projects/Scientific_Quality_of_COVID-19_and_SARS_CoV-2_Publications_in_the_Highest_Impact_Medical_Journals_during_the_Early_Phase_of_the_Pandemic_A_Case-Control_Study/86027.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES