Skip to main content
The American Journal of Clinical Nutrition logoLink to The American Journal of Clinical Nutrition
. 2021 Mar 19;113(6):1578–1592. doi: 10.1093/ajcn/nqab002

Characteristics and quality of systematic reviews and meta-analyses of observational nutritional epidemiology: a cross-sectional study

Dena Zeraatkar 1,2, Arrti Bhasin 3, Rita E Morassut 4, Isabella Churchill 5, Arnav Gupta 6, Daeria O Lawson 7, Anna Miroshnychenko 8, Emily Sirotich 9, Komal Aryal 10, David Mikhail 11, Tauseef A Khan 12,13, Vanessa Ha 14, John L Sievenpiper 15,16, Steven E Hanna 17, Joseph Beyene 18, Russell J de Souza 19,20,21,
PMCID: PMC8243916  PMID: 33740039

ABSTRACT

Background

Dietary recommendations and policies should be guided by rigorous systematic reviews. Reviews that are of poor methodological quality may be ineffective or misleading. Most of the evidence in nutrition comes from nonrandomized studies of nutritional exposures (usually referred to as nutritional epidemiology studies), but to date methodological evaluations of the quality of systematic reviews of such studies have been sparse and inconsistent.

Objectives

We aimed to investigate the quality of recently published systematic reviews and meta-analyses of nutritional epidemiology studies and to propose guidance addressing major limitations.

Methods

We searched MEDLINE (January 2018–August 2019), EMBASE (January 2018–August 2019), and the Cochrane Database of Systematic Reviews (January 2018–February 2019) for systematic reviews of nutritional epidemiology studies. We included a random sample of 150 reviews.

Results

Most reviews were published by authors from Asia (n = 49; 32.7%) or Europe (n = 43; 28.7%) and investigated foods or beverages (n = 60; 40.0%) and cancer morbidity and mortality (n = 54; 36%). Reviews often had important limitations: less than one-quarter (n = 30; 20.0%) reported preregistration of a protocol and almost one-third (n = 42; 28.0%) did not report a replicable search strategy. Suboptimal practices and errors in the synthesis of results were common: one-quarter of meta-analyses (n = 30; 26.1%) selected the meta-analytic model based on statistical indicators of heterogeneity and almost half of meta-analyses (n = 50; 43.5%) did not consider dose–response associations even when it was appropriate to do so. Only 16 (10.7%) reviews used an established system to evaluate the certainty of evidence.

Conclusions

Systematic reviews of nutritional epidemiology studies often have serious limitations. Authors can improve future reviews by involving statisticians, methodologists, and researchers with substantive knowledge in the specific area of nutrition being studied and using a rigorous and transparent system to evaluate the certainty of evidence.

Keywords: systematic reviews, nutritional epidemiology, risk of bias, quality, credibility


See corresponding editorial on page 1385.

Introduction

Owing to the challenges of conducting randomized controlled trials (RCTs) of dietary interventions, most of the evidence in nutrition comes from nonrandomized, observational studies of nutritional exposures, hereafter referred to simply as nutritional epidemiology studies (13). Clinicians, guideline developers, policymakers, and researchers use systematic reviews of these studies to advise patients on optimal dietary habits, formulate recommendations and policies, and plan future research but a review that is of poor methodological quality may be ineffective or even misleading (2, 4, 5). There is empirical evidence that systematic reviews in the biomedical literature often have important limitations (68). There are, however, reasons to suspect that reviews of nutritional epidemiology studies may have more serious limitations than systematic reviews in other health fields (5). There are unique challenges, for example, to conducting reviews of nutritional exposures, such as the need to summarize dose–response relations and to consider how the effects of nutritional exposures may differ based on the foods or food compounds that are consumed instead of the exposure of interest (913). To our knowledge, there has been no evaluation of the quality of systematic reviews of nutritional epidemiology studies.

The objective of this study was to evaluate the characteristics and quality of recently published systematic reviews of nutritional epidemiology studies and, based on these findings, to propose guidance addressing major limitations. We define quality as the extent to which a review addresses a sensible research question and uses rigorous methods (including appropriate methodological safeguards against bias) to search the literature, select eligible studies and collect data, appraise the quality of studies, and synthesize and interpret findings (14, 15).

Methods

We registered a protocol for this study at the Open Science Framework (https://osf.io/p9vge).

Search strategy

A research librarian developed a search strategy to identify systematic reviews of nutritional epidemiology studies (Supplemental Tables 1, 2). We searched MEDLINE and EMBASE from January 2018 to August 2019 and the Cochrane Database of Systematic Reviews from January 2018 to February 2019.

Study selection

Systematic reviews were eligible for inclusion if they investigated the association between ≥1 nutritional exposures and health outcomes and reported on ≥1 epidemiologic studies. We defined systematic reviews as studies that explicitly described a search strategy (including at minimum the databases searched) and eligibility criteria (including at minimum the exposures and health outcomes of interest) (16); epidemiologic studies as nonrandomized, nonexperimental studies (e.g., cohort, case-control, case cohort, nested case-control, cross-sectional, excluding case series) (17); nutritional exposures as macronutrients, micronutrients, bioactive compounds, foods, beverages, dietary patterns/habits, or nonnutritive components of foods (e.g., caffeine); and health outcomes as measures of morbidity, mortality, and quality of life. Reviews using harmonized data sets to conduct analyses on multiple epidemiologic studies were also eligible for inclusion. Reviews that included RCTs were eligible if they included ≥1 epidemiologic studies. Both English and non-English reviews were eligible for inclusion but we only identified English-language reviews. We excluded reviews in which all studies included <500 participants, because such studies were primarily nonrandomized experimental studies instead of observational epidemiologic studies. We excluded other types of reviews (e.g., scoping reviews), reviews that were not systematic in their methods (i.e., narrative reviews that did not describe a search strategy including at minimum the databases searched and eligibility criteria), qualitative syntheses, and reviews of postprandial studies, supplements, and chemicals involuntarily consumed through the diet (e.g., pesticides).

Reviewers (DZ, DM) completed calibration exercises, after which they performed screening independently and in duplicate. Reviewers resolved disagreements by discussion or by third-party adjudication (RJdS). We estimated that 150 reviews would allow estimation of the prevalence of even uncommon review characteristics (i.e., prevalence in ∼5% of studies) with acceptable precision (i.e., ±3.5%) (18). Hence, we selected a random sample of 150 eligible reviews using a computer-generated random number sequence.

Data collection

Reviewers (DZ, AB, REM, IC, DOL, AM, ES, KA), after training and calibration exercises to ensure sufficient agreement, extracted the following information from each review, independently and in duplicate: research question; eligibility criteria, search strategy; methods for screening, data extraction, and assessment of risk of bias; analytic methods; results from the primary meta-analysis (if any type of meta-analysis was conducted); characteristics related to the reporting and interpretation of results; and sources of funding and conflicts of interest. Reviewers resolved disagreements by discussion or by third-party adjudication (RJdS). Items of the data collection form were drawn from established criteria for assessing the quality of systematic reviews, guidance on optimal practices for conducting systematic reviews, data collection forms of previous studies, and literature on methodological issues relevant to systematic reviews of nonrandomized studies and systematic reviews of nutritional exposures (4, 68, 1315, 17, 1921).

If a review cited a protocol, reviewers retrieved and reviewed the protocol for additional relevant information. If a review did not explicitly identify a primary meta-analysis, we assumed that the primary meta-analysis was the meta-analysis for which results were first presented in the results section of the article. We took the reciprocal of relative effects <1 to produce unidirectional relative effects. For example, a relative effect of 0.5 was transformed to 2. We classified effect sizes reported in reviews as very small (RR: 1–1.1), small (RR: 1.1–1.5), moderate (RR: 1.51–2), and large (RR: >2.01). Although the categorization of effect sizes is overall arbitrary, our categorization of large effects is consistent with established Grading of Recommendations Assessment, Development, and Evaluation (GRADE) guidance (2224).

Risk of bias of systematic reviews

Reviewers (DZ, AB, REM, IC, DOL, AM, ES, KA), working independently and in duplicate, used a modified version of the ROBIS tool to assess the risk of bias of systematic reviews (the propensity for a review to present distorted results owing to systematic flaws or limitations in the design, conduct, or analysis of the review) (25). We chose to use the ROBIS tool rather than A MeaSurement Tool to Assess systematic Reviews (AMSTAR) because it is more comprehensive in its assessment of risk of bias and because AMSTAR includes several items that address the construct of reporting quality rather than risk of bias (26). We excluded the section on assessing the relevance of the review from the ROBIS tool because we did not apply the results of reviews to address specific questions. We used ROBIS guidance to rate each domain of the tool. For the domain on study eligibility criteria, we rated reviews at “low concern” if eligibility criteria were prespecified in a protocol, directly addressed the research question, and were unambiguous. For the domain on the identification and selection of studies, we rated reviews at “low concern” if the search strategy included ≥2 databases from either MEDLINE or PubMed, EMBASE, and Web of Science (or other databases with similar coverage) and strategies to identify unpublished data, the full search strategy for ≥1 database was reported and was deemed likely to retrieve as many eligible studies as possible, all search restrictions were appropriate, and the selection of studies was conducted in duplicate. For the domain on data collection and study appraisal, we rated reviews at “low concern” if data extraction was conducted in duplicate and risk of bias was assessed using an appropriate and comprehensive tool or set of criteria. We considered risk of bias assessments to be appropriate when they included criteria addressing biases due to confounding, selection of participants, classification of the exposure, departures from the intended exposure, measurement of the outcome, missing data, and selective reporting (27). We deemed risk of bias criteria inappropriate if they failed to address any of the aforementioned criteria or if they included criteria relevant to study precision, reporting, or generalizability (external validity). For the domain on synthesis and findings, we rated reviews at “low concern” if the synthesis was conducted using appropriate methods (i.e., random-effects dose-response meta-analysis unless compelling reasons for conducting other analyses were presented), included all relevant studies, addressed risk of bias in the synthesis of results (e.g., conducted ≥1 subgroup or sensitivity analyses based on risk of bias or presented a narrative discussion of bias), and followed predefined analytic methods. Based on ROBIS guidance, we rated reviews at “low risk of bias” overall if all domains were at “low concern” or if reviewers acknowledged all limitations and described how they may have affected results in their interpretation of their review findings.

Data synthesis and analysis

We present frequencies and percentages for categorical outcomes and medians and IQRs for continuous outcomes.

Results

Supplemental Figure 1 presents details of the selection of reviews. We retrieved a total of 4267 unique records and screened a random sample of 2273 titles and abstracts and 184 full-text articles to identify a sample of 150 eligible reviews.

General characteristics of systematic reviews

Table 1 presents general characteristics of reviews and Supplemental Table 3 presents additional details and examples. Almost half of the reviews were published in general nutrition journals by authors from Asia or Europe. Only 6 of the reviews were conducted to inform a particular guideline, policy decision, or to fulfill the needs of a specific evidence user. One-third of reviews were funded by either government agencies or institutions (e.g., hospitals, universities) and a very small minority were funded by marketing/advocacy organizations or food companies. Only 10 reviews declared any conflicts of interest. Reviews most frequently reported on foods or beverages and cancer morbidity and mortality. Only a minority of reviews studied surrogate outcomes. Reviews included a median of 15 studies and 200,000 participants.

TABLE 1.

General characteristics of systematic reviews1

Reviews (n = 150)
Journal
 General nutrition journal (journals with only a nutrition focus, e.g., The American Journal of Clinical Nutrition) 61 (40.7)
 Specialized nutrition journal (journals with a focus on nutrition and a specific disease area, e.g., Nutrition,Metabolism and Cardiovascular Diseases) 7 (4.7)
 General medical journal (e.g., Lancet) 28 (18.7)
 Specialized medical journal (e.g., Clinical Breast Cancer) 54 (36.0)
Country of primary affiliation of corresponding author
 North America 14 (9.3)
 Europe 43 (28.7)
 Oceania 13 (8.7)
 Middle East 28 (18.7)
 Asia 49 (32.7)
 South America 3 (0.7)
Was the review conducted to inform a particular guideline or policy decision or to fulfill the needs of a particular evidence user?
 Yes 6 (4.0)
 No 144 (96.0)
Funding2
 Government support 56 (37.3)
 Institutional support 34 (22.7)
 Private not-for-profit foundation 20 (13.3)
 Food marketing/advocacy organizations 4 (2.7)
 Food companies 2 (1.3)
 No funding 32 (21.3)
 Not reported 34 (22.7)
Did the authors declare any conflicts of interest?
 Yes 10 (6.7)
 No 135 (90.0)
 Not reported 5 (3.3)
Exposures2
 Micronutrient 27 (18.0)
 Macronutrient 24 (16.0)
 Bioactive compounds 15 (10.0)
 Food or beverage 60 (40.0)
 Food group 21 (14.0)
 Dietary pattern 49 (32.7)
 Nonnutritive components of foods/beverages 25 (18.7)
Outcomes2
 Cardiometabolic morbidity or mortality 26 (17.3)
 Cancer morbidity or mortality 54 (36.0)
 Diseases of the digestive system 10 (6.7)
 All-cause mortality 9 (6.0)
 Anthropometric measures 8 (5.3)
 Surrogate outcomes 17 (11.3)
 Other 55 (36.7)
Primary studies, n 15 [11–23]
Participants, n 208,117 [84,951–510,954]
1

Values are n (%) or median [IQR].

2

Each review can be classified in >1 category.

Methodological characteristics of systematic reviews

Table 2 presents methodological characteristics of reviews and Supplemental Table 4 presents additional details and examples. The majority of reviews did not report preregistration or publication of a protocol. Among those that did, in nearly half, there were unexplained deviations from the protocol, common examples of which include discrepancies in subgroup analyses (n = 9; 60.0%), eligibility criteria (n = 8; 53.3%), and databases searched (n = 4; 26.7%). The majority of reviews (n =138; 92.0%) searched ≥2 of the following high-yield databases: MEDLINE/PubMed, EMBASE, Scopus, and Web of Science. One-third of reviews did not report a replicable search strategy (i.e., the search syntax) for ≥1 database and only 14 reviews searched for unpublished data (i.e., conference abstracts, dissertations, expert contact, protocol registries). Less than one-third of reviews (n = 40; 26.7%) conducted screening, data extraction, and assessment of risk of bias in duplicate. Three-quarters of reviews conducted ≥1 meta-analyses. Among reviews that did not conduct meta-analysis, only a minority presented a tabular or graphical summary of quantitative results of primary studies and less than half explained why meta-analysis was not performed.

TABLE 2.

Methodological characteristics of systematic reviews1

Reviews (n = 150)
Did the review cite a reporting guideline?2
 PRISMA 83 (55.3)
 MOOSE 23 (15.3)
 None 45 (30.0)
Did the review cite a protocol?
 Yes 30 (20.0)
 No 120 (80.0)
Among reviews with a protocol (n = 30; 20.0%), were there deviations from the methods described in the protocol?
 Yes, and deviations were explained 2 (6.7)
 Yes, but deviations were not explained 13 (43.3)
 The protocol was not publicly accessible 5 (16.7)
 No 10 (33.3)
Eligible study designs2
 Cohort 146 (97.3)
 Case-control 97 (64.7)
 Cross-sectional 80 (53.3)
 Randomized controlled trials 74 (49.3)
Databases searched, n 3 [2–3]
Databases searched2
 MEDLINE/PubMed 150 (100)
 EMBASE 98 (65.3)
 Web of Science 63 (42.0)
 Cochrane CENTRAL 52 (34.7)
 Scopus 45 (30.0)
 Google Scholar 17 (11.3)
 CINAHL 15 (10.0)
 Other 37 (24.7)
Did the review report a replicable search strategy?
 Yes, the search strategy is replicable 108 (72.0)
 No, but key terms are reported 35 (23.3)
 No 7 (4.7)
Language restrictions?
 Yes 75 (50.0)
 No 75 (50.0)
Did the review search for unpublished data (i.e., conference abstracts, dissertations, unpublished studies, partially published studies,expert solicitation)?
 Yes 14 (9.3)
 No 136 (90.7)
Method for screening of studies for eligibility
 Completed in duplicate or more 106 (70.7)
 Completed by 1 reviewer 4 (2.7)
 Completed by 1 reviewer and a subset verified by a second reviewer 3 (2.0)
 Completed by 1 reviewer with uncertainties verified by a second reviewer 1 (0.7)
 Not reported 36 (24.0)
Method for data extraction from primary studies
 Completed in duplicate 88 (58.7)
 Completed by 1 reviewer 3 (2.0)
 Completed by 1 reviewer and verified by a second reviewer 10 (6.7)
 Completed by 1 reviewer with uncertainties verified by a second reviewer 1 (0.7)
 Not reported 48 (32.0)
Method for the assessment of risk of bias among reviews that assessed risk of bias (n = 131; 87.3%)
 Completed in duplicate or more 69 (52.7)
 Completed by 1 reviewer and verified by a second reviewer 1 (0.7)
 Not reported 61 (46.7)
Method for the synthesis of results
 Meta-analysis 115 (76.7)
 Narrative 21 (14.0)
 Tabular/graphical summary of quantitative results 14 (9.3)
Among reviews without meta-analysis (n = 35; 23.33%), was the decision to not perform meta-analysis explained in the review article?
 Yes 14 (40.0)
 No 21 (60.0)
1

Values are n (%) or median [IQR]. CENTRAL, Cochrane Central Register of Controlled Trials; CINAHL, Cumulative Index to Nursing and Allied Health Literature; MOOSE, Meta-analyses Of Observational Studies in Epidemiology; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

2

Each review can be classified in >1 category.

Characteristics of meta-analyses and analytic results

Table 3 presents characteristics of meta-analyses and their results and Supplemental Table 5 presents additional details and examples. All meta-analyses included only aggregate study-level data and none included individual participant data. None of the meta-analyses pooled effect estimates from substitution models (i.e., models that estimate the effect of the substitution of one exposure for another) or joint analyses (i.e., analyses that compare outcomes between participants grouped based on their level of consumption of ≥2 exposures). Among reviews that conducted >1 type of meta-analysis (e.g., meta-analysis of extreme categories and dose-response meta-analysis) or used >1 meta-analytic model (i.e., fixed-effect and random-effects meta-analysis), only 1 review explicitly specified the primary meta-analytic method. Among reviews that did not explicitly specify a primary meta-analytic method, we assumed that the method for which results were first presented in the results section was the primary one. Based on this assumption, the primary meta-analytic method was most frequently a random-effects meta-analysis comparing extreme categories of exposure. One-quarter of reviews selected the meta-analytic model based on a test for statistical heterogeneity or the magnitude of observed heterogeneity. Among reviews for which dose-response meta-analysis would be informative, only half presented a dose-response meta-analysis. Two-thirds of reviews did not conduct dose-response meta-analysis, among which this decision was only justified, either by the authors or based on the question being investigated, in one-quarter. Almost one-fifth of reviews included multiple effect estimates from the same study population in meta-analyses (i.e., double-counting studies) and misestimated heterogeneity by pooling stratified data from the same study in the main meta-analysis. More than one-quarter of meta-analyses reported very small (relative effect ≤1.1) or small (1.1 < relative effect ≤ 1.5), but statistically significant, effects (i.e., RR, OR, HR). Heterogeneity was moderate or substantial (I2 > 50%) in over half of meta-analyses (median I2: 60%; IQR: 31%–75%). Nearly all meta-analyses reported ≥1 subgroup analysis but subgroup analyses were only prespecified for less than one-fifth of reviews. The majority of meta-analyses tested for small study effects, one-third of which found evidence of small study effects.

TABLE 3.

Characteristics of meta-analyses and analytic results1

Meta-analyses (n = 115)
Among reviews that used >1 method (i.e., model or type) of meta-analysis (n = 49; 42.6%), was the primary meta-analytic method explicitly defined?
 Yes 1 (2.0)
 No 48 (98.0)
Primary model for meta-analysis
 Random-effects 84 (73.0)
 Fixed-effect 1 (0.9)
 Random-effects with significant or substantial heterogeneity, fixed-effect otherwise 30 (26.1)
Primary type of meta-analysis
 Meta-analysis of extreme categories of intake 87 (75.7)
 Dose-response meta-analysis 8 (7.0)
 Meta-analysis of specific dose categories 11 (9.6)
 Other 9 (7.8)
Secondary model for meta-analysis among reviews that used >1 method (i.e., model or type) for meta-analysis (n = 49; 42.6%)
 Random-effects 36 (73.5)
 Fixed-effect 3 (6.1)
 Random-effects with significant or substantial heterogeneity, fixed-effect otherwise 10 (20.4)
Secondary type for meta-analysis among reviews that used >1 method (i.e., model or type) for meta-analysis (n = 49; 42.6%)2
 Meta-analysis of extreme categories of intake 3 (6.1)
 Dose-response meta-analysis 31 (63.3)
 Meta-analysis of specific dose categories 6 (12.2)
 Other 4 (8.2)
Among reviews that did not conduct dose-response meta-analysis (n = 67; 58.3%), was the decision to not conduct dose-response meta-analysis justified, either by the authors in the report or based on the question being investigated?
 Yes 17 (25.4)
 No 50 (74.6)
Were any subgroup analyses reported?
 Yes 103 (89.6)
 No 12 (10.4)
Subgroup analyses among reviews with subgroup analyses, n (n = 103, 89.6%) 4 [2–7]
Among reviews with subgroup analyses (n = 103; 89.6%), were subgroup analyses prespecified?
 Yes, all subgroups were prespecified 6 (5.8)
 Yes, some were prespecified and others were post hoc 8 (7.8)
 No 4 (3.9)
 Not reported 85 (82.5)
Study designs pooled in primary meta-analysis
 Cohorts 32 (27.8)
 Case-control 5 (4.3)
 Cross-sectional 7 (6.1)
 RCTs + cohorts 1 (0.9)
 RCTs + cohorts + case-control 1 (0.9)
 Cohorts + case-control 30 (26.1)
 Cohorts + cross-sectional 9 (7.8)
 Cohorts + case-control + cross-sectional 13 (11.3)
 Case-control + cross-sectional 3 (2.6)
 Not reported 14 (12.2)
Among meta-analyses that included different study designs (n = 57; 49.6%), did the review present subgroup analyses by study design?
 Yes 48 (84.2)
 No 9 (15.8)
Test for small study effects2
 Egger's test 82 (71.3)
 Visual inspection of funnel plot 72 (62.6)
 Begg's test 35 (30.4)
 No test for small study effects 15 (13.0)
Among meta-analyses that tested for small study effects (n = 100; 87.0%), was there evidence of small study effects?
 Yes 38 (38.0)
 No 60 (60.0)
 Not reported 2 (2.0)
Among meta-analyses with evidence of small study effects (n = 38; 33.0%), were results adjusted for small study effects?
 Yes, using trim and fill (28) 15 (39.5)
 Yes, a study was excluded 2 (5.3)
 No 21 (55.3)
Other analytic errors and suboptimal practices2
 Misestimation of heterogeneity due to the pooling of stratified data in the main meta-analysis 21 (18.3)
 Double-counting of studies in meta-analyses 20 (17.4)
Studies included in the primary meta-analysis, n 10 [6–14]
Effect size of the primary meta-analysis among meta-analyses with dichotomous outcomes (n = 110; 95.6%)3
 Very small or no effect (relative effect of 1.0–1.1) 32 (29.0)
 Small (relative effect of 1.1–1.5) 68 (61.8)
 Moderate (relative effect of 1.51–2.00) 8 (7.3)
 Large (relative effect > 2.01) 2 (1.8)
Was the primary meta-analysis statistically significant?
 Yes 79 (68.7)
 No 36 (31.3)
Magnitude of heterogeneity (I2) in the primary meta-analysis
 <25% 23 (20)
 25 to <50% 18 (15.7)
 50 to <75% 43 (37.4)
 75% to 100% 28 (24.3)
 Not reported 3 (2.6)
1

Values are n (%) or median [IQR]. RCT, randomized controlled trial.

2

Each review can be classified in >1 category.

3

We converted effect estimates so that increasing levels of exposure were associated with increasing risk of the outcome.

Reporting and interpretation of findings in systematic reviews

Table 4 presents characteristics related to the reporting and interpretation of findings of systematic reviews and Supplemental Table 6 presents additional details and examples. Only 5 reviews reported absolute effects. Only 1 in 10 reviews evaluated the certainty of evidence using a formal system. The most commonly used approach was GRADE, followed by NutriGrade (29, 30). Two reviews made errors in the application or interpretation of GRADE: both reviews failed to initially rate evidence from nonrandomized studies at low certainty (31, 32) and 1 review rated up the certainty of evidence for a large effect despite ORs only ranging between 0.6 and 0.8 (31). In their interpretation of findings, most reviews did not discuss risk of bias (the validity of studies and the risk that they may overestimate or underestimate the true effects), consistency (consistency of results across studies), imprecision (random error due to insufficient sample size or number of events), indirectness (differences between the populations, interventions, and outcomes of interest and those investigated in studies), or publication bias (distortion of results caused by the tendency of authors to submit, reviewers to approve, and editors to publish articles containing “positive” findings) (33). More than two-thirds of reviews concluded that the certainty of evidence was insufficient to draw meaningful conclusions regarding the effects of the exposure.

TABLE 4.

Interpretation of results in systematic reviews1

Reviews (n = 150)
Among reviews with dichotomous outcomes (n = 144; 96.0%), were absolute effects reported?
 Yes 5 (3.5)
 No 139 (96.5)
Did the review evaluate the certainty of evidence using a formal system?
 Yes, using GRADE (29) 9 (6.0)
 Yes, using NutriGrade (30) 2 (1.3)
 Yes, using SIGN (40) 1 (0.7)
 Yes, using the NHMRC FORM methodology (41) 1 (0.7)
 Yes, using a modified version of the American Diabetes Association system (42) 1 (0.7)
 Yes, using a modified version of the National Osteoporosis Foundation evidence grading system (43) 1 (0.7)
 Yes, using an ad hoc system 1 (0.7)
 No 134 (89.3)
Lowest-certainty evidence presented among reviews using GRADE (n = 9; 6.0%)
 Very low 7 (77.8)
 Low 0 (0.0)
 Moderate 1 (11.1)
 High 0 (0.0)
 Not reported 1 (11.1)
Highest-certainty evidence presented among reviews using GRADE (n = 9; 6.0%)
 Very low 2 (22.2)
 Low 1 (11.1)
 Moderate 5 (55.6)
 High 0 (0.0)
 Not reported 1 (11.1)
Did the review consider risk of bias of primary studies in the interpretation of results?
 Yes, risk of bias is acknowledged as a limitation 12 (8.0)
 Yes, bias is described as unlikely to have affected findings 13 (8.7)
 No, risk of bias is not discussed 125 (83.3)
Did the review consider consistency in the interpretation of results?
 Yes, consistency across primary studies is used to support findings 13 (8.7)
 Yes, inconsistency across primary studies is acknowledged or described as a limitation 71 (47.3)
 No, consistency is not discussed 66 (44.0)
Did the review consider directness in the interpretation of results?
 Yes, directness across primary studies is used to support findings 3 (2.0)
 Yes, indirectness across primary studies is acknowledged or described as a limitation 49 (32.67)
 No, indirectness is not discussed 98 (65.3)
Did the review consider precision in the interpretation of results?
 Yes, precise results, large sample size, or a large number of events is used to support findings 24 (16.0)
 Yes, imprecision across primary studies is acknowledged or described as a limitation 54 (36.0)
 No, indirectness is not discussed 72 (48.0)
Did the review consider the potential for publication bias in the interpretation of results?
 Yes, the potential for publication bias is acknowledged as a limitation 28 (18.7)
 Yes, publication bias is described as unlikely to have affected findings 20 (13.3)
 No, publication bias is not discussed 102 (68.0)
What is the final conclusion of the review?
 The review draws definitive conclusions regarding the effects of the nutritional exposure 11 (7.3)
 The review draws some conclusions regarding the effects of the nutritional exposure on the outcome of interestbut suggests that additional evidence is still needed to draw more definitive conclusions 35 (23.3)
 The review suggests that the current evidence is of too low certainty to draw any conclusions on the effects of the nutritional exposure on the outcome of interest 104 (69.3)
1

Values are n (%). GRADE, Grading of Recommendations Assessment, Development, and Evaluation; NHMRC, National Health and Medical Research Council; SIGN, Scottish Intercollegiate Guidelines Network.

Risk of bias of systematic reviews

Table 5 presents the risk of bias of systematic reviews. All reviews had ≥1 domains at high concern. More than three-quarters of reviews had important limitations related to study eligibility criteria, primarily due to the lack of prespecification of eligibility criteria. Nearly all reviews had important limitations related to the identification and selection of studies, data collection and study appraisal, and synthesis and findings, primarily due to failure to search for unpublished data, use of inappropriate criteria for assessment of risk of bias, and lack of prespecified analyses, respectively. All reviews were rated at “high risk of bias” overall, because of the aforementioned limitations and because these limitations and their impact were not acknowledged in the interpretation of review findings.

TABLE 5.

Results from the application of the ROBIS tool to systematic reviews

Studies, n (%)
Concerns related to study eligibility criteria
 High 122 (81.3%)
 Low 28 (18.7%)
Concerns related to identification and selection of studies
 High 145 (96.7%)
 Low 5 (3.3%)
Concerns related to data collection and study appraisal
 High 150 (100%)
 Low 0 (0.0%)
Concerns related to synthesis and findings
 High 145 (96.7%)
 Low 5 (3.3%)

Region of corresponding author

Supplemental Tables 7–11 present results stratified by the region of the corresponding author's primary affiliation (i.e., West, including Europe, North America, and Oceania; Asia; and Middle East). We did not observe any appreciable differences in review characteristics or quality between the 3 regions, although our sample size for each region was small.

Discussion

Main findings

Our study shows that systematic reviews of nonrandomized, observational studies of the health effects of nutritional exposures, termed nutritional epidemiology studies, often have important limitations. More than half of reviews studied the effects of single foods or food compounds without considering dietary patterns (34), substitutions (35, 36), or joint effects (36), which is in contrast to recent efforts to move away from the reductionist approach to nutrition research—an approach that is discouraged because it ignores the potential for the effects of nutritional exposures to differ depending on the foods or food compounds consumed instead of the exposure of interest (13, 35, 37).

Most reviews had important limitations related to their search strategy, selection of studies, and extraction of data. Fewer than one-quarter of reviews reported preregistration or publication of a protocol, a practice that protects against reporting bias and reviewers’ methodological decisions being influenced by the observed results (38, 39). Nearly one-quarter of reviews did not report a replicable search strategy, which precludes evidence users from replicating or updating the review or assessing the comprehensiveness of the search. Only a handful of reviews attempted to search for unpublished data, which risks the perpetuation of reporting bias in the literature—a problem that is likely already highly prevalent in nutritional epidemiology owing to the lack of standard registration practices for study protocols and statistical analysis plans (4447). Fewer than one-third of reviews conducted screening, data extraction, and assessment of risk of bias in duplicate, which empirical evidence shows is important for reducing errors (4850).

Nearly all reviews included ≥1 suboptimal practices or errors related to the synthesis of results. One-quarter of reviews, for example, did not conduct meta-analysis, among which the decision to not conduct meta-analysis was only justified in under half. Review authors may choose to not conduct a meta-analysis if they lack sufficient expertise or resources, although a narrative review for a topic with sufficient evidence for quantitative synthesis is less useful to evidence users (51). Among reviews that investigated questions for which dose-response meta-analysis would be appropriate and informative, only half conducted dose-response meta-analysis. This may be because dose-response meta-analysis, compared with meta-analysis of extreme categories of exposure, requires the collection of additional data (that may not always be reported in primary studies) and greater statistical expertise (9, 10). When dose-response meta-analysis was presented, it was almost always secondary to meta-analysis comparing extreme categories of exposure, which is less useful and may even give misleading results, particularly in situations in which the relation between the exposure and outcome is nonlinear or in cases in which the difference in the magnitude of exposure across extreme categories is unrealistic or unattainable for patients or the public (e.g., comparing the health effects of <1 serving/d with those of 10 servings/d of fruits and vegetables) (11, 12). Despite prespecification being an important determinant of the validity of subgroup analyses (52, 53), subgroup analyses were seldom prespecified. Other common suboptimal analytic practices included the selection of the meta-analytic model (i.e., random-effects compared with fixed-effect) based on a statistical test of heterogeneity or the observed magnitude of heterogeneity [a practice that is strongly discouraged because of the low reliability of tests and statistics of heterogeneity (54, 55)], the pooling of stratified data in the main meta-analysis [a practice that can lead to the misestimation of heterogeneity (56)], and the double-counting of studies [a practice that produces spurious precision (57)].

Reviews often had significant limitations related to the reporting and interpretation of findings. Reviews, for example, seldom reported absolute effects (e.g., risk differences), despite these being essential for informed decision-making (5860). Only a handful of reviews used a formal system to evaluate the certainty of evidence, possibly as a result of which important factors in the interpretation of the evidence were often neglected (61, 62).

Finally, few systematic reviews were able to confidently draw conclusions regarding the effects of the exposure on the outcome of interest, largely owing to problems inherent in nutritional epidemiology studies (e.g., the potential for confounding and biases in the measurement of nutritional exposures).

Relation to previous work

Our findings are consistent with previous studies that have reported on the quality of systematic reviews in the general biomedical literature and in specific health fields (68, 63). Page et al. (6), for example, found methodological and reporting limitations to be common in a sample of systematic reviews indexed in MEDLINE. Similar to our findings, major issues included failure to prespecify methods in a protocol, report a replicable search strategy, and errors in the application and interpretation of statistical analyses (68).

Previous studies have also reported on the scope and quality of systematic reviews in nutrition and have also identified important deficiencies (19, 20). Deficiencies were more common in our sample than in previous studies (e.g., 91% and 100% followed a priori protocols in previous studies compared with 19.3% of reviews in our study) (19, 20)—likely because previous studies have primarily evaluated Cochrane reviews, which must meet established standards before publication, and reviews of RCTs of nutritional interventions, which are generally more straightforward than reviews of epidemiologic studies (15). Overviews of reviews (“umbrella reviews”) that have assessed the methodological quality of reviews of nonrandomized studies in nutrition have generally reported reviews to be of variable quality, with an appreciable proportion of reviews including important methodological limitations, although such overviews have not found as many or as extensive methodological problems as we identified in our study (6467). Differences between the results of overviews and our study may be due to overviews not including an assessment of the quality of reviews that is as comprehensive or detailed as is presented here. Further, quality assessments presented by such overviews are only representative of reviews addressing a particular question and may not be representative of all nutritional epidemiology reviews.

Implications and recommendations

Given that reviews of nutritional epidemiology studies often have important limitations, evidence users should be cautious when interpreting and applying their results. Based on the findings of our study and based on prevailing guidance on the conduct of rigorous systematic reviews and meta-analyses (15, 68), we have compiled a list of recommendations for review authors that may improve the quality of future reviews of nutritional epidemiology studies (presented below). We encourage journal editors and peer reviewers to also be mindful of our recommendations because many of the issues we describe can be addressed at the peer review stage (e.g., the double-counting of studies can be detected from the list of studies included in meta-analyses or from forest plots).

Recommendations for authors of systematic reviews of nutritional epidemiology studies

Planning a review

1)   Consider whether a systematic review of nutritional epidemiology studies is useful. Such reviews usually only provide low certainty evidence, primarily owing to limitations inherent in the design of primary nutritional epidemiology studies (e.g., confounding, biases in the measurement of nutritional exposures, and selective reporting) ( 2, 29, 46, 6973). Because of these concerns, some evidence users may consider prioritizing other types of evidence, such as evidence from RCTs, when available. Frequently, however, evidence from nutritional RCTs may also be low certainty, usually because nutritional RCTs typically only address surrogate outcomes (rather than patient-important outcomes), participants do not adhere to assigned interventions, or there is high attrition due to the need for long follow-up. When evidence from RCTs is also low certainty, reviewing evidence from nutritional epidemiology studies may be more consequential (74).

2)  Choose between conducting a systematic review and meta-analysis of the published literature or a meta-analysis of individual participant data using ≥1 harmonized data sets (e.g., the Pooling Project)—the latter of which allows standardization of analyses across studies and reduces the effects of publication bias on results but requires accessing primary data which may not always be possible and usually demands more time (7577).

3)  When the exposure of interest is a single food or food compound, to avoid overlooking potentially differential effects of the exposure of interest depending on the foods or food compounds that are consumed instead of the exposure, consider collecting and meta-analyzing effects from substitution or joint models, when such models are reported (13, 35). It is important to be mindful, however, of the potential challenges of this type of analysis. Foods may be grouped together, for example, in ways that are too different to allow meaningful pooling (e.g., unit of substitution, energy adjustment). Some of these issues may be overcome by individual participant data meta-analysis, which allows greater standardization of analyses across studies (36, 76).

4)  To avoid unnecessary duplication, using electronic databases and repositories (e.g., PROSPERO, Open Science Framework) search for existing and ongoing systematic reviews that address the question of interest. Avoid undertaking a new review if there are already existing reviews that address the question of interest and that are rigorous and sufficiently up to date. If a review is not up to date, conducting an update of the review may be more efficient than conducting a review de novo (78).

5)  Register a protocol that includes a detailed account of all decisions in the review process, including the review question; study eligibility criteria; the search strategy; methods for data collection and evaluation of risk of bias; methods for the synthesis of results across studies; methods for the assessment of heterogeneity and publication bias; planned subgroup and sensitivity analyses and their justification and anticipated direction of effect; and criteria for the rating of the certainty of evidence (38, 39).

Data collection and assessment

6)  Work with a research librarian or information specialist to devise the search strategy. At minimum, search MEDLINE and EMBASE, or databases with similar coverage, such as Scopus (79). Report the full search strategy with sufficient detail to allow replication (80). Search study registries, bibliographies of included studies, and abstracts from relevant meetings and conferences, and consider soliciting investigators for unpublished data (81).

7)  Screen studies, collect data, and assess risk of bias using a tool that addresses all potential sources of bias in such studies (i.e., bias due to confounding, selection of participants into the study, classification of exposure, deviations from the intended exposures, missing data, measurement of outcomes, selection of the reported results) in duplicate (8284). Resolve any discrepancies by discussion or by consultation with 1 third-party.

Synthesis and evaluation

8)  Conduct a meta-analysis when possible (15). Teams without training in meta-analysis should enlist the help of a statistician (51). Use a random-effects model to meta-analyze results, unless there are too few studies to reliably estimate between-study heterogeneity or unless there are compelling reasons to believe that a fixed-effect model may be preferable (55). When the question of interest is the relation between the quantity of intake of a nutritional exposure and health outcomes, conduct a dose-response meta-analysis as the primary meta-analytic method (9, 10, 12). Consider whether the relation between the exposure and health outcome of interest may be nonlinear. In most cases, it is conceivable, and potentially expected, for the relation to be nonlinear (11). In such cases, consider a nonlinear dose-response meta-analysis via either splines or polynomials (9). When meta-analysis is not possible or appropriate, present effect estimates and CIs (or some other measure of variability) for all primary studies.

9)  Include only 1 effect estimate from each study in the meta-analysis (57). If a study reports multiple effect estimates at various points of follow-up, in most cases, the effect estimate from the longest point of follow-up is preferred because the longest point of follow-up should include the greatest number of events and hence produce the greatest statistical power. If a study reports separate effect estimates corresponding to subtypes of the exposure of interest (e.g., almonds, walnuts, and pistachios when the exposure of interest is tree nuts) or subtypes of the outcome of interest (e.g., death from myocardial infarction and death from stroke when the outcome of interest is cardiovascular mortality), use a predefined rule to select only 1 effect estimate for meta-analysis. Alternatively, use more complex meta-analytic methods that can deal with dependence among effect estimates. We refer the reader to other sources that describe these methods and how they can be implemented using common statistical software (85, 86). If a study reports results stratified across ≥1 baseline characteristics (e.g., sex), in order to correctly estimate heterogeneity, first meta-analyze effect estimates across strata using a fixed-effect model and subsequently meta-analyze with other studies.

10)  Calculate, report, and interpret absolute effects (e.g., number needed to treat/harm, risk difference) (87, 88). Although review authors should typically meta-analyze relative effects because they tend to be similar across populations, optimal decision making requires knowledge of the absolute magnitude of desirable and undesirable effects (5860).

11)  Use a rigorous and transparent system to evaluate the certainty of evidence (29, 89). One such system is the GRADE approach, which is based on comprehensive methodology that has been described in detail in a series of 8 BMJ publications and, thus far, 30 publications in the Journal of Clinical Epidemiology and which has been adopted by >110 international organizations (29), including the Cochrane Collaboration and the WHO, each of which regularly apply GRADE to nutritional questions. The application of the GRADE approach facilitates the consideration of important criteria that bear on the certainty of evidence, including risk of bias, inconsistency, indirectness, imprecision, and publication bias, and improves transparency in making and communicating judgments about the certainty of evidence. Review authors should keep in mind, however, that even with the application of the GRADE approach, evaluating the certainty of evidence is a subjective process.

Recently, the appropriateness of the application of the GRADE approach to nutrition has been questioned and alternative systems for evaluating the certainty of evidence have been proposed (30, 69, 9092). Criticism of the GRADE approach has largely centered on the challenges of conducting RCTs in nutrition, because of which most of the evidence comes from nonrandomized studies that are typically rated at low certainty (9193). We argue, however, that the challenges of conducting RCTs in nutrition should not increase our confidence in findings of nonrandomized studies and that it is preferable to acknowledge the limitations of a body of evidence, even if it is not possible to generate any higher-certainty evidence (2, 69). We also argue that there are important merits to maintaining consistent standards for evaluating the certainty of evidence across health fields, such as allowing evidence users to compare the certainty of evidence of different types of interventions for a particular clinical or public health problem (e.g., lifestyle, nutritional, or surgical interventions for obesity) (2, 69). To maintain consistency of the criteria applied to evaluate the certainty of evidence across health fields, we encourage the use of the GRADE approach. We acknowledge, however, that these criteria are not perfect, that there is room for additional methodological developments to improve these criteria, and that these criteria will not be acceptable to all review authors. GRADE may, for example, benefit from additional guidance for review authors to make judgments on when there is high-certainty evidence of no effect based on evidence from nonrandomized studies. Review authors may consider alternative systems for evaluating the certainty of evidence, such as NutriGrade, Hierarchies of evidence applied to lifestyle medicine (HEALM), or the World Cancer Research Fund criteria (30, 90, 94).

There have also been recent advances in GRADE methodology and related implications for reviews of nutritional epidemiology studies. The original GRADE guidance called for all evidence from nonrandomized studies to start at low certainty evidence, after which the certainty of evidence may have been further downgraded owing to risk of bias, inconsistency, indirectness, imprecision, and publication bias or upgraded when a credible dose-response gradient or a large effect was observed or when all plausible confounders would produce an effect in the opposite direction than the effect that is observed (23, 33). The guidance has been updated so that evidence from nonrandomized studies may start at high certainty if residual confounding is considered under the domain of risk of bias (93). With both approaches, evidence from nonrandomized studies almost always ends up being rated at low or very low certainty (93).

Reporting

12)  Use the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (16) or Meta-analyses Of Observational Studies in Epidemiology (MOOSE) (95) reporting checklists to ensure that all critical information is reported in the review article. Both the PRISMA and MOOSE checklists recommend reporting the effect estimates (e.g., mean difference, RR) and the associated measures of variance (e.g., CIs) from each primary study, ideally within a forest plot. Describe any deviations from the original review protocol and explain why they were necessary, including those that came about during the peer review process.

13)  A final issue is the language that authors use to communicate review findings. Journals, noting their limitations in making causal inferences, often recommend for authors to avoid the use of causal language to describe the results of nonrandomized studies (96). More often though, the objective of nonrandomized studies addressing modifiable risk factors, including nutritional epidemiology studies and reviews of nutritional epidemiology studies, is to infer causal relations rather than associations (97). It may be that avoiding causal language will obscure this intent.

An alternative approach is for review authors to use causal language to describe their results—language that is consistent with their causal objectives. Authors using causal language for nonrandomized studies must, however, point out what will most often be the case: that the certainty of evidence is low or very low. Use of the GRADE system, and its recommended language for communicating low- or very-low-certainty evidence, greatly facilitates this approach (98). For example, when communicating the results of a review of nonrandomized studies addressing the relation between sugar-sweetened beverages and hypertension, the authors may conclude that sugar-sweetened beverages may increase the risk of hypertension but that the certainty of evidence is low (99).

Strengths and limitations

The strengths of our study include the duplicate assessment of review eligibility and collection of data, the evaluation of systematic reviews using widely accepted indicators of methodological quality, and the consideration of issues unique to reviews of nutritional epidemiology studies.

Our study is limited by the subjectivity involved in the assessment of various aspects of review quality. We attempted to reduce subjectivity by providing reviewers with detailed instructions for each of the items in our data collection form and by conducting extensive calibration exercises.

Although we identified many deficiencies and errors in the conduct and analyses of reviews, it is unclear the extent to which such issues may have affected findings. The investigation of this question would require replicating a sample of reviews using optimal methodology, which was considered to be outside the scope of this study. Empirical evidence, however, suggests that the types of deficiencies and errors we identified in our investigation have the potential to produce misleading results (e.g., 12, 48, 82, 100102). Similarly, the extent to which methodological limitations of reviews adversely affect clinical decisions, recommendations, and policy actions is unclear. It is possible that evidence users use the most rigorous systematic reviews to guide decisions, recommendations, and policies, in which case the impact of these issues may be negligible beyond producing research waste.

Our assessment of quality was dependent on the reporting of reviews, which is conceptually distinct from methodological rigor, although complete reporting is necessary to optimally appraise methodology. It is possible that some reviews failed to report important aspects of their methods, such as the registration of a protocol or the prespecification of subgroup analyses. Although reviews may be rigorous despite poor reporting, lack of good reporting leaves readers unable to interpret and confidently apply results.

When a review did not explicitly identify a primary meta-analysis, we assumed that the meta-analysis for which results were first presented in the results section was the primary one. Although not random, this is the meta-analysis first encountered by readers and was almost always the meta-analysis that was most discussed in the results and discussion sections of the review.

Our sample of reviews was drawn from only 1 timepoint (2018–2019) and the characteristics and quality of reviews published at other timepoints may be different. We expect, for example, older reviews to typically suffer from a greater number of more serious methodological limitations. Further, we focused only on reviews of observational nutritional epidemiology studies and did not include reviews of RCTs or mechanistic studies because we considered methods and considerations in evidence synthesis for these types of studies to be too different than for nutritional epidemiology studies.

Conclusions

Systematic reviews of nutritional epidemiology studies often have serious limitations. We encourage evidence users to be mindful of these limitations. Researchers can improve the quality of future reviews by performing comprehensive literature searches, including a search for unpublished data; involving statisticians, methodologists, and researchers with substantive knowledge of the nutrition topic area; and using a rigorous and transparent system to evaluate the certainty of evidence and to draw conclusions.

Supplementary Material

nqab002_Supplemental_File

Acknowledgments

We thank Dr. Bradley Johnston and Dr. Gordon Guyatt for their valuable feedback. We acknowledge that 11 of the 16 authors listed on this article are affiliated with McMaster University, the primary institution at which the GRADE criteria were developed.

The authors’ responsibilities were as follows—DZ, JLS, JB, and RJdS: designed the study; DZ, AB, REM, IC, AG, DOL, AM, ES, KA, and DM: collected the data; DZ: analyzed the data and produced the first draft of the manuscript; DZ, JLS, SEH, JB, and RJdS: interpreted the data; DZ, TAK, VH, JLS, SEH, JB, and RJdS: provided critical revision of the manuscript for important intellectual content; and all authors: read and approved the final manuscript.

TAK has received research support from the Canadian Institutes of Health Research (CIHR), the International Life Science Institute (ILSI), and National Honey Board. RJdS has served as an external resource person to the World Health Organization’s Nutrition Guidelines Advisory Group on trans fats, saturated fats, and polyunsaturated fats. The WHO paid for his travel and accommodation to attend meetings from 2012–2017 to present and discuss this work. He has also done contract research for the Canadian Institutes of Health Research’s Institute of Nutrition, Metabolism, and Diabetes, Health Canada, and the World Health Organization for which he received remuneration. He has received speaker’s fees from the University of Toronto, and McMaster Children’s Hospital. He has held grants from the Canadian Institutes of Health Research, Canadian Foundation for Dietetic Research, Population Health Research Institute, and Hamilton Health Sciences Corporation as a principal investigator, and is a co-investigator on several funded team grants from the Canadian Institutes of Health Research. He serves as a member of the Nutrition Science Advisory Committee to Health Canada (Government of Canada), and as an independent director of the Helderleigh Foundation (Canada). JLS has received research support from the Canadian Foundation for Innovation, Ontario Research Fund, Province of Ontario Ministry of Research and Innovation and Science, Canadian Institutes of Health Research (CIHR), Diabetes Canada, PSI Foundation, Banting and Best Diabetes Centre (BBDC), American Society for Nutrition (ASN), INC International Nut and Dried Fruit Council Foundation, National Dried Fruit Trade Association, National Honey Board, International Life Sciences Institute (ILSI), Pulse Canada, Quaker, The Tate and Lyle Nutritional Research Fund at the University of Toronto, The Glycemic Control and Cardiovascular Disease in Type 2 Diabetes Fund at the University of Toronto (a fund established by the Alberta Pulse Growers), and The Nutrition Trialists Fund at the University of Toronto (a fund established by an inaugural donation from the Calorie Control Council). He has received in-kind food donations to support a randomized controlled trial from the Almond Board of California, California Walnut Commission, American Peanut Council, Barilla, Unilever, Upfield, Unico/Primo, Loblaw Companies, Quaker, Kellogg Canada, WhiteWave Foods, and Nutrartis. He has received travel support, speaker fees and/or honoraria from Diabetes Canada, Dairy Farmers of Canada, FoodMinds LLC, International Sweeteners Association, Nestlé, Pulse Canada, Canadian Society for Endocrinology and Metabolism (CSEM), GI Foundation, Abbott, General Mills, Biofortis, ASN, Northern Ontario School of Medicine, INC Nutrition Research & Education Foundation, European Food Safety Authority (EFSA), Comité Européen des Fabricants de Sucre (CEFS), and Physicians Committee for Responsible Medicine. He has or has had ad hoc consulting arrangements with Perkins Coie LLP, Tate & Lyle, Wirtschaftliche Vereinigung Zucker e.V., Danone, and Inquis Clinical Research. He is a member of the European Fruit Juice Association Scientific Expert Panel and Soy Nutrition Institute (SNI) Scientific Advisory Committee. He is on the Clinical Practice Guidelines Expert Committees of Diabetes Canada, European Association for the Study of Diabetes (EASD), Canadian Cardiovascular Society (CCS), and Obesity Canada. He serves or has served as an unpaid scientific advisor for the Food, Nutrition, and Safety Program (FNSP) and the Technical Committee on Carbohydrates of ILSI North America. He is a member of the International Carbohydrate Quality Consortium (ICQC), Executive Board Member of the Diabetes and Nutrition Study Group (DNSG) of the EASD, and Director of the Toronto 3D Knowledge Synthesis and Clinical Trials foundation. His wife is an employee of AB InBev.

Notes

The authors reported no funding received for this study. DZ was supported by a Canadian Institutes of Health Research (CIHR) Doctoral Award and a CIHR Banting Fellowship. JLS was supported by a Diabetes Canada Clinician Scientist Award.

Supplemental Tables 1–11 and Supplemental Figure 1 are available from the “Supplementary data” link in the online posting of the article and from the same link in the online table of contents at https://academic.oup.com/ajcn/.

Abbreviations used: AMSTAR, A MeaSurement Tool to Assess systematic Reviews; GRADE, Grading of Recommendations Assessment, Development, and Evaluation; MOOSE, Meta-analyses Of Observational Studies in Epidemiology; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses; RCT, randomized controlled trial.

Contributor Information

Dena Zeraatkar, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Arrti Bhasin, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Rita E Morassut, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

Isabella Churchill, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Arnav Gupta, Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada.

Daeria O Lawson, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Anna Miroshnychenko, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Emily Sirotich, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Komal Aryal, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

David Mikhail, Faculty of Science, McMaster University, Hamilton, Ontario, Canada.

Tauseef A Khan, Department of Nutritional Sciences, Department of Medicine, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; 3D Knowledge Synthesis and Clinical Trials Unit, Clinical Nutrition and Risk Factor Modification Centre, Division of Endocrinology & Metabolism, St. Michael's Hospital, Toronto, Ontario, Canada.

Vanessa Ha, School of Medicine, Queen's University, Kingston, Ontario, Canada.

John L Sievenpiper, Department of Nutritional Sciences, Department of Medicine, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada; 3D Knowledge Synthesis and Clinical Trials Unit, Clinical Nutrition and Risk Factor Modification Centre, Division of Endocrinology & Metabolism, St. Michael's Hospital, Toronto, Ontario, Canada.

Steven E Hanna, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Joseph Beyene, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.

Russell J de Souza, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; 3D Knowledge Synthesis and Clinical Trials Unit, Clinical Nutrition and Risk Factor Modification Centre, Division of Endocrinology & Metabolism, St. Michael's Hospital, Toronto, Ontario, Canada; Population Health Research Institute, McMaster University, Hamilton, Ontario, Canada.

References

  • 1.Ortiz-Moncada R, González-Zapata L, Ruiz-Cantero MT, Clemente-Gómez V. Priority issues, study designs and geographical distribution in nutrition journals. Nutr Hosp. 2011;26(4):784–91. [DOI] [PubMed] [Google Scholar]
  • 2.Zeraatkar D, Johnston BC, Guyatt G. Evidence collection and evaluation for the development of dietary guidelines and public policy on nutrition. Annu Rev Nutr. 2019;39:227–47. [DOI] [PubMed] [Google Scholar]
  • 3.Hebert JR, Frongillo EA, Adams SA, Turner-McGrievy GM, Hurley TG, Miller DR, Ockene IS. Perspective: randomized controlled trials are not a panacea for diet-related research. Adv Nutr. 2016;7(3):423–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Brannon PM, Taylor CL, Coates PM. Use and applications of systematic reviews in public health nutrition. Annu Rev Nutr. 2014;34:401–19. [DOI] [PubMed] [Google Scholar]
  • 5.Barnard ND, Willett WC, Ding EL. The misuse of meta-analysis in nutrition research. JAMA. 2017;318(15):1435–6. [DOI] [PubMed] [Google Scholar]
  • 6.Page MJ, Shamseer L, Altman DG, Tetzlaff J, Sampson M, Tricco AC, Catalá-López F, Li L, Reid EK, Sarkis-Onofre R. Epidemiology and reporting characteristics of systematic reviews of biomedical research: a cross-sectional study. PLoS Med. 2016;13(5):e1002028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Page MJ, Altman DG, McKenzie JE, Shamseer L, Ahmadzai N, Wolfe D, Yazdi F, Catalá-López F, Tricco AC, Moher D. Flaws in the application and interpretation of statistical analyses in systematic reviews of therapeutic interventions were common: a cross-sectional analysis. J Clin Epidemiol. 2018;95:7–18. [DOI] [PubMed] [Google Scholar]
  • 8.Page MJ, Altman DG, Shamseer L, McKenzie JE, Ahmadzai N, Wolfe D, Yazdi F, Catalá-López F, Tricco AC, Moher D. Reproducible research practices are underused in systematic reviews of biomedical interventions. J Clin Epidemiol. 2018;94:8–18. [DOI] [PubMed] [Google Scholar]
  • 9.Orsini N, Li R, Wolk A, Khudyakov P, Spiegelman D. Meta-analysis for linear and nonlinear dose-response relations: examples, an evaluation of approximations, and software. Am J Epidemiol. 2012;175(1):66–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Greenland S, Longnecker MP. Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. Am J Epidemiol. 1992;135(11):1301–9. [DOI] [PubMed] [Google Scholar]
  • 11.Khan A, Chiavaroli L, Zurbau A, Sievenpiper JL. A lack of consideration of a dose–response relationship can lead to erroneous conclusions regarding 100% fruit juice and the risk of cardiometabolic disease. Eur J Clin Nutr. 2019;73(12):1556–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Yu WW, Schmid CH, Lichtenstein AH, Lau J, Trikalinos TA. Empirical evaluation of meta-analytic approaches for nutrient and health outcome dose-response data. Res Syn Meth. 2013;4(3):256–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Fardet A, Rock E. Perspective: reductionist nutrition research has meaning only within the framework of holistic and ethical thinking. Adv Nutr. 2018;9(6):655–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Murad MH, Montori VM, Ioannidis JP, Jaeschke R, Devereaux PJ, Prasad K, Neumann I, Carrasco-Labra A, Agoritsas T, Hatala Ret al. How to read a systematic review and meta-analysis and apply the results to patient care: users’ guides to the medical literature. JAMA. 2014;312(2):171–9. [DOI] [PubMed] [Google Scholar]
  • 15.Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA. Cochrane Handbook for Systematic Reviews of Interventions version 6.0. Sussex (UK): Cochrane; 2019. [Google Scholar]
  • 16.Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–9. [DOI] [PubMed] [Google Scholar]
  • 17.Dekkers OM, Vandenbroucke JP, Cevallos M, Renehan AG, Altman DG, Egger M. COSMOS-E: guidance on conducting systematic reviews and meta-analyses of observational studies of etiology. PLoS Med. 2019;16(2):e1002742. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Naing L, Winn T, Rusli BN. Practical issues in calculating the sample size for prevalence studies. Arch Orofac Sci. 2006;1:9–14. [Google Scholar]
  • 19.Salam RA, Welch V, Bhutta ZA. Systematic reviews on selected nutrition interventions: descriptive assessment of conduct and methodological challenges. BMC Nutr. 2015;1(1):9. [Google Scholar]
  • 20.Naude CE, Durao S, Harper A, Volmink J. Scope and quality of Cochrane reviews of nutrition interventions: a cross-sectional study. Nutr J. 2017;16(1):22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Institute of Medicine Committee on Standards for Systematic Reviews of Comparative Effectiveness Research. Eden J, Levit L, Berg A, Morton S, editors. Finding what works in health care: standards for systematic reviews. Washington (DC): National Academies Press (US); 2011. [PubMed] [Google Scholar]
  • 22.Patel CJ, Ioannidis JP. Placing epidemiological results in the context of multiplicity and typical correlations of exposures. J Epidemiol Community Health. 2014;68(11):1096–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Guyatt GH, Oxman AD, Sultan S, Glasziou P, Akl EA, Alonso-Coello P, Atkins D, Kunz R, Brozek J, Montori Vet al. GRADE guidelines: 9. Rating up the quality of evidence. J Clin Epidemiol. 2011;64(12):1311–16. [DOI] [PubMed] [Google Scholar]
  • 24.VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167(4):268–74. [DOI] [PubMed] [Google Scholar]
  • 25.Whiting P, Savović J, Higgins JP, Caldwell DM, Reeves BC, Shea B, Davies P, Kleijnen J, Churchill R; ROBIS group . ROBIS: a new tool to assess risk of bias in systematic reviews was developed. J Clin Epidemiol. 2016;69:225–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Faggion CM Jr. Critical appraisal of AMSTAR: challenges, limitations, and potential solutions from the perspective of an assessor. BMC Med Res Methodol. 2015;15:63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, Henry D, Altman DG, Ansari MT, Boutron I. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355:i4919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Duval S, Tweedie R. Trim and fill: a simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics. 2000;56(2):455–63. [DOI] [PubMed] [Google Scholar]
  • 29.Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, Schunemann HJ. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Schwingshackl L, Knuppel S, Schwedhelm C, Hoffmann G, Missbach B, Stelmach-Mardas M, Dietrich S, Eichelmann F, Kontopantelis E, Iqbal Ket al. Perspective: NutriGrade: a scoring system to assess and judge the meta-evidence of randomized controlled trials and cohort studies in nutrition research. Adv Nutr. 2016;7(6):994–1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Liu F, Xiong J, Hu J, Ran Z, Wang J, Li Z, Chen M, Wang Y. Vitamin C and risk of age-related cataracts: a systematic review and meta-analysis. Int J Clin Exp Med. 2018;11(9):8929–40. [Google Scholar]
  • 32.Salari-Moghaddam A, Saneei P, Larijani B, Esmaillzadeh A. Glycemic index, glycemic load, and depression: a systematic review and meta-analysis. Eur J Clin Nutr. 2019;73(3):356–65. [DOI] [PubMed] [Google Scholar]
  • 33.Balshem H, Helfand M, Schunemann HJ, Oxman AD, Kunz R, Brozek J, Vist GE, Falck-Ytter Y, Meerpohl J, Norris Set al. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol. 2011;64(4):401–6. [DOI] [PubMed] [Google Scholar]
  • 34.Hu FB. Dietary pattern analysis: a new direction in nutritional epidemiology. Curr Opin Lipidol. 2002;13(1):3–9. [DOI] [PubMed] [Google Scholar]
  • 35.Willett W. Nutritional epidemiology. Oxford: Oxford University Press; 2012. [Google Scholar]
  • 36.Song M, Giovannucci E. Substitution analysis in nutritional epidemiology: proceed with caution. Eur J Epidemiol. 2018;33(2):137–40. [DOI] [PubMed] [Google Scholar]
  • 37.Tapsell LC, Neale EP, Satija A, Hu FB. Foods, nutrients, and dietary patterns: interconnections and implications for dietary guidelines. Adv Nutr. 2016;7(3):445–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Booth A, Clarke M, Dooley G, Ghersi D, Moher D, Petticrew M, Stewart L. The nuts and bolts of PROSPERO: an international prospective register of systematic reviews. Syst Rev. 2012;1:2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Stewart L, Moher D, Shekelle P. Why prospective registration of systematic reviews makes sense. Syst Rev. 2012;1:7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Harbour R, Miller J. A new system for grading recommendations in evidence based guidelines. BMJ. 2001;323(7308):334–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hillier S, Grimmer-Somers K, Merlin T, Middleton P, Salisbury J, Tooher R, Weston A. FORM: an Australian method for formulating and grading recommendations in evidence-based clinical guidelines. BMC Med Res Methodol. 2011;11:23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Introduction: the American Diabetes Association's (ADA) evidence-based practice guidelines, standards, and related recommendations and documents for diabetes care. Diabetes Care. 2012;35:S1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Wallace TC, Bauer DC, Gagel RF, Greenspan SL, Lappe JM, LeBoff MS, Recker RR, Saag KG, Singer AJ. The National Osteoporosis Foundation's methods and processes for developing position statements. Arch Osteoporos. 2016;11:22. [DOI] [PubMed] [Google Scholar]
  • 44.Silberzahn R, Uhlmann EL, Martin DP, Anselmi P, Aust F, Awtrey E, Bahník Š, Bai F, Bannard C, Bonnier Eet al. Many analysts, one data set: making transparent how variations in analytic choices affect results. Adv Methods Pract Psychol Sci. 2018;1(3):337–56. [Google Scholar]
  • 45.Gelman A, Loken E. The garden of forking paths: why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. New York: Department of Statistics, Columbia University; 2013. [Google Scholar]
  • 46.Trepanowski JF, Ioannidis JPA. Perspective: limiting dependence on nonrandomized studies and improving randomized trials in human nutrition research: why and how. Adv Nutr. 2018;9(4):367–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Thomas L, Peterson ED. The value of statistical analysis plans in observational research: defining high-quality research from the start. JAMA. 2012;308(8):773–4. [DOI] [PubMed] [Google Scholar]
  • 48.Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. J Clin Epidemiol. 2006;59(7):697–703. [DOI] [PubMed] [Google Scholar]
  • 49.Waffenschmidt S, Knelangen M, Sieben W, Bühn S, Pieper D. Single screening versus conventional double screening for study selection in systematic reviews: a methodological systematic review. BMC Med Res Methodol. 2019;19(1):132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gartlehner G, Affengruber L, Titscher V, Noel-Storr A, Dooley G, Ballarini N, König F. Single-reviewer abstract screening missed 13 percent of relevant studies: a crowd-based, randomized controlled trial. J Clin Epidemiol. 2020;121:20–8. [DOI] [PubMed] [Google Scholar]
  • 51.Ioannidis JP, Patsopoulos NA, Rothstein HR. Reasons or excuses for avoiding meta-analysis in forest plots. BMJ. 2008;336(7658):1413–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sun X, Ioannidis JP, Agoritsas T, Alba AC, Guyatt G. How to use a subgroup analysis: users’ guide to the medical literature. JAMA. 2014;311(4):405–11. [DOI] [PubMed] [Google Scholar]
  • 53.Sun X, Briel M, Walter SD, Guyatt GH. Is a subgroup effect believable? Updating criteria to evaluate the credibility of subgroup analyses. BMJ. 2010;340:c117. [DOI] [PubMed] [Google Scholar]
  • 54.Rucker G, Schwarzer G, Carpenter JR, Schumacher M. Undue reliance on I2 in assessing heterogeneity may mislead. BMC Med Res Methodol. 2008;8:79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Borenstein M, Hedges LV, Higgins JP, Rothstein HR. A basic introduction to fixed-effect and random-effects models for meta-analysis. Res Synth Method. 2010;1(2):97–111. [DOI] [PubMed] [Google Scholar]
  • 56.Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statist Med. 2002;21(11):1539–58. [DOI] [PubMed] [Google Scholar]
  • 57.Senn SJ. Overstating the evidence: double counting in meta-analysis and related problems. BMC Med Res Methodol. 2009;9:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Forrow L, Taylor WC, Arnold RM. Absolutely relative: how research results are summarized can affect treatment decisions. Am J Med. 1992;92(2):121–4. [DOI] [PubMed] [Google Scholar]
  • 59.Naylor CD, Chen E, Strauss B. Measured enthusiasm: does the method of reporting trial results alter perceptions of therapeutic effectiveness?. Ann Intern Med. 1992;117(11):916–21. [DOI] [PubMed] [Google Scholar]
  • 60.Tucker G, Metcalfe A, Pearce C, Need AG, Dick IM, Prince RL, Nordin BE. The importance of calculating absolute rather than relative fracture risk. Bone. 2007;41(6):937–41. [DOI] [PubMed] [Google Scholar]
  • 61.Lai NM, Teng CL, Lee ML. Interpreting systematic reviews: are we ready to make our own conclusions? A cross-sectional study. BMC Med. 2011;9:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Knottnerus JA, Tugwell P. The way in which effects are analyzed and communicated can make a difference for decision making. J Clin Epidemiol. 2016;72:1–3. [DOI] [PubMed] [Google Scholar]
  • 63.Evaniew N, van der Watt L, Bhandari M, Ghert M, Aleem I, Drew B, Guyatt G. Strategies to improve the credibility of meta-analyses in spine surgery: a systematic survey. Spine J. 2015;15(9):2066–76. [DOI] [PubMed] [Google Scholar]
  • 64.Neuenschwander M, Ballon A, Weber KS, Norat T, Aune D, Schwingshackl L, Schlesinger S. Role of diet in type 2 diabetes incidence: umbrella review of meta-analyses of prospective observational studies. BMJ. 2019;366:l2368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Poole R, Kennedy OJ, Roderick P, Fallowfield JA, Hayes PC, Parkes J. Coffee consumption and health: umbrella review of meta-analyses of multiple health outcomes. BMJ. 2017;359:j5024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Yi M, Wu X, Zhuang W, Xia L, Chen Y, Zhao R, Wan Q, Du L, Zhou Y. Tea consumption and health outcomes: umbrella review of meta-analyses of observational studies in humans. Mol Nutr Food Res. 2019;63(16):1900389. [DOI] [PubMed] [Google Scholar]
  • 67.Veronese N, Demurtas J, Celotto S, Caruso MG, Maggi S, Bolzetta F, Firth J, Smith L, Schofield P, Koyanagi Aet al. Is chocolate consumption associated with health outcomes? An umbrella review of systematic reviews and meta-analyses. Clin Nutr. 2019;38(3):1101–8. [DOI] [PubMed] [Google Scholar]
  • 68.Joanna Briggs Institute. Comprehensive systematic review training program manual. Adelaide, Australia: The Joanna Briggs Institute; 2004. [Google Scholar]
  • 69.Zeraatkar D, Guyatt GH, Alonso-Coello P, Bala MM, Rabassa M, Han MA, Vernooij RWM, Valli C, Johnston BC. Red and processed meat consumption and risk for all-cause mortality and cardiometabolic outcomes. Ann Intern Med. 2020;172(7):511–12. [DOI] [PubMed] [Google Scholar]
  • 70.Brown AW, Ioannidis JP, Cope MB, Bier DM, Allison DB. Unscientific beliefs about scientific topics in nutrition. Adv Nutr. 2014;5(5):563–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Ioannidis JP. Implausible results in human nutrition research. BMJ. 2013;347:f6698. [DOI] [PubMed] [Google Scholar]
  • 72.Ioannidis JPA. Unreformed nutritional epidemiology: a lamp post in the dark forest. Eur J Epidemiol. 2019;34(4):327–31. [DOI] [PubMed] [Google Scholar]
  • 73.Ioannidis JPA. The challenge of reforming nutritional epidemiologic research. JAMA. 2018;320(10):969–70. [DOI] [PubMed] [Google Scholar]
  • 74.Schünemann HJ, Tugwell P, Reeves BC, Akl EA, Santesso N, Spencer FA, Shea B, Wells G, Helfand M. Non-randomized studies as a source of complementary, sequential or replacement evidence for randomized controlled trials in systematic reviews on the effects of interventions. Res Syn Meth. 2013;4(1):49–62. [DOI] [PubMed] [Google Scholar]
  • 75.Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. [DOI] [PubMed] [Google Scholar]
  • 76.Smith-Warner SA, Spiegelman D, Ritz J, Albanes D, Beeson WL, Bernstein L, Berrino F, van den Brandt PA, Buring JE, Cho Eet al. Methods for pooling results of epidemiologic studies: the Pooling Project of Prospective Studies of Diet and Cancer. Am J Epidemiol. 2006;163(11):1053–64. [DOI] [PubMed] [Google Scholar]
  • 77.Ventresca M, Schünemann HJ, Macbeth F, Clarke M, Thabane L, Griffiths G, Noble S, Garcia D, Marcucci M, Iorio Aet al. Obtaining and managing data sets for individual participant data meta-analysis: scoping review and practical guide. BMC Med Res Methodol. 2020;20(1):113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Garner P, Hopewell S, Chandler J, MacLehose H, Schünemann HJ, Akl EA, Beyene J, Chang S, Churchill R, Dearness Ket al. When and how to update systematic reviews: consensus and checklist. BMJ. 2016;354:i3507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Elsevier Scopus. Scopus content coverage guide. Amsterdam, Netherlands: Elsevier BV; 2016. [Google Scholar]
  • 80.Lefebvre C, Glanville J, Briscoe S, Littlewood A, Marshall C, Metzendorf M, Noel-Storr A, Rader T, Shokraneh F, Thomas J. Searching for and selecting studies. Draft version (13 September 2018) for inclusion. In: Higgins JPT, Thomas J, Chandler J, Cumpton MS, Li T, Page MJ, Welch V (editors) Cochrane Handbook for Systematic Reviews of Interventions. London: Cochrane; 2018. [Google Scholar]
  • 81.Mahood Q, Van Eerd D, Irvin E. Searching for grey literature for systematic reviews: challenges and benefits. Res Syn Meth. 2014;5(3):221–34. [DOI] [PubMed] [Google Scholar]
  • 82.Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews. J Clin Epidemiol. 2005;58(7):741–2. [DOI] [PubMed] [Google Scholar]
  • 83.Gøtzsche PC, Hróbjartsson A, Marić K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences. JAMA. 2007;298(4):430–7. [DOI] [PubMed] [Google Scholar]
  • 84.Ford AC, Guyatt GH, Talley NJ, Moayyedi P. Errors in the conduct of systematic reviews of pharmacological interventions for irritable bowel syndrome. Am J Gastroenterol. 2010;105(2):280–8. [DOI] [PubMed] [Google Scholar]
  • 85.Scammacca N, Roberts G, Stuebing KK. Meta-analysis with complex research designs: dealing with dependence from multiple measures and multiple group comparisons. Rev Educ Res. 2014;84(3):328–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.López-López JA, Page MJ, Lipsey MW, Higgins JPT. Dealing with effect size multiplicity in systematic reviews and meta-analyses. Res Syn Meth. 2018;9(3):336–51. [DOI] [PubMed] [Google Scholar]
  • 87.Engels EA, Schmid CH, Terrin N, Olkin I, Lau J. Heterogeneity and statistical significance in meta-analysis: an empirical study of 125 meta-analyses. Stat Med. 2000;19(13):1707–28. [DOI] [PubMed] [Google Scholar]
  • 88.Deeks JJ. Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes. Statist Med. 2002;21(11):1575–600. [DOI] [PubMed] [Google Scholar]
  • 89.Guyatt GH, Oxman AD, Schunemann HJ, Tugwell P, Knottnerus A. GRADE guidelines: a new series of articles in the Journal of Clinical Epidemiology. J Clin Epidemiol. 2011;64(4):380–2. [DOI] [PubMed] [Google Scholar]
  • 90.Katz DL, Karlsen MC, Chung M, Shams-White MM, Green LW, Fielding J, Saito A, Willett W. Hierarchies of evidence applied to lifestyle medicine (HEALM): introduction of a strength-of-evidence approach based on a methodological systematic review. BMC Med Res Methodol. 2019;19(1):178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Giovannucci E. Nutritional epidemiology: forest, trees and leaves. Eur J Epidemiol. 2019;34(4):319–25. [DOI] [PubMed] [Google Scholar]
  • 92.Qian F, Riddle MC, Wylie-Rosett J, Hu FB. Red and processed meats and health risks: how strong is the evidence?. Diabetes Care. 2020;43(2):265–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Schunemann HJ, Cuello C, Akl EA, Mustafa RA, Meerpohl JJ, Thayer K, Morgan RL, Gartlehner G, Kunz R, Katikireddi SVet al. GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence. J Clin Epidemiol. 2019;111:105–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Wiseman M. The second World Cancer Research Fund/American Institute for Cancer Research expert report. Food, nutrition, physical activity, and the prevention of cancer: a global perspective. Proc Nutr Soc. 2008;67(3):253–6. [DOI] [PubMed] [Google Scholar]
  • 95.Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, Moher D, Becker BJ, Sipe TA, Thacker SB. Meta-analysis of observational studies in epidemiology: a proposal for reporting. JAMA. 2000;283(15):2008–12. [DOI] [PubMed] [Google Scholar]
  • 96.Luscher TF. In search of the right word: a statement of the HEART Group on scientific language. Eur Heart J. 2013;34(1):7–9. [DOI] [PubMed] [Google Scholar]
  • 97.Hernan MA. The C-word: scientific euphemisms do not improve causal inference from observational data. Am J Public Health. 2018;108(5):616–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Santesso N, Glenton C, Dahm P, Garner P, Akl EA, Alper B, Brignardello-Petersen R, Carrasco-Labra A, De Beer H, Hultcrantz M. GRADE guidelines 26: informative statements to communicate the findings of systematic reviews of interventions. J Clin Epidemiol. 2020;119:126–35. [DOI] [PubMed] [Google Scholar]
  • 99.Jayalath VH, de Souza RJ, Ha V, Mirrahimi A, Blanco-Mejia S, Di Buono M, Jenkins AL, Leiter LA, Wolever TM, Beyene Jet al. Sugar-sweetened beverage consumption and incident hypertension: a systematic review and meta-analysis of prospective cohorts. Am J Clin Nutr. 2015;102(4):914–21. [DOI] [PubMed] [Google Scholar]
  • 100.Hopewell S, McDonald S, Clarke M, Egger M. Grey literature in meta-analyses of randomized trials of health care interventions. Cochrane Database Syst Rev. 2007;(2):Mr000010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Hartling L, Featherstone R, Nuspl M, Shave K, Dryden DM, Vandermeer B. Grey literature in systematic reviews: a cross-sectional study of the contribution of non-English reports, unpublished studies and dissertations to the results of meta-analyses in child-relevant reviews. BMC Med Res Methodol. 2017;17(1):64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Schmucker CM, Blumle A, Schell LK, Schwarzer G, Oeller P, Cabrera L, von Elm E, Briel M, Meerpohl JJ. Systematic review finds that study data not published in full text articles have unclear impact on meta-analyses results in medical research. PLoS One. 2017;12(4):e0176210. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

nqab002_Supplemental_File

Articles from The American Journal of Clinical Nutrition are provided here courtesy of American Society for Nutrition

RESOURCES