Skip to main content
PLOS Medicine logoLink to PLOS Medicine
. 2022 Apr 26;19(4):e1003972. doi: 10.1371/journal.pmed.1003972

Polygenic scores, diet quality, and type 2 diabetes risk: An observational study among 35,759 adults from 3 US cohorts

Jordi Merino 1,2,3,*,#, Marta Guasch-Ferré 4,5,#, Jun Li 4,6,#, Wonil Chung 6,7, Yang Hu 4, Baoshan Ma 6,7, Yanping Li 4, Jae H Kang 5, Peter Kraft 6,8,9, Liming Liang 6,8,9, Qi Sun 4,5,6, Paul W Franks 4,10, JoAnn E Manson 4,5,11, Walter C Willet 4,5,6, Jose C Florez 1,2,3,, Frank B Hu 4,5,6,
Editor: Weiping Jia12
PMCID: PMC9041832  PMID: 35472203

Abstract

Background

Both genetic and lifestyle factors contribute to the risk of type 2 diabetes, but the extent to which there is a synergistic effect of the 2 factors is unclear. The aim of this study was to examine the joint associations of genetic risk and diet quality with incident type 2 diabetes.

Methods and findings

We analyzed data from 35,759 men and women in the United States participating in the Nurses’ Health Study (NHS) I (1986 to 2016) and II (1991 to 2017) and the Health Professionals Follow-up Study (HPFS; 1986 to 2016) with available genetic data and who did not have diabetes, cardiovascular disease, or cancer at baseline. Genetic risk was characterized using both a global polygenic score capturing overall genetic risk and pathway-specific polygenic scores denoting distinct pathophysiological mechanisms. Diet quality was assessed using the Alternate Healthy Eating Index (AHEI). Cox models were used to calculate hazard ratios (HRs) for type 2 diabetes after adjusting for potential confounders. With over 902,386 person-years of follow-up, 4,433 participants were diagnosed with type 2 diabetes. The relative risk of type 2 diabetes was 1.29 (95% confidence interval [CI] 1.25, 1.32; P < 0.001) per standard deviation (SD) increase in global polygenic score and 1.13 (1.09, 1.17; P < 0.001) per 10-unit decrease in AHEI. Irrespective of genetic risk, low diet quality, as compared to high diet quality, was associated with approximately 30% increased risk of type 2 diabetes (Pinteraction = 0.69). The joint association of low diet quality and increased genetic risk was similar to the sum of the risk associated with each factor alone (Pinteraction = 0.30). Limitations of this study include the self-report of diet information and possible bias resulting from inclusion of highly educated participants with available genetic data.

Conclusions

These data provide evidence for the independent associations of genetic risk and diet quality with incident type 2 diabetes and suggest that a healthy diet is associated with lower diabetes risk across all levels of genetic risk.


In an observational study of 3 cohorts in the United States, Jordi Merino, Marta Guasch-Ferré, Jun Li, and colleagues investigate the individual and combined associations between genetic risk, diet quality, and risk of type 2 diabetes.

Author summary

Why was this study done?

  • Both genetic and lifestyle factors contribute to individual-level risk of type 2 diabetes.

  • While previous studies have shown that adherence to a healthy lifestyle is associated with reduced risk of type 2 diabetes regardless of genetic risk, the partial characterization of genetic risk and the predominant assessment of interactions on the multiplicative scale might have prevented previous studies from identifying genetic profiles interacting with dietary exposures.

  • Therefore, understanding how genetic risk and diet quality contribute to the development of type 2 diabetes is important to support evidence-based preventive interventions.

What did the researchers do and find?

  • In 3 cohort studies involving 35,759 men and women in the US, we used novel polygenic scores for type 2 diabetes to systematically evaluate the presence of additive and multiplicative interactions between genetic risk and diet quality on the development of type 2 diabetes.

  • We found that both low diet quality and increased overall or pathway-specific genetic risk were independently associated with higher risk of type 2 diabetes.

  • We documented that within any genetic risk category, high diet quality, as compared to low diet quality, was associated with a nearly 30% lower risk of type 2 diabetes.

  • Further, we showed that the risk of type 2 diabetes attributed to the combination of increased genetic risk and low diet quality was similar to the sum of the risks associated with each factor alone.

What do these findings mean?

  • Results from this study suggest that consuming a healthier diet is associated with a lower risk of type 2 diabetes regardless of genetic risk.

  • Our results underscore the value of genetic risk assessment to identify individuals at increased disease risk and their potential for risk stratification and surveillance.

  • Such knowledge can serve to inform and design future strategies to advance the prevention of type 2 diabetes.

Introduction

The burden of type 2 diabetes is not equally distributed, as susceptibility to environmental factors varies between and within human populations [1]. This observation has led many to presume that dietary and lifestyle factors may yield different effects depending on inherited genetic susceptibility, a concept often referred to as “gene × lifestyle interaction” [24]. To date, some studies have attempted to identify genotypes interacting with lifestyle factors on the development of type 2 diabetes, but these studies have consistently demonstrated that adherence to healthy dietary or lifestyle recommendations is associated with a lower burden of type 2 diabetes regardless of genetic risk [511]. Partial characterization of genetic risk, often based on polygenic scores that included a limited number of variants, the predominant assessment for interactions on the multiplicative scale alone, or the use of a single time point dietary exposure assessment and limited follow-up might have prevented previous studies from identifying genotypes interacting with lifestyle or dietary exposures.

Recent genetic discoveries and improved computational algorithms offer an unprecedented opportunity to better characterize type 2 diabetes genetic risk [12,13]. It is now possible to clump thousands of genetic variants with marginal effects into a “global” polygenic score with considerable impact on disease risk [12]. In addition, it is possible to capture the etiological heterogeneity that characterizes type 2 diabetes by generating “pathway-specific” polygenic scores with variants that share increased type 2 diabetes risk through specific pathophysiological processes such as impaired insulin secretion or different forms of insulin resistance [13]. The extent to which this knowledge is useful for identifying individuals more susceptible than others to an unhealthy diet is unknown.

Here, we analyzed longitudinal data for 35,759 participants in 3 cohorts to investigate how genetic risk and diet quality contribute to the risk of type 2 diabetes.

Methods

Study design and population

We used data collected from 3 prospective cohort studies in the US including participants in the Nurses’ Health Study (NHS), the Health Professionals Follow-up Study (HPFS), and the NHS II [14,15]. The NHS was established in 1976 when 121,700 female registered nurses aged 30 to 55 were recruited [14]. The HPFS began in 1986 and enrolled 51,529 male health professionals aged 40 to 75 years [15]. The NHS II cohort was initiated in 1989 and included 116,340 women aged 25 to 42 years [14]. The study baseline was set at 1986 for the NHS and HPFS and 1991 for the NHS II, which was when participants first completed a questionnaire on their medical history, diet, and lifestyle characteristics.

Multiple genome-wide association studies (GWASs) have been conducted within the NHS, NHS II, and HPFS nested cohorts to investigate genetic susceptibility to 12 complex diseases [16]. Participants for genetic determinations were selected to represent a representative sample of the original sample. Demographic characteristics and health status of participants with genetic information were generally similar to those without genetic information (S1 Table). Genotype, imputation, and quality control of genome-wide genetic data have been harmonized across nested cohorts and detailed elsewhere [16]. After quality control, genome-wide genetic data were available for 42,437 individuals. We excluded participants diagnosed with diagnosis of type 2 diabetes (n = 3,200), cardiovascular disease (including nonfatal myocardial infarction, fatal coronary heart disease, and fatal and nonfatal stroke, n = 613), or cancer at baseline (n = 721), those who had an unusual total energy intake at baseline (<800 kcal or >4,200 kcal/day in men and <500 or >3,500 kcal/day in women, n = 581), and those who completed only the baseline questionnaire (n = 1,563). After these exclusions, 14,454 participants in the NHS, 9,417 participants in the HPFS, and 11,888 participants in the NHS II were included in this analysis. The study protocol was approved by the human research committee of Brigham and Women’s Hospital and the Harvard TH Chan School of Public Health.

Ascertainment of type 2 diabetes

Cases of type 2 diabetes were identified by biennially mailed questionnaires and confirmed by a validated supplementary questionnaire regarding symptoms, diagnostic laboratory test results, and hypoglycemic therapy. For cases diagnosed before 1998, type 2 diabetes was documented if participants met at least 1 of the following National Diabetes Data Group criteria [17]: (a) raised glycemia (fasting plasma glucose ≥ 7.8 mmol/l, random plasma glucose ≥ 11.1 mmol/l, or plasma glucose ≥ 11.1 mmol/l after an oral glucose load) and at least 1 symptom related to diabetes (excessive thirst, hunger, polyuria, or weight loss); (b) no symptoms, but elevated glucose concentrations on 2 occasions; and (c) treatment with insulin or other hypoglycemic medication [10]. From 1998 onward, the cutoff point for elevated fasting plasma glucose concentrations was lowered to 7.0 mmol/l according to the American Diabetes Association criteria [18]. We also considered a HbA1c concentration ≥6.5% criteria for confirming type 2 diabetes cases identified after January 2010 [19]. Validation studies in subsamples of the NHS revealed the validity of using the supplementary questionnaires to adjudicate type 2 diabetes diagnosis, showing that more than 97% of participants with self-reported type 2 diabetes were reconfirmed through medical record review [20].

Type 2 diabetes polygenic scores

We generated a global polygenic score for type 2 diabetes that captures overall genetic burden using external data from UK Biobank (S1 Text). The rationale to use external data from UK Biobank was to avoid sample overlap with a previous publicly available global polygenic score for type 2 diabetes [12]. In brief, we selected a random UK Biobank sample (n = 391,147 participants, 17,403 type 2 diabetes cases) and conducted a genome-wide association analysis for type 2 diabetes using linear mixed models implemented in BOLT-LMM [21]. Next, estimated effect sizes were reweighted and clumped using LDPred [22]. The predictive performance of the global polygenic score including approximately 850,000 independent genetic variants was tested in the remaining set of UK Biobank participants (n = 20,000 participants, 893 type 2 diabetes cases) and then applied to our study population. To calculate individual scores in our study population, each variant was coded with the expected number of associated alleles and weighted by its relative effect size on type 2 diabetes. The scores, which included the same number of genetic variants in each cohort, were then standardized.

To generate pathway-specific polygenic scores, we used data from a previous study aimed at grouping known type 2 diabetes loci based on shared physiological similarities [13]. Genetic variants to compute these polygenic scores and their respective weights are detailed in S2 Table. These pathway-specific polygenic scores capture biological processes relevant to diabetes pathophysiology including impaired insulin secretion (one polygenic score for beta-cell dysfunction and another for proinsulin synthesis) and increased insulin resistance (polygenic scores related to obesity-mediated insulin resistance, body fat distribution, and lipid/hepatic metabolism). Allocation of type 2 diabetes variants to each polygenic score is supported by tissue-specific patterns of chromatin accessibility, histone modification, and transcriptional regulation [13], indicating that the mechanistic basis of these polygenic scores is robust even though these variants may have pleiotropic effects. The significance of pathway-specific polygenic scores has been shown in previous studies indicating that individuals enriched for genetic variants defining each of the intermediate diabetogenic processes exhibited the predicted score–associated phenotypes [13,23]. The scores were generated by multiplying a variant’s genotype dosage by its respective weight and then standardized. Polygenic scores were standardized to allow comparisons across scores computed in this study with different number of genetic variants.

Assessment of diet quality

Diet quality was assessed using diet information obtained from a validated 131-item semiquantitative food frequency questionnaire administered at baseline and every 4 years thereafter. To quantify overall diet quality, we calculated the Alternate Healthy Eating Index (AHEI) using food components and scoring criteria that have been described previously [24]. The AHEI score is based on 11 foods and nutrients, emphasizing higher intake of fruits, whole grains, vegetables (excluding potatoes), nuts and legumes, polyunsaturated fatty acids, and long chain (n-3) fatty acids; moderate intake of alcohol; and lower intake of red and processed meats, sugar sweetened drinks and fruit juice, sodium, and trans-fat. Each component was scored from 0 (unhealthiest) to 10 (healthiest) points, with intermediate values scored proportionally, and all component scores were summed to obtain a total score ranging from 0 (lowest diet quality) to 110 (highest diet quality) points.

As an additional method to quantify diet quality, we used the Dietary Approaches to Stop Hypertension (DASH) score [25]. The DASH score was based on the DASH-style diet, which includes information from 8 foods and nutrients. Each component was scored from 1 to 5 points according to fifths of intake, with 5 being the best score for higher intake of fruits, whole grains, vegetables, nuts and legumes, and low-fat dairy products and for lower intake of red and processed meats, sugar sweetened drinks, and sodium. The total score ranged from 8 (lowest diet quality) to 40 (highest diet quality) points.

Assessment of covariates

Covariates were ascertained every 2 years with the use of questionnaire that obtained updated information on occurrence of diseases and many lifestyles and personal risk factors, including age, family history of diabetes, history of hypertension, history of hypercholesterolemia, body mass index (BMI), menopausal status and postmenopausal hormone use in women, smoking status, physical activity, total energy intake, and alcohol intake. Baseline history of hypertension and hypercholesterolemia were determined through self-reporting. BMI was calculated as weight in kilograms divided by the square of the height in meters. Physical activity was repeatedly assessed using validated questionnaire on time spent on recreational activities. We used principal component analysis in each cohort to generate ancestry-derived principal components.

Statistical analysis

We elaborated a prespecified protocol including definitions of exposures, outcomes and covariates, and statistical analysis plan prior to data analysis (S2 Text). We summarized continuous measurements by using means (standard deviation, SD) or medians (interquartile range, IQR) and present categorical observations as frequency and percentages. Correlations between diet and polygenic scores were assessed using Pearson correlation test. To better capture longitudinal trajectories of diet quality, we calculated and used cumulative averages of diet quality. To generate cumulative averages, we continually updated diet throughout duration of follow up. Because the proportion of missing values of covariates for individuals with genetic data was below 5%, participants with missing covariate information were excluded from the analysis.

Person-time for each participant was calculated from the return of the baseline questionnaire to the diagnosis of type 2 diabetes, death, loss to follow-up, or the end of the follow-up period (2016 for the NHS and the HPFS and 2017 for the NHS II), whichever came first. We used multivariable Cox proportional hazards models to calculate hazard ratios (HRs) and 95% confidence intervals (CIs) for type 2 diabetes after exploring that the proportional hazards assumption was not violated. The proportionality of hazards assumption was assessed using the Schoenfeld residuals. We modeled polygenic scores and diet quality as continuous variables. We have used as time-varying variables in the models, the variables that change over time including age, history of hypertension, history of hypercholesterolemia, menopausal status, BMI, smoking status, physical activity, and total energy intake. Family history of type 2 diabetes and ancestry-derived principal components were considered time-fixed variables at baseline. For the variables that were updated throughout the follow-up, the questionnaire year determined the time point. Cox regression models were stratified by age (in months, continuous) and adjusted for ancestry-derived principal components (1 to 4) (crude model). The multivariable-adjusted model was further adjusted for family history of diabetes (yes or no), history of hypertension (yes or no), history of hypercholesterolemia (yes or no), menopausal status (premenopausal or postmenopausal [never, past, or current menopausal hormone use], women only), BMI (quintiles of kg/m2), smoking status (current, former, and never), physical activity (quintiles of MET hours/week), and total energy intake (quintiles of total caloric intake/day). Because BMI could be on the causal pathway between diet quality and type 2 diabetes risk, we also conducted separate models without adjusting for BMI.

We conducted analyses stratified by genetic risk category (low, intermediate, and high based on thirds of genetic risk distribution) to assess the association between diet quality and type 2 diabetes risk. We also cross-classified participants according to categories of genetic risk and diet quality (9 categories based on thirds of genetic risk and diet quality score, with low genetic risk and high diet quality as reference) and conducted joint analyses to investigate the combined association of genetic risk and diet quality with the risk of type 2 diabetes.

We evaluated whether the associations between diet quality and type 2 diabetes risk differed based on genetic susceptibility by using additive and multiplicative interaction analyses [26,27]. Power calculations for interaction analyses were conducted to determine the minimum detectable interaction on the risk ratio scale [28]. The available sample gave us 80% statistical power at α 0.05 to detect an additive and multiplicative interaction effect size ≥1.04 and ≥1.10, respectively. We tested for multiplicative interactions using the log-likelihood ratio test to compare the goodness of fit of a multivariable-adjusted model with and without the cross-product interaction term [27]. For additive interaction analyses, we considered genetic risk as a continuous variable and used a binary categorical variable for diet quality based on the median of the diet quality score in each cohort. We assessed the relative excess risk due to interaction (RERI) as an index of additive interaction [26] and further examined the decomposition of the joint effect, which is the proportion of risk due to genetic risk alone, to diet quality alone, and to their interaction [26]. CIs for each of the interaction measures were calculated using the delta method described by Hosmer and Lemeshow [29].

We conducted secondary analyses to investigate the consistency of our results. First, we used the DASH score as an additional method to quantify diet quality. For these analyses, we further adjusted our multivariable models for alcohol intake. We tested for additive and multiplicative interactions using the DASH score. Second, we conducted 3-way interaction analyses to examine whether BMI modified the joint association between increased genetic risk and low diet quality on type 2 diabetes risk.

All analyses were performed separately for each cohort and then were pooled with the use of inverse variance weighted, fixed-effects meta-analysis. Results were also combined using random-effects meta-analysis. The heterogeneity index (I2) was used to assess heterogeneity. All P values presented were 2 sided, with statistical significance determined by the Bonferroni corrected threshold of significance <0.007 (0.05/7 exposures). Data were analyzed with the use of SAS software, version 9.3 (SAS Institute) and R software, version 4.0.3 (R Foundation). This manuscript is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S3 Text) [30].

Results

Baseline characteristics of the 35,759 participants included in this study are shown in Table 1. The mean baseline age of NHS participants was 53 years old, 54 in HPFS, and 37 in NHS II. Most of them were of European descent without major chronic diseases at baseline. Mean BMI ranged from 24.3 kg/m2 in NHS to 25.5 kg/m2 in NHS II. At baseline, mean AHEI score ranged from 48.9 in NHS II to 52.6 in HPFS. The study sample was representative of each original study population with no major differences in clinical, demographic, and lifestyle characteristics (S1 Table). A total of 4,433 participants developed type 2 diabetes during 902,387 person-years of follow-up (n = 2,204 (15.2%) in NHS, n = 1,285 (13.6%) in HPFS, and n = 944 (7.9%) in NHS II).

Table 1. Baseline characteristics of the 35,759 US men and women in the NHS I, the HPFS, and the NHS II.

Characteristic NHS (n = 14,454) HPFS (n = 9,417) NHS II (n = 11,888)
Person-yϕ 366,719 239,296 296,371
Age, years 53 (7) 54 (9) 37 (4)
Self-reported race/ethnicityε
    White, n (%) 14,416 (99.7) 9,267 (98.4) 11,845 (99.6)
    Other, n (%) 38 (0.3) 150 (1.6) 43 (0.4)
Clinical history
    Hypertension, n (%) 2,193 (15.2) 1,833 (19.5) 345 (2.9)
    Dyslipidemia, n (%) 1,133 (7.8) 1,117 (11.9) 1,173 (9.9)
    Family history of diabetes, n (%) 4,323 (29.9) 2,695 (28.6) 4,294 (36.1)
    Hormone use, n (%)
    Premenopausal 5,742 (39.7) 9,617 (80.9)
    Postmenopausal, never 3,870 (26.8) 1,340 (11.3)
    Postmenopausal, current 2,535 (17.5) 819 (6.9)
    Postmenopausal, previous 1,902 (13.2)
Lifestyle habits
    Current smoker, n (%) 2,479 (17.2) 728 (7.7) 1,245 (10.5)
    BMI, kg/m2 25.2 (4.6) 25.5 (3.1) 24.3 (5)
    Physical activity, MET-h/wk, median (IQR) 14.3 (2.9 to 19.3) 19.1 (4.3 to 25.5) 20.2 (5.2 to 26.0)
    Total energy intake, kcal/day, mean (SD) 1,781 (520) 1,988 (557) 1,802 (535)
    AHEI score, mean (SD)* 52.1 (11.3) 52.6 (11.7) 48.9 (11)
    Alcohol intake, g/day, median (IQR) 6.5 (0 to 8.6) 12.2 (1.1 to 16.1) 3.2 (0 to 3.5)

Values are means (SD) or medians (IQR) for continuous variables and numbers and percentages are for categorical variables. The study baseline was set at 1986 for the NHS I and the HPFS and 1991 for the NHS II.

MET denotes metabolic equivalent tasks.

ϕPerson-years are based on the analysis for type 2 diabetes.

εRace was self-reported by the participants. Non-Hispanic white (southern European/Mediterranean, Scandinavian, and other European ancestry) and Hispanic were categorized into “White,” while Black, Asian, American Indian, or Hawaiian were categorized into “Other.” Ancestry-derived principal components were used to adjust multivariable models.

*Scores on the AHEI range from 0 to 110, with higher scores indicating a healthy diet.

AHEI, Alternate Healthy Eating Index; BMI, body mass index; HPFS, Health Professionals Follow-up Study; IQR, interquartile range; NHS, Nurses’ Health Study; SD, standard deviation.

Associations between polygenic scores and type 2 diabetes incidence

The polygenic scores were normally distributed (S1S3 Figs). The age-adjusted HR for type 2 diabetes was 1.42 (95% CI 1.38, 1.46; I2 = 93.2%; P < 0.001) per 1 SD increase in the global polygenic score (S3 Table). In fully adjusted models, the global polygenic score was associated with higher risk of type 2 diabetes with an HR of 1.29 (95% CI 1.25, 1.33; I2 = 88.4%; P < 0.001; per SD increase; Fig 1). When analyzed in each cohort separately, the multivariable-adjusted HR for type 2 diabetes was 1.26 (95% CI 1.20, 1.31; P < 0.001) in NHS, 1.23 (95% CI 1.16, 1.31; P < 0.001) in HPFS, and 1.46 (95% CI 1.37, 1.56; P < 0.001) in NHS II. When pathway-specific polygenic scores were used to characterize genetic risk, there were consistent associations between increased genetic risk and type 2 diabetes risk. The crude estimates for pathway-specific polygenic scores are presented in S3 Table. The multivariable-adjusted HRs per 1 SD increase in pathway-specific polygenic scores ranged from 1.26 (95% CI 1.22, 1.30; I2 = 55.5%; P < 0.001) for the beta-cell dysfunction polygenic score to 1.09 (95% CI 1.05, 1.12; I2 = 49.1%; P < 0.001) for the obesity-mediated insulin resistance polygenic score (Fig 1).

Fig 1. Risk of incident type 2 diabetes associated with genetic risk.

Fig 1

Shown are adjusted HRs and 95% CI of the estimate for type 2 diabetes risk per SD increase in polygenic scores. Estimates are presented for each of the 3 prospective cohorts separately and in a combined analysis. Polygenic scores included in this study are described in the methods section. Cox proportional hazards models were stratified by age and adjusted for ancestry-derived principal components, family history of diabetes, history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, and total energy intake. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results. BMI, body mass index; CI, confidence interval; HPFS, Health Professionals Follow-up Study; HR, hazard ratio; IR, insulin resistance; NHS, Nurses’ Health Study; NHS2, Nurses’ Health Study II; SD, standard deviation; T2D, type 2 diabetes.

The correlations between polygenic scores included in this study were modest (r2 ranging from 0.27 to 0.07), supporting the notion that they capture different axes of genetic predisposition (S4 Table). The association between polygenic scores and type 2 diabetes risk was consistent in models without adjusting for BMI (S3 Table) or when random-effects meta-analyses were used to combine cohort estimates (S5 Table).

Interplay between diet quality and genetic risk on the development of type 2 diabetes

The risk of type 2 diabetes per 10-unit decrease AHEI score was 1.13 (95% CI 1.09, 1.17; I2 = 58.6%; P < 0.001) after adjusting for potential confounders (S4 Fig). When analyzed in each cohort separately, the risk of type 2 diabetes per 10-unit decrease in AHEI score was 1.11 (95% CI 1.06, 1.17; P < 0.001) in the NHS, 1.20 (95% CI 1.12, 1.28; P < 0.001) in the HPFS, and 1.08 (95% CI 1.00, 1.16; P = 0.048) in the NHS II. The association between diet quality and type 2 diabetes risk was consistent in secondary analyses using the DASH score (pooled HR 1.13, 95 CI 1.09, 1.18; I2 = 81.7%; P < 0.001; per 5-unit lower; S5 Fig). The correlation between the AHEI and the DASH score was high (r2 > 0.6 in all 3 cohorts, P < 0.001).

When analyzed within each category of genetic risk, defined using the global polygenic score, low diet quality was consistently associated with higher type 2 diabetes risk (Table 2). The unadjusted HR for type 2 diabetes when compared individuals in the lowest category of the diet quality score to those at the highest category was 1.74 (95% CI 1.48, 2.05; P < 0.001) among participants at low genetic risk, 1.84 (95% CI 1.59, 2.13; P < 0.001) among participants at intermediate genetic risk, and 1.72 (95% CI 1.53, 1.93; P < 0.001) among participants at high genetic risk. In the multivariable model, low diet quality, as compared to high diet quality, was associated with higher risk of type 2 diabetes with an adjusted HR of 1.31 (95% CI 1.09, 1.58; P = 0.001) among participants at low genetic risk, 1.39 (95% CI 1.19, 1.63; P < 0.001) among those at intermediate genetic risk, and 1.29 (95% CI 1.14, 1.46; P < 0.001) among those at high genetic risk. Findings were consistent in models without adjusting for BMI (Table 2). There was no evidence of significant interactions on the multiplicative scale between diet quality and genetic risk on the risk of type 2 diabetes (Pinteraction = 0.65; Table 3). The lack of significant interactions was consistent when genetic risk was characterized using pathway-specific polygenic scores (Table 3) or when the DASH score was used (S6 Table).

Table 2. Association between diet quality and type 2 diabetes risk according to categories of genetic risk.

Subgroup HR (95% CI) P value
Low genetic risk
    Crude model 1.74 (1.48, 2.05) <0.001
    Multivariable-adjusted model 1.31 (1.11, 1.54) 0.001
    Multivariable-adjusted without BMI 1.41 (1.17, 1.69) <0.001
Intermediate genetic risk
    Crude model 1.84 (1.59, 2.13) <0.001
    Multivariable-adjusted model 1.39 (1.19, 1.63) <0.001
    Multivariable-adjusted without BMI 1.50 (1.29, 1.75) <0.001
High genetic risk
    Crude model 1.72 (1.53, 1.93) <0.001
    Multivariable-adjusted model 1.29 (1.14, 1.46) <0.001
    Multivariable-adjusted without BMI 1.38 (1.22, 1.56) <0.001

HRs and 95% CI for type 2 diabetes risk for low versus high diet quality according to genetic risk categories. Cox proportional hazards models were stratified by age (in months, continuous) and adjusted for ancestry-derived principal components (1 to 4) (crude model). Multivariable-adjusted model was further adjusted for time-dependent confounders including family history of diabetes (not time-dependent, yes or no), hypertension (yes or no), hypercholesterolemia (yes or no), menopausal status (premenopausal or postmenopausal [never, past, or current menopausal hormone use], women only), BMI (quintiles of kg/m2), smoking status (current, former, and never), physical activity (quintiles of MET hours/week), and total energy intake (quintiles of total caloric intake/day). An additional model was conducted without adding BMI as a covariate. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results.

BMI, body mass index; CI, confidence interval; HR, hazard ratio.

Table 3. Multiplicative interactions between diet quality and genetic risk on the risk of type 2 diabetes.

Global polygenic score Impaired insulin secretion Impaired insulin sensitivity
Polygenic score Beta-cell dysfunction Impaired insulin synthesis Obesity-mediated insulin resistance Body fat distribution Lipid/hepatic metabolism
Multiplicative interaction
Interaction term, coefficient 0.99 (0.93, 1.05) 0.94 (0.82, 1.00) 0.98 (0.92, 1.04) 1.02 (0.96, 1.09) 0.97 (0.92, 1.03) 0.96 (0.91, 1.02)
Interaction term, P value 0.65 0.05 0.44 0.45 0.37 0.21

For each polygenic score, the combined interaction term coefficient and P value is shown. Estimates were obtained from Cox proportional hazards models with a cross-product interaction term between genetic risk and diet quality stratified by age and adjusted for ancestry-derived principal components, family history of diabetes, history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, and total energy intake (methods).

BMI, body mass index.

In a joint analysis to investigate the combined association of genetic risk and diet quality with the risk of type 2 diabetes, there was a risk gradient with increasing genetic risk and decreasing diet quality (Fig 2). Age-adjusted estimates are presented in S6 Fig. Compared with individuals at low genetic risk and high diet quality, the multivariable-adjusted HR for risk of type 2 diabetes for low diet quality was 1.31 (95% CI 1.11, 1.54; P = 0.001) among those at low genetic risk, 1.53 (95% CI 1.31, 1.79; P < 0.001) among those at intermediate genetic risk, and 2.19 (95% CI 1.89, 2.54; P < 0.001) among those at high genetic risk. The joint association of diet quality and genetic risk was similar to the sum of the risk associated with each factor alone (RERI = 0.05, 95% CI −0.04, 0.13; Pinteraction = 0.30; Table 4), indicating no evidence of significant additive interactions. The proportion of contribution to excess type 2 diabetes risk was estimated to be 53.5% (95% CI 4.8, 62.2) to genetic risk, 38.6% (95% CI 29.4, 47.6) to diet quality, and 7.8% (95% CI −6.5, 22.2) to their interaction. We did not find evidence of additive interactions in crude models (S7 Table). We observed the same pattern for the joint associations and nonsignificant additive interactions when genetic risk was characterized using pathway-specific polygenic scores (S7 Fig). The proportion of contribution to excess type 2 diabetes risk due to genetic risk ranged from 61.2% (95% CI 51.9, 70.9) for the beta-cell dysfunction polygenic score to 21.9% (95% CI 5.9, 38.0) for the obesity-mediated insulin resistance polygenic score (Table 4). Findings from additive interaction analyses were similar when the 3 cohorts were analyzed separately (S8 Table, S8 Fig), when the DASH score was used (S9 Table, S9 and S10 Figs).

Fig 2. Risk of incident type 2 diabetes according to categories of genetic risk and diet quality.

Fig 2

Shown are adjusted HRs and 95% CI of the estimate for type 2 diabetes in a pooled analysis of the 3 prospective cohorts according to categories of genetic risk and diet quality score. In these comparisons, participants with low genetic risk and high diet quality served as the reference group. Cox proportional hazards models were stratified by age and adjusted for ancestry-derived principal components, family history of diabetes, history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, and total energy intake. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results. BMI, body mass index; CI, confidence interval; HR, hazard ratio.

Table 4. Additive interactions between diet quality and genetic risk using global and pathway-specific polygenic scores.

Global polygenic score Pathway-specific polygenic scores
Impaired insulin secretion Impaired insulin sensitivity
Polygenic score Beta-cell dysfunction Impaired insulin synthesis Obesity-mediated insulin resistance Body fat distribution Lipid/hepatic metabolism
Main effects
    Diet quality 1.22 (1.14, 1.30) 1.22 (1.15, 1.31) 1.21 (1.14, 1.29) 1.21 (1.13, 1.29) 1.21 (1.14, 1.30) 1.21 (1.14, 1.29)
    Polygenic score 1.29 (1.25, 1.33) 1.26 (1.22, 1.30) 1.14 (1.10, 1.17) 1.09 (1.05, 1.12) 1.23 (1.19, 1.26) 1.11 (1.07, 1.16)
    Joint effect 1.57 (1.50, 1.64) 1.51 (1.43, 1.58) 1.36 (1.29, 1.44) 1.32 (1.25, 1.40) 1.48 (1.41, 1.56) 1.32 (1.25, 1.39)
RERI
    RERI 0.05 (−0.04, 13.0) 0.03 (−0.06, 0.11) −0.01 (−0.08, 0.08) 0.04 (−0.03, 0.12) 0.03 (−0.07, 0.10) −0.02 (−0.09, 0.05)
    P value 0.300 0.524 0.958 0.220 0.758 0.513
Attributable risk proportion, %
    Diet quality 38.6 (29.7, 47.6) 44.3 (34.9, 53.6) 58.5 (46.7, 70.3) 63.8 (50.8, 76.7) 45.1 (35.4, 54.7) 66.4 (53.2, 79.5)
    Polygenic score 53.5 (44.8, 62.2) 61.2 (51.9, 70.5) 42.1 (29.7, 54.4) 21.9 (5.9, 38) 52.2 (42.7, 61.8) 41.4 (27.5, 55.3)
    Additive interaction 7.8 (−6.5, 22.2) 5.4 (−11.9, 22.8) −5.8 (−22.2, 21.1) 14.2 (−7.1, 35.6) 2.7 (−14.1, 19.4) −7.8 (−32.1, 16.5)

Multivariable-adjusted risk of type 2 diabetes estimated from Cox proportional hazards models stratified by age and adjusted for ancestry-derived principal components, family history of diabetes, history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, and total energy intake.

Polygenic scores were standardized to allow comparisons across scores computed in this study with different the number of genetic variants. Details about the variants and weights used to compute theses scores are detailed in the Supporting information.

Low quality diet versus high-quality diet was defined as a categorical variable based on the median distribution of the diet quality score in each cohort.

Per SD increase in polygenic scores.

BMI, body mass index; RERI, relative excess risk due to interaction; SD, standard deviation.

In a sensitivity analysis to investigate if BMI modified the joint association of increased genetic risk and low diet quality on type 2 diabetes risk, we showed that changes in BMI did not modify the risk of type 2 diabetes attributed to increased genetic risk and low diet quality (Pinteraction = 0.69; S10 Table, S11 Fig).

Discussion

In this large prospective study to investigate how genetic risk and diet quality contribute to the risk of type 2 diabetes, we found that both low diet quality and increased genetic risk were independently associated with higher risk of type 2 diabetes without evidence of significant interactions. We showed that within any genetic risk category, high diet quality compared to low diet quality was associated with a nearly 30% lower risk of type 2 diabetes and that the risk of type 2 diabetes attributed to the combination of increased genetic risk and low diet quality was similar to the sum of the risks associated which each factor alone. Taken together, results from this study are important to support evidence-based prevention strategies for type 2 diabetes.

Our study adds to knowledge on the interplay between genetic and lifestyle factors by formally investigating whether polygenic scores for type 2 diabetes capturing overall genetic risk or distinct pathophysiological mechanisms could help prioritize individuals who would benefit the most from targeted dietary recommendations. Previous studies have shown no appreciable interactions between genetic and lifestyle factors on the development of type 2 diabetes [6,9,31], indicating that genetic risk does not modify the beneficial effect of healthy lifestyle interventions. However, previous studies considered a limited number of variants to generate type 2 diabetes polygenic scores, and the lack of significant interactions reported in these studies is often attributed to the mixture of variants affecting different pathways into a single score [6,32]. The latter is particularly relevant in the context of a highly heterogenous disease such as type 2 diabetes, in which groups of individuals are more likely to develop the disease due to alterations in specific processes. By leveraging novel polygenic scores for type 2 diabetes, our study supports that both diet quality and overall or pathway-specific genetic risk are independently associated with risk of type 2 diabetes. These findings suggest that healthy dietary recommendations for the prevention of type 2 diabetes could be deployed across all levels of genetic risk in the population as genetic risk does not seem to modify their effectiveness. Further, our results emphasize the value of genetic risk assessment to identify individuals at increased disease risk and their potential for risk stratification and surveillance, as those at increased genetic risk might need to incorporate other lifestyle components in addition to healthy diet to mitigate their inherited risk.

We systematically evaluated the presence of additive and multiplicative interactions between genetic risk and diet quality on type 2 diabetes incidence. Interaction on a multiplicative scale means that the combined effect of 2 exposures is larger than the product of the individual effects of the 2 exposures, whereas interaction on an additive scale means that the combined effect of 2 exposures is larger than the sum of the individual effects [33]. While previous interaction studies have mainly tested for interactions on the multiplicative scale, the assessment of additive interactions is more suitable to identify which groups of individuals would benefit the most from a given intervention [34]. Our findings provide evidence that there is no departure from the additivity of risks attributed to each factor separately, indicating that the presence of the 2 exposures (low diet and increased genetic risk) does not explain a higher number of cases that could have prevented if only one of the exposures were present. These findings suggest that if interactions between genetic and dietary factors in type 2 diabetes exist, they are likely to be small, undetectable by conventional approaches, or influenced by other factors such as socioeconomic status or changes in body weight [35].

By clarifying that genetic risk and diet quality are each independently associated with the risk of type 2 diabetes and would not have an additive or multiplicative impact on the risk of the disease, our findings can yield useful clinical and public health answers as we prepare for the eventual implementation of precision nutrition. Major worldwide organizations recommend population-wide healthy dietary patterns for the prevention of metabolic diseases [36,37]. However, recent short-term multiomics feeding studies have reported large interindividual variability in response to specific foods or diets, supporting the need for more personalized approaches [38,39]. While long-term follow-up studies are needed to better appreciate the value of extremely personalized dietary recommendations for the prevention of diabetes and related metabolic diseases, findings from the present study support public health efforts that emphasize the consumption of healthy dietary patterns.

The strengths of this study include the use of new generation polygenic scores for type 2 diabetes that capture overall genetic risk or specific pathophysiological processes, the well-validated measures of dietary factors and the use of repeated diet measurements to reduce measurement error and noise, the large number of incident type 2 diabetes cases and extended follow-up, and the consistency of our findings in sensitivity analyses. Further, we generated both global and pathway specific polygenic scores to systematically investigate the presence of additive and multiplicative interactions between genetic risk and diet quality on the development of type 2 diabetes.

We acknowledge several limitations. Because this was an observational study and allocation to low- or high-quality diet was not randomized, we could not infer causality regarding the associations of low diet quality and increased genetic risk on the development of type 2 diabetes. A possible reason for the lack of interaction between the polygenic risk score and diet on the risk of type 2 diabetes could be imprecision in dietary intake measurement. We used cumulative averages of diet, which yield more precise dietary intake estimates than baseline intakes alone [40], but the use of more objective dietary intake assessment methods, such as the use of smartphone applications, wearable technology, or dietary intake biomarkers, is necessary to accurately ascertain dietary intake and reduce self-reported errors [39,41]. We computed global and pathway-specific polygenic scores for type 2 diabetes, but the use of aggregated scores might have missed potential interactions driven by highly penetrant single genetic variants of strong effects or variants for glycemic traits interacting with environmental exposures [42]. However, further restraining the number of genetic variants will limit the clinical and public health value of our findings as highly penetrant variants tend to be rare or extremely rare in the population. In addition, the use of tails of polygenic score distribution (i.e., top 5% or 1% of genetic risk) could be used to detect potential interactions more likely to be present among people with very high or low genetic risk. However, such analysis would have lower statistical power compared to the assessment of interaction in the continuous scale, and it might yield spurious interactions due to unbalanced covariates between groups and residual confounding [43]. We restricted our analyses to participants for whom genetic data were available, which represents a small proportion of the original sample and might have induced selection bias. Participants for genetic determinations were selected to be representative of the original study population, and demographic characteristics and health status of participants with genetic information were generally similar to those who did not. The inclusion of well-informed and educated healthcare professionals without major chronic diseases at baseline might limit the generalizability of our findings to other populations. However, increased genetic risk and low diet quality have been associated with risk of type 2 diabetes in other populations [11].

In conclusion, our data provide evidence that genetic risk and diet quality are each independently associated with the risk of type 2 diabetes, without evidence of an additive or multiplicative impact on the risk of the disease. Our results suggest that the association of a healthy diet with lower risk of type 2 diabetes risk does not vary substantially based on the overall or pathway-specific genetic risk and highlights the potential of genetic risk assessment for future risk stratification and surveillance. Findings from this study might provide a valuable source of information for the primary prevention of type 2 diabetes.

Supporting information

S1 Fig. Distribution of polygenic scores in the NHS.

Distribution of the global and pathway polygenic scores in the NHS. NHS, Nurses’ Health Study.

(PNG)

S2 Fig. Distribution of polygenic scores in the HPFS.

Distribution of the global and pathway polygenic scores in the HPFS. HPFS, Health Professionals Follow-up Study.

(PNG)

S3 Fig. Distribution of polygenic scores in the NHS II.

Distribution of the global and pathway polygenic scores in the NHS II. NHS, Nurses’ Health Study.

(PNG)

S4 Fig. Risk of incident type 2 diabetes associated with diet quality.

Shown are adjusted HRs and 95% CI of the estimate for type 2 diabetes in each of the 3 prospective cohorts per 10 units decrease in diet quality score assessed using the AHEI score. The diet quality score was derived from repeated measurements analyses. Cox proportional hazards models were stratified by age and adjusted for time-varying covariates including ancestry-derived principal components (not time-varying), family history of diabetes (not time-varying), history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, and total energy intake. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results. The heterogeneity index (I2) were used to assess heterogeneity. The P values for the association were <0.001, <0.001, and 0.049 for the NHS, the HPFS, and the NHS II, respectively. AHEI, Alternate Healthy Eating Index; BMI, body mass index; CI, confidence interval; HPFS, Health Professionals Follow-up Study; HR, hazard ratio; NHS, Nurses’ Health Study.

(TIFF)

S5 Fig. Risk of incident type 2 diabetes associated with diet quality—sensitivity analysis using the DASH score.

Shown are adjusted HRs and 95% CI of the estimate for type 2 diabetes in each of the 3 prospective cohorts per 5 units decrease in diet quality score assessed using the DASH score. The diet quality score was derived from repeated measurements analyses. Cox proportional hazards models were stratified by age and adjusted for time-varying confounders including ancestry-derived principal components (not time-varying), family history of diabetes (not time-varying), history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, total energy intake, and alcohol intake. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results. The heterogeneity index (I2) were used to assess heterogeneity. The P values for the association were 0.001, <0.001, and 0.23 for the NHS, the HPFS, and the NHS II, respectively. BMI, body mass index; CI, confidence interval; DASH, Dietary Approaches to Stop Hypertension; HPFS, Health Professionals Follow-up Study; HR, hazard ratio; NHS, Nurses’ Health Study.

(TIFF)

S6 Fig. Risk of type 2 diabetes according to categories of the global polygenic scores and adherence to a healthy diet in age-adjusted secondary analyses.

Shown are age-adjusted HRs and 95% CI of the estimate for type 2 diabetes according to genetic risk and diet quality categories using the AHEI score. In these comparisons, participants with low genetic risk and high-quality diet served as the reference group. Cox proportional hazards models were stratified by age and adjusted for ancestry-derived principal components (not time-varying). A fixed-effects meta-analysis was used to combine cohort-specific results. AHEI, Alternate Healthy Eating Index; CI, confidence interval; HR, hazard ratio.

(PNG)

S7 Fig. Risk of type 2 diabetes according to categories of the 5 pathway-specific polygenic score diet quality.

Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes incidence according to pathway-specific polygenic score and diet quality categories. (A) Beta-cell polygenic score, (B) proinsulin polygenic score, (C) obesity polygenic score, (D) lipodystrophy polygenic score, and (E) liver metabolism polygenic score. In these comparisons, participants at low genetic risk with high-quality diet served as the reference group. A fixed-effects meta-analysis was used to combine cohort-specific results. CI, confidence interval; HR, hazard ratio.

(PDF)

S8 Fig. Risk of incident type 2 diabetes according to genetic and diet quality risk in each cohort separately.

Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes in (A) NHS, (B) HPFS, and (C) NHS II according to genetic risk and diet quality categories. In these comparisons, participants with low genetic risk and high-quality diet served as the reference group. CI, confidence interval; HR, hazard ratio; HPFS, Health Professionals Follow-up Study; NHS, Nurses’ Health Study.

(PDF)

S9 Fig. Risk of type 2 diabetes according to categories of the global polygenic scores and adherence to a healthy diet—sensitivity analysis using the DASH score.

Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes according to genetic risk and diet quality categories using the DASH score. In these comparisons, participants with low genetic risk and high-quality diet served as the reference group. A fixed-effects meta-analysis was used to combine cohort-specific results. CI, confidence interval; DASH, Dietary Approaches to Stop Hypertension; HR, hazard ratio.

(PNG)

S10 Fig. Risk of type 2 diabetes according to categories of the pathway specific polygenic scores and adherence to a healthy diet—sensitivity analysis using the DASH score.

Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes incidence according to pathway-specific polygenic score and diet quality categories using the DASH score. (A) Beta-cell polygenic score, (B) proinsulin polygenic score, (C) obesity polygenic score, (D) lipodystrophy polygenic score, and (E) liver metabolism polygenic score. In these comparisons, participants at low genetic risk with high-quality diet served as the reference group. A fixed-effects meta-analysis was used to combine cohort-specific results. CI, confidence interval; DASH, Dietary Approaches to Stop Hypertension; HR, hazard ratio.

(PDF)

S11 Fig. Interplay between diet quality and global polygenic score on type 2 diabetes risk according to changes in BMI.

Three-dimensional illustrations of type 2 diabetes risk, genetic susceptibility, and diet quality by BMI among individuals with normal weight (A), overweight (B), and obese (C). The blue-colored region maps the lower risk area, and the red-colored area stands for higher risk area. Deciles of AHEI are inverse transformed, with 0 being good diet quality and 10 bad diet quality. Data from 3 cohorts were combined. Multivariate analyses were stratified by age and adjusted for time-varying confounders including cohort (not time-varying), ancestry-derived principal components (not time-varying), family history of diabetes (not time-varying), history of hypertension, history of hypercholesterolemia, menopausal status (women only), smoking status, physical activity, and total energy intake. P = 0.681 for 3-way interaction. AHEI, Alternate Healthy Eating Index; BMI, body mass index; SD, standard deviation; T2D, type 2 diabetes.

(PDF)

S1 Table. Differences in baseline characteristics between the sample of participants included in this study and all participants in each original cohort.

(DOCX)

S2 Table. Characteristics of genetic variants used to build the 5 different pathway-specific polygenic scores.

(DOCX)

S3 Table. Associations of global and process-specific polygenic scores with type 2 diabetes risk in secondary analyses.

(DOCX)

S4 Table. Correlation between polygenic scores included in this study.

(DOCX)

S5 Table. Associations of global and process-specific polygenic scores with type 2 diabetes risk, random-effects meta-analysis.

(DOCX)

S6 Table. Multiplicative interactions between diet quality and genetic risk using global and pathway-specific polygenic scores.

Secondary analyses using the DASH score. DASH, Dietary Approaches to Stop Hypertension.

(DOCX)

S7 Table. Additive interactions between diet quality and genetic susceptibility on type 2 diabetes risk, crude models.

(DOCX)

S8 Table. Additive interactions between diet quality and genetic susceptibility on type 2 diabetes risk in each cohort.

(DOCX)

S9 Table. Additive interactions between diet quality and genetic risk using global and pathway-specific polygenic scores.

Secondary analyses using the DASH score. DASH, Dietary Approaches to Stop Hypertension.

(DOCX)

S10 Table. Interplay between diet quality and pathway-specific polygenic scores on type 2 diabetes risk by changes in BMI.

BMI, body mass index.

(DOCX)

S1 Text. Type 2 diabetes polygenic scores.

(DOCX)

S2 Text. Prespecified analysis plan.

(DOCX)

S3 Text. STROBE checklist.

STROBE, Strengthening the Reporting of Observational Studies in Epidemiology.

(DOCX)

Abbreviations

AHEI

Alternate Healthy Eating Index

BMI

body mass index

CI

confidence interval

DASH

Dietary Approaches to Stop Hypertension

GWAS

genome-wide association study

HPFS

Health Professionals Follow-up Study

HR

hazard ratio

IQR

interquartile range

NHS

Nurses’ Health Study

RERI

relative excess risk due to interaction

SD

standard deviation

STROBE

Strengthening the Reporting of Observational Studies in Epidemiology

Data Availability

The data underlying the generation of the global polygenic score for type 2 diabetes are available from the UK Biobank project site, subject to registration and application process. Further details can be found at https://www.ukbiobank.ac.uk. This research was conducted under UK Biobank application no. 45052. Code to run the genome-wide association analysis for type 2 diabetes and generate the global polygenic score has been uploaded to GitHub (https://github.com/lab319/ps-diet-t2d). Information including the procedures to obtain and access the data and codes used in this study in the Nurses’ Health Study I and II, and the Health Professionals Follow-Up Study is described at http://www.nurseshealthstudy.org/researchers for the Nurses’ Health Study (contact: nhsaccess@channing.harvard.edu) or https://www.hsph.harvard.edu/hpfs/ for the Health Professionals Follow-up Study (contact: hpfs@hsph.harvard.edu). The scripts to analyze NHS/HPFS data presented in this manuscript are open and widely available once access to the system is granted.

Funding Statement

This study was funded by research grants from the National Institutes of Health CA186107 (J.E.M.), CA176726 (W.C.W.), CA167552 (W.C.W.), HL034594 (J.E.M.), HL035464 (Q.S.), EY015473 (F.B.H.), DK112940 (F.B.H.), DK120870 (F.B.H.), DK40561 (J.M), and DK110550 (J.C.F.), the American Diabetes Association 1-18-PMF-029 (M.G-F.), and the National Natural Science Foundation of China 61471078 (B.M.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Cheng YJ, Kanaya AM, Araneta MRG, Saydah SH, Kahn HS, Gregg EW, et al. Prevalence of Diabetes by Race and Ethnicity in the United States, 2011–2016. JAMA. 2019;322(24):2389–98. doi: 10.1001/jama.2019.19365 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hunter DJ. Gene-environment interactions in human diseases. Nat Rev Genet. 2005;6(4):287–98. doi: 10.1038/nrg1578 [DOI] [PubMed] [Google Scholar]
  • 3.Franks PW, McCarthy MI. Exposing the exposures responsible for type 2 diabetes and obesity. Science. 2016;354(6308):69–73. doi: 10.1126/science.aaf5094 [DOI] [PubMed] [Google Scholar]
  • 4.McAllister K, Mechanic LE, Amos C, Aschard H, Blair IA, Chatterjee N, et al. Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases. Am J Epidemiol. 2017;186(7):753–61. doi: 10.1093/aje/kwx227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Florez JC, Jablonski KA, Bayley N, Pollin TI, de Bakker PIW, Shuldiner AR, et al. TCF7L2 polymorphisms and progression to diabetes in the Diabetes Prevention Program. N Engl J Med. 2006;355(3):241–50. doi: 10.1056/NEJMoa062418 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Langenberg C, Sharp SJ, Franks PW, Scott RA, Deloukas P, Forouhi NG, et al. Gene-Lifestyle Interaction and Type 2 Diabetes: The EPIC InterAct Case-Cohort Study. PLoS Med. 2014;11(5):e1001647. doi: 10.1371/journal.pmed.1001647 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wessel J, Chu AY, Willems SM, Wang S, Yaghootkar H, Brody JA, et al. Low-frequency and rare exome chip variants associate with fasting glucose and type 2 diabetes susceptibility. Nat Commun. 2015;6(1):5897. doi: 10.1038/ncomms6897 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Said MA, Verweij N, van der Harst P. Associations of Combined Genetic and Lifestyle Risks With Incident Cardiovascular Disease and Diabetes in the UK Biobank Study. JAMA Cardiol. 2018;3(8):693–702. doi: 10.1001/jamacardio.2018.1717 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Merino J, Guasch-Ferré M, Ellervik C, Dashti HS, Sharp SJ, Wu P, et al. Quality of dietary fat and genetic risk of type 2 diabetes: individual participant data meta-analysis. BMJ. 2019;366:l4292. doi: 10.1136/bmj.l4292 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Merino J, Jablonski KA, Mercader JM, Kahn SE, Chen L, Harden M, et al. Interaction Between Type 2 Diabetes Prevention Strategies and Genetic Determinants of Coronary Artery Disease on Cardiometabolic Risk Factors. Diabetes. 2020;69(1):112–20. doi: 10.2337/db19-0097 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Li H, Khor C-C, Fan J, Lv J, Yu C, Guo Y, et al. Genetic risk, adherence to a healthy lifestyle, and type 2 diabetes risk among 550,000 Chinese adults: results from 2 independent Asian cohorts. Am J Clin Nutr. 2020;111(3):698–707. doi: 10.1093/ajcn/nqz310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Khera AV, Chaffin M, Aragam KG, Haas ME, Roseli C, Hoan S, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50(9):1219–24. doi: 10.1038/s41588-018-0183-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Udler MS, Kim J, von Grotthuss M, Bonàs-Guarch S, Cole JB, Chiou J, et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis. PLoS Med. 2018;15(9):e1002654. doi: 10.1371/journal.pmed.1002654 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Rimm EB, Giovannucci EL, Willett WC, Colditz GA, Ascherio A, Rosner B, et al. Prospective study of alcohol consumption and risk of coronary disease in men. Lancet. 1991;338(8765):464–8. doi: 10.1016/0140-6736(91)90542-w [DOI] [PubMed] [Google Scholar]
  • 15.Colditz GA, Manson JE, Hankinson SE. The Nurses Health Study: 20-year contribution to the understanding of health among women. J Womens Health. 1997;6(1):49–62. doi: 10.1089/jwh.1997.6.49 [DOI] [PubMed] [Google Scholar]
  • 16.Lindström S, Loomis S, Turman C, Huang H, Huang J, Aschard H, et al. A comprehensive survey of genetic variation in 20,691 subjects from four large cohorts. PLoS ONE. 2017;12(3):e0173997. doi: 10.1371/journal.pone.0173997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.National Diabetes Data Group. Classification and diagnosis of diabetes mellitus and other categories of glucose intolerance. Diabetes. 1979;28(12):1039–57. doi: 10.2337/diab.28.12.1039 [DOI] [PubMed] [Google Scholar]
  • 18.American Diabetes Association. Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care. 1997;20(7):1183–97. doi: 10.2337/diacare.20.7.1183 [DOI] [PubMed] [Google Scholar]
  • 19.Standards of medical care in diabetes—2010. Diabetes Care 2010;33(Suppl 1):S11–S61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Manson JE, Rimm EB, Stampfer MJ, Colditz GA, Willett WC, Krolewski AS, et al. Physical activity and incidence of non-insulin-dependent diabetes mellitus in women. Lancet. 1991;338(8770):774–8. doi: 10.1016/0140-6736(91)90664-b [DOI] [PubMed] [Google Scholar]
  • 21.Loh P-R, Tucker G, Bulik-Sullivan BK, Vilhjálmsson BJ, Finucane HK, Salem RM, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015;47(3):284–90. doi: 10.1038/ng.3190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Vilhjálmsson BJ, Yang J, Finucane HK, Gusev A, Lindström S, Ripke S, et al. Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. Am J Hum Genet. 2015;97(4):576–92. doi: 10.1016/j.ajhg.2015.09.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat Genet. 2018;50(11):1505–13. doi: 10.1038/s41588-018-0241-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Chiuve SE, Fung TT, Rimm EB, Hu FB, McCullough ML, Wang M, et al. Alternative dietary indices both strongly predict risk of chronic disease. J Nutr. 2012;142 (6):1009–18. doi: 10.3945/jn.111.157222 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Fung TT, Chiuve SE, McCullough ML, Rexrode KM, Logroscino G, Hu FB. Adherence to a DASH-style diet and risk of coronary heart disease and stroke in women. Arch Intern Med. 2008;168(7):713–20. doi: 10.1001/archinte.168.7.713 [DOI] [PubMed] [Google Scholar]
  • 26.VanderWeele TJ, Tchetgen Tchetgen EJ. Attributing effects to interactions. Epidemiology. 2014;25(5):711–22. doi: 10.1097/EDE.0000000000000096 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.VanderWeele TJ, Knol MJ. A tutorial on interaction. Epidemiol Methods. 2014;3(1):33–72. [Google Scholar]
  • 28.Vanderweele TJ, Vansteelandt S. Invited commentary: Some advantages of the relative excess risk due to interaction (RERI)-Towards better estimators of additive interaction, Vol. 179. Am J Epidemiol. 2014;179(6):670–1. doi: 10.1093/aje/kwt316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wood W, Rünger D. Psychology of Habit. Annu Rev Psychol. 2016;67(1):289–314. doi: 10.1146/annurev-psych-122414-033417 [DOI] [PubMed] [Google Scholar]
  • 30.von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. PLoS Med. 2007;4(10):e296. doi: 10.1371/journal.pmed.0040296 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.He Y, Lakhani CM, Rasooly D, Manrai AK, Tzoulaki I, Patel CJ. Comparisons of Polyexposure, Polygenic, and Clinical Risk Scores in Risk Prediction of Type 2 Diabetes. Diabetes Care. 2021;44(4):935–43. doi: 10.2337/dc20-2049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Franks PW, Merino J. Gene-lifestyle interplay in type 2 diabetes. Curr Opin Genet Dev. 2018;50:35–40. doi: 10.1016/j.gde.2018.02.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Knol MJ, VanderWeele TJ, Groenwold RHH, Klungel OH, Rovers MM, Grobbee DE. Estimating measures of interaction on an additive scale for preventive exposures. Eur J Epidemiol. 2011;26(6):433–8. doi: 10.1007/s10654-011-9554-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Blot WJ, Day NE. Synergism and interaction: Are they equivalent? Am J Epidemiol. 1979;110(1):99–100. doi: 10.1093/oxfordjournals.aje.a112793 [DOI] [PubMed] [Google Scholar]
  • 35.VanderWeele TJ, Ko Y-A, Mukherjee B. Environmental Confounding in Gene-Environment Interaction Studies. Am J Epidemiol. 2013;178(1):144–52. doi: 10.1093/aje/kws439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Evert AB, Dennison M, Gardner CD, Garvey WT, Lau KHK, MacLeod J, et al. Nutrition Therapy for Adults With Diabetes or Prediabetes: A Consensus Report. Diabetes Care. 2019;42(5):731–54. doi: 10.2337/dci19-0014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.U.S. Department of Health and Human Services and U.S. Department of Agriculture. 2015–2020 Dietary Guidelines for Americans. 8th Edition. December 2015. Available from: https://health.gov/dietaryguidelines/2015/guidelines/.
  • 38.Zeevi D, Korem T, Zmora N, Israeli D, Rothschild D, Weinberger A, et al. Personalized Nutrition by Prediction of Glycemic Responses. Cell. 2015;163(5):1079–94. doi: 10.1016/j.cell.2015.11.001 [DOI] [PubMed] [Google Scholar]
  • 39.Berry SE, Valdes AM, Drew DA, Asnicar F, Mazidi M, Wolf J, et al. Human postprandial responses to food and potential for precision nutrition. Nat Med. 2020;26(6):964–73. doi: 10.1038/s41591-020-0934-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Bernstein AM, Rosner BA, Willett WC. Cereal fiber and coronary heart disease: A comparison of modeling approaches for repeated dietary measurements, intermediate outcomes, and long follow-up. Eur J Epidemiol. 2011;26(11):877–86. doi: 10.1007/s10654-011-9626-x [DOI] [PubMed] [Google Scholar]
  • 41.Savolainen O, Lind MV, Bergström G, Fagerberg B, Sandberg AS, Ross A. Biomarkers of food intake and nutrient status are associated with glucose tolerance status and development of type 2 diabetes in older Swedish women. Am J Clin Nutr. 2017;106(5):1302–10. doi: 10.3945/ajcn.117.152850 [DOI] [PubMed] [Google Scholar]
  • 42.Schnurr TM, Jørsboe E, Chadt A, Dahl-Petersen IK, Kristensen JM, Wojtaszewski JFP, et al. Physical activity attenuates postprandial hyperglycaemia in homozygous TBC1D4 loss-of-function mutation carriers. Diabetologia. 2021;64(8):1795–804. doi: 10.1007/s00125-021-05461-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Vanderweele TJ. Epidemiologic Methods Sample Size and Power Calculations for Additive Interactions Sample Size and Power Calculations for Additive Interactions. Epidemiol Methods. 2012;1(1):159–88. doi: 10.1515/2161-962X.1010 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Caitlin Moyer

2 Apr 2021

Dear Dr Merino,

Thank you for submitting your manuscript entitled "Polygenic scores, diet quality, and type 2 diabetes risk: a prospective, observational study" for consideration by PLOS Medicine.

Your manuscript has now been evaluated by the PLOS Medicine editorial staff and I am writing to let you know that we would like to send your submission out for external peer review.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Please re-submit your manuscript within two working days, i.e. by .

Login to Editorial Manager here: https://www.editorialmanager.com/pmedicine

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review.

Feel free to email us at plosmedicine@plos.org if you have any queries relating to your submission.

Kind regards,

Caitlin Moyer, Ph.D.

Associate Editor

PLOS Medicine

Decision Letter 1

Caitlin Moyer

24 Aug 2021

Dear Dr. Merino,

Thank you very much for submitting your manuscript "Polygenic scores, diet quality, and type 2 diabetes risk: a prospective, observational study" (PMEDICINE-D-21-01465R1) for consideration at PLOS Medicine.

Your paper was evaluated by a senior editor and discussed among all the editors here. It was also discussed with an academic editor with relevant expertise, and sent to four independent reviewers, including a statistical reviewer. The reviews are appended at the bottom of this email and any accompanying reviewer attachments can be seen via the link below:

[LINK]

In light of these reviews, I am afraid that we will not be able to accept the manuscript for publication in the journal in its current form, but we would like to consider a revised version that addresses the reviewers' and editors' comments. Obviously we cannot make any decision about publication until we have seen the revised manuscript and your response, and we plan to seek re-review by one or more of the reviewers.

In revising the manuscript for further consideration, your revisions should address the specific points made by each reviewer and the editors. Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments, the changes you have made in the manuscript, and include either an excerpt of the revised text or the location (eg: page and line number) where each change can be found. Please submit a clean version of the paper as the main article file; a version with changes marked should be uploaded as a marked up manuscript.

In addition, we request that you upload any figures associated with your paper as individual TIF or EPS files with 300dpi resolution at resubmission; please read our figure guidelines for more information on our requirements: http://journals.plos.org/plosmedicine/s/figures. While revising your submission, please upload your figure files to the PACE digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at PLOSMedicine@plos.org.

We expect to receive your revised manuscript by Sep 14 2021 11:59PM. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

We ask every co-author listed on the manuscript to fill in a contributing author statement, making sure to declare all competing interests. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. If new competing interests are declared later in the revision process, this may also hold up the submission. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT. You can see our competing interests policy here: http://journals.plos.org/plosmedicine/s/competing-interests.

Please use the following link to submit the revised manuscript:

https://www.editorialmanager.com/pmedicine/

Your article can be found in the "Submissions Needing Revision" folder.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

We look forward to receiving your revised manuscript.

Sincerely,

Caitlin Moyer, Ph.D.

Associate Editor

PLOS Medicine

plosmedicine.org

-----------------------------------------------------------

Requests from the editors:

1. From the Academic Editor: Please emphasize the novelty, clinical significance and interpretation of the findings of the study, as compared to the existing literature describing relationships and interactions between lifestyle and genetic risk and risk of type 2 diabetes. Please provide some discussion on the new methods used in this study.

2. Data Availability statement: PLOS Medicine requires that the de-identified data underlying the specific results in a published article be made available, without restrictions on access, in a public repository or as Supporting Information at the time of article publication, provided it is legal and ethical to do so. Please see the policy at

http://journals.plos.org/plosmedicine/s/data-availability

and FAQs at

http://journals.plos.org/plosmedicine/s/data-availability#loc-faqs-for-data-policy

The Data Availability Statement (DAS) requires revision. For each data source used in your study:

a) If the data are freely or publicly available, note this and state the location of the data: within the paper, in Supporting Information files, or in a public repository (include the DOI or accession number).

b) If the data are owned by a third party but freely available upon request, please note this and state the owner of the data set and contact information for data requests (web or email address). Note that a study author cannot be the contact person for the data.

c) If the data are not freely available, please describe briefly the ethical, legal, or contractual restriction that prevents you from sharing it. Please also include an appropriate contact (web or email address) for inquiries (again, this cannot be a study author).

3. Abstract: Please structure your abstract using the PLOS Medicine headings (Background, Methods and Findings, Conclusions).

4. Abstract Background: Provide the context of why the study is important. The final sentence should clearly state the study question.

5. Abstract: Methods and Findings: Please include some summary demographics of the cohorts, and specify the cohorts are US-based.. Please quantify the main results (with 95% CIs and p values). Please include the important dependent variables that are adjusted for in the analyses.

6. Abstract: Methods and Findings: In the last sentence of the Abstract Methods and Findings section, please describe the main limitation(s) of the study's methodology.

7. Author summary: At this stage, we ask that you include a short, non-technical Author Summary of your research to make findings accessible to a wide audience that includes both scientists and non-scientists. The Author Summary should immediately follow the Abstract in your revised manuscript. This text is subject to editorial change and should be distinct from the scientific abstract. Please see our author guidelines for more information: https://journals.plos.org/plosmedicine/s/revising-your-manuscript#loc-author-summary

8. Throughout: Please place in-text reference citations within square brackets.

9. Methods: Please include the information presented in the Appendix (Description of the study population and sample selection) where the Study design and Population are described.

10. Methods: Line 108: Please change the superscript reference to [10].

11. Methods: Diet quality: Please mention how the DASH score was determined. Please mention the food components for the AHEI and DASH.

12. Methods: Line 162: Please note where a complete list of the covariates/ risk factors may be found.

13. Methods: Line 202: Please change “duet” to “due” here.

14. Methods: Please ensure that the study is reported according to the STROBE guideline, and include the completed STROBE checklist as Supporting Information. Please add the following statement, or similar, to the Methods: "This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist)."

The STROBE guideline can be found here: http://www.equator-network.org/reporting-guidelines/strobe/

When completing the checklist, please use section and paragraph numbers, rather than page numbers.

15. Methods: Did your study have a prospective protocol or analysis plan? Please state this (either way) early in the Methods section.

a) If a prospective analysis plan (from your funding proposal, IRB or other ethics committee submission, study protocol, or other planning document written before analyzing the data) was used in designing the study, please include the relevant prospectively written document with your revised manuscript as a Supporting Information file to be published alongside your study, and cite it in the Methods section. A legend for this file should be included at the end of your manuscript.

b) If no such document exists, please make sure that the Methods section transparently describes when analyses were planned, and when/why any data-driven changes to analyses took place.

c) In either case, changes in the analysis-- including those made in response to peer review comments-- should be identified as such in the Methods section of the paper, with rationale.

16. Results: In the first paragraph, please report on some of the participant characteristics presented in Table 1.

17. Results: Please provide 95% CIs and p values for all the main analyses described in the text (e.g PRS association with diabetes risk). For adusted analyses, please note the factors adjusted for (in the text or the appropriate figure/table), and please also provide results from unadjusted analyses.

18. Results: Line 271: Please mention the confounders adjusted for in this analysis.

19. Results: Line 275: Please clarify this sentence, as it seems as if there was no evidence for an interaction “...indicating a non-significant interaction on the multiplicative scale (P =0.65 for interaction…)”

20. Discussion: Line 288: Please rename this section “Discussion”

21. Discussion: Please present and organize the Discussion as follows: a short, clear summary of the article's findings; what the study adds to existing research and where and why the results may differ from previous research; strengths and limitations of the study; implications and next steps for research, clinical practice, and/or public policy; one-paragraph conclusion.

22. Please remove “Funding” and “Duality of Interests” sections from the main text. Please be sure that all information is completely and accurately entered in the Financial Disclosures and Competing Interests sections of the manuscript submission metadata.

23. References: Please use the "Vancouver" style for reference formatting, and see our website for other reference guidelines https://journals.plos.org/plosmedicine/s/submission-guidelines#loc-references

Please check journal title abbreviations- for example, please check if this should be J Womens Health for reference 15. Please update reference 31. Please use the same formatting for the references in the Appendix.

24. Figures and Tables: Please fully define all abbreviations used within each figure and table in the legend, including for the Supporting Information tables.

25. Figures 1 and 2: Please provide results from unadjusted analyses in addition to the adjusted analyses.

26. Table 2: Please provide the unadjusted analyses, and also please provide both 95% CIs and p values.

27. Supplementary Figure 7: Please place the axis labels so it is clear which axis is represented.

Comments from the reviewers:

Reviewer #1: This manuscript describes a detailed analysis of the potential for an interaction between diet and genetic risk for diabetes in regard to diabetes incidence. The authors conclude that any such interactions must be very small, and that a healthy diet is associated with lower risk of diabetes at all levels of genetic risk.

Generally, the results are clearly presented, and the findings are robust. The findings are not surprising for two reasons. First, it is hard to see why high genetic risk would remove the effect of diet. Diet affects blood glucose levels even among people with T1 diabetes, so why would it not do so in people with a high polygenic risk for T2. Second, several other studies (as listed by the authors) have reported the same findings, although the current analysis takes a different and potentially more sophisticated approach.

The authors stopped updating dietary information after a diagnosis of CVD or cancer, and carried forward earlier cumulative averages. It is not clear why this reduces bias. If a person develops CVD, and then makes some dietary changes, it may be those exact dietary changes that explain why they didn't subsequently develop diabetes.

Abstract conclusion. 'gradients of genetic risk' is not quite correct. 'levels of genetic risk' would be better.

Table 1. The study legend refers to 'Age standardized characteristics', which may lead the reader to think that they are all standardised to the same age profile. However, the footnote indicates that each analytic sample has been standardised to the age profile of its own parent study. This is confusing. Furthermore, it's not clear what the purpose of this is, and I couldn't see where it was explained in the methods. The purpose of table 1 is to describe the precise population used in the current analysis, not to provide estimates of prevalences and means within the larger study from which the analytic sample has been drawn. Thus, it seems to me that there is no value in age-standardizing.

Figure 1. The labelling of the Impaired insulin sensitivity scores is a little confusing. For example, it looks as if score 3 is a genetic risk score for obesity. Also, I would number these 5 pathways 1-2 and 1-3. The current numbering of 1-5 obscures the fact that 3-5 come under the sub-heading of (b).

Titles for figures 1 and 2 use different terminology for what appears to be the same concept ('genetic susceptibility' vs 'genetic risk'). Please keep them consistent with each other.

Figure 2. It's not easy to tell from the legend how A and B differ from each other. For example, in 2A, the third row compares Low quality diet with High quality diet among people with low genetic risk. This seems to be the same comparison as presented in the first row of 2B. The HRs are the same (1.31), but the CIs and p-values are different.

Reviewer #2: The manuscript examines the association of genetic factors and diet quality with the incidence of type 2 diabetes. Novel insight is gained through systematic evaluation of additive and multiplicative interactions between genetic risk and diet quality, which were found to not be significant. Overall, the paper is very well-written and utilizes a number of secondary resources and metrics (e.g., DASH and AHEI scores) to ensure that results are robust and reproducible. The choice of statistical methods is appropriate, and the analysis of each cohort separately ensures that results are not confounded by cross-cohort batch effects. This is a well-executed study, and I only have one relatively minor suggestion.

In the appendix, the authors write that models were adjusted for the first 20 principal components (PCs). However, this raises a concern that if the dominant variance in the data aligns with whether individuals had type 2 diabetes, then such an adjustment would lower the signal-to-noise ratio. Furthermore, the choice of 20 appears to be arbitrary. It would be good to see that a) the top 20 PCs do in fact describe non-random variance in the data (e.g., via Horn's parallel analysis), and b) that the top 20 PCs do NOT have a significant association with type 2 diabetes.

Minor comments:

-The Introduction section is missing its caption

-To increase reproducibility and transparency, consider releasing R and SAS code as a publicly available repository (e.g., on GitHub).

-For completeness, please list the overall correlation between DASH and AHEI scores in each cohort.

-In Figure 2 caption, "..participants with low genetic risk and high-quality diet (Fig 2A) or high quality diet (Fig 2B) served as the reference group." seems unnecessarily redundant. Consider simplifying to "..participants with low genetic risk and high-quality diet served as the reference group."

Reviewer #3: Merino et al examined the joint effects of diet quality and genetic risk on incident T2D in three large prospective cohorts followed up for 902,386 person-years with available genetic data. This is a timely topic given the emerging concept of precision prevention with the implication that individuals with high genetic risk may benefit more from intervention. The authors leveraged the comprehensive genotype and phenotype data in UK Biobank to construct and validate a global polygenic score, as well as several pathway-specific polygenic scores for T2D, which were then analyzed jointly with the diet quality scores quantified by two different methods. The authors also conducted a three-way interaction analysis to investigate whether changes in BMI could modify the joint association of genetic risk and diet quality with T2D. The authors confirmed the independent associations of incident T2D with diet quality and genetic risk albeit without significant interaction and was not modified by BMI. They also quantified that 50% of the proportion of diabetes risk was explained by genetic risk and 38% by high quality diet. Within the polygenic score, ~60% was attributable to beta-cell dysfunction, although sub-analysis showed similar joint associations of quality diet and different pathway-specific polygenic risk scores. Overall, this is a well-conducted study in terms of design, sample size, cohort selection, exposure and outcome definition, and sensitivity analyses although some issues need to be addressed.

Major issues:

One of the major limitations of this study is the applicability of the results to the general population since the majority of these subjects were well informed and educated healthcare professionals and predominantly nurses. Given the importance of education on diabetes risk, this point had not been discussed. In table 1, the author should state specifically the single sex nature in the 3 cohorts including 35,759 men and women. Although the authors concluded that high quality diet was associated with reduced risk of T2D irrespective of genetic burden, they proposed the use of polygenic score for surveillance purpose but did not discuss the implication of high quality diet within this context.

Background

1) Page 4, line 71-74: the authors listed four limitations in existing studies that prevented the researchers from identifying "genotypes interacting with lifestyle or dietary exposures" but did not address them in their analysis or discussion.

* Limitation 1: Partial characterization of genetic risk due to a limited number of variants. There are genetic loci associated with other traits that may interact with diet or lifestyle on T2D risk (e.g. https://pubmed.ncbi.nlm.nih.gov/33864366/ reported gene-diet interactions that influenced HbA1c ). The "partial characterization of genetic risk" is not only restricted to "limited number of variants" but also the specificity, sensitivity and effect size of these scores. In Figure 1, the "Beta-cell dysfunction" score constructed from fewer SNPs compared with the "850,000 independent variants" used for the global polygenic score, yielded comparable hazard ratios on incident T2D. Thus, increasing the number of T2D risk loci does not necessarily provide additional variance in discriminating subjects with different T2D genetic risk. In the Appendix, the global polygenic score for T2D together with sex and age had an AUC of 0.638, what is the AUC when they are combined with the dietary scores in explaining the total variance? Given the differences in genetic architecture and many non-genetic factors across various populations and settings, emphasis on using aggregated polygenic risk score may impede our understanding in the effects of environment-personal-lifestyle-familial factors on the development of T2D.

* Limitation 2: Interactions in previous studies were predominantly assessed on the multiplicative scale alone. The authors only presented the additive Relative Excess Risk due to Interaction (RERI) in Table 2 when they should include the index estimates with confidence intervals and P values of multiplicative interactions for completeness.

* Limitation 3: Single time point dietary exposure. The cohorts had multiple diet quality scores collected at different time points. The authors should state clearly i) whether the diet quality scores were used as a time-fixed variable or a time-varying variable in the Cox model; ii) compared with the single time point diet quality score, e.g. the last one before endpoint, what was the advantage of the cumulative average? Was the cumulative average sufficient to capture the longitudinal trajectory of the score?

* Limitation 4: Limited follow-up. The three cohorts were followed up from 26 years to 30 years. With such long follow-up, considerable changes in the socioeconomic and demographic characteristics and ecological factors could influence on diet quality and T2D incidence. Did the author adjust for any socioeconomic covariates in their Cox regression analyses if there were relevant information in the questionnaire, or discuss the issue if not available. Did they adjust for year of data capture?

Methods

2) For the global polygenic risk scores and pathway-specific risk scores, suggest

* present the distribution of these PRS in the three cohorts separately;

* show overlap between the SNPs in the global and pathway-specific PRS;

* show correlation among the global PRS and pathway-specific PRSs

* discuss AUC differences between UK Biobank validation dataset (0.723) and combined cohort under study (0.638)

3) Regarding the diet quality assessed at baseline and every 4 years thereafter,

* What was the average number of diet quality scores for a subject in the three cohorts respectively?

* Was there any correlation between the number of diet quality scores and cumulative average of the scores (i.e. did subjects with more diet quality scores have a higher cumulative average)? If so, this may be a marker of adherence and would that affect the interaction?

* What was the within-subject variance distribution in the three cohorts respectively? Is the cumulative average a valid surrogate of the long-term diet trajectory for subjects especially in those with fluctuating diet quality and large within-subject variance.

* Was there any difference in the within-subject variance among the three cohorts?

* The authors stopped updating dietary information when the subjects first reported having cardiovascular events or cancers. What about other factors that could dramatically change subjects' diet habits (e.g. T2D education or elevated glycemic traits that the subject may be aware of)?

* Did family history (FH) of diabetes affect the diet quality scores? Genetics is only a component of FH, did the authors repeat the analysis in those with or without FH?

4) Page 6, line 183: The statement "Models were adjusted for time-varying …" is very vague. Please define the time-varying covariates and the "a priori" rationale for considering them as time-varying risk factors. Similarly, what were the time-fixed covariates in the models and the values at which time-point were used? A conceptual framework including these mediators or confounders would be helpful in explaining the selection of these covariates in the model.

5) Page 6, line 185: The statement "The fully adjusted model was further adjusted …" is contradictory.

6) Page 6, line 190-191: "Because time varying BMI could be on the causal pathway between diet quality and …, we also conducted separate models without adjusting for BMI". Where were the results of these separate models? In supplementary figure 7, the results are presented in 3 categories of BMI, was this the baseline BMI rathe than changes in BMI between baseline and censor point?

7) Page 10, line 200: What was the "median distribution of the diet quality score in each cohort"? The statement is confusing from a statistical aspect.

8) Page 10, line 200: The RERI could also be calculated on a multiplicative scale to provide the index estimates of multiplicative interaction.

9) Page 10, line 205: "tested for multiplicative interactions by comparing the -2 log likelihood". The statement "-2 log likelihood" was informal. In addition, what test was used?

10) Page 10, line 209: The statement "examine whether changes in BMI modified the joint association …" is also vague. Which one was used to perform the 3-way interaction, time varying BMI or changes between two BMI?

Results

11) In Table 1, please include the incidence of T2D, incidence of death, number of lost to follow up and number of cardiovascular/cancer events in the three cohorts.

12) Page 12, line 246: "there was a risk gradient with increasing …". Please perform trend analysis to demonstrate the statistical significance of such "risk gradient".

13) Page 12, line 251-252: "The available sample gave us 80% statistical power …", at what significance level?

14) Given that the three cohorts were all sex-specific (two women cohorts and one men cohort), did the authors observe any sex-specific association between the risk factors under study and T2D incidence?

15) The lack of interactions between T2D genetic risk scores and diet quality on incident T2D limited the novelty of this work. A recent study reported interaction between fruit intake and T2D genetic risk (https://pubmed.ncbi.nlm.nih.gov/33399975/). Did the author explore the association of a subset of foods or nutrients well known for their risk-conferring or risk-reducing effect for T2D instead of using the aggregated score .

16) Have the authors performed GWAS on dietary traits or borrowed similar information from the UK Biobank analysis (https://pubmed.ncbi.nlm.nih.gov/32193382/) to investigate whether there are shared genetic factors between T2D and unhealthy diet. Other similar work such as genome-wide gene-diet interaction analysis should preferably be performed or cited to reflect the limitation of the study (https://pubmed.ncbi.nlm.nih.gov/33864366/). These analyses would complement the existing results and strengthen this study.

17) Limitations and strengths of the study should be discussed.

Minor issues:

1) Page 4, line 82: "processes such impaired insulin …" -> "processes such as impaired insulin …"

2) Page 6, line 116: "… a global polygenic score for type 2 diabetes that capture overall …" -> "that captures overall …"

3) Page 7, line 138: "transcriptional regulation (13)' indicating …" -> "transcriptional regulation (13), indicating …"

4) Page 7, line 144-145: "in this study with different the number of …' -> "in this study with different number of"

5) Page 7, line 154: "we used the DASH score (Dietary Approaches to Stop Hypertension) …" -> "we used the Dietary Approaches to Stop Hypertension (DASH) score …"

6) Page 10, line 202: "which is the proportion of risk duet to …" -> "… due to …"

7) Page 26, line 566: "… or median (IQR) …" -> "… or medians (IQR) …"

8) Page 27, line 577: "time varying confounders including age, ancestry-derived principal components, family history …". How could the genetic principal components be used as time varying covariates? As for family history, can change in status over time influence dietary habit and thence risk of diabetes.

Reviewer #4: The paper describes a prospective cohort study of interaction between genetic risk and diet quality on risk of type 2 diabetes. Three studies that followed up health professional's for over 900,000 person years were studied. Association is found with polygenic risk scores and with diet quality. These appear to additively affect risk of Type 2 diabetes with no evidence of interaction on the additive or multiplicative scale.

This is a well done study and the work is well described. It provides further support for the idea that lifestyle modification is important independent of genetic risk.

1. While I think the paper has useful findings, one concern I have is how much of an advance this is over the previous papers (referenced here from 5 to 11). It seems this lack of interaction between genetics and lifestyle has been well described several times in other studies, which is acknowledged in the paper. I think the main new finding claimed in this paper is the lack of additive interaction - but this seems like a limited advance over the previous studies. I would like more discussion - for a non-statistician - on the reasons why this is an important finding.

2. I also wonder why the UK Biobank study wasn't used in this study for comparison. That is also a longitudinal study and is used here to generate genetic risk scores, but I'm unsure why it also couldn't have been used for the main analysis - it would at least provide a comparison to a different population with different study selection criteria.

3. Diet quality is, by its nature, not a precise measure. The manuscript describes a good approach to assessing diet quality, but if you measure something imperfectly then this measurement error is going to make the possibility of finding interactions much harder. Do the authors think that this is a potential explanation for the lack of interactions identified?

4. The extreme of polygenic risk aren't assessed here (e.g.>5% or >1% tails of polygenic risk). While overall in the population there may be no obvious interactions, what about if someone has a very high genetic risk of diabetes? In these individuals you might imagine diet does have a less of an impact? Has this been tested?

Any attachments provided with reviews can be seen via the following link:

[LINK]

Attachment

Submitted filename: Comments_PRS_Diet_T2D_PLoSMed_15July2021_clean.docx

Decision Letter 2

Caitlin Moyer

13 Dec 2021

Dear Dr. Merino,

Thank you very much for submitting your manuscript "Polygenic scores, diet quality, and type 2 diabetes risk: a prospective, observational study" (PMEDICINE-D-21-01465R2) for consideration at PLOS Medicine.

Your revised paper was evaluated by a senior editor and discussed among all the editors here. It was also discussed with an academic editor with relevant expertise, and sent to four of the original reviewers, including a statistical reviewer. The reviews are appended at the bottom of this email and any accompanying reviewer attachments can be seen via the link below:

[LINK]

In light of these reviews, we will not be able to accept the manuscript for publication in the journal in its current form, but we would like to consider a revised version that addresses the reviewers' and editors' comments. Obviously we cannot make any decision about publication until we have seen the revised manuscript and your response, and we plan to seek re-review by one or more of the reviewers.

In revising the manuscript for further consideration, your revisions should address the specific points made by each reviewer and the editors. Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments, the changes you have made in the manuscript, and include either an excerpt of the revised text or the location (eg: page and line number) where each change can be found. Please submit a clean version of the paper as the main article file; a version with changes marked should be uploaded as a marked up manuscript.

In addition, we request that you upload any figures associated with your paper as individual TIF or EPS files with 300dpi resolution at resubmission; please read our figure guidelines for more information on our requirements: http://journals.plos.org/plosmedicine/s/figures. While revising your submission, please upload your figure files to the PACE digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at PLOSMedicine@plos.org.

We expect to receive your revised manuscript by Jan 03 2022 11:59PM. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

We ask every co-author listed on the manuscript to fill in a contributing author statement, making sure to declare all competing interests. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. If new competing interests are declared later in the revision process, this may also hold up the submission. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT. You can see our competing interests policy here: http://journals.plos.org/plosmedicine/s/competing-interests.

Please use the following link to submit the revised manuscript:

https://www.editorialmanager.com/pmedicine/

Your article can be found in the "Submissions Needing Revision" folder.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

We look forward to receiving your revised manuscript.

Sincerely,

Caitlin Moyer, Ph.D.

Associate Editor

PLOS Medicine

plosmedicine.org

-----------------------------------------------------------

Requests from the editors:

1. Please address the remaining points of reviewer 2, including the issue of including top principal components in the model, as well as acknowledging grant support and providing source code and all underlying data needed for supporting these analyses.

2. Title: We suggest removing the word prospective from the title. We suggest including the study setting/population in the title.

3. Data availability statement: Please revise the data statement, and please update this in the Data Availabilty section of the manuscript submission system. In your revision, you indicate:

“Data Availability: Information including the procedures to obtain and access

data from the Nurses’ Health Study and Health Professionals Follow-Up Study is described at http://www.nurseshealthstudy.org/researchers for the Nurses’ Health Study (contact: nhsaccess@channing.harvard.edu) or https://www.hsph.harvard.edu/hpfs/hpfs_collaborators.htm for the Health Professionals Followup Study (contact: hpfs@hsph.harvard.edu). Access to statistical codes and datasets will be facilitated following the existing data sharing guidelines provided, which can be found on the study websites.”

PLOS Medicine requires that the de-identified data underlying the specific results in a published article be made available, without restrictions on access, in a public repository or as Supporting Information at the time of article publication, provided it is legal and ethical to do so. Please see the policy at

http://journals.plos.org/plosmedicine/s/data-availability

and FAQs at

http://journals.plos.org/plosmedicine/s/data-availability#loc-faqs-for-data-policy

Please describe the locations of code and datasets necessary for replication of the analyses, or contact information for how these may be obtained. If the statistical codes and datasets are not freely available, please describe briefly the ethical, legal, or contractual restriction that prevents you from sharing those. Please also include an appropriate contact (web or email address) for inquiries (please note, this cannot be a study author).

4. Abstract: Background: At line 36, we suggest removing the word “prospectively” from the sentence.

5. Methods: Lines 128-131: Please provide either data or a reference supporting these sentences: “Participants for genetic determinations were selected to represent a representative sample of the original sample. Demographic characteristics and health status of participants with genetic information were generally similar to those who did not, therefore bias due to selection are probably minimized.”

6. Discussion: Line 419: Please temper this statement with “To the best of our knowledge, this is the first long-term…”

7. Page 25: Please remove the Author Contributions, Prior Presentation, and Data Availability sections from the main text, and please be sure all relevant information is entered in the manuscript submission system.

8. References: Please check the formatting of each reference, including journal title abbreviations. For example, reference 6 should be “PLoS Med” and reference 7 should be “Nat Commun”. Please use the "Vancouver" style for reference formatting, and see our website for other reference guidelines https://journals.plos.org/plosmedicine/s/submission-guidelines#loc-references

9. Figure 1 and Figure 2: Please also provide the unadjusted results. Please define all abbreviations in the legend (e.g. NHS, HPFS).

10. STROBE Checklist: Thank you for including the checklist. Please note “Funding Statement” as the location for item 22 “Funding” on the checklist.

11. S1 Appendix: Type 2 diabetes polygenic scores: Please define NHS I, NHS II and HPFS at first use in the text. Please also clarify “Genetic variants included in each polygenic score and corresponding weights for each variant are detailed in the appendix (pp).” Please also format references using “Vancouver” style, as in the main text.

12. S1 Table: In the legend, please define all abbreviations in the table.

13. S4 Table: Please define NHS I, NHS II and HPFS in the legend.

14. S9 Figure: Please define all abbreviations used in the figure in the legend.

Comments from the reviewers:

Reviewer #1: Thank you for addressing my comments.

Reviewer #2: Thank you for the opportunity to review the revised version of this work. Unfortunately, I feel that the authors have largely ignored my concerns in favor of working to address the more extensive criticism provided by the other reviewers.

In the response, the authors write that they followed the recommendations from investigators who developed BOLT-LMM. However, these recommendations are not quoted entirely accurately. Including principal components (PCs) as covariates to control for false positives is generally recommended by the BOLT-LMM developers for linear regression models, based on their earlier work [PMID: 16862161]. For mixed-model analysis, such as the one employed here, the original work describing BOLT-LMM states that "principal component analysis is not part of BOLT-LMM; it is unnecessary to perform PCA when running mixed model association methods" [PMID: 25642633]. Their later work does suggest "including PC covariates for the purpose of accelerating convergence of [...] BOLT-LMM" [PMID: 29892013], but algorithm runtime is of low importance in a one-shot analysis, especially when the source code is not being released for future reproducibility (see below). Blindly following a recommendation to improve runtime does nothing to address the original concern that the top principal components may have real (i.e., not ancestry-confounded false positive) type 2 diabetes signal, and that adjusting for these PCs may unnecessarily lower signal-to-noise ratio.

The URLs provided by the authors do NOT address my request to improve transparency and reproducibility of the study. The first URL is broken. The page at the second URL states that "Researchers using the NHS and NHSII data are required to acknowledge the grant support received by the National Institutes of Health (NIH) listed below, as appropriate, in all publications", yet the manuscript itself (which makes extensive use of these data) does not list any grants in the Acknowledgments section. Lastly, holding back source code under the guise of data availability is not appropriate. Data and software are two separate entities, and software licensing follows its own structure that is completely independent of any data usage agreements (https://en.wikipedia.org/wiki/Software_license). The authors need to decide what software license applies to their source code. If it falls under the umbrella of open source, then the code should be released as such. If the authors prefer to use a proprietary license, then it may be good to include a brief explanation why sacrificing transparency and reproducibility is appropriate in an NIH-funded study.

Reviewer #3: The authors have addressed all comments satisfactorily and the paper is much improved

Reviewer #4: The authors have addressed my comments well.

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 3

Caitlin Moyer

22 Feb 2022

Dear Dr. Merino,

Thank you very much for re-submitting your manuscript "Polygenic scores, diet quality, and type 2 diabetes risk: an observational study among 35,759 adults from three US cohorts" (PMEDICINE-D-21-01465R3) for review by PLOS Medicine.

I have discussed the paper with my colleagues and the academic editor and it was also seen again by one of the reviewers. I am pleased to say that provided the remaining editorial and production issues are dealt with we are planning to accept the paper for publication in the journal.

The remaining issues that need to be addressed are listed at the end of this email. Any accompanying reviewer attachments can be seen via the link below. Please take these into account before resubmitting your manuscript:

[LINK]

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

In revising the manuscript for further consideration here, please ensure you address the specific points made by each reviewer and the editors. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments and the changes you have made in the manuscript. Please submit a clean version of the paper as the main article file. A version with changes marked must also be uploaded as a marked up manuscript file.

Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. If you haven't already, we ask that you provide a short, non-technical Author Summary of your research to make findings accessible to a wide audience that includes both scientists and non-scientists. The Author Summary should immediately follow the Abstract in your revised manuscript. This text is subject to editorial change and should be distinct from the scientific abstract.

We expect to receive your revised manuscript within 1 week. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

We ask every co-author listed on the manuscript to fill in a contributing author statement. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT.

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.

Please note, when your manuscript is accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you've already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosmedicine@plos.org.

If you have any questions in the meantime, please contact me or the journal staff on plosmedicine@plos.org.  

We look forward to receiving the revised manuscript by Mar 01 2022 11:59PM.   

Sincerely,

Caitlin Moyer, Ph.D.

Associate Editor 

PLOS Medicine

plosmedicine.org

------------------------------------------------------------

Requests from Editors:

1. Reviewer 2 comments: As suggested, please complete the check to see how much T2D signal is present in the top PCs. Please also provide the relevant information needed to access the analysis code.

2. Title: Please capitalize the first word of the subtitle, and please update this within the manuscript as well as the submission system: “Polygenic scores, diet quality, and type 2 diabetes risk: An observational study among 35,759 adults from three US cohorts”

3. Data availability statement: As mentioned by Reviewer 2, we request that you please make available the source code needed to replicate the study's findings in a repository (such as GitHub, SourceForge or Bitbucket) or a cloud computing service (such as Code Ocean). Protection of authors’ intellectual property will not be cause for exception. Please explain in the manuscript’s Data Availability Statement how readers can access the shared code and please also include an appropriate contact (web or email address) for inquiries (please note, this cannot be a study author).

4. Abstract: Line 40-41: Please revise if this should be: “Health Professional’s Follow-up Study”

5. Abstract: Line 53: We suggest revising to: “Limitations of this study include the self-report of diet information and possible bias resulting from inclusion of highly educated participants with available genetic data.”

6. Author summary: Line 63-64: Please revise to: “...the partial characterization of genetic risk and the predominant assessment of interactions on the multiplicative scale…” or similar.

7. Author summary: Line 79-81: Please clarify if this should be: “Further, we showed that the risk of type 2 diabetes attributed to the combination of increased genetic risk and low diet quality was similar to the sum of the risks associated with each factor alone.”

8. Methods: Line 116: Please change “RESEARCH DESIGN AND METHODS” to “Methods”

9. Methods: Line 252-253: Please describe how genetic risk categories (low, intermediate, and high) were established.

10. References: Please check journal title abbreviations (for example, reference 5 should be N Engl J Med).

11. Table 1: How was race/ethnicity defined and by whom? Should the DASH scores be reported here, in addition to the AHEI?

12. Table 3: The confidence interval for Beta-cell dysfunction is incomplete.

13. Figure 1: Please note in the legend that IR is insulin resistance.

14. Figure 2: Please report p values as p<0.001 where relevant. Please report to two decimal places where p>/= 0.01 and please report to three decimal places where p<0.01.

15. Supporting Information File: Please provide a “clean” version of the document.

16. S1 Fig: Please increase the font size of the axis labels on the graphs, if possible.

17. S2 Fig and S3 Fig: Please provide p values for the associations for each of the three cohorts.

18. S4, S5, S6, S7, S8 Fig: Please report p values as p<0.001 where relevant. Please report to two decimal places where p>0.01 and please report to three decimal places where p<0.01.

19. STROBE Checklist: Please provide a “clean” version of the checklist.

Comments from Reviewers:

Reviewer #2: First of all, I apologize if I have been creating extra work for everybody involved. My original hope was that the authors would simply perform a quick check to see how much Type 2 Diabetes (T2D) signal is even present in the top principal components (PCs). Since PCs have to be computed for BOLT-LMM adjustment anyway, checking for correlation with T2D is a trivial amount of computation on top of that, and the very original comment could have been addressed with a single sentence to the effect of "To verify that adjusting for PCs did not remove a substantial amount of relevant signal, we examined stratification of patients into whether they have T2D by the top 20 PCs and found that <conclusion> (<relevant statistics here>)".

Instead, it appears that we are deep down the rabbit hole of discussing the merits of adjusting BOLT-LMM in general. To clarify, I agree with the authors that adjusting for the top PCs is a widespread practice, and that the resulting correction for possible confounders strongly outweighs any possible reduction in the signal of interest. I was simply suggesting to look at the amount of signal being lost (which was expected to be minimal), but this is a relatively minor sanity check that should not hold up the overall publication process. I also apologize if this didn't come across clearly in my earlier comments.

I can confirm that the financial disclosures statement correctly acknowledges relevant NIH grants. Unfortunately, the code is still being lumped under the data availability statement for some strange reason. I fully understand privacy concerns associated with releasing data, and the "available upon request" system is completely appropriate for it. However, computer code should not have the same constraints, and its release can help future researchers understand exactly what analyses were performed and with what parameters, thereby increasing overall reproducibility and robustness of the findings. It is a minimal amount of effort to upload existing analysis scripts to GitHub/GitLab or equivalent, add a simple README, and attach an open-source license (e.g., MIT or GPL). It is not entirely clear why the authors are so hesitant to publicly release their code, especially in an NIH-funded study, but this is not a hill that I am looking to die on with this manuscript.

Overall, I think this is a well-executed study, and the relatively minor points above should not prevent its progress towards a publication.

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 4

Caitlin Moyer

21 Mar 2022

Dear Dr Merino, 

On behalf of my colleagues and the Academic Editor, Weiping Jia, I am pleased to inform you that we have agreed to publish your manuscript "Polygenic scores, diet quality, and type 2 diabetes risk: an observational study among 35,759 adults from three US cohorts" (PMEDICINE-D-21-01465R4) in PLOS Medicine.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes.

In the meantime, please log into Editorial Manager at http://www.editorialmanager.com/pmedicine/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process. 

Please also address the following editorial requests:

1. Title: Please capitalize the first word of the subtitle, and please be sure to update this in the submission system: “Polygenic scores, diet quality, and type 2 diabetes risk: An observational study among 35,759 adults from three US cohorts”

2. Data availability statement: As discussed, please revise the Data Availability Statement to:

“The data underlying the generation of the global polygenic score for type 2 diabetes are available from the UK Biobank project site, subject to registration and application process. Further details can be found at https://www.ukbiobank.ac.uk. This research was conducted under UK Biobank application no. 45052. Code to run the genome-wide association analysis for type 2 diabetes and generate the global polygenic score has been uploaded to GitHub (https://github.com/lab319/ps-diet-t2d).

Information including the procedures to obtain and access the data and codes used in this study in the Nurses’ Health Study I and II, and the Health Professionals Follow-Up Study is described at http://www.nurseshealthstudy.org/researchers for the Nurses’ Health Study (contact: nhsaccess@channing.harvard.edu) or https://www.hsph.harvard.edu/hpfs/ for the Health Professionals Follow-up Study (contact: hpfs@hsph.harvard.edu). The scripts to analyze NHS/HPFS data presented in this manuscript are open and widely available once access to the system is granted.”

Please ensure that all relevant scripts are available for access via the GitHub link provided.

3. Results: Line 318: Please check this sentence for a typo, and change to “When analyzed in each cohort separately, the risk of type 2 diabetes per 10-unit decrease in AHEI score was…” if appropriate.

4. Table 1 Legend (Page 30, Line 613): Please change “Caucasian” to “European” or another term that conveys what is meant.

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with medicinepress@plos.org. If you have not yet opted out of the early version process, we ask that you notify us immediately of any press plans so that we may do so on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Thank you again for submitting to PLOS Medicine. We look forward to publishing your paper. 

Sincerely, 

Caitlin Moyer, Ph.D. 

Associate Editor 

PLOS Medicine

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Distribution of polygenic scores in the NHS.

    Distribution of the global and pathway polygenic scores in the NHS. NHS, Nurses’ Health Study.

    (PNG)

    S2 Fig. Distribution of polygenic scores in the HPFS.

    Distribution of the global and pathway polygenic scores in the HPFS. HPFS, Health Professionals Follow-up Study.

    (PNG)

    S3 Fig. Distribution of polygenic scores in the NHS II.

    Distribution of the global and pathway polygenic scores in the NHS II. NHS, Nurses’ Health Study.

    (PNG)

    S4 Fig. Risk of incident type 2 diabetes associated with diet quality.

    Shown are adjusted HRs and 95% CI of the estimate for type 2 diabetes in each of the 3 prospective cohorts per 10 units decrease in diet quality score assessed using the AHEI score. The diet quality score was derived from repeated measurements analyses. Cox proportional hazards models were stratified by age and adjusted for time-varying covariates including ancestry-derived principal components (not time-varying), family history of diabetes (not time-varying), history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, and total energy intake. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results. The heterogeneity index (I2) were used to assess heterogeneity. The P values for the association were <0.001, <0.001, and 0.049 for the NHS, the HPFS, and the NHS II, respectively. AHEI, Alternate Healthy Eating Index; BMI, body mass index; CI, confidence interval; HPFS, Health Professionals Follow-up Study; HR, hazard ratio; NHS, Nurses’ Health Study.

    (TIFF)

    S5 Fig. Risk of incident type 2 diabetes associated with diet quality—sensitivity analysis using the DASH score.

    Shown are adjusted HRs and 95% CI of the estimate for type 2 diabetes in each of the 3 prospective cohorts per 5 units decrease in diet quality score assessed using the DASH score. The diet quality score was derived from repeated measurements analyses. Cox proportional hazards models were stratified by age and adjusted for time-varying confounders including ancestry-derived principal components (not time-varying), family history of diabetes (not time-varying), history of hypertension, history of hypercholesterolemia, menopausal status (women only), BMI, smoking status, physical activity, total energy intake, and alcohol intake. Fixed-effects inverse variance weighted meta-analysis was used to combine cohort-specific results. The heterogeneity index (I2) were used to assess heterogeneity. The P values for the association were 0.001, <0.001, and 0.23 for the NHS, the HPFS, and the NHS II, respectively. BMI, body mass index; CI, confidence interval; DASH, Dietary Approaches to Stop Hypertension; HPFS, Health Professionals Follow-up Study; HR, hazard ratio; NHS, Nurses’ Health Study.

    (TIFF)

    S6 Fig. Risk of type 2 diabetes according to categories of the global polygenic scores and adherence to a healthy diet in age-adjusted secondary analyses.

    Shown are age-adjusted HRs and 95% CI of the estimate for type 2 diabetes according to genetic risk and diet quality categories using the AHEI score. In these comparisons, participants with low genetic risk and high-quality diet served as the reference group. Cox proportional hazards models were stratified by age and adjusted for ancestry-derived principal components (not time-varying). A fixed-effects meta-analysis was used to combine cohort-specific results. AHEI, Alternate Healthy Eating Index; CI, confidence interval; HR, hazard ratio.

    (PNG)

    S7 Fig. Risk of type 2 diabetes according to categories of the 5 pathway-specific polygenic score diet quality.

    Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes incidence according to pathway-specific polygenic score and diet quality categories. (A) Beta-cell polygenic score, (B) proinsulin polygenic score, (C) obesity polygenic score, (D) lipodystrophy polygenic score, and (E) liver metabolism polygenic score. In these comparisons, participants at low genetic risk with high-quality diet served as the reference group. A fixed-effects meta-analysis was used to combine cohort-specific results. CI, confidence interval; HR, hazard ratio.

    (PDF)

    S8 Fig. Risk of incident type 2 diabetes according to genetic and diet quality risk in each cohort separately.

    Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes in (A) NHS, (B) HPFS, and (C) NHS II according to genetic risk and diet quality categories. In these comparisons, participants with low genetic risk and high-quality diet served as the reference group. CI, confidence interval; HR, hazard ratio; HPFS, Health Professionals Follow-up Study; NHS, Nurses’ Health Study.

    (PDF)

    S9 Fig. Risk of type 2 diabetes according to categories of the global polygenic scores and adherence to a healthy diet—sensitivity analysis using the DASH score.

    Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes according to genetic risk and diet quality categories using the DASH score. In these comparisons, participants with low genetic risk and high-quality diet served as the reference group. A fixed-effects meta-analysis was used to combine cohort-specific results. CI, confidence interval; DASH, Dietary Approaches to Stop Hypertension; HR, hazard ratio.

    (PNG)

    S10 Fig. Risk of type 2 diabetes according to categories of the pathway specific polygenic scores and adherence to a healthy diet—sensitivity analysis using the DASH score.

    Shown are multivariable-adjusted HRs and 95% CI of the estimate for type 2 diabetes incidence according to pathway-specific polygenic score and diet quality categories using the DASH score. (A) Beta-cell polygenic score, (B) proinsulin polygenic score, (C) obesity polygenic score, (D) lipodystrophy polygenic score, and (E) liver metabolism polygenic score. In these comparisons, participants at low genetic risk with high-quality diet served as the reference group. A fixed-effects meta-analysis was used to combine cohort-specific results. CI, confidence interval; DASH, Dietary Approaches to Stop Hypertension; HR, hazard ratio.

    (PDF)

    S11 Fig. Interplay between diet quality and global polygenic score on type 2 diabetes risk according to changes in BMI.

    Three-dimensional illustrations of type 2 diabetes risk, genetic susceptibility, and diet quality by BMI among individuals with normal weight (A), overweight (B), and obese (C). The blue-colored region maps the lower risk area, and the red-colored area stands for higher risk area. Deciles of AHEI are inverse transformed, with 0 being good diet quality and 10 bad diet quality. Data from 3 cohorts were combined. Multivariate analyses were stratified by age and adjusted for time-varying confounders including cohort (not time-varying), ancestry-derived principal components (not time-varying), family history of diabetes (not time-varying), history of hypertension, history of hypercholesterolemia, menopausal status (women only), smoking status, physical activity, and total energy intake. P = 0.681 for 3-way interaction. AHEI, Alternate Healthy Eating Index; BMI, body mass index; SD, standard deviation; T2D, type 2 diabetes.

    (PDF)

    S1 Table. Differences in baseline characteristics between the sample of participants included in this study and all participants in each original cohort.

    (DOCX)

    S2 Table. Characteristics of genetic variants used to build the 5 different pathway-specific polygenic scores.

    (DOCX)

    S3 Table. Associations of global and process-specific polygenic scores with type 2 diabetes risk in secondary analyses.

    (DOCX)

    S4 Table. Correlation between polygenic scores included in this study.

    (DOCX)

    S5 Table. Associations of global and process-specific polygenic scores with type 2 diabetes risk, random-effects meta-analysis.

    (DOCX)

    S6 Table. Multiplicative interactions between diet quality and genetic risk using global and pathway-specific polygenic scores.

    Secondary analyses using the DASH score. DASH, Dietary Approaches to Stop Hypertension.

    (DOCX)

    S7 Table. Additive interactions between diet quality and genetic susceptibility on type 2 diabetes risk, crude models.

    (DOCX)

    S8 Table. Additive interactions between diet quality and genetic susceptibility on type 2 diabetes risk in each cohort.

    (DOCX)

    S9 Table. Additive interactions between diet quality and genetic risk using global and pathway-specific polygenic scores.

    Secondary analyses using the DASH score. DASH, Dietary Approaches to Stop Hypertension.

    (DOCX)

    S10 Table. Interplay between diet quality and pathway-specific polygenic scores on type 2 diabetes risk by changes in BMI.

    BMI, body mass index.

    (DOCX)

    S1 Text. Type 2 diabetes polygenic scores.

    (DOCX)

    S2 Text. Prespecified analysis plan.

    (DOCX)

    S3 Text. STROBE checklist.

    STROBE, Strengthening the Reporting of Observational Studies in Epidemiology.

    (DOCX)

    Attachment

    Submitted filename: Comments_PRS_Diet_T2D_PLoSMed_15July2021_clean.docx

    Attachment

    Submitted filename: Response.PMEDICINE-D-21-01465R1.docx

    Attachment

    Submitted filename: Response.PMEDICINE-D-21-01465R2.FINAL.docx

    Attachment

    Submitted filename: Response.PMEDICINE-D-21-01465.R3.FINAL.docx

    Data Availability Statement

    The data underlying the generation of the global polygenic score for type 2 diabetes are available from the UK Biobank project site, subject to registration and application process. Further details can be found at https://www.ukbiobank.ac.uk. This research was conducted under UK Biobank application no. 45052. Code to run the genome-wide association analysis for type 2 diabetes and generate the global polygenic score has been uploaded to GitHub (https://github.com/lab319/ps-diet-t2d). Information including the procedures to obtain and access the data and codes used in this study in the Nurses’ Health Study I and II, and the Health Professionals Follow-Up Study is described at http://www.nurseshealthstudy.org/researchers for the Nurses’ Health Study (contact: nhsaccess@channing.harvard.edu) or https://www.hsph.harvard.edu/hpfs/ for the Health Professionals Follow-up Study (contact: hpfs@hsph.harvard.edu). The scripts to analyze NHS/HPFS data presented in this manuscript are open and widely available once access to the system is granted.


    Articles from PLoS Medicine are provided here courtesy of PLOS

    RESOURCES