Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Jan 1.
Published in final edited form as: Epidemiology. 2024 Sep 24;36(1):126–138. doi: 10.1097/EDE.0000000000001795

Characterization of additive gene–environment interactions for colorectal cancer risk

Claire E Thomas 1,*, Yi Lin 1,*, Michelle Kim 1, Eric S Kawaguchi 2, Conghui Qu 1, Caroline Y Um 3, Brigid M Lynch 4,5, Bethany Van Guelpen 6,7, Kostas Tsilidis 8,9, Robert Carreras-Torres 10,11, Franzel JB van Duijnhoven 12, Lori C Sakoda 13,1, Peter T Campbell 14, Yu Tian 15,16, Jenny Chang-Claude 17,18, Stéphane Bézieau 19, Arif Budiarto 20,21, Julie R Palmer 22, Polly A Newcomb 1,23, Graham Casey 24, Loic Le Marchand 25, Marios Giannakis 26,27,28, Christopher I Li 1, Andrea Gsur 29, Christina Newton 3, Mireia Obón-Santacana 30,31,32, Victor Moreno 30,31,32,33, Pavel Vodicka 34,35,36, Hermann Brenner 37,38,39, Michael Hoffmeister 37, Andrew J Pellatt 40, Robert E Schoen 41, Niki Dimou 42, Neil Murphy 42, Marc J Gunter 42, Sergi Castellví-Bel 43, Jane C Figueiredo 44,45, Andrew T Chan 46,47,48,49,50,51, Mingyang Song 52,48,53, Li Li 54, D Timothy Bishop 55, Stephen B Gruber 56, James W Baurley 57,58, Stephanie A Bien 1, David V Conti 2, Jeroen R Huyghe 1, Anshul Kundaje 59,60, Yu-Ru Su 1, Jun Wang 61,62, Temitope O Keku 63, Michael O Woods 64, Sonja I Berndt 65, Stephen J Chanock 65, Catherine M Tangen 66, Alicja Wolk 67, Andrea Burnett-Hartman 68, Anna H Wu 69, Emily White 1,70, Matthew A Devall 71, Virginia Díez-Obrero 72, David A Drew 73, Edward Giovannucci 74,75, Akihisa Hidaka 1, Andre E Kim 2, Juan Pablo Lewinger 2, John Morrison 2, Jennifer Ose 76, Nikos Papadimitriou 42, Bens Pardamean 57, Anita R Peoples 76, Edward A Ruiz-Narvaez 77, Anna Shcherbina 59,60, Mariana C Stern 45, Xuechen Chen 37, Duncan C Thomas 2, Elizabeth A Platz 78, W James Gauderman 2, Ulrike Peters 1,23, Li Hsu 1,79
PMCID: PMC12142706  NIHMSID: NIHMS2079814  PMID: 39316822

Abstract

Background:

Colorectal cancer (CRC) is a common, fatal cancer. Identifying subgroups who may benefit more from intervention is of critical public health importance. Previous studies have assessed multiplicative interaction between genetic risk scores and environmental factors, but few have assessed additive interaction, the relevant public health measure.

Methods:

Using resources from colorectal cancer consortia including 45,247 CRC cases and 52,671 controls, we assessed multiplicative and additive interaction (relative excess risk due to interaction, RERI) using logistic regression between 13 harmonized environmental factors and genetic risk score including 141 variants associated with CRC risk.

Results:

There was no evidence of multiplicative interaction between environmental factors and genetic risk score. There was additive interaction where, for individuals with high genetic susceptibility, either heavy drinking [RERI = 0.24, 95% confidence interval, CI, (0.13, 0.36)], ever smoking [0.11 (0.05, 0.16)], high BMI [female 0.09 (0.05, 0.13), male 0.10 (0.05, 0.14)], or high red meat intake [highest versus lowest quartile 0.18 (0.09, 0.27)] was associated with excess CRC risk greater than that for individuals with average genetic susceptibility. Conversely, we estimate those with high genetic susceptibility may benefit more from reducing CRC risk with aspirin/NSAID use [−0.16 (−0.20, −0.11)] or higher intake of fruit, fiber, or calcium [highest quartile versus lowest quartile −0.12 (−0.18, −0.050); −0.16 (−0.23, −0.09); −0.11 (−0.18, −0.05), respectively] than those with average genetic susceptibility.

Conclusions:

Additive interaction is important to assess for identifying subgroups who may benefit from intervention. The subgroups identified in this study may help inform precision CRC prevention.

Keywords: GxE, multiplicative interaction, additive interaction, colorectal cancer, genetic epidemiology

Introduction

Colorectal cancer (CRC) is a critical public health issue, as the third most commonly diagnosed cancer and second leading cause of cancer death globally.1 Ample evidence suggests that both genetics and environmental risk factors contribute to CRC development; heritability of CRC is estimated to be 15–35%.2,3 Genome-wide association studies (GWAS) of CRC have identified important genetic risk associations, where recent GWAS studies of more than 100,000 participants have identified over 100 independent risk associations related to normal colorectal homeostasis, Hedgehog signaling, proliferation, cell adhesion, migration, immune function, long noncoding RNAs and somatic drivers, and microbial interactions.4,5 There is great interest in identifying interaction between modifiable exposures and genetic risk for precision prevention of CRC, pointing to environmental factors that could be targeted to mitigate elevated genetic risk.6

GxE interaction refers to when the effect of one exposure (environmental, E) on an outcome varies across different strata of the second exposure (genetic, G). For a disease trait, these interactions can be commonly described in two ways: additive and multiplicative. Additive interaction is focused on the sum of the individual effects of G and E, while multiplicative interaction is focused on the product of the individual effects. Given a logistic regression model, which is widely used to model GxE for binary outcome from case–control studies, the departure from the multiplicative effect can be measured by the regression coefficient of the G and E product term (βG×E). The departure from the additive effect can be described with the relative excess risk due to interaction (RERI). This cannot be validly estimated directly from linear models with case–control data,7 but it can be derived from estimates from the logistic regression model when the case–control odds ratio approximates the risk ratio. It has been asserted that interaction for identification of relevant subgroups with excess risk and public health impact is best assessed through additive interaction, which is often overlooked in epidemiologic studies focused on etiology.8 Multiplicative interaction is straightforward and convenient to assess through an interaction term in a logistic regression model that is easily derived in statistical software. It is important to note that when assessing interaction, if both exposures of interest have an effect on the outcome then mathematically interaction must be present on at least one scale or both, where the absence of additive interaction implies multiplicative interaction, and vice versa.810 Given that different information can be gained from different scales, it is recommended to present both additive and multiplicative interaction in practice.8,10,11

Many previous studies of GxE in CRC have assessed multiplicative interaction with genetic risk score,1219 but to our knowledge, only one study has included additive interaction using the RERI.20 This study assessed interaction between genetic risk score of 95 genetic variants and a healthy lifestyle score combining all environmental factors. However, it may have limited public health utility in identifying which component of a healthy lifestyle score particularly warrants intervention among those with high genetic risk. Therefore, we set out to assess GxE interaction among 97,918 participants of European descent utilizing a genetic risk score including 141 known CRC associated genetic variants and 13 environmental risk/protective factors. Our objective was twofold: to assess GxE interactions to identify subgroups among whom intervention on one or more environmental factors may have a large impact on reducing CRC risk, and to provide an example of GxE assessment on both the additive and multiplicative scales in agreement with current recommendations8 to encourage the evaluation of additive interaction to guide precision prevention.

Methods

Study population

We included participants in the Colon Cancer Family Registry (CCFR), the Colorectal Transdisciplinary Study (CORECT), and the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO). Study details have been previously published4,21,22 and can be found in eTable 1. Cases were identified as incident invasive colorectal cancer or advanced adenoma cases and confirmed by medical records, pathology reports, or death certificate information. For cohort studies, we assembled nested case–control sets via risk-set sampling. For population-based case–control studies we used population-based controls with study-specific eligibility. Controls were matched with cases on age, sex, race, and enrollment date or trial group, when applicable. For the subset of advanced adenoma cases (N = 4,774, eTable 1), matched controls had polyp-free from endoscopy at the time of adenoma selection. All participants gave written informed consent and studies were approved by their respective Institutional Review Boards.

We limited analyses to self-reported European/non-Hispanic White participants who, as part of the quality control step for genetic association analyses, were also required to cluster with the European reference population from the 1000 Genomes Project based on genetic principal components. The final pooled sample size was 45,247 cases and 52,671 controls.

Environmental factor selection

Demographic, lifestyle, body composition, and environmental exposures, throughout the paper referred to as environmental factors, were self-reported either at in-person or telephone interviews or via structured questionnaires. For cohort studies the variables were assessed at blood collection or participant recruitment; for case–control studies variables were assessed at least 1–2 years and up to 10 years or more preceding participant recruitment. Data harmonization consisted of a multi-step procedure performed at the GECCO coordinating center (Fred Hutchinson Cancer Center).23 Briefly, we defined common data elements a priori. We examined study questionnaires and data dictionaries, and through an iterative process of communication with data contributors, we mapped these elements to common data elements. Definitions, permissible values, and standardized coding were implemented into a single database via SAS and T-SQL. We checked resulting data for errors and outlying values within and between studies.

We selected environmental factors based on previously being established for their association with CRC risk and on availability of pooled data in our consortium study.2325 Environmental factors included are: anthropometric measurements including body mass index (BMI) per 5 kg/m2 and height in 10 cm, ever smoked (yes/no), study-specific definitions of regular use of aspirin and nonaspirin non-steroidal anti-inflammatory drugs (NSAIDs, yes/no), history of type II diabetes (yes/no), and dietary intake. Dietary variables were measured using food frequency questionnaires or diet histories, including alcohol, fruits, vegetables, dietary fiber, red meat, processed meat, dietary and supplemental calcium, dietary and supplemental folate, and total energy intake. We created sex- and study-specific quartiles for all dietary variables except alcohol. We categorized alcohol by grams of alcohol intake per day: non-drinker, 1–28 (moderate drinker) and >28g/day (heavy drinker). For dichotomous or categorical variables, the lowest category of exposure (or no use) was used as the reference, except for alcohol where the moderate drinker group was set as the reference, given the observed J-shape association.24,26

Polygenic risk score construction

Details on genotyping and quality control have been previously published,4 and genotyping platforms used are summarized in eTable 1. Briefly, we excluded genotyped single nucleotide polymorphisms (SNPs) based on call-rate (< 95–98%), lack of Hardy Weinberg equilibrium (P < 1×10^−4), inconsistencies between self-reported and genotypic sex, and discordant genotype calls within duplicate samples. We imputed all autosomal SNPs of all studies to the Haplotype Reference Consortium r1.127 reference panel via the Michigan Imputation Server28 and converted them into a binary format for data management and analysis using R package BinaryDosage.29 To capture common genetic predisposition, we calculated a GRS combining estimated effects of 141 previously GWAS-identified SNPs associated with CRC risk.4,3032 These SNPs had independent contributions to CRC risk confirmed by linkage disequilibrium and/or conditional association analyses, which accounted for the lead SNP within each region.4,3032 The detailed characteristics of these SNPs are provided in eTable 2. We coded each SNP variable as the expected number of copies of the variant allele. The weights were determined by the marginal log-odds ratios estimated from prior studies to avoid potential bias. For the known loci identified through GECCO, CCFR, or CORECT studies, the estimates were adjusted for the winner’s curse.33 The winner’s curse phenomenon occurs when the true effect sizes are over estimated due to selection bias from using the same data to identify variants that reach a certain significance threshold.33,34 We constructed a genetic risk score for each individual by taking the weighted sum of variant alleles over all 141 SNPs accounting for the strength of CRC-association with each SNP. We standardized the genetic risk score as (risk score-mean risk score)/SD, where mean and SD were the mean and standard deviation of genetic risk score calculated based on all participants.

Additive and multiplicative interaction

Consider a binary genotype G and a binary environmental exposure E. Let Y indicate disease status, we can fit a logistic regression with interaction term:

logit{P(Y=1|G,E}=β0+β1G+β2E+β3GE, (1)

where β0 is the intercept, β1 and β2 are the main effects of G and E, respectively, and β3 is the multiplicative interaction effect. To measure the interaction on an additive scale, we constructed a 2×2 table (Table 1) to facilitate understanding of RERI estimation. The cells in the table represent the odds ratio of disease for the reference group when both G and E are absent (OR00=1), when factor G is present but E is absent (OR01), when factor G is absent but E is present (OR10), and when both factors G and E are present (OR11), respectively. For simplicity we have defined G with binary classifications; in practice, however, in our manuscript G = 0 would represent the mean GRS, and G = 1 represents mean genetic risk score + one standard deviation. The RERI is then defined as:

RERI=(OR11OR01)(OR101)=(eβ1+β2+β3eβ1)(eβ21) (2)

Table 1:

2×2 table of two exposures (genetic and environmental) and corresponding odds ratios

Exposure 1 (G)
0 (mean) 1 (mean + 1 SD)
Exposure 2 (E) 0 1.0 (Ref) OR01=eβ1
1 OR10=eβ2 OR11=eβ1+β2+β3

logit{PY=1|G,E}=β0+β1G+β2E+β3GE (1)

AdditiveInteraction,RERI=(OR11OR01)(OR101)=(eβ1+β2+β3eβ1)(eβ21) (2)

MultiplicativeInteraction=OR11/(OR01*OR10)=eβ3

For continuous genetic exposure, G=0 is the mean, compared to G=1 is mean + 1 standard deviation. 0,1 notation used for simplicity.

The first part of the equation, (eβ1+β2+β3eβ1), estimates the difference in odds ratio of developing disease when exposed to environmental exposure compared to when not exposed among individuals who carry variant genotypes. The second part of the equation, (eβ2 − 1), estimates the difference in odds ratio when exposed to the environmental factor compared to when not exposed among those who do not carry variant genotypes. Finally, the difference of the two parts estimates excessive odds ratio differences between being exposed and not for those who carry the high-risk genotypes and for those who do not. In other words, RERI quantifies excess of odds ratio due to jointly being exposed and carrying the high-risk genotype compared to the reference group beyond the summation of the separate excess of odds ratios of being exposed and carrying the high-risk genotype in the absence of the other risk factor compared to the reference group. Given that the odds ratio approximates the relative risk, the RERI derived from odds ratios will approximately equal the RERI derived from risk data.8 We used the delta method for estimating the variance and 95% confidence intervals (CI) of RERI.35

The reference level of continuous exposures is of critical importance for the additive interaction RERI, because it is calculated post hoc from a multiplicative logistic regression model. Changing the reference level by adding or subtracting a constant does not change the estimate for multiplicative interaction but does so for additive interaction. This is because additive interaction also involves estimates of main effects, which change if the reference level changes. For continuous variables (e.g., genetic risk score), we used the mean as the reference level so that RERI would reflect the comparison with the mean and the mean generally is more robust than using other values as the reference level (e.g., a minimum value).36 For dichotomous or categorical variables, no use or lowest quartile was used as the reference regardless of whether the exposure was a risk or protective factor. The exception was alcohol where we set the moderate drinker group as the reference due to the observed J-shape association.24,26 More details are provided in the eAppendix.

Statistical analysis

We presented mean (SD) and counts (percentages) for the genetic risk score and environmental factors for CRC cases and controls separately. We used logistic regression models to calculate OR and 95% CI to assess the associations of the 13 environmental factors and genetic risk scores with CRC risk, where for genetic risk score we calculated the OR per 1 standard deviation increase as commonly used for genetic risk score in the literature. In our models, quartile variables (Q2, Q3, Q4) were compared to Q1 as the reference group, treating each quartile exposure variable as a categorical predictor. We also used the ordinal quartile variable (1,2,3,4) to calculate a linear trend test. Our primary complete data analysis models were adjusted for age at reference time, sex, study, the first three principal components of genetic ancestry, and total energy for dietary variables only. In further models we adjusted for all other 12 environmental factors of interest, where we used study- and sex-specific mean imputation for missing values (missing rates < 5%) in adjusted covariates to maximize our sample size. In our previous work we have found consistent results utilizing mean versus multiple imputation, therefore we used mean imputation here for simplicity.37 We also assessed the correlation between genetic risk score and environmental factors by using linear regression model among controls for the association of environmental factors with genetic risk score, adjusted for age at reference time, sex, study, the first three principal components of genetic ancestry, and total energy for dietary variables only. For BMI and height, we stratified the analyses by sex, given that BMI and height vary by sex. We additionally conducted a sensitivity analysis restricted to colon-only cancers (26,102 colon cancer cases and 48,116 controls). Forest plots of additive (RERI) and multiplicative interaction estimates, and 95% confidence intervals were plotted. To account for multiple testing of 11 exposure variables and two sex-stratified exposure variables, the Holm-Bonferroni method for p-value correction was used with 0.05 as the overall significance threshold for additive and multiplicative interaction separately (presented in eTable 4). All analyses were performed using R, version 4.1.3 (R Foundation for Statistical Computing, Vienna, Austria) software.

Results

Our study included a total of 97,918 participants with 45,247 cases and 52,671 controls (eTable 1). Heavy or no alcohol, smoking, diabetes, high BMI for both sexes, taller height among females, and high processed meat and red meat consumption were associated with higher risk of CRC (Table 2). Use of aspirin/NSAIDs, higher intakes of dietary and total calcium, fiber, dietary/total folate, vegetable, and fruit were associated with lower risk of CRC. Genetic risk score was associated with 1.5-fold higher risk of CRC per SD increase in genetic risk score (95% CI=1.5, 1.6). There was no evidence of correlation between genetic risk score and any of the environmental risk factors (eTable 3).

Table 2:

Comparison by colorectal cancer case-control status and association (odds ratio, 95% confidence interval) of genetic risk score, environmental factors, and risk of colorectal cancer

Overall Cases Controls OR (95% CI) a

Genetic risk score, mean (std) 9.3 (0.51) 9.4 (0.50) 9.2 (0.50) 1.5 (1.5, 1.6)
per 1 sd
Risk factors
 Alcohol non-drinker (<1 g/day), N (%) 29293 (40) 13754 (43) 15539 (37) 1.2 (1.1, 1.2)
Alcohol 1–28 g/day, N (%) 35635 (48) 13979 (44) 21656 (51) 1
 Alcohol >28 g/day, N (%) 9162 (12) 4141 (13) 5021 (12) 1.4 (1.3, 1.5)
Ever smoked, No, N (%) 36947 (47) 14975 (44) 21972 (50) 1
 Ever smoked, Yes, N (%) 41144 (53) 18781 (56) 22363 (50) 1.2 (1.2, 1.3)
History of type II diabetes, No, N (%) 67524 (91) 27957 (89) 39567 (92) 1
 History of type II diabetes, Yes, N (%) 6854 (9) 3572 (11) 3282 (8) 1.3 (1.3, 1.4)
BMI, Female, mean (std) c 27 kg/m2 (5.2) 27 kg/m2 (5.4) 27 kg/m2 (5.1) 1.1 (1.1, 1.1) per 5 kg/m2
BMI, Male, mean (std) c 27 kg/m2 (4.2) 28 kg/m2 (4.3) 27 kg/m2 (4.1) 1.2 (1.2, 1.2) per 5 kg/m2
Height, Female, mean (std) c 1.6 m (0.1) 1.6 m (0.1) 1.6 m (0.1) 1.1 (1.0, 1.1) per 10 cm
Height, Male, mean (std) c 1.8 m (0.1) 1.8 m (0.1) 1.8 m (0.1) 1.0 (1.0, 1.0) per 10 cm
Red meat, Quartile 1 (lowest), N (%) b 18665 (24) 7482 (23) 11183 (26) 1
 Red meat, Quartile 2, N (%) 22204 (29) 9243 (28) 12961 (30) 1.1 (1.1, 1.2)
 Red meat, Quartile 3, N (%) 19869 (26) 8517 (26) 11352 (26) 1.2 (1.2, 1.3)
 Red meat, Quartile 4 (highest), N (%) 15580 (20) 7645 (23) 7935 (18) 1.3 (1.3, 1.4)
Ptrend 6.65E-35
Processed meat, Quartile 1 (lowest), N (%) b 12761 (18) 5084 (17) 7677 (19) 1
 Processed meat, Quartile 2, N (%) 24160 (35) 9947 (33) 14213 (35) 1.1 (1.0, 1.1)
 Processed meat, Quartile 3, N (%) 22993 (33) 9941 (33) 13052 (32) 1.2 (1.2, 1.3)
 Processed meat, Quartile 4 (highest), N (%) 9998 (14) 4762 (16) 5236 (13) 1.2 (1.1, 1.30)
Ptrend 5.73E-19
Protective factors
Regular aspirin/NSAIDs use at reference time, No, N (%) 45396 (62) 20372 (66) 25024 (60) 1
 Regular aspirin/NSAIDs use, Yes, N (%) 27290 (38) 10450 (34) 16840 (40) 0.76 (0.73, 0.79)
Vegetable, Quartile 1 (lowest), N (%) b 17528 (23) 7393 (23) 10135 (23) 1
 Vegetable, Quartile 2, N (%) 23722 (31) 11121 (34) 12601 (29) 0.93 (0.89, 0.97)
 Vegetable, Quartile 3, N (%) 18952 (25) 7944 (24) 11008 (25) 0.84 (0.81, 0.88)
 Vegetable, Quartile 4, N (%) 15943 (21) 6401 (19) 9542 (22) 0.83 (0.79, 0.87)
Ptrend 2.21E-18
Fruit, Quartile 1 (lowest), N (%) b 19272 (25) 8762 (27) 10510 (24) 1
 Fruit, Quartile 2, N (%) 23449 (31) 10626 (32) 12823 (30) 0.85 (0.82, 0.89)
 Fruit, Quartile 3, N (%) 17733 (23) 7322 (22) 10411 (24) 0.77 (0.73, 0.80)
 Fruit, Quartile 4 (highest), N (%) 15548 (20) 6071 (19) 9477 (22) 0.74 (0.70, 0.77)
Ptrend 1.30E-43
Fiber (g/day), Quartile 1 (lowest), N (%) b 12231 (25) 6160 (26) 6071 (23) 1
 Fiber, Quartile 2, N (%) 12607 (25) 6001 (26) 6606 (25) 0.84 (0.80, 0.89)
 Fiber, Quartile 3, N (%) 12404 (25) 5637 (24) 6767 (26) 0.73 (0.69, 0.77)
 Fiber, Quartile 4 (highest), N (%) 12493 (25) 5583 (24) 6910 (26) 0.65 (0.61, 0.69)
Ptrend 2.62E-47
Total folate (mcg/day), Quartile 1 (lowest), N (%) b 14132 (24) 6961 (25) 7171 (23) 1
 Total folate, Quartile 2, N (%) 15797 (27) 7556 (27) 8241 (27) 0.86 (0.82, 0.90)
 Total folate, Quartile 3, N (%) 13566 (23) 6443 (23) 7123 (23) 0.83 (0.79, 0.87)
 Total folate, Quartile 4 (highest), N (%) 14969 (26) 6695 (24) 8274 (27) 0.75 (0.71, 0.79)
Ptrend 1.44E-24
Dietary folate (mcg/day), Quartile 1 (lowest), N (%) b 12488 (24) 5946 (25) 6542 (24) 1
 Dietary folate, Quartile 2, N (%) 12894 (25) 5879 (25) 7015 (25) 0.88 (0.83, 0.92)
 Dietary folate, Quartile 3, N (%) 13157 (25) 6010 (25) 7147 (26) 0.84 (0.80, 0.89)
 Dietary folate, Quartile 4 (highest), N (%) 13110 (25) 5995 (25) 7115 (26) 0.79 (0.74, 0.84)
Ptrend 8.27E-13
Total calcium (mg/day), Quartile 1 (lowest), N (%) b 14466 (19) 7398 (22) 7068 (16) 1
 Total calcium, Quartile 2, N (%) 29355 (38) 11534 (34) 17821 (41) 0.87 (0.83, 0.91)
 Total calcium, Quartile 3, N (%) 18873 (24) 8216 (24) 10657 (24) 0.77 (0.73, 0.81)
 Total calcium, Quartile 4, N (%) 14861 (19) 6583 (20) 8278 (19) 0.67 (0.64, 0.71)
Ptrend 1.11E-55
Dietary calcium (mg/day), Quartile 1 (lowest), N (%) b 13860 (24) 6993 (25) 6867 (23) 1
 Dietary calcium, Quartile 2, N (%) 14616 (26) 7198 (26) 7418 (25) 0.91 (0.87, 0.96)
 Dietary calcium, Quartile 3, N (%) 14200 (25) 6573 (24) 7627 (26) 0.77 (0.73, 0.81)
 Dietary calcium, Quartile 4, N (%) 14514 (25) 6691 (24) 7823 (26) 0.72 (0.68, 0.76)
Ptrend 3.28E-35
a

Logistic regression adjusted for age, sex, study, the first three principal components of ancestry, and total energy for dietary variables only.

b

Quartiles are study and sex-specific.

c

BMI data was complete for 36,415 cases and 48,439 controls. Height data was complete for 39,798 cases and 50,340 controls.

Interaction of environmental risk factors and GRS with CRC risk

Across all environmental risk factors assessed, there were no large magnitude multiplicative interactions with genetic risk on CRC risk (all OR estimates >0.95 and <1.05, Figure part A, right panel). In contrast, we observed large magnitude additive interaction effects for several environmental factors: heavy alcohol consumption, ever smoking status, BMI, and red meat (Figure part A, left panel, eTable 4). Specifically, the RERI (95% CI) for heavy alcohol consumption was 0.24 (0.13, 0.36); ever smoking, 0.11 (0.05, 0.16); for BMI among females 0.09 (0.05, 0.13) per 5 kg/m2 increase, among males 0.10 (0.05, 0.14) per 5 kg/m2 increase; and red meat intake, RERI Q4 (highest) versus Q1 (lowest) 0.18 (0.09, 0.27). The joint effect of each of these risk factors and high genetic risk score on CRC risk was higher than the expected sum of individual effects. We additionally observed additive interaction effects for alcohol non-drinkers and processed meat intake, where processed meat intake RERI had a relatively large estimate [RERI (95% CI) processed meat Q4 (highest) vs Q1 (lowest) = 0.15 (0.04, 0.26), Figure part A, left panel, eTable 4]. We conducted a sensitivity analysis using further adjusted models and results did not materially change (eTable 5). In the sensitivity analyses restricted to colon cancer only, results were similar and, in some cases, stronger than for CRC cases overall (eTable 6).

Figure. Estimate for additive and multiplicative interactions between genetic risk score and (a) risk factors for colorectal cancer, (b) protective factors for colorectal cancer.

Figure.

Figure.

Logistic regression adjusted for age, sex, study, first three principal components of ancestry, and total energy for dietary variables only.

Interaction of environmental protective factors and genetic risk score with CRC risk

Across all environmental protective factors assessed, there were no large magnitude multiplicative interactions with genetic risk score on CRC risk (all OR estimates >0.95 and <1.05, Figure part B, right panel). We observed large magnitude additive interaction for several protective factors: aspirin/NSAIDs, fruit, fiber, and total calcium (Figure part B, left panel, eTable 4). The RERI (95% CI) for these protective factors were use of aspirin or NSAIDs −0.16 (−0.2, −0.11); fruit Q4 (highest) versus Q1 (lowest) −0.12 (−0.18, −0.05); fiber Q4 (highest) versus Q1 (lowest) −0.16 (−0.23, −0.09); and total calcium Q4 (highest) versus Q1 (lowest) −0.11 (−0.18, −0.05), respectively. The beneficial effects of these environmental factors may be more protective against CRC risk among those with high genetic risk score compared to those with low risk score on the additive scale. We additionally observed additive interaction for vegetable, total folate, dietary folate, and dietary calcium intake, as well as Q3 versus Q1 for total fiber and total calcium, where all RERI estimates ranged from −0.09 to −0.11 (Figure part B, left panel, eTable 4). We conducted a sensitivity analysis using further adjusted models and results did not materially change (eTable 5). In the colon-only sensitivity analyses, results were also similar and, in some cases, stronger than for CRC cases overall (eTable 6).

Discussion

In this large study from international colorectal cancer consortia where we aimed to assess gene–environment interactions for subgroup identification, we observed additive interactions between several risk and protective environmental factors and elevated GRS on CRC risk. These results are consistent with the hypothesis that, for individuals with high genetic susceptibility, either heavy drinking, ever smoking, high BMI, or high red meat intake confer excessive CRC risk greater than that for individuals with average genetic susceptibility. In other words, the excess relative risk associated with genetic risk score and the respective environmental factor together is greater than the sum of excessive relative risk of genetic risk score and the respective environmental factor separately. Conversely, we estimate that those with high genetic susceptibility experience greater benefit of reducing CRC risk with aspirin or NSAID use or higher intake of fruit, fiber, or calcium than those with average genetic susceptibility. While environmental factors show consistent direction of association regardless of genetic susceptibility, these results can be useful in identifying those at high genetic risk who may benefit more from environmental factor intervention than the general population, as well as inform personalized healthcare decision-making.

Our findings are consistent with a previous study on additive interaction in colorectal cancer, which found that healthy lifestyle score and genetic risk score had an additive interaction where those with high genetic risk score and unhealthy healthy lifestyle score had a relative excess risk due to interaction of 0.58 (95% CI 0.06, 1.10) compared to the sum of risk from either high genetic risk score or unhealthy healthy lifestyle score.20 Other papers examining multiplicative interaction with CRC-related genetic risk score found, consistent with our findings, no interaction with smoking,12 NSAIDs,13 fine particulate matter,14 aspirin,16 red and processed meat intake,17 alcohol consumption,18 physical activity,19 or several environmental factors including height, BMI, smoking, alcohol, vegetable, fruit, processed meat, fiber, and others.15 It is worth noting that for the GxE interaction studies focusing on single SNPs, there is little or no difference between additive and multiplicative interaction due to weak SNP effect size, as commonly observed.38 However, when GRS is used to capture overall genetic susceptibility, the effect size of the GRS is sizable and the difference between multiplicative interaction and RERI may be substantial, as shown in our study where we observed additive interactions but no multiplicative interactions for many environmental risk factors. This observation indicates the importance of assessing interaction on both additive and multiplicative scales, where the additive results, proportional to excess absolute risk, may be more useful to public health intervention.

Our study improves upon current findings by suggesting which environmental factors specifically have evidence of GxE interaction, as those factors may be better targets through which to intervene on the population level. Many environmental factor recommendations apply to the general population, regardless of genetic risk: less heavy alcohol consumption, no smoking, lower adiposity, less red meat, and more fiber, fruit, and calcium. However, having knowledge of which subgroups may benefit more from encouraging better lifestyle factors may help inform personalized intervention and medical decision making. For example, NSAIDs is inversely associated with CRC risk; however, it also carries risk from potential side effects. Identifying those who may benefit more from NSAIDs while accounting for side effects may help inform an individual’s risk and benefit decision making about the use of such drugs.

We conducted a sensitivity analysis using further adjusted models to verify that our additive interaction results were unlikely to be biased by confounding of environmental factors. Our results were broadly consistent between minimally adjusted and further adjusted models, suggesting that environmental confounding had a limited impact on our interaction findings. Under the assumptions of gene–environment independence, a rare outcome, and no interaction between unmeasured confounders and GRS, the multiplicative interaction term will be valid even in the presence of an unmeasured or incompletely adjusted for environmental confounder.39 In our study, there was little evidence of dependence between GRS and environmental factors (eTable 3). However, the main effects of E may be biased by potential environmental confounding.39 Given that we derive additive interaction post hoc from a multiplicative logistic regression model, it uses both the main effects and interaction term in calculation of RERI. Thus, additive interaction is more likely to be affected by environmental confounding compared to multiplicative interaction, where the multiplicative interaction term may still be valid. However, given similar results with further adjustment of all other environmental factors, confounding does not appear to play a substantial role.

RERI is on the odds ratio scale and may be more generalizable to other populations than absolute risk. This is because absolute risk entails baseline absolute risk, and accurately estimating it can be challenging due to population heterogeneity. On the other hand, as RERI is calculated based on the logistic regression model, which is often adjusted for additional covariates (e.g., age and sex) to account for confounding, interpreting the RERI magnitude in this context requires caution. The excess risk or protection effect implied by RERI may vary across covariate values, even though the RERI itself remains constant. Nonetheless, the direction of RERI and its implication in the public health relevance should remain consistent.8 It is arguable whether a logistic regression model is suitable for assessing additive interaction due to potential model misspecification.7 However, logistic regression is a well-tested approach for analyzing retrospectively collected case–control data. While caution should be taken, as is the case with any modeling, RERI derived post hoc from logistic regression may be meaningful particularly in identification of high-risk subgroups for public health impact.

Our study has several strengths. We investigated gene–environment interactions on both the multiplicative and additive scales. Inclusion of additive interaction improves our ability to identify subgroups who may benefit the most from public health intervention.8 We also utilized a comprehensive genetic risk score of CRC risk based on known loci from prior studies, including 141 GWAS-identified variants using a large sample size. Our study also has limitations, the chief of which is that our study only included individuals of European ancestry, limiting the generalizability of our findings to other racial/ethnic groups. Further studies of GxE in diverse populations are needed to equitably identify relevant subgroups who may benefit the most from additional intervention. Of note, the effect size of genetic risk score is reflective of the combined influence of multiple genetic variants and is expressed in terms of standard deviations, where the interaction effects, whether additive or multiplicative, would be reported in terms of standard deviations of the genetic risk score, providing a relative measure of the influence of genetic profile on disease risk. Genetic risk score distributions may vary across populations due to differential linkage disequilibrium and allele frequencies. As a result, the effect size of genetic risk score per SD may vary across populations. Although our findings in the European ancestry population likely hold qualitatively, it is important to assess the multiplicative and additive interactions in other racial and ethnic groups. Additionally, environmental factors were measured at only one single time point and may not capture their effects on CRC risk adequately. However, we used a standardized protocol to harmonize environmental factors and our previous work based on this large pooled study that included both retrospective and prospective studies showed consistent associations of environmental factors with CRC risk,24,37,40 supporting the robustness of the results and pooling of these studies. Some GWAS summary statistics used to create the genetic risk score overlap with GECCO studies4, which may cause inflation of the genetic risk score main effect. This is an example of the “winner’s curse” phenomenon, where the true effect sizes are over estimated due to selection bias from using the same data to identify variants that reach a certain significance threshold.33,34 While the effect size estimates for constructing the genetic risk score were adjusted for winner’s curse33 and the main effect of genetic risk score could still be overestimated because the same data was used, the (multiplicative) GxE interaction may not be. Therefore, while it is possible that the magnitude of additive interaction is overestimated, it is unlikely that the conclusion is wrong qualitatively. As previously mentioned, it is important to note that when assessing interaction, if both exposures of interest have an effect on the outcome interaction must be present on at least one scale or both.8 Given that we selected genetic risk score and environmental factors based on their their well-established association with CRC, our results of exposures showing interaction effects are expected. However, the overall lack of a multiplicative interaction is intriguing. This could be because of lower power on the multiplicative scale than on the additive scale.41 Nevertheless, we note that the estimates of multiplicative interaction were close to the null, and the 95% confidence intervals well covered the null. In contrast, the 95% confidence intervals for additive interaction were generally away from the null.

In conclusion, we identified additive interaction between multiple risk and protective environmental factors and genetic risk score on CRC risk in individuals of European descent. These findings may help inform which population subgroups may gain additional benefit from environmental factor intervention utilizing genetic information in practice. Future work is needed in risk prediction and intervention of modifiable risk factors among genetically susceptible high-risk subgroups.

Supplementary Material

eTables 1 and 2
appendix files (except excel files)

Acknowledgments:

ASTERISK: We are very grateful to Dr. Bruno Buecher without whom this project would not have existed. We also thank all those who agreed to participate in this study, including the patients and the healthy control persons, as well as all the physicians, technicians and students.

CCFR: The Colon CFR graciously thanks the generous contributions of their study participants, dedication of study staff, and the financial support from the U.S. National Cancer Institute, without which this important registry would not exist.

CLUE II: We thank the participants of Clue II and appreciate the continued efforts of the staff at the Johns Hopkins George W. Comstock Center for Public Health Research and Prevention in the conduct of the Clue II Cohort Study. Cancer data was provided by the Maryland Cancer Registry, Center for Cancer Prevention and Control, Maryland Department of Health, with funding from the State of Maryland and the Maryland Cigarette Restitution Fund. The collection and availability of cancer registry data is also supported by the Cooperative Agreement NU58DP006333, funded by the Centers for Disease Control and Prevention. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Centers for Disease Control and Prevention or the Department of Health and Human Services.

CORSA: We kindly thank all individuals who agreed to participate in the CORSA study. Furthermore, we thank all cooperating physicians and students and the Biobank Graz of the Medical University of Graz.

CPS-II: The authors express sincere appreciation to all Cancer Prevention Study-II participants, and to each member of the study and biospecimen management group. The authors would like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention’s National Program of Cancer Registries and cancer registries supported by the National Cancer Institute’s Surveillance Epidemiology and End Results Program. The authors assume full responsibility for all analyses and interpretation of results. The views expressed here are those of the authors and do not necessarily represent the American Cancer Society or the American Cancer Society – Cancer Action Network.

Czech Republic CCS: We are thankful to all clinicians in major hospitals in the Czech Republic, without whom the study would not be practicable. We are also sincerely grateful to all patients participating in this study.

DACHS: We thank all participants and cooperating clinicians, and everyone who provided excellent technical assistance.

EDRN: We acknowledge all contributors to the development of the resource at University of Pittsburgh School of Medicine, Department of Gastroenterology, Department of Pathology, Hepatology and Nutrition and Biomedical Informatics.

EPIC: Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.

EPICOLON: We are sincerely grateful to all patients participating in this study who were recruited as part of the EPICOLON project. We acknowledge the Spanish National DNA Bank, Biobank of Hospital Clínic–IDIBAPS and Biobanco Vasco for the availability of the samples. The work was carried out (in part) at the Esther Koplowitz Centre, Barcelona.

Harvard cohorts: The study protocol was approved by the institutional review boards of the Brigham and Women’s Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required. We acknowledge Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital as home of the NHS. The authors would like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention’s National Program of Cancer Registries (NPCR) and/or the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program. Central registries may also be supported by state agencies, universities, and cancer centers. Participating central cancer registries include the following: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Indiana, Iowa, Kentucky, Louisiana, Massachusetts, Maine, Maryland, Michigan, Mississippi, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico, Rhode Island, Seattle SEER Registry, South Carolina, Tennessee, Texas, Utah, Virginia, West Virginia, Wyoming. The authors assume full responsibility for analyses and interpretation of these data.

Kentucky: We would like to acknowledge the staff at the Kentucky Cancer Registry.

LCCS: We acknowledge the contributions of Jennifer Barrett, Robin Waxman, Gillian Smith and Emma Northwood in conducting this study.

NCCCS I & II: We would like to thank the study participants, and the NC Colorectal Cancer Study staff.

PLCO: The authors thank the PLCO Cancer Screening Trial screening center investigators and the staff from Information Management Services Inc and Westat Inc. Most importantly, we thank the study participants for their contributions that made this study possible.

SEARCH: We thank the SEARCH team.

SELECT: We thank the research and clinical staff at the sites that participated on SELECT study, without whom the trial would not have been successful. We are also grateful to the 35,533 dedicated men who participated in SELECT.

WHI: The authors thank the WHI investigators and staff for their dedication, and the study participants for making the program possible. A full listing of WHI investigators can be found at: http://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Short%20List.pdf

Funding:

Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO): National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services (U01 CA137088, R01 CA059045, U01 CA164930, R21 CA191312, R01 CA244588, R01 206279, R01 201407, R01 CA488857, P20 CA252733). Genotyping/Sequencing services were provided by the Center for Inherited Disease Research (CIDR) contract number HHSN268201700006I and HHSN268201200008I. This research was funded in part through the NIH/NCI Cancer Center Support Grant P30 CA015704. Scientific Computing Infrastructure at Fred Hutch funded by ORIP grant S10OD028685. C.E. Thomas is supported by L70 CA284301, T32 CA094880, and T32 CA009168. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institutes or the National Institutes of Health.

ASTERISK: a Hospital Clinical Research Program (PHRC-BRD09/C) from the University Hospital Center of Nantes (CHU de Nantes) and supported by the Regional Council of Pays de la Loire, the Groupement des Entreprises Françaises dans la Lutte contre le Cancer (GEFLUC), the Association Anne de Bretagne Génétique and the Ligue Régionale Contre le Cancer (LRCC).

The ATBC Study is supported by the Intramural Research Program of the U.S. National Cancer Institute, National Institutes of Health, Department of Health and Human Services.

BWHS: National Institutes of Health U01 CA164974

The Colon Cancer Family Registry (CCFR, www.coloncfr.org) is supported in part by funding from the National Cancer Institute (NCI), National Institutes of Health (NIH) (award U01 CA167551). Support for case ascertainment was provided in part from the Surveillance, Epidemiology, and End Results (SEER) Program and the following U.S. state cancer registries: AZ, CO, MN, NC, NH; and by the Victoria Cancer Registry (Australia) and Ontario Cancer Registry (Canada). The CCFR Set-1 (Illumina 1M/1M-Duo) was supported by NIH awards U01 CA122839 and R01 CA143237 (to GC). The CCFR Set-3 (Affymetrix Axiom CORECT Set array) was supported by NIH award U19 CA148107 and R01 CA81488 (to SBG). The CCFR Set-4 (Illumina OncoArray 600K SNP array) was supported by NIH award U19 CA148107 (to SBG) and by the Center for Inherited Disease Research (CIDR), which is funded by the NIH to the Johns Hopkins University, contract number HHSN268201200008I. The content of this manuscript does not necessarily reflect the views or policies of the NCI, NIH or any of the collaborating centers in the Colon Cancer Family Registry (CCFR), nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government, any cancer registry, or the CCFR.

CLUE II funding was from the National Cancer Institute (U01 CA086308, Early Detection Research Network; P30 CA006973), National Institute on Aging (U01 AG018033), and the American Institute for Cancer Research. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US government.

COLO2&3: National Institutes of Health (R01 CA060987).

ColoCare: This work was supported by the National Institutes of Health (grant numbers R01 CA189184 (Li/Ulrich), U01 CA206110 (Ulrich/Li/Siegel/Figueiredo/Colditz, 2P30CA015704-40 (Gilliland), R01 CA207371 (Ulrich/Li)), the Matthias Lackas-Foundation, the German Consortium for Translational Cancer Research, and the EU TRANSCAN initiative.

CORSA: The CORSA study was funded by Austrian Research Funding Agency (FFG) BRIDGE (grant 829675, to Andrea Gsur), the “Herzfelder’sche Familienstiftung” (grant to Andrea Gsur) and was supported by COST Action BM1206.

CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study-II (CPS-II) cohort. The study protocol was approved by the institutional review boards of Emory University, and those of participating registries as required.

CRCGEN: Colorectal Cancer Genetics & Genomics, Spanish study was supported by Instituto de Salud Carlos III, co-funded by FEDER funds –a way to build Europe– (grants PI14-613 and PI09-1286), Agency for Management of University and Research Grants (AGAUR) of the Catalan Government (grant 2017SGR723), Junta de Castilla y León (grant LE22A10-2), the Spanish Association Against Cancer (AECC) Scientific Foundation grant GCTRA18022MORE and the Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), action Genrisk. Sample collection of this work was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d’Oncología de Catalunya (XBTC), Plataforma Biobancos PT13/0010/0013 and ICOBIOBANC, sponsored by the Catalan Institute of Oncology. We thank CERCA Programme, Generalitat de Catalunya for institutional support.

Czech Republic CCS: This work was supported by the Czech Science Foundation (21-04607X, 21-27902S), by the Grant Agency of the Ministry of Health of the Czech Republic (grants AZV NU21-07-00247 and AZV NU21-03-00145), and Charles University Research Fund (Cooperation 43-Surgical disciplines).

DACHS: This work was supported by the German Research Council (BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, CH 117/1-1, HO 5117/2-1, HE 5998/2-1, KL 2354/3-1, RO 2270/8-1 and BR 1704/17-1), the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany, and the German Federal Ministry of Education and Research (01KH0404, 01ER0814, 01ER0815, 01ER1505A and 01ER1505B).

DALS: National Institutes of Health (R01 CA048998 to M. L. Slattery).

EDRN: This work is funded and supported by the NCI, EDRN Grant (U01-CA152753).

EPIC: The coordination of EPIC is financially supported by International Agency for Research on Cancer (IARC) and also by the Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London which has additional infrastructure support provided by the NIHR Imperial Biomedical Research Centre (BRC). The national cohorts are supported by: Danish Cancer Society (Denmark); Ligue Contre le Cancer, Institut Gustave Roussy, Mutuelle Générale de l’Education Nationale, Institut National de la Santé et de la Recherche Médicale (INSERM) (France); German Cancer Aid, German Cancer Research Center (DKFZ), German Institute of Human Nutrition Potsdam-Rehbruecke (DIfE), Federal Ministry of Education and Research (BMBF) (Germany); Associazione Italiana per la Ricerca sul Cancro-AIRC-Italy, Compagnia di SanPaolo and National Research Council (Italy); Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF), Statistics Netherlands (The Netherlands); Health Research Fund (FIS) - Instituto de Salud Carlos III (ISCIII), Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, and the Catalan Institute of Oncology - ICO (Spain); Swedish Cancer Society, Swedish Research Council and Region Skåne and Region Västerbotten (Sweden); Cancer Research UK (14136 to EPIC-Norfolk; C8221/A29017 to EPIC-Oxford), Medical Research Council (1000143 to EPIC-Norfolk; MR/M012190/1 to EPIC-Oxford). (United Kingdom).

EPICOLON: This work was supported by grants from Fondo de Investigación Sanitaria/FEDER (PI08/0024, PI08/1276, PS09/02368, P111/00219, PI11/00681, PI14/00173, PI14/00230, PI17/00509, 17/00878, PI20/00113, PI20/00226, Acción Transversal de Cáncer), Xunta de Galicia (PGIDIT07PXIB9101209PR), Ministerio de Economia y Competitividad (SAF07-64873, SAF 2010-19273, SAF2014-54453R), Fundación Científica de la Asociación Española contra el Cáncer (GCB13131592CAST), Beca Grupo de Trabajo “Oncología” AEG (Asociación Española de Gastroenterología), Fundación Privada Olga Torres, FP7 CHIBCHA Consortium, Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR, Generalitat de Catalunya, 2014SGR135, 2014SGR255, 2017SGR21, 2017SGR653), Catalan Tumour Bank Network (Pla Director d’Oncologia, Generalitat de Catalunya), PERIS (SLT002/16/00398, Generalitat de Catalunya), CERCA Programme (Generalitat de Catalunya) and COST Action BM1206 and CA17118. CIBERehd is funded by the Instituto de Salud Carlos III.

ESTHER/VERDI. This work was supported by grants from the Baden-Württemberg Ministry of Science, Research and Arts and the German Cancer Aid.

Hawaii Adenoma Study: NCI grants R01 CA072520.

Harvard cohorts: HPFS is supported by the National Institutes of Health (P01 CA055075, UM1 CA167552, U01 CA167552, R01 CA137178, R01 CA151993, and R35 CA197735), NHS by the National Institutes of Health (P01 CA087969, UM1 CA186107, R01 CA137178, R01 CA151993, and R35 CA197735), and PHS by the National Institutes of Health (R01 CA042182).

Kentucky: This work was supported by the following grant support: Clinical Investigator Award from Damon Runyon Cancer Research Foundation (CI-8); NCI R01CA136726.

LCCS: The Leeds Colorectal Cancer Study was funded by the Food Standards Agency and Cancer Research UK Programme Award (C588/A19167).

MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 509348, 209057, 251553 and 504711 and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. BMLynch was supported by MCRF18005 from the Victorian Cancer Agency.

MEC: National Institutes of Health (R37 CA054281, P01 CA033619, and R01 CA063464).

MECC: This work was supported by the National Institutes of Health, U.S. Department of Health and Human Services (R01 CA081488, R01 CA197350, U19 CA148107, R01 CA242218, and a generous gift from Daniel and Maryann Fong.

NCCCS I & II: We acknowledge funding support for this project from the National Institutes of Health, R01 CA066635 and P30 DK034987.

NFCCR: This work was supported by an Interdisciplinary Health Research Team award from the Canadian Institutes of Health Research (CRT 43821); the National Institutes of Health, U.S. Department of Health and Human Serivces (U01 CA074783); and National Cancer Institute of Canada grants (18223 and 18226). The authors wish to acknowledge the contribution of Alexandre Belisle and the genotyping team of the McGill University and Génome Québec Innovation Centre, Montréal, Canada, for genotyping the Sequenom panel in the NFCCR samples. Funding was provided to Michael O. Woods by the Canadian Cancer Society Research Institute.

PLCO: This work was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and by contracts from the Division of Cancer Prevention, National Cancer Institute, NIH, DHHS. Funding was provided by National Institutes of Health (NIH), Genes, Environment and Health Initiative (GEI) Z01 CP 010200, NIH U01 HG004446, and NIH GEI U01 HG 004438.

REACH & SMS: This work was supported by the National Cancer Institute (grant P01 CA074184 to J.D.P. and P.A.N., grants R01 CA097325, R03 CA153323, and K05 CA152715 to P.A.N., and the National Center for Advancing Translational Sciences at the National Institutes of Health (grant KL2 TR000421 to A.N.B.-H.)

SEARCH: The University of Cambridge has received salary support in respect of PDPP from the NHS in the East of England through the Clinical Academic Reserve. Cancer Research UK (C490/A16561); the UK National Institute for Health Research Biomedical Research Centres at the University of Cambridge.

SELECT: Research reported in this publication was supported in part by the National Cancer Institute of the National Institutes of Health under Award Numbers U10 CA037429 (CD Blanke), and UM1 CA182883 (CM Tangen/IM Thompson). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Swedish Mammography Cohort and Cohort of Swedish Men: This work is supported by the Swedish Research Council/Infrastructure grant, the Swedish Cancer Foundation, and the Karolinska Institutés Distinguished Professor Award to Alicja Wolk.

UK Biobank: This research has been conducted using the UK Biobank Resource under Application Number 8614

VITAL: National Institutes of Health (K05 CA154337).

WHI: The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C.

Footnotes

Conflict of Interest:

SB Gruber: Brogent International LLC, co-founder, not related to submitted work.

JW Baurley: JWB is co-founder and employee of BioRealm LLC. BioRealm LLC offers data analysis services, unrelated to this study.

M Giannakis: Research funding from Servier and Janssen, unrelated to this study.

All other authors do not have any conflicts of interest to disclose.

Disclaimer: Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.

Data & code availability statement:

Except for UK Biobank and CCFR 1 and CCFR 2, individual-level data are deposited in dbGaP (accession nos. phs001415.v1.p1, phs001315.v1.p1, phs001078.v1.p1, phs001903.v1.p1, phs001856.v1.p1 and phs001045.v1.p1). UK Biobank data are available through http://www.ukbiobank.ac.uk. CCFR 1 and CCFR 2 data can be requested by submitting an application for collaboration to the CCFR (forms, instructions and contact information can be located at www.coloncfr/collaboration.org). Code is available from the corresponding author upon request.

References

  • 1.Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71(3):209–249. [DOI] [PubMed] [Google Scholar]
  • 2.Lichtenstein P, Holm NV, Verkasalo PK, et al. Environmental and Heritable Factors in the Causation of Cancer — Analyses of Cohorts of Twins from Sweden, Denmark, and Finland. N Engl J Med. 2000;343(2):78–85. [DOI] [PubMed] [Google Scholar]
  • 3.Mucci LA, Hjelmborg JB, Harris JR, et al. Familial Risk and Heritability of Cancer Among Twins in Nordic Countries. JAMA. 2016;315(1):68–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Huyghe JR, Bien SA, Harrison TA, et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat Genet. 2019;51(1):76–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Fernandez-Rozadilla C, Timofeeva M, Chen Z, et al. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nat Genet. 2023;55(1):89–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.McAllister K, Mechanic LE, Amos C, et al. Current Challenges and New Opportunities for Gene–environment Interaction Studies of Complex Diseases. Am J Epidemiol. 2017;186(7):753–761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Skrondal A Interaction as departure from additivity in case–control studies: a cautionary note. Am J Epidemiol. 2003;158(3):251–258. [DOI] [PubMed] [Google Scholar]
  • 8.Lash TL, VanderWeele TJ, Haneuse S, Rothman KJ. Modern Epidemiology. Fourth edition. LWW; 2021. [Google Scholar]
  • 9.Greenland S, Lash TL, and Rothman KJ (2008). “Concepts of interaction,” chapter 5. In: Modern Epidemiology, Rothman KJ, Greenland S, and Lash TL (Eds.). 3rd Edition. Philadelphia, PA: Lippincott Williams and Wilkins. [Google Scholar]
  • 10.VanderWeele TJ, Knol MJ. A Tutorial on Interaction. Epidemiol Methods. 2014;3(1):33–72. [Google Scholar]
  • 11.Knol MJ, VanderWeele TJ. Recommendations for presenting analyses of effect modification and interaction. Int J Epidemiol. 2012;41(2):514–520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Chen X, Jansen L, Guo F, Hoffmeister M, Chang-Claude J, Brenner H. Smoking, Genetic Predisposition, and Colorectal Cancer Risk. Clin Transl Gastroenterol. 2021;12(3):e00317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Chen X, Guo F, Hoffmeister M, Chang-Claude J, Brenner H. Non-steroidal anti-inflammatory drugs, polygenic risk score and colorectal cancer risk. Aliment Pharmacol Ther. 2021;54(2):167–175. [DOI] [PubMed] [Google Scholar]
  • 14.Chu H, Xin J, Yuan Q, et al. A prospective study of the associations among fine particulate matter, genetic variants, and the risk of colorectal cancer. Environ Int. 2021;147:106309. [DOI] [PubMed] [Google Scholar]
  • 15.Yang T, Li X, Farrington SM, et al. A Systematic Analysis of Interactions between Environmental Risk Factors and Genetic Variation in Susceptibility to Colorectal Cancer. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol. 2020;29(6):1145–1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bakshi A, Cao Y, Orchard SG, et al. Aspirin and the risk of colorectal cancer according to genetic susceptibility among older individuals. Cancer Prev Res Phila Pa. Published online March 29, 2022:canprevres.0011.2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Chen X, Hoffmeister M, Brenner H. Red and Processed Meat Intake, Polygenic Risk Score, and Colorectal Cancer Risk. Nutrients. 2022;14(5):1077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chen X, Li H, Guo F, Hoffmeister M, Brenner H. Alcohol consumption, polygenic risk score, and early- and late-onset colorectal cancer risk. EClinicalMedicine. 2022;49:101460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chen X, Guo F, Chang-Claude J, Hoffmeister M, Brenner H. Physical activity, polygenic risk score, and colorectal cancer risk. Cancer Med. Published online July 26, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Choi J, Jia G, Wen W, Shu XO, Zheng W. Healthy lifestyles, genetic modifiers, and colorectal cancer risk: a prospective cohort study in the UK Biobank. Am J Clin Nutr. 2021;113(4):810–820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Schmit SL, Edlund CK, Schumacher FR, et al. Novel Common Genetic Susceptibility Loci for Colorectal Cancer. J Natl Cancer Inst. 2019;111(2):146–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Schumacher FR, Schmit SL, Jiao S, et al. Genome-wide association study of colorectal cancer identifies six new susceptibility loci. Nat Commun. 2015;6:7138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hutter CM, Chang-Claude J, Slattery ML, et al. Characterization of Gene–environment interactions for colorectal cancer susceptibility loci. Cancer Res. 2012;72(8):2036–2044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.McNabb S, Harrison TA, Albanes D, et al. Meta-analysis of 16 studies of the association of alcohol with colorectal cancer. Int J Cancer. 2020;146(3):861–873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Xia Z, Su Y, Petersen P, et al. Functional informed genome-wide interaction analysis of body mass index, diabetes and colorectal cancer risk. Cancer Med. 2020;9(10):3563–3573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gong J, Hutter CM, Newcomb PA, et al. Genome-Wide Interaction Analyses between Genetic Variants and Alcohol Consumption and Smoking for Risk of Colorectal Cancer. PLoS Genet. 2016;12(10):e1006296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.McCarthy S, Das S, Kretzschmar W, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48(10):1279–1283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Das S, Forer L, Schönherr S, et al. Next-generation genotype imputation service and methods. Nat Genet. 2016;48(10):1284–1287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Morrison John (2020). BinaryDosage: Creates, Merges, and Reads Binary Dosage Files. R package version 1.0.0. https://CRAN.R-project.org/package=BinaryDosage. [Google Scholar]
  • 30.Archambault AN, Jeon J, Lin Y, et al. Risk Stratification for Early-Onset Colorectal Cancer Using a Combination of Genetic and Environmental Risk Scores: An International Multi-Center Study. J Natl Cancer Inst. 2022;114(4):528–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Law PJ, Timofeeva M, Fernandez-Rozadilla C, et al. Association analyses identify 31 new risk loci for colorectal cancer susceptibility. Nat Commun. 2019;10(1):2154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lu Y, Kweon SS, Tanikawa C, et al. Large-Scale Genome-Wide Association Study of East Asians Identifies Loci Associated With Risk for Colorectal Cancer. Gastroenterology. 2019;156(5):1455–1466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zhong H, Prentice RL. Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostat Oxf Engl. 2008;9(4):621–634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Capen EC, Clapp RV, Campbell WM. Competitive Bidding in High-Risk Situations. J Pet Technol. 1971;23(06):641–653. [Google Scholar]
  • 35.Hosmer DW, Lemeshow S. Confidence interval estimation of interaction. Epidemiol Camb Mass. 1992;3(5):452–456. [DOI] [PubMed] [Google Scholar]
  • 36.Knol MJ, VanderWeele TJ, Groenwold RHH, Klungel OH, Rovers MM, Grobbee DE. Estimating measures of interaction on an additive scale for preventive exposures. Eur J Epidemiol. 2011;26(6):433–438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wang X, O’Connell K, Jeon J, et al. Combined effect of modifiable and non-modifiable risk factors for colorectal cancer risk in a pooled analysis of 11 population-based studies. BMJ Open Gastroenterol. 2019;6(1):e000339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Gauderman WJ, Mukherjee B, Aschard H, et al. Update on the State of the Science for Analytical Methods for Gene–environment Interactions. Am J Epidemiol. 2017;186(7):762–770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.VanderWeele TJ, Ko YA, Mukherjee B. Environmental Confounding in Gene–environment Interaction Studies. Am J Epidemiol. 2013;178(1):144–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Jeon J, Du M, Schoen RE, et al. Determining Risk of Colorectal Cancer and Starting Age of Screening Based on Lifestyle, Environmental, and Genetic Factors. Gastroenterology. 2018;154(8):2152–2164.e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Poole C, Shrier I, VanderWeele TJ. Is the Risk Difference Really a More Heterogeneous Measure? Epidemiol Camb Mass. 2015;26(5):714–718. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

eTables 1 and 2
appendix files (except excel files)

Data Availability Statement

Except for UK Biobank and CCFR 1 and CCFR 2, individual-level data are deposited in dbGaP (accession nos. phs001415.v1.p1, phs001315.v1.p1, phs001078.v1.p1, phs001903.v1.p1, phs001856.v1.p1 and phs001045.v1.p1). UK Biobank data are available through http://www.ukbiobank.ac.uk. CCFR 1 and CCFR 2 data can be requested by submitting an application for collaboration to the CCFR (forms, instructions and contact information can be located at www.coloncfr/collaboration.org). Code is available from the corresponding author upon request.

RESOURCES