Abstract
Background:
The genotoxin colibactin causes a tumor single base substitution (SBS) mutational signature, SBS88. It is unknown whether epidemiologic factors association with colorectal cancer (CRC) risk and survival differ by SBS88.
Methods:
Within the Genetic Epidemiology of Colorectal Cancer Consortium and Colon Cancer Family Registry, we measured SBS88 in 4,308 microsatellite stable/microsatellite instability low tumors. Associations of epidemiologic factors with CRC risk by SBS88 were assessed using multinomial regression (N=4,308 cases, 14,192 controls; cohort-only cases N=1,911), and with CRC-specific survival using Cox proportional hazards regression (N=3,465 cases).
Results:
392 (9%) tumors were SBS88 positive. Among all cases, the highest quartile of fruit intake was associated with lower risk of SBS88-positive CRC than SBS88-negative CRC [odds ratio (OR) = 0.53, 95% confidence interval (CI) 0.37, 0.76; OR = 0.75, 95% CI 0.66, 0.85, respectively, Pheterogeneity = 0.047]. Among cohort studies, associations of BMI, alcohol, and fruit intake with CRC risk differed by SBS88. BMI ≥30 kg/m2 was associated with worse CRC-specific survival among those SBS88-positive [hazard ratio (HR) = 3.40, 95% CI 1.47, 7.84], but not among those SBS88-negative (HR = 0.97, 95% CI 0.78, 1.21, Pheterogeneity = 0.066).
Conclusions:
Most epidemiologic factors did not differ by SBS88 for CRC risk or survival. Higher BMI may be associated with worse CRC-specific survival among those SBS88-positive, however validation is needed in samples with whole-genome or exome sequencing available.
Keywords: epidemiologic factor, risk factor, SBS88, colibactin, colorectal cancer, tumor phenotype, microbiome, lifestyle
Introduction
Colorectal cancer (CRC) is a critical public health problem that ranks as the third most commonly diagnosed cancer and second leading cause of cancer death globally.1 One emerging area of research in relation to CRC is the gut microbiome. Different bacterial phyla of the gut microbiota may impact carcinogenesis and progression through inflammation, DNA damage, and metabolite production.2 An important bacteria strain that has been previously studied in relation to CRC etiology is Escherichia coli (E. coli),3 which may be directly genotoxic through production of colibactin.4
Colibactin is a genotoxin that causes inter-strand cross links5 and double-strand breaks6 in DNA. Colibactin is produced by E. coli and other enterobacteria carrying the pks island (i.e., pks+ E. coli).7,8 These pks+ E. coli have previously been found to be present in colon tissue of about 20% of healthy individuals,9,10 40% of patients with inflammatory bowel disease,10 and 60% of patients with familial adenomatous polyposis or CRC.9–12 The DNA damage caused by colibactin is identifiable through a tumor mutational signature that aggregates somatic DNA mutation, published by COSMIC as SBS88, which is characterized by specific single base-pair substitutions (https://cancer.sanger.ac.uk/cosmic/signatures).
Given that CRC is a genetically and molecularly heterogeneous disease, it is plausible that associations of certain lifestyle factors with CRC risk could vary by tumor phenotype. For example, previous work has found that lifestyle factors, such as diet and smoking, associations with CRC risk have varied by tumor molecular subtype.13,14 Regarding pks+ E. coli and colibactin specifically, one previous study has found a stronger association between western diet score and risk of CRC with high levels of pks+ E. coli present in tumor tissue compared to risk of CRC with low or negative pks+ E. coli cases.15 We hypothesized that other established epidemiologic factors, such as smoking, body mass index (BMI), history of type II diabetes, aspirin and non-aspirin NSAIDs use, menopausal hormone therapy use, and individual dietary variables which may be related to compositional and functional alterations of the gut microbiome,16 would be more strongly associated with risk of CRC exhibiting a colibactin signature (i.e., SBS88-positive) than with risk of CRC not exhibiting a colibactin signature (i.e., SBS88-negative). Previously, our group found that SBS88-positive signature was associated with the distal colon and rectum compared to the proximal colon, and that many somatic mutations associated with SBS88 positivity were also associated with colibactin-induced DNA damage, most prominently APC:c.835–8A>G (medRxiv 2023.03.10.23287127). Notably, the APC:c.835–8A>G mutation was previously associated with colibactin signature and the distal colon in patients with unexplained colorectal polyposis17 as well as being found in precancerous polyps matching the colibactin signature.18 We also previously found that SBS88 positivity was associated with better CRC-specific survival (medRxiv 2023.03.10.23287127), and thus hypothesized that dysbiosis-related factors may have a differential association with survival for SBS88-positive CRC compared to SBS88-negative CRC tumors. The aim of the present analysis was to “open the black box” between the risk factors outlined above and CRC to understand which risk factors affect CRC through a pks+ E. coli-mediated pathway.
Methods
Study Population, Targeted Sequencing and Signature Development
Targeted sequencing was performed on 6,111 CRC cases within the Genetic Epidemiology of Colorectal Cancer Consortium and Colon Cancer Family Registry (GECCO-CCFR). GECCO is an international collaboration including CRC cases from 70 studies from North America, Australia, Asia, and Europe.19,20 CCFR is a consortium supported by the National Cancer Institute that consists of six centers dedicated to the establishment of a comprehensive collaborative infrastructure for interdisciplinary studies in the genetic epidemiology of colorectal cancer.21 All participants gave written informed consent. Studies were approved by their respective Institutional Review Boards and were conducted in accordance with the Declaration of Helsinki.
Tumors from 2,542 CRC cases were sequenced with a 1.34 megabase (Mb) targeted panel covering 205 genes,22 and tumors from 3,569 cases were sequenced with a 1.96Mb targeted panel expanding the panel to 298 genes. DNA was extracted from formalin-fixed paraffin-embedded (FFPE) CRC tissue and macrodissected. Matching normal DNA was primarily extracted from either adjacent normal colonic FFPE tissue or peripheral blood. Tumor mutational signatures were calculated using the simulated annealing method implemented by the R package SignatureEstimation.23 The original set of 78 COSMIC single-base substitution (SBS) signatures build GRCh37 version 3.2 (https://cancer.sanger.ac.uk/cosmic/signatures, downloadable from https://cancer.sanger.ac.uk/signatures/documents/452/COSMIC_v3.2_SBS_GRCh37.txt) was limited to 18 signatures previously observed in CRC specifically,24 including the colibactin-induced tumor mutational signature, SBS88 (medRxiv 2023.03.10.23287127). The colibactin-induced signature SBS88 (defined at https://cancer.sanger.ac.uk/signatures/sbs/sbs88/), is characterized primarily by TT>NT mutations. We restricted our sample to participants with CRC tumors with at least five somatic single nucleotide variants (SNVs) to ensure accurate signature decomposition.25 Tumors with low mutational burden are not well-captured by targeted sequencing panels to make inferences about mutational signatures,25 resulting in the inclusion of 5,292 CRC tumors. The distribution of number of SNVs was right skewed, where the median was 10 SNVs (inter-quartile range 7–17) and a maximum value of 1,117.
SBS88-positive tumor mutational signature, or evidence of pks+ E.coli colibactin-induced DNA damage, was defined as SBS88 contribution of >10% (medRxiv 2023.03.10.23287127). Among the 5,292 CRC tumors with at least five SNVs, 984 (19%) were defined as having high microsatellite instability (MSI-H), as determined by mSINGS26 (as previously described22). Among MSI-H tumors, only six (0.6%) were SBS88-positive; therefore, we restricted our analysis to CRC tumors that were microsatellite stable or exhibited low microsatellite instability (MSS/L, N=4,308) to ensure a large enough sample size to conduct stratified analyses of epidemiologic factors and to reduce potential confounding or heterogeneity introduced by MSI-H tumors. These cases were compared to 14,192 healthy, non-cancer control participants selected from the same studies. Given that SBS88 signature calculation was derived from targeted sequencing data as opposed to whole genome or whole exome sequencing, we conducted sensitivity analyses comparing differential associations between three groups: SBS88+APC+, SBS88+APC−, and SBS88− tumors, where APC+/− status was determined by APC mutation APC:c.835–8A>G. APC:c.835–8A>G is strongly associated with the SBS88 signature (OR= 65.5, 95% CI 39.0–110.0, medRxiv 2023.03.10.23287127) and has been found both in patients with unexplained colorectal polyposis and precancerous polyps fulfilling the SBS88 signature.17,18
Epidemiologic factor selection
Data for epidemiologic factors were collected through self-report via structured questionnaire or in-person interview. Multi-step data harmonization was performed at the Fred Hutchinson Cancer Center and has been previously described.27 Briefly, common data elements (CDEs) were defined, study-specific data dictionaries and questionnaires examined, and elements mapped to CDEs after an iterative process of communication with data contributors. Permissible values, definitions, and standardized coding were combined into a single dataset using SQL. The resulting dataset was checked for quality assurance, including for errors and outlying values within and between studies.
Epidemiologic factors were selected for analysis if they were available in harmonized data and plausibly related to gut dysbiosis.28,29 We considered the following epidemiologic factors: tobacco smoking (non-smoker, study- and sex-specific quartiles of pack-years smoked), body mass index (kg/m2, 18.5–24.9, 25-<30, ≥30), history of type II diabetes (yes/no), study-specific defined regular use of aspirin and non-aspirin NSAIDs (yes/no), any post-menopausal hormone replacement therapy (HRT) use (yes/no), and dietary variables. Dietary variables were obtained from diet histories or food frequency questionnaires, and include consumption of red meat, processed meat, fiber, total calcium, total folate, fruits, vegetables, and alcohol. All dietary variables were coded as study- and sex-specific quartiles except alcohol, which was coded by grams of alcohol intake per day: non-drinker, 1–28 g/day (moderate drinker), and >28 g/day (heavy drinker). All categorical variables were modeled with the lowest level of exposure or no use as the reference group except alcohol, where moderate drinking was used as the reference due to the observed J-shape association.30,31
Assessment of Outcomes
Studies identified incident CRC either via self-report of diagnosis from study participants, with confirmation via adjudication of medical records, via population-based cancer registries, regional hospitals, or healthcare management organizations. Most studies ascertained vital status via linkage to state or national death registries, or state cancer registries, with cause of death verified by death certificates. Other studies used active follow-up to ascertain vital status, with dates and causes of death confirmed via review of death certificates and/or medical records by trained adjudicators.
Statistical Analysis
CRC cases were stratified by their SBS88 positivity status, and multinomial logistic regression was used to compare SBS88-positive CRC cases and SBS88-negative CRC cases to controls. Case-only analyses comparing SBS88-positive to SBS88-negative CRC cases used logistic regression. Primary models adjusted for age, sex, study site, and, for dietary exposures, also total energy intake. Further adjusted models were adjusted for all other exposures of interest to account for potential confounding. To assess if the associations stratified by SBS88 status differed with longer time from exposure measurement to diagnosis, we conducted a sensitivity analysis restricted to participants from cohort studies.
We used Cox proportional hazards models to compare epidemiological factors association with CRC-specific survival separately among SBS88-positive and SBS88-negative tumors, and combined to assess interaction. The outcome was the time to CRC-specific death assessed from the date of diagnosis to the date of death or the end of follow-up, whichever occurred first. CRC-specific death was coded 1 if the individual died and 0 otherwise. To assess proportional hazards assumptions, we performed scaled Schoenfeld residuals testing for a non-zero slope as a function of survival time, where we found no significant violations. We chose five years to censor our follow-up time because deaths within five years were most likely to be related to CRC as opposed to other causes of death. After excluding cases with missing or inconsistent survival data, our survival analysis sample included 3,465 CRC cases. Survival models used the same covariates as the risk analyses described above. All analyses were performed using R, version 4.1.3 (R Foundation for Statistical Computing, Vienna, Austria) software.
Data Availability Statement
The original panel-sequenced data used in this study are available at the database of Genotypes and Phenotypes (dbGaP). The Ontario Institute of Cancer Research (OICR) data is available under accession code phs002050.v1.p1. The Center for Inherited Disease Research (CIDR) data is available under accession code phs001905.v1.p1. Mutational signature definitions were downloaded from the COSMIC website at https://cancer.sanger.ac.uk/signatures/downloads/.
Results
Among 4,308 MSS/L CRC cases, 392 (9%) were SBS88-positive. Red meat intake, processed meat intake, high alcohol consumption, and diabetes status were associated with higher risk of both SBS88-positive and SBS88-negative CRC compared to controls (Table 1). Point estimates were generally consistent with inverse associations with both SBS88-positive and SBS88-negative CRC risk for fiber, total calcium, total folate, and HRT. In minimally adjusted models, those in the highest quartile of fruit intake had 0.53 times the risk of SBS88-positive CRC as compared to those in the lowest quartile; in comparison, the OR for this association was 0.75 with respect to the risk of SBS88-negative CRC (P for heterogeneity = 0.047, Table 1). Other potential differences in associations by SBS88 status were observed for vegetable consumption, tobacco smoking, and BMI, though not statistically significant (Table 1). When cases were further stratified by both SBS88 and APC:c.835–8A>G, there was not strong evidence of differential associations (Supplemental Table 1).
Table 1.
Odds ratios (95% confidence intervals) for the association of epidemiologic factors with CRC risk by SBS88 signature status
| Exposure of interest | SBS88+ cases (N, %)e | SBS88− cases (N, %)e | Controls (N, %) | SBS88+ vs. Controls, Minimal adjusted OR (95% CI) a | SBS88− vs. Controls, Minimal adjusted OR (95% CI) a | P heterogeneity (case/case analysis) c | SBS88+ vs. Controls, Multivariable-adjusted OR (95% CI) b | SBS88− vs. Controls, Multivariable-adjusted OR (95% CI) b | P heterogeneity (case/case analysis) c |
|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
| Red meat (servings/day) | Overall N = 16,536 | ||||||||
| Lowest (Q1) | 76 (23) | 735 (23) | 3496 (27) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 76 (23) | 804 (25) | 3500 (27) | 1.17 (0.84, 1.61) | 1.20 (1.07, 1.35) | 0.847 | 1.10 (0.79, 1.53) | 1.16 (1.03, 1.30) | 0.734 |
| Third (Q3) | 84 (26) | 785 (25) | 3105 (24) | 1.44 (1.04, 1.98) | 1.31 (1.16, 1.47) | 0.466 | 1.30 (0.93, 1.82) | 1.24 (1.09, 1.40) | 0.623 |
| Highest (Q4) | 93 (28) | 856 (27) | 2926 (22) | 1.57 (1.13, 2.17) | 1.42 (1.26, 1.60) | 0.470 | 1.33 (0.94, 1.90) | 1.28 (1.12, 1.46) | 0.724 |
| Ptrend | 0.003 | 4.2e-09 | 0.333 | 0.076 | 1.8e-04 | 0.561 | |||
| Processed meat (servings/day) | Overall N = 14,678 | ||||||||
| Lowest (Q1) | 74 (27) | 740 (28) | 3585 (30) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 57 (21) | 613 (23) | 2848 (24) | 1.04 (0.73, 1.49) | 1.10 (0.97, 1.24) | 0.849 | 0.98 (0.68, 1.41) | 1.04 (0.91, 1.17) | 0.783 |
| Third (Q3) | 58 (21) | 636 (24) | 2801 (24) | 1.09 (0.77, 1.56) | 1.18 (1.05, 1.34) | 0.852 | 0.99 (0.68, 1.43) | 1.07 (0.94, 1.21) | 0.798 |
| Highest (Q4) | 81 (30) | 639 (24) | 2546 (22) | 1.58 (1.13, 2.21) | 1.21 (1.06, 1.37) | 0.087 | 1.31 (0.90, 1.90) | 1.02 (0.89, 1.18) | 0.205 |
| Ptrend | 0.011 | 0.002 | 0.112 | 0.183 | 0.651 | 0.238 | |||
| Fiber (g/day) | Overall N = 12,813 | ||||||||
| Lowest (Q1) | 78 (29) | 623 (26) | 2411 (24) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 59 (22) | 588 (24) | 2528 (25) | 0.71 (0.50, 1.01) | 0.86 (0.76, 0.99) | 0.257 | 0.82 (0.57, 1.18) | 0.98 (0.85, 1.13) | 0.317 |
| Third (Q3) | 60 (23) | 591 (24) | 2623 (26) | 0.70 (0.49, 1.00) | 0.83 (0.72, 0.95) | 0.318 | 0.89 (0.59, 1.34) | 1.04 (0.89, 1.21) | 0.509 |
| Highest (Q4) | 68 (26) | 618 (26) | 2566 (25) | 0.81 (0.56, 1.17) | 0.85 (0.74, 0.98) | 0.728 | 1.21 (0.75, 1.95) | 1.21 (1.01, 1.45) | 0.981 |
| Ptrend | 0.265 | 0.020 | 0.739 | 0.448 | 0.033 | 0.936 | |||
| Total Calcium (mg/day) | Overall N = 17,312 | ||||||||
| Lowest (Q1) | 91 (26) | 775 (22) | 3081 (23) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 115 (32) | 1265 (36) | 3926 (29) | 0.85 (0.63, 1.16) | 1.08 (0.96, 1.20) | 0.222 | 0.89 (0.65, 1.22) | 1.16 (1.03, 1.30) | 0.187 |
| Third (Q3) | 79 (22) | 819 (23) | 3345 (25) | 0.70 (0.50, 0.97) | 0.86 (0.76, 0.97) | 0.208 | 0.76 (0.54, 1.07) | 0.98 (0.86, 1.11) | 0.148 |
| Highest (Q4) | 70 (20) | 630 (18) | 3116 (23) | 0.74 (0.52, 1.04) | 0.74 (0.65, 0.84) | 0.953 | 0.83 (0.57, 1.20) | 0.88 (0.77, 1.01) | 0.747 |
| Ptrend | 0.034 | 5.7e-09 | 0.865 | 0.203 | 0.006 | 0.680 | |||
| Total Folate (mcg/day) | Overall N = 15,421 | ||||||||
| Lowest (Q1) | 80 (23) | 758 (23) | 2505 (21) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 108 (31) | 1024 (31) | 3457 (29) | 0.86 (0.62, 1.18) | 0.81 (0.72, 0.92) | 0.935 | 0.92 (0.65, 1.28) | 0.84 (0.74, 0.95) | 0.732 |
| Third (Q3) | 80 (23) | 864 (26) | 2938 (25) | 0.71 (0.51, 1.00) | 0.79 (0.70, 0.89) | 0.588 | 0.82 (0.57, 1.17) | 0.85 (0.75, 0.97) | 0.877 |
| Highest (Q4) | 79 (23) | 637 (19) | 2891 (25) | 0.79 (0.57, 1.10) | 0.64 (0.56, 0.72) | 0.334 | 0.99 (0.67, 1.45) | 0.72 (0.62, 0.83) | 0.161 |
| Ptrend | 0.100 | 3.0e-11 | 0.500 | 0.783 | 7.5e-05 | 0.270 | |||
| Fruit (servings/day) | Overall N = 16,415 | ||||||||
| Lowest (Q1) | 104 (32) | 874 (28) | 2706 (21) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 96 (29) | 994 (32) | 4458 (34) | 0.70 (0.52, 0.94) | 0.77 (0.69, 0.86) | 0.600 | 0.72 (0.53, 0.97) | 0.80 (0.71, 0.90) | 0.571 |
| Third (Q3) | 79 (24) | 677 (22) | 3269 (25) | 0.71 (0.52, 0.96) | 0.67 (0.59, 0.75) | 0.877 | 0.75 (0.54, 1.04) | 0.72 (0.63, 0.81) | 0.892 |
| Highest (Q4) | 47 (14) | 603 (19) | 2508 (19) | 0.53 (0.37, 0.76) | 0.75 (0.66, 0.85) | 0.047 | 0.57 (0.38, 0.85) | 0.82 (0.71, 0.95) | 0.076 |
| Ptrend | 9.0e-04 | 5.4e-08 | 0.135 | 0.012 | 6.8e-04 | 0.193 | |||
| Vegetables (servings/day) | Overall N = 16,475 | ||||||||
| Lowest (Q1) | 77 (23) | 677 (21) | 2537 (20) | 1 | 1 | 1 | 1 | ||
| Second (Q2) | 102 (31) | 1005 (32) | 4341 (33) | 0.94 (0.68, 1.29) | 0.99 (0.87, 1.11) | 0.826 | 0.97 (0.70, 1.33) | 1.02 (0.91, 1.16) | 0.810 |
| Third (Q3) | 96 (29) | 816 (26) | 3454 (27) | 0.93 (0.68, 1.27) | 0.89 (0.79, 1.01) | 0.819 | 1.01 (0.73, 1.41) | 0.97 (0.85, 1.10) | 0.772 |
| Highest (Q4) | 56 (17) | 673 (21) | 2641 (20) | 0.70 (0.49, 1.01) | 0.95 (0.83, 1.08) | 0.099 | 0.79 (0.53, 1.18) | 1.08 (0.93, 1.25) | 0.124 |
| Ptrend | 0.083 | 0.153 | 0.197 | 0.381 | 0.569 | 0.261 | |||
| Alcohol consumption | Overall N = 16,714 | ||||||||
| 1–28 g/day (ref) | 134 (40) | 1430 (44) | 6242 (48) | 1 | 1 | 1 | 1 | ||
| Non-drinker | 155 (47) | 1356 (42) | 5748 (44) | 1.29 (1.01, 1.64) | 1.09 (0.99, 1.19) | 0.223 | 1.23 (0.96, 1.58) | 1.04 (0.95, 1.14) | 0.268 |
| P | 0.043 | 0.071 | 0.101 | 0.357 | |||||
| >28 g/day | 44 (13) | 457 (14) | 1148 (9) | 1.71 (1.19, 2.46) | 1.52 (1.33, 1.73) | 0.573 | 1.56 (1.08, 2.25) | 1.43 (1.25, 1.63) | 0.747 |
| P | 0.004 | 4.5e-10 | 0.018 | 2.3e-07 | |||||
| Pheterogeneity c | 0.331 | 0.460 | |||||||
| Tobacco smoking (pack years) d | Overall N = 15,935 | ||||||||
| Never smoker | 160 (51) | 1543 (50) | 6802 (54) | 1 | 1 | 1 | 1 | ||
| ≤25th percentile | 31 (9) | 327 (11) | 1385 (11) | 0.98 (0.66, 1.45) | 1.04 (0.91, 1.20) | 0.812 | 1.01 (0.68, 1.50) | 1.06 (0.92, 1.22) | 0.867 |
| 25th–50th percentile | 43 (14) | 355 (12) | 1446 (12) | 1.31 (0.92, 1.86) | 1.09 (0.95, 1.25) | 0.240 | 1.34 (0.94, 1.92) | 1.09 (0.95, 1.25) | 0.219 |
| 50th–75th percentile | 43 (14) | 387 (13) | 1465 (12) | 1.31 (0.92, 1.86) | 1.18 (1.04, 1.35) | 0.476 | 1.27 (0.89, 1.82) | 1.15 (1.01, 1.31) | 0.451 |
| ≥75th percentile | 36 (12) | 465 (15) | 1447 (12) | 1.05 (0.72, 1.52) | 1.35 (1.19, 1.53) | 0.250 | 0.95 (0.65, 1.40) | 1.26 (1.11, 1.44) | 0.219 |
| Ptrend | 0.246 | 1.2e-06 | 0.725 | 0.496 | 2.0e-04 | 0.681 | |||
| BMI (kg/m2) | Overall N = 17,434 | ||||||||
| 18.5–24.9 | 122 (36) | 1180 (35) | 5450 (40) | 1 | 1 | 1 | 1 | ||
| 25–<30 | 146 (43) | 1391 (42) | 5573 (40) | 1.18 (0.92, 1.52) | 1.12 (1.02, 1.22) | 0.770 | 1.15 (0.90, 1.48) | 1.10 (1.01, 1.21) | 0.829 |
| ≥30 | 72 (21) | 768 (23) | 2732 (20) | 1.19 (0.88, 1.61) | 1.35 (1.21, 1.50) | 0.349 | 1.10 (0.81, 1.50) | 1.28 (1.14, 1.43) | 0.287 |
| Ptrend | 0.199 | 1.0e-07 | 0.414 | 0.446 | 1.6e-05 | 0.347 | |||
| Diabetes | Overall N = 17,577 | ||||||||
| No | 316 (90) | 3070 (89) | 12,766 (93) | 1 | 1 | 1 | 1 | ||
| Yes | 36 (10) | 365 (11) | 1024 (7) | 1.44 (1.00, 2.07) | 1.34 (1.17, 1.54) | 0.763 | 1.34 (0.92, 1.94) | 1.25 (1.09, 1.44) | 0.788 |
| P | 0.051 | 2.2e-05 | 0.124 | 0.001 | |||||
| Aspirin/NSAID use | Overall N = 15,750 | ||||||||
| No | 222 (74) | 2200 (72) | 7563 (61) | 1 | 1 | 1 | 1 | ||
| Yes | 79 (26) | 852 (28) | 4834 (39) | 0.63 (0.48, 0.82) | 0.67 (0.61, 0.73) | 0.638 | 0.62 (0.48, 0.82) | 0.66 (0.60, 0.73) | 0.656 |
| P | 7.0e-04 | < e-10 | 6.4e-04 | < e-10 | |||||
| HRT use | Overall N = 7,541 | ||||||||
| No | 101 (68) | 901 (69) | 3345 (55) | 1 | 1 | 1 | 1 | ||
| Yes | 48 (32) | 408 (31) | 2738 (45) | 0.73 (0.51, 1.05) | 0.71 (0.61, 0.81) | 0.970 | 0.76 (0.53, 1.10) | 0.75 (0.65, 0.87) | 0.889 |
| P | 0.089 | 6.9e-07 | 0.150 | 8.0e-05 | |||||
Adjusted for age at diagnosis, sex, study site. Dietary exposures (red meat-vegetables) were also adjusted for total energy (kcal/day).
In addition to minimally adjusted OR, further adjusted for all other exposure variables of interest. **All imputed variables are the same as the variables of interest, except BMI, where BMI/5 imputed was used as a covariate for all models except BMI.
Pdifference is the difference between the lifestyle factor and colorectal cancer association for the two cancer mutational signature subtypes, colibactin+ versus colibactin− tumor signature. This is based on the case-only analysis testing the difference in either specific quartiles/categories and/or trend.
Study and sex-specific quartiles of pack-years in current and former smokers.
Microsatellite stable or low microsatellite instability tumors
As a sensitivity analysis, we repeated analyses among cohort studies only (N=10,070 participants) given longer intervals between exposure assessment and CRC diagnosis. The stronger protective effect for fruit intake for SBS88-positive CRC risk compared to SBS88-negative CRC risk was consistent in the cohort-only analysis (Table 2). Alcohol non-drinkers in the cohort-only analysis had a higher risk of SBS88-positive CRC, while there was no such association with SBS88-negative CRC (P heterogeneity = 0.024, Table 2). Additionally, those with a BMI ≥30 kg/m2 had 1.40 (95% CI 1.20, 1.63) times the risk of SBS88-negative CRC compared to those with a BMI of 18.5–24.9 kg/m2, while there was no association with risk of SBS88-positive CRC for the higher BMI category (OR=0.82; 95% CI 0.53, 1.26; P for heterogeneity = 0.022).
Table 2.
Odds ratios (95% confidence intervals) for the association of epidemiologic factors with CRC risk by SBS88 signature status among participants in cohort studies
| Exposure of interest | SBS88+ cases (N, %)d | SBS88− cases (N, %)d | Controls (N, %) | SBS88+ vs. Controls, Multivariable-adjusted OR (95% CI) a | SBS88− vs. Controls, Multivariable-adjusted OR (95% CI) a | P heterogeneity (case/case analysis) b |
|---|---|---|---|---|---|---|
|
| ||||||
| Red meat (servings/day) | Overall N = 9774 | |||||
| Lowest (Q1) | 40 (22) | 369 (22) | 1968 (25) | 1 | 1 | |
| Second (Q2) | 36 (20) | 424 (25) | 2016 (25) | 0.86 (0.53, 1.38) | 1.09 (0.93, 1.29) | 0.373 |
| Third (Q3) | 54 (30) | 440 (26) | 1979 (25) | 1.26 (0.79, 2.03) | 1.07 (0.89, 1.27) | 0.431 |
| Highest (Q4) | 48 (27) | 432 (26) | 1968 (25) | 1.14 (0.66, 1.98) | 1.01 (0.82, 1.24) | 0.601 |
| Ptrend | 0.359 | 0.979 | 0.309 | |||
| Processed meat (servings/day) | Overall N = 9650 | |||||
| Lowest (Q1) | 50 (29) | 491 (30) | 2440 (31) | 1 | 1 | |
| Second (Q2) | 38 (22) | 337 (21) | 1661 (21) | 1.13 (0.72, 1.77) | 1.04 (0.88, 1.22) | 0.693 |
| Third (Q3) | 36 (21) | 397 (24) | 1885 (24) | 0.94 (0.59, 1.51) | 1.06 (0.89, 1.25) | 0.753 |
| Highest (Q4) | 49 (28) | 404 (25) | 1862 (24) | 1.23 (0.75, 2.00) | 1.03 (0.85, 1.24) | 0.428 |
| Ptrend | 0.577 | 0.711 | 0.598 | |||
| Fiber (g/day) | Overall N = 9686 | |||||
| Lowest (Q1) | 58 (33) | 423 (26) | 1929 (24) | 1 | 1 | |
| Second (Q2) | 41 (23) | 410 (25) | 1965 (25) | 0.81 (0.52, 1.27) | 1.00 (0.84, 1.18) | 0.376 |
| Third (Q3) | 37 (21) | 406 (25) | 2013 (26) | 0.81 (0.48, 1.37) | 1.04 (0.86, 1.26) | 0.412 |
| Highest (Q4) | 40 (23) | 389 (24) | 1975 (25) | 1.05 (0.55, 1.99) | 1.04 (0.82, 1.31) | 0.925 |
| Ptrend | 0.976 | 0.699 | 0.995 | |||
| Total Calcium (mg/day) | Overall N = 10070 | |||||
| Lowest (Q1) | 58 (31) | 463 (27) | 2024 (25) | 1 | 1 | |
| Second (Q2) | 44 (24) | 425 (25) | 2120 (26) | 0.83(0.55, 1.26) | 0.94 (0.80, 1.10) | 0.652 |
| Third (Q3) | 41 (22) | 439 (25) | 1950 (24) | 0.84 (0.53, 1.31) | 1.06 (0.90, 1.25) | 0.430 |
| Highest (Q4) | 43 (23) | 398 (23) | 2065 (25) | 0.99 (0.61, 1.62) | 1.00 (0.83, 1.19) | 0.880 |
| Ptrend | 0.938 | 0.669 | 0.978 | |||
| Total Folate (mcg/day) | Overall N = 10070 | |||||
| Lowest (Q1) | 49 (26) | 470 (27) | 1910 (23) | 1 | 1 | |
| Second (Q2) | 54 (29) | 487 (28) | 2132 (26) | 1.05 (0.69, 1.59) | 0.83 (0.71, 0.97) | 0.284 |
| Third (Q3) | 35 (19) | 345 (20) | 1889 (23) | 0.71 (0.44, 1.15) | 0.63 (0.53, 0.75) | 0.717 |
| Highest (Q4) | 48 (26) | 423 (25) | 2228 (27) | 0.92 (0.56, 1.52) | 0.66 (0.55, 0.79) | 0.253 |
| Ptrend | 0.435 | 4.3e-07 | 0.385 | |||
| Fruit (servings/day) | Overall N = 9591 | |||||
| Lowest (Q1) | 63 (35) | 463 (28) | 1888 (24) | 1 | 1 | |
| Second (Q2) | 46 (26) | 419 (26) | 2005 (26) | 0.72 (0.48, 1.07) | 0.90 (0.77, 1.05) | 0.323 |
| Third (Q3) | 41 (23) | 382 (23) | 2002 (26) | 0.68 (0.43, 1.05) | 0.86 (0.72, 1.01) | 0.321 |
| Highest (Q4) | 29 (16) | 376 (23) | 1877 (24) | 0.53 (0.31, 0.90) | 0.92 (0.76, 1.11) | 0.050 |
| Ptrend | 0.019 | 0.311 | 0.064 | |||
| Vegetables (servings/day) | Overall N = 9591 | |||||
| Lowest (Q1) | 44 (25) | 405 (25) | 1923 (25) | 1 | 1 | |
| Second (Q2) | 43 (24) | 384 (23) | 1827 (24) | 1.25 (0.80, 1.94) | 1.14 (0.96, 1.34) | 0.630 |
| Third (Q3) | 51 (28) | 422 (26) | 2002 (26) | 1.35 (0.86, 2.11) | 1.09 (0.92, 1.29) | 0.356 |
| Highest (Q4) | 41 (23) | 429 (26) | 2020 (26) | 1.23 (0.73, 2.09) | 1.17 (0.97, 1.42) | 0.989 |
| Ptrend | 0.371 | 0.155 | 0.850 | |||
| Alcohol consumption | Overall N = 9875 | |||||
| 1–28 g/day (ref) | 65 (37) | 775 (47) | 3669 (46) | 1 | 1 | |
| Non-drinker | 91 (51) | 685 (41) | 3800 (47) | 1.50 (1.07, 2.11) | 0.99 (0.87, 1.12) | 0.024 |
| P | 0.019 | 0.852 | ||||
| >28 g/day | 21 (12) | 199 (12) | 570 (7) | 1.64 (0.97, 2.77) | 1.29 (1.06, 1.56) | 0.470 |
| P | 0.065 | 0.011 | ||||
| Pheterogeneity b | 0.104 | |||||
| Tobacco smoking (pack years) c | Overall N = 9448 | |||||
| Never smoker | 92 (55) | 810 (51) | 4233 (55) | 1 | 1 | |
| ≤25th percentile | 14 (8) | 172 (11) | 874 (11) | 0.85 (0.48, 1.52) | 1.03 (0.85, 1.24) | 0.485 |
| 25th–50th percentile | 21 (13) | 181 (11) | 883 (11) | 1.18 (0.71, 1.93) | 1.01 (0.84, 1.22) | 0.491 |
| 50th–75th percentile | 27 (16) | 189 (12) | 895 (12) | 1.52 (0.96, 2.40) | 1.07 (0.88, 1.28) | 0.095 |
| ≥75th percentile | 14 (8) | 225 (14) | 818 (11) | 0.77 (0.43, 1.38) | 1.26 (1.05, 1.51) | 0.163 |
| Ptrend | 0.699 | 0.029 | 0.945 | |||
| BMI (kg/m2) | Overall N = 9805 | |||||
| 18.5–24.9 | 77 (43) | 612 (36) | 3301 (42) | 1 | 1 | |
| 25–<30 | 71 (39) | 690 (41) | 3076 (39) | 0.92 (0.66, 1.28) | 1.14 (1.01, 1.29) | 0.243 |
| ≥30 | 32 (18) | 375 (22) | 1571 (20) | 0.82 (0.53, 1.26) | 1.40 (1.20, 1.63) | 0.022 |
| Ptrend | 0.347 | 2.7e-05 | 0.020 | |||
| Diabetes | Overall N = 9961 | |||||
| No | 164 (92) | 1553 (93) | 7714 (95) | 1 | 1 | |
| Yes | 15 (8) | 114 (7) | 401 (5) | 1.82 (1.04, 3.18) | 1.40 (1.12, 1.76) | 0.378 |
| P | 0.035 | 0.004 | ||||
| Aspirin/NSAID use | Overall N = 8573 | |||||
| No | 84 (63) | 819 (61) | 3856 (54) | 1 | 1 | |
| Yes | 49 (37) | 520 (39) | 3245 (46) | 0.71 (0.49, 1.02) | 0.74 (0.65, 0.84) | 0.743 |
| P | 0.063 | 4.0e-06 | ||||
| HRT use | Overall N = 5163 | |||||
| No | 55 (63) | 465 (62) | 2262 (52) | 1 | 1 | |
| Yes | 33 (38) | 282 (38) | 2066 (48) | 0.76 (0.48, 1.20) | 0.81 (0.68, 0.96) | 0.617 |
| P | 0.244 | 0.018 | ||||
Adjusted for age at diagnosis, sex, study site. Dietary exposures (red meat-vegetables) were also adjusted for total energy (kcal/day). Additionally adjusted for all other exposure variables of interest. **All imputed variables are the same as the variables of interest, except BMI, where BMI/5 imputed was used as a covariate for all models except BMI.
Pdifference is the difference between the lifestyle factor and colorectal cancer association for the two cancer mutational signature subtypes, colibactin+ versus colibactin− tumor signature. This is based on the case-only analysis testing the difference in either specific quartiles/categories and/or trend.
Study and sex-specific quartiles of pack-years in current and former smokers.
Microsatellite stable or low microsatellite instability tumors
In addition to CRC risk, we also compared differences in CRC-specific mortality for environmental and lifestyle factors stratified by SBS88 status. Among 3,465 CRC cases eligible for survival analysis, 734 CRC-specific deaths occurred within five years of follow-up. The only environmental and lifestyle factor meaningfully associated with differential CRC-specific survival was BMI. After adjusting for all other exposures, the hazard of CRC-specific death among those with SBS88-positive CRC and BMI ≥30 kg/m2 was 3.40 (95% CI 1.47, 7.84, P trend = 0.005) times as great as the hazard of CRC-specific death among those with SBS88-positive CRC and BMI 18.5–24.9 kg/m2; however, no association between BMI and survival was observed among those with SBS88-negative tumors (HR = 0.97 95% CI 0.78, 1.21, P for heterogeneity = 0.066, Table 3). Among those with SBS88-positive tumors, higher red meat intake was associated with better survival and higher vegetable intake was associated with worse survival; however, these results may be unstable due to small sample size among those with SBS88-positive tumors (42–95 SBS88-positive cases and 9–14 CRC-specific deaths per quartile for red meat and vegetable intake) and neither association was significantly different by SBS88 status. When cases were further stratified by both SBS88 and APC:c.835–8A>G, the differential association between BMI and CRC-specific death was only shown in the SBS88+APC− group but not the SBS88+APC+ group, where results may also be unstable due to small sample size (Supplemental Table 2).
Table 3.
Hazard ratios (95% confidence intervals) for the association of epidemiologic factors with colorectal cancer survival by SBS88 signature status
| Exposure of interest | SBS88+ cases d | SBS88− cases d | SBS88+ CRC-specific deaths | SBS88− CRC-specific deaths | SBS88+ CRC-specific mortality, Minimal adjusted HR (95% CI) a | SBS88− CRC-specific mortality, Minimal adjusted HR (95% CI) a | SBS88+ CRC-specific mortality, Multivariable-adjusted HR (95% CI) b | SBS88− CRC-specific mortality, Multivariable-adjusted HR (95% CI) b |
|---|---|---|---|---|---|---|---|---|
|
| ||||||||
| Red meat (servings/day) | Total N= 286 | Total N= 2882 | ||||||
| Lowest (Q1) | 67 | 656 | 14 | 136 | 1 | 1 | 1 | 1 |
| Second (Q2) | 70 | 730 | 10 | 167 | 0.47 (0.20, 1.11) | 1.13 (0.90, 1.41) | 0.41 (0.16, 1.07) | 1.16 (0.92, 1.47) |
| Third (Q3) | 70 | 714 | 13 | 166 | 0.68 (0.31, 1.52) | 1.13 (0.90, 1.43) | 0.48 (0.20, 1.16) | 1.19 (0.93, 1.51) |
| Highest (Q4) | 79 | 782 | 12 | 153 | 0.58 (0.26, 1.32) | 0.96 (0.75, 1.22) | 0.40 (0.15, 1.07) | 1.03 (0.79, 1.35) |
| Ptrend | 0.346 | 0.721 | 0.102 | 0.813 | ||||
| P heterogeneity | 0.757 | 0.745 | ||||||
| Processed meat (servings/day) | Total N= 227 | Total N= 2332 | ||||||
| Lowest (Q1) | 65 | 645 | 15 | 160 | 1 | 1 | 1 | 1 |
| Second (Q2) | 46 | 540 | 4 | 109 | 0.28 (0.08, 0.93) | 0.89 (0.69, 1.14) | 0.24 (0.06, 0.94) | 0.88 (0.69, 1.14) |
| Third (Q3) | 49 | 562 | 10 | 137 | 0.80 (0.33, 1.90) | 1.00 (0.79, 1.26) | 0.66 (0.25, 1.79) | 0.98 (0.77, 1.26) |
| Highest (Q4) | 67 | 585 | 13 | 114 | 0.74 (0.33, 1.66) | 0.79 (0.61, 1.01) | 1.05 (0.34, 3.24) | 0.76 (0.58, 1.01) |
| Ptrend | 0.849 | 0.144 | 0.790 | 0.137 | ||||
| P heterogeneity | 0.769 | 0.762 | ||||||
| Fiber (g/day) | Total N= 224 | Total N= 2131 | ||||||
| Lowest (Q1) | 70 | 536 | 11 | 116 | 1 | 1 | 1 | 1 |
| Second (Q2) | 49 | 520 | 10 | 127 | 1.29 (0.53, 3.14) | 1.18 (0.91, 1.53) | 0.87 (0.31, 2.40) | 1.20 (0.92, 1.58) |
| Third (Q3) | 47 | 523 | 10 | 124 | 1.55 (0.59, 4.06) | 1.13 (0.87, 1.48) | 1.00 (0.31, 3.25) | 1.18 (0.87, 1.60) |
| Highest (Q4) | 58 | 552 | 7 | 122 | 0.72 (0.23, 2.27) | 1.09 (0.82, 1.44) | 0.44 (0.11, 1.79) | 1.15 (0.81, 1.64) |
| Ptrend | 0.752 | 0.661 | 0.296 | 0.505 | ||||
| P heterogeneity | 0.690 | 0.721 | ||||||
| Total Calcium (mg/day) | Total N= 306 | Total N= 3159 | ||||||
| Lowest (Q1) | 82 | 698 | 19 | 155 | 1 | 1 | 1 | 1 |
| Second (Q2) | 103 | 1177 | 12 | 241 | 0.37 (0.16, 0.88) | 1.04 (0.84, 1.30) | 0.41 (0.16, 1.02) | 1.09 (0.87, 1.36) |
| Third (Q3) | 62 | 728 | 11 | 158 | 0.56 (0.23, 1.40) | 1.01 (0.80, 1.28) | 0.53 (0.20, 1.39) | 1.07 (0.84, 1.37) |
| Highest (Q4) | 59 | 556 | 10 | 128 | 0.59 (0.23, 1.53) | 1.09 (0.85, 1.40) | 0.63 (0.22, 1.83) | 1.16 (0.88, 1.52) |
| Ptrend | 0.300 | 0.613 | 0.374 | 0.355 | ||||
| P heterogeneity | 0.473 | 0.548 | ||||||
| Total Folate (mcg/day) | Total N= 299 | Total N= 2956 | ||||||
| Lowest (Q1) | 69 | 682 | 10 | 160 | 1 | 1 | 1 | 1 |
| Second (Q2) | 97 | 950 | 16 | 209 | 0.93 (0.38, 2.25) | 1.05 (0.84, 1.32) | 0.90 (0.35, 2.35) | 1.03 (0.81, 1.31) |
| Third (Q3) | 65 | 779 | 13 | 161 | 0.91 (0.35, 2.38) | 0.88 (0.70, 1.12) | 0.80 (0.29, 2.25) | 0.86 (0.67, 1.11) |
| Highest (Q4) | 68 | 545 | 9 | 118 | 0.88 (0.32, 2.42) | 0.92 (0.72, 1.18) | 1.08 (0.36, 3.23) | 0.86 (0.65, 1.14) |
| Ptrend | 0.808 | 0.242 | 0.947 | 0.131 | ||||
| P heterogeneity | 0.715 | 0.666 | ||||||
| Fruit (servings/day) | Total N= 283 | Total N= 2849 | ||||||
| Lowest (Q1) | 91 | 792 | 14 | 168 | 1 | 1 | 1 | 1 |
| Second (Q2) | 88 | 926 | 16 | 207 | 1.30 (0.59, 2.85) | 1.15 (0.92, 1.42) | 1.77 (0.74, 4.24) | 1.13 (0.91, 1.42) |
| Third (Q3) | 67 | 601 | 12 | 132 | 1.11 (0.49, 2.49) | 1.05 (0.83, 1.32) | 1.22 (0.51, 2.93) | 1.02 (0.79, 1.30) |
| Highest (Q4) | 37 | 530 | 6 | 107 | 0.95 (0.34, 2.63) | 0.97 (0.75, 1.24) | 1.19 (0.35, 4.07) | 0.93 (0.70, 1.23) |
| Ptrend | 0.952 | 0.748 | 0.794 | 0.512 | ||||
| P heterogeneity | 0.802 | 0.797 | ||||||
| Vegetables (servings/day) | Total N= 288 | Total N= 2872 | ||||||
| Lowest (Q1) | 68 | 619 | 9 | 141 | 1 | 1 | 1 | 1 |
| Second (Q2) | 95 | 931 | 14 | 177 | 1.42 (0.57, 3.56) | 0.94 (0.74, 1.19) | 1.86 (0.72, 4.84) | 0.97 (0.76, 1.23) |
| Third (Q3) | 83 | 747 | 14 | 165 | 1.22 (0.51, 2.92) | 1.03 (0.82, 1.30) | 1.56 (0.62, 3.94) | 1.06 (0.83, 1.35) |
| Highest (Q4) | 42 | 575 | 12 | 135 | 2.34 (0.94, 5.85) | 1.10 (0.86, 1.40) | 4.00 (1.32, 12.11) | 1.15 (0.87, 1.52) |
| Ptrend | 0.129 | 0.355 | 0.046 | 0.253 | ||||
| P heterogeneity | 0.135 | 0.142 | ||||||
| Alcohol consumption | Total N= 283 | Total N= 2872 | ||||||
| 1–28 g/day (ref) | 115 | 1284 | 14 | 261 | 1 | 1 | 1 | 1 |
| Non-drinker | 131 | 1174 | 27 | 252 | 1.65 (0.83, 3.30) | 1.01 (0.84, 1.21) | 1.91 (0.91, 4.03) | 1.01 (0.84, 1.21) |
| P | 0.155 | 0.921 | 0.089 | 0.930 | ||||
| >28 g/day | 37 | 414 | 7 | 102 | 1.69 (0.63, 4.48) | 1.22 (0.96, 1.56) | 1.77 (0.63, 5.02) | 1.21 (0.95, 1.55) |
| P | 0.296 | 0.100 | 0.281 | 0.119 | ||||
| P heterogeneity | 0.411 | 0.369 | ||||||
| Tobacco smoking (pack years) c | Total N= 270 | Total N= 2789 | ||||||
| Never smoker | 135 | 1352 | 21 | 297 | 1 | 1 | 1 | 1 |
| ≤25th percentile | 28 | 289 | 5 | 60 | 1.38 (0.49, 3.86) | 0.99 (0.74, 1.31) | 1.45 (0.47, 4.48) | 0.99 (0.75, 1.32) |
| 25th–50th percentile | 41 | 337 | 10 | 71 | 2.46 (1.07, 5.64) | 1.01 (0.78, 1.32) | 2.79 (1.14, 6.78) | 1.00 (0.76, 1.31) |
| 50th–75th percentile | 34 | 367 | 6 | 79 | 1.26 (0.48, 3.32) | 1.04 (0.81, 1.35) | 1.28 (0.46, 3.54) | 1.04 (0.80, 1.35) |
| ≥75th percentile | 32 | 444 | 4 | 103 | 1.28 (0.39, 4.14) | 1.14 (0.90, 1.44) | 1.37 (0.38, 4.96) | 1.13 (0.89, 1.44) |
| Ptrend | 0.353 | 0.313 | 0.303 | 0.346 | ||||
| P heterogeneity | 0.634 | 0.676 | ||||||
| BMI (kg/m2) | Total N= 287 | Total N= 2912 | ||||||
| 18.5–24.9 | 102 | 1048 | 14 | 235 | 1 | 1 | 1 | 1 |
| 25–<30 | 126 | 1224 | 19 | 265 | 1.16 (0.55, 2.44) | 0.94 (0.78, 1.12) | 1.20 (0.54, 2.70) | 0.96 (0.80, 1.15) |
| ≥30 | 59 | 640 | 16 | 138 | 2.69 (1.25, 5.81) | 0.95 (0.77, 1.18) | 3.40 (1.47, 7.84) | 0.97 (0.78, 1.21) |
| Ptrend | 0.014 | 0.602 | 0.005 | 0.758 | ||||
| P heterogeneity | 0.051 | 0.066 | ||||||
| Diabetes | Total N= 296 | Total N= 2985 | ||||||
| No | 268 | 2690 | 45 | 582 | 1 | 1 | 1 | 1 |
| Yes | 28 | 295 | 5 | 66 | 1.04 (0.41, 2.69) | 1.07 (0.83, 1.39) | 0.70 (0.25, 1.97) | 1.10 (0.84, 1.43) |
| P | 0.929 | 0.605 | 0.503 | 0.500 | ||||
| P heterogeneity | 0.913 | 0.909 | ||||||
| Aspirin/NSAID use | Total N= 255 | Total N= 2699 | ||||||
| No | 188 | 1922 | 27 | 399 | 1 | 1 | 1 | 1 |
| Yes | 67 | 777 | 8 | 160 | 0.74 (0.31, 1.77) | 0.88 (0.72, 1.06) | 0.60 (0.23, 1.57) | 0.87 (0.72, 1.05) |
| P | 0.492 | 0.171 | 0.296 | 0.156 | ||||
| P heterogeneity | 0.601 | 0.631 | ||||||
| HRT use | Total N= 115 | Total N= 1073 | ||||||
| No | 77 | 751 | 16 | 171 | 1 | 1 | 1 | 1 |
| Yes | 38 | 322 | 8 | 62 | 1.37 (0.51, 3.63) | 0.84 (0.62, 1.14) | 2.13 (0.57, 7.96) | 0.85 (0.62, 1.15) |
| P | 0.533 | 0.265 | 0.261 | 0.285 | ||||
| P heterogeneity | 0.691 | 0.632 | ||||||
Adjusted for age at diagnosis, sex, strata(study site). Dietary exposures (red meat-vegetables) were also adjusted for total energy (kcal/day).
In addition to minimally adjusted OR, further adjusted for all other exposure variables of interest. **All imputed variables are the same as the variables of interest, except BMI, where BMI/5 imputed was used as a covariate for all models except BMI.
Study and sex-specific quartiles of pack-years in current and former smokers.
Microsatellite stable or low microsatellite instability tumors
Discussion
In this consortium study of CRC, most epidemiologic CRC factors of relevance for gut dysbiosis were not differentially associated with CRC risk or CRC-specific survival by SBS88 status of the tumor, and none would be significantly differentially associated with SBS88 positive or negative CRC if accounting for multiple comparisons. We found a suggestive difference that those with higher fruit intake had lower risk of SBS88-positive CRC versus SBS88-negative CRC. Among cohort studies with longer time between exposure assessment and diagnosis, associations of fruit consumption, BMI, and alcohol consumption with CRC risk differed modestly by SBS88 status. Those with BMI ≥30 kg/m2 (versus BMI 18.5–24.9 kg/m2) had more than 3 times the hazard of SBS88-positive CRC-specific death as opposed to no association with SBS88-negative CRC-specific death. However, this differential association was driven by tumors that were SBS88+APC−, suggesting further research is needed of these differential associations, preferably with signature calculation in samples with either whole genome or whole exome sequencing available.
Mutational signatures are an exciting area of research where technological advancements now allow for determination of what aggregate somatic mutations may be caused by different exposures or mutation mechanisms.24,32 However, potential limitations arise when interpreting mutational signatures results. Signatures may be approximations whose determinations may be impacted by mathematical approach and other factors related to the study in which it was derived, as well as other variations in signature, including the potential that it may be caused by multiple sources.24 However, in the case of SBS88, experimental evidence has shown a direct link between SBS88 and colibactin exposure generated by pks+ E. coli.11 Pleguezuelos-Manzano et al. found that human intestinal organoids that were exposed to pks+ E. coli had a distinct mutational signature absent from those without pks+ exposure, and this signature was also identified in a subset of 5,876 human cancer genomes from two independent cohorts.11 This experimental study showed a direct causal link between pks+ E. coli that produce colibactin and the SBS88 signature, reducing potential for misclassification of mutational signature mechanism.
While obesity is generally regarded as a risk factor for CRC, the relationship between BMI and CRC-specific survival is controversial.33 CRC is a heterogeneous disease, which may be contributing to these inconsistent findings if the role of body size in CRC survival differs between tumor subtypes. Our large consortium study is the first to find that among those with SBS88-positive tumors, those with BMI≥30 kg/m2 may have 3 times the hazard of CRC-specific death compared to those with BMI 18.5–24.9 kg/m2. We did not see any association between BMI and CRC-specific survival among SBS88-negative cases. Previous meta-analyses have found conflicting results for the relationship between BMI and colorectal cancer survival- one published in 2016 including 18 studies and 60,346 participants found that those in highest and lowest BMI categories had higher risk than those in the middle categories.34 Another 2022 meta-analysis including sixteen studies and 55,391 participants found no significant difference in cancer-specific survival for those in the highest and lowest BMI categories, but that those with BMI 25.0–29.9 kg/m2 had better cancer-specific survival compared to cases with BMI 18.5–24.9 kg/m2.35 Further research is needed to understand potential heterogeneity of the impact of adiposity in relation to CRC progression that may be caused by pks+ E coli as opposed to other mutagenic events.
Using quantitative polymerase chain reaction, Arima et al. found a stronger association between western diet score and risk of CRC with high levels of pks+ E coli than with risk of CRC with low or negative pks+ E coli.15 In contrast, we focused on the presence of a tumor mutational signature believed to result from the influence of pks+ E coli. Differences in the technical approaches used to characterize the role of pks+ E coli in CRC may explain the differences in our findings. Compared to this prior study, we did not see a strong differential association between variables describing a westernized diet, such as red or processed meat intake and CRC risk according to SBS88 status. When Arima et al specifically looked at red meat variables (i.e., the largest component of western diet score), they did not see a statistically significant differential association by pks+ E coli level, however the magnitude of hazard ratios for processed red meat among pks+ E coli high were different from pks+ E coli low or negative (multi-variable adjusted HRs= 2.34, 0.90, 1.00, respectively).15 Further studies of what components of a western diet, beyond red meat intake, drive this differential association with CRC risk by pks+ E coli status are needed.
Previous work by our research group has demonstrated an association between SBS88 signature and better CRC-specific survival in addition to other characteristics among MSS/L tumors (medRxiv 2023.03.10.23287127). SBS88-positive tumors were associated with better survival than SBS88-negative tumors (HR= 0.69, 95% CI 0.52, 0.90) after adjustment for age, sex, study, and stage (medRxiv 2023.03.10.23287127). SBS88-positive tumors were more likely to be located in the distal colon or rectum than the proximal colon, more likely among women, and more likely among younger cases (medRxiv 2023.03.10.23287127). Distal colon CRC has previously been associated with processed meat consumption,36 which may be consistent with Arima et al findings.
Our study has several strengths. Notably, we have a large sample size of CRC cases with targeted tumor sequencing data, allowing for determination of SBS88 signature status, extensive data on epidemiologic factors, survival information, and inclusion of a large population of cancer-free controls. One critical limitation may be the timing of measurement of epidemiologic factors when assessing CRC risk, as it is unknown what time period of epidemiologic factor exposure may be related to either subsequent SBS88 signature development or progression of SBS88 signature to cancer. Lee-Six et al have previously found the mutational process underlying the SBS88 signature may occur very early in life by an extrinsic, local mutagenic event in childhood before the age of 10 years.37 This finding calls into question when would be the correct epidemiologic factor time-point related to colibactin (i.e., pks+ E coli) exposure, which may not be accurately captured in our study. However, we point out that this limitation is not impacting survival analysis. There may also be measurement error generally when using self-reported harmonized epidemiologic factor data, where not all factors can always be reliably monitored when assessed by interview or questionnaire. As previously mentioned, multiple comparison testing is a limitation to the present study. Due to limitations in sample size, we were unable to construct analyses stratified by race and ethnicity or clinical factors. We were limited to only including tumors with at least five SNVs in our study, due to limitations in targeted sequencing panels,25 which reduced our sample size and statistical power and may lead to misclassification. Our use of a targeted sequencing panel is additionally limited due to the selective pressure of genomic regions included, whereas whole-genome or whole-exome sequencing may define SBS88 with reduced misclassification. Lastly, COSMIC (https://cancer.sanger.ac.uk/cosmic/signatures) has published an additional pks+ E. coli colibactin-induced tumor mutational signature caused by small insertions and deletions (indels, ID), ID18.38 We were unable to conduct stratified analyses by ID18 due to low number/accuracy of indels in our study due to our targeted sequencing panel, where 84% of tumors had less than five indel mutations which would not be suitable for accurate ID signature decomposition (medRxiv 2023.03.10.23287127).25 Future studies able to utilize whole-genome or whole-exome sequencing may enable investigation of stratified analyses by ID18.
In conclusion, most epidemiologic factors were not differentially associated with risk or survival by SBS88 status in this novel study combining tumor mutational signature, epidemiologic factors, and survival data in a large consortium. Higher BMI may be associated with worse CRC-specific survival among those with SBS88-positive tumors, however validation in samples where SBS88 signature is calculated in whole genome or whole exome sequencing data are needed. BMI, alcohol, and dietary factors may be differentially associated with CRC risk based on SBS88 status, although further investigation into the exposure timing and reduced misclassification of SBS88 signature are warranted.
Supplementary Material
Impact:
This study highlights the importance of identification of tumor phenotypes related to CRC and understanding potential heterogeneity for risk and survival.
Acknowledgements:
Disclaimer: Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.
CCFR: The Colon CFR graciously thanks the generous contributions of their study participants, dedication of study staff, and the financial support from the U.S. National Cancer Institute, without which this important registry would not exist. The authors would like to thank the study participants and staff of the Seattle Colon Cancer Family Registry and the Hormones and Colon Cancer study (CORE Studies).
CORSA: We kindly thank all individuals who agreed to participate in the CORSA study. Furthermore, we thank all cooperating physicians and students and the Biobank Graz of the Medical University of Graz.
CPS-II: The authors express sincere appreciation to all Cancer Prevention Study-II participants, and to each member of the study and biospecimen management group. The authors would like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention’s National Program of Cancer Registries and cancer registries supported by the National Cancer Institute’s Surveillance Epidemiology and End Results Program. The authors assume full responsibility for all analyses and interpretation of results. The views expressed here are those of the authors and do not necessarily represent the American Cancer Society or the American Cancer Society – Cancer Action Network.
DACHS: We thank all participants and cooperating clinicians, and everyone who provided excellent technical assistance.
EPIC: Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.
Nurses’ Health Study I and II, Health Professionals Follow up Study: The study protocol was approved by the institutional review boards of the Brigham and Women’s Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required. We acknowledge Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital as home of the NHS. The authors would like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention’s National Program of Cancer Registries (NPCR) and/or the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program. Central registries may also be supported by state agencies, universities, and cancer centers. Participating central cancer registries include the following: Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Indiana, Iowa, Kentucky, Louisiana, Massachusetts, Maine, Maryland, Michigan, Mississippi, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico, Rhode Island, Seattle SEER Registry, South Carolina, Tennessee, Texas, Utah, Virginia, West Virginia, Wyoming. The authors assume full responsibility for analyses and interpretation of these data.
PLCO: The authors thank the PLCO Cancer Screening Trial screening center investigators and the staff from Information Management Services Inc and Westat Inc. Most importantly, we thank the study participants for their contributions that made this study possible. Cancer incidence data have been provided by the District of Columbia Cancer Registry, Georgia Cancer Registry, Hawaii Cancer Registry, Minnesota Cancer Surveillance System, Missouri Cancer Registry, Nevada Central Cancer Registry, Pennsylvania Cancer Registry, Texas Cancer Registry, Virginia Cancer Registry, and Wisconsin Cancer Reporting System. All are supported in part by funds from the Center for Disease Control and Prevention, National Program for Central Registries, local states or by the National Cancer Institute, Surveillance, Epidemiology, and End Results program. The results reported here and the conclusions derived are the sole responsibility of the authors.
WHI: The authors thank the WHI investigators and staff for their dedication, and the study participants for making the program possible. A full listing of WHI investigators can be found at: http://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Short%20List.pdf
Funding:
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO): National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services (U01 CA137088, R01 CA059045, U01 CA164930, R21 CA191312, R01 CA244588, R01 CA206279, R01 CA201407, R01 CA488857, P20 CA252733). Genotyping/Sequencing services were provided by the Center for Inherited Disease Research (CIDR) contract number HHSN268201700006I and HHSN268201200008I. This research was funded in part through the NIH/NCI Cancer Center Support Grant P30 CA015704. Scientific Computing Infrastructure at Fred Hutch funded by ORIP grant S10OD028685. C.E. Thomas is supported by L70 CA284301, T32 CA094880, and T32 CA009168.
The Colon Cancer Family Registry (CCFR, www.coloncfr.org) is supported in part by funding from the National Cancer Institute (NCI), National Institutes of Health (NIH) (award U01 CA167551). Support for case ascertainment was provided in part from the Surveillance, Epidemiology, and End Results (SEER) Program and the following U.S. state cancer registries: AZ, CO, MN, NC, NH; and by the Victoria Cancer Registry (Australia) and Ontario Cancer Registry (Canada). The content of this manuscript does not necessarily reflect the views or policies of the NCI, NIH or any of the collaborating centers in the Colon Cancer Family Registry (CCFR), nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government, any cancer registry, or the CCFR.
CORSA: The CORSA study was funded by Austrian Research Funding Agency (FFG) BRIDGE (grant 829675, to Andrea Gsur), the “Herzfelder’sche Familienstiftung” (grant to Andrea Gsur) and was supported by COST Action BM1206.
CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study-II (CPS-II) cohort. The study protocol was approved by the institutional review boards of Emory University, and those of participating registries as required.
CRA: This work was supported by National Institutes of Health grant R01 CA068535
CRCGEN: Colorectal Cancer Genetics & Genomics, Spanish study was supported by Instituto de Salud Carlos III, co-funded by FEDER funds –a way to build Europe– (grants PI14–613 and PI09–1286), Junta de Castilla y León (grant LE22A10–2), the Spanish Association Against Cancer (AECC) Scientific Foundation grant GCTRA18022MORE and the Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), action Genrisk. Sample collection of this work was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d’Oncología de Catalunya (XBTC), Plataforma Biobancos PT13/0010/0013 and ICOBIOBANC, sponsored by the Catalan Institute of Oncology. We thank CERCA Programme, Generalitat de Catalunya for institutional support.
DACHS: This work was supported by the German Research Council (BR 1704/6–1, BR 1704/6–3, BR 1704/6–4, CH 117/1–1, HO 5117/2–1, HE 5998/2–1, KL 2354/3–1, RO 2270/8–1 and BR 1704/17–1), the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany, and the German Federal Ministry of Education and Research (01KH0404, 01ER0814, 01ER0815, 01ER1505A and 01ER1505B).
EPIC: The coordination of EPIC is financially supported by International Agency for Research on Cancer (IARC) and also by the Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London which has additional infrastructure support provided by the NIHR Imperial Biomedical Research Centre (BRC). The national cohorts are supported by: Danish Cancer Society (Denmark); Ligue Contre le Cancer, Institut Gustave Roussy, Mutuelle Générale de l’Education Nationale, Institut National de la Santé et de la Recherche Médicale (INSERM) (France); German Cancer Aid, German Cancer Research Center (DKFZ), German Institute of Human Nutrition Potsdam- Rehbruecke (DIfE), Federal Ministry of Education and Research (BMBF) (Germany); Associazione Italiana per la Ricerca sul Cancro-AIRC-Italy, Compagnia di SanPaolo and National Research Council (Italy); Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF), Statistics Netherlands (The Netherlands); Health Research Fund (FIS) - Instituto de Salud Carlos III (ISCIII), Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, and the Catalan Institute of Oncology - ICO (Spain); Swedish Cancer Society, Swedish Research Council and and Region Skåne and Region Västerbotten (Sweden); Cancer Research UK (14136 to EPIC-Norfolk; C8221/A29017 to EPIC-Oxford), Medical Research Council (1000143 to EPIC-Norfolk; MR/M012190/1 to EPIC-Oxford). (United Kingdom).
Harvard cohorts: HPFS is supported by the National Institutes of Health (P01 CA055075, UM1 CA167552, U01 CA167552, R01 CA151993, R35 CA197735, R35 CA253185), NHS by the National Institutes of Health (P01 CA087969, UM1 CA186107, R01 CA137178, R01 CA151993, R35 CA197735, R35 CA253185), and NHS2 (NHSII) by the National Institutes of Health (U01 CA176726,R35 CA197735, R35 CA253185, R37CA246175). M Song is supported by U01CA261961. T Ugai is supported by a grant from the Prevent Cancer Foundation. DA Drew is supported by K01DK120742.
IWHS: This study was supported by NIH grants CA107333 (R01 grant awarded to P.J. Limburg) and HHSN261201000032C (N01 contract awarded to the University of Iowa).
MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 509348, 209057, 251553 and 504711 and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. BMLynch was supported by MCRF18005 from the Victorian Cancer Agency.
PLCO was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and contracts from the Division of Cancer Prevention, National Cancer Institute, NIH, DHHS. Funding was provided by National Institutes of Health (NIH), Genes, Environment and Health Initiative (GEI) Z01 CP 010200, NIH U01 HG004446, and NIH GEI U01 HG 004438.
WHI: The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C.
Footnotes
Conflict of Interest: Andrew T Chan has no COI related to this manuscript. For research unrelated to this manuscript, he has grant support from Pfizer Inc., Zoe Ltd, Freenome and consulting fees from Pfizer Inc, Boheringer Ingelheim, and Bayer Pharma AG. Marios Giannakis has no COI related to this manuscript; unrelated he has research funding from Janssen. All other authors have nothing to disclose.
References
- 1.Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71(3):209–249. [DOI] [PubMed] [Google Scholar]
- 2.Sánchez-Alcoholado L, Ramos-Molina B, Otero A, Laborda-Illanes A, Ordóñez R, Medina JA, et al. The Role of the Gut Microbiome in Colorectal Cancer Development and Therapy Response. Cancers. 2020;12(6):1406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rebersek M Gut microbiome and its role in colorectal cancer. BMC Cancer. 2021;21(1):1325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Dziubańska-Kusibab PJ, Berger H, Battistini F, Bouwman BAM, Iftekhar A, Katainen R, et al. Colibactin DNA-damage signature indicates mutational impact in colorectal cancer. Nat Med. 2020;26(7):1063–1069. [DOI] [PubMed] [Google Scholar]
- 5.Bossuet-Greif N, Vignard J, Taieb F, Mirey G, Dubois D, Petit C, et al. The Colibactin Genotoxin Generates DNA Interstrand Cross-Links in Infected Cells. mBio. 2018;9(2):e02393–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Iftekhar A, Berger H, Bouznad N, Heuberger J, Boccellato F, Dobrindt U, et al. Genomic aberrations after short-term exposure to colibactin-producing E. coli transform primary colon epithelial cells. Nat Commun. 2021;12(1):1003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Putze J, Hennequin C, Nougayrède JP, Zhang W, Homburg S, Karch H, et al. Genetic structure and distribution of the colibactin genomic island among members of the family Enterobacteriaceae. Infect Immun. 2009;77(11):4696–4703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Strakova N, Korena K, Karpiskova R. Klebsiella pneumoniae producing bacterial toxin colibactin as a risk of colorectal cancer development - A systematic review. Toxicon Off J Int Soc Toxinology. 2021;197:126–135. [DOI] [PubMed] [Google Scholar]
- 9.Dejea CM, Fathi P, Craig JM, Boleij A, Taddese R, Geis AL, et al. Patients with familial adenomatous polyposis harbor colonic biofilms containing tumorigenic bacteria. Science. 2018;359(6375):592–597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Arthur JC, Perez-Chanona E, Mühlbauer M, Tomkovich S, Uronis JM, Fan TJ, et al. Intestinal inflammation targets cancer-inducing activity of the microbiota. Science. 2012;338(6103):120–123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Pleguezuelos-Manzano C, Puschhof J, Rosendahl Huber A, van Hoeck A, Wood HM, Nomburg J, et al. Mutational signature in colorectal cancer caused by genotoxic pks+ E. coli. Nature. 2020;580(7802):269–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Buc E, Dubois D, Sauvanet P, Raisch J, Delmas J, Darfeuille-Michaud A, et al. High Prevalence of Mucosa-Associated E. coli Producing Cyclomodulin and Genotoxin in Colon Cancer. PLOS ONE. 2013;8(2):e56964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hidaka A, Harrison TA, Cao Y, Sakoda LC, Barfield R, Giannakis M, et al. Intake of Dietary Fruit, Vegetables, and Fiber and Risk of Colorectal Cancer According to Molecular Subtypes: A Pooled Analysis of 9 Studies. Cancer Res. 2020;80(20):4578–4590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wang X, Amitay E, Harrison TA, Banbury BL, Berndt SI, Brenner H, et al. Association Between Smoking and Molecular Subtypes of Colorectal Cancer. JNCI Cancer Spectr. 2021;5(4):pkab056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Arima K, Zhong R, Ugai T, Zhao M, Haruki K, Akimoto N, et al. Western-Style Diet, pks Island-Carrying Escherichia coli, and Colorectal Cancer: Analyses From Two Large Prospective Cohort Studies. Gastroenterology. 2022;163(4):862–874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Levy M, Kolodziejczyk AA, Thaiss CA, Elinav E. Dysbiosis and the immune system. Nat Rev Immunol. 2017;17(4):219–232. [DOI] [PubMed] [Google Scholar]
- 17.Terlouw D, Suerink M, Boot A, van Wezel T, Nielsen M, Morreau H. Recurrent APC Splice Variant c.835–8A>G in Patients With Unexplained Colorectal Polyposis Fulfilling the Colibactin Mutational Signature. Gastroenterology. 2020;159(4):1612–1614.e5. [DOI] [PubMed] [Google Scholar]
- 18.Chen B, Ramazzotti D, Heide T, Spiteri I, Fernandez-Mateos J, James C, et al. Contribution of pks+ E. coli mutations to colorectal carcinogenesis. Nat Commun. 2023;14(1):7827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Fernandez-Rozadilla C, Timofeeva M, Chen Z, Law P, Thomas M, Schmit S, et al. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nat Genet. 2023;55(1):89–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Huyghe JR, Bien SA, Harrison TA, Kang HM, Chen S, Schmit SL, et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat Genet. 2019;51(1):76–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Newcomb PA, Baron J, Cotterchio M, Gallinger S, Grove J, Haile R, et al. Colon Cancer Family Registry: an international resource for studies of the genetic epidemiology of colon cancer. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol. 2007;16(11):2331–2343. [DOI] [PubMed] [Google Scholar]
- 22.Zaidi SH, Harrison TA, Phipps AI, Steinfelder R, Trinh QM, Qu C, et al. Landscape of somatic single nucleotide variants and indels in colorectal cancer and impact on survival. Nat Commun. 2020;11(1):3644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Huang X, Wojtowicz D, Przytycka TM. Detecting presence of mutational signatures in cancer with confidence. Bioinforma Oxf Engl. 2018;34(2):330–337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793):94–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gulhan DC, Lee JJK, Melloni GEM, Cortés-Ciriano I, Park PJ. Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nat Genet. 2019;51(5):912–919. [DOI] [PubMed] [Google Scholar]
- 26.Salipante SJ, Scroggins SM, Hampel HL, Turner EH, Pritchard CC. Microsatellite instability detection by next generation sequencing. Clin Chem. 2014;60(9):1192–1199. [DOI] [PubMed] [Google Scholar]
- 27.Hutter CM, Chang-Claude J, Slattery ML, Pflugeisen BM, Lin Y, Duggan D, et al. Characterization of gene-environment interactions for colorectal cancer susceptibility loci. Cancer Res. 2012;72(8):2036–2044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Martinez JE, Kahana DD, Ghuman S, Wilson HP, Wilson J, Kim SCJ, et al. Unhealthy Lifestyle and Gut Dysbiosis: A Better Understanding of the Effects of Poor Diet and Nicotine on the Intestinal Microbiome. Front Endocrinol. 2021;12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Carding S, Verbeke K, Vipond DT, Corfe BM, Owen LJ. Dysbiosis of the gut microbiota in disease. Microb Ecol Health Dis. 2015;26:26191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gong J, Hutter CM, Newcomb PA, Ulrich CM, Bien SA, Campbell PT, et al. Genome-Wide Interaction Analyses between Genetic Variants and Alcohol Consumption and Smoking for Risk of Colorectal Cancer. PLoS Genet. 2016;12(10):e1006296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.McNabb S, Harrison TA, Albanes D, Berndt SI, Brenner H, Caan BJ, et al. Meta-analysis of 16 studies of the association of alcohol with colorectal cancer. Int J Cancer. 2020;146(3):861–873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Alexandrov LB, Stratton MR. Mutational signatures: the patterns of somatic mutations hidden in cancer genomes. Curr Opin Genet Dev. 2014;24:52–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Silva A, Faria G, Araújo A, Monteiro MP. Impact of adiposity on staging and prognosis of colorectal cancer. Crit Rev Oncol Hematol. 2020;145:102857. [DOI] [PubMed] [Google Scholar]
- 34.Doleman B, Mills KT, Lim S, Zelhart MD, Gagliardi G. Body mass index and colorectal cancer prognosis: a systematic review and meta-analysis. Tech Coloproctology. 2016;20(8):517–535. [DOI] [PubMed] [Google Scholar]
- 35.Li Y, Li C, Wu G, Yang W, Wang X, Duan L, et al. The obesity paradox in patients with colorectal cancer: a systematic review and meta-analysis. Nutr Rev. 2022;80(7):1755–1768. [DOI] [PubMed] [Google Scholar]
- 36.Wei EK, Colditz GA, Giovannucci EL, Wu K, Glynn RJ, Fuchs CS, et al. A Comprehensive Model of Colorectal Cancer by Risk Factor Status and Subsite Using Data From the Nurses’ Health Study. Am J Epidemiol. 2017;185(3):224–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lee-Six H, Olafsson S, Ellis P, Osborne RJ, Sanders MA, Moore L, et al. The landscape of somatic mutation in normal colorectal epithelial cells. Nature. 2019;574(7779):532–537. [DOI] [PubMed] [Google Scholar]
- 38.Boot A, Ng AWT, Chong FT, Ho SC, Yu W, Tan DSW, et al. Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types. Genome Res. 2020;30(6):803–813. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The original panel-sequenced data used in this study are available at the database of Genotypes and Phenotypes (dbGaP). The Ontario Institute of Cancer Research (OICR) data is available under accession code phs002050.v1.p1. The Center for Inherited Disease Research (CIDR) data is available under accession code phs001905.v1.p1. Mutational signature definitions were downloaded from the COSMIC website at https://cancer.sanger.ac.uk/signatures/downloads/.
