Summary
Background
Previous studies have shown that three DNA methylation (DNAm) based algorithms of aging, DNAm PhenoAge acceleration (AgeAccelPheno), DNAm GrimAge acceleration (AgeAccelGrim), and mortality risk score (MRscore), to be strong predictors of mortality and aging related outcomes. We aimed to investigate if and to what extent these algorithms predict cancer.
Methods
In four subsets (n = 727, 1003, 910, and 412) of a population-based cohort from Germany, DNA methylation in whole blood was measured using the Infinium Methylation EPIC BeadChip kit or the Infinium HumanMethylation450K BeadChip Assay (Illumina.Inc, San Diego, CA, USA). AgeAccelPheno, AgeAccelGrim, and a revised MRscore based on 8 CpGs only (MRscore-8CpGs), were calculated. Hazard ratios (HRs) were calculated to assess associations of the three DNAm algorithms with total cancer risk and risk of invasive breast, lung, prostate, and colorectal cancer incidence.
Findings
During 17 years of follow-up, a total of 697 malignant tumors (thereof breast cancer = 75, lung cancer = 146, prostate cancer = 114, colorectal cancer = 155) were observed. All three algorithms showed strong positive associations with lung cancer risk in a dose response manner, with adjusted HRs per SD increase in AgeAccelPheno, AgeAccelGrim, and MRscore-8CpGs, of 1·37 (1·03-1·82), 1·74 (1·11-2·73), and 2·06 (1·39-3·06), respectively. By contrast, strong inverse associations were seen with breast cancer risk [adjusted HRs 0·65 (0·49-0·86), 0·45 (0·25-0·80), and 0·42 (0·25-0·70), respectively]. Weak positive associations of MRscore-8CpGs were seen with total cancer risk.
Interpretation
The DNAm algorithms, particularly the MRscore-8CpGs, have potential to contribute to site-specific cancer risk prediction.
Funding
The ESTHER study was funded by grants from the Baden-Württemberg state Ministry of Science, Research and Arts (Stuttgart, Germany), the Federal Ministry of Education and Research (Berlin, Germany), the Federal Ministry of Family Affairs, Senior Citizens, Women and Youth (Berlin, Germany), and the Saarland State Ministry of Health, Social Affairs, Women and the Family (Saarbrücken, Germany). The work of Xiangwei Li was supported by a grant from Fondazione Cariplo (Bando Ricerca Malattie invecchiamento, #2017-0653).
Keywords: DNA methylation, Epigenetic clock, Age acceleration, Cancer risk
Research in context.
Evidence before the study
Recently, three DNA methylation based algorithms, DNA methylation PhenoAge acceleration, DNA methylation GrimAge acceleration, and mortality risk score were developed and shown to be strongly associated with mortality and incidence of various aging-related diseases. However, only very few most recently published studies have investigated the associations of these algorithms with cancer risk and the results were inconsistent.
Added value of the study
In this study, based on four subsets from a population-based cohort study from Germany (n = 727, 1003, 910, and 412, respectively), we aimed to investigate if and to what extent these algorithms predict cancer. All three algorithms showed strong positive associations with lung cancer risk in a clear dose response manner. By contrast, strong inverse associations of the three algorithms were seen with breast cancer risk. No associations were seen with risk of colorectal cancer and prostate cancer.
Implications of all the available evidence
Our study corroborates and expands evidence on potential associations between DNA methylation based algorithms and total cancer risk. At the same time it discloses strong variation of associations with specific types of cancer. Further research should address potential implications of the strong positive association with lung cancer risk for lung cancer risk stratification, e.g. in selecting high risk people for lung cancer screening. The intriguing finding of a strong inverse association with breast cancer risk, potential associations with other types of cancer and the underlying mechanisms likewise require clarification by further research.
Alt-text: Unlabelled box
Introduction
Cancer ranks as a leading cause of death worldwide, accounting for nearly 10 million deaths in 20201. With rapidly growing burden of incidence and mortality, cancer becomes a key barrier to increasing life expectancy in every country of the world.1, 2, 3 In the past two decades numerous studies have evaluated the potential of genetic factors to predict cancer risk, and polygenic risk scores are meanwhile established for multiple types of cancer.4, 5, 6, 7 Nevertheless, their contribution to cancer risk prediction remains limited for most cancers, and their use for cancer prevention is mostly limited to risk stratification (e.g. for cancer screening) given their non-modifiable nature.8,9 An alternative approach to cancer risk prediction may be algorithms of DNA methylation (DNAm), which can be modified by environmental and lifestyle factors.10,11
Recently, three novel DNAm based algorithms of aging, DNAm phenotypic age (PhenoAge),12 DNAm GrimAge (GrimAge),13 and the mortality risk score (MRscore),14 were proposed based on methylation at 513, 1030, and 10 CpGs in DNA from whole blood samples, respectively. PhenoAge and GrimAge, which are second-generation epigenetic age clocks, were derived from DNAm surrogates for several factors related to age, plasma proteins, and smoking pack-years.12,13 MRscore is a linear combination of ten CpG sites that were identified to be most robustly related to all-cause mortality in an epigenome-wide association study with thorough internal and external validation.14 The residuals from the regression of PhenoAge and GrimAge on chronological age, termed AgeAccelPheno and AgeAccelGrim, together with MRscore were shown to be strongly associated with all-cause mortality and cancer-specific mortality across various study populations.15, 16, 17, 18 These DNAm algorithms were also reported to be strongly associated with the incidence of various aging-related diseases, including diabetes mellitus, myocardial infarction, and stroke.19,20 With the inclusion of DNAm surrogates for various risk factors related to aetiology of multiple cancers such as smoking, the three algorithms are hypothesized to be associated with incidence of cancer, in particular lung cancer. This particularly applies to MRscore whose 10 CpGs include smoking associated ones. However, only very few recently published studies have investigated the associations of these algorithms with cancer risk (i.e. overall-, lung-, prostate-, and colorectal cancer risk) and the results were inconsistent.20,21 It is therefore unclear whether and to what extent these algorithms are predictive of cancer risk, and comparative validation of the associations in population-based cohort studies is essential.
We aimed to investigate and compare the associations of AgeAccelPheno, AgeAccelGrim, and MRscore with the overall risk of cancer and risk of specific common cancers (breast-, lung-, prostate-, and colorectal cancer) in a population-based cohort of older adults (50-75 years at baseline) from Germany.
Methods
Study population and data collection
Our analysis is based on the ESTHER study, a large ongoing prospective, population-based cohort study conducted in Germany. Details of the study design and population have been described elsewhere.14,22,23 In brief, 9940 participants (50 -75 years of age at baseline) were recruited by their general practitioners (GPs) during routine health checkups between July 2000 and December 2002, and followed up thereafter. Sociodemographic characteristics, health characteristics, lifestyle habits, and history of major diseases were collected using standardized self-administered questionnaires. Comprehensive medical data, including diagnoses of major diseases and drug prescriptions, were collected from the GPs’ records. Information on self-reported smoking at baseline was validated using serum cotinine measurements and was found to be highly accurate in a subgroup of 1500 study participants.24 Peripheral blood samples were collected at recruitment and stored at -80°C for later testing.
For the current analysis, four independent subsets were used, which were previously selected for genome-wide DNAm measurements for various projects.25, 26, 27 Subsets I and II consist of 741 and 1030 subjects that were randomly selected from ESTHER study. Subset III includes the first 500 men and 500 women consecutively enrolled during the first 6 months of recruitment (recruited between July and October 2000). Subset IV has a nested case-control design for cancer-related methylation signatures and included 266 participants with incident malignant cancer (lung cancer cases=116, prostate cancer cases=23, colorectal cancer cases=129) and 205 randomly selected participants among those free of malignant tumors. To prevent potential bias resulting from DNAm algorithm changes caused by cancer, cases of cancer that were diagnosed in the first two years after enrollment were excluded. Ultimately, 727, 1003, 910, and 412 participants of subset I, II, III, and IV were included into our current study.
The four subsets in the current study were independent of and not overlapping with a subsample of the ESTHER cohort from which the MRscore had been derived from in previous research.14
Ethics
All participants provided written informed consent. The ESTHER study was approved by the ethics committees of the medical faculty of the University of Heidelberg and the medical board of the state of Saarland.
Methylation assessment
DNAm levels of subset I and II were assessed with the Infinium Methylation EPIC BeadChip kit (EPIC, Illumina.Inc, San Diego, CA, USA), and DNAm profiles of subset III and IV were determined with the Infinium Human Methylation450K BeadChip Assay (450K, Illumina.Inc, San Diego, CA, USA). As previously described,14,25,27 the assays were conducted following the manufacturer's instruction by the Genomics and Proteomics Core Facility at the German Cancer Research Center, Heidelberg, Germany (DKFZ). In data pre-processing, signals of probes with detection P-value >0·01, >10% missing values, and probes targeting the X and Y chromosomes were excluded.14,25,27,28
Calculation of DNAm aging algorithms
AgeAccelPheno and AgeAccelGrim, which are the residuals from regression models of PhenoAge and GrimAge estimates on chronological age, were calculated using the Horvath's online tool available at https://dnamage.genetics.ucla.edu/new.12,13
The original MRscore was derived from the Infinium HumanMethylation450K BeadChip Assay (450K, Illumina.Inc, San Diego, CA, USA) and constructed as the sum of the methylation β values multiplied by the regression coefficients of each of ten CpGs.14 However, because two (cg01612140 and cg 23665802) of the ten CpGs were not included in the EPIC array, we therefore derived a new MRscore algorithm, the MRscore-8CpGs.27 The MRscore-8CpGs was adopted by regressing the remaining eight CpGs on the original MRscore in a third subset from the ESTHER study whose DNAm had been performed by both 450K and EPIC array (N=111, independent of and not overlapping with subsets I and II of current study and the ESTHER subsample from which the MRscore had initially been derived):
The original MRscore and MRscore-8CpGs were highly correlated with Spearman correlation coefficients as 0·9716 and 0·9771 in subset III and IV, respectively. Associations of the original MRscore and MRscore-8CpGs with total and site-specific cancer risk in subset III and IV were highly consistent (Supplementary Table 1).
Ascertainment of incident cancer cases
Incident cases of cancer, including total cancer (ICD-10 codes C00–C97 excluding the code C44 for non-melanoma skin cancer), breast cancer (ICD-10 code C50), lung cancer (ICD-10 code C34), prostate cancer (ICD-10 code C61), and colorectal cancer (ICD-10 codes C18-C20), during follow-up between 2000 and end of 2018 were identified through record linkage with the Saarland Cancer Registry, which has been shown to ascertain virtually all cancer diseases in the underlying population (>=95% in 2015/2016).
Statistical analysis
Standard descriptive methods were used to describe demographic characteristics of the study subjects at baseline. Associations of DNAm algorithms with cancer risk in subset I, II, and III were estimated using Cox proportional hazard models. In subset IV, the associations were calculated using weighted Cox regression models that account for the case-control sampling design.29 The models were firstly adjusted for age, sex, leukocyte composition (estimated by the Houseman approach),30 and batch (Model 1), and we additionally controlled for educational level (≤9 years, 10-11 years, and ≥12 years), smoking status (never smoker, former smoker, current smoker), alcohol consumption (grams per day), body mass index (kg/m2), and diabetes (yes/no) (Model 2). Models for incident breast cancer were further adjusted for menopausal status (yes/no) and postmenopausal hormone use (yes/no). For lung cancer, additional models were run in which we controlled for smoking pack-years (number of packs of 20 cigarettes smoked per year) rather than smoking status. The proportional hazards assumption was checked by scaled Schoenfeld residuals plots.31 Hazard ratios (HR) with corresponding 95% confidence intervals (CI) per standard deviation (SD) increase in DNAm algorithms were calculated for total cancer risk and risk of selected common cancers (breast, lung, prostate, colon and rectum). Due to the case-control study design of subset IV, the associations of the DNAm algorithms with total incident cancer were estimated in subset I, II, and III only. When assessing the associations with risk of specific invasive cancers in subset IV, only targeted cancer cases and controls were included into the analyses. Furthermore, we conducted subgroup analyses for the associations of the DNAm algorithms with total incident cancer by sex and age (50-64 years / 65-75 years), and we tested for statistical significance of interactions by the two factors.
To assess potential dose-response relationships of DNAm algorithms with cancer risk, we also run Cox regression models including DNAm algorithms as categorical variables using model 2 for adjustment as described above. In subset I, II, and III, the three DNAm algorithms were classified according to quartiles and the lowest quartile served as the reference category. In subset IV, these algorithms were categorized according to quartiles among controls. P for trends were derived from tests performed with the median of DNAm algorithms within each category.
Because the DNAm profiles in the four subsets were performed in different time periods by different batches of DNAm assessment chips, HRs and corresponding 95 % CIs were calculated separately and combined by random‐effects meta‐analysis.
All statistical analyses were carried out using SAS, version 9·4 (SAS Institute, Inc., Cary, NC). Statistical significance was defined by P-values < 0·05 in two-sided testing.
Role of funding source
All funders did not have any role in study design, data collection, data analyses, interpretation, writing of report, or decision to publish the study.
Results
Table 1 shows the baseline sociodemographic characteristics of participants by subset. In subset I, II, III, and controls in subset IV, mean ages were approximately 62 years and a slight majority of participants were women. About three out of four participants were overweight or obese, about half had ever smoked, and mean daily alcohol consumption was 9 to 11 g per day in all subsets.
Table 1.
Characteristics | Subset I (N=727) | Subset II (N=1003) | Subset III (N=910) | Subset IV (N=412) |
Entire ESTHER cohort (N=9940) | |
---|---|---|---|---|---|---|
Controls (N=205) | Cases (N=207) | |||||
Age (years; mean ± SD) | 61·6±6·5 | 62·0±6·7 | 61·9±6·5 | 62·5±6·4 | 63·2±6·0 | 62·1±6·6 |
Sex (N/%) | ||||||
Men | 320 (44·0) | 435 (43·4) | 448 (49·2) | 88 (42·9) | 132 (63·8) | 4478 (45·0) |
Women | 407 (56·0) | 568 (56·6) | 462 (50·8) | 117 (57·0) | 75 (36·2) | 5462 (55·0) |
Educational levels (N/%)a | ||||||
Low (≤9 years) | 519 (73·5) | 748 (76·3) | 672 (73·9) | 149 (74·5) | 162 (78·3) | 7235 (74·7) |
Intermediate (10-11 years) | 112 (15·9) | 131 (13·4) | 141 (15·5) | 34 (17·0) | 23 (11·1) | 1372 (14·2) |
High (≥12 years) | 75 (10·6) | 101 (10·3) | 97 (10·7) | 17 (8·5) | 19 (9·2) | 1081 (11·2) |
Body mass index (N/%)b | ||||||
Normal weight (<25·0 kg/m2) | 188 (25·9) | 269 (26·9) | 231 (25·4) | 64 (31·2) | 49 (23·7) | 2724 (27·5) |
Overweight (25·0-<30·0 kg/m2) | 346 (47·7) | 475 (47·5) | 437 (48) | 100 (48·8) | 97 (46·9) | 4675 (47·1) |
Obesity (≥30·0 kg/m2) | 192 (26·5) | 256 (25·6) | 242 (26·6) | 41 (20·0) | 61 (29·5) | 2525 (25·4) |
Smoking status (N/%)c | ||||||
Never smoker | 354 (48·7) | 520 (51·8) | 445 (48·9) | 96 (49·5) | 63 (30·4) | 4832 (50·0) |
Former smoker | 241 (33·2) | 318 (31·7) | 295 (32·4) | 61 (31·4) | 82 (39·6) | 3185 (33·0) |
Current smoker | 132 (18·2) | 165 (16·5) | 170 (18·7) | 37 (19·1) | 60 (29·0) | 1649 (17·0) |
Diabetesd | ||||||
Yes | 608 (85) | 866 (87·4) | 769 (84·5) | 175 (85·8) | 169 (81·6) | 8230 (85·0) |
No | 107 (15) | 125 (12·6) | 141 (15·5) | 29 (14·2) | 35 (16·9) | 1455 (15·0) |
Alcohol consumption (grams per day; mean ± SD) | 9·3±12·8 | 9·8±13 | 9·7±13·7 | 10·4±14·1 | 12·2±17·7 | 9·8±14·0 |
Abbreviations: SD, standard deviation; AgeAccelPheno, DNA methylation phenotypic age acceleration; AgeAccelGrim, DNA methylation GrimAge acceleration; MRscore-8CpGs, revised version of continuous mortality risk score with 8 CpGs.
Data missing for 21, 23, 8, and 252 participants in subset I, II, IV, and all study population in ESTHER.
Data missing for 1, 3, and 16 participants in subset I, II, and all study population in ESTHER.
Data missing for 13 and 274 participants in subset IV and all study population in ESTHER.
Data missing for 12, 12, 4 and 255 participants in subset I, II, IV, and all study population in ESTHER.
During 17 years of follow-up, a total of 697 cancer cases (invasive tumors of female breast = 75, lung = 146, prostate = 114, colorectum = 155, Supplemental Table 2) were identified in the four subsets.
Table 2 shows the associations of DNAm algorithms and cancer incidences in the overall study population and by the type of cancer. Multivariable adjusted HRs of total incident cancer (meta-analysis of subset I, II, and III) were 1·11 (95% CI = 0·92-1·34), 1·12 (95% CI = 0·96-1·30), and 1·27 (95% CI = 1·04-1·56) per SD increase of AgeAccelPheno, AgeAccelGrim, and MRscore-8CpGs, respectively. Associations were similar for men and women and younger and older participants (Supplemental Table 3), and none of the tests for interactions between DNAm algorithms and sex and age reached statistical significance (P for interactions > 0·05).
Table 2.
Predictors | HR (95% CI) |
||
---|---|---|---|
Cases | Model 1 | Model 2a | |
Overall incident cancera | |||
AgeAccelPheno (per SD) | 490 | 1·05 (0·96-1·15) | 1·11 (0·92-1·34) |
AgeAccelGrim (per SD) | 490 | 1·15 (1·05-1·26) | 1·12 (0·96-1·30) |
MRscore-8CpGs (per SD) | 490 | 1·28 (1·07-1·52) | 1·27 (1·04-1·56) |
Incident breast cancerb | |||
AgeAccelPheno (per SD) | 75 | 0·73 (0·56-0·94) | 0·65 (0·49-0·86) |
AgeAccelGrim (per SD) | 75 | 0·67 (0·48-0·93) | 0·45 (0·25-0·80) |
MRscore-8CpGs (per SD) | 75 | 0·57 (0·38-0·87) | 0·42 (0·25-0·70) |
Incident lung cancerb | |||
AgeAccelPheno (per SD) | 146 | 1·53 (1·23-1·90) | 1·37 (1·03-1·82) |
AgeAccelGrim (per SD) | 146 | 2·62 (2·21-3·11) | 1·74 (1·11-2·73) |
MRscore-8CpGs (per SD) | 146 | 2·87 (2·36-3·48) | 2·06 (1·39-3·06) |
Incident prostate cancerb | |||
AgeAccelPheno (per SD) | 114 | 1·02 (0·77-1·36) | 0·97 (0·77-1·24) |
AgeAccelGrim (per SD) | 114 | 0·89 (0·72-1·08) | 0·89 (0·69-1·15) |
MRscore-8CpGs (per SD) | 114 | 0·90 (0·64-1·27) | 1·00 (0·73-1·37) |
Incident colorectal cancerb | |||
AgeAccelPheno (per SD) | 155 | 1·01 (0·80-1·27) | 1·24 (0·86-1·78) |
AgeAccelGrim (per SD) | 155 | 0·83 (0·60-1·15) | 1·09 (0·81-1·45) |
MRscore-8CpGs (per SD) | 155 | 0·73 (0·59-0·90) | 1·03 (0·70-1·50) |
Abbreviations: DNAm, DNA methylation; HR, hazard ratio; CI, confidence interval; SD, standard deviation; AgeAccelPheno, DNA methylation phenotypic age acceleration; AgeAccelGrim, DNA methylation GrimAge acceleration; MRscore-8CpGs, revised version of continuous mortality risk score with 8 CpGs.
Meta-analysis of subset I, II, and III.
Meta-analysis of all subsets. In subset IV, only targeted cancer cases and controls were included into the analyses.
Model 1, adjusted for age, sex, leukocyte composition, and batch.
Model 2, similar as model 1, additionally adjusted for educational level, smoking status (never smoker, former smoker, current smoker), alcohol consumption (grams per day), body mass index (kg/m2), and diabetes. Models for incident breast cancer were additionally adjusted for menopausal status and postmenopausal hormone use.
However, strongly divergent patterns were seen for individual cancers. Strong inverse associations of the three DNAm algorithms with breast cancer risk were seen with multivariable adjusted HRs as 0·65 (95% CI = 0·49-0·86), 0·45 (95% CI = 0·25-0·80), and 0·42 (95% CI = 0·25-0·70) in per SD increase of AgeAccelPheno, AgeAccelGrim, and MRscore-8CpGs, respectively. By contrast, strong positive associations were observed for incident lung cancer, with multivariable adjusted HRs of 1·37 (95% CI = 1·03-1·82), 1·74 (95% CI = 1·11-2·73), and 2·06 (95% CI = 1·39-3·06) per SD increase in AgeAccelPheno, AgeAccelGrim, and MRscore-8CpGs, respectively. When further adjusting the models for smoking pack-years rather than smoking status, the positive associations were highly consistent (Supplementary Table 5). There were no significant relationships between any of the three DNAm algorithms with incidence of prostate cancer and colorectal cancer.
Due to the small case number of incident breast cancer in subset IV, we assessed the dose-response analysis of the association of the DNAm algorithms with breast cancer incidence in subset I, II, and III (Table 3). MRscore-8CpGs showed monotonic inverse dose-response relationship with breast cancer risk (P-trend as 0·006) and very low risk for those in the highest quartiles (HR as 0·16, 95% CI = 0·04-0·74). No clear dose-response relationship with breast cancer risk was seen for AgeAccelPheno and AgeAccelGrim.
Table 3.
Predictors | Quartiles | Person-years | Cases | HR (95% CI)a | P-trend |
---|---|---|---|---|---|
AgeAccelPheno |
Q1 | 5987·4 | 27 | 1·00 (Ref·) | 0·759 |
Q2 | 5185·7 | 15 | 0·83 (0·17-3·98) | ||
Q3 | 4980·9 | 18 | 0·79 (0·22-2·81) | ||
Q4 | 4398·6 | 15 | 0·48 (0·22-1·04) | ||
AgeAccelGrim |
Q1 | 8168·7 | 35 | 1·00 (Ref·) | 0·320 |
Q2 | 6061·8 | 18 | 0·36 (0·10-1·33) | ||
Q3 | 3828·0 | 15 | 0·58 (0·27-1·25) | ||
Q4 | 2494·1 | 7 | 0·12 (0·03-0·47) | ||
MRscore-8CpGs | Q1 | 7058·5 | 28 | 1·00 (Ref·) | 0·006 |
Q2 | 5820·7 | 25 | 0·80 (0·19-3·34) | ||
Q3 | 4645·3 | 15 | 0·55 (0·15-1·92) | ||
Q4 | 3028·1 | 7 | 0·16 (0·04-0·74) |
Abbreviations: DNAm, DNA methylation; HR, hazard ratio; CI, confidence interval; AgeAccelPheno, DNA methylation phenotypic age acceleration; AgeAccelGrim, DNA methylation GrimAge acceleration; MRscore-8CpGs, revised version of continuous mortality risk score with 8 CpGs.
Models adjusted for age, leukocyte composition, batch, educational level, smoking status (never smoker, former smoker, current smoker), alcohol consumption (grams per day), body mass index (kg/m2), diabetes, menopausal status, and postmenopausal hormone use.
Table 4 presents results of the dose-response analyses of the association between the DNAm algorithms and lung cancer incidence. Strong, monotonic increases of lung cancer incidence were consistently seen with increasing quartiles for AgeAccelPheno, AgeAccelGrim, and MRscore-8CpGs, with P-values for trend as 0·033, 0·003, and 0·003, respectively.
Table 4.
Predictorsa | Quartiles | Person-years | Cases | HR (95% CI)b | P-trend |
---|---|---|---|---|---|
AgeAccelPheno |
Q1 | 11068·3 | 22 | 1·00 (Ref·) | 0·033 |
Q2 | 10937·8 | 25 | 1·36 (0·66-2·82) | ||
Q3 | 11256·6 | 35 | 1·29 (0·56-2·99) | ||
Q4 | 11448·9 | 64 | 1·95 (1·11-3·41) | ||
AgeAccelGrim |
Q1 | 11589·0 | 10 | 1·00 (Ref·) | 0·003 |
Q2 | 10926·3 | 15 | 1·76 (0·13-23·54) | ||
Q3 | 11674·8 | 25 | 3·35 (0·58-19·26) | ||
Q4 | 10521·5 | 96 | 5·83 (0·54-63·55) | ||
MRscore-8CpGs | Q1 | 10927·1 | 11 | 1·00 (Ref·) | 0·003 |
Q2 | 10925·8 | 16 | 0·72 (0·25-2·11) | ||
Q3 | 11268·1 | 31 | 1·59 (0·27-9·41) | ||
Q4 | 11590·6 | 88 | 4·15 (1·01-17·09) |
Abbreviations: DNAm, DNA methylation; HR, hazard ratio; CI, confidence interval; AgeAccelPheno, DNA methylation phenotypic age acceleration; AgeAccelGrim, DNA methylation GrimAge acceleration; MRscore-8CpGs, revised version of continuous mortality risk score with 8 CpGs.
In subset IV, only lung cancer cases and controls were included into the analyses.
Models adjusted for age, sex, leukocyte composition, batch, educational level, smoking status (never smoker, former smoker, current smoker), alcohol consumption (grams per day), body mass index (kg/m2), and diabetes.
Dose-response analyses of the associations of the three DNAm algorithms with prostate- and colorectal cancer risk are shown in Supplementary Table 5 and 6, respectively. None of the DNAm algorithms showed clear dose-response relationships with prostate- and colorectal cancer risk.
Discussion
In this population-based prospective cohort study, 697 participants with incident cancer cases were identified during 17 years of follow-up. MRscore-8CpGs were positively associated with total cancer risk. However, large differences were seen between major cancer sites. Whereas a strong positive dose-response-relationship was seen between all DNAm algorithms and lung cancer risk, MRscore-8CpGs showed a strong inverse dose-response relationship with breast cancer risk.
Epigenetic events that are related to cancer risk are believed to occur in the early process of cancer development.32 With the constant effects of the epigenetic events on genomic stability and gene expression, these changes might result in carcinogenesis from initiation through progression. MRscore-8CpGs and the second-generation epigenetic clocks, the AgeAccelPheno and AgeAccelGrim, were derived based on a large representation of CpGs related to age, plasma protein levels, smoking pack-years, and key disease and mortality risk factors, which are also involved in the aetiology of many cancers. It therefore appears plausible that these DNAm algorithms might also be associated with cancer risk.
However, only very few recent studies have investigated the associations of epigenetic aging with total cancer risk.20,21 Wang et.al.20 conducted a study based on data from the Normative Ageing Study and the KORA F4 cohort and reported similar associations of AgeAccelPheno, AgeAccelGrim and MR-score with total cancer incidence as in our study. However, with only 298 incident cancer cases these associations had not reached statistical significance. The Melbourne Collaborative Cohort found positive associations between AgeAccelPheno and AgeAccelGrim and the total of seven specific types of cancers (total n=2994).21 This study had not assessed MRscore or MRscore-8CpGs. In each of the studies, including ours, the associations of the algorithms with total cancer risk persisted after adjustment for multiple sociodemographic and lifestyle factors, which indicate that these algorithms might capture information beyond self-reported adverse environmental and specific lifestyle factors that affect the methylome over the life course.
Like our study, the Melbourne Collaborative Cohort (MCC) also reported associations between AgeAccelPheno and AgeAccelGrim and specific cancers, and found these associations to strongly vary between cancer sites. Although the list of cancers assessed in that study and our study was only partially overlapping, results for lung, prostate and colorectal cancer were reported in both studies. Despite some discrepancy of the study population characteristics (e.g. age range 54-66 years in MCC versus 50-75 years in ESTHER, % of males approximately 70% in MCC versus 45% in ESTHER) overall results were quite consistent. Strong positive associations of AgeAccelPheno and AgeAccelGrim with lung cancer incidence, and no association or even an inverse association with prostate cancer were seen in both studies. However, the positive association with colorectal cancer observed in the Melbourne Collaborative Cohort was not confirmed in our study.
The positive associations of AgeAccelPheno, AgeAccelGrim and MRscore-8CpGs with lung cancer observed in our study were somewhat reduced but nevertheless remained strong after adjusting for either smoking status or pack-years, suggesting that they might be only explained to a limited extent by smoking whose association with epigenetic changes including epigenetic aging has long been established.33,34
Another intriguing finding of our study is the inverse association of the three algorithms with breast cancer risk. A similar inverse association with breast cancer risk had also been reported for the age accelerated Horvath's clock, a first generation epigenetic clock.35 By contrast, no association with breast cancer risk has been observed in a recent case-cohort study from the United States.36 To what extent this apparent inconsistency may be explained by the differences in study populations or other design features remains to be explored in further research. For example, the US cohort exclusively consisted of sisters of patients with breast cancer, and mean follow-up and mean time-to-diagnosis were 6.0 and 3.9 years only. Also, possible mechanisms linking epigenetic age and breast cancer risk are yet to be fully disclosed. Interestingly, a previous study found that earlier menopause, which is inversely associated with breast cancer risk due to reduced estrogen exposure,37 was associated with increased epigenetic age acceleration in blood.38
Of the three DNA methylation algorithms assessed in relation to cancer risk in this study, two (AgeAccelPheno, AgeAccelGrim) were specifically derived to quantify accelerated aging, whereas the third one was originally derived for predicting all-cause mortality. In our study, AgeAccelGrim and MRscore-8CpGs predicted total and individual cancer risks approximately equally well, whereas associations were generally weaker with AgeAccelPheno. Given that methylation quantification is required for a much lower number of CpGs for MRscore-8CpGs (n=8) than for AgeAccelPheno (n=513) and AgeAccelGrim (n=1030), the former might be a particularly economic approach for DNA methylation-based quantification of cancer risk. Apart from cancer risk quantification and stratification, another potential use of the algorithms, to be evaluated in further research, might be their use as intermediate biomarkers in the assessment of the efficacy of cancer prevention strategies.
A major strength of our study is that it is based on a large population-based cohort study with extensive collection of biospecimen, life style and medical data, and comprehensive long-term prospective follow-up with respect to morbidity and mortality. In particular, record linkage with the Saarland Cancer Registry ensured complete ascertainment of incident cancer diseases. However, several limitations should also be kept in mind. First, despite the overall large sample size of our study, statistical power of analyses and precision of estimates were limited for individual cancer sites due to relatively small numbers of patients with specific cancers. Second, analyses for specific malignant tumors were limited to four common types of cancer. Further research should address and is expected to disclose associations with additional cancers, as has been most recently demonstrated, for example, for pancreatic cancer. Third, our analyses were restricted to measurement of methylation of DNA from blood samples which is known to differ from methylation patterns in various tissues, in particular the tissues of origin of the specific cancers assessed in our study. Fourth, because two of the ten CpGs used to construct MRscore are missing in the EPIC microarray data, all of our analyses regarding the mortality score were based on the MRscore-8CpGs rather than the original MRscore. However, given the very high correlations between both scores, this should not have had any relevant impact on the results.
Despite these limitations, our study corroborates and expands potential associations between DNAm algorithms and site-specific cancer risks, which show major variation across various types of cancer. Further research should address potential implications of the strong positive association with lung cancer risk for lung cancer risk stratification, e.g. in selecting high risk people for lung cancer screening. The intriguing finding of a strong inverse association with breast cancer risk, potential associations with other types of cancer and the underlying mechanisms likewise require clarification by further research.
Contributors
Conception and design: X. Li, H. Brenner
Development of methodology: X. Li, H. Brenner
Acquisition of data: B. Schöttker, B. Holleczek, H. Brenner
Analysis and interpretation of data: X. Li, H. Brenner
Writing of the manuscript: X. Li, H. Brenner
Critical review and revision of manuscript: all authors
Study supervision: H. Brenner
Declaration of interests
All authors confirmed the full access to all the data in the study and accepted responsibility to submit for publication. No potential conflicts of interest were disclosed.
Acknowledgments
The authors thank the study participants and their general practitioners as well as laboratory and administrative staff of the ESTHER study team. The authors gratefully acknowledge contributions of DKFZ Genomics and Proteomics Core Facility in the processing of DNA samples and performing the laboratory work. The ESTHER study was funded by grants from the Baden-Württemberg state Ministry of Science, Research and Arts (Stuttgart, Germany), the Federal Ministry of Education and Research (Berlin, Germany), the Federal Ministry of Family Affairs, Senior Citizens, Women and Youth (Berlin, Germany), and the Saarland State Ministry of Health, Social Affairs, Women and the Family (Saarbrücken, Germany). The work of Xiangwei Li was supported by a grant from Fondazione Cariplo (Bando Ricerca Malattie invecchiamento, #2017-0653).
Data sharing statement
Data available on request from the authors.
Footnotes
Supplementary material associated with this article can be found in the online version at doi:10.1016/j.ebiom.2022.104083.
Appendix. Supplementary materials
References
- 1.Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
- 2.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70(1):7–30. doi: 10.3322/caac.21590. [DOI] [PubMed] [Google Scholar]
- 3.Bray F, Laversanne M, Weiderpass E, Soerjomataram I. The ever-increasing importance of cancer as a leading cause of premature death worldwide. Cancer. 2021;127(16):3029–3030. doi: 10.1002/cncr.33587. [DOI] [PubMed] [Google Scholar]
- 4.Dai J, Lv J, Zhu M, et al. Identification of risk loci and a polygenic risk score for lung cancer: a large-scale prospective cohort study in Chinese populations. Lancet Respir Med. 2019;7(10):881–891. doi: 10.1016/S2213-2600(19)30144-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Huyghe JR, Bien SA, Harrison TA, et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat Genet. 2019;51(1):76–87. doi: 10.1038/s41588-018-0286-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mavaddat N, Michailidou K, Dennis J, et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am J Hum Genet. 2019;104(1):21–34. doi: 10.1016/j.ajhg.2018.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Thomas M, Sakoda LC, Hoffmeister M, et al. Genome-wide modeling of polygenic risk score in colorectal cancer risk. Am J Hum Genet. 2020;107(3):432–444. doi: 10.1016/j.ajhg.2020.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Chatterjee N, Shi J, Garcia-Closas M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat Rev Genet. 2016;17(7):392–406. doi: 10.1038/nrg.2016.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Konuma T, Okada Y. Statistical genetics and polygenic risk score for precision medicine. Inflamm Regen. 2021;41(1):18. doi: 10.1186/s41232-021-00172-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kulis M, Esteller M. DNA methylation and cancer. Adv Genet. 2010;70:27–56. doi: 10.1016/B978-0-12-380866-0.60002-2. [DOI] [PubMed] [Google Scholar]
- 11.Koch A, Joosten SC, Feng Z, et al. Analysis of DNA methylation in cancer: location revisited. Nat Rev Clin Oncol. 2018;15(7):459–466. doi: 10.1038/s41571-018-0004-4. [DOI] [PubMed] [Google Scholar]
- 12.Levine ME, Lu AT, Quach A, et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY) 2018;10(4):573–591. doi: 10.18632/aging.101414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lu AT, Quach A, Wilson JG, et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging (Albany NY) 2019;11(2):303–327. doi: 10.18632/aging.101684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhang Y, Wilson R, Heiss J, et al. DNA methylation signatures in peripheral blood strongly predict all-cause mortality. Nat Commun. 2017;8:14617. doi: 10.1038/ncomms14617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gao X, Colicino E, Shen J, et al. Comparative validation of an epigenetic mortality risk score with three aging biomarkers for predicting mortality risks among older adult males. Int J Epidemiol. 2019;48(6):1958–1971. doi: 10.1093/ije/dyz082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gao X, Zhang Y, Boakye D, et al. Whole blood DNA methylation aging markers predict colorectal cancer survival: a prospective cohort study. Clin Epigenetics. 2020;12(1):184. doi: 10.1186/s13148-020-00977-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Li X, Ploner A, Wang Y, et al. Longitudinal trajectories, correlations and mortality associations of nine biological ages across 20-years follow-up. Elife. 2020;9 doi: 10.7554/eLife.51507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.McCrory C, Fiorito G, Hernandez B, et al. Grimage outperforms other epigenetic clocks in the prediction of age-related clinical phenotypes and all-cause mortality. J Gerontol A Biol Sci Med Sci. 2021;76(5):741–749. doi: 10.1093/gerona/glaa286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hillary RF, Stevenson AJ, McCartney DL, et al. Epigenetic measures of ageing predict the prevalence and incidence of leading causes of death and disease burden. Clin Epigenetics. 2020;12(1):115. doi: 10.1186/s13148-020-00905-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wang C, Ni W, Yao Y, et al. DNA methylation-based biomarkers of age acceleration and all-cause death, myocardial infarction, stroke, and cancer in two cohorts: The NAS, and KORA F4. EBioMedicine. 2021;63 doi: 10.1016/j.ebiom.2020.103151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dugue PA, Bassett JK, Wong EM, et al. Biological aging measures based on blood dna methylation and risk of cancer: a prospective study. JNCI Cancer Spectr. 2021;5(1):pkaa109. doi: 10.1093/jncics/pkaa109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schottker B, Muhlack DC, Hoppe LK, Holleczek B, Brenner H. Updated analysis on polypharmacy and mortality from the ESTHER study. Eur J Clin Pharmacol. 2018;74(7):981–982. doi: 10.1007/s00228-018-2445-1. [DOI] [PubMed] [Google Scholar]
- 23.Holleczek B, Schottker B, Brenner H. Helicobacter pylori infection, chronic atrophic gastritis and risk of stomach and esophagus cancer: Results from the prospective population-based ESTHER cohort study. Int J Cancer. 2020;146(10):2773–2783. doi: 10.1002/ijc.32610. [DOI] [PubMed] [Google Scholar]
- 24.Zhang Y, Florath I, Saum KU, Brenner H. Self-reported smoking, serum cotinine, and blood DNA methylation. Environ Res. 2016;146:395–403. doi: 10.1016/j.envres.2016.01.026. [DOI] [PubMed] [Google Scholar]
- 25.Zhang Y, Saum KU, Schottker B, Holleczek B, Brenner H. Methylomic survival predictors, frailty, and mortality. Aging (Albany NY) 2018;10(3):339–357. doi: 10.18632/aging.101392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gao X, Zhang Y, Burwinkel B, et al. The associations of DNA methylation alterations in oxidative stress-related genes with cancer incidence and mortality outcomes: a population-based cohort study. Clin Epigenetics. 2019;11(1):14. doi: 10.1186/s13148-018-0604-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Li X, Zhang Y, Gao X, Holleczek B, Schottker B, Brenner H. Comparative validation of three DNA methylation algorithms of ageing and a frailty index in relation to mortality: results from the ESTHER cohort study. EBioMedicine. 2021;74 doi: 10.1016/j.ebiom.2021.103686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lehne B, Drong AW, Loh M, et al. A coherent approach for analysis of the Illumina HumanMethylation450 BeadChip improves data quality and performance in epigenome-wide association studies. Genome Biol. 2015;16:37. doi: 10.1186/s13059-015-0600-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Leffondre K, Wynant W, Cao Z, Abrahamowicz M, Heinze G, Siemiatycki J. A weighted Cox model for modelling time-dependent exposures in the analysis of case-control studies. Stat Med. 2010;29(7-8):839–850. doi: 10.1002/sim.3764. [DOI] [PubMed] [Google Scholar]
- 30.Houseman EA, Accomando WP, Koestler DC, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86. doi: 10.1186/1471-2105-13-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81(3):11. [Google Scholar]
- 32.Verma M, Rogers S, Divi RL, et al. Epigenetic research in cancer epidemiology: trends, opportunities, and challenges. Cancer Epidemiol Biomarkers Prev. 2014;23(2):223–233. doi: 10.1158/1055-9965.EPI-13-0573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Breitling LP, Yang R, Korn B, Burwinkel B, Brenner H. Tobacco-smoking-related differential DNA methylation: 27K discovery and replication. Am J Hum Genet. 2011;88(4):450–457. doi: 10.1016/j.ajhg.2011.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gao X, Jia M, Zhang Y, Breitling LP, Brenner H. DNA methylation changes of whole blood cells in response to active smoking exposure in adults: a systematic review of DNA methylation studies. Clin Epigenetics. 2015;7:113. doi: 10.1186/s13148-015-0148-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14(10):R115. doi: 10.1186/gb-2013-14-10-r115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kresovich JK, Xu Z, O'Brien KM, Weinberg CR, Sandler DP, Taylor JA. Epigenetic mortality predictors and incidence of breast cancer. Aging (Albany NY) 2019;11(24):11975–11987. doi: 10.18632/aging.102523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Collaborative Group on Hormonal Factors in Breast C. Menarche, menopause, and breast cancer risk: individual participant meta-analysis, including 118 964 women with breast cancer from 117 epidemiological studies. Lancet Oncol. 2012;13(11):1141–1151. doi: 10.1016/S1470-2045(12)70425-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Levine ME, Lu AT, Chen BH, et al. Menopause accelerates biological aging. Proc Natl Acad Sci U S A. 2016;113(33):9327–9332. doi: 10.1073/pnas.1604558113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.