Integrative analysis of clinical and epigenetic biomarkers of mortality

Tianxiao Huan; Steve Nguyen; Elena Colicino; Carolina Ochoa‐Rosales; W David Hill; Jennifer A Brody; Mette Soerensen; Yan Zhang; Antoine Baldassari; Mohamed Ahmed Elhadad; Tanaka Toshiko; Yinan Zheng; Arce Domingo‐Relloso; Dong Heon Lee; Jiantao Ma; Chen Yao; Chunyu Liu; Shih‐Jen Hwang; Roby Joehanes; Myriam Fornage; Jan Bressler; Joyce BJ van Meurs; Birgit Debrabant; Jonas Mengel‐From; Jacob Hjelmborg; Kaare Christensen; Pantel Vokonas; Joel Schwartz; Sina A Gahrib; Nona Sotoodehnia; Colleen M Sitlani; Sonja Kunze; Christian Gieger; Annette Peters; Melanie Waldenberger; Ian J Deary; Luigi Ferrucci; Yishu Qu; Philip Greenland; Donald M Lloyd‐Jones; Lifang Hou; Stefania Bandinelli; Trudy Voortman; Brenner Hermann; Andrea Baccarelli; Eric Whitsel; James S Pankow; Daniel Levy

doi:10.1111/acel.13608

. 2022 May 12;21(6):e13608. doi: 10.1111/acel.13608

Integrative analysis of clinical and epigenetic biomarkers of mortality

Tianxiao Huan ^1,^2,^3,^✉, Steve Nguyen ⁴, Elena Colicino ⁵, Carolina Ochoa‐Rosales ^6,⁷, W David Hill ⁸, Jennifer A Brody ⁹, Mette Soerensen ^10,^11,¹², Yan Zhang ¹³, Antoine Baldassari ¹⁴, Mohamed Ahmed Elhadad ^15,^16,¹⁷, Tanaka Toshiko ¹⁸, Yinan Zheng ¹⁹, Arce Domingo‐Relloso ^20,^21,²², Dong Heon Lee ^1,², Jiantao Ma ^1,^2,²³, Chen Yao ^1,², Chunyu Liu ²⁴, Shih‐Jen Hwang ^1,², Roby Joehanes ^1,², Myriam Fornage ²⁵, Jan Bressler ²⁶, Joyce BJ van Meurs ²⁶, Birgit Debrabant ¹⁰, Jonas Mengel‐From ^10,¹², Jacob Hjelmborg ¹⁰, Kaare Christensen ^10,¹², Pantel Vokonas ^27,^28,²⁹, Joel Schwartz ³⁰, Sina A Gahrib ^9,³¹, Nona Sotoodehnia ⁹, Colleen M Sitlani ⁹, Sonja Kunze ^15,¹⁶, Christian Gieger ^15,^16,¹⁷, Annette Peters ^16,^17,^32,³³, Melanie Waldenberger ^15,^16,¹⁷, Ian J Deary ³⁴, Luigi Ferrucci ¹⁸, Yishu Qu ¹⁹, Philip Greenland ¹⁹, Donald M Lloyd‐Jones ¹⁹, Lifang Hou ¹⁹, Stefania Bandinelli ³⁵, Trudy Voortman ⁶, Brenner Hermann ^13,³⁶, Andrea Baccarelli ³⁷, Eric Whitsel ^14,³⁸, James S Pankow ^4,^✉, Daniel Levy ^1,^2,^✉

^¹The Framingham Heart Study, Framingham, Massachusetts, USA

^²The Population Sciences Branch, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA

^³, Department of Ophthalmology and Visual Sciences, University of Massachusetts Medical School, Worcester, Massachusetts, USA

^⁴Division of Epidemiology & Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA

^⁵, Icahn School of Medicine at Mount Sinai, New York, New York, USA

^⁶Department of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands

^⁷Centro de Vida Saludable de la Universidad de Concepción, Concepción, Chile

^⁸Department of Psychology, Lothian Birth Cohorts, University of Edinburgh, Edinburgh, UK

^⁹, Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington, USA

^¹⁰, Department of Public Health, Epidemiology, Biostatistics and Biodemography, University of Southern Denmark, Odense C, Denmark

^¹¹, Department of Clinical Biochemistry and Pharmacology, Center for Individualized Medicine in Arterial Diseases, Odense University Hospital, Odense C, Denmark

^¹², Department of Clinical Genetics, Odense University Hospital, Odense C, Denmark

^¹³Division of Clinical Epidemiology & Aging Research, German Cancer Rsrch Ctr (DKFZ), Heidelberg, Germany

^¹⁴Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA

^¹⁵Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

^¹⁶Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

^¹⁷German Research Center for Cardiovascular Disease (DZHK), Partner Site Munich Heart Alliance, Munich, Germany

^¹⁸Translational Gerontology Branch, National Institute on Aging, Baltimore, Maryland, USA

^¹⁹, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

^²⁰, Department of Chronic Diseases Epidemiology, National Center for Epidemiology, Carlos III Health Institute, Madrid, Spain

^²¹, Department of Environmental Health Sciences, Columbia University Mailman School of Public Health, New York, New York, USA

^²², Department of Statistics and Operations Research, University of Valencia, Valencia, Spain

^²³Nutrition Epidemiology and Data Science, Friedman School of Nutrition Science and Policy, Tufts University, Boston, Massachusetts, USA

^²⁴Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA

^²⁵Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA

^²⁶Department of Internal Medicine, Erasmus, Rotterdam, the Netherlands

^²⁷Veterans Affairs, Normative Aging Study, Boston, Massachusetts, USA

^²⁸Veterans Affairs, Boston Healthcare System, Boston, Massachusetts, USA

^²⁹Boston University School of Public Health, Boston, Massachusetts, USA

^³⁰Departments of Environmental Health and Epidemiology, Harvard TH Chan School of Public Health, Boston, Massachusetts, USA

^³¹Department of Psychology, Univ of Edinburgh, Edinburgh, UK

^³²German Center for Diabetes Research (DZD), München‐Neuherberg, Neuherberg, Germany

^³³Institute of Medical Information Sciences, Biometry and Epidemiology, Ludwig‐Maximilians‐University, Munich, Germany

^³⁴, Division of Pulmonary, Critical Care and Sleep Medicine, Center for Lung Biology, University of Washington, Seattle, Washington, USA

^³⁵Geriatric Unit, LHATC (Local Health Authorities Tuscany Centre), Florence, Italy

^³⁶Network Aging Research (NAR), University of Heidelberg, Heidelberg, Germany

^³⁷Precision Medicine Program, Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York, USA

^³⁸Department of Medicine, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA

Correspondence , Daniel Levy, Framingham Heart Study, Population Sciences Branch, National Heart, Lung, and Blood Institute, 73 Mt. Wayte Avenue, Suite 2, Framingham, MA 01702, USA. Email: LevyD@nih.gov , James S. Pankow, Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA. Email: panko001@umn.edu , Tianxiao Huan, Framingham Heart Study, National Heart, Lung, and Blood Institute, Framingham, MA 01702, USA. Email: tianxiao.huan@umassmed.edu

^✉

Corresponding author.

PMCID: PMC9197414 PMID: 35546478

Abstract

DNA methylation (DNAm) has been reported to be associated with many diseases and with mortality. We hypothesized that the integration of DNAm with clinical risk factors would improve mortality prediction. We performed an epigenome‐wide association study of whole blood DNAm in relation to mortality in 15 cohorts (n = 15,013). During a mean follow‐up of 10 years, there were 4314 deaths from all causes including 1235 cardiovascular disease (CVD) deaths and 868 cancer deaths. Ancestry‐stratified meta‐analysis of all‐cause mortality identified 163 CpGs in European ancestry (EA) and 17 in African ancestry (AA) participants at p < 1 × 10⁻⁷, of which 41 (EA) and 16 (AA) were also associated with CVD death, and 15 (EA) and 9 (AA) with cancer death. We built DNAm‐based prediction models for all‐cause mortality that predicted mortality risk after adjusting for clinical risk factors. The mortality prediction model trained by integrating DNAm with clinical risk factors showed an improvement in prediction of cancer death with 5% increase in the C‐index in a replication cohort, compared with the model including clinical risk factors alone. Mendelian randomization identified 15 putatively causal CpGs in relation to longevity, CVD, or cancer risk. For example, cg06885782 (in KCNQ4) was positively associated with risk for prostate cancer (Beta = 1.2, P _MR = 4.1 × 10⁻⁴) and negatively associated with longevity (Beta = −1.9, P _MR = 0.02). Pathway analysis revealed that genes associated with mortality‐related CpGs are enriched for immune‐ and cancer‐related pathways. We identified replicable DNAm signatures of mortality and demonstrated the potential utility of CpGs as informative biomarkers for prediction of mortality risk.

Keywords: cancer, cardiovascular disease, DNA methylation, machine learning, mortality

DNA methylation (DNAm) is an important epigenetic regulatory mechanism and is associated with many diseases and with human aging. Our study demonstrated that inter‐individual variation in DNAm is associated with all‐cause mortality risk and with cause‐specific mortality. We built a prediction model by integrating DNAm with clinical risk factors and we show that doing so can improve mortality prediction.

Abbreviations

AA: African ancestry
ARIC: the Atherosclerosis Risk in Communities study
CHD: coronary heart disease
CVD: cardiovascular disease
CHS: the Cardiovascular Health Study
DNAm: DNA methylation
DTR: the Danish Twin Register sample
EA: European ancestry
EWAS: epigenome‐wide association studies
ESTHER: the Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung
FHS: the Framingham Heart Study
GWAS: genome‐wide association studies
InCHIANTI: the Invecchiare in Chianti Study
KORA F4: the Cooperative Health Research in the Region of Augsburg
LBC: the Lothian Birth Cohorts
MR: Mendelian randomization
NAS: the Normative Aging Study
RS: the Rotterdam Study
WHI: the Women’s Health Initiative

1. INTRODUCTION

Despite substantial evidence of heritability of human longevity (h ² = 10–30%), genome‐wide association studies (GWAS) have reported few loci associated with human longevity (Deelen et al., 2019; Pilling et al., 2017; Timmers et al., 2019; van den Berg et al., 2017). DNA methylation (DNAm), the covalent binding of a methyl group to the 5′ carbon of cytosine‐ phosphate‐guanine (CpG) dinucleotide sequences, reflects a wide range of environmental exposures and genetic influences at the molecular level and altered DNAm has been shown to regulate gene expression (Jones & Takai, 2001). Recent studies have reported DNAm patterns associated with age in humans (Hannum et al., 2013; Horvath, 2013; Levine et al., 2018; Lu et al., 2019). Estimates of biological age based on DNAm referred to as "epigenetic age" or "DNAm age" have been validated in numerous studies, although the functions of these age‐associated CpGs are largely unknown (Horvath et al., 2015; Lu et al., 2019; Marioni et al., 2015; Marioni et al., 2015). DNAm age also has been shown to be predictive of many age‐related diseases and of all‐cause mortality (Chen et al., 2016; Dugué et al., 2018; Levine et al., 2018; Lu et al., 2019; Marioni et al., 2015).

Despite the association of DNAm age with a variety of age‐associated outcomes, age‐related CpGs are different from those that are most strongly associated with mortality. Relatively few DNAm studies have focused on mortality as the primary outcome (Colicino et al., 2020; Svane et al., 2018; Zhang et al., 2017). Moreover, due to sample size limitations, most DNAm mortality studies have not typically investigated cause‐specific mortality such as death due to cardiovascular disease (CVD) and cancer. Additionally, little is known about the prediction performance of DNAm‐based mortality models and whether or not such approaches improve mortality prediction above and beyond established clinical risk factors.

We hypothesized that inter‐individual variation in DNAm is associated with all‐cause mortality risk and with cause‐specific mortality, and that we could build models incorporating CpGs that would improve mortality prediction beyond established clinical risk factors. In this study, we report the results of a meta‐analysis of epigenome‐wide association studies (EWAS) of all‐cause mortality and cause‐specific mortality including death from CVD and cancer in up to 15,013 individuals from 15 prospective cohort studies in which DNAm was measured in whole blood. We built all‐cause mortality risk prediction models using penalized regression and machine learning methods and integrated DNAm and established mortality clinical risk factors and validated the models’ performance. Additionally, using Mendelian randomization, we identified putatively causal CpGs for mortality. Last, we investigated the downstream gene expression and pathway changes of the mortality‐related CpGs by testing associations between DNAm and gene expression. Figure 1 summarizes the multi‐step study design.

2. RESULTS

2.1. Study population

Table 1 presents the major clinical characteristics of the 15,013 study participants including 11,684 European ancestry (EA, mean age 65, 55% women) and 3329 African ancestry (AA, mean age 59, 70% women) participants from 15 cohorts (Table S1 summarizes additional clinical characteristics). Most studies had fewer than 15 years of mean follow‐up (mean values ranged from 6.4 to 13.7 years), except ARIC (mean follow‐up of 20.0 years in ARIC EA and 18.6 in ARIC AA participants, respectively). During follow‐up of EA participants, 2907 died of any cause, 688 of CVD, and 546 of cancer; among AA participants, 1407 died of any cause, 547 of CVD, and 322 of cancer.

TABLE 1.

Clinical characteristics the 15,013 study participants

									Prevalent diseases
Cohort	Total N	No. of all‐cause death	No. of CVD death	No. of cancer death	Time to death/last follow‐up years, mean (SD)	Age, mean (SD)	Sex (F, %)	BMI, mean (SD)	Type 2 Diabetes (n)	Coronary Heart Disease (n)	Heart Failure (n)	Stroke (n)	Hypertension (n)	Cancer (n)
European ancestry
ARIC	969	331	95	94	20.0 (5.2)	59.8 (5.5)	59	26.2 (4.5)	86	44	29	16	233	102
CHS	419	373	132		12.7 (6.1)	75.0 (4.9)	60	26.8 (4.9)	72	16	11	5	224	78
DTR	870	298	74	40	9.3 (3.4)	69.4 (7.9)	52	25.9 (3.9)	46 ^a			37	269	129
ESTHER	1000	265	94	90	13.7 (3.5)	62.1 (6.5)	50	27.8 (4.3)	154	144	110	28	572	77
FHS	2427	403	91	155	9.1 (2.2)	66.3 (9.0)	55	28.3 (5.3)	279	226	53	116	107	389
InCHIANTi	488	104			10.0 (1.6)	62.4 (15.8)	52	27.0 (3.9)	42	31	9	10	232
KORA F4	1727	89	31	35	6.4 (0.9)	61.0 (8.9)	51	28.1 (4.8)	158	105	41	47	789	154
LBC 1921	418	366			9.8 (4.7)	79.1 (0.6)	60	28.2 (4.0)	19	70		33	170
LBC 1936	900	192			10.2 (2.4)	69.6 (0.8)	50	27.7(4.4)	72	221		46	364
NAS	640	221	123	72	10.5 (3.3)	72.8 (6.8)	0	28.1 (4.0)	117	181		42	447	316
RS	731	73			6.8 (1.5)	59.9 (8.2)	54	27.4 (4.5)	74	45		30	385	76
WHI	1095	192	48	60	11.5 (3.5)	62 (6.9)	100	28.8 (5.9)	60	20	5	11	469	14
African ancestry
ARIC	2446	1069	424	322	18.6 (6.6)	56.5 (5.8)	64	30.1 (6.2)	643	120	163	75	1373	87
CHS	325	264	96		12.9 (6.6)	73.1 (5.5)	62	28.6 (5.2)	68	2	0	2	235	36
WHI	558	74	27		10.6 (3.7)	61 (6.8)	100	31.5 (6.1)	76	18	11	12	369	2

Open in a new tab

The clinical risk factors were ascertained at the time of blood draw for DNAm measurements. BMI was calculated as weight (kg) divided by height squared (m²). Diabetes was defined as a measured fasting blood glucose level of >125 mg/dl or current use of glucose‐lowering prescription medication. Hypertension was defined as a measured systolic blood pressure (BP) ≥140 mm Hg or diastolic BP ≥90 mm Hg or use of antihypertensive prescription medication. Cancer was defined as the occurrence of any type of cancer excluding non‐melanoma skin cancer.

^{^a}

The diabetes cases in DTR included both type I and type II diabetes.

2.2. Ancestry‐stratified epigenome‐wide meta‐analysis of all‐cause mortality

At Bonferroni‐corrected p < 1 × 10⁻⁷ (~0.05/400,000), we identified 163 CpGs whose differential methylation in whole blood was associated with all‐cause mortality in EA participants, and 17 CpGs in AA participants, after adjustment of age, sex, lifestyle factors, clinical risk factors, white blood cell types, and technical covariates (e.g., batch). Tables S2–S3 present the results for all CpGs at p < 1 × 10⁻⁵. Overall genomic inflation in meta‐analysis (λ) was estimated at 1.15 or less, indicating low inflation and low risk of false‐positive findings. Even though cohort‐specific analysis showed slightly higher genomic inflation in some cohorts (λ > 1.5 in two cohorts, Table S4), forest plots show that the results were not driven by results from one or several cohorts (Fig. S1). Sensitivity analysis results including meta‐analysis after correcting for λ in each cohort, meta‐analysis after excluding results from two cohorts with λ > 1.5, and meta‐analysis after excluding RS cohort are included in Tables S5 and S6. Results of the sensitivity analysis remained consistent with the main results in terms of direction and effect estimates with Pearson's correlation r = 0.99 (in EA, corrected for λ in each cohorts), r = 1.00 (in EA, after removing two cohorts with λ > 1.5), r = 1.00 (in EA, after removing RS) and r = 1.00 (in AA, corrected for λ in each cohorts).

Among the 177 all‐cause mortality‐related CpGs (union set of EA and AA results at p < 1 × 10⁻⁷), the vast majority of significant CpGs (151, 85%) were inversely associated with mortality, with hazards ratios (HRs) <1 (range 0.72 to 0.89 per standard deviation [SD]). Methylation at the remaining 26 (15%) CpGs was positively associated with mortality, with HRs >1 (range 1.13 to 1.32). The 177 CpGs are annotated to 121 genes and 43 intergenic regions.

2.3. Transethnic replication and sensitivity analysis

Of the 163 all‐cause mortality‐related CpGs in EA participants, 18 (11%) had p < 0.0003 (0.05/163) in AA participants; of the 17 CpGs in AA participants, 12 (71%) had p < 0.004 (0.05/17) in EA participants. Table 2 displays the transethnic replicated CpGs including 27 unique CpGs. The top 3 transethnic replicated CpGs in EA participants remained the top 3 in AA participants, including cg16743273 for MOBKL2A, cg18181703 for SOCS3, and cg21393163 at an intergenic region (Chr.1: 12217629).

TABLE 2.

Transethnic replicated all‐cause mortality‐related CpGs

CpG	Chr	Position	Gene	Meta‐analysis EA cohorts		Meta‐analysis AA cohorts		Transethnic replication
CpG	Chr	Position	Gene	HR (95% CI)	p‐value	HR (95% CI)	p‐value	Bonferroni‐corrected P
Discovered in EA, and then replicated in AA
cg16743273	19	2076833	MOBKL2A	1.15 (1.1–1.21)	1.57E−09	1.24 (1.15–1.33)	1.28E−08	2.08E−06
cg18181703	17	76354621	SOCS3	0.83 (0.8–0.87)	6.15E−16	0.82 (0.77–0.88)	3.71E−08	6.05E−06
cg21393163	1	12217629		0.84 (0.8–0.88)	4.15E−12	0.84 (0.79–0.89)	7.48E−08	1.22E−05
cg15310871	8	20077936	ATP6V1B2	1.18 (1.12–1.25)	1.42E−08	1.19 (1.11–1.26)	1.80E−07	2.94E−05
cg25953130	10	63753550	ARID5B	0.87 (0.83–0.91)	4.67E−10	0.86 (0.81–0.91)	1.22E−06	1.98E−04
cg05438378	15	67383736	SMAD3	0.88 (0.84–0.92)	1.52E−08	0.85 (0.79–0.91)	3.68E−06	6.00E−04
cg26470501	19	45252955	BCL3	0.84 (0.79–0.88)	8.38E−12	0.81 (0.74–0.89)	1.48E−05	2.42E−03
cg06126421	6	30720080		0.8 (0.75–0.86)	2.48E−10	0.84 (0.78–0.91)	1.69E−05	2.75E−03
cg02003183	14	103415882	CDC42BPB	1.19 (1.13–1.26)	1.94E−11	1.16 (1.08–1.24)	2.00E−05	3.26E−03
cg10950251	1	204466432		0.86 (0.82–0.91)	4.05E−08	0.86 (0.8–0.92)	2.34E−05	3.81E−03
cg17501210	6	166970252	RPS6KA2	0.86 (0.81–0.9)	5.84E−09	0.87 (0.82–0.93)	2.71E−05	4.41E−03
cg23598089	1	203652079	ATP2B4	1.13 (1.08–1.18)	2.36E−08	1.14 (1.07–1.22)	4.19E−05	6.84E−03
cg21993290	2	233703120	GIGYF2	0.88 (0.84–0.92)	6.13E−08	0.87 (0.81–0.93)	4.94E−05	8.06E−03
cg04987734	14	103415873	CDC42BPB	1.2 (1.15–1.26)	2.53E−14	1.15 (1.07–1.23)	5.77E−05	9.41E−03
cg20813374	6	35657180	FKBP5	0.84 (0.78–0.89)	4.27E−08	0.84 (0.77–0.91)	7.19E−05	1.17E−02
cg11927233	5	170816542	NPM1	0.84 (0.8–0.89)	2.43E−09	0.89 (0.84–0.95)	2.41E−04	3.92E−02
cg24859433	6	30720203		0.85 (0.81–0.9)	7.15E−10	0.88 (0.82–0.94)	2.70E−04	4.40E−02
cg01445100	16	88103339	BANP	1.23 (1.15–1.32)	1.88E−09	1.24 (1.1–1.39)	2.76E−04	4.49E−02
Discovered in AA, and then replicated in EA
cg18181703	17	76354621	SOCS3	0.83 (0.8–0.87)	6.15E−16	0.82 (0.77–0.88)	3.71E−08	1.04E−14
cg21393163	1	12217629		0.84 (0.8–0.88)	4.15E−12	0.84 (0.79–0.89)	7.48E−08	7.05E−11
cg16743273	19	2076833	MOBKL2A	1.15 (1.1–1.21)	1.57E−09	1.24 (1.15–1.33)	1.28E−08	2.67E−08
cg25114611	6	35696870	FKBP5	0.86 (0.81–0.91)	7.50E−07	0.81 (0.75–0.87)	1.79E−08	1.28E−05
cg16411857	16	57023191	NLRC5	0.88 (0.84–0.93)	4.40E−06	0.79 (0.74–0.85)	2.40E−11	7.47E−05
cg16936953	17	57915665	TMEM49	0.91 (0.87–0.95)	7.05E−05	0.82 (0.77–0.88)	1.72E−08	1.20E−03
cg23570810	11	315102	IFITM1	0.86 (0.8–0.93)	9.75E−05	0.77 (0.72–0.83)	2.35E−11	1.66E−03
cg12054453	17	57915717	TMEM49	0.92 (0.88–0.96)	1.57E−04	0.84 (0.79–0.89)	2.93E−08	2.66E−03
cg18942579	17	57915773	TMEM49	0.91 (0.87–0.96)	3.53E−04	0.8 (0.74–0.86)	2.58E−09	6.01E−03
cg01041239	18	13222581	C18orf1	1.1 (1.04–1.16)	1.29E−03	1.22 (1.14–1.31)	1.04E−08	2.20E−02
cg03038262	11	315262	IFITM1	0.88 (0.82–0.96)	1.85E−03	0.72 (0.66–0.79)	5.14E−13	3.15E−02
cg24408769	6	15506085	JARID2	1.11 (1.04–1.18)	2.17E−03	1.27 (1.17–1.37)	1.29E−08	3.68E−02

Open in a new tab

Abbreviations: AA, African ancestry; CI, confidence interval; EA, European ancestry; HR, hazard ratio per standard deviation.

Because ARIC had longer follow‐up than the other cohorts, in sensitivity analysis, we truncated ARIC follow‐up at 15 years. The HRs for the significant CpGs (at p < 1 × 10⁻⁵) remained consistent with the main results in terms of direction and effect estimates with Pearson's correlation r = 1.00 and r = 0.99 in EA and AA participants, respectively (Tables S2–S3 and Fig. S2).

2.4. Associations of DNAm with CVD death and cancer death

In comparison with results for all‐cause mortality, fewer CpGs were associated with CVD death (at p < 1 × 10⁻⁷, n = 4 in EA, and n = 15 in AA) and cancer death (n = 0 in EA, and n = 1 in AA); Tables S7–S8 report the corresponding results at p < 1 × 10⁻⁵. Among the 163 all‐cause mortality‐related CpGs identified in EA participants at p < 1 × 10⁻⁷, 41 CpGs were associated with CVD death, 16 with cancer death and 5 with both (at p < 0.05/163, Table S2). Among the 17 CpGs identified in AA participants at p < 1 × 10⁻⁷, 15 were associated with CVD death, 9 with cancer death and 8 with both (at p < 0.05/17, Table S3). Figure 2 shows the effect sizes and direction of effect for the top CpGs associated with all‐cause mortality, and their consistency with the results of analyses of CVD death and cancer death. If a CpG was positively correlated with all‐cause mortality, it also was positively correlated with CVD death and cancer death, and vice versa.

Effect sizes (log hazards ratios) and 95% confidence intervals of CpGs related to mortality identified by meta‐analysis, comparing the results for all‐cause mortality, CVD death, and cancer death. (a) Results of meta‐analysis of European ancestry (EA); (b) Results of meta‐analysis of African ancestry (AA). These figures showed the CpGs associated with all‐cause mortality identified by the meta‐analysis, which were also associated with either CVD death or cancer death passing Bonferroni‐corrected threshold. Figure 1a shows 51 CpGs in EA, including 41 CpGs associated with CVD death, 16 with cancer death, and 5 with both. Figure 1b shows 16 CpGs in AA, including 15 CpGs associated with CVD death, 8 with cancer death, and 7 with both

2.5. Mortality prediction model

To investigate whether DNAm can be used to predict mortality risk, we constructed prediction models for all‐cause mortality and evaluated their prediction of all‐cause mortality, CVD death, and cancer death. To ensure unbiased validation, we split the EA cohorts into separate discovery and replication sets (Figure 1 shows the analysis flowchart). The discovery cohorts consisted of 8288 participants (including 2173 deaths from all causes) from 10 cohorts, excluding FHS (n = 2427) and ARIC (n = 969), which were used as replication cohorts. The meta‐analysis of the discovery set identified 74 CpGs at p < 1 × 10⁻⁷, 158 CpGs at p < 1 × 10⁻⁶, 357 CpGs at p < 1 × 10⁻⁵, 931 CpGs at p < 1 × 10⁻⁴, 2717 CpGs at p < 1 × 10⁻³, and 28,323 CpGs at p < 0.05. We evaluated three types of input features: (a) clinical risk factors only (i.e., clinical risk factor models); (b) CpGs identified in the meta‐analysis of the discovery set (i.e., CpG models); and (c) the input features including both CpGs and clinical risk factors (i.e., integrative models). We also compared four prediction methods including Elastic net‐Cox proportional hazards (Elastic‐coxph; Friedman et al., 2010), Random survival forest (RSF) (Ishwaran et al., 2008), Cox‐nnet (Ching et al., 2018), and DeepSurv (Katzman et al., 2018) (see Methods for details). In general, the four prediction methods did not show major differences in predicting mortality outcomes as assessed by multiple evaluation metrics (Table S9 lists the evaluation metrics across all four methods). To simplify the presentation of results, we focused on the Elastic‐coxph method (Figure 3).

Kaplan–Meier estimates of mortality risk scores with respect to mortality outcomes in ARIC study. (a) Survival curves with respect to all‐cause mortality; (b) survival curves with respect to CVD death; (c) survival curves with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years. For cancer death, we excluded samples who had any type of cancer before blood drawn for DNA methylation measurements. The mortality risk scores for (a) and (b) were computed by the model (Table S10), and for (c) was computed by the model (Table S11)

2.5.1. Clinical risk factors strongly predict all‐cause mortality and CVD death

The C‐index of the clinical risk factor models (age, sex, and 12 clinical risk factors) was 0.80 for all‐cause mortality, 0.81 for CVD death and 0.77 for cancer death in FHS (reflecting the average values of 10‐fold cross‐validation). We considered 12 clinical risk factors including BMI, smoking, alcohol consumption, physical activity, educational attainment, and prevalent diseases including hypertension, CHD, heart failure, stroke, type 2 diabetes, and cancer. Among the 12 clinical risk factors, prevalent cancer status was the major contributor to predicting cancer death. After excluding individuals with prevalent cancer at the time of blood draw for DNAm measurements (i.e., the start of follow‐up), the C‐index of the clinical risk factor model was 0.57 for cancer death. Finally, two clinical risk models were built using the optimum parameters selecting by cross‐validation (see Methods). The first one was trained using all FHS participants and included 10 risk factors selected by the Elastic‐coxph method (to predict all‐cause mortality and CVD death, Table S10), and the second was trained using FHS participants excluding those with prevalent cancer cases and including 10 risk factors (to predict cancer death, Table S11). The corresponding C‐index of the clinical risk factor model was 0.75 for all‐cause mortality (HR = 2.64 per SD in the risk score, 95% CI [2.21, 3.15], p = 4.4 × 10⁻²⁷), 0.81 for CVD death (HR = 3.51, 95% CI [2.58, 4.79], p = 2.1 × 10⁻¹⁵), and 0.71 for cancer death (excluding prevalent cancer samples, HR = 2.35, 95% CI [1.74, 3.18], p = 2.3 × 10⁻⁸) in ARIC EA participants with follow‐up truncated at 15 years (Table 3).

TABLE 3.

Performance robustness comparison of mortality predictors in FHS and ARIC cohorts

Model	FHS ^a			ARIC ^b
Model	HR	C‐index	IBS	HR (95% CI)	C‐index	IBS
All‐cause mortality
Clinical risk factor model	3.37	0.80	0.07	2.64 (2.21–3.15)	0.75	0.04
CpG model	2.91	0.77	0.07	2.24 (1.89–2.66)	0.72	0.04
Integrative model	3.50	0.80	0.06	2.95 (2.45–3.55)	0.77	0.04
CVD death
Clinical risk factor model	3.74	0.81	0.02	3.51 (2.57–4.79)	0.81	0.02
CpG model	3.85	0.82	0.02	2.62 (1.56–3.91)	0.77	0.02
Integrative model	3.90	0.83	0.02	3.65 (2.63–5.05)	0.80	0.02
Cancer Death (excluding prevalent cancer cases)
Clinical risk factor model	1.25	0.57	0.01	2.35 (1.74–3.18)	0.71	0.02
CpG model	1.71	0.65	0.01	2.22 (1.64–2.89)	0.73	0.02
Integrative model	1.78	0.68	0.01	2.58 (1.90–3.50)	0.76	0.02

Open in a new tab

Abbreviation: HR, hazard ratio per standard deviation; IBS: Integrated brier score.

Note: The clinical risk factor models were trained by using clinical risk factors as the sole input features. The CpG Models were trained by using CpGs selecting in the discovery meta‐analysis. The integrative model was trained by using both clinical risk factors and CpGs selecting in the discovery meta‐analysis.

The Clinical Risk Factor Model used to predict all‐cause mortality and CVD death was shown in Table S10, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S11. The CpG model used to predict all‐cause mortality and CVD death was shown in Table S12, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S13. The integrative model used to predict all‐cause mortality and CVD death was shown in Table S14, and to predict cancer death (trained in samples excluding prevalent cancer cases) was shown in Table S15.

^{^a}

HR, C‐index and IBS values in FHS reflect the average values of 10 times cross‐validation.

^{^b}

The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years.

2.5.2. DNAm predicts mortality independently of age and clinical risk factors

The models using all‐cause mortality‐related CpGs identified in the discovery cohorts as the sole input feature (the CpG model) were predictive of all‐cause mortality, CVD death, and cancer death in the replication set. As shown in Fig. S3, when more discovery CpGs were added to the model, the prediction performance metrics did not always improve. In FHS, the models with discovery CpGs at p < 1 × 10⁻³ showed the best predictive performance for all‐cause mortality (C‐index = 0.77) and CVD death (C‐index = 0.82), but the model with discovery CpGs at p < 1 × 10⁻⁵ showed the best predictive performance for cancer death (excluding prevalent cancer cases, [C‐index = 0.65]). The final CpG models that were trained using all FHS participants are provided in Table S12 including 76 CpGs to predict all‐cause mortality and CVD death, and in Table S13 including 56 CpGs to predict cancer death (excluding prevalent cancer cases). The C‐index of the CpG models with the best predictive performance in ARIC were 0.72 for all‐cause mortality (HR = 2.21, 95% CI [1.86, 2.62], P = 2.0 × 10⁻²⁰), 0.77 for CVD death (HR = 2.62, 95% CI [1.96, 3.51], p = 9.9 × 10⁻¹¹), and 0.73 for cancer death (HR = 2.22, 95% CI [1.67, 2.95], p = 3.2 × 10⁻⁸, Table 3). The association of the mortality risk scores calculated by the CpG models with mortality outcomes remained significant after adjusting for age, sex, and clinical risk factors; for all‐cause mortality (HR = 1.68, 95% CI [1.37, 2.07], p = 9.8 × 10⁻⁷), CVD death (HR = 1.81, 95% CI [1.24, 2.64], p = 0.002), and cancer death (HR = 2.04, 95% CI [1.46, 2.86], P = 3.0 × 10⁻⁵).

2.5.3. The integrative model (trained by CpGs and clinical risk factors) moderately improved upon the clinical risk factor model for all‐cause mortality and CVD death, and substantially improved the prediction of cancer death

As shown in Table 3, the integrative models demonstrated robustness for predicting mortality outcomes, with a good C‐index, HR, and low brier error rate. The final integrative models trained using data from all FHS participants are provided in Table S14 including nine clinical risk factors and 36 CpGs to predict all‐cause mortality and CVD death, and in Table S15 including seven clinical risk factors and 42 CpGs to predict cancer death (excluding prevalent cancer cases). The C‐index values of the integrative models were 0.80 (FHS, reflecting the average values of 10‐fold cross‐validation) and 0.77 (ARIC) for all‐cause mortality; 0.83 (FHS) and 0.80 (ARIC) for CVD death; and 0.69 (FHS) and 0.76 (ARIC) for cancer death. Kaplan–Meier survival curves for the mortality risk scores (split into high‐, middle‐, and low‐risk groups) in the ARIC EA cohort (computed by the integrative models using clinical risk factors and CpGs at discovery p < 1 × 10⁻⁶, Tables S14–S15) illustrate the higher death rate for those with a higher mortality risk score (log‐rank p < 1 × 10⁻⁶, Figure 4). In comparison with the clinical risk factor models, the integrative models slightly improved prediction of all‐cause mortality (0.7% increase in C‐index with addition of CpGs in FHS and 2% increase in ARIC), and of CVD death (2% increase in C‐index in FHS, but no increase in ARIC). We speculate that the reason for this minor increase is because the mortality‐related CpGs capture the contributions of clinical risk factors for CVD death. For cancer death, however, the C‐index of the integrative model revealed an 11% increase in FHS above and beyond the clinical risk factor model and a corresponding 5% increase in ARIC (C‐index for the clinical risk factor model is 0.71 [0.67, 0.75], and for the integrative model is 0.76 [0.72, 0.80], one‐tailed t‐test p = 0.036).

Hazard ratios per standard deviation increment with 95% confidence intervals for mortality. (a) With respect to all‐cause mortality; (b) with respect to CVD death; and (c) with respect to cancer death. The results were obtained from ARIC European ancestry participants with follow‐up truncated at 15 years. For cancer death, samples who had any type of cancer before blood drawn for DNA methylation measurements were excluded. Cox regression models were used to relate mortality outcomes to inversely transformed mortality risk scores computed by Integrative models (Tables S12–S13) and CpG models (Tables S10–S11), and inversely transformed DNAm age including GrimAge (Lu et al., 2019), PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), and Hannum Age (Hannum et al., 2013). *Adj age and sex* indicated the association further adjusted for age and sex. *Adj age*, *sex and risk factors* indicated the association further adjusted for age, sex and the other clinical risk factors

We also tested the mortality prediction models’ performance using the entire ARIC EA data (without truncation, Table S16). Due to the long follow‐up time in this older cohort (mean age 59.8 at baseline, with 20 ± 5.5 years follow‐up), the integrative model exhibits very similar performance features as the model using age and sex as the sole input features for predicting all‐cause mortality and CVD death. The integrative model improved prediction of cancer death with 2% increase in the C‐index versus the clinical risk factor model.

We further tested all‐cause mortality prediction models in the CARDIA study (baseline age 45 ± 3 years). The CARDIA study has 12 years of follow‐up, during which there were 27 deaths from all causes in 905 participants with DNA methylation. As shown in Table S17, the clinical risk factor model, the CpG model, and the integrative model each predicted all‐cause mortality, and each outperformed the DNAm age models.

2.6. Comparing the mortality prediction model with DNAm age

We compared four DNAm age models (i.e., PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), Hannum Age (Hannum et al., 2013), and GrimAge (Lu et al., 2019)) with our mortality prediction models (CpG only models and integrative CpG plus 12 risk factor models) for all‐cause mortality, CVD death, and cancer death in ARIC participants. The associations of mortality risk scores calculated by mortality prediction models with mortality outcomes were statistically significant, and the associations remained significant after adjusting for age and sex, and after additionally adjusting for the clinical risk factors. The four DNAm age models were significantly associated with mortality outcomes. After adjusting for age, sex, and clinical risk factors, however, only GrimAge remained associated with all‐cause mortality, CVD death, and cancer death. None of the other three DNAm age predictors was associated with mortality outcomes after additionally adjusting for clinical risk factors (Figure 4). The mortality prediction models (both the CpG only model and the integrative model that included the clinical risk factors and CpGs) outperformed the GrimAge model in prediction of mortality outcomes in terms of HRs and p values. The associations of mortality risk scores with mortality outcomes remained significant after adjusting for the four DNAm age terms (Table S18).

2.7. Associations of DNAm with genetic variants and Mendelian randomization analysis

Among the 177 all‐cause mortality‐related CpGs (union of EA and AA results at p < 1 × 10⁻⁷), 123 CpGs had significant associations with genetic variants (i.e., cis‐ or trans‐meQTL variants). meQTL variants for 80 CpGs could be linked to 618 GWAS Catalog (Buniello et al., 2019) index SNPs associated with 432 complex traits or diseases (Table S19).

We further performed multiple instrumental variable (IV) MR analysis for the 17 CpGs having ≥3 independent cis‐meQTL SNPs (pruned by LD r ² < 0.01, as IVs, to model the causal relations of differential methylation at these CpGs (as the exposure) in relation to the various outcomes, including longevity (Deelen et al., 2019), CVD, CVD risk factors, and cancer (Evangelou et al., 2018; Locke et al., 2015; Michailidou et al., 2017; Phelan et al., 2017; Schumacher et al., 2018; Scott et al., 2017; Wang et al., 2014; Willer et al., 2013). At p _MR < 0.05, MR supported causal effects of 15 CpGs on one or more outcome (Table S20), and 4 CpGs were statistically significant at p _MR < 0.05/17, including cg06885782 (within 1500 bases upstream of transcription start site [TSS1500] of KCNQ4) and cg04907244 (TSS1500 of SNORD93) in relation to prostate cancer (Schumacher et al., 2018; Beta = 1.2 and 2.1; and p _MR = 4.1 × 10⁻⁴ and 0.003, respectively), cg07094298 (in the gene body of TNIP2) in relation to lung cancer (Wang et al., 2014; Beta = 2.2, and p _MR = 0.003), and cg18241337 (in the gene body of SSR3) in relation to total cholesterol (Willer et al., 2013; Beta = 0.5, and p _MR = 0.003). cg06885782 (KCNQ4) also was associated with longevity (Deelen et al., 2019; Beta = −1.9, p _MR = 0.02).

2.8. Associations of DNAm with gene expression and pathway analysis

For the 177 all‐cause mortality‐related CpGs at p < 1 × 10⁻⁷, we assessed associations of CpGs with nearby gene expression (i.e., cis gene expression; within ± 1 Mb) and identified 15 cis‐ DNAm‐mRNA associated pairs (13 CpGs and 15 mRNAs) at p < 3 × 10⁻¹⁰. The genes located at these CpGs or cis‐eQTM mRNAs were not enriched for any biological processes or pathways. For the 719 all‐cause mortality‐related CpGs at p < 1 × 10⁻⁵, genes located at CpG sites were enriched for multiple immune functions, cellular response to organic substance, and negative regulation of cell communication (Gene Ontology [GO] (Ashburner et al., 2000), FDR < 0.05), and pathways for multiple types of cancer (Kyoto Encyclopedia of Genes and Genomes [KEGG] pathway (Kanehisa & Goto, 2000), p < 0.05, Table S21). There were 79 cis‐DNAm‐mRNA pairs (63 CpGs and 67 mRNAs, Table S22).

3. DISCUSSION

By performing EWAS using whole blood‐derived DNA from 15,013 individuals from 15 cohorts with the accrual of 4314 deaths during a mean follow‐up of more than 10 years, we identified robust DNAm signatures of all‐cause and cause‐specific mortality. We developed replicable mortality predictors by integrating mortality‐related CpGs with traditional clinical risk factors. The integrative models that included clinical risk factors and CpGs showed small improvements in prediction of all‐cause mortality and CVD death, and a more substantial improvement in prediction of cancer death compared to the traditional risk factor model.

Our study is one of the largest EWAS of mortality to date (Colicino et al., 2020; Svane et al., 2018; Zhang et al., 2017), and it revealed many replicable DNAm signatures for all‐cause mortality. Our results are consistent with those from previous EWAS of all‐cause mortality; the vast majority of CpGs (85% in our study, 84% in (Zhang et al., 2017), and 67% in (Colicino et al., 2020)) were inversely associated with mortality suggesting a greater mortality risk with lower CpG methylation. Our study identified more CpGs in EA cohorts (n = 163) than in AA cohorts (n = 17). As shown in Table 2, the effect sizes (i.e., HR) of mortality‐related CpGs in EA and AA participants were quite similar. We speculate that our study identified many more CpGs in EA participants than AA participants due the greater statistical power of the larger EA sample size. Using different DNAm data normalization methods (such as Noob (Triche Jr et al., 2013), SWAN (Maksimovic et al., 2012), BMIQ (Teschendorff et al., 2013), and Dasen (Pidsley et al., 2013), see File S1) in different cohorts may also affect the reproducibility of the results. For the 177 all‐cause mortality‐related CpGs (union of EA and AA results at p < 1 × 10⁻⁷), we examined their overlap with trait‐associated CpGs in the EWAS catalog (Table S23; Battram et al., 2021). We found that 172 CpGs (97%) have been reported to be associated with human age, 123 with smoking, 49 with alcohol consumption, 42 with sex, and 140 with the other diseases or traits. Many CpGs were associated with multiple traits. For example, the top two CpGs, cg02583484 and cg18181703, were have been reported to be associated with smoking, alcohol consumption, prenatal smoking, healthy diet, forced expiratory volume, C reactive protein, and many other traits. We speculate that many of the CpGs identified by EWAS reflect DNA methylation changes due to disease, human aging, lifestyle, and environmental influences. By linking CpGs with meQTLs and performing MR analysis, it is possible to further infer putatively causal CpGs. However, to identify definitely causal effects of CpGs on outcomes, functional studies are necessary.

Among the 177 all‐cause mortality‐related CpGs, 123 CpGs had significant associations with genetic variants (i.e., cis‐ or trans‐meQTL variants identified previously; Huan et al., 2019). For the remaining 44 CpGs, however, this does not mean that their methylation levels have nothing to do with genetic variation. It is possible that the previous meQTL study lacked sufficient statistical power to identify meQTLs for those CpGs. The mortality‐related CpGs are linked to hundreds of human complex diseases/traits via their cis‐meQTL SNPs, which coincide with 618 GWAS Catalog (Buniello et al., 2019) index SNPs. This leads us to hypothesize that many disease/phenotype‐associated SNPs may contribute to disease processes via effects on mortality‐related CpGs. In this way, the mortality‐related CpGs may contribute causally to disease. To test this hypothesis, we conducted MR analyses that confirmed several putatively causal associations of mortality‐related CpGs with longevity (Deelen et al., 2019), CVD (Nikpay et al., 2015), CVD risk factors, and several types of cancer (Evangelou et al., 2018; Locke et al., 2015; Michailidou et al., 2017; Phelan et al., 2017; Schumacher et al., 2018; Scott et al., 2017; Wang et al., 2014; Willer et al., 2013; Table S20). Among the four CpGs passing a Bonferroni‐corrected threshold in MR analyses, cg06885782 in KCNQ4 was reported to be associated with risk for prostate cancer (beta = 1.2, p _MR = 4.1 × 10⁻⁴) and negatively associated with longevity (beta = −1.9, p _MR = 0.02). KCNQ4 (potassium voltage‐gated channel subfamily Q member 4) was previously reported to be associated with age‐related hearing impairment (Van Eyken et al., 2006), and it contains genetic variants associated with all‐cause mortality and survival free of major diseases (Walter et al., 2011). cg07094298 in the gene body of TNIP2 was previously identified as causal for lung cancer. A recent study reported TNIP2‐ALK fusion in lung adenocarcinoma (Feng et al., 2019). cg04907244 (in TSS1500 of SNORD93) was identified as causal for prostate cancer by MR. SNORD93 and its methylation was reported to be associated with several cancer types including uveal melanoma (Gong et al., 2017), breast cancer (Patterson et al., 2017), and renal clear cell carcinoma (Zhao et al., 2020). Pathway analysis further supported a role of mortality‐related CpGs in relation to cancer risk. The intragenic CpGs were enriched for genes in cancer pathways, possibly as a consequence of the expression of nearby genes (cis‐eQTMs analysis, Table S21) related to immune function.

The 14 clinical risk factors for mortality were chosen based on prior knowledge. In contrast, there are far fewer established risk factors for cancer death other than age, sex, BMI, smoking, and alcohol consumption. It is not a surprise that the clinical risk factors themselves accurately predicted all‐cause mortality (C‐index = 0.80 in FHS, and 0.75 in ARIC) and CVD death (0.81 in FHS and 0.81 in ARIC), but not cancer death (0.57 in FHS and 0.71 in ARIC). Even though the clinical risk factors are important for stratifying CVD risk, clinical risk factors themselves are unable to reveal molecular mechanism and are thereby unable to highlight causal mechanisms or promising therapeutic targets. After integrating clinical risk factors with DNAm in the all‐cause mortality prediction model, the C‐index only slightly increased (<2%) compared with the clinical risk factors model with regard to all‐cause mortality and CVD death. As shown in Table S14, nine of the 14 clinical risk factors, including age, sex, physical activity, prevalent cancer, type II diabetes, hypertension, CHD, heart failure, and stroke, as well as 36 CpGs that were selected as the representative features. Compared with clinical risk factors, the individual coefficients of the CpGs are much smaller. The small increase in the C‐index and the small coefficients of the CpGs suggest that the contribution of CpGs to the prediction of death may overlap with these clinical risk factors. We also found that the mortality‐related CpGs as the sole input features were still able to predict mortality outcomes after adjusting for clinical risk factors. This suggests that mortality‐related CpGs may identify novel molecular mechanisms contributing to CVD mortality that cannot be captured by existing clinical risk factors.

In contrast to CVD and CVD mortality, for which established risk factors are highly predictive of risk, the prediction of cancer and cancer mortality has proved much more challenging. Owing to the lower prediction using clinical risk factors alone (0.57 in FHS and 0.71 in ARIC), the mortality‐related CpGs improved risk prediction of cancer death over and above the clinical risk factor model with an 11% increase in the C‐index in FHS and a 5% increase in ARIC. We further tested whether the all‐cause mortality prediction model can be used to predict mortality among all participants in the FHS with prevalent cancer (n = 389). During a mean follow‐up of 9 years, there were 165 deaths in this group. The integrative mortality model predicted mortality risk among cancer cases (HR [95%CI]: 4.23 [2.63–6.80], p = 2.9 × 10⁻⁹). These results in conjunction with MR and pathway analysis show strong evidence of potential causal relations between mortality‐related CpCs and pathways in cancer. Based on these results, we hypothesize that mortality‐related CpGs can shed light on the epigenetic regulation of molecular interactions and help to identify novel therapeutic targets to reduce mortality risk for both CVD and cancer death.

Recent studies have used DNAm of multiple CpG sites to predict chronological age (i.e., DNAm age) and showed that DNAm age was associated with all‐cause mortality. We explored the prediction provided by these DNAm age models and show that PhenoAge (Levine et al., 2018), Horvath Age (Horvath, 2013), Hannum Age (Hannum et al., 2013), and GrimAge (Lu et al., 2019) were associated with mortality before accounting for risk factors. Only GrimAge, however, remained associated with mortality after adjusting for clinical risk factors. In contrast, the other three DNAm age models were no longer associated with mortality (Figure 4). One possible explanation is that the three DNAm age predictors (i.e., PhenoAge, Horvath Age, and Hannum Age) identify CpGs associated with age, but are not specific for all‐cause or cause‐specific mortality risk. Of note, the CpGs that serve as DNAm mortality predictors and those that predict DNAm age in the three models do not overlap. Among the top CpGs (N = 177) associated with all‐cause mortality in our EWAS, only cg00687674 in TMEM84 is included in PhenoAge (Levine et al., 2018), and only cg19935065 in DNTT appears in Hannum Age (Hannum et al., 2013). GrimAge may have outperformed the other three DNAm age models in predicting mortality because the CpGs that it uses are associated with the levels of 80 CVD‐related blood proteins, and with lifestyle and clinical risk factors (such as smoking), and mortality (Ho et al., 2018; Shah et al., 2019; Yao et al., 2018). However, because the CpGs in the GrimAge model are not disclosed (i.e., they are proprietary), we were unable to determine whether any of the mortality‐related CpGs in our study overlap with CpGs in the GrimAge model. Of note, our mortality prediction models (both the CpG only model and the integrative model that included CpGs and the clinical risk factors) outperformed GrimAge in prediction of mortality outcomes.

We tested and compared four prediction methods including Elastic‐coxph (Friedman et al., 2010), a regression‐based method, and three machine learning methods (Ching et al., 2018; Ishwaran et al., 2008; Katzman et al., 2018). The machine learning models did not outperform Elastic‐coxph (Table S9 and Fig. S2). The clinical risk factor model trained by machine learning methods did not perform well in independent external replication. For example, the C‐index of the clinical risk factor model for all‐cause mortality was 0.67 using RSF¹⁷ versus 0.75 using Elastic‐coxph in ARIC participants. Based on this metric, the machine learning methods did not outperform the regression‐based methods when there were relatively few features as inputs.

The primary outcome of our study was all‐cause mortality. We did not train prediction models for CVD death or cancer death, but we tested the prediction ability of the all‐cause mortality predictor on CVD death and cancer death. The CpGs in the model were restricted to all‐cause mortality‐related CpGs. As shown in Figure 1, the top DNAm signatures for all‐cause mortality showed the same direction of effect for CVD death and cancer death. It is possible that some CpGs show opposite directions in relation to CVD death and cancer death, but we did not train separate models for these outcomes. Therefore, developing separate prediction models for CVD death and cancer death with a very large sample size would be an important next step.

4. CONCLUSIONS

In conclusion, the ancestry‐stratified epigenome‐wide meta‐analyses in 15 population‐based cohorts identified replicable DNAm signatures of all‐cause and cause‐specific mortality. The top mortality‐associated CpGs were linked with genes involved in immune‐ and cancer‐related pathways, and were reported to be associated with human longevity, CVD risk factors, and several types of cancer. We constructed and validated DNAm‐based prediction models that predicted mortality risk independent of established clinical risk factors. The prediction model trained by integrating DNAm with clinical risk factors showed small improvement in prediction of all‐cause mortality and CVD death, and a more substantial improvement in prediction of cancer death, compared with the model trained by clinical risk factors alone. The mortality‐related CpG sites and the DNAm‐based prediction models may serve as useful clinical tools for assessing all‐cause and cause‐specific mortality risk and for developing new therapeutic strategies.

5. METHODS

5.1. Study population

This study included 15,013 participants from 15 population‐based cohorts. There were 11,684 European ancestry (EA) participants from 12 cohorts, including the Atherosclerosis Risk in Communities (ARIC) Study, the Cardiovascular Health Study (CHS), the Danish Twin Register sample (DTR), the Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung (ESTHER), the Framingham Heart Study (FHS), the Invecchiare in Chianti (InCHIANTI) Study, the Cooperative Health Research in the Region of Augsburg (KORA F4), the Lothian Birth Cohorts of 1921 (LBC1921) and 1936 (LBC1936), the Normative Aging Study (NAS), the Rotterdam Study (RS), and Women's Health Initiative (WHI); and 3329 Africa ancestry (AA) participants from 3 cohorts, including ARIC, CHS, and WHI. For each participant, we calculated the follow‐up time between the date of the blood draw for DNAm measurements and the date at death or last follow‐up. Mean follow‐up was less than 15 years (range 6.2–13.7) for most cohorts, except for ARIC (mean 20.0 for EA and 18.6 for AA). The protocol for each study was approved by the institutional review board of each cohort. Further details for each cohort were included in File S1.

5.2. Mortality ascertainment and clinical phenotypes

Outcomes including death from all causes, deaths from CVD, and deaths from cancer were prospectively ascertained in each cohort. Survival status and details of death were ascertained using multiple strategies, including routine contact with participants for health history updates, surveillance at the local hospital, review of obituaries in the local newspaper, and National Death Index queries. Death certificates, hospital and nursing home records prior to death, and autopsy reports were requested and reviewed. Date and cause of death were determined separately for each cohort following review of all available medical records and /or were register‐based.

The clinical and lifestyle risk factors (referred to as clinical risk factors for simplicity thereafter) used as covariates in this study included age, sex, body mass index (BMI), smoking, alcohol consumption, physical activity, educational attainment, and prevalent diseases including hypertension, coronary heart disease (CHD), heart failure, stroke, type II diabetes, and cancer. Fourteen clinical risk factors were chosen based on prior knowledge; most of these are key CVD risk factors. The clinical risk factors were ascertained at the time of blood draw for DNAm measurements. BMI was calculated as weight (kg) divided by height squared (m²). Educational attainment (years of educational schooling), physical activity (frequency, intensity, or the metabolic equivalent of task [MET] scores), smoking status (yes/no, or cigs/day), and alcohol consumption (drinks per day) were self‐reported or ascertained by an administered questionnaire at routine research clinic visits. Diabetes was defined as a measured fasting blood glucose level of >125 mg/dl or current use of glucose‐lowering prescription medication. Hypertension was defined as a measured systolic blood pressure (BP) ≥140 mm Hg or diastolic BP ≥90 mm Hg or use of antihypertensive prescription medication. Cancer was defined as the occurrence of any type of cancer excluding non‐melanoma skin cancer.

5.3. DNA methylation measurements and quality control

For each cohort, DNA was extracted from whole blood and bisulfite‐converted using a Zymo EZ DNA methylation kit. DNAm was measured using the Illumina Infinium HumanMethylation450 (450K) BeadChip platform (Illumina Inc., San Diego, CA). Each cohort conducted independent laboratory DNAm measurement, quality control (including sample‐wise and probe‐wise filtering, and probe intensity background correction; see File S1).

5.4. Cohort‐specific epigenome‐wide association analysis

The correction of methylation data for technical covariates was cohort specific. Each cohort performed an independent investigation to select an optimized set of technical covariates (e.g., batch, plate, chip, row, and column), using measured or imputed blood cell type fractions, surrogate variables, and/or principal components. Most cohorts had previous publications using the same dataset for EWAS of different traits, such as EWAS of alcohol drinking and smoking (Mendelson et al., 2017; Michailidou et al., 2017). In this study, those cohorts used the same strategies as they did previously for correcting for technical variables including batch (see File S1). To avoid false positives driven by single CpG extreme values, in each cohort, we first performed rank‐based inverse normal transformation (INT) of DNAm β‐values (the ratio of methylated probe intensity divided by the sum of the methylation and unmethylated probe intensity). We then conducted time‐to‐event analyses using Cox proportional hazards models to test for associations between each CpG and mortality outcomes including all‐cause mortality, CVD death, and cancer death using the coxph() function in the “survival” R library, adjusting for clinical risk factors (see Mortality ascertainment and clinical risk factors), technical confounders, and familial relatedness. Because ARIC cohorts had much longer follow‐up than the other cohorts, ARIC follow‐up was truncated at 15 years and results were compared to those before truncation to determine whether results were impacted by duration of follow‐up.

In this study, we performed INT of DNAm β‐values to avoid false positives driven by extreme values of single CpGs. Using the FHS EWAS results as an example, Table S24 shows that the top CpGs associated with all‐cause mortality (without INT) were no longer significant after performing INT. This finding suggests that if we directly use DNAm β‐values, those extreme outlier values could lead to false‐positive results. Clearly, the distribution of DNA β‐values is non‐normal, and for this reason, we believe that the conservative INT approach we took protected against false‐positive results.

5.5. Meta‐analysis

The meta‐analysis was performed for all‐cause mortality, CVD death, and cancer death in EA (n = 11,684) and AA (n = 3329) participants, respectively, using inverse variance‐weighted random‐effects models implemented in metagen() function R packages (https://rdrr.io/cran/meta/man/metagen.html). We chose a random‐effects model because of the heterogeneity in follow‐up length and population demographics in the different cohorts (Table S1). We excluded the EWAS results for a study with <20 deaths. We excluded probes mapping to multiple locations on the sex chromosomes or with an underlying SNP (MAF > 5% in 1000 Genome Project data) at the CpG site or within 10bp of the single base extension. In addition, the meta‐analysis was constrained to methylation probes passing filtering criteria in five or more cohorts (see File S1), which resulted in ~400,000 CpGs that were included in the final analyses. The statistical significance threshold was p < 0.05/400,000 ≈ 1 × 10⁻⁷.

Three types of sensitivity analyses were performed including (1) correcting for λ values in each cohorts (Devlin et al., 2001), (2) excluding two cohorts with λ > 1.5 from the meta‐analysis, and (3) excluding results of RS, because the cohort‐specific analysis in RS having a strange distribution of top hits. There were 157 CpGs identified at p < 1e‐7 in the RS cohort‐specific analysis. The number is much more than the number of all‐cause mortality‐associated CpGs identified in the other cohorts.

5.6. Mortality prediction models

Mortality prediction models based on clinical risk factors and with the addition of DNAm were built and tested in EA cohorts. The analysis flowchart is shown in Figure 1. To ensure unbiased validation, we split the EA cohorts into discovery and replication sets. The discovery cohorts consisted of 8288 participants from 10 cohorts, excluding FHS (n = 2427) and ARIC (n = 969). Candidate CpGs as input features were selected by meta‐analysis in the discovery cohorts. FHS was used to train prediction models. ARIC was used to validate models. To build and replicate a prediction model, the DNAm data were preprocessed utilizing the same strategy as in the EWAS analysis.

5.6.1. Input features

To evaluate the prediction performance of clinical risk factors and DNAm comprehensively, we tested 13 sets of features, Feature set 1 (F1) included age (years), sex (male as 1 and female as 2), and 12 other clinical risk factors including BMI (kg/m²), smoking (current smoker as 1, and former and never smoker as 0), alcohol consumption (g/day), physical activity (MET scores), educational attainment (education years), and prevalent diseases (yes as 1 and no as 0) including hypertension, CHD, heart failure, stroke, type 2 diabetes, and cancer. F2‐F7 were mortality‐related CpGs selected by meta‐analysis in the discovery cohorts by inverse variance‐weighted random‐effects models at a series of p value thresholds, including F2 CpGs at p < 1e‐7, F3 CpGs at p < 1e‐6, F4 CpGs at p < 1e‐5, F5 CpGs at p < 1e‐4, F6 CpGs at p < 1e‐3, and F7 CpGs at p < 0.05. F8‐F13 are F1 (age, sex, and 12 clinical phenotypes) plus F2‐F7, respectively. In doing so, we were able to evaluate the prediction performance based on the clinical risk factors (F1) and the DNAm (F2‐F7), and test if the combination of DNAm with clinical risk factors (F8‐F13) could be able to improve the prediction performance by using clinical risk factors (F1) only and DNAm only (F2‐F7).

5.6.2. Model building

We compared four methods of building prediction models, including 1) Elastic net ‐ Cox proportional hazards method (Elastic‐coxph, using glmnet, a R package) (Friedman et al., 2010); 2) Random survival forest (RSF, using randomForestSRC, a R package) (Ishwaran et al., 2008); 3) Cox‐nnet (https://github.com/lanagarmire/cox‐nnet, a Python package) (Ching et al., 2018); and 4) DeepSurv (https://github.com/jaredleekatzman/DeepSurv, a Python package) (Katzman et al., 2018). The first method is a penalized linear regression method, while the other three are non‐linear machine learning methods.

Elastic‐coxph is a Cox regression model regularized with elastic net penalty (Friedman et al., 2010). Performing this method requires to identify best values of two parameters, α and λ. We tuned each model by iterating over a number of α and λ values under cross‐validation. α indicated linearly combined penalties of the lasso (α=0) and ridge (α=1) regression. λ is the shrinkage parameter, when λ =0 indicated no shrinkage, and as λ increases, the coefficients are shrunk ever more strongly. Effectively this will shrink some coefficients close to 0 for optimizing a set of features. The α value was set to 0.5, and the λ value was set to lambda.min when training models.

RSF is an ensemble tree model that is based on the random forest method for survival analysis (Ishwaran et al., 2008). The optimized values of parameters in RSF models, including the number of trees (nTrees=100) and nodeSize =15, were chosen by iterating over a number of values which maximized the accuracy of RSF models tested in the replication sets under cross‐validation. RSF can compute feature importance scores for feature selection.

Cox‐nnet is an artificial neural network‐based method for survival analysis (Ching et al., 2018). Cox‐nnet includes two layer neural network: one hidden layer and one output layer. The output layer was used to perform Cox regression based on the activation levels of the hidden layer. Cox‐nnet could also compute feature importance scores for feature selection. For each model training, the L2 regularization parameter is optimized using the L2CVProfile Python function by iterating over a number of values under cross‐validation.

DeepSurv is a deep learning‐based survival prediction method (Katzman et al., 2018). DeepSurv uses a multi‐layer feed forward neural network, of which the hidden layers consist of a fully connected layer of nodes, followed by a dropout layer, and the output is a single node with a linear activation which estimated the log‐risk function in the Cox model, parameterized by the weight of the network. The values of hyperparameters when using DeepSurv were L2 regularization =0.8, dropout =0.4, learning rate =0.02, hidden layer size (4 layers with nodes 500, 200, 100 and 50), lr_decay =0.001, momentum =0.9 and the activation method (using Scaled Exponential Linear Units), which were optimized by iterating over a number of values each‐by‐each and under cross‐validation. DeepSurv has not been used previously for selecting features.

The 2427 FHS participants were randomly split into 5 equal sets (n=485 or 486 in each set), and each set included approximately equal numbers of deaths. We then used 3 of the 5 sets (60%) for model training and the remaining 2 sets (40%) for model testing. In doing so, we obtained 10 combinations. In each training / testing combination, we constructed a model using the training data, and then used the model to generate a mortality risk score based on the testing data. We assessed associations of the predicted mortality risk score (after inversely normal transformation) with all‐cause mortality, CVD death, and cancer death in the testing data using time‐to‐event proportional hazards models. This data partitioning and cross‐validation strategy was only used to assess the robustness of prediction models when using different features and methods, and to select the optimized parameters for training models. The final models reported were built on all FHS participants using the optimized parameters. We also repeated the same analysis steps using FHS participants without cancer at baseline (n=2038; 238 deaths from all causes, 70 from CVD, and 42 from cancer).

5.6.3. Validation

The prediction models built using all FHS participants were tested in ARIC EA participants for the prediction of mortality outcomes. We performed tests on all‐cause mortality and CVD death on all ARIC EA participants truncated at 15 years of follow‐up, and tests on cancer death after excluding prevalent cancer. We further tested the all‐cause mortality prediction model in the CARDIA study. The CARDIA study has 12 years of follow‐up, during which there were 27 deaths from all causes in 905 participants with DNA methylation.

5.6.4. Evaluation of model performance

We used four evaluation metrics to assess model performance, including the concordance index (C‐index) (Harrell Jr et al., 1996), hazards ratio of predicted risk score (inversely transformed) for prediction of mortality, the integrated brier score (IBS) (Brier, 1950), and Kaplan–Meier (KM) survival curves for high‐, medium‐, and low‐risk groups (Kaplan & Meier, 1958). The C‐index reflects the percentage of individuals whose predicted survival times are correctly ordered. A C‐index of 0.50 reflects no improvement in prediction over chance. The brier score measures the mean of the difference between the observed and the estimated survival beyond a certain time. The brier score ranges between 0 and 1, and a larger score indicates higher inaccuracy. The integrated brier score is the brier score averaged over the entire time interval.

5.7. DNAm Age

We compared the prediction performance of DNAm age with our DNAm‐based mortality prediction model in relation to all‐cause mortality, CVD death, and cancer death in the ARIC EA cohort (truncating follow‐up at 15 years). Four measures of DNAm age were used in this study, including PhenoAge (Levine et al., 2018), Horvath age (Horvath, 2013), Hannum age (Hannum et al., 2013), and GrimAge (Lu et al., 2019). The Horvath Age is based on 353 CpGs, the Hannum age is based on 71 CpGs, and PhenoAge is based on 513 CpGs. DNAm age was calculated as the sum of the beta values multiplied by the reported effect size. Due to the GrimAge model was not publicly available, the GrimAge was calculated by uploading the DNAm data to the website (http://dnamage.genetics.ucla.edu/). Proportional hazards regression models were used to test the association between inversely rank transformed DNAm age (all 3 approaches) and mortality outcomes, adjusting for age, sex, and clinical covariates (see Mortality ascertainment and clinical phenotypes).

5.8. meQTLs

meQTLs (SNPs associated with DNA methylation) were identified from 4170 FHS participants as reported previously, including 4.7 million cis‐meQTLs and 630K trans‐meQTLs at p < 2 × 10⁻¹¹ for cis and p < 1.5 × 10⁻¹⁴ for trans (Huan et al., 2019). The genotypes were measured using Affymetrix SNP 500K mapping and Affymetrix 50K gene‐focused MIP arrays. Genotypes were imputed using the 1000 Genomes Project panel phase 3 using MACH / Minimac software. SNPs with MAF > 0.01 and imputation quality ratio >0.3 were retained. cis‐meQTLs were defined as SNPs residing within 1 Mb upstream or downstream of a CpG site. The FHS meQTL data resource includes 3.5 times more cis‐, and 10 times more trans‐meQTL SNPs than the other published studies to date (https://ftp.ncbi.nlm.nih.gov/eqtl/original_submissions/FHS_meQTLs/).

5.9. Mendelian randomization

Two‐sample Mendelian randomization (MR) was used to identify putatively causal CpGs for human longevity, CVD and CVD risk factors, and cancer types using a multi‐step strategy. Estimated associations and effect sizes between SNPs and traits were based on the latest published GWAS meta‐analysis of human longevity (Deelen et al., 2019), coronary heart disease (CHD) (Nikpay et al., 2015); myocardial infarction (MI) (Nikpay et al., 2015); type II diabetes (T2D) (Scott et al., 2017); body mass index (BMI) (Locke et al., 2015); lipids traits including high‐density lipoprotein (HDL) cholesterol, low‐density lipoprotein (LDL) cholesterol, total cholesterol (TC), and triglycerides (TG) (Willer et al., 2013); systolic blood pressure (SBP) and diastolic blood pressure (DBP) (Evangelou et al., 2018), and cancer types including breast cancer (Michailidou et al., 2017), prostate cancer (Schumacher et al., 2018), lung cancer (Wang et al., 2014) and ovarian cancer (Phelan et al., 2017). We were unable to include some other popular cancer types, because their GWAS data were not be accessible by us.

Instrumental variables (IVs) for each CpG site consisted of independent cis‐meQTLs pruned at linkage disequilibrium (LD) r ² < 0.01, retaining only one cis‐meQTL variant with the lowest SNP‐CpG p value in each LD block. LD proxies were defined using 1000 genomes imputation in EA. Inverse variance‐weighted (IVW) MR tests were performed on CpGs with at least three independent cis‐meQTL variants, which is the minimum number of IVs needed to perform multiple instruments MR. The multiple instruments improved the precision of IV estimates and allowed the examination of underlying IV assumption (Palmer et al., 2012). Among 177 all‐cause mortality‐related CpGs at p < 1 × 10⁻⁷, MR tests were performed on 17 CpGs having ≥3 independent cis‐meQTL SNPs. To test the validity of IVW‐MR results, we performed heterogeneity and MR‐EGGER pleiotropy tests for all IVs. The statistical significance threshold for MR is p _MR < 0.05/17, and both p _heter and p _pleio were required to be >0.05.

5.10. eQTMs

Association tests of DNAm and gene expression were performed in 3684 FHS participants with available DNAm and gene expression data. mRNA was extracted from whole blood (collected in PAXgene tubes) and profiled using the Affymetrix Human Exon 1.0 ST GeneChip platform. Raw gene expression data were first normalized using the RMA (robust multi‐array average) from Affymetrix Power Tools (APT, thermofisher.com/us/en/home/life‐science/microarray‐analysis/affymetrix.html#1_2) with quantile normalization. Then, output expression values of 17,318 genes were extracted by APT based on NetAffx annotation version 31.

DNAm β values were adjusted for age, sex, predicted blood cell fraction, the two top PCs of DNAm, and 25 surrogate variables (SVs), with DNAm as a fixed effect, and batch as a random effect by fitting LME models. Residuals (DNAm_resid) were retained. The gene expression values were adjusted for age, sex, predicted blood cell fraction, a set of technical covariates, the two top PCs and 25 SVs, with gene expression as a fixed effect, and batch as a random effect by LME, and residuals (mRNA_resid) were retained. Then, linear regression models were used to assess pair‐wise associations between DNAm_resid and mRNA_resid. SVs were calculated using the SVA package in R. A cis‐CpG‐mRNA pair was defined as a CpG residing ± 1 Mb of the TSS of the corresponding gene encoding the mRNA (cis‐eQTM). The annotations of CpGs and transcripts were obtained from annotation files of the HumanMethylation450K BeadChip and the Affymetrix exon array S1.0 platforms. We estimated that there were 1.6 × 10⁸ potential cis‐ CpG‐mRNA pairs. We only used cis‐eQTMs in this study because trans‐eQTMs were not replicated in independent external studies. The statistical significance threshold was p < 3 × 10⁻¹⁰ (0.05/1.6 × 10⁸).

5.11. Gene ontology and pathway enrichment analysis

Gene ontology and pathway enrichment analyses were performed on the genes annotated in relation to the 177 all‐cause mortality‐related CpGs at p < 1 × 10⁻⁷or p < 1 × 10⁻⁵ as well as the cis‐eQTM genes associated with those CpGs. To improve focus in this study, we only used results of KEGG and Gene Ontology–biological process (GO‐BP) terms. Enrichment tests used gometh function in missMethy R package, which can take into account two types of bias in DNA methylation study: (1) the differing number of probes per gene present on the array, and (2) CpGs that are annotated to multiple genes (Maksimovic et al., 2021).

CONFLICT OF INTEREST

The authors declare no conflict of interest.

AUTHORS’ CONTRIBUTIONS

T. H., D.L., and J.P. designed, directed, and supervised the project. T. H. and D.L. drafted the manuscript. T.H., S.N., E.C., C. R., D.H., J.B., M.S., Y.Z., A.B., E.M., and T.T. conducted the analyses. All authors participated in revising and editing the manuscripts. All authors have read and approved the final version of the manuscript.

CONSENT TO PARTICIPATE

This study included participants from 12 population‐based cohorts studies, including the Atherosclerosis Risk in Communities (ARIC) Study, the Cardiovascular Health Study (CHS), the Danish Twin Register sample (DTR), the Epidemiologische Studie zu Chancen der Verhütung, Früherkennung und optimierten Therapie chronischer Erkrankungen in der älteren Bevölkerung (ESTHER), the Framingham Heart Study (FHS), the Invecchiare in Chianti (InCHIANTI) Study, the Cooperative Health Research in the Region of Augsburg (KORA F4), the Lothian Birth Cohorts of 1921 (LBC1921) and 1936 (LBC1936), the Normative Aging Study (NAS), the Rotterdam Study (RS), and Women's Health Initiative (WHI). All study participants provided written informed consent.

Supporting information

Fig S1‐3

Click here for additional data file.^{(1.1MB, docx)}

Tables S1‐S4

Click here for additional data file.^{(990.9KB, xlsx)}

ACKNOWLEDGEMENTS

The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute; the National Institutes of Health; or the U.S. Department of Health and Human Services. For a list of all the investigators who have contributed to WHI science, please visit: https://s3‐us‐west‐2.amazonaws.com/www‐whi‐org/wp‐content/uploads/WHI‐Investigator‐Long‐List.pdf.

Huan, T. , Nguyen, S. , Colicino, E. , Ochoa‐Rosales, C. , Hill, W. D. , Brody, J. A. , Soerensen, M. , Zhang, Y. , Baldassari, A. , Elhadad, M. A. , Toshiko, T. , Zheng, Y. , Domingo‐Relloso, A. , Lee, D. H. , Ma, J. , Yao, C. , Liu, C. , Hwang, S.‐J. , Joehanes, R. , … Levy, D. (2022). Integrative analysis of clinical and epigenetic biomarkers of mortality. Aging Cell, 21, e13608. 10.1111/acel.13608

Tianxiao Huan and Steve Nguyen authors are contribute equally.

Funding information

The Framingham Heart Study is funded by National Institutes of Health contract N01‐HC‐25195 and HHSN268201500001I. The laboratory work for this investigation was funded by the Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health. The analytical component of this project was funded by the Division of Intramural Research, National Heart, Lung, and Blood Institute, and the Center for Information Technology, National Institutes of Health, Bethesda, MD. The Cardiovascular Health Study is supported by NHLBI contracts HHSN268201200036C, HHSN268200800007C, HHSN268201800001C, N01HC55222, N01HC85079, N01HC85080, N01HC85081, N01HC85082, N01HC85083, N01HC85086; and NHLBI grants U01HL080295, U01HL130114, K08HL116640, R01HL087652, R01HL092111, R01HL103612, R01HL105756, R01HL103612, R01HL111089, R01HL116747 and R01HL120393 with additional contribution from the National Institute of Neurological Disorders and Stroke (NINDS). Additional support was provided through R01AG023629 from the National Institute on Aging (NIA), Merck Foundation/Society of Epidemiologic Research as well as Laughlin Family, Alpha Phi Foundation, and Locke Charitable Foundation. A full list of principal CHS investigators and institutions can be found at CHS‐NHLBI.org. The provision of genotyping data was supported in part by the National Center for Advancing Translational Sciences, CTSI grant UL1TR000124, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant DK063491 to the Southern California Diabetes Endocrinology Research Center. Infrastructure for the CHARGE Consortium is supported in part by the National Heart, Lung, and Blood Institute grant R01HL105756. The DTR study was supported by The Danish Council for Independent Research—Medical Sciences (DFF‐6110‐00016), the European Union's Seventh Framework Programme (FP7/2007–2011) under grant Agreement No. 259679 and The Danish National Program for Research Infrastructure 2007 (09‐063256). The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, and HHSN268201600004C. Work in WHI was NIEHS‐supported by R01‐ES020836 (EAW; AB; LH). Phenotype collection in the Lothian Birth Cohort 1921 was supported by the UK’s Biotechnology and Biological Sciences Research Council (BBSRC), The Royal Society and The Chief Scientist Office of the Scottish Government. Phenotype collection in the Lothian Birth Cohort 1936 was supported by Age UK (The Disconnected Mind project). Methylation typing was supported by Centre for Cognitive Ageing and Cognitive Epidemiology (Pilot Fund award), Age UK, The Wellcome Trust Institutional Strategic Support Fund, The University of Edinburgh, and The University of Queensland. IJD is a member of the University of Edinburgh Centre for Cognitive Ageing and Cognitive Epidemiology (CCACE), which is supported by funding from the BBSRC, the Medical Research Council (MRC), and the University of Edinburgh as part of the cross‐council Lifelong Health and Wellbeing initiative (MR/K026992/1). W.D.H. is supported by a grant from Age UK (Disconnected Mind Project). The VA Normative Aging Study is sponsored by the CSP/ERIC program of the U.S. Department of Veterans Affairs and is a research component of the Massachusetts Veterans Education Research and Information Center (MAVERIC). The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. 75N92022D00001, 75N92022D00002, 75N92022D00003, 75N92022D00004, and 75N92022D00005. Funding was also supported by 5RC2HL102419 and R01NS087541. The authors thank the staff and participants of the ARIC study for their important contributions. J.M. is supported by a grant from NIH HL135075. W.D.H. is supported by a Career Development Award from the Medical Research Council (MRC) [MR/T030852/1] for the project titled “From genetic sequence to phenotypic consequence: genetic and environmental links between cognitive ability, socioeconomic position, and health.”

Contributor Information

Tianxiao Huan, Email: tianxiao.huan@umassmed.edu.

James S. Pankow, Email: panko001@umn.edu.

Daniel Levy, Email: LevyD@nih.gov.

DATA AVAILABILITY STATEMENT

The DNA methylation data and phenotype data are available in dbGaP for some of the cohorts in this study (https://www.ncbi.nlm.nih.gov/gap/) including FHS (accession number phs000724.v5.p10) and WHI (accession number phs000200.v12.p3). For LBC, data are available in the European Genome‐phenome Archive (https://www.ebi.ac.uk/ega/home), under accession number EGAS00001000910. For the other cohorts including ARIC, CHS, NAS, InCHIANTi, KORA, ESTHER, Danish, RS, and CARDIA, the data are available on request by contacting with the principal investigators of each cohort.

REFERENCES

Ashburner, M. , Ball, C. A. , Blake, J. A. , Botstein, D. , Butler, H. , Cherry, J. M. , Davis, A. P. , Dolinski, K. , Dwight, S. S. , Eppig, J. T. , Harris, M. A. , Hill, D. P. , Issel‐Tarver, L. , Kasarskis, A. , Lewis, S. , Matese, J. C. , Richardson, J. E. , Ringwald, M. , Rubin, G. M. , & Sherlock, G. (2000). Gene ontology: tool for the unification of biology. Nature Genetics, 25(1), 25–29. 10.1038/75556 [DOI] [PMC free article] [PubMed] [Google Scholar]
Battram, T. , Yousefi, P. , Crawford, G. , Prince, C. , Babei, M. S. , Sharp, G. , Hatcher, C. , Vega‐Salas, M. J. , Khodabakhsh, S. , Whitehurst, O. , Langdon, R. , Mahoney, L. , Elliott, H. R. , Mancano, G. , Lee, M. A. , Watkins, S. H. , Lay, A. C. , Hemani, G. , Gaunt, T. R. , … Whitehurst, O. (2022). The EWAS Catalog: A database of epigenome‐wide association studies. Wellcome Open Research, 7(41), 41. 10.1093/nar/gky993 [DOI] [PMC free article] [PubMed] [Google Scholar]
Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3. [Google Scholar]
Buniello, A. , MacArthur, J. A. L. , Cerezo, M. , Harris, L. W. , Hayhurst, J. , Malangone, C. , McMahon, A. , Morales, J. , Mountjoy, E. , Sollis, E. , Suveges, D. , Vrousgou, O. , Whetzel, P. L. , Amode, R. , Guillen, J. A. , Riat, H. S. , Trevanion, S. J. , Hall, P. , Junkins, H. , … Parkinson, H. (2019). The NHGRI‐EBI GWAS Catalog of published genome‐wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Research, 47(D1), D1005–D1012. 10.1093/nar/gky1120 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen, B. H. , Marioni, R. E. , Colicino, E. , Peters, M. J. , Ward‐Caviness, C. K. , Tsai, P.‐C. , Roetker, N. S. , Just, A. C. , Demerath, E. W. , Guan, W. , Bressler, J. , Fornage, M. , Studenski, S. , Vandiver, A. R. , Moore, A. Z. , Tanaka, T. , Kiel, D. P. , Liang, L. , Vokonas, P. , … Horvath, S. (2016). DNA methylation‐based measures of biological age: meta‐analysis predicting time to death. Aging (Albany NY), 8(9), 1844. 10.18632/aging.101020 [DOI] [PMC free article] [PubMed] [Google Scholar]
Ching, T. , Zhu, X. , & Garmire, L. X. (2018). Cox‐nnet: an artificial neural network method for prognosis prediction of high‐throughput omics data. PLoS Computational Biology, 14(4), e1006076. 10.1371/journal.pcbi.1006076 [DOI] [PMC free article] [PubMed] [Google Scholar]
Colicino, E. , Marioni, R. , Ward‐Caviness, C. , Gondalia, R. , Guan, W. , Chen, B. , Tsai, P.‐C. , Huan, T. , Xu, G. , Golareh, A. , Schwartz, J. , Vokonas, P. , Just, A. , Starr, J. M. , McRae, A. F. , Wray, N. R. , Visscher, P. M. , Bressler, J. , Zhang, W. , … Baccarelli, A. (2020). Blood DNA methylation sites predict death risk in a longitudinal study of 12,300 individuals. Aging (Albany NY), 12(14), 14092. 10.18632/aging.103408 [DOI] [PMC free article] [PubMed] [Google Scholar]
Deelen, J. , Evans, D. S. , Arking, D. E. , Tesi, N. , Nygaard, M. , Liu, X. , Wojczynski, M. K. , Biggs, M. L. , van der Spek, A. , Atzmon, G. , Ware, E. B. , Sarnowski, C. , Smith, A. V. , Seppälä, I. , Cordell, H. J. , Dose, J. , Amin, N. , Arnold, A. M. , Ayers, K. L. , … Murabito, J. M. (2019). A meta‐analysis of genome‐wide association studies identifies multiple longevity genes. Nature Communications, 10(1), 1–14. 10.1038/s41467-019-11558-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
Devlin, B. , Roeder, K. , & Wasserman, L. (2001). Genomic control, a new approach to genetic‐based association studies. Theoretical Population Biology, 60(3), 155–166. 10.1006/tpbi.2001.1542 [DOI] [PubMed] [Google Scholar]
Dugué, P.‐A. , Bassett, J. K. , Joo, J. H. E. , Jung, C.‐H. , Ming Wong, E. E. , Moreno‐Betancur, M. , Schmidt, D. , Makalic, E. , Li, S. , Severi, G. , Hodge, A. M. , Buchanan, D. D. , English, D. R. , Hopper, J. L. , Southey, M. C. , Giles, G. G. , & Milne, R. L. (2018). DNA methylation‐based biological aging and cancer risk and survival: Pooled analysis of seven prospective studies. International Journal of Cancer, 142(8), 1611–1619. 10.1002/ijc.31189 [DOI] [PubMed] [Google Scholar]
Evangelou, E. , Warren, H. R. , Mosen‐Ansorena, D. , Mifsud, B. , Pazoki, R. , Gao, H. E. , Ntritsos, G. , Dimou, N. , Cabrera, C. P. , Karaman, I. , Ng, F. L. , Evangelou, M. , Witkowska, K. , Tzanis, E. , Hellwege, J. N. , Giri, A. , Velez Edwards, D. R. , Sun, Y. V. , Cho, K. , … Caulfield, M. J. (2018). Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nature Genetics, 50(10), 1412–1425. 10.1038/s41588-018-0205-x [DOI] [PMC free article] [PubMed] [Google Scholar]
Feng, T. , Chen, Z. , Gu, J. , Wang, Y. , Zhang, J. , & Min, L. (2019). The clinical responses of TNIP2‐ALK fusion variants to crizotinib in ALK‐rearranged lung adenocarcinoma. Lung Cancer, 137, 19–22. 10.1016/j.lungcan.2019.08.032 [DOI] [PubMed] [Google Scholar]
Friedman, J. , Hastie, T. , & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1. 10.18637/jss.v033.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gong, J. , Li, Y. , Liu, C.‐J. , Xiang, Y. U. , Li, C. , Ye, Y. , Zhang, Z. , Hawke, D. H. , Park, P. K. , Diao, L. , Putkey, J. A. , Yang, L. , Guo, A.‐Y. , Lin, C. , & Han, L. (2017). A pan‐cancer analysis of the expression and clinical relevance of small nucleolar RNAs in human cancer. Cell Reports, 21(7), 1968–1981. 10.1016/j.celrep.2017.10.070 [DOI] [PubMed] [Google Scholar]
Hannum, G. , Guinney, J. , Zhao, L. , Zhang, L. I. , Hughes, G. , Sadda, S. V. , Klotzle, B. , Bibikova, M. , Fan, J.‐B. , Gao, Y. , Deconde, R. , Chen, M. , Rajapakse, I. , Friend, S. , Ideker, T. , & Zhang, K. (2013). Genome‐wide methylation profiles reveal quantitative views of human aging rates. Molecular Cell, 49(2), 359–367. 10.1016/j.molcel.2012.10.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
Harrell, F. E. Jr , Lee, K. L. , & Mark, D. B. (1996). Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine, 15(4), 361–387. [DOI] [PubMed] [Google Scholar]
Ho, J. E. , Lyass, A. , Courchesne, P. , Chen, G. , Liu, C. , Yin, X. , Hwang, S.‐J. , Massaro, J. M. , Larson, M. G. , & Levy, D. (2018). Protein biomarkers of cardiovascular disease and mortality in the community. Journal of the American Heart Association, 7(14), e008108. 10.1161/JAHA.117.008108 [DOI] [PMC free article] [PubMed] [Google Scholar]
Horvath, S. (2013). DNA methylation age of human tissues and cell types. Genome Biology, 14(10), 3156. 10.1186/gb-2013-14-10-r115 [DOI] [PMC free article] [PubMed] [Google Scholar]
Horvath, S. , Pirazzini, C. , Bacalini, M. G. , Gentilini, D. , Di Blasio, A. M. , Delledonne, M. , Mari, D. , Arosio, B. , Monti, D. , Passarino, G. , De Rango, F. , D'Aquila, P. , Giuliani, C. , Marasco, E. , Collino, S. , Descombes, P. , Garagnani, P. , & Franceschi, C. (2015). Decreased epigenetic age of PBMCs from Italian semi‐supercentenarians and their offspring. Aging (Albany NY), 7(12), 1159. 10.18632/aging.100861 [DOI] [PMC free article] [PubMed] [Google Scholar]
Huan, T. , Joehanes, R. , Song, C. I. , Peng, F. , Guo, Y. , Mendelson, M. , Yao, C. , Liu, C. , Ma, J. , Richard, M. , Agha, G. , Guan, W. , Almli, L. M. , Conneely, K. N. , Keefe, J. , Hwang, S.‐J. , Johnson, A. D. , Fornage, M. , Liang, L. , & Levy, D. (2019). Genome‐wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nature Communications, 10(1), 1–14. 10.1038/s41467-019-12228-z [DOI] [PMC free article] [PubMed] [Google Scholar]
Ishwaran, H. , Kogalur, U. B. , Blackstone, E. H. , & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860. 10.1214/08-AOAS169 [DOI] [Google Scholar]
Jones, P. A. , & Takai, D. (2001). The role of DNA methylation in mammalian epigenetics. Science, 293(5532), 1068–1070. [DOI] [PubMed] [Google Scholar]
Kanehisa, M. , & Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28(1), 27–30. 10.1093/nar/28.1.27 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kaplan, E. L. , & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457–481. 10.1080/01621459.1958.10501452 [DOI] [Google Scholar]
Katzman, J. L. , Shaham, U. , Cloninger, A. , Bates, J. , Jiang, T. , & Kluger, Y. (2018). DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 24. 10.1186/s12874-018-0482-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
Levine, M. E. , Lu, A. T. , Quach, A. , Chen, B. H. , Assimes, T. L. , Bandinelli, S. , Hou, L. , Baccarelli, A. A. , Stewart, J. D. , Li, Y. , Whitsel, E. A. , Wilson, J. G. , Reiner, A. P. , Aviv, A. , Lohman, K. , Liu, Y. , Ferrucci, L. , & Horvath, S. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY), 10(4), 573. 10.18632/aging.101414 [DOI] [PMC free article] [PubMed] [Google Scholar]
Locke, A. E. , Kahali, B. , Berndt, S. I. , Justice, A. E. , Pers, T. H. , Day, F. R. , & Yang, J. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature, 518(7538), 197–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lu, A. T. , Quach, A. , Wilson, J. G. , Reiner, A. P. , Aviv, A. , Raj, K. , Hou, L. , Baccarelli, A. A. , Li, Y. , Stewart, J. D. , Whitsel, E. A. , Assimes, T. L. , Ferrucci, L. , & Horvath, S. (2019). DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging (Albany NY), 11(2), 303. 10.18632/aging.101684 [DOI] [PMC free article] [PubMed] [Google Scholar]
Maksimovic, J. , Gordon, L. , & Oshlack, A. (2012). SWAN: Subset‐quantile within array normalization for illumina infinium HumanMethylation450 BeadChips. Genome Biology, 13(6), 1–12. 10.1186/gb-2012-13-6-r44 [DOI] [PMC free article] [PubMed] [Google Scholar]
Maksimovic, J. , Oshlack, A. , & Phipson, B. (2021). Gene set enrichment analysis for genome‐wide DNA methylation data. Genome Biology, 22(1), 1–26. 10.1186/s13059-021-02388-x [DOI] [PMC free article] [PubMed] [Google Scholar]
Marioni, R. E. , Shah, S. , McRae, A. F. , Chen, B. H. , Colicino, E. , Harris, S. E. , Gibson, J. , Henders, A. K. , Redmond, P. , Cox, S. R. , Pattie, A. , Corley, J. , Murphy, L. , Martin, N. G. , Montgomery, G. W. , Feinberg, A. P. , Fallin, M. D. , Multhaup, M. L. , Jaffe, A. E. , … Deary, I. J. (2015). DNA methylation age of blood predicts all‐cause mortality in later life. Genome Biology, 16(1), 25. 10.1186/s13059-015-0584-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
Marioni, R. E. , Shah, S. , McRae, A. F. , Ritchie, S. J. , Muniz‐Terrera, G. , Harris, S. E. , Gibson, J. , Redmond, P. , Cox, S. R. , Pattie, A. , Corley, J. , Taylor, A. , Murphy, L. , Starr, J. M. , Horvath, S. , Visscher, P. M. , Wray, N. R. , & Deary, I. J. (2015). The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936. International Journal of Epidemiology, 44(4), 1388–1396. 10.1093/ije/dyu277 [DOI] [PMC free article] [PubMed] [Google Scholar]
Mendelson, M. M. , Marioni, R. E. , Joehanes, R. , Liu, C. , Hedman, Å. K. , Aslibekyan, S. , Demerath, E. W. , Guan, W. , Zhi, D. , Yao, C. , Huan, T. , Willinger, C. , Chen, B. , Courchesne, P. , Multhaup, M. , Irvin, M. R. , Cohain, A. , Schadt, E. E. , Grove, M. L. , … Deary, I. J. (2017). Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. PLoS Med, 14(1), e1002215. 10.1371/journal.pmed.1002215 [DOI] [PMC free article] [PubMed] [Google Scholar]
Michailidou, K. , Lindström, S. , Dennis, J. , Beesley, J. , Hui, S. , Kar, S. , Lemaçon, A. , Soucy, P. , Glubb, D. , Rostamianfar, A. , Bolla, M. K. , Wang, Q. , Tyrer, J. , Dicks, E. D. , Lee, A. , Wang, Z. , Allen, J. , Keeman, R. , Eilber, U. , … Easton, D. F. (2017). Association analysis identifies 65 new breast cancer risk loci. Nature, 551(7678), 92. 10.1038/nature24284 [DOI] [PMC free article] [PubMed] [Google Scholar]
Nikpay, M. , Goel, A. , Won, H.‐H. , Hall, L. M. , Willenborg, C. , Kanoni, S. , & Hopewell, J. C. (2015). A comprehensive 1000 Genomes–based genome‐wide association meta‐analysis of coronary artery disease. Nature Genetics, 47(10), 1121. [DOI] [PMC free article] [PubMed] [Google Scholar]
Palmer, T. M. , Lawlor, D. A. , Harbord, R. M. , Sheehan, N. A. , Tobias, J. H. , Timpson, N. J. , Smith, G. D. , & Sterne, J. A. C. (2012). Using multiple genetic variants as instrumental variables for modifiable risk factors. Statistical Methods in Medical Research, 21(3), 223–242. 10.1177/0962280210394459 [DOI] [PMC free article] [PubMed] [Google Scholar]
Patterson, D. G. , Roberts, J. T. , King, V. M. , Houserova, D. , Barnhill, E. C. , Crucello, A. , Polska, C. J. , Brantley, L. W. , Kaufman, G. C. , Nguyen, M. , Santana, M. W. , Schiller, I. A. , Spicciani, J. S. , Zapata, A. K. , Miller, M. M. , Sherman, T. D. , Ma, R. , Zhao, H. , Arora, R. , … Borchert, G. M. (2017). Human snoRNA‐93 is processed into a microRNA‐like RNA that promotes breast cancer cell invasion. NPJ Breast Cancer, 3(1), 1–12. 10.1038/s41523-017-0032-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
Phelan, C. M. , Kuchenbaecker, K. B. , Tyrer, J. P. , Kar, S. P. , Lawrenson, K. , Winham, S. J. , Dennis, J. , Pirie, A. , Riggan, M. J. , Chornokur, G. , Earp, M. A. , Lyra, P. C. , Lee, J. M. , Coetzee, S. , Beesley, J. , McGuffog, L. , Soucy, P. , Dicks, E. D. , Lee, A. , … Pharoah, P. D. P. (2017). Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer. Nature Genetics, 49(5), 680. 10.1038/ng.3826 [DOI] [PMC free article] [PubMed] [Google Scholar]
Pidsley, R. , Wong, C. C. , Volta, M. , Lunnon, K. , Mill, J. , & Schalkwyk, L. C. (2013). A data‐driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics, 14(1), 1–10. 10.1186/1471-2164-14-293 [DOI] [PMC free article] [PubMed] [Google Scholar]
Pilling, L. C. , Kuo, C.‐L. , Sicinski, K. , Tamosauskaite, J. , Kuchel, G. A. , Harries, L. W. , Herd, P. , Wallace, R. , Ferrucci, L. , & Melzer, D. (2017). Human longevity: 25 genetic loci associated in 389,166 UK biobank participants. Aging (Albany NY), 9(12), 2504. 10.18632/aging.101334 [DOI] [PMC free article] [PubMed] [Google Scholar]
Schumacher, F. R. , Al Olama, A. A. , Berndt, S. I. , Benlloch, S. , Ahmed, M. , Saunders, E. J. , Dadaev, T. , Leongamornlert, D. , Anokian, E. , Cieza‐Borrella, C. , Goh, C. , Brook, M. N. , Sheng, X. , Fachal, L. , Dennis, J. , Tyrer, J. , Muir, K. , Lophatananon, A. , Stevens, V. L. , … Eeles, R. A. (2018). Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nature Genetics, 50(7), 928. 10.1038/s41588-018-0142-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
Scott, R. A. , Scott, L. J. , Mägi, R. , Marullo, L. , Gaulton, K. J. , Kaakinen, M. , Pervjakova, N. , Pers, T. H. , Johnson, A. D. , Eicher, J. D. , Jackson, A. U. , Ferreira, T. , Lee, Y. , Ma, C. , Steinthorsdottir, V. , Thorleifsson, G. , Qi, L. U. , Van Zuydam, N. R. , Mahajan, A. , … Prokopenko, I. (2017). An expanded genome‐wide association study of type 2 diabetes in Europeans. Diabetes, 66(11), 2888–2902. 10.2337/db16-1253 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shah, R. V. , Hwang, S.‐J. , Yeri, A. , Tanriverdi, K. , Pico, A. R. , Yao, C. , Murthy, V. , Ho, J. , Vitseva, O. , Demarco, D. , Shah, S. , Iafrati, M. D. , Levy, D. , & Freedman, J. E. (2019). Proteins altered by surgical weight loss highlight biomarkers of insulin resistance in the community. Arteriosclerosis, Thrombosis, and Vascular Biology, 39(1), 107–115. 10.1161/ATVBAHA.118.311928 [DOI] [PMC free article] [PubMed] [Google Scholar]
Svane, A. , Soerensen, M. , Lund, J. , Tan, Q. , Jylhävä, J. , Wang, Y. , Pedersen, N. , Hägg, S. , Debrabant, B. , Deary, I. , Christensen, K. , Christiansen, L. , & Hjelmborg, J. (2018). DNA methylation and all‐cause mortality in middle‐aged and elderly Danish twins. Genes, 9(2), 78. 10.3390/genes9020078 [DOI] [PMC free article] [PubMed] [Google Scholar]
Teschendorff, A. E. , Marabita, F. , Lechner, M. , Bartlett, T. , Tegner, J. , Gomez‐Cabrero, D. , & Beck, S. (2013). A beta‐mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics, 29(2), 189–196. 10.1093/bioinformatics/bts680 [DOI] [PMC free article] [PubMed] [Google Scholar]
Timmers, P. R. H. J. , Mounier, N. , Lall, K. , Fischer, K. , Ning, Z. , Feng, X. , Bretherick, A. D. , Clark, D. W. , Agbessi, M. , Ahsan, H. , Alves, I. , Andiappan, A. , Awadalla, P. , Battle, A. , Bonder, M. J. , Boomsma, D. , Christiansen, M. , Claringbould, A. , Deelen, P. , … Joshi, P. K. (2019). Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances. eLife, 8, e39856. 10.7554/eLife.39856 [DOI] [PMC free article] [PubMed] [Google Scholar]
Triche, T. J. Jr , Weisenberger, D. J. , Van Den Berg, D. , Laird, P. W. , & Siegmund, K. D. (2013). Low‐level processing of Illumina Infinium DNA methylation beadarrays. Nucleic Acids Research, 41(7), e90. [DOI] [PMC free article] [PubMed] [Google Scholar]
van den Berg, N. , Beekman, M. , Smith, K. R. , Janssens, A. , & Slagboom, P. E. (2017). Historical demography and longevity genetics: Back to the future. Ageing Research Reviews, 38, 28–39. 10.1016/j.arr.2017.06.005 [DOI] [PubMed] [Google Scholar]
Van Eyken, E. , Van Laer, L. , Fransen, E. , Topsakal, V. , Lemkens, N. , Laureys, W. , Nelissen, N. , Vandevelde, A. , & Wienker, T. & Van De Heyning, P. (2006). KCNQ4: a gene for age‐related hearing impairment? Human Mutation, 27(10), 1007–1016. [DOI] [PubMed] [Google Scholar]
Walter, S. , Atzmon, G. , Demerath, E. W. , Garcia, M. E. , Kaplan, R. C. , Kumari, M. , Lunetta, K. L. , Milaneschi, Y. , Tanaka, T. , Tranah, G. J. , Völker, U. , Yu, L. , Arnold, A. , Benjamin, E. J. , Biffar, R. , Buchman, A. S. , Boerwinkle, E. , Couper, D. , De Jager, P. L. , … Franceschini, N. (2011). A genome‐wide association study of aging. Neurobiology of Aging, 32(11), 2109, e2115–2109. e2128. 10.1016/j.neurobiolaging.2011.05.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang, Y. , McKay, J. D. , Rafnar, T. , Wang, Z. , Timofeeva, M. N. , Broderick, P. , Zong, X. , Laplana, M. , Wei, Y. , Han, Y. , Lloyd, A. , Delahaye‐Sourdeix, M. , Chubb, D. , Gaborieau, V. , Wheeler, W. , Chatterjee, N. , Thorleifsson, G. , Sulem, P. , Liu, G. , … Amos, C. I. (2014). Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nature Genetics, 46(7), 736. 10.1038/ng.3002 [DOI] [PMC free article] [PubMed] [Google Scholar]
Willer, C. J. , Schmidt, E. M. , Sengupta, S. , Peloso, G. M. , Gustafsson, S. , Kanoni, S. , & Mora, S. (2013). Discovery and refinement of loci associated with lipid levels. Nature Genetics, 45(11), 1274. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yao, C. , Chen, G. , Song, C. , Keefe, J. , Mendelson, M. , Huan, T. , & Wu, H. (2018). Genome‐wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nature Communications, 9(1), 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang, Y. , Wilson, R. , Heiss, J. , Breitling, L. P. , Saum, K.‐U. , Schöttker, B. , Holleczek, B. , Waldenberger, M. , Peters, A. , & Brenner, H. (2017). DNA methylation signatures in peripheral blood strongly predict all‐cause mortality. Nature Communications, 8(1), 1–11. 10.1038/ncomms14617 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhao, Y. , Yan, Y. , Ma, R. , Lv, X. , Zhang, L. , Wang, J. , Zhu, W. , Zhao, L. , Jiang, L. , Zhao, L. , Wen, L. , Yang, B. O. , Chen, Y. , He, M. , Liu, M. , & Wei, M. (2020). Expression signature of six‐snoRNA serves as novel non‐invasive biomarker for diagnosis and prognosis prediction of renal clear cell carcinoma. Journal of Cellular and Molecular Medicine, 24(3), 2215–2228. 10.1111/jcmm.14886 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Fig S1‐3

Click here for additional data file.^{(1.1MB, docx)}

Tables S1‐S4

Click here for additional data file.^{(990.9KB, xlsx)}

Data Availability Statement

[acel13608-bib-0001] Ashburner, M. , Ball, C. A. , Blake, J. A. , Botstein, D. , Butler, H. , Cherry, J. M. , Davis, A. P. , Dolinski, K. , Dwight, S. S. , Eppig, J. T. , Harris, M. A. , Hill, D. P. , Issel‐Tarver, L. , Kasarskis, A. , Lewis, S. , Matese, J. C. , Richardson, J. E. , Ringwald, M. , Rubin, G. M. , & Sherlock, G. (2000). Gene ontology: tool for the unification of biology. Nature Genetics, 25(1), 25–29. 10.1038/75556 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0002] Battram, T. , Yousefi, P. , Crawford, G. , Prince, C. , Babei, M. S. , Sharp, G. , Hatcher, C. , Vega‐Salas, M. J. , Khodabakhsh, S. , Whitehurst, O. , Langdon, R. , Mahoney, L. , Elliott, H. R. , Mancano, G. , Lee, M. A. , Watkins, S. H. , Lay, A. C. , Hemani, G. , Gaunt, T. R. , … Whitehurst, O. (2022). The EWAS Catalog: A database of epigenome‐wide association studies. Wellcome Open Research, 7(41), 41. 10.1093/nar/gky993 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0003] Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3. [Google Scholar]

[acel13608-bib-0004] Buniello, A. , MacArthur, J. A. L. , Cerezo, M. , Harris, L. W. , Hayhurst, J. , Malangone, C. , McMahon, A. , Morales, J. , Mountjoy, E. , Sollis, E. , Suveges, D. , Vrousgou, O. , Whetzel, P. L. , Amode, R. , Guillen, J. A. , Riat, H. S. , Trevanion, S. J. , Hall, P. , Junkins, H. , … Parkinson, H. (2019). The NHGRI‐EBI GWAS Catalog of published genome‐wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Research, 47(D1), D1005–D1012. 10.1093/nar/gky1120 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0005] Chen, B. H. , Marioni, R. E. , Colicino, E. , Peters, M. J. , Ward‐Caviness, C. K. , Tsai, P.‐C. , Roetker, N. S. , Just, A. C. , Demerath, E. W. , Guan, W. , Bressler, J. , Fornage, M. , Studenski, S. , Vandiver, A. R. , Moore, A. Z. , Tanaka, T. , Kiel, D. P. , Liang, L. , Vokonas, P. , … Horvath, S. (2016). DNA methylation‐based measures of biological age: meta‐analysis predicting time to death. Aging (Albany NY), 8(9), 1844. 10.18632/aging.101020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0006] Ching, T. , Zhu, X. , & Garmire, L. X. (2018). Cox‐nnet: an artificial neural network method for prognosis prediction of high‐throughput omics data. PLoS Computational Biology, 14(4), e1006076. 10.1371/journal.pcbi.1006076 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0007] Colicino, E. , Marioni, R. , Ward‐Caviness, C. , Gondalia, R. , Guan, W. , Chen, B. , Tsai, P.‐C. , Huan, T. , Xu, G. , Golareh, A. , Schwartz, J. , Vokonas, P. , Just, A. , Starr, J. M. , McRae, A. F. , Wray, N. R. , Visscher, P. M. , Bressler, J. , Zhang, W. , … Baccarelli, A. (2020). Blood DNA methylation sites predict death risk in a longitudinal study of 12,300 individuals. Aging (Albany NY), 12(14), 14092. 10.18632/aging.103408 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0008] Deelen, J. , Evans, D. S. , Arking, D. E. , Tesi, N. , Nygaard, M. , Liu, X. , Wojczynski, M. K. , Biggs, M. L. , van der Spek, A. , Atzmon, G. , Ware, E. B. , Sarnowski, C. , Smith, A. V. , Seppälä, I. , Cordell, H. J. , Dose, J. , Amin, N. , Arnold, A. M. , Ayers, K. L. , … Murabito, J. M. (2019). A meta‐analysis of genome‐wide association studies identifies multiple longevity genes. Nature Communications, 10(1), 1–14. 10.1038/s41467-019-11558-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0009] Devlin, B. , Roeder, K. , & Wasserman, L. (2001). Genomic control, a new approach to genetic‐based association studies. Theoretical Population Biology, 60(3), 155–166. 10.1006/tpbi.2001.1542 [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0010] Dugué, P.‐A. , Bassett, J. K. , Joo, J. H. E. , Jung, C.‐H. , Ming Wong, E. E. , Moreno‐Betancur, M. , Schmidt, D. , Makalic, E. , Li, S. , Severi, G. , Hodge, A. M. , Buchanan, D. D. , English, D. R. , Hopper, J. L. , Southey, M. C. , Giles, G. G. , & Milne, R. L. (2018). DNA methylation‐based biological aging and cancer risk and survival: Pooled analysis of seven prospective studies. International Journal of Cancer, 142(8), 1611–1619. 10.1002/ijc.31189 [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0011] Evangelou, E. , Warren, H. R. , Mosen‐Ansorena, D. , Mifsud, B. , Pazoki, R. , Gao, H. E. , Ntritsos, G. , Dimou, N. , Cabrera, C. P. , Karaman, I. , Ng, F. L. , Evangelou, M. , Witkowska, K. , Tzanis, E. , Hellwege, J. N. , Giri, A. , Velez Edwards, D. R. , Sun, Y. V. , Cho, K. , … Caulfield, M. J. (2018). Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nature Genetics, 50(10), 1412–1425. 10.1038/s41588-018-0205-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0012] Feng, T. , Chen, Z. , Gu, J. , Wang, Y. , Zhang, J. , & Min, L. (2019). The clinical responses of TNIP2‐ALK fusion variants to crizotinib in ALK‐rearranged lung adenocarcinoma. Lung Cancer, 137, 19–22. 10.1016/j.lungcan.2019.08.032 [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0013] Friedman, J. , Hastie, T. , & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1. 10.18637/jss.v033.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0014] Gong, J. , Li, Y. , Liu, C.‐J. , Xiang, Y. U. , Li, C. , Ye, Y. , Zhang, Z. , Hawke, D. H. , Park, P. K. , Diao, L. , Putkey, J. A. , Yang, L. , Guo, A.‐Y. , Lin, C. , & Han, L. (2017). A pan‐cancer analysis of the expression and clinical relevance of small nucleolar RNAs in human cancer. Cell Reports, 21(7), 1968–1981. 10.1016/j.celrep.2017.10.070 [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0015] Hannum, G. , Guinney, J. , Zhao, L. , Zhang, L. I. , Hughes, G. , Sadda, S. V. , Klotzle, B. , Bibikova, M. , Fan, J.‐B. , Gao, Y. , Deconde, R. , Chen, M. , Rajapakse, I. , Friend, S. , Ideker, T. , & Zhang, K. (2013). Genome‐wide methylation profiles reveal quantitative views of human aging rates. Molecular Cell, 49(2), 359–367. 10.1016/j.molcel.2012.10.016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0016] Harrell, F. E. Jr , Lee, K. L. , & Mark, D. B. (1996). Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine, 15(4), 361–387. [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0017] Ho, J. E. , Lyass, A. , Courchesne, P. , Chen, G. , Liu, C. , Yin, X. , Hwang, S.‐J. , Massaro, J. M. , Larson, M. G. , & Levy, D. (2018). Protein biomarkers of cardiovascular disease and mortality in the community. Journal of the American Heart Association, 7(14), e008108. 10.1161/JAHA.117.008108 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0018] Horvath, S. (2013). DNA methylation age of human tissues and cell types. Genome Biology, 14(10), 3156. 10.1186/gb-2013-14-10-r115 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0019] Horvath, S. , Pirazzini, C. , Bacalini, M. G. , Gentilini, D. , Di Blasio, A. M. , Delledonne, M. , Mari, D. , Arosio, B. , Monti, D. , Passarino, G. , De Rango, F. , D'Aquila, P. , Giuliani, C. , Marasco, E. , Collino, S. , Descombes, P. , Garagnani, P. , & Franceschi, C. (2015). Decreased epigenetic age of PBMCs from Italian semi‐supercentenarians and their offspring. Aging (Albany NY), 7(12), 1159. 10.18632/aging.100861 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0020] Huan, T. , Joehanes, R. , Song, C. I. , Peng, F. , Guo, Y. , Mendelson, M. , Yao, C. , Liu, C. , Ma, J. , Richard, M. , Agha, G. , Guan, W. , Almli, L. M. , Conneely, K. N. , Keefe, J. , Hwang, S.‐J. , Johnson, A. D. , Fornage, M. , Liang, L. , & Levy, D. (2019). Genome‐wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nature Communications, 10(1), 1–14. 10.1038/s41467-019-12228-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0021] Ishwaran, H. , Kogalur, U. B. , Blackstone, E. H. , & Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860. 10.1214/08-AOAS169 [DOI] [Google Scholar]

[acel13608-bib-0022] Jones, P. A. , & Takai, D. (2001). The role of DNA methylation in mammalian epigenetics. Science, 293(5532), 1068–1070. [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0023] Kanehisa, M. , & Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28(1), 27–30. 10.1093/nar/28.1.27 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0024] Kaplan, E. L. , & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457–481. 10.1080/01621459.1958.10501452 [DOI] [Google Scholar]

[acel13608-bib-0025] Katzman, J. L. , Shaham, U. , Cloninger, A. , Bates, J. , Jiang, T. , & Kluger, Y. (2018). DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1), 24. 10.1186/s12874-018-0482-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0026] Levine, M. E. , Lu, A. T. , Quach, A. , Chen, B. H. , Assimes, T. L. , Bandinelli, S. , Hou, L. , Baccarelli, A. A. , Stewart, J. D. , Li, Y. , Whitsel, E. A. , Wilson, J. G. , Reiner, A. P. , Aviv, A. , Lohman, K. , Liu, Y. , Ferrucci, L. , & Horvath, S. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY), 10(4), 573. 10.18632/aging.101414 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0027] Locke, A. E. , Kahali, B. , Berndt, S. I. , Justice, A. E. , Pers, T. H. , Day, F. R. , & Yang, J. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature, 518(7538), 197–206. [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0028] Lu, A. T. , Quach, A. , Wilson, J. G. , Reiner, A. P. , Aviv, A. , Raj, K. , Hou, L. , Baccarelli, A. A. , Li, Y. , Stewart, J. D. , Whitsel, E. A. , Assimes, T. L. , Ferrucci, L. , & Horvath, S. (2019). DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging (Albany NY), 11(2), 303. 10.18632/aging.101684 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0029] Maksimovic, J. , Gordon, L. , & Oshlack, A. (2012). SWAN: Subset‐quantile within array normalization for illumina infinium HumanMethylation450 BeadChips. Genome Biology, 13(6), 1–12. 10.1186/gb-2012-13-6-r44 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0030] Maksimovic, J. , Oshlack, A. , & Phipson, B. (2021). Gene set enrichment analysis for genome‐wide DNA methylation data. Genome Biology, 22(1), 1–26. 10.1186/s13059-021-02388-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0031] Marioni, R. E. , Shah, S. , McRae, A. F. , Chen, B. H. , Colicino, E. , Harris, S. E. , Gibson, J. , Henders, A. K. , Redmond, P. , Cox, S. R. , Pattie, A. , Corley, J. , Murphy, L. , Martin, N. G. , Montgomery, G. W. , Feinberg, A. P. , Fallin, M. D. , Multhaup, M. L. , Jaffe, A. E. , … Deary, I. J. (2015). DNA methylation age of blood predicts all‐cause mortality in later life. Genome Biology, 16(1), 25. 10.1186/s13059-015-0584-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0032] Marioni, R. E. , Shah, S. , McRae, A. F. , Ritchie, S. J. , Muniz‐Terrera, G. , Harris, S. E. , Gibson, J. , Redmond, P. , Cox, S. R. , Pattie, A. , Corley, J. , Taylor, A. , Murphy, L. , Starr, J. M. , Horvath, S. , Visscher, P. M. , Wray, N. R. , & Deary, I. J. (2015). The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936. International Journal of Epidemiology, 44(4), 1388–1396. 10.1093/ije/dyu277 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0033] Mendelson, M. M. , Marioni, R. E. , Joehanes, R. , Liu, C. , Hedman, Å. K. , Aslibekyan, S. , Demerath, E. W. , Guan, W. , Zhi, D. , Yao, C. , Huan, T. , Willinger, C. , Chen, B. , Courchesne, P. , Multhaup, M. , Irvin, M. R. , Cohain, A. , Schadt, E. E. , Grove, M. L. , … Deary, I. J. (2017). Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. PLoS Med, 14(1), e1002215. 10.1371/journal.pmed.1002215 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0034] Michailidou, K. , Lindström, S. , Dennis, J. , Beesley, J. , Hui, S. , Kar, S. , Lemaçon, A. , Soucy, P. , Glubb, D. , Rostamianfar, A. , Bolla, M. K. , Wang, Q. , Tyrer, J. , Dicks, E. D. , Lee, A. , Wang, Z. , Allen, J. , Keeman, R. , Eilber, U. , … Easton, D. F. (2017). Association analysis identifies 65 new breast cancer risk loci. Nature, 551(7678), 92. 10.1038/nature24284 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0035] Nikpay, M. , Goel, A. , Won, H.‐H. , Hall, L. M. , Willenborg, C. , Kanoni, S. , & Hopewell, J. C. (2015). A comprehensive 1000 Genomes–based genome‐wide association meta‐analysis of coronary artery disease. Nature Genetics, 47(10), 1121. [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0036] Palmer, T. M. , Lawlor, D. A. , Harbord, R. M. , Sheehan, N. A. , Tobias, J. H. , Timpson, N. J. , Smith, G. D. , & Sterne, J. A. C. (2012). Using multiple genetic variants as instrumental variables for modifiable risk factors. Statistical Methods in Medical Research, 21(3), 223–242. 10.1177/0962280210394459 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0037] Patterson, D. G. , Roberts, J. T. , King, V. M. , Houserova, D. , Barnhill, E. C. , Crucello, A. , Polska, C. J. , Brantley, L. W. , Kaufman, G. C. , Nguyen, M. , Santana, M. W. , Schiller, I. A. , Spicciani, J. S. , Zapata, A. K. , Miller, M. M. , Sherman, T. D. , Ma, R. , Zhao, H. , Arora, R. , … Borchert, G. M. (2017). Human snoRNA‐93 is processed into a microRNA‐like RNA that promotes breast cancer cell invasion. NPJ Breast Cancer, 3(1), 1–12. 10.1038/s41523-017-0032-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0038] Phelan, C. M. , Kuchenbaecker, K. B. , Tyrer, J. P. , Kar, S. P. , Lawrenson, K. , Winham, S. J. , Dennis, J. , Pirie, A. , Riggan, M. J. , Chornokur, G. , Earp, M. A. , Lyra, P. C. , Lee, J. M. , Coetzee, S. , Beesley, J. , McGuffog, L. , Soucy, P. , Dicks, E. D. , Lee, A. , … Pharoah, P. D. P. (2017). Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer. Nature Genetics, 49(5), 680. 10.1038/ng.3826 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0039] Pidsley, R. , Wong, C. C. , Volta, M. , Lunnon, K. , Mill, J. , & Schalkwyk, L. C. (2013). A data‐driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics, 14(1), 1–10. 10.1186/1471-2164-14-293 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0040] Pilling, L. C. , Kuo, C.‐L. , Sicinski, K. , Tamosauskaite, J. , Kuchel, G. A. , Harries, L. W. , Herd, P. , Wallace, R. , Ferrucci, L. , & Melzer, D. (2017). Human longevity: 25 genetic loci associated in 389,166 UK biobank participants. Aging (Albany NY), 9(12), 2504. 10.18632/aging.101334 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0041] Schumacher, F. R. , Al Olama, A. A. , Berndt, S. I. , Benlloch, S. , Ahmed, M. , Saunders, E. J. , Dadaev, T. , Leongamornlert, D. , Anokian, E. , Cieza‐Borrella, C. , Goh, C. , Brook, M. N. , Sheng, X. , Fachal, L. , Dennis, J. , Tyrer, J. , Muir, K. , Lophatananon, A. , Stevens, V. L. , … Eeles, R. A. (2018). Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nature Genetics, 50(7), 928. 10.1038/s41588-018-0142-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0042] Scott, R. A. , Scott, L. J. , Mägi, R. , Marullo, L. , Gaulton, K. J. , Kaakinen, M. , Pervjakova, N. , Pers, T. H. , Johnson, A. D. , Eicher, J. D. , Jackson, A. U. , Ferreira, T. , Lee, Y. , Ma, C. , Steinthorsdottir, V. , Thorleifsson, G. , Qi, L. U. , Van Zuydam, N. R. , Mahajan, A. , … Prokopenko, I. (2017). An expanded genome‐wide association study of type 2 diabetes in Europeans. Diabetes, 66(11), 2888–2902. 10.2337/db16-1253 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0043] Shah, R. V. , Hwang, S.‐J. , Yeri, A. , Tanriverdi, K. , Pico, A. R. , Yao, C. , Murthy, V. , Ho, J. , Vitseva, O. , Demarco, D. , Shah, S. , Iafrati, M. D. , Levy, D. , & Freedman, J. E. (2019). Proteins altered by surgical weight loss highlight biomarkers of insulin resistance in the community. Arteriosclerosis, Thrombosis, and Vascular Biology, 39(1), 107–115. 10.1161/ATVBAHA.118.311928 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0044] Svane, A. , Soerensen, M. , Lund, J. , Tan, Q. , Jylhävä, J. , Wang, Y. , Pedersen, N. , Hägg, S. , Debrabant, B. , Deary, I. , Christensen, K. , Christiansen, L. , & Hjelmborg, J. (2018). DNA methylation and all‐cause mortality in middle‐aged and elderly Danish twins. Genes, 9(2), 78. 10.3390/genes9020078 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0045] Teschendorff, A. E. , Marabita, F. , Lechner, M. , Bartlett, T. , Tegner, J. , Gomez‐Cabrero, D. , & Beck, S. (2013). A beta‐mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics, 29(2), 189–196. 10.1093/bioinformatics/bts680 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0046] Timmers, P. R. H. J. , Mounier, N. , Lall, K. , Fischer, K. , Ning, Z. , Feng, X. , Bretherick, A. D. , Clark, D. W. , Agbessi, M. , Ahsan, H. , Alves, I. , Andiappan, A. , Awadalla, P. , Battle, A. , Bonder, M. J. , Boomsma, D. , Christiansen, M. , Claringbould, A. , Deelen, P. , … Joshi, P. K. (2019). Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances. eLife, 8, e39856. 10.7554/eLife.39856 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0047] Triche, T. J. Jr , Weisenberger, D. J. , Van Den Berg, D. , Laird, P. W. , & Siegmund, K. D. (2013). Low‐level processing of Illumina Infinium DNA methylation beadarrays. Nucleic Acids Research, 41(7), e90. [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0048] van den Berg, N. , Beekman, M. , Smith, K. R. , Janssens, A. , & Slagboom, P. E. (2017). Historical demography and longevity genetics: Back to the future. Ageing Research Reviews, 38, 28–39. 10.1016/j.arr.2017.06.005 [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0049] Van Eyken, E. , Van Laer, L. , Fransen, E. , Topsakal, V. , Lemkens, N. , Laureys, W. , Nelissen, N. , Vandevelde, A. , & Wienker, T. & Van De Heyning, P. (2006). KCNQ4: a gene for age‐related hearing impairment? Human Mutation, 27(10), 1007–1016. [DOI] [PubMed] [Google Scholar]

[acel13608-bib-0050] Walter, S. , Atzmon, G. , Demerath, E. W. , Garcia, M. E. , Kaplan, R. C. , Kumari, M. , Lunetta, K. L. , Milaneschi, Y. , Tanaka, T. , Tranah, G. J. , Völker, U. , Yu, L. , Arnold, A. , Benjamin, E. J. , Biffar, R. , Buchman, A. S. , Boerwinkle, E. , Couper, D. , De Jager, P. L. , … Franceschini, N. (2011). A genome‐wide association study of aging. Neurobiology of Aging, 32(11), 2109, e2115–2109. e2128. 10.1016/j.neurobiolaging.2011.05.026 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0051] Wang, Y. , McKay, J. D. , Rafnar, T. , Wang, Z. , Timofeeva, M. N. , Broderick, P. , Zong, X. , Laplana, M. , Wei, Y. , Han, Y. , Lloyd, A. , Delahaye‐Sourdeix, M. , Chubb, D. , Gaborieau, V. , Wheeler, W. , Chatterjee, N. , Thorleifsson, G. , Sulem, P. , Liu, G. , … Amos, C. I. (2014). Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nature Genetics, 46(7), 736. 10.1038/ng.3002 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0052] Willer, C. J. , Schmidt, E. M. , Sengupta, S. , Peloso, G. M. , Gustafsson, S. , Kanoni, S. , & Mora, S. (2013). Discovery and refinement of loci associated with lipid levels. Nature Genetics, 45(11), 1274. [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0053] Yao, C. , Chen, G. , Song, C. , Keefe, J. , Mendelson, M. , Huan, T. , & Wu, H. (2018). Genome‐wide mapping of plasma protein QTLs identifies putatively causal genes and pathways for cardiovascular disease. Nature Communications, 9(1), 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0054] Zhang, Y. , Wilson, R. , Heiss, J. , Breitling, L. P. , Saum, K.‐U. , Schöttker, B. , Holleczek, B. , Waldenberger, M. , Peters, A. , & Brenner, H. (2017). DNA methylation signatures in peripheral blood strongly predict all‐cause mortality. Nature Communications, 8(1), 1–11. 10.1038/ncomms14617 [DOI] [PMC free article] [PubMed] [Google Scholar]

[acel13608-bib-0055] Zhao, Y. , Yan, Y. , Ma, R. , Lv, X. , Zhang, L. , Wang, J. , Zhu, W. , Zhao, L. , Jiang, L. , Zhao, L. , Wen, L. , Yang, B. O. , Chen, Y. , He, M. , Liu, M. , & Wei, M. (2020). Expression signature of six‐snoRNA serves as novel non‐invasive biomarker for diagnosis and prognosis prediction of renal clear cell carcinoma. Journal of Cellular and Molecular Medicine, 24(3), 2215–2228. 10.1111/jcmm.14886 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Integrative analysis of clinical and epigenetic biomarkers of mortality

Tianxiao Huan

Steve Nguyen

Elena Colicino

Carolina Ochoa‐Rosales

W David Hill

Jennifer A Brody

Mette Soerensen

Yan Zhang

Antoine Baldassari

Mohamed Ahmed Elhadad

Tanaka Toshiko

Yinan Zheng

Arce Domingo‐Relloso

Dong Heon Lee

Jiantao Ma

Chen Yao

Chunyu Liu

Shih‐Jen Hwang

Roby Joehanes

Myriam Fornage

Jan Bressler

Joyce BJ van Meurs

Birgit Debrabant

Jonas Mengel‐From

Jacob Hjelmborg

Kaare Christensen

Pantel Vokonas

Joel Schwartz

Sina A Gahrib

Nona Sotoodehnia

Colleen M Sitlani

Sonja Kunze

Christian Gieger

Annette Peters

Melanie Waldenberger

Ian J Deary

Luigi Ferrucci

Yishu Qu

Philip Greenland

Donald M Lloyd‐Jones

Lifang Hou

Stefania Bandinelli

Trudy Voortman

Brenner Hermann

Andrea Baccarelli

Eric Whitsel

James S Pankow

Daniel Levy

Abstract

Abbreviations

1. INTRODUCTION

FIGURE 1.

2. RESULTS

2.1. Study population

TABLE 1.

2.2. Ancestry‐stratified epigenome‐wide meta‐analysis of all‐cause mortality

2.3. Transethnic replication and sensitivity analysis

TABLE 2.

2.4. Associations of DNAm with CVD death and cancer death

FIGURE 2.

2.5. Mortality prediction model

FIGURE 3.

2.5.1. Clinical risk factors strongly predict all‐cause mortality and CVD death

TABLE 3.

2.5.2. DNAm predicts mortality independently of age and clinical risk factors

2.5.3. The integrative model (trained by CpGs and clinical risk factors) moderately improved upon the clinical risk factor model for all‐cause mortality and CVD death, and substantially improved the prediction of cancer death

FIGURE 4.

2.6. Comparing the mortality prediction model with DNAm age

2.7. Associations of DNAm with genetic variants and Mendelian randomization analysis

2.8. Associations of DNAm with gene expression and pathway analysis

3. DISCUSSION

4. CONCLUSIONS

5. METHODS

5.1. Study population

5.2. Mortality ascertainment and clinical phenotypes

5.3. DNA methylation measurements and quality control

5.4. Cohort‐specific epigenome‐wide association analysis

5.5. Meta‐analysis