Summary
Inflammation is a critical component of chronic diseases, aging progression, and lifespan. Omics signatures may characterize inflammation status beyond blood biomarkers. We leveraged genetics (polygenic risk score [PRS]), metabolomics (metabolomic risk score [MRS]), and epigenetics (epigenetic risk score [ERS]) to build multi-omics-multi-marker risk scores for inflammation status represented by the level of circulating C-reactive protein (CRP), interleukin 6 (IL-6), and tumor necrosis factor alpha (TNF-α). We found that multi-omics risk scores generally outperformed single-omics risk scores in predicting all-cause mortality in the Canadian Longitudinal Study on Aging. Compared with circulating inflammation biomarkers, some multi-omics risk scores had a higher hazard ratio (HR) for all-cause mortality when including both score and circulating IL-6 in the same model (1-SD IL-6 MRS-ERS: HR = 2.20 [1.55–3.13] vs. 1-SD circulating IL-6 HR = 0.94 [0.67,1.32]. 1-SD IL-6 PRS-MRS: HR = 1.47 [1.35,1.59] vs. 1-SD circulating IL-6 HR = 1.33 [1.18, 1.51]. 1-SD PRS-MRS-ERS: HR = 1.95 [1.40, 2.70] vs. 1-SD circulating IL-6: HR = 0.99 [0.71, 1.39]). In the Nurses’ Health Study (NHS), NHS II, and Health Professional Follow-up Study with available omics, 1 SD of IL-6 PRS and 1-SD IL-6 PRS-MRS had HR = 1.12 [1.00,1.26] and HR = 1.13 [1.01,1.26] among individuals >65 years old without mutual adjustment of the score and circulating IL-6. Our study demonstrates that some multi-omics scores for inflammation markers may characterize important inflammation burden for an individual beyond those represented by blood biomarkers and improve our prediction capability for the aging process and lifespan.
Keywords: CLSA, genetics, metabolomics, DNA methylation
Graphical abstract

We developed single- and multi-omics risk scores to assess blood inflammation markers and validated them across three cohorts. Our multi-omics models outperformed blood markers in predicting all-cause mortality, offering a more comprehensive approach to capturing inflammation burden. This may help identify at-risk populations for targeted interventions to reduce inflammation-related mortality.
Introduction
Strong evidence supports the role of low-grade systemic inflammation in complex diseases, including cardiovascular disease and type 2 diabetes,1,2,3 as well as age-related diseases such as Alzheimer disease (AD).4 Aging and age-related diseases are associated with a decline in immune function, known as immunosenescence, and an increase in inflammation, sometimes referred to as inflammaging.5 Immunosenescence is associated with an imbalance of pro- and anti-inflammatory cytokines, and a chronic low-grade inflammatory state has been linked to poor function and mobility in older adults.5 Among traditionally measured inflammation biomarkers, blood C-reactive protein (CRP), interleukin (IL)-6, and tumor necrosis factor alpha (TNF-α) levels are independent predictors for all-cause mortality in several prospective studies,6,7,8 especially in those older than 65 years.8,9 However, a snapshot of these blood biomarkers may not reflect an individual’s full inflammation status.
Integrating different omics reflecting different functional layers (“multi-omics”) may better characterize a more comprehensive spectrum of inflammation and the burden on chronic inflammation from immediate and acute status (e.g., metabolomics) to lifetime impact (e.g., genetics). Using multi-omics jointly may enhance our ability to evaluate and explore inflammatory-related pathophysiology.
In this study, leveraging data from the Canadian Longitudinal Study on Aging (CLSA) and published genome-wide summary statistics, we established multi-omics scores for inflammation markers CRP, IL-6, and TNF-α based on genetics (via polygenic risk score [PRS]), metabolomics (via metabolomic risk score [MRS]), and epigenetics (via epigenetic risk score [ERS], using DNA methylation). We hypothesized that, independent of the observed level of blood biomarkers, omics-based inflammation signatures for CRP, IL-6, or TNF-α would reflect a greater burden of chronic inflammation and, therefore, be more strongly associated with a higher mortality hazard. The risk scores and the association with all-cause mortality were further tested for validation in the Nurses’ Health Study (NHS), NHS II, and the Health Professional Follow-up Study (HPFS).
Subjects, material, and methods
Study population
We used the comprehensive CLSA cohort (30,097 participants; 50.9% women, mean [SD] age = 62.96 [10.25]), which randomly selected from within 25–50 km of 11 data collection sites in seven Canadian provinces and were interviewed in person, took part in in-depth physical assessments at the collection sites, and provided blood and urine samples (2011–2015).10 The first (2015–2018) and the second (2018–2021) follow-ups on the comprehensive cohort include reports on all-cause mortality. The CLSA data were used to train the metabolomic and epigenetic prediction models and for the discovery association study of the omics-based signatures with all-cause mortality. The NHS, NHS II, and HPFS were used to validate the risk scores and the association with mortality. The description of the NHS, NHS II, and HPFS is presented in Methods S1. Participation in the CLSA cohort is voluntary and all individuals provided written informed consent.10 An ethical review of the CLSA protocol was conducted by the research ethics board (REB) at each research site with the coordination of the McMaster Research Ethics Board (at baseline, there were 13 REBs involved). The study protocol for NHS, NHS II, and HPFS was approved by the institutional review boards of the Brigham and Women’s Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required, and participants provided written informed consent.11,12
Assessment of omics
Details on measuring and processing the omics data in the CLSA are presented below. The methods for omics in the validation cohorts are described in Methods S2.
Genotyping and imputation
The DNA extraction, genotyping, and quality filtering protocols were described previously13 and in the CLSA Genome-Wide Genetic Data Release document (https://www.clsa-elcv.ca/wp-content/uploads/2023/06/clsa_gwas_v3.pdf). Briefly, genotyping was undertaken in five batches of roughly 5,000 samples, each using Axiom Analysis Suite 2.0, similar to the UK Biobank genotyping quality control (QC) documentation.13 The average call rate for passing samples was ≥95.0, and the Hardy-Weinberg equilibrium (HWE) p value was >10−6. Imputation quality using the TOPMed reference panel was assessed using the marker-wise information measure (Rsq) and compared to the imputation using the haplotype reference.14 SNPs with minor allele frequency (MAF) <0.05 were removed. Out of 794,409 genetic markers, 573,386 were selected after sample-based QC. The above procedure to remove duplicates resulted in 26,622 uniquely genotyped CLSA participants.
Metabolomics
Blood metabolomics were evaluated for 9,992 samples at baseline at Metabolon. QC and normalization methods are detailed in the CLSA Metabolomic Profiling Data Support Document (https://www.clsa-elcv.ca/wp-content/uploads/2024/01/CLSA_DataSupportDoc_Metabolomics_v2.0_2023Aug03.pdf). A batch-normalization method was applied to the data by correcting each metabolomic compound in instrument batch blocks by registering the medians of each batch to equal one (1.00) and normalizing each data point proportionately (termed the “block correction”). For the current study, the following metabolite groups were included: amino acids, xenobiotics, peptides, partially characterized molecules, nucleotides, lipids, energy, cofactors and vitamins, and carbohydrates (a total of 1,021 metabolites). Metabolites were further standardized to a mean of 0 and SD of 1 before being input into the models.
DNA methylation profiling
The site-specific DNA methylation was measured using the Infinium MethylationEPIC BeadChip platform (Illumina, CA, USA) on DNA extracted from peripheral blood mononuclear cells (PBMCs) as described in the CLSA data support document: https://www.clsa-elcv.ca/wp-content/uploads/2023/06/clsa_datasupportdoc_epigenetics_v2.0_2022nov30_final.pdf. For this study, we performed sample-level QC using functions from the R package “minfi.”15 We evaluated the quality of 1,478 samples at baseline by the following means. (1) Computing the median for both Meth and Unmeth signals for each array and displaying in a scatterplot to identify outlier samples with low intensity. The cutoff of the median log2 intensity value was <10.5. (2) Evaluating outlier samples and confirming male and female blood samples clustered separately according to multi-dimensional scaling (MDS) plots. This was repeated twice: clustering by the predicted sex and clustering by the reported sex. Two samples were flagged: one sample had a predicted sex different from the reported sex, and one sample clustered within the opposite sex that was predicted and reported. (3) Calculating sample-wise missing rates. We set a detection p value of 0.01 and >5% missing rate as a threshold to be removed. (4) Additional QC was performed using the QC report generated by the “qcReport” function of the minfi R package. For the probe-level QC, we used the “preprocessQuantile” function to normalize the data and “getBeta” function to get the final beta values for 1,476 samples.
Assessment of inflammation markers
Circulating inflammatory biomarker measurements were detailed before.16 Briefly, CRP for 27,011 participants was measured in serum using the Cobas 8000 modular analyzer (Roche Diagnostics). TNF-α and IL-6 (available for n = 9,522 and n = 9,698 participants) were measured in serum using the Quantikine high-sensitivity ELISA (R&D Systems). Blood draws for inflammation markers were performed at the same time as the omics data at baseline. The current analysis included participants with overlapping blood inflammation markers and omics data.
Measurement of blood CRP and IL-6 in the validation cohorts NHS, NHS II, and HPFS is described in Methods S3. For the NHS/HPFS, we used inflammation markers measured closest to the timing of blood draws for omics. Of note, blood TNF-α was not measured in these studies.
Main covariates
Background characteristics collected at baseline used for the all-cause mortality association study were age, sex, smoking status (yes/no/former), body mass index (BMI), alcohol intake (frequency of alcohol consumption past 12 months), and race/ethnicity (based on self-reports and derived from culture and ethnicity background reports, as previously published.17 To avoid small numbers, participants were grouped into five ethnic groups: White/European, African, Asian, Hispanic, and other). Multimorbidity was defined as an ordinal measurement, ranging from 0 to 8, and calculated by summing the number of the following diseases in the CLSA (assessed by self-reports at baseline): dementia or AD, Parkinson disease, multiple sclerosis, diabetes mellitus, renal failure, musculoskeletal system, connective tissue disease (osteoporosis, rheumatoid arthritis, arthritis), cardiovascular risk factors and disease (hypertension, heart disease including congestive heart failure, angina, myocardial infarction or heart attack, coronary artery bypass graft surgery [CABG] surgery or percutaneous coronary intervention, congestive heart failure, stroke), and cancer (except non-melanoma skin cancer). The diseases included in the multimorbidity were based on previous publications.18,19
Prediction models
Single-omics (1-way) risk scores
Genome-wide association study (GWAS) for CRP, IL-6, and TNF-α was curated from summary data from several publicly available sources. We conducted a meta-analysis for the same circulating protein biomarkers using METAL20 with the inverse-variance-weighted method. For CRP PRS, we used data from two cohorts with a combined sample size of n = 465,522. For IL-6 PRS, we used data from three cohorts with a combined n = 12,487, and, for TNF-α PRS, we used data from three cohorts with a combined n = 7,752. Using the genome-wide SNPs from the meta-analysis summary statistics, PRS was generated by applying Sparse Bayesian Regression (sBayesian) linear mixed models with linkage disequilibrium (LD) sparse matrix (ukbEURu_imp_v3_HM3_n50k.chisq10.ldm.sparse) as demonstrated before.21 For MRS and ERS models, we first applied log10 transformation on the blood inflammation markers to reduce skewness and control the effect from outliers. We generated MRS and ERS for each inflammation marker separately using elastic net regression (“glmnet” R package)22 with all metabolites and DNA methylation sites as independent variables. To reduce overfitting, a 10-fold cross-validation (CV) was used to obtain the predicted value for each CLSA individual. Pearson R and root-mean-square error (RMSE) were calculated based on the observed and the predicted values. The coefficients used on the validation sets were generated from the last fold.
Multi-omics (2- and 3-way) risk scores
We established the risk scores based on each omics pair combination (PRS-ERS, PRS-MRS, MRS-ERS) or all three (PRS-MRS-ERS) using a hierarchical approach. This approach is useful in cohorts where there are significant differences in sample size in the measured omics (e.g., 2-fold, 10-fold more genetics than metabolomics). In this approach, we first used the omics with the largest number of samples and regressed the corresponding omics' score out of the blood biomarker by linear regression. Next, we input the residual from this linear regression as the outcome in an elastic net regression with the second-largest omics as the predictors. For 2-way risk scores, we summed the predictors from the second step and the prediction variable by risk score × beta from the first linear regression. For the 3-way risk scores, we build additional regression predicting the inflammation marker by the previous two omics scores and use the residuals as an outcome for another 10-fold CV elastic net regression with the third set of omics as predictors. The 3-way risk score is the sum of beta from the second linear regression × PRS + beta from the second linear regression × predicted by MRS + risk score output from the final elastic net (ERS). The conceptual framework for establishing 1-, 2-, and 3-way risk scores is detailed in Figure 1.
Figure 1.
Analyses workflow and application for 1-, 2-, and 3-way risk scores and hierarchical approach for multi-omics integration
Steps for establishing omics risk scores using genetics, metabolomics, and DNA methylation demonstrated for blood CRP.
Statistical analysis
Blood CRP, IL-6, and TNF-α were log10 transformed before being used in all models and correlation tests. Pearson correlation was used to examine the correlation between the observed log-transformed inflammation markers and the omics risk scores. Spearman correlation was used to examine correlations between inflammation markers and age. We applied previously published DNA methylation scores for CRP and IL-6 onto the CLSA data: (1) the CRP score by Ligthart et al.,23 suggested using 218 CpGs to predict CRP (207 available in the CLSA data); (2) the score by Barker et al.24 included seven CpGs (six available) to predict CRP; (3) the score by Stevenson et al.25 included 35 CpGs (all available) to predict IL-6. The association with all-cause mortality was examined using Cox regression (“survival” R package: https://github.com/therneau/survival), with the follow-up time from baseline to either the death or the end of the follow-up period. The models for the survival discovery analysis were adjusted for age, sex, smoking, alcohol intake, BMI, race/ethnicity, and multimorbidity. We reported the hazard ratios (HRs), 95% confidence intervals (CIs), and concordance resulting from the Cox models. We calculated the survival models' dynamic area under the curve (AUC) using the “dynpred” R package (10.32614/CRAN.package.dynpred). Sensitivity analyses were performed using logistic regression, using the same covariates as in the Cox models. The risk scores and blood inflammation markers were standardized before input to the models, with a mean of 0 and SD of 1. Thus, the effect size of the Cox and logistic regression models represents a 1-SD change of each of these predictors. The likelihood ratio test (LRT) was performed using the “lrtest” function in the “lmtest” R package (10.32614/CRAN.package.lmtest). All statistical analyses were performed using R (version 4.2.3; R Foundation for Statistical Computing).
Results
Description of cohorts used for this analysis
The background characteristics of the CLSA cohort are presented in Table 1. No significant differences in terms of age and sex were observed between participants with metabolomics and epigenetics data and those without omics data (all p > 0.05). However, participants without genetic data (n = 3,475) were slightly older (63.5 vs. 62.9 years) and with a higher proportion of women (58.7% vs. 49.9%) compared with participants with genetic data (n = 26,622). The three inflammation markers CRP, IL-6, and TNF-α were directly correlated with age (CRP, R = 0.105; IL-6, R = 0.279; TNF-α, R = 0.289; p < 0.001 for all; Figure S1). The characteristics of the validation sets (NHS II: PRS, MRS, ERS. NHS and HPFS: PRS and MRS) are presented in Table S1.
Table 1.
Background characteristics of the CLSA across the availability of omics data
| Characteristics |
Individuals with genetic data n = 26,622 |
Individuals with metabolomic data n = 9,992 |
Individuals with epigenetic data n = 1,478 |
|---|---|---|---|
| Age, years | 62.88 (10.20) | 63.00 (10.20) | 63.12 (10.29) |
| Men, % | 50.12 | 49.10 | 49.46 |
| BMI, kg/m2 | 28.07 (8.02) | 28.07 (5.44) | 28.43 (5.69) |
| Current smokers, % | 8.9 | 9.45 | 9.95 |
| CRP, mg/La | 2.56 (5.07) | 2.54 (4.69) | 2.68 (4.42) |
| IL-6, pg/mLb | 2.38 (1.63) | 2.38 (1.63) | 2.62 (1.74) |
| TNF-α, pg/mLc | 1.10 (0.50) | 1.11 (0.50) | 0.49 (7.16) |
Sample size before QC steps. Data are presented as means (SD) for continuous measurements and numbers/percentages for categorical.
n with available data: genetic sample, 26,462; metabolomic sample, 9,958; epigenetic sample, 1,470.
n with available data: genetic sample, 9,537; metabolomic sample, 9,728; epigenetic sample, 1,420.
n with available data: genetic sample, 9,360; metabolomic sample, 9,954; epigenetic sample, 1,428. CLSA, Canadian Longitudinal Study on Aging; CRP, C-reactive protein; IL-6, interleukin 6; TNF-α, tumor necrosis factor alpha.
One-way omics risk scores for inflammation markers
Inflammation PRS
We applied our pre-determined inflammation PRS26 to the CLSA, NHS, NHS II, and HPFS. In the CLSA, CRP PRS had the strongest correlation with blood CRP levels (n = 26,462, R = 0.336, p < 0.001). PRS for IL-6 and TNF-α correlations with the corresponding blood levels were significant but weaker (n = 9,537, R = 0.049, p < 0.001 and n = 9,360, R = 0.032, p = 0.001 for IL-6 and TNF-α, respectively; Tables 2, 3, and 4; Figures S2A–S2C). Similar correlations between blood CRP and CRP PRS were observed in the NHS, NHS II, and HPFS (Table 5). Blood IL-6 was strongly correlated with IL-6 PRS in all validation cohorts (Table 5).
Table 2.
Multi-omics inflammation risk scores for CRP and association with all-cause mortality in CLSA
| Omics integration | Approach | The number and type of predictors included in the final model | Sample size to calculate model performance [3-way sample size]) | RMSEa [3-way sample size] | Correlation: blood CRP vs. risk score [3-way sample size] | Analysis model | 1 SD of blood CRP | 1 SD of the risk score | Sample size and number of eventsb |
|---|---|---|---|---|---|---|---|---|---|
| 3-way: PRS-MRS-ERS | hierarchical | 1 PRS + predicted value by 444 metabolites + predicted value by 87 CpGs | 1,430 | 0.408 | 0.790 | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.38 (1.08, 1.77) | 1.58 (1.22, 2.06) | n = 1,255; number of events = 70 |
| MV + and mutual adjustment | 1.02 (0.70, 1.49) | 1.56 (1.04, 2.33) | |||||||
| 2-way: PRS-MRS | hierarchical | 1 PRS + predicted value by 444 metabolites | 9,759 [1,430] | 0.463 [0.462] | 0.767 [0.781] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.38 (1.26, 1.52) | 1.35 (1.24, 1.48) | n = 8,705; number of events = 476 |
| MV + and mutual adjustment | 1.25 (1.11, 1.42) | 1.16 (1.02, 1.35) | |||||||
| 2-way: PRS-ERS | hierarchical | 1 PRS + predicted value by 428 CpGs | 1,435 [1,430] | 0.514 [0.514] | 0.577 [0.577] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.40 (1.01, 1.79) | 1.34 (1.04, 1.74) | n = 1,259; number of events = 72 |
| MV + and mutual adjustment | 1.30 (0.98, 1.71) | 1.18 (0.89, 1.58) | |||||||
| 2-way: MRS-ERS | hierarchical | 320 metabolites + predicted value by 24 CpGs | 1,463 [1,430] | 0.282 [0.282] | 0.758 [0.760] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.38 (1.08, 1.77) | 1.69 (1.31, 2.19) | n = 1,284; number of events = 72 |
| MV + and mutual adjustment | 0.94 (0.66, 1.33) | 1.79 (1.21, 2.64) | |||||||
| 1-way: PRS | sBayesian | 1 PRS | 26,462 [1,430] | 0.602 [0.630] | 0.336 [0.304] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.29 (1.22, 1.36) | 1.05 (1.00, 1.11) | n = 23,692; number of events = 1,299 |
| MV + and mutual adjustment | 1.30 (1.23, 1.38) | 0.97 (0.92, 1.03) | |||||||
| 1-way: MRS | 10-fold CV elastic net | 320 metabolites | 9,958 [1,430] | 0.293 [0.283] | 0.740 [0.760] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.39 (1.27, 1.52) | 1.41 (1.29, 1.55) | n = 8,885; number of events = 488 |
| MV + and mutual adjustment | 1.21 (1.08, 1.36) | 1.25 (1.10, 1.40) | |||||||
| 1-way: ERS | 10-fold CV elastic net | 150 CpGs | 1,468 [1,430] | 0.371 [0.373] | 0.534 [0.536] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.40 (1.10, 1.78) | 1.45 (1.16, 1.82) | n = 1,288; number of events = 74 |
| MV + and mutual adjustment | 1.23 (0.94, 1.60) | 1.34 (1.04, 1.72) |
RMSE for 2- and 3-way models was calculated as follows: RMSE = √[ Σ(Pi – Oi)2/n ].
According to MV + mutual adjustment. BMI, body mass index; CRP, C-reactive protein; CV, cross-validation; ERS, epigenetic risk score; MRS, metabolomic risk score; PRS, polygenic risk score.
Table 3.
Multi-omics inflammation risk scores for IL-6 and association with all-cause mortality in CLSA
| Omics integration | Approach | The number and type of predictors included in the final model | Sample size to calculate model performance [3-way sample size] | RMSEa [3-way sample size] | Correlation: blood IL-6 vs. risk score [3-way sample size] | Analysis model | 1 SD of blood IL-6 | 1 SD of the risk score | Sample size and number of eventsb |
|---|---|---|---|---|---|---|---|---|---|
| 3-way: PRS-MRS-ERS | hierarchical | 1 PRS + predicted value by 284 metabolites + predicted value by 5 CpGs | 1,384 | 0.357 | 0.702 | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.44 (1.10, 1.90) | 1.93 (1.50, 2.49) | n = 1,216; number of events = 64 |
| MV + and mutual adjustment | 0.99 (0.71, 1.39) | 1.95 (1.40, 2.70) | |||||||
| 2-way: PRS-MRS | hierarchical | 1 PRS + predicted value by 284 metabolites | 9,534 [1,384] | 0.362 [0.377] | 0.698 [0.700] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV)) | 1.62 (1.45, 1.81) | 1.57 (1.47, 1.68) | n = 8,516; number of events = 444 |
| MV + and mutual adjustment | 1.33 (1.18, 1.51) | 1.47 (1.35, 1.59) | |||||||
| 2-way: PRS-ERS | hierarchical | 1 PRS + predicted value by 549 CpGs | 1,385 [1,384] | 0.367 [0.367] | 0.565 [0.565] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.44 (1.10, 1.90) | 1.30 (0.98, 1.73) | n = 1,217; number of events = 64 |
| MV + and mutual adjustment | 1.38 (1.02, 1.86) | 1.12 (0.82, 1.54) | |||||||
| 2-way: MRS-ERS | hierarchical | 270 metabolites + predicted value by 7 CpGs | 1,417 [1,384] | 0.190 [0.189] | 0.704 [0.704] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.46 (1.12,1.92) | 2.12 (1.61, 1.78) | n = 1,245; number of events = 66 |
| MV + and mutual adjustment | 0.94 (0.67, 1.32) | 2.20 (1.55, 3.13) | |||||||
| 1-way: PRS | sBayesian | 1 PRS | 9,537 [1,384] | 0.664 [0.699] | 0.049 [0.038] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.62 (1.45, 1.81) | 0.95 (0.87, 1.05) | n = 8,519; number of events = 444 |
| MV + and mutual adjustment | 1.62 (1.45, 1.81) | 0.95 (0.86, 1.04) | |||||||
| 1-way: MRS | 10-fold CV elastic net | 270 metabolites | 9,728 [1,384] | 0.194 [0.191] | 0.707 [0.701] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV)) | 1.62 (1.45, 1.81) | 1.72 (1.58, 1.87) | n = 8,692; number of events = 456 |
| MV + and mutual adjustment | 1.28 (1.13, 1.46) | 1.56 (1.41, 1.74) | |||||||
| 1-way: ERS | 10-fold CV elastic net | 469 CpGs | 1,418 [1,384] | 0.220 [0.220] | 0.575 [0.572] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.46 (1.12, 1.92) | 1.55 (1.16, 2.08) | n = 1,246; number of events = 66 |
| MV + and mutual adjustment | 1.27 (0.94, 1.73) | 1.38 (0.99, 1.91) |
RMSE for 2- and 3-way models was calculated as follows: RMSE = √[ Σ(Pi – Oi)2/n].
According to MV + mutual adjustment. BMI, body mass index; CV, cross-validation; ERS, epigenetic risk score; IL-6, interleukin 6; MRS, metabolomic risk score; PRS, polygenic risk score.
Table 4.
Multi-omics inflammation risk scores for TNF-α and association with all-cause mortality in CLSA
| Omics integration | Approach | The number and type of predictors included in the final model | Sample size to calculate model performance [3-way sample size] | RMSEa [3-way sample size] | Correlation: blood TNF-α vs. risk score [3-way sample size] | Analysis model | 1 SD of blood TNF-α | 1 SD of the risk score | Sample size and number of eventsb |
|---|---|---|---|---|---|---|---|---|---|
| 3-way: PRS-MRS-ERS | hierarchical | 1 PRS + predicted value by 381 metabolites + precited by 33 CpGs | 1,392 | 0.122 | 0.642 | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.65 (1.32, 2.06) | 1.61 (1.34, 1.94) | n = 1,224; number of events = 71 |
| MV + and mutual adjustment | 1.31 (0.96, 1.77) | 1.39 (1.09, 1.79) | |||||||
| 2-way: PRS-MRS | hierarchical | 1 PRS + predicted value by 381 metabolites | 9,357 [1,392] | 0.118 [0.123] | 0.657 [0.646] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.30 (1.18 1.43) | 1.47 (1.36, 1.60) | n = 8,356; number of events = 456 |
| MV + and mutual adjustment | 1.00 (0.89, 1.14) | 1.47 (1.33, 1.63) | |||||||
| 2-way: PRS-ERS | hierarchical | 1 PRS + predicted value by 139 CpGs | 1,393 [1,392] | 0.145 [0.145] | 0.413 [0.413] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.65 (1.32, 2.06) | 1.44 (1.08, 1.91) | n = 1,225; number of events = 71 |
| MV + and mutual adjustment | 1.59 (1.25, 2.01) | 1.22 (0.91, 1.63) | |||||||
| 2-way: MRS-ERS | hierarchical | 111 metabolites + predicted value by 116 CpGs | 1,425 [1,392] | 0.120 [0.120] | 0.651 [0.649] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.66 (1.33, 2.07) | 1.65 (1.36, 1.99) | n = 1,253; number of events = 72 |
| MV + and mutual adjustment | 1.30 (0.96, 1.78) | 1.42 (1.10, 1.83) | |||||||
| 1-way: PRS | sBayesian | 1 PRS | 9,360 [1,392] | 0.273 [0.269] | 0.032 [0.023] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.30 (1.18, 1.43) | 0.96 (0.88, 1.06) | n = 8,359; number of events = 456 |
| MV + and mutual adjustment | 1.30 (1.18, 1.43) | 0.96 (0.88, 1.06) | |||||||
| 1-way: MRS | 10-fold CV elastic net | 111 metabolites | 9,554 [1,392] | 0.117 [0.121] | 0.661 [0.650] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.30 (1.18, 1.43) | 1.49 (1.37, 1.61) | n = 8,534; number of events = 468 |
| MV + and mutual adjustment | 1.00 (0.89, 1.13) | 1.49 (1.34, 1.65) | |||||||
| 1-way: ERS | 10-fold CV elastic net | 363 CpGs | 1,426 [1,392] | 0.146 [0.146] | 0.376 [0.377] | age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol-adjusted model (MV) | 1.66 (1.34, 2.07) | 1.40 (1.11, 1.77) | n = 1,254; number of events = 73 |
| MV + and mutual adjustment | 1.60 (1.27, 2.02) | 1.25 (0.97, 1.61) |
RMSE for 2- and 3-way models was calculated as follows: RMSE = √[ Σ(Pi – Oi)2/n].
According to MV + mutual adjustment. BMI, body mass index; CV, cross-validation; ERS, epigenetic risk score; MRS, metabolomic risk score; PRS, polygenic risk score; TNF-α, tumor necrosis factor alpha.
Table 5.
Validation of the CRP and IL-6 risk scores in NHS, NHS II, and HPFS
| R (Pearson) |
NHS |
R (Pearson) |
NHS II |
R (Pearson) |
HPFS |
||||
|---|---|---|---|---|---|---|---|---|---|
| p | n | p | n | p | n | ||||
| CRP | |||||||||
| ERSa: blood draw #1 | – | – | – | 0.329 | <0.001 | 334 | – | – | – |
| ERSa: blood draw #2 | – | – | – | 0.175 | 0.004 | 253 | – | – | – |
| MRSb | 0.376 | <0.001 | 4,321 | 0.424 | <0.001 | 580 | 0.222 | <0.001 | 725 |
| PRS | 0.440 | <0.001 | 8,409 | 0.405 | <0.001 | 1,736 | 0.414 | <0.001 | 3,887 |
| MRS-ERS | – | – | – | 0.370 | 0.042 | 43 | – | – | – |
| PRS-MRS | 0.547 | <0.001 | 2,412 | 0.502 | <0.001 | 180 | 0.418 | <0.001 | 506 |
| PRS-ERS | – | – | – | 0.523 | <0.001 | 253 | – | – | – |
| PRS-MRS-ERS | – | – | – | 0.401 | 0.014 | 37 | – | – | – |
| IL-6 | |||||||||
| ERSa: blood draw #1 | – | – | – | 0.313 | <0.001 | 135 | – | – | – |
| ERSa: blood draw #2 | – | – | – | 0.257 | 0.007 | 51 | – | – | – |
| MRSb | 0.321 | <0.001 | 3,333 | 0.356 | <0.001 | 696 | 0.251 | <0.001 | 813 |
| PRS | 0.698 | <0.001 | 5,181 | 0.703 | <0.001 | 1,428 | 0.691 | <0.01 | 2,546 |
| MRS-ERS | – | – | – | 0.385 | 0.16 | 15 | – | – | – |
| PRS-MRS | 0.449 | <0.001 | 1,953 | 0.189 | 0.50 | 187 | 0.440 | <0.001 | 619 |
| PRS-ERS | – | – | – | 0.428 | <0.001 | 89 | – | – | – |
| PRS-MRS-ERS | – | – | – | 0.347 | 0.25 | 13 | – | – | – |
CRP, C-reactive protein; ERS, epigenetic risk score; IL-6, interleukin 6; MRS, metabolomic risk score; PRS, polygenic risk score.
Seven CpGs removed from the original 150-CpG ERS model due to differences in DNA methylation QC (removing probes).
238 and 205 metabolites were removed from NHS, NHS II, and HPFS datasets due to the differences in the metabolomic platforms (blank HMDB).
Inflammation MRS
Our MRS generated using 10-fold CV resulted in a prediction of the blood markers by 254–350 (CRP, n = 9,958, R = 0.740), 129–430 (IL-6, n = 9,728, R = 0.707), and 109–217 (TNF-α, n = 9,554, R = 0.661) metabolites (Tables 2, 3, and 4; Figure S3). Next, we validated our MRS for CRP and IL-6 in the NHS, NHS II, and HPFS data using our final model with 320 and 270 metabolites for CRP and IL-6 risk scores. The MRS for CRP performed well in the NHS, NHS II, and HPFS (Table 5). Of note, 238 (CRP) and 205 (IL-6) metabolites were removed from NHS, NHS II, and HPFS datasets (Table S2) due to the differences in the metabolomic platforms (blank HMDB). The MRS for IL-6 showed similar performance across all validation cohorts (Table 5).
Inflammation ERS
Our ERS generated using 10-fold CV resulted in a prediction of the blood markers by 122–328 (CRP, n = 1,468, R = 0.534), 356–527 (IL-6, n = 1,418, R = 0.575), and 137–769 (TNF-α, n = 1,426, R = 0.376) CpGs (Tables 2, 3, and 4; Figures S4A–S4C). We also applied previously published epigenetic scores to predict CRP and IL-6 in the CLSA data and correlated these with the actual blood levels of each marker. This resulted in a correlation of R = 0.183 and R = 0.256 for blood CRP vs. the published Ligthart and Barker CRP epigenetic scores and R = 0.123 for blood IL-6 vs. published Stevenson et al. IL-6 epigenetic score.
Validation of ERS for CRP and IL-6 using the NHS II data showed a good correlation between the ERS and the blood levels of each marker (CRP, R = 0.329 and R = 0.175; IL-6, R = 0.313 and R = 0.257; examined in two blood draws in the NHS II study; Table 5).
Two-way omics risk scores for inflammation markers
Of the three two-way integrated risk scores (PRS-MRS, PRS-ERS, or MRS-ERS), the PRS-MRS for CRP and TNF-α and MRS-ERS IL-6 had the highest correlation with the corresponding blood markers (Tables 2, 3, and 4). To examine whether this observation is due to the different available samples for these risk scores, we repeated the correlation test in a subset using the same set of samples as further used for the 3-way signatures. CRP PRS-MRS and IL-6 MRS-ERS remained the strongest of the 2-way risk scores, but for TNF-α, the 2-way MRS-ERS instead of PRS-MRS had the strongest correlation with the blood markers of all the 2-way scores. Validation of the 2-way PRS-MRS (Table 5) generally showed a good correlation between the signature and blood levels of PRS-MRS CRP and IL-6 but not for IL-6 PRS-MRS in NHS II (n = 187). The 2-way PRS-ERS showed significant correlation between the scores and corresponding blood levels of the markers in the NHS II (Table 5). CRP MRS-ERS showed significant correlation with blood CRP (n = 43) but not for IL-6 MRS-ERS and IL-6 (n = 15).
Three-way scores based on PRS, MRS, and ERS
The fully integrated (3-way) scores included the PRS, MRS, and ERS. The 3-way CRP risk score showed the strongest correlation with the corresponding blood markers of all the CRP risk scores (Tables 2, 3, and 4). All models (1-, 2-, and 3-way risk scores for all three inflammation markers) are presented in Tables S3–S7. The 3-way CRP risk score performed well in the NHS II validation set (Table 5), although the sample sizes were limited (n = 37).
We repeated the scorers’ validation in the NHS, NHS II, and HPFS using a minimal sample size that included only individuals with both CRP and IL-6 measurements instead of all available sample size per inflammation marker and omics (Table S8). This validation yielded similar results.
Blood CRP and CRP omics risk scores as predictors for all-cause mortality
We performed a series of multivariate models, detailed in Table S9. When adjusting for age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol (Table 2), the MRS-ERS for CRP had the highest HR (1.69, 95% CI [1.31, 2.19]). In mutually adjusted models, considering both blood CRP and each omics CRP signature in the same Cox model detailed above, the PRS-MRS-ERS, PRS-MRS, MRS-ERS, MRS, and ERS remained significant predictors for all-cause mortality. The 2-way MRS-ERS remained the strongest predictor for all-cause mortality compared with the other risk scores (HR = 1.79, 95% CI [1.21, 2.64]).
We performed a series of 1-degree of freedom (df) LRTs, examining the above models with and without the CRP risk scores (e.g., multivariate models including the blood marker “mutual adjustment,” with and without 1-way ERS, with and without 2-way PRS-ERS, etc.). For a total of seven LRTs, one for each CRP risk score, the following LRTs were significant: ERS, MRS, PRS-MRS, MRS-ERS, and PRS-MRS-ERS (p < 0.05 for all).
Blood IL-6 and IL-6 omics risk scores as predictors for all-cause mortality
We repeated the same models to test for association with all-cause mortality, with blood IL-6, different omics risk scores for IL-6, or a mutual adjustment (Table 3). Full model details, coefficients, and p values are presented in Table S10. The 3-way IL-6 risk scores and the 2-way MRS-ERS had the highest HR of all mutually adjusted models (HR = 1.95, 95% CI [1.40,2.70] and HR = 2.20, 95% CI [1.55,3.13], respectively), while the blood IL-6 was not a significant predictor for all-cause mortality in these models. In 1-df LRTs, examining the above models of mutual adjustment with and without the IL-6 risk scores, the following LRTs were significant: MRS, PRS-MRS, MRS-ERS, and PRS-MRS-ERS (p < 0.05 for all). 1-df LRT for IL-6 ERS was marginal (p = 0.055). Of note, in the case of IL-6 sample size, the number of individuals with IL-6 PRS and blood IL-6 was slightly smaller than the IL-6 MRS. Nevertheless, since the total difference between the sample sizes was 2, we constructed the multi-omics IL-6 risk scores in the same order as the CRP scores, starting with PRS, followed by MRS and ERS. However, if the model construction had started with IL-6 MRS followed by IL-6 PRS as the second omics, the prediction of the 2-way MRS-PRS IL-6 would not change significantly (n = 9,536, R = 0.706, and RMSE = 0.409) compared with the 2-way PRS-MRS.
Blood TNF-α and TNF-α omics risk scores as predictors for all-cause mortality
Finally, we repeated the same Cox regression models with blood TNF-α, different risk scores for TNF-α, or a mutual adjustment (Table 4). Full model details, coefficients, and p values are presented in Table S11. The MRS, PRS-MRS, MRS-ERS, and the 3-way risk score were significantly associated with all-cause mortality in the mutually adjusted models (MRS: HR:1.49 [1.34,1.65]. MRS-ERS: HR = 1.42 [1.10,1.83]. PRS-MRS: HR = 1.47 [1.33,1.63]. PRS-MRS-ERS: HR = 1.39 [1.09,1.79]). In 1-df LRTs, examining the above models of mutual adjustment with and without the TNF-α risk scores, all LRTs were significant (p < 0.05 for all) except for the TNF-α PRS, ERS, and PRS-ERS. Similar to IL-6, the number of individuals with TNF-α PRS and blood TNF-α was slightly smaller than with the TNF-α MRS. If the model construction had started with TNF-α MRS followed by TNF-α PRS as the second omics, the model prediction for the 2-way risk score MRS-PRS would not change significantly (n = 9,359, R = 0.659, and RMSE = 0.156).
Sensitivity analysis and prediction improvement by omics
A series of sensitivity analyses (Tables S12–S18) using logistic regression to get the odds ratio in comparable models to the survival analysis yielded similar conclusions.
Next, we first repeated the Cox analyses while inputting first the traditional predictors (model 1: age, sex, smoking. Model 2: model 1 + BMI. Model 3: model 2 + race/ethnicity. Model 4: model 3 + alcohol intake. Model 5: model 4 + multimorbidity). We then included omics risk scores (model 6: model 5 + the 3-way risk scores for CRP, IL-6, or TNF-α) and examined the change in C-statistics as an indicator for model improvement. The change in concordance for the three 3-way risk scores is presented in Figure 2. Examining the dynamic prediction of model 6 using the AUC showed that the full model with the 3-way IL-6 risk score had the highest AUC, followed by the TNF-α and the CRP scores (AUC = 0.792, 0.786, 0.764; IL-6, TNF-α, and CRP, respectively; Figure S5). Inputting any of the 3-way risk scores resulted in the highest concordance, compared with models with traditional markers for mortality. Adding all three 3-way risk scores to model 5 resulted in an AUC of 0.796.
Figure 2.
Changes in concordance across different models (fit)
(A) CRP 3-way risk score input in the last model (model 6).
(B) IL-6 3-way risk score input in the last model (model 6).
(C) TNF-α 3-way risk score input in the last model (model 6).
Model 1: age, sex, and smoking. Model 2: age, sex, smoking, and BMI. Model 3: age, sex, smoking, BMI, and race/ethnicity. Model 4: age, sex, smoking, BMI, race/ethnicity, and alcohol intake. Model 5: age, sex, smoking, BMI, race/ethnicity, alcohol intake, multimorbidity. Model 6: age, sex, smoking, BMI, race/ethnicity, alcohol intake, and multimorbidity, and the 3-way risk score (A, CRP; B, IL-6; C, TNF-α). Dots represent model’s concordance; error lines represent standard errors. BMI, body mass index; CRP, C-reactive protein; IL-6, interleukin 6; TNF-α, tumor necrosis factor alpha.
The joint effect of the CRP, IL-6, and TNF-α omics risk scores on all-cause mortality
As each blood inflammation marker may represent a different aspect of inflammation, we included all 3-way risk scores and all blood markers for the subsequent analysis. We followed the same models detailed above (Table S19). In a model including only the three 3-way risk scores and blood biomarkers to predict all-cause mortality, only the IL-6 risk score and TNF-α risk score were significantly associated with all-cause mortality (HR = 2.18 [1.38, 3.45] and HR = 1.42 [1.05, 1.92] IL-6 and TNF-α risk scores, respectively). However, when further adjusted for confounders, only the IL-6 risk score remained significantly associated with all-cause mortality (p = 0.027). We continued to explore the joint effect of the risk scores using an LRT test. In a model including all three 3-way risk scores, blood markers, age, sex, smoking, multimorbidity, BMI, race/ethnicity, and alcohol intake, the 3-df LRT was significant after removing all risk scores (p = 0.005).
Validation of the associations with all-cause mortality
Finally, we sought to validate the results of the association of the risk scores with all-cause mortality in the validation cohorts. The median follow-up time was 28.0, 21.0, and 20.1 years for NHS, NHS II, and HPFS, respectively. We combined these cohorts due to a smaller sample size with overlapping omics between participants measured, which allowed us to use the following risk scores in addition to blood CRP and IL-6: PRS, MRS, and PRS-MRS. We used the following comparable stepwise models to examine the HR between the CLSA, NHS, NHS II, and HPFS (by the available omics): age + sex + smoking + BMI + race/ethnicity + alcohol intake (full model; Tables S20 and S21). CLSA and NHS/HPFS showed the same direction of effect size to the association between blood CRP and blood IL-6 and mortality. Similarly, for the MRSs for IL-6 and CRP, the models’ results were similar between the cohorts, with a stronger effect size in the CLSA. The IL-6 2-way PRS-MRS showed the same direction in all cohorts, but the magnitudes differed between the CLSA and the NHS/HPFS.
To better align the average age of the baseline samples in CLSA and NHS/HPFS, we further restricted the models to participants age >65 years in CLSA (overall n = 12,646 of participants >65 years, maximum n = 11,173 with CRP data) and NHS/HPFS (n = 2,019; Tables S22 and S23); this increased the HR observed in the NHS/HPFS, especially across the IL-6 risk scores.
Discussion
In this study, we established single- and multi-omics risk scores to evaluate blood inflammation markers. We validated these risk scores in three cohorts and compared our ERS to previously published scores, showing the increased predictive ability for mortality.
We designed a hierarchical approach to build our multi-omics models sequentially by leveraging the power of different omics’ available sample sizes and maximizing the residual total variance explained by subsequent omics risk score. At each step, we used a 10-fold CV to ensure the model used to compute the predicted score for each subject is completely independent of the subject in the calculation. Instead of inputting all coefficients from the 1-way scores to build a multi-omics score, this approach helps avoid overfitting the models, maximizes the variance explained by all involved omics predictors, and allows exploiting the changing sample size available per omics. Compared with previously published ERS for CRP built on Epigenome Wide Association Study (EWAS) results and 450K methylation array data (Ligthart et al.23 and Barker et al.24), our CRP ERS model outperformed the published ones with a higher correlation between the CRP ERS and blood CRP. This can be explained by using penalized regression for candidate predictors rather than performing EWAS, which reduces overfitting and provides the combined effect of predictors to get the highest explained variance, rather than single predictors.27 Also, additional CpGs in the EPIC array compared with the 450K array may possess additional biological information critical to evaluating the inflammatory state. Our ERS for IL-6 also outperformed the previously published score by Stevenson et al.,25 although both methods relied on penalized regression to establish the DNA methylation IL-6 score. However, Stevenson et al. methylation data were subset to probes in common on both the 450K and EPIC arrays to establish cross-array scores, as their training sample (Lothian Birth Cohort) used the 450K array to assess DNA methylation and applied the IL-6 score to the Generation Scotland with DNA methylation analyzed by the EPIC array.
The omics risk scores that included metabolomics showed the highest prediction of blood inflammation markers. It has been suggested that metabolites from different tissues are involved in regulating the activation of the immune system and, thus, predict inflammatory-based disease outcomes.28 A previous study used blood metabolomics from three biobanks to build MRS to identify high-risk groups for 12 diseases.29 In that study, the MRS was a stronger predictor for most future disease onset than PRS. That study also suggested that a joint PRS-MRS model may be the best-performing model for disease prediction. Integrating different omics from different functional layers, some subjected to external exposures such as metabolomics and epigenetics, and genetics, representing the inheritable risk of diseases, may provide complementary biological information on a phenotype or capture the confounding effect of other correlated phenotypes/exposures. This was supported by our results, with an increase in the performance of the prediction models, from single to multi-omics risk scores.
Our central hypothesis was that integrating several time-scope omics from different function layers may provide an additional biological potential for inflammation and mortality, which have not yet been reflected in the blood levels of the biomarkers measured at the same blood draw. When associated with mortality and compared to the blood levels of each inflammation marker, the risk scores outperformed the blood markers in terms of effect size and significance in most models compared to CRP, IL-6, and TNF-α. Interactions and crosstalk between inflammation markers and different omics, and within omics layers, can provide insights into inflammation and inflammatory-related diseases. Genetics-epigenetics, for example, have bi-directional crosstalk: genetic variants can influence epigenetic regulation,30 and epigenetics may mediate the penetrance of genetic predisposition of inflammatory diseases.31 Certain genes are known to influence the levels of inflammation markers; variations in the IL-6 gene have been associated with differences in levels of the IL-6 cytokine and other inflammation markers,32,33 and variations in TNF-α, CRP, and IL-6 can affect the production or activity of proteins that contribute to inflammation.34,35 Several CRP-related CpG sites were associated with the expression of nearby genes.23 Epigenetics, as opposed to genetics, which predicts heritable components of disease risks and phenotypes, varies dynamically throughout the life course owing to a complex set of endogenous biological processes and, thus, provides the blueprint for the production of proteins through gene expression.36
Additionally, when examining the contribution of each score to the association with all-cause mortality, by including all covariates starting with the traditional background factors and last for the omics risk scores, we found that any of the 3-way risk scores resulted in the highest concordance, compared with models with traditional factors for mortality (age, sex, smoking, BMI, race/ethnicity, alcohol intake, multimorbidity). This highlights the added value of integrating multi-omics data into risk prediction models and underscores their potential to improve prognostic accuracy beyond traditional clinical factors.
An elevated inflammatory profile reflecting chronic low-grade inflammation was shown to be a predictor for all-cause mortality. A previous meta-analysis of 14 studies showed that elevated blood CRP is an independent risk factor for all-cause mortality in the general population.6 This was also demonstrated in middle-aged and older men and women above 65 years.9 IL-6 predicted all-cause mortality among elderly men over 65 years.8 Other studies identified blood IL-6 and TNF-α as predictors for all-cause mortality, mostly among older populations and with chronic diseases.8,37,38 Among the general population in our study, some biological risk scores, such as 3-way CRP risk score, 3-way IL-6 risk score, 3-way TNF-α risk score, 2-way CRP MRS-ERS, 2-way IL-6 MRS-ERS, and 2-way TNF-α MRS-ERS, were stronger predictors for all-cause mortality compared with blood inflammation levels. When trying to validate the results of the association study between omics inflammation markers, blood markers, and all-cause mortality using the NHSs and HPFS, results were consistent between the cohorts, with smaller but similar directions of effect size for IL-6 risk scores and blood IL-6 and CRP, but inconsistent for CRP risk scores and mutual adjustment models. These can be explained by the smaller sample size in the NHS/HPFS, younger age compared to the CLSA, and potential environmental exposures between the residential areas where all participants live.
Our study has several limitations. First, we could not externally validate TNF-α models since TNF-α was not measured in any of the NHS/HPFS datasets. Second, the CLSA and NHS/HPFS used different metabolomic platforms and had fewer overlapping metabolites than the number of metabolites used in training the models (CLSA). Third, only one validation cohort (NHS II) had DNA methylation measured and was used to validate the 3-way risk score. Although this cohort had a substantially smaller sample size, we observed good performance for our 3-way CRP risk score. Fourth, while, for the CRP omics scores, the largest sample size was genetics, followed by metabolomics and epigenetics, for which the hierarchical approach is suggested to be employed to utilize the entire available biological data, for IL-6 and TNF-α there were slightly fewer individuals with genetic and inflammation markers measured as compared with metabolomics and the two markers. However, the construction of our models began with PRS and not MRS. We chose to be consistent with the CRP modeling order since the differences between the samples were minor. Furthermore, we showed that replacing the order of the omics while constructing the 2-way and 3-way risk scores for IL-6 and TNF-α did not have a major effect on the models’ performance due to the similar sample size. Fifth, it has to be noted that, for the 1-way inflammation genetic risk scores, we observed a weak correlation between 1-way IL-6 and TNF-α PRS and the observed marker in the CLSA cohort but a strong correlation for the IL-6 PRS in the validation cohorts. However, for all cohorts, we could not find an association between any of the genetic PRS and all-cause mortality in our multi-variable mutual-adjusted models. However, previous studies mostly show the association between disease PRS and all-cause mortality rather than inflammation PRS. Thus, further research is needed to shed light on the effect of inflammation PRS and all-cause mortality. Finally, some of the associations with all-cause mortality were not statistically significant in the NHS and HPFS, probably due to limited sample size and the smaller number of overlapping metabolomics all datasets, a larger percentage of women in the validation set, the timing in terms of participant age of measuring the metabolomics, the mean age at the beginning of the follow-up (stratifying on subjects >65 years old improved the validation performance), the duration of the follow-up, and differences in mortality rates due to different geographical environment. Nevertheless, our study has strengths that need to be highlighted. We integrated three omics to produce 1- 2-, and 3-way risk scores for inflammation markers. Our scores outperformed previously published ERSs. We demonstrated that including at least two omics as predictors of inflammation markers improves the model’s predictive ability.
In conclusion, we have developed multi-omics inflammation signature models to capture the burden of inflammation more comprehensively across different time scopes. Utilizing omics-based inflammation risk scores, we predicted all-cause mortality beyond what is explained by traditional blood biomarkers, age, sex, and other background characteristics, including comorbidities. The integration of multiple omics data as risk scores offers a more robust approach for identifying populations at risk for inflammation-related comorbidities and mortality, thereby enabling better targeted interventions.
Data and code availability
The code for generating 1-, 2-, and 3-way risk scores is available at http://lianglab.rc.fas.harvard.edu/OmicsRiskScore/.
Acknowledgments
See acknowledgments in the supplemental information.
Author contributions
A.Y.M. and L.L. conceived and designed the study and wrote the manuscript. A.Y.M., H.Y., and J. Liu performed the statistical analysis. J. Li curated the GWAS summary statistics used for the PRS models. J.H. performed the technical review for the NHS, NHS II, and HPFS. L.L. obtained access to the CLSA. A.Y.M., A.B., A.R., M.S., A.H.E., L.C., K.K., M.J.S., and L.L. obtained access to NHS, NHS II, and HPFS. A.B., A.R., M.S., A.H.E., L.C., and K.K. generated the NHS II epigenetic data. L.L. directed and supervised the study. M.J.S. and G.P. made valuable suggestions regarding the cohorts and models included in this study. A.Y.M. and L.L. obtained permission for publication from the CLSA manuscript board. A.Y.M., H.Y., J. Liu, and L.L. obtained permission for publication from the NHS, NHS II, and HPFS manuscript boards. All authors contributed to the manuscript and approved the submitted version.
Declaration of interests
The authors declare no competing interests.
Published: July 8, 2025
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.ajhg.2025.06.009.
Contributor Information
Anat Yaskolka Meir, Email: ayaskolkameir@hsph.harvard.edu.
Liming Liang, Email: lliang@hsph.harvard.edu.
Web resources
Canadian Longitudinal Study on Aging, www.clsa-elcv.ca
Health Professional Follow-up Study, https://sites.sph.harvard.edu/hpfs/for-collaborators/
Nurses’ Health Study, https://www.nurseshealthstudy.org/researchers
Supplemental information
References
- 1.Aul P., Idker M.R., Harles C., Ennekens H.H., Ulie J., Uring E.B., Ader N., Ifai R. C-Reactive Protein and Other Markers of Inflammation in the Prediction of Cardiovascular Disease in Women. N. Engl. J. Med. 2000;342:836–843. doi: 10.1056/NEJM200003233421202. [DOI] [PubMed] [Google Scholar]
- 2.Zhang W., Speiser J.L., Ye F., Tsai M.Y., Cainzos-Achirica M., Nasir K., Herrington D.M., Shapiro M.D. High-Sensitivity C-Reactive Protein Modifies the Cardiovascular Risk of Lipoprotein(a): Multi-Ethnic Study of Atherosclerosis. J. Am. Coll. Cardiol. 2021;78:1083–1094. doi: 10.1016/J.JACC.2021.07.016/SUPPL_FILE/MMC1.DOCX. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Okdahl T., Wegeberg A.-M., Pociot F., Brock B., Størling J., Brock C. Low-grade inflammation in type 2 diabetes: A cross-sectional study from a Danish diabetes outpatient clinic. BMJ Open. 2022;12 doi: 10.1136/bmjopen-2022-062188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kinney J.W., Bemiller S.M., Murtishaw A.S., Leisgang A.M., Salazar A.M., Lamb B.T. Inflammation as a central mechanism in Alzheimer’s disease. Alzheimer's Dement. 2018;4:575–590. doi: 10.1016/j.trci.2018.06.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Franceschi C., Garagnani P., Parini P., Giuliani C., Santoro A. Inflammaging: a new immune–metabolic viewpoint for age-related diseases. Nat. Rev. Endocrinol. 2018;14:576–590. doi: 10.1038/s41574-018-0059-4. [DOI] [PubMed] [Google Scholar]
- 6.Li Y., Zhong X., Cheng G., Zhao C., Zhang L., Hong Y., Wan Q., He R., Wang Z. Hs-CRP and all-cause, cardiovascular, and cancer mortality risk: a meta-analysis. Atherosclerosis. 2017;259:75–82. doi: 10.1016/j.atherosclerosis.2017.02.003. [DOI] [PubMed] [Google Scholar]
- 7.Brüünsgaard H., Pedersen B.K. Age-related inflammatory cytokines and disease. Immunol. Aller. Clin. 2003;23:15–39. doi: 10.1016/s0889-8561(02)00056-5. [DOI] [PubMed] [Google Scholar]
- 8.Baune B.T., Rothermundt M., Ladwig K.H., Meisinger C., Berger K. Systemic inflammation (Interleukin 6) predicts all-cause mortality in men: results from a 9-year follow-up of the MEMO Study. Age. 2011;33:209–217. doi: 10.1007/s11357-010-9165-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Li Z.-H., Zhong W.-F., Lv Y.-B., Kraus V.B., Gao X., Chen P.-L., Huang Q.-M., Ni J.-D., Shi X.-M., Mao C., Wu X.B. Associations of plasma high-sensitivity C-reactive protein concentrations with all-cause and cause-specific mortality among middle-aged and elderly individuals. Immun. Ageing. 2019;16:28. doi: 10.1186/s12979-019-0168-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Raina P., Wolfson C., Kirkland S., Griffith L.E., Balion C., Cossette B., Dionne I., Hofer S., Hogan D., Van Den Heuvel E.R., et al. Cohort profile: the Canadian longitudinal study on aging (CLSA) Int. J. Epidemiol. 2019;48:1752–1753. doi: 10.1093/ije/dyz173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bao Y., Bertoia M.L., Lenart E.B., Stampfer M.J., Willett W.C., Speizer F.E., Chavarro J.E. Origin, methods, and evolution of the three Nurses’ Health Studies. Am. J. Public Health. 2016;106:1573–1581. doi: 10.2105/AJPH.2016.303338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mendoza K., Smith-Warner S.A., Rossato S.L., Khandpur N., Manson J.E., Qi L., Rimm E.B., Mukamal K.J., Willett W.C., Wang M., et al. Ultra-processed foods and cardiovascular disease: analysis of three large US prospective cohorts and a systematic review and meta-analysis of prospective cohort studies. Lancet Reg. Health. Am. 2024;37 doi: 10.1016/j.lana.2024.100859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Forgetta V., Li R., Darmond-Zwaig C., Belisle A., Balion C., Roshandel D., Wolfson C., Lettre G., Pare G., Paterson A.D., et al. Cohort profile: genomic data for 26 622 individuals from the Canadian Longitudinal Study on Aging (CLSA) BMJ Open. 2022;12 doi: 10.1136/bmjopen-2021-059021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Marchini J., Abecasis G., Durbin R. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genetics. 2016;48:1–279. doi: 10.1038/ng.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Aryee M.J., Jaffe A.E., Corrada-Bravo H., Ladd-Acosta C., Feinberg A.P., Hansen K.D., Irizarry R.A. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–1369. doi: 10.1093/bioinformatics/btu049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Verschoor C.P., Vlasschaert C., Rauh M.J., Paré G. A DNA methylation based measure outperforms circulating CRP as a marker of chronic inflammation and partly reflects the monocytic response to long-term inflammatory exposure: A Canadian longitudinal study of aging analysis. Aging Cell. 2023;22 doi: 10.1111/acel.13863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Morin S.N., Berger C., Papaioannou A., Cheung A.M., Rahme E., Leslie W.D., Goltzman D. Race/ethnic differences in the prevalence of osteoporosis, falls and fractures: a cross-sectional analysis of the Canadian Longitudinal Study on Aging. Osteoporos. Int. 2022;33:2637–2648. doi: 10.1007/s00198-022-06539-z. [DOI] [PubMed] [Google Scholar]
- 18.Pietzner M., Stewart I.D., Raffler J., Khaw K.-T., Michelotti G.A., Kastenmüller G., Wareham N.J., Langenberg C. Plasma metabolites to profile pathways in noncommunicable disease multimorbidity. Nat. Med. 2021;27:471–479. doi: 10.1038/s41591-021-01266-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sun Q., Townsend M.K., Okereke O.I., Franco O.H., Hu F.B., Grodstein F. Physical Activity at Midlife in Relation to Successful Survival in Women at Age 70 Years or Older. Arch. Intern. Med. 2010;170:194–201. doi: 10.1001/ARCHINTERNMED.2009.503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Willer C.J., Li Y., Abecasis G.R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wainberg M., Forde N.J., Mansour S., Kerrebijn I., Medland S.E., Hawco C., Tripathy S.J. Genetic architecture of the structural connectome. Nat. Commun. 2024;15:1962. doi: 10.1038/s41467-024-46023-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010;33:1–22. doi: 10.18637/jss.v033.i01. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ligthart S., Marzi C., Aslibekyan S., Mendelson M.M., Conneely K.N., Tanaka T., Colicino E., Waite L.L., Joehanes R., Guan W., et al. DNA methylation signatures of chronic low-grade inflammation are associated with complex diseases. Genome Biol. 2016;17:255. doi: 10.1186/s13059-016-1119-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Barker E.D., Cecil C.A.M., Walton E., Houtepen L.C., O’Connor T.G., Danese A., Jaffee S.R., Jensen S.K.G., Pariante C., McArdle W., et al. Inflammation-related epigenetic risk and child and adolescent mental health: A prospective study from pregnancy to middle adolescence. Dev. Psychopathol. 2018;30:1145–1156. doi: 10.1017/S0954579418000330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Stevenson A.J., Gadd D.A., Hillary R.F., McCartney D.L., Campbell A., Walker R.M., Evans K.L., Harris S.E., Spires-Jones T.L., McRae A.F., et al. Creating and validating a DNA methylation-based proxy for interleukin-6. J. Gerontol. A Biol. Sci. Med. Sci. 2021;76:2284–2292. doi: 10.1093/gerona/glab046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Petersen L.K., Brixi G., Li J., Hu J., Wang Z., Han X., Meir A.Y., Tyrmi J., Mahalingaiah S., Piltonen T. Understanding and Predicting Polycystic Ovary Syndrome through Shared Genetic Architecture with Testosterone, SHBG, and Inflammatory Markers. medRxiv. 2023 doi: 10.1101/2023.10.17.23297115. Preprint at. [DOI] [Google Scholar]
- 27.Yousefi P.D., Suderman M., Langdon R., Whitehurst O., Davey Smith G., Relton C.L. DNA methylation-based predictors of health: applications and statistical considerations. Nat. Rev. Genet. 2022;23:369–383. doi: 10.1038/s41576-022-00465-w. [DOI] [PubMed] [Google Scholar]
- 28.Fitzpatrick M., Young S.P. Metabolomics–a novel window into inflammatory disease. Swiss Med. Wkly. 2013;143 doi: 10.4414/smw.2013.13743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Barrett J.C., Esko T., Fischer K., Jostins-Dean L., Jousilahti P., Julkunen H., Jääskeläinen T., Kerimov N., Kerminen S., Kolde A., et al. Metabolomic and genomic prediction of common diseases in 477,706 participants in three national biobanks. medRxiv. 2023 doi: 10.1101/2023.06.09.23291213. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Cazaly E., Charlesworth J., Dickinson J.L., Holloway A.F. Genetic determinants of epigenetic patterns: providing insight into disease. Mol. Med. 2015;21:400–409. doi: 10.2119/molmed.2015.00001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Ventham N.T., Kennedy N.A., Adams A.T., Kalla R., Heath S., O’leary K.R., Drummond H., Wilson D.C., et al. IBD BIOM consortium, IBD CHARACTER consortium Integrative epigenome-wide analysis demonstrates that DNA methylation may mediate genetic risk in inflammatory bowel disease. Nat. Commun. 2016;7 doi: 10.1038/ncomms13507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zhong A., Xiong X., Shi M., Xu H. Roles of interleukin (IL)-6 gene polymorphisms, serum IL-6 levels, and treatment in obstructive sleep apnea: a meta-analysis. Sleep Breath. 2016;20:719–731. doi: 10.1007/S11325-015-1288-6/METRICS. [DOI] [PubMed] [Google Scholar]
- 33.Kaur R.P., Vasudeva K., Singla H., Benipal R.P.S., Khetarpal P., Munshi A. Analysis of pro- and anti-inflammatory cytokine gene variants and serum cytokine levels as prognostic markers in breast cancer. J. Cell. Physiol. 2018;233:9716–9723. doi: 10.1002/JCP.26901. [DOI] [PubMed] [Google Scholar]
- 34.Elahi M.M., Asotra K., Matata B.M., Mastana S.S. Tumor necrosis factor alpha − 308 gene locus promoter polymorphism: An analysis of association with health and disease. Biochim. Biophys. Acta. 2009;1792:163–172. doi: 10.1016/J.BBADIS.2009.01.007. [DOI] [PubMed] [Google Scholar]
- 35.Yanbaeva D.G., Dentener M.A., Spruit M.A., Houwing-Duistermaat J.J., Kotz D., Passos V.L., Wouters E.F. IL6 and CRP haplotypes are associated with COPD risk and systemic inflammation: A case-control study. BMC Med. Genet. 2009;10:23. doi: 10.1186/1471-2350-10-23/TABLES/7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Ning K., Li Y. Methodologies of Multi-Omics Data Integration and Data Mining: Techniques and Applications. Springer; 2023. Introduction to Multi-Omics; pp. 1–10. [Google Scholar]
- 37.Sun J., Axelsson J., Machowska A., Heimbürger O., Bárány P., Lindholm B., Lindström K., Stenvinkel P., Qureshi A.R. Biomarkers of cardiovascular disease and mortality risk in patients with advanced CKD. Clin. J. Am. Soc. Nephrol. 2016;11:1163–1172. doi: 10.2215/CJN.10441015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Tripepi G., Mallamaci F., Zoccali C. Inflammation markers, adhesion molecules, and all-cause and cardiovascular mortality in patients with ESRD: searching for the best risk marker by multivariate modeling. J. Am. Soc. Nephrol. 2005;16:S83–S88. doi: 10.1681/asn.2004110972. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code for generating 1-, 2-, and 3-way risk scores is available at http://lianglab.rc.fas.harvard.edu/OmicsRiskScore/.


