Abstract
Aims
Evaluate sex differences in cardiovascular disease (CVD) risk prediction, including use of (i) optimal sex-specific risk predictors and (ii) sex-specific risk thresholds.
Methods and results
Prospective cohort study using UK Biobank, including 121 724 and 182 632 healthy men and women, respectively, aged 38–73 years at baseline. There were 11 899 (men) and 9110 (women) incident CVD cases (hospitalization or mortality) with a median of 12.1 years of follow-up. We used recalibrated pooled cohort equations (PCEs; 7.5% 10-year risk threshold as per US guidelines), QRISK3 (10% 10-year risk threshold as per UK guidelines), and Cox survival models using sparse sex-specific variable sets (via LASSO stability selection) to predict CVD risk separately in men and women. LASSO stability selection included 12 variables in common between men and women, with 3 additional variables selected for men and 1 for women. C-statistics were slightly lower for PCE than QRISK3 and models using stably selected variables, but were similar between men and women: 0.67 (0.66–0.68), 0.70 (0.69–0.71), and 0.71 (0.70–0.72) in men and 0.69 (0.68–0.70), 0.72 (0.71–0.73), and 0.72 (0.71–0.73) in women for PCE, QRISK3, and models using stably selected variables, respectively. At current clinically implemented risk thresholds, test sensitivity was markedly lower in women than men for all models: at 7.5% 10-year risk, sensitivity was 65.1 and 68.2% in men and 24.0 and 33.4% in women for PCE and models using stably selected variables, respectively; at 10% 10-year risk, sensitivity was 53.7 and 52.3% in men and 16.8 and 20.2% in women for QRISK3 and models using stably selected variables, respectively. Specificity was correspondingly higher in women than men. However, the sensitivity in women at 5% 10-year risk threshold increased to 50.1, 58.5, and 55.7% for PCE, QRISK3, and models using stably selected variables, respectively.
Conclusion
Use of sparse sex-specific variables improved CVD risk prediction compared with PCE but not QRISK3. At current risk thresholds, PCE and QRISK3 work less well for women than men, but sensitivity was improved in women using a 5% 10-year risk threshold. Use of sex-specific risk thresholds should be considered in any re-evaluation of CVD risk calculators.
Keywords: CVD risk prediction, Pooled cohort equations, QRISK3, Biomarkers, Sparse variable selection
Graphical Abstract
Time of primary review: 41 days
See the editorial comment for this article ‘How far are we from accurate sex-specific risk prediction of cardiovascular disease? One size may not fit all’, by B. Huang et al., https://doi.org/10.1093/cvr/cvae135.
1. Introduction
Cardiovascular disease (CVD) is the leading cause of morbidity and mortality worldwide.1 Risk stratification via accurate prediction of future CVD risk is key to guiding effective early management and prevention, including lifestyle modifications and lipid-lowering therapeutics. A systematic review of CVD prediction models found that the most commonly included variables were age, smoking, systolic blood pressure, history of diabetes, total cholesterol, and high-density lipoprotein cholesterol.2 Alongside ethnicity and history of treated hypertension, these variables are included in the pooled cohort equations (PCEs), which are used in the USA to predict 10-year absolute atherosclerotic CVD risk as a decision aid for recommending lipid-lowering (statin) therapy, with a treatment threshold of 7.5% 10-year absolute risk or greater.3,4 In the UK, QRISK3 is used instead and incorporates additional variables, with a 10% 10-year absolute risk of atherosclerotic CVD used as a statin treatment threshold.5 However, both the US and UK guidelines note that lower treatment thresholds are likely to be clinically beneficial.6,7 Furthermore, there is evidence for sex- and age-specific treatment thresholds, with worse test sensitivity for younger vs. older adults and women vs. men.8,9
Models including additional variables have been proposed to predict incident coronary artery disease (CAD), with recent examples combining data from electronic healthcare records with polygenic risk scores (PRSs) for CAD10 and blood markers.11 Other recent studies have reported that PRS for CAD/CVD, when considered in isolation, yield a modest or non-significant improvement in predictive performance for CVD risk over traditional risk models.12–16 It has also been suggested that metabolomic data may be predictive of incident CVD, although their clinical utility in risk prediction remains to be established.17–20
Here, we analyse the UK Biobank data set to evaluate sex differences in CVD risk prediction, including use of (i) optimal sex-specific risk predictors and (ii) sex-specific risk thresholds.
2. Methods
2.1. Study participants
The UK Biobank recruited 502 536 volunteers aged 38–73 years between 2006 and 2010. Demographic and lifestyle factors, medical and surgical histories, standardized clinical measurements, and blood samples were collected at baseline. A panel of laboratory tests was performed on stored serum and red blood cells as well as genotyping.21 For the primary analyses, we excluded a total of 198 180 participants: 151 806 with prevalent CVD or missing data for any of the variables included in PCE or QRISK3,14 45 887 on lipid-lowering agents (as PCE and QRISK3 are used to guide the initiation of lipid-lowering therapeutics), and a further 487 who had withdrawn consent, leaving 304 356 participants without prior CVD at baseline for the present analyses (121 724 men and 182 632 women, Figure 1). Among these, a subset of 27 873 men and 40 982 women also had data on nuclear magnetic resonance (NMR) metabolic biomarkers measured in baseline plasma samples. The study complies with the Declaration of Helsinki.
2.2. CVD definition
CVD was defined as myocardial infarction and its sequelae, angina, non-haemorrhagic stroke, and transient ischaemic attack.14 Cases (i.e. people who had a cardiovascular event during follow-up) were identified using linkage to hospital admissions, operation/procedure codes, and death registrations, and prevalent cases were further defined via nurse-administered questionnaire at baseline (see Supplementary material online, Table S1). Participants who did not have a recorded cardiovascular event during follow-up are defined here as non-cases, with censoring by availability of hospital admission and mortality data (7 April 2021).
2.3. Study variables
Variables included in PCE3,4 are age, ethnicity (White, Black, and Other), smoking (never, former, and current), diabetes (prevalent self-reported or from hospital records), total and high-density lipoprotein cholesterol, systolic blood pressure (mean of two measurements), and use of antihypertensive medication. QRISK35 includes additional variables: standard deviation of systolic blood pressure, body mass index, family history of CAD, area-level deprivation score (Townsend), medication use including oral steroids and atypical antipsychotics, and self-reported prevalent conditions including chronic kidney disease stages 3–5, atrial fibrillation, migraine, rheumatoid arthritis, systemic lupus erythematosus, severe mental illness, and erectile dysfunction in men. In addition to the above variables, we considered for variable selection 26 further baseline serum biochemistry measurements (excluding oestradiol and rheumatoid factor that were missing in more than 80% of participants)22,23; 23 baseline haematology measurements including full blood count and white blood cell differential24; a PRS for CVD developed using lassosum,25 as previously described14; and NMR-derived metabolic variables (available in ∼120 000 randomly sampled participants from the whole UK Biobank cohort). The NMR-derived metabolomic profile includes estimated blood levels of (N = 168) annotated molecules including lipoprotein lipids, fatty acids, and fatty acid compositions, as well as some low-molecular-weight metabolites including amino acids, ketone bodies, and glycolysis metabolites.26
2.4. Statistical analyses
We randomly split the data into three sex-stratified and non-overlapping sets, constraining the ratio of CVD cases to non-cases to be equal in all three data splits (Figure 1): (i) a variable selection data set (40%); (ii) a training data set (30%), in which PCE and QRISK3 were calculated/recalibrated (see Supplementary material online, Methods and Figure S1) and unpenalized Cox models were fit using stably selected variables; and (iii) a hold-out test data set (30%), comparing the predictive accuracy of recalibrated PCE and QRISK3 with the models using stably selected variables. The Cox models used follow-up time as the underlying time variable with CVD event as outcome. In the subset of participants with NMR data, we compared variable selection and model performance excluding and including metabolomic data. After filtering for highly correlated variables and overlap with directly measured blood markers, 18 metabolomic variables were included in our analyses (see Supplementary material online, Methods and Figure S2) and were available in 68 855 of the 304 356 participants included in our study (Figure 1). For biochemical and haematological variables, there was up to 20% missingness with similar proportions for CVD cases and non-cases (see Supplementary material online, Table S2). Missing values were imputed using multiple imputation with predictive mean matching over five iterations of chained random forests.27 Skewed variables were log-transformed prior to analyses.
2.4.1. Variable selection
For variable selection, we used LASSO penalized regression in a stability selection framework28,29 to identify reproducible, parsimonious sets of variables that jointly contribute to CVD risk prediction. Briefly, we fit LASSO Cox models on (N = 1000) 50% independent subsamples of the variable selection data set and estimated, across subsamples, the per-variable selection proportion as a proxy for the variable importance. Model calibration was achieved by jointly identifying (i) the penalty parameter λ (controlling the sparsity of the LASSO model) and (ii) the threshold in selection proportion π (controlling the stability of the model, conditional on the penalty) above which a feature was considered as stably selected. These parameters were obtained by maximizing a likelihood-based stability score using the sharp package in R.29 We also performed sensitivity analyses assessing the reliability of the LASSO stability selection, using 100 subsampled variable selection data sets (see Supplementary material online, Methods).
2.4.2. Predictive performance
We calculated predictive accuracy (C-statistics) as well as sensitivity and specificity at relevant risk thresholds for 10-year risk (7.5% threshold for PCE, 10% threshold for QRISK3, and both 7.5 and 10% thresholds for models using stably selected variables). We used logistic regression models to perform receiver operating characteristic (ROC) analyses, reporting the mean and 95% confidence intervals of the area under the ROC curve (AUC). We also used a nested approach where log hazards from PCE and QRISK3, respectively, were forced into the LASSO stability selection models in place of their constituent variables. In addition, we calculated sensitivity and specificity at 5% 10-year risk threshold in women across all models.
Statistical analyses were performed using R version 4.2.2.30
3. Results
Mean age at baseline in men was 54.4 years in non-cases and 58.9 years in cases and 55.2 and 60.0 years, respectively, in women. A total of 11 899 men and 9110 women were diagnosed with CVD during the period of follow-up (median 12.1 years). Descriptive statistics, stratified by sex and case status, are shown in Supplementary material online, Table S3. Corresponding descriptive statistics for the subset with metabolomic data are reported in Supplementary material online, Table S4.
Our stability selection model consistently selected 12 variables in both men and women (Figure 2): age, albumin, antihypertensive medication, apolipoprotein B, atrial fibrillation, C-reactive protein, current smoker, cystatin C, family history of CAD, glycated haemoglobin, systolic blood pressure, and a PRS for CVD. In addition, apolipoprotein A1, lipoprotein(a), white blood cell count, and deprivation index were selected in men only and triglycerides in women only (see Supplementary material online, Table S5). Including variables beyond those stably selected did not substantially improve model performance (see Supplementary material online, Figure S3).
ROC analyses with logistic models for incident CVD in test data showed improvement in predictive accuracy when using stably selected variables vs. recalibrated PCE but not for QRISK3: in men, AUCs were 0.67 (0.66–0.68) for PCE and 0.70 (0.69–0.71) for QRISK3 vs. 0.71 (0.70–0.72) for models using stably selected variables; in women, they were 0.69 (0.68–0.70) for PCE and 0.72 (0.71–0.73) for QRISK3 vs. 0.72 (0.71–0.73) for stably selected variables (Figure 3).
Table 1 shows 10-year risk prediction reclassification, sensitivity, and specificity for LASSO stability selection variables vs. PCE (7.5% risk threshold) and QRISK3 (10% risk threshold). Test sensitivity was markedly lower in women than men for all models: at 7.5% 10-year risk, sensitivity was 65.1 and 68.2% in men and 24.0 and 33.4% in women for PCE and models using stably selected variables, respectively; at 10% 10-year risk, sensitivity was 53.7 and 52.3% in men and 16.8 and 20.2% in women for QRISK3 and stably selected variables, respectively. Specificity was correspondingly higher in women than men. However, the sensitivity in women at 5% 10-year risk threshold increased to 50.1, 58.5, and 55.7% for PCE, QRISK3, and stably selected variables, respectively (Table 2).
Table 1.
A PCE vs. LASSO | ||||
---|---|---|---|---|
Men | ||||
PCE Predicted 10-year risk (%) | LASSO stability selection | |||
Predicted 10-year risk (%) | Reclassified (%) | |||
<7.5 | ≥7.5 | |||
Cases | <7.5 | 808 | 439 | 35.2 |
≥7.5 | 329 | 1995 | 14.2 | |
Non-cases | <7.5 | 16 886 | 2851 | 14.4 |
≥7.5 | 3610 | 9601 | 27.3 |
LASSO% | PCE% | |
---|---|---|
Sensitivity (at 7.5% 10-year risk) | 68.2 | 65.1 |
Specificity (at 7.5% 10-year risk) | 62.2 | 59.9 |
Women | ||||
---|---|---|---|---|
PCE Predicted 10-year risk (%) | LASSO stability selection | |||
Predicted 10-year risk (%) | Reclassified (%) | |||
<7.5 | ≥7.5 | |||
Cases | <7.5 | 1631 | 448 | 21.5 |
≥7.5 | 189 | 466 | 28.9 | |
Non-cases | <7.5 | 43 785 | 3149 | 6.7 |
≥7.5 | 2430 | 2694 | 47.4 |
LASSO% | PCE% | |
---|---|---|
Sensitivity (at 7.5% 10-year risk) | 33.4 | 24.0 |
Specificity (at 7.5% 10-year risk) | 88.8 | 90.2 |
B QRISK3 vs. LASSO | ||||
---|---|---|---|---|
Men | ||||
QRISK3 Predicted 10-year risk (%) | LASSO stability selection | |||
Predicted 10-year risk (%) | Reclassified (%) | |||
<10 | ≥10 | |||
Cases | <10 | 1391 | 264 | 16.0 |
≥10 | 312 | 1604 | 16.3 | |
Non-cases | <10 | 22 431 | 1499 | 6.3 |
≥10 | 2483 | 6535 | 27.5 |
LASSO% | QRISK3% | |
---|---|---|
Sensitivity (at 10% 10-year risk) | 52.3 | 53.7 |
Specificity (at 10% 10-year risk) | 75.6 | 72.6 |
Women | ||||
---|---|---|---|---|
QRISK3 Predicted 10-year risk (%) | LASSO stability selection | |||
Predicted 10-year risk (%) | Reclassified (%) | |||
<10 | ≥10 | |||
Cases | <10 | 2058 | 218 | 9.6 |
≥10 | 124 | 334 | 27.1 | |
Non-cases | <10 | 48 269 | 1358 | 2.7 |
≥10 | 984 | 1447 | 40.5 |
LASSO% | QRISK3% | |
---|---|---|
Sensitivity (at 10% 10-year risk) | 20.2 | 16.8 |
Specificity (at 10% 10-year risk) | 94.6 | 95.3 |
Table 2.
A PCE vs. LASSO | ||||
---|---|---|---|---|
PCE Predicted 10-year risk (%) | LASSO stability selection | |||
Predicted 10-year risk (%) | Reclassified (%) | |||
<5 | ≥5 | |||
Cases | <5 | 962 | 402 | 29.5 |
≥5 | 249 | 1121 | 18.2 | |
Non-cases | <5 | 34 755 | 4306 | 11.0 |
≥5 | 4439 | 8558 | 34.2 |
B QRISK3 vs. LASSO | ||||
---|---|---|---|---|
QRISK3 Predicted 10-year risk (%) | LASSO stability selection | |||
Predicted 10-year risk (%) | Reclassified (%) | |||
<5 | ≥5 | |||
Cases | <5 | 959 | 175 | 15.4 |
≥5 | 252 | 1348 | 15.8 | |
Non-cases | <5 | 35 380 | 2285 | 6.1 |
≥5 | 3814 | 10 579 | 26.5 |
C Sensitivity and specificity at 5% 10-year risk threshold | |||
---|---|---|---|
LASSO% | QRISK3% | PCE% | |
Sensitivity (at 5% 10-year risk) | 55.7 | 58.5 | 50.1 |
Specificity (at 5% 10-year risk) | 75.3 | 72.4 | 75.0 |
In sensitivity analyses where PCE or QRISK3 log hazards were included in lieu of the constituent variables, stably selected variables differed slightly from the main analyses (see Supplementary material online, Figure S4). This did not affect model performances, with similar C-statistics, sensitivity, and specificity (see Supplementary material online, Table S6).
Among the subset of (N = 68 855) participants with available metabolomic data, glycoprotein acetyls was selected in women only, in preference to C-reactive protein (see Supplementary material online, Figure S5), with no improvement in predictive performance (see Supplementary material online, Figures S6 and S7). Assessment of the reliability of LASSO stability selection showed similar variable sets across 100 subsampled iterations (see Supplementary material online, Methods and Figure S8).
4. Discussion
In this large population-based cohort, use of sex-specific stably selected variables improved predictive performance for CVD beyond PCE but not QRISK3, although QRISK3 was also developed selecting from an extensive set of risk predictors.5 Among the variables selected in both men and women, some are already included in PCE and QRISK3, while others used in these risk calculators were not selected (diabetes status, ethnicity, high-density lipoprotein, and total cholesterol). At the current clinical risk thresholds, sensitivity was much lower in women (with higher specificity) than in men for both PCE and QRISK3. A higher proportion of incident CVD cases might therefore go untreated in women than men using a common risk threshold for both sexes, as is current practice.
Our results concerning test sensitivity by sex are consistent with previous findings for PCE. In an analysis of PCE among 3685 participants in the Framingham Offspring Study, sensitivity was lower in women than men at the clinically used 7.5% 10-year risk threshold, except at the oldest ages; the authors suggest using a 5% risk threshold at younger ages (40–55 years).8 In 1685 patients of the YOUNG-MI registry, who had a myocardial infarction aged 50 years or below, sensitivity of PCE in women was around half that in men at the 7.5% risk threshold. However, sensitivity in women at the 5% risk threshold was similar to that in men at the 7.5% threshold.9 Together with our own findings, these results suggest that sex-specific risk thresholds should be considered for clinical implementation to avoid sex inequality in CVD risk prediction.
Sex-specific differences in CVD risk prediction are not well understood. They may reflect underlying physiological differences, including the impact of sex hormones, vascular remodelling, lipid metabolism, and endothelial function.31,32 In our study, among lipids, triglycerides33 were selected in women only and lipoprotein(a)34,35 and apolipoprotein A136–38 in men only. Apolipoprotein B was selected in both men and women, replacing more standard lipid measures currently included in PCE and QRISK3, consistent with it being a better risk predictor of incident CVD.39,40 In keeping with this, both the European Society of Cardiology41 and the 2019 American College of Cardiology/American Heart Association guidelines on primary prevention of CVD42 have highlighted the utility of apolipoprotein B to improve risk stratification.
Systemic inflammation is an important component of CVD risk, and some of the selected variables reflect this: while white blood cell count was selected in men only, serum albumin,43 C-reactive protein44 (acute phase reactants), and cystatin C (a sensitive marker of renal function45,46) were selected in both men and women. Among NMR metabolomic variables, glycoprotein acetyls47–49 were selected in women only in preference to C-reactive protein, but this did not improve predictive accuracy. In addition, glycated haemoglobin, a biomarker used in the diagnosis and monitoring of diabetes and non-diabetic hyperglycaemia50 (both pro-inflammatory states),51 was stably selected in preference to diabetes status in both men and women, in keeping with it being a continuous and therefore more informative variable. Given that glycated haemoglobin is increasingly recorded in electronic health records and offers a superior predictor of CVD risk, a strong case can be made for its inclusion in CVD risk calculators.
Use of PRS in CVD risk prediction remains controversial14,52; here, it was stably selected in both men and women but made only modest contribution to predictive accuracy, in keeping with previous analyses of UK Biobank and other data.14,15 Family history of CAD, which may reflect common lifestyle and socio-economic factors as well as genetic risk,34 was also stably selected alongside PRS, indicating that they both jointly and independently contribute to CVD risk.
4.1. Limitations
We only included participants aged 38–73 years at baseline who were mostly of European ancestry; the participants were on average healthier, were less deprived, and have lower mortality than the general population and therefore may not be fully representative.53 While PCE was developed in US cohorts, the present study uses a UK-based cohort; we performed model recalibration to correct for population differences54 and included the standard risk prediction tool (QRISK3) used in the UK. Other potentially important predictors including coronary artery calcium were not measured, and their inclusion may further improve risk prediction or potentially compete with variables selected in our models.55 The UK Biobank does not have complete prescription data during follow-up, so it is likely that some participants’ CVD risk would have been modified from baseline through clinical management. Cost–benefit and decision analyses would be needed before implementing either sex-specific risk thresholds or an enhanced predictive score. Variable selection, training, and test data were drawn from the same population; external validation in different cohorts and settings would help to generalize our findings to other populations.
4.2. Conclusions
Use of sparse sex-specific variables improved CVD risk prediction compared with PCE but not QRISK3. At current risk thresholds, PCE and QRISK3 work less well for women than men, but sensitivity was improved in women using a 5% 10-year risk threshold. Use of sex-specific risk thresholds should be considered in any re-evaluation of CVD risk calculators.
Translational perspective.
Cardiovascular disease risk prediction is an important component of clinical risk management and disease prevention. We find that at risk prediction thresholds used by currently applied risk prediction algorithms (pooled cohort equation 7.5% 10-year risk threshold in the USA and QRISK3 10% risk threshold in the UK), sensitivity of these risk prediction tools is markedly lower in women than in men. This sex inequality implies that women are proportionately less likely to receive appropriate clinical management including lipid-lowering therapy. If the risk prediction threshold is lowered to 5% 10-year risk in women, then sensitivity in women is substantially increased.
Supplementary Material
Contributor Information
Joshua Elliott, Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, UK; Imperial College Healthcare NHS Trust, London, UK; Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; National Institute for Health Research Imperial Biomedical Research Centre, Imperial College London, The Bays, Entrance, 2 S Wharf Rd, London W2 1NY, UK.
Barbara Bodinier, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; MRC Centre for Environment and Health, School of Public Health, Imperial College London, Praed Street, London W2 1NY, UK.
Matthew Whitaker, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; MRC Centre for Environment and Health, School of Public Health, Imperial College London, Praed Street, London W2 1NY, UK.
Rin Wada, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; MRC Centre for Environment and Health, School of Public Health, Imperial College London, Praed Street, London W2 1NY, UK.
Graham Cooke, Department of Infectious Diseases, Faculty of Medicine, Imperial College London, London, UK; Imperial College Healthcare NHS Trust, London, UK; National Institute for Health Research Imperial Biomedical Research Centre, Imperial College London, The Bays, Entrance, 2 S Wharf Rd, London W2 1NY, UK.
Helen Ward, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; National Institute for Health Research Imperial Biomedical Research Centre, Imperial College London, The Bays, Entrance, 2 S Wharf Rd, London W2 1NY, UK.
Ioanna Tzoulaki, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; National Institute for Health Research Imperial Biomedical Research Centre, Imperial College London, The Bays, Entrance, 2 S Wharf Rd, London W2 1NY, UK; MRC Centre for Environment and Health, School of Public Health, Imperial College London, Praed Street, London W2 1NY, UK; British Heart Foundation Centre for Research Excellence, Imperial College London, South Kensington Campus, London SW7 2AZ, UK; Dementia Research Institute at Imperial College London, 86 Wood Ln, London W12 0BZ, UK; Health Data Research UK, Imperial College London, Exhibition Rd, South Kensington, London SW7 2AZ, UK; Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece.
Paul Elliott, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; National Institute for Health Research Imperial Biomedical Research Centre, Imperial College London, The Bays, Entrance, 2 S Wharf Rd, London W2 1NY, UK; MRC Centre for Environment and Health, School of Public Health, Imperial College London, Praed Street, London W2 1NY, UK; British Heart Foundation Centre for Research Excellence, Imperial College London, South Kensington Campus, London SW7 2AZ, UK; Dementia Research Institute at Imperial College London, 86 Wood Ln, London W12 0BZ, UK; Health Data Research UK, Imperial College London, Exhibition Rd, South Kensington, London SW7 2AZ, UK.
Marc Chadeau-Hyam, Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, 90 Wood Ln, London W12 0BZ, UK; MRC Centre for Environment and Health, School of Public Health, Imperial College London, Praed Street, London W2 1NY, UK.
Supplementary material
Supplementary material is available at Cardiovascular Research online.
Funding
This work was supported by the National Institute for Health Research (NIHR) Imperial Biomedical Research Centre (BRC: PA6381_WPEA and PA0866_WDVC to J.E.); NIHR Health Protection Research Units in Chemical and Radiation Threats and Hazards and in Health Impact of Environmental Hazards (HPRU-2012-10141 to P.E.); British Heart Foundation Centre of Research Excellence at Imperial College (RE/18/4/34215 to P.E.); Medical Research Council (MR/L01341X/1 to P.E.); European Union H2020 EXPANSE (Horizon 2020 grant no. 874627 to M.C.-H.); LongITools (Horizon 2020 grant no. 874739 to M.C.-H.); Wellcome Trust (205456/Z/16/Z to H.W.); and UK Dementia Research Institute at Imperial College London (MC_PC_17114 to P.E.). P.E. is a UK Dementia Research Institute (DRI) Professor at Imperial College; UK DRI is funded by the UK MRC, Alzheimer’s Society, and Alzheimer’s Research UK. P.E. is co-lead of the driver programme on Social and Environmental Determinants of Health as part of Health Data Research UK, which is supported, among others, by MRC, NIHR, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Wellcome Trust, and British Heart Foundation. H.W. acknowledges support from an NIHR Senior Investigator Award and the NIHR Applied Research Collaboration (ARC) Northwest London. G.C. is supported by an NIHR Professorship. Funders did not have any role in the study design, data analyses, result interpretation, manuscript preparation, and decision to publish.
Data availability
This study was conducted using the UK Biobank resource under application number 69328 granting access to the corresponding UK Biobank genetic and phenotype data. The UK Biobank received ethical approval from the North West Multi-centre Research Ethics Committee (REC reference: 11/NW/0382) to obtain and disseminate data and samples from the participants (http://www.ukbiobank.ac.uk/ethics/).
References
- 1. WHO . Global status report on noncommunicable diseases 2014. http://www.who.int/nmh/publications/ncd-status-report-2014/en/. 2014.
- 2. Damen JA, Hooft L, Schuit E, Debray TP, Collins GS, Tzoulaki I, Lassale CM, Siontis GC, Chiocchia V, Roberts C, Schlüssel MM, Gerry S, Black JA, Heus P, van der Schouw YT, Peelen LM, Moons KG. Prediction models for cardiovascular disease risk in the general population: systematic review. BMJ 2016;353:i2416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Stone NJ, Robinson JG, Lichtenstein AH, Bairey Merz CN, Blum CB, Eckel RH, Goldberg AC, Gordon D, Levy D, Lloyd-Jones DM, McBride P, Schwartz JS, Shero ST, Smith SC Jr, Watson K, Wilson PW, Eddleman KM, Jarrett NM, LaBresh K, Nevo L, Wnek J, Anderson JL, Halperin JL, Albert NM, Bozkurt B, Brindis RG, Curtis LH, DeMets D, Hochman JS, Kovacs RJ, Ohman EM, Pressler SJ, Sellke FW, Shen WK, Smith SC Jr, Tomaselli GF; American College of Cardiology/American Heart Association Task Force on Practice Guidelines . 2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation 2014;129:S1–S45. [DOI] [PubMed] [Google Scholar]
- 4. Goff DC Jr, Lloyd-Jones DM, Bennett G, Coady S, D'Agostino RB, Gibbons R, Greenland P, Lackland DT, Levy D, O'Donnell CJ, Robinson JG, Schwartz JS, Shero ST, Smith SC Jr, Sorlie P, Stone NJ, Wilson PW, Jordan HS, Nevo L, Wnek J, Anderson JL, Halperin JL, Albert NM, Bozkurt B, Brindis RG, Curtis LH, DeMets D, Hochman JS, Kovacs RJ, Ohman EM, Pressler SJ, Sellke FW, Shen WK, Smith SC Jr, Tomaselli GF; American College of Cardiology/American Heart Association Task Force on Practice Guidelines . 2013 ACC/AHA guideline on the assessment of cardiovascular risk: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation 2014;129:S49–S73. [DOI] [PubMed] [Google Scholar]
- 5. Hippisley-Cox J, Coupland C, Brindle P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study. BMJ Online 2017;357:j2099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. National Institute for Health and Care Excellence . Cardiovascular disease: risk assessment and reduction, including lipid modification. 2014.
- 7. Grundy SM, Stone NJ, Bailey AL, Beam C, Birtcher KK, Blumenthal RS, Braun LT, de Ferranti S, Faiella-Tommasino J, Forman DE, Goldberg R, Heidenreich PA, Hlatky MA, Jones DW, Lloyd-Jones D, Lopez-Pajares N, Ndumele CE, Orringer CE, Peralta CA, Saseen JJ, Smith SC Jr, Sperling L, Virani SS, Yeboah J. 2018 AHA/ACC/AACVPR/AAPA/ABC/ACPM/ADA/AGS/APhA/ASPC/NLA/PCNA guideline on the management of blood cholesterol: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation 2019;139:e1082–e1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Navar-Boggan AM, Peterson ED, D’Agostino RB, Pencina MJ, Sniderman AD. Using age- and sex-specific risk thresholds to guide statin therapy: one size may not fit all. J Am Coll Cardiol 2015;65:1633–1639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Singh A, Collins BL, Gupta A, Fatima A, Qamar A, Biery D, Baez J, Cawley M, Klein J, Hainer J, Plutzky J, Cannon CP, Nasir K, Di Carli MF, Bhatt DL, Blankstein R. Cardiovascular risk and statin eligibility of young adults after an MI: partners YOUNG-MI registry. J Am Coll Cardiol 2018;71:292–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Forrest IS, Petrazzini BO, Duffy Á, Park JK, Marquez-Luna C, Jordan DM, Rocheleau G, Cho JH, Rosenson RS, Narula J, Nadkarni GN, Do R. Machine learning-based marker for coronary artery disease: derivation and validation in two longitudinal cohorts. Lancet 2023;401:215–225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Agrawal S, Klarqvist MDR, Emdin C, Patel AP, Paranjpe MD, Ellinor PT, Philippakis A, Ng K, Batra P, Khera AV. Selection of 51 predictors from 13,782 candidate multimodal features using machine learning improves coronary artery disease prediction. Patterns (N Y) 2021;2:100364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Khera AV, Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, Natarajan P, Lander ES, Lubitz SA, Ellinor PT, Kathiresan S. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet 2018;50:1219–1224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Riveros-Mckay F, Weale ME, Moore R, Selzam S, Krapohl E, Sivley RM, Tarran WA, Sørensen P, Lachapelle AS, Griffiths JA, Saffari A, Deanfield J, Spencer CCA, Hippisley-Cox J, Hunter DJ, O'Sullivan JW, Ashley EA, Plagnol V, Donnelly P. Integrated polygenic tool substantially enhances coronary artery disease prediction. Circ Genomic Precis Med 2021;14:e003304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Elliott J, Bodinier B, Bond TA, Chadeau-Hyam M, Evangelou E, Moons KGM, Dehghan A, Muller DC, Elliott P, Tzoulaki I. Predictive accuracy of a polygenic risk score-enhanced prediction model vs a clinical risk score for coronary artery disease. JAMA 2020;323:636–645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Mosley JD, Gupta DK, Tan J, Yao J, Wells QS, Shaffer CM, Kundu S, Robinson-Cohen C, Psaty BM, Rich SS, Post WS, Guo X, Rotter JI, Roden DM, Gerszten RE, Wang TJ. Predictive accuracy of a polygenic risk score compared with a clinical risk score for incident coronary heart disease. JAMA 2020;323:627–635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Khan SS, Cooper R, Greenland P. Do polygenic risk scores improve patient selection for prevention of coronary artery disease? JAMA 2020;323:614–615. [DOI] [PubMed] [Google Scholar]
- 17. Würtz P, Havulinna AS, Soininen P, Tynkkynen T, Prieto-Merino D, Tillin T, Ghorbani A, Artati A, Wang Q, Tiainen M, Kangas AJ, Kettunen J, Kaikkonen J, Mikkilä V, Jula A, Kähönen M, Lehtimäki T, Lawlor DA, Gaunt TR, Hughes AD, Sattar N, Illig T, Adamski J, Wang TJ, Perola M, Ripatti S, Vasan RS, Raitakari OT, Gerszten RE, Casas J-P, Chaturvedi N, Ala-Korpela M, Salomaa V. Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation 2015;131:774–785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Iliou A, Mikros E, Karaman I, Elliott F, Griffin JL, Tzoulaki I, Elliott P. Metabolic phenotyping and cardiovascular disease: an overview of evidence from epidemiological settings. Heart 2021;107:1123–1129. [DOI] [PubMed] [Google Scholar]
- 19. Tzoulaki I, Castagné R, Boulangé CL, Karaman I, Chekmeneva E, Evangelou E, Ebbels TMD, Kaluarachchi MR, Chadeau-Hyam M, Mosen D, Dehghan A, Moayyeri A, Ferreira DLS, Guo X, Rotter JI, Taylor KD, Kavousi M, de Vries PS, Lehne B, Loh M, Hofman A, Nicholson JK, Chambers J, Gieger C, Holmes E, Tracy R, Kooner J, Greenland P, Franco OH, Herrington D, Lindon JC, Elliott P. Serum metabolic signatures of coronary and carotid atherosclerosis and subsequent cardiovascular disease. Eur Heart J 2019;40:2883–2896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. McGranaghan P, Saxena A, Rubens M, Radenkovic J, Bach D, Schleußner L, Pieske B, Edelmann F, Trippel TD. Predictive value of metabolomic biomarkers for cardiovascular disease risk: a systematic review and meta-analysis. Biomark Biochem Indic Expo Response Susceptibility Chem 2020;25:101–111. [DOI] [PubMed] [Google Scholar]
- 21. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, Downey P, Elliott P, Green J, Landray M, Liu B, Matthews P, Ong G, Pell J, Silman A, Young A, Sprosen T, Peakman T, Collins R. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015;12:e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Fry D, Moffat S, Almond R, Gordon M, Singh P. UK biobank biomarker project companion document to accompany serum biomarker data. http://biobank.ndph.ox.ac.uk/showcase/showcase/docs/serum_biochemistry.pdf. 2019.
- 23. Tierney A, Fry D, Almond R, Gordon M, Moffat S. UK biobank biomarker enhancement project companion document to accompany HbA1c biomarker data. http://www.ukbiobank.ac.uk/uk-biobank-biomarker-panel/https://biobank.ndph.ox.ac.uk/ukb/ukb/docs/serum_hb1ac.pdf. 2018.
- 24. Sheard S, Nicholls R, Froggatt J. UK biobank haematology data companion document. https://biobank.ndph.ox.ac.uk/ukb/ukb/docs/haematology.pdf. 2017.
- 25. Mak TSH, Porsch RM, Choi SW, Zhou X, Sham PC. Polygenic scores via penalized regression on summary statistics. Genet Epidemiol 2017;41:469–480. [DOI] [PubMed] [Google Scholar]
- 26. Julkunen H, Cichońska A, Tiainen M, Koskela H, Nybo K, Mäkelä V, Nokso-Koivisto J, Kristiansson K, Perola M, Salomaa V, Jousilahti P, Lundqvist A, Kangas AJ, Soininen P, Barrett JC, Würtz P. Atlas of plasma nuclear magnetic resonance biomarkers for health and disease in 118,461 individuals from the UK Biobank. June 13, 2022. 10.1101/2022.06.13.22276332, preprint: not peer reviewed. [DOI] [PMC free article] [PubMed]
- 27. Mayer M. R package: ‘missRanger'. 2023.
- 28. Meinshausen N, Bühlmann P. Stability selection. J R Stat Soc Ser B Stat Methodol 2010;72:417–473. [Google Scholar]
- 29. Bodinier B, Filippi S, Nøst TH, Chiquet J, Chadeau-Hyam M. Automated calibration for stability selection in penalised regression and graphical models. J R Stat Soc Ser C Appl Stat 2023;72:1375–1393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. R Core Team . R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2020. https://www.r-project.org/. [Google Scholar]
- 31. Regitz-Zagrosek V, Kararigas G. Mechanistic pathways of sex differences in cardiovascular disease. Physiol Rev 2017;97:1–37. [DOI] [PubMed] [Google Scholar]
- 32. Gerdts E, Regitz-Zagrosek V. Sex differences in cardiometabolic disorders. Nat Med 2019;25:1657–1666. [DOI] [PubMed] [Google Scholar]
- 33. Nordestgaard BG, Varbo A. Triglycerides and cardiovascular disease. Lancet 2014;384:626–635. [DOI] [PubMed] [Google Scholar]
- 34. Mehta A, Virani SS, Ayers CR, Sun W, Hoogeveen RC, Rohatgi A, Berry JD, Joshi PH, Ballantyne CM, Khera A. Lipoprotein(a) and family history predict cardiovascular disease risk. J Am Coll Cardiol 2020;76:781–793. [DOI] [PubMed] [Google Scholar]
- 35. Nordestgaard BG, Chapman MJ, Ray K, Borén J, Andreotti F, Watts GF, Ginsberg H, Amarenco P, Catapano A, Descamps OS, Fisher E, Kovanen PT, Kuivenhoven JA, Lesnik P, Masana L, Reiner Z, Taskinen M-R, Tokgözoglu L, Tybjærg-Hansen A. Lipoprotein(a) as a cardiovascular risk factor: current status. Eur Heart J 2010;31:2844–2853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Walldius G, Jungner I. Apolipoprotein B and apolipoprotein A-I: risk indicators of coronary heart disease and targets for lipid-modifying therapy. J Intern Med 2004;255:188–205. [DOI] [PubMed] [Google Scholar]
- 37. Lu M, Lu Q, Zhang Y, Tian G. Apob/apoA1 is an effective predictor of coronary heart disease risk in overweight and obesity. J Biomed Res 2011;25:266–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Liting P, Guoping L, Zhenyue C. Apolipoprotein B/apolipoprotein A1 ratio and non-high-density lipoprotein cholesterol. Herz 2015;40:1–7. [DOI] [PubMed] [Google Scholar]
- 39. Kohli-Lynch CN, Thanassoulis G, Moran AE, Sniderman AD. The clinical utility of apoB versus LDL-C/non-HDL-C. Clin Chim Acta 2020;508:103–108. [DOI] [PubMed] [Google Scholar]
- 40. Trompet S, Packard CJ, Jukema JW. Plasma apolipoprotein-B is an important risk factor for cardiovascular disease, and its assessment should be routine clinical practice. Curr Opin Lipidol 2018;29:51–52. [DOI] [PubMed] [Google Scholar]
- 41. Mach F, Baigent C, Catapano AL, Koskinas KC, Casula M, Badimon L, Chapman MJ, De Backer GG, Delgado V, Ference BA, Graham IM, Halliday A, Landmesser U, Mihaylova B, Pedersen TR, Riccardi G, Richter DJ, Sabatine MS, Taskinen MR, Tokgozoglu L, Wiklund O; ESC Scientific Document Group . 2019 ESC/EAS guidelines for the management of dyslipidaemias: lipid modification to reduce cardiovascular risk. Eur Heart J 2020;41:111–188. [DOI] [PubMed] [Google Scholar]
- 42. Arnett DK, Blumenthal RS, Albert MA, Buroker AB, Goldberger ZD, Hahn EJ, Himmelfarb CD, Khera A, Lloyd-Jones D, McEvoy JW, Michos ED, Miedema MD, Muñoz D, Smith SC, Virani SS, Williams KA, Yeboah J, Ziaeian B. 2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. J Am Coll Cardiol 2019;74:e177–e232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Ronit A, Kirkegaard-Klitbo DM, Dohlmann TL, Lundgren J, Sabin CA, Phillips AN, Nordestgaard BG, Afzal S. Plasma albumin and incident cardiovascular disease. Arterioscler Thromb Vasc Biol 2020;40:473–482. [DOI] [PubMed] [Google Scholar]
- 44. de Ferranti S, Rifai N. C-reactive protein and cardiovascular disease: a review of risk prediction and interventions. Clin Chim Acta 2002;317:1–15. [DOI] [PubMed] [Google Scholar]
- 45. Mussap M, Plebani M. Biochemistry and clinical role of human cystatin C. Crit Rev Clin Lab Sci 2004;41:467–550. [DOI] [PubMed] [Google Scholar]
- 46. Angelidis C, Deftereos S, Giannopoulos G, Anatoliotakis N, Bouras G, Hatzis G, Panagopoulou V, Pyrgakis V, Cleman MW. Cystatin C: an emerging biomarker in cardiovascular disease. Curr Top Med Chem 2013;13:164–179. [DOI] [PubMed] [Google Scholar]
- 47. Connelly MA, Otvos JD, Shalaurova I, Playford MP, Mehta NN. Glyca, a novel biomarker of systemic inflammation and cardiovascular disease risk. J Transl Med 2017;15:219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Chiesa ST, Charakida M, Georgiopoulos G, Roberts JD, Stafford SJ, Park C, Mykkänen J, Kähönen M, Lehtimäki T, Ala-Korpela M, Raitakari O, Pietiäinen M, Pussinen P, Muthurangu V, Hughes AD, Sattar N, Timpson NJ, Deanfield JE. Glycoprotein acetyls: a novel inflammatory biomarker of early cardiovascular risk in the young. J Am Heart Assoc 2022;11:e024380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Lawler PR, Akinkuolie AO, Chandler PD, Moorthy MV, Vandenburgh MJ, Schaumberg DA, Lee IM, Glynn RJ, Ridker PM, Buring JE, Mora S. Circulating N-linked glycoprotein acetyls and longitudinal mortality risk. Circ Res 2016;118:1106–1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. American Diabetes Association . 2. Classification and diagnosis of diabetes: standards of medical care in diabetes—2021. Diabetes Care 2020;44:S15–S33. [DOI] [PubMed] [Google Scholar]
- 51. Grossmann V, Schmitt VH, Zeller T, Panova-Noeva M, Schulz A, Laubert-Reh D, Juenger C, Schnabel RB, Abt TGJ, Laskowski R, Wiltink J, Schulz E, Blankenberg S, Lackner KJ, Münzel T, Wild PS. Profile of the immune and inflammatory response in individuals with prediabetes and type 2 diabetes. Diabetes Care 2015;38:1356–1364. [DOI] [PubMed] [Google Scholar]
- 52. Groenendyk JW, Greenland P, Khan SS. Incremental value of polygenic risk scores in primary prevention of coronary heart disease: a review. JAMA Intern Med 2022;182:1082–1088. [DOI] [PubMed] [Google Scholar]
- 53. Fry A, Littlejohns TJ, Sudlow C, Doherty N, Adamska L, Sprosen T, Collins R, Allen NE. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am J Epidemiol 2017;186:1026–1034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Pennells L, Kaptoge S, Wood A, Sweeting M, Zhao X, White I, Burgess S, Willeit P, Bolton T, Moons KGM, van der Schouw YT, Selmer R, Khaw KT, Gudnason V, Assmann G, Amouyel P, Salomaa V, Kivimaki M, Nordestgaard BG, Blaha MJ, Kuller LH, Brenner H, Gillum RF, Meisinger C, Ford I, Knuiman MW, Rosengren A, Lawlor DA, Völzke H, Cooper C, Marín Ibañez A, Casiglia E, Kauhanen J, Cooper JA, Rodriguez B, Sundström J, Barrett-Connor E, Dankner R, Nietert PJ, Davidson KW, Wallace RB, Blazer DG, Björkelund C, Donfrancesco C, Krumholz HM, Nissinen A, Davis BR, Coady S, Whincup PH, Jørgensen T, Ducimetiere P, Trevisan M, Engström G, Crespo CJ, Meade TW, Visser M, Kromhout D, Kiechl S, Daimon M, Price JF, Gómez de la Cámara A, Wouter Jukema J, Lamarche B, Onat A, Simons LA, Kavousi M, Ben-Shlomo Y, Gallacher J, Dekker JM, Arima H, Shara N, Tipping RW, Roussel R, Brunner EJ, Koenig W, Sakurai M, Pavlovic J, Gansevoort RT, Nagel D, Goldbourt U, Barr ELM, Palmieri L, Njølstad I, Sato S, Monique Verschuren WM, Varghese CV, Graham I, Onuma O, Greenland P, Woodward M, Ezzati M, Psaty BM, Sattar N, Jackson R, Ridker PM, Cook NR, D'Agostino RB, Thompson SG, Danesh J, Di Angelantonio E. Equalization of four cardiovascular risk algorithms after systematic recalibration: individual-participant meta-analysis of 86 prospective studies. Eur Heart J 2019;40:621–631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Polonsky TS, Greenland P. Viewing the value of coronary artery calcium testing from different perspectives. JAMA Cardiol 2018;3:908. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
This study was conducted using the UK Biobank resource under application number 69328 granting access to the corresponding UK Biobank genetic and phenotype data. The UK Biobank received ethical approval from the North West Multi-centre Research Ethics Committee (REC reference: 11/NW/0382) to obtain and disseminate data and samples from the participants (http://www.ukbiobank.ac.uk/ethics/).