Skip to main content
JAMA Network logoLink to JAMA Network
. 2024 Sep 18;9(11):1018–1028. doi: 10.1001/jamacardio.2024.2912

Tailoring Risk Prediction Models to Local Populations

Aniket N Zinzuwadia 1,, Olga Mineeva 2, Chunying Li 1, Zareen Farukhi 1,3, Franco Giulianini 1, Brian Cade 1, Lin Chen 1, Elizabeth Karlson 1, Nina Paynter 1, Samia Mora 1, Olga Demler 1,2
PMCID: PMC11411452  PMID: 39292486

Key Points

Question

Can established cardiovascular risk tools be adapted for local populations without sacrificing interpretability?

Findings

This cohort study including 95 326 individuals applied a machine learning recalibration method that uses minimal variables to the American Heart Association’s Predicting Risk of Cardiovascular Disease Events (AHA-PREVENT) equations for a New England population. This approach strengthened the AHA-PREVENT risk equations, improving calibration while maintaining similar risk discrimination.

Meaning

The results indicate that the interpretable machine learning-based recalibration method used in this study can be implemented to tailor risk stratification in local health systems.


This cohort study evaluates the use of a machine learning model to adapt guideline recommendations to local populations.

Abstract

Importance

Risk estimation is an integral part of cardiovascular care. Local recalibration of guideline-recommended models could address the limitations of existing tools.

Objective

To provide a machine learning (ML) approach to augment the performance of the American Heart Association’s Predicting Risk of Cardiovascular Disease Events (AHA-PREVENT) equations when applied to a local population while preserving clinical interpretability.

Design, Setting, and Participants

This cohort study used a New England–based electronic health record cohort of patients without prior atherosclerotic cardiovascular disease (ASCVD) who had the data necessary to calculate the AHA-PREVENT 10-year risk of developing ASCVD in the event period (2007-2016). Patients with prior ASCVD events, death prior to 2007, or age 79 years or older in 2007 were subsequently excluded. The final study population of 95 326 patients was split into 3 nonoverlapping subsets for training, testing, and validation. The AHA-PREVENT model was adapted to this local population using the open-source ML model (MLM) Extreme Gradient Boosting model (XGBoost) with minimal predictor variables, including age, sex, and AHA-PREVENT. The MLM was monotonically constrained to preserve known associations between risk factors and ASCVD risk. Along with sex, race and ethnicity data from the electronic health record were collected to validate the performance of ASCVD risk prediction in subgroups. Data were analyzed from August 2021 to February 2024.

Main Outcomes and Measures

Consistent with the AHA-PREVENT model, ASCVD events were defined as the first occurrence of either nonfatal myocardial infarction, coronary artery disease, ischemic stroke, or cardiovascular death. Cardiovascular death was coded via government registries. Discrimination, calibration, and risk reclassification were assessed using the Harrell C index, a modified Hosmer-Lemeshow goodness-of-fit test and calibration curves, and reclassification tables, respectively.

Results

In the test set of 38 137 patients (mean [SD] age, 64.8 [6.9] years, 22 708 [59.5]% women and 15 429 [40.5%] men; 935 [2.5%] Asian, 2153 [5.6%] Black, 1414 [3.7%] Hispanic, 31 400 [82.3%] White, and 2235 [5.9%] other, including American Indian, multiple races, unspecified, and unrecorded, consolidated owing to small numbers), MLM-PREVENT had improved calibration (modified Hosmer-Lemeshow P > .05) compared to the AHA-PREVENT model across risk categories in the overall cohort (χ23 = 2.2; P = .53 vs χ23 > 16.3; P < .001) and sex subgroups (men: χ23 = 2.1; P = .55 vs χ23 > 16.3; P < .001; women: χ23 = 6.5; P = .09 vs. χ23 > 16.3; P < .001), while also surpassing a traditional recalibration approach. MLM-PREVENT maintained or improved AHA-PREVENT’s calibration in Asian, Black, and White individuals. Both MLM-PREVENT and AHA-PREVENT performed equally well in discriminating risk (approximate ΔC index, ±0.01). Using a clinically significant 7.5% risk threshold, MLM-PREVENT reclassified a total of 11.5% of patients. We visualize the recalibration through MLM-PREVENT ASCVD risk charts that highlight preserved risk associations of the original AHA-PREVENT model.

Conclusions and Relevance

The interpretable ML approach presented in this article enhanced the accuracy of the AHA-PREVENT model when applied to a local population while still preserving the risk associations found by the original model. This method has the potential to recalibrate other established risk tools and is implementable in electronic health record systems for improved cardiovascular risk assessment.

Introduction

Accurate assessment of atherosclerotic cardiovascular disease (ASCVD) risk plays an integral role in clinical decision-making, with clinicians using tools like the Pooled Cohort Equations (PCE) and Systematic Coronary Risk Evaluation (SCORE2) for ASCVD risk estimation.1,2,3,4 Despite their importance, risk prediction models have well-documented shortcomings, resulting in a limited generalizability to local, contemporary cohorts. For example, multiple studies report PCE overestimates 10-year ASCVD risk across risk groups, particularly in Black and Hispanic individuals.5,6,7,8,9 Furthermore, the performance of cardiovascular risk prognostic tools has been shown to deteriorate over time with evolving risk factor profiles of patients and the introduction of lipid-lowering treatments.10,11,12,13,14,15 In addition, the disconnect between model development in prospective enrollment-based cohorts and their clinical application in dynamic electronic health record (EHR) systems further compounds these challenges.16 Similar shortcomings have been reported across multiple medical domains, where guidelines are risk-based, leading investigators to advocate for region-specific ASCVD models, like SCORE2, that provide tailored risk calculators to minimize generalizability concerns.17

Recently, to address the well-established calibration challenges of the PCE, the American Heart Association (AHA) announced the race-free AHA Predicting Risk of Cardiovascular Disease Events (PREVENT) equations18 that incorporate the improved understanding of cardiovascular-kidney-metabolic syndrome and ASCVD risk. Despite recent validation in a diverse cohort, it is uncertain if the AHA-PREVENT equations could benefit from a postprocessing approach to adapt to distinct local populations not well represented in the original study. Here, we evaluate the performance of the AHA-PREVENT equations in a New England EHR cohort older than 55 years and present an approach to adapt risk prognostic tools to local study population data using a minimal set of demographic variables while preserving known risk associations. We borrow the foundation model principle from the field of machine learning (ML) by proposing that accurate local risk prognostic tools can be developed by retraining pretrained models on local data. Here, in lieu of retraining a base model, guideline-recommended cardiovascular risk models (eg, AHA-PREVENT, PCE and SCORE2) can serve as the foundation model, as said models follow rigorous validation processes overseen by expert committees. In this study, as proof of principle, we use the AHA-PREVENT equations as a foundation model, with the goal of adapting them to a New England EHR cohort beyond traditional recalibration strategies.19

To adapt AHA-PREVENT, we used the Extreme Gradient Boosting model (XGBoost), an open-source ML model (MLM) that has proven successful in developing novel prognostic models from complex, multilinear data for cardiovascular outcomes.20,21,22,23,24 Here, we trained the MLM in a training subset using the original AHA-PREVENT risk score and a limited set of demographic factors (age and sex) as input variables. This MLM-PREVENT model was then evaluated in a set-aside test population to determine whether this recalibration approach strengthened PREVENT. ML models are often black box models—a highly undesirable property in a clinical application. To overcome this issue, we monotonically constrained the MLM and demonstrated that by doing so, this approach will preserve all associations with risk pertinent to the original AHA-PREVENT model. We visualize this effect using ASCVD risk charts that are readily interpretable for clinician-patient shared decision-making.

Methods

Data Collection and Description

Study Population

We developed a cohort from the Mass General Brigham Research Patient Data Registry, a centralized clinical data registry from large Boston-area hospitals.25,26 For cohort selection, we identified all individuals (aged 55 years and older in 2007) with at least 1 lipid or blood pressure measurement during the study observation period from 1997 to 2006 and at least 1 visit in the follow-up period from 2007 to 2016. Patients with prior ASCVD events, death prior to 2007, or age 79 years or older in 2007 were subsequently excluded. We also removed participants with insufficient data for the AHA-PREVENT model. We focused on regularly seen hospital system patients to reduce informed presence bias, excluding those with a low EHR data-completeness score defined from a validated algorithm (eFigure 1 in Supplement 1).27,28,29,30 The final study population of 95 326 patients (eFigure 2 in Supplement 1) was stratified by ASCVD event rate into 3 nonoverlapping subsets: (1) a training set for model derivation (50%), (2) a validation set (10%), and (3) an independent cohort testing set (40%) (eTable 1 in Supplement 1). Patients had a median (IQR) of 152 (65-284) encounters over a median (IQR) period of 15.6 (10.1-19.2) years during the observation and follow-up period. The Mass General Brigham institutional review board approved the study protocol and waived informed consent for this study because it used historical, deidentified data and because obtaining consent would alter the study’s end point to only include survivors or shifting the study to a prospective design, requiring additional follow-up time. Data were analyzed from August 2021 to February 2024.

ASCVD Outcome Definition

The ASCVD outcome was defined as the first occurrence of nonfatal myocardial infarction, coronary artery disease, ischemic stroke, or cardiovascular death between 2007 and 2016. This composite end point was identified using validated ML disease phenotypes and national or state death records. ML disease phenotypes, comprising International Classification of Diseases, Ninth Revision (ICD-9), or Tenth Revision (ICD-10) codes, biomarkers, and clinical notes analyzed with natural language processing, were validated by clinicians beforehand.31 Cardiovascular deaths were determined using death records from 2 government registries: the Massachusetts Department of Public Health and the National Death Index using US Centers for Disease Control and Prevention criteria.32

Variable Definitions

We collected variables necessary for the AHA-PREVENT risk score calculation including age, sex, total cholesterol, high-density lipoprotein cholesterol, estimated glomerular filtration rate,33 systolic blood pressure, blood pressure-lowering medication use, diabetes, and smoking status. The most recent data during the observation period were noted for biomarkers. Patients who had documented use of a blood pressure-lowering or statin medication during the observation period were assumed to be taking medication for the remainder of the study period. Patients with active smoking status and those with diabetes or obesity were identified through validated disease phenotypes, similar to ASCVD event identification. Along with sex, race and ethnicity data from the EHR were collected to validate the performance of ASCVD risk prediction in subgroups.

Evaluation Metrics

Observed ASCVD event rates were determined via Kaplan-Meier analysis, while model discrimination was assessed using the Harrell C index with bootstrapped confidence intervals.34,35 Calibration was evaluated using the 2-sided Greenwood-Nam-D’Agostino χ2 test (a modified Hosmer-Lemeshow test (eMethods [section 1] in Supplement 1) and visualized using calibration bar charts across risk categories of the overall cohort and subpopulations (sex, race, ethnicity, obesity, statin use, diabetes, etc).36,37 We used ASCVD risk categories (low: <5%, borderline: 5%-7.5%, intermediate: 7.5%-20%, and high: >20%) from recent guidelines, with a cutoff for adequate calibration set at P > .05.38,39 Reclassification calibration tables were calculated in the test set. Statistical analyses were performed using R version 3.6.0 (R Foundation).

To illustrate model interpretability, ASCVD risk charts for theoretical office-based use of the MLM-PREVENT risk prognostic model were developed by applying the final MLM-PREVENT model to combinations of age, sex, and other risk predictors. Similar to charts for the European SCORE2 model, risk estimates were assigned a color on a continuous green (low risk) to red (high risk) gradient.40

Statistical Analysis

AHA-PREVENT Calculation

Ten-year risk of ASCVD events was calculated for each patient using the PREVENT equations according to published equations for men and women (eTable 2 in Supplement 1).18 Analysis was restricted to the base model given the uneven availability of the optional predictors, including urine albumin-to-creatinine ratio, hemoglobin A1c, and social deprivation index.

Recalibration Approaches

XGBoost, an open-source ensemble method, combines predictions of individual decision trees to enhance model accuracy.41,42 In the training subset, we specified the MLM with Cox survival tree-based options to predict 10-year ASCVD event risk using 3 predictor variables: AHA-PREVENT risk score, age, and sex. We introduced monotonic constraints to the MLM, requiring ASCVD risk to increase with age and AHA-PREVENT risk scores. We show that with constraints, the MLM preserved associations between risk factors and ASCVD risk of the AHA-PREVENT model (eMethods [section 2] in Supplement 1). Further details on MLM development and hyperparameter tuning are provided in the eMethods (section 3) in Supplement 1. The Cox survival MLM-PREVENT was used to estimate 10-year ASCVD event risk using AHA-PREVENT risk scores, age, and sex as input variables in the test subset. Our recalibration approach is visually summarized in Figure 1. We also recalibrated AHA-PREVENT using the conventional calibration slope and intercept method. Within the test set, we updated the baseline hazard by fitting a Cox model (PREVENT–calibration slope [CS]), which incorporated the linear predictor component of the risk score as a covariate.43 Each risk score (AHA-PREVENT, PREVENT-CS, and MLM-PREVENT) underwent minimal adjustment to match the observed 10-year ASCVD event rate.

Figure 1. Workflow of Local Recalibration of the American Heart Association’s Predicting Risk of Cardiovascular Disease Events (AHA-PREVENT) Equations in a Contemporary Older New England Health Record Cohort.

Figure 1.

ASCVD indicates atherosclerotic cardiovascular disease; CAD, coronary artery disease; demogr, demographic information; EHR, electronic health record; MI, myocardial infarction; MLM, machine learning model; NLP, natural language processing; resid, residual; T1, tree 1; T2, tree 2.

Results

In the test cohort of 38 137 patients, the mean (SD) age was 64.8 (6.9) years; 22 708 patients (59.5%) were women and 15 429 (40.5%) were men. By self-report, 935 participants (2.5%) were Asian, 2153 (5.6%) were Black, 1414 (3.7%) were Hispanic, 31 400 (82.3%) were White, and 2235 (5.9%) were other, including American Indian, multiple races, unspecified, and unrecorded, consolidated owing to small numbers. Baseline characteristics were recorded both overall and by subgroup (Table). Participant characteristics varied by race and ethnicity, with Black and Hispanic individuals having higher rates of hypertension, diabetes, and obesity than White patients. Mean observed ASCVD 10-year event rate was 7.2%, which differed by race and ethnicity with rates of 6.0%, 7.9%, 4.9%, and 7.3% for Asian, Black, Hispanic, and White individuals, respectively (eTable 3 in Supplement 1). Median (IQR) estimated uncalibrated AHA-PREVENT ASCVD risk for the overall cohort was 6.3% (3.7%-10.3%). Both AHA-PREVENT and MLM-PREVENT showed similar discriminative performance in the independent test population with C indexes of 0.72 (95% CI, 0.71-0.73) and 0.73 (95% CI, 0.72-0.73), respectively. Moreover, MLM-PREVENT maintained comparable discriminative performance across all race, ethnicity, and sex groups with nonsignificant changes in C indexes between −0.01 and 0.01 (eTable 4 in Supplement 1).

Table. Baseline Characteristics of Test Set by Race, Ethnicity, and Sexa.

Characteristic Overall (N = 38 137) Male (n = 15 429) Female (n = 22 708) Asian (n = 935) Black (n = 2153) Hispanic (n = 1414) White (n = 31 400)
Age, mean (SD), y 64.8 (6.9) 64.6 (6.8) 64.9 (6.9) 64.8 (7.0) 64 (6.8) 64 (6.7) 64.8 (6.9)
Men, % 40.5 NA NA 38.3 36.9 36.4 41
Systolic blood pressure, mean (SD), mm Hg 129.3 (17.6) 129.8 (17.2) 128.9 (17.9) 126.4 (18.6) 134.8 (19.6) 130.8 (18.2) 128.9 (17.3)
Hypertension, % 51.5 53.6 50 53.4 70.5 63.7 49.8
Type 2 diabetes, % 15.7 18.3 14 25.2 33.7 31.7 13.2
Obesity, % 29.0 26.7 30.6 12.7 40.9 36.7 28.4
Smoking, % 33.3 38.8 29.5 16.9 33.9 26.7 34.7
Total cholesterol, mean (SD), mg/dL 190.2 (40.7) 178.8 (39.1) 197.9 (39.9) 189.1 (39.5) 184.6 (42.3) 183.9 (40.7) 191.0 (40.5)
HDL cholesterol, mean (SD), mg/dL 56.6 (18.2) 49 (14.9) 61.8 (18.4) 55.4 (16.2) 54.4 (16.7) 50 (14.9) 57.3 (18.4)
eGFR, mean (SD), mL/min/1.73 m2 76.9 (17.7) 77.4 (17.5) 76.6 (17.8) 79.1 (18.0) 70.4 (20.4) 80.2 (19.3) 77.1 (17.2)
First-reported statin use before 2007, % 35 39.1 32.2 35 39.8 42.4 34.1
First-reported statin use after 2007, % 29.6 31.2 28.5 30 26 22.2 30.6

Abbreviations: eGFR, estimated glomerular filtration rate; HDL, high-density lipoprotein.

SI conversion factor: To convert cholesterol to mmol/L, multiply by 0.0259.

a

Along with sex, race and ethnicity data from the electronic health record were collected to validate the performance of ASCVD risk prediction in subgroups.

The 10-year ASCVD outcome rate of 7.2% in this EHR cohort was similar to the mean uncalibrated predicted AHA-PREVENT risk of 7.5%; consequently, we minimally adjusted all risk scores for this observed rate. Despite this calibration in the large, both AHA-PREVENT and PREVENT-CS remained insufficiently calibrated (modified Hosmer-Lemeshow P < .05) across groups of predicted risk in the overall cohort (AHA-PREVENT: χ23 > 16.3; P < .001; PREVENT-CS: χ23 = 12.8; P = .005) and among male (AHA-PREVENT: χ23 > 16.3; P < .001; PREVENT-CS: χ23 > 16.3; P < .001) and female (AHA-PREVENT: χ23 > 16.3; P < .001; PREVENT-CS: χ23 > 16.3; P < .001) participants (Figure 2; eFigure 3 in Supplement 1). In the test subset, MLM-PREVENT exhibited adequate calibration across groups of predicted risk in the overall cohort (χ23 = 2.2; P = .53) and among male (χ23 = 2.1; P = .55) and female (χ23 = 6.5; P = .09) participants by alleviating ASCVD risk underestimation in male patients and overestimation in lower-risk female patients (Figure 2). MLM-PREVENT and PREVENT-CS maintained or improved AHA-PREVENT’s calibration in Asian, Black, and White individuals, although it did not resolve risk overestimation in Hispanic individuals (eFigures 3 and 4 in Supplement 1). Additionally, when considering subpopulations defined by clinical variables not included in the MLM recalibration, MLM-PREVENT had a variable effect: comparable calibration performance in patients with statin use (MLM-PREVENT: χ23 = 11.3; P = .01; AHA-PREVENT: χ23 = 9.8; P = .02) in observation period and estimated glomerular filtration rate (MLM-PREVENT: χ23 > 16.3; P < .001; AHA-PREVENT: χ23 > 16.3; P < .001) consistent with at least early-stage kidney disease and decrease in calibration in patients with obesity (MLM-PREVENT: χ23 = 14.8; P = .002; AHA-PREVENT: χ23 = 4.0; P = .26) and diabetes (MLM-PREVENT: χ23 = 11.3; P = .01; AHA-PREVENT: χ23 = 7.4; P = .06) (eFigure 5 in Supplement 1). Furthermore, in the test dataset, the MLM-PREVENT model reclassified 4369 patients (11.5%) to a different risk category: 2022 from less than 7.5% to greater than 7.5% and 2347 in the opposite direction (eTable 5 in Supplement 1).

Figure 2. Rates of Events.

Figure 2.

Rates of events estimated by the American Heart Association’s Predicting Risk of Cardiovascular Disease Events (AHA-PREVENT), PREVENT–calibration slope (CS), and Machine Learning Model (MLM)–PREVENT models compared with rates of observed events across subgroups by groups of atherosclerotic cardiovascular disease (ASCVD) predicted risk in the overall cohort (A-C), electronic health record (EHR) self-reported male sex (D-F), and EHR self-reported female sex (G-I). All risk scores were adjusted for the mean to match observed event rate. P values are for Nam-D’Agostino χ2 goodness-of-fit test; a nonsignificant χ2 (P > .05) indicates good calibration.

We visualized the MLM-PREVENT model by generating risk charts across age, cholesterol, systolic blood pressure, sex, and other variables. Figure 3 presents an example chart for patients without diabetes or blood pressure–lowering medications, with normal high-density lipoprotein cholesterol values and an estimated glomerular filtration rate of 70 mL/min/1.73 m2. For comparison, we provide both a corresponding AHA-PREVENT risk diagrams (eFigure 7 in Supplement 1) and a heatmap illustrating the absolute differences between the AHA-PREVENT and MLM-PREVENT models (Figure 4). Additionally, we used the MLM to recalibrate PCE risk scores with age, sex, race, and ethnicity in this EHR population, yielding improved calibration across groups of predicted risk and race and ethnicity (eMethods [section 4] in Supplement 1).

Figure 3. Machine Learning Model (MLM)–Adapted Predicting Risk of Cardiovascular Disease Events (PREVENT) Risk Score Heatmap for Men and Women.

Figure 3.

PREVENT-MLM risk score heatmap for women (A) and men (B) with a high-density lipoprotein cholesterol level of 60 mg/dL, estimated glomerular filtration rate of 70 mL/min/1.73 m2, no diabetes, and no current blood pressure–lowering medication use across age, total cholesterol, systolic blood pressure, and smoking categories. SI conversion factor: To convert cholesterol to mmol/L, multiply by 0.0259.

Figure 4. Absolute Difference Between Machine Learning Model (MLM)–Adapted Predicting Risk of Cardiovascular Disease Events (PREVENT) and the American Heart Association’s PREVENT Predicted Probabilities.

Figure 4.

Absolute difference plotted as a heatmap for men and women with a high-density lipoprotein cholesterol level of 60 mg/dL, estimated glomerular filtration rate of 70 mL/min/1.73 m2, no diabetes, and no current blood pressure–lowering medication use across age, total cholesterol, systolic blood pressure, and smoking categories.

Discussion

To our knowledge, this cohort study offers the first published ML approach to refine a risk prognostic model for local populations that focuses on preserving its clinical interpretability. In a New England EHR study of 95 326 individuals, we recalibrated AHA-PREVENT through an MLM in a training subset of patients using the AHA-PREVENT score and important demographic variables (age and sex). In the independent test set, the AHA-PREVENT equations had an overall acceptable model discrimination but were not sufficiently calibrated among male and female participants even after applying a traditional recalibration approach. MLM-PREVENT resolved certain limitations of the original model in terms of calibration while maintaining its discriminative performance. Using the 7.5% risk threshold used in current American College of Cardiology (ACC) and AHA guidelines, our approach resulted in the reclassification of 11.5% of individuals in the test subset. We also provide several tools to demonstrate the interpretability of this model.

ML methods are gaining prominence in cardiology, particularly in reading electrocardiograms and training other imaging- and genomic-based risk assessments.44,45 Despite this progress, the inherent black box nature of complex models poses a challenge for clinical application.46 However, XGBoost has been shown to surpass linear models in cardiovascular risk assessments while providing an ability to explain a model’s assessment at a cumulative patient level (eMethods [section 5] in Supplement 1).47,48 Here, we demonstrate that an MLM trained with a minimal set of predictor variables (AHA-PREVENT risk score plus age and sex) offered a clinically relevant strategy for recalibration but with several important considerations. First, to avoid a black box model, we used constrained XGBoost to maintain the risk associations of the original AHA-PREVENT model (eMethods [section 2] in Supplement 1). Second, we provided a visual summary of how this minimal set of predictor variables impact MLM-PREVENT’s recalibration of AHA-PREVENT. For instance, as illustrated in eFigure 8 in Supplement 1, we observe that male sex contributes to an upward adjustment in model prediction, addressing AHA-PREVENT underestimation of risk in this population. Third, as a further consistency check, we recommend producing risk charts akin to those in European guidelines, allowing clinicians to verify the model’s monotonicity and quantify risk adjustment at the patient level, as exemplified in Figure 3.

The inadequate calibration of AHA-PREVENT in this cohort across male and female participants is noteworthy given the recent development and validation of these race-free equations in a large, contemporary US population that encompassed many adjudicated datasets.18 We suspect this discrepancy may stem from the unique characteristics of this New England cohort that are not consistent with the original validation at the national level, including older age (mean, 64.8 years), high rates of statin use (64.6% in either observation or event period), and higher socioeconomic status (median household income by zip code, approximately $87 000). This study adds to the growing evidence that any risk prediction model should be adapted to reflect local populations with distinct risk factor profiles, physician practice patterns, and disease incidence.7 Accordingly, local recalibration or postprocessing of existing tools has been seen as an increasingly feasible approach to address inevitable miscalibration in subpopulations of any risk prognostic tool.49,50

Scaling up the development of local models, despite mounting evidence supporting their effectiveness, has proven challenging. In contrast, guideline-recommended models undergo rigorous processes to ensure both statistical and clinical validity. The published open-source model is then subject to validation across independent datasets, ensuring that treatment decisions are based on clinically sound evidence. Without such checks, models may yield inaccurate calculations, leading to inaccurate treatment allocation. For instance, a proprietary sepsis prediction model derived from a single EHR system was promptly retracted after the research community demonstrated substandard performance across multiple cohorts.51,52,53 Additionally, black box models that bypass exertional validation could be prone to undetected model instability with slight data perturbations posing a risk if incorporated into treatment decisions.54,55,56 To address these challenges and scale up local model development, we leveraged a guideline-recommended model as a foundation model, assuming it accumulates disease-risk associations that are clinically relevant based on current knowledge. We preserved these associations by specifying monotonic constraints that convert XGBoost—a black box ML algorithm—to a transparent method using standard software tools. This approach aims to optimize clinical relevance by ensuring the local model retains all disease-risk associations present in the established model.

The key clinical implication of this study is the proposal to recalibrate established models by retraining them on local data to mitigate miscalibration. In this article, we demonstrate this concept as a proof-of-principle, using AHA-PREVENT as a foundation model by retraining its disease-biomarker associations in an EHR-based contemporary cohort. Results were consistent when PCE risk scores were used instead of AHA-PREVENT, demonstrating the potential generalizability of this approach to recalibrate any guidelines-recommended model (eMethods [section 4] in Supplement 1). While XGBoost was used as the adaptation methodology, we suggest its use when traditional recalibration approaches, such as calibration intercept and slope (PREVENT-CS), are inadequate, as observed in this dataset. Furthermore, we encourage validation of this recalibration approach in distinct patient populations in other health systems. For detailed implementation steps, please refer to eMethods (section 6) in Supplement 1.

In addition, we recognize the ongoing importance of developing new ASCVD risk calculators, like the AHA-PREVENT equations, that incorporate an improved understanding of cardiovascular-kidney-metabolic syndrome and ASCVD risk.18 Still, we posit that a sustainable solution to inevitable miscalibration due to unique characteristics of geographic localities that cannot be fully captured with standard risk factors as well as temporal changes in treatment strategies requires a ready-to-use methodology for local recalibration. For our recalibration strategy, we limited input variables to the MLM to demographic variables contained in the foundation model; further exploration could consider the incorporation of additional variables, such as clinical comorbidities (eg, diabetes and obesity) or race and ancestry with possible ASCVD risk underestimation from existing risk calculators (eg, South Asian ancestry).57 When selecting special variables in a general setting, it is crucial to prioritize variables that have well-established associations with the outcome of interest and are readily available in clinical datasets. These variables should also contribute to improving model calibration without adding unnecessary complexity, ensuring the model remains interpretable and clinically applicable.

In this local EHR cohort, we observed ASCVD risk overestimation from AHA-PREVENT in Hispanic patients that were unresolved with a race-free recalibration approach. This miscalibration could stem from factors such as our local population’s nonrepresentativeness of the broader US population, lower Hispanic representation in the development of AHA-PREVENT equations, or our relatively small sample size of Hispanic individuals (approximately 3500). Regardless, further validation of AHA-PREVENT in multiethnic cohorts should be explored. The use of race in risk algorithms has been questioned in several medical conditions where race-based risk assessments worsen existing health disparities, such as the underdiagnosis of chronic kidney disease in Black patients due to adjusted glomerular filtration rate.58,59,60,61 While historically accepted for ASCVD risk estimation to safeguard against miscalibration and underperformance in high-risk communities, race correction has faced criticism for yielding significant and biologically implausible differences in predicted ASCVD risk between White and Black individuals.62 Additionally, the false equivalence of race as a biologic construct when included in risk tools has motivated recent calls for race to be removed from ASCVD risk calculators and replaced with causal mechanistic pathways and social determinants of health.63 Thus, we refrained from including race in our recalibration approach for AHA-PREVENT and were encouraged to see adequate calibration of AHA-PREVENT in Black individuals. However, the inclusion of race in an ideal ASCVD prediction risk calculator is challenging to avoid given the measurable correlations between race and cardiovascular disease, the impacts of structural racism that are not reflected in other causal variables, and the need for timely risk assessments in clinical settings. As PCE incorporated race into risk score calculation, we do report results of an MLM recalibration that incorporates race in eMethods (section 4) in Supplement 1; MLM-PCE exhibited improved model calibration in the New England cohort in Asian, Black, and Hispanic populations by incorporating these variables in model training. Further dialogue is necessary to determine the appropriateness of including race in recalibration strategies.

Limitations

We would like to acknowledge several potential limitations of our study. First, this approach is prone to the challenges inherent to using EHR data, such as informed presence bias.64 To mitigate this challenge, we used data with a high data-completeness score, relied on validated and adjudicated disease phenotypes, and integrated cause of death information from 2 government registries.27,28 Second, to calculate risk estimates for individuals, we collect risk factor data over a 10-year observation period. In clinical practice, while risk estimates are calculated using laboratory values and clinical indicators from snapshots in time, we assume that this observation window will still provide an accurate population-level representation of ASCVD risk. Third, our use of race and ethnicity categories may not fully capture the diverse ancestry within populations. Fourth, we were unable to collect information regarding statin discontinuation and assumed patients with statin use had continued, uninterrupted use following initiation. Thus, while our results do not take medication discontinuation into account for a patient’s predicted risk, sensitivity analysis demonstrated that MLM-PREVENT addressed miscalibration in subgroups even when disregarding statin use. We recommend considering our proposed approach after determining that classical recalibration methods prove inadequate, as we observed in our data.19 Fifth, the current study design is potentially vulnerable to model optimism as our test and training data are from the same data source. However, we posit that this point is less applicable as this framework is meant to be used for late integration to tailor existing risk prediction tools in local populations.

Conclusions

This study demonstrated that the MLM-PREVENT model effectively updated the AHA-PREVENT model by incorporating a minimal set of demographic variables while preserving disease risk associations to avoid a black box model. MLM-PREVENT was better calibrated and reclassified a noteworthy percentage of AHA-PREVENT risk scores into lower or higher risk categories. Coupled with the use of a minimal set of readily available predictor variables, this approach can be readily applied to local EHR systems. We demonstrated that retraining any guideline-recommended model is possible in a cost-efficient way without sacrificing interpretability.

Supplement 1.

eMethods

eAppendix

eTable 1. Characteristics of study population by cohort subset

eTable 2. Estimation of 10-year ASCVD risk using the simplified AHA PREVENT sex-specific equations

eTable 3. Outcome and risk score summary of test set by race/ethnicity and sex subgroups in the test set

eTable 4. Harrell’s c-index across subgroups for PREVENT, PREVENT-CS, and MLM-PREVENT model

eTable 5. Reclassification table

eTable 6. Hyperparameters of XGBoost MLM-PREVENT

eTable 7. Estimation of ASCVD risk using PCE race- and sex-specific equations

eTable 8. Harrell’s c-index across subgroups for PCE and MLM-PCE models in test set

eFigure 1. ASCVD outcome by decile of data-completeness score

eFigure 2. Study Selection Flowchart

eFigure 3. Calibration plots for PREVENT, PREVENT following Cox calibration slope, and MLM-PREVENT across deciles of predicted PREVENT risk for subgroups

eFigure 4. Rates of events estimated by PREVENT, PREVENT-CS, and MLM-PREVENT models compared with rates of observed events across subgroups by groups of ASCVD predicted risk by race and ethnicity

eFigure 5. Rates of events estimated by PREVENT, PREVENT-CS, and MLM-PREVENT models compared with rates of observed events across subgroups by groups of ASCVD predicted risk by morbidity

eFigure 6. Rates of events estimated by PCE and MLM-PCE models compared with rates of observed events across subgroups based on estimated risk

eFigure 7. AHA-PREVENT risk score heatmap for men and women

eFigure 8. SHAP analysis of MLM-PREVENT

eFigure 9. Unconstrained PREVENT-XGBoost risk score heatmap for patients with diabetes across age, total cholesterol, systolic blood pressure, gender, and smoking categories

Supplement 2.

Data sharing statement

References

  • 1.Lloyd-Jones DM, Braun LT, Ndumele CE, et al. Use of risk assessment tools to guide decision-making in the primary prevention of atherosclerotic cardiovascular disease: a special report from the American Heart Association and American College of Cardiology. Circulation. 2019;139(25):e1162-e1177. doi: 10.1161/CIR.0000000000000638 [DOI] [PubMed] [Google Scholar]
  • 2.Visseren FLJ, Mach F, Smulders YM, et al. ; ESC National Cardiac Societies; ESC Scientific Document Group . 2021 ESC Guidelines on cardiovascular disease prevention in clinical practice. Eur Heart J. 2021;42(34):3227-3337. doi: 10.1093/eurheartj/ehab484 [DOI] [PubMed] [Google Scholar]
  • 3.SCORE2 Working Group and ESC Cardiovascular Risk Collaboration . SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur Heart J. 2021;42(25):2439-2454. doi: 10.1093/eurheartj/ehab309 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Assmann G, Cullen P, Schulte H. Simple scoring scheme for calculating the risk of acute coronary events based on the 10-year follow-up of the prospective cardiovascular Münster (PROCAM) study. Circulation. 2002;105(3):310-315. doi: 10.1161/hc0302.102575 [DOI] [PubMed] [Google Scholar]
  • 5.Cook NR, Ridker PM. Calibration of the pooled cohort equations for atherosclerotic cardiovascular disease: an update. Ann Intern Med. 2016;165(11):786-794. doi: 10.7326/M16-1739 [DOI] [PubMed] [Google Scholar]
  • 6.Flores Rosario K, Mehta A, Ayers C, et al. Performance of the pooled cohort equations in Hispanic individuals across the United States: insights from the Multi-Ethnic Study of Atherosclerosis and the Dallas Heart Study. J Am Heart Assoc. 2021;10(9):e018410. doi: 10.1161/JAHA.120.018410 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yadlowsky S, Hayward RA, Sussman JB, McClelland RL, Min YI, Basu S. Clinical implications of revised pooled cohort equations for estimating atherosclerotic cardiovascular disease risk. Ann Intern Med. 2018;169(1):20-29. doi: 10.7326/M17-3011 [DOI] [PubMed] [Google Scholar]
  • 8.Pennells L, Kaptoge S, Wood A, et al. ; Emerging Risk Factors Collaboration . Equalization of four cardiovascular risk algorithms after systematic recalibration: individual-participant meta-analysis of 86 prospective studies. Eur Heart J. 2019;40(7):621-631. doi: 10.1093/eurheartj/ehy653 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Damen JA, Pajouheshnia R, Heus P, et al. Performance of the Framingham risk models and pooled cohort equations for predicting 10-year risk of cardiovascular disease: a systematic review and meta-analysis. BMC Med. 2019;17(1):109. doi: 10.1186/s12916-019-1340-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Barda N, Yona G, Rothblum GN, et al. Addressing bias in prediction models by improving subpopulation calibration. J Am Med Inform Assoc. 2021;28(3):549-558. doi: 10.1093/jamia/ocaa283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Pate A, Emsley R, Ashcroft DM, Brown B, van Staa T. The uncertainty with using risk prediction models for individual decision making: an exemplar cohort study examining the prediction of cardiovascular disease in English primary care. BMC Med. 2019;17(1):134. doi: 10.1186/s12916-019-1368-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Miller RJH, Sabovčik F, Cauwenberghs N, et al. Temporal shift and predictive performance of machine learning for heart transplant outcomes. J Heart Lung Transplant. 2022;41(7):928-936. doi: 10.1016/j.healun.2022.03.019 [DOI] [PubMed] [Google Scholar]
  • 13.Navar AM, Fonarow GC, Pencina MJ. Time to revisit using 10-year risk to guide statin therapy. JAMA Cardiol. 2022;7(8):785-786. doi: 10.1001/jamacardio.2022.1883 [DOI] [PubMed] [Google Scholar]
  • 14.Beaulieu-Jones BK, Yuan W, Brat GA, et al. Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians? NPJ Digit Med. 2021;4(1):62. doi: 10.1038/s41746-021-00426-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kuragaichi T, Kataoka Y, Miyakoshi C, Miyamoto T, Sato Y. External validation of pooled cohort equations using systolic blood pressure intervention trial data. BMC Res Notes. 2019;12(1):271. doi: 10.1186/s13104-019-4293-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Goldstein BA, Pencina MJ. Testing clinical prediction models. JAMA. 2020;324(19):1998-1999. doi: 10.1001/jama.2020.19392 [DOI] [PubMed] [Google Scholar]
  • 17.Singh H, Mhasawade V, Chunara R. Generalizability challenges of mortality risk prediction models: a retrospective analysis on a multi-center database. PLOS Digit Health. 2022;1(4):e0000023. doi: 10.1371/journal.pdig.0000023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Khan SS, Matsushita K, Sang Y, et al. Development and validation of the American Heart Association’s PREVENT Equations. Circulation. 2024;149(6):430-449. doi: 10.1161/CIRCULATIONAHA.123.067626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW; Topic Group ‘Evaluating diagnostic tests and prediction models’ of the STRATOS initiative . Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17(1):230. doi: 10.1186/s12916-019-1466-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Dinh A, Miertschin S, Young A, Mohanty SD. A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med Inform Decis Mak. 2019;19(1):211. doi: 10.1186/s12911-019-0918-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Athanasiou M, Sfrintzeri K, Zarkogianni K, Thanopoulou AC, Nikita KS. An explainable XGBoost–based approach towards assessing the risk of cardiovascular disease in patients with type 2 diabetes mellitus. In 2020 IEEE 20th International Conference on Bioinformatics and Bioengineering (BIBE), 2020: 859-864. doi: 10.1109/BIBE50027.2020.00146 [DOI] [Google Scholar]
  • 22.Rajliwall NS, Davey R, Chetty G. Cardiovascular risk prediction based on XGBoost. In 2018 5th Asia-Pacific World Congress on Computer Science and Engineering (APWC on CSE), 2018: 246-252. doi: 10.1109/APWConCSE.2018.00047 [DOI] [Google Scholar]
  • 23.Hyland SL, Faltys M, Hüser M, et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat Med. 2020;26(3):364-373. doi: 10.1038/s41591-020-0789-4 [DOI] [PubMed] [Google Scholar]
  • 24.Mori M, Durant TJS, Huang C, et al. Toward dynamic risk prediction of outcomes after coronary artery bypass graft: improving risk prediction with intraoperative events using gradient boosting. Circ Cardiovasc Qual Outcomes. 2021;14(6):e007363. doi: 10.1161/CIRCOUTCOMES.120.007363 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhao J, Feng Q, Wu P, et al. Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Sci Rep. 2019;9(1):717. doi: 10.1038/s41598-018-36745-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Williams BA. Constructing epidemiologic cohorts from electronic health record data. Int J Environ Res Public Health. 2021;18(24):13193. doi: 10.3390/ijerph182413193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Lin KJ, Singer DE, Glynn RJ, Murphy SN, Lii J, Schneeweiss S. Identifying patients with high data completeness to improve validity of comparative effectiveness research in electronic health records data. Clin Pharmacol Ther. 2018;103(5):899-905. doi: 10.1002/cpt.861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lin KJ, Rosenthal GE, Murphy SN, et al. External validation of an algorithm to identify patients with high data-completeness in electronic health records for comparative effectiveness research. Clin Epidemiol. 2020;12:133-141. doi: 10.2147/CLEP.S232540 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Phelan M, Bhavsar NA, Goldstein BA. Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. EGEMS (Wash DC). 2017;5(1):22. doi: 10.5334/egems.243 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Yan M, Pencina MJ, Boulware LE, Goldstein BA. Observability and its impact on differential bias for clinical prediction models. J Am Med Inform Assoc. 2022;29(5):937-943. doi: 10.1093/jamia/ocac019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Yu S, Ma Y, Gronsbell J, et al. Enabling phenotypic big data with PheNorm. J Am Med Inform Assoc. 2018;25(1):54-60. doi: 10.1093/jamia/ocx111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Curtin SC. Trends in cancer and heart disease death rates among adults aged 45-64: United States, 1999-2017. Natl Vital Stat Rep. 2019;68(5):1-9. [PubMed] [Google Scholar]
  • 33.Inker LA, Eneanya ND, Coresh J, et al. ; Chronic Kidney Disease Epidemiology Collaboration . New creatinine- and cystatin C-based equations to estimate GFR without race. N Engl J Med. 2021;385(19):1737-1749. doi: 10.1056/NEJMoa2102953 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247(18):2543-2546. doi: 10.1001/jama.1982.03320430047030 [DOI] [PubMed] [Google Scholar]
  • 35.Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30(10):1105-1117. doi: 10.1002/sim.4154 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Demler OV, Paynter NP, Cook NR. Tests of calibration and goodness-of-fit in the survival setting. Stat Med. 2015;34(10):1659-1680. doi: 10.1002/sim.6428 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.D’Agostino RB, Nam BH. Evaluation of the Performance of Survival Analysis Models: Discrimination and Calibration Measures. In: Balakrishnan N, Rao CR, eds. Handbook of Statistics. Vol 23. Elsevier; 2003:1-25. [Google Scholar]
  • 38.Arnett DK, Blumenthal RS, Albert MA, et al. 2019 ACC/AHA Guideline on the Primary Prevention of Cardiovascular Disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation. 2019;140(11):e596-e646. doi: 10.1161/CIR.0000000000000678 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Muntner P, Colantonio LD, Cushman M, et al. Validation of the atherosclerotic cardiovascular disease pooled cohort risk equations. JAMA. 2014;311(14):1406-1415. doi: 10.1001/jama.2014.2630 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Piepoli MF, Hoes AW, Agewall S, et al. ; ESC Scientific Document Group . 2016 European Guidelines on cardiovascular disease prevention in clinical practice: the Sixth Joint Task Force of the European Society of Cardiology and Other Societies on Cardiovascular Disease Prevention in Clinical Practice (constituted by representatives of 10 societies and by invited experts) developed with the special contribution of the European Association for Cardiovascular Prevention & Rehabilitation (EACPR). Eur Heart J. 2016;37(29):2315-2381. doi: 10.1093/eurheartj/ehw106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, NY, USA, 2016): 785-794. doi: 10.1145/2939672.2939785. [DOI] [Google Scholar]
  • 42.Lin, I. IyarLin/survXgboost. 2024. https://github.com/IyarLin/survXgboost
  • 43.Crowson CS, Atkinson EJ, Therneau TM. Assessing calibration of prognostic risk scores. Stat Methods Med Res. 2016;25(4):1692-1706. doi: 10.1177/0962280213497434 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Unterhuber M, Kresoja KP, Rommel KP, et al. Proteomics-enabled deep learning machine algorithms can enhance prediction of mortality. J Am Coll Cardiol. 2021;78(16):1621-1631. doi: 10.1016/j.jacc.2021.08.018 [DOI] [PubMed] [Google Scholar]
  • 45.Commandeur F, Slomka PJ, Goeller M, et al. Machine learning to predict the long-term risk of myocardial infarction and cardiac death based on clinical risk, coronary calcium, and epicardial adipose tissue: a prospective study. Cardiovasc Res. 2020;116(14):2216-2225. doi: 10.1093/cvr/cvz321 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Petch J, Di S, Nelson W. Opening the black box: the promise and limitations of explainable machine learning in cardiology. Can J Cardiol. 2022;38(2):204-213. doi: 10.1016/j.cjca.2021.09.004 [DOI] [PubMed] [Google Scholar]
  • 47.Moore A, Bell M. XGBoost, a novel explainable ai technique, in the prediction of myocardial infarction: a UK Biobank cohort study. Clin Med Insights Cardiol. 2022;16:11795468221133611. doi: 10.1177/11795468221133611 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Salah H, Srinivas S. Explainable machine learning framework for predicting long-term cardiovascular disease risk among adolescents. Sci Rep. 2022;12(1):21905. doi: 10.1038/s41598-022-25933-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Pennells L, Kaptoge S, Di Angelantonio E. Adapting cardiovascular risk prediction models to different populations: the need for recalibration. Eur Heart J. 2023;45(2):129-131. doi: 10.1093/eurheartj/ehad748 [DOI] [PubMed] [Google Scholar]
  • 50.Laukkanen JA, Kunutsor SK. Is ‘re-calibration’ of standard cardiovascular disease (CVD) risk algorithms the panacea to improved CVD risk prediction and prevention? Eur Heart J. 2019;40(7):632-634. doi: 10.1093/eurheartj/ehy726 [DOI] [PubMed] [Google Scholar]
  • 51.Wong A, Otles E, Donnelly JP, et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med. 2021;181(8):1065-1070. doi: 10.1001/jamainternmed.2021.2626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Habib AR, Lin AL, Grant RW. The epic sepsis model falls short-the importance of external validation. JAMA Intern Med. 2021;181(8):1040-1041. doi: 10.1001/jamainternmed.2021.3333 [DOI] [PubMed] [Google Scholar]
  • 53.Lyons PG, Hofford MR, Yu SC, et al. Factors associated with variability in the performance of a proprietary sepsis prediction model across 9 networked hospitals in the US. JAMA Intern Med. 2023;183(6):611-612. doi: 10.1001/jamainternmed.2022.7182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1(5):206-215. doi: 10.1038/s42256-019-0048-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Rudin C, Radin J. Why are we using black box models in AI when we don’t need to? a lesson from an explainable AI competition. Harv Data Sci Rev. 2019;1(2). doi: 10.1162/99608f92.5a8a3a3d [DOI] [Google Scholar]
  • 56.Antun V, Renna F, Poon C, Adcock B, Hansen AC. On instabilities of deep learning in image reconstruction and the potential costs of AI. Proc Natl Acad Sci U S A. 2020;117(48):30088-30095. doi: 10.1073/pnas.1907377117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Volgman AS, Palaniappan LS, Aggarwal NT, et al. ; American Heart Association Council on Epidemiology and Prevention; Cardiovascular Disease and Stroke in Women and Special Populations Committee of the Council on Clinical Cardiology; Council on Cardiovascular and Stroke Nursing; Council on Quality of Care and Outcomes Research; and Stroke Council . Atherosclerotic cardiovascular disease in South Asians in the United States: epidemiology, risk factors, and treatments: a scientific statement from the American Heart Association. Circulation. 2018;138(1):e1-e34. doi: 10.1161/CIR.0000000000000580 [DOI] [PubMed] [Google Scholar]
  • 58.Vyas DA, Eisenstein LG, Jones DS. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms. N Engl J Med. 2020;383(9):874-882. doi: 10.1056/NEJMms2004740 [DOI] [PubMed] [Google Scholar]
  • 59.Norris KC, Eneanya ND, Boulware LE. Removal of race from estimates of kidney function: first, do no harm. JAMA. 2021;325(2):135-137. [DOI] [PubMed] [Google Scholar]
  • 60.Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447-453. doi: 10.1126/science.aax2342 [DOI] [PubMed] [Google Scholar]
  • 61.Eneanya ND, Yang W, Reese PP. Reconsidering the consequences of using race to estimate kidney function. JAMA. 2019;322(2):113-114. doi: 10.1001/jama.2019.5774 [DOI] [PubMed] [Google Scholar]
  • 62.Vasan RS, van den Heuvel E. Differences in estimates for 10-year risk of cardiovascular disease in Black versus White individuals with identical risk factor profiles using pooled cohort equations: an in silico cohort study. Lancet Digit Health. 2022;4(1):e55-e63. doi: 10.1016/S2589-7500(21)00236-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Vyas DA, James A, Kormos W, Essien UR. Revising the atherosclerotic cardiovascular disease calculator without race. Lancet Digit Health. 2022;4(1):e4-e5. doi: 10.1016/S2589-7500(21)00258-2 [DOI] [PubMed] [Google Scholar]
  • 64.Goldstein BA, Bhavsar NA, Phelan M, Pencina MJ. Controlling for informed presence bias due to the number of health encounters in an electronic health record. Am J Epidemiol. 2016;184(11):847-855. doi: 10.1093/aje/kww112 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1.

eMethods

eAppendix

eTable 1. Characteristics of study population by cohort subset

eTable 2. Estimation of 10-year ASCVD risk using the simplified AHA PREVENT sex-specific equations

eTable 3. Outcome and risk score summary of test set by race/ethnicity and sex subgroups in the test set

eTable 4. Harrell’s c-index across subgroups for PREVENT, PREVENT-CS, and MLM-PREVENT model

eTable 5. Reclassification table

eTable 6. Hyperparameters of XGBoost MLM-PREVENT

eTable 7. Estimation of ASCVD risk using PCE race- and sex-specific equations

eTable 8. Harrell’s c-index across subgroups for PCE and MLM-PCE models in test set

eFigure 1. ASCVD outcome by decile of data-completeness score

eFigure 2. Study Selection Flowchart

eFigure 3. Calibration plots for PREVENT, PREVENT following Cox calibration slope, and MLM-PREVENT across deciles of predicted PREVENT risk for subgroups

eFigure 4. Rates of events estimated by PREVENT, PREVENT-CS, and MLM-PREVENT models compared with rates of observed events across subgroups by groups of ASCVD predicted risk by race and ethnicity

eFigure 5. Rates of events estimated by PREVENT, PREVENT-CS, and MLM-PREVENT models compared with rates of observed events across subgroups by groups of ASCVD predicted risk by morbidity

eFigure 6. Rates of events estimated by PCE and MLM-PCE models compared with rates of observed events across subgroups based on estimated risk

eFigure 7. AHA-PREVENT risk score heatmap for men and women

eFigure 8. SHAP analysis of MLM-PREVENT

eFigure 9. Unconstrained PREVENT-XGBoost risk score heatmap for patients with diabetes across age, total cholesterol, systolic blood pressure, gender, and smoking categories

Supplement 2.

Data sharing statement


Articles from JAMA Cardiology are provided here courtesy of American Medical Association

RESOURCES