Skip to main content
Cardiovascular Research logoLink to Cardiovascular Research
. 2021 Jul 14;118(9):2152–2164. doi: 10.1093/cvr/cvab236

Determining a minimum set of variables for machine learning cardiovascular event prediction: results from REFINE SPECT registry

Richard Rios 1, Robert J H Miller 2,3, Lien Hsin Hu 4, Yuka Otaki 5, Ananya Singh 6, Marcio Diniz 7, Tali Sharir 8,9, Andrew J Einstein 10,11, Mathews B Fish 12, Terrence D Ruddy 13, Philipp A Kaufmann 14, Albert J Sinusas 15, Edward J Miller 16, Timothy M Bateman 17, Sharmila Dorbala 18, Marcelo DiCarli 19, Serge Van Kriekinge 20, Paul Kavanagh 21, Tejas Parekh 22, Joanna X Liang 23, Damini Dey 24, Daniel S Berman 25, Piotr Slomka 26,
PMCID: PMC9302886  PMID: 34259870

Abstract

Aims

Optimal risk stratification with machine learning (ML) from myocardial perfusion imaging (MPI) includes both clinical and imaging data. While most imaging variables can be derived automatically, clinical variables require manual collection, which is time-consuming and prone to error. We determined the fewest manually input and imaging variables required to maintain the prognostic accuracy for major adverse cardiac events (MACE) in patients undergoing a single-photon emission computed tomography (SPECT) MPI.

Methods and results

This study included 20 414 patients from the multicentre REFINE SPECT registry and 2984 from the University of Calgary for training and external testing of the ML models, respectively. ML models were trained using all variables (ML-All) and all image-derived variables (including age and sex, ML-Image). Next, ML models were sequentially trained by incrementally adding manually input and imaging variables to baseline ML models based on their importance ranking. The fewest variables were determined as the ML models (ML-Reduced, ML-Minimum, and ML-Image-Reduced) that achieved comparable prognostic performance to ML-All and ML-Image. Prognostic accuracy of the ML models was compared with visual diagnosis, stress total perfusion deficit (TPD), and traditional multivariable models using area under the receiver-operating characteristic curve (AUC). ML-Minimum (AUC 0.798) obtained comparable prognostic accuracy to ML-All (AUC 0.799, P = 0.19) by including 12 of 40 manually input variables and 11 of 58 imaging variables. ML-Reduced achieved comparable accuracy (AUC 0.796) with a reduced set of manually input variables and all imaging variables. In external validation, the ML models also obtained comparable or higher prognostic accuracy than traditional multivariable models.

Conclusion

Reduced ML models, including a minimum set of manually collected or imaging variables, achieved slightly lower accuracy compared to a full ML model but outperformed standard interpretation methods and risk models. ML models with fewer collected variables may be more practical for clinical implementation.

Keywords: Machine learning, Prognosis, SPECT myocardial perfusion imaging, Major adverse cardiovascular events, Dimensionality reduction

Graphical Abstract

Graphical Abstract.

Graphical Abstract


See the editorial comment for this article ‘Refining and simplifying decision models-tackling the ‘one size fits all’ challenge’, by Pablo Lamata, https://doi.org/10.1093/cvr/cvac083.

Translational Perspective

A reduced machine learning model, with 12 out of 40 manually collected variables and 11 of 58 imaging variables, achieved >99% of the prognostic accuracy of the full model. Models with fewer manually collected features require less infrastructure to implement, are easier for physicians to utilize, and are potentially critical to ensuring broader clinical implementation. Additionally, these models can integrate mechanisms to explain patient-specific risk estimates to improve physician confidence in the machine learning prediction.

1. Introduction

Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) is frequently used to evaluate patients with known or suspected coronary artery disease (CAD).1,2 Recently machine learning (ML) techniques have been used to improve the prediction of major adverse cardiovascular events (MACE) in patients undergoing SPECT MPI.3–5 These ML models combine clinical, stress, and imaging variables to improve prediction performance. ML models can potentially be utilized clinically to automatically select patients at low risk of MACE for rest scan cancellation,6 or improve prediction of MACE4 and revascularization.5 Although the clinical potential of ML models has become more apparent,7 how to best implement them is not well established. Clinical use of ML models requires mechanisms to collect and input data into the algorithm. Most image variables can be obtained automatically from MPI processing software. However, clinical and stress variables require either manual data entry or linkage with electronic medical records. These methods are time-consuming and error-prone, and potentially require additional infrastructure which can be a barrier to clinical implementation.3,8,9 ML models that limit the amount of data requiring manual entry can help overcome this barrier to implementation in clinical practice.

The aim of this study was to develop an ML model with the fewest manually input and imaging variables required to maintain prognostic accuracy for MACE. We used the REgistry of Fast Myocardial Perfusion Imaging with NExt generation SPECT (REFINE SPECT) a large, multicentre, international registry of patients who underwent MPI using the latest generation of SPECT scanners. We also used an independent population of patients who underwent MPI from the University of Calgary in order to provide an external validation of the ML models’ accuracy for predicting MACE risk. In addition, we demonstrated a method to visualize patient-specific explanations of risk estimates, aiming to provide a tool that supports physician decision-making.

2. Materials and methods

2.1 Study population

This study included 20 414 patients from the prognostic cohort of the multicentre REFINE SPECT registry.10 The cohort included consecutive patients referred for clinically indicated SPECT MPI from 2009 to 2014 at five centres. All data were de-identified and transferred to Cedars-Sinai Medical Center and were examined for potential errors by experienced nuclear cardiologists.10 Centers obtained informed consent or waiver of consent through site-specific mechanisms. The institutional review boards at each centre approved local data collection and transfer and the institutional review board at Cedars-Sinai Medical Center approved the overall collection of data for the registry. The study complies with the Declaration of Helsinki. To the extent allowed by the data-sharing agreements and IRB protocols, the data from this manuscript will be shared upon written request.

2.2 External population

This study also included 2984 patients from the University of Calgary. The cohort included consecutive patients referred for clinically indicated SPECT MPI from September 2014 to April 2019, with follow-up for MACE. All data and images were de-identified and transferred to Cedars-Sinai Medical Center, where they were examined for potential errors by experienced nuclear cardiologists. The University of Calgary was granted a waiver of consent, and its institutional review board approved local data collection and transfer to the core laboratory. The institutional review board at Cedars-Sinai Medical Center approved the overall collection of data for the registry. The study complies with the Declaration of Helsinki. To the extent allowed by the data-sharing agreements and IRB protocols, the data from this manuscript will be shared upon written request.

2.3 Clinical data

Clinical data were obtained including age, sex, body mass index (BMI), past medical history, symptoms, and family history of premature clinical coronary artery disease (CAD). Past medical history included diabetes mellitus (DM), hypertension, dyslipidaemia, smoking, previous myocardial infarction (MI), previous percutaneous coronary intervention (PCI), and prior coronary artery bypass grafting (CABG).10

2.4 Stress imaging acquisition and interpretation

Details of stress imaging acquisition and interpretation are available in the Supplementary material online.

2.5 Primary outcome

The primary outcome was MACE, which included all-cause mortality, non-fatal MI, admission for unstable angina, or early or late coronary revascularization (with either PCI or CABG). All-cause mortality was determined from the Social Security Death Index for US sites, the Ministry of Health National Death Database for Israel, and through the Open Architecture Clinical Information System in Canada. Non-fatal MI was defined based on hospital admission for chest pain, elevated cardiac enzyme levels, and typical ECG changes.11 Admission for unstable angina was defined as hospital admission for cardiac chest pain without elevated cardiac enzymes. All non-fatal events were adjudicated by experienced cardiologists after reviewing all available clinical, laboratory, and imaging information. We performed a separate analysis for the prediction of hard clinical outcomes (death or non-fatal MI).

2.6 Machine learning

An overview of the ML steps and models is shown in Figure 1. First, we divided the ML variables into those that can automatically be extracted for the ML algorithm—either from automatic image analysis or from the image header—and variables that require manual entry. Second, we obtained the ranking of variable importance.12,13 We then trained ML models with all variables (ML-All) and image-derived variables (including age and sex, ML-Image). Next, we derived minimum sets of manually input and image-derived variables by sequentially adding manually input variables with the highest gain metric, until all study variables were included in the ML model. Additional details are available in the Supplementary material online, including a full list of variables in Supplementary material online, Table S1.

Figure 1.

Figure 1

Machine learning workflow. The machine learning (ML) pathway and validation procedures used in this study. The REFINE SPECT study population was used to first rank variable importance, followed by classifying ML variables as either image-derived or manually input. Image-derived variables included all variables that can automatically be extracted for the ML algorithm, including age and sex, while manually input variables included all variables that may require manual entry into the ML algorithm. The minimum set of imaging and manually input variables (ML-image reduced and ML-reduced) was derived in the study population (n = 20 414) by sequentially incrementing variables to the baseline model until prediction performance was similar to the full ML model. ML-Minimum was the model combining the reduced set of imaging and manually input variables. Internal testing was performed with 10-fold cross-validation and one site held-out procedures. In the one site held-out procedure, four sites were used for model training which was tested in the held-out site. In the external validation procedure, ML models with all variables and reduced variables were trained in the study population (n = 20 414) and then tested in the external population (n = 2984). ML models were compared to traditional risk models, expert visual interpretation, and stress total perfusion deficit (TPD). Abbreviations: area under the receiver-operating characteristic curve (AUC), major adverse cardiovascular event (MACE), Net reclassification improvement (NRI).

2.7 ML variables

Ninety-eight variables were available for ML modelling between imaging, clinical, and stress test variables. Only stress imaging variables were included based on our previous work.6 As performed previously by Hu et al.,6 missing variables were imputed with the population mean value for continuous variables while categorical variables were imputed with a separate ‘missing’ category.6 The missing rates for study variables are the same as described by Hu et al.6

2.8 Ranking of variable importance

We used the eXtreme Gradient Boosting (XGboost)’s gain metric to rank the importance of all variables. The gain metric describes the total loss reduction obtained for each feature when it is used to classify the training data, i.e., it describes the relative contribution of each feature to the model.12 See Supplementary material online, Section A.1 for more details.

2.9 Classification of ML variables

The variables were divided into two categories (i) image-derived variables, which included all variables that can be derived using SPECT MPI interpretation software or extracted from the image header (a series of tags that describe the information within an image file) automatically and (ii) manually input variables, which included all variables that may require manual entry into the ML algorithm. We labelled 58 variables as image-derived and 40 as manually input variables, see Supplementary material online, Tables S1 and S2. Age and sex were the only variables assumed to be universally available in the image header (not requiring manual entry into ML algorithm).

2.10 Reduction of ML variables

Nested (double) cross-validation was used to determine the prognostic value of ML models with an increasing number of manually input variables. The reduction procedure was used to determine a minimum set of manually input variables and subsequently the fewest image-derived variables by following the same steps, see Supplementary material online for additional details.

First, an ML model was trained using all image-derived variables (including age and sex, ML-Image). Subsequently, based on the ranking of variable importance, ML models were trained by incrementing the manually input variables until all study variables were included. For example, the first ML model was trained with all image-derived variables plus the most important manually input variable; then, a second ML model was trained by adding the two most important manually input variables, and so forth until all manually input variables were included in the model (ML-All). Area under the receiver-operating characteristic curve (AUC) was used to quantify the prediction performance for MACE of each ML model in each inner testing set, see Supplementary material online. We determined the minimum set of manually input variables by identifying the ML model in which the average AUC value was ≥99.5% of the ML-All.

2.11 Models for MACE prediction

We selected five ML models to evaluate MACE prediction: a model with all image-derived variables (ML-Image); a model using all the image-derived variables and minimum set of manually input variables (ML-Reduced); a model with all variables (ML-All); a model using only the minimum set of image-derived variables (ML-Image-Reduced); and a model using the minimum sets of image-derived and manually input variables (ML-Minimum). Additionally, we developed two traditional clinical models to compare with our ML models: A Cox proportional hazards model (Cox) and a logistic regression model (Logistic). Initially, a univariate analysis was performed using the minimum set of manually input variables in addition to age, sex, stress TPD, stress type test (exercise or pharmacologic), and stress LVEF. Variables were then considered for entry into the multivariable model if they were significant in univariate analysis (P < 0.05) and were maintained in the multivariable model if they retained significance. We separately assessed models with clinical information only (Cox and Logistic models), followed by models with clinical information in addition to stress TPD and stress LVEF (Cox+ and Logistic+ models).

2.12 Internal validation

ML and traditional risk models were trained and tested in two internal validation procedures; (i) 10-fold cross-validation and (ii) one site held out validation. In the 10-fold cross-validation procedure, the population was randomly split into 10 equally sized folds (which include a similar proportion of patients with MACE and representation from all 5 sites). One fold is held out as the testing set, while the remaining nine folds are used for training. This is repeated 10 times with a different fold held out for testing each time. The prognostic accuracy for MACE in each of the testing folds is concatenated to get an average predictive performance across the 10 testing procedures. This procedure gives an estimated predictive performance when the ML model is applied in a patient population with similar patient characteristics. We compared the ML models’ performance with the four-point scale expert visual interpretation (Diagnosis) and stress TPD using AUC and pairwise comparisons.14

In the one site held-out procedure, one site was held-out from model training and used for testing, while the remaining sites were used as the training set. The proportion of patients with MACE and population characteristics varied between the training and testing populations due to differences between sites. This procedure gives an expected performance for the ML models when applied in a new population with characteristics that are different from the original training population. We used AUC to evaluate the prediction performance of Logistic and ML models and C-statistic for Cox models, which is analogous to AUC in a time-to-event analysis.15–18 The ML models’ performance was compared with the 4-point scale expert visual interpretation (Diagnosis).

2.13 External validation

We also performed an external validation, with models trained entirely in the REFINE SPECT study population (n = 20 414) and tested in the external population (n = 2984). An overview of the machine learning workflow and validation procedures is shown in Figure 1. Since patients from the external cohort underwent supine imaging with and without CT attenuation correction (not available at other sites) instead of two position imaging, we generated generalizable ML models which were trained with only supine non-attenuation corrected imaging variables. This procedure gives an expected performance of the generalizable ML models when applied to a new population. We compared the ML models’ performance with the 4-point scale expert visual interpretation (Diagnosis) using AUC and C-statistic. Methods used to compare AUC and C-statistic are outlined in the Supplementary material online.

2.14 Explanation of individual ML prediction

We developed a method to explain the ML model’s prediction to help to overcome the perception of ML models as ‘black boxes’.7 In boosting models, for a given test case, often only few variables have a substantial contribution to the model response while most of the variables are irrelevant.19,20 We calculated the contribution of each feature in individual predictions to visually support decision-making in the evaluation of MACE risk, see further details in Supplementary material online, Section C.

2.15 Statistical analysis

Continuous variables were summarized as mean ± SD while categorical variables were summarized as number (frequency). Variables were compared using Wilcoxon rank sum and chi-square tests for continuous and categorical variables, respectively. Brier scores were computed between predicted and observed MACE for the ML models to evaluate their probabilistic risk predictions. In addition, we performed a risk re-classification analysis with four-categories between ML-Reduced and expert visual interpretation, see Supplementary material online for more details with results in Supplementary material online, Tables S3 and S4. All models and statistics were implemented in R language (version 4.0.3), using the following open-source packages: xgboost (version 1.2.0.1),12 survival (3.2.7), PredictABEL (version 1.2.-4), and pROC (1.16.2).

3. Results

3.1 Study cohort

A total of 20 414 patients were included in the study population. Table 1 provides the baseline clinical characteristics of the study population and univariable analysis of clinical variables for MACE prediction. During a mean follow-up of 4.7 ± 1.5 years, 3541 patients experienced at least one MACE including 1617 deaths, 379 myocardial infarction (MI), 1895 revascularizations, and 300 admissions for unstable angina. The annual rate of MACE was 3.7%. In the study population, patients who experienced MACE were older (mean age 68 vs. 63, P < 0.001) and more likely to have a history of diabetes (38% vs. 23%, P < 0.001) or hypertension (74% vs. 63%, P < 0.001). Patients who experienced MACE were also more likely to require pharmacologic stress (68% vs. 49%, P < 0.001), or have a positive ECG response during stress (14% vs. 9%, P < 0.001).

Table 1.

Patient characteristics in the study population and univariable analysis of clinical variables for MACE prediction

Name MACE No MACE P-value AUC (95% CI)
Age 68 ± 12 63 ± 12 <0.001 0.62
(0.61–0.63)
Men 2481 (70) 9158 (54) <0.001 0.579
(0.571–0.587)
BMI 28 ± 6 28 ± 6 0.9 0.506
(0.495–0.587)
CAD risk factors
Diabetes 1329 (38) 3882 (23) <0.001 0.573
(0.564–0.581)
Dyslipidaemia 2252 (72) 10 348 (61) <0.001 0.554
(0.545–0.562)
Hypertension 2635 (74) 10 281 (61) <0.001 0.567
(0.559–0.575)
Family history of CAD 819 (23) 4821 (29) <0.001 0.473
(0.465–0.481)
Current smoker 636 (18) 3235 (19) 0.09 0.494
(0.487–0.501)
History of CAD
Past MI 825 (23) 1,939 (12) <0.001 0.559
(0.522–0.566)
Previous PCI 1194 (34) 2772 (16) <0.001 0.586
(0.578–0.595)
Past CABG 537 (15) 1157 (7) <0.001 0.542
(0.535–0.548)
Symptoms 0.49
(0.48–0.5)
Typical angina 362 (10) 862 (5) <0.001
Asymptomatic 1744 (49) 7829 (46) <0.01
Atypical angina 811 (23) 3780 (22) 0.5
Non-angina 624 (18) 4,01 (26) <0.001
Exercise stress type 0.592
(0.583–0.601)
Exercise stress 1,147 (32) 8,584 (51%) <0.001
Pharmacologic stress 2,394 (68) 8,289 (49) <0.001
Stress peak HR 107 ± 29 122 ± 32 1 0.637
(0.628–0.647)
% of Maximal predicted HR 70 ± 18 77 ± 19 1 0.616
(0.606–0.626)
ECG response to stress 0.578
(0.569–0.588)
Negative 1830 (52) 11 217 (66) <0.001
Positive 480(14) 1572(9) <0.001
Equivocal 175 (5) 1032 (6) <0.01
Non-diagnostic 1047 (30) 2999 (18) <0.001
Borderline 2 (0.06) 25 (0.15) 0.27
ST Deviation 0 ± 1 0 ± 1 <0.001 0.523
(0.515–0.531)
Clinical response to stress 0.491
(0.481–0.501)
Ischaemic 845 (24) 2,811 (17) <0.001
Non-ischaemic 1982 (56) 11 580 (69) <0.001
Abnormal 146 (4) 503(3) <0.001
Non-diagnostic 423 (12) 1,371 (8) <0.001
Equivocal 138(4) 581 (3) 0.2

Abbreviations: BMI, body mass index; CABG, coronary artery bypass grafting; CAD, coronary artery disease; CI, confidence interval; HR, heart rate; MACE, major adverse cardiovascular event; PCI, percutaneous coronary intervention; AUC, area under the receiver operating characteristics curve.

3.2 External cohort

A total of 2978 patients were included in the external cohort. A comparison of the study population and external population is shown in Table 2. Supplementary material online, Table S5 provides the baseline clinical characteristics of the external population. During mean follow-up of 3.1 ± 1.4 years, 495 patients experienced at least one MACE including 265 deaths, 78 MI, 189 revascularizations, and 76 admissions for unstable angina. The annual rate of MACE was 5.7. Patients in the external population were older (mean age 67 vs. 64, P < 0.001) and less likely to have a history of hypertension (60% vs. 63%, P < 0.001). Patients in the external population were also less likely to require pharmacologic stress (37% vs. 52%, P < 0.001).

Table 2.

Comparison of study population and external population

Name REFINE cohort External cohort P-value
(n = 20 414) (n = 2984)
Age 64 ± 12 67 ± 11 <0.001
Men 11 639 (57) 1625 (54) <0.001
BMI 28.4 ± 6.2 29.6 ± 6.4 <0.001
CAD risk factors
 Diabetes 5211 (26) 771 (26) 0.719
 Dyslipidaemia 12 900 (63) 1517 (50.8) <0.001
 Hypertension 12 916 (63) 1779 (60) <0.001
 Family history of CAD 5640 (28) 1537 (51) <0.001
 Current smoker 3871 (19) 192 (6) <0.001
History of CAD
 Past MI 2764 (14) 353 (12) 0.010
 Past PCI 3966 (19) 431 (14) <0.001
 Past CABG 1694 (8) 144 (5) <0.001
Symptoms
 Typical angina 1224 (6) 132 (4) <0.001
 Asymptomatic 9573 (47) 1473 (49) 0.012
 Atypical angina 4591 (22) 461 (15) <0.001
 Non-angina 5025 (25) 918 (31) <0.001
Exercise stress 9731 (48) 1892 (63) <0.001
Pharmacologic stress 10 683 (52) 1092 (37) <0.001
Stress peak heart rate 119 ± 32 124 ± 32 <0.001
% of maximal predicted HR 92 ± 9 88 ± 11 <0.001
ECG Response to stress
 Negative 13 047 (64) 2224 (75) <0.001
 Positive 2052 (10) 556 (19) <0.001
 Equivocal 1207 (6) 204 (6.8) 0.053
 Non-diagnostic 4046 (20) 0 (0) <0.001
 Borderline 27 (0) 0 (0) 0.041
ST Deviation 0 ± 1 0 ± 1 <0.001
Clinical response to stress
 Ischaemic 3656 (18) 112 (4) <0.001
 Non-ischaemic 13 562 (67) 2622 (88) <0.001
 Abnormal 649 (3) 10 (0) <0.001
 Non-diagnostic 1794 (9) 240 (8) 0.186
 Equivocal 719 (4) 0 (0) <0.001

Abbreviations: BMI, body mass index; CABG, coronary artery bypass grafting; CAD, coronary artery disease; CI, confidence interval; HR, heart rate; MACE, major adverse cardiovascular event; PCI, percutaneous coronary intervention.

3.3 Ranking of variable importance

Figure 2 shows the 20 variables with the highest importance factor and Supplementary material online, Figure S2 in the supplement provides the importance ranking for all variables. Image-derived variables represented 5 of the top 10 most important variables, including stress TPD (default, alternative, and combined views), age, and stress dose. Meanwhile, the manually input variables in the top 10 were stress peak heart rate (HR), previous PCI, indication for test, resting HR, and pharmacological stress.

Figure 2.

Figure 2

Ranking of variable importance. The 20 most important features derived using XGboost’s gain metric. The image-derived category included variables that can be derived automatically from the image file. The manually input category includes variables that require manual entry. All the variables are in the default view (upright for DSPECT and supine for Discovery), unless specified as * which are from the alternate view (supine for D-SPECT or prone for Discovery), or as the combined view. Values are the average of the gain metric obtained during the 10 inner training sets and represent all 20 414 patients. ECG, electrocardiogram; ES, end-systolic; HR, heart rate; PCI, percutaneous coronary intervention; TPD, total perfusion deficit.

3.4 Minimum sets of variables

Figure 3 provides the average AUCs of the sequence of ML models trained while increasing the number of manually input features within the double cross-validation procedure. ML-All had an AUC of 0.797 [95% confidence interval (CI) 0.795–0.799]. The ML model trained with the first 12 manually input variables achieved ≥ 99.5% of the maximum prognostic accuracy. The minimum manually input variables were stress peak HR, previous PCI, indications for test, resting HR, pharmacological stress, BMI, DM, ECG response, symptoms, ST deviation, % maximal predicted HR, and clinical response to stress. Age and sex were ranked 3rd and 16th in variable importance, respectively. If age and sex could not be obtained from the image header, they would need to be added to the minimum set of manually input variables to maintain the same prognostic accuracy. Meanwhile, Supplementary material online, Figure S3 provides the average AUC of the sequence of ML models trained while increasing the number of image-derived variables.

Figure 3.

Figure 3

Incremental prognostic value of the manually input variables. ML-Image and ML-All models were trained using all image-derived variables and all study variables, respectively. Each point represents the prediction performance of the ML model trained using all image-derived variables and first most important manually input variables, based on the ranking of variable importance, e.g., ML-1 was trained using all image-derived variables plus the manually input variable with the highest gain metric (stress peak HR). ML-2 was trained by adding the second most important manually input variable (previous PCI) to ML-1. The ML-Reduced was the model whose average AUC value was ≥ 99.5% of ML-All. Values are the average of the results obtained in each of the 10 inner testing sets and represent all 20 414 patients. Abbreviations: ML, machine learning; AUC, area under the receiver-operating characteristic curve; BMI, body mass index; CI, confidence interval; DM, diabetes mellitus; ECG, electrocardiogram; HR, heart rate; PCI, percutaneous coronary intervention.

3.5 Internal validation

Figure 4 provides the overall MACE prediction performance for ML-All, ML-Minimum, ML-Reduced, ML-Image, 4-point scale visual diagnosis (Diagnosis), and stress TPD within the 10-fold cross-validation procedure. ML-Minimum and ML-Reduced obtained clinically similar prognostic accuracy to ML-All (AUC: 0.798, 95% CI: 0.790 to 0.806 and AUC 0.796, 95% CI 0.788–0.803 vs AUC 0.799, 95% CI 0.792–0.807; P=0.19 for ML-Minimum vs ML-All and P < 0.01 for ML-Reduced vs ML-All). ML-Minimum and ML-Reduced model decreased the amount of manually input variables by 70% but maintained more than 99.6% of the prognostic accuracy. ML-Image (AUC 0.755, 95% CI 0.746–0.764), had significantly higher prognostic accuracy compared with stress TPD or physician diagnosis (AUC 0.698, 95% CI 0.688–0.708 and AUC 0.68, 95% CI 0.67–0.69 respectively; P < 0.01). ML-Image maintained 94.6% of the prognostic accuracy without any manually input variables. Supplementary material online, Figure S4 provides the distribution of AUC values for the three ML models within the 10-fold cross-validation. Given the large sample size, even a small difference in the distribution of AUC values will result in a significant difference obtained by DeLong’s method. Supplementary material online, Figure S5 provides the overall prediction performance for hard outcomes for ML-All, ML-Image, ML-Reduced, visual diagnosis, and stress TPD.

Figure 4.

Figure 4

Receiver-operating characteristic curves for MACE prediction. ML-All: model trained with all variables, ML-Image: model trained with all image-derived variables, ML-Reduced: model trained with all image-derived variables plus minimum manually input variables, and ML-Minimum: model trained with both minimum sets of variables compared to standard methods. AUC are derived from the results of the outer testing sets from 10-fold cross-validation including 20 414 patients, with statistical significance assessed by Delong’s method. Abbreviations: ML, machine learning; AUC, area under the receiver-operating characteristic curve; CI, confidence interval; MACE, major adverse cardiovascular event; TPD, total perfusion deficit.

Supplementary material online, Figure S6 provides the overall MACE prediction performance of the ML model (ML-Image-reduced) trained with the minimum set of image-derived. The prognostic accuracy of ML-Image-reduced was comparable to ML-Image (AUC 0.751, 95% CI 0.742–0.76 and AUC 0.755, 95% CI 0.746–0.764 respectively; P < 0.01).

Table 3 provides the univariate predictors of MACE. Tables 3 and 4 outline the components of the multivariable Cox proportional hazards and logistic regression models. Supplementary material online, Figure S7 provides the overall MACE prediction performance for visual diagnosis as well as all clinical and four selected ML models in the one site held-out validation procedures.

Table 3.

Univariable and multivariable associations with MACE using Cox proportional hazards regression

Univariable hazard ratios Multivariable hazard ratios Cox Multivariable hazard ratios Cox+
Stress TPD 1.06 1.03
(1.05–1.06) (1.03–1.04)
Age 1.04 1.02 1.02
(1.03–1.04) (1.02–1.02) (1.02–1.02)
Stress peak HR 0.98 0.989 0.931
(0.974–0.985) (0.987–0.991) (0.905–0.959)
Previous PCI 2.37 1.76 1.46
(2.2–2.55) (1.63–1.90) (1.35–1.58)
Indication for test 0.997
(0.996–0.998)
Resting HR 0.844 0.941 0.931
(0.826–0.863) (0.914–0.968) (0.905–0.959)
Pharmacological stress agent 1.19
(1.15–1.22)
BMI 0.98 0.975 0.976
(0.974–0.985) (0.969–0.981) (0.969–0.982)
DM 1.76 1.43 1.41
(1.64–1.89) (1.33–1.54) (1.31–1.52)
ECG response 1.22 1.10 1.06
(1.19–1.25) (1.08–1.13) (1.03–1.09)
Sex 1.95 1.91 1.65
(1.81–2.11) (1.77–2.07) (1.51–1.79)
ST deviation 1.24 1.14 1.10
(1.18–1.31) (1.07–1.21) (1.04–1.17)
% MPHR 0.978
(0.977–0.98)
Stress LVEF 0.966 0.996
(0.964–0.968) (0.992– 0.999)
Stress test type 1.01 1.02 1.02
(1–1.01) (1.02–1.02) (1.01–1.02)

Associations with major adverse cardiovascular events (MACE) for the minimum set of manually input variables, stress test type, stress total perfusion deficit (TPD), and stress left ventricle ejection fraction (LVEF). Variables are ordered according to their information gain ranking. Variables which were not significantly associated with MACE were removed from the multivariable model. Pharmacological stress agent describes the specific agent which was used, while stress test type describes the stress protocol (exercise, pharmacologic, or augmented pharmacologic). Abbreviations: BMI, body mass index; DM, diabetes mellitus; HR, heart rate; MPHR, maximal predicted heart rate.

Table 4.

Multivariable associations with MACE using logistic regression

Parameters Parameters
Logistic Logistic+
Stress TPD 0.00962
(0.00878 to 0.0105)
Age 0.00289 0.003
(0.00237 to 0.00340) (0.0025 to 0.00351)
Stress peak HR −0.00141 −0.000932
(−0.00167 to −0.00115) (−0.00119 to 0.000674)
Previous PCI 0.112 0.0720
(0.0982 to 0.125) (0.0586 to 0.0853)
Resting HR −0.02180 −0.0226
(−0.0265 to −0.0171) (−0.0272 to −0.018)
BMI −0.00373 −0.00385
(−0.00459 to −0.00287) (−0.00470 to −0.00301)
DM 0.0624 0.0556
(0.0504 to 0.0744) (0.0439 to 0.0674)
ECG response 0.0193 0.0108
(0.0151 to 0.0234) (0.0671 to 0.015)
Sex 0.0982 0.0601
(0.0875 to 0.109) (0.0486 to 0.0715)
ST deviation 0.0105 0.00878
(0.00693 to 0.0231) (0.000832 to 0.0167)
Stress LVEF −0.00130
(−0.00183 to −0.000764)
Stress test type 0.00239 0.00197
(0.00198 to 0.00280) (0.00157 to 0.00238)
Intercept −0.205 −0.0701
(−0.273 to −0.136) (−0.147 to 0.00642)

Associations with major adverse cardiovascular events (MACE) for the minimum set of manually input variables, stress test type, stress total perfusion deficit (TPD), and stress left ventricle ejection fraction (LVEF). Variables are ordered according to their information gain ranking. Abbreviations:

BMI, body mass index; DM, diabetes mellitus; HR, heart rate; MPHR, maximal predicted heart rate.

3.6 External validation

Figure 5 and Supplementary material online, Figure S8 provide the overall MACE prediction performance for visual diagnosis as well as all clinical and four selected ML models in the testing and training sets of the external validation procedure, respectively. Similar to the internal validation procedure, ML-All (AUC 0.739), ML-Reduced (AUC 0.725), and ML-Minimum (AUC 0.723) demonstrated higher AUC compared to visual diagnosis (AUC 0.680, all P < 0.01). ML-Reduced had comparable or higher accuracy compared to traditional risk estimation models (Logistic+ AUC 0.718, P=0.4, Cox+ C-statistic 0.703, P < 0.01). ML-Image obtained prediction performance comparable to multivariable models incorporating clinical and imaging information (e.g. Logistic+ AUC 0.718 vs. ML-Image AUC 0.713 vs. Cox+ C-statistic 0.703, P = 0.14).

Figure 5.

Figure 5

AUCs for MACE prediction in the external validation procedure. ML-All: model trained with all variables, see Supplementary material online, Tables S1 and S2; ML-Image: model trained with all image-derived variables, see Supplementary material online, Table S1; ML-Minimum: model trained with minimum set of manually input and image-derived variables, and ML-Reduced: model trained with all image-derived variables plus the minimum set of manually input variables. The minimum set of manually input and image-derived variables are shown in Figure 3 and Supplementary material online, Figure S3, respectively. Cox and Logistic models were trained with clinical and stress variables only. Cox+ and Logistic+ were trained by adding stress TPD and LVEF to Cox and Logistic models, respectively, see Tables 3 and 4. AUC was used to evaluate the prediction performance of Logistic, Logistic+, and ML models and C-statistic for Cox and Cox+. In the external cohort, physician interpretation was aided by knowledge of coronary calcium from attenuation corrected imaging which was not used by ML. Abbreviations: major adverse cardiovascular events (MACE); AUC, area under the receiver-operating characteristic curve; CI, confidence interval; LVEF, left ventricular ejection fraction; ML, machine learning; TPD, total perfusion deficit.

3.7 Interpretation of individual ML prediction

Figure 6 shows patient-specific explanations of annualized MACE risk predictions for two patients using ML-Reduced model. For the patient (Figure 6A) experiencing MACE, previous PCI, stress heart HR, diabetes mellitus, pharmacological stress agent, % maximal predicted HR, and age were the features that led to increased risk. Meanwhile, for the patient without MACE (Figure 6B), stress peak HR, resting HR, age, stress dose, previous PCI, and ECG response were the features that led the ML model to decrease the MACE risk.

Figure 6.

Figure 6

Individual MACE risk prediction. Individual contribution of features for annualized major adverse cardiovascular event (MACE) risk predictions for two patients using ML-Reduced model. ML-Reduced: model trained with all image-derived variables plus minimum manually input variables. The individual contributions of the top 10 variables for MACE risk for each patient are provided (blue bars decrease risk while red bars increase risk). For the patient experiencing MACE (panel A), previous percutaneous coronary intervention (PCI), stress heart rate (HR), diabetes mellitus (DM), pharmacological stress agent, % maximal predicted HR, and age were the features that led to increased risk. Meanwhile, for the patient without MACE (Panel B), stress peak HR, resting HR, age, stress Dose, previous PCI, and electrocardiogram (ECG) response were the features that led the ML model to decrease the MACE risk. Grey dotted line indicates baseline cohort risk. Red dotted line indicates high risk cut point for annual MACE risk. Abbreviations: MACE, major adverse cardiovascular event; ML, machine learning; TPD, total perfusion deficit; AUC, area under the receiver-operating characteristic curve; BMI, body mass index; CI, confidence interval.

3.8 Model calibration

Brier scores for ML-All, ML-Reduced, and ML-Image were 0.117, 0.118, and 0.123, respectively, showing good calibration between observed MACE and ML scores for all models. Figure 7 shows observed vs. predicted MACE for ML-All and ML-Reduced, indicating that the distribution of MACE risk across ML scores were similar for both models.

Figure 7.

Figure 7

Observed vs. predicted MACE. Observed vs. predicted major adverse cardiovascular events (MACE) grouped by machine learning (ML) risk score for all 20 414 patients. ML-All model was trained using all the features while ML-Reduced model was trained using all image-derived variables plus the first 12 manually input variables.

4. Discussion

We assessed the prognostic performance of ML models with a limited number of manually input and imaging variables (ML-Reduced and ML-Minimum). We showed that ML models including only variables automatically collected during image processing (ML-Image and ML-Image-Reduced) had significantly higher prognostic accuracy compared to standard interpretation methods. While these models were improved by adding all 40 available clinical and stress test variables, the ML-Reduced and ML-minimum models with only 12 manually input variables achieved more than 99.5% of the prognostic accuracy. ML-minimum achieved comparable prognostic accuracy to ML-Reduced with a reduced set of imaging and manually collected variables. External validation confirmed that reduced ML models had higher prognostic accuracy compared to expert visual interpretation, quantitative analysis, and traditional clinical models. Models with fewer collected features require less infrastructure to implement and would be easier for physicians to utilize in clinical practice.21 These more practical ML models should be considered in prospective studies assessing the clinical implementation of ML for MACE prediction.

There is growing evidence that ML-based techniques can improve disease diagnosis or risk prediction following MPI.1,2,4,5,21 Betancur et al.4 developed an ML model using clinical, imaging, and stress test variables using 2689 patients from a single centre, and improved MACE prediction compared to physician interpretation or quantitative analysis. Subsequent studies have shown that ML can improve prediction of early revascularization,5 or automatically select patients with a low risk of MACE for stress-only imaging.6 However, a major limitation to clinical implementation has been the feasibility of collecting the required variables. For example, the ML model employed by Betancur et al. used 45 variables potentially requiring manual collection by clinicians or technical staff, which is time-intensive and prone to error.8,9 One method to address this problem is to extract data from existing electronic medical records.3 However, this requires infrastructure that is not universally available, and the extracted data may still be incomplete or inaccurate.

We assessed the feasibility of an alternate solution, minimizing the amount of information requiring manual collection. This concept was previously assessed by Haro Alonso et al.21 in a cohort of 8321 patients undergoing SPECT MPI. They showed that a reduced ML model, with 6 imaging and clinical features, outperformed standard logistic regression with 14 variables. However, the reduced ML model had significantly lower prognostic accuracy compared to more comprehensive ML models.21 We expanded on this work by deriving reduced ML models with comparable risk prediction to the full ML model. We also demonstrated that removing some imaging variables from the ML-Reduced model (to generate ML-Minimum) further increased prognostic accuracy, highlighting the importance of variable optimization. Importantly, both ML-Reduced and ML-Minimum demonstrated higher prognostic accuracy compared to traditional logistic regression and Cox proportional hazards models built with clinical and imaging variables. Additionally, multivariable models with clinical and imaging information outperformed models with clinical information alone. A further benefit of the ML models is the ability to generate patient-specific explanations of risk predictions, as demonstrated in Figure 6, rather than population-level summaries of risk factors. Lastly, our results were similar when considering only death and non-fatal MI in the outcome.

We performed external validation in a population from a separate site, which was not involved with either variable selection or model training. As expected, there were significant differences in most patient characteristics between the external population and the study population. As a result, the accuracies of all models were lower in the external validation cohort. However, the prognostic accuracy of reduced ML models was still higher than traditional risk estimation models. Additionally, they had higher prognostic accuracy compared to physician interpretation which was aided by knowledge of coronary calcification and attenuation corrected imaging. External validation in a large population with distinct characteristics is critical to demonstrate the generalizability of any ML model. Our results demonstrate that ML models with a reduced set of variables would be expected to outperform traditional risk estimation methods when applied to other populations. Additionally, ML models with a limited number of variables will be easier to implement clinically and can improve the flexibility of ML by allowing centres to identify their own ideal trade-off between prognostic accuracy and requirement for variable collection. For instance, some centres may believe that the prognostic accuracy of ML-Image meets their clinical needs, and risk prediction could occur entirely automatically. Similarly, this method can be employed to provide automatic rest scan cancellation recommendations.6 The ease and flexibility afforded by ML models with fewer manually input variables may be critical to ensuring broader implementation.

We attempted to address another potential limitation to AI implementation by generating patient-specific explanations for MACE risk to reduce the ‘black-box’ perception of ML. Feature contribution plots describe how each variable contributes to the predicted risk.19 These plots allow physicians to better understand the relative importance of variables and potentially identify variables that are clinically actionable in the individual patient. For instance, noting that diabetes mellitus was a substantial contributor to the risk estimate may trigger the physician to re-evaluate the patient’s diabetes management. These plots may also lead to improvements in the clinical performance of ML models by allowing physicians to identify potential errors in risk estimation. These mechanisms to explain predictions enable physicians to understand the model rational, fostering confidence in the use of ML in clinical practice.

Our study has a few important limitations. The quantitative analysis of SPECT MPI was performed with one commercially available software package, and the ability to obtain values automatically may differ with other software. However, we identified the most important imaging variables and prognostic accuracy should be similar if the variability in their measurements is small. For broader generalizability, we assumed only minimal demographic information would available in the image header. However, other information—such as BMI, pharmacological stress, or resting HR—can potentially be extracted from the image header to improve the prognostic performance ML-Image. Finally, although the models were robustly validated, prospective studies are needed to determine their potential impact on management.

5. Conclusions

ML models using only automatically extracted variables had improved prognostic accuracy compared to standard interpretation methods. While ML models with fewer collected variables can have slightly lower accuracy compared to a full ML model, they are easier to use clinically. More practical ML models like these, combined with methods to explain risk predictions, could help overcome key barriers to the broader clinical use of ML.

Supplementary material

Supplementary material is available at Cardiovascular Research online.

Authors’ contributions

R.R. pre-processed data, performed the experiments, conceived the experiments, analysed data, and wrote the manuscript; R.JH.M helped with the pre-processing of data, conceived the experiments, analysed data, and provided material, and wrote the manuscript; LH.H. analysed data, provided feedback on experiments, reviewed the manuscript; A.S., P.K., and T.P. helped with the pre-processing of data; M.D provided feedback on experiments; Y.O., A.J.E, E.J.M, S.D, S.V.K., J.X.L., D.D., and D.S.B. provided material, provided feedback on experiments and reviewed the manuscript; T.S., M.B.F, T.D.R, T.D.R., P.A.K., A.J.S, T.M.B, and M.D.C. provided material; P.S. helped to perform experiments, conceived the experiments, analysed data, reviewed and edited the manuscript, and provided funding.

Conflict of interest: D.S.B., S.V.K., P.S., and P.K. participate in software royalties for QPS software at Cedars-Sinai Medical Center. P.S. has received research grant support from Siemens Medical Systems. D.S.B., S.D., A.J.E., and E.J.M. have served as consultants for GE Healthcare. S.D. has served as a consultant to Bracco Diagnostics; her institution has received grant support from Astellas. M.D.C. has received research grant support from Spectrum Dynamics and consulting honoraria from Sanofi and GE Healthcare. T.D.R. has received research grant support from GE Healthcare and Advanced Accelerator Applications. A.J.E. has served as a consultant to W. L. Gore & Associates and his institution has received research support from Toshiba America Medical Systems, Roche Medical Systems, and W. L. Gore & Associates. E.J.M. has served as a consultant for Bracco Inc; and he and his institution has received grant support from Bracco Inc. Dr. Berman’s institution has received grant support from HeartFlow. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Funding

This research was supported in part by grant R01HL089765 from the National Heart, Lung, and Blood Institute/National Institutes of Health (NHLBI/NIH) (PI: Piotr Slomka). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The work was also supported in part by the Miriam and Sheldon Adelson Medical Research Foundation.

Data availability

The data underlying this article cannot be shared publicly due to the investigational board review and multi-institutional data agreements constrains. To the extent allowed by the data-sharing agreements and IRB protocols researcher, the data from this manuscript will be shared upon written request to the corresponding author.

Supplementary Material

cvab236_Supplementary_Data

Contributor Information

Richard Rios, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Robert J H Miller, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA; Department of Cardiac Sciences, University of Calgary, Calgary, AB, Canada.

Lien Hsin Hu, Department of Nuclear Medicine, Taipei, Veterans General Hospital, Taipei, Taiwan.

Yuka Otaki, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Ananya Singh, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Marcio Diniz, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Tali Sharir, Department of Nuclear Cardiology, Assuta Medical Center, Tel Aviv, Israel; Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheba, Israel.

Andrew J Einstein, Division of Cardiology, Department of Medicine, Columbia University Irving Medical Center and New York-Presbyterian Hospital, New York, NY, USA; Department of Radiology, Columbia University Irving Medical Center and New York-Presbyterian Hospital, New York, NY, USA.

Mathews B Fish, Department of Nuclear Medicine, Oregon Heart and Vascular Institute, Sacred Heart Medical Center, Springfield, OR, USA.

Terrence D Ruddy, Division of Cardiology, Department of Medicine, University of Ottawa Heart Institute, Ottawa ON, Canada.

Philipp A Kaufmann, Department of Nuclear Medicine, Cardiac Imaging, University Hospital Zurich, Zurich, Switzerland.

Albert J Sinusas, Department of Internal Medicine, Section of Cardiovascular Medicine, Yale University School of Medicine, New Haven, CT, USA.

Edward J Miller, Department of Internal Medicine, Section of Cardiovascular Medicine, Yale University School of Medicine, New Haven, CT, USA.

Timothy M Bateman, Cardiovascular Imaging Technologies LLC, Kansas City, MO, USA.

Sharmila Dorbala, Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Brigham and Women's Hospital, Boston, MA, USA.

Marcelo DiCarli, Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Brigham and Women's Hospital, Boston, MA, USA.

Serge Van Kriekinge, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Paul Kavanagh, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Tejas Parekh, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Joanna X Liang, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Damini Dey, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Daniel S Berman, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Piotr Slomka, Department of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

References

  • 1. Fihn SD, Gardin JM, Abrams J, Berra K, Blankenship JC, Dallas AP, Douglas PS, Foody JM, Gerber TC, Hinderliter AL, King SB, Kligfield PD, Krumholz HM, Kwong RYK, Lim MJ, Linderbaum JA, Mack MJ, Munger MA, Prager RL, Sabik JF, Shaw LJ, Sikkema JD, Smith CR, Smith SC, Spertus JA, Williams SV; American College of Cardiology Foundation, American Heart Association Task Force on Practice Guidelines, American College of Physicians, American Association for Thoracic Surgery, Preventive Cardiovascular Nurses Association, Society for Cardiovascular Angiography and Interventions, Society of Thoracic Surgeons . 2012 ACCF/AHA/ACP/AATS/PCNA/SCAI/STS Guideline for the diagnosis and management of patients with stable ischemic heart disease: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines, and the American College of Physicians, American Association for Thoracic Surgery, Preventive Cardiovascular Nurses Association, Society for Cardiovascular Angiography and Interventions, and Society of Thoracic Surgeons. J Am Coll Cardiol 2012;60:e44–e164. [DOI] [PubMed] [Google Scholar]
  • 2. Knuuti J, Wijns W, Saraste A, Capodanno D, Barbato E, Funck-Brentano C, Prescott E, Storey RF, Deaton C, Cuisset T, Agewall S, Dickstein K, Edvardsen T, Escaned J, Gersh BJ, Svitil P, Gilard M, Hasdai D, Hatala R, Mahfoud F, Masip J, Muneretto C, Valgimigli M, Achenbach S, Bax JJ, Neumann F-J, Sechtem U, Banning AP, Bonaros N, Bueno H, Bugiardini R, Chieffo A, Crea F, Czerny M, Delgado V, Dendale P, Flachskampf FA, Gohlke H, Grove EL, James S, Katritsis D, Landmesser U, Lettino M, Matter CM, Nathoe H, Niessner A, Patrono C, Petronio AS, Pettersen SE, Piccolo R, Piepoli MF, Popescu BA, Räber L, Richter DJ, Roffi M, Roithinger FX, Shlyakhto E, Sibbing D, Silber S, Simpson IA, Sousa-Uva M, Vardas P, Witkowski A, Zamorano JL, Achenbach S, Agewall S, Barbato E, Bax JJ, Capodanno D, Cuisset T, Deaton C, Dickstein K, Edvardsen T, Escaned J, Funck-Brentano C, Gersh BJ, Gilard M, Hasdai D, Hatala R, Mahfoud F, Masip J, Muneretto C, Prescott E, Saraste A, Storey RF, Svitil P, Valgimigli M, Windecker S, Aboyans V, Baigent C, Collet J-P, Dean V, Delgado V, Fitzsimons D, Gale CP, Grobbee D, Halvorsen S, Hindricks G, Iung B, Jüni P, Katus HA, Landmesser U, Leclercq C, Lettino M, Lewis BS, Merkely B, Mueller C, Petersen S, Petronio AS, Richter DJ, Roffi M, Shlyakhto E, Simpson IA, Sousa-Uva M, Touyz RM, Benkhedda S, Metzler B, Sujayeva V, Cosyns B, Kusljugic Z, Velchev V, Panayi G, Kala P, Haahr-Pedersen SA, Kabil H, Ainla T, Kaukonen T, Cayla G, Pagava Z, Woehrle J, Kanakakis J, Tóth K, Gudnason T, Peace A, Aronson D, Riccio C, Elezi S, Mirrakhimov E, Hansone S, Sarkis A, Babarskiene R, Beissel J, Maempel AJC, Revenco V, de Grooth GJ, Pejkov H, Juliebø V, Lipiec P, Santos J, Chioncel O, Duplyakov D, Bertelli L, Dikic AD, Studenčan M, Bunc M, Alfonso F, Bäck M, Zellweger M, Addad F, Yildirir A, Sirenko Y, Clapp B, ESC Scientific Document Group . 2019 ESC Guidelines for the diagnosis and management of chronic coronary syndromesThe Task Force for the diagnosis and management of chronic coronary syndromes of the European Society of Cardiology (ESC). Eur Heart J 2020;41:407–477. [DOI] [PubMed] [Google Scholar]
  • 3. Motwani M, Dey D, Berman DS, Germano G, Achenbach S, Al-Mallah MH, Andreini D, Budoff MJ, Cademartiri F, Callister TQ, Chang H-J, Chinnaiyan K, Chow BJW, Cury RC, Delago A, Gomez M, Gransar H, Hadamitzky M, Hausleiter J, Hindoyan N, Feuchtner G, Kaufmann PA, Kim Y-J, Leipsic J, Lin FY, Maffei E, Marques H, Pontone G, Raff G, Rubinshtein R, Shaw LJ, Stehli J, Villines TC, Dunning A, Min JK, Slomka PJ.. Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J 2017;38:500–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Betancur J, Otaki Y, Motwani M, Fish MB, Lemley M, Dey D, Gransar H, Tamarappoo B, Germano G, Sharir T, Berman DS, Slomka PJ.. Prognostic value of combined clinical and myocardial perfusion imaging data using machine learning. JACC Cardiovasc Imaging 2018;11:1000–1009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Hu L-H, Betancur J, Sharir T, Einstein AJ, Bokhari S, Fish MB, Ruddy TD, Kaufmann PA, Sinusas AJ, Miller EJ, Bateman TM, Dorbala S, Di Carli M, Germano G, Commandeur F, Liang JX, Otaki Y, Tamarappoo BK, Dey D, Berman DS, Slomka PJ.. Machine learning predicts per-vessel early coronary revascularization after fast myocardial perfusion SPECT: results from multicentre REFINE SPECT registry. Eur Heart J Cardiovasc Imaging 2020;21:549–559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Hu L-H, Miller RJH, Sharir T, Commandeur F, Rios R, Einstein AJ, Fish MB, Ruddy TD, Kaufmann PA, Sinusas AJ, Miller EJ, Bateman TM, Dorbala S, Di Carli M, Liang JX, Eisenberg E, Dey D, Berman DS, Slomka PJ.. Prognostically safe stress-only single-photon emission computed tomography myocardial perfusion imaging guided by machine learning: report from REFINE SPECT. Eur Heart J - Cardiovasc Imaging 2021;22:705–714. [DOI] [PubMed] [Google Scholar]
  • 7. Dey D, Slomka PJ, Leeson P, Comaniciu D, Shrestha S, Sengupta PP, Marwick TH.. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol 2019;73:1317–1335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Hong MKH, Yao HHI, Pedersen JS, Peters JS, Costello AJ, Murphy DG, Hovens CM, Corcoran NM.. Error rates in a clinical data repository: lessons from the transition to electronic data transfer—a descriptive study. BMJ Open 2013;3:e002406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Nahm ML, Pieper CF, Cunningham MM.. Quantifying data quality for clinical trials using electronic data capture. PLoS One 2008;3:e3049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Slomka PJ, Betancur J, Liang JX, Otaki Y, Hu L-H, Sharir T, Dorbala S, Di Carli M, Fish MB, Ruddy TD, Bateman TM, Einstein AJ, Kaufmann PA, Miller EJ, Sinusas AJ, Azadani PN, Gransar H, Tamarappoo BK, Dey D, Berman DS, Germano G.. Rationale and design of the REgistry of Fast Myocardial Perfusion Imaging with NExt generation SPECT (REFINE SPECT). J Nucl Cardiol 2020;27:1010–1021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Thygesen K, Alpert JS, White HD, Jaffe AS, Apple FS, Galvani M, Katus HA, Newby LK, Ravkilde J, Chaitman B, Clemmensen PM, Dellborg M, Hod H, Porela P, Underwood R, Bax JJ, Beller GA, Bonow R, Van der Wall EE, Bassand J-P, Wijns W, Ferguson TB, Steg PG, Uretsky BF, Williams DO, Armstrong PW, Antman EM, Fox KA, Hamm CW, Ohman EM, Simoons ML, Poole-Wilson PA, Gurfinkel EP, Lopez-Sendon J-L, Pais P, Mendis S, Zhu J-R, Wallentin LC, Fernández-Avilés F, Fox KM, Parkhomenko AN, Priori SG, Tendera M, Voipio-Pulkki L-M, Vahanian A, Camm AJ, De Caterina R, Dean V, Dickstein K, Filippatos G, Funck-Brentano C, Hellemans I, Kristensen SD, McGregor K, Sechtem U, Silber S, Tendera M, Widimsky P, Zamorano JL, Morais J, Brener S, Harrington R, Morrow D, Lim M, Martinez-Rios MA, Steinhubl S, Levine GN, Gibler WB, Goff D, Tubaro M, Dudek D, Al-Attar N; Joint ESC/ACCF/AHA/WHF Task Force for the Redefinition of Myocardial Infarction . Universal definition of myocardial infarction. Circulation 2007;116:2634–2653. [DOI] [PubMed] [Google Scholar]
  • 12. Chen T, Guestrin C, XGBoost: a scalable tree boosting system. In Proceedings of 22nd ACM SIGKDD International Conference Knowledge Discovery and Data Mining. Published online August 13, 2016. p785–794.
  • 13. Chollet F. Deep Learning with Python. 1st ed. Shelter Island, New York, US: Manning Publications; 2017. [Google Scholar]
  • 14. DeLong ER, DeLong DM, Clarke-Pearson DL.. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837–845. doi: 10.2307/2531595 [DOI] [PubMed] [Google Scholar]
  • 15. Hanley JA, McNeil BJ.. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29–36. doi: 10.1148/radiology.143.1.7063747 [DOI] [PubMed] [Google Scholar]
  • 16. Kim WJ, Sung JM, Sung D, Chae M-H, An SK, Namkoong K, Lee E, Chang H-J.. Cox proportional hazard regression versus a deep learning algorithm in the prediction of dementia: an analysis based on periodic health examination. JMIR Med Inform 2019;7:e13139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Heagerty PJ, Zheng Y.. Survival model predictive accuracy and ROC curves. Biometrics 2005;61:92–105. [DOI] [PubMed] [Google Scholar]
  • 18. Harrell FE, Lee KL, Mark DB.. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361–387. [DOI] [PubMed] [Google Scholar]
  • 19. Palczewska A, Palczewski J, Marchese Robinson R, Neagu D, Interpreting random forest classification models using a feature contribution method. In Bouabana-Tebibel T, Rubin SH (eds) Integration of Reusable Systems. Advances in Intelligent Systems and Computing. Springer International Publishing; 2014. p193–218. [Google Scholar]
  • 20. Hastie T, Tibshirani R, Friedman J, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed.Basel, Switzerland: Springer Series in Statistics; 2017. [Google Scholar]
  • 21. Haro Alonso D, Wernick MN, Yang Y, Germano G, Berman DS, Slomka P.. Prediction of cardiac death after adenosine myocardial perfusion SPECT based on machine learning. J Nucl Cardiol 2019;26:1746–1754. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

cvab236_Supplementary_Data

Data Availability Statement

The data underlying this article cannot be shared publicly due to the investigational board review and multi-institutional data agreements constrains. To the extent allowed by the data-sharing agreements and IRB protocols researcher, the data from this manuscript will be shared upon written request to the corresponding author.


Articles from Cardiovascular Research are provided here courtesy of Oxford University Press

RESOURCES