Skip to main content
. 2024 Jun 12;15:5034. doi: 10.1038/s41467-024-49390-y

Fig. 2. Diagnostic performance overview.

Fig. 2

ROC and PR-curve. Predictive performance of our deep learning-based approach (CARPEECG), a random forest based on clinical data (CARPEClin.), the cardiologist, and ST depression in terms of mean performance ± standard deviation (envelopes) over n = 25 bootstrap draws. The upper plots show that both machine learning approaches outperform the cardiologist in terms of area under the receiver operating characteristic and precision-recall curve. In regions of high specificity (inline plot), the neural network is on par with the cardiologist while CARPEClin. exhibits worse performance. Both machine learning methods outperform the cardiologist’s judgement in regions of high sensitivity (inline plot). Decision Curve: First row: Net benefit43 plot for CARPEECG (green), CARPEClin. (orange), the cardiologist (purple), a myocardial perfusion scan (MPS) for no patient (black), and MPS for all patients (dashed grey). CARPEColl. is not shown as it is visually indistinguishable from CARPEECG. Net benefit puts both benefit and harm on the same scale. In our case, we consider harm to be inflicted by performing an unnecessary MPS. At a decision threshold of 5%, all approaches lead to a similar net benefit. At the second threshold of 15%, CARPEClin. and the cardiologist demonstrate a net benefit similar to performing MPS on all patients, with CARPEECG leading to a higher net benefit. Second row: Potential MPSs avoided compared to the cardiologist’s strategy: While the conventional ML model and deep learning avoid the approximately same number of MPSs at the decision threshold of 5% (11.5% and 12.8%, respectively), the gap increases at the pre-MPS threshold of 15% (15.3% and 5.3%, respectively). Envelopes in both rows show 95% confidence intervals around the mean over n = 25 bootstrap draws. Source data are provided as a Source Data file.