Prediction of normal echocardiograms from 12-lead electrocardiograms using deep learning to improve outpatient screening for structural heart disease

B Arends; D Ahmetagic; W A C Van Amsterdam; D Van Osch; T P Mast; M B Vessies; W A L Tonino; M Van 'T Veer; A J Teske; P Van Der Harst; R R Van De Leur; R Van Es

doi:10.1093/ehjdh/ztaf143.034

Abstract

Introduction

Echocardiography is indispensable for diagnosing structural heart disease (SHD). However, over 40% of initial outpatient imaging studies show no clinically significant findings, contributing to unnecessary utilization and capacity limitations in cardiology clinics.

Purpose

To develop and validate an artificial intelligence-based electrocardiogram interpretation (AI-ECG) model that can accurately identify patients unlikely to have clinically significant SHD, enabling more efficient use of echocardiography.

Methods

We retrospectively paired ECGs with transthoracic echocardiograms (performed within 90 days) from two Dutch hospitals (2009-2023). We built an ensemble model in two stages. First, a convolutional neural network analyzed the median beat ECG and produced nine probability scores, one for each predefined abnormality: moderate-or-greater valvular disease, left or right ventricular systolic dysfunction or dilation, and grade ≥II diastolic dysfunction. These probabilities, along with patient age and sex, were then input into an XGBoost classifier, which outputs a single probability that the echocardiogram would be abnormal (any of the nine abnormalities). The operating point was fixed at 95% sensitivity in the development set to maximize negative predictive value, and model performance was evaluated in an outpatient test cohort.

Results

The development sets comprised 80,635 ECGs from 52,006 patients (31.5% SHD) and 4,024 ECGs from 4,024 patients (20.3% SHD) were included in the outpatient test cohort. Figure 1 shows the ROC curve (AUROC 0.84, 95% CI 0.83–0.86). At the prespecified threshold, sensitivity was 0.95 (95% CI: 0.93–0.96), specificity 0.35 (95% CI: 0.34–0.37), PPV 0.27 (95% CI: 0.25–0.29) and NPV 0.96 (95% CI: 0.95–0.98). The model would have deferred 1,130 (28.1%) echocardiograms while missing 41 (1.0%) significant SHD cases, none classified as severe.

Conclusions

An AI-ECG screening model can effectively rule out clinically significant SHD with high sensitivity and NPV. In our retrospective analysis, it could potentially reduce the volume of outpatient echocardiograms by nearly 30%, improving efficiency. Prospective trials are warranted to validate these findings and to assess the safety, clinical integration workflow, and cost-effectiveness of implementing the model in practice.

graphic file with name 34749920250611103319_1.jpg — Discriminative performance of AI-ECG

PERMALINK

Prediction of normal echocardiograms from 12-lead electrocardiograms using deep learning to improve outpatient screening for structural heart disease

B Arends

D Ahmetagic

W A C Van Amsterdam

D Van Osch

T P Mast

M B Vessies

W A L Tonino

M Van 'T Veer

A J Teske

P Van Der Harst

R R Van De Leur

R Van Es

Abstract

Introduction

Purpose

Methods

Results

Conclusions

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Prediction of normal echocardiograms from 12-lead electrocardiograms using deep learning to improve outpatient screening for structural heart disease

B Arends

D Ahmetagic

W A C Van Amsterdam

D Van Osch

T P Mast

M B Vessies

W A L Tonino

M Van 'T Veer

A J Teske

P Van Der Harst

R R Van De Leur

R Van Es

Abstract

Introduction

Purpose

Methods

Results

Conclusions

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases