Development and multicentre validation of an artificial intelligence electrocardiogram model for ventricular remodeling in repaired tetralogy of Fallot

Son Q Duong; Akhil Vaid; Pengfei Jiang; Yuval Bitterman; Yamini Krishnamurthy; I Min Chiu; Joshua Finer; Brian Cleary; Benjamin S Glicksberg; Ruchira Garg; Michael DiLorenzo; Mark Friedberg; Evan Zahn; Matthew Lewis; Michael Satzer; David Ouyang; Pierre Elias; Tal Geva; Sunil Ghelani; Brett R Anderson; Ali Zaidi; Rachel M Wald; Girish N Nadkarni; Joshua Mayourian

doi:10.1093/ehjdh/ztag015

. 2026 Feb 2;7(2):ztag015. doi: 10.1093/ehjdh/ztag015

Development and multicentre validation of an artificial intelligence electrocardiogram model for ventricular remodeling in repaired tetralogy of Fallot

Son Q Duong ^1,^2,^3,^✉,³, Akhil Vaid ^4,⁵, Pengfei Jiang ^6,⁷, Yuval Bitterman ⁸, Yamini Krishnamurthy ⁹, I Min Chiu ¹⁰, Joshua Finer ¹¹, Brian Cleary ¹², Benjamin S Glicksberg ^13,^14,¹⁵, Ruchira Garg ¹⁶, Michael DiLorenzo ¹⁷, Mark Friedberg ¹⁸, Evan Zahn ¹⁹, Matthew Lewis ²⁰, Michael Satzer ²¹, David Ouyang ²², Pierre Elias ²³, Tal Geva ²⁴, Sunil Ghelani ²⁵, Brett R Anderson ^26,^27,²⁸, Ali Zaidi ^29,³⁰, Rachel M Wald ^31,³², Girish N Nadkarni ^33,^34,^#, Joshua Mayourian ^35,^#

¹ Department of Pediatrics (Cardiology), Icahn School of Medicine at Mount Sinai, 1468 Madison Ave, Annenberg 3rd Fl, New York, NY 10029, USA

² Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

³ Center for Artificial Intelligence in Children’s Health, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

⁴ Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

⁵ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY 10029, USA

⁶ Center for Child Health Services Research, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

⁷ Department of Population Health Sciences and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

⁸ Department of Paediatrics, Labatt Family Heart Centre, Hospital for Sick Children (SickKids), Toronto M5G 1X8, Canada

⁹ Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA

¹⁰ Department of Medicine (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA

¹¹ Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA

¹² Department of Pediatrics (Cardiology), Northwestern Feinberg School of Medicine, Chicago, IL 60611, USA

¹³ Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

¹⁴ Center for Artificial Intelligence in Children’s Health, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

¹⁵ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY 10029, USA

¹⁶ Department of Pediatrics (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA

¹⁷ Department of Pediatrics (Cardiology), Columbia University College of Physicians and Surgeons, New York, NY 10032, USA

¹⁸ Department of Paediatrics, Labatt Family Heart Centre, Hospital for Sick Children (SickKids), Toronto M5G 1X8, Canada

¹⁹ Department of Pediatrics (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA

²⁰ Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA

²¹ Department of Pediatrics (Cardiology), Northwestern Feinberg School of Medicine, Chicago, IL 60611, USA

²² Department of Medicine (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA

²³ Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA

²⁴ Department of Cardiology, Boston Children’s Hospital, Boston, MA 02115, USA

²⁵ Department of Cardiology, Boston Children’s Hospital, Boston, MA 02115, USA

²⁶ Department of Pediatrics (Cardiology), Icahn School of Medicine at Mount Sinai, 1468 Madison Ave, Annenberg 3rd Fl, New York, NY 10029, USA

²⁷ Center for Child Health Services Research, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

²⁸ Department of Population Health Sciences and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

²⁹ Department of Pediatrics (Cardiology), Icahn School of Medicine at Mount Sinai, 1468 Madison Ave, Annenberg 3rd Fl, New York, NY 10029, USA

³⁰ Mount Sinai Fuster Heart Hospital, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

³¹ Department of Paediatrics, Labatt Family Heart Centre, Hospital for Sick Children (SickKids), Toronto M5G 1X8, Canada

³² Peter Munk Cardiac Centre, Toronto Adult Congenital Heart Disease Program, University of Toronto, Toronto M5G 2C4, Ontario, Canada

³³ Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA

³⁴ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY 10029, USA

³⁵ Department of Cardiology, Boston Children’s Hospital, Boston, MA 02115, USA

^✉

Corresponding author. Tel: 1-844-733-7692, Email: son.duong@mssm.edu

Girish N Nadkarni and Joshua Mayourian contributed equally to the study.

Conflict of interest: G.N.N.: Heart Test Laboratories (Equity, Royalty, Scientific Advisory Board); Renalytix (Co-founder, Equity, Royalty, Scientific Advisory Board); Pensieve Health (Co-founder, Equity, Royalty, Scientific Advisory Board); S.Q.D. had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Roles

Son Q Duong: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing

Akhil Vaid: Formal analysis, Investigation, Methodology

Pengfei Jiang: Data curation, Formal analysis, Writing - review & editing

Yuval Bitterman: Data curation, Writing - review & editing

Yamini Krishnamurthy: Data curation, Writing - review & editing

I Min Chiu: Data curation, Formal analysis, Writing - review & editing

Joshua Finer: Data curation, Formal analysis, Writing - review & editing

Brian Cleary: Data curation, Writing - review & editing

Benjamin S Glicksberg: Methodology, Resources, Writing - review & editing

Ruchira Garg: Data curation, Writing - review & editing

Michael DiLorenzo: Data curation, Writing - review & editing

Mark Friedberg: Data curation, Writing - review & editing

Evan Zahn: Data curation, Writing - review & editing

Matthew Lewis: Data curation, Formal analysis, Writing - review & editing

Michael Satzer: Data curation, Writing - review & editing

David Ouyang: Data curation, Formal analysis, Writing - review & editing

Pierre Elias: Data curation, Formal analysis, Writing - review & editing

Tal Geva: Data curation, Writing - review & editing

Sunil Ghelani: Data curation, Writing - review & editing

Brett R Anderson: Data curation, Formal analysis, Writing - review & editing

Ali Zaidi: Data curation, Investigation, Writing - review & editing

Rachel M Wald: Conceptualization, Data curation, Formal analysis, Investigation, Supervision, Writing - review & editing

Girish N Nadkarni: Formal analysis, Funding acquisition, Resources, Supervision, Writing - review & editing

Joshua Mayourian: Conceptualization, Formal analysis, Investigation, Methodology, Writing - review & editing

PMCID: PMC12902437 PMID: 41695565

Abstract

Aims

Periodic cardiac MRI (CMR) is recommended to identify adverse ventricular remodelling in repaired tetralogy of Fallot (TOF), but access to CMR is uneven, and compliance is poor. We developed a 12-lead electrocardiogram (ECG) artificial intelligence (AI) biomarker to identify CMR-quantified adverse biventricular remodelling in repaired TOF.

Methods and results

Six (1 train/5 external test) North American retrospective cohorts with paired ECG and CMR were included. The main outcome was a composite of ≥2 TOF-specific CMR abnormalities: right ventricular (RV) end-diastolic volume ≥ 160 mL/m², RV end-systolic volume ≥ 80 mL/m², RV ejection fraction (EF) <47%, and left ventricular EF <55%. Model discrimination, calibration, and net benefit as a screening test to rule out ventricular remodelling were assessed. Nine hundred and eight patients (2552 ECGs) were included in training, and 782 patients (1795 ECGs) in external validation (outcome prevalence 57%). The area under the receiver-operating curve (AUROC) was 0.85 (95% confidence interval 0.83–0.87), and average precision was 0.88. At a screening risk-threshold of 0.25, there was 92% sensitivity, 41% specificity, 87% negative predictive value, and 55% positive predictive value for ventricular remodelling, which yielded a 13% net reduction in CMR use on net benefit analysis. There was no difference by sex or race/ethnicity, but there were differences by age and site, with two of five sites with lower AUROC than the others, and three of five sites met criteria for miscalibration, which improved after centre-specific calibration.

Conclusion

An artificial intelligence analysis of electrocardiogram (AI-ECG) biomarker in repaired TOF effectively identifies ventricular remodelling to inform timing of advanced imaging. Extensive external validation revealed variation in discrimination and calibration that are important considerations for clinical implementation and regulatory approval pathways of AI-ECG in congenital heart disease.

Graphical Abstract

Introduction

Tetralogy of Fallot (TOF) is the most common cyanotic congenital heart disease, affecting 1 in 3000 live births.¹ Right ventricular outflow tract (RVOT) dysfunction after repair of TOF is nearly universal and characterized by chronic pulmonary regurgitation with residual obstruction. Although well-tolerated early in life, chronic RVOT dysfunction precipitates a pathophysiological cascade that may lead to morbidity and mortality after the second decade of life.²

Lifelong surveillance is recommended for early detection of electromechanical cardiomyopathy that might be ameliorated by pulmonary valve replacement (PVR) to increase survival.^3,4 Guidelines recommend serial monitoring and RVOT intervention with PVR at ventricular size and functional thresholds that represent a ‘tipping point’ before irreversible changes occur.^2,4,5

Because assessment of RV size and systolic function by echocardiography is suboptimal,^6,7 cardiac magnetic resonance imaging (CMR) is recommended for assessment of RV volumes, ejection fraction (EF), and mass. However, CMR is infeasible in some patients, requires specialized expertise and equipment, and typically exceeds 2 h of patient and clinician time to acquire, process, and report.⁸ These factors limit their use in under-resourced settings, as half of adults with congenital heart disease live more than an hour from an appropriate care centre,⁹ and demonstrate low adherence to guideline-recommended diagnostic imaging.¹⁰ Furthermore, frequent CMR in TOF may not be necessary in select patients.^11,12 This population is therefore an ideal target for the development of broadly available precision diagnostics to increase care efficiency and expand access to quantitative right ventricular (RV) assessment.

Artificial intelligence analysis of electrocardiograms (AI-ECG) is a novel method for RV assessment. Artificial intelligence analysis of electrocardiogram methods can predict CMR-quantified RV dilation and dysfunction in adult¹³ and congenital heart disease populations,¹⁴ and AI-ECG complements echocardiogram-based RV functional assessment.¹⁵ However, the performance of AI-ECG to identify adverse ventricular remodelling at clinically important quantitative thresholds in repaired TOF has not been explored. Additionally, prior AI-ECG studies in paediatric and congenital heart disease are limited by a lack of multiple-centre external validation, and thus, differences in performance and calibration have yet to be explored. This is an important next step in the regulatory approval pathway for these novel diagnostics.

This study sought to develop and externally validate an AI-ECG model in six centres across North America to predict the risk of clinically important CMR-based biventricular size and systolic functional abnormalities in patients with repaired TOF.

Methods

Participating centres

Model training occurred at a large northeastern combined paediatric and adult congenital heart disease (ACHD) centre. External validation occurred at five hospital-based centres consisting of mixed ACHD and paediatric practices across the USA and Canada. An overview of the training and validation process is shown in Figure 1. Institutional Review Board approval with a waiver of consent was obtained at all participating institutions. This study adheres to EHRA-AI¹⁶ and TRIPOD-AI¹⁷ reporting guidelines (see Supplementary material online).

Study overview. A multicentre study to train and validate a deep learning electrocardiogram model to predict the abnormalities in biventricular size and function on cardiac MRI for patients with tetralogy of Fallot.

Inclusion criteria

Investigators at each participating centre retrospectively identified patients of any age followed at their institution with TOF status post full intracardiac repair, body surface area (BSA) ≥ 1 m², and at least one 12-lead ECG performed within 90 days of a CMR without an intermediate intervention (cardiac catheterization or surgical). At Site B, part of the cohort submitted was existing longitudinal registry data consisting of patients with moderate-or-greater pulmonary regurgitation at enrolment.¹⁸

Data collection

Clinical and demographic data were collected from electronic health records, internal databases, and imaging reports per site-specific practices (see Supplementary material online, Methods). Electrocardiograms obtained during clinical care were retrospectively identified and extracted from MUSE ECG management system in XML format (GE Healthcare, USA) either manually (Site E, which submitted 1 ECG per CMR) or through database query (all other sites, resulting in ≥ 1 ECG per CMR in the inclusion period). Electrocardiogram tracings were 10 s acquisitions sampled at either 250 or 500 Hz. Only the standard 12-leads were included. Cardiac MRI volumes and EF were collected from CMR reports performed during the course of clinical care, except for registry participants at Site B, at which CMR volumes were remeasured in a core lab as part of the existing registry protocol.¹⁸

Prediction target

Cardiac MRI in TOF quantifies adverse ventricular remodelling to guide the timing of PVR. These CMR-defined criteria were the prediction targets of interest:^2–4,19,20 BSA-indexed RV end-diastolic volume (RVEDVi) ≥ 160 mL/m², BSA-indexed RV end-systolic volume (RVESVi) ≥ 80 mL/m², RVEF < 47%, LVEF < 55%, and a composite of ≥2 of the above criteria. The five model outputs were prediction probabilities ranging from 0 to 1 for their respective CMR abnormality. As guidelines suggest referral for PVR with ≥2 CMR abnormalities in asymptomatic individuals, the ≥2 composite criteria were considered the main outcome and were subsequently evaluated for clinical utility as a screening tool to identify adverse ventricular remodelling.

Model selection, architecture, and training

To maximize sample size, the training set was partitioned 90% training and 10% validation for monitoring loss and hyperparameter tuning, with no other data split for internal testing at the training site. Instead, model performance was evaluated exclusively at external validation sites. Increasing the training sample by combining data across centres for multisite training was not feasible due to data-sharing restrictions. Starting model weights were from a previously published congenital heart disease model trained on over 90 000 ECGs.¹⁴ The network architecture is identical to previous work^14,21 where 12 × 2048 ECG inputs are used as inputs into a convolutional neural network that includes residual blocks adapted for unidimensional signals.²² Details of model training are found in Supplementary material online, Methods.

Multicentre external validation

The model and pipeline for inference were packaged into a Docker container (Docker Inc., USA) for reproducibility of inference across all external validation centres. Global model discrimination was measured with area under the receiver-operating curve (AUROC) and average precision [analogous to area under the precision–recall curve (AUPRC)] with bootstrapped 95% confidence intervals. Subgroups were analysed with Bonferroni-corrected Delong Test. The containerized model and pipeline used for multicentre external inference and evaluation are available at https://github.com/sonqduong/ECGsizefxn_rTOF. Other software used and data availability statement can be found in Supplementary material online, Methods.

Model calibration

Model calibration was assessed visually with reliability diagrams and statistically with the Spiegelhalter Z test.²³ A Z test P < 0.05 in a model with AUROC >0.65 was considered evidence of miscalibration. In centres that met criteria for miscalibration, centre-specific Platt scaling²³ was performed, which does not affect model AUROC at individual centres but improves calibration. Platt scaling was performed using leave-one-group-out cross-validation (with grouping at the patient level) to prevent data leakage and reduce overoptimistic estimates of performance. The calibrated results were then aggregated across centres to allow for global examination of performance.

Clinical utility

The composite outcome model was selected to demonstrate the clinical utility of AI-ECG as a biomarker to screen for adverse ventricular remodelling suggestive of the need for PVR. The cohort was further limited to the closest available ECG to the time of CMR to mimic the expected clinical use case of the algorithm. Threshold-specific accuracy metrics: sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) were reported at a 25% risk threshold, which represented a clinically reasonable pre-test odds of disease at which CMR could be deferred. The net benefit with decision curve analysis²⁴ was implemented as described in the Supplementary material online, Methods to evaluate net reduction in CMR with implementation of the AI-ECG model at this threshold and over a range of clinically reasonable thresholds.

Model explainability

Median waveform analysis provides a visualization of representative high-risk and low-risk ECGs. Saliency mapping was performed using a Shapley Additive Explanations framework⁷ to visualize input ECG features that contribute most to model prediction. Model explainability was performed on ECGs from test Site A using saliency mapping on median waveforms from 10 independent patient ECGs from the highest and lowest AI-ECG risk scores, similar to prior work.^25,26

As a comparator model, the final reported QRS duration on a subset of ECG from Site A and Site B was collected, and a simple univariable logistic regression model was fit using patient-stratified five-fold cross-validation to predict the primary outcome using QRS duration as a sole predictor. Area under the receiver-operating curve and decision curve analysis were reported for the AI-ECG model.

Results

Cohort characteristics

The training cohort consisted of 908 patients with 1991 CMR and 2552 ECGs. The aggregated external validation cohort consisted of 782 patients with 996 CMR and 1795 ECG (Table 1). The aggregated external validation cohort had a higher prevalence of disease outcomes compared with the training site (34 vs. 23% for RVEDVi ≥ 160, 48 vs. 29% for RVESVi ≥ 80, 61 vs. 36% for RVEF < 47%, 51 vs. 35% for LVEF < 55%, 57 vs. 38% for composite of ≥2 criteria; P < 0.001 for all comparisons). There was heterogeneity across external validation centres in race/ethnicity, TOF subtype, age at study, year of study, body size measures, and RVOT type at CMR (P < 0.001 for all measures, Table 1). There were significant group differences in RV volumes and biventricular EFs across validation centres (P < 0.001).

Table 1.

Cohort characteristics

		Train site	Aggregated external validation	Individual external validation centres
		Train site	Aggregated external validation	Site A	Site B	Site C	Site D	Site E	P-value^*
Patient characteristics	Independent patients, n	908	782	161	250	54	175	142
	Female sex, n (%)	405 (45%)	351 (46%)	71 (49%)	104 (42%)	28 (52%)	76 (45%)	70 (49%)	0.41
	Race/ethnicity
	Non-Hispanic Asian	29 (3%)	70 (9%)	11 (7%)	33 (13%)	8 (15%)	6 (3%)	12 (8%)	<0.001
	Non-Hispanic Black	33 (4%)	55 (7%)	19 (12%)	4 (2%)	4 (7%)	8 (5%)	20 (14%)
	Hispanic	51 (6%)	95 (12%)	27 (17%)	4 (2%)	16 (30%)	12 (7%)	36 (25%)
	Other/mixed	35 (4%)	45 (6%)	16 (10%)	12 (5%)	2 (4%)	12 (7%)	3 (2%)
	Non-Hispanic Pacific Islander	—	4 (1%)	3 (2%)	0 (0%)	0 (0%)	0 (0%)	1 (1%)
	Unknown	152 (17%)	174 (22%)	27 (17%)	67 (27%)	1 (2%)	78 (45%)	1 (1%)
	Non-Hispanic White	608 (67%)	339 (43%)	58 (36%)	130 (52%)	23 (43%)	59 (34%)	69 (49%)
	TOF subtype
	Pulmonary stenosis	580 (64%)	530 (68%)	139 (86%)	181 (72%)	18 (33%)	91 (52%)	102 (72%)	<0.001
	Pulmonary atresia	161 (18%)	68 (9%)	16 (10%)	9 (4%)	6 (11%)	19 (11%)	18 (13%)
	PA/MAPCAs	7 (1%)	18 (2%)	2 (1%)	0 (0%)	1 (2%)	7 (4%)	8 (6%)
	Absent pulmonary valve	6 (1%)	22 (3%)	1 (1%)	0 (0%)	0 (0%)	9 (5%)	11 (8%)
	TOF-AV canal defect	11 (1%)	6 (1%)	1 (1%)	0 (0%)	1 (2%)	2 (1%)	2 (1%)
	Unspecified	143 (16%)	138 (18%)	2 (1%)	60 (24%)	28 (52%)	47 (27%)	1 (1%)
CMR Characteristics	Independent CMR, n	1991	996	242	263	72	184	235
	Age, median [IQR]	23 [16–34]	23 [16–34]	25 [18–35]	26 [18–40]	28 [22–40]	27 [17–41]	17 [14–22]	<0.001
	Year, median [IQR]	2012 [‘08–‘16]	2016 [’14–’20]	2019 [‘14–‘22]	2015 [‘14–‘17]	2017 [‘15–‘18]	2021 [‘18–‘22]	2015 [‘11–‘19]	<0.001
	Weight, median [IQR]	65 [52–79]	63 [52–78]	67 [56–79]	65 [52–80]	67 [54–84]	61 [50–75]	58 [49–70]	<0.001
	BSA, median [IQR]	1.7 [1.5–1.9]	1.7 [1.5–1.9]	1.8 [1.6–1.9]	1.7 [1.5–2.0]	1.8 [1.5–2.0]	1.7 [1.5–1.9]	1.6 [1.4–1.8]	<0.001
	Obese, n (%)	374 (19%)	162 (18%)	46 (19%)	48 (18%)	15 (21%)	17 (18%)	36 (15%)	0.75
	RVOT anatomy at the time of CMR
	Surgical/transcatheter PVR, n (%)	—	198 (20%)	52 (21%)	5 (2%)	37 (51%)	59 (32%)	45 (19%)	<0.001
	RV-PA conduit, n (%)	—	117 (12%)	15 (6%)	11 (4%)	2 (3%)	18 (10%)	71 (30%)
	Transannular patch, n (%)	—	382 (38%)	76 (31%)	136 (52%)	18 (25%)	90 (49%)	62 (26%)
	Valve-sparing, n (%)	—	175 (18%)	29 (12%)	75 (29%)	10 (14%)	7 (4%)	54 (23%)
	Unknown, n (%)	—	124 (12%)	70 (29%)	36 (14%)	5 (7%)	10 (5%)	3 (1%)
	RVEDVi, median [IQR]	129 [107–153]	132 [110–156]	126 [105–151]	150 [129–179]	129 [109–154]	134 [112–159]	119 [103–138]	<0.001
	RVESVi, median [IQR]	63 [515–80]	69 [55–87]	63 [51–75]	85 [68–104]	61 [47–77]	72 [58–94]	63 [52–76]	<0.001
	RVEF, median [IQR]	50 [45–55]	47 [42–52]	50 [45–54]	45 [40–49]	51 [47–58]	45 [39–50]	48 [42–52]	<0.001
	LVEF, median [IQR]	58 [54–62]	56 [52–60]	58 [55–63]	55 [51–59]	61 [57–66]	55 [51–59]	55 [50–58]	<0.001
ECG prediction	Independent ECGs, n	2552	1795	473	540	90	457	235
	RVEDVi > 160, n (%)	590 (23%)	608 (34%)	90 (19%)	314 (58%)	25 (28%)	159 (35%)	20 (9%)	<0.001
	RVESVi > 80, n (%)	752 (29%)	854 (48%)	151 (32%)	393 (73%)	23 (26%)	240 (53%)	47 (20%)	<0.001
	RVEF < 47%, n (%)	911 (36%)	1086 (61%)	201 (43%)	410 (76%)	26 (29%)	345 (75%)	104 (44%)	<0.001
	LVEF < 55%, n (%)	892 (35%)	909 (51%)	182 (39%)	336 (62%)	18 (20%)	258 (56%)	115 (49%)	<0.001
	>2 Criteria, n (%)	966 (38%)	1019 (57%)	185 (39%)	420 (78%)	28 (31%)	298 (65%)	88 (37%)	<0.001

Open in a new tab

^* P-value for differences between external validation sites.

External validation model performance

The centre-aggregated primary outcome model AUROC was 0.85 (95% CI 0.83–0.87), and the average precision was 0.88 (95% CI 0.86–0.90). Notably, there was variation observed in AUROC across centres, which ranged from 0.69 to 0.88 (AUROC Site A: 0.87, B: 0.88, C: 0.82; D: 0.74, E: 0.69; see Supplementary material online, Figure S1). Three of five centres met criteria for miscalibration (Sites B, D, E; see Supplementary material online, Results), with miscalibrated sites tending to underestimate the risk of ventricular remodelling (See Supplementary material online, Figure S1). After within-centre calibration, the model was adequately calibrated with expected calibration error of 6.1% (Spiegelhalter Z test P = 0.25, Figure 2). Clinical utility of model implementation as a screening tool to identify adverse ventricular remodelling was examined for the ECG prediction closest to the time of CMR (outcome prevalence of 43%) at a risk threshold of 0.25. Performance metrics were sensitivity 92% (95% CI 89–95%), specificity 41% (95% CI 37–45%), PPV 54% (95% CI 50–57%), and NPV 87% (95% CI 83–91%). In net benefit analysis, implementation of the model as a screening test for ventricular remodelling yielded a net 13% reduction in CMR without missing any additional true positives at a risk threshold of 0.25. Limiting the evaluation to only well-performing centres increased the net reduction in CMR to 21%. Decision curves show a net reduction of the model compared to ‘CMR all’ and ‘CMR none’ strategies in a clinically acceptable risk threshold range of 8% to 50% (see Figure 3). Net benefit was examined across a range of disease prevalence from 25 to 50% due to concern that the study inclusion criteria favoured more diseased groups (see Supplementary material online, Figure S2). The projected net reduction in CMR increased to 24.5% at a prevalence of 25%, implying AI-ECG screening could be more effective to reduce CMR usage in settings where ventricular remodelling is rarer.

Model discrimination and calibration. Receiver-operating characteristic curves (left panel) and precision–recall curves (middle panel) to identify cardiac MRI-quantified volumetric and functional abnormalities in patients with repaired tetralogy of Fallot. Composite outcome reliability diagram (right panel). Calibrated predictions were aggregated across all sites. ECE, expected calibration error. P-value is Spieghalter’s Z test P-value in which a P < 0.05 suggests model miscalibration.

Decision curve analysis. Implementation of artificial intelligence analysis of electrocardiogram as a screening tool to detect adverse ventricular remodelling to aid in the timing of advanced imaging in tetralogy of Fallot. The net reduction in cardiac MRI (green line) is plotted against the current strategy of every patient undergoing cardiac MRI (red dashed line) and the strategy of no patient undergoing cardiac MRI (black dashed line). The well-performing subset of sites (A, B, C) is also separately plotted to examine the best-case net benefit (green dashed line).

Examining the utility of the model as a tool to rule-in disease (i.e. tuned for specificity at a risk threshold of 0.75), the model sensitivity was 43% (40–46%), specificity 94% (92–96%), PPV 84% (78–89%), and NPV 68% (65–71%). In total, 59% of the study cohort fell within either the low- or high-risk thresholds.

Model performance of the subcomponent volume and systolic functional CMR metrics to the primary composite model is shown in Figure 2. Prediction of RVESV and RVEF derangements has similar performance to the composite model (AUROC 0.85 for both). Right ventricular end-diastolic volume prediction performance is moderately lower (AUROC 0.78). Discrimination between those above and below an LVEF of 55% is limited (AUROC 0.69).

Subgroup analysis

Subgroup analysis of the composite outcome aggregated across centres is shown in Figure 4. There was no difference in model discrimination by patient sex (P = 0.18), race/ethnicity (P = 1.0), or presence of PVR after intracardiac repair (P = 1.0). There was lower performance in children and young adults <22 years old vs. older adults (AUROC 0.73 vs. 0.85; P = 0.005). Due to this finding, threshold-specific metrics were re-examined in the young adult subgroup, and despite the drop in AUROC, at the screening threshold to rule out adverse ventricular remodelling, the sensitivity was 85% and NPV was 83%, and there was 9% net benefit. At the specific ‘rule-in’ threshold, specificity was 97% and PPV 83%.

Key subgroup performance. Key subgroup model performance in the aggregated external test cohort. Bonferroni-corrected Delong test P-value for the difference in area under the receiver-operating curve displayed on the right.

Obesity was also associated with improved prediction (AUROC 0.90 vs. 0.84, P = 0.025). However, age and BMI were positively correlated (Spearman rho 0.49, P < 0.001), and in the cohort <22 years, the proportion of obese was significantly lower compared with the older group (10.2 vs. 19.3%, P < 0.001). Therefore, AUROC was examined in the older age cohort, and no prediction performance difference was identified (AUROC 0.90 vs. 0.89 for obese vs. non-obese).

Model explainability with saliency mapping

Saliency mapping example of the composite outcome model is shown in Figure 5. Mapping suggests that the QRS complex is the most important region for the prediction of disease state, particularly in V1 and V6. High-risk features include R’ notching in V1, V2, and a taller R wave in V6.

Median waveform analysis and saliency mapping. Median waveforms of the 10 highest (red line) and 10 lowest (green line) predicted risk for the primary outcome in independent patient electrocardiograms from Site A. Saliency mapping (blue shade) demarcates important prediction regions of the waveform.

Comparison to QRS duration model

A univariable logistic regression model to predict adverse ventricular remodelling using QRS duration was fit from 392 available ECGs and compared with the matching AI-ECG predictions. Artificial intelligence analysis of electrocardiograms outperformed QRS duration (AUROC 0.87 vs. 0.78, see Supplementary material online, Figure S3). QRS duration model was properly calibrated (Spieghalter’s Z test P = 0.99), but at a similar clinical decision threshold of 0.25, there was no net reduction in CMR, and it was slightly harmful (net reduction −0.02).

Error analysis

To better understand the characteristics of the false negative predictions and the implications on care delivery, studies were partitioned into correct and misclassified groups: true positive, false negative, false positive, and true negative categories to examine CMR measurements and age at CMR (Table 2). The false negative group had less extreme median adverse remodelling than the true positive group. The false negative group was also younger than the false positive group (median age 17.0 vs. 27.1 years P < 0.001), again demonstrating the diminished prediction performance by age.

Table 2.

Characteristics of artificial intelligence analysis of electrocardiogram model in correct and misclassified groups

	True positive	False negative	False positive	True negative
n	392	34	337	231
Age (years)	27.1 [19.1–41.0]	17.0 [14.0–24.9]	21.2 [15.4–30.6]	21.9 [14.8–29.6]
RVEDVi (mL/m²)	159.7 [135.0–186.4]	150.9 [117.6–172.8]	120.8 [104.2–135.0]	117.0 [98.9–134.5]
RVESVi (mL/m²)	92.0 [80.7–109.1]	84.8 [68.4–90.0]	60.2 [51.7–69.1]	54.1 [44.9–65.7]
RVEF (%)	41.2 [35.8–45.0]	44.3 [40.1–49.0]	50.0 [47.0–53.2]	52.7 [49.9–56.9]
LVEF (%)	52.0 [47.4–55.4]	53.6 [50.0–58.0]	58.0 [55.2–61.0]	59.3 [55.7–64.0]

Open in a new tab

Results are presented as median [IQR].

Discussion

The widely available and cost-effective 12-lead ECG can serve as an AI-ECG biomarker to personalize the assessment of adverse ventricular remodelling in TOF. By linking together CMR-defined criteria linked to adverse ventricular remodelling with the ECG waveform, this unique biomarker may inform the timing of further imaging and may be used to effectively reduce the number of CMR required in TOF. A strength of this study is the breadth of multiple external validations, which advance important technical and pragmatic aspects of multicentre AI model implementation in congenital heart disease. Important additional novel findings of this study are (i) RV contractile metrics such as RVESV and RVEF were better predicted than RVEDV; and (ii) AI-ECG models exhibited important differences in discrimination and calibration across external validation sites.

Clinical significance

The output of the AI-ECG model is the probability of CMR volumetric or functional derangements indicative of adverse ventricular remodelling. This may be used as a novel biomarker to risk-stratify the need and mode of advanced imaging. As the recommended CMR screening interval in asymptomatic adult TOF population is every 3 years,^4,5 a proposed use of this model is to personalize the timing and mode of surveillance imaging. For low-risk patients who have less than a 25% risk of having an actionable CMR abnormality, use of this test may delay (but not necessarily replace due to false negatives) the need for surveillance CMR by a year, with 92% sensitivity and 87% NPV. A method of evaluating benefit is net benefit and decision curve analysis, which describes the net reduction in CMR in an acceptable range of risks. Acceptable risk is dependent on patient and practitioner practices. In this analysis, anything less than a 25% risk, or three ‘negative’ for every 1 CMR ‘positive’ for evidence of significant ventricular remodelling, was considered a reasonable range of risk to delay a CMR given the typically slow progression of ventricular remodelling in older children and adults.¹¹ Because current practice is to obtain CMR on all patients, implementation of this strategy is equivalent to a strategy that identified every case of disease in the population but only referred 87% of patients for CMR. Importantly, decision curves illustrate net reduction over a range of thresholds of patient and provider risk preference. A net reduction in CMR was observed for all risk probabilities at and above 8%, meaning in only very risk-adverse providers or patients, would the current strategy of obtaining CMR on every patient be warranted. Although the model utilized North American CMR-based surveillance guidelines as prediction targets, these are similar to European Society of Cardiology guidelines, which also state RVEDVi ≥160 mL/m², RVESVi ≥ 80 mL/m², and RV systolic dysfunction are Class IIa recommendations for PVR.²⁷

The AI-ECG biomarker may also be informative for the patient with a high risk (>75%) of adverse ventricular remodelling. They could be referred for earlier-than-usual CMR for earlier disease identification, or they may be referred directly for retrospective ECG-gated cardiac CT for transcatheter pulmonary valve evaluation²⁸ and volumetric assessment to eliminate redundant imaging. Finally, patients with intermediate risk would continue usual care. The elimination of unnecessary or redundant imaging could increase cost-effectiveness and ensure that limited resources are used in those most likely to benefit.

This approach meets a particularly salient need in the ACHD population. Shortfalls in the availability of ACHD providers²⁹ lead to unequal access to speciality care. Early adulthood is a particularly high-risk period for patients with congenital heart disease to lose access to care,³⁰ and in one study, only 13% of patients were adherent to imaging guidelines¹⁰ highlighting the challenge of delivering appropriate care to this population. Furthermore, this is a high-risk age window in which adverse ventricular remodelling may develop, and PVR may be indicated.³¹ Artificial intelligence analysis of electrocardiograms could be an effective tool to provide increased access to congenital care in this at-risk population, as it may be performed with non-specialized equipment and providers. Even though global model discrimination (AUROC) was decreased in patients <22 years old, it is important to recognize that the screening threshold metrics still suggest clinical utility and net benefit in this patient population. Despite this, and even though the global false negative rate is low (high test sensitivity), error analysis suggests that the characteristic of the ‘missed’ patient tends to be the younger patients with CMR measurements closer to the threshold for intervention. The clinical impact of delaying diagnosis in younger patients may be minimized as PVR in children is less common,⁵ and the vast majority of TOF patients do not progress rapidly across serial CMR.¹¹ However, as guidelines recommend a ‘baseline’ CMR in adolescence near the time of transition to adulthood,³² this algorithm may not be suited to replace this initial CMR. Differences in model performance may be related to underrepresentation within the training cohort due to the experimental requirement for a paired CMR (which is typically first obtained in older children and young adults) and BSA >1 m².

Pathophysiological relevance

The ECG patterns in TOF convey important electromechanical information. Right bundle branch block QRS duration is prognostic, with a duration >180 ms a risk factor for sudden cardiac death.^33,34 Notably, the AI-ECG model outperformed QRS duration for the prediction of adverse remodelling, and QRS duration alone was insufficient to reduce CMR usage at clinically relevant screening thresholds. QRS fragmentation is a more recently described pattern that has been associated with RV size, function, exercise tolerance, and outcome.^35–40 Saliency mapping in this study identifies the later portions of the QRS complex, during early to mid-systole, as an important area for risk stratification, with notching of the R’ and R height observed as important factors. Interestingly, new data deemphasize the importance of diastolic size in risk stratification and emphasize that contractile metrics like RVESV and RVEF may have stronger associations with outcome.^5,19,20,41 Similarly, this study suggests that RVESV and RVEF may have stronger electromechanical representation in the ECG than diastolic size, which is a potential mechanism for the correlations observed between the ECG, the CMR, and long-term outcome in TOF.

Interestingly, obesity was associated with improved model performance. Obesity in TOF is associated with lower LV and RV EF and leads to underestimation of the degree of ventricular dilation when volumes are indexed to BSA rather than ideal body weight.⁴² In this study, weak negative correlations between BMI and RVEF (Spearman rho = −0.07, P = 0.024, see Supplementary material online, Figure S4) and between BMI and RVEDVi (Spearman rho = −0.08, P = 0.015) are observed consistent with these prior reports. However, unsurprisingly, obesity and age were correlated with each other, and within the ≥22-year-old age group, there was not increased model performance in obese patients. Given the known limitations of BSA scaling in volumetric RV assessment in the setting of obesity, ECG analysis could be a novel method for identification of ventricular remodelling but further development of this method is required to ensure results are not confounded by age effects.

Implications for AI model development and deployment in congenital heart disease

This study goes beyond single-centre external validation to evaluate the generalizability of an AI-ECG model across a diverse spectrum of practices and settings. Based on the extensive external validation performed in this study, we conclude that local verification of discrimination and calibration is required. This is a novel and important finding in the congenital heart disease literature, particularly when examined in a regulatory framework. For ‘high risk’ AI systems, the newly implemented European Union AI Act demands ‘an appropriate level of accuracy’ which shall be defined ‘in cooperation with relevant stakeholders and organizations’. The United States Federal Drug Administration approval pathway requires evidence of model generalizability⁴³ that may be demonstrated with two separate external validations. This study could meet these requirements as three centres performed strongly on external validation, but further evaluation revealed suboptimal performance in two other centres. This reinforces that demonstration of discrimination, calibration, and net benefit across a wide range of centres is important for safe and effective adoption of AI tools. The current standards could be inadequate to ensure truly generalizable performance, which warrants further examination across the breadth of AI-ECG applications.

Calibration is an essential consideration to ensure that model probability scores can be reliably interpreted as predicted risk, but it is not well studied in the congenital heart disease literature. Calibration varied across sites including miscalibration at one site that had a high AUROC (Site B). We speculate that differences in calibration may arise from out-of-distribution prevalence of adverse remodelling and age distributions in the miscalibrated centres (see Supplementary material online, Results and discussion). This highlights that model discrimination (AUROC) is only the first step towards AI model evaluation. Calibration is essential to ensure that globally applied risk thresholds can be expected to have similar performance across institutions. Furthermore, calibration ensures the predicted risk aligns with the observed outcomes to give clinicians the context needed to make personalized decisions. Important metrics like net benefit analysis are not sensical without proper model calibration²⁴ because model output probability does not reflect the underlying risk of disease. This is further emphasized by analysis of net benefit curves for the uncalibrated model (see Supplementary material online, Figure S5), which shows very little net benefit at the a priori threshold amongst well-performing centres and negative net benefit when all centres are considered. This shows that without proper consideration of calibration, a model may be harmful compared with the current standard of care when considered in the context of clinical risk tolerance. These findings echo recent findings that an FDA-approved AI-ECG model has calibration differences across centres that significantly affect the model performance at predefined screening thresholds to identify hypertrophic cardiomyopathy.⁴⁴ Mitigation of calibration differences across centres might be achieved through a multicentre training design to account for distribution differences across centres.

To aid in further research and implementation in the congenital heart disease community, the complete containerized pipeline and model to perform AI-ECG inference from 12-lead ECG in XML format is available for download at https://github.com/sonqduong/ECGsizefxn_rTOF.

Limitations

The AI-ECG model output is the predicted probability of significant CMR abnormalities. However, these volumetric criteria for intervention are debated. A scientific statement⁵ released after data collection, model training, and analysis of this study suggested modification of the CMR-based indications for PVR, though RVESVi, RVEF, LVEF, and RVEDVi are still all directly or indirectly integrated. It remains to be seen whether this guidance will be broadly adopted. Notably, in this study, the composite, RVEF, and RVESV models all exhibit similar performance, which suggests that this model could be successfully adapted to updated guidance. Although the models were produced from retrospective data before the release of this statement, if the definitions of significant ventricular remodelling change, then CMR screening practices are also likely to change over time, which may in turn become a source of model drift in prospective implementation. It is also recognized that the AI-ECG model cannot completely replace advanced imaging because CMR also provides important information beyond quantification of volumes and EF such as pulmonary regurgitation fraction, aorta and pulmonary artery size, branch pulmonary artery flow distribution, and evaluation for scar and fibrosis to further risk stratify outcomes.^45–47

In this study, significant differences across institutions in CMR measurements and outcome prevalence are observed, which may be related to population differences across centres but may also be related to how CMR are quantified at each centre. The combined effect of interstudy and interobserver variability on RV measurements in TOF may be substantial, varying by 12.7 mL/m² for RVEDVi, 9.3 mL/m² for RVESVi, and 7% for RVEF.⁴⁸ Even in healthy children, coefficients of variation in RVEDV, RVESV, and RVEF range from 6 to ∼10%, with evidence for systematic differences between observers.⁴⁹ If there are systematic differences in how CMRs are measured, then this may degrade performance in external validation through ‘concept shift’ (i.e. similar ECG features may reflect the same underlying ‘true’ measurement, but due to CMR measurement differences, the ECG is labelled as disease-positive at one site and disease-negative at another). This may in part explain the observed differences in calibration observed across centres. Although one of the sites (Site B) submitted CMR data that were partially remeasured in a core lab¹⁸ and discrimination remained strong, it is likely that performance could be further improved by recontouring all studies at a core facility, as has been performed in other major TOF cohort studies^18,50 to improve generalizability.

Because variations in discrimination are observed, it is important to verify discrimination and calibration locally. The calibration was performed using cross-validation instead of a hold-out calibration set to maximize sample size for detailed subgroup performance analysis. However, this strategy risks presenting overoptimistic calibration results, which may in turn affect net benefit analysis results, which are sensitive to calibration. In prospective implementation, recalibration would by necessity be performed on a hold-out calibration set. As this validation was only performed at North American sites, validation in Europe and other global locations that may benefit from AI-ECG screening is required. Finally, due to the experimental requirement for a paired CMR, the cohort is enriched for patients with more severe disease, as more frequent CMR is recommended in higher-risk patients.⁴ Certain populations with a likely lower incidence of disease such as those without significant pulmonary regurgitation might be underrepresented. We addressed this by examining net benefit under a range of projected disease prevalence and found that the net reduction in CMR increased as disease prevalence decreased.

Conclusions

Artificial intelligence analysis of electrocardiogram analysis can be used as a biomarker for adverse ventricular remodelling to inform the timing of advanced imaging in patients with repaired TOF. This may increase care efficiency and expand access to care by reducing unnecessary or redundant CMR. Through its rare multicentre design, this study also reveals important differences in performance and model calibration across centres and cohorts, which have important practical implications for real-world prospective implementation of AI-ECG biomarkers in congenital heart disease and beyond.

Supplementary Material

ztag015_Supplementary_Data

ztag015_supplementary_data.docx^{(498.4KB, docx)}

Acknowledgements

The authors would like to thank Dr. Andrew Vickers for advice in implementation of net benefit analysis.

Contributor Information

Son Q Duong, Department of Pediatrics (Cardiology), Icahn School of Medicine at Mount Sinai, 1468 Madison Ave, Annenberg 3rd Fl, New York, NY 10029, USA; Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Center for Artificial Intelligence in Children’s Health, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Akhil Vaid, Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY 10029, USA.

Pengfei Jiang, Center for Child Health Services Research, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Sciences and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Yuval Bitterman, Department of Paediatrics, Labatt Family Heart Centre, Hospital for Sick Children (SickKids), Toronto M5G 1X8, Canada.

Yamini Krishnamurthy, Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA.

I Min Chiu, Department of Medicine (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Joshua Finer, Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA.

Brian Cleary, Department of Pediatrics (Cardiology), Northwestern Feinberg School of Medicine, Chicago, IL 60611, USA.

Benjamin S Glicksberg, Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Center for Artificial Intelligence in Children’s Health, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY 10029, USA.

Ruchira Garg, Department of Pediatrics (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Michael DiLorenzo, Department of Pediatrics (Cardiology), Columbia University College of Physicians and Surgeons, New York, NY 10032, USA.

Mark Friedberg, Department of Paediatrics, Labatt Family Heart Centre, Hospital for Sick Children (SickKids), Toronto M5G 1X8, Canada.

Evan Zahn, Department of Pediatrics (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Matthew Lewis, Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA.

Michael Satzer, Department of Pediatrics (Cardiology), Northwestern Feinberg School of Medicine, Chicago, IL 60611, USA.

David Ouyang, Department of Medicine (Cardiology), Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA.

Pierre Elias, Department of Medicine (Cardiology), Columbia University Medical Center, New York, NY 10032, USA.

Tal Geva, Department of Cardiology, Boston Children’s Hospital, Boston, MA 02115, USA.

Sunil Ghelani, Department of Cardiology, Boston Children’s Hospital, Boston, MA 02115, USA.

Brett R Anderson, Department of Pediatrics (Cardiology), Icahn School of Medicine at Mount Sinai, 1468 Madison Ave, Annenberg 3rd Fl, New York, NY 10029, USA; Center for Child Health Services Research, Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Population Health Sciences and Policy, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Ali Zaidi, Department of Pediatrics (Cardiology), Icahn School of Medicine at Mount Sinai, 1468 Madison Ave, Annenberg 3rd Fl, New York, NY 10029, USA; Mount Sinai Fuster Heart Hospital, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Rachel M Wald, Department of Paediatrics, Labatt Family Heart Centre, Hospital for Sick Children (SickKids), Toronto M5G 1X8, Canada; Peter Munk Cardiac Centre, Toronto Adult Congenital Heart Disease Program, University of Toronto, Toronto M5G 2C4, Ontario, Canada.

Girish N Nadkarni, Windreich Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY 10029, USA.

Joshua Mayourian, Department of Cardiology, Boston Children’s Hospital, Boston, MA 02115, USA.

Supplementary material

Supplementary material is available at European Heart Journal – Digital Health.

Author contributions

Son Duong (Conceptualization [lead]; Data curation [lead]; Formal analysis [lead]; Investigation [lead]; Methodology [lead]; Supervision [equal]; Validation [lead]; Visualization [lead]; Writing—original draft [lead]; Writing—review & editing [lead]), Rachel Wald (Conceptualization [supporting]; Data curation [supporting]; Formal analysis [supporting]; Investigation [supporting]; Supervision [supporting]; Writing—review & editing [supporting]), Ali Zaidi (Data curation [supporting]; Investigation [supporting]; Writing—review & editing [supporting]), Brett Anderson (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), Sunil Ghelani (Data curation [supporting]; Writing—review & editing [supporting]), Tal Geva (Data curation [supporting]; Writing—review & editing [supporting]), Pierre Elias (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), David Ouyang (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), Michael Satzer (Data curation [supporting]; Writing—review & editing [supporting]), Matthew Lewis (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), Evan Zahn (Data curation [supporting]; Writing—review & editing [supporting]), Mark Friedberg (Data curation [supporting]; Writing—review & editing [supporting]), Michael DiLorenzo (Data curation [supporting]; Writing—review & editing [supporting]), Ruchira Garg (Data curation [supporting]; Writing—review & editing [supporting]), Benjamin Glicksberg (Methodology [supporting]; Resources [supporting]; Writing—review & editing [supporting]), Brian Cleary (Data curation [supporting]; Writing—review & editing [supporting]), Joshua Finer (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), I-Min Chiu (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), Yamini Krishnamurthy (Data curation [supporting]; Writing—review & editing [supporting]), Yuval Bitterman (Data curation [supporting]; Writing—review & editing [supporting]), Pengfei Jiang (Data curation [supporting]; Formal analysis [supporting]; Writing—review & editing [supporting]), Akhil Vaid (Formal analysis [supporting]; Investigation [supporting]; Methodology [supporting]), Girish Nadkarni (Formal analysis [supporting]; Funding acquisition [equal]; Resources [equal]; Supervision [equal]; Writing—review & editing [lead]), and Joshua Mayourian (Conceptualization [equal]; Formal analysis [equal]; Investigation [equal]; Methodology [equal]; Writing—review & editing [equal])

Funding

This work was supported by National Institutes of Health K08HL173639 and the American Society of Echcardiography Foundation EDGES Award (S.Q.D.), National Institutes of Health R01HL167050 (G.N.N.), Kostin Innovation Fund (J.M.), Thrasher Research Fund Early Career Award (J.M.), and National Institutes of Health T32HL007572 (J.M.).

Data availability

The data underlying this article cannot be shared publically because it contains protected patient health information. The data may be shared upon reasonable request and in accordance with appropriate regulatory oversight. The containerized pipeline and model to perform AI-ECG inference from 12-lead ECG in XML format is available for download at https://github.com/sonqduong/ECGsizefxn_rTOF.

References

1. Apitz C, Webb GD, Redington AN. Tetralogy of Fallot. Lancet 2009;374:1462–1471. [DOI] [PubMed] [Google Scholar]
2. Geva T. Indications for pulmonary valve replacement in repaired tetralogy of Fallot: the quest continues. Circulation 2013;128:1855–1857. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Bokma JP, Geva T, Sleeper LA, Lee JH, Lu M, Sompolinsky T, et al. Improved outcomes after pulmonary valve replacement in repaired tetralogy of Fallot. J Am Coll Cardiol 2023;81:2075–2085. [DOI] [PubMed] [Google Scholar]
4. Stout KK, Daniels CJ, Aboulhosn JA, Bozkurt B, Broberg CS, Colman JM, et al. 2018 AHA/ACC guideline for the management of adults with congenital heart disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation 2019;139:e698–e800. [DOI] [PubMed] [Google Scholar]
5. Geva T, Wald RM, Bucholz E, Cnota JF, McElhinney DB, Mercer-Rosa LM, et al. Long-term management of right ventricular outflow tract dysfunction in repaired tetralogy of Fallot: a scientific statement from the American Heart Association. Circulation 2024;150:e689–e707. [DOI] [PubMed] [Google Scholar]
6. Mercer-Rosa L, Parnell A, Forfia PR, Yang W, Goldmuntz E, Kawut SM. Tricuspid annular plane systolic excursion in the assessment of right ventricular function in children and adolescents after repair of tetralogy of Fallot. J Am Soc Echocardiogr 2013;26:1322–1329. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Lopez L, Saurers DL, Barker PCA, Cohen MS, Colan SD, Dwyer J, et al. Guidelines for performing a comprehensive pediatric transthoracic echocardiogram: recommendations from the American Society of Echocardiography. J Am Soc Echocardiogr 2024;37:119–170. [DOI] [PubMed] [Google Scholar]
8. Buddhe S, Soriano BD, Powell AJ. Survey of centers performing cardiovascular magnetic resonance in pediatric and congenital heart disease: a report of the Society for Cardiovascular Magnetic Resonance. J Cardiovasc Magn Reson 2022;24:10. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Salciccioli KB, Oluyomi A, Lupo PJ, Ermis PR, Lopez KN. A model for geographic and sociodemographic access to care disparities for adults with congenital heart disease. Congenit Heart Dis 2019;14:752–759. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Khan AM, McGrath LB, Ramsey K, Agarwal A, Broberg CS. Association of adults with congenital heart disease-specific care with clinical characteristics and healthcare use. J Am Heart Assoc 2021;10:e019598. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Rutz T, Ghandour F, Meierhofer C, Naumann S, Martinoff S, Lange R, et al. Evolution of right ventricular size over time after tetralogy of Fallot repair: a longitudinal cardiac magnetic resonance study. Eur Heart J Cardiovasc Imaging 2017;18:364–370. [DOI] [PubMed] [Google Scholar]
12. Wald RM, Valente AM, Gauvreau K, Babu-Narayan SV, Assenza GE, Schreier J, et al. Cardiac magnetic resonance markers of progressive RV dilation and dysfunction after tetralogy of Fallot repair. Heart 2015;101:1724–1730. [DOI] [PubMed] [Google Scholar]
13. Duong SQ, Vaid A, My VTH, Butler LR, Lampert J, Pass RH, et al. Quantitative prediction of right ventricular size and function from the ECG. J Am Heart Assoc 2024;13:e031671. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Mayourian J, Gearhart A, La Cava WG, Vaid A, Nadkarni GN, Triedman JK, et al. Deep learning-based electrocardiogram analysis predicts biventricular dysfunction and dilation in congenital heart disease. J Am Coll Cardiol 2024;84:815–828. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Duong SQ, Dominy CL, Lampert J, Singh S, Croft L, Zaidi AN, et al. Ensemble modeling of multimodal electrocardiogram and echocardiogram data improves quantitative assessment of right ventricular function. JACC Adv 2024;3:101186. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Svennberg E, Han JK, Caiani EG, Engelhardt S, Ernst S, Friedman P, et al. State of the art of artificial intelligence in clinical electrophysiology in 2025: a scientific statement of the European Heart Rhythm Association (EHRA) of the ESC, the Heart Rhythm Society (HRS), and the ESC Working Group on E-Cardiology. Europace 2025;27:euaf071. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, et al. TRIPOD + AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024;385:q902. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Wald RM, Altaha MA, Alvarez N, Caldarone CA, Cavallé-Garrido T, Dallaire F, et al. Rationale and design of the Canadian outcomes registry late after tetralogy of Fallot repair: the CORRELATE study. Can J Cardiol 2014;30:1436–1443. [DOI] [PubMed] [Google Scholar]
19. Valente AM, Gauvreau K, Assenza GE, Babu-Narayan SV, Schreier J, Gatzoulis MA, et al. Contemporary predictors of death and sustained ventricular tachycardia in patients with repaired tetralogy of Fallot enrolled in the INDICATOR cohort. Heart 2014;100:247–253. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Bokma JP, Winter MM, Oosterhof T, Vliegen HW, van Dijk AP, Hazekamp MG, et al. Preoperative thresholds for mid-to-late haemodynamic and clinical outcomes after pulmonary valve replacement in tetralogy of Fallot. Eur Heart J 2016;37:829–835. [DOI] [PubMed] [Google Scholar]
21. Mayourian J, La Cava W, Vaid A, Ghelani SJ, Mannix R, Bezzerides VJ, et al. Pediatric electrocardiogram-based deep learning to predict left ventricular dysfunction and remodeling. Circulation 2024;149:917–931, [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Ribeiro AH, Ribeiro MH, Paixão GMM, Oliveira DM, Gomes PR, Canazart JA, et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat Commun 2020;11:1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Huang Y, Li W, Macheret F, Gabriel RA, Ohno-Machado L. A tutorial on calibration measurements and calibration models for clinical prediction models. J Am Med Inform Assoc 2020;27:621–633. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 2016;352:i6. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Khurshid S, Friedman S, Pirruccello JP, Di Achille P, Diamant N, Anderson CD, et al. Deep learning to predict cardiac magnetic resonance–derived left ventricular mass and hypertrophy from 12-lead ECGs. Circ Cardiovasc Imaging 2021;14:e012281. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Mayourian J, El-Bokl A, Lukyanenko P, La Cava WG, Geva T, Valente AM, et al. Electrocardiogram-based deep learning to predict mortality in paediatric and adult congenital heart disease. Eur Heart J 2025;46:856–868. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Baumgartner H, De Backer J, Babu-Narayan SV, Budts W, Chessa M, Diller G-P, et al. 2020 ESC guidelines for the management of adult congenital heart disease. Eur Heart J 2021;42:563–645. [DOI] [PubMed] [Google Scholar]
28. Gillespie MJ, Benson LN, Bergersen L, Bacha EA, Cheatham SL, Crean AM, et al. Patient selection process for the harmony transcatheter pulmonary valve early feasibility study. Am J Cardiol 2017;120:1387–1392. [DOI] [PubMed] [Google Scholar]
29. Chowdhury D, Johnson JN, Baker-Smith CM, Jaquiss RDB, Mahendran AK, Curren V, et al. Health care policy and congenital heart disease: 2020 focus on our 2030 future. J Am Heart Assoc 2021;10:e020605. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Moons P, Bratt EL, De Backer J, Goossens E, Hornung T, Tutarel O, et al. Transition to adulthood and transfer to adult care of adolescents with congenital heart disease: a global consensus statement of the ESC Association of Cardiovascular Nursing and Allied Professions (ACNAP), the ESC Working Group on Adult Congenital Heart Disease (WG ACHD), the Association for European Paediatric and Congenital Cardiology (AEPC), the Pan-African Society of Cardiology (PASCAR), the Asia-Pacific Pediatric Cardiac Society (APPCS), the Inter-American Society of Cardiology (IASC), the Cardiac Society of Australia and New Zealand (CSANZ), the International Society for Adult Congenital Heart Disease (ISACHD), the World Heart Federation (WHF), the European Congenital Heart Disease Organisation (ECHDO), and the Global Alliance for Rheumatic and Congenital Hearts (Global ARCH). Eur Heart J 2021;42:4213–4223. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Slouha E, Trygg G, Tariq AH, La A, Shay A, Gorantla VR. Pulmonary valve replacement timing following initial tetralogy of Fallot repair: a systematic review. Cureus 2023;15:e49577. [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Mayourian J, La Cava WG, Vaid A, Nadkarni GN, Ghelani SJ, Mannix R, et al. Pediatric ECG-based deep learning to predict left ventricular dysfunction and remodeling. Circulation 2024;149:917–931. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Khairy P, Harris L, Landzberg MJ, Viswanathan S, Barlow A, Gatzoulis MA, et al. Implantable cardioverter-defibrillators in tetralogy of Fallot. Circulation 2008;117:363–370. [DOI] [PubMed] [Google Scholar]
34. Gatzoulis MA, Till JA, Somerville J, Redington AN. Mechanoelectrical interaction in tetralogy of Fallot. Circulation 1995;92:231–237. [DOI] [PubMed] [Google Scholar]
35. Egbe AC, Luis SA, Padang R, Warnes CA. Outcomes in moderate mixed aortic valve disease: is it time for a paradigm shift? J Am Coll Cardiol 2016;67:2321–2329. [DOI] [PubMed] [Google Scholar]
36. Bokma JP, Winter MM, Vehmeijer JT, Vliegen HW, van Dijk AP, van Melle JP, et al. QRS fragmentation is superior to QRS duration in predicting mortality in adults with tetralogy of Fallot. Heart 2017;103:666–671. [DOI] [PubMed] [Google Scholar]
37. Alonso P, Andrés A, Rueda J, Buendía F, Igual B, Rodríguez M, et al. Value of the electrocardiogram as a predictor of right ventricular dysfunction in patients with chronic right ventricular volume overload. Rev Esp Cardiol 2015;68:390–397. [DOI] [PubMed] [Google Scholar]
38. Buntharikpornpun R, Jaruratanasirikul S, Roymanee S, Jarutach J, Wongwaitaweewong K, Sangthong R. Correlation between fragmented QRS and ventricular function from cardiac magnetic resonance in patients with repaired tetralogy of Fallot. Pediatr Cardiol 2021;42:1713–1721. [DOI] [PubMed] [Google Scholar]
39. Book WM, Hurst JW, Parks WJ, Hopkins KL. Electrocardiographic predictors of right ventricular volume measured by magnetic resonance imaging late after total repair of tetralogy of Fallot. Clin Cardiol 1999;22:740–746. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Lumens J, Fan CS, Walmsley J, Yim D, Manlhiot C, Dragulescu A, et al. Relative impact of right ventricular electromechanical dyssynchrony versus pulmonary regurgitation on right ventricular dysfunction and exercise intolerance in patients after repair of tetralogy of Fallot. J Am Heart Assoc 2019;8:e010903. [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Ishikita A, McIntosh C, Hanneman K, Lee MM, Liang T, Karur GR, et al. Machine learning for prediction of adverse cardiovascular events in adults with repaired tetralogy of Fallot using clinical and cardiovascular magnetic resonance imaging variables. Circ Cardiovasc Imaging 2023;16:e015205. [DOI] [PMC free article] [PubMed] [Google Scholar]
42. Aly S, Lizano Santamaria RW, Devlin PJ, Jegatheeswaran A, Russell J, Seed M, et al. Negative impact of obesity on ventricular size and function and exercise performance in children and adolescents with repaired tetralogy of Fallot. Can J Cardiol 2020;36:1482–1490. [DOI] [PubMed] [Google Scholar]
43. Health C for D and R . Good Machine Learning Practice for Medical Device Development: Guiding Principles. FDA. Published online March 25, 2025. https://www.fda.gov/medical-devices/software-medical-device-samd/good-machine-learning-practice-medical-device-development-guiding-principles (6 August 2025).
44. Lampert J, Bhatt DL, Vaid A, Kon K, Feinman J, Jou S, et al. Calibration of ECG-based deep-learning algorithm scores for patients flagged as high risk for hypertrophic cardiomyopathy. NEJM AI 2025;2:AIoa2400421. [Google Scholar]
45. Geva T. Repaired tetralogy of Fallot: the roles of cardiovascular magnetic resonance in evaluating pathophysiology and for pulmonary valve replacement decision support. J Cardiovasc Magn Reson 2011;13:9. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Valente AM, Cook S, Festa P, Ko HH, Krishnamurthy R, Taylor AM, et al. Multimodality imaging guidelines for patients with repaired tetralogy of Fallot: a report from the American Society of Echocardiography. J Am Soc Echocardiogr 2014;27:111–141. [DOI] [PubMed] [Google Scholar]
47. Ghonim S, Gatzoulis MA, Ernst S, Li W, Moon JC, Smith GC, et al. Predicting survival in repaired tetralogy of Fallot: a lesion-specific and personalized approach. JACC Cardiovasc Imaging 2022;15:257–268. [DOI] [PMC free article] [PubMed] [Google Scholar]
48. Blalock SE, Banka P, Geva T, Powell AJ, Zhou J, Prakash A. Inter-study variability in CMR measurements of right ventricular volume, mass and ejection fraction in tetralogy of Fallot: a prospective observational study. J Cardiovasc Magn Reson 2012;14:P104. [DOI] [PubMed] [Google Scholar]
49. van der Ven JPG, Sadighy Z, Valsangiacomo Buechel ER, Sarikouch S, Robbers-Visser D, Kellenberger CJ, et al. Multicentre reference values for cardiac magnetic resonance imaging derived ventricular size and function for children aged 0–18 years. Eur Heart J Cardiovasc Imaging 2020;21:102–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Valente AM, Gauvreau K, Assenza GE, Babu-Narayan SV, Evans SP, Gatzoulis M, et al. Rationale and design of an international multicenter registry of patients with repaired tetralogy of Fallot to define risk factors for late adverse outcomes: the INDICATOR cohort. Pediatr Cardiol 2013;34:95–104. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ztag015_Supplementary_Data

ztag015_supplementary_data.docx^{(498.4KB, docx)}

Data Availability Statement

[ztag015-B1] 1. Apitz C, Webb GD, Redington AN. Tetralogy of Fallot. Lancet 2009;374:1462–1471. [DOI] [PubMed] [Google Scholar]

[ztag015-B2] 2. Geva T. Indications for pulmonary valve replacement in repaired tetralogy of Fallot: the quest continues. Circulation 2013;128:1855–1857. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B3] 3. Bokma JP, Geva T, Sleeper LA, Lee JH, Lu M, Sompolinsky T, et al. Improved outcomes after pulmonary valve replacement in repaired tetralogy of Fallot. J Am Coll Cardiol 2023;81:2075–2085. [DOI] [PubMed] [Google Scholar]

[ztag015-B4] 4. Stout KK, Daniels CJ, Aboulhosn JA, Bozkurt B, Broberg CS, Colman JM, et al. 2018 AHA/ACC guideline for the management of adults with congenital heart disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation 2019;139:e698–e800. [DOI] [PubMed] [Google Scholar]

[ztag015-B5] 5. Geva T, Wald RM, Bucholz E, Cnota JF, McElhinney DB, Mercer-Rosa LM, et al. Long-term management of right ventricular outflow tract dysfunction in repaired tetralogy of Fallot: a scientific statement from the American Heart Association. Circulation 2024;150:e689–e707. [DOI] [PubMed] [Google Scholar]

[ztag015-B6] 6. Mercer-Rosa L, Parnell A, Forfia PR, Yang W, Goldmuntz E, Kawut SM. Tricuspid annular plane systolic excursion in the assessment of right ventricular function in children and adolescents after repair of tetralogy of Fallot. J Am Soc Echocardiogr 2013;26:1322–1329. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B7] 7. Lopez L, Saurers DL, Barker PCA, Cohen MS, Colan SD, Dwyer J, et al. Guidelines for performing a comprehensive pediatric transthoracic echocardiogram: recommendations from the American Society of Echocardiography. J Am Soc Echocardiogr 2024;37:119–170. [DOI] [PubMed] [Google Scholar]

[ztag015-B8] 8. Buddhe S, Soriano BD, Powell AJ. Survey of centers performing cardiovascular magnetic resonance in pediatric and congenital heart disease: a report of the Society for Cardiovascular Magnetic Resonance. J Cardiovasc Magn Reson 2022;24:10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B9] 9. Salciccioli KB, Oluyomi A, Lupo PJ, Ermis PR, Lopez KN. A model for geographic and sociodemographic access to care disparities for adults with congenital heart disease. Congenit Heart Dis 2019;14:752–759. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B10] 10. Khan AM, McGrath LB, Ramsey K, Agarwal A, Broberg CS. Association of adults with congenital heart disease-specific care with clinical characteristics and healthcare use. J Am Heart Assoc 2021;10:e019598. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B11] 11. Rutz T, Ghandour F, Meierhofer C, Naumann S, Martinoff S, Lange R, et al. Evolution of right ventricular size over time after tetralogy of Fallot repair: a longitudinal cardiac magnetic resonance study. Eur Heart J Cardiovasc Imaging 2017;18:364–370. [DOI] [PubMed] [Google Scholar]

[ztag015-B12] 12. Wald RM, Valente AM, Gauvreau K, Babu-Narayan SV, Assenza GE, Schreier J, et al. Cardiac magnetic resonance markers of progressive RV dilation and dysfunction after tetralogy of Fallot repair. Heart 2015;101:1724–1730. [DOI] [PubMed] [Google Scholar]

[ztag015-B13] 13. Duong SQ, Vaid A, My VTH, Butler LR, Lampert J, Pass RH, et al. Quantitative prediction of right ventricular size and function from the ECG. J Am Heart Assoc 2024;13:e031671. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B14] 14. Mayourian J, Gearhart A, La Cava WG, Vaid A, Nadkarni GN, Triedman JK, et al. Deep learning-based electrocardiogram analysis predicts biventricular dysfunction and dilation in congenital heart disease. J Am Coll Cardiol 2024;84:815–828. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B15] 15. Duong SQ, Dominy CL, Lampert J, Singh S, Croft L, Zaidi AN, et al. Ensemble modeling of multimodal electrocardiogram and echocardiogram data improves quantitative assessment of right ventricular function. JACC Adv 2024;3:101186. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B16] 16. Svennberg E, Han JK, Caiani EG, Engelhardt S, Ernst S, Friedman P, et al. State of the art of artificial intelligence in clinical electrophysiology in 2025: a scientific statement of the European Heart Rhythm Association (EHRA) of the ESC, the Heart Rhythm Society (HRS), and the ESC Working Group on E-Cardiology. Europace 2025;27:euaf071. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B17] 17. Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, et al. TRIPOD + AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024;385:q902. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B18] 18. Wald RM, Altaha MA, Alvarez N, Caldarone CA, Cavallé-Garrido T, Dallaire F, et al. Rationale and design of the Canadian outcomes registry late after tetralogy of Fallot repair: the CORRELATE study. Can J Cardiol 2014;30:1436–1443. [DOI] [PubMed] [Google Scholar]

[ztag015-B19] 19. Valente AM, Gauvreau K, Assenza GE, Babu-Narayan SV, Schreier J, Gatzoulis MA, et al. Contemporary predictors of death and sustained ventricular tachycardia in patients with repaired tetralogy of Fallot enrolled in the INDICATOR cohort. Heart 2014;100:247–253. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B20] 20. Bokma JP, Winter MM, Oosterhof T, Vliegen HW, van Dijk AP, Hazekamp MG, et al. Preoperative thresholds for mid-to-late haemodynamic and clinical outcomes after pulmonary valve replacement in tetralogy of Fallot. Eur Heart J 2016;37:829–835. [DOI] [PubMed] [Google Scholar]

[ztag015-B21] 21. Mayourian J, La Cava W, Vaid A, Ghelani SJ, Mannix R, Bezzerides VJ, et al. Pediatric electrocardiogram-based deep learning to predict left ventricular dysfunction and remodeling. Circulation 2024;149:917–931, [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B22] 22. Ribeiro AH, Ribeiro MH, Paixão GMM, Oliveira DM, Gomes PR, Canazart JA, et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat Commun 2020;11:1760. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B23] 23. Huang Y, Li W, Macheret F, Gabriel RA, Ohno-Machado L. A tutorial on calibration measurements and calibration models for clinical prediction models. J Am Med Inform Assoc 2020;27:621–633. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B24] 24. Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 2016;352:i6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B25] 25. Khurshid S, Friedman S, Pirruccello JP, Di Achille P, Diamant N, Anderson CD, et al. Deep learning to predict cardiac magnetic resonance–derived left ventricular mass and hypertrophy from 12-lead ECGs. Circ Cardiovasc Imaging 2021;14:e012281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B26] 26. Mayourian J, El-Bokl A, Lukyanenko P, La Cava WG, Geva T, Valente AM, et al. Electrocardiogram-based deep learning to predict mortality in paediatric and adult congenital heart disease. Eur Heart J 2025;46:856–868. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B27] 27. Baumgartner H, De Backer J, Babu-Narayan SV, Budts W, Chessa M, Diller G-P, et al. 2020 ESC guidelines for the management of adult congenital heart disease. Eur Heart J 2021;42:563–645. [DOI] [PubMed] [Google Scholar]

[ztag015-B28] 28. Gillespie MJ, Benson LN, Bergersen L, Bacha EA, Cheatham SL, Crean AM, et al. Patient selection process for the harmony transcatheter pulmonary valve early feasibility study. Am J Cardiol 2017;120:1387–1392. [DOI] [PubMed] [Google Scholar]

[ztag015-B29] 29. Chowdhury D, Johnson JN, Baker-Smith CM, Jaquiss RDB, Mahendran AK, Curren V, et al. Health care policy and congenital heart disease: 2020 focus on our 2030 future. J Am Heart Assoc 2021;10:e020605. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B30] 30. Moons P, Bratt EL, De Backer J, Goossens E, Hornung T, Tutarel O, et al. Transition to adulthood and transfer to adult care of adolescents with congenital heart disease: a global consensus statement of the ESC Association of Cardiovascular Nursing and Allied Professions (ACNAP), the ESC Working Group on Adult Congenital Heart Disease (WG ACHD), the Association for European Paediatric and Congenital Cardiology (AEPC), the Pan-African Society of Cardiology (PASCAR), the Asia-Pacific Pediatric Cardiac Society (APPCS), the Inter-American Society of Cardiology (IASC), the Cardiac Society of Australia and New Zealand (CSANZ), the International Society for Adult Congenital Heart Disease (ISACHD), the World Heart Federation (WHF), the European Congenital Heart Disease Organisation (ECHDO), and the Global Alliance for Rheumatic and Congenital Hearts (Global ARCH). Eur Heart J 2021;42:4213–4223. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B31] 31. Slouha E, Trygg G, Tariq AH, La A, Shay A, Gorantla VR. Pulmonary valve replacement timing following initial tetralogy of Fallot repair: a systematic review. Cureus 2023;15:e49577. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B32] 32. Mayourian J, La Cava WG, Vaid A, Nadkarni GN, Ghelani SJ, Mannix R, et al. Pediatric ECG-based deep learning to predict left ventricular dysfunction and remodeling. Circulation 2024;149:917–931. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B33] 33. Khairy P, Harris L, Landzberg MJ, Viswanathan S, Barlow A, Gatzoulis MA, et al. Implantable cardioverter-defibrillators in tetralogy of Fallot. Circulation 2008;117:363–370. [DOI] [PubMed] [Google Scholar]

[ztag015-B34] 34. Gatzoulis MA, Till JA, Somerville J, Redington AN. Mechanoelectrical interaction in tetralogy of Fallot. Circulation 1995;92:231–237. [DOI] [PubMed] [Google Scholar]

[ztag015-B35] 35. Egbe AC, Luis SA, Padang R, Warnes CA. Outcomes in moderate mixed aortic valve disease: is it time for a paradigm shift? J Am Coll Cardiol 2016;67:2321–2329. [DOI] [PubMed] [Google Scholar]

[ztag015-B36] 36. Bokma JP, Winter MM, Vehmeijer JT, Vliegen HW, van Dijk AP, van Melle JP, et al. QRS fragmentation is superior to QRS duration in predicting mortality in adults with tetralogy of Fallot. Heart 2017;103:666–671. [DOI] [PubMed] [Google Scholar]

[ztag015-B37] 37. Alonso P, Andrés A, Rueda J, Buendía F, Igual B, Rodríguez M, et al. Value of the electrocardiogram as a predictor of right ventricular dysfunction in patients with chronic right ventricular volume overload. Rev Esp Cardiol 2015;68:390–397. [DOI] [PubMed] [Google Scholar]

[ztag015-B38] 38. Buntharikpornpun R, Jaruratanasirikul S, Roymanee S, Jarutach J, Wongwaitaweewong K, Sangthong R. Correlation between fragmented QRS and ventricular function from cardiac magnetic resonance in patients with repaired tetralogy of Fallot. Pediatr Cardiol 2021;42:1713–1721. [DOI] [PubMed] [Google Scholar]

[ztag015-B39] 39. Book WM, Hurst JW, Parks WJ, Hopkins KL. Electrocardiographic predictors of right ventricular volume measured by magnetic resonance imaging late after total repair of tetralogy of Fallot. Clin Cardiol 1999;22:740–746. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B40] 40. Lumens J, Fan CS, Walmsley J, Yim D, Manlhiot C, Dragulescu A, et al. Relative impact of right ventricular electromechanical dyssynchrony versus pulmonary regurgitation on right ventricular dysfunction and exercise intolerance in patients after repair of tetralogy of Fallot. J Am Heart Assoc 2019;8:e010903. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B41] 41. Ishikita A, McIntosh C, Hanneman K, Lee MM, Liang T, Karur GR, et al. Machine learning for prediction of adverse cardiovascular events in adults with repaired tetralogy of Fallot using clinical and cardiovascular magnetic resonance imaging variables. Circ Cardiovasc Imaging 2023;16:e015205. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B42] 42. Aly S, Lizano Santamaria RW, Devlin PJ, Jegatheeswaran A, Russell J, Seed M, et al. Negative impact of obesity on ventricular size and function and exercise performance in children and adolescents with repaired tetralogy of Fallot. Can J Cardiol 2020;36:1482–1490. [DOI] [PubMed] [Google Scholar]

[ztag015-B43] 43. Health C for D and R . Good Machine Learning Practice for Medical Device Development: Guiding Principles. FDA. Published online March 25, 2025. https://www.fda.gov/medical-devices/software-medical-device-samd/good-machine-learning-practice-medical-device-development-guiding-principles (6 August 2025).

[ztag015-B44] 44. Lampert J, Bhatt DL, Vaid A, Kon K, Feinman J, Jou S, et al. Calibration of ECG-based deep-learning algorithm scores for patients flagged as high risk for hypertrophic cardiomyopathy. NEJM AI 2025;2:AIoa2400421. [Google Scholar]

[ztag015-B45] 45. Geva T. Repaired tetralogy of Fallot: the roles of cardiovascular magnetic resonance in evaluating pathophysiology and for pulmonary valve replacement decision support. J Cardiovasc Magn Reson 2011;13:9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B46] 46. Valente AM, Cook S, Festa P, Ko HH, Krishnamurthy R, Taylor AM, et al. Multimodality imaging guidelines for patients with repaired tetralogy of Fallot: a report from the American Society of Echocardiography. J Am Soc Echocardiogr 2014;27:111–141. [DOI] [PubMed] [Google Scholar]

[ztag015-B47] 47. Ghonim S, Gatzoulis MA, Ernst S, Li W, Moon JC, Smith GC, et al. Predicting survival in repaired tetralogy of Fallot: a lesion-specific and personalized approach. JACC Cardiovasc Imaging 2022;15:257–268. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B48] 48. Blalock SE, Banka P, Geva T, Powell AJ, Zhou J, Prakash A. Inter-study variability in CMR measurements of right ventricular volume, mass and ejection fraction in tetralogy of Fallot: a prospective observational study. J Cardiovasc Magn Reson 2012;14:P104. [DOI] [PubMed] [Google Scholar]

[ztag015-B49] 49. van der Ven JPG, Sadighy Z, Valsangiacomo Buechel ER, Sarikouch S, Robbers-Visser D, Kellenberger CJ, et al. Multicentre reference values for cardiac magnetic resonance imaging derived ventricular size and function for children aged 0–18 years. Eur Heart J Cardiovasc Imaging 2020;21:102–113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag015-B50] 50. Valente AM, Gauvreau K, Assenza GE, Babu-Narayan SV, Evans SP, Gatzoulis M, et al. Rationale and design of an international multicenter registry of patients with repaired tetralogy of Fallot to define risk factors for late adverse outcomes: the INDICATOR cohort. Pediatr Cardiol 2013;34:95–104. [DOI] [PubMed] [Google Scholar]

PERMALINK

Development and multicentre validation of an artificial intelligence electrocardiogram model for ventricular remodeling in repaired tetralogy of Fallot

Son Q Duong

Akhil Vaid

Pengfei Jiang

Yuval Bitterman

Yamini Krishnamurthy

I Min Chiu

Joshua Finer

Brian Cleary

Benjamin S Glicksberg

Ruchira Garg

Michael DiLorenzo

Mark Friedberg

Evan Zahn

Matthew Lewis

Michael Satzer

David Ouyang

Pierre Elias

Tal Geva

Sunil Ghelani

Brett R Anderson

Ali Zaidi

Rachel M Wald

Girish N Nadkarni

Joshua Mayourian

Roles

Abstract

Aims

Methods and results

Conclusion

Graphical Abstract

Graphical Abstract.

Introduction

Methods

Participating centres

Figure 1.

Inclusion criteria

Data collection

Prediction target

Model selection, architecture, and training

Multicentre external validation

Model calibration

Clinical utility

Model explainability

Results

Cohort characteristics

Table 1.

External validation model performance

Figure 2.

Figure 3.

Subgroup analysis

Figure 4.

Model explainability with saliency mapping

Figure 5.

Comparison to QRS duration model

Error analysis

Table 2.

Discussion

Clinical significance

Pathophysiological relevance

Implications for AI model development and deployment in congenital heart disease

Limitations

Conclusions

Supplementary Material

Acknowledgements

Contributor Information

Supplementary material

Author contributions

Funding

Data availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles