Skip to main content
. 2022 Dec 14;1(5):100153. doi: 10.1016/j.jacadv.2022.100153

Table 1.

Application of Artificial Intelligence in Congenital Heart Disease

First Author Year Patient Population Category for Analysis Models Training/Validation Data Sets Test Data Set Results Metrics Limitations
Prenatal CHD screening
 Chen et al24 2017 900 fetuses Echocardiograms Composite RNN to define standard fetal cardiac imaging planes 900 videos 331 videos AUC: 0.95 Limited to healthy patients, not tested on CHD.
 Dong et al25 2022 3,910 fetuses (14.1% with CHD) Echocardiograms Random forest algorithms (ML) to differentiate normal and CHD hearts 25 features 10 features AUC: 0.94
Sensitivity: 0.85
Specificity: 0.88
Tabular data instead of raw images.
No specific subtypes of CHD defined. Single center.
 Arnaout et al12 2021 1,326 fetuses Echocardiograms CNN (classification) 107,823 images from 1,326 echocardiograms 4,108 fetal ultrasounds AUC: 0.99
Sensitivity: 0.95
Specificity: 0.96
No published algorithms. Not clinically deployed in practice.
 Truong et al26 2022 3,910 fetuses (14.1% with CHD) Echocardiograms Random forest algorithms (ML) to differentiate normal and CHD hearts 25 features 10 features AUC: 0.94
Sensitivity: 0.85
Specificity: 0.88
Tabular data instead of raw images.
No specific subtypes of CHD defined. Single center.
Postnatal CHD screening
 Gharenhbaghi et al27 2017 55 healthy children vs 35 BAV Heart sounds Support vector machine and Markov model Unknown Unknown Sensitivity: 0.86
Specificity: 0.87
Small study and clinical deployment not widespread.
 Gharenhbaghi et al28 2020 50 healthy children vs 35 septal defects vs 30 valvular regurgitation Heart sounds Time growing neural network (a type of DL) 80 patients for training, 30% random sampling as validation test Unknown Sensitivity: 0.92 No test data sets and not used clinically.
 Toba et al29 2020 1,031 cardiac catheterizations from 657 CHD patients to predict pulmonary-to-systemic flow ratio Chest x-rays Transfer learning of CNN 931 100 AUC: 0.88
Sensitivity: 0.47
Specificity: 0.95
Lack of external validation. Bias as all CHD patients who had a cardiac catheterization. Limited number of patients in the training group.
 Gomez-Quintana et al30 2021 265 term and late-preterm neonates (137 normal vs 89 PDA vs 39 CHD patients) Heart sounds (healthy vs PDA)
(healthy vs CHD)
ML 90% of data 10% of data AUC (PDA): 0.74
AUC (CHD): 0.78
Not clinically deployed.
Limited data sets.
 Mori et al31 2021 1,192 EKGs from 728 patients (828 normal and 364 ASD) EKG CNN and LSTM Validation was 25% of 1,000 learning data 192 EKG (155 healthy and 37 ASD) AUC: 0.96
Sensitivity: 0.76
Specificity: 0.96
Volume of data was small for DL. Bias associated with priming effect. Insufficient data to deploy into clinical practice.
 Lai et al32 2021 236 newborns Pulse oximetry ML (random forest, logistic regression, multilayer perception) 158 healthy and 27 CHD patients (0-48 h), 50 healthy and 36 CHD patients (>48 h) 50 healthy and 36 CHD AUC: 0.91
Sensitivity: 95.8
Specificity: 86.4
Small data sets
 Bos et al33 2021 2,059 patients; 967 with LQTS and 1,092 evaluated for LQTS but discharged without a diagnosis EKGs CNN classification Trained using 60% and validated in 10% of the patients Tested on remaining 30% of patients AUC was 0.900 (95% CI: 0.876-0.925) Bias as patient cohort sent with suspicion of possible LQTS limiting generalizability. Lacks external validation and calibration from a different center.
 Hong et al34 2022 Color Doppler echocardiogram images CNN for classification and segmentation 4,031 cases with 370,057 images 229 cases with 203,619 images of which 105 cases with ASD and 124 with intact atrial septum Accuracy, recall, precision, specificity, and F1 score of 0.8833, 0.8545, 0.8577, 0.9136, and 0.8546, respectively Not generalizable to spectrum of CHD; single center.
Cardiac imaging
 Pereira et al35 2017 90 patients; 26 coarctation and 64 healthy 2D echocardiograms of the parasternal long axis, apical 4-chamber, and suprasternal notch views SVM (support vector machine classifiers) Trained on 80% Tested on 20% Total error rate of 12.9% (11.5% false negative error and 13.6% false positive) Single-center study. Limited to single disease. No external validation.
 Diller et al10 2019 132 patients with a systemic RV and 67 normal controls (73,425 TGA; 33,394 ccTGA; and 24,354 normal apical 4-chamber frames) Echocardiograms CNN—classification and segmentation 159 40 Accuracy: 0.98 Model requires external validation.
 Wegner et al36 2022 9,793 echocardiogram images from 262 patients with CHD (ToF, Ebstein, TGA) and 62 controls used to build a new model. Prior model was trained on 14,035 echocardiograms from patients without CHD for automated view classification. Echocardiograms from patients with CHD or structural heart disease used to validate existing CNN trained on structurally normal hearts. Additional model built trained on CHD echocardiograms to compare performance. CNN view classification model 80% for training and validation 20% for testing Noncongenital model overall accuracy of 48.3% vs 66.7% in patients without cardiac disease for correct view classification in patients with CHD. New CHD trained model accuracy of 76.1% for view classification. Single-center study. Not vendor agnostic. Relatively small number of patients with cyanotic forms of CHD (ie, 3 patients with HLHS, 1 with tricuspid atresia).
 Karimi-Bidhedi et al13 2020 64 patients (20 ToF, 9 DORV, 9 TGA, 8 cardiomyopathy, 9 coronary artery anomaly, 4 pulmonary stenosis, 3 truncus, 2 aortic arch anomaly) MRI images Generative Adversarial Network (form of unsupervised learning) to augment data used to augment training set. CNN used to segment MRI images 26 patients randomly assigned to training data set (split 80/20 for training and validation) 38 Patients randomly selected for testing Dice Similarity Index metrics of 91% and 86.8% for LV at end-diastole and end-systole, respectively, and 87.4% and 80.6% for RV at end-diastole and end-systole, respectively. Externally validated. Single site. Small patient numbers.
 Tandon et al37 2021 87 cardiac MRI from repaired ToF patients MRI images CNN—transfer learning 57 30 Dice similarity coefficient: 0.90 Small data sets
 Wang et al38 2021 1,308 children (823 healthy, 209 VSDs, 276 ASDs) Echocardiograms CNN view classification for 5 views 90% training 10% testing Autoencoders trained significantly better on CHD samples than healthy samples; cross-entropy healthy: 0.2649 ± 0.0369 vs 0.2597 ± 0.0327 for CHD, and mean squared difference healthy: 133.89 ± 79.06 vs 118.86 ± 61.52 for CHD. A lower cross-entropy indicates a closer representation of the underlying distribution. No external validation. Limited diseases.
Procedural planning for catheterization and surgery
 Ruiz-Fernandez et al39 2016 2,432 patients Basic clinical data, healthy history, surgical intervention, and postsurgical intervention Classification model:
  • 1.

    Multilayer perceptron

  • 2.

    Radial basis function

  • 3.

    Self-organizing map

  • 4.

    Decision tree

2,432 2,432 Accuracy: 0.99 Not clinically deployed
 Lu et al40 2020 550 echocardiogram images; 275 before and after atrial septal occlusion surgery 2D echocardiogram images Variant of the U-Net architecture used to perform atrial segmentation via CNN to determine surgical outcomes of atrial septal defects before and after septal occlude 3:1 Training-to-testing ratio The U-net mean and SD reported for the Dice Similarity Index, Jaccard Index, and Hausdorff Distance were 0.9488 (±0.0209), 0.9033 (±0.0374), and 7.5625 (±4.4549), respectively. Single clinical site and scanner used. No external validation.
Outcome prediction and risk stratification
 Diller et al10 2019 10,019 adult CHD patients Clinical data, EKG, cardiopulmonary exercise test, laboratory markers CNN to categorize diagnostic groups, disease complexity, and New York Heart Association Class 44,000 medical reports Unclear Accuracy 91% in diagnosis, 96% in disease complexity, 90% New York Heart Association Class Retrospective single-center data. Raw echo and MRI data using specifically trained data need validation externally.
 Atallah et al41 2020 288 patients (72 ToF patients and 216 controls) Clinical data and noninvasive testing Random forest
Decision tree to risk stratify into low, moderate, high risk for ventricular arrhythmia and life-threatening events
Unknown Unknown High-risk group Sensitivity: 0.54
Specificity: 0.86
Small data set and retrospective. Unknown numbers for training and testing data sets.
 Jalali et al42 2020 549 single-ventricle patients Clinical data, surgery Logistic regression
Decision tree
Random forest
Gradient boosting
  • 1.

    Deep neural network

25 out of 100 variables selected for training Unknown AUC (mortality/cardiac transplantation): 0.95
AUC (prolonged length of stay): 0.94
Exclusion of very ill patients from the PHN SVR trial, thus biased toward higher survival rates. Retrospective data set.
 Bertsimas et al43 2021 235,000 patients with 295,000 operations Clinical data, general preoperative patient risk factors to predict mortality, postoperative MVST, and length of hospital stay (LOS)
  • 2.

    Optimal classification trees

  • 3.

    Random forests

  • 4.

    Gradient boosting

175,239 46,096 AUC (mortality): 0.86
AUC (prolonged MVST): 0.85
AUC (prolonged LOS): 0.82
Heterogeneous data can lead to bias.
Precision medicine
 Meza et al44 2018 651 neonates with critical left heart obstruction 136 echocardiographic measures to group patients into 3 subtypes and identify differentiating characteristics Unsupervised clustering analysis Divided into group 1, 215; group 2, 338; and group 3, 98. Median LV end diastolic area was 1.35, 0.69, 2.47 cm2 in groups 1, 2, and 3; P < 0.001. Overall mortality was 27%, 41%, and 12%, respectively; P < 0.001.
 Bruse et al45 2017 60 patients CMR Automated segmentation, statistical shape modeling and unsupervised hierarchical clustering to group patients accordingly and identify novel subgroups Cohort divided into 20 healthy subjects, 20 patients who had undergone surgical aortic arch reconstruction, and 20 patients who had their aorta pushed back posteriorly in the Lecompte maneuver for arterial switch operation Achieved automatic division of input shape data according to primary clinical diagnosis with an high F-score (0.902 ± 0.042) and Matthews correlation coefficient (0.851 ± 0.064) using the correlation/weighted distance/linkage combination. Relatively small cohort of patients; not generalizable to other forms CHD
 Bahado-Singh et al46 2022 24 coarctation patients and 16 controls Blood spots Deep learning to perform genome-wide DNA methylation analysis Unknown Unknown AUC: 0.97
Sensitivity: 0.95
Specificity: 0.98
Unknown number of training and testing data sets

ASD = atrial septal defect; AUC = area under the curve; BAV = bicuspid aortic valve; ccTGA = corrected transposition of the great arteries; CHD = congenital heart; CI = confidence interval; CMR = cardiac magnetic resonance imaging; CNN = convolutional neural network; DL = deep learning; DORV = double outlet right ventricle; EKG = electrocardiogram; HLHS = hypoplastic left heart syndrome; LQTS = long QT syndrome; LSTM = long short term memory; LV = left ventricle; ML = machine learning; MRI =magnetic resonance imaging; MVST = mechanical ventilatory support time; PDA = patent ductus arteriosus; PHN = pulmonary hypertension; RNN =recurrent neural network; RV = right ventricle; SD = standard deviation; SVM = support vector machine; TGA = transposition of the great arteries; ToF = tetralogy of Fallot; VSD = ventricular septal defect.