Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Mar 19.
Published in final edited form as: Circulation. 2024 Feb 5;149(12):917–931. doi: 10.1161/CIRCULATIONAHA.123.067750

Pediatric Electrocardiogram-Based Deep Learning to Predict Left Ventricular Dysfunction and Remodeling

Joshua Mayourian a, William G La Cava b, Akhil Vaid c, Girish N Nadkarni c, Sunil J Ghelani a, Rebekah Mannix d, Tal Geva a, Audrey Dionne a, Mark E Alexander a, Son Q Duong c,e, John K Triedman a
PMCID: PMC10948312  NIHMSID: NIHMS1960426  PMID: 38314583

Abstract

Background:

Artificial intelligence–enhanced ECG analysis shows promise to detect ventricular dysfunction and remodeling in adult populations. However, its application to pediatric populations remains underexplored.

Methods:

A convolutional neural network was trained on paired ECG-echos (≤ 2 days apart) from patients ≤ 18 years old without major congenital heart disease to detect human expert classified greater than mild left ventricular (LV) dysfunction, hypertrophy, and dilation (individually and as a composite outcome). Model performance was evaluated on single ECG-echo pairs per patient at Boston Children’s Hospital and externally at Mount Sinai Hospital using area under the receiver operating (AUROC) and precision recall (AUPRC) curves.

Results:

The training cohort comprised of 92,377 ECG-echo pairs (46,261 patients; median age 8.2 years). Test groups included internal testing (12,631 patients; median age 8.8 years; 4.6% composite outcomes), emergency department (2,830 patients; median age 7.7 years; 10.0% composite outcomes), and external validation (5,088 patients; median age 4.3 years; 6.1% composite outcomes) cohorts. Model performance was similar on internal test and emergency department cohorts, with model predictions of LV hypertrophy outperforming the pediatric cardiologist expert benchmark. Adding age and sex to the model added no benefit to model performance. When using quantitative outcome cutoffs, model performance was similar between internal testing (composite outcome: AUROC 0.88, AUPRC 0.43; LV dysfunction: AUROC 0.92, AUPRC 0.23; LV hypertrophy: AUROC 0.88, AUPRC 0.28; LV dilation: AUROC 0.91, AUPRC 0.47) and external validation (composite outcome: AUROC 0.86, AUPRC 0.39; LV dysfunction: AUROC 0.94, AUPRC 0.32; LV hypertrophy: AUROC 0.84, AUPRC 0.25; LV dilation: AUROC 0.87, AUPRC 0.33), with composite outcome negative predictive values of 99.0% and 99.2%, respectively. Saliency mapping highlights ECG components (precordial QRS complexes for all outcomes; T waves for LV dysfunction) that influence model predictions. High-risk ECG features include lateral T wave inversion (LV dysfunction), deep S waves in V1–V2 and tall R waves in V5–V6 (LV hypertrophy), and tall R waves in V4–V6 (LV dilation).

Conclusions:

This externally validated algorithm shows promise to inexpensively screen LV dysfunction and remodeling in children, which may facilitate improved access to care by democratizing the expertise of pediatric cardiologists.

Keywords: Artificial Intelligence, Pediatric Cardiology, Electrophysiology, Ventricular Dysfunction, Ventricular Remodeling

INTRODUCTION

ECG is a rapid, standardized, and cost-effective tool used ubiquitously for initial cardiac screening of adults and children.1 The utility of rule-, feature- and measurement-based human interpretation of the ECG varies by level of experience and expertise. This has historically motivated the development of computer-generated interpretations based on predefined rules and feature recognition algorithms that may not capture subtleties of an ECG.2 Recent work has demonstrated that deep learning-based artificial intelligence-enhanced ECG (AI-ECG) algorithms may result in greater diagnostic fidelity; studies of this approach in adult populations have reliably predicted a range of adult cardiovascular phenotypes, including ventricular dysfunction,37 ventricular hypertrophy810, and ventricular dilation.9, 11

Progressive anatomical and physiological changes occurring from birth to adolescence lead to age-dependent variations in pediatric ECGs. The epidemiology and patterns of normal versus abnormal pediatric ECGs differ significantly from those of adults, which may be expected to limit generalizability of applying adult AI-ECG algorithms to pediatric cohorts.12 As an example, application of an adult hypertrophic cardiomyopathy AI-ECG model on a pediatric cohort had reduced performance with decreasing age.13 That work13 represents one of only a handful14 of available AI-ECG applications to pediatric cardiology, which highlights the paucity of pediatric AI-ECG models to date that could benefit pediatric populations.

In this work, this technological gap was addressed by developing and externally validating an AI-ECG model on a pediatric population (AI-pECG) to predict left ventricular (LV) dysfunction or remodeling on echo. To do so, a convolutional neural network was trained to predict human expert classified LV dysfunction, hypertrophy, and dilation (individually and as a composite outcome) using nearly 100,000 ECG-echo pairs ≤ 2 days apart. Model performance was then tested in >10,000 patients from an independent internal test cohort, nearly 3,000 patients from a separate clinical setting (emergency department), as well as >5,000 patients from an outside healthcare system. Finally, saliency mapping was performed to provide model explainability and identify regions of the ECG waveform that influence model predictions.

METHODS

Internal Study Population and Patient Assignment

Patient data was utilized from Boston Children’s Hospital between January 1, 2002 and December 31, 2021. Inclusion criteria consisted of children ≤ 18 years old with at least one echo. Echos performed in the operating room, medical intensive care unit, or cardiac intensive care unit were excluded. Patients with known major congenital heart disease15, 16 or implantable cardioverter-defibrillators/pacemakers were excluded. Patients with known congenital heart disease were identified based on the institutional Fyler coding system.16 This coding system has been mapped into the International Pediatric and Congenital Cardiac Code ICD-11 nomenclatures.15 Fyler codes used to exclude major congenital heart disease in this study are shown in Table S1.

Each qualifying echo event was paired to an ECG; only ECG-echo pairs ≤ 2 days apart were included. For patients with multiple ECGs within this timeframe, only the ECG closest in time to the echo was included. ECG-echo pairs with ECGs failing to pass quality control (see “Data Processing, Quality Control, and Filtering” for details) were removed. The remaining ECG-echo pairs were included in the main cohort.

Similar to others7, a group stratified design was implemented for partitioning of the main cohort. Each patient was treated as a separate group, which restricts ECG-echo pairs for a given patient to either training or testing datasets in order to minimize leakage of ECG-echo pair data. If an ECG or echo within an ECG-echo pair was performed in the emergency department, then the ECG-echo pair was placed in the emergency department group. These same patients with other ECG-echo pairs were forced into the internal testing group to ensure no data leakage occurred between training and testing. The remaining patients were then randomly partitioned 80:20 into training and internal testing datasets.

External Study Population

For external validation, patient data from Mount Sinai Hospital between January 1, 2002 and December 31, 2018 were used (as Fyler codes were no longer available for this institution starting in 2019). Inclusion criteria consisted of children ≤ 18 years old with at least one ECG-echo pairs ≤ 7 days apart. The same Fyler codes (Table S1) were used to exclude patients.

Data Retrieval

All raw ECG signals were exported from the MUSE ECG data management system (GE Healthcare, Chicago, IL). Waveform data were obtained from XML files, where each one-dimensional vector of data sampled at a rate of 250 Hz for 10 seconds duration (2500 samples) corresponds to a lead (I, II, and V1–V6). As performed by others,17 linear transformations of the vectors were performed based on the Einthoven law18 and Goldberger equation19 to obtain leads III, aVF, aVL, and aVR. Age, sex, and physician reviewed ECGs measurements (e.g., QRS interval, QRS axis, T axis, P axis, PR interval, QT interval, QTc, and heart rate) are archived in an internal database at Boston Children’s Hospital, which was also retrieved. In addition, ECG-based diagnosis of LV hypertrophy by expert pediatric cardiologists using conventional scoring systems20, 21 were retrieved using Fyler codes at Boston Children’s Hospital for benchmarking purposes.

Similarly, echo reports written by pediatric cardiologists are archived in an internal database at Boston Children’s Hospital; extracted records contained the human expert classification of the degree of left ventricular (LV) systolic dysfunction, hypertrophy, and/or dilation (if any). Potential grades were “trivial”, “mild”, “mild-to-moderate”, “moderate”, “moderate-to-severe”, and “severe”. When available, quantitative measures were also obtained of LV ejection fraction (% and z-score), LV mass (raw and z-score), and LV end-diastolic volume (raw and z-score). In an effort to make the model generalizable across multiple institutions, the Pediatric Heart Network (PHN) z-scores were utilized (based on healthy children with normal echos) for LV mass and LV end-diastolic volume.22 Since PHN z-scores were not available for LV ejection fraction, institutional z-scores were utilized (given that normative values are related to age23), which are publicly available online.

Quality Control and Data Preprocessing

In the case of multiple ECG recording attempts for a given ECG event, the final recorded ECG is retrieved. This ECG is then discarded if any lead is not 2500 samples long, or if any lead recording has no lead information (i.e., flat line). Given that ECGs are prone to recording errors (e.g., baseline wander; electrical interference), a high pass filter was utilized24 with cutoff frequency 0.8 Hz, rejection band 0.2 Hz, ripple in passband 0.5 dB, and attenuation in rejection band 40 dB. The ECG was then trimmed to 2048 samples (approximately 8 seconds) to facilitate conveniently working with convolution neural networks.

Definition of Primary Outcomes

Individual outcomes included LV systolic dysfunction, LV hypertrophy, and LV dilation. Human expert knowledge was considered as the ground truth, whereby: 1) LV systolic dysfunction was considered positive if the echo report was coded by a pediatric cardiologist for qualitatively greater than “mild” LV systolic dysfunction; 2) LV hypertrophy was considered positive if the echo report was coded by a pediatric cardiologist for qualitatively greater than “mild” LV hypertrophy, or LV hypertrophic cardiomyopathy; and 3) LV dilation was considered positive if the echo report was coded by a pediatric cardiologist for qualitatively greater than “mild” LV dilation, or LV dilated cardiomyopathy. The composite outcome was defined as having positive LV systolic dysfunction, hypertrophy, or dilation. The primary qualitative outcomes were used to train and internally test the models used herein. A similar coding structure was not available at the external site, restricting the ability to externally validate with qualitative cutoffs.

To further evaluate the performance of the human expert trained model both internally (Boston Children’s Hospital) and externally (Mount Sinai Hospital), quantitative cutoffs were also implemented for the above outcomes, whereby LV ejection fraction, LV mass, and LV end-diastolic volume z-scores of ≤ −4, ≥ +4, and ≥ +4 (corresponding to quantitative “moderate” cutoffs), respectively, were considered positive. The quantitative composite outcome was defined as having positive quantitative LV systolic dysfunction, hypertrophy, or dilation. Note that an LV ejection fraction z-score ≤ −4 corresponds to an ejection fraction of 42% as a newborn, and linearly increases to 47% at 18 years of age.

Model Selection, Architecture, and Training

The model was developed solely on the training set, which was further partitioned 95% for training and 5% for validation to allow for hyperparameter tuning. 12 × 2048 ECG samples were used as inputs to a convolutional neural network similar to the residual network described elsewhere17 that is adapted for unidimensional signals. This architecture allows neural networks to be efficiently trained with skip connections.17 A diagram of the architecture used in this study is shown in Figure S1.

More specifically, the artificial intelligence-enhanced pediatric ECG (AI-pECG) network consisted of a convolutional layer followed by four residual blocks with two convolutional layers per block.17 The convolutional layers start with 64 filters for the first layer and residual block, with a filter increase and subsampling as shown in Table S2. The output of each convolutional layer is rescaled using batch normalization and fed into a rectified linear activation unit, with subsequent dropout at a rate of 0.2. Max pooling and convolutional layers with filter length 1 are included in the skip connections to match main branch signal dimensions.17 The output of the last block is fed into a fully connected layer with a sigmoid activation function given that outcomes are not mutually exclusive.

Similar to others,24 demographics (i.e., age and sex) were also incorporated as inputs along with ECG waveforms in a separate deep learning model (AI-pECG + age + sex). More specifically, the above architecture (Figure S1) was modified by adding a separate part of the model, where demographics (i.e., age and sex) were concatenated and passed through a fully connected layer. The outputs for demographic and ECG model parts are individually flattened and then concatenated to obtain one feature vector. The resulting feature vector was fed into the final fully connected layer with a sigmoid activation function. For details of this model architecture, see Table S3.

For each model, the final hyperparameters were obtained via a grid search on the training set among the following options: kernel size [3, 9, 17], batch size [8, 32, 64], and initial learning rate [0.01, 0.001, 0.0001, 0.00001]. The average cross-entropy was minimized using the Adam optimizer. Maximum 150 epochs were used with early stopping based on validation loss. The model with the lowest validation loss during hyperparameter tuning was selected as the final model. For the AI-pECG model, final hyperparameters were kernel size 17, batch size 32, learning rate 0.001. For the AI-pECG + age + sex model, final hyperparameters were kernel size 17, batch size 64, learning rate 0.001.

Performance Evaluation and Statistical Analyses

Consistent with prior works2527, multiple ECG-echo pairs per patient were allowed in the training cohort, as: 1) progressive anatomical and physiological changes lead to age-dependent variations in pediatric ECGs that are important to capture over time; and 2) a patient may have a normal echo at one snapshot in time, but not another. In contrast, model performance was evaluated on the internal and external test groups using one ECG-echo pair per patient. To minimize confounding variables, the ECG-echo pair with the smallest time difference was selected for each patient.

Given the nature of an imbalanced dataset, the area under the receiver operating curve (AUROC) and area under the precision-recall (i.e., positive predictive value (PPV)-sensitivity) curve (AUPRC) were computed. To benchmark the LV hypertrophy model, pediatric cardiologist ECG-based diagnoses of LV hypertrophy were used. Other performance metrics evaluated included PPV, negative predictive value (NPV), sensitivity, and specificity. These metrics were calculated based on thresholds achieving 90% sensitivity in the training set. For all metrics, a higher value is indicative of better performance. Resampling with 1,000 bootstraps was implemented to obtain performance metric confidence intervals.

Subgroup Analyses

Subgroup analyses were performed on the internal test set when considering all available ECG-echo pairs ≤ 2 days apart. Age and sex are known to influence ECG characteristics in a healthy pediatric population,1 and were therefore explored in subgroup analyses. Age partitioning was adapted from elsewhere1, 28 with groupings of age < 1, 1 ≤ age < 3, 3 ≤ age < 8, 8 ≤ age < 12, and 12 ≤ age ≤ 18 years. AUROCs and AUPRCs were calculated for each subgroup.

Model Explainability

In an effort to provide model interpretability, the following analyses were performed: 1) median waveform analysis; and 2) saliency mapping.

Similar to others,26 median waveform analysis is a technique to visualize aggregated ECG samples into a single beat. In doing so, examples of high-risk and low-risk ECGs can be visualized. Herein, the 100 highest predicted ECGs for a given outcome were used in the internal test set to create high-risk median waveforms, and the 100 lowest predicted ECGs in the internal test set to create low-risk median waveforms. Median waveforms were generated in each lead using the NeuroKit Python toolbox29 by: 1) QRS complex detection; 2) interpolating all ECGs to the same heart rate; 3) computing the median voltage across beats for each patient; 4) computing the median voltage across patients for each time bin in the cardiac cycle.26

Saliency mapping helps identify which features of the ECG input contribute to model prediction. Saliency maps highlight components of the ECG where a change in input (i.e., ECG voltage) leads to a change in prediction.26 Saliency maps were created using a Shapley Additive Explanations (SHAP) framework.30 To highlight the most influential components of the ECG waveform, the SHAP values for the high-risk ECGs were obtained. Subsequently, the above steps to generate median waveforms were implemented on SHAP values over time. The resultant darker regions in saliency maps correspond to greater contribution to the prediction.

Data Availability and Software

Requests for Boston Children’s Hospital data and related materials will be internally reviewed to clarify if the request is subject to intellectual property or confidentiality constraints. Shareable data and materials will be released under a material transfer agreement for non-commercial research purposes. Use of Boston Children’s Hospital and Mount Sinai Hospital data were approved by their respective Institutional Review Boards.

Programming code used to perform the analyses are available upon reasonable request. The convolutional neural network used the Keras framework with a Tensorflow (Google) backend using Python 3.9.31 Deep learning was executed on institutional graphics processing units. All other pre- and post-processing code was written in Python 3.931 and R 4.032, which was executed locally.

RESULTS

Training Cohort: Patient Population Characteristics

Of the 272,221 echos at Boston Children’s Hospital from 104,508 children ≤ 18 years old without major congenital heart disease, there were 122,757 ECG-echo pairs ≤ 2 days apart. Of these ECG-echo pairs, 119,787 ECGs (61,722 patients) passed quality control, thus forming the main internal study cohort (Figure 1).

Figure 1: Schematic of Study Design.

Figure 1:

(A) Schematic of training design. STROBE diagram showing initial patient selection and filtering at each data processing stage (with primary outcome rates shown) at Boston Children’s Hospital (light grey), leading to the final training cohort. (B) Schematic of testing design. Model performance was tested on one ECG-echo pair per patient using qualitative and quantitative outcome cutoffs across different Boston Children’s Hospital settings. External validation (Mount Sinai; dark grey) was performed on one ECG-echo pair per patient using quantitative cutoffs.

Abbreviations: quality control (QC); congenital heart disease (CHD); left ventricle (LV).

The training cohort comprised of 92,377 ECG-echo pairs (46,261 patients; median age 8.2 [IQR, 2.9–13.8] years; 54% male; 56% white), 8.2% with composite LV outcomes, 2.4% with LV dysfunction, 3.5% with LV hypertrophy, and 3.8% with LV dilation (Table 1). Interestingly, median LV mass and volume z-scores derived from the PHN network were 0.6, consistent with prior reports.33 Training cohort patient characteristics and outcomes stratified by age group are shown in Table S4. ECG characteristics stratified by age are within range of previously reported values for healthy children.1 ECG findings by age include a more rightward QRS, T, and P axis for age < 1, an increasing PR and QT interval with age, and decreasing heart rate with age (Table S4). Tables S5S8 highlight the numerous significant differences in ECG-echo pair demographics, ECG characteristics, and echo data when stratifying by each outcome in the training cohort. Of note, approximately 40% of patients with LV dysfunction had concomitant LV dilation (Table S6).

Table 1:

Training Cohort Baseline Characteristics

Training
Demographics
ECG-echo Pairs 92,377
Patients 46,261
Sex
 Female 21,300 (46%)
 Male 24,950 (54%)
 Missing 11 (<0.01%)
Age at ECG (years) 8.2 (2.9, 13.8)
Race
 White 25,856 (56.0%)
 Black 2,509 (5.4%)
 Hispanic 3,894 (8.4%)
 Asian 1,408 (3.0%)
 Other 2,877 (6.2%)
 Missing 9,717 (21.0%)
ECG Characteristics
QRS interval (ms) 80.0 (70.0, 90.0)
QRS axis 72.0 (52.0, 86.0)
T axis 48.0 (33.0, 61.0)
P axis 46.0 (32.0, 58.0)
PR interval (ms) 126.0 (112.0, 142.0)
QT interval (ms) 350.0 (312.0, 382.0)
QTc (ms) 421.0 (407.0, 438.0)
Heart Rate (BPM) 89.0 (73.0, 113.0)
Echo Characteristics
LVEF (%) 63.0 (59.0, 66.0)
LVEF (z-score) −0.1 (−1.0, 0.6)
LV mass (g) 60.7 (32.9, 98.8)
LV mass (z-score) 0.6 (−0.2, 1.7)
LV EDV (mL) 72.6 (39.4, 115.6)
LV EDV (z-score) 0.6 (−0.2, 1.5)
Qualitative Outcomes
Composite LV outcome 7,587 (8.2%)
LV dysfunction 2,261 (2.4%)
LV hypertrophy 3,261 (3.5%)
LV dilation 3,520 (3.8%)

Data presented as median (interquartile range).

Abbreviations: left ventricle (LV); beats per minute (BPM); ejection fraction (EF); end-diastolic volume (EDV).

Testing Cohort: Patient Population Characteristics

The testing cohorts for evaluating model performance were comprised of one ECG-echo pair per patient, with 12,631 for internal testing, 2,830 for emergency department, and 5,088 for external validation cohorts (Figure 1 and Table 2).

Table 2:

Comparison of Demographics, Echo Characteristics, and Outcomes Stratified by Test Cohort

Boston Children’s Hospital Mount Sinai Hospital
Internal Testing Emergency Department External Validation
Demographics
Patients 12,631 2,830 5,088
ECG-echo Pairs 12,631 2,830 5,088
Sex
 Female 5,859 (46%) 1,217 (43%) 2,696 (53%)
 Male 6,772 (54%) 1,613 (57%) 2,376 (47%)
 Missing 16 (0.3%)
Age at ECG (years) 8.8 (2.8, 14.4) 7.7 (1.2, 14.5) 4.3 (0.3, 12.2)
Echo Characteristics
LVEF (%) 63.0 (59.0, 66.0) 62.0 (58.0, 66.0) 62.1 (58.6, 65.8)
LVEF (z-score) −0.1 (−0.9, 0.6) −0.3 (−1.2, 0.6) −0.3 (−1.0, 0.5)
LV mass (g) 62.7 (32.5, 102.4) 59.9 (25.3, 105.0) 36.9 (13.4, 78.3)
LV mass (z-score) 0.5 (−0.2, 1.4) 0.7 (−0.1, 1.8) 0.0 (−0.8, 1.1)
LV EDV (mL) 77.3 (39.9, 121.8) 70.6 (29.5, 118.2) 47.8 (15.8, 97.8)
LV EDV (z-score) 0.6 (−0.2, 1.3) 0.6 (−0.3, 1.6) 0.4 (−0.4, 1.4)
Qualitative Outcomes
Composite LV outcome 567 (4.5%) 260 (9.2%)
LV dysfunction 150 (1.2%) 123 (4.3%)
LV hypertrophy 232 (1.8%) 94 (3.3%)
LV dilation 254 (2.0%) 102 (3.6%)
Quantitative Outcomes *
Composite LV outcome 437/9,476 (4.6%) 203/1,974 (10.0%) 280/4,602 (6.1%)
LV dysfunction 83/9,565 (0.9%) 87/2,036 (4.3%) 61/5,080 (1.2%)
LV hypertrophy 294/9,484 (3.1%) 128/1,980 (6.5%) 153/4,602 (3.3%)
LV dilation 253/9,576 (2.6%) 95/2,036 (4.7%) 206/4,747 (4.3%)

Data presented as median (interquartile range). Note qualitative outcomes not available for the external validation site.

Abbreviations: left ventricle (LV); beats per minute (BPM); ejection fraction (EF); end-diastolic volume (EDV).

*

Outcome rates presented as number of events/number of eligible echos based on z-score availability.

As shown in Table 2, age at ECG was similar between the training and internal test cohorts. In contrast, the emergency department—and more notably the external validation cohort—had younger ages at ECG. Similarly, echo characteristics were similar between the training and internal test cohorts. In contrast, the emergency department had lower LV ejection fraction and higher LV mass, with an increased prevalence in all outcomes. Interestingly, the external validation site had median echo z-scores closer to 0, but nonetheless had outcome rates similar to the internal test cohort (Table 2).

AI-pECG Model Performance on Qualitative Cutoffs

After training the AI-pECG model on nearly 100,000 ECG-echo pairs with corresponding human expert classified greater than mild LV dysfunction, hypertrophy, and dilation, model performance was evaluated.

During internal testing (Figure 2; left), the AI-pECG model achieved AUROCs of 0.86, 0.86, 0.86, and 0.88 for LV composite outcome, LV dysfunction, LV hypertrophy, and LV dilation, respectively. AUPRCs of 0.37, 0.18, 0.24, and 0.31 were achieved, respectively. During testing on the emergency department cohort (Figure 2; right), the AI-pECG model achieved AUROCs of 0.79, 0.81, 0.84, and 0.85 for the LV composite outcome, LV dysfunction, LV hypertrophy, and LV dilation, respectively. AUPRCs of 0.39, 0.33, 0.29, and 0.35 were achieved, respectively.

Figure 2: Pediatric Electrocardiogram-Based Deep Learning Model Performance at Qualitative Outcome Cutoffs.

Figure 2:

Performance of the artificial intelligence-enhanced pediatric electrocardiogram (AI-pECG; blue) and AI-pECG with age and sex (AI-pECG + age + sex; orange) model performances evaluated using the internal test (left) and emergency department (right) cohorts with receiver operating and precision-recall curves for the: (A) left ventricular (LV) composite; (B) LV dysfunction; (C) LV hypertrophy; and (D) LV dilation qualitative outcomes. In panel C, the grey dot represents the benchmark of pediatric cardiologist expert ECG-based diagnosis of LV hypertrophy. AUROC and AUPRC metric values for each model and outcome are inset. Dotted line represents chance. 95% confidence intervals are shown using bootstrapping.

Abbreviations: positive predictive value (PPV).

In both cases, the AI-pECG model outperformed the pediatric cardiologist expert ECG-based diagnosis of LV hypertrophy (Figure 2C; grey dot). Adding age and sex to the AI-pECG model led to similar performance (Figure 2). Model performance improved when considering all available ECG-echo pairs ≤ 2 days apart (Table S9) for patients in each cohort (Figure S2).

AI-pECG Model External Validation

Performance of the AI-pECG model to discriminate between quantitative cutoffs was subsequently explored internally (Figure 3; left) and externally using an outside healthcare system (Figure 3; right).

Figure 3: External Validation of Model to Predict Quantitative Cutoffs of Left Ventricular Function and Remodeling.

Figure 3:

Performance of the artificial intelligence-enhanced pediatric electrocardiogram (AI-pECG) algorithm evaluated in the internal (left) and external (right) cohorts using receiver operating (AUROC) and precision recall (AUPRC) curves for the following outcomes: (A) left ventricular (LV) composite outcome (LV ejection fraction (LVEF) z-score ≤ −4 or LV mass z-score ≥ +4 or LV end-diastolic volume (LVEDV) z-score ≥ +4); (B) LV dysfunction (LVEF z-score ≤ −4); (C), LV hypertrophy (LV mass z-score ≥ +4); and (D) LV dilation (LVEDV z-score ≥ +4). Within the internal group, internal testing (blue) and emergency department (orange) performance is shown. AUROC and AUPRC metric values for each model and outcome are inset. Dotted line represents chance (note: blue and orange lines represent the prevalence of each respective group). 95% confidence intervals are shown using bootstrapping.

In general, performance when using quantitative cutoffs (Figure 3; left) was higher than qualitative cutoffs (Figure 2). During internal testing (Figure 3; left), the AI-pECG model achieved AUROCs of 0.88, 0.92, 0.88, and 0.91 for LV composite outcome, LV dysfunction, LV hypertrophy, and LV dilation, respectively; AUPRCs of 0.43, 0.23, 0.28, and 0.47 were achieved, respectively. For the emergency department cohort (Figure 3; left), the AI-pECG model achieved AUROCs of 0.81, 0.84, 0.82, and 0.84 for LV composite outcome, LV dysfunction, LV hypertrophy, and LV dilation, respectively; AUPRCs of 0.47, 0.43, 0.35, and 0.38 were achieved, respectively. LV dysfunction model performance was similar when using a quantitative cutoff of ejection fraction ≤ 40% and ≤ 50% (Figure S3). Again, model performance improved when considering all available ECG-echo pairs ≤ 2 days apart (Table S9) for patients in the internal cohorts (Figure S4).

During external validation (Figure 3; right), the AI-pECG model achieved AUROCs of 0.86, 0.94, 0.84, and 0.87 for LV composite outcome, LV dysfunction, LV hypertrophy, and LV dilation, respectively; AUPRCs of 0.39, 0.32, 0.25, and 0.33 were achieved, respectively.

Model sensitivity, specificity, NPV, PPV, and percentage predicted negative were subsequently evaluated when setting the following thresholds to achieve 90% sensitivity in the training set: 0.015 (LV dysfunction); 0.019 (LV hypertrophy); 0.04 (LV dilation); and 0.05 (LV composite outcome). Within the Boston Children’s Hospital cohorts, sensitivities were slightly lower for each outcome, ranging from 0.80 (LV hypertrophy) to 0.87 (LV dysfunction and dilation). In contrast, for the Mount Sinai cohort, sensitivities were markedly higher, ranging from 0.92 to 1.00. Correspondingly, the NPVs were higher, the percent predicted negative were lower, and the PPVs were lower in the external validation cohort compared to the Boston Children’s cohort (Table 3).

Table 3:

Summary of Internal and External Validation Model Performance at a Select Threshold

Boston Children’s Hospital Mount Sinai Hospital
Internal Testing Emergency Department External Validation
LV Composite Outcome
Prevalence (%) 4.6 10.0 6.1
Sensitivity 0.84 [0.80–0.87] 0.84 [0.79–0.89] 0.96 [0.93–0.98]
Specificity 0.73 [0.73–0.74] 0.57 [0.55–0.60] 0.34 [0.33–0.36]
NPV (%) 99.0 [98.7–99.2] 96.8 [95.9–97.8] 99.2 [98.7–99.6]
PPV (%) 13.2 [12.6–13.9] 18.4 [17.2–19.7] 8.6 [8.4–8.9]
Predicted Negative (%) 70.8 [70.0–71.6] 53.3 [51.1–55.4] 32.6 [31.2–33.7]
LV Dysfunction
Prevalence (%) 0.9 4.3 1.2
Sensitivity 0.87 [0.80–0.93] 0.85 [0.78–0.92] 1.00 [1.00–1.00]
Specificity 0.79 [0.79–0.80] 0.58 [0.55–0.60] 0.58 [0.57–0.59]
NPV (%) 99.9 [99.8–99.9] 98.9 [98.3–99.4] 100 [100–100]
PPV (%) 3.4 [3.1–3.7] 8.2 [7.5–9.0] 2.8 [2.7–2.9]
Predicted Negative (%) 78.2 [77.3–79.0] 55.8 [53.7–57.9] 57.4 [56.0–58.7]
LV Hypertrophy
Prevalence (%) 3.1 6.5 3.3
Sensitivity 0.80 [0.76–0.85] 0.81 [0.74–0.88] 0.92 [0.88–0.96]
Specificity 0.78 [0.77–0.79] 0.67 [0.65–0.69] 0.40 [0.39–0.42]
NPV (%) 99.2 [99.0–99.4] 98.1 [97.4–98.7] 99.3 [98.9–99.7]
PPV (%) 10.6 [9.9–11.3] 14.6 [13.2–15.9] 5.0 [4.8–5.3]
Predicted Negative (%) 76.5 [75.6–77.3] 64.1 [62.0–66.1] 39.2 [37.8–40.6]
LV Dilation
Prevalence (%) 2.6 4.7 4.3
Sensitivity 0.87 [0.83–0.91] 0.85 [0.78–0.92] 0.95 [0.91–0.97]
Specificity 0.77 [0.76–0.78] 0.63 [0.61–0.65] 0.40 [0.38–0.41]
NPV (%) 99.5 [99.4–99.7] 98.9 [98.3–99.4] 99.4 [99.0–99.7]
PPV (%) 9.3 [8.8–9.8] 10.2 [9.2–11.1] 6.7 [6.4–6.9]
Predicted Negative (%) 75.4 [74.6–76.2] 60.7 [58.5–62.9] 38.3 [37.0–39.5]

Data presented as median [95% confidence interval]. Predicted Negative indicates the fraction of ECGs predicting negative echo findings at the given threshold.

Abbreviations: negative predictive value (NPV); positive predictive value (PPV).

Subgroup Analysis

In a subgroup analysis (Figure 4), AI-pECG model performance in age < 1 year was lower for predicting LV hypertrophy and higher for predicting LV dilation. In age ≥ 12 years old, performance was slightly lower for predicting LV dilation. There was slightly better performance in females for predicting LV dysfunction and dilation, and in males for predicting LV hypertrophy (Figure 4).

Figure 4: Model Performance in Age and Sex Subgroups.

Figure 4:

Forest plot showing AI-pECG area under the receiver operating (AUROC; red) and precision recall (AUPRC; black) curve performance when stratifying by age (age < 1, 1 ≤ age < 3, 3 ≤ age < 8, 8 ≤ age < 12, age ≥ 12) and sex for the following outcomes: left ventricular (LV) composite, LV dysfunction, LV hypertrophy, and LV dilation. 95% confidence intervals are shown using bootstrapping.

To investigate the influence of age-dependent ECG changes on overall model performance, a sensitivity analysis was performed by comparing overall model performance to model performance when excluding each age group of interest (Figure S5). Composite outcome model performance was insensitive to specific age groups (Figure S5). LV hypertrophy model performance slightly increased when excluding age < 1 year. LV dilation model performance marginally decreased when excluding age < 1 year and marginally increased when excluding age ≥ 12 years old.

Model Explainability

Finally, in an attempt to gain model interpretability, saliency mapping and median waveform analysis were performed.

As shown in Figure 5, the most salient features of an ECG to predict LV dysfunction include the lateral precordial (V4–V6) QRS complexes, as well as the lateral precordial (V4–V6) T waves. High-risk features to predict LV dysfunction include inverted T waves in the lateral precordial leads (V4–V6). Interestingly, in V1–2, S waves are also found as salient, with high-risk features including deep S waves. To predict LV hypertrophy, the most salient features include precordial QRS complexes. High-risk features to predict LV hypertrophy include deep S waves in V1–V2. Interestingly, in limb lead I, the QRS complex was also salient, with high-risk features including a high amplitude R wave (Figure 5). For LV dilation, the most salient features include lateral precordial (V4–V6) QRS complexes. High-risk features to predict LV dilation include high amplitude R waves in V4–V6 (Figure 5).

Figure 5: Explainability of AI-pECG Predictions.

Figure 5:

Visualization of median waveforms generated in each lead using ECGs from the 100 highest (red) and 100 lowest (green) AI-pECG predictions of left ventricular (LV) dysfunction, hypertrophy, and dilation. Saliency mapping demarcates regions of the ECG waveform having greatest (dark blue) and least (light blue) influence on each outcome. Saliency was averaged over the 100 highest predicted ECGs for each outcome.

When stratifying by age (Figures S6S8), distinct age-dependent high-risk features were identified. For LV dysfunction (Figure S6), tall R waves in lateral precordial leads were high-risk features for age < 1 year old, but not age ≥ 12 years old. Conventional scoring system components to predict LV hypertrophy20, 21 were detected as high-risk features in age ≥ 12 years old, and less so in age < 1 year (Figure S7). For LV dilation (Figure S8), tall R waves in lateral precordial leads were high-risk features for age < 1 year old, and less so in age ≥ 12 years old.

Finally, the saliency map for the composite outcome (Figure S9) appears to merge features from each of the individual outcomes of interest.

DISCUSSION

In this work, a technological gap in the application of ECG-based deep learning to a pediatric cohort for prediction of LV dysfunction or remodeling was addressed. The convolutional neural network trained on nearly 100,000 ECG-echo pairs with human expert classified greater than mild LV dysfunction, hypertrophy, and dilation performed well on an independent internal test set of >10,000 patients, on nearly 3,000 patients in the emergency room, as well as during external validation on an outside healthcare system with >5,000 patients. During internal and external testing, a high NPV was achieved. Saliency mapping and median waveform analysis provide physiologically relevant insights into ECG waveforms influencing model prediction of pediatric LV dysfunction or remodeling, allowing for visual comparison with existing algorithms for ECG interpretation. Altogether, these findings demonstrate the promise of AI-pECG to inexpensively screen for and/or diagnose LV dysfunction or remodeling in children, which may facilitate improved access to care and help prioritize patients for further studies and/or interventions.

Clinical Significance and Implications

Artificial intelligence has already had a transformative effect on adult cardiovascular medicine,2 but its potential for implementation in pediatric cardiology is only beginning to be appreciated.34, 35 To date, AI has been used in pediatric cardiology primarily for image based-deep learning applications.3639 Analysis of ECG waveforms provides a rapid, easily to implement, and cost-effective application for artificial intelligence. Its utility in adults has been wide-ranging, including (but not limited to) prediction of ventricular dysfunction,37 ventricular hypertrophy810, ventricular dilation,9, 11 atrial fibrillation and other arrhythmias,17, 26, 40, 41, age42, 43, sex42, and time-to-death.8, 43 As shown herein, the findings provide a proof-of-concept that similar ECG applications can be explored in children. It suggests that deep learning may also be applicable to other data streams (e.g. wearable biosensor data), which could aid in predicting outcomes of children,44 similar to what has been performed in adults.45

As a direct case example of clinical utility, ECG-echo pairs from the emergency room were considered—a setting in which AI-pECG could be of significant clinical and economic value. The economic burden includes misdiagnosis leading to unnecessary referrals and associated costs of echos, as well as missed diagnoses resulting in adverse clinical outcomes. AI-pECG predictions could help guide an emergency physician’s need to consult a pediatric cardiologist. The benchmark herein of pediatric cardiologists applying conventional LV hypertrophy criteria is on par with previous literature;46 given that model performance is superior, it may also help guide pediatric cardiologists on whether to get an echo for a child without major congenital heart disease. This democratization of specialty expertise is likely to be particularly valuable for hospitals with low pediatric volumes and/or limited pediatric cardiology experience.47 As a thought example, using the composite LV outcome, the model has the capacity to achieve a NPV of 99% during internal testing, 97% in the emergency department, and 99% during external validation, with a potential to reduce echos obtained by 71%, 53%, and 33%, respectively (Table 3). NPVs and clinical effectiveness to reduce echos further improves when assessing individual outcomes (Table 3).

Its clinical utility can be further recognized by its convenience; the AI-pECG model only requires the cheap and rapidly generated ECG waveform data. The robust performance of the model suggests it is at least partially resistant to noise generated from obtaining data, as well as age-dependence on overall model performance. Importantly, the model uses only a single modality (i.e., independent of age/sex), rather than a complex clinical scoring system that would require user interaction and is susceptible to input error.

The saliency mapping also provides insight for clinicians to detect LV dysfunction and/or remodeling. It is reassuring that features identified by this model are relevant to findings currently used by clinicians to identify LV hypertrophy and dysfunction. First, saliency mapping and median waveform analysis identified lateral precordial leads as influential for LV dysfunction, with high-risk features of inverted T waves in V5–V6, which has been previously considered pathologic.48 Second, LV hypertrophy explainability was focused on precordial QRS complexes (e.g., deep S waves in V1–V2, tall R waves in V6), in keeping with previously established scoring systems.20, 21 Third, tall R waves in lateral precordial leads (V4–V6) were high-risk features for LV dilation. Finally, there were distinct age-dependent characteristics in saliency maps and median waveforms; notably, conventional scoring system components to predict LV hypertrophy20, 21 were detected as high-risk features in age ≥ 12 years old, and less so in age < 1 year (Figure S7).

Utilization of Both Qualitative and Quantitative Cutoffs

The primary objective was to create an ECG-based deep learning model based on human expert interpretation of echos of LV function and remodeling. This approach was taken given that ~25% of patients were given only qualitative measures of LV function, which would have substantially reduced the training dataset. Human expert opinion incorporates multiple clinical datapoints that aid in decision making, which was also of interest when training a model. Furthermore, most emergency medicine physicians will attempt cardiac point of care ultrasound and report qualitative outcomes on function. In contrast, from a pediatric cardiology perspective, z-scores are commonly reported for LV ejection fraction, LV mass, and LV volume. However, both qualitative49 and quantitative39, 50 cutoffs have limitations. Given the goal to externally validate the model for eventual multicenter use and validation, PHN22 z-scores were incorporated. In practical clinical use, it would be of value to have a model that is agnostic to cutoff method used.

Herein, PHN z-scores lead to higher internal median LV mass and volume z-scores. While these findings are consistent with previous work,33 it nonetheless may contribute to the variation in sensitivity and percentage predicted negative across institutions (Table 3), underscoring the need for: 1) improved standardization of echo measurements; 2) multi-center collaboration (e.g., federated learning) for model training and testing; and/or 3) consideration of institution-specific AI-pECG thresholds.

Limitations and Future Directions

There are several limitations of this work. First, human expert qualitative classification of outcomes based on echo is subject to inter-rater and intra-rater variability as highlighted above. This was addressed by demonstrating effective model performance using quantitative cutoffs across multiple institutions (Figure 3). Second, while AUROC and AUPRC performance is similar between internal and external cohort (demonstrating generalizable discrimination) and the main objective is to screen for (i.e., rule out) pathology, model specificity and PPV were limited (demonstrating limited calibration, which may be attributed to the aforementioned measurement variability). Only one example of thresholding was used in evaluation of model performance, as further consideration is required to weigh the impact of resultant false positives and false negatives, as well as optimally set thresholds across institutions. To this end, multicenter external validation to further refine thresholds for clinical implementation is warranted. Similarly, multicenter collaboration may help improve training/testing sample sizes, which may further improve performance (e.g., AUPRC) and generalizability. Third, it is conceivable that echo (or ECG) findings may change within the timeframe of the paired ECG (or echo), which was minimized by selecting the closest ECG-echo pair during testing. Fourth, other quantitative cutoffs could have also been implemented, but herein z-score based cutoffs were utilized given their ubiquitous use in pediatrics. Fifth, only one model architecture was attempted; conceivably, others may lead to improved model performance. Finally, these findings are limited to patients without major congenital heart disease.

Future work therefore includes application to other pediatric cardiology outcomes of interest (e.g., pediatric arrhythmia detection, mortality), other pediatric populations (e.g., major congenital heart disease), multicenter collaboration (for further model and threshold selection refinement), and prospective trials (to determine how to properly implement such tools to support clinical decision making). Additionally, methodologies need to be further developed to clarify the mechanistic insights afforded by models in relationship to ECG principles and current scoring systems.

Conclusions

In conclusion, these findings demonstrate the promise of AI-pECG to inexpensively screen for and/or diagnose LV dysfunction or remodeling in children. This tool may facilitate prioritization of patients for future interventions/studies, provide meaningful insight into novel ECG waveforms suggestive of LV dysfunction and/or remodeling, and potentially reduce disparities by improving access to care. Future multicenter collaboration, prospective trials, and application to congenital heart disease and pediatric arrhythmias are warranted.

Supplementary Material

Supplemental Publication Material

CLINICAL PERSPECTIVE:

What is new?

  • An artificial intelligence–enhanced pediatric electrocardiogram (AI-pECG) algorithm is predictive of left ventricular dysfunction and remodeling in children across multiple healthcare systems.

  • The model outperforms a pediatric cardiologist benchmark for left ventricular hypertrophy; the addition of age and sex does not improve overall model performance.

  • Saliency mapping provides insight into ECG components (precordial QRS complexes for all outcomes; T waves for LV dysfunction) influencing model predictions, with high-risk features including lateral T wave inversion for LV dysfunction, deep S waves in V1–V2 and tall R waves in V5–V6 for LV hypertrophy, and tall R waves in V4–V6 for LV dilation.

What are the clinical implications?

  • This artificial intelligence–enhanced pediatric ECG algorithm shows promise to inexpensively screen for and diagnose LV dysfunction or remodeling in children, which may facilitate improved access to care and democratize the expertise of pediatric cardiologists.

  • Prospective trials may help guide model implementation to support clinical decision making.

  • Saliency mapping may promote clinician discovery of novel age-dependent ECG waveform patterns consistent with LV dysfunction and remodeling.

ACKNOWLEDGMENTS:

The authors would like to acknowledge Boston Children’s Hospital’s High-Performance Computing Resources Clusters Enkefalos 2 (E2) made available for conducting the research reported in this publication.

SOURCES OF FUNDING:

Funding support received from the Thrasher Research Fund Early Career Award (J.M.), Boston Children’s Hospital Electrophysiology Research Education Fund (J.M., J.K.T.) and NIH grant R00-LM012926 from the National Library of Medicine (W.G.L.).

DISCLOSURES:

Dr. Nadkarni reports consultancy agreements with AstraZeneca, BioVie, GLG Consulting, Pensieve Health, Reata, Renalytix, Siemens Healthineers and Variant Bio; research funding from Goldfinch Bio and Renalytix; honoraria from AstraZeneca, BioVie, Lexicon, Daiichi Sankyo, Menarini Health and Reata; patents or royalties with Renalytix; owns equity and stock options in Pensieve Health and Renalytix as a scientific cofounder; owns equity in Verici Dx; has received financial compensation as a scientific board member and advisor to Renalytix; serves on the advisory board of Neurona Health; and serves in an advisory or leadership role for Pensieve Health and Renalytix. None played a role in the design or conduct of this study.

Nonstandard Abbreviations and Acronyms:

AI-ECG

Artificial Intelligence-Enhanced Electrocardiogram

AI-pECG

Artificial Intelligence-Enhanced Pediatric Electrocardiogram

AUROC

Area under the Receiver Operating Curve

AUPRC

Area under the Precision-Recall Curve

LV

Left ventricle

NPV

Negative predictive value

PPV

Positive predictive value

PHN

Pediatric Heart Network

Footnotes

SUPPLEMENTAL MATERIAL:

Tables S1S9

Figures S1S9

REFERENCES

  • 1.Saarel EV, Granger S, Kaltman JR, Minich LL, Tristani-Firouzi M, Kim JJ, Ash K, Tsao SS, Berul CI, Stephenson EA, Gamboa DG, Trachtenberg F, Fischbach P, Vetter VL, Czosek RJ, Johnson TR, Salerno JC, Cain NB, Pass RH, Zeltser I, Silver ES, Kovach JR, Alexander ME and Pediatric Heart Network I. Electrocardiograms in Healthy North American Children in the Digital Age. Circ Arrhythm Electrophysiol. 2018;11:e005808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Siontis KC, Noseworthy PA, Attia ZI and Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nat Rev Cardiol. 2021;18:465–478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, Pellikka PA, Enriquez-Sarano M, Noseworthy PA, Munger TM, Asirvatham SJ, Scott CG, Carter RE and Friedman PA. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. 2019;25:70–74. [DOI] [PubMed] [Google Scholar]
  • 4.Attia ZI, Kapa S, Yao X, Lopez-Jimenez F, Mohan TL, Pellikka PA, Carter RE, Shah ND, Friedman PA and Noseworthy PA. Prospective validation of a deep learning electrocardiogram algorithm for the detection of left ventricular systolic dysfunction. J Cardiovasc Electrophysiol. 2019;30:668–674. [DOI] [PubMed] [Google Scholar]
  • 5.Yao X, Rushlow DR, Inselman JW, McCoy RG, Thacher TD, Behnken EM, Bernard ME, Rosas SL, Akfaly A, Misra A, Molling PE, Krien JS, Foss RM, Barry BA, Siontis KC, Kapa S, Pellikka PA, Lopez-Jimenez F, Attia ZI, Shah ND, Friedman PA and Noseworthy PA. Artificial intelligence-enabled electrocardiograms for identification of patients with low ejection fraction: a pragmatic, randomized clinical trial. Nat Med. 2021;27:815–819. [DOI] [PubMed] [Google Scholar]
  • 6.Adedinsewo D, Carter RE, Attia Z, Johnson P, Kashou AH, Dugan JL, Albus M, Sheele JM, Bellolio F, Friedman PA, Lopez-Jimenez F and Noseworthy PA. Artificial Intelligence-Enabled ECG Algorithm to Identify Patients With Left Ventricular Systolic Dysfunction Presenting to the Emergency Department With Dyspnea. Circ Arrhythm Electrophysiol. 2020;13:e008437. [DOI] [PubMed] [Google Scholar]
  • 7.Vaid A, Johnson KW, Badgeley MA, Somani SS, Bicak M, Landi I, Russak A, Zhao S, Levin MA, Freeman RS, Charney AW, Kukar A, Kim B, Danilov T, Lerakis S, Argulian E, Narula J, Nadkarni GN and Glicksberg BS. Using Deep-Learning Algorithms to Simultaneously Identify Right and Left Ventricular Dysfunction From the Electrocardiogram. JACC Cardiovasc Imaging. 2022;15:395–410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Liu CM, Hsieh ME, Hu YF, Wei TY, Wu IC, Chen PF, Lin YJ, Higa S, Yagi N, Chen SA and Tseng VS. Artificial Intelligence-Enabled Model for Early Detection of Left Ventricular Hypertrophy and Mortality Prediction in Young to Middle-Aged Adults. Circ Cardiovasc Qual Outcomes. 2022;15:e008360. [DOI] [PubMed] [Google Scholar]
  • 9.Kokubo T, Kodera S, Sawano S, Katsushika S, Nakamoto M, Takeuchi H, Kimura N, Shinohara H, Matsuoka R, Nakanishi K, Nakao T, Higashikuni Y, Takeda N, Fujiu K, Daimon M, Akazawa H, Morita H, Matsuyama Y and Komuro I. Automatic Detection of Left Ventricular Dilatation and Hypertrophy from Electrocardiograms Using Deep Learning. Int Heart J. 2022;63:939–947. [DOI] [PubMed] [Google Scholar]
  • 10.Ko WY, Siontis KC, Attia ZI, Carter RE, Kapa S, Ommen SR, Demuth SJ, Ackerman MJ, Gersh BJ, Arruda-Olson AM, Geske JB, Asirvatham SJ, Lopez-Jimenez F, Nishimura RA, Friedman PA and Noseworthy PA. Detection of Hypertrophic Cardiomyopathy Using a Convolutional Neural Network-Enabled Electrocardiogram. J Am Coll Cardiol. 2020;75:722–733. [DOI] [PubMed] [Google Scholar]
  • 11.Shrivastava S, Cohen-Shelly M, Attia ZI, Rosenbaum AN, Wang L, Giudicessi JR, Redfield M, Bailey K, Lopez-Jimenez F, Lin G, Kapa S, Friedman PA and Pereira NL. Artificial Intelligence-Enabled Electrocardiography to Screen Patients with Dilated Cardiomyopathy. Am J Cardiol. 2021;155:121–127. [DOI] [PubMed] [Google Scholar]
  • 12.Dickinson DF. The normal ECG in childhood and adolescence. Heart. 2005;91:1626–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Siontis KC, Liu K, Bos JM, Attia ZI, Cohen-Shelly M, Arruda-Olson AM, Zanjirani Farahani N, Friedman PA, Noseworthy PA and Ackerman MJ. Detection of hypertrophic cardiomyopathy by an artificial intelligence electrocardiogram in children and adolescents. Int J Cardiol. 2021;340:42–47. [DOI] [PubMed] [Google Scholar]
  • 14.Mori H, Inai K, Sugiyama H and Muragaki Y. Diagnosing Atrial Septal Defect from Electrocardiogram with Deep Learning. Pediatr Cardiol. 2021;42:1379–1387. [DOI] [PubMed] [Google Scholar]
  • 15.Jacobs JP, Franklin RCG, Beland MJ, Spicer DE, Colan SD, Walters HL 3rd, Bailliard, Houyel L, St Louis JD, Lopez L, Aiello VD, Gaynor JW, Krogmann ON, Kurosawa H, Maruszewski BJ, Stellin G, Weinberg PM, Jacobs ML, Boris JR, Cohen MS, Everett AD, Giroud JM, Guleserian KJ, Hughes ML, Juraszek AL, Seslar SP, Shepard CW, Srivastava, Cook AC, Crucean A, Hernandez LE, Loomba RS, Rogers LS, Sanders SP, Savla JJ, Tierney ESS, Tretter JT, Wang L, Elliott MJ, Mavroudis C and Tchervenkov CI. Nomenclature for Pediatric and Congenital Cardiac Care: Unification of Clinical and Administrative Nomenclature - The 2021 International Paediatric and Congenital Cardiac Code (IPCCC) and the Eleventh Revision of the International Classification of Diseases (ICD-11). World J Pediatr Congenit Heart Surg. 2021;12:E1–E18. [DOI] [PubMed] [Google Scholar]
  • 16.Colan SD. Early Database Initiatives: The Fyler Codes. In: Barach PR, Jacobs JP, Lipshultz SE and Laussen PC, eds. Pediatric and Congenital Cardiac Care: Volume 1: Outcomes Analysis London: Springer London; 2015: 163–169. [Google Scholar]
  • 17.Ribeiro AH, Ribeiro MH, Paixao GMM, Oliveira DM, Gomes PR, Canazart JA, Ferreira MPS, Andersson CR, Macfarlane PW, Meira W Jr., Schon TB and Ribeiro ALP. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat Commun. 2020;11:1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Einthoven W Weiteres über das Elektrokardiogramm. Archiv für die gesamte Physiologie des Menschen und der Tiere. 1908;122:517–584. [Google Scholar]
  • 19.Goldberger E A simple, indifferent, electrocardiographic electrode of zero potential and a technique of obtaining augmented, unipolar, extremity leads. American Heart Journal. 1942;23:483–492. [Google Scholar]
  • 20.Sokolow M and Lyon TP. The ventricular complex in left ventricular hypertrophy as obtained by unipolar precordial and limb leads. Am Heart J. 1949;37:161–86. [DOI] [PubMed] [Google Scholar]
  • 21.Okin PM, Roman MJ, Devereux RB and Kligfield P. Electrocardiographic identification of increased left ventricular mass by simple voltage-duration products. J Am Coll Cardiol. 1995;25:417–23. [DOI] [PubMed] [Google Scholar]
  • 22.Lopez L, Colan S, Stylianou M, Granger S, Trachtenberg F, Frommelt P, Pearson G, Camarda J, Cnota J, Cohen M, Dragulescu A, Frommelt M, Garuba O, Johnson T, Lai W, Mahgerefteh J, Pignatelli R, Prakash A, Sachdeva R, Soriano B, Soslow J, Spurney C, Srivastava S, Taylor C, Thankavel P, van der Velde M, Minich L and Pediatric Heart Network I. Relationship of Echocardiographic Z Scores Adjusted for Body Surface Area to Age, Sex, Race, and Ethnicity: The Pediatric Heart Network Normal Echocardiogram Database. Circ Cardiovasc Imaging. 2017;10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Lai WW, Mertens L, Cohen M and Geva T. Echocardiography in pediatric and congenital heart disease : from fetus to adult. 2021:1 online resource. [Google Scholar]
  • 24.Gustafsson S, Gedon D, Lampa E, Ribeiro AH, Holzmann MJ, Schon TB and Sundstrom J. Development and validation of deep learning ECG-based prediction of myocardial infarction in emergency department patients. Sci Rep. 2022;12:19615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ulloa-Cerna AE, Jing L, Pfeifer JM, Raghunath S, Ruhl JA, Rocha DB, Leader JB, Zimmerman N, Lee G, Steinhubl SR, Good CW, Haggerty CM, Fornwalt BK and Chen R. rECHOmmend: An ECG-Based Machine Learning Approach for Identifying Patients at Increased Risk of Undiagnosed Structural Heart Disease Detectable by Echocardiography. Circulation. 2022;146:36–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Khurshid S, Friedman S, Reeder C, Di Achille P, Diamant N, Singh P, Harrington LX, Wang X, Al-Alusi MA, Sarma G, Foulkes AS, Ellinor PT, Anderson CD, Ho JE, Philippakis AA, Batra P and Lubitz SA. ECG-Based Deep Learning and Clinical Risk Factors to Predict Atrial Fibrillation. Circulation. 2022;145:122–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Sangha V, Nargesi AA, Dhingra LS, Khunte A, Mortazavi BJ, Ribeiro AH, Banina E, Adeola O, Garg N, Brandt CA, Miller EJ, Ribeiro ALP, Velazquez EJ, Giatti L, Barreto SM, Foppa M, Yuan N, Ouyang D, Krumholz HM and Khera R. Detection of Left Ventricular Systolic Dysfunction From Electrocardiographic Images. Circulation. 2023;148:765–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Rijnbeek PR, Witsenburg M, Schrama E, Hess J and Kors JA. New normal limits for the paediatric electrocardiogram. Eur Heart J. 2001;22:702–11. [DOI] [PubMed] [Google Scholar]
  • 29.Makowski D, Pham T, Lau ZJ, Brammer JC, Lespinasse F, Pham H, Scholzel C and Chen SHA. NeuroKit2: A Python toolbox for neurophysiological signal processing. Behav Res Methods. 2021;53:1689–1696. [DOI] [PubMed] [Google Scholar]
  • 30.Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N and Lee SI. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat Mach Intell. 2020;2:56–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Team PC. Python: A dynamic, open source programming language. Python Software Foundation. 2015;78. [Google Scholar]
  • 32.Team RC. R: A language and environment for statistical computing. R Foundation for Statistical Computing. (No Title). 2019. [Google Scholar]
  • 33.Lopez L, Frommelt PC, Colan SD, Trachtenberg FL, Gongwer R, Stylianou M, Bhat A, Burns KM, Cohen MS, Dragulescu A, Freud LR, Frommelt MA, Lytrivi ID, Mahgerefteh J, McCrindle BW, Pignatelli R, Prakash A, Sachdeva R, Soslow JH, Spurney C, Taylor CL, Thankavel PP, Thorsson T, Tretter JT, Young LT, LuAnn Minich L and Pediatric Heart Network I. Pediatric Heart Network Echocardiographic Z Scores: Comparison with Other Published Models. J Am Soc Echocardiogr. 2021;34:185–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Gaffar S, Gearhart AS and Chang AC. The Next Frontier in Pediatric Cardiology: Artificial Intelligence. Pediatr Clin North Am. 2020;67:995–1009. [DOI] [PubMed] [Google Scholar]
  • 35.Jone P-N, Gearhart A, Lei H, Xing F, Nahar J, Lopez-Jimenez F, Diller G-P, Marelli A, Wilson L, Saidi A, Cho D and Chang AC. Artificial Intelligence in Congenital Heart Disease. JACC: Advances. 2022;1:100153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Sethi Y, Patel N, Kaka N, Desai A, Kaiwan O, Sheth M, Sharma R, Huang H, Chopra H, Khandaker MU, Lashin MMA, Hamd ZY and Emran TB. Artificial Intelligence in Pediatric Cardiology: A Scoping Review. J Clin Med. 2022;11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Arnaout R, Curran L, Zhao Y, Levine JC, Chinn E and Moon-Grady AJ. An ensemble of neural networks provides expert-level prenatal detection of complex congenital heart disease. Nat Med. 2021;27:882–891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Gearhart A, Goto S, Deo RC and Powell AJ. An Automated View Classification Model for Pediatric Echocardiography Using Artificial Intelligence. J Am Soc Echocardiogr. 2022;35:1238–1246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Reddy CD, Lopez L, Ouyang D, Zou JY and He B. Video-Based Deep Learning for Automated Assessment of Left Ventricular Ejection Fraction in Pediatric Patients. J Am Soc Echocardiogr. 2023;36:482–489. [DOI] [PubMed] [Google Scholar]
  • 40.Attia ZI, Noseworthy PA, Lopez-Jimenez F, Asirvatham SJ, Deshmukh AJ, Gersh BJ, Carter RE, Yao X, Rabinstein AA, Erickson BJ, Kapa S and Friedman PA. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet. 2019;394:861–867. [DOI] [PubMed] [Google Scholar]
  • 41.Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP and Ng AY. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25:65–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Attia ZI, Friedman PA, Noseworthy PA, Lopez-Jimenez F, Ladewig DJ, Satam G, Pellikka PA, Munger TM, Asirvatham SJ, Scott CG, Carter RE and Kapa S. Age and Sex Estimation Using Artificial Intelligence From Standard 12-Lead ECGs. Circ Arrhythm Electrophysiol. 2019;12:e007284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lima EM, Ribeiro AH, Paixao GMM, Ribeiro MH, Pinto-Filho MM, Gomes PR, Oliveira DM, Sabino EC, Duncan BB, Giatti L, Barreto SM, Meira W Jr., Schon TB and Ribeiro ALP. Deep neural network-estimated electrocardiographic age as a mortality predictor. Nat Commun. 2021;12:5117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Tandon A, Nguyen HH, Avula S, Seshadri DR, Patel A, Fares M, Baloglu O, Amdani S, Jafari R, Inan OT and Drummond CK. Wearable Biosensors in Congenital Heart Disease: Needs to Advance the Field. JACC Adv. 2023;2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Attia ZI, Harmon DM, Dugan J, Manka L, Lopez-Jimenez F, Lerman A, Siontis KC, Noseworthy PA, Yao X, Klavetter EW, Halamka JD, Asirvatham SJ, Khan R, Carter RE, Leibovich BC and Friedman PA. Prospective evaluation of smartwatch-enabled detection of left ventricular dysfunction. Nat Med. 2022;28:2497–2503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Rivenes SM, Colan SD, Easley KA, Kaplan S, Jenkins KJ, Khan MN, Lai WW, Lipshultz SE, Moodie DS, Starc TJ, Sopko G, Zhang W, Bricker JT, Pediatric P and Cardiovascular Complications of Vertically Transmitted HIVISG. Usefulness of the pediatric electrocardiogram in detecting left ventricular hypertrophy: results from the Prospective Pediatric Pulmonary and Cardiovascular Complications of Vertically Transmitted HIV Infection (P2C2 HIV) multicenter study. Am Heart J. 2003;145:716–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Morris SA and Lopez KN. Deep learning for detecting congenital heart disease in the fetus. Nat Med. 2021;27:764–765. [DOI] [PubMed] [Google Scholar]
  • 48.D’Ascenzi F, Anselmi F, Berti B, Capitani E, Chiti C, Franchini A, Graziano F, Nistri S, Focardi M, Capitani M, Corrado D, Bonifazi M and Mondillo S. Prevalence and significance of T-wave inversion in children practicing sport: A prospective, 4-year follow-up study. Int J Cardiol. 2019;279:100–104. [DOI] [PubMed] [Google Scholar]
  • 49.Lopez L, Colan SD, Frommelt PC, Ensing GJ, Kendall K, Younoszai AK, Lai WW and Geva T. Recommendations for quantification methods during the performance of a pediatric echocardiogram: a report from the Pediatric Measurements Writing Group of the American Society of Echocardiography Pediatric and Congenital Heart Disease Council. J Am Soc Echocardiogr. 2010;23:465–95; quiz 576–7. [DOI] [PubMed] [Google Scholar]
  • 50.Frommelt PC, Minich LL, Trachtenberg FL, Altmann K, Camarda J, Cohen MS, Colan SD, Dragulescu A, Frommelt MA, Johnson TR, Kovalchin JP, Lin L, Mahgerefteh J, Nutting A, Parra DA, Pearson GD, Pignatelli R, Sachdeva R, Soriano BD, Spurney C, Srivastava S, Statile CJ, Stelter J, Stylianou M, Thankavel PP, Tierney ES, van der Velde ME, Lopez L and Pediatric Heart Network I. Challenges With Left Ventricular Functional Parameters: The Pediatric Heart Network Normal Echocardiogram Database. J Am Soc Echocardiogr. 2019;32:1331–1338 e1. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Publication Material

Data Availability Statement

Requests for Boston Children’s Hospital data and related materials will be internally reviewed to clarify if the request is subject to intellectual property or confidentiality constraints. Shareable data and materials will be released under a material transfer agreement for non-commercial research purposes. Use of Boston Children’s Hospital and Mount Sinai Hospital data were approved by their respective Institutional Review Boards.

Programming code used to perform the analyses are available upon reasonable request. The convolutional neural network used the Keras framework with a Tensorflow (Google) backend using Python 3.9.31 Deep learning was executed on institutional graphics processing units. All other pre- and post-processing code was written in Python 3.931 and R 4.032, which was executed locally.

RESOURCES