Structured abstract
Background:
Pulmonary hypertension (PH) is life-threatening, and often diagnosed late in its course. We aimed to evaluate if a deep learning approach using electrocardiogram (ECG) data alone can detect PH and clinically important subtypes.
Research Question:
Does an automated deep learning approach to ECG interpretation detect PH and its clinically important subtypes.
Study Design and Methods:
Adults with right heart catheterization (RHC) or an echocardiogram within 90 days of an ECG at the University of California, San Francisco (2012–2019) were retrospectively identified as PH or non-PH. A deep convolutional neural network was trained on patients’ 12-lead ECG voltage data. Patients were divided into training, development, and test sets in a ratio of 7:1:2.
Results:
Overall, 5016 PH and 19,454 non-PH patients were used in the study. Mean (SD) age at time of ECG was 62.29 (17.58) years and 49.88% were female. Mean interval between ECG and RHC or echocardiogram was 3.66 and 2.23 days for PH and non-PH patients, respectively. In the test dataset, the model achieved an area under the receiver operating characteristic curve (AUC), sensitivity, and specificity, respectively of 0.89, 0.79, and 0.84 to detect PH; 0.91, 0.83, and 0.84 to detect pre-capillary PH; 0.88, 0.81, and 0.81 to detect PAH, and 0.80, 0.73, and 0.76 to detect Group 3 PH. We additionally applied the trained model on ECGs from participants in the test dataset that were obtained from up to 2 years before diagnosis of PH: AUC was ≥0.79.
Interpretation:
A deep learning ECG algorithm can detect PH and PH subtypes around the time of diagnosis and can detect PH using ECGs that were done up to 2 years before RHC/echocardiogram diagnosis. This approach has the potential to reduce diagnostic delay in PH.
Keywords: artificial intelligence, deep learning, electrocardiogram, pulmonary hypertension
INTRODUCTION
Pulmonary hypertension (PH) is a life-threatening condition that encompasses a group of severe clinical entities with different etiologies. The World Health Organization (WHO) system is used to classify PH according to clinical presentation, hemodynamic characteristics, and treatment strategy1. PH can arise from pre-capillary or capillary abnormalities of the pulmonary vasculature (WHO Group 1, 3, 4, or 5) or from a post-capillary etiology of left-sided heart disease (WHO Group 2)1. The prognosis of PH depends on the etiology and severity of the condition but is often fatal if untreated2, 3. PH is frequently diagnosed late in its course, with the mean time between symptom onset and diagnosis of Group 1 PH (pulmonary arterial hypertension, PAH) ranging from 2.5 years4 to 3.9 years5. Despite the development of numerous targeted therapies for PAH, and treatments for other forms of pre-capillary PH in the past decade6-8, there has been no meaningful decrease in the diagnostic delay9, 10 due to lack of established screening guidelines for most forms of PH11.
Definitive diagnosis of PH relies on measurements from the right heart catheterization (RHC) procedure, which is invasive and costly. Echocardiography is the current gold standard in PH screening6; however, it is a more specialized, time-consuming procedure that is not readily accessible to all physicians12, 13. In contrast, the electrocardiogram (ECG) is the most performed cardiovascular diagnostic procedure, being ordered at 5% of US office visits14, 22% of annual health examinations15 and at 17% of emergency department visits16. Traditional ECG findings such as right axis deviation, right ventricular hypertrophy, and right bundle branch block raise suspicion for PH but are not sufficient diagnostic purposes, and are typically not detected until severe stages of PH6. Prior work evaluating traditional ECG features showed that the presence of a qR pattern in V1 and the magnitude of the p wave amplitude in the inferior leads were highly predictive of an adverse outcome17, which reflects the prognostic importance of the degree of impairment of the right ventricle. However, guidelines and standard-of-care does not presently rely on ECG alone for PH detection.
Prior proof-of-concept work from our group has demonstrated that machine learning applied to ECG interpretation can extend the clinical utility of the ECG in diagnosis and disease progression across a wide range of cardiovascular disease, including PH18. In the current work, we hypothesize that an automated deep learning approach to ECG interpretation can be used to detect PH and certain clinically important subtypes. We developed and tested an algorithmic framework that can detect PH broadly, as well as specific subtypes including pre-capillary PH, WHO Group 1 PH, and WHO Group 3 PH (PH due to lung disease and/or hypoxia). The algorithm represents a novel approach to identify patients with suspected PH for the purposes of reducing the delay in diagnosis.
METHODS
Human Subjects Research
University of California, San Francisco (UCSF) Institutional Review Board approval was received for this study and informed consent was waived due to minimal risk.
Cohort Definitions
Eligible patients were adults who had undergone right heart catheterization (RHC) or had an echocardiogram within 90 days before or after an ECG at the University of California, San Francisco, between 2012 and 2019. Eligible patients were retrospectively identified and classified as PH or control: PH patients were those with mean pulmonary arterial pressure (mPAP) >20 mmHg by RHC or peak tricuspid regurgitation velocity (TRV) >3.4 m/s by echocardiogram; and control patients were those with RHC mPAP ≤20 mmHg or peak TRV ≤2.8 m/s.1 It is standard practice in the UCSF catheterization laboratory to obtain the mean PCWP by measuring the pressure at the mid a-wave. If PCWP is > 15mHg, it is also standard practice in our lab to confirm with either a highly saturated blood sample (O2 sat >90%), or obtain an LVEDP measurement. Echocardiographic data were only used to classify patients if RHC was not available. Patients who had intermediate probability of PH according to their TRV measurement (TRV between 2.8 and 3.4 m/s) or had no resting mPAP measurement from RHC were excluded. Summaries of hemodynamic measurements from echocardiogram and RHC for our cohort can be found in supplemental tables 1 and 2. The PH cohort was used to train and validate the convolutional neural network (CNN) algorithm as described below.
In order to test the algorithm’s performance in identifying subtypes of PH, patients with subtypes of PH were identified according to RHC measurements as defined by the 6th World Symposium on Pulmonary Hypertension consensus statement1. Patients without RHC data were not used in this PH subtype analysis. Pre-capillary PH patients were identified only if RHC was performed and were defined as having mPAP >20 mmHg, pulmonary artery wedge pressure (PAWP) ≤15 mmHg and pulmonary vascular resistance (PVR) ≥3 WU. Those without pre-capillary PH were defined as those with RHC data demonstrating mPAP ≤20 mmHg. Group 1 PH (PAH) patients were defined as PH patients with RHC mPAP >20 mmHg who had ≥1 code for a PAH-specific medication ordered during the 3 months prior or 6 months following the qualifying RHC. A full list of PAH-specific medications is shown in the Supplemental Appendix. Group 3 PH (PH due to lung disease and/or hypoxia) patients were defined as PH patients with RHC demonstrating mPAP >20 mmHg who also had an International Classification of Disease (ICD) code that was consistent with a diagnosis of Group 3 PH within 3 months before or after RHC PH diagnosis. A full list of the qualifying ICD codes is listed in the Supplementary Appendix.
Data Extraction and Processing
RHC, echocardiogram reports, and patient heath record data such as demographics, ICD codes, and medications were extracted from their respective UCSF electronic health record databases. Free text in RHC and echocardiogram reports was parsed to extract the required data (i.e. mPAP or TRV) into a structured format for each procedure. The nearest ECG within 90 days in either direction of the procedure date was matched to the RHC or echocardiogram, yielding one ECG per patient for model training. If no ECG was found within 90 days of a procedure date, patients were excluded. Patients were then randomly divided into training, development, and test datasets in a ratio of 7:1:2 for training, tuning, and validation of the model, respectively. Within the test dataset, PH subgroups (as defined above) were also identified. ECGs were recorded as part of routine clinical care at either 250 Hz or 500 Hz across 12 leads for 10 seconds and converted to a 2,500x12 matrix, and down-sampled as necessary. These data are available from any standard 12-lead ECG performed by any clinic.
Algorithm Development
We trained a single CNN to predict the presence of PH. The model had a one-dimensional ResNet architecture19, similar to that previously described20. The architecture of the CNN accepts the input matrix of ECG voltages and produces an output score between 0 and 1 that corresponds to predicted probability of PH. The CNN is comprised of 15 convolutional layers, implemented in Keras version 2.2.4 and Python 3.6. The filter size is 8, each with a convolution, a ReLU non-linearity21, and a batch normalization layer22. Alternating layers are connected by a residual layer. A max-pooling layer applies to every 4th layer, and this reduces the time resolution by a factor of 2 starting from 2500. The number of convolutional channels is doubled every 8th layer, starting from the 64th layer. The output of the final layer of convolution is a 256 x 10 matrix, which is fed into a fully connected layer that outputs a 128 x 1 matrix. This feeds the sigmoid activation function to complete our model. This final architecture was decided upon after a grid search sweep of the following architectural parameters: number of convolutional layers {11,12,13,14,15,16,17}, number of fully connected layers {0,1}, size of fully connected layer {64,128,256,1024}. The Adam optimizer trained each model until convergence with a learning rate of 0.001, and the learning rate was then reduced by a factor of 10 and the training was continued until reaching convergence thrice23. The model output consists of two classes of predicted binary outcomes, each with a probability value between 0 and 1. The outcomes are mutually exclusive, so output probabilities add up to 1.
AI Explainability
Understanding the features used by deep learning models can be important to contextualize their predictions, build trust in their conclusions, and possibly elucidate patterns in their input previously unknown to humans. Here, we apply an explainability technique called Linear Model-Agnostic Explanations (LIME) to highlight the voltage features of the ECG used by the model to make predictions about the presence of PH24 (Figure 3). We present voltages with importance weight represented by red highlights, in which color intensity corresponds to importance.
Figure 3.
A,B: Examples of ECG tracings used for algorithm development from an example patient with longstanding PH (top) and a patient without PH (bottom). Note some classic ECG features of PH in (left) including rightward axis and prominent R wave in V1. C,D: Features of the ECG voltages that were important for the model’s classification are highlighted in red, with color intensity corresponding to LIME-derived importance values. ECG, electrocardiogram; PH, pulmonary hypertension
Statistical Performance Metrics
CNN performance was evaluated by calculating the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score (harmonic mean of the PPV and sensitivity).25 A threshold for binary decision was chosen to optimize the F1 score in the development dataset and then applied to the test set for calculation of the F1 score, sensitivity and specificity and predictive values.
The trained model was also tested on a subset of patients from the test set in which the prevalence of PH was made to be 1%, to simulate the estimated prevalence of PH in the general population.26 The trained model was then also tested on several PH subgroups, namely pre-capillary PH, Group 1 PH, and Group 3 PH, that were identified within the test dataset using the criteria described above; the same binary threshold was used as in the primary analysis. Of note, the CNN was not specifically re-trained for these subtypes due to low sample sizes. In these analyses (PH and its subgroups), the model was tested on ECGs from within 90 days of diagnosis by RHC or echocardiogram. In a separate analysis, the trained model was also applied on ECGs from patients in the test dataset taken from up to 2 years before the date that the diagnosis was made. To examine whether a CNN can also detect PH using fewer than 12-leads, as is commonly obtained from remote ECG monitoring devices, we trained separate CNNs using only 2 or 1 input ECG leads (leads I and V2) and reported results in the hold-out test dataset; all CNN hyperparameters were kept identical to those used during 12-lead training.
RESULTS
Patient Cohorts
A total of 24,470 adult patients were retrospectively identified from adults who had undergone RHC or had an echocardiogram within 90 days before or after an ECG (Figure 1). Patient demographics and characteristics are shown in Table 1. We identified 5016 PH patients (2830 by RHC, 56%) and 19,454 control patients. Mean (SD) age at time of ECG was 62.29 (17.58) years and 49.88% were female. The mean interval between ECG and RHC or echocardiogram was 3.66 and 2.23 days for PH and control patients, respectively; the median was 0 days between ECG and both RHC and echocardiogram. Demographics and patient characteristics were similar across the training, development, and test sets (data not shown). From the PH patients in the test set (N=982), 147 patients were confirmed as having pre-capillary PH by RHC, 119 were classified as Group 1 PH patients, based on RHC hemodynamics and medication orders, and 169 were classified as Group 3 PH patients, based on RHC hemodynamics and ICD codes.
Figure 1.
Patient flow diagram (Consort diagram) and PH Cohort Venn Diagram. ECG, electrocardiogram; echo, echocardiogram; EHR, electronic health records; PH, pulmonary hypertension; RHC, right heart catheterization; UCSF, University of California, San Francisco. *Forty ECG files were incomplete and were not used in model development.
Table 1.
Patient demographics and characteristics in the PH cohorts
Overall N=24,470 |
PH (mPAP >20 mm Hg or TRV >3.4 m/s), N = 5016 |
Pre-capillary PH (by RHC only) N=701 |
Group 1 PH - PAH (mPAP >20 mm Hg + 1 PAH-medication) N=572 |
Group 3 PH – PH due to lung disease and/or hypoxia (mPAP >20 mm Hg + 1 qualifying ICD code) N=958 |
|
---|---|---|---|---|---|
Male gender, n (%) | 12,257 (50.1) | 2650 (52.8) | 296 (42.2) | 237 (41.4) | 562 (58.7) |
Female gender, n (%)* | 12,206 (49.9) | 2366 (47.2) | 405 (57.8) | 335 (58.6) | 396 (41.3) |
Mean (SD) age, years | 62.3 (17.6) | 65.3 (16.8) | 58.4 (14.5) | 55.0 (14.6) | 62.5 (12.4) |
Race, n | |||||
White | 13,194 (53.9) | 1586 (31.6) | 235 (33.5) | 292 (51.0) | 553 (57.7) |
Black/African American | 2084 (8.5) | 425 (8.5) | 51 (7.3) | 72 (12.6) | 132 (13.8) |
Asian | 4309 (17.6) | 521 (10.4) | 59 (8.4) | 77 (13.5) | 97 (10.1) |
Native Hawaiian/Pacific Islander | 362 (1.5) | 69 (1.4) | 7 (1.0) | 9 (1.6) | 12 (1.3) |
Unknown | 4521 (18.5) | 2,415 (48.1) | 349 (49.8) | 122 (21.3) | 164 (17.1) |
7 individuals identified as non-binary gender, hence the percentages do not total 100.
Detecting Pulmonary Hypertension
The primary objective of the study was to determine whether an automated deep learning approach to ECG interpretation could be used to detect PH. Our trained model achieved an AUC of 0.89, and sensitivity and specificity of 0.79 and 0.84, respectively in our test dataset which had a PH prevalence of 20% (Table 2). To simulate the estimated prevalence of PH in the general population,26 the trained model was also applied to a test dataset subset with 1% PH prevalence (Table 2, Figure S1) and the ROC curves are shown in Figure 2. Positive predictive values for detection of PH (Table 2, Figure S2) were 0.56 in the overall test dataset (20% PH prevalence) and 0.05 in the dataset with 1% PH prevalence. The corresponding NPVs were 0.94 and 1.00; patients who are identified as PH-negative by the algorithm are likely to not have PH. The predictions given by the CNN compared with the 'ground truth' (by RHC or echocardiogram) are shown in two-by-two tables (Figure S2). In a sensitivity analysis, all CNN performance metrics (AUC, sensitivity, specificity, F1, NPV, PPV) were materially identical with overlapping confidence intervals in PH cohorts stratified by RHC vs. echo PH diagnosis criteria (Table 3).
Table 2.
CNN performance in detecting PH in each of the test datasets
AUC | Sensitivity | Specificity | PPV, % | NPV, % | PH+ in test set, n |
Controls in test set, n |
Disease prevalence in dataset |
|
---|---|---|---|---|---|---|---|---|
PH at a prevalence of 20% | 0.89 (0.88-0.90) | 0.79 (0.76-0.81) | 0.84 (0.83-0.85) | 0.56 (0.53-0.58) | 0.94 (0.93-0.95) | 982 | 3903 | 20 |
PH at a prevalence of 1% | 0.88 (0.83-0.93) | 0.79 (0.68-0.90) | 0.84 (0.83-0.85) | 0.05 (0.03-0.06) | 1.00 (1.00-1.00) | 39 | 3903 | 1 |
Pre-capillary PH | 0.91 (0.89-0.94) | 0.83 (0.78-0.88) | 0.84 (0.83-0.88) | 0.17 (0.14-0.19) | 0.99 (0.99-0.99) | 147 | 87 | 63 |
Group 1 PH – PAH | 0.94 (0.92-0.96) | 0.88 (0.83-0.93) | 0.84 (0.83-0.85) | 0.15 (0.12-0.17) | 1.0 (0.99-1.0) | 119 | 87 | 56 |
Group 3 PH – PH due to lung disease and/or hypoxia | 0.9 (0.89-0.91) | 0.81 (0.77-0.84) | 0.84 (0.83-0.85) | 0.31 (0.29-0.34) | 0.98 (0.98-0.98) | 169 | 80 | 68 |
AUC=area under the receiver operating characteristic curve; PPV=positive predictive value; NPV=negative predictive value; PH=pulmonary hypertension; value (95% confidence interval)
Figure 2.
ROC curves for primary model's performance in the test dataset for A: PH patients; B: pre-capillary PH; C: Group 1 PH (PAH) patients; D: Group 3 PH patients. AUC, area under the receiver operating characteristic curve; PAH, pulmonary arterial hypertension; PH, pulmonary hypertension; ROC, receiver operating characteristic.
Table 3.
Performance of CNN to detect PH stratified by PH diagnosis criteria
PH Diagnosis Method | AUC | Sensitivity | Specificity | F1 score | PPV | NPV |
---|---|---|---|---|---|---|
RHC | 0.89 (0.87-0.9) | 0.78 (0.75-0.81) | 0.84 (0.83-0.85) | 0.51 (0.49-0.54) | 0.38 (0.36-0.41) | 0.97 (0.96-0.97) |
Echo | 0.90 (0.89-0.91) | 0.79 (0.76-0.82) | 0.84 (0.83-0.85) | 0.52 (0.49-0.55) | 0.39 (0.36-0.41) | 0.97 (0.96-0.97) |
AUC=area under the receiver operating characteristic curve; RHC=right heart catheterization; PPV=positive predictive value; NPV=negative predictive value.
Example ECGs from PH patient and non-PH patients show different characteristic ECG changes (Figure 3, A-B). We applied the LIME technique to the trained PH CNN which identifies ECG segments of greatest importance to the CNN to predict PH for example ECGs from a PH and non-PH patient (Figure 3, C-D)24. ECG segments well-recognized to be associated with PH, such as the large R-wave in lead V1, findings consistent with right ventricular hypertrophy, right axis deviation, and RV strain pattern (ST depression and T wave inversions in precordial and inferior leads, particularly in lead III), were all weighed heavily in the CNN PH prediction.
We then examined how the CNN would perform to predict PH prior to the date of clinical PH diagnosis. For PH patients in the test dataset, we identified additional ECGs prior to the date of their PH diagnosis, grouped into intervals of 3–6 months from the date of diagnosis. The trained PH model was then applied on these ECGs obtained up to 2 years prior to diagnosis to examine the model’s performance to predict PH prior to the date of actual PH diagnosis (Figure 4). AUC was highest using ECGs from the time of diagnosis, and AUC remained above 0.80 up to 1 year prior to diagnosis, and at a minimum of 0.79 up to 2 years prior to diagnosis. Figure 5 shows examples of ECGs from PH patients around the time of PH diagnosis and 2 years prior to diagnosis, neither of which show characteristic changes associated with longstanding PH.
Figure 4.
Model performance to detect PH using ECGs obtained prior to date of clinical PH diagnosis. AUC, area under the receiver operating characteristic curve; PH, pulmonary hypertension.
Figure 5.
Example ECGs from the same patient 24 months prior to diagnosis (top) and at the time of diagnosis (bottom). Note that no major ECG changes suggestive of PH are visible globally in either example. ECG, electrocardiogram; PH, pulmonary hypertension.
Detecting Pulmonary Hypertension Sub-types
We applied the trained CNN, which was trained to detect broad PH, to test its ability to detect clinically-relevant PH subtypes. The model achieved an AUC, sensitivity, and specificity of 0.91, 0.93, and 0.84, respectively, for pre-capillary PH patients defined by RHC criteria (Table 2). Adding medication criteria to RHC hemodynamic parameters, we identified patients with Group 1 PH. The model achieved an AUC, sensitivity, and specificity of 0.88, 0.81, and 0.81, respectively for Group 1 PH patients. Finally, ICD code data was added to RHC hemodynamic parameters to identify patients with Group 3 PH. The model achieved an AUC, sensitivity, and specificity of 0.80, 0.73, and 0.76, respectively for Group 3 PH patients (Figures 1-2, Table 2).
Remote ECG monitoring devices frequently capture fewer than 12 ECG leads, therefore we also examined CNN performance to detect overall PH using 1 or 2 ECG leads. Discrimination for PH remained high using fewer leads, with only mildly lower AUCs compared to CNNs using 12 ECG leads (Table 4).
Table 4.
Performance of CNN to detect PH with fewer than 12 ECG leads.
AUC | Sensitivity | Specificity | PPV | NPV | |
---|---|---|---|---|---|
Detecting PH using 2 ECG leads (leads I and V2) | 0.84 (0.83-0.86) | 0.78 (0.75-0.81) | 0.76 (0.74-0.77) | 0.45 (0.42-0.48) | 0.93 (0.92-0.94) |
Detecting PH using ECG lead V2 | 0.83 (0.82-0.85) | 0.72 (0.68-0.75) | 0.8 (0.78-0.81) | 0.47 (0.44-0.5) | 0.92 (0.91-0.93) |
Detecting PH using ECG lead I | 0.81 (0.79-0.83) | 0.77 (0.73-0.8) | 0.7 (0.68-0.71) | 0.39 (0.37-0.42) | 0.92 (0.91-0.93) |
AUC=area under the receiver operating characteristic curve; PH=pulmonary hypertension; PPV=positive predictive value; NPV=negative predictive value.
DISCUSSION
Deep learning-based analysis of standard 12-lead ECGs can detect PH with high overall performance (AUC 0.87). Importantly, when our algorithm was applied to ECGs up to 2 years prior to clinical PH diagnosis, it performed well with an AUC of ≥0.79, demonstrating the broad potential to detect PH with ECGs much earlier than occurs through the current clinical standard of care. In addition, the model was able to identify PH in patients with pre-capillary PH, WHO Group 1 PH and WHO Group 3 PH. Accuracy to detect PH was only slightly decreased when data were limited to 1 or 2 ECG leads, such as is often obtained from remote ECG monitoring devices, further expanding clinical settings in which ECG-based PH detection can be deployed. This approach has the potential to reduce diagnostic delay in PH by adding a readily accessible, inexpensive, and relatively precise test to the currently available tools for screening.
We anticipate our ECG algorithm may be best suited for patients in specific high-risk clinical settings such as cardiology or pulmonology clinics, or in patients with unexplained dyspnea (i.e. from primary care or specialty settings). Positive predictive values for any algorithm are directly proportional to disease prevalence,27 therefore our algorithm may be better applied in higher-risk settings than the general population. We demonstrated the comparative PPVs observed when PH prevalence was 20% (PPV: 0.56) vs. 1% (PPV: 0.05). In any scenario, the algorithm’s prediction would be considered by the treating physician alongside all other clinical signs and clinical data before a decision was made on next steps, to avoid unnecessary testing. Importantly, NPVs were 0.94 (20% prevalence) and 1.00 (1% prevalence), indicating the algorithm could correctly exclude patients from further unnecessary tests in most cases. Our algorithm could be applied in databases from large healthcare systems or specialty clinics to identify patients with PH from existing ECG data, as well as being applied prospectively, in “real-time”, as ECGs are being collected in multiple settings. Further, the ability to detect PH using fewer than 12-ECG leads greatly expands the potential impact of this CNN-based approach beyond standard clinical encounters where 12-lead ECGs are obtained, to include remote and ambulatory ECG monitoring settings that are used with increasing frequency.
Importantly, we also demonstrated that our algorithm can detect PH from ECGs obtained up to at least 2 years prior to the date of clinical diagnosis. This raises the possibility to achieve improved detection of existing but undiagnosed PH, earlier than the present standard of care. Currently, PH patients often present very symptomatic (WHO functional class III/IV) at the time of diagnosis,28, 29 representing an unmet need for improved screening. With the accessibility of ECG in most clinical settings, it is plausible that this algorithm could be used in primary care or resource-constrained settings and could potentially lead to more timely echocardiographic assessment, diagnosis, and specialist referrals. The unmet need and impact of improving the PH diagnostic workflow is evident from registry data, which show that a long delay in diagnosis is associated with a higher mortality risk in PAH.4
Kwon and colleagues also reported a deep learning model that used both demographic information and 12- or single-lead ECG data from two Korean hospitals (N=14,039). Their PH definitions relied solely on echocardiogram criteria which is less rigorous and did not examine PH subtypes; they reported AUC, sensitivity and specificity of 0.90, 0.80, and 0.84, respectively, for the detection of PH.30 In comparison, our study provides corroboration of their primary finding with echocardiogram data alone while using more stringent definitions of PH, including PH patients confirmed by RHC. We further demonstrated that our model was able to detect subtypes of PH, including pre-capillary PH, in Group 1 PH and Group 3 PH, for which there are commercially available treatments.
A systematic review and meta-analysis of 27 studies involving a total of 4386 patients with suspected PH who underwent RHC and echocardiogram found that echocardiographic estimation of systolic pulmonary artery pressure could predict PH with a pooled AUC, sensitivity and specificity of 0.88, 0.85, and 0.74, respectively.31 Four studies using tricuspid regurgitation pressure gradient obtained a pooled AUC, sensitivity and specificity of 0.85, 0.75, and 0.81, respectively.31 This demonstrates that the predictive accuracy of echocardiographic assessment of PH is generally acceptable, although sensitivity and specificity can vary depending on the measurements used. However, echocardiography requires specialist expertise, is associated with high inter-operator variability,12 and does not reliably lead to timely referrals.32, 33 For example, a study assessed a random sample of 500 echocardiographic reports of patients with estimated right ventricular systolic pressure >40 mm Hg between 2006 and 2014 (21% of which were >60 mm Hg) and found that only 31% mentioned PH in their report summaries and only 4.6% of patients were referred to the PH clinic.32 Furthermore, a study of patients who had an echocardiogram within 2 days of RHC found that 214 (47%) of the 459 RHC-confirmed PH patients did not have a measurable TRV on their echocardiogram.34 Moreover, access to echocardiography varies between and within countries35-37 due to socioeconomic differences and/or differences in local practice guidelines or policies. The use of echocardiograms has increased in recent years but a US study showed that echocardiograms remain underused during critical cardiovascular hospitalizations.36
Several groups have employed deep learning models to enable or enhance detection of cardiovascular conditions by ECG.38 We previously demonstrated that such deep learning analysis of 12-lead ECG can not only identify numerous diagnoses commonly associated with ECGs,39 but also diseases and conditions not typically diagnosed by ECG including hypertrophic cardiomyopathy (HCM), amyloidosis or even diastolic dysfunction with AUCs of 0.91, 0.86 and 0.84, respectively.18 Though ECGs do not currently play a central role in their clinical diagnosis, models could discriminate each disease fairly well and may help lead patients to more confirmatory testing. Others have subsequently corroborated these results for HCM in other populations40 and for other conditions like LV systolic dysfunction.41 This present study contributes to this body of literature to further establish that ECGs contain much more information than is presently appreciated by physician interpretation. Literature describing deployment of AI algorithms for screening and diagnosis in clinical practice so far is limited, but holds promise.38 The EAGLE trial demonstrated than an ECG model to detect low left ventricular ejection fraction increased diagnosis compared with usual care (2.1% vs. 1.6%, OR [95% CI]: 1.32 [1.01; 1.61], P=0.007) in a routine primary care setting in the US.42 Furthermore, the US Food and Drug Administration issued an Emergency Use Authorization for this model in patients with COVID-19.43
This study has some limitations. While we used gold-standard RHC-based PH cohort definitions when possible, to maximize sample size for algorithm training we also used echocardiographic PH criteria of TRV. Patients who had an intermediate probability of having PH based on their TRV measurement were excluded (unless they also had an mPAP measurement available) as they could not be robustly classified as PH-positive or PH-negative. In addition, WHO group 1 PH patients were identified from RHC-confirmed PH patients that also had physician-prescribed orders of PAH-specific medications. While prescription of these PAH-specific medications is standardized at UCSF, it is possible that they could have been prescribed off-label for patients with other PH etiologies.3 Group 3 PH patients were identified based on qualifying ICD codes, though there are no Group 3 PH-specific ICD codes.44 In all three cases above, patients could have been misclassified, though this misclassification would be expected to bias results towards the null. Second, deep learning models should be tested in different ethnicities and races.45 While it is a strength that our study population consisted of 50% female and at least 27.6% non-White race, it will be important for future work to include validation in diverse populations. Third, positive predictive values, which are directly proportional to disease prevalence, were 54.7% (at 20% prevalence of PH) and 3.7% (at 1% prevalence of PH), indicating that application of this algorithm to a general patient population would potentially result in a relatively high number of false positives and unnecessary tests. However, this algorithm is intended for patients who are at risk of having PH (e.g. those with chronic unexplained dyspnea) and the algorithm’s prediction would be considered alongside all the other clinical signs and data available to the treating physician before a decision was made on next steps. Future ECG workflows could integrate algorithms such as ours into standard automated interpretation. The key strengths of this study are the application of a deep learning approach to a test that is readily accessible to primary care physicians, using data from a large population (N=24,470) and future research will investigate how this could potentially be incorporated into the clinical workflow.
INTERPRETATION
This deep learning ECG algorithm can detect PH and performs well to identify rare and treatable PH subtypes. It can also detect PH using ECGs up to 2 years before RHC or echocardiogram diagnosis, demonstrating that it has the potential to substantially reduce the diagnostic delay in PH and help ensure timely treatment of this life-threatening condition.
Supplementary Material
Figure S1. Visual summary of study design and findings. AUC, area under the curve; mPAP, mean pulmonary artery pressure; RHC, right heart catheterization; TRV, tricuspid regurgitation velocity; Group 1 PH, pulmonary arterial hypertension; Group 3 PH, PH due to lung disease and/or hypoxia
Figure S2. Confusion matrices (two by two tables) showing the predictions by the primary model for the presence of PH compared with the 'ground truth' PH diagnosis by RHC and/or echocardiogram in (A) PH patients (prevalence of 20%), (B), pre-capillary patients, (C) Group 1 PH (PAH) patients, and (D) Group 3 PH patients. Panel (E) outlines how the performance metrics are calculated from these tables. NPV, negative predictive value; PAH, pulmonary arterial hypertension; PH, pulmonary hypertension; PPV, positive predictive value; RHC, right heart catheterization
This work describes the development and evaluation of a deep neural network “artificial intelligence” algorithm that shows high accuracy to detect pulmonary hypertension and several subtypes from 12-lead ECG alone.
It shows that pulmonary hypertension can even be detected with this approach up to 2 years before the clinical diagnosis was made in the standard clinical workflow.
This provides a new approach to detect pulmonary hypertension patients that can benefit from directed treatments in a widely-accessible non-invasive way in higher risk populations.
Acknowledgements:
GHT had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. GHT, CB, DQ contributed to study design. SA, HM, LR, JB contributed to data analysis; MAA, LK, NM, BR, JB, CC, EK, XG, AN, DQ, CB, AJB, JEO, GHT contributed to interpretation of results. MAA, SA, HM, LR, LK, NM, BR, JB, CC, EK, XG, AN, DQ, CB, AJB, JEO, GHT contributed to writing of the manuscript. Janssen Pharmaceutical Companies of Johnson & Johnson provided funding for this work.
Funding Information
This work was partially supported by National Institutes of Health NHLBI K23HL135274 (Dr. Tison). This research also received funding support from Janssen Pharmaceutical Companies of Johnson & Johnson. Medical writing and editorial support, under the direction of the authors, were provided by Victoria Atess, PhD, and Clare Lowe (Ashfield MedComms, an Ashfield Health company), BSocSc, funded by Actelion Pharmaceuticals Ltd, a Janssen Pharmaceutical Company of Johnson & Johnson
Nonstandard Abbreviations and Acronyms
- CNN
convolutional neural network
- ECG
electrocardiogram
- HCM
hypertrophic cardiomyopathy
- mPAP
mean pulmonary arterial pressure
- NPV
negative predictive value
- PAH
pulmonary arterial hypertension
- PH
pulmonary hypertension
- PPV
positive predictive value
- RHC
right heart catheterization
- TRV
tricuspid regurgitation velocity
Footnotes
Disclosures
Mandar A. Aras, Sean Abreau, Hunter Mills, Lakshmi Radhakrishnan, Liviu Klein, Neha Mantri, Benjamin Rubin, Joshua Barrios and Jeffrey E. Olgin have no other potential conflicts to disclose. Christel Chehoud, Emily Kogan, Xavier Gitton, Anderson Nnewihe, Deborah Quinn, and Charles Bridges are employees of Janssen Pharmaceutical Companies of Johnson & Johnson and own shares in the company. Atul Butte is a co-founder and consultant to Personalis and NuMedii; consultant to Samsung, Mango Tree Corporation, and in the recent past, 10x Genomics, Helix, Pathway Genomics, and Verinata (Illumina); has served on paid advisory panels or boards for Geisinger Health, Regenstrief Institute, Gerson Lehman Group, AlphaSights, Covance, Novartis, Genentech, Merck, and Roche; is a shareholder in Personalis and NuMedii; is a minor shareholder in Apple, Facebook, Alphabet (Google), Microsoft, Amazon, Snap, Snowflake, 10x Genomics, Illumina, Nuna Health, Assay Depot (Scientist.com), Vet24seven, Regeneron, Sanofi, Royalty Pharma, Pfizer, BioNTech, AstraZeneca, Moderna, Biogen, Twist Bioscience, Pacific Biosciences, Editas Medicine, Invitae, and Sutro, and several other non-health related companies and mutual funds; and has received honoraria and travel reimbursement for invited talks from Johnson and Johnson, Roche, Genentech, Pfizer, Merck, Lilly, Takeda, Varian, Mars, Siemens, Optum, Abbott, Celgene, AstraZeneca, AbbVie, Westat, several investment and venture capital firms, and many academic institutions, medical or disease specific foundations and associations, and health systems. Atul Butte receives royalty payments through Stanford University, for several patents and other disclosures licensed to NuMedii and Personalis. Atul Butte’s research has been funded by NIH, Northrup Grumman (as the prime on an NIH contract), Genentech, Johnson and Johnson, FDA, Robert Wood Johnson Foundation, Leon Lowenstein Foundation, Intervalien Foundation, Priscilla Chan and Mark Zuckerberg, the Barbara and Gerson Bakar Foundation, and in the recent past, the March of Dimes, Juvenile Diabetes Research Foundation, California Governor’s Office of Planning and Research, California Institute for Regenerative Medicine, L’Oreal, and Progenity. Geoffrey Tison has received research grants from Myokardia, General Electric and Janssen Pharmaceutical Companies of Johnson & Johnson.
REFERENCES
- 1.Simonneau G, Montani D, Celermajer DS, Denton CP, Gatzoulis MA, Krowka M, Williams PG and Souza R. Haemodynamic definitions and updated clinical classification of pulmonary hypertension. The European respiratory journal. 2019;53:1801913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Benza RL, Miller DP, Barst RJ, Badesch DB, Frost AE and McGoon MD. An evaluation of long-term survival from time of diagnosis in pulmonary arterial hypertension from the REVEAL Registry. Chest. 2012;142:448–456. [DOI] [PubMed] [Google Scholar]
- 3.Gall H, Felix JF, Schneck FK, Milger K, Sommer N, Voswinckel R, Franco OH, Hofman A, Schermuly RT, Weissmann N, Grimminger F, Seeger W and Ghofrani HA. The Giessen Pulmonary Hypertension Registry: Survival in pulmonary hypertension subgroups. The Journal of Heart and Lung Transplantation. 2017;36:957–967. [DOI] [PubMed] [Google Scholar]
- 4.Khou V, Anderson JJ, Strange G, Corrigan C, Collins N, Celermajer DS, Dwyer N, Feenstra J, Horrigan M, Keating D, Kotlyar E, Lavender M, McWilliams TJ, Steele P, Weintraub R, Whitford H, Whyte K, Williams TJ, Wrobel JP, Keogh A and Lau EM. Diagnostic delay in pulmonary arterial hypertension: Insights from the Australian and New Zealand pulmonary hypertension registry. 2020;25:863–871. [DOI] [PubMed] [Google Scholar]
- 5.Strange G, Gabbay E, Kermeen F, Williams T, Carrington M, Stewart S and Keogh A. Time from symptoms to definitive diagnosis of idiopathic pulmonary arterial hypertension: The delay study. Pulmonary circulation. 2013;3:89–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Galiè N, Humbert M, Vachiery J-L, Gibbs S, Lang I, Torbicki A, Simonneau G, Peacock A, Vonk Noordegraaf A, Beghetti M, Ghofrani A, Gomez Sanchez MA, Hansmann G, Klepetko W, Lancellotti P, Matucci M, McDonagh T, Pierard LA, Trindade PT, Zompatori M and Hoeper M. 2015 ESC/ERS Guidelines for the diagnosis and treatment of pulmonary hypertension. <span class="subtitle">The Joint Task Force for the Diagnosis and Treatment of Pulmonary Hypertension of the European Society of Cardiology (ESC) and the European Respiratory Society (ERS)<span class="subtitle">Endorsed by: Association for European Paediatric and Congenital Cardiology (AEPC), International Society for Heart and Lung Transplantation (ISHLT). 2015;46:903–975. [DOI] [PubMed] [Google Scholar]
- 7.Kim MH, Johnston SS, Chu BC, Dalal MR and Schulman KL. Estimation of total incremental health care costs in patients with atrial fibrillation in the United States. Circ Cardiovasc Qual Outcomes. 2011;4:313–20. [DOI] [PubMed] [Google Scholar]
- 8.Wang TJ, Larson MG, Levy D, Vasan RS, Leip EP, Wolf PA, D'Agostino RB, Murabito JM, Kannel WB and Benjamin EJ. Temporal relations of atrial fibrillation and congestive heart failure and their joint influence on mortality: the Framingham Heart Study. Circulation. 2003;107:2920–5. [DOI] [PubMed] [Google Scholar]
- 9.Armstrong I, Billings C, Kiely DG, Yorke J, Harries C, Clayton S and Gin-Sing W. The patient experience of pulmonary hypertension: a large cross-sectional study of UK patients. BMC Pulmonary Medicine. 2019;19:67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Brown LM, Chen H, Halpern S, Taichman D, McGoon MD, Farber HW, Frost AE, Liou TG, Turner M, Feldkircher K, Miller DP and Elliott CG. Delay in recognition of pulmonary arterial hypertension: factors identified from the REVEAL Registry. Chest. 2011;140:19–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kiely DG, Lawrie A and Humbert M. Screening strategies for pulmonary arterial hypertension. European Heart Journal Supplements. 2019;21:K9–K20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ollivier C, Sun H, Amchin W, Beghetti M, Berger RMF, Breitenstein S, Garnett C, Gullberg N, Hassel P, Ivy D, Kawut SM, Klein A, Lesage C, Migdal M, Nije B, Odermarsky M, Strait J, de Graeff PA and Stockbridge N. New Strategies for the Conduct of Clinical Trials in Pediatric Pulmonary Arterial Hypertension: Outcome of a Multistakeholder Meeting With Patients, Academia, Industry, and Regulators, Held at the European Medicines Agency on Monday, June 12, 2017. Journal of the American Heart Association. 2019;8:e011306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Sobczyk D, Nycz K and Andruszkiewicz P. Validity of a 5-minute focused echocardiography with A-F mnemonic performed by non-echocardiographers in the management of patients with acute chest pain. Cardiovascular ultrasound. 2015;13:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rui P and Okeyode T. National Ambulatory Medical Care Survey: 2016 National Summary Tables. Available from: https://www.cdc.gov/nchs/data/ahcd/namcs_summary/2016_namcs_web_tables.pdf. 2016. [Google Scholar]
- 15.Bhatia RS, Bouck Z, Ivers NM, Mecredy G, Singh J, Pendrith C, Ko DT, Martin D, Wijeysundera HC, Tu JV, Wilson L, Wintemute K, Dorian P, Tepper J, Austin PC, Glazier RH and Levinson W. Electrocardiograms in Low-Risk Patients Undergoing an Annual Health Examination. JAMA internal medicine. 2017;177:1326–1333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Pitts SR, Niska RW, Xu J and Burt CW. National Hospital Ambulatory Medical Care Survey: 2006 emergency department summary. National Health Statistics Report. 2008:1–38. [PubMed] [Google Scholar]
- 17.Bossone E, Paciocco G, Iarussi D, Agretto A, Iacono A, Gillespie BW, and Rubenfire M. The prognostic role of ECG in primary pulmonary hypertension. Chest. 2002. Feb; 121(2):513–8. [DOI] [PubMed] [Google Scholar]
- 18.Tison GH, Zhang J, Delling FN and Deo RC. Automated and Interpretable Patient ECG Profiles for Disease Detection, Tracking, and Discovery. Circulation: Cardiovascular Quality and Outcomes. 2019;12:e005289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.He K, Zhang X, Ren S and Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016:770–778. [Google Scholar]
- 20.Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP and Ng AY. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature Medicine. 2019;25:65–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nair V and Hinton GE. Rectified linear units improve restricted boltzmann machines. Paper presented at: Proceedings of the 27th International Conference on International Machine Learning; 2010; Haifa, Israel. [Google Scholar]
- 22.Ioffe S and Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning. 2015;37:448–456. [Google Scholar]
- 23.Kingma DP and Ba JL. Adam: A method for stochastic optimization. 3rd International Conference on Learning Representations, ICLR. 2015:1–15. [Google Scholar]
- 24.Ribeiro MT, Singh S, Guestrin C. “Why should I trust you ?” explaining the predictions of any classifier. In: KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016:1135–1144. doi: 10.1145/2939672.2939778. [DOI] [Google Scholar]
- 25.Saito T and Rehmsmeier M. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE. 2015;10:e0118432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hoeper MM, Humbert M, Souza R, Idrees M, Kawut SM, Sliwa-Hahnle K, Jing Z-C and Gibbs JSR. A global view of pulmonary hypertension. The Lancet Respiratory Medicine. 2016;4:306–322. [DOI] [PubMed] [Google Scholar]
- 27.Parikh R, Mathai A, Parikh S, Chandra Sekhar G and Thomas R. Understanding and using sensitivity, specificity and predictive values. Indian journal of ophthalmology. 2008;56:45–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Chazova IY, Martynyuk TV, Valieva ZS, Gratsianskaya SY, Aleevskaya AM, Zorin AV and Nakonechnikov SN. Clinical and Instrumental Characteristics of Newly Diagnosed Patients with Various Forms of Pulmonary Hypertension according to the Russian National Registry. BioMed research international. 2020;2020:6836973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Escribano-Subias P, Blanco I, López-Meseguer M, Lopez-Guarch CJ, Roman A, Morales P, Castillo-Palma MJ, Segovia J, Gómez-Sanchez MA and Barberà JA. Survival in pulmonary hypertension in Spain: insights from the Spanish registry. The European respiratory journal. 2012;40:596–603. [DOI] [PubMed] [Google Scholar]
- 30.Kwon J-m, Kim K-H, Medina-Inojosa J, Jeon K-H, Park J and Oh B-H. Artificial intelligence for early prediction of pulmonary hypertension using electrocardiography. The Journal of Heart and Lung Transplantation. 2020;39:805–814. [DOI] [PubMed] [Google Scholar]
- 31.Ni JR, Yan PJ, Liu SD, Hu Y, Yang KH, Song B and Lei JQ. Diagnostic accuracy of transthoracic echocardiography for pulmonary hypertension: a systematic review and meta-analysis. BMJ open. 2019;9:e033084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Choi E, Brown RE, Sullivan MJ and Andrus BW. Echocardiography reporting of pulmonary hypertension and subsequent referral to a specialty clinic. Echocardiography. 2020;37:8–13. [DOI] [PubMed] [Google Scholar]
- 33.Kanwar MK, Tedford RJ, Thenappan T, Marco TD, Park M and McLaughlin V. Elevated Pulmonary Pressure Noted on Echocardiogram: A Simplified Approach to Next Steps. Journal of the American Heart Association. 2021;10:e017684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.O'Leary JM, Assad TR, Xu M, Farber-Eger E, Wells QS, Hemnes AR and Brittain EL. Lack of a Tricuspid Regurgitation Doppler Signal and Pulmonary Hypertension by Invasive Measurement. Journal of the American Heart Association. 2018;7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Munt B, O'Neill BJ, Koilpillai C, Gin K, Jue J and Honos G. Treating the right patient at the right time: Access to echocardiography in Canada. Canadian Journal of Cardiology. 2006;22:1029–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Papolos A, Narula J, Bavishi C, Chaudhry FA and Sengupta PP. U.S. Hospital Use of Echocardiography: Insights From the Nationwide Inpatient Sample. Journal of the American College of Cardiology. 2016;67:502–511. [DOI] [PubMed] [Google Scholar]
- 37.van Gurp N, Boonman-De Winter LJ, Meijer Timmerman Thijssen DW and Stoffers HE. Benefits of an open access echocardiography service: a Dutch prospective cohort study. Netherlands Heart Journal. 2013;21:399–405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Siontis KC, Noseworthy PA, Attia ZI and Friedman PA. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nature Reviews Cardiology. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hughes JW, Olgin JE, Avram R, Abreau SA, Sittler T, Radia K, Hsia H, Walters T, Lee B, Gonzalez JE and Tison GH. Performance of a Convolutional Neural Network and Explainability Technique for 12-Lead Electrocardiogram Interpretation. JAMA cardiology. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ko WY, Siontis KC, Attia ZI, Carter RE, Kapa S, Ommen SR, Demuth SJ, Ackerman MJ, Gersh BJ, Arruda-Olson AM, Geske JB, Asirvatham SJ, Lopez-Jimenez F, Nishimura RA, Friedman PA and Noseworthy PA. Detection of Hypertrophic Cardiomyopathy Using a Convolutional Neural Network-Enabled Electrocardiogram. Journal of the American College of Cardiology. 2020;75:722–733. [DOI] [PubMed] [Google Scholar]
- 41.Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, Pellikka PA, Enriquez-Sarano M, Noseworthy PA, Munger TM, Asirvatham SJ, Scott CG, Carter RE and Friedman PA. Screening for cardiac contractile dysfunction using an artificial intelligence–enabled electrocardiogram. Nature Medicine. 2019;25:70–74. [DOI] [PubMed] [Google Scholar]
- 42.Yao X, Rushlow DR, Inselman JW, McCoy RG, Thacher TD, Behnken EM, Bernard ME, Rosas SL, Akfaly A, Misra A, Molling PE, Krien JS, Foss RM, Barry BA, Siontis KC, Kapa S, Pellikka PA, Lopez-Jimenez F, Attia ZI, Shah ND, Friedman PA and Noseworthy PA. Artificial intelligence–enabled electrocardiograms for identification of patients with low ejection fraction: a pragmatic, randomized clinical trial. Nature Medicine. 2021;27:815–819. [DOI] [PubMed] [Google Scholar]
- 43.Rocken C, Peters B, Juenemann G, Saeger W, Klein HU, Huth C, Roessner A and Goette A. Atrial amyloidosis: an arrhythmogenic substrate for persistent atrial fibrillation. Circulation. 2002;106:2091–7. [DOI] [PubMed] [Google Scholar]
- 44.Mathai SC and Mathew S. Breathing (and Coding?) a Bit Easier: Changes to International Classification of Disease Coding for Pulmonary Hypertension. Chest. 2018;154:207–218. [DOI] [PubMed] [Google Scholar]
- 45.Noseworthy PA, Attia ZI, Brewer LC, Hayes SN, Yao X, Kapa S, Friedman PA and Lopez-Jimenez F. Assessing and Mitigating Bias in Medical Artificial Intelligence: The Effects of Race and Ethnicity on a Deep Learning Model for ECG Analysis. Circulation: Arrhythmia and Electrophysiology. 2020;13:e007988. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1. Visual summary of study design and findings. AUC, area under the curve; mPAP, mean pulmonary artery pressure; RHC, right heart catheterization; TRV, tricuspid regurgitation velocity; Group 1 PH, pulmonary arterial hypertension; Group 3 PH, PH due to lung disease and/or hypoxia
Figure S2. Confusion matrices (two by two tables) showing the predictions by the primary model for the presence of PH compared with the 'ground truth' PH diagnosis by RHC and/or echocardiogram in (A) PH patients (prevalence of 20%), (B), pre-capillary patients, (C) Group 1 PH (PAH) patients, and (D) Group 3 PH patients. Panel (E) outlines how the performance metrics are calculated from these tables. NPV, negative predictive value; PAH, pulmonary arterial hypertension; PH, pulmonary hypertension; PPV, positive predictive value; RHC, right heart catheterization