Skip to main content
JACC Asia logoLink to JACC Asia
. 2022 May 17;2(3):258–270. doi: 10.1016/j.jacasi.2022.02.008

Artificial Intelligence-Enabled Electrocardiogram Improves the Diagnosis and Prediction of Mortality in Patients With Pulmonary Hypertension

Chih-Min Liu a,b,, Edward SC Shih c,, Jhih-Yu Chen c, Chih-Han Huang c,d, I-Chien Wu e, Pei-Fen Chen e, Satoshi Higa f, Nobumori Yagi g, Yu-Feng Hu a,b,c,, Ming-Jing Hwang c,d,∗∗, Shih-Ann Chen a,b,h
PMCID: PMC9627911  PMID: 36338407

Abstract

Background

Pulmonary hypertension is a disabling and life-threatening cardiovascular disease. Early detection of elevated pulmonary artery pressure (ePAP) is needed for prompt diagnosis and treatment to avoid detrimental consequences of pulmonary hypertension.

Objectives

This study sought to develop an artificial intelligence (AI)-enabled electrocardiogram (ECG) model to identify patients with ePAP and related prognostic implications.

Methods

From a hospital-based ECG database, the authors extracted the first pairs of ECG and transthoracic echocardiography taken within 2 weeks of each other from 41,097 patients to develop an AI model for detecting ePAP (PAP > 50 mm Hg by transthoracic echocardiography). The model was evaluated on independent data sets, including an external cohort of patients from Japan.

Results

Tests of 10-fold cross-validation neural-network deep learning showed that the area under the receiver-operating characteristic curve of the AI model was 0.88 (sensitivity 81.0%; specificity 79.6%) for detecting ePAP. The diagnostic performance was consistent across age, sex, and various comorbidities (diagnostic odds ratio >8 for most factors examined). At 6-year follow-up, the patients predicted by the AI model to have ePAP were independently associated with higher cardiovascular mortality (HR: 3.69). Similar diagnostic performance and prediction for cardiovascular mortality could be replicated in the external cohort.

Conclusions

The ECG-based AI model identified patients with ePAP and predicted their future risk for cardiovascular mortality. This model could serve as a useful clinical test to identify patients with pulmonary hypertension so that treatment can be initiated early to improve their survival prognosis.

Key words: all-cause mortality, artificial intelligence, cardiovascular mortality, deep learning, electrocardiogram, pulmonary hypertension

Abbreviations and Acronyms: AI, artificial intelligence; AIC, Akaike Information Criterion; AUC, area under the curve; ECG, electrocardiogram; ePAP, elevated pulmonary artery pressure; PAH, pulmonary arterial hypertension; PAP, pulmonary artery pressure; PH, pulmonary hypertension; TTE, transthoracic echocardiography

Central Illustration

graphic file with name fx1.jpg


Pulmonary hypertension (PH), a condition of elevated pulmonary artery pressure (ePAP), affects more than a million individuals worldwide and causes premature disability, heart failure, and death.1 Left heart and lung diseases are the most common etiologies of PH.2,3 The presence of PH can increase the 5-year mortality rate to more than 30% if left untreated.4, 5, 6 The transthoracic echocardiography (TTE) is recommended for measuring pulmonary artery pressure (PAP) via the estimation of peak tricuspid regurgitation velocity. However, TTE is highly operator-dependent, and requires a good acoustic window and flow tracings for correct PAP measurements. As such, delays of up to 2 years between the onset of symptoms and diagnosis of ePAP are frequently observed.7 Alternative tests, such as the pulmonary function, serum N-terminal pro–B-type natriuretic peptide, and uric acid level tests, have been developed to increase the sensitivity for the early diagnosis of ePAP.8,9 Although their sensitivity remains unsatisfactory, ranging from 55.9% to 71.0%, these tests can identify patients with disease progression and future mortality, which is a key requirement for a qualified clinical test10,11 Diagnosing PH at a stage when it may be more amenable to treatment is important. However, the current screening approaches for PH are resource-intensive, and population-based screening remains unavailable.12

Recently, researchers have successfully trained artificial intelligence (AI) models to correlate electrocardiogram (ECG) signals to echocardiographic phenotypes and automatically screen for cardiac structural derangement (eg, cardiac contractile dysfunction, hypertrophy).13,14 Using the same approach, a semiautomatic AI algorithm was recently created to detect ePAP,15 which requires a variety of clinical (eg, age, sex, height, weight, body mass index) and ECG (eg, P axis, PR interval, QRS duration, QT interval, presence of atrial fibrillation) characteristics, as well as predetermined arrhythmias from medical records or physician diagnoses. As numerous clinical and ECG predetermined parameters are still needed, the workload of manpower is expected not to be reduced.

The assessment of associations between a medical test or biomarker and relevant clinical outcomes is a prerequisite for a qualified diagnostic test. Although automatic detection of cardiac abnormality by AI is an encouraging new technology,13 AI algorithms have not been correlated to cardiovascular outcomes until very recently.16 In the present study, we intended to develop a completely automated AI-enabled ECG model that can detect ePAP and identify patients at risk of cardiovascular disease mortality. This qualified test could enable early diagnosis, risk assessment, and therapeutic intervention in patients with ePAP or suspicious PH.

Methods

In our study, ePAP was defined as PAP > 50 mm Hg by TTE, which indicates a high probability of PH according to established guidelines.17,18 A reproducibility analysis of the TTE measurements in the database we used has been reported,19,20 which showed that interobserver variability (4.4% to 6.4%) was just a bit higher than intra-observer variability (4.0% to 4.8%). From the TTE data, PAP was estimated as the equation of (4 × [peak tricuspid regurgitation velocity]2 + estimated right atrium pressure).17 Deep learning AI models were trained to identify patients with ePAP. The design of the AI model is shown in Supplemental Figure S1 and the Supplemental Methods. The overall process is illustrated in the Central Illustration.

Central Illustration.

Central Illustration

An Electrocardiogram-Enabled Artificial Intelligence Model for Detecting Elevated Pulmonary Arterial Hypertension and Assessing Risk of Cardiovascular Mortality

The schematic overview highlights the clinical implications of pulmonary hypertension addressed by the AI model. AI = artificial intelligence; AUC = area under the curve; ECG = electrocardiogram; ePAP = elevated pulmonary arterial hypertension.

Study population

A total of 63,767 patients with both ECGs (Philips Healthcare) and TTEs between 2010 and 2017 were obtained from the ECG database at Taipei Veterans General Hospital, Taiwan. Among those, 229,787 ECG-TTE pairs were performed within a 2-week interval. We selected the 2-week interval following a previous study design.13 The first such ECG-TTE pair for each patient (n = 41,097) was selected to form the main analysis data to create the AI model (Group 1). To examine whether the AI model could predict the long-term incidence of developing cardiac abnormality in patients whose hearts were initially diagnosed as normal, we retrieved a TTE follow-up cohort of 10,818 patients from Group 1. These patients received a follow-up TTE after the first ECG-TTE pair at a mean interval of 2.6 ± 1.7 years (IQR from first ECG-TTE pair: 1.2 to 3.7 years). This subset of Group 1 patients was designated as Group 1′. Another 18,373 ECG-TTE pairs had intervals of more than 2 weeks between the ECG and TTE. These patients (designated as Group 2) were not in the main analysis group (Group 1) and thus also not in the follow-up analysis group (Group 1′). Group 2 was used as an independent group for ancillary validation of the AI model for cardiovascular mortality. An external cohort of Japanese patients (Group 3), described in the following text, was used to further test the model. Figure 1 shows the study’s flow diagram.

Figure 1.

Figure 1

Flowchart of Study and Dataset Enrolment and Allocation

In this scheme, a patient’s data was placed in only 1 of the data sets to avoid cross-contamination among training, validation, and tests during the derivation and evaluation of the models. AI models were derived using only the first valid ECG-TTE pairs. The first such ECG-TTE pair for each patient (n = 41,097) was selected to form the main analysis data to create the AI model (Group 1). Group 1′ is a subset of Group 1. Group 2 and Group 3 are independent cohorts used to validate the AI model. AI = artificial intelligence; ECG = electrocardiogram; ePAP = elevated pulmonary arterial hypertension; TTE = transthoracic echocardiography.

Main analysis to create the AI model (Group 1)

Ten AI models were derived from 10-fold cross-validation neural network deep learning. In this AI learning, all ECG-TTE pairs from Group 1 were randomly assigned to 10 independent segments (8 to the training set, 1 to the validation set, and 1 to the test set). Because any given patient’s data were exclusively present in only 1 of the 10 segments, the data could not be used simultaneously for model training, validation, or testing. This 8-1-1 training, validation, and test data segmentation scheme yielded 32,875 to 32,879 ECG-TTE pairs in the training set, 4,109 to 4,111 pairs in the validation set, and 4,109 to 4,111 pairs in the test set for each of the 10 cross-validation models. We used a rolling data-segment selection scheme (Figure 1) to ensure all pairs in Group 1 were used once for validation and tested once in 1 of the 10 models, thereby avoiding data selection bias. Note the cross-validation procedure used here is not a conventional one because the validation set is for the selection of the best model instead of the best hyperparameters (Supplemental Methods).

Follow-up analysis for patients developing ePAP (Group 1′)

Group 1′ patients included 2 subgroups: AI-predicted non-ePAP patients and AI-predicted ePAP patients. AI-predicted non-ePAP patients were those predicted in the test sets as not having an ePAP who also had a confirmatory PAP <50 mm Hg by TTE (n = 6,532). AI-predicted ePAP patients were those identified by AI as having ePAP but had a contradictory PAP <50 mm Hg by TTE examined within 2 weeks of the ECG (n = 2,984). We followed up the incidences of developing ePAP in patients of both subgroups.

Ancillary analysis of patients whose ECG and TTE occurred more than 2 weeks apart (Group 2)

Group 2 patients had their TTEs more than 2 weeks after their initial ECGs. The mean interval from ECG to TTE was 2.6 ± 2.1 years (IQR: 0.7–4.1 years). For the ancillary test on Group 2 patients, we intentionally used the worst-performing AI model from the 10-fold cross-validation of the main (Group 1) analysis. Because of the lack of the ECG-TEE pairs within an interval of 2 weeks, we applied the AI model to merely evaluate its prognostic value for Group 2 patients.

Further validation of the AI model on a Japanese cohort (Group 3)

A total of deidentified 279 ECGs (114 PH and 165 non-PH patients confirmed by TTE within a 2-week interval of ECG-TTE pairs) with the recordings of age, sex, and history of diseases, as well as follow-up of cardiovascular mortality and all-cause mortality, were randomly retrieved from a registry compiled between 2009 and 2020 at Makiminato Central Hospital and Nakagami Hospital, Japan. The 10-second ECGs were acquired from Cardiofax V ECG-2400 (Nihon Kohden Corporation) and analyzed by the same AI model tested on Group 2 without retraining.

The methods of survival and sensitivity analyses and using conventional ECG characteristics to diagnose ePAP were provided in the Supplemental Appendix.

AI model for ePAP

Our model was adapted with permission from the deep learning neural network model of our previous work designed for the detection and classification of cardiac arrhythmias (Supplemental Figure S1, Supplemental Methods).21

Ethical approval

This study was approved by the institutional review board at Taipei Veterans General Hospital, Taipei, Taiwan.

Statistical analysis

All analyses were performed using SAS (SAS Institute) statistical proprietary software 9.4 (TS1M3 DBCS3170). The statistical methods used are described in the Supplemental Appendix.

Results

Main analysis of diagnostic performance of the AI model for ePAP

The patients in the main analysis (Group 1) were 60.0 ± 18.6 years of age, and 6.9% of them were diagnosed as having ePAP by TTE. Table 1 shows the baseline clinical characteristics. The patients with ePAP were older, more often male, and had more comorbidities.

Table 1.

Basic Characteristics of the Study Population in Group 1 Patients

Elevated PAP (n = 2,824) Nonelevated PAP (n = 38,273) Total (N = 41,097) P Value
Age, y 73.0 ± 14.9 59.0 ± 18.4 60.0 ± 18.6 <0.001
Male 1,639 (58.0) 18,383 (48.0) 20,022 (48.7) <0.001
Hypertension 1,628 (57.7) 15,716 (41.1) 17,344 (42.2) <0.001
Diabetes mellitus 827 (29.3) 6,189 (16.2) 7,016 (17.1) <0.001
Congestive heart failure 1,272 (45.0) 4,411 (11.5) 5,683 (13.8) <0.001
Prior stroke 311 (11.0) 2,449 (6.4) 2,760 (6.7) <0.001
Prior myocardial infarction 147 (5.2) 797 (2.1) 944 (2.3) <0.001
Chronic kidney disease 256 (9.1) 926 (2.4) 1,182 (2.9) <0.001
Chronic lung diseases 788 (27.9) 5,676 (14.8) 6,464 (15.7) <0.001
Pulmonary embolism 47 (1.7) 114 (0.3) 161 (0.4) <0.001
Rheumatic diseases 266 (9.4) 2,462 (6.4) 2,728 (6.6) <0.001

Values are mean ± SD or n (%).

Elevated or non-elevated according to transthoracic echocardiography measurement at a threshold of 50 mm Hg.

PAP = pulmonary artery pressure.

In the test sets of the main analysis (Group 1), the area under the curves (AUCs) of the 10 cross-validation AI models for detecting ePAP were consistent (mean 0.88; 95% CI: 0.87–0.89) (Figure 2). Consistency was also observed for sensitivity (81.0%, 95% CI: 77.6%–84.4%), specificity (79.6%, 95% CI: 77.4%–81.7%), and accuracy (79.7%, 95% CI: 77.8%–81.5%) (Supplemental Table S1). The AUCs for the validation sets, shown in Supplemental Figure S2, also were consistent and similar (mean 0.88, 95% CI: 0.87–0.89) to those for the test sets. The fact that the AUCs of the 10 cross-validation AI models for detecting ePAP were consistent across the different individual test sets, and that these test sets were randomly segmentalized, the possible bias caused by different models tested on different patients could be minimized. These results demonstrated the robustness of our AI models, which required no predetermined clinical or ECG characteristics as input.

Figure 2.

Figure 2

Performance of AI Models for Identifying Group 1 ePAP Patients

The receiver-operating characteristic curves of 10 cross-validation AI models (see Figure 1) showing small variations in their areas under the curve. These performance results were evaluated on the test set data of the corresponding model (see Methods and Figure 1). AUC = area under the curve; ROC = receiver-operating characteristic; other abbreviations as in Figure 1.

The AI performance was strong across age, sex, and various comorbidities (Figure 3). For comorbidities, the diagnostic odds ratios all exceeded 7.84, indicating that the AI’s diagnostic performance was consistently predictive for all these different diseases. Interestingly, the presence of comorbidity was associated with a decreased diagnostic odds ratio for some diseases with statistical significance (interaction P < 0.05), suggesting a potential interference of ECG changes from these diseases in the AI diagnosis. The observed diagnostic odds ratio decreased with age but remained powerful, and even in patients older than 80 years, it was as high as 5.86. No sex difference was observed in the performance of the AI model.

Figure 3.

Figure 3

The Diagnostic Performance of Patient Subgroups

The 2 dashed vertical lines indicate the diagnostic odds ratio reference (odds ratio: 1.00) and the overall diagnostic odds ratio (here, odds ratio: 16.58), respectively. DOR = diagnostic odds ratio; L95CI = lower limit of 95% CI; PAH = pulmonary artery hypertension; SENs = sensitivity; SPEc = specificity; U95CI = upper limit of 95% CI.

The power of the AI model was explained by V1, V2, and V3 being identified as the most important individual leads for AI to predict ePAP (Supplemental Figure S3A). This is consistent with features of right ventricular hypertrophy that physicians observed in these 3 leads to diagnose ePAP (Supplemental Figures S3B to S3E).17

For comparison, we analyzed the diagnostic performance of using conventional ECG characteristics to detect ePAP. The sensitivity of using conventional ECG characteristics including P pulmonale, right axis deviation, right ventricular hypertrophy, right ventricular strain, and right bundle branch block mimic patterns to identify ePAP was low (2.4% to 19.2%) (Supplemental Table 2). Even when all characteristics were integrated, sensitivity was only 34.1% (specificity 78.7%, accuracy 75.7%) (Table 2). The sensitivity of the conventional ECG characteristics to identify ePAP was too low for effective clinical use, in agreement with previous reports.17,22

Table 2.

Diagnostic Performance of Conventional ECG Characteristics for Detecting ePAP Patients in Group 1

Sensitivity, % Specificity, %
Conventional ECG characteristics for elevated PAPa 34.1 78.7
P pulmonaleb 2.6 98.1
Right axis deviationc 11.3 93.0
Right ventricular hypertrophyd 5.8 96.6
Right ventricular straine 2.4 97.5
Right bundle branch block mimicsf 19.2 89.9

ECG = electrocardiogram; ePAP = elevated pulmonary artery pressure; PAP = pulmonary artery pressure.

a

Integrated electrocardiogram characteristics to diagnose elevated PAP using any of the following abnormalities:

b

P pulmonale indicates right atrial enlargement, right atrial abnormality, or biatrial abnormalities in ECG annotation;

c

right axis deviation in ECG annotation;

d

right ventricular hypertrophy in ECG annotation;

e

right ventricular strain indicates T-wave abnormalities or ST-segment depression over anterior precordial leads in ECG annotation;

f

right bundle branch block mimics indicates right bundle branch block or rSR′ pattern in V1 or V2 without fulfilling the criteria of right bundle branch block in ECG annotation.

The diagnostic performance of the AI model was also better than a logistic regression model utilizing traditional ECG variables, as shown in the Supplemental Results and Supplemental Figure S4.

Follow-up analysis on future incidents of ePAP

Of the Group 1′ patients identified by the AI model as having a normal PAP who also had a confirmatory normal PAP by TTE within a 2-week window, 6,532 had a follow-up TTE. Of these AI-predicted non-ePAP patients, 372 (5.7%) went on to develop ePAP. By contrast, for the 2,984 patients labeled by the AI model as having ePAP but with a normal PAP by TTE (AI-predicted ePAP), 697 (23.4%) went on to develop ePAP (Supplemental Figure S5A). Compared with the AI-predicted non-ePAP patients, this represents a 5.04-fold risk of developing ePAP for the AI-predicted ePAP patients when the AI model defined their ECG as abnormal (multivariate-adjusted HR: 3.74 [95% CI: 3.28–4.26]). The AI model thus identified ECG abnormalities before overt ePAP manifested.

Ancillary analysis of future incidents of ePAP for an additional cohort of patients

We conducted further analysis on future incidents of ePAP for an independent group of patients (Group 2) identified by the AI model as having ePAP or normal PAP. The baseline characteristics of Group 2 patients were shown in Supplemental Table S2. Of the 4,269 AI-predicted ePAP patients, the incidence of ePAP as examined by TTE at more than 2 weeks after the ECG was 27.3%. In comparison, of the 14,104 patients identified by the AI model as having normal PAP, the incidence of an ensuing TTE-determined ePAP was 4.9% (Supplemental Figure S5B). This represents a 6.61-fold risk of developing ePAP for patients with AI-defined abnormal ECG (multivariate-adjusted HR: 4.32 [95% CI: 3.91–4.78]) (Supplemental Figure S5B). This result for an independent group of patients further demonstrates the robustness and ability of the AI algorithm to identify ePAP even in the absence of a TTE diagnosis.

Cardiovascular outcomes predicted by the AI model

Cardiovascular mortality and all-cause mortality were analyzed for Group 1 patients. During the 6-year follow-up, Kaplan-Meier survival analysis showed that in comparison to patients stratified as non-ePAP by the AI model, those stratified as ePAP were associated with higher cardiovascular mortality (AI-predicted ePAP vs AI-predicted non-ePAP 1,027 [16.2%] vs 389 [2.0%] patients; P < 0.001) and higher all-cause mortality (3,581 [45.5%] vs 2,422 [10.9%] patients; P < 0.001), as shown in Figure 4.

Figure 4.

Figure 4

Kaplan-Meier Survival Curves for AI-Classified ePAP or Non-ePAP Patients

AI-predicted ePAP: Group 1 patients classified by the AI model as having ePAP. AI-predicted non-ePAP: Group 1 patients classified by the AI model as having non-ePAP. (A) Cardiovascular mortality. (B) All-cause mortality. Abbreviations as in Figure 1.

Using multivariate Cox regression analysis to adjust for potential confounding factors (age, sex, and various comorbidities), the AI stratification remained independent and the strongest predictor of cardiovascular mortality (HR: 3.69; 95% CI: 3.27–4.17; P < 0.001) and all-cause mortality (HR: 2.34; 95% CI: 2.21–2.47; P < 0.001), as shown in Table 3. Further analysis of patients stratified into groups according to presence/absence of a variety of comorbidities showed that the AI model’s predictive power for cardiovascular death was consistent across diabetes mellitus, hypertension, heart failure, myocardial infarction, stroke, lung diseases, kidney diseases, thromboembolism, and rheumatic and other diseases (Figure 5). Similar results were obtained for all-cause mortality (Supplemental Figure S6). The mortality analysis was also conducted for Group 2 patients and similar results were obtained. AI-predicted ePAP was associated with higher cardiovascular mortality (Supplemental Figure S7) (AI-predicted ePAP vs AI-predicted non-ePAP, 444 [14.9%] vs 277 [2.7%] patients; P < 0.001) and higher all-cause mortality (1,645 [44.5%] vs 1,900 [16.3%] patients; P < 0.001). Multivariate analysis also showed that AI-predicted ePAP was an independent predictor of cardiovascular mortality (HR: 2.63; 95% CI: 2.27–3.04; P < 0.001) and all-cause mortality (HR: 1.79; 95% CI: 1.67–1.91; P < 0.001), as shown in Supplemental Table S3. Further analysis showed that the AI model’s predictive power for cardiovascular death and all-cause death in Group 2 patients also was consistent across a variety of diseases (Supplemental Figures S8 and S9). The results from the 2 independent data sets (Groups 1 and 2) thus suggest the AI model reliably predicted cardiovascular outcomes.

Table 3.

Cardiovascular and All-Cause Mortality of AI-Predicted ePAP Patients in Group 1

Univariate Analysis
Multivariate Analysisa
HR 95% CI P Value HR 95% CI P Value
Cardiovascular mortality
 Age 1.09 1.08–1.09 <0.001 1.06 1.05–1.06 <0.001
 Male 2.29 2.05–2.55 <0.001 1.33 1.19–1.49 <0.001
 Hypertension 2.43 2.18–2.69 <0.001 0.98 0.87–1.09 0.69
 Diabetes mellitus 2.49 2.23–2.78 <0.001 1.17 1.04–1.31 0.01
 Congestive heart failure 6.51 5.88–7.21 <0.001 2.38 2.13–2.65 <0.001
 Prior stroke 2.41 2.07–2.80 <0.001 1.01 0.86–1.18 0.94
 Prior myocardial infarction 5.70 4.79–6.77 <0.001 1.80 1.51–2.16 <0.001
 Chronic kidney disease 3.09 2.52–3.81 <0.001 2.02 1.74–2.34 <0.001
 Chronic lung diseases 2.87 2.57–3.20 <0.001 1.12 1.00–1.26 0.05
 Pulmonary embolism 2.19 1.18–4.08 0.01 1.17 0.63–2.18 0.63
 Rheumatic diseases 1.42 1.17–1.73 <0.001 1.07 0.88–1.30 0.52
 AI-predicted ePAP 9.56 8.54–10.71 <0.001 3.69 3.27–4.17 <0.001
All-cause mortality
 Age 1.07 1.07–1.07 <0.001 1.05 1.05–1.06 <0.001
 Male 2.15 2.04–2.26 <0.001 1.38 1.31–1.46 <0.001
 Hypertension 2.10 2.00–2.21 <0.001 0.86 0.82–0.91 <0.001
 Diabetes mellitus 2.58 2.44–2.72 <0.001 1.34 1.27–1.42 <0.001
 Congestive heart failure 3.61 3.42–3.81 <0.001 1.46 1.38–1.55 <0.001
 Prior stroke 2.62 2.44–2.82 <0.001 1.20 1.12–1.30 <0.001
 Prior myocardial infarction 3.28 2.95–3.64 <0.001 1.22 1.10–1.37 <0.001
 Chronic kidney disease 3.65 3.33–4.01 <0.001 2.17 2.02–2.34 <0.001
 Chronic lung diseases 3.06 2.37–3.96 <0.001 1.24 1.17–1.32 <0.001
 Pulmonary embolism 2.80 2.65–2.95 <0.001 1.83 1.42–2.37 <0.001
 Rheumatic diseases 1.74 1.60–1.90 <0.001 1.33 1.21–1.45 <0.001
 AI-predicted ePAP 5.31 5.05–5.57 <0.001 2.34 2.21–2.47 <0.001

AI = artificial intelligence; ePAP = elevated pulmonary artery pressure.

a

Estimated using multiple Cox regression stepwise analysis (all variables with P < 0.10 were included in the analysis).

Figure 5.

Figure 5

Subgroup Analyses of AI’s Performance on Cardiovascular Mortality

The HR for cardiovascular death (during 6-year follow-up) in AI-predicted ePAP patients and AI-predicted non-ePAP patients in Group 1 patients, according to subgroups of clinical characteristics. Abbreviations as in Figure 1.

External validation of the AI model

The baseline characteristics of the Japanese cohort were provided in Supplemental Table S4. A comparison of baseline characteristics among Group 1, 2, and 3 patients were presented in Supplemental Table S5. The comparison showed that the Japanese cohort (Group 3) exhibited a higher age, more females, and more comorbidities in chronic kidney disease, chronic lung diseases, and pulmonary embolism. In comparison, a higher incidence of hypertension, diabetes mellitus, and prior myocardial infarction was seen in Group 2, and the incidence of congestive heart failure and rheumatic diseases was higher in Group 1.

The AUC of the AI model for ePAP in the external cohort was 0.88 (sensitivity 93.9%, specificity 58.7%, accuracy 72.4%) (Figure 6A). Similarly, the patients stratified as ePAP by AI model were consistently associated with higher cardiovascular mortality (Figure 6B) (AI-predicted ePAP vs AI-predicted non-ePAP 32 [29.8%] vs 1 [1.4%] patients; P < 0.001) and higher all-cause mortality (Supplemental Figure S10) (52 [46.7%] vs 3 [5.1%] patients; P < 0.001). The AI-predicted ePAP was an independent predictor for cardiovascular mortality (HR: 21.08; 95% CI: 2.51–176.88; P < 0.01) and all-cause mortality (HR: 9.15; 95% CI: 2.52–33.27; P < 0.01) after multivariate analysis (Supplemental Table S6). These results suggest the AI model could accommodate ECG data for patients from a different hospital in a different country.

Figure 6.

Figure 6

Performance of AI Model and Kaplan-Meier Curves for Japanese Cohort

(A) The receiver-operating characteristic curve of the AI model for ePAP diagnosis in the Japanese cohort. (B) Kaplan-Meier survival curves of cardiovascular mortality. AI-predicted ePAP: patients classified by the AI model as having ePAP. AI-predicted non-ePAP: patients classified by the AI model as having non-ePAP. Abbreviations as in Figures 1 and 2.

Discussion

In this study, we developed an automated AI model capable of identifying ePAP patients and predicting their risk for cardiovascular and all-cause mortality. The performance of this AI model was shown to be robust for both early diagnosis of ePAP and prognosis of mortality risk in independent patient groups, including an external cohort of Japanese patients. These results support the potential of this AI model as a qualified and valid clinical test to screen for patients at risk of developing ePAP so that treatment can be initiated early to improve their odds of survival.

The ability to detect ePAP is crucial for clinical suspicion of PH, early diagnosis, and prompt treatment because most patients with PH have minimal or no symptoms at an early stage.7,17,23,24 Early detection of PH and prompt therapy would translate into better 3-year survival rates for more than 50% of patients.25 The limitations of TTE have led to efforts devoted to developing alternative clinical tests. However, the traditional ECG criteria used to identity ePAP are not considered a reliable screening tool caused by their low sensitivity of 34% to 55%, as observed in this study and others.12,22 The sensitivities of the pulmonary function, serum N-terminal pro–B-type natriuretic peptide, and uric acid level tests are slightly higher but still only 71%, 56% to 69%, and 68%, respectively.9, 10, 11, 12,26,27 Therefore, screening algorithms (eg, DETECT [Early, Simple and Reliable Detection of Pulmonary Arterial Hypertension in Systemic Sclerosis] or ASCS [Australian Scleroderma Cohort Study]) that incorporate clinical characteristics (eg, the presence of telangiectasia) and a variety of tests (eg, ECG, pulmonary function, serum N-terminal pro–B-type natriuretic peptide, and uric acid level) were developed to increase the chance of detection by relying heavily on expert opinions.8,9

Compared with these tests, our AI model automatically detected patients with ePAP and a high probability of PH with compatible, if not better, sensitivity and specificity. The results suggest the AI model could be used as a standalone test or incorporated into screening algorithms for early diagnosis of patients with PH. Notably, our AI model is noninvasive and needs only the information from ECG signals without any input of clinical characteristics or predetermined ECG parameters and diagnosis, which is a significant improvement to the recently reported AI algorithm.15 The low cost, low operator dependence, and worldwide usage of ECG examination confer an extraordinary opportunity for our AI-enabled ECG model to be a cost-effective and widely applicable tool to change clinical practice in screening patients at risk of PH.

We applied the same threshold used in the worst-performing validation model (Supplemental Figure S2) for the Group 1 patients to compute the AI model’s sensitivity and specificity for the external cohort of Japanese patients (Group 3). Although the diagnostic performance in the AUC of the AI model was the same (0.88) between the 2 cohorts, the specificity was lower and sensitivity higher for the external cohort caused by a tradeoff between specificity and sensitivity. In future application, a different threshold value could be used to meet specific clinical considerations for Japanese patients. In addition, because all the 10 AI models from the 10-fold cross validation produced similar test results, it is likely that, for future clinical practice, no particular model of the 10 could significantly outperform others, although the best validation model could be a logical choice for this purpose at present.

We also performed an analysis of the Akaike Information Criterion (AIC)28 for Group 1 patients. The AIC is used to evaluate the quality of each model relative to each of the other models in a multiple models analysis. The AIC analysis showed that the model combining baseline characteristics and AI-predicted ePAP had the best quality to assess the risk of cardiovascular mortality (Supplemental Table S7). The analysis also showed that, although the AI-predicted ePAP was the factor of the highest HR, it alone would not be a better model than that of all the baseline characteristics combined for predicting mortality risk. To be able to use these results in clinical practice, we devised an index score based on the HR for each parameter from the multivariate analysis of the combined model to calculate the risk of cardiovascular mortality and all-cause mortality for Group 1 patients, as shown in Supplemental Tables S8 and S9, respectively. By summing over the index scores, the 1-year cardiovascular mortality and 1-year all-cause mortality can be estimated from the high correlation between the index score sum, now called risk score, and mortality, as shown in Supplemental Figures S11 and S12, respectively. Note that the precipitating disruption of the correlation at the risk score of 16 for all-cause mortality was caused by there being only 2 patients with that highest risk score, and both survived at the 1-year point (Supplemental Figure S12).

A qualified biomarker needs to meet adequate diagnostic performance and be associated with the clinical endpoints of diseases. Although various AI models have been developed to detect cardiovascular abnormalities, their relevance to cardiovascular outcomes remains unknown. The present work showed that an AI-enabled ECG can be an independent predictor of long-term cardiovascular and all-cause mortality. As shown in Figure 4, our AI model identified patients at high risk of cardiovascular mortality, with the risk estimated at 4.2%, 9.4%, and 14.0%, and of all-cause mortality, estimated at 14.4%, 29.2%, and 40.6%, during their 1-, 3-, and 5-year follow-ups, respectively.

Stratification based on mortality risk is pivotal in optimizing therapeutic strategies to treat patients with PH. Currently, PH is clinically categorized as pulmonary arterial hypertension (PAH), or as PH caused by left heart disease, lung disease or hypoxia, chronic thromboembolism, or multifactorial mechanisms. The current guideline recommends using a comprehensive assessment of patient prognosis to administer therapies according to disease categories. For example, in PH patients diagnosed with PAH, the mortality risk can be classified as low, intermediate, or high, with an estimated 1-year all-cause mortality of below 5%, 5–10%, or above 10%, respectively.17,29, 30, 31 Doctors use this risk stratification along with patients’ clinical characteristics, exercise test results, serum N-terminal pro–B-type natriuretic peptide levels, cardiac images, and hemodynamic determinants to determine an adequate treatment strategy (eg, single or combination therapy, intravenous or oral therapy).17 A similar strategy of decision-making is applied to patients with PH caused by left heart disease or chronic thromboembolism.32,33

The prognostic power of our AI model suggests its usefulness in providing accurate risk stratification to direct differential therapeutic interventions, although a randomized clinical trial is needed to justify this application. Moreover, according to our subgroup sensitivity analysis, the AI model could be consistently applied to estimate mortality risk for different PH categories, including PAH, PH caused by left heart disease, lung disease, or hypoxia, and chronic thromboembolism. The expected wide applicability to patients with different diseases is a significant asset of our AI model.

Study limitations

The ePAP affirmed by TTE was selected as a surrogate marker for highly suspicious PH based on an established guideline,17 but no other marker was tested. This is a limitation due mainly to a limited number of qualified patients available in our database. For example, right heart catheterization for a definite diagnosis of PH was not used in this study because only 17 such patients (0.04% of Group 1) had an ECG within the 14-day interval of our study design. Likewise, the number of patients tested (279 patients) for the external cohort was relatively small. Finally, the ability of the AI model to detect dynamic changes, such as disappearing PH or PH developed at follow-up, is likely limited because the model was not designed to predict those, and the small number of patients under those conditions prevented a thorough analysis. Further study on larger cohorts is necessary to fully validate the AI model’s performance.

Conclusions

In this work, we showed that the deep learning neural network AI model we previously developed for detecting cardiac arrhythmias could be extended to accurately identify patients with ePAP from their ECG data. We further showed that the AI model could predict future incidents of ePAP, as well as risk for cardiovascular and all-cause mortality, even for patients with various different diseases. These results suggest our AI model could be a useful clinical tool to identify patients with PH so that treatment can be initiated early to improve their survival prognosis.

Perspectives.

COMPETENCY IN MEDICAL KNOWLEDGE: Pulmonary hypertension affects more than a million individuals around the world and causes premature disability and heart failure, and increases the 5-year mortality rate to more than 30% if left untreated. Early detection of ePAP is needed for prompt diagnosis and treatment to avoid detrimental consequences of pulmonary hypertension. We developed the ECG-based AI model to identify patients with ePAP and predicted their future risk for cardiovascular mortality, which was validated in independent patient groups, including an external cohort of Japanese patients. The diagnostic performance (AUC: 0.88) and risk prediction for cardiovascular mortality (HR: 3.6-21.1) satisfied clinical standards, and outperformed conventional ECG diagnosis by cardiologists.

TRANSLATIONAL OUTLOOK: The AI-enabled ECG model can serve as a first automated, qualified, and valid clinical test for early diagnosis of ePAP and prognosis of mortality risk.

Funding Support and Author Disclosures

This work was supported by Taipei Veterans General Hospital (VGH108C-019, VN108-12, VN109-03, V109C-070, V110C-039, V110B-043, V111C-047, VGHUST111-G6-3-2, VTA111-A-1-2), Ministry of Science and Technology (MOST 108-2628-B-075-003, MOST 109-2628-B-075-017, MOST 110-2628-B-075-015, MOST 110-2314-B-075-063-MY3, MOST 110-2321-B-075-002), Szu-Yuan Research Foundation of Internal Medicine (107-041), National Health Research Institutes (NHRI-EX108-10513SC, NHRI-109BCC0-MF-202014-02), and Academia Sinica (AS-TM-109-01-05, AS-TM-110-01-01). The authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Acknowledgments

The authors thank Wan-Ting Hsu and Li-Lien Liao for data collection.

Footnotes

The authors attest they are in compliance with human studies committees and animal welfare regulations of the authors’ institutions and Food and Drug Administration guidelines, including patient consent where appropriate. For more information, visit the Author Center.

Appendix

For expanded Methods and Results sections, and supplemental figures and tables, please see the online version of this paper.

Contributor Information

Yu-Feng Hu, Email: huhuhu0609@gmail.com.

Ming-Jing Hwang, Email: mjhwang@ibms.sinica.edu.tw.

Appendix

Supplemental Data
mmc1.docx (24MB, docx)

References

  • 1.Elliott C.G., Barst R.J., Seeger W., et al. Worldwide physician education and training in pulmonary hypertension: pulmonary vascular disease: the global perspective. Chest. 2010;137:85S–94S. doi: 10.1378/chest.09-2816. [DOI] [PubMed] [Google Scholar]
  • 2.Vachiery J.L., Adir Y., Barbera J.A., et al. Pulmonary hypertension due to left heart diseases. J Am Coll Cardiol. 2013;62:D100–D108. doi: 10.1016/j.jacc.2013.10.033. [DOI] [PubMed] [Google Scholar]
  • 3.Seeger W., Adir Y., Barbera J.A., et al. Pulmonary hypertension in chronic lung diseases. J Am Coll Cardiol. 2013;62:D109–D116. doi: 10.1016/j.jacc.2013.10.036. [DOI] [PubMed] [Google Scholar]
  • 4.Thenappan T., Shah S.J., Rich S., Tian L., Archer S.L., Gomberg-Maitland M. Survival in pulmonary arterial hypertension: a reappraisal of the NIH risk stratification equation. Eur Respir J. 2010;35:1079–1087. doi: 10.1183/09031936.00072709. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Oswald-Mammosser M., Weitzenblum E., Quoix E., et al. Prognostic factors in COPD patients receiving long-term oxygen therapy. Importance of pulmonary artery pressure. Chest. 1995;107:1193–1198. doi: 10.1378/chest.107.5.1193. [DOI] [PubMed] [Google Scholar]
  • 6.Abramson S.V., Burke J.F., Kelly J.J., Jr., et al. Pulmonary hypertension predicts mortality and morbidity in patients with dilated cardiomyopathy. Ann Intern Med. 1992;116:888–895. doi: 10.7326/0003-4819-116-11-888. [DOI] [PubMed] [Google Scholar]
  • 7.Humbert M., Sitbon O., Chaouat A., et al. Pulmonary arterial hypertension in France: results from a national registry. Am J Respir Crit Care Med. 2006;173:1023–1030. doi: 10.1164/rccm.200510-1668OC. [DOI] [PubMed] [Google Scholar]
  • 8.Coghlan J.G., Denton C.P., Grünig E., et al. Evidence-based detection of pulmonary arterial hypertension in systemic sclerosis: the DETECT study. Ann Rheum Dis. 2014;73:1340–1349. doi: 10.1136/annrheumdis-2013-203301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Thakkar V., Stevens W., Prior D., et al. The inclusion of N-terminal pro-brain natriuretic peptide in a sensitive screening strategy for systemic sclerosis-related pulmonary arterial hypertension: a cohort study. Arthritis Res Ther. 2013;15:R193. doi: 10.1186/ar4383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Williams M.H., Handler C.E., Akram R., et al. Role of N-terminal brain natriuretic peptide (N-TproBNP) in scleroderma-associated pulmonary arterial hypertension. Eur Heart J. 2006;27:1485–1494. doi: 10.1093/eurheartj/ehi891. [DOI] [PubMed] [Google Scholar]
  • 11.Simpson C.E., Damico R.L., Hummers L., et al. Serum uric acid as a marker of disease risk, severity, and survival in systemic sclerosis-related pulmonary arterial hypertension. Pulm Circ. 2019;9(3) doi: 10.1177/2045894019859477. 2045894019859477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kiely D.G., Lawrie A., Humbert M. Screening strategies for pulmonary arterial hypertension. Eur Heart J Suppl. 2019;21:K9–K20. doi: 10.1093/eurheartj/suz204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Attia Z.I., Kapa S., Lopez-Jimenez F., et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. 2019;25:70–74. doi: 10.1038/s41591-018-0240-2. [DOI] [PubMed] [Google Scholar]
  • 14.Kwon J.M., Jeon K.H., Kim H.M., et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22:412–419. doi: 10.1093/europace/euz324. [DOI] [PubMed] [Google Scholar]
  • 15.Kwon J.M., Kim K.H., Medina-Inojosa J., Jeon K.H., Park J., Oh B.H. Artificial intelligence for early prediction of pulmonary hypertension using electrocardiography. J Heart Lung Transplant. 2020;39(8):805–814. doi: 10.1016/j.healun.2020.04.009. [DOI] [PubMed] [Google Scholar]
  • 16.Raghunath S., Ulloa Cerna A.E., Jing L., et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nat Med. 2020;26:886–891. doi: 10.1038/s41591-020-0870-z. [DOI] [PubMed] [Google Scholar]
  • 17.Galie N., Humbert M., Vachiery J.L., et al. 2015 ESC/ERS guidelines for the diagnosis and treatment of pulmonary hypertension: the Joint Task Force for the Diagnosis and Treatment of Pulmonary Hypertension of the European Society of Cardiology (ESC) and the European Respiratory Society (ERS) Eur Heart J. 2016;37:67–119. doi: 10.1093/eurheartj/ehv317. [DOI] [PubMed] [Google Scholar]
  • 18.Rudski L.G., Lai W.W., Afilalo J., et al. Guidelines for the echocardiographic assessment of the right heart in adults: a report from the American Society of Echocardiography endorsed by the European Association of Echocardiography, a registered branch of the European Society of Cardiology, and the Canadian Society of Echocardiography. J Am Soc Echocardiogr. 2010;23:685–713. doi: 10.1016/j.echo.2010.05.010. quiz 786-788. [DOI] [PubMed] [Google Scholar]
  • 19.Liao J.N., Chao T.F., Kuo J.Y., et al. Global left atrial longitudinal strain using 3-beat method improves risk prediction of stroke over conventional echocardiography in atrial fibrillation. Circ Cardiovasc Imaging. 2020;13 doi: 10.1161/CIRCIMAGING.119.010287. [DOI] [PubMed] [Google Scholar]
  • 20.Liao J.N., Chao T.F., Kuo J.Y., et al. Age, sex, and blood pressure-related influences on reference values of left atrial deformation and mechanics from a large-scale Asian population. Circ Cardiovasc Imaging. 2017;10(10) doi: 10.1161/CIRCIMAGING.116.006077. [DOI] [PubMed] [Google Scholar]
  • 21.Chen T.M., Huang C.H., Shih E.S.C., Hu Y.F., Hwang M.J. Detection and classification of cardiac arrhythmias by a challenge-best deep learning neural network model. iScience. 2020;23:100886. doi: 10.1016/j.isci.2020.100886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Speich R. Diagnosing pulmonary hypertension: is there a revival of the electrocardiogram? Eur Respir J. 2011;37:994–996. doi: 10.1183/09031936.00189810. [DOI] [PubMed] [Google Scholar]
  • 23.Taichman D.B., Ornelas J., Chung L., et al. Pharmacologic therapy for pulmonary arterial hypertension in adults: CHEST guideline and expert panel report. Chest. 2014;146:449–475. doi: 10.1378/chest.14-0793. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kiely D.G., Elliot C.A., Sabroe I., Condliffe R. Pulmonary hypertension: diagnosis and management. BMJ. 2013;346:f2028. doi: 10.1136/bmj.f2028. [DOI] [PubMed] [Google Scholar]
  • 25.Humbert M., Yaici A., de Groote P., et al. Screening for pulmonary arterial hypertension in patients with systemic sclerosis: clinical characteristics at diagnosis and long-term survival. Arthritis Rheum. 2011;63:3522–3530. doi: 10.1002/art.30541. [DOI] [PubMed] [Google Scholar]
  • 26.Gladue H., Steen V., Allanore Y., et al. Combination of echocardiographic and pulmonary function test measures improves sensitivity for diagnosis of systemic sclerosis-associated pulmonary arterial hypertension: analysis of 2 cohorts. J Rheumatol. 2013;40:1706–1711. doi: 10.3899/jrheum.130400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hsu V.M., Moreyra A.E., Wilson A.C., et al. Assessment of pulmonary arterial hypertension in patients with systemic sclerosis: comparison of noninvasive tests with results of right-heart catheterization. J Rheumatol. 2008;35:458–465. [PubMed] [Google Scholar]
  • 28.Bozdogan H. Model selection and Akaike's Information Criterion (AIC): the general theory and its analytical extensions. Psychometrika. 1987;52:345–370. [Google Scholar]
  • 29.Benza R.L., Gomberg-Maitland M., Elliott C.G., et al. Predicting survival in patients with pulmonary arterial hypertension: the REVEAL risk score calculator 2.0 and comparison with ESC/ERS-based risk assessment strategies. Chest. 2019;156:323–337. doi: 10.1016/j.chest.2019.02.004. [DOI] [PubMed] [Google Scholar]
  • 30.Humbert M., Sitbon O., Yaïci A., et al. Survival in incident and prevalent cohorts of patients with pulmonary arterial hypertension. Eur Respir J. 2010;36:549–555. doi: 10.1183/09031936.00057010. [DOI] [PubMed] [Google Scholar]
  • 31.Kylhammar D., Kjellström B., Hjalmarsson C., et al. A comprehensive risk stratification at early follow-up determines prognosis in pulmonary arterial hypertension. Eur Heart J. 2018;39:4175–4181. doi: 10.1093/eurheartj/ehx257. [DOI] [PubMed] [Google Scholar]
  • 32.Agarwal R., Shah S.J., Foreman A.J., et al. Risk assessment in pulmonary hypertension associated with heart failure and preserved ejection fraction. J Heart Lung Transplant. 2012;31:467–477. doi: 10.1016/j.healun.2011.11.017. [DOI] [PubMed] [Google Scholar]
  • 33.Humbert M., Farber H.W., Ghofrani H.A., et al. Risk assessment in pulmonary arterial hypertension and chronic thromboembolic pulmonary hypertension. Eur Respir J. 2019;53(6):1802004. doi: 10.1183/13993003.02004-2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data
mmc1.docx (24MB, docx)

Articles from JACC Asia are provided here courtesy of Elsevier

RESOURCES