Abstract
Background
The detection of dyskalemias—hypokalemia and hyperkalemia—currently depends on laboratory tests. Since cardiac tissue is very sensitive to dyskalemia, electrocardiography (ECG) may be able to uncover clinically important dyskalemias before laboratory results.
Objective
Our study aimed to develop a deep-learning model, ECG12Net, to detect dyskalemias based on ECG presentations and to evaluate the logic and performance of this model.
Methods
Spanning from May 2011 to December 2016, 66,321 ECG records with corresponding serum potassium (K+) concentrations were obtained from 40,180 patients admitted to the emergency department. ECG12Net is an 82-layer convolutional neural network that estimates serum K+ concentration. Six clinicians—three emergency physicians and three cardiologists—participated in human-machine competition. Sensitivity, specificity, and balance accuracy were used to evaluate the performance of ECG12Net with that of these physicians.
Results
In a human-machine competition including 300 ECGs of different serum K+ concentrations, the area under the curve for detecting hypokalemia and hyperkalemia with ECG12Net was 0.926 and 0.958, respectively, which was significantly better than that of our best clinicians. Moreover, in detecting hypokalemia and hyperkalemia, the sensitivities were 96.7% and 83.3%, respectively, and the specificities were 93.3% and 97.8%, respectively. In a test set including 13,222 ECGs, ECG12Net had a similar performance in terms of sensitivity for severe hypokalemia (95.6%) and severe hyperkalemia (84.5%), with a mean absolute error of 0.531. The specificities for detecting hypokalemia and hyperkalemia were 81.6% and 96.0%, respectively.
Conclusions
A deep-learning model based on a 12-lead ECG may help physicians promptly recognize severe dyskalemias and thereby potentially reduce cardiac events.
Keywords: artificial intelligence, sudden cardiac death, electrocardiogram, machine learning, potassium homeostasis
Introduction
Dyskalemias—hyperkalemia and hypokalemia—are common causes of sudden cardiac death in clinical practice [1]. Prompt recognition and rapid correction of these potassium (K+) derangements are needed to prevent catastrophic outcomes [2]. Currently, the detection of dyskalemia relies on laboratory tests. Point-of-care blood testing provides rapid analysis of electrolyte levels, however, its accuracy and precision may not be as reliable as that from a clinical central laboratory; this is mainly due to dilution, which would underestimate plasma K+ concentration, and the inability to discern hemolysis from pseudohyperkalemia [3,4]. Electrocardiography (ECG) is universally needed in patients with emergent cardiac or noncardiac conditions, which may exhibit the typical changes seen in dyskalemia since cardiac tissue is very sensitive to this disease. The main ECG changes associated with hypokalemia include a decreased T wave amplitude, ST-segment depression, T wave inversion, a prolonged PR interval, and an increased corrected QT interval (QTc) [5]. The typical ECG findings for hyperkalemia progress from tall peaked T waves and a shortened QT interval to a lengthened PR interval and a loss of the P wave, followed by a widening QRS complex and ultimately a sine wave morphology [5,6]. Although these morphologic changes are well known in dyskalemias, even experienced clinicians frequently do not notice all of these subtle details [7].
Previous researchers have developed ECG quantification algorithms to predict serum K+ concentration based on T wave morphology, mainly using the slope and width of T waves. Hyperkalemia is associated with tall, narrow, and symmetrical T waves, whereas hypokalemia is associated with flat T waves [8-12]. The algorithms were mostly derived from continuous patient monitoring, such as during hemodialysis, with homogeneous ECG morphologies from a limited set of patients [8-12]. Recently, applying the processing of T wave morphologies manually has been used to improve the diagnosis of hyperkalemia [13]. Nevertheless, using T wave changes alone to detect dyskalemias is less sensitive and specific than a comprehensive ECG interpretation [14].
With the revolution in artificial intelligence (AI), several advanced deep-learning models, such as Oxford’s VGGNet [15], Inception Net [16], ResNet [17], and DenseNet [18], have been developed, providing an unprecedented opportunity to improve health care; this was initiated by AlexNet’s victory in the ImageNet Large Scale Visual Recognition Challenge in 2012 [19]. Existing deep-learning models have been shown to achieve human-level performance and be effective in medical applications when large annotated datasets are available [17,20-22]. This potential to improve diagnosis and patient care prompted us to develop a deep-learning model to assist emergency physicians in recognizing ECG changes associated with dyskalemias.
Our study aimed to train a deep-learning model, ECG12Net, to predict serum K+ concentration by ECG. The deep-learning model was an 82-layer convolutional neural network that underwent a series of training processes to optimize model performance. The AI system, which will learn from more than 50,000 electrocardiograms to identify critical morphologic changes, will help to reduce medical errors in emergency departments (EDs) resulting from intense time pressure and harried ED staff during busy periods in ED environments [23]. Facilitated by the system’s powerful computing ability, the performance of the trained model was compared with that of emergency physicians and cardiologists. Finally, we visualized ECG12Net’s calculation process to understand why and how it works.
Methods
Data Source
The data were obtained from Tri-Service General Hospital, Taiwan, and research approval was given by the Institutional Review Board (IRB) (IRB No. 1-107-05-047). From May 11, 2011, to December 31, 2016, 40,180 emergency patients were enrolled who had 66,321 ECG records within 1 hour before or after serum K+ concentration for reference. Serum K+ concentrations were measured in the laboratory using indirect ion-selective electrode methods that had been accredited by the International Organization for Standardization (ISO) standard ISO-15189 and the College of American Pathologists’ Laboratory Accreditation Program. All hemolyzed samples were excluded. Potential confounders, such as patients with chest pain or thyroid disorders, were not excluded from the study. We divided the dataset into training (~70%), validation (~10%), and test (~20%) sets by date. Emergency patients presenting before April 30, 2016, were included in the training set; those presenting between May 1 and July 20, 2016, were in the validation set; and those presenting after July 21, 2016, were in the test set to assess model performance. All records included in the training set were excluded from the validation and test sets; thus, there was no overlap among the three datasets. The ECG recordings were collected using a Philips 12-Lead ECG machine (PH080A). The ECG signal was recorded in a digital format. The sampling frequency was 500 Hz with 2.5 seconds recorded in each lead. The estimated K+ concentrations ranged from 1.5 mEq/L to 7.5 mEq/L. Predicted K+ concentrations less than 1.5 mEq/L or greater than 7.5 mEq/L were indicated accordingly without further detail (ie, as either <1.5 mEq/L or >7.5 mEq/L). Patient characteristics and laboratory results were collected using an electronic health record system. The estimated glomerular filtration rate was calculated using the Chronic Kidney Disease Epidemiology Collaboration formula [24]. Eight basic ECG morphology parameters (EMPs) were calculated by the Philips 12-Lead ECG machine: heart rate, PR interval, QRS duration, QT interval, QTc, P wave axis, RS wave axis, and T wave axis.
The Implementation of ECG12Net
We developed a 12-channel sequence-to-sequence model, which is modified from DenseNet [18]. The details are shown in Multimedia Appendix 1. The architecture of ECG12Net is shown in Figure 1. We designed an ECG lead block with 80 trainable layers whose architecture is shown in Figure 1 A. This ECG lead block was used to extract 864 features from each ECG lead, making a basic output prediction based on each lead. Figure 1 B shows how ECG12Net integrates all the information from the ECG leads to make an overall prediction. ECG12Net is composed of 12 of these ECG lead blocks corresponding to each lead sequence. We designed an attention mechanism based on a hierarchical attention network to concatenate these blocks, increasing the interpretive power of ECG12Net [25]. ECG12Net-1, which uses only ECG wave information, contains 82 trainable layers. To improve prediction performance, we added an EMPNet, which is a multilayer perceptron with two hidden layers containing eight EMPs, to ECG12Net-1 to create ECG12Net-2.
Human-Machine Competition
We evaluated the performance of practicing physicians using a subtest set. We divided the data into five categories based on the serum K+ concentration: (1) K+ ≤2.5 mEq/L, (2) 2.5< K+ ≤3.5 mEq/L, (3) 3.5< K+ <5.5 mEq/L, (4) 5.5≤ K+ <6.5 mEq/L, and (5) K+ ≥6.5 mEq/L. Stratified sampling was used to create the subtest set due to the rarity of cases in the first and fifth categories. Each category of K+ concentration comprised 60 cases, and a total of 300 cases were used in the test. The participating physicians included an emergency physician under training (second-year resident); two emergency physicians, one with 4 and the other with 13 years of experience; a chief resident in cardiology; and two cardiologists, one with 2 and the other with 9 years of experience. The physicians had no access to patient information and no knowledge of the data. The responses they provided were entered into an online standardized data entry program. We calculated their sensitivity and specificity and compared their results with those of ECG12Net.
Statistical Analysis and Model Performance Assessment
The study cohort was divided into training, validation, and test sets. We presented their characteristics as the means and standard deviations, the numbers of patients, or the percentages, where appropriate. This information was compared using either analysis of variance or the chi-square test as appropriate. We then analyzed the EMP differences between the five serum K+ groups, and the EMPs were subjected to post hoc analysis. All the dyskalemia groups were compared to the normal group.
The primary analysis was done to evaluate the performance in dyskalemia prediction between ECG12Net and the clinicians in a machine-human competition. Receiver operating characteristic curves and the areas under the curve (AUCs) were applied to evaluate the competition results. Additionally, the sensitivity, specificity, and balance accuracy of dyskalemia prediction by ECG12Net and the clinical physicians were calculated. The balance accuracy is defined as the mean of the sensitivity and specificity obtained in the study. Due to the stratified sampling process destroying the original prevalence, the positive predictive value and negative predictive value for the competition results are not presented.
The secondary analyses were performed on our test set with the data obtained after July 21, 2016, which had not been used in the training process. This was a simulated prospective study to evaluate the performance of the AI models with the mean absolute error (MAE) as the major measurement index due to the continuous predictions. Moreover, categorized analyses are also presented. Sensitivity, specificity, positive predictive value, negative predictive value, and the squared weighted kappa were used to evaluate the performance of the models. Finally, we conducted a series of logistic models to identify the effects of patient demographic characteristics on the performance of our deep-learning model.
We used a significance level ofP< throughout the analysis. Bootstrap 95% CIs were calculated and presented for all measure indexes based on 10,000 permutations. No additional adjustments for multiple comparisons were used because of the small number of planned comparisons. The statistical analysis was carried out using the software environment R, version 3.4.3 (The R Foundation).
Results
Cohort Description
The training, validation, and test sets comprised records from 28,183; 3993; and 8004 patients, respectively. Table 1 shows the patient characteristics, which reveal similar distributions among the sets of gender, age, body mass index, marital status, education, and underlying comorbidities, including diabetes mellitus, coronary artery disease, hypertension, heart failure, hyperlipidemia, chronic kidney disease, chronic obstructive pulmonary disease, and pneumothorax. The training, validation, and test sets consisted of 46,692; 6407; and 13,222 pairs, respectively, of ECGs and K+ concentrations. The details of the laboratory and EMP analyses are presented in Multimedia Appendix 1. The detailed dyskalemia distribution (see Multimedia Appendix 1) shows a hypokalemia/hyperkalemia prevalence of 22.7%/2.6%, 22.9%/2.3%, and 22.7%/2.8% in the training, validation, and test sets, respectively.
Table 1.
Characteristic | Training set (N=28,183) |
Validation set (N=3993) |
Test set (N=8004) |
P value | |||||
Gender, n (%) |
|
|
|
.08 | |||||
|
Female | 13,828 (49.07) | 1942 (48.64) | 3814 (47.65) |
|
||||
|
Male | 14,350 (50.92) | 2049 (51.31) | 4190 (52.35) |
|
||||
Age (years), mean (SD) | 62.57 (19.45) | 62.47 (19.33) | 62.61 (19.25) | .93 | |||||
Height (cm), mean (SD) | 162.24 (9.37) | 162.19 (9.58) | 163.29 (36.90) | .09 | |||||
Weight (cm), mean (SD) | 63.98 (14.12) | 64.11 (14.16) | 63.75 (13.79) | .78 | |||||
BMI (kg/m2), mean (SD) | 24.32 (6.38) | 24.39 (6.71) | 24.07 (4.49) | .24 | |||||
Underlying comorbidities, n (%) |
|
|
|
|
|||||
|
Diabetes mellitus | 3553 (12.61) | 476 (11.92) | 1009 (12.61) | .47 | ||||
|
Coronary artery disease | 1694 (6.01) | 257 (6.44) | 485 (6.06) | .57 | ||||
|
Hypertension | 5219 (18.52) | 741 (18.56) | 1496 (18.69) | .94 | ||||
|
Heart failure | 825 (2.93) | 124 (3.11) | 239 (2.99) | .81 | ||||
|
Hyperlipidemia | 3868 (13.72) | 520 (13.02) | 1078 (13.47) | .45 | ||||
|
Chronic kidney disease | 6294 (22.33) | 859 (21.51) | 1786 (22.31) | .50 | ||||
|
Chronic obstructive pulmonary disease | 1351 (4.79) | 193 (4.83) | 408 (5.10) | .54 | ||||
|
Pneumothorax | 88 (0.31) | 11 (0.28) | 24 (0.30) | .92 |
Primary Analysis
The results of the human-machine competition are summarized in Figure 2. The AUCs of our ECG12Net-1 were 0.993, 0.926, 0.958, and 0.976 in the detection of severe hypokalemia, hypokalemia, hyperkalemia, and severe hyperkalemia, respectively. Due to the continuous nature of the K+ concentration predictions from ECG12Net, we used clinical cut points as described in the Methods section for further analysis. Our clinicians detected severe hypokalemia with sensitivities and specificities of 45%-78.3% and 74.4%-83.9%, respectively, whereas ECG12Net-1 achieved a sensitivity of 96.7% (95% CI 91.7-100.0) and a specificity of 93.3% (95% CI 89.4-96.7). In detecting severe hyperkalemia, the clinicians had nearly perfect specificity (92.8%-100.0%) but low sensitivity (16.7%-43.3%), while ECG12Net-1 exhibited a sensitivity of 83.3% (95% CI 73.3-91.7) and a specificity of 97.8% (95% CI 95.6-99.4). Including mild-to-moderate dyskalemias, ECG12Net-1 had the highest sensitivity in detecting hypokalemia (67.5%, 95% CI 59.2-75.8) and hyperkalemia (67.5%, 95% CI 59.2-75.8) in the human-machine competition. The details of the human-machine competition are shown in Table 2. In terms of balance accuracy, ECG12Net-1’s performance was significantly better than that of the best clinician (cardiologist 2) participating in the hypokalemia detection (80.4%, 95% CI 75.7-84.9, vs 66.7%, 95% CI 61.4-72.1). In detecting severe hyperkalemia, the balance accuracy of ECG12Net-1 was also significantly better than that of the best clinician (cardiologist 3) (82.7%, 95% CI 78.2-86.8, vs 70.6%, 95% CI 65.6-75.4). Although ECG12Net-2 exhibited lower performance compared with ECG12Net-1, it performed much better than all of the clinicians. The results of the consistency analysis are shown in Multimedia Appendix 1. When inconsistency arose between the predictions made by ECG12Net and the experts, ECG12Net was approximately 3.85 times more likely to be correct (P<.001 based on the McNemar test).
Table 2.
Type of dyskalemia | Sensitivitya, 95% CI | Specificitya (n=180), 95% CI | Balance accuracyb, 95% CI | |||||
|
Overall (n=120) |
Severe (n=60) |
Mild to moderate (n=60) |
|
|
|||
Hypokalemia (K+≤3.5 mEq/L) |
|
|
|
|
|
|||
|
Emergency physician 1c | 0.300 (0.219-0.385) | 0.483 (0.356-0.613) | 0.117 (0.040-0.206) | 0.822 (0.765-0.875) | 0.561 (0.512-0.611) | ||
|
Emergency physician 2d | 0.508 (0.420-0.598) | 0.683 (0.562-0.797) | 0.333 (0.217-0.455) | 0.744 (0.680-0.807) | 0.626 (0.572-0.682) | ||
|
Emergency physician 3e | 0.467 (0.378-0.554) | 0.700 (0.581-0.812) | 0.233 (0.131-0.345) | 0.778 (0.717-0.835) | 0.622 (0.569-0.676) | ||
|
Cardiologist 1f | 0.317 (0.236-0.403) | 0.450 (0.323-0.579) | 0.183 (0.091-0.288) | 0.839 (0.782-0.892) | 0.578 (0.528-0.628) | ||
|
Cardiologist 2g | 0.550 (0.462-0.637) | 0.783 (0.673-0.885) | 0.317 (0.204-0.439) | 0.783 (0.722-0.842) | 0.667 (0.614-0.721) | ||
|
Cardiologist 3h | 0.567 (0.477-0.654) | 0.767 (0.654-0.870) | 0.367 (0.246-0.492) | 0.761 (0.697-0.820) | 0.664 (0.608-0.718) | ||
|
ECG12Net-1 | 0.675 (0.592-0.758) | 0.967 (0.917-1.000) | 0.383 (0.267-0.500) | 0.933 (0.894-0.967) | 0.804 (0.757-0.849) | ||
|
ECG12Net-2 | 0.675 (0.592-0.758) | 0.967 (0.917-1.000) | 0.383 (0.267-0.500) | 0.922 (0.883-0.961) | 0.799 (0.751-0.843) | ||
Hyperkalemia (K+≥5.5 mEq/L) |
|
|
|
|
|
|||
|
Emergency physician 1 | 0.192 (0.124-0.266) | 0.250 (0.145-0.365) | 0.133 (0.053-0.224) | 0.978 (0.954-0.995) | 0.585 (0.549-0.623) | ||
|
Emergency physician 2 | 0.175 (0.110-0.244) | 0.200 (0.103-0.304) | 0.150 (0.065-0.250) | 0.994 (0.982-1.000) | 0.585 (0.552-0.620) | ||
|
Emergency physician 3 | 0.208 (0.137-0.282) | 0.233 (0.130-0.344) | 0.183 (0.089-0.288) | 1.000 (1.000-1.000) | 0.604 (0.569-0.641) | ||
|
Cardiologist 1 | 0.108 (0.056-0.167) | 0.167 (0.077-0.266) | 0.050 (0.000-0.113) | 1.000 (1.000-1.000) | 0.554 (0.528-0.583) | ||
|
Cardiologist 2 | 0.200 (0.131-0.274) | 0.233 (0.132-0.345) | 0.167 (0.078-0.265) | 0.989 (0.971-1.000) | 0.594 (0.560-0.632) | ||
|
Cardiologist 3 | 0.483 (0.393-0.571) | 0.433 (0.305-0.558) | 0.533 (0.403-0.661) | 0.928 (0.888-0.963) | 0.706 (0.656-0.754) | ||
|
ECG12Net-1 | 0.675 (0.592-0.758) | 0.833 (0.733-0.917) | 0.517 (0.383-0.633) | 0.978 (0.956-0.994) | 0.827 (0.782-0.868) | ||
|
ECG12Net-2 | 0.683 (0.600-0.767) | 0.833 (0.733-0.917) | 0.533 (0.400-0.650) | 0.972 (0.944-0.994) | 0.828 (0.783-0.869) |
aThe test provides three selections for prediction: hypokalemia (K+ ≤3.5 mEq/L), normokalemia (3.5 mEq/L< K+ <5.5 mEq/L), and hyperkalemia (K+ ≥5.5 mEq/L).
bThe balance accuracy value represents the average of the overall sensitivity and specificity.
cEmergency physician 1: second-year resident.
dEmergency physician 2: 4 years of experience.
eEmergency physician 3: 13 years of experience.
fCardiologist 1: chief resident of cardiology.
gCardiologist 2: 2 years of experience.
hCardiologist 3: 9 years of experience.
Performance of ECG12Net on the Test Set
The model performance on the test set is shown in Multimedia Appendix 1. The performance of ECG12Net was better than that of each lead. ECG12Net-1 had the lowest MAE (0.531). Including EMP information did not improve the prediction of K+ concentration (MAE ECG12Net-1: 0.531; MAE ECG12Net-2: 0.538). When categorizing among three classes—hypokalemia, normokalemia, and hyperkalemia—and five classes, with the addition of severe hypokalemia and severe hyperkalemia, as described in Multimedia Appendix 1, a similar performance was observed by ECG12Net-1; this demonstrated the highest squared weighted kappa of 0.354 in the three-class categorization and 0.396 in the five-class categorization. For the detection of hypokalemia, the sensitivity, specificity, positive predictive value, and negative predictive value of ECG12Net-1 were 50.7%, 81.6%, 44.7%, and 85.0%, respectively; for hyperkalemia, they were 50.8%, 96.0%, 26.9%, and 98.5%, respectively. The confusion scatter plots for the predictions by the two ECG12Nets are shown in Figure 3. Importantly, in detecting severe hypokalemia and hyperkalemia, ECG12Net-1 demonstrated a sensitivity of 95.6% and 84.5%, respectively. ECG12Net-2 exhibited similar prediction capabilities for severe hypokalemia and hyperkalemia as ECG12Net-1.
Model Interpretation
A total of 58 severe hypokalemia cases were correctly detected by ECG12Net-1, of which 15 (26%) were overlooked by clinician consensus. The classical ECG findings of U wave and ST segment depression, especially in leads V2 and V3, were consistently recognized as severe hypokalemia by both the clinicians and ECG12Net-1 (see Figure 4 A). As shown in Figure 4 B, ECG12Net-1 predicted a case of severe hypokalemia from ST segment depression in the V3 lead; this case was misdiagnosed by all the clinicians. Two cases of severe hypokalemia were misclassified by ECG12Net-1 but diagnosed correctly by the clinicians (data not shown). These cases had severe noise in the presented ECG; however, the clinicians made the correct diagnosis based on the presence of a prolonged QTc.
A total of 50 severe hyperkalemia cases were correctly detected by ECG12Net-1, with 36 (72%) of these cases overlooked by clinician consensus. Figure 4 C shows a typical ECG presentation of severe hyperkalemia with tented T waves accompanied by a long QRS complex duration, which was correctly diagnosed by all clinicians and ECG12Net-1. Figure 4 D shows a case of severe hyperkalemia correctly recognized by ECG12Net-1, with ST depression followed by a peaked T wave in lead V6, which was misdiagnosed as hypokalemia by all the clinicians. There were also 10 cases of severe hyperkalemia overlooked by ECG12Net-1 and all clinicians.
Discussion
In this study, we developed a deep-learning model, ECG12Net, to detect dyskalemias through ECG analysis. Using a deep convolutional network extracting many useful ECG features with a training set of more than 50,000 ECGs, ECG12Net performed better than clinicians in detecting dyskalemias. Notably, ECG12Net performed well with sensitivities of 95.6% and 84.5% in detecting severe hypokalemia and severe hyperkalemia, respectively.
ECG interpretation is one of the most important skills in medical practice. Previous studies have analyzed morphological features, for instance, the R wave peak [26] and the QRS complex [27], combined with machine learning approaches for disease detection, such as atrial fibrillation [28]. These systems were relatively imprecise, making it troublesome to quantify specific rhythm morphologies [29]. Although some recent studies have used deep convolutional neural networks and recurrent neural networks mainly for arrhythmia detection [30-35], most of the data were collected from wearable devices without offering all the important information provided by a 12-lead ECG [11]. The clinical value of these findings is also dampened by the lack of laboratory-based diagnosis and annotation and the relatively small volumes of data. In contrast, our database was unprecedented, comprising 40,180 patients and 66,321 laboratory-annotated ECG records collected by standard 12-lead ECG machines.
Galloway et al recently developed a deep-learning model to screen for hyperkalemia in patients with chronic kidney disease, stage III or higher, using ECG [36]. We applied ECG12Net to a broad set of patients in the ED and developed a continuous prediction of both hypokalemia and hyperkalemia. Moreover, although the three-category classification task in our study is more difficult than the two-category classification task in theirs, our ECG12Net achieved an AUC greater than 0.9 in detecting hyperkalemia, which is similar to that of their model with an AUC of 0.85-0.88. This highlights the strength of ECG12Net.
The EMPs of different K+ concentration groups yielded several interesting findings. The EMPs, such as the PR and QTc intervals, and the data used for analysis were all collected from the original ECGs (see Multimedia Appendix 1). The impact of hyperkalemia on the T wave axis was more profound and substantial than the axes of the P and RS waves. Hypokalemia was actually associated with a widening of the QRS complex, which may be explained by the decrease in conduction velocity caused by reduced K+ concentrations after hemodialysis [37]. Although the longest QTc occurred in the severe hypokalemia group, a well-documented finding, the QTc was longer in patients with hyperkalemia as well. In fact, for most of the intervals and durations, the nadir was in normokalemia, with increases on both forms of dyskalemia. Although the underlying mechanisms are unclear, these findings uncovered by big data may guide directions for further research.
Interestingly, the algorithm focusing only on morphologic changes (ie, ECG12Net-1) performed slightly better than that with additional EMP information (ie, ECG12Net-2). That the addition of EMP information did not improve the model’s predictive ability corroborates prior research that found that deep-learning models can automatically extract useful features for prediction without preprocessing [17,20,21]. This also highlights the importance of morphologic changes in ECG over EMPs in the detection of dyskalemias.
There are several clinical applications of ECG12Net shown in Multimedia Appendix 1. First, severe dyskalemia could be identified by ECG12Net within 5 minutes, much faster than laboratory testing, leading to more prompt management. Second, pseudodyskalemia, defined as an abnormal reported serum or plasma K+ concentration despite a normal in vivo K+ concentration, can be excluded early by ECG12Net to avoid inappropriate treatment. Third, the performance of ECG12Net is more than 10% better than that of the best cardiologist in our study, whose performance was similar to other experts in prior studies [38,39]. This means that emergency physicians could have access to a consistent, beyond cardiologist-level decision aid available 24 hours a day to help diagnose and manage dyskalemic patients. Fourth, the developed ECG12Net model can be included in a wearable device for dyskalemia detection, especially for patients with advanced chronic kidney disease or uremia on dialysis. Finally, the ECG12Net model could be incorporated into ECG machines in ambulances or remote areas to facilitate telemedicine.
Explainable AI plays a critical role in clinical practice [40,41]. The so-called “black box” approach in the deep-learning models often precludes the understanding of the decision-making process [42]. To increase the interpretability of our model, we established heatmaps to visualize the focus in the ECG by ECG12Net using class activation mappings [25,43], which can help physicians understand the logic of the AI decisions. Although our ECG12Net was approximately 3.85 times more likely to be correct when inconsistencies occurred between the AI and human predictions (see Multimedia Appendix 1), physicians who can integrate the AI suggestions with the symptoms and signs of patients should make the final decision to take appropriate action.
Some limitations of this study should be mentioned. First, the studied patients were only enrolled from one academic medical center, despite the similar distribution of blood K+ concentration in other large studies [44,45]. Multicenter validation is needed to confirm the value and application of this study. Second, only six clinicians participated in the competition with ECG12Net’s performance. Although their performance in severe hyperkalemia detection was consistent with that of the previous studies [38,39], comparisons should be made with more experts to confirm the superiority of ECG12Net. Third, only the patients in the ED with both an ECG and a serum K+ test were enrolled in this study, which may have caused selection bias and constrained the generalizability of the results. Fourth, although the sensitivity heatmap provides a glimpse into the basis for ECG12Net’s prediction, the reason why the particular ECG segment was highlighted remains unclear. Finally, ECG12Net showed decreased sensitivity in detecting mild-to-moderate hypokalemia, which accounts for the majority of dyskalemias, leading to low weighted averages of the sensitivities. Hypokalemia-associated ECG changes usually occur when the serum K+ level falls below 3 mEq/L [46], which may explain why our algorithm failed to accurately distinguish the ECG morphologies of mild-to-moderate hypokalemia from normokalemia.
In conclusion, we established a deep-learning model called ECG12Net to detect dyskalemias in the ED. The collaboration between physicians and AI can lead to better health care for our patients. This model will help emergency physicians promptly recognize severe dyskalemias and potentially reduce sudden cardiac death.
Acknowledgments
This study was supported in part by grants from the Ministry of Science and Technology, Taiwan (MOST 106-2314-B-016-035-MY3 to SHL, MOST 106-2314-B-016-038-MY3 and MOST 107-2511-H-016-002-MY2 to CSL, and MOST 108-2314-B-016-001- to CL), the Research Fund of Tri-Service General Hospital (TSGH-C-106-113 to SHL and TSGH-C107-007-007-S02 to CSL), and the Ter-Zer Foundation for Educational Achievement.
Abbreviations
- AI
artificial intelligence
- AUC
area under the curve
- ECG
electrocardiography
- ED
emergency department
- EMP
electrocardiography morphology parameter
- IRB
Institutional Review Board
- ISO
International Organization for Standardization
- MAE
mean absolute error
- QTc
corrected QT interval
Appendix
Supplementary materials.
Footnotes
Conflicts of Interest: None declared.
References
- 1.No authors listed Editorial: Slow-K, quick quick, slow. Lancet. 1974 Nov 09;2(7889):1123–1124. [PubMed] [Google Scholar]
- 2.Priori SG, Blomström-Lundqvist C, Mazzanti A, Blom N, Borggrefe M, Camm J, Elliott PM, Fitzsimons D, Hatala R, Hindricks G, Kirchhof P, Kjeldsen K, Kuck K, Hernandez-Madrid A, Nikolaou N, Norekvål TM, Spaulding C, Van Veldhuisen DJ, ESC Scientific Document Group 2015 ESC Guidelines for the management of patients with ventricular arrhythmias and the prevention of sudden cardiac death: The Task Force for the Management of Patients with Ventricular Arrhythmias and the Prevention of Sudden Cardiac Death of the European Society of Cardiology (ESC). Endorsed by: Association for European Paediatric and Congenital Cardiology (AEPC) Eur Heart J. 2015 Nov 01;36(41):2793–2867. doi: 10.1093/eurheartj/ehv316. [DOI] [PubMed] [Google Scholar]
- 3.Gavala A, Myrianthefs P. Comparison of point-of-care versus central laboratory measurement of hematocrit, hemoglobin, and electrolyte concentrations. Heart Lung. 2017;46(4):246–250. doi: 10.1016/j.hrtlng.2017.04.003. [DOI] [PubMed] [Google Scholar]
- 4.Dylewski JF, Linas S. Variability of potassium blood testing: Imprecise nature of blood testing or normal physiologic changes? Mayo Clin Proc. 2018 May;93(5):551–554. doi: 10.1016/j.mayocp.2018.03.019. [DOI] [PubMed] [Google Scholar]
- 5.Diercks DB, Shumaik GM, Harrigan RA, Brady WJ, Chan TC. Electrocardiographic manifestations: Electrolyte abnormalities. J Emerg Med. 2004 Aug;27(2):153–160. doi: 10.1016/j.jemermed.2004.04.006. [DOI] [PubMed] [Google Scholar]
- 6.Slovis C, Jenkins R. ABC of clinical electrocardiography: Conditions not primarily affecting the heart. BMJ. 2002 Jun 01;324(7349):1320–1323. doi: 10.1136/bmj.324.7349.1320. http://europepmc.org/abstract/MED/12039829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Van Mieghem C, Sabbe M, Knockaert D. The clinical value of the ECG in noncardiac conditions. Chest. 2004 Apr;125(4):1561–1576. doi: 10.1378/chest.125.4.1561. [DOI] [PubMed] [Google Scholar]
- 8.Dillon JJ, DeSimone CV, Sapir Y, Somers VK, Dugan JL, Bruce CJ, Ackerman MJ, Asirvatham SJ, Striemer BL, Bukartyk J, Scott CG, Bennet KE, Mikell SB, Ladewig DJ, Gilles EJ, Geva A, Sadot D, Friedman PA. Noninvasive potassium determination using a mathematically processed ECG: Proof of concept for a novel "blood-less, blood test". J Electrocardiol. 2015;48(1):12–18. doi: 10.1016/j.jelectrocard.2014.10.002. http://europepmc.org/abstract/MED/25453193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Attia ZI, DeSimone CV, Dillon JJ, Sapir Y, Somers VK, Dugan JL, Bruce CJ, Ackerman MJ, Asirvatham SJ, Striemer BL, Bukartyk J, Scott CG, Bennet KE, Ladewig DJ, Gilles EJ, Sadot D, Geva AB, Friedman PA. Novel bloodless potassium determination using a signal-processed single-lead ECG. J Am Heart Assoc. 2016 Jan 25;5(1):e002746. doi: 10.1161/JAHA.115.002746. http://www.ahajournals.org/doi/full/10.1161/JAHA.115.002746?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%3dpubmed. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Greenlee M, Wingo CS, McDonough AA, Youn J, Kone BC. Narrative review: Evolving concepts in potassium homeostasis and hypokalemia. Ann Intern Med. 2009 May 05;150(9):619–625. doi: 10.7326/0003-4819-150-9-200905050-00008. http://europepmc.org/abstract/MED/19414841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Corsi C, Cortesi M, Callisesi G, De Bie J, Napolitano C, Santoro A, Mortara D, Severi S. Noninvasive quantification of blood potassium concentration from ECG in hemodialysis patients. Sci Rep. 2017 Feb 15;7:42492. doi: 10.1038/srep42492. doi: 10.1038/srep42492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Corsi C, DeBie J, Napolitano C, Priori S, Mortara D, Severi S. Validation of a novel method for non-invasive blood potassium quantification from the ECG. Comput Cardiol. 2012;39:105–108. http://www.cinc.org/archives/2012/pdf/0105.pdf. [Google Scholar]
- 13.Velagapudi V, O'Horo JC, Vellanki A, Baker SP, Pidikiti R, Stoff JS, Tighe DA. Computer-assisted image processing 12 lead ECG model to diagnose hyperkalemia. J Electrocardiol. 2017;50(1):131–138. doi: 10.1016/j.jelectrocard.2016.09.001. [DOI] [PubMed] [Google Scholar]
- 14.Montague BT, Ouellette JR, Buller GK. Retrospective review of the frequency of ECG changes in hyperkalemia. Clin J Am Soc Nephrol. 2008 Mar;3(2):324–330. doi: 10.2215/CJN.04611007. http://cjasn.asnjournals.org/cgi/pmidlookup?view=long&pmid=18235147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015); 3rd International Conference on Learning Representations (ICLR 2015); May 7-9, 2015; San Diego, CA. 2015. pp. 1–14. https://arxiv.org/pdf/1409.1556.pdf. [Google Scholar]
- 16.Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015); IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015); June 7-12, 2015; Boston, MA. 2015. pp. 1–9. [DOI] [Google Scholar]
- 17.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016); IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016); June 26-July 1, 2016; Las Vegas, NV. 2016. pp. 770–778. [DOI] [Google Scholar]
- 18.Huang G, Liu Z, Weinberger KQ, van der Maaten L. Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017); IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017); July 21-26, 2017; Hawaii, HI. 2017. pp. 2261–2269. [DOI] [Google Scholar]
- 19.Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017 May 24;60(6):84–90. doi: 10.1145/3065386. https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf. [DOI] [Google Scholar]
- 20.Xiong W, Droppo J, Huang X, Seide F, Seltzer M, Stolcke A. Microsoft. 2017. Feb, [2020-02-21]. Achieving human parity in conversational speech recognition https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/ms_parity.pdf.
- 21.Amodei D, Ananthanarayanan S, Anubhai R, Bai J, Battenberg E, Case C, Casper J, Catanzaro B, Cheng Q, Chen G, Chen J, Chen J, Chen Z, Chrzanowski M, Coates A, Diamos G, Ding K, Du N, Elsen E, Engel J, Fang W, Fan L, Fougner C, Gao L, Gong C, Hannun A, Han T, Johannes L, Jiang B, Ju C, Jun B, LeGresley P, Lin L, Liu J, Liu Y, Li W, Li X, Ma D, Narang S, Ng A, Ozair S, Peng Y, Prenger R, Qian S, Quan Z, Raiman J, Rao V, Satheesh S, Seetapun D, Sengupta S, Srinet K, Sriram A, Tang H, Tang L, Wang C, Wang J, Wang K, Wang Y, Wang Z, Wang Z, Wu S, Wei L, Xiao B, Xie W, Xie Y, Yogatama D, Yuan B, Zhan J, Zhu Z. Deep Speech 2: End-to-end speech recognition in English and Mandarin. Proceedings of the International Conference on Machine Learning (ICML 2016); International Conference on Machine Learning (ICML 2016); June 19-24, 2016; New York, NY. 2016. pp. 1–10. http://proceedings.mlr.press/v48/amodei16.pdf. [DOI] [Google Scholar]
- 22.Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017 Feb 02;542(7639):115–118. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Schenkel S. Promoting patient safety and preventing medical error in emergency departments. Acad Emerg Med. 2000 Nov;7(11):1204–1222. doi: 10.1111/j.1553-2712.2000.tb00466.x. https://onlinelibrary.wiley.com/resolve/openurl?genre=article&sid=nlm:pubmed&issn=1069-6563&date=2000&volume=7&issue=11&spage=1204. [DOI] [PubMed] [Google Scholar]
- 24.Levey AS, Stevens LA, Schmid CH, Zhang YL, Castro AF, Feldman HI, Kusek JW, Eggers P, Van Lente F, Greene T, Coresh J, CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration) A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009 May 05;150(9):604–612. doi: 10.7326/0003-4819-150-9-200905050-00006. http://europepmc.org/abstract/MED/19414839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL); 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL); June 12-17, 2016; San Diego, CA. 2016. pp. 1480–1489. https://www.aclweb.org/anthology/N16-1174.pdf. [DOI] [Google Scholar]
- 26.Li C, Zheng C, Tai C. Detection of ECG characteristic points using wavelet transforms. IEEE Trans Biomed Eng. 1995 Jan;42(1):21–28. doi: 10.1109/10.362922. [DOI] [PubMed] [Google Scholar]
- 27.Mukhopadhyay S, Biswas S, Roy A, Dey N. Wavelet based QRS complex detection of ECG signal. Int J Eng Res Appl. 2012 May;2(3):2361–2365. https://arxiv.org/ftp/arxiv/papers/1209/1209.1563.pdf. [Google Scholar]
- 28.Zabihi M, Rad A, Katsaggelos A, Kiranyaz S, Narkilahti S, Gabbouj M. Detection of atrial fibrillation in ECG hand-held devices using a random forest classifier. Comput Cardiol. 2017 Sep 24;44:1–4. doi: 10.22489/cinc.2017.069-336. http://www.cinc.org/archives/2017/pdf/069-336.pdf. [DOI] [Google Scholar]
- 29.Guglin ME, Thatai D. Common errors in computer electrocardiogram interpretation. Int J Cardiol. 2006 Jan 13;106(2):232–237. doi: 10.1016/j.ijcard.2005.02.007. [DOI] [PubMed] [Google Scholar]
- 30.Rajpurkar P, Hannun A, Haghpanahi M, Bourn C, Ng A. Stanford ML Group. 2017. [2020-02-21]. Cardiologist-level arrhythmia detection with convolutional neural networks https://arxiv.org/pdf/1707.01836.pdf.
- 31.Zihlmann M, Perekrestenko D, Tschannen M. Convolutional recurrent neural networks for electrocardiogram classification. Comput Cardiol. 2017;44:1–4. doi: 10.22489/cinc.2017.070-060. http://www.cinc.org/archives/2017/pdf/070-060.pdf. [DOI] [Google Scholar]
- 32.Rubin J, Parvaneh S, Rahman A, Conroy B, Babaeizadeh S. Densely connected convolutional networks and signal quality analysis to detect atrial fibrillation using short single-lead ECG recordings. Proceedings of 2017 Computing in Cardiology Conference (CinC); 2017 Computing in Cardiology Conference (CinC); September 24-27, 2017; Rennes, France. 2017. pp. 1–4. https://arxiv.org/ftp/arxiv/papers/1710/1710.05817.pdf. [DOI] [Google Scholar]
- 33.Acharya UR, Fujita H, Lih OS, Hagiwara Y, Tan JH, Adam M. Automated detection of arrhythmias using different intervals of tachycardia ECG segments with convolutional neural network. Inf Sci. 2017 Sep;405:81–90. doi: 10.1016/j.ins.2017.04.012. [DOI] [Google Scholar]
- 34.Tan JH, Hagiwara Y, Pang W, Lim I, Oh SL, Adam M, Tan RS, Chen M, Acharya UR. Application of stacked convolutional and long short-term memory network for accurate identification of CAD ECG signals. Comput Biol Med. 2018 Mar 01;94:19–26. doi: 10.1016/j.compbiomed.2017.12.023. [DOI] [PubMed] [Google Scholar]
- 35.Acharya UR, Fujita H, Oh SL, Hagiwara Y, Tan JH, Adam M. Application of deep convolutional neural network for automated detection of myocardial infarction using ECG signals. Inf Sci. 2017 Nov;415-416:190–198. doi: 10.1016/j.ins.2017.06.027. [DOI] [Google Scholar]
- 36.Galloway CD, Valys AV, Shreibati JB, Treiman DL, Petterson FL, Gundotra VP, Albert DE, Attia ZI, Carter RE, Asirvatham SJ, Ackerman MJ, Noseworthy PA, Dillon JJ, Friedman PA. Development and validation of a deep-learning model to screen for hyperkalemia from the electrocardiogram. JAMA Cardiol. 2019 May 01;4(5):428–436. doi: 10.1001/jamacardio.2019.0640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Severi S, Pogliani D, Fantini G, Fabbrini P, Viganò MR, Galbiati E, Bonforte G, Vincenti A, Stella A, Genovesi S. Alterations of atrial electrophysiology induced by electrolyte variations: Combined computational and P-wave analysis. Europace. 2010 Jun;12(6):842–849. doi: 10.1093/europace/euq042. [DOI] [PubMed] [Google Scholar]
- 38.Wrenn KD, Slovis CM, Slovis BS. The ability of physicians to predict hyperkalemia from the ECG. Ann Emerg Med. 1991 Nov;20(11):1229–1232. doi: 10.1016/s0196-0644(05)81476-3. [DOI] [PubMed] [Google Scholar]
- 39.Acker CG, Johnson JP, Palevsky PM, Greenberg A. Hyperkalemia in hospitalized patients: Causes, adequacy of treatment, and results of an attempt to improve physician compliance with published therapy guidelines. Arch Intern Med. 1998 Apr 27;158(8):917–924. doi: 10.1001/archinte.158.8.917. [DOI] [PubMed] [Google Scholar]
- 40.Holzinger A, Langs G, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial intelligence in medicine. WIREs Data Min Knowl Discov. 2019 Apr 02;9(4):1–13. doi: 10.1002/widm.1312. https://www.onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Holzinger A, Kieseberg P, Weippl E, Tjoa A. Current advances, trends and challenges of machine learning and knowledge extraction: From machine learning to explainable AI. Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction; International Cross-Domain Conference for Machine Learning and Knowledge Extraction; August 27-30, 2018; Hamburg, Germany. 2018. pp. 1–8. [DOI] [Google Scholar]
- 42.Castelvecchi D. Can we open the black box of AI? Nature. 2016 Oct 06;538(7623):20–23. doi: 10.1038/538020a. [DOI] [PubMed] [Google Scholar]
- 43.Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016); IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016); June 26-July 1, 2016; Las Vegas, NV. 2016. https://arxiv.org/pdf/1512.04150.pdf. [DOI] [Google Scholar]
- 44.Marti G, Schwarz C, Leichtle AB, Fiedler G, Arampatzis S, Exadaktylos AK, Lindner G. Etiology and symptoms of severe hypokalemia in emergency department patients. Eur J Emerg Med. 2014 Feb;21(1):46–51. doi: 10.1097/MEJ.0b013e3283643801. [DOI] [PubMed] [Google Scholar]
- 45.Nilsson E, Gasparini A, Ärnlöv J, Xu H, Henriksson KM, Coresh J, Grams ME, Carrero JJ. Incidence and determinants of hyperkalemia and hypokalemia in a large healthcare system. Int J Cardiol. 2017 Oct 15;245:277–284. doi: 10.1016/j.ijcard.2017.07.035. https://linkinghub.elsevier.com/retrieve/pii/S0167-5273(17)32575-5. [DOI] [PubMed] [Google Scholar]
- 46.El-Sherif N, Turitto G. Electrolyte disorders and arrhythmogenesis. Cardiol J. 2011;18(3):233–245. http://www.cardiologyjournal.org/en/darmowy_pdf.phtml?id=103&indeks_art=1446. [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary materials.