AI-Enabled ECG for Paroxysmal Atrial Fibrillation Detection: One Step to Closer to the Finish Line

Matthew M Kalscheur; Oguz Akbilgic

doi:10.1016/j.jacep.2023.05.023

. Author manuscript; available in PMC: 2024 Aug 1.

Published in final edited form as: JACC Clin Electrophysiol. 2023 Jul 26;9(8 Pt 3):1783–1785. doi: 10.1016/j.jacep.2023.05.023

AI-Enabled ECG for Paroxysmal Atrial Fibrillation Detection

One Step to Closer to the Finish Line

Matthew M Kalscheur ^a, Oguz Akbilgic ^b

PMCID: PMC10928874 NIHMSID: NIHMS1970626 PMID: 37498242

With the rapidly increasing global burden of atrial fibrillation (AF) and the increased risk of stroke, heart failure, hospitalization, cognitive decline, and decreased quality of life associated with AF, identifying and treating AF is of paramount interest to public health.^1,2 Unfortunately, AF is commonly first diagnosed following a disabling stroke.^3,4 According to the most recent U.S. Preventive Services Task Force statement, evidence is insufficient to recommend for or against one-time screening strategies for AF by any method beyond pulse palpation in asymptomatic adults 50 years of age and older.⁵ However, in 2019, Attia et al⁶ published pioneering work introducing us to artificial intelligence (AI)–enabled electrocardiogram (ECG), a methodology that could play a pivotal role in facilitating more targeted screening for AF that leads to improved outcomes.⁶

In this issue of JACC: Clinical Electrophysiology, Gruwez et al⁷ make important contributions to crossing the finish line of widespread implementation of AI-enabled ECG. Attia et al⁶ had previously shown the use of convolutional neural networks (CNNs) to develop a model to identify patients with high likelihood of paroxysmal AF using sinus rhythm ECGs. A prospective trial using this model showed improved yield in AF screening.⁸ Noseworthy et al⁸ used the Mayo Clinic AI-enabled ECG model to stratify patients by AI-enabled ECG risk. Patients in the high AI-enabled ECG risk group had increased diagnostic yield for AF detection compared to usual care. This important trial set the stage for making this model clinically useful. However, there are at least 3 important hurdles to clear before widespread implementation: 1) replication of model performance in different populations with particular attention paid toward the robustness of AF annotations; 2) improved understanding of the features identified by the CNN to differentiate patients; and 3) demonstration that model predictions result in interventions that positively impact patients.

Here, Gruwez et al⁷ independently built a model to predict latent AF from sinus rhythm ECGs using a CNN architecture and methods replicating the work of Attia et al⁶ in Ziekenhuis Oost-Limburg (ZOL) (Genk, Belgium) and applied that model to a separate dataset from Ziekenhuis Maas en Kempen (ZMK) (Maaseik, Belgium). In both settings, the model exhibited similar performance to prior work. Gruwez et al⁷ thoughtfully discuss model performance, an important addition to guide implementation efforts. When deciding how to implement an AI-enabled ECG model to predict latent AF, the impact of disease prevalence requires consideration. Gruwez et al⁷ describe lower model performance after simulating their results in a lower event rate scenario by randomly excluding two-thirds of AF cases in their analytical cohort. It is worth further studies to understand whether the deteriorated model performance in the lower event rate scenario was due to the low event rate or the significantly smaller number of AF cases upon which to train (or to optimize its hyperparameters). Regardless, a positive result in a low prevalence population will have a lower positive predictive value compared to a positive result in a high prevalence population and should impact the resulting intervention (eg, noninvasive monitoring vs invasive monitoring for AF).

In terms of the hurdles described above, this work contributed in 2 important ways. External replication and validation of the AI-enabled ECG model to predict latent AF is a critical next step. This pushes us toward that goal, but significant work is still required. The performance of supervised AI models depends upon the quality of outcome annotations. With model derivation at one site and external validation in another site, in theory, it provides a fair assessment of the trained model. However, with the 2 institutions being <20 miles apart, shared patients seem likely, implying that data from one patient could be in both derivation and external validation cohorts. Further, a patient with no-AF annotation in one institute may have had an AF annotation at the other. A comprehensive chart review on a subset of patients to verify annotations would be interesting. Yet, such limitations may not imply that the true model performance is lower; indeed, it can be even higher as some of the false-positive results may have had the diagnosis elsewhere.

To cross the finish line, a next step may be a collaborative learning effort using methods such as federated learning (FL) or institutional incremental learning.⁹ Developing a model that is broadly effective requires a large amount of data from diverse populations—including across nations. The most common paradigm to accomplish this to date involves multiple institutions pooling data in a central location for model training. Accomplishing this data sharing requires overcoming privacy and technical barriers that are even more difficult for an international initiative.

FL represents a potential solution to this problem as partner institutions train models locally on their own data in parallel. Individual models are then aggregated into a consensus model on a central server and distributed back to the individual institutions. The EXAM model (electronic medical record chest x-ray AI model) is an FL model developed among 20 institutes worldwide that predicts future oxygen requirements of patients presenting to the emergency department with symptomatic COVID-19.¹⁰ This important proof-of-concept showed an average area under the curve of >0.92 for predicting outcomes at 24 and 72 hours with the consensus model showing a 16% performance improvement compared to local models with a 38% improvement in generalizability. Future AI-enabled ECG studies may benefit similarly from a FL approach.

The subgroup analysis and saliency mapping presented by Gruwez et al⁷ provide a second nudge. Saliency mapping is a technique that highlights input features that are discriminative with respect to class. Gruwez et al⁷ used this technique to determine that the terminal part of the P-wave was critical to determining the probability of latent AF, a finding consistent with the hypothesis that the CNN identifies features related to structural changes in the left atrium. This observation could serve as a jumping off point to understand false-negative results. For example, there may be a significant difference in left atrial size in patients with false-negative AI-enabled ECG results compared to true-positive results. In a subgroup analysis, Gruwez et al⁷ showed that the performance of the model was higher in women (area under the receiver operating characteristic curve of 0.90 compared to that for men of 0.84). This finding may be consistent with a recent study showing more advanced atrial remodeling in women.¹¹ Future AI-enabled ECG studies should include analyses exploring feature importance and subgroup analyses to continue to generate hypotheses that may provide more insight into the model itself and the underlying disease process.

The current work clearly advances the case that AI-enabled ECGs are likely to impact clinical care. However, there is still one last hurdle to clear. The model output must be tied to an action that leads to an intervention, and the clinical impact of that intervention is the true outcome of interest.¹² The model presented here and by the Mayo group was not developed to predict future AF. Rather, it identified structural changes associated with AF in sinus rhythm ECGs. In the test sets, it seems likely that many of the sinus rhythm ECGs tested occurred sometime after the AF diagnosis. A subsequent analysis to demonstrate model performance using only sinus rhythm ECGs obtained in the window of interest prior to AF diagnosis would be interesting. Even with that caveat, we have already observed that an AI-enabled ECG model can increase the yield of AF screening.⁸ Hopefully, a well-validated, generalizable AI-enabled ECG model will lead to targeted screening that reduces adverse outcomes related to AF without causing increased harm. At that point, AI-enabled ECG for AF detection will have reached the finish line.

FUNDING SUPPORT AND AUTHOR DISCLOSURES

Dr Akbilgic has received grants from NIH for projects R01CA261834 and R21HL167126. Dr Kalscheur has reported that he has no relationships relevant to the contents of this paper to disclose.

Footnotes

The authors attest they are in compliance with human studies committees and animal welfare regulations of the authors’ institutions and Food and Drug Administration guidelines, including patient consent where appropriate. For more information, visit the Author Center.

REFERENCES

1.Hindricks G, Potpara T, Dagres N, et al. 2020 ESC guidelines for the diagnosis and management of atrial fibrillation developed in collaboration with the European Association of Cardio-Thoracic Surgery (EACTS): the task force for the diagnosis and management of atrial fibrillation of the European Society of Cardiolgoy (ESC) developed with special contribution of the European Heart Rhythm Association of the ESC. Eur Heart J. 2021;42(5):373–498. [DOI] [PubMed] [Google Scholar]
2.Kornej J, Borschel CS, Benjamin EJ, Schnabel RB. Epidemiology of atrial fibrillation in the 21st century: novel methods and new insights. Circ Res. 2020;127(1):4–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Lin H-J, Wolf PA, Benjamin EJ, Belanger AJ, D’Agostino RB. Newly diagnosed atrial fibrillation and acute stroke: the Framingham study. Stroke. 1995;26(9):1527–1530. [DOI] [PubMed] [Google Scholar]
4.Gladstone DJ, Bui E, Fang J, et al. Potentially preventable strokes in high-risk patients with atrial fibrillation who are not adequately anti-coagulated. Stroke. 2009;40(1):235–240. [DOI] [PubMed] [Google Scholar]
5.U.S. Preventive Services Task Force, Davidson KW, Barry MJ, et al. Screening for atrial fibrillation: US Preventive Services Task Force recommendation statement. JAMA. 2022;327(4):360–367. [DOI] [PubMed] [Google Scholar]
6.Attia ZI, Noseworthy PA, Lopez-Jimenez F, et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet. 2019;394(10201):861–867. [DOI] [PubMed] [Google Scholar]
7.Gruwez H, Barthels M, Haemers P, et al. Detecting paroxysmal atrial fibrillation from an electrocardiogram in sinus rhythm: external validation of the AI approach. J Am Coll Cardiol EP. 2023;9:1771–1782. [DOI] [PubMed] [Google Scholar]
8.Noseworthy PA, Attia ZI, Behnken EM, et al. Artificial intelligence-guided screening for atrial fibrillation using electrocardiogram during sinus rhythm: a prospective non-randomised interventional trial. Lancet. 2022;400(10359):1206–1212. [DOI] [PubMed] [Google Scholar]
9.Sheller MJ, Edwards B, Reina GA, et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci Rep. 2020;10(1):12598. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Dayan I, Roth HR, Zhong A, et al. Federated learning for predicting clinical outcomes in patients with COVID-19. Nat Med. 2021;27(10):1735–1743. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Wong GR, Nalliah CJ, Lee G, et al. Sex-related differences in atrial remodeling in patients with atrial fibrillation: relationship to ablation outcomes. Circ Arrhythm Electrophysiol. 2022;15(1):e009925. [DOI] [PubMed] [Google Scholar]
12.Smith MA, Adelaine S, Bednarz L, Patterson BW, Pothof J, Liao F. Predictive solutions in learning health systems: the critical need to systematize implementation of prediction to action to intervention. NEJM Catalyst. 2021;2(5). [Google Scholar]

[R1] 1.Hindricks G, Potpara T, Dagres N, et al. 2020 ESC guidelines for the diagnosis and management of atrial fibrillation developed in collaboration with the European Association of Cardio-Thoracic Surgery (EACTS): the task force for the diagnosis and management of atrial fibrillation of the European Society of Cardiolgoy (ESC) developed with special contribution of the European Heart Rhythm Association of the ESC. Eur Heart J. 2021;42(5):373–498. [DOI] [PubMed] [Google Scholar]

[R2] 2.Kornej J, Borschel CS, Benjamin EJ, Schnabel RB. Epidemiology of atrial fibrillation in the 21st century: novel methods and new insights. Circ Res. 2020;127(1):4–20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Lin H-J, Wolf PA, Benjamin EJ, Belanger AJ, D’Agostino RB. Newly diagnosed atrial fibrillation and acute stroke: the Framingham study. Stroke. 1995;26(9):1527–1530. [DOI] [PubMed] [Google Scholar]

[R4] 4.Gladstone DJ, Bui E, Fang J, et al. Potentially preventable strokes in high-risk patients with atrial fibrillation who are not adequately anti-coagulated. Stroke. 2009;40(1):235–240. [DOI] [PubMed] [Google Scholar]

[R5] 5.U.S. Preventive Services Task Force, Davidson KW, Barry MJ, et al. Screening for atrial fibrillation: US Preventive Services Task Force recommendation statement. JAMA. 2022;327(4):360–367. [DOI] [PubMed] [Google Scholar]

[R6] 6.Attia ZI, Noseworthy PA, Lopez-Jimenez F, et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet. 2019;394(10201):861–867. [DOI] [PubMed] [Google Scholar]

[R7] 7.Gruwez H, Barthels M, Haemers P, et al. Detecting paroxysmal atrial fibrillation from an electrocardiogram in sinus rhythm: external validation of the AI approach. J Am Coll Cardiol EP. 2023;9:1771–1782. [DOI] [PubMed] [Google Scholar]

[R8] 8.Noseworthy PA, Attia ZI, Behnken EM, et al. Artificial intelligence-guided screening for atrial fibrillation using electrocardiogram during sinus rhythm: a prospective non-randomised interventional trial. Lancet. 2022;400(10359):1206–1212. [DOI] [PubMed] [Google Scholar]

[R9] 9.Sheller MJ, Edwards B, Reina GA, et al. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Sci Rep. 2020;10(1):12598. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Dayan I, Roth HR, Zhong A, et al. Federated learning for predicting clinical outcomes in patients with COVID-19. Nat Med. 2021;27(10):1735–1743. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Wong GR, Nalliah CJ, Lee G, et al. Sex-related differences in atrial remodeling in patients with atrial fibrillation: relationship to ablation outcomes. Circ Arrhythm Electrophysiol. 2022;15(1):e009925. [DOI] [PubMed] [Google Scholar]

[R12] 12.Smith MA, Adelaine S, Bednarz L, Patterson BW, Pothof J, Liao F. Predictive solutions in learning health systems: the critical need to systematize implementation of prediction to action to intervention. NEJM Catalyst. 2021;2(5). [Google Scholar]

PERMALINK

AI-Enabled ECG for Paroxysmal Atrial Fibrillation Detection

Matthew M Kalscheur, MD

Oguz Akbilgic, DBA, PhD

FUNDING SUPPORT AND AUTHOR DISCLOSURES

Footnotes

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

AI-Enabled ECG for Paroxysmal Atrial Fibrillation Detection

Matthew M Kalscheur, MD

Oguz Akbilgic, DBA, PhD

FUNDING SUPPORT AND AUTHOR DISCLOSURES

Footnotes

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases