Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 6.
Published in final edited form as: Circ Arrhythm Electrophysiol. 2025 Sep 30;18(10):e013734. doi: 10.1161/CIRCEP.125.013734

Prediction of Atrial Fibrillation from the Electrocardiogram in the Community Using Deep Learning: A Multinational Study

Luisa C C Brant 1, Antônio H Ribeiro 2, Oseiwe B Eromosele 3, Marcelo M Pinto-Filho 1, Sandhi M Barreto 1, Bruce B Duncan 1,4, Martin G Larson 5,6, Emelia J Benjamin 3,6,7, Antonio L P Ribeiro 1, Honghuang Lin 8
PMCID: PMC12569998  NIHMSID: NIHMS2118667  PMID: 41025252

Abstract

Background.

We aimed to refine and validate a deep neural network (DNN) model from the electrocardiogram (ECG) to predict atrial fibrillation (AF) risk, using samples from diverse backgrounds: the Framingham Heart Study (FHS), UK Biobank, and ELSA-Brasil. We compared the model’s performance to the clinical CHARGE-AF risk score and evaluated the association with other cardiovascular outcomes.

Methods.

The ECG-AF model was refined using 60% of FHS samples free of AF. Its performance was then tested in the remaining FHS samples, UK Biobank, and ELSA-Brasil, with discrimination assessed by the area under the receiver operating characteristic curve (AUC). The association of ECG-AF with cardiovascular outcomes was assessed using Cox proportional hazards models.

Results.

The study sample included 10,097 FHS participants (mean age 53±12 years; 54.9% women), 49,280 participants from the UK Biobank (mean age 64±8 years, 47.9% women), and 12,284 participants from ELSA-Brasil (mean age 53±8 years, 54.7% women). The ECG-AF model showed moderate discrimination for incident AF (AUC=0.82, 95%CI 0.80–0.84) in the FHS, comparable to the CHARGE-AF score (AUC=0.83, 95 CI 0.81–0.85), and incremental when combined (AUC=0.85, 95%CI 0.83–0.87). In UK Biobank and ELSA-Brasil, combining ECG-AF and CHARGE also improved prediction. Higher ECG-AF scores were associated with increased risks of heart failure, myocardial infarction, stroke, and all-cause mortality in all three cohorts.

Conclusions.

In multinational cohort studies, the single input ECG-AF DNN model demonstrated good performance in predicting AF and other cardiovascular outcomes, comparable to a multivariable clinical risk score, with improved performance when combined.

Keywords: atrial fibrillation, electrocardiogram, artificial intelligence, prediction, cohort studies

Graphical Abstract

graphic file with name nihms-2118667-f0001.jpg

Introduction

Atrial fibrillation (AF) is the most common arrhythmia worldwide, with an estimated prevalence of at least 52.6 (95% UI 43.5–63.7) million people in 2021.1 AF is associated with an increased risk of stroke,2,3 dementia,4,5 myocardial infarction (MI),6 heart failure (HF),7 and death.8 In individuals with AF, the lifetime risks of developing subsequent HF and stroke are two and one in five, respectively.9 From the public health perspective, AF is associated with major socioeconomic burdens including higher costs and hospitalizations.10

Identifying individuals with higher AF risk may help to promote AF primary prevention in these individuals through risk factor modification.11 It may also facilitate early diagnosis by enhancing AF screening in individuals with elevated AF risk.12 From the population perspective, identifying individuals with higher AF risk to target screening may improve resource allocation by reducing the number needed to screen.13 Traditionally, AF prediction is assessed through clinical risk scores, such as the well-validated score from the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium (CHARGE-AF).14,15 More recently, models to predict AF developed from the electrocardiogram (ECG) signal (ECG-AF models) through deep neural networks (DNN) have also been described. These ECG-AF models have been developed from ECGs of individuals receiving care at hospitals in the US or Canada,1618 or in primary care centers in Brazil (the Clinical Outcomes in Digital Electrocardiography, CODE study),19 or in a community-based study of older persons in the US.20 The predictive performance of ECG-AF models has primarily been evaluated in healthcare-based populations, where they have demonstrated incremental discrimination when added to existing clinical risk scores. However, there is a paucity of studies assessing their performance in more diverse populations, particularly those with broader age ranges and systematically collected risk factors and outcomes. Additionally, the potential of ECG-AF models to predict other adverse cardiovascular events associated with AF remains underexplored.

We refined the ECG-AF DNN model to predict AF in 5 years developed in the CODE study, from Brazil, in a subset of individuals from the community-based Framingham Heart Study (FHS). We aimed to compare the discrimination, calibration and reclassification performance of this refined ECG-AF model to the CHARGE-AF risk prediction score in another subset of FHS participants and validate it in individuals from the UK Biobank and the ELSA-Brasil cohort study. We then evaluated whether adding the ECG-AF to the CHARGE-AF model (CHARGE-ECG) improved AF discrimination, calibration, and reclassification. Lastly, we evaluated whether ECG-AF could also predict other cardiovascular outcomes related to AF, including HF, MI, stroke, and death.

Methods

Data availability

The data that support the study’s findings are available through the FHS for Researchers Portal https://www.framinghamheartstudy.org/fhs-for-researchers/ or NIH’s BioLINCC https://biolincc.nhlbi.nih.gov/home/. UK Biobank data are publicly available by application (www.ukbiobank.ac.uk). ELSA-Brasil data is also available for researchers upon reasonable request (http://elsabrasil.org/pesquisadores/). The DNN-based ECG-AF model21 is available at https://github.com/mygithth27/af-risk-prediction-by-ecg-dnn.

Study participants

FHS is a community-based cohort initiated in 1948. Three generations of participants have been recruited. The detailed information of the study has been described previously.22 In the current study, we included ECGs free of AF or atrial flutter from participants of the FHS older than 40 years. Participants were eligible to re-enter the analyses for subsequent time periods after five years if they were still alive and did not have AF. Each index examination with its follow-up period is considered to be a separate person examination similar to our prior study.23 We considered participants at risk until death or age at the last follow-up, whichever came first (Supplemental Figure 1).

The UK Biobank is a nationwide, population-based prospective study, embedded within the UK’s National Health Service.24 It aims to understand both genetic, behavioral, and environmental determinants of common diseases.25 More than 500,000 participants aged 40–69 years were recruited during 2006–2010 at 22 assessment centers throughout the UK.26 In the current study, we included 49,280 participants with ECGs and related clinical data (Supplemental Figure 1).

The ELSA-Brasil study is a multicenter cohort that enrolled 15,105 participants between 2008 and 2010.27 Participants were civil servants from five federal universities and one federal research institute, located in the capital cities of different Brazilian states. For the present analysis, we included individuals who had a valid baseline ECG recorded as a digital signal (not as an image or paper tracing) that could be processed by the AI model, and who were over 40 years of age, resulting in 12,717 eligible participants. We further excluded individuals with missing data on AF risk factors (n = 186) and those with prevalent AF at baseline (n = 247), yielding a final analytic sample of 12,284 participants (Supplemental Figure 1).

Artificial Intelligence Model to Predict Atrial Fibrillation from the Electrocardiogram (ECG-AF)

Electrocardiogram acquisition

In FHS, ECGs have been routinely performed on all participants as part of standard clinical assessments. Beginning in 1986, FHS adopted digital ECG systems, initially using the Marquette MAC/PC and later the Marquette MAC 5000 (General Electric). Currently, the MUSE 8 ECG Management System (General Electric) is in use, enabling centralized access and contemporary analysis of all ECGs collected since the transition to digital formats.28 For this study, we included ECGs collected between 1986 and 2021, restricted to those without evidence of AF.

In the UK Biobank, a standardized resting 12-lead ECG was performed at a UK Biobank Imaging Assessment Centre. ECG GE Cardiosoft program was loaded into the workstation and used to record ECG. We retrieved the ECG data in extended markup language (XML) files (Data-Field 20205). The ECG leads were recorded with a 500 Hz sampling frequency for 10s.29 More information can be found at https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=12323.

In ELSA-Brasil, 12-lead ECGs were performed at each study site using a standardized protocol with the Burdick Atria 6100 machine. Recordings were digitally acquired with 12 high-frequency leads at a sampling rate of 500 samples per second per channel over a 10-second duration. These digital ECGs were then transmitted to the centralized ECG reading center at the ELSA-Brasil research unit in Minas Gerais, where they were reviewed and interpreted by a senior cardiologist.30

Deep neural network (DNN) model for AF prediction

We used DNN due to its strong performance and the possibility of transfer learning31, which allows us to effectively adapt models pretrained on large datasets (such as CODE) to smaller cohorts (such as Framingham). Deep learning has also demonstrated state-of-the-art results in ECG analysis across several clinical tasks3234, and has consistently performed well in recent PhysioNet challenges35.

We used a two-stage approach to develop the DNN for AF prediction. In the first stage, we pre-trained our model in the CODE dataset, and in the second stage, we refined it with data from FHS. The pretraining procedure is obtained using a previously used protocol, which has been published with additional details. In brief, we excluded patients with a single exam if it was not AF.21 The remaining patients were included in the analysis. This approach was designed to leverage the large number of annotated ECGs available in the CODE dataset alongside the more detailed and reliable follow-up data from the FHS. We trained a CNN based on a residual network architecture to classify ECGs into one of three categories: no AF, prevalent AF, or incident AF. After applying the appropriate inclusion criteria, the CODE dataset comprised 631,514 ECGs classified as no AF, 41,851 as prevalent AF, and 12,280 as incident AF. The data were split into 60% for training, 10% for validation, and 30% for testing. Details of the model architecture and evaluation procedures are available in our prior publication.21 In the second stage, the model was further refined using FHS data by updating the neural network weights with the same optimization algorithm.

Fine-tuning a pre-trained model on a smaller dataset typically requires a reduced learning rate to ensure stable convergence.31 Accordingly, while a step size of 0.001 was used for training on the CODE dataset, a smaller learning rate of 0.0001 was applied during refinement on the FHS data. For this second phase, 60% of FHS ECGs were used to update the model parameters, and the remaining 40% were reserved to evaluate model performance. To mitigate overfitting, we employed standard regularization techniques, including dropout and weight decay. Model overfitting was further assessed using a hold-out test set within FHS, as well as external validation in two independent cohorts: the UK Biobank and ELSA-Brasil. Additionally, we performed 10-fold cross-validation to estimate prediction variability, demonstrating the model’s ability to generalize consistently to unseen data.

Clinical risk score to predict AF

We calculated the CHARGE-AF score, which has been validated for 5-year AF prediction.36 This score used pooled data from 18,556 participants of the FHS, the Cardiovascular Health Study, and the Atherosclerosis Risk in Communities (ARIC) study, and included the following characteristics: age, race, height, weight, systolic and diastolic blood pressure, current smoking, treatment of hypertension, diabetes, and history of MI and HF.(26,27) CHARGE-AF was then further validated in other diverse cohorts.37, 38

Clinical variables

In FHS, participants underwent interviews, physical examinations, and laboratory measurements, as detailed previously.22,39 Race was classified as White or other races or ethnic groups, similar to CHARGE-AF.40 Systolic and diastolic blood pressure were measured according to the FHS protocol. Current cigarette smoking (within the year before examination) and medications for blood pressure, diabetes, and lipid-lowering were assessed by self-report. Diabetes was defined as treatment with a hypoglycemic agent, fasting blood glucose ≥126 mg/dL, or non-fasting plasma glucose of ≥200 mg/dL. Myocardial infarction (MI) was defined when the participant had ≥2 of 3 findings: (1) symptoms indicative of ischemia, (2) changes in blood biomarkers of myocardial necrosis, and (3) serial changes in the ECG. Heart failure (HF) was defined by the FHS uniform criteria.41

In the UK Biobank, participants completed questionnaires, underwent physical assessments, and provided blood, urine, and saliva samples. Demographic and clinical variables, including age, sex, race, body mass index (BMI), and blood pressure, were collected at baseline visits. Smoking status was determined from self-reported questionnaires and categorized as current, former, or never smoking. Additionally, participants with at least one ICD-9 or ICD-10 code indicating active tobacco use during a linked medical encounter (e.g., toxic effects of tobacco or nicotine) were also classified as current smoking. Clinical conditions included in the CHARGE-AF risk score—such as hypertension, diabetes, coronary artery disease, and heart failure—were defined either through self-reported questionnaires at baseline or by the presence of at least one ICD-9 or ICD-10 code for the condition listed as a primary or secondary diagnosis, or as a cause of death in linked health records. Detailed variable definitions have been previously described.26,28

In the ELSA-Brasil study, baseline assessments included standardized interviews, physical examinations, and laboratory testing.42 Race was self-reported and categorized according to the Brazilian National Census as White, Mixed/Brown, Black, Asian, or Indigenous.43

Hypertension was defined as a systolic blood pressure ≥140 mm Hg, a diastolic blood pressure ≥90 mm Hg, or self-reported use of antihypertensive medication. Smoking status was classified as either smoking or non-smoking, with the latter category including former smoking. Diabetes was defined by fasting glucose ≥126 mg/dL, 2-hour post-load glucose ≥200 mg/dL, glycohemoglobin ≥6.5%, or self-reported physician diagnosis or use of anti-diabetic medications. The presence of prevalent myocardial infarction (MI) or heart failure (HF) was self-reported.

Study Outcomes

The primary outcome was incident AF in five years. We considered atrial flutter to be equivalent to AF. For FHS, AF was adjudicated by an FHS cardiologist reviewing electrocardiograms from the FHS research center, as well as outside medical records, interim hospitalizations, and Holter monitor results. Incident AF cases are defined as those who had AF after the ECG was performed, whereas prevalent AF cases are defined as those who had AF before or at the time when the ECG was performed. Cardiovascular outcomes were incident MI, HF, stroke, and all-cause death. MI, HF, and stroke were confirmed after adjudication by the Framingham Endpoint Review Committee (a panel of 2–3 clinicians) with information from the study’s research examinations and outside medical records (hospital and clinic) as previously stated for the respective condition.39

In the UK Biobank, AF, MI, HF, and stroke outcomes were compiled through self-reported diagnoses, hospital and primary care encounters, and death records through linkages to routinely available national health-related datasets. AF was defined as in a previously published set of definitions from these sources. Detailed field names and data codes used have been previously reported.16,28,44,45 Although direct validation is not possible in the UK Biobank, the AF definition has been previously assessed in an external dataset with a reported positive predictive value of 92%.28

In the ELSA-Brasil study, the included outcomes were adjudicated based on standardized protocols by a Review Committee, which conducted two independent initial reviews, followed by a third review by a senior cardiologist to resolve any disagreements.46 A diagnosis of AF was made based on reports of AF from hospitalizations, death certificates, or detection of AF on an ECG performed during the second follow-up visit up until 12/31/2018. MI, HF, and stroke were identified through hospitalizations reporting these events or through death certificates.

Statistical Analyses

Descriptive statistics were calculated using means and standard deviations for continuous variables, or frequency counts and percentages for categorical variables. The model discrimination performance was assessed by the area under the receiver operating characteristic curve (AUC), with a higher AUC representing a better discrimination performance. This ECG-AF AUC was compared to the AUC from CHARGE-AF and from a combined score using inputs from both cited scores (CHARGE-ECG). A 95% confidence interval of AUC was estimated by the DeLong test.47 The predicted risk was further adjusted for the calibration slope based on the training dataset. The calibration performance was assessed by calculating the Integrated Calibration Index (ICI), defined as the weighted mean absolute difference between predicted and observed event rates across deciles of predicted risk. A lower ICI indicates better calibration. Additionally, we computed the Brier score to quantify overall prediction accuracy as the mean squared difference between observed outcomes and recalibrated predictions. The Integrated Discrimination Improvement (IDI) was used to assess the improvement of classification after adding ECG-AF to the CHARGE risk prediction model48. The correlation between ECG-AF and CHARGE-AF was evaluated by Pearson’s correlation coefficient. The association of ECG-AF with cardiovascular outcomes and all-cause mortality was assessed using Cox proportional hazards models, with follow-up times censored at the last follow-up time or death. The models were adjusted for age and sex. The proportional hazards assumption was assessed using Schoenfeld residuals. As AF is a well-established risk factor for various cardiovascular outcomes, in the additional sensitivity analysis, we defined interim AF as those AF cases that were diagnosed before the onset of the respective outcome. We then treated interim AF as a confounding factor when assessing the association between the ECG-AF score and these cardiovascular events.

In the secondary analyses, we examined the association of ECG-AF with 5-year AF risk stratified by sex. We further tested for effect modification by sex by including interaction terms in the Cox models. As sensitivity analyses, we assessed the effect of the model’s fine-tuning by evaluating the discrimination of the original CODE’s model in the three cohorts, and we also restricted the FHS test set sample to unique ECG entries from each participant. Statistical significance was considered a two-sided P<0.05. All the statistical analyses were performed using R software version 4.0.3 (https://www.r-project.org/).

Ethical Considerations

The CODE Study was approved by the Research Ethics Committee of the Universidade Federal de Minas Gerais, protocol 49368496317.7.0000.5149. The FHS protocol was approved by the Boston University Medical Center Institutional Review Board. The UK Biobank was approved by the UK Biobank Research Ethics Committee (reference number 11/NW/0382). ELSA-Brasil was approved by the National Committee for Research Ethics (CONEP 976/2006) of the Ministry of Health and by the Research Ethics Committees of the participating institutions. All participants from the three cohorts provided written informed consent. The DNN-based ECG-AF model is publicly available at https://github.com/mygithth27/af-risk-prediction-by-ecg-dnn. We followed the STROBE guidelines for reporting our study.

Results

Training and Validation Samples

The current study included 10,097 participants from the FHS with 28,151 valid digital ECGs (mean baseline age 53±12 years, 54.9% were women, and 91.7% were White). The median number of ECGs per individual was 3.0 (25%, 75% quartile 2–4). The samples were further divided into the training set (6,036 participants with 16,876 ECGs) and the test set (4,061 participants with 11,275 ECGs). The training set was used to refine the ECG-AF model previously developed based on the CODE study, whereas the test set was used to validate the performance of the refined ECG-AF model. We further validated the refined model in 49,280 participants from the UK Biobank (mean age 64±8 years, 47.9% were women, 84.1% were white), and 12,284 participants from ELSA-Brasil (mean age 53±8 years, 54.7% were women, 51.8% were white). The baseline clinical characteristics of the study participants are shown in Table 1. In UK Biobank, participants were older, and had higher blood pressures compared to FHS. In ELSA-Brasil, participants had a similar mean age as FHS participants at baseline, but were more racially diverse and exhibited a higher prevalence of diabetes compared to the other cohorts. The number of AF cases and referents in each cohort and stratified by sex groups is shown in Supplemental Table 1.

Table 1.

Clinical characteristics of study samples at the baseline.

Characteristics* FHS training set (n=6,036) FHS test set (n=4,061) UK Biobank (n=49,280) ELSA-Brasil (n=12,284)
Age, years 53 ± 12 53 ± 12 64 ± 8 53 ± 8
Women, n (%) 3316 (54.9) 2225 (54.8) 23,617 (47.9) 6,718 (54.7)
Race
 White, n (%) 5531 (91.6) 3727 (91.8) 41,446 (84.1) 6,359 (51.8)
 Black, n (%) 158 (2.6) 96 (2.4) 144 (0.3) 5,482 (44.6)
 Asian, n (%) 97 (1.6) 52 (1.3) 704 (1.4) 308 (2.5)
 Other, n (%) 250 (4.1) 186 (4.6) 6,837 (13.9) 135 (1.1)
Hispanic, n (%) 210 (3.5) 148 (3.6) - -
Current smoking, n (%) 1,024 (17.0) 715 (17.6) 1,712 (3.5) 1,653 (13.5)
Height, cm 168 ± 10 167 ± 10 169 ± 9 165 ± 9
Weight, kg 77 ± 18 77 ± 18 76 ± 15 74 ± 15
BMI, kg/m2 27.3 ± 5.3 27.3 ± 5.3 26.6 ± 4.5 27.1 ± 4.7
SBP, mmHg 126 ± 20 126 ± 20 141 ± 19 122 ± 18
DBP, mmHg 77 ± 10 77 ± 10 79 ± 10 76 ± 11
Hypertension treatment, n (%) 1,259 (20.9) 853 (21.0) 16,344 (33.2) 3,826 (31.1)
Diabetes mellitus, n (%) 196 (3.2) 146 (3.6) 3,080 (6.3) 2,098 (17.1)
Prevalent HF, n (%) 31 (0.5) 23 (0.6) 347 (0.7) 210 (1.7)
Prevalent MI, n (%) 169 (2.8) 116 (2.9) 325 (0.7) 243 (2.0)
Prevalent stroke, n (%) 106 (1.8) 70 (1.7) 497(1.0) 181 (1.5)

AF: atrial fibrillation; BMI: body mass index; DBP: diastolic blood pressure; FHS: Framingham Heart Study; HDL: high-density lipoprotein; HF: heart failure; MI: myocardial infarction; SBP: systolic blood pressure; UK: United Kingdom.

*

Values are n (%) for dichotomous variables and mean ± standard deviation for continuous variables. Prevalent AF cases at the baseline were excluded from the current study.

Incident AF rates within 5 years varied from 4.6/1,000 persons-year in FHS, to 3.9 in UK Biobank and 1.5 in ELSA-Brasil. A modest correlation was observed between the ECG-AF and the CHARGE-AF score (correlation coefficient=0.41, 0.27, and 0.14 in the FHS, UK Biobank, and ELSA-Brasil, respectively, all with P<2.2×10−16). We also examined the consistency of prediction variability across the ten-fold cross-validation. The result was consistent across different folds with a standard deviation of AUC 0.026 in the FHS test set and 0.019 in the UK Biobank.

Discrimination of ECG-AF Prediction Model

We assessed the discrimination of the ECG-AF model by the area under the curve (AUC). In the FHS test set with 405 AF cases, the model demonstrated moderate AF discrimination (AUC=0.82, 95% CI 0.80–0.84), which was comparable to the CHARGE-AF score (AUC=0.83, 95% CI 0.81–0.85) (Table 2). We further derived an integrative score (defined as CHARGE-ECG) by combining the ECG-AF score and the CHARGE-AF score, which showed a statistically significant improvement compared to each model separately (AUC=0.85, 95% CI 0.83–0.87, P<0.001) (Figure 1). In the sex-stratified analysis (Supplemental Figure 2), there was no difference according to sex (P=0.39).

Table 2.

Model performance for 5-year AF risk

Model FHS UK Biobank ELSA
AUC (95% CI) Sensitivity/Specificity AUC (95% CI) Sensitivity/Specificity AUC (95% CI) Sensitivity/Specificity
CHARGE-AF 0.83 (0.81–0.85) 0.76/0.77 0.78 (0.76–0.79) 0.73/0.69 0.79 (0.74–0.85) 0.71/0.77
ECG-AF 0.82 (0.80–0.84) 0.72/0.78 0.73 (0.71–0.75) 0.63/0.71 0.72 (0.66–0.78) 0.67/0.66
CHARGE-ECG 0.85 (0.83–0.87) 0.78/0.76 0.81 (0.79–0.82) 0.74/0.72 0.81 (0.76–0.86) 0.78/0.71

AF: atrial fibrillation; AUC: area under the curve; CHARGE-AF: Cohorts for Heart and Aging Research in Genomic Epidemiology consortium risk prediction score for AF; CHARGE-ECG: the combination of CHARGE-AF score and ECG-AF score; ECG-AF: Deep-learning model from electrocardiogram to predict AF; FHS: Framingham Heart Study; UK: United Kingdom.

Figure 1.

Figure 1.

Receiver operating characteristic curves for incident AF prediction within five years in the Framingham Heart Study. CHARGE-AF: Model solely built from clinical factors in the CHARGE-AF model. ECG-AF: Model solely built from ECG waveform data. CHARGE-ECG: Model built from the combination of clinical factors and ECG waveform data. AUC: Area under the curve.

Similarly, in the UK Biobank (Figure 2) with 719 AF cases and the ELSA-Brasil (Figure 3) with 76 cases, the model also demonstrated moderate AF discrimination performance (AUC=0.73, 95% CI 0.71–0.75 and AUC=0.72, 95% CI 0.66–0.78, respectively), which is lower than the CHARGE-AF score (AUC=0.78, 95% CI 0.76–0.79 and AUC=0.79, 95% CI 0.74–0.85, respectively). The integrative CHARGE-ECG score also showed improved performance in both cohorts (AUC=0.81, 95% CI 0.79–0.82 and AUC=0.81, 95% CI 0.76–0.86, respectively). Supplemental Figures 3 and 4 reveal the sex-stratified analysis for the UK Biobank and ELSA-Brasil, respectively.

Figure 2.

Figure 2.

Receiver operating characteristic curves for incident AF prediction within five years the UK Biobank. CHARGE-AF: Model solely built from clinical factors in the CHARGE-AF model. ECG-AF: Model solely built from ECG waveform data. CHARGE-ECG: Model built from the combination of clinical factors and ECG waveform data. AUC: Area under the curve.

Figure 3.

Figure 3.

Receiver operating characteristic curves for incident AF prediction within five years the ELSA-Brasil. CHARGE-AF: Model solely built from clinical factors in the CHARGE-AF model. ECG-AF: Model solely built from ECG waveform data. CHARGE-ECG: Model built from the combination of clinical factors and ECG waveform data. AUC: Area under the curve.

In the sensitivity analysis, we assessed the model generalizability by applying the initial model developed solely based on the CODE dataset without the FHS refinement. As shown in Supplemental Table 2, the CODE’s ECG-AF model reached a slightly lower AUC than the models in the FHS and UK Biobank, except in ELSA-Brasil.

Reclassification and Calibration

We also assessed the reclassification of 5-year AF prediction by adding ECG-AF to the CHARGE-AF model. As shown in Supplemental Table 3, the CHARGE-ECG model significantly improves discrimination over the CHARGE model for predicting 5-year AF risk, with an IDI of 0.049 (95% CI: 0.029–0.071, p<0.001) in the FHS test set. This improvement is primarily due to better differentiation among individuals who developed AF (D1 = 0.046), with a small trade-off in risk prediction among individuals who did not experience AF (D0= −0.0035). The total risk separation also increased substantially with mean predicted risk between cases and referents of 0.338, indicating that the CHARGE-ECG model provides more informative stratification of risk. A similar pattern was also observed in UK Biobank (IDI=0.038, 95% CI: 0.023–0.071, p<0.001) and ELSA (IDI=0.023, 95% CI: 0.006–0.065).

The prediction models also demonstrated good calibration, with Integrated Calibration Index (ICI) values ranging from 0.007 in the UK Biobank to 0.032 in ELSA. Brier scores were also low, ranging from 0.011 in ELSA to 0.037 in the FHS test set (Supplemental Table 4). A calibration plot comparing predicted versus expected probabilities of incident AF for the three cohorts is shown in Supplemental Figure 5.

Association of ECG-AF with Cardiovascular Outcomes

As shown in Table 3, the ECG-AF was strongly associated with incident AF in FHS. One standard deviation increase of the ECG-AF score was associated with a 40% increase in AF risk (95% CI 1.37–1.42). The association was more significant in women (HR=1.44, 95% CI 1.41–1.48) compared with men (HR=1.35, 95% CI 1.31–1.39) with the interaction term P<0.001. ECG-AF was also associated with HF (HR=1.26, 95% CI 1.22–1.29), MI (HR=1.06, 95% CI 1.01–1.11), stroke (HR=1.13, 95% CI 1.09–1.17), and all-cause mortality (HR=1.07, 95% CI 1.05–1.09). The associations remained significant after additionally adjusting for interim AF (Supplemental Table 5). As each participant from the FHS might contribute to multiple ECGs used for the analysis, we performed additional analysis by restricting one ECG per participant. As reported in Supplemental Table 6, the association of ECG-AF score with cardiovascular outcomes and all-cause mortality remained significant.

Table 3.

Association of ECG-AF score with cardiovascular outcomes (incident atrial fibrillation, heart failure, myocardial infarction, and stroke) and all-cause mortality.

Outcome All Women only Men only
HR 95% CI P HR 95% CI P HR 95% CI P
FHS
AF 1.40 1.37–1.42 <0.001 1.44 1.41–1.48 <0.001 1.35 1.31–1.39 <0.001
HF 1.26 1.22–1.29 <0.001 1.27 1.22–1.31 <0.001 1.25 1.20–1.30 <0.001
MI 1.06 1.01–1.11 0.015 1.06 0.99–1.14 0.10 1.06 1.00–1.14 0.058
Stroke 1.13 1.09–1.17 <0.001 1.17 1.12–1.23 <0.001 1.07 1.01–1.14 0.002
Mortality 1.07 1.05–1.09 <0.001 1.09 1.06–1.12 <0.001 1.05 1.02–1.08 <0.001
UK Biobank
AF 1.41 1.38–1.45 <0.001 1.49 1.44–1.55 <0.001 1.35 1.30–1.40 <0.001
HF 1.38 1.33–1.44 <0.001 1.40 1.32–1.48 <0.001 1.35 1.27–1.44 <0.001
MI 1.09 1.00–1.19 0.039 1.09 0.99–1.20 0.074 1.14 0.95–1.36 0.17
Stroke 1.14 1.05–1.24 0.0019 1.17 1.06–1.29 0.0013 1.06 0.89–1.28 0.51
Mortality 1.11 1.05–1.18 <0.001 1.11 1.04–1.19 <0.001 1.10 0.97–1.25 0.046
ELSA-Brasil
AF 1.51 1.39–1.65 <0.001 1.74 1.50–2.01 <0.001 1.42 1.26–1.60 <0.001
HF 1.53 1.38–1.69 <0.001 1.77 1.49–2.10 <0.001 1.43 1.23–1.66 <0.001
MI 1.27 1.17–1.39 <0.001 1.34 1.12–1.61 0.0016 1.26 1.14–1.39 <0.001
Stroke 1.34 1.18–1.53 <0.001 1.41 1.14–1.74 0.0017 1.31 1.10–1.55 0.002
Mortality 1.20 1.14–1.26 <0.001 1.17 1.08–1.27 <0.001 1.21 1.14–1.29 <0.001
*

The models were adjusted for age and sex. The sex-stratified models were only adjusted for age.

ECG: electrocardiogram; HR: Hazard ratio, expressed as one standard deviation (SD) increase of ECG-AF score; CI: confidence interval. FHS: Framingham Heart Study; UK: United Kingdom; AF: atrial fibrillation; HF: heart failure; MI: myocardial infarction

Similarly, in the UK Biobank, one standard deviation increase of the ECG-AF score was associated with a 41% increase in AF risk (95% CI 1.38–1.45), 38% increase in the risk of HF (95% CI 1.33–1.44), 9% increase in the risk of MI (95% CI 1.00–1.19), 14% increase in the risk of stroke (95% CI 1.05–1.24), and 11% increase in all-cause mortality (95% CI 1.05–1.18). In ELSA-Brasil, one standard deviation increase of the ECG-AF score was associated with a 51% increase in AF risk (95% CI 1.39–1.65), 53% increase in the risk of HF (95% CI 1.38–1.69), 27% increase in the risk of MI (HR 95% CI 1.17–1.39), 34% increase in the risk of stroke (95% CI 1.18–1.53), and 20% increase in all-cause mortality (95% CI 1.14–1.26).

Discussion

In this study, we refined a deep learning model to predict AF using 60% of ECGs from the FHS, and validated the model in the remaining samples of the FHS, as well as externally using samples from the UK Biobank and ELSA-Brasil. ECG-AF showed moderate correlation and comparable performance to the CHARGE-AF clinical risk score. Moreover, combining the ECG-AF with CHARGE-AF, the CHARGE-ECG model, yielded improved discrimination and calibration for AF risk compared to either model alone and enhanced reclassification of AF. Beyond AF prediction, ECG-AF was also associated with other adverse cardiovascular outcomes including HF, MI, stroke, and all-cause mortality. Taken together, our findings demonstrate that an ECG-based deep learning model using a single input can achieve risk prediction performance comparable to that of a multivariable clinical model. Moreover, the CHARGE-ECG model offers improved predictive accuracy, suggesting that the two models capture partially distinct and complementary information. Importantly, we also report—for the first time—AF incidence rates in a Brazilian population, contributing valuable data from an underrepresented cohort.

Given the global rise in AF prevalence and the widespread availability and low cost of ECGs, our findings carry important clinical implications. We demonstrate that deep learning models trained on ECG data can generalize well across racially and geographically diverse populations, remain robust for predicting both AF and its associated outcomes, and offer an efficient alternative or complement to traditional clinical risk scores.49 These models may enable tailored screening strategies by identifying individuals at high risk of developing AF in the near future, thus reducing the number needed to screen and improving resource allocation. Early identification of individuals at risk for AF could facilitate timely interventions such as anticoagulation,50,51 which has been shown to reduce stroke risk by 64% and mortality by 26%. Better AF risk stratification may also allow for upstream prevention through targeted risk factor modification.

While AI-ECG algorithms are not included in current guidelines, the ACC/AHA/ACCP/HRS guidelines consider evidence of structural or electrical findings predisposing to AF as Stage 2, Pre-AF, which merits treating modifiable risk factors and considering heightened surveillance. If ECG AI is further validated, future AF guidelines may consider AI-ECG algorithms as a criterion for Stage 2 Pre-AF.52 These risk factor modification strategies may include blood pressure control, weight management, and avoidance of alcohol and smoking, to be tested in further studies.10,52

Our findings align with previous studies demonstrating the utility of DNNs for AF prediction from 12-lead ECGs. Attia et al. demonstrated the ability of a DNN model to identify the electrocardiographic signature of paroxysmal AF from 12-lead ECGs showing sinus rhythm in a short time window.53 This was reproduced in the CODE cohort for a longer time window (more than 5 years).19,54 Raghunath et al. also developed such models and provided evidence that these tools can help to identify patients at risk of AF-related stroke.55 Jabbour et al developed and tested an AI-ECG algorithm in a tertiary cardiac center that surpassed clinical and polygenic risk scores for AF risk prediction.18

Emerging evidence also provides insights into how DNNs may detect subtle ECG features associated with future AF risk. Saliency maps suggest the P wave and surrounding regions are important for the prediction of AF risk using AI-ECG.53 Moreover, median waveform analysis suggests that a high estimated AF risk is associated with a longer P wave duration, slightly wider QRS, and a flatter ST segment.54 Medical review of exams suggests that many tracings marked by the incident-AF model as high-risk have hypertrophy, bundle branch block, or premature beats.54 Still, deep learning performs better in predicting future AF, than classification models that use traditional engineering features from the ECG, indicating that deep learning models might be examining more than simple readings of wavelengths and amplitudes.19 To corroborate this hypothesis, the addition of traditional features from the ECG, such as left ventricular hypertrophy and PR interval, provided no gain in the CHARGE-AF predictive ability.15 More recently, other studies went one step further and reported that artificial intelligence provides a meaningful improvement in predictive accuracy for AF, beyond clinical risk scores (CHARGE-AF), corroborating our findings16,20

Despite the high lifetime risk of AF, the 5-year incidence rate remains relatively low in the general population. The number of individuals without AF far exceeds those who develop the condition, resulting in a marked class imbalance in our datasets. This imbalance poses challenges for risk prediction models, as it can lead to biased estimates and reduced sensitivity in identifying true cases.56 Moreover, given the low absolute incidence, a large number of individuals would need to be screened to identify a relatively small number of AF cases. While early identification remains a priority for prevention and intervention, the need to screen many low-risk individuals could increase the overall clinical and logistical burden. These findings highlight the importance of optimizing predictive performance not only for discrimination but also for calibration and clinical utility, particularly in populations with low event rates.

Several limitations should be acknowledged. First, the FHS and UK Biobank cohorts are predominantly composed of individuals of European ancestry. Although ELSA-Brasil enhanced the racial diversity of our sample, the generalizability of ECG-AF to other underrepresented populations remains to be fully established. Nevertheless, the original ECG-AF model was trained in the racially and socioeconomically diverse Brazilian CODE cohort, and our study demonstrates successful model transferability across countries and cohorts. Second, although the CHARGE-ECG model improved predictive performance, the incremental gain over CHARGE-AF alone was modest. Therefore, ECG-AF may be particularly useful for opportunistic risk stratification when clinical variables are unavailable. Third, incidence rates varied across cohorts due to differences in outcome ascertainment. For example, ELSA-Brasil relied on hospitalization and death certificates, whereas the UK Biobank and FHS also incorporated outpatient data. A previous study showed that 22% of AF incident cases in the UK Biobank were diagnosed only in primary care encounters.45 In FHS, all potential AF events were adjudicated and included Holter monitoring results. Fourth, as an observational study, we cannot infer causality or rule out residual confounding.

Despite these limitations, our study has notable strengths. We leveraged a large, community-based cohort for model refinement and validated findings in two diverse, well-characterized external cohorts. The ECG-AF model demonstrated strong and consistent predictive performance, both independently and in combination with an established clinical risk score. Moreover, ECG-AF was associated with multiple cardiovascular outcomes, underscoring its broader potential for clinical risk assessment.

In conclusion, our study demonstrates that a deep learning model applied to a single 12-lead ECG can predict AF with performance comparable to the multivariable CHARGE-AF clinical risk score. This suggests that risk stratification for AF may be feasible using only the ECG, even in the absence of full clinical data, facilitating its application in the context of opportunistic screening. Furthermore, when combined with the CHARGE-AF score, the AI-ECG model provides incremental predictive value, enhancing overall risk assessment. Given the growing burden of AF and the accessibility of ECG technology, ECG-AF models represent a promising tool for improving AF prevention and early detection strategies. Prospective validation and real-world clinical implementation studies will be essential to define their role in future practice.57

Supplementary Material

013734_-_Supplemental_Material

Tables S1S6

Figures S1S5

What is known

  • Atrial fibrillation (AF) risk prediction is traditionally assessed through multivariable clinical risk scores. Artificial intelligence applied to the electrocardiogram (AI-ECG) through deep neural network (DNN) models can predict cardiovascular outcomes, including AF, however evaluations in diverse cohort studies are still scarce.

  • Early identification of individuals at higher risk of AF is the first step to implement tailored prevention strategies and may also facilitate diagnosis and prevent complications by enhancing screening of high-risk individuals.

What the study adds

  • We refined a DNN model to predict AF from ECG (ECG-AF) and demonstrated moderate discrimination (area under the curve - AUC = 0.82) in the longstanding community-based Framingham Heart Study (FHS) and in other samples from diverse settings: the UK Biobank and ELSA-Brasil.

  • The ECG-AF model’s performance was comparable to the traditional CHARGE-AF multivariable clinical risk score across cohorts using only the ECG data. Combining ECG-AF with CHARGE-AF resulted in improved prediction, calibration and reclassification.

  • Considering that ECGs are low-cost and widely available, even in areas with limited health access through telemedicine, ECG-AF could be used alone or in combination for AF risk prediction, potentially enabling tailored prevention strategies and more efficient screening.

Sources of Funding:

The Framingham Heart Study acknowledges the support of contracts NO1-HC-25195, HHSN268201500001I, 75N92019D00031 and 75N92025D00012 from the National Heart, Lung and Blood Institute. The ELSA-Brasil cohort study is funded by the Brazilian Ministry of Health (Science and Technology Department) and the Brazilian Ministry of Science, Technology and Innovation (FINEP and CNPq), grant numbers: 01 060010.00 and 01.10.0643.03 (RS); 01 06 0212.00 and 01.10.0742- 00 (BA); 01 06 0300.00 and 01.12.0284.00(ES); 01 06 0278.00 and 01 10 0746 00 (MG); 01 06 0115.00 and 01.10.0773-00 (SP); and 01 06 0071.00 and01.11.0093.01 (RJ). LCCB is supported in part by CNPq (307329/2022-4), FAPEMIG (RED 00192-23) and has received the “Women for Science” award from Brazilian Academy of Science, UNESCO and L’oreal Brasil. AHR is funded by Kjell och Märta Beijer Foundation. EJB was supported by NIH 2R01 HL092577; 2U54HL120163; 1R01AG066010; R01AG028321; American Heart Association Grant (18SFRN34110082). ALPR (465518/2014-1, 310790/2021-2), SMB (303656/2021-2), and BBD (304467/2015-4 and 307003/2020-5) are supported in part by CNPq and ALPR additionally by FAPEMIG (RED 00192-23). HL is supported by NIH grants (U01AG068221, R01AG083735, and R21HL175584) and American Heart Association Grant (20SFRN35360180).

Nonstandard Abbreviations and Acronyms

AI

Artificial Intelligence

AI-ECG

Artificial Intelligence applied to the electrocardiogram

AF

Atrial fibrillation

BMI

Body mass index

BP

Blood pressure

CHARGE

Cohorts for Heart and Aging Research in Genomic Epidemiology consortium

CODE

Clinical Outcomes in Digital Electrocardiography Study

DNN

Deep neural network

ECG

Electrocardiogram

ECG-AF

Electrocardiogram derived deep-learning prediction of atrial fibrillation

ELSA-Brasil

Estudo Longitudinal da Saúde do Adulto

FHS

Framingham Heart Study

HF

Heart failure

MI

Myocardial infarction

Footnotes

Disclosures: None

References:

  • 1.Global Health Data Exchange [Internet]. [cited 2025 Apr 18]. Available from: http://ghdx.healthdata.org/
  • 2.Wolf PA, Dawber TR, Thomas E, Kannel WB. Epidemiologic assessment of chronic atrial fibrillation and risk of stroke: The Framingham Study. Neurology. 2011; 77:1579–1579. [Google Scholar]
  • 3.Wolf PA, Abbott RD, Kannel WB. Atrial fibrillation: a major contributor to stroke in the elderly. The Framingham Study. Arch Intern Med. 1987;147:1561–4. [PubMed] [Google Scholar]
  • 4.Elias MF, Sullivan LM, Elias PK, Vasan RS, D’Agostino RB Sr, Seshadri S, Rhoda Au, Wolf PA, Benjaminl EJ. Atrial fibrillation is associated with lower cognitive performance in the Framingham offspring men. J Stroke Cerebrovasc Dis. 2006; 15:214–22. [DOI] [PubMed] [Google Scholar]
  • 5.Ott A, Breteler MM, de Bruyne MC, van Harskamp F, Grobbee DE, Hofman A. Atrial fibrillation and dementia in a population-based study. The Rotterdam Study: The Rotterdam Study. Stroke. 1997; 28:316–21. [DOI] [PubMed] [Google Scholar]
  • 6.Soliman EZ, Safford MM, Muntner P, Khodneva Y, Dawood FZ, Zakai NA, Thacker EL, Judd S, Howard VJ,et al. Atrial fibrillation and the risk of myocardial infarction. JAMA Intern Med. 2014;174:107–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wang TJ, Larson MG, Levy D, Vasan RS, Leip EP, Wolf PA, D’Agostino RB, Murabito JM, Kannel WB, Benjamin EJl. Temporal relations of atrial fibrillation and congestive heart failure and their joint influence on mortality: the Framingham Heart Study: The Framingham heart study. Circulation. 2003;107(23):2920–5. [DOI] [PubMed] [Google Scholar]
  • 8.Benjamin EJ, Wolf PA, D’Agostino RB, Silbershatz H, Kannel WB, Levy D. Impact of atrial fibrillation on the risk of death: the Framingham Heart Study: The Framingham Heart Study. Circulation. 1998;98(10):946–52. [DOI] [PubMed] [Google Scholar]
  • 9.Vinter N, Cordsen P, Johnsen SP, Staerk L, Benjamin EJ, Frost L, Trinquart L. Temporal trends in lifetime risks of atrial fibrillation and its complications between 2000 and 2022: Danish, nationwide, population based cohort study. BMJ. 2024;385:e077209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Martin SS, Aday AW, Almarzooq ZI, Anderson CAM, Arora P, Avery CL, Baker-Smith MC, Gibbs BB, Beaton AZ, Boehme AK, et al. 2024 Heart Disease and stroke statistics: A report of US and global data from the American Heart Association. Circulation. 2024;149(8):e347–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kornej J, Börschel CS, Benjamin EJ, Schnabel RB. Epidemiology of atrial fibrillation in the 21st century: Novel methods and new insights. Circ Res. 2020;127(1):4–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Nadarajah R, Wu J, Frangi AF, Hogg D, Cowan C, Gale CP. What is next for screening for undiagnosed atrial fibrillation? Artificial intelligence may hold the key. Eur Heart J Qual Care Clin Outcomes. 2022;8(4):391–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Sivanandarajah P, Wu H, Bajaj N, Khan S, Ng FS. Is machine learning the future for atrial fibrillation screening? Cardiovasc Digit Health J. 2022; 3:136–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.O’Neal WT, Alonso A. The appropriate use of risk scores in the prediction of atrial fibrillation. J Thorac Dis. 2016;8:E1391–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Alonso A, Krijthe BP, Aspelund T, Stepas KA, Pencina MJ, Moser CB, Sinner MF, Sotoodehnia N, Fontes JD, Janssens ACJW, et al. Simple risk model predicts incidence of atrial fibrillation in a racially and geographically diverse population: the CHARGE-AF consortium. J Am Heart Assoc. 2013; 2:e000102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Khurshid S, Friedman S, Reeder C, Di Achille P, Diamant N, Singh P, Harrington LX, Wang X, Mostafa A Al-Alusi MA, Sarma G, et al. ECG-based deep learning and clinical risk factors to predict Atrial Fibrillation. Circulation. 2022;145(2):122–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yuan N, Duffy G, Dhruva SS, Oesterle A, Pellegrini CN, Theurer J, Vali M, Heidenreich PA, Keyhani S, Ouyang D. Deep learning of electrocardiograms in sinus rhythm from US veterans to predict atrial fibrillation. JAMA Cardiol. 2023;8:1131–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Jabbour G, Nolin-Lapalme A, Tastet O, Corbin D, Jordà P, Sowa A, Delfrate J, Busseui Dl, Hussin JG, Dubé MP, et al. Prediction of incident atrial fibrillation using deep learning, clinical models, and polygenic scores. Eur Heart J. 2024;45:4920–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Biton S, Gendelman S, Ribeiro AH, Miana G, Moreira C, Ribeiro ALP, Behar JAl. Atrial fibrillation risk prediction from the 12-lead electrocardiogram using digital biomarkers and deep representation learning. Eur Heart J Digit Health. 2021;2:576–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Christopoulos G, Graff-Radford J, Lopez CL, Yao X, Attia ZI, Rabinstein AA, Petersen RC, Knopman DS, Mielke MM, Kremers W et al. Artificial Intelligence-electrocardiography to predict incident atrial fibrillation: A population-based study: A population-based study. Circ Arrhythm Electrophysiol. 2020;13:e009355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Habineza T, Ribeiro AH, Gedon D, Behar JA, Ribeiro ALP, Schön TB. End-to-end risk prediction of atrial fibrillation from the 12-Lead ECG by deep neural networks. J Electrocardiol. 2023;81:193–200. [DOI] [PubMed] [Google Scholar]
  • 22.Andersson C, Johnson AD, Benjamin EJ, Levy D, Vasan RS. 70-year legacy of the Framingham Heart Study. Nat Rev Cardiol. 2019;16:687–98. [DOI] [PubMed] [Google Scholar]
  • 23.Vinter N, Huang Q, Fenger-Grøn M, Frost L, Benjamin EJ, Trinquart L. Trends in excess mortality associated with atrial fibrillation over 45 years (Framingham Heart Study): community based cohort study. BMJ. 2020;370:m2724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ollier W, Sprosen T, Peakman T. UK Biobank: from concept to reality. Pharmacogenomics. 2005;6:639–46. [DOI] [PubMed] [Google Scholar]
  • 25.Collins R What makes UK Biobank special? Lancet. 2012;379(9822):1173–4. [DOI] [PubMed] [Google Scholar]
  • 26.Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, Downey P, Elliott P, Green J, Landray M, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Aquino EML, Barreto SM, Bensenor IM, Carvalho MS, Chor D, Duncan BB, Lotufo PA, Mill JG, Molina MDC, Mota ELA, et al. Brazilian longitudinal study of adult health (ELSA-brasil): Objectives and design. Am J Epidemiol. 2012;175(4):315–24. [DOI] [PubMed] [Google Scholar]
  • 28.Magnani JW, Newton-Cheh C, O’Donnell CJ, Levy D. Development and application of a longitudinal electrocardiogram repository: the Framingham Heart Study. J Electrocardiol. 2012;45:673–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Khurshid S, Choi SH, Weng LC, Wang EY, Trinquart L, Benjamin EJ, Ellinor PT, Lubitz SAl. Frequency of cardiac rhythm abnormalities in a half million adults. Circ Arrhythm Electrophysiol. 2018;1:e006273. [Google Scholar]
  • 30.Pinto MM Filho, Brant LCC, Padilha-da-Silva JL, Foppa M, Lotufo PA, Mill JG, Vasconcelo-Silva PR, Almeida MCC, Barreto SM, Ribeiro ALPl. Electrocardiographic findings in Brazilian adults without heart disease: ELSA-Brasil. Arq Bras Cardiol. 2017;109:416–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.14.2. Fine-Tuning — Dive into Deep Learning 1.0.3 documentation [Internet]. [cited 2025 Apr 18]. Available from: https://d2l.ai/chapter_computer-vision/fine-tuning.html
  • 32.Raghunath S, Ulloa Cerna AE, Jing L, vanMaanen DP, Stough J, Hartzel DN, Leader JB, Kirchner HL, Stumpe MC, Hafez A, et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nat Med. 2020;26:886–91. [DOI] [PubMed] [Google Scholar]
  • 33.Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, Ng AYl. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25:65–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, Pellikka PA,Enriquez-Sarano M, Noseworthy PA, Munger TM, et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. 2019;25:70–4. [DOI] [PubMed] [Google Scholar]
  • 35.Perez Alday EA, Gu A, J Shah A, Robichaux C, Ian Wong AK, Liu C, Liu F, Rad AB, Elola A, Seyedi S, et al. Classification of 12-lead ECGs: the PhysioNet/Computing in Cardiology Challenge 2020. Physiol Meas. 2020;41:124003. [Google Scholar]
  • 36.Himmelreich JCL, Veelers L, Lucassen WAM, Schnabel RB, Rienstra M, van Weert HCPM, Harskamp REl. Prediction models for atrial fibrillation applicable in the community: a systematic review and meta-analysis. Europace. 2020;22:684–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Bundy JD, Heckbert SR, Chen LY, Lloyd-Jones DM, Greenland P. Evaluation of risk prediction models of atrial fibrillation (from the Multi-Ethnic Study of Atherosclerosis [MESA]). Am J Cardiol. 2020;125:55–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Pfister R, Brägelmann J, Michels G, Wareham NJ, Luben R, Khaw KT. Performance of the CHARGE-AF risk model for incident atrial fibrillation in the EPIC Norfolk cohort. Eur J Prev Cardiol. 2015;22:932–9. [DOI] [PubMed] [Google Scholar]
  • 39.Tsao CW, Vasan RS. Cohort Profile: The Framingham Heart Study (FHS): overview of milestones in cardiovascular epidemiology. Int J Epidemiol. 2015;44:1800–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Alonso A, Roetker NS, Soliman EZ, Chen LY, Greenland P, Heckbert SR. Prediction of atrial fibrillation in a racially diverse cohort: The Multi-Ethnic Study of Atherosclerosis (MESA). J Am Heart Assoc [Internet]. 2016;5. Available from: 10.1161/JAHA.115.003077 [DOI] [Google Scholar]
  • 41.Levy D, Kenchaiah S, Larson MG, Benjamin EJ, Kupka MJ, Ho KKL, Murabito JM, Vasan RS. Long-term trends in the incidence of and survival with heart failure. N Engl J Med. 2002;347:1397–402. [DOI] [PubMed] [Google Scholar]
  • 42.Mill JG, Pinto K, Griep RH, Goulart A, Foppa M, Lotufo PA, Maestri MK, Ribeiro AL, Andreão RV, Dantas EM,et al. Medical assessments and measurements in ELSA-Brasil. Rev Saude Publica. 2013;47 Suppl 2:54–62. [DOI] [PubMed] [Google Scholar]
  • 43.Ethno-Racial Characteristics of the Population [Internet]. [cited 2025 Apr 18]. Available from: https://www.ibge.gov.br/en/statistics/full-list-statistics/17590-ethno-racial-characteristics-of-the-population.html?utm_source=chatgpt.com
  • 44.[cited 2025 Apr 8]. Available from: https://biobank.ndph.ox.ac.uk/ukb/ukb/docs/alg_outcome_main.pdf
  • 45.Camm CF, Von Ende A, Gajendragadkar PR, Pessoa-Amorim G, Mafham M, Allen N, Parish S, Casadei B, Hopewell JC. Role of primary and secondary care data in atrial fibrillation ascertainment: impact on risk factor associations, patient management, and mortality in UK Biobank. Europace [Internet]. 2025;27. Available from: 10.1093/europace/euae291 [DOI] [Google Scholar]
  • 46.Barreto SM, Ladeira RM, Bastos M do SCB de O, Diniz M de FHS, Jesus EA de, Kelles SMB, Luft VC, Melo ECP, Oliveira ERA de. Rev Saude Publica. 2013;47 Suppl 2:79–86. [DOI] [PubMed] [Google Scholar]
  • 47.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45. [PubMed] [Google Scholar]
  • 48.Pencina MJ, D’Agostino RB Sr, Steyerberg EW. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med. 2011;30:11–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Svennberg E, Friberg L, Frykman V, Al-Khalili F, Engdahl J, Rosenqvist M. Clinical outcomes in systematic screening for atrial fibrillation (STROKESTOP): a multicentre, parallel group, unmasked, randomised controlled trial. Lancet. 2021;398:1498–506. [DOI] [PubMed] [Google Scholar]
  • 50.Friberg L, Rosenqvist M, Lip GYH. Net clinical benefit of warfarin in patients with atrial fibrillation: a report from the Swedish atrial fibrillation cohort study: A report from the Swedish atrial fibrillation cohort study. Circulation. 2012;125:2298–307. [DOI] [PubMed] [Google Scholar]
  • 51.Hart RG, Pearce LA, Aguilar MI. Meta-analysis: antithrombotic therapy to prevent stroke in patients who have nonvalvular atrial fibrillation. Ann Intern Med. 2007;146:857–67. [DOI] [PubMed] [Google Scholar]
  • 52.Joglar JA, Chung MK, Armbruster AL, Benjamin EJ, Chyou JY, Cronin EM, Deswal A, Eckhardt LL, Goldberger ZD, Gopinathannair R, et al. 2023 ACC/AHA/ACCP/HRS guideline for the Diagnosis and Management of Atrial Fibrillation: A report of the American college of cardiology/American heart association joint committee on clinical practice guidelines. Circulation. 2024;149:e1–156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Attia ZI, Noseworthy PA, Lopez-Jimenez F, Asirvatham SJ, Deshmukh AJ, Gersh BJ, Carter RE, Yao X, Rabinstein AA, Erickson BJ, et al. An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction. Lancet. 2019;394:861–7. [DOI] [PubMed] [Google Scholar]
  • 54.Zvuloni E, Read J, Ribeiro AH, Ribeiro ALP, Behar JA. On merging feature engineering and deep learning for diagnosis, risk prediction and age estimation based on the 12-lead ECG. IEEE Trans Biomed Eng. 2023;70:2227–36. [DOI] [PubMed] [Google Scholar]
  • 55.Raghunath S, Pfeifer JM, Ulloa-Cerna AE, Nemani A, Carbonati T, Jing L, vanMaanen DP, Hartzel DN, Ruhl JA, Lagerman BF, et al. Deep neural networks can predict new-onset atrial fibrillation from the 12-lead ECG and help identify those at risk of atrial fibrillation-related stroke. Circulation. 2021;143:1287–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst Appl. 2017;73:220–39. [Google Scholar]
  • 57.Hlatky MA, Greenland P, Arnett DK, Ballantyne CM, Criqui MH, Elkind MSV, Go AS, Harrell FE Jr, Hong H, Howard BV, et al. Criteria for evaluation of novel markers of cardiovascular risk: a scientific statement from the American Heart Association: A scientific statement from the American heart association. Circulation. 2009;119:2408–16. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

013734_-_Supplemental_Material

Data Availability Statement

The data that support the study’s findings are available through the FHS for Researchers Portal https://www.framinghamheartstudy.org/fhs-for-researchers/ or NIH’s BioLINCC https://biolincc.nhlbi.nih.gov/home/. UK Biobank data are publicly available by application (www.ukbiobank.ac.uk). ELSA-Brasil data is also available for researchers upon reasonable request (http://elsabrasil.org/pesquisadores/). The DNN-based ECG-AF model21 is available at https://github.com/mygithth27/af-risk-prediction-by-ecg-dnn.

RESOURCES