Abstract
Background
Performance of existing atrial fibrillation (AF) risk prediction models in poststroke populations is unclear. We evaluated predictive utility of an AF risk model in patients with acute stroke and assessed performance of a fully refitted model.
Methods and Results
Within an academic hospital, we included patients aged 46 to 94 years discharged for acute ischemic stroke between 2003 and 2018. We estimated 5‐year predicted probabilities of AF using the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model, by recalibrating CHARGE‐AF to the baseline risk of the sample, and by fully refitting a Cox proportional hazards model to the stroke sample (Re‐CHARGE‐AF) model. We compared discrimination and calibration between models and used 200 bootstrap samples for optimism‐adjusted measures. Among 551 patients with acute stroke, there were 70 incident AF events over 5 years (cumulative incidence, 15.2%; 95% CI, 10.6%–19.5%). Median predicted 5‐year risk from CHARGE‐AF was 4.8% (quartile 1–quartile 3, 2.0–12.6) and from Re‐CHARGE‐AF was 16.1% (quartile 1–quartile 3, 8.0–26.2). For CHARGE‐AF, discrimination was moderate (C statistic, 0.64; 95% CI, 0.57–0.70) and calibration was poor, underestimating AF risk (Greenwood‐Nam D’Agostino chi‐square, P<0.001). Calibration with recalibrated baseline risk was also poor (Greenwood‐Nam D’Agostino chi‐square, P<0.001). Re‐CHARGE‐AF improved discrimination (P=0.001) compared with CHARGE‐AF (C statistic, 0.74 [95% CI, 0.68–0.79]; optimism‐adjusted, 0.70 [95% CI, 0.65–0.75]) and was well calibrated (Greenwood‐Nam D’Agostino chi‐square, P=0.97).
Conclusions
Covariates from an established AF risk model enable accurate estimation of AF risk in a poststroke population after recalibration. A fully refitted model was required to account for varying baseline AF hazard and strength of associations between covariates and incident AF.
Keywords: atrial fibrillation, ischemic stroke, predicted risk
Subject Categories: Atrial Fibrillation, Ischemic Stroke, Risk Factors
Nonstandard Abbreviations and Acronyms
- CHARGE‐AF
Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation
- Re‐CHARGE‐AF
fully refitted CHARGE‐AF
Clinical Perspective
What Is New?
We evaluated the predictive utility of an established atrial fibrillation (AF) risk model (Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation [CHARGE‐AF]) in an acute stroke population, a setting where risk of AF is higher than in the population samples in which it was based.
CHARGE‐AF and recalibrated CHARGE‐AF to the baseline risk of the poststroke sample were poorly calibrated and substantially underestimated the risk of AF.
A fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model was required to account for varying baseline AF hazard and strength of associations between covariates and incident AF
What Are the Clinical Implications?
As more risk estimates from prognostic models are incorporated into clinical tools, our results highlight the importance of evaluating model performance to ensure that accurate and useful information is being provided in the context of the population being treated.
Atrial fibrillation (AF) is a common cardiac arrhythmia associated with a 5‐fold increased risk of stroke. 1 AF‐related strokes have a high rate of recurrence, and are associated with substantial morbidity, long‐term disability, and mortality. 2 , 3 , 4 Oral anticoagulants are effective for preventing strokes caused by AF. 5 Identifying patients with stroke at high risk for AF can be challenging but important for preventing recurrent strokes.
AF may be asymptomatic even at the time of stroke, and detection may require extended cardiac rhythm monitoring. 6 , 7 , 8 Clinical guidelines support prolonged rhythm monitoring (≈30 days) for AF within 6 months in patients who have experienced an acute ischemic stroke with no other apparent cause (class IIa, level of evidence C), 9 and insertion of an implantable loop recorder to optimize detection of AF in patients with cryptogenic stroke in whom external ambulatory monitoring is inconclusive (class IIa, level of evidence B‐R). 10 Detection of AF with cardiac rhythm monitoring may occur in up to 20% of patients, but varies greatly by the timing, duration, and type of monitor used. 6 , 7 , 8 However, implantable cardiac rhythm monitoring is costly and invasive. 11
Assessing individual patient risk for AF may enable more efficient use of cardiac rhythm monitoring in individuals most likely to have had an AF‐related stroke. Although developed and validated in multiple community cohorts, the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) risk prediction model 12 has demonstrated poor calibration in healthcare‐related data sets, and has not been evaluated in an acute stroke population. 13 We sought to assess the performance of the CHARGE‐AF model to predict 5‐year incident AF after acute stroke, where risk of AF is higher than in the population samples in which it was based. Based on our results, we performed full refitting of the CHARGE‐AF (Re‐CHARGE‐AF) model and assessed whether the updated model achieved favorable performance for prediction of AF following ischemic stroke.
Methods
Study Sample
Eligible patients were aged 46 to 94 years who were discharged from Massachusetts General Hospital following hospitalization for acute ischemic stroke between January 1, 2003, and December 31, 2018. In order to maximize follow‐up information, we included only those patients who were connected to a Massachusetts General Hospital primary care physician, defined by having at least 1 primary care visit during the 3 years before the stroke event. 14 , 15 Patients were excluded if they had a prevalent diagnosis of AF at the time of stroke, were diagnosed with AF within 7 days of the stroke event, or did not visit their Massachusetts General Hospital primary care physician following discharge. This medical records–based study was approved with a waiver of informed consent by the local Mass General Brigham institutional review board. Mass General Brigham data contain protected health information and cannot be publicly shared. The data processing scripts used to perform analyses will be made available to interested researchers upon reasonable request to the corresponding author.
Ascertainment of Clinical Factors
Patient characteristics, comorbidities, and medication lists were obtained from a central data repository at Mass General Brigham. 16 Age, sex, and race or ethnicity were ascertained at the time of stroke. Height, weight, and systolic and diastolic blood pressure (BP) recorded closest to the date of stroke were obtained. If a value was not documented on the date of the stroke event, we accepted the most proximal weight or BP documented in the electronic health record (EHR) within 5 years before the stroke event (lookback period for weight [median, 0.27 years; quartile 1–quartile 3, 0.08–1.02] and BP [median, 0.22 years; quartile 1–quartile 3, 0.05–0.76]). Sensitivity analyses limiting weight or BP to within 3 years before the stroke event excluded 93 patients from the sample but produced similar results in analyses of discrimination and calibration. The distribution of BP values documented on the date of stroke was similar to those documented before the stroke date, so we included BP values from the stroke date. We used the height documented closest to the stroke date without any time restrictions. Antihypertensive medication use was assessed based on any medications listed before the stroke date. Smoking status (current versus not) was assigned based on smoking status reported within a structured field in the health monitoring section of the EHR within 2 years; a problem list entry or International Classification of Diseases, Ninth Revision (ICD‐9) or International Classification of Diseases, Tenth Revision (ICD‐10), billing code within the prior year; or based on free text within the prior year using a natural language processing algorithm if smoking status using structured fields was unavailable. 17 Patients were considered to have diagnosed diabetes at the time of stroke (type 1 or type 2) using a previously validated algorithm. 18 Patients with congestive heart failure were identified using an internally validated algorithm that required 1 inpatient primary discharge or 2 outpatient visits with problem list terms or ICD‐9 or ICD‐10 billing codes for congestive heart failure (Table S1). Patients with previous myocardial infarction were identified using ICD‐9 or ICD‐10 billing codes. Death was ascertained based on EHR data linked to the Social Security Death Index.
Outcomes
The primary outcome was incident AF within 5 years of acute ischemic stroke. Incident AF status was ascertained using a previously validated EHR algorithm, which utilized problem list entries and inpatient or outpatient ICD‐9 or ICD‐10 codes. 19
Estimated AF Risk
We utilized CHARGE‐AF because it is a widely validated tool specifically designed to estimate risk of AF and has previously been compared with other metrics for estimating AF risk. 20 , 21 , 22 , 23 We estimated AF risk based on clinical risk factors in 3 ways. First, we calculated the CHARGE‐AF score using published components and weights. 12 We converted the CHARGE‐AF score into 5‐year predicted probability of AF using the formula , where is an individual’s CHARGE‐AF score. 12 Second, we implemented CHARGE‐AF while recalibrating it to the baseline risk of the sample. 24 , 25 We generated an updated baseline risk by calculating the average 5‐year AF‐free survival and calculated the mean CHARGE‐AF score in the study sample. The CHARGE‐AF score was converted into 5‐year predicted probability using the same formula with updated constants: . Third, because adjusting the baseline risk alone did not result in a well‐calibrated model, we fully refitted a Cox proportional hazards model with AF incidence within 5 years as the outcome and included covariate terms for each component of the CHARGE‐AF model to create the Re‐CHARGE‐AF model. Censoring occurred at the time of death, last primary care visit during follow‐up if no visit history after 5 years, or after 5 years of follow‐up. We calculated predicted probabilities of 5‐year AF for the Re‐CHARGE‐AF model using the formula , where the baseline risk represents the 5‐year AF‐free survival at the mean values of the risk factors in the sample, is an individual’s Re‐CHARGE‐AF score calculated using the regression coefficients from the updated Cox model (β) and the level for each risk factor (X), and the remaining constant is the Re‐CHARGE‐AF score at the mean values of the risk factors in the sample.
Statistical Analysis
For descriptive data we calculated mean and SDs or number and percentages. We plotted the distribution of the estimated 5‐year predicted probability of the CHARGE‐AF model and the Re‐CHARGE‐AF model. To assess model performance, we compared discrimination and calibration between the CHARGE‐AF model and the Re‐CHARGE‐AF model. We assessed discrimination by comparing hazard ratios (HRs) among groups defined by both CHARGE‐AF model and the Re‐CHARGE‐AF model. For each model, we created groups of predicted risk based on tertiles of the linear predictor values and then based on the 16th, 50th, and 84th percentiles of the linear predictor values. 26 We also assessed discrimination by calculating Harrell C statistic and Royston‐Sauerbrei R 2 D and by plotting cumulative incidence curves for risk groups. We compared Harrell C statistic between CHARGE‐AF and Re‐CHARGE‐AF with bias correction using 200 bootstrap samples. We assessed calibration by visually comparing the predicted and observed 5‐year AF risks from the CHARGE‐AF model, the CHARGE‐AF model with recalibrated baseline risk, and the Re‐CHARGE‐AF model with patients divided into risk groups based on quintiles, and also tested calibration using the Greenwood‐Nam‐D’Agostino test (where a significant P value suggested the presence of miscalibration). 27 To assess internal validity of discrimination and calibration estimates for the Re‐CHARGE‐AF model, we constructed 200 bootstrap samples and calculated the estimate of optimism and the optimism‐adjusted C statistic, and generated an optimism‐corrected calibration plot. 28 We considered a 2‐sided P value <0.05 to indicate statistical significance.
Results
Among 1110 patients discharged alive following an admission for acute ischemic stroke and connected to a Massachusetts General Hospital primary care physician, 228 (20.5%) had a prevalent diagnosis of AF, 132 (11.9%) had no follow‐up primary care visits after discharge, 68 (6.1%) did not meet age eligibility, 81 (7.3%) had missing data preventing AF risk estimation, and 50 (4.5%) had AF diagnosed within 7 days of the stroke, resulting in 551 patients for analysis (Figure 1). The mean age of patients was 68.0 years (SD, 11.8 years), 45.7% were women, and 80.6% were non‐Hispanic White. Additional baseline characteristics included in AF risk estimation are shown in Table 1. Characteristics of 559 patients with acute ischemic stroke linked with a primary care physician excluded from analyses (Table S2) and of patients from the original CHARGE‐AF derivation sample (Table S3) compared with the 551 patients included in these analyses are shown in the supplementary material.
Table 1.
N=551 | |
---|---|
Age, y | 68.0±11.8 |
Female sex | 252 (45.7) |
Race or ethnicity | |
Non‐Hispanic White | 444 (80.6) |
Black | 48 (8.7) |
Asian | 17 (3.1) |
Hispanic | 19 (3.5) |
Other/unknown * | 23 (4.2) |
Height, cm | 167.9±10.9 |
Weight, kg | 81.7±18.6 |
Systolic BP, mm Hg | 141±24 |
Diastolic BP, mm Hg | 77±12 |
Smoking (current) | 114 (20.7) |
Antihypertensive medication use | 309 (56.1) |
Diabetes | 160 (29.0) |
Heart failure | 45 (8.2) |
Myocardial infarction | 34 (6.2) |
Values are mean±SD or number (percentage). BP indicates blood pressure.
Other refers to 1 patient with race listed as "American Indian/Native Alaskan." The other 22 patients have Unknown race.
Over 5 years of follow‐up, there were 70 incident AF diagnoses (Kaplan‐Meier cumulative incidence, 15.2%; 95% CI, 10.6%–19.5%) and 32 death events that occurred before an AF diagnosis or the end of follow‐up (5.8%). The median duration of follow‐up among the entire patient sample was 1.92 years and among censored patients was 2.25 years. An estimate of potential follow‐up using the reverse Kaplan‐Meier method was 2.54 years (quartile 1–quartile 3: 0.99–5.00 years).
The estimated β coefficients and HRs from the original CHARGE‐AF model and the Re‐CHARGE‐AF model are shown in Table 2. Heart failure, age, and myocardial infarction remained strong predictors of AF incidence in both models, although there were large differences with wide CIs in the β coefficients and corresponding HRs for other variables (Figure S1). This includes some variables indicating decreased AF risk in our sample in contrast to the original CHARGE‐AF model results. The distributions of the estimated 5‐year predicted probability of AF for the CHARGE‐AF model and the Re‐CHARGE‐AF model are depicted in Figure 2. The distribution of AF risk for Re‐CHARGE‐AF is shifted to the right towards higher estimated AF risk. The median predicted 5‐year AF risk from the CHARGE‐AF model was 4.8% (quartile 1–quartile 3, 2.0%–12.6%). The median predicted 5‐year AF risk from the refitted model was 16.1% (quartile 1–quartile 3, 8.0%–26.2%).
Table 2.
CHARGE‐AF Estimated β (SE) 12 | CHARGE‐AF HR (95% CI) 12 | Re‐CHARGE‐AF Estimated β (SE) | Re‐CHARGE‐AF HR (95% CI) | |
---|---|---|---|---|
Age (5 y) | 0.508 (0.022) | 1.66 (1.59–1.74) | 0.286 (0.065) | 1.33 (1.17–1.51) |
Race (White)* | 0.465 (0.093) | 1.59 (1.33–1.91) | −0.686 (0.309) | 0.50 (0.27–0.92) |
Height (10 cm) | 0.248 (0.036) | 1.28 (1.19–1.38) | −0.133 (0.128) | 0.88 (0.68–1.13) |
Weight (15 kg) | 0.115 (0.033) | 1.12 (1.05–1.20) | 0.421 (0.117) | 1.52 (1.21–1.92) |
Systolic BP (20 mm Hg) | 0.197 (0.033) | 1.22 (1.14–1.30) | 0.023 (0.114) | 1.02 (0.82–1.28) |
Diastolic BP (10 mm Hg) | −0.101 (0.032) | 0.90 (0.85–0.96) | −0.116 (0.121) | 0.89 (0.70–1.13) |
Smoking (current) | 0.359 (0.091) | 1.43 (1.20–1.71) | −0.517 (0.387) | 0.60 (0.28–1.27) |
Antihypertensive medication use | 0.349 (0.063) | 1.42 (1.25–1.60) | 0.004 (0.273) | 1.00 (0.59–1.72) |
Diabetes (yes) | 0.237 (0.073) | 1.27 (1.10–1.46) | −0.488 (0.291) | 0.61 (0.35–1.09) |
Heart failure (yes) | 0.701 (0.106) | 2.02 (1.64–2.48) | 0.627 (0.379) | 1.87 (0.89–3.93) |
Myocardial infarction (yes) | 0.496 (0.089) | 1.64 (1.38–1.96) | 0.282 (0.396) | 1.33 (0.61–2.88) |
BP indicates blood pressure; and HR, hazard ratio.
For the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model, the race coefficient corresponds to White persons compared with Black persons. For the fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model, the race coefficient corresponds to non‐Hispanic White persons compared with persons from all other race/ethnic groups.
For the CHARGE‐AF model, the HR for AF was 4.93 (95% CI, 2.38–10.21) for the highest risk tertile group and 2.81 (95% CI, 1.30–6.07) for the middle risk tertile, both compared with the lowest risk tertile. By comparison, for the Re‐CHARGE‐AF model, the HR for AF was 6.69 (95% CI, 3.15–14.23) for its highest risk tertile and 2.33 (95% CI, 1.02–5.37) for the middle tertile. The C statistic for the CHARGE‐AF model was 0.64 (95% CI, 0.57–0.70) and the Royston‐Sauerbrei D statistic was 0.83 (95% CI, 0.46–1.20). For the Re‐CHARGE‐AF model, the C statistic was 0.74 (95% CI, 0.68–0.79) and the Royston‐Sauerbrei D statistic was 1.30 (95% CI, 1.10–1.50). The optimism‐adjusted C statistic for Re‐CHARGE AF was 0.70 (95% CI, 0.65–0.75) and was significantly greater than the C statistic for the CHARGE‐AF model (P=0.001). Cumulative incidence plots stratified by tertile groups of predicted risk based on the CHARGE‐AF model and the Re‐CHARGE‐AF model are shown in Figure 3A and 3B. The evaluation of discrimination with 4 risk groups based on the 16th, 50th, and 84th percentiles is shown in Table S4 and Figure S2. There is separation between the cumulative incidence curves for both CHARGE‐AF and Re‐CHARGE‐AF when stratified into 3 or 4 risk groups. In both instances, the highest risk group in the Re‐CHARGE‐AF model demonstrated greater separation from the next highest risk group (Kaplan‐Meier estimate for highest tertile: 33.4%; middle tertile: 16.1%; >84th percentile: 41.7%; 50th–84th percentile: 24.3%) compared with CHARGE‐AF (Kaplan‐Meier estimate for highest tertile: 31.7%; middle tertile: 19.6%; >84th percentile: 32.1%; 50th–84th percentile: 26.1%).
Calibration of the CHARGE‐AF model was poor, with the plot of observed 5‐year AF risk versus predicted 5‐year AF risk demonstrating marked underestimation of AF risk (Figure 4A) and the Greenwood‐Nam D’Agostino test indicating miscalibration (chi‐square: 24.8, P<0.001). Calibration of the CHARGE‐AF model with recalibrated baseline risk was also poor (Greenwood‐Nam D’agostino chi‐square: 37.6, P<0.001; Figure S3). In contrast, the Re‐CHARGE‐AF model appeared well calibrated both with and without optimism adjustment (Figure 4B), as well as by the Greenwood‐Nam D’Agostino chi‐square test (0.53, P=0.97).
Discussion
Among over 500 primary care patients discharged from a regional stroke referral center after acute ischemic stroke, we observed that the CHARGE‐AF risk model achieved moderate discrimination for incident AF but was poorly calibrated and substantially underestimated the risk of AF. Recalibration of CHARGE‐AF to the baseline hazard of our poststroke sample was insufficient to achieve accurate absolute AF risk estimates. In contrast, a fully refitted model, Re‐CHARGE‐AF, demonstrated substantially greater discrimination of AF and achieved good calibration between predicted and observed AF incidence. Our findings suggest that accurate AF risk estimation in the poststroke setting can be achieved using covariates from an established AF risk model, but only after adjustment to account both for varying baseline risk and relative influence of covariates. Given that AF is an important predictor of recurrent stroke, our findings may enable accurate estimation of AF risk and ultimately aid in clinical management decisions in patients with stroke.
The CHARGE‐AF risk model was developed in community‐based cohorts to predict incident AF. Although it has been externally validated in multiple cohorts, 12 , 20 , 22 , 23 , 29 , 30 its discriminatory performance and calibration in an acute stroke population, whose risk of AF is elevated, has not previously been assessed. Our results demonstrate that calculating 5‐year predicted probability of incident AF in an acute stroke population using the published CHARGE‐AF model components and weights achieves moderate discrimination but poor calibration. A full model refitting comprising the CHARGE‐AF score components was required to achieve a more discriminative risk score that is well calibrated in the poststroke population.
Prior research has shown that cardiac monitoring following acute stroke may be underutilized and is not associated with predicted risk of AF. 31 , 32 , 33 Future research is needed to evaluate whether accurate assessments of AF risk in acute stroke survivors increases appropriate poststroke cardiac monitor utilization and identification of AF.
Our findings support the need to evaluate both discrimination and calibration of prognostic models before implementation in clinical practice. Even if a model demonstrates good discrimination, poor calibration can make predictions based on the model misleading. 34 We observed that the CHARGE‐AF model underestimated AF risk in a poststroke population, which may impact clinical decisions by physicians and may misrepresent risk to patients. For example, if utilizing predicted AF risk to determine whether extended or ambulatory cardiac monitoring is appropriate following stroke, poor calibration may lead to inappropriately lower utilization via underestimation of AF risk. Despite the importance of both discrimination and calibration in making a model clinically useful, systematic reviews have shown that calibration is assessed less often. 35 , 36 , 37 The C2HEST (coronary artery disease or chronic obstructive pulmonary disease [1 point each]; hypertension [1 point]; elderly [age ≥75 years, 2 points]; systolic heart failure [2 points]; thyroid disease [hyperthyroidism, 1 point]) score, a score originally developed in a general population of Asian patients to predict incident AF, was recently evaluated in a poststroke population in France. 38 While the C2HEST score showed adequate discrimination in this population, calibration of the model was not assessed. As our study demonstrates, recalibration and even refitting may be necessary in order to accurately present risk. 39
Calibration of prognostic models may be affected by several factors. The underlying risk of disease incidence and other patient characteristics may differ between where an algorithm is developed and where it is implemented. 39 In addition, calibration may be impacted by secular trends in disease incidence. 40 We applied the CHARGE‐AF model to an acute stroke sample from an academic medical hospital, which corresponds to a target population with high underlying risk of developing AF, and ascertained predictors using EHR data. Poor calibration may be expected since CHARGE‐AF was developed in community‐based cohorts with lower incidence of AF and routine follow‐up data collection. Additionally, the derivation data for CHARGE‐AF included only White and Black persons, 12 while our sample included additional racial and ethnic groups. We used the coefficient for race from CHARGE‐AF, although it may not be applicable to other groups.
This study has several potential limitations. We utilized the CHARGE‐AF model since it is most widely used and externally validated. 13 , 20 , 23 , 29 , 30 Other AF risk scores may have performed differently in a poststroke sample. Ascertainment of clinical predictors and incidence of AF was based on retrospective assessment of EHR documentation, which may be associated with misclassification. Our sample was limited to only those patients connected to our primary care network, a study design choice to increase the probability of having adequate follow‐up for outcome assessment. Despite this, we do not have the ability to fully ascertain incident diagnoses of AF for those patients who left the network and were censored during the follow‐up period. However, the proportion of patients diagnosed with AF following discharge in our sample compares favorably with a meta‐analysis of ambulatory AF diagnoses poststroke. 41 Limited sample size led to a modest number of incident AF events and imprecise estimates. Simulation studies suggest a minimum of 100 events for external validation of logistic regression models to detect differences in model performance for calibration and discrimination. 42 However, we did detect evidence of decreased calibration when applying CHARGE‐AF in our stroke sample despite the relatively low number of incident AF events, which motivated the recalibration. Our study was conducted within a single‐center tertiary academic hospital with patients who were largely of European ancestry, so generalizability may be limited. We did not perform external validation of our refitted model as our goal was not to propose a new standard model but rather to evaluate and correct the calibration of an existing and widely used model. However, we did observe some decrement in prediction accuracy in internal validation. It is possible that improved prediction of AF incidence could be achieved by building a new model or adding new risk factors that may be predictive in a poststroke population; however, that was not the objective of the current study. Unmeasured confounding may have impacted estimates of model coefficients and, thus, while predictive of AF risk in our representative Massachusetts General Hospital ischemic stroke population, are not intended to represent biologically informative markers of disease risk. We do not propose external use of our derived coefficients. Rather, our findings suggest that, when possible, recalibration or refitting of existing models within populations in whom deployment is intended may facilitate substantially more accurate absolute risk estimates.
In conclusion, in a sample of patients with acute stroke connected to primary care, we found that the CHARGE‐AF risk model exhibited moderate discrimination of incident AF; however, it was poorly calibrated and underestimated true AF risk. A fully refitted model in our stroke sample substantially improved discrimination and was well calibrated. As we move towards incorporating risk estimates from prognostic models into clinical tools to improve decision making, it is critical to evaluate model performance, calibration, and discrimination to ensure that we are providing the most useful information in the context of the population being treated.
Sources of Funding
Drs Lubitz, Anderson, Trinquart, and Ashburner are supported by the American Heart Association (AHA) 18SFRN34250007 and 18SFRN34150007. Dr Ashburner is supported by National Institutes of Health (NIH) grant K01HL148506. Dr Lubitz is supported by National Institutes of Health (NIH) grant 1R01HL139731. Drs Ellinor and Benjamin are supported by NIH grant 1RO1HL092577 and the AHA (18SFRN34110082). Dr Ellinor is supported by grants from the Fondation Leducq (14CVD01) and the NIH (K24HL105780). Dr Anderson is supported by NIH grants R01NS103924 and U01NS069763, AHA‐Bugher Foundation Centers for Excellence in Hemorrhagic Stroke, the Massachusetts General Hospital Center for Neuroscience, and the Henry and Allison McCance Center for Brain Health. Dr Khurshid is supported by NIH T32HL007208. Dr Ko is supported by the American College of Cardiology Foundation/Merck Research Fellowship in Cardiovascular Diseases and Cardiometabolic Disorders. Dr Benjamin is also supported by R01 HL141434, R01AG066010, 1R01AG066914, and 2U54HL120163.
Disclosures
Dr Lubitz receives sponsored research support from Bristol‐Myers Squibb/Pfizer, Bayer AG, Boehringer Ingelheim, and Fitbit; has consulted for Bristol‐Myers Squibb/Pfizer and Bayer AG; and participates in a research collaboration with IBM. Dr Ellinor is supported by a grant from Bayer AG to the Broad Institute focused on the genetics and therapeutics of cardiovascular diseases. Dr Ellinor has also served on advisory boards or consulted for Bayer AG, MyoKardia, Quest Diagnostics, and Novartis. Dr Anderson receives sponsored research support from Bayer AG and has consulted for ApoPharma, Inc. Dr Singer receives research support from Bristol‐Myers Squibb and has consulted for Boehringer Ingelheim, Bristol‐Myers Squibb, Fitbit, Johnson and Johnson, Merck, and Pfizer. Dr Atlas receives sponsored research support from Bristol‐Myers Squibb/Pfizer and has consulted for Bristol‐Myers Squibb/Pfizer and Fitbit. The remaining authors have no disclosures to report.
Supporting information
For Sources of Funding and Disclosures, see page 9.
References
- 1. Wolf PA, Abbott RD, Kannel WB. Atrial fibrillation as an independent risk factor for stroke: the Framingham Study. Stroke. 1991;22:983–988. doi: 10.1161/01.STR.22.8.983 [DOI] [PubMed] [Google Scholar]
- 2. Brüggenjürgen B, Rossnagel K, Roll S, Andersson FL, Selim D, Müller‐Nordhorn J, Nolte CH, Jungehülsing GJ, Villringer A, Willich SN. The impact of atrial fibrillation on the cost of stroke: the berlin acute stroke study. Value Health. 2007;10:137–143. doi: 10.1111/j.1524-4733.2006.00160.x [DOI] [PubMed] [Google Scholar]
- 3. Hylek EM, Go AS, Chang Y, Jensvold NG, Henault LE, Selby JV, Singer DE. Effect of intensity of oral anticoagulation on stroke severity and mortality in atrial fibrillation. N Engl J Med. 2003;349:1019–1026. doi: 10.1056/NEJMoa022913 [DOI] [PubMed] [Google Scholar]
- 4. Lin HJ, Wolf PA, Kelly‐Hayes M, Beiser AS, Kase CS, Benjamin EJ, D’Agostino RB. Stroke severity in atrial fibrillation. The Framingham Study. Stroke. 1996;27:1760–1764. doi: 10.1161/01.STR.27.10.1760 [DOI] [PubMed] [Google Scholar]
- 5. Ruff CT, Giugliano RP, Braunwald E, Hoffman EB, Deenadayalu N, Ezekowitz MD, Camm AJ, Weitz JI, Lewis BS, Parkhomenko A, et al. Comparison of the efficacy and safety of new oral anticoagulants with warfarin in patients with atrial fibrillation: a meta‐analysis of randomised trials. Lancet. 2014;383:955–962. doi: 10.1016/S0140-6736(13)62343-0 [DOI] [PubMed] [Google Scholar]
- 6. Sanna T, Diener HC, Passman RS, Di Lazzaro V, Bernstein RA, Morillo CA, Rymer MM, Thijs V, Rogers T, Beckers F, et al. Cryptogenic stroke and underlying atrial fibrillation. N Engl J Med. 2014;370:2478–2486. doi: 10.1056/NEJMoa1313600 [DOI] [PubMed] [Google Scholar]
- 7. Gladstone DJ, Spring M, Dorian P, Panzov V, Thorpe KE, Hall J, Vaid H, O’Donnell M, Laupacis A, Côté R, et al. Atrial fibrillation in patients with cryptogenic stroke. N Engl J Med. 2014;370:2467–2477. doi: 10.1056/NEJMoa1311376 [DOI] [PubMed] [Google Scholar]
- 8. Kishore A, Vail A, Majid A, Dawson J, Lees KR, Tyrrell PJ, Smith CJ. Detection of atrial fibrillation after ischemic stroke or transient ischemic attack: a systematic review and meta‐analysis. Stroke. 2014;45:520–526. doi: 10.1161/STROKEAHA.113.003433 [DOI] [PubMed] [Google Scholar]
- 9. Kernan WN, Ovbiagele B, Black HR, Bravata DM, Chimowitz MI, Ezekowitz MD, Fang MC, Fisher M, Furie KL, Heck DV, et al. Guidelines for the prevention of stroke in patients with stroke and transient ischemic attack: a guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2014;45:2160–2236. doi: 10.1161/STR.0000000000000024 [DOI] [PubMed] [Google Scholar]
- 10. January CT, Wann LS, Calkins H, Chen LY, Cigarroa JE, Cleveland JC, Ellinor PT, Ezekowitz MD, Field ME, Furie KL, et al. 2019 AHA/ACC/HRS focused update of the 2014 AHA/ACC/HRS guideline for the management of patients with atrial fibrillation: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society. Circulation. 2019;140:e125–e151. doi: 10.1161/CIR.0000000000000665 [DOI] [PubMed] [Google Scholar]
- 11. Chew DS, Rennert‐May E, Spackman E, Mark DB, Exner DV. Cost‐effectiveness of extended electrocardiogram monitoring for atrial fibrillation after stroke: a systematic review. Stroke. 2020;51:2244–2248. doi: 10.1161/STROKEAHA.120.029340 [DOI] [PubMed] [Google Scholar]
- 12. Alonso A, Krijthe BP, Aspelund T, Stepas KA, Pencina MJ, Moser CB, Sinner MF, Sotoodehnia N, Fontes JD, Janssens AC, et al. Simple risk model predicts incidence of atrial fibrillation in a racially and geographically diverse population: the CHARGE‐AF consortium. J Am Heart Assoc. 2013;2:e000102. doi: 10.1161/JAHA.112.000102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Kolek MJ, Graves AJ, Xu M, Bian A, Teixeira PL, Shoemaker MB, Parvez B, Xu H, Heckbert SR, Ellinor PT, et al. Evaluation of a prediction model for the development of atrial fibrillation in a repository of electronic medical records. JAMA Cardiol. 2016;1:1007–1013. doi: 10.1001/jamacardio.2016.3366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Atlas SJ, Grant RW, Ferris TG, Chang Y, Barry MJ. Patient‐physician connectedness and quality of primary care. Ann Intern Med. 2009;150:325–335. doi: 10.7326/0003-4819-150-5-200903030-00008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Atlas SJ, Chang Y, Lasko TA, Chueh HC, Grant RW, Barry MJ. Is this “my” patient? Development and validation of a predictive model to link patients to primary care providers. J Gen Intern Med. 2006;21:973–978. doi: 10.1007/BF02743147 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Murphy SN, Chueh HC. A security architecture for query tools used to access large biomedical databases. Proc AMIA Symp. 2002;552–556. [PMC free article] [PubMed] [Google Scholar]
- 17. Regan S, Meigs JB, Grinspoon SK, Triant VA. Determinants of smoking and quitting in HIV‐infected individuals. PLoS One. 2016;11:e0153103. doi: 10.1371/journal.pone.0153103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Grant RW, Cagliero E, Sullivan CM, Dubey AK, Estey GA, Weil EM, Gesmundo J, Nathan DM, Singer DE, Chueh HC, et al. A controlled trial of population management: diabetes mellitus: putting evidence into practice (DM‐PEP). Diabetes Care. 2004;27:2299–2305. doi: 10.2337/diacare.27.10.2299 [DOI] [PubMed] [Google Scholar]
- 19. Ashburner JM, Singer DE, Lubitz SA, Borowsky LH, Atlas SJ. Changes in use of anticoagulation in patients with atrial fibrillation within a primary care network associated with the introduction of direct oral anticoagulants. Am J Cardiol. 2017;120:786–791. doi: 10.1016/j.amjcard.2017.05.055 [DOI] [PubMed] [Google Scholar]
- 20. Khurshid S, Kartoun U, Ashburner JM, Trinquart L, Philippakis A, Khera AV, Ellinor PT, Ng K, Lubitz SA. Performance of atrial fibrillation risk prediction models in over 4 million individuals. Circ Arrhythm Electrophysiol. 2021;14:e008997. doi: 10.1161/CIRCEP.120.008997 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Himmelreich JC, Veelers L, Lucassen WA, Schnabel RB, Rienstra M, van Weert HC, Harskamp RE. Prediction models for atrial fibrillation applicable in the community: a systematic review and meta‐analysis. Europace. 2020;22:684–694. doi: 10.1093/europace/euaa005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Himmelreich JC, Lucassen WA, Harskamp RE, Aussems C, van Weert HC, Nielen MM. CHARGE‐AF in a national routine primary care electronic health records database in the Netherlands: validation for 5‐year risk of atrial fibrillation and implications for patient selection in atrial fibrillation screening. Open Heart. 2021;8:e001459. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7816907/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Christophersen IE, Yin X, Larson MG, Lubitz SA, Magnani JW, McManus DD, Ellinor PT, Benjamin EJ. A comparison of the CHARGE‐AF and the CHA2DS2‐VASc risk scores for prediction of atrial fibrillation in the Framingham Heart Study. Am Heart J. 2016;178:45–54. doi: 10.1016/j.ahj.2016.05.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Steyerberg EW, Borsboom GJ, van Houwelingen HC, Eijkemans MJ, Habbema JD. Validation and updating of predictive logistic regression models: a study on sample size and shrinkage. Stat Med. 2004;23:2567–2586. doi: 10.1002/sim.1844 [DOI] [PubMed] [Google Scholar]
- 25. Crowson CS, Atkinson EJ, Therneau TM. Assessing calibration of prognostic risk scores. Stat Methods Med Res. 2016;25:1692–1706. doi: 10.1177/0962280213497434 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Cox D. Note on grouping. J Am Stat Assoc. 1957;52:543–547. doi: 10.1080/01621459.1957.10501411 [DOI] [Google Scholar]
- 27. Demler OV, Paynter NP, Cook NR. Tests of calibration and goodness of fit in the survival setting. Stat Med. 2015;34:1659–1680. doi: 10.1002/sim.6428 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–387. doi: [DOI] [PubMed] [Google Scholar]
- 29. Hulme OL, Khurshid S, Weng LC, Anderson CD, Wang EY, Ashburner JM, Ko D, McManus DD, Benjamin EJ, Ellinor PT, et al. Development and validation of a prediction model for atrial fibrillation using electronic health records. JACC Clin Electrophysiol. 2019;5:1331–1341. doi: 10.1016/j.jacep.2019.07.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Pfister R, Brägelmann J, Michels G, Wareham NJ, Luben R, Khaw KT. Performance of the CHARGE‐AF risk model for incident atrial fibrillation in the EPIC Norfolk cohort. Eur J Prev Cardiol. 2015;22:932–939. doi: 10.1177/2047487314544045 [DOI] [PubMed] [Google Scholar]
- 31. Demeestere J, Fieuws S, Lansberg MG, Lemmens R. Detection of atrial fibrillation among patients with stroke due to large or small Vessel disease: a meta‐analysis. J Am Heart Assoc. 2016;5:e004151. doi: 10.1161/JAHA.116.004151 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Burn J, Dennis M, Bamford J, Sandercock P, Wade D, Warlow C. Long‐term risk of recurrent stroke after a first‐ever stroke. The Oxfordshire community stroke project. Stroke. 1994;25:333–337. doi: 10.1161/01.STR.25.2.333 [DOI] [PubMed] [Google Scholar]
- 33. Khurshid S, Li X, Ashburner JM, Lipsanopoulos AT, Lee PR, Lin AK, Ko D, Ellinor PT, Schwamm LH, Benjamin EJ, et al. Usefulness of rhythm monitoring following acute ischemic stroke. Am J Cardiol. 2021;147:44–51. doi: 10.1016/j.amjcard.2021.01.038 [DOI] [PubMed] [Google Scholar]
- 34. Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision‐analytic performance. Med Decis Making. 2015;35:162–169. doi: 10.1177/0272989X14547233 [DOI] [PubMed] [Google Scholar]
- 35. Collins GS, de Groot JA, Dutton S, Omar O, Shanyinde M, Tajar A, Voysey M, Wharton R, Yu LM, Moons KG, et al. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol. 2014;14:40. doi: 10.1186/1471-2288-14-40 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. doi: 10.1016/j.jclinepi.2019.02.004 [DOI] [PubMed] [Google Scholar]
- 37. Bouwmeester W, Zuithoff NPA, Mallett S, Geerlings MI, Vergouwe Y, Steyerberg EW, Altman DG, Moons KG. Reporting and methods in clinical prediction research: a systematic review. PLoS Medicine. 2012;9:1–12. doi: 10.1371/journal.pmed.1001221 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Li YG, Bisson A, Bodin A, Herbert J, Grammatico‐Guillon L, Joung B, Wang YT, Lip GY, Fauchier L. C2 HEST score and prediction of incident atrial fibrillation in poststroke patients: a French nationwide study. J Am Heart Assoc. 2019;8:e012546. doi: 10.1161/JAHA.119.012546 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Topic Group ‘Evaluating diagnostic tests and prediction models’ of the STRATOS initiative . Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. doi: 10.1186/s12916-019-1466-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Schonfeld SJ, Pee D, Greenlee RT, Hartge P, Lacey JV, Park Y, Schatzkin A, Visvanathan K, Pfeiffer RM. Effect of changing breast cancer incidence rates on the calibration of the Gail model. J Clin Oncol. 2010;28:2411–2417. doi: 10.1200/JCO.2009.25.2767 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Sposato LA, Cipriano LE, Saposnik G, Vargas ER, Riccio PM, Hachinski V. Diagnosis of atrial fibrillation after stroke and transient ischaemic attack: a systematic review and meta‐analysis. Lancet Neurol. 2015;14:377–387. doi: 10.1016/S1474-4422(15)70027-X [DOI] [PubMed] [Google Scholar]
- 42. Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JD. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005;58:475–483. doi: 10.1016/j.jclinepi.2004.06.017 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.