Abstract
Background
The accuracy of stroke diagnosis in administrative claims for a contemporary population of Medicare enrollees has not been studied. We assessed the validity of diagnostic coding algorithms for identifying stroke in the Medicare population by linking data from the REasons for Geographic And Racial Differences in Stroke (REGARDS) Study to Medicare claims.
Methods and Results
The REGARDS Study enrolled 30,239 participants 45 years and older in the United States between 2003 and 2007. Stroke experts adjudicated suspected strokes using retrieved medical records. We linked data for participants enrolled in fee-for-service Medicare to claims files from 2003 through 2009. Using adjudicated strokes as the gold standard, we calculated accuracy measures for algorithms to identify incident and recurrent stroke.
We linked data for 15,089 participants, among whom 422 participants had adjudicated strokes during follow-up. An algorithm using primary discharge diagnosis codes for acute ischemic or hemorrhagic stroke [ICD-9-CM codes: 430, 431, 433.x1, 434.x1, 436] had positive predictive value of 92.6% (95% Confidence Interval (CI), 88.8%-96.4%), specificity of 99.8% (99.6%-99.9%), and sensitivity of 59.5% (53.8%-65.1%). An algorithm using only acute ischemic stroke codes [433.x1, 434.x1, 436] had positive predictive value of 91.1% (95% CI, 86.6%-95.5%), specificity of 99.8% (99.7%-99.9%), and sensitivity of 58.6% (52.4%-64.7%).
Conclusions
Claims-based algorithms to identify stroke in a contemporary Medicare cohort had high positive predictive value and specificity, supporting their use as outcomes for etiologic and comparative effectiveness studies in similar populations. These inpatient algorithms are unsuitable for estimating stroke incidence due to low sensitivity.
Keywords: cohort study, Medicare, diagnosis, comparative effectiveness, health services research
Introduction
Previous studies have assessed the accuracy of diagnoses of stroke in administrative claims databases.1-12 Relatively high positive predictive values (PPVs) ranging from 70% to 96% have been reported, especially in recent studies using fifth-digit International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, supporting the use of these algorithms to identify stroke in the databases. The most recent studies in Medicare patients include one in a population of kidney transplant recipients in 200413 and another in patients with atrial fibrillation in 1998-1999.8 Improvements in modalities for diagnosis of stroke have changed clinical practice over the last 15 years,14, 15 and changes to ICD-9-CM codes have also occurred.16 Thus, the validity of stroke ascertainment algorithms in a contemporary population might differ from that suggested by older studies. High specificity and PPV are crucial for the validity of estimates of etiological and comparative effectiveness studies when these claims-based algorithms are used to identify outcome events.17-19
Investigators for the REasons for Geographic And Racial Differences in Stroke (REGARDS) Study, a nationwide epidemiologic study of 30,239 participants, recently linked the cohort to Medicare claims data. This linkage provided a unique opportunity to assess the validity of claims-based algorithms for stroke diagnosis in Medicare beneficiaries. The purpose of the current study is to assess the validity of the claims-based algorithms and illustrate the types of analyses in which these algorithms are useful in the contemporary Medicare population.
Methods
Data Sources
We used linked REGARDS-Medicare claims data for the study. The REGARDS data contained information on 30,239 community-dwelling participants 45 years and older recruited throughout the United States between 2003 and 2007, with oversampling of black participants and those living in the “Stroke Belt”, a region with a particularly higher incidence rate of stroke as compared to the rest of the country (Alabama, Arkansas, Georgia, Louisiana, Mississippi, North Carolina, South Carolina, and Tennessee). Details of the study methods have been described previously by Howard and colleagues20. Participants who agreed to be enrolled completed a 45-minute telephone interview to collect demographic, socioeconomic, risk factor, and medical history information at baseline. A health professional then completed a visit to collect blood and urine samples, blood pressure measurements, electrocardiogram, and other key study variables. Participants were followed through phone interviews performed every 6 months. Medical records for self-reported, suspected stroke events and or stroke symptoms were retrieved whenever possible and centrally adjudicated for stroke events by a team of trained stroke experts.
Medicare is the primary health insurer of the US population 65 years and older,21 and administrative claims data for fee-for-service enrollees, who constitute approximately 80% of the total Medicare population,22 are collected and distributed by the Centers for Medicare & Medicaid Services for research use. The database contains final action claims data submitted by inpatient hospital providers, skilled nursing facility providers, institutional outpatient providers, noninstitutional providers, and durable medical equipment suppliers for reimbursement of services. The information contained in these files includes diagnoses and procedures, dates of service, reimbursement amounts, provider information, Medicare eligibility and enrollment information, and demographic and vital status characteristics.
For the current study, REGARDS data were linked to Medicare data for 2003 through 2009. In the entire REGARDS cohort, 26,659 (88.2%) participants provided a Social Security number and consented to linkage with Medicare data. REGARDS participants were linked based upon exact agreement in Social Security number and sex and an exact match to at least 2 of 3 components of date of birth. Mismatch in the year of birth was allowed to differ by a maximum of 1 year. Following this procedure, a total of 17,942 participants were linked to Medicare data, resulting in a linkage rate of 91.2% (16,583/18,190) for those who were 65 years or older at any time during the study period.
Study Cohort
We restricted the study population in the linked data set to participants with at least 1 month of eligibility in fee-for-service Medicare Part A and Part B after REGARDS enrollment (therefore excluding Medicare Advantage enrollment periods). We constructed 2 cohorts for the study: an “any stroke” cohort and a “first stroke” cohort. The any stroke cohort included all participants who met the criteria for the study population, and was used to conduct analyses on combined incident and recurrent stroke events. The first stroke cohort was further restricted to those free of self-reported physician-diagnosed stroke in REGARDS before cohort entry to limit the analysis to incident cases (Figure 1).
Figure 1.
Flowchart of Participant Linkage and Exclusion
Claims-Based Algorithms
We assessed the validity of the following inpatient claims-based algorithms using the Medicare institutional files: (1) the acute ischemic stroke (AIS) algorithm included ICD-9-CM code 433.×1, 434.×1, or 436 in the primary discharge diagnosis: (2) the intracranial hemorrhage (ICH) algorithm included code 430 or 431 in the primary discharge diagnosis: (3) the AIS/ICH algorithm included codes 430, 431, 433.×1, 434.×1, or 436 in the primary discharge diagnosis. These algorithms using only primary discharge diagnosis codes have been reported to have high PPV,23,24 and we expected high specificity and low sensitivity from them. Codes for transient ischemic attack were not included.
Gold Standard
Strokes adjudicated through medical record review were used as the gold standard. As part of the REGARDS cohort follow-up, participants or their proxies are contacted via telephone for an interview every 6 months to answer questions related to all hospitalizations. When stroke-related events (ie, new stroke symptoms or hospitalizations or ambulatory evaluations for stroke or transient ischemic attack) were reported during the interviews, medical records were pursued for review. Because some events were not captured through these self-reports during the interview, we conducted additional medical record retrievals to enhance the completeness and accuracy of the gold standard. Among participants with no self-reported events previously, we identified those with stroke claims in the Medicare database using the AIS/ICH algorithm and pursued for their medical records.
After being reviewed by a stroke nurse for exclusion of obvious non-cases, all retrieved medical records were reviewed by a committee of stroke experts. Stroke was defined according to the World Health Organization definition as “rapidly developing clinical signs of focal, at times global, disturbance of cerebral function, lasting more than 24 hours or leading to death with no apparent cause other than that of vascular origin.”25 When events did not meet this definition but had symptoms lasting less than 24 hours with neuroimaging consistent with acute ischemia or hemorrhage, they were classified as “clinical strokes.” Strokes were further classified as ischemic or hemorrhagic. Both World Health Organization–defined and “clinical” stroke cases were included as the gold standard. The event status was considered missing if medical chart could not be retrieved for adjudication.
Statistical Analysis
Baseline characteristics and age at cohort entry were summarized using the data collected in the REGARDS Study. For the calculation of the validity measures, each participant of the cohort contributed 1 suspected event (i.e. a self report or a claim). Those with multiple suspected events contributed their earliest of the suspected events. Since the proportion of unretrieved medical charts (ie, those of unknown gold standard status) differed between participants with and without the claims, we calculated the validity measures in a stratified manner to avoid bias from informative missingness. All participants were categorized into 1 of the following 4 strata based on how the participant's suspected stroke events were identified: (A) claim/self-report: participants with suspected events identified by both claims and self-report during a follow-up telephone interview; (B) claims only: participants with suspected events identified by Medicare claims but not self-reported; (C) self-report only: participants with suspected events self-reported but not identified by claims; and (D) no events: participants without suspected stroke events in either claims data or self-report.
We first calculated the proportion of suspected events adjudicated as strokes among all events adjudicated in each of the 4 strata described above. When there was an adjudicated strokes occurring within plus or minus 30 days of the admission date of the claim, that claim was considered as having correctly identified the stroke (ie, true positives). For the primary analysis, we assumed that the proportion of true strokes was the same among those with and without retrieved medical records (i.e. assumed missing completely at random within the strata defined by the presence of self-report and/or claim) and estimated the overall PPV, sensitivity, specificity, and negative predictive value (NPV) under this assumption, applying strata-specific imputed means (percentage of suspected events adjudicated as strokes among those with retrieved records in the strata) to those without medical records. We calculated 95% CIs by normal approximation. We also assessed whether these validity measures varied among subgroups of participants defined by categories of age at cohort entry, sex, and race. Among those who had an event identified by AIS/ICH algorithm, we compared the baseline characteristics of the participants with and without retrieved medical records.
Sensitivity Analysis
To assess how dependent the estimates of the validity measures were on the assumption of true stroke proportion among the unknown status events, we estimated the PPV, sensitivity, and specificity of the AIS/ICH algorithm in any stroke cohort under different assumed proportions of true strokes.
While the optimal method for evaluating the algorithms would be to test them against the gold standard obtained in the whole group or a random sample regardless of self-report or claims, we were not able to do so due to limited resources for the medical chart review. We assumed that there are no unidentified cases in the “no event” stratum in the primary analysis. We assessed the impact of this assumption in the second sensitivity analysis, by estimating the PPV, sensitivity, and specificity of the AIS/ICH algorithm under different prevalence of uncaptured true strokes in the stratum.
The institutional review boards of the Duke University Health System and the University of Alabama at Birmingham approved the study. We used SAS version 9.3 (SAS Institute, North Carolina) for all analyses.
Results
Among the 17,942 participants linked to Medicare claims data, 15,089 had at least 1 month of eligibility for fee-for-service Medicare Parts A and B during follow-up, and were eligible for any stroke cohort. In total, 422 of the participants had at least one stroke during the follow-up. Restricting the population to those without baseline report of stroke or transient ischemic attack, the first stroke cohort consisted of 13,096 participants.
In the any stroke cohort, the mean age was 69.3 years, and more than 87% of participants were 65 years or older. The cohort had equal numbers of men and women, and 37% of participants were black (Table 1). Except for the presence of prior stroke or prior transient ischemic attack at baseline, participants in the first stroke cohort had similar demographic and comorbidity profiles, with slightly lower proportions of comorbid conditions.
Table 1.
Baseline Characteristics of the Study Population
| Characteristic | Any Stroke Cohort (n=15,089) | First Stroke Cohort (n=13,096) |
|---|---|---|
| Age, mean(SD), y* | 69.3 (7.2) | 69.1 (7.0) |
| Age group, No.(%) | ||
| <65y | 1915 (12.7) | 1551 (11.8) |
| 65-74y | 9713 (64.4) | 8702 (66.5) |
| ≥75y | 3461 (22.9) | 2843 (21.7) |
| Female, No.(%) | 7804 (51.7) | 6782 (51.8) |
| Race, No.(%) | ||
| Black | 5574 (36.9) | 4689 (35.8) |
| White | 9515 (63.1) | 8407 (64.2) |
| Body mass index, No.(%) | ||
| <18.5 | 146 (1.0) | 121 (0.9) |
| 18.5-25.0 | 3640 (24.3) | 3146 (24.2) |
| 25.0-30.0 | 5701 (38.1) | 4975 (38.3) |
| 30.0-35.0 | 3245 (21.7) | 2849 (21.9) |
| 35.0-40.0 | 1412 (9.4) | 1195 (9.2) |
| ≥40.0 | 832 (5.6) | 718 (5.5) |
| Smoking status, No.(%)† | ||
| Never a smoker | 6376 (42.4) | 5627 (43.2) |
| Past smoker | 6687 (44.5) | 5795 (44.5) |
| Current smoker | 1965 (13.1) | 1615 (12.4) |
| Comorbid conditions, No.(%)† | ||
| Atrial fibrillation | 1370 (9.2) | 1083 (8.4) |
| Deep vein thrombosis | 950 (6.3) | 763 (5.8) |
| Diabetes mellitus# | 3534 (24.3) | 2860 (22.6) |
| Dialysis | 93 (0.6) | 72 (0.6) |
| Dyslipidemia∥ | 9136 (62.8) | 7809 (61.9) |
| Heart disease‡ | 3323 (22.4) | 2623 (20.4) |
| Hypertension** | 9598 (63.8) | 8031 (61.5) |
| Kidney failure | 349 (2.3) | 264 (2.0) |
| Malignancy | 1759 (16.9) | 1512 (16.8) |
| Regular aspirin use, No.(%)† | 7397 (49.0) | 6149 (47.0) |
| History of cerebrovascular disease, No.(%)† | ||
| Prior stroke | 1181 (7.9) | 0 |
| Prior transient ischemic attack | 678 (4.9) | 0 |
At the time of enrollment into the cohort.
Self-reported at baseline.
Self-reported angioplasty, coronary artery bypass graft, myocardial infarction, or stenting or evidence of myocardial infarction via electrocardiogram.
Total cholesterol≥240mg/dL, low-density lipoprotein≥160mg/dL, high-density lipoprotein≤40mg/dL, or reported medication use.
Fasting glucose≥126 mg/dL, non-fasting glucose≥200mg/dL, or reported medication use.
Systolic blood pressure≥140mmHg, diastolic blood pressure≥90mmHg, or reported medication use.
Among participants with no previously self-reported events, we identified additional 120 events among 97 participants using the AIS/ICH algorithm. Medical records for 65 of the 120 events were successfully retrieved, and 48 of them were adjudicated as strokes. Among the rest, 7 (11%) cases which were mostly asymptomatic carotid procedure hospitalizations were triaged by the stroke nurse, 1 (2%) case was adjudicated as transient ischemic attack, and 9 (14%) cases as non-strokes. In Table 2, we present the numbers of adjudicated strokes and non-strokes after medical record review, as well as the number of unretrieved (missing) medical records in each category of claims/self-report for the 3 algorithms, for the first suspected event of each participant. For example, for the AIS algorithm, there were 161 participants with their first suspected events identified via both claim and self-report. Among them, 143 medical records (88.9%) were retrieved, and 126 of them were adjudicated as strokes, resulting in PPV of 88% among the adjudicated events. Among the 161, we were unable to retrieve the medical records for 18 participants (11.2% missing). The proportion of cases adjudicated as stroke among those with retrieved medical charts was similar across the three algorithms, but differed by strata. Approximately 90% of events were adjudicated as strokes in the claims/self-report and claims-only strata, while 10% or fewer cases were adjudicated as strokes in the self-report only stratum (Table 2). For the AIS and AIS/ICH algorithms, the proportion of suspected events with unretrieved medical charts in each strata was about 10% for the claim/self-report and self-report only strata, and was approximately 60% in the claims-only stratum. Using the AIS/ICH algorithm, 282 participants had stroke (Table 2). Among these 282 patients, those with unretrieved medical charts was on average younger and had substantially higher proportions of black participants, current smokers, and those with deep vein thrombosis and diabetes mellitus compared to those whose medical charts were retrieved. (Table 3)
Table 2.
Adjudication Results for Each Algorithm
| Algorithm and Strata | Total No. | Adjudication Results, No.(%) | Proportion of True Cases Among Those Adjudicated A/ (A+B) | ||
|---|---|---|---|---|---|
| Stroke (A) | No Stroke (B) | Missing | |||
| Any stroke cohort | |||||
| AIS | |||||
| Claim/report | 161 | 126 (78.3) | 17 (10.6) | 18 (11.2) | 0.88 |
| Claim-only | 83 | 26 (31.3) | 3 (3.6) | 54 (65.1) | 0.90 |
| Report-only | 1773 | 140 (7.9) | 1485 (83.8) | 148 (8.3) | 0.09 |
| No event | 13,072 | 0 | 13,072 (100.0) | 0 | — |
| Total | 15,089 | 292 (1.9) | 14,577 (96.6) | 220 (1.5) | — |
| ICH | |||||
| Claim/report | 26 | 23 (88.5) | 3 (11.5) | 0 | 0.88 |
| Claim-only | 15 | 8 (53.3) | 1 (6.7) | 6 (40.0) | 0.89 |
| Report-only | 1925 | 16 (0.8) | 1733 (90.0) | 176 (9.1) | 0.01 |
| No event | 13,123 | 0 | 13,123 (100.0) | 0 | — |
| Total | 15,089 | 47 (0.3) | 14,860 (98.5) | 182 (1.2) | — |
| AIS/ICH | |||||
| Claim/report | 185 | 150 (81.2) | 17 (9.2) | 18 (9.7) | 0.90 |
| Claim-only | 97 | 34 (35.1) | 3 (3.1) | 60 (61.9) | 0.92 |
| Report-only | 1748 | 153 (8.8) | 1447 (84.5) | 148 (8.5) | 0.10 |
| No event | 13,059 | 0 | 13,059 (100.0) | 0 | — |
| Total | 15,089 | 337 (2.2) | 14,526 (96.3) | 226 (1.5) | — |
| First-stroke cohort | |||||
| AIS | |||||
| Claim/report | 110 | 89 (80.9) | 8 (7.2) | 13 (11.8) | 0.92 |
| Claim-only | 49 | 17 (34.7) | 2 (4.1) | 30 (61.2) | 0.89 |
| Report-only | 1307 | 94 (7.2) | 1106 (84.6) | 107 (8.2) | 0.08 |
| No event | 11,630 | 0 | 11,630 (100.0) | 0 | — |
| Total | 13,096 | 200 (1.5) | 12,746 (97.3) | 150 (1.1) | — |
| ICH | |||||
| Claim/report | 22 | 19 (86.4) | 3 (13.6) | 0 | 0.86 |
| Claim-only | 7 | 4 (57.1) | 1 (14.3) | 2 (28.6) | 0.80 |
| Report-only | 1406 | 15 (1.2) | 1265 (90.0) | 126 (9.0) | 0.01 |
| No event | 11,661 | 0 | 11,661(100.0) | 0 | — |
| Total | 13,096 | 38 (0.3) | 12,930 (98.7) | 128(1.0) | — |
| AIS/ICH | |||||
| Claim/report | 130 | 109 (83.8) | 8 (6.2) | 13 (10.0) | 0.93 |
| Claim-only | 55 | 21 (38.2) | 2 (3.6) | 32 (58.2) | 0.91 |
| Report-only | 1287 | 107 (8.3) | 1073 (83.4) | 107 (8.3) | 0.09 |
| No event | 11,624 | 0 | 11,624 (100.0) | 0 | — |
| Total | 13,096 | 237 (1.8) | 12,707 (97.0) | 152 (1.2) | — |
Abbreviations: AIS, acute ischemic stroke; ICH, intracranial hemorrhage.
Table 3.
Differences in the Baseline Characteristics of Participants With and Without Medical Record Retrieval among Those With Claims
| Characteristic | Medical Charts Retrieved (n = 204), % | Medical Charts Not Retrieved (n = 78), % |
|---|---|---|
| Age group, y* | ||
| < 65 | 7.8 | 9.0 |
| 65-74 | 43.1 | 60.3 |
| ≥ 75 | 49.0 | 30.8 |
| Female | 44.6 | 55.1 |
| Race | ||
| Black | 40.7 | 66.7 |
| White | 59.3 | 33.3 |
| Body mass index | ||
| < 18.5 | 2.5 | 2.6 |
| 18.5-25.0 | 34.3 | 28.2 |
| 25.0-30.0 | 35.8 | 33.3 |
| 30.0-35.0 | 18.1 | 20.5 |
| 35.0-40.0 | 7.4 | 11.5 |
| ≥ 40.0 | 2.0 | 3.9 |
| Smoking status† | ||
| Never a smoker | 36.7 | 40.7 |
| Past smoker | 48.5 | 37.2 |
| Current smoker | 10.8 | 23.1 |
| Comorbid conditions† | ||
| Atrial fibrillation | 13.4 | 15.8 |
| Deep vein thrombosis | 9.1 | 16.9 |
| Diabetes mellitus | 14.3 | 27.4 |
| Dialysis | 1.5 | 0.0 |
| Dyslipidemia | 55.0 | 57.7 |
| Heart disease‡ | 32.8 | 42.1 |
| Kidney failure | 3.5 | 3.9 |
| Malignancy | 20.8 | 13.4 |
| Medications† | ||
| Regular aspirin use | 47.5 | 50.0 |
| History of cerebrovascular disease | ||
| Prior stroke | 23.0 | 27.3 |
| Prior transient ischemic attack | 10.8 | 10.7 |
At the time of enrollment into the cohort.
Self-reported at baseline.
Self-reported angioplasty, coronary artery bypass graft, myocardial infarction, or stenting or evidence of myocardial infarction via electrocardiogram.
The 3 algorithms had high specificity and NPV (Table 4). The PPVs of the algorithms were also high, ranging from 85% to 93%. Sensitivities were lower and fairly consistent across the 3 algorithms, ranging from 58% to 68% in the any stroke cohort, and from 58% to 60% in the first stroke cohort. Differences in age, race, and sex had limited influence on the specificity and NPVs, whereas the estimated PPV and sensitivity varied more by these demographic characteristics; however, the 95% CIs were wide and overlapped among most of the subgroups (Table 5, 6).
Table 4.
Accuracy of ICD-9-CM Codes in the Primary Diagnosis Position
| Cohort | Algorithm | PPV (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | NPV (95% CI) |
|---|---|---|---|---|---|
| Any stroke cohort | AIS | 88.6 (84.7-92.6) | 58.6 (53.6-63.6) | 99.8 (99.7-99.9) | 99.0 (98.8-99.1) |
| ICH | 88.6 (78.9-98.3) | 67.4 (54.8-79.9) | 100.0 (99.9-100.0) | 99.9 (99.8-99.9) | |
| AIS/ICH | 90.5 (87.1-94.0) | 60.4 (55.8-65.1) | 99.8 (99.6-99.9) | 98.9 (98.7-99.0) | |
| First stroke cohort | AIS | 91.1 (86.6-95.5) | 58.6 (52.4-64.7) | 99.9 (99.8-100) | 99.2 (99.1-99.4) |
| ICH | 84.8 (71.8-97.9) | 59.9 (44.9-74.9) | 100.0 (99.9-100.0) | 99.9 (99.8-99.9) | |
| AIS/ICH | 92.6 (88.8-96.4) | 59.5 (53.8-65.1) | 99.9 (99.8-100) | 99.1 (98.9-99.3) |
Abbreviations: AIS, acute ischemic stroke; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; ICH, intracranial hemorrhage; NPV, negative predictive value; PPV, positive predictive value.
Table 5.
Accuracy Measures Stratified by Age, Sex, and Race in Any Stroke Cohort
| Algorithm | Category | No. | PPV (95% CI), % | Sensitivity (95% CI), % | Specificity (95% CI), % | NPV (95% CI), % |
|---|---|---|---|---|---|---|
| AIS | Age | |||||
| 65 y | 1915 | 85.4 (70.6-100) | 44.2 (29.3-59.1) | 99.8 (99.6-100) | 98.7 (98.2-99.2) | |
| 65-74 y | 9713 | 90.7 (85.3-96.0) | 56.0 (48.8-63.2) | 99.9 (99.8-100) | 99.2 (99.0-99.3) | |
| ≥ 75 y | 3461 | 87.8 (81.6-93.9) | 66.4 (58.7-74.1) | 99.6 (99.4-99.8) | 98.6 (98.2-99.0) | |
| Sex | ||||||
| Female | 7804 | 92.0 (87.1-96.9) | 62.6 (55.4-69.8) | 99.9 (99.8-100) | 99.2 (98.9-99.4) | |
| Male | 7285 | 85.8 (87.1-96.9) | 55.2 (48.3-62.2) | 99.7 (99.6-99.9) | 98.8 (98.5-99.0) | |
| Race | ||||||
| Black | 5574 | 93.0 (88.3-97.6) | 66.1 (58.8-73.3) | 99.8 (99.7-100) | 99.0 (98.7-99.2) | |
| White | 9515 | 86.5 (80.6-92.5) | 53.3 (46.5-60.1) | 99.8 (99.7-99.9) | 99.0 (98.8-99.2) | |
| ICH | Age | |||||
| < 65 y | 1915 | — | — | — | — | |
| 65-74 y | 9713 | 85.4 (71.3-99.5) | 67.3 (50.7-84.0) | 100.0 (99.9-100) | 99.9 (99.8-100) | |
| ≥ 75 y | 3461 | 93.8 (81.0-100) | 69.4 (49.9-88.8) | 100.0 (99.9-100) | 99.8 (99.7-100) | |
| Sex | ||||||
| Female | 7804 | 86.7 (71.0-100) | 77.8 (59.6-96.0) | 100.0 (99.9-100) | 99.9 (99.9-100) | |
| Male | 7285 | 91.3 (79.8-100) | 61.6 (45.3-78.0) | 100.0 (99.9-100) | 99.8 (99.7-99.9) | |
| Race | ||||||
| Black | 5574 | 81.4 (64.4-98.5) | 82.5 (65.7-99.2) | 99.9 (99.9-100) | 99.9 (99.9-100) | |
| White | 9515 | 95.2 (86.1-100) | 59.1 (42.5-75.6) | 100.0 (100-100) | 99.9 (99.8-99.9) | |
| AIS/ICH | Age | |||||
| < 65 y | 1915 | 86.1 (71.9-100) | 45.5 (30.7-60.3) | 99.8 (99.6-100) | 98.7 (98.2-99.2) | |
| 65-74 y | 9713 | 92.2 (87.7-96.8) | 57.9 (51.3-64.5) | 99.9 (99.8-100) | 99.1 (98.9-99.2) | |
| ≥ 75 y | 3461 | 90.1 (84.9-95.4) | 67.9 (60.7-75.0) | 99.6 (99.4-99.8) | 98.4 (98.0-98.8) | |
| Sex | ||||||
| Female | 7804 | 93.1 (88.8-97.4) | 64.2 (57.5-71.0) | 99.9 (99.8-100) | 99.1 (98.9-99.3) | |
| Male | 7285 | 88.3 (83.1-93.5) | 57.3 (50.9-63.7) | 99.8 (99.6-99.9) | 98.6 (98.4-98.9) | |
| Race | ||||||
| Black | 5574 | 93.1 (88.9-97.4) | 68.8 (62.1-75.5) | 99.8 (99.7-99.9) | 99.0 (98.7-99.2) | |
| White | 9515 | 89.3 (84.3-94.3) | 54.6 (48.3-60.9) | 99.8 (99.7-99.9) | 98.8 (98.6-99.1) |
Abbreviations: AIS, acute ischemic stroke; ICH, intracranial hemorrhage; NPV, negative predictive value; PPV, positive predictive value.
Table 6.
Accuracy Measures Stratified by Age, Sex, and Race in First Stroke Cohort
| Algorithm | Category | No. | PPV (95% CI), % | Sensitivity (95% CI), % | Specificity (95% CI), % | NPV (95% CI), % |
|---|---|---|---|---|---|---|
| AIS | Age | |||||
| < 65 y | 1551 | — * | — * | — * | — * | |
| 65-74 y | 8702 | 96.9 (93.1-100) | 56.3 (47.9-64.6) | 100.0 (99.9-100) | 99.3 (99.1-99.5) | |
| ≥ 75 y | 2843 | 89.4 (82.3-96.6) | 66.8 (57.3-76.3) | 99.7 (99.5-99.9) | 98.9 (98.5-99.3) | |
| Sex | ||||||
| Female | 6782 | 95.7 (91.3-100) | 63.4 (54.9-71.9) | 99.9 (99.9-100) | 99.3 (99.1-99.5) | |
| Male | 6314 | 87.2 (79.8-94.6) | 54.1 (45.4-62.8) | 99.8 (99.7-99.9) | 99.1 (98.8-99.3) | |
| Race | ||||||
| Black | 4689 | 93.2 (87.4-99.0) | 62.7 (53.5-71.9) | 99.9 (99.8-100) | 99.1 (98.9-99.4) | |
| White | 8407 | 90.6 (84.5-96.8) | 55.8 (47.6-64.0) | 99.9 (99.8-100) | 99.2 (99.1-99.4) | |
| ICH | Age | |||||
| < 65 y | 1551 | — * | — * | — * | — * | |
| 65-74 y | 8702 | 76.7 (55.3-98.1) | 56.6 (35.1-78.2) | 100.0 (99.9-100) | 99.9 (99.8-100) | |
| ≥ 75 y | 2843 | 92.3 (77.8-100) | 64.6 (42.9-86.4) | 100.0 (99.9-100) | 99.8 (99.6-99.9) | |
| Sex | ||||||
| Female | 6782 | 79.2 (56.2-100) | 74.2 (50.3-98.2) | 100.0 (99.9-100) | 100.0 (99.9-100) | |
| Male | 6314 | 88.2 (72.9-100) | 53.3 (34.8-71.7) | 100.0 (99.9-100) | 99.8 (99.7-99.9) | |
| Race | ||||||
| Black | 4689 | 72.2 (46.9-97.6) | 71.5 (46.0-96.9) | 99.9 (99.9-100) | 99.9 (99.8-100) | |
| White | 8407 | 94.1 (82.9-100) | 55.6 (37.6-73.8) | 100.0 (100-100) | 99.8 (99.8-99.9) | |
| AIS/ICH | Age | |||||
| < 65 y | 1551 | — * | — * | — * | — * | |
| 65-74 y | 8702 | 96.2 (92.3-100) | 56.4 (48.6-64.2) | 100.0 (99.9-100) | 99.2 (99.0-99.4) | |
| m=ge | 2843 | 92.4 (86.7-98.1) | 67.4 (58.8-76.0) | 99.8 (99.6-99.9) | 98.7 (98.2-99.1) | |
| Sex | ||||||
| Female | 6782 | 96.3 (92.4-100) | 64.6 (56.5-72.6) | 99.9 (99.9-100) | 99.3 (99.1-99.5) | |
| ` | Male | 6314 | 89.9 (83.8-96.0) | 55.2 (47.3-63.1) | 99.8 (99.7-99.9) | 98.9 (98.6-99.2) |
| Race | ||||||
| Black | 4689 | 92.8 (87.2-98.4) | 64.3 (55.6-72.9) | 99.9 (99.8-100) | 99.1 (98.8-99.4) | |
| White | 8407 | 93.3 (88.5-98.1) | 56.5 (49.0-63.9) | 99.9 (99.9-100) | 99.1 (98.9-99.3) |
Abbreviations: AIS, acute ischemic stroke; ICH, intracranial hemorrhage; NPV, negative predictive value; PPV, positive predictive value.
Calculations could not be completed as some cells had zero observations.
Figure 2 shows the results of the sensitivity analysis of different proportions of true cases assumed in the unknown status cases in the claims-only stratum, in the any stroke cohort. Sensitivity and specificity were robust to the change in the assumption, while PPV showed a slight change to the altered proportion. Because the proportion of suspected events with unretrieved medical records were relatively small in strata A and C (approximately 10%), changing the assumed true stroke frequency among them had much smaller impact (data not shown). Altering the assumption on frequency of unidentified true strokes in the no-event stratum did not affect PPV, and minutely affected specificity (Figure 3). On the other hand, sensitivity was meaningfully influenced by the change in this assumption.
Figure 2.
Sensitivity of the Validity Measures to Assumed Proportions of True Cases in Participants whose Charts were not Retrieved in Claims-Only stratum
Figure 3.
Change in the Validity Measures After Altering the Prevalence of True Cases Unidentified in Participants in the No-Event Stratum
Discussion
Using the adjudicated stroke cases in the REGARDS Study as the gold standard in a linked REGARDS-Medicare claims database, we calculated the sensitivity, specificity, PPV, and NPV for 3 claims-based algorithms to capture incident and recurrent stroke among contemporary Medicare beneficiaries. The algorithms, which used ICD-9-CM codes in the primary position of discharge diagnosis, identified strokes with very high specificity and NPVs, high PPVs, but lower sensitivity. Differences in age, sex, and race had limited influence on specificity and NPVs, but sensitivity and PPVs varied by these factors, although most of the CIs overlapped among subgroups.
The most recent validation study on the claims-based stroke algorithms in the Medicare population, aside from one in a specific population of kidney transplant recipients,13 was conducted among hospitalized atrial fibrillation patients in 1998 and 1999.8 The study reported a PPV of 96% and sensitivity of 35%, for more inclusive algorithms including 437 and 438 codes. As the authors explained, the inclusion of prior (prevalent) strokes in the gold standard most likely contributed to the low sensitivity, and reduces the applicability of these measures to etiological studies where the aim is to capture incidences of strokes as opposed to prevalence. The PPVs in our study were more comparable to the PPV of 88% among Tennessee Medicaid enrollees from 1999 to 200323 and to the PPVs of 99% for AIS, 89% for ICH, and 94% for subarachnoid hemorrhage (SAH) among Seattle residents from 1990 to 1996.24
Compared to some disease-specific populations, our base population is more similar to the general US population, arising from a population-based community-dwelling sample that constitutes the study cohort of REGARDS. While the oversampling from the stroke belt resulting in high stroke prevalence in our cohort will influence the generalizability of PPV to those populations with lower or higher stroke prevalence, the sensitivity and specificity estimates should be directly applicable to other populations with varying prevalence. Also, our gold standard events included approximately 10% of cases not resulting in hospitalizations, diagnosed only in outpatient settings. Inclusion of these cases led to lower sensitivity compared to some previously reported numbers,24 but again our estimate would be more relevant for most etiological studies conducted using claims databases, where participants are community dwellers at risk for non-admitted strokes.
We conducted sensitivity analyses to estimate the impact of assumed stroke frequency among the suspected events with unretrievable medical records, and observed limited impact on the validity measures, especially specificity. The original medical record retrieval in REGARDS was triggered by the participant's self-report of a potential event. For this validation study, we retrieved additional charts for those with suspected strokes by inpatient diagnoses in Medicare data. The retrieval was less complete for events identified by Medicare claims only (43% missing) compared to those with self-reports (10% missing). One of the major reasons for nonretrieval was the hospital request to obtain an updated medical record release form from the participants, which was not always successful. This was particularly a problem for the additional chart retrieval for events identified by Medicare claims only, as some of these events had happened more than a year ago from when the additional medical record retrieval was pursued for the current study. This informative missingness was accounted for, in part, by the stratified imputation of the missing gold standard stroke status.
In order to maximize the number of additionally captured strokes and obtain stable estimates for the validity measures in the presence of resource restrictions, we pursued additional medical records for only the events identified by the restrictive algorithm of AIS/ICH. Because of this, we have no reliable estimate of the true stroke rate among the events identified by more inclusive algorithms, such as those including secondary discharge diagnoses, outpatient diagnoses, or ICD-9-CM codes for transient ischemic attack (435), other and ill-defined cerebrovascular disease (437), or late effects of cerebrovascular disease (438) when not accompanied by self-report. We are thus unable to report the validity of these more expansive algorithms, which are expected to have greater sensitivity in exchange for specificity. In the Online Data Supplement, we present probable ranges of validity estimates for two such algorithms based on our data. These suggest that inclusion of secondary position discharge diagnoses for the same codes as the AIS/ICH algorithm will most likely result in several percentage points higher sensitivity in exchange for 0.1 to 0.3 percentage points lower specificity, and can be an alternative for identifying strokes as outcomes for etiologic studies (Supplemental Table 2 and Supplemental Figure 1). On the other hand, the algorithm using all 430-438 codes in primary and secondary discharge diagnoses could result in as high as 80% sensitivity in exchange for substantially reduced specificity and PPV (Supplemental Table 3 and Supplemental Figure 2). These algorithms should be directly validated in future studies.
Our study has several limitations. First, our primary analysis assumed that the failure of medical chart retrievals occurred dependent only on the event identification process. Other variables, observed or unobserved, associated with retrieval failure and the risk of stroke could have biased our estimates. However, the impact of the bias would be limited considering the results of sensitivity analyses. Second, not all participants older than 65 years in the REGARDS cohort were linked to Medicare claims data. The proportion of black participants was 13 percentage points higher in the non-linked population, suggesting the possibility of systematic disapproval for linkage and incompleteness of linkage variables in this subpopulation (Supplemental Table 1). Third, for construction of the first stroke cohort, we excluded patients with prior history or events suggestive of strokes using the data from the REGARDS study only. It is possible that the identification of prior stroke is incomplete, but likely small. Lastly, due to the use of single imputed means to deal with the missingness, the widths of the reported CIs are inevitably but not substantially underestimated.
Conclusion
In the REGARDS-Medicare claims linked dataset of participants sampled from the general US population, claims-based algorithms using primary discharge diagnoses captured true stroke events among Medicare enrollees with high PPV and high specificity. This finding supports the validity of the relative risk estimates derived in etiologic or comparative effectiveness studies with stroke outcome under the assumption of nondifferential misclassification, as well as of stroke cohort identification. Due to their low sensitivity, however, the usefulness of these algorithms to accurately estimate population-level incidence rates of stroke or of related healthcare utilizations or costs is limited. Further studies are needed to evaluate more sensitive Medicare algorithms for these purposes.
Supplementary Material
Acknowledgments
The authors thank the investigators, the staff, and the participants of the REGARDS study for their valuable contributions. A full list of participating REGARDS investigators and institutions can be found at http://www.regardsstudy.org.
Funding/Support
This project was sponsored by the Agency for Healthcare Research and Quality (AHRQ), US Department of Health and Human Services, Rockville, Maryland, as part of the Cardiovascular Consortium and funded under contract number HHSA290201000007 – Task Order 2, as part of the Research Consortia for Comparative Effectiveness Studies in Cardiovascular Disease. The REGARDS project is supported by cooperative agreement U01NS041588 from the National Institute of Neurological Disorders and Stroke (NINDS), National Institutes of Health (NIH), Department of Health and Human Services. Dr Kumamaru was supported by the Program in Pharmacoepidemiology at Harvard School of Public Health through training grants from Pfizer, Millennium Pharmaceuticals, and ASISA, and by Honjo International Scholarship Foundation. Dr Setoguchi was supported by midcareer development award K02HS017731 from AHRQ. Dr Curtis was supported by grant R01HS018517 from AHRQ and grant K23AR053351 from National Institute of Arthritis and Musculoskeletal and Skin Diseases.
Footnotes
Disclaimer
The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies. Representatives of the funding agencies were involved in the review of the manuscript but not in collection, management, analysis, or interpretation of the data.
Disclosures
Dr Setoguchi reported receiving research support from Johnson & Johnson.
References
- 1.Jollis JG, Ancukiewicz M, DeLong ER, Pryor DB, Muhlbaier LH, Mark DB. Discordance of databases designed for claims payment versus clinical information systems. Implications for outcomes research. Annals of internal medicine. 1993;119:844–850. doi: 10.7326/0003-4819-119-8-199310150-00011. [DOI] [PubMed] [Google Scholar]
- 2.Goldstein LB. Accuracy of icd-9-cm coding for the identification of patients with acute ischemic stroke: Effect of modifier codes. Stroke; a journal of cerebral circulation. 1998;29:1602–1604. doi: 10.1161/01.str.29.8.1602. [DOI] [PubMed] [Google Scholar]
- 3.Derby CA, Lapane KL, Feldman HA, Carleton RA. Trends in validated cases of fatal and nonfatal stroke, stroke classification, and risk factors in southeastern new england, 1980 to 1991 : Data from the pawtucket heart health program. Stroke; a journal of cerebral circulation. 2000;31:875–881. doi: 10.1161/01.str.31.4.875. [DOI] [PubMed] [Google Scholar]
- 4.Tirschwell D, Kukull WA, Longstreth WT., Jr Inaccuracy of the icd-9-cm in identifying the diagnosis of ischemic cerebrovascular disease. Neurology. 1998;51:921. doi: 10.1212/wnl.51.3.921-a. author reply 922. [DOI] [PubMed] [Google Scholar]
- 5.Piriyawat P, Smajsova M, Smith MA, Pallegar S, Al-Wabil A, Garcia NM, Risser JM, Moye LA, Morgenstern LB. Comparison of active and passive surveillance for cerebrovascular disease: The brain attack surveillance in corpus christi (basic) project. American journal of epidemiology. 2002;156:1062–1069. doi: 10.1093/aje/kwf152. [DOI] [PubMed] [Google Scholar]
- 6.Heckbert SR, Kooperberg C, Safford MM, Psaty BM, Hsia J, McTiernan A, Gaziano JM, Frishman WH, Curb JD. Comparison of self-report, hospital discharge codes, and adjudication of cardiovascular events in the women's health initiative. American journal of epidemiology. 2004;160:1152–1158. doi: 10.1093/aje/kwh314. [DOI] [PubMed] [Google Scholar]
- 7.Benesch C, Witter DM, Jr., Wilder AL, Duncan PW, Samsa GP, Matchar DB. Inaccuracy of the international classification of diseases (icd-9-cm) in identifying the diagnosis of ischemic cerebrovascular disease. Neurology. 1997;49:660–664. doi: 10.1212/wnl.49.3.660. [DOI] [PubMed] [Google Scholar]
- 8.Birman-Deych E, Waterman AD, Yan Y, Nilasena DS, Radford MJ, Gage BF. Accuracy of icd-9-cm codes for identifying cardiovascular and stroke risk factors. Medical care. 2005;43:480–485. doi: 10.1097/01.mlr.0000160417.39497.a9. [DOI] [PubMed] [Google Scholar]
- 9.Broderick J, Brott T, Kothari R, Miller R, Khoury J, Pancioli A, Gebel J, Mills D, Minneci L, Shukla R. The greater cincinnati/northern kentucky stroke study: Preliminary first-ever and total incidence rates of stroke among blacks. Stroke; a journal of cerebral circulation. 1998;29:415–421. doi: 10.1161/01.str.29.2.415. [DOI] [PubMed] [Google Scholar]
- 10.Wahl PM, Rodgers K, Schneeweiss S, Gage BF, Butler J, Wilmer C, Nash M, Esper G, Gitlin N, Osborn N, Short LJ, Bohn RL. Validation of claims-based diagnostic and procedure codes for cardiovascular and gastrointestinal serious adverse events in a commercially-insured population. Pharmacoepidemiology and drug safety. 2010;19:596–603. doi: 10.1002/pds.1924. [DOI] [PubMed] [Google Scholar]
- 11.Humphries KH, Rankin JM, Carere RG, Buller CE, Kiely FM, Spinelli JJ. Co-morbidity data in outcomes research: Are clinical data derived from administrative databases a reliable alternative to chart review? Journal of clinical epidemiology. 2000;53:343–349. doi: 10.1016/s0895-4356(99)00188-2. [DOI] [PubMed] [Google Scholar]
- 12.Arnason T, Wells PS, van Walraven C, Forster AJ. Accuracy of coding for possible warfarin complications in hospital discharge abstracts. Thrombosis research. 2006;118:253–262. doi: 10.1016/j.thromres.2005.06.015. [DOI] [PubMed] [Google Scholar]
- 13.Lentine KL, Schnitzler MA, Abbott KC, Bramesfeld K, Buchanan PM, Brennan DC. Sensitivity of billing claims for cardiovascular disease events among kidney transplant recipients. Clinical journal of the American Society of Nephrology : CJASN. 2009;4:1213–1221. doi: 10.2215/CJN.00670109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Easton JD, Saver JL, Albers GW, Alberts MJ, Chaturvedi S, Feldmann E, et al. Definition and evaluation of transient ischemic attack. Stroke. 2009;40:2276–2293. doi: 10.1161/STROKEAHA.108.192218. [DOI] [PubMed] [Google Scholar]
- 15.Albers GW, Caplan LR, Easton JD, Fayad PB, Mohr JP, Saver JL, et al. Transient ischemic attack--proposal for a new definition. N Engl J Med. 2002;347:1713–1716. doi: 10.1056/NEJMsb020987. [DOI] [PubMed] [Google Scholar]
- 16.ICD-9-CM Coordination and Maintenance Committee Meeting; 2004 April 1-2; http://www.cdc.gov/nchs/data/icd9/agendaapril04%20revised.pdf. Accessed June 18, 2013. [Google Scholar]
- 17.Hsieh CC. The effect of non-differential outcome misclassification on estimates of the attributable and prevented fraction. Statistics in medicine. 1991;10:361–373. doi: 10.1002/sim.4780100308. [DOI] [PubMed] [Google Scholar]
- 18.Setoguchi S, Solomon DH, Glynn RJ, Cook EF, Levin R, Schneeweiss S. Agreement of diagnosis and its date for hematologic malignancies and solid tumors between medicare claims and cancer registry data. Cancer causes & control : CCC. 2007;18:561–569. doi: 10.1007/s10552-007-0131-1. [DOI] [PubMed] [Google Scholar]
- 19.White E. The effect of misclassification of disease status in follow-up studies: Implications for selecting disease classification criteria. American journal of epidemiology. 1986;124:816–825. doi: 10.1093/oxfordjournals.aje.a114458. [DOI] [PubMed] [Google Scholar]
- 20.Howard VJ, Cushman M, Pulley L, Gomez CR, Go RC, Prineas RJ, Graham A, Moy CS, Howard G. The reasons for geographic and racial differences in stroke study: Objectives and design. Neuroepidemiology. 2005;25:135–143. doi: 10.1159/000086678. [DOI] [PubMed] [Google Scholar]
- 21.Warren JL, Klabunde CN, Schrag D, Bach PB, Riley GF. Overview of the seer-medicare data: Content, research applications, and generalizability to the united states elderly population. Medical care. 2002;40:IV–3-18. doi: 10.1097/01.MLR.0000020942.47004.03. [DOI] [PubMed] [Google Scholar]
- 22.Centers for Medicare & Medicaid Services, Medicare & Medicaid Statistical Supplement 2011 Edition. http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/MedicareMedicaidStatSupp/2011.html. Accessed March 6, 2013.
- 23.Roumie CL, Mitchel E, Gideon PS, Varas-Lorenzo C, Castellsague J, Griffin MR. Validation of icd-9 codes with a high positive predictive value for incident strokes resulting in hospitalization using medicaid health data. Pharmacoepidemiology and drug safety. 2008;17:20–26. doi: 10.1002/pds.1518. [DOI] [PubMed] [Google Scholar]
- 24.Tirschwell DL, Longstreth WT., Jr Validating administrative data in stroke research. Stroke; a journal of cerebral circulation. 2002;33:2465–2470. doi: 10.1161/01.str.0000032240.28636.bd. [DOI] [PubMed] [Google Scholar]
- 25.Stroke--1989 Recommendations on stroke prevention, diagnosis, and therapy. Report of the WHO Task Force on Stroke and other Cerebrovascular Disorders. Stroke. 1989;20:1407–1431. doi: 10.1161/01.str.20.10.1407. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



