Short abstract
Introduction
Administrative hospital diagnostic coding data are increasingly being used in identifying incident and prevalent stroke cases, for outcome audit and for ‘big data’ research. Validity of administrative coding has varied in previous studies, but little is known about the temporal trends of coding accuracy, which could bias analyses.
Patients and methods
Using all incident and recurrent strokes in a population-based cohort (Oxford Vascular Study/OXVASC) with multiple sources of ascertainment as the reference, we determined the temporal trends in sensitivity and positive predictive value of hospital diagnostic codes for identifying acute stroke from 2002 to 2017.
Results
Of 1883 hospitalised strokes, 1341 (71.2%) were correctly identified by coding. Sensitivity of coding improved over time for all strokes (ptrend = 0.005) and for incident cases (ptrend = 0.002). Of 1995 apparent stroke admissions identified by International Classification of Disease-10 stroke codes (I60–I68), 1588 (79.6%) used the stroke-specific codes (I60–I61/I63–I64). Positive predictive value was higher with the use of specific codes (83.2% vs. 69.2% for all codes) and highest if combined with the first admission only (88.5%), particularly during more recent time periods (2014–2017 = 90.3%). Of 2254 OXVASC incident strokes, 833 (37.0%) were not hospitalised. Sensitivity of coding increased over time for non-disabling stroke (ptrend = 0.001), but not for disabling/fatal stroke (ptrend = 0.40).
Conclusions
Although accuracy of hospital diagnostic coding for identifying acute strokes improved over the last 15 years, residual insensitivity supports linkage to other sources in large epidemiological studies. Moreover, differences in the time trends of coding sensitivity in relation to stroke severity might bias studies of trends in stroke outcome if only administrative coding is used.
Keywords: Diagnostic coding, population-based cohort, stroke, epidemiology, trend, sensitivity, positive predictive value
Introduction
Stroke is the second leading cause of death and the main cause of long-term neurological disability in adults.1 Routinely collected administrative hospital diagnostic coding data are inexpensive and widely available in electronic format and are therefore increasingly being used for quality audit, for ‘big data’ research, including studying time-trends of stroke incidence and outcome.2–4 However, in many countries hospital diagnostic coding is not designed for research purposes and is often done by non-clinical clerical staff and largely depends on their interpreting medical notes and applying appropriate codes.3,5 Therefore, accuracy of hospital coding can be inadequate and the sensitivity and positive predictive value (PPV) of coding in identifying acute stroke cases could bias the results of some studies dependent on the research questions.5 Moreover, change of coding quality over time could also potentially bias results of clinical studies focusing on temporal trends.
Validation of administrative coding data has been done in several studies, but results were inconsistent, with a wide range of sensitivity (42–96%) and PPV (6–97%) reported.3,6 Most studies examined codes from the International Classification of Disease (ICD) 8th and 9th revisions, and the majority were done before year 2000.3,6 Since then, healthcare systems in European countries have switched to use ICD 10th revision. Advances in neuroimaging, particularly the increasing availability of MRI for suspected stroke might also have resulted in improvement of coding reliability over time, as might change in reimbursement incentives. One French study suggested that PPVs of hospital diagnostic codes in identifying acute stroke improved significantly in 2004–2008.7 We recently showed that sensitivity of coding is still poor in identifying inpatient acute strokes complicating procedures or other diseases,8 but in-hospital stroke only accounts for a small proportion of acute strokes and little is known about the temporal trends of coding accuracy for identifying all acute stroke cases. In light of the widespread use of coding data in stroke research in the UK and no change of reimbursement policy for hospital coding for stroke,2,3 we studied data from a prospective population-based stroke incidence study (Oxford Vascular Study) that included multiple sources of ascertainment to determine the temporal trends of sensitivity and PPV of hospital administrative diagnostic codes for identifying acute stroke in 2002–2017.
Methods
The Oxford Vascular Study (OXVASC) is an ongoing population-based study of the incidence and outcome of all acute vascular events.9 The study population comprises all 92,728 individuals, irrespective of age, registered with about 100 general practitioners (GPs) in nine general practices in Oxfordshire, UK.
The study methods have been reported elsewhere. Briefly, multiple overlapping methods of ‘hot’ and ‘cold’ pursuit were used to achieve near complete ascertainment of all individuals with transient ischaemic attack (TIA) or stroke.9 These include: (1) a daily, rapid access ‘TIA and stroke clinic’ to which participating GPs and the local emergency department refer individuals with suspected TIA or minor stroke; (2) daily searches of admissions to the medical, stroke, neurology and other relevant wards, including also screening all patients undergoing elective or emergency coronary, carotid or peripheral vascular investigations or interventions; (3) daily searches of the local emergency department attendance register; (4) daily searches of in-hospital death records via the bereavement office; (5) monthly searches of all death certificates and coroner’s reports for out of hospital deaths; (6) monthly searches of all brain and vascular imaging referrals; (7) monthly searches of GP diagnostic coding and hospital electronic record discharge codes, using a wide preselected ICD 10 codes (I60–I68; G45–G46; H34) that occurred at any diagnositc position. All potential cases identified by coding will subsequently be reviewed by study clinicians for adjudication. Stroke was defined as rapid-onset symptoms and/or signs of focal, and at times global, loss of cerebral function, with symptoms lasting more than 24 h or leading to death, with no apparent cause other than of vascular origin.9 TIAs (i.e. event lasting for less than 24 h) with acute infarct detected on brain imaging are not included as strokes.
Patients with suspected stroke were seen by study physicians as soon as possible after the initial presentation. Baseline demographic data, vascular risk factors and other comorbidities were collected from face-to-face interview and cross-referenced with primary care records. Detailed clinical history was recorded in all patients and assessments were made for stroke severity using the National Institute of Health Stroke Scale (NIHSS). Major stroke was defined as NIHSS ≥ 5. Patients routinely had brain imaging (CT or MR), vascular imaging (Carotid Doppler or CT-angiography/MR-angiography or digital subtraction angiography), 12-lead electrocardiography (ECG) and standard blood tests. ECG, 24-h ECG (HOLTER) and five-day ambulatory ECG monitoring were done when clinically indicated. If a patient died before assessment, we obtained an eyewitness account of the clinical event and reviewed any relevant records. All cases were reviewed by the senior study neurologist (PMR) for final adjudication, and reasons for exclusion were recorded.
All patients were followed up face to face at 1, 6, 12, 60 and 120 months by a study nurse or physician to determine recurrent strokes. For patients who had moved out of the study area, telephone follow-up was done. All patients were flagged for the Office for National Statistics mortality data and all deaths during follow-up were recorded with causes. All recurrent strokes that presented to medical attention would also be identified by the ongoing daily case ascertainment. If a recurrent stroke was suspected, the patient was re-assessed and investigated by a study physician.
Statistical analyses
To calculate sensitivity of hospital coding in identifying acute stroke episodes, we used all strokes ascertained and adjudicated in OXVASC in 2002–2017 with multiple sources as the reference standard. Cases identified by stroke-specific codes (I60–I61, I63–I64) as the primary diagnosis were considered as correctly identified cases by coding. We also reported the number of ‘false-negative’ cases that could be identified by coding if other codes or diagnostic positions were used.
To calculate PPVs, we compared all the stroke admissions of the study population identified by coding only with all adjudicated OXVASC stroke ascertained in the same period (2002–2017). Although as part of the ‘cold pursuit’ methods in OXVASC, clinical adjudication was performed for all potential cases identified from hospital discharge coding using ICD-10 codes I60–I68, G45–G46 and H34 at any diagnostic position, for the purpose of the current study, and particularly to avoid overestimation of ‘false-positive’ cases, only cases identified by ICD-10 codes I60–I68 as the primary diagnosis were considered as coding-identified stroke admissions.
We assessed the time trends of coding accuracy (sensitivity and PPVs) in identifying acute stroke cases in five consecutive 3-year periods (2002–2005, 2005–2008, 2008–2011, 2011–2014, 2014–2017) using Chi-square test for trend.
For change of coding sensitivity over time, analyses were performed for all hospitalised stroke cases, all hospitalised incident stroke cases and for all incident strokes, with further stratification by age, stroke severity (non-disabling vs. disabling and fatal) and stroke subtype (ischaemic stroke, intracerebral haemorrhage and subarachnoid haemorrhage).
For change of PPVs over the study periods, given the lack of consensus about which codes should be used for identifying stroke outcomes, analyses were stratified by different searching strategies, which included using all ICD-10 non-specific codes (I60–I68; I62 – subdural haemorrhage, nontraumatic extradural haemorrhage, and unspecified intracranial haemorrhage, I65 – occlusion and stenosis of precerebral arteries, not resulting in cerebral infarction, I66 – occlusion and stenosis of cerebral arteries, not resulting in cerebral infarction, I67 – other cerebrovascular diseases, I68 – cerebrovascular disorders in diseases classified elsewhere) for possible stroke, using stroke-specific codes alone (I60–I61, I63–I64), including all admissions, excluding stroke admissions within 28 days of a known previous stroke admission, or including only the first admission. Analyses stratified by age and stroke subtype were also performed.
All analyses were performed using SPSS version 22.
Standard protocol approvals, registrations, and patient consents
Written informed consent from patients or assent from relatives was obtained in all participants in OXVASC. OXVASC was approved by the local research ethics committee (OREC A: 05/Q1604/70).
Results
Of 3011 strokes ascertained in OXVASC from 2002 to 2017, 1883 (62.5%) were hospitalised (1647 admitted for the acute stroke, and 236 already in the hospital for other diseases at the time of stroke). A further 1044 (34.7%) cases were managed in outpatient clinics or in the community and 84 (2.8%) happened out of the area or abroad, and were not expected to be identified by local hospital diagnostic codes. The distributions of where patients were managed did not change over time, both for all stroke cases and for the 2254 (74.9%) incident cases (Web appendix 1).
Among all 1883 hospitalised stroke cases, 1341 (71.2%) were identified by diagnostic codes at discharge. The sensitivity of hospital coding to identify admitted stroke cases did not differ between incident and recurrent stroke cases (incident vs. recurrent: 71.1% vs. 72.1%, p = 0.68) but was much lower for stroke cases that happened during admission for other diseases (39.8% vs. 75.7% for those admitted for suspected stroke, p < 0.0001; Table 1) and was also lower in patients with posterior circulation vs. anterior circulation event (60.9% vs. 71.9%, p = 0.0002).
Table 1.
All hospitalised cases | Admitted for suspected stroke | Stroke during hospital admission for other disease | |
---|---|---|---|
Year | (coding identified/total, %) | (coding identified/total, %) | (coding identified/total, %) |
All stroke cases | |||
2002–2005 | 256/359 (71.3) | 231/303 (76.2) | 25/56 (44.6) |
2005–2008 | 232/358 (64.8) | 217/315 (68.9) | 15/43 (34.9) |
2008–2011 | 278/402 (69.2) | 253/349 (72.5) | 25/53 (47.2) |
2011–2014 | 278/380 (73.2) | 266/338 (78.7) | 12/42 (28.6) |
2014–2017 | 297/384 (77.3) | 280/342 (81.9) | 17/42 (40.5) |
ptrend | 0.005 | 0.004 | 0.49 |
Total | 1341/1883 (71.2) | 1247/1647 (75.7) | 94/236 (39.8) |
Incident cases | |||
2002–2005 | 179/253 (70.8) | 169/223 (75.8) | 10/30 (33.3) |
2005–2008 | 174/276 (63.0) | 163/244 (66.8) | 11/32 (34.4) |
2008–2011 | 204/298 (68.5) | 194/268 (72.4) | 10/30 (33.3) |
2011–2014 | 223/297 (75.1) | 215/268 (80.2) | 8/29 (27.6) |
2014–2017 | 231/297 (77.8) | 216/265 (81.5) | 15/32 (46.9) |
ptrend | 0.002 | 0.002 | 0.43 |
Total | 1011/1421 (71.1) | 957/1268 (75.5) | 54/153 (35.3) |
Table 1 shows the change of coding sensitivity to identify hospitalised stroke cases over time. Overall sensitivity improved from 71.3% in 2002–2005 to 77.3% in 2014–2017 for all ascertained stroke cases (ptrend = 0.005; Table 1). However the apparent improvement in sensitivity was mainly accounted for by better coding for cases that were admitted for suspected stroke, and there was no change in coding sensitivity to identify stroke cases that happened during admission for other diseases (Table 1). Results were consistent in analyses confined to incident stroke ascertained in OXVASC (Table 1).
Reasons for coding-missed hospitalised strokes (‘false negative’) also changed over time (Web appendix 2). Although complete omission remained the most common reason (n = 287/53.0%), there were more ‘false negative’ cases with potentially relevant coding in non-primary diagnostic positions (Web appendix 2). Results were again similar for incident strokes (Web appendix 2).
Of 2254 incident stroke cases, 833 (37.0%) were not hospitalised in the local area. As shown in Figure 1, sensitivity of coding in identifying any stroke improved significantly over time, yet during the period of 2014–2017, 229 (49.6%) of all 462 incident strokes would have been missed if no additional ascertainment sources were used (Figure 1), particularly if non-disabling (n = 183/63.1%; Figure 1).
Among all incident strokes, there were 1938 ischaemic strokes, 221 intracerebral haemorrhages and 95 subarachnoid haemorrhages. Of 1011 correctly identified stroke admissions, 46 (4.5%) had wrong subtyping, which decreased over time (Web appendix 3). Consequently, increasing sensitivities of coding to identify any stroke or any hospitalised stroke over time were consistent for ischaemic and haemorrhagic strokes (Figure 2). Moreover, among all ischaemic stroke cases identified with I63 or I64, there was increasing use of I63 over time (n/% – 53/38.1% in 2002–2005 vs. 149/84.7% in 2014–2017, ptrend < 0.0001). However, hospitalisation rates were significantly lower for ischaemic compared to haemorrhagic strokes (n/%: 1147/59.2% vs. 274/86.7%, p < 0.0001). Therefore using hospital discharge codes alone would miss more ischaemic than haemorrhagic incident strokes cases (Figure 2), even during the last study period (coding missed incident stroke in 2014–2017: 211/54.4% vs. 18/24.3%, p = <0.0001; Figure 2).
Analyses for age-specific temporal trends of coding sensitivity are presented in Web appendix 4. Sensitivity did not differ by age in identifying hospitalised stroke (p = 0.98) and the improvement over time was most marked for patients aged ≥75 years (71.6% in 2002–2005 to 82.1% in 2014–2017, ptrend = 0.003; Web appendix 4). However, as expected, hospitalisation rates for stroke increased with age (n/%: 289/54.9% at age <65 years vs. 838/67.4% at age over 75 years, p < 0.0001). Consequently, sensitivity of using coding to identify any incident stroke cases increased with age (ptrend = 0.001; Web appendix 4). The significant improvement of coding sensitivity over time was again seen at older ages but not at age <65 years (Web appendix 4), although at age ≥75 years, 41.5% incident stroke cases during 2014–2017 would still have been missed if only hospital discharge codes were used.
There were 1995 stroke admissions identified by hospital discharge coding using the full non-specific ICD-10 codes for stroke (I60–I68), of which 1588 (79.6%) admissions used the specific codes (I60, I61, I63 and I64). PPV improved with specific codes (83.2% vs. 69.2% for all codes) and was highest if combined with the first admission (88.5%; Table 2). Despite the differences in coding accuracy by searching strategy, PPV improved significantly over time for all strategies (Table 2). Using specific codes and first admission only, the PPV for coding increased from 84.4% in 2002–2005 to 90.3% in 2014–2017 (ptrend = 0.001; Table 2). The trends were consistent for ischaemic stroke (Table 3) and intracerebral haemorrhage (Table 4), but not for subarachnoid haemorrhage (Table 4), although there were too few events to draw reliable conclusions.
Table 2.
Full non-specific codes for possible stroke (I60–I68): |
Stroke-specific codes (I60, I61, I63, I64): |
|||||
---|---|---|---|---|---|---|
All admissions | Excluding re-admission ≤28 days | First admission only | All admissions | Excluding re-admission ≤28 days | First admission only | |
Year | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) |
2002–2005 | 245/390 (62.8) | 244/322 (75.8) | 236/308 (76.6) | 240/345 (69.9) | 239/287 (83.3) | 232/275 (84.4) |
2005–2008 | 245/380 (64.5) | 241/351 (68.7) | 231/326 (70.9) | 226/289 (78.2) | 224/271 (82.7) | 214/255 (83.9) |
2008–2011 | 282/394 (71.6) | 278/381 (73.0) | 255/339 (75.2) | 270/295 (91.5) | 266/289 (92.0) | 245/268 (91.4) |
2011–2014 | 306/418 (73.2) | 300/402 (74.6) | 280/359 (78.0) | 293/323 (90.7) | 288/314 (91.7) | 269/292 (92.1) |
2014–2017 | 303/413 (73.4) | 296/384 (77.1) | 271/337 (80.4) | 293/336 (87.2) | 286/318 (89.9) | 261/289 (90.3) |
ptrend | <0.0001 | 0.198 | 0.041 | <0.0001 | 0.0002 | 0.001 |
Total | 1381/1995 (69.2) | 1359/1840 (73.9) | 1273/1669 (76.3) | 1322/1588 (83.2) | 1303/1479 (88.2) | 1221/1379 (88.5) |
TP: true positive.
Table 3.
Full codes (I63, I64): |
Stroke-specific codes (I63): |
|||||
---|---|---|---|---|---|---|
All admissions | Excluding re-admission ≤28 days | First admission only | All admissions | Excluding re-admission ≤28 days | First admission only | |
Year | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) |
2002–2005 | 186/283 (65.7) | 185/234 (79.1) | 179/226 (79.2) | 67/97 (69.1) | 66/82 (80.5) | 66/81 (81.5) |
2005–2008 | 168/226 (74.3) | 166/213 (77.9) | 158/204 (77.5) | 94/116 (81.0) | 92/110 (83.6) | 87/105 (82.9) |
2008–2011 | 209/232 (90.1) | 207/228 (90.8) | 189/210 (90.0) | 172/190 (90.5) | 170/186 (91.4) | 155/171 (90.6) |
2011–2014 | 238/262 (90.8) | 235/256 (91.8) | 218/236 (92.4) | 220/240 (91.7) | 217/234 (92.7) | 201/215 (93.5) |
2014–2017 | 221/249 (88.8) | 215/238 (90.3) | 196/216 (90.7) | 189/209 (90.4) | 183/198 (92.4) | 167/180 (92.8) |
ptrend | <0.0001 | <0.0001 | <0.0001 | <0.0001 | 0.0003 | 0.0002 |
Total | 1022/1252 (81.6) | 1008/1169 (86.2) | 940/1092 (86.1) | 742/852 (87.1) | 728/810 (89.9) | 676/752 (89.9) |
TP: true positive cases.
Table 4.
Intracerebral haemorrhage (I61) |
Subarachnoid haemorrhage (I60) |
|||||
---|---|---|---|---|---|---|
All admissions | Excluding re-admission ≤28 days | First admission only | All admissions | Excluding re-admission ≤28 days | First admission only | |
Year | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) | (TP/total, %) |
2002–2005 | 15/29 (51.7) | 15/28 (53.6) | 15/28 (53.6) | 17/33 (51.5) | 17/25 (68.0) | 16/21 (76.2) |
2005–2008 | 25/39 (64.1) | 25/35 (71.4) | 24/33 (72.7) | 15/24 (62.5) | 15/23 (65.2) | 15/18 (83.3) |
2008–2011 | 40/48 (83.3) | 39/46 (84.8) | 37/43 (86.0) | 11/15 (73.3) | 11/15 (73.3) | 11/15 (73.3) |
2011–2014 | 38/45 (84.4) | 36/43 (83.7) | 35/42 (83.3) | 11/16 (68.8) | 11/15 (73.3) | 10/14 (71.4) |
2014–2017 | 50/59 (84.7) | 49/53 (92.5) | 43/47 (91.5) | 15/28 (53.6) | 15/27 (55.6) | 15/26 (57.7) |
ptrend | 0.0002 | <0.0001 | 0.0001 | 0.74 | 0.47 | 0.09 |
Total | 168/220 (76.4) | 164/205 (80.0) | 154/193 (79.8) | 69/116 (59.5) | 69/105 (65.7) | 67/94 (71.3) |
TP: true positive cases.
Reasons for ‘false-positive’ cases are listed in Web appendix 5. When possible non-specific stroke codes and all admissions were used, miscoding elective admissions for rehabilitation, investigations or procedures for prior stroke as new stroke (n = 317/52.1%) was the most common reason, followed by non-stroke diagnosis (n = 249/41.0%). Reassuringly, misdiagnosing admissions for stroke rehabilitation as new strokes decreased from 28.1% in 2002–2005 to 5.5% in 2014–2017. As expected, if only stroke-specific codes and first admission were used, non-stroke diagnosis miscoded as stroke became the most common reason (n = 113/74.3%) and the reasons for ‘false-positive’ cases did not change over time (p = 0.24; Web appendix 5).
Analyses for temporal trends of PPV by age are presented in Web appendix 6. Among all 1995 stroke admissions identified by hospital discharge codes, 463 (23.2%) were of patients aged <65 years, 418 (21.0%) between 65 and 74 years and 1114 (55.8%) over 75 years. PPV for coding to identify true acute stroke cases increased with age, from 58.5% (95% confidence interval [CI] 54.0–62.9%) in those aged <65 years to 75.8% (73.2–78.2%) at age >75 years if full codes and all admissions were used (ptrend < 0.0001), and from 82.9% (78.1–86.8%) at age <65 years to 90.3% (88.1–92.2%) at age >75 years if specific codes and fist admission were used (ptrend = 0.001). The overall improvement of PPV with time was seen at older ages (Web appendix 6), and the PPV was 95.0% (90.7–97.4%) in 2014–2017 when specific codes and first admissions were used. However, there was no change in PPV for patients at age <65 years, even when specific codes and first admissions were used (PPV 2014–2017: 78.1%, 95%CI 66.3–86.6%; Web appendix 6).
Discussion
Using all acute stroke cases ascertained in a population-based cohort as the reference standard, we showed that both sensitivity and PPVs of administrative hospital diagnostic coding of acute stroke cases improved in the last 15 years, with PPV >90% in recent years if stroke-specific codes plus first admission were used. However, despite improvement over time, hospital coding still lacked sensitivity. One fourth of hospitalised stroke cases and half of all incident strokes would have been missed if no additional ascertainment sources were used. Moreover, the time trends of improvement of coding sensitivity in non-disabling strokes but not in disabling or fatal strokes might bias interpretations of outcome audit studies if only administrative coding was used.
The overall 71.2% sensitivity of diagnostic coding in identifying hospitalised stroke in our study was similar to the estimates found in two previous French studies, which also used population-based registries as their gold standard for comparison.7,10 However, even in 2014–2017, sensitivity of coding found in our study was lower than the 90.6% reported from Sweden.11 This could be partly explained by the persistently low sensitivity (40.5% in 2014–2017) of diagnostic coding in identifying acute stroke cases that happened during admission for other diseases in our study.8 In the UK, hospital diagnostic coding is often done by non-clinical clerical staff and largely depends on their interpreting medical notes and applying appropriate codes.5 The actual reason of the acute admission is not always clear, particularly in patients with multiple comorbidities. We found increasing numbers of ‘false-negative’ cases with stroke codes identified in non-primary diagnostic positions, reflecting the coding practice change in our local hospital that >10 diagnostic codes are now allowed. However, although using codes in non-primary diagnostic position may increase sensitivity, it is unlikely to be substantial (24 out of 88 ‘false-negative’ cases in 2014–2017), but would be at the expense of a lower PPV.12,13
We showed that coding sensitivity in identifying incident strokes increased with patient age, stroke severity and was higher in haemorrhagic compared to ischaemic strokes, which was in accordance with previous studies.7,14,15 Moreover, these patterns remained consistent over time and were largely explained by the differences in hospitalisation rates between the subgroups, which is a potential source of selection biases in epidemiological studies which rely on hospital coding alone to identify incident stroke cases. Moreover, the improvement of coding sensitivity in recent years for non-disabling strokes but not for disabling or fatal strokes might also bias studies of temporal trends in stroke outcomes if only coding data were used.
We found that PPVs of hospital diagnostic codes for acute stroke improved over time. One previous study in the US suggested that the improvement of PPVs from 1980 to 1990 in Rochester were attributable to advances in neuroimaging technology.16 In our study, use of brain imaging, particularly MRI scan, also increased significantly during the study period (12% in 2002–2010 vs. 46% in 2010–2014),17 which perhaps explained the increasing use of the more specific code – I63 (ischaemic stroke) vs. I64 (unspecified stroke) for the diagnosis of ischaemic stroke over time. However, increasing use of MRI scan is not at no cost, as misdiagnosing TIA as stroke was one of the common reasons for ‘false-positive’ cases in our study, which was also found in one previous study.7 Another possible explanation for the apparent improvement in PPVs is probably the improvement of acute stroke care in the UK during the study period with increasing number of acute stroke cases being admitted and managed in the Acute Stroke Unit (data not shown).
Although there is lack of consensus among stroke epidemiology studies about which codes should be used for identifying stroke outcomes, previous studies advocated the use of stroke-specific codes (I60–I61, I63–I64) over the use of full non-specific stroke codes (I60–I68).3,6 We also found that using this approach together with the inclusion of first admission alone could maximise the PPVs. In 2014–2017, selection of stroke-specific codes combined with inclusion of first admission resulted in a PPV >90%. However, given the lower PPVs observed at younger ages and for subarachnoid haemorrhage, these subgroups should be given priority for validation with other data sources.
Although we consider our results to be valid, the study has limitations. Firstly, our study was done in Oxfordshire and might not be representative of all hospitals in the UK. However, our estimates of PPVs were comparable to other UK studies.15,18 Secondly, as coding accuracy might differ between healthcare systems, the improvement of coding sensitivity and PPV in the UK might be not generalizable to other countries. Nevertheless, similar PPVs were reported in the last 5 years in Canada,19 Norway,20 Italy21 and South Korea.22 Thirdly, a third of stroke patients are not admitted to hospital in the UK.23,24 Therefore sensitivity of hospital coding in identifying all incident stroke might be higher in countries with higher hospitalisation rate, although hospitalisation has also been reported to have decreased in other countries.25,26 Fourthly, as most of the TIA cases are managed in outpatient settings (i.e. TIA clinics) in the UK, our study was not able to quantify the hospital coding accuracy change for TIAs. Finally, similar to previous studies,3,6 we only considered stroke-specific codes (I60–I61, I63–I64) at the primary diagnostic position as correctly identified cases by coding when calculating sensitivity. Identification of stroke cases using the full non-specific stroke codes (I60–I68) irrespective diagnostic position is likely to increase sensitivity, albeit only by a small amount (Web appendix 2) and at the cost of lowering PPVs.
In conclusion, we showed that the accuracy of administrative hospital diagnostic coding for identifying acute strokes improved significantly in the last 15 years in Oxfordshire, UK. With appropriate selection of stroke-specific codes and inclusion of only first admission, PPVs >90% can be achieved, which would be adequate for large-scale epidemiological studies of the determinants of stroke. However, despite improvement over time, the lack of sensitivity for hospital coding does not support the use of these data alone for incidence estimates. Moreover, the stroke severity-specific difference of the time trends of coding sensitivity might bias interpretations of outcome audit studies if only administrative coding was used. Approaches to improve coding accuracy are still required and future studies should address the impact of additional linking to primary care data and other sources in large epidemiological studies.
Supplemental Material
Supplemental material, ESO881017 Supplemental Material for Temporal trends in the accuracy of hospital diagnostic coding for identifying acute stroke: A population-based study by Linxin Li, Lucy E Binney, Ramon Luengo-Fernandez, Louise E Silver, Peter M Rothwell and on behalf of the Oxford Vascular Study in European Stroke Journal
Acknowledgements
We are grateful to all the staff in the general practices that collaborated in the Oxford Vascular Study: Abingdon Surgery, Stert St, Abingdon; Malthouse Surgery, Abingdon; Marcham Road Family Health Centre, Abingdon; The Health Centre, Berinsfield; Key Medical Practice; Kidlington; 19 Beaumont St, Oxford; East Oxford Health Centre, Oxford; Church Street Practice, Wantage. This work uses data provided by patients and collected by the NHS as part of their care and support and would not have been possible without access to this data. The NIHR recognises and values the role of patient data, securely accessed and stored, both in underpinning and leading to improvements in research and care.
Contributorship
Linxin Li collected data, did the statistical analysis and interpretation, wrote and revised the manuscript.
Lucy Binney, Ramon Luengo-Fernandez and Louise Silver collected data.
Peter Rothwell conceived and designed the overall study, provided study supervision and funding, acquired, analysed and interpreted data, and wrote and revised the manuscript.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Dr. Li reports no disclosures. Dr. Silver reports no disclosures. Dr. Binney reports no disclosures. Dr. Luengo-Fernandez reports no disclosures. Dr. Rothwell reports no disclosures.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Oxford Vascular Study is funded by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC), Wellcome Trust, Wolfson Foundation, and British Heart Foundation. Professor Rothwell is in receipt of an NIHR Senior Investigator award. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
Informed consent and ethical approval
Written informed consent or assent from relatives was obtained in all participants in OXVASC. OXVASC was approved by the local research ethics committee (OREC A: 05/Q1604/70).
Guarantor
PMR
References
- 1.Strong K, Mathers C, Bonita R. Preventing stroke: saving lives around the world. Lancet Neurol 2007; 6: 182–187. [DOI] [PubMed] [Google Scholar]
- 2.Seminog OO, Scarborough P, Wright FL, et al. Determinants of the decline in mortality from acute stroke in England: linked national database study of 795869 adults. BMJ 2019; 365: i1778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Woodfield R, Grant I, Sudlow C. Accuracy of electronic health record data for identifying stroke cases in large-scale epidemiological studies: a systematic review from the UK Biobank Stroke Outcomes Group. PloS One 2015; 10: e0140533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sacco S, Pistoia F, Carolei A. Stroke tracked by administrative coding data: is it fair?. Stroke 2013; 44: 1766–1768. [DOI] [PubMed] [Google Scholar]
- 5.Li L, Rothwell PM. Biases in detection of apparent “weekend effect” on outcome with administrative coding data: population based study of stroke. BMJ 2016; 353: i2648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.McCormick N, Bhole V, Lacaille D, et al. Validity of diagnostic codes for acute stroke in administrative databases: a systematic review. PloS One 2015; 10: e0135834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Aboa-Eboule C, Mengue D, Benzenine E, et al. How accurate is the reporting of stroke in hospital discharge data? A pilot validation study using a population-based stroke registry as control. J Neurol 2013; 260: 605–613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Li L, Binney LE, Carter S, et al. Sensitivity of administrative coding in identifying inpatient acute strokes complicating procedures or other diseases in UK hospitals. J Am Heart Assoc 2019; 8: e012995. [DOI] [PMC free article] [PubMed]
- 9.Rothwell PM, Coull AJ, Silver LE, et al. Population-based study of event-rate, incidence, case fatality, and mortality for all acute vascular events in all arterial territories (Oxford Vascular Study). Lancet 2005; 366: 1773–1783. [DOI] [PubMed] [Google Scholar]
- 10.Haesebaert J, Termoz A, Polazzi S, et al. Can hospital discharge databases be used to follow ischemic stroke incidence?. Stroke 2013; 44: 1770–1774. [DOI] [PubMed] [Google Scholar]
- 11.Koster M, Asplund K, Johansson A, et al. Refinement of Swedish administrative registers to monitor stroke events on the national level. Neuroepidemiology 2013; 40: 240–246. [DOI] [PubMed] [Google Scholar]
- 12.Tirschwell DL, Longstreth WT., Jr., Validating administrative data in stroke research. Stroke 2002; 33: 2465–2470. [DOI] [PubMed] [Google Scholar]
- 13.Roumie CL, Mitchel E, Gideon PS, et al. Validation of ICD-9 codes with a high positive predictive value for incident strokes resulting in hospitalization using Medicaid health data. Pharmacoepidem Drug Safe 2008; 17: 20–26. [DOI] [PubMed] [Google Scholar]
- 14.Oie LR, Madsbu MA, Giannadakis C, et al. Validation of intracranial hemorrhage in the Norwegian Patient Registry. Brain Behav 2018; 8: e00900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kivimaki M, Batty GD, Singh-Manoux A, et al. Validity of cardiovascular disease event ascertainment using linkage to UK hospital records. Epidemiology 2017; 28: 735–739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Derby CA, Lapane KL, Feldman HA, et al. Trends in validated cases of fatal and nonfatal stroke, stroke classification, and risk factors in southeastern New England, 1980 to 1991: data from the Pawtucket Heart Health Program. Stroke 2000; 31: 875–881. [DOI] [PubMed] [Google Scholar]
- 17.Li L, Yiin GS, Geraghty OC, et al. Incidence, outcome, risk factors, and long-term prognosis of cryptogenic transient ischaemic attack and ischaemic stroke: a population-based study. Lancet Neurol 2015; 14: 903–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sinha S, Myint PK, Luben RN, et al. Accuracy of death certification and hospital record linkage for identification of incident stroke. BMC Med Res Methodol 2008; 8: 74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Porter J, Mondor L, Kapral MK, et al. How reliable are administrative data for capturing stroke patients and their care?. Cerebrovasc Dis Extra 2016; 6: 96–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Varmdal T, Bakken IJ, Janszky I, et al. Comparison of the validity of stroke diagnoses in a medical quality register and an administrative health register. Scand J Public Health 2016; 44: 143–149. [DOI] [PubMed] [Google Scholar]
- 21.Baldereschi M, Balzi D, Di Fabrizio V, et al. Administrative data underestimate acute ischemic stroke events and thrombolysis treatments: data from a multicenter validation survey in Italy. PloS One 2018; 13: e0193776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Park TH, Choi JC. Validation of stroke and thrombolytic therapy in Korean National Health Insurance Claim Data. J Clin Neurol 2016; 12: 42–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bamford J, Sandercock P, Warlow C, et al. Why are patients with acute stroke admitted to hospital? BMJ (Clin Res Ed) 1986; 292: 1369–1372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Paul NL, Koton S, Simoni M, et al. Feasibility, safety and cost of outpatient management of acute minor ischaemic stroke: a population-based study. J Neurol Neurosurg Psych 2013; 84: 356–361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hallstrom B, Jonsson AC, Nerbrand C, et al. Lund Stroke Register: hospitalization pattern and yield of different screening methods for first-ever stroke. Acta Neurol Scand 2007; 115: 49–54. [DOI] [PubMed] [Google Scholar]
- 26.Asplund K, Bonita R, Kuulasmaa K, et al. Multinational comparisons of stroke epidemiology. Evaluation of case ascertainment in the WHO MONICA Stroke Study. World Health Organization Monitoring Trends and Determinants in Cardiovascular Disease. Stroke 1995; 26: 355–360. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material, ESO881017 Supplemental Material for Temporal trends in the accuracy of hospital diagnostic coding for identifying acute stroke: A population-based study by Linxin Li, Lucy E Binney, Ramon Luengo-Fernandez, Louise E Silver, Peter M Rothwell and on behalf of the Oxford Vascular Study in European Stroke Journal