Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Dec 1.
Published in final edited form as: J Stroke Cerebrovasc Dis. 2020 Oct 15;29(12):105306. doi: 10.1016/j.jstrokecerebrovasdis.2020.105306

Identification of Patients with Nontraumatic Intracranial Hemorrhage using Administrative Claims Data

Rohit B Sangal 1,5, Samah Fodeh 1,5, Andrew Taylor 1,5, Craig Rothenberg 1,5, Emily B Finn 4,5, Kevin Sheth 2,5, Charles Matouk 3,5, Andrew Ulrich 1,5, Vivek Parwani 1,5, John Sather 1,5, Arjun Venkatesh 1,5
PMCID: PMC7686163  NIHMSID: NIHMS1627787  PMID: 33070110

Abstract

Introduction:

Nontraumatic intracranial hemorrhage (ICH) is a neurological emergency of research interest; however, unlike ischemic stroke, has not been well studied in large datasets due to the lack of an established administrative claims-based definition. We aimed to evaluate both explicit diagnosis codes and machine learning methods to create a claims-based definition for this clinical phenotype.

Methods:

We examined all patients admitted to our tertiary medical center with a primary or secondary International Classification of Disease version 9 (ICD-9) or 10 (ICD-10) code for ICH in claims from any portion of the hospitalization in 2014-2015. As a gold standard, we defined the nontraumatic ICH phenotype based on manual chart review. We tested explicit definitions based on ICD-9 and ICD-10 that had been previously published in the literature as well as four machine learning classifiers including support vector machine (SVM), logistic regression with LASSO, random forest and xgboost. We report five standard measures of model performance for each approach.

Results:

A total of 1830 patients with 2145 unique ICD-10 codes were included in the initial dataset, of which 437 (24%) were true positive based on manual review. The explicit ICD-10 definition performed best (Sensitivity = 0.89 (95% Cl 0.85-0.92), Specificity = 0.83 (0.81-0.85), F-score = 0.73 (0.69-0.77)) and improves on an explicit ICD-9 definition (Sensitivity = 0.87 (0.83-0.90), Specificity = 0.77 (0.74-0.79), F-score = 0.67 (0.63-0.71). Among machine learning classifiers, SVM performed best (Sensitivity = 0.78 (0.75-0.82), Specificity = 0.84 (0.81-0.87), AUC = 0.89 (0.87-0.92), F-score = 0.66 (0.62-0.69)).

Conclusions:

An explicit ICD-10 definition can be used to accurately identify patients with a nontraumatic ICH phenotype with substantially better performance than ICD-9. An explicit ICD-10 based definition is easier to implement and quantitatively not appreciably improved with the additional application of machine learning classifiers. Future research utilizing large datasets should utilize this definition to address important research gaps.

Keywords: stroke, quality, health services research, intracranial hemorrhage

Introduction

Administrative claims datasets based on data captured for the purposes of billing are commonly used for health services and outcomes research as well as public health surveillance.13 Although administrative claims offer the potential to study very large populations in a cost effective manner by providing a large amount of information in a standardized format, the data often lack clinical accuracy given the vague coding of symptoms or diagnoses.3,4 While the introduction of the International Classification of Disease version 10 (ICD-10) codes into administrative claims data sought to rectify the challenge of classification specificity, it instead may have exacerbated the problem by expanding the codes nearly 5-fold, from approximately 13,000 (ICD-9) to 68,000 (ICD-10) codes. In neurology, such claims data are the primary source of data for outcomes research, quality measures of inpatient and outpatient care, and frequently applied to public health surveillance efforts of cerebrovascular disease.13 Thus, it is important to have an assurance of accuracy for clinicians to avoid over diagnosis, under surveillance and appropriate targeting of quality improvement initiatives in neurologic care especially with respect to cerebrovascular accidents.

Most methodological research using claims-based data has focused on ischemic stroke subtypes despite the comparably higher morbidity and mortality of hemorrhagic strokes.2,510 Interestingly, nontraumatic ICH, including subarachnoid hemorrhage and intracerebral hemorrhage, does not have a well-established claims-based definition despite comprising nearly 15% of all strokes.11 Prior work using ICD-9 and ICD-10 codes report sensitivity ranging from 68-83% and 30-65%, respectively.1,2,5,7 Poor performance is likely related to a lack of specificity of ICD-9 codes for a clinical syndrome often identified on radiology images versus a discrete pathologic diagnosis. Furthermore, existing claims data and new electronic health record data lack standardized data to capture this syndrome necessitating novel methods to promote future research. Such limitations may be overcome using machine learning models that accommodate the incorporation of thousands of codes and interactions which may facilitate better syndromic definition.12

Machine learning algorithms within health care have been used to analyze large amounts of heterogeneous data to predict clinical disease progression, response to therapy and new diagnoses.8,13,14 We sought to derive and compare claims-based definitions for nontraumatic ICH based on explicit definitions using ICD-9 and ICD-10 diagnosis codes as well as four commonly applied machine learning methods.

Methods

Study Setting and Dataset creation

This study examined all patients admitted to a tertiary medical center from 2014-2015 with an ICD-9 or ICD-10 code for ICH during the emergency department visit or as principal hospital discharge diagnosis (see supplement A). All diagnosis code data was obtained from the enterprise data warehouse, using SQL queries to extract relevant raw-data in comma-separated value format. The Standards for Reporting of Diagnostic Accuracy (STARD) reporting guideline was used for the reporting of this study. This study was approved by the University Investigation Review Board.

Data Processing and Chart Review

As a gold standard, two reviewers classified each hospital discharge as a true positive nontraumatic ICH if on chart review no evidence of alternative etiology was found. Specifically, ICH secondary to brain neoplasm or trauma were excluded as such cases are managed distinctly. All charts were reviewed by two reviewers to confirm the presence or absence of nontraumatic ICH. This confirmation is used later on as gold standard to build the classification model. Interrater agreement among reviewers was calculated using the kappa statistic. Consistent with manual chart review research guidelines we assessed interrater agreement for key variables by abstracting 100 random charts by both trained physician research team members to report raw agreement and kappa statistics.15,16 Specifically, the formula given by Cantor (1996) indicated that a sample size of greater than 70 for agreement amongst reviewers would allow the detection of a kappa statistic between the values 0.60 and 0.80 with 80% power at an alpha level of 0.05.15 The dataset was split 2:1 for training and validation. Due to class imbalance, on the training set we used the synthetic minority oversampling technique (SMOTE) approach.17

Explicit Diagnosis-Based Definitions for Nontraumatic ICH:

An explicit definition was defined as diagnosis codes meant to solely represent the underlying pathology and is often used in health services and outcome research. Prior work has used ICD-9 based codes and had poor sensitivity1,5,7 and no validated definition exists for ICD-10. We specified two explicit, rule based, definitions of nontraumatic ICH using ICD-9 and ICD-10. Because no prior ICD-10-based primary outcome definition has been published, we utilized the claims based definition used in a study examining major bleeding events, including ICH, in patients with venous thromboembolism.18 This was confirmed with project investigator review and agreement. Determining the performance of ICD-9 and ICD-10 allowed comparisons to machine learning techniques.

Machine Learning Model Development for nontraumatic ICH

Explicit definitions defined by expert input may be unable to capture the complexity of codes and interactions necessary to define atraumatic ICH. Previous literature has demonstrated performance improvement for claims based definitions of clinical phenotypes utilizing machine learning techniques.19 We examined four machine learning models: support vector machine (SVM), random forest (RF), logistic regression with LASSO, and xgboost. These models were selected based on differences in their perceived interpretability, intrinsic feature selection capabilities, ability to model non-linear relationships, and prior reported performance.20,21 We additionally explored the performance benefit of upstream feature, or variable, selection based on chi-square scores to filter ICD-10 codes for use in machine learning algorithms. Hyperparameter tuning and model selection was performed via grid-search and evaluated through 5-fold cross-validation.

Outcomes

For all classifiers, sensitivity, specificity, area under the curve (AUC), F1 score and positive predictive values were computed as measures of model performance. The 95% Wilson score confidence intervals were used to assess the variability in the estimates for sensitivity, specificity and accuracy.22 All analyses were done in Python. The final model is available on the Github repository (https://gitlab.com/yaleemdatascience/atraumatic_ich).

Results

A total of 1830 patients were included and 897 (49%) were females. All demographics are shown in table 1. Patients identified as true positive nontraumatic ICH by manual chart review tended to be older, black and had higher mortality than remaining sample. (Table 1). 437 (24%) patients were true positives utilizing either ICD-9 or ICD-10 codes. 100 (5.5%) random charts were reviewed by each of the two reviewers for five major variables (in-hospital mortality, anticoagulation reversal, antihypertensive medication administration, hypoxia and intubation) and found a raw interrater reliability of 97.2% and kappa of 92.9% (95% CI 0.88-0.98).

Table 1:

Patient Characteristics.

Total Sample True Positive ICH** False Positive ICH p-value
Characteristic N % N % N %
Total Sample 1830 437 24 1393 76
Gender (Female) 897 49 215 49 682 49 0.970
Race ethnicity
Black/African American 210 11 69 16 141 10 0.002
White/Caucasian 1353 74 308 70 1045 75 0.068
Asian 28 2 10 2 18 1 0.209
Other/Refused 239 13 50 11 189 14 0.285
Age in years* 63.9 ± 17.8 65.9 ± 16 63.4 ± 18.3 0.006
Smoking Status
Never 665 36.3 164 37.5 501 36 0.592
Previous 682 37.3 167 38.2 515 37 0.680
Other/Unknown 250 13.7 59 13.5 191 13.7 0.975
In-Hospital Mortality 144 7.9 64 14.6 80 5.7 <0.001
*

Mean and standard deviation

**

Determined by manual chart review

2145 unique ICD-10 codes existed in the data set of which feature selection identified 220 (10.1%) relevant codes for use in the downstream machine learning models. ICD-10 codes related to nontraumatic ICH by frequency are listed in Table 2. 54% of the total sample was captured by the top ten codes (Table 2).

Table 2:

ICD-10 Diagnosis Code Frequency stratified by total sample and true positive ICH.

Total Sample True Positives
Most Common 1° Code (Description) ICD-10 code N % Cumulative % N of 1° code among 2° codes % N %
Nontraumatic intracerebral hemorrhage, unspecified I61.9 222 12.1% 12.1% 82 4.5% 202 46.2%
Nontraumatic subarachnoid hemorrhage, unspecified I61.9 180 9.8% 22% 32 1.7% 138 31.6%
Traumatic subdural hemorrhage without loss of consciousness, initial encounter S06.5X0A 112 6.1% 28.1% 20 1.1% 0 0%
Cerebral aneurysm, nonruptured I67.1 103 5.6% 33.7% 42 2.3% 18 4.1%
Nontraumatic subdural hemorrhage, unspecified I62.0 95 5.2% 38.9% 46 2.5% 49 11.2%
Cerebral infarction due to unspecified occlusion or stenosis of unspecified cerebral artery I63.50 74 4% 43% 29 1.6% 19 4.3%
Traumatic subdural hemorrhage with loss of consciousness of unspecified duration, initial encounter S06.5X9A 65 3.6% 46.5% 33 1.8% 0 0%
Nontraumatic intracranial hemorrhage, unspecified I62.9 47 2.6% 49.1% 15 0.8% 37 8.5%
Cerebral infarction due to embolism of unspecified cerebral artery I63.40 44 2.4% 51.5% 12 0.7% 8 1.8%
Traumatic subarachnoid hemorrhage with loss of consciousness of unspecified duration, initial encounter S06.6X9A 40 2.2% 53.7% 23 1.3% 0 0%
Total 982 53.7% 334

Across definitions, ICD-10 performed best (Sensitivity=0.89 (95% CI 0.85-0.92), Specificity=0.83 (0.81-0.85), F-score=0.73 (0.69-0.77)) and better than ICD-9 (Sensitivity=0.87 (0.83-0.90), Specificity=0.77 (0.74-0.79), F-score=0.67 (0.63-0.71). Among machine learning classifiers, SVM performed best with the highest sensitivity and F1 measure. Performance of all classifiers are summarized (Table 3).

Table 3.

Classifier performance to identify nontraumatic ICH

ICD-10 ICD-9 SVM RF LASSO Xgboost
Specificity 0.83
(0.81,0.85)
0.77
(0.74,0.79)
0.84
(0.81,0.87)
0.91
(0.89,0.94)
0.87
(0.84,0.9)
0.88
(0.85,0.91)
Sensitivity 0.89
(0.85,0.92)
0.87
(0.83,0.90)
0.78
(0.75,0.82)
0.61
(0.57,0.65)
0.65
(0.61,0.69)
0.64
(0.62,0.68)
PPV 0.63
(0.59,0.67)
0.54
(0.50,0.58)
0.56
(0.52,0.6)
0.64
(0.61,0.68)
0.57
(0.53,0.61)
0.58
(0.54,0.62)
NPV 0.96
(0.95,0.97)
0.95
(0.93,0.96)
0.94
(0.92,0.96)
0.9
(0.88,0.92)
0.9
(0.88,0.93)
0.9
(0.88,0.93)
F1-Measure 0.73
(0.69,0.77)
0.67
(0.63,0.71)
0.66
(0.62,0.69)
0.63
(0.59,0.66)
0.6
(0.57,0.64)
0.61
(0.58,0.65)
AUC 0.89
(0.87,0.92)
0.89
(0.86,0.91)
0.87
(0.84,0.9)
0.88
(0.85,0.91)

Discussion

We found that an explicit definition of nontraumatic ICH using ICD-10 codes was accurate at identifying the clinical syndrome of nontraumatic ICH in comparison to rigorous manual chart abstraction. The ICD-10 definition also outperformed an older definition based on ICD-9 and was surprisingly not outperformed by several machine learning classifiers (Table 3). Our validated definition should support many observational, comparative effectiveness and public health research efforts focused on nontraumatic ICH.

To our knowledge this work represents the first methodologic investigation of ICD-10 based definitions for ICH. A primary strength of this research is our chart review. Specifically, administrative studies review a subset of charts to ensure accuracy but also raises the possibility of an unrepresentative sample. We reviewed all charts as a gold standard. While it would be impractical to chart review even larger data sets, this model was based on a robust gold standard. Our analyses extended upon one study by Tirschwell and Colleagues (2002) which used administrative data to examine ICH and reported sensitivity, specificity and positive predictive values of 85%, 95% and 89%, respectively justifying the use of ICD-9 codes to classify this disease.3 Not only does our study examine ICD-10 codes, our 95% confidence intervals are narrower than those reported in prior studies which has been a persistent limitation of prior work.1,2,18 We attribute our improved confidence intervals to the increased specificity of ICD-10 codes. With respect to machine learning, research using ICD codes often uses a few codes which are most predictive and are at risk of missing less common diagnosis codes.3 We incorporated more than two thousand codes into our model without sacrificing sensitivity or specificity.

Interestingly, machine learning techniques did not surpass our explicit ICD-10 definition. Our findings demonstrate the importance of evaluating simpler methods that may be more feasibly implemented to promote research efforts. Amongst the four machine learning tools we tested, SVM effectively identifies patients with a nontraumatic ICH phenotype with 84% accuracy and outperforms prior literature utilizing ICD-9 codes alone.1,2,5,7,10 Interestingly, these machine learning models consistently approached but did not exceed the accuracy of ICD-10. This suggests that older data ontologies such as ICD-9 that lack clinical granularity may benefit for the additional use of machine learning methods, whereas more modern coding nomenclatures are capable of accurately identifying clinical phenotypes without introducing the statistical noise of advanced analytic methods. Thus, the use of explicit ICD-10 codes will result in improved specificity alone. Both ICD-10 and machine learning methods appear to be reasonable approaches for real time surveillance in clinical research, however the former is substantially less resource intensive and more generalizable.

Our findings must be interpreted within the context of our study design. The derivation and validation study set were derived at a single large academic hospital and may not be applicable for research conducted at other institutions with substantially different diagnostic coding patterns. While the study was performed at a single center, it consisted of a robust sample with nationally accepted ICD codes thus limiting location bias. Additionally, administrative studies that review a subset of charts to ensure accuracy also raise the possibility of an unrepresentative sample and it would be impractical for any study using a large dataset to manually review all charts as was done herein. As data contained in the electronic health record, performance of machine learning algorithms may improve and underscores the importance of re-evaluating methods as data is increasingly digitized for large-scale analysis.

Conclusion

In summary, ICD-10 diagnosis codes can be effectively used to identify patients with a nontraumatic ICH phenotype with substantially better performance than ICD-9. Despite robust model performance, machine learning classifiers did not outperform ICD-10. Researchers should apply similar validated ICD-10 definitions for public health surveillance, clinical trial planning/evaluation and observational research using large datasets.

Supplementary Material

mmc1

Highlights:

  • ICD-9 diagnosis codes for ICH have reported poor sensitivity and specificity.

  • This work represents the first formal evaluation of ICD-10 diagnosis codes.

  • ICD-10 diagnosis codes outperform ICD-9 identifying patients with nontraumatic ICH

  • Machine learning did not outperform ICD-10

Acknowledgments

Thank you to Michael Yip, MD and Ross Littauer, MD for assistance in data collection and case review for this project.

Grant Support:

This project was conducted through the Yale Center for Healthcare Innovation, Redesign and Learning (CHIRAL). CHIRAL was funded by a grant (#P30HS023554-01) from the Agency for Healthcare Research and Quality.

The corresponding author received support from a Yale Center for Clinical Investigation KL2TR000140 from the National Center for Advancing Translational Science. The content is solely the responsibility of the authors and does not necessarily represent the official views of these organizations. In addition, we are grateful for the support provided to CHIRAL from Yale New Haven Hospital and the Claude D. Pepper Older Americans Independence Center at Yale School of Medicine (#P30AG021342 NIH/NIA).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Disclosures

No authors have any conflicts of interest with the development or dissemination of this work

References

  • 1.Jones SA, Gottesman RF, Shahar E, Wruck L, Rosamond WD.Validity of hospital discharge diagnosis codes for stroke: The atherosclerosis risk in communities study.Stroke.2014,-45:3219–3225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Li L, Binney LE, Carter S, Gutnikov SA, Beebe S, Bowsher-Brown K, et al. Sensitivity of administrative coding in identifying inpatient acute strokes complicating procedures or other diseases in uk hospitals. Journal of the American Heart Association.2019;8:e012995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tirschwell DL, Longstreth WT Jr.Validating administrative data in stroke research.Stroke.2002,-33:2465–2470 [DOI] [PubMed] [Google Scholar]
  • 4.Iezzoni LI.Using administrative diagnostic data to assess the quality of hospital care. Pitfalls and potential of icd-9-cm. Internationai journal of technology assessment in health care.1990;6:272–281 [DOI] [PubMed] [Google Scholar]
  • 5.Benesch C, Witter DM Jr., Wilder AL, Duncan PW, Samsa GP, Matchar DB.Inaccuracy of the international classification of diseases (icd-9-cm) in identifying the diagnosis of ischemic cerebrovascular disease. Neurology.1997;49:660–664 [DOI] [PubMed] [Google Scholar]
  • 6.Claude Hemphill J 3rd, Lam A.Emergency neurological life support: Intracerebral hemorrhage.Neurocriticai care.2017,-27:89–101 [DOI] [PubMed] [Google Scholar]
  • 7.Goldstein LB.Accuracy of icd-9-cm coding for the identification of patients with acute ischemic stroke: Effect of modifier codes.Stroke.1998;29:1602–1604 [DOI] [PubMed] [Google Scholar]
  • 8.Imran TF, Posner D, Honerlaw J, Vassy JL, Song RJ, Ho Y-L, et al. A phenotyping algorithm to identify acute ischemic stroke accurately from a national biobank: The million veteran program.Clinical epidemioiogy.2018;10:1509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.McCormick N, Bhole V, Lacaille D, Avina-Zubieta JA.Validity of diagnostic codes for acute stroke in administrative databases: A systematic review.PloS one.2015;10:e0135834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Roumie CL, Mitchel E, Gideon PS, Varas-Lorenzo C, Castellsague J, Griffin MR.Validation of icd-9 codes with a high positive predictive value for incident strokes resulting in hospitalization using medicaid health data.Pharmacoepidemiology and drug safety.2008;17:20–26 [DOI] [PubMed] [Google Scholar]
  • 11.The top 10 causes of death: Fact sheet, world health organization; 2017 [Google Scholar]
  • 12.Obermeyer Z, Emanuel EJ.Predicting the future - big data, machine learning, and clinical medicine.The New England journal of medicine.2016;375:1216–1219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Nori VS, Hane CA, Martin DC, Kravetz AD, Sanghavi DM.Identifying incident dementia by applying machine learning to a very large administrative claims dataset. PloS one.2019;14:e0203246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Walsh JA, Rozycki M, Yi E, Park Y.Application of machine learning in the diagnosis of axial spondyloarthritis.Current opinion in rheumatoiogy.2019;31:362–367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cantor AB.Sample-size calculations for cohen’s kappa.Psychoiogical Methods.1996;1:150–153 [Google Scholar]
  • 16.Kaji AH, Schriger D, Green S.Looking through the retrospectoscope: Reducing bias in emergency medicine chart review studies.Annals of emergency medicine.2014;64:292–298 [DOI] [PubMed] [Google Scholar]
  • 17.Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP.Smote: Synthetic minority over-sampling technique.Journal of artificial intelligence research.2002;16:321–357 [Google Scholar]
  • 18.Al-Ani F, Shariff S, Siqueira L, Seyam A, Lazo-Langner A.Identifying venous thromboembolism and major bleeding in emergency room discharges using administrative data. Thrombosis research.2015;136:1195–1198 [DOI] [PubMed] [Google Scholar]
  • 19.Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, et al. A review of approaches to identifying patient phenotype cohorts using electronic health records.Journal of the American Medical Informatics Association.2013;21:221–230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Fernández-Delgado M, Cernadas E, Barro S, Amorim D.Do we need hundreds of classifiers to solve real world classification problems? 7he journal of machine learning research.2014;15:3133–3181 [Google Scholar]
  • 21.Olson RS, La Cava W, Mustahsan Z, Varik A, Moore JH.Data-driven advice for applying machine learning to bioinformatics problems.arXivpreprint arXiv:1708.05070.2017 [PMC free article] [PubMed] [Google Scholar]
  • 22.Wilson EB.Probable inference, the law of succession, and statistical inference.Journal of the American Statistical Association.1927;22:209–212 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1

RESOURCES