Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 1.
Published in final edited form as: Stroke. 2021 Feb 18;52(4):1370–1379. doi: 10.1161/STROKEAHA.120.032546

Dynamic Detection of Delayed Cerebral Ischemia: A Study in Three Centers

Murad Megjhani 1,*, Kalijah Terilli 1,*, Miriam Weiss 2, Jude Savarraj 3, Li Hui Chen 1, Ayham Alkhachroum 4, David J Roh 1, Sachin Agarwal 1, E Sander Connolly Jr 5, Angela Velazquez 1, Amelia Boehme 1, Jan Claassen 1, HuiMahn A Choi 3, Gerrit A Schubert 2, Soojin Park 1
PMCID: PMC8247633  NIHMSID: NIHMS1664081  PMID: 33596676

Abstract

Background and Purpose:

Delayed cerebral ischemia (DCI) after aneurysmal subarachnoid hemorrhage negatively impacts long-term recovery, but is often detected too late to prevent damage. We aim to develop hourly risk scores using routinely collected clinical data to detect DCI.

Methods:

A DCI classification model was trained using vital sign measurements (heart rate, blood pressure, respiratory rate, oxygen saturation) and demographics routinely collected for clinical care. Twenty-two time-varying physiologic measures were computed including mean, standard-deviation and cross-correlation of heartrate timeseries with each of the other vitals. Classification was achieved using an ensemble approach with L2-regularized logistic regression, random forest and support vector machines models. Classifier performance was determined by area under the receiver operating characteristic curves (AU-ROC) and confusion matrices. Hourly DCI risk scores were generated as the posterior probability at time t using the ensemble classifier on cohorts recruited at two external institutions (N=38, N=40).

Results:

310 patients were included in the training model (median 54 years old [IQR 45-65], 80.2% female, 28.4% Hunt Hess 4-5, 38.7% Modified Fisher Scale 3-4), 101 (33%) developed DCI with a median onset day 6 [IQR 5-8]. Classification accuracy prior to DCI onset was 0.83 [IQR 0.76-0.83] AU-ROC. Risk scores applied to external institution datasets correctly predicted 64% and 91% of DCI events as early as 12 hours before clinical detection, with 2.7 and 1.6 true alerts for every false alert.

Conclusion:

An hourly risk score for DCI derived from routine vital signs may have the potential to alert clinicians to DCI, which could reduce neurological injury.

Keywords: Delayed Cerebral Ischemia, aneurysmal Subarachnoid Hemorrhage, Machine Learning, Neurocritical Care

Introduction

Delayed cerebral ischemia (DCI) is seen in up to every third patient with aneurysmal subarachnoid hemorrhage (SAH) and has a significant impact on functional and cognitive outcomes.13 Despite the high impact on patient outcomes, there are many barriers to the timely detection of DCI. Clinical prediction tools are static and detection is often inadequate to identify DCI before permanent damage has occurred. Confirmatory testing such as angiography carries risks such as patient transport and radiation exposure. At present, the onset of DCI is too often missed,3 obscured by impaired consciousness, and confirmatory testing performed is too often negative to confirm the clinical suspicion of DCI. Diagnostic approaches to diagnose DCI include ultrasound and electroencephalogram (EEG). Transcranial doppler to detect angiographic vasospasm is the only non-invasive surveillance tool supported by guidelines, yet it is limited by poor sensitivity, poor inter-rater reliability, and technician availability (at most, it is performed once daily). Moreover, adequate insonation is not possible for 10-15% of patients.4 Continuous quantitative EEG has shown promising results for detecting DCI, however implementation requires 24 hours continuous EEG monitoring and clinical expert artifact reconciliation.5, 6 An optimal DCI monitoring tool would be continuous and automated, performing without reliance on expert or technician availability.

Prediction and detection algorithms for DCI 5, 7, 8 may include sodium levels, glucose variability, red blood cell distribution width, white blood cell, neutrophil-lymphocyte ratio, hemoglobin, electroencephalogram, and vital signs912. Vital signs are compelling targets, as they are universally collected continuous markers of cardiopulmonary status that are affected by states of inflammation13, 14 and autonomic dysfunction15, 16, both of which have been implicated in the development of DCI following SAH. Incorporating featurized physiologic signals provides good prediction for DCI (AU-ROC 0.78) that surpassed widely established imaging-based prediction scales such as the modified Fisher Scale alone (AU-ROC 0.54-0.58).10, 11 This approach, however, is of limited clinical applicability as it still only provides a static, one-time assessment of DCI risk early after injury. In clinical practice, an ongoing monitor providing continuous risk assessment of DCI is needed. In the present study, we pivot from prediction to detection, utilizing a cross-correlation method to featurize inter-vital-sign relationships (as successfully used by others to predict neonatal sepsis17) and apply machine learning methods to develop algorithms that alert when DCI becomes more or less likely to occur. Machine learning derived detection tools have been successfully developed and implemented in other areas of critical care 18, 19, including but certainly not limited to: weaning in mechanically ventilated patients, instability and risk of cardiac arrest in children, hemodynamic shock, acute kidney injury and extensively in sepsis. We propose a data-driven DCI detection tool that offers hourly estimations of the patient’s current state, incorporating new physiologic information over time. Our hypothesis is that such a model could be used to establish DCI onset time for better evaluation of proposed treatments and to improve patient care by acting as a trigger for confirmatory testing and subsequent timely intervention.

Methods

Data availability

All relevant data are presented within the article and its supporting information files. Additional information can be obtained upon request to the corresponding author.

Patient Population and clinical data collection

Consecutive patients with aneurysmal SAH admitted to the neurological intensive care unit (NICU) were prospectively enrolled in an observational cohort study at New York Presbyterian Hospital – Columbia University Irving Medical Center (Columbia)20 from 2006 to 2014. Clinical management of patients with SAH was aligned with the American Heart Association (AHA) guidelines 21. DCI was defined classically, when patients met the following criteria: delayed neurological deterioration defined as a ≥ 2 point change in GCS or new focal neurological deficit lasting for > 1 hour and not associated with surgical treatment, and/or a new cerebral infarct on brain imaging that is not attributable to any other causes. 21, 22. TCDs were performed daily on weekdays, hypertension was induced if DCI or symptomatic vasospasm were suspected and confirmatory diagnostic studies (computed tomography angiography or digital subtraction angiography) were performed. If angiographic vasospasm was discovered, intra-arterial verapamil (sometimes angioplasty, as clinically determined) was administered. Patients with Glasgow Coma Scale < 9 were considered for multimodality neuromonitoring. The study was approved by the Institutional Review Board. Patients were excluded from this study for the following exclusion criteria: no parametric physiologic data were available, they expired prior to the DCI onset window (< Post Bleed Day (PBD) 3), or early angiographic vasospasm was detected on admitting angiogram (as DCI development was the signal of interest).

Patients with aneurysmal SAH admitted to the NICU at Rheinisch-Westfälische Technische Hochschule Aachen University (Aachen) from August 2018 to May 2020 with multimodality neuromonitoring were prospectively enrolled in an observational study. Monitors were placed in patients with high clinical grade (Hunt Hess >2) and/or extensive visible hemorrhage on CT scan warranting clinical concern for high DCI risk. Physiologic data was collected with Moberg Component Neuromonitoring Systems (Moberg Research Inc, Ambler, PA USA). The study was approved by the Institutional Review Board. At Aachen, clinical management of SAH patients was generally aligned with the European Stroke Organization (ESO) guidelines23, which does not happen to veer significantly from the AHA guidelines. TCDs are not performed regularly. At Aachen, DCI diagnosis was supplemented by “perfusion” DCI,24 defined as territorial or watershed zone hypoperfusion in CT perfusion scans triggered by abnormalities in multimodal neuromonitoring (brain tissue oxygen, microdialysis). DCI or “perfusion” DCI triggered clinicians to induce hypertension.

Patients with aneurysmal SAH admitted to the NICU at the University of Texas McGovern (Houston) from March 2018 to November 2019 were prospectively enrolled in an observational study. Physiologic data was collected based on availability of two Moberg Systems. The study was approved by the Institutional Review Board. At Houston, clinical management of patients with SAH was aligned with the AHA guidelines. Similar to Columbia, TCDs were performed daily on weekdays, hypertension was induced if DCI or symptomatic vasospasm were suspected and confirmed by computed tomography angiography/perfusion or digital subtraction angiography.

For comparison of clinical characteristics and severity between institutions, we describe core and supplemental-highly recommended National Institute of Neurological Disorders and Stroke (NINDS) common data elements (CDEs) for SAH subject characteristics. These include age, gender, ethnicity, tobacco use, and hypertension history. For comparison (and model training), we describe routinely collected grading scales of injury severity and outcome prediction which were in the core CDE: World Federation of Neurological Surgeons scale (WFNS), and the supplemental CDEs: modified Fisher Score (mFS), Hunt & Hess grade (HH) and Glasgow Coma Scale (GCS).25, 26 Hospital events were compared between cohorts: cerebral edema, fever (>38.3ºC), pulmonary edema, hydrocephalus, and seizure. Outcome variables for comparison were in-hospital mortality, length of stay in the NICU, and modified Rankin Scale (mRS) at discharge, 3 months and 12 months post-discharge, as available. The target classification outcome of the study was DCI21, 22.

Physiologic Data Analysis and Cross-Correlation between Vital Signs

At Columbia, vitals data were collected using a high-resolution acquisition system (BedmasterEX; Excel Medical Electronics Inc, Jupiter, FL, USA) at 0.2 Hz (every 5 sec). Houston and Aachen used Moberg Systems to collect data at 0.016 Hz (every minute). Six vital signs were collected: Heart rate (HR), respiratory rate (RR), oxygen saturation (SPO2), mean, systolic, and diastolic arterial blood pressure (ABP, SBP, DBP). We downsampled Columbia data to 1 min to standardize the frequency across the three institutions. Downsampling was computed as median, to deal with erroneous or missing data19. Missing data were imputed using a “carry-forward” system, where the most recent value is carried forward to fill subsequent empty time points.27

Data for each vital sign was rescaled to allow for comparison using min-max normalization.28 Twenty-two time-varying measures were computed in 10-minute windows: mean and standard deviation for each of the six vital signs, and cross-correlation (maximum and minimum) (lag −60 to +60 seconds) of HR with each remaining five vital signs (RR, SPO2, ABP, SBP, DBP). Resulting measures were only retained for segments with at least 50% of the data available. Finally, hourly averages were computed, resulting in 336-time points over 14 days. The analysis was performed in Python (www.python.org).

Statistical Analysis and Modelling

We applied an ensemble machine learning approach to build classifiers trained on Columbia data with the plan to externally validate on Houston and Aachen data. The large window for DCI presentation makes direct comparison across patients over time challenging. To address this, we used DCI diagnosis as the temporal anchor (see Materials in the Data Supplement for expanded Methods) to align the data so that we could identify a physiologic signal as the disease develops, despite varied times of onset (Figure 1). Demographics included in the model were: age, sex, mFS, WFNS, HH, and GCS at NICU admission 10.

Figure 1:

Figure 1:

Illustrating the concept of anchoring patients’ data to the DCI onset to capture the temporal dynamics leading to DCI onset. The vertical black line indicates the onset of DCI.

Comparison of Demographics data:

Demographics data were compared for the DCI positive and DCI negative groups at each of the centers and for SAH patients across three centers. Fisher’s exact test was applied to categorical variables, and the Mann-Whitney U test for two-group comparisons was applied to continuous variables. All statistical tests were two-tailed and p-value < 0.05 was considered statistically significant. The analysis was performed in R Studio software (version 1.0.143, http://www.rstudio.com, RStudio Inc., Boston, USA).

Columbia DCI Modeling:

We computed the range (max-min), mean, SD, median, IQR and entropy for each of the 22 vital sign measures and 6 subject characteristic variables (age, sex, mFS, WFNS, HH and GCS at NICU admission), resulting in 138 total features. Missing values were imputed using the median. We used F-statistics to identify k number of best features explaining most variance within the dataset 27. We then built L2-regularized logistic regression (LR), linear and kernel support vector machine (SL, SK), random forest (RF), and ensemble (EC) classifiers for each day prior to the DCI anchor (see Materials in the Data Supplement for expanded Methods) 27. We built classifiers using incrementally larger amounts of data going away from the anchor (Figure 2), as we hypothesized that the inherent temporal dynamics of the vitals change as they get closer to the anchor (or time of DCI onset). For example, our first model (M1) was created with data from the anchor to 12 hours before the anchor, the second model (M2) was created with 24 hours’ worth of data going back from the anchor, and so on. We used grid searching to tune the hyper-parameters for the classifier. We performed nested five-fold cross-validation to tune model parameters and to report the accuracy (see Materials in the Data Supplement for expanded Methods). All classifiers were evaluated for good discrimination using the area under the receiver operating characteristic curve (AU-ROC) and confusion matrix. We externally validated the performance of the Columbia classifiers (M1,…, M14) on Houston and Aachen datasets. (Figure 3 & 4)

Figure 2:

Figure 2:

Overview of the approach.

Figure 3:

Figure 3:

Performance of models (M1,…,M14) on Houston dataset over time leading to DCI anchor.

Figure 4:

Figure 4:

Performance of models (M1,…,M14) on Aachen dataset over time leading to DCI anchor.

Hourly Risk Scores

Choosing the Columbia classifier that best performed for the institution’s dataset, we generated hourly risk scores indicating the current likelihood of DCI using the ensemble classifier. Given a patient i the risk score at time t with features xit is computed as a posterior probability given by:

p(yi|xit)=f(w,xit)

Where f is the classifier (EC), w is the weight of the classifier. Machine learning models and risk scores were developed using the Python scikit-learn library.29 The optimal cut off point was selected using Youden index30, which maximizes the difference between true positive rate and and false-positive rate over all possible cut-point values. Risk scores above the optimal cut off point (threshold) indicate a higher probability of the patient developing DCI.

Results

Patient Cohorts

Inclusion criteria were met for 310 SAH patients admitted to the NICU at Columbia University from May 2006 to December 2014. Exclusion criteria were vasospasm on admission vessel imaging, non-aneurysmal SAH, death before entering the DCI window (before PBD 3), and angiographic vasospasm without DCI development. These patients were excluded from model creation to enhance the clarity of the signal of interest, leaving just two groups, patients with or without DCI. (Figure 2). Of the 310 patients included in model creation, 101 (33%) developed DCI while 209 (67%) did not. 88 (28%) were enrolled with a Hunt Hess grade of 4-5 and 121 (39%) with a modified Fisher score of 3-4. For patients with DCI, the median number of days from bleed to DCI was 6 (IQR 5,8). This range was consistent with the clinically-accepted peak time of DCI risk and when surveillance scans typically confirm the absence of angiographic vasospasm triggering de-escalation. Post-bleed day 7 was used as the anchor for patients without DCI, making the average window used for analysis for all patients the first 14 days post-bleed.

Clinical and vital sign data were available and inclusion criteria were met for 38 SAH patients admitted to the NICU at Houston from March 2018 to November 2019. 19 (50%) were enrolled with a Hunt Hess grade of 4-5 and 38 (100%) with a modified Fisher score of 3-4. 12 patients (32%) developed DCI during their visit, a median of 7 (IQR 6,8) days post-bleed.

Clinical and vital sign data were available and inclusion criteria were met for 40 SAH patients admitted to the NICU at Aachen from August 2018 to May 2020. 15 (37.5 %) were enrolled with a Hunt Hess grade of 4-5 and 20 (50 %) with a modified Fisher score of 3-4. 15 patients (37.5 %) developed “perfusion” DCI and of those, 11 patients (27.5%) developed DCI during their visit, and the earliest of either occurred a median of 7 (IQR 5,10) days post-bleed. “Perfusion” DCI was always identified on or before DCI with a mean difference of 1.5 days. In a post-hoc retrospective evaluation of Aachen data for only DCI, the onset occurred a median of 11 (IQR 8,13) days post-bleed.

Demographic statistical differences between groups are reported in Table 1.

Table 1.

Characteristics of SAH patients with and without DCI

Columbia Houston Aachen
Total Patients 310 - Total Patients 38 - Total Patients 40 -

DCI+ DCI− p value DCI+ DCI− p value DCI+ DCI− p value
n=101 n=209 n=12 n=26 n=11 n=29

Age, y, median (IQR) 54 (45-65) 57 (46-65) 0.765 59 (49-65.5) 0.137

50 (44-62) 56 (47-66) 0.033* 53 (49-60) 59 (45-70) 0.285 52 (48-60) 60 (53-66) 0.055

Female Gender, n (%) 218 (70.3) 28 (73.7) 0.85 27 (67.5) 0.716

81 (80.2) 137 (65.6) 0.008* 12 (100.0) 16 (61.5) 0.016* 9 (81.8) 18 (62.1) 0.286

Hispanic Ethnicity, n (%) 99 (31.9) 12 (31.6) 1 0 (0.0) N/A

30 (29.7) 69 (33.0) 0.604 3 (25.0) 9 (34.6) 0.714 0 (0.0) 0 (0.0) 1

Tobacco Use, n (%) 166 (53.6) 20 (52.6) 1 15 (37.5) 0.065

55 (54.5) 111 (53.1) 0.713 9 (75.0) 11 (42.3) 0.086 5 (45.5) 10 (34.5) 0.716

Hypertension History 146 (47.1) 27 (71.0) 0.006* 16 (40.0) 0.5

39 (38.6) 107 (51.2) 0.04* 11 (91.7) 16 (61.5) 0.121 7 (63.6) 9 (31.0) 0.08

HH, 4 - 5, n (%) 88 (28.4) 19 (50.0) 0.008* 15 (37.5) 0.269

40 (39.6) 48 (23.0) 0.003* 5 (41.7) 14 (53.8) 0.728 7 (63.6) 8 (27.6) 0.065

WFNS, 4 - 5, n (%) 121 (39.0) 22 (57.9) 0.035* 17 (42.5) 0.732

50 (49.5) 71 (34.0) 0.009* 7 (58.3) 15 (57.7) 1 7 (63.6) 10 (34.5) 0.153

GCS, median (IQR) 14 (8-15) 8 (5-14) <0.001* 13 (7-15) 0.101

13 (6-15) 15 (9-15) <0.001* 9 (6-13) 8 (5-14) 0.95 7 (6-13) 13 (8-15) 0.053

mFS, 3 - 4, n (%) 120 (38.7) 38 (100.0) <0.001* 20 (50.0) 0.175

38 (37.6) 82 (39.2) 0.805 12 (100.0) 26 (100.0) 1 9 (81.8) 11 (37.9) 0.031*

mFS 0, n (%) 0 (0.0) 15 (7.2) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0)
mFS 1, n (%) 15 (14.9) 41 (19.6) 0 (0.0) 0 (0.0) 2 (18.2) 12 (41.4)
mFS 2, n (%) 48 (47.5) 71 (34.0) 0 (0.0) 0 (0.0) 0 (0.0) 6 (20.7)
mFS 3, n (%) 32 (31.7) 73 (34.9) 6 (50.0) 19 (73.1) 4 (36.4) 3 (10.34)
mFS 4, n (%) 6 (5.9) 9 (4.3) 6 (50.0) 7 (26.9) 5 (45.5) 8 (27.6)

Vasopressor Use, n (%) 66/242 (27.3) 20 (52.6) 0.002* 37 (92.5) <0.001*

47/78 (60.3) 19/164 (11.6) <0.001* 10 (83.3) 10 (38.5) 0.015* 11 (100) 26 (89.7) 0.548

Cerebral Edema, n (%) 176 (56.77) 10 (26.32) <0.001 8 (20) <0.0001

76 (75.2) 100 (47.8) <0.0001* 4 (33.3) 6 (23.1) 0.694 4 (36.4) 4 (13.8) 0.182

Fever >38.6°C, n (%) 126 (40.65) 26 (68.42) 0.002* 34 (85) <0.0001*

54 (53.5) 72 (34.4) 0.002* 8 (66.7) 18 (69.2) 1 11 (100) 23 (79.3) 0.162

Hydrocephalus, n (%) 143 (46.1) 26 (68.4) 0.010* 34 (85.0) <0.0001*

61 (60.4) 82 (39.2) <0.001* 9 (75.0) 17 (65.4) 0.714 11 (100) 23 (79.3) 0.162

Pulmonary Edema, n (%) 51 (16.5) 9 (23.7) 0.260 10 (25.0) 0.186

27 (26.7) 24 (11.5) 0.001* 4 (33.3) 5 (19.2) 0.423 3 (27.3) 7 (24.1) 1

Seizures, n (%) 32 (10.3) 3 (7.9) 0.782 11 (27.5) 0.004*

12 (11.9) 20 (9.6) 0.553 1 (8.3) 2 (7.6) 1 3 (27.3) 8 (27.6) 1

In-Hospital Mortality, n (%) 34 (11.0) 3 (7.9) 0.781 9 (22.5) 0.068

8 (7.9) 26 (12.4) 0.253 0 (0.0) 3 (11.54) 0.538 5 (45.5) 4 (13.8) 0.083

LOS, d, median (IQR) 14.9 (11-22) 13 (10-17) 0.103 14.5 (11-20) 0.474

22 (17-30) 12 (10-16) <0.0001* 18 (16-21) 12 (9-14) 0.004* 18 (15-25) 14 (9-18) 0.022*

mRS at Discharge, 3 - 6, n (%) 195 (62.9) 35 (92.1) <0.001* 24 (60.0) 0.730

89 (88.1) 106 (50.7) <0.0001* 10 (83.3) 25 (96.2) 0.229 9 (81.8) 15 (51.7) 0.148

mRS at 3 Months, 3 - 6, n (%) 92/238 (38.7) 16/29 (55.2) 0.108 15/27 (55.6) 0.101

40/74 (54.1) 52/164 (31.7) 0.001* 4/11 (36.4) 12/18 (66.7) 0.143 6/9 (22.2) 9/18 (50.0) 0.683

mRS at 12 Months, 3 - 6, n (%) 93/229 (40.6) 8/14 (57.1) 0.268 11/19 (57.9) 0.154

36/70 (51.4) 57/159 (35.8) 0.029* 2/4 (50.0) 6/10 (60.0) 1 6/7 (85.7) 5/12 (41.7) 0.147
*

statistically significant

Selecting Predictive Models for Risk Score Tool

All classifiers trained with vital sign features performed better than with demographic features alone (Table 2, Figure 5A, Figure I in the Data supplement), producing AU-ROCs: L2-regularized Logistic Regression = 0.7 [0.69-0.71], SVM-Linear = 0.74 [0.68-0.74], SVM-Kernel = 0.73 [0.72-0.74], Random Forrest = 0.89 [0.85-0.89], and Ensemble Classifier = 0.83 [0.76-0.83]. We selected Ensemble Classifier, as Random Forrest models were prone to overfitting. The Ensemble Classifier created with the most vital sign data (the 7 days prior to anchor) performed the best on the Houston dataset, while the classifier with the least amount of vital sign data (12 hours prior to anchor) performed best on the Aachen dataset. (Figures 3 & 4).

Table 2.

AU-ROCs for 5 classifiers at all time-points

Time from Anchor LR SL SK RF EC
−0.5 0.64 [0.64 0.65] 0.63 [0.62 0.63] 0.67 [0.63 0.68] 0.64 [0.63 0.74] 0.66 [0.65 0.67]
−1.0 0.62 [0.61 0.68] 0.65 [0.59 0.70] 0.64 [0.60 0.72] 0.65 [0.64 0.70] 0.67 [0.62 0.69]
−1.5 0.63 [0.59 0.64] 0.61 [0.61 0.61] 0.64 [0.61 0.64] 0.66 [0.62 0.70] 0.64 [0.60 0.64]
−2.0 0.61 [0.60 0.66] 0.63 [0.61 0.63] 0.64 [0.60 0.65] 0.62 [0.61 0.68] 0.62 [0.61 0.63]
−2.5 0.61 [0.59 0.64] 0.66 [0.60 0.72] 0.63 [0.63 0.64] 0.60 [0.57 0.62] 0.64 [0.62 0.65]
−3.0 0.65 [0.64 0.66] 0.67 [0.60 0.69] 0.67 [0.65 0.69] 0.67 [0.65 0.68] 0.67 [0.63 0.69]
−3.5 0.64 [0.62 0.67] 0.60 [0.59 0.61] 0.67 [0.63 0.68] 0.66 [0.66 0.67] 0.65 [0.64 0.66]
−4.0 0.67 [0.64 0.68] 0.64 [0.57 0.64] 0.68 [0.65 0.68] 0.69 [0.69 0.70] 0.68 [0.67 0.68]
−4.5 0.66 [0.64 0.73] 0.69 [0.65 0.71] 0.67 [0.64 0.72] 0.68 [0.66 0.71] 0.70 [0.66 0.73]
−5.0 0.68 [0.64 0.69] 0.68 [0.63 0.68] 0.69 [0.67 0.73] 0.70 [0.66 0.71] 0.70 [0.70 0.72]
−5.5 0.70 [0.66 0.70] 0.69 [0.69 0.72] 0.70 [0.69 0.73] 0.72 [0.72 0.75] 0.73 [0.72 0.73]
−6.0 0.68 [0.66 0.70] 0.73 [0.64 0.74] 0.72 [0.69 0.74] 0.74 [0.72 0.76] 0.71 [0.71 0.75]
−6.5 0.70 [0.69 0.71] 0.74 [0.68 0.74] 0.73 [0.72 0.74] 0.75 [0.73 0.76] 0.73 [0.73 0.76]
−7.0 0.68 [0.68 0.72] 0.74 [0.66 0.77] 0.72 [0.70 0.72] 0.89 [0.85 0.90] 0.83[0.76-0.83]

Figure 5:

Figure 5:

Classifier performance and risk scores. (A) AU-ROCs for five classifiers (L2-Regularized Logistic Regression, Support Vector Machine – Linear and Kernel, Random Forrest, and Ensemble Classifier) trained on initial demographic and vital sign features. White dotted lines are the median AU-ROCs and the blue box indicates the IQR. Best performing models are highlighted in red and the risk scores were generated using these models. Risk scores generated every 12 hours for Columbia, Houston (using model M14) and Aachen (using model M1). Risk scores generated every 1 hour for Columbia and Houston before Classical DCI and for Aachen before “Perfusion” DCI.

Risk Scores

We computed hourly risk scores using the Ensemble Classifiers. We selected the classifier that showed maximal separation in terms of AU-ROC, which was M1 for Aachen and M14 for Houston. The optimal threshold based on Youdens index was 0.41 (M1) and 0.35 (M14). All patients start with similar (high) risk scores. Over time, Houston patients without DCI show decreasing scores that drop below the threshold, while patients who develop DCI generate scores above the threshold (Figure 5C). With model M14 and a threshold risk score of 0.35, we correctly predicted 63.6% of patients with DCI at least 12 hours before clinical detection for Houston SAH patients. This would be akin to 2.7 true alerts for every false alert. At Aachen, risk scores increased for patients developing DCI until “perfusion” DCI was diagnosed, at which point the risk scores for all SAH patients become more similar. We hypothesize that the use of a “perfusion” DCI threshold for intervention at Aachen alters the temporal dynamics of vital signs thereby stabilizing the risk scores. With model M1 and an optimal threshold of 0.41, our model predicted 90.9% of DCI events 12 hours before “perfusion” DCI, i.e., 1.6 true alerts for every false alert.

Performance on Angiographic Vasospasm

We further studied the performance of the classifiers on 71 patients with angiographic vasospasm without DCI, which were excluded while creating the models. Ensemble classifier trained with data 2.5 days prior to DCI anchor (Model M5) correctly classified 82% of these patients as non-DCI one day before the anchor. (Figure II in the Data Supplement)

Discussion

The current practice for DCI diagnosis is reliant on the availability of a useful exam in already neurologically injured patients (20% of SAH patients are comatose)31 as well as intermittent transcranial dopplers and the attention of expert caregivers in the busy and diurnal hospital setting.32 An automated, continuous monitoring tool has the potential to provide a continuous real-time risk assessment for DCI that is implementable in any ICU and thereby scalable.

Initial efforts towards automated DCI classification using simple summary statistics of vital signs significantly out-performed classifiers trained on resource-intensive information such as TCDs and nursing exams.9 Later work combined time series analysis of vitals and more sophisticated machine learning methods 10, 11 to discover hidden signals of a patient state within continuous cardiopulmonary physiologic data. Each of these studies incorporated information from just the first few days post- bleed and boosted prediction to 0.77-0.78 AU-ROC, surpassing the modified Fisher scale, a radiological tool to predict DCI. The early promise of these machine learning models supports the notion of a useful physiologic signal for the classification of patients with DCI, however, these are still prediction models that do not provide an estimation of disease onset time.

In the current study, a prospectively collected observational dataset of demographics and vital signs from a cohort of 310 SAH patients was used to explore optimal machine learning models for DCI classification. Hourly risk scores were generated and externally validated on 2 institutional SAH cohorts.

DCI positive patients had persistent high risk scores at 12 hours before diagnosis, and DCI negative patients’ risk scores dropped below critical threshold. A concern with machine learning models used for prediction is overfitting and generalizability. Testing on external datasets supports the wider applicability of this model.

The patient characteristics of the external datasets were different from that of the model derivation dataset. The Columbia dataset was a consecutive cohort of all SAH, inclusive of mild to severe SAH. The Aachen dataset was more similar to the Columbia dataset in this regard. The Houston dataset consisted solely of modified Fisher 3 and 4 patients, with concomitantly higher Hunt Hess, and lower GCS. Optimal model selection for generating hourly risk scores differed for the 2 external institutions, with one performing best when inclusive of all data preceding DCI (Houston), and the other performing best when limited to data more immediately preceding DCI (Aachen). This difference will need to be explored. A compelling hypothesis is that vital signs throughout the ICU stay are influenced by clinical care and there may be more similarities in practices between the US institutions of Houston and Columbia.

Risk scores validated on the Aachen cohort resulted in higher sensitivity but lower specificity. Risk scores validated on the Houston cohort resulted in lower sensitivity but higher specificity. It is encouraging that the model would perform as well as it did in both of these cohorts, even in one with a selected high severity SAH population, as the biggest concern for missing DCI diagnoses is in those with limited consciousness and exam.

There are several limitations to this study. First, since the range of time to DCI onset after SAH is large, when we align patients by individual diagnosis, the amount of data before and after the anchor is inconsistent across patients. We tolerate this variation because alignment by the outcome event should maximize the information in the physiologic signal leading up to it, and our priority is to create models that can best capture this signal. Second, the challenge of causality leakage33 exists when building a classifier using data that may be influenced by the target outcome (e.g. detecting sepsis based on blood cultures that would have been sent only in response to clinical suspicion of sepsis). This is a challenge that must be addressed with prospective application of the model.

There is an abundance of routinely collected physiologic data generated in the NICU. To our knowledge, this is the first study in humans to apply a data-driven automated approach using vital sign time series data to produce hourly classifications of DCI after SAH. In order to be accepted as a clinical tool, this method must be implemented and compared to existing diagnostic methods, which at the moment is limited to intermittent transcranial dopplers and astute clinical acumen. Such a trial is being planned.

Conclusion

This is the first study to show that real-time hourly risk scores can classify DCI with good accuracy following aneurysmal SAH. These scores were developed applying data-driven machine learning approaches to widely available vital sign data collected in the critical care setting. These models may be generalizable as validation on two independent cohorts demonstrated good reproducibility. Future efforts will focus on affirming that the model is not impaired by causality leakage, comparing it against the clinical approach that exists for detecting DCI, and showing it improves outcome.

Supplementary Material

Supplemental Material

Acknowledgments

Funding: This study was funded by the NIH, NIEHS K01-ES026833 (SP); American Heart Association: 20POST35210653 (MM).

Disclosures:

DR reports funding from Portola Pharmaceuticals outside the submitted work.

JC reports funding from iCE Neurosystems outside the submitted work.

AA reports funding from National Center for Advancing Translational Sciences of the National Institutes of Health NIH under the Miami Clinical Translational Sciences Institute CTSI KL2 Career Development Award UL1TR002736 (AA).

Abbreviation

DCI

Delayed Cerebral Ischemia

aSAH

aneurysmal Subarachnoid Hemorrhage

EEG

Electroencephalogram

NICU

Neurological Intensive Care Unit

TCD

Transcranial Doppler

PBD

Post Bleed Day

CT

Computed Tomography

mRS

modified Rankin Scale

WFNS

World Federation of Neurological Surgeons scale

mFS

modified Fisher Score

HH

Hunt and Hess grade

GCS

Glasgow Coma Scale

HR

Heart rate

RR

Respiratory Rate

SPO2

Oxygen Saturation

ABP

Mean Arterial Blood Pressure

SBP

Systolic Arterial Blood Pressure

DBP

Diastolic Arterial Blood Pressure

LR

L2-regularized logistic regression

SL

Support Vector Machine Linear

SK

Support Vector Machine Kernel

RF

Random Forest

EC

Ensemble Classifier

AU-ROC

Area Under Receiver Operating Curve

Footnotes

Conflicts of interest/Competing interests: None

References

  • 1.Dorsch N A clinical review of cerebral vasospasm and delayed ischaemia following aneurysm rupture. Acta Neurochir Suppl. 2011;110:5–6. doi: 10.1007/978-3-7091-0353-1_1 [DOI] [PubMed] [Google Scholar]
  • 2.Eagles ME, Tso MK, Macdonald RL. Cognitive Impairment, Functional Outcome, and Delayed Cerebral Ischemia After Aneurysmal Subarachnoid Hemorrhage. World Neurosurg. 2019;124: e558–e562. doi: 10.1016/j.wneu.2018.12.152 [DOI] [PubMed] [Google Scholar]
  • 3.Schmidt J, Wartenberg KE, Fernandez A, Claassen J, Rincon F, Ostapkovich ND, Badjatia N, Parra A, Connolly E, Mayer SA. Frequency and clinical impact of asymptomatic cerebral infarction due to vasospasm after subarachnoid hemorrhage. J Neurosurg. 2008;109:1052–9. doi: 10.3171/jns.2008.109.12.1052 [DOI] [PubMed] [Google Scholar]
  • 4.Mastantuono JM, Combescure C, Elia N, Tramer MR, Lysakowski C. Transcranial Doppler in the Diagnosis of Cerebral Vasospasm: An Updated Meta-Analysis. Crit Care Med. 2018;46:1665–1672. doi: 10.1097/ccm.0000000000003297 [DOI] [PubMed] [Google Scholar]
  • 5.Claassen J, Hirsch LJ, Kreiter KT, Du EY, Connolly ES, Emerson RG, Mayer SA. Quantitative continuous EEG for detecting delayed cerebral ischemia in patients with poor-grade subarachnoid hemorrhage. Clin Neurophysiol. 2004;115:2699–710. doi: 10.1016/j.clinph.2004.06.017 [DOI] [PubMed] [Google Scholar]
  • 6.Rosenthal ES, Biswal S, Zafar SF, O’Connor KL, Bechek S, Shenoy AV, Boyle EJ, Shafi MM, Gilmore EJ, Foreman BP, et al. Continuous electroencephalography predicts delayed cerebral ischemia after subarachnoid hemorrhage: a prospective study of diagnostic accuracy. Annals of neurology. 2018;83:958–69. doi: 10.1002/ana.25232 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Francoeur CL, Mayer SA. Management of delayed cerebral ischemia after subarachnoid hemorrhage. Crit Care. 2016;20:277. doi: 10.1186/s13054-016-1447-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lai X, Zhang W, Ye M, Liu X, Luo X. Development and validation of a predictive model for the prognosis in aneurysmal subarachnoid hemorrhage. Journal of Clinical Laboratory Analysis. 2020:e23542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Roederer A, Holmes JH, Smith MJ, Lee I, Park S. Prediction of significant vasospasm in aneurysmal subarachnoid hemorrhage using automated data. Neurocritical care. 2014;21:444–50. doi: 10.1007/s12028-014-9976-9 [DOI] [PubMed] [Google Scholar]
  • 10.Park S, Megjhani M, Frey HP, Grave E, Wiggins C, Terilli KL, Roh DJ, Velazquez A, Agarwal S, Connolly ES. Predicting delayed cerebral ischemia after subarachnoid hemorrhage using physiological time series data. J Clin Monit Comput. 2019;33:95–105. doi: 10.1007/s10877-018-0132-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Megjhani M, Terilli K, Frey HP, Velazquez AG, Doyle KW, Connolly ES, Roh DJ, Agarwal S, Claassen J, Elhadad N, et al. Incorporating High-Frequency Physiologic Data Using Computational Dictionary Learning Improves Prediction of Delayed Cerebral Ischemia Compared to Existing Methods. Front Neurol. 2018;9:122. doi: 10.3389/fneur.2018.00122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Schmidt JM, Sow D, Crimmins M, Albers D, Agarwal S, Claassen J, Connolly ES, Elkind MS, Hripcsak G, Mayer SAl. Heart rate variability for preclinical detection of secondary complications after subarachnoid hemorrhage. Neurocrit Care. 2014;20:382–9. doi: 10.1007/s12028-014-9966-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Provencio JJ. Inflammation in subarachnoid hemorrhage and delayed deterioration associated with vasospasm: a review. Acta Neurochir Suppl. 2013;115:233–8. doi: 10.1007/978-3-7091-1192-5_42 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lucke-Wold BP, Logsdon AF, Manoranjan B, Turner RC, McConnell E, Vates GE, Huber JD, Rosen CL, Simard JM. Aneurysmal Subarachnoid Hemorrhage and Neuroinflammation: A Comprehensive Review. Int J Mol Sci. 2016;17:497. doi: 10.3390/ijms17040497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Schmidt JM. Heart Rate Variability for the Early Detection of Delayed Cerebral Ischemia. J Clin Neurophysiol. 2016;33:268–74. doi: 10.1097/WNP.0000000000000286 [DOI] [PubMed] [Google Scholar]
  • 16.Park S, Kaffashi F, Loparo KA, Jacono FJ. The use of heart rate variability for the early detection of treatable complications after aneurysmal subarachnoid hemorrhage. J Clin Monit Comput. 2013;27:385–93. doi: 10.1007/s10877-013-9467-0 [DOI] [PubMed] [Google Scholar]
  • 17.Fairchild KD, Lake DE, Kattwinkel J, Moorman JR, Bateman DA, Grieve PG, Isler JR, Sahni R l. Vital signs and their cross-correlation in sepsis and NEC: a study of 1,065 very-low-birth-weight infants in two NICUs. Pediatr Res. 2017;81:315–321. doi: 10.1038/pr.2016.215 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Alkhachroum A, Terilli K, Megjhani M, Park S. Harnessing Big Data in Neurocritical Care in the Era of Precision Medicine. CURRENT TREATMENT OPTIONS IN NEUROLOGY. 2020;22. doi: 10.1007/s11940-020-00622-8 [DOI] [Google Scholar]
  • 19.Johnson AE, Ghassemi MM, Nemati S, Niehaus KE, Clifton DA, Clifford GD. Machine Learning and Decision Support in Critical Care. Proc IEEE Inst Electr Electron Eng. 2016;104:444–466. doi: 10.1109/JPROC.2015.2501978 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Lantigua H, Ortega-Gutierrez S, Schmidt JM, Lee K, Badjatia N, Agarwal S, Claassen J, Connolly ES, Mayer SA. Subarachnoid hemorrhage: who dies, and why? Crit Care. 2015;19:309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Connolly ES Jr, Rabinstein AA, Carhuapoma JR, Derdeyn CP, Dion J, Higashida RT, Hoh BL, Kirkness CJ, Naidech AM, Ogilvy CS, et al. Guidelines for the management of aneurysmal subarachnoid hemorrhage: a guideline for healthcare professionals from the American Heart Association/american Stroke Association. Stroke. 2012;43:1711–37. doi: 10.1161/STR.0b013e3182587839 [DOI] [PubMed] [Google Scholar]
  • 22.Vergouwen MD, Vermeulen M, van Gijn J, Rinkel GJ, Wijdicks EF, Muizelaar JP, Mendelow AD, Juvela S, Yonas H, Terbrugge KG, et al. Definition of delayed cerebral ischemia after aneurysmal subarachnoid hemorrhage as an outcome event in clinical trials and observational studies: proposal of a multidisciplinary research group. Stroke. 2010;41:2391–5. doi: 10.1161/strokeaha.110.589275 [DOI] [PubMed] [Google Scholar]
  • 23.Steiner T, Juvela S, Unterberg A, Jung C, Forsting M, Rinkel G. European Stroke Organization guidelines for the management of intracranial aneurysms and subarachnoid haemorrhage. Cerebrovasc Dis. 2013;35:93–112. doi: 10.1159/000346087 [DOI] [PubMed] [Google Scholar]
  • 24.Veldeman M, Albanna W, Weiss M, Conzen C, Schmidt TP, Schulze-Steinen H, Wiesmann M, Clusmann H, Schubert GA. Invasive neuromonitoring with an extended definition of delayed cerebral ischemia is associated with improved outcome after poor-grade subarachnoid hemorrhage. J Neurosurg. 2020:1–8. doi: 10.3171/2020.3.JNS20375 [DOI] [PubMed] [Google Scholar]
  • 25.Damani R, Mayer S, Dhar R, Martin RH, Nyquist P, Olson DM, Mejia-Mantilla JH, Muehlschlegel S, Jauch EC, Mocco J, et al. Common Data Element for Unruptured Intracranial Aneurysm and Subarachnoid Hemorrhage: Recommendations from Assessments and Clinical Examination Workgroup/Subcommittee. Neurocrit Care. 2019;30):28–35. doi: 10.1007/s12028-019-00736-1 [DOI] [PubMed] [Google Scholar]
  • 26.Suarez JI, Sheikh MK, Macdonald RL, Amin-Hanjani S, Brown RD, de Oliveira Manoel AL, Derdeyn CP, Etminan N, Keller E, Leroux PD, et al. Common Data Elements for Unruptured Intracranial Aneurysms and Subarachnoid Hemorrhage Clinical Research: A National Institute for Neurological Disorders and Stroke and National Library of Medicine Project. Neurocrit Care. 2019;30):4–19. doi: 10.1007/s12028-019-00723-6 [DOI] [PubMed] [Google Scholar]
  • 27.Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. vol 1. Springer series in statistics New York; 2001. [Google Scholar]
  • 28.Jiawei H, Jian P, Micheline K. Data Mining: Concepts and Techniques. 2011. [Google Scholar]
  • 29.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. Scikit-learn: Machine learning in Python. Journal of machine learning research. 2011;12:2825–2830. [Google Scholar]
  • 30.Youden WJ. Index for rating diagnostic tests. Cancer. 1950;3:32–35. [DOI] [PubMed] [Google Scholar]
  • 31.Quintard H, Leduc S, Ferrari P, Petit I, Ichai C. Early and persistent high level of PS 100beta is associated with increased poor neurological outcome in patients with SAH: is there a PS 100beta threshold for SAH prognosis? Crit Care. 2016;20:33. doi: 10.1186/s13054-016-1200-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Deshmukh H, Hinkley M, Dulhanty L, Patel HC, Galea JP. Effect of weekend admission on in-hospital mortality and functional outcomes for patients with acute subarachnoid haemorrhage (SAH). Acta Neurochir (Wien). 2016;158:829–35. doi: 10.1007/s00701-016-2746-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kaufman S, Rosset S, Perlich C, Stitelman O. Leakage in data mining: Formulation, detection, and avoidance. ACM Transactions on Knowledge Discovery from Data (TKDD). 2012;6:15. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Data Availability Statement

All relevant data are presented within the article and its supporting information files. Additional information can be obtained upon request to the corresponding author.

RESOURCES