Abstract
Background
Timely surgical decompression improves functional outcomes and survival among children with traumatic brain injury (TBI) and increased intracranial pressure. Previous scoring systems for identifying the need for surgical decompression after TBI in children and adults have had several barriers to use. These barriers include the inability to generate a score with missing data, a requirement for radiographic imaging that may not be immediately available, and limited accuracy. To address these limitations, we developed a Bayesian network to predict the probability of neurosurgical intervention among injured children and adolescents (age 1 to 18 years) using physical exam findings and injury characteristics observable at hospital arrival.
Methods
We obtained patient, injury, transportation, resuscitation, and procedure characteristics from the 2017 to 2019 Trauma Quality Improvement Project database. We trained and validated a Bayesian network to predict the probability of a neurosurgical intervention, defined as undergoing a craniotomy, craniectomy, or intracranial pressure monitor placement. We evaluated model performance using the area under the receiver operating characteristic (AUROC) and calibration curves. We evaluated the percentage of contribution of each input for predicting neurosurgical intervention using relative mutual information (RMI).
Results
The final model included four predictor variables, including the Glasgow Coma Scale score (RMI 31.9%), pupillary response (RMI 11.6%), mechanism of injury (RMI 5.8%), and presence of prehospital cardiopulmonary resuscitation (RMI 0.8%). The model achieved an AUROC curve of 0.90 (95% CI 0.89, 0.91) and had a calibration slope of 0.77 (95% CI 0.29, 1.26) with a y-intercept of 0.05 (95% CI −0.14, 0.25).
Conclusion
We developed a Bayesian network that predicts neurosurgical intervention for all injured children using four factors immediately available on arrival. Compared to a binary threshold model, this probabilistic model may allow clinicians to stratify management strategies based on risk.
Level of Evidence
Prognostic, Level III
Keywords: Bayesian prediction, pediatrics, neurosurgery, intracranial pressure
BACKGROUND
Immediate strategies for improving survival and functional outcomes are limited for children and adults with a severe traumatic brain injury (TBI) (Glasgow Coma Scale [GCS] score < 9) or a traumatic intracranial lesion at any level of severity.(1–5) Options may include serial neurological examination and imaging, intracranial pressure (ICP) monitoring, and immediate operative intervention when evidence of elevated intracranial pressure is high.(6) Although close observation and the placement of an ICP monitor does not directly improve outcomes, the information obtained from these strategies can guide medical and surgical management.(6) The Brain Trauma Foundation has identified several observable signs, symptoms, and radiographic characteristics of elevated ICP that can guide the decision to proceed with operative intervention. These characteristics include the GCS score, pupillary response, presence of a focal neurological deficit, size of the intracranial lesions, presence of mass effect on computerized tomography (CT), continued neurological deterioration, and medically refractory ICP.(1–5) For children and adults with brain injuries who do not meet these criteria, it is often difficult to determine who will have progression of intracranial hemorrhage or development of intracranial hypertension.
Implementation of a ‘Level 1 Neuro-Trauma (L1N) alert’ reduced the time to anesthesia induction, surgical incision, disposition from the emergency department (ED) to the pediatric intensive care unit (PICU), and ICP monitor placement at a stand-alone American College of Surgeons (ACS) verified Level 1 pediatric trauma program.(7) Activation of this alert increased the accessibility of plasma products and allowed time for the operating room and intensive care unit staff to allocate resources necessary for TBI management. Criteria used to trigger this activation were developed using domain knowledge instead of a data-driven approach. Only one data-driven triage tool has been validated to identify injured children and adolescents at risk for requiring a craniotomy or craniectomy, the Surgical Intervention for Traumatic Injury (SITI) score.(8–10) Using SITI in practice has had several barriers, including the inability to generate a score with missing data, delays that may occur when obtaining radiographic findings needed for the calculation, and exclusion of several categories of children, including those with mild TBI, penetrating head injuries, and skull fractures. A triage tool that identifies children who may require an ICP monitor has yet to be described. To address these limitations, we developed a Bayesian network to predict the probability of neurosurgical intervention (craniotomy, craniectomy, or ICP monitor placement) after pediatric injury, usable for all injured children, using variables immediately available upon hospital arrival. We intend for our data-driven model to triage injured children like the L1N activation criteria to mobilize resources and reduce the time to definitive multidisciplinary TBI management.
METHODS
Subject Selection
To develop and validate a Bayesian network, we used the 2017 to 2019 Trauma Quality Improvement Project (TQIP) database. This trauma registry contains information related to each patient’s demographics, injury and resuscitation characteristics, in-hospital procedures and events, and outcome data.(11) In-hospital procedures and events are categorized within TQIP using International Classification of Diseases Clinical Modification (ICD-CM) codes.(12) A data-driven approach has previously been described for identifying children and adolescents in TQIP who received a craniotomy, craniectomy, or ICP monitor with excellent accuracy using the 9th edition of ICD-CM codes.(12) Because the 10th edition of the ICD-CM (ICD-10-CM) codes became required in the United States in October 2015, we first developed and validated ICD-10-CM computable phenotypes to identify craniotomy, craniectomy, and ICP monitor placement (including parenchymal and ventriculostomy monitors) in TQIP using this approach.(12) The Institutional Review Boards qualified this study as an exempt protocol. We used the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines to ensure proper reporting of methods, results, and discussion for the development of the Bayesian network (Supplemental Digital Content 1).
Identification of Neurosurgical Procedures by ICD-10-CM Codes
To develop computable phenotypes, we first used data from a previous prospective cohort study of children and adolescents (<18 years old) diagnosed with TBI at two pediatric level 1 trauma centers between October 2015 and June 2017. We excluded children with a GCS score ≥ 13 if they did not undergo a craniotomy, craniectomy, or an ICP monitor placement within 24 hours of admission. To avoid bias from patients whose mental status was depressed at admission due to factors such as medications, we excluded surviving patients with a GCS < 13 and discharged from the intensive care unit (ICU) within 24 hours of admission without a neurosurgical or critical care intervention (e.g., intubation, central venous line placement, arterial line placement).(13, 14) These prospectively collected data were then linked to trauma registry codes in the format submitted to the National Trauma Data Bank (NTDB) and the TQIP, creating the dataset needed to develop and validate computable phenotypes.
Similar to earlier work using the 9th edition ICD-CM codes, we developed a penalized logistic regression and Boolean classifiers to predict craniotomy, craniectomy, and ICP monitor placement.(12) Each ICD-10-CM code in a Boolean classifier is given equal weight, but the regression models have different coefficients for each code. As recommended by the TRIPOD statement, we used temporal (first 70% training, subsequent 30% testing) validation.(15) Candidate predictors (ICD-10-CM codes) were identified from previously published studies and by review of ICD-10-CM codebooks by clinical experts.
For the penalized logistic regression model, we allowed the shrinkage penalty to vary between 0 (ridge regression) and 0.9 (elastic net regression). We selected and validated the optimal model using 5-fold cross-validation with accuracy as the metric to evaluate each model. We also evaluated model performance using Brier’s score and discrimination using sensitivity, specificity, and the area under the receiver operator curve (AUROC).
The optimal penalized regression model for classifying ICP monitor placement had only modestly (accuracy 84.1% versus 82.5%) better performance than a Boolean classifier using the same ICD-10-CM codes (Table 1). For craniotomy or craniectomy, the Boolean classifier performed better (accuracy 90.5% versus 85.7%). We chose the Boolean classifier to identify craniotomy, craniectomy, and ICP monitor placement for the remainder of this study.
Table 1.
Intracranial Pressure Monitor and Craniotomy/Craniectomy Classification Models
| NTDB ICP ICD-10-CM Procedure Codes | Accuracy (%) | Sensitivity (%) | Specificity (%) | AUC (%) | Brier | |
|
| ||||||
| model | 4A103BD (6.1), 4A107BD, 4A007BD (5.9), 4A003BD (6.1), 009630Z (5.7), 009600Z (0.9), 009640Z, 00H002Z (5.9), 00H032Z (5.9) | 84.1 (73.2, 91.1) | 63.6 (51.3, 74.4) | 95.1 (86.8, 98.3) | 85.2 (75.3, 95.2) | 0.170 |
| ad hoc | 82.5 (71.4, 90.0) | 63.6 (51.3, 74.4) | 92.7 (83.5, 96.9) | 82.5 (71.6, 93.3) | 0.175 | |
|
| ||||||
| NTDB Craniotomy/Craniectomy ICD-10-CM Procedure Codes | Accuracy (%) | Sensitivity (%) | Specificity (%) | AUC (%) | Brier | |
|
| ||||||
| model | 0N800ZZ (1.0), 0NB00ZZ, 0N9100Z, 0NH104Z, 0N9000Z, 0NB03ZZ (1.0), 0NR307Z, 0N9130Z, 00J00ZZ (1.8), 00903ZZ (1.0), 00B00ZX (1.0), 0NS204Z, 0NS104Z, 0NB20ZZ, 0NS00ZZ, 0NS304Z, 0NB30ZZ, 0NQ20ZZ, 0NB50ZZ, 0NQ50ZZ, 0NR20KZ, 0NQ40ZZ, 0NB10ZZ, 0NR207Z, 0NR107Z, 0NR607Z, 0NT60ZZ, 00B00ZZ, 00Q00ZZ, 00N00ZZ, 00900ZZ | 85.7 (75.0, 92.3) | 30.8 (20.8, 43.0) | 100.0 (94.3, 100.0) | 92.4 (87.7, 97.0) | 0.135 |
| ad hoc | 90.5 (80.7, 95.6) | 53.8 (41.7, 65.6) | 100.0 (94.3, 100.0) | 94.6 (90.6, 98.7) | 0.095 | |
“model” = best penalized regression model selected by cross-validation. Predictors in bold were included in the final model with coefficients shown on the logit scale.
“ad hoc” = Boolean classifier including all of the candidate predictors from the corresponding penalized regression model in the row above, i.e. any of X or Y or Z = yes.
ICP = intracranial pressure; AUC = area under the curve; NTDB = National Trauma Data Bank; Estimates are shown with 95% confidence intervals except the Brier’s score.
Bayesian Belief Network Subject Selection and Characteristics
We included all records of children and adolescents (ages 1 to 18 years old) from the 2017 to 2019 TQIP database, including those classified ‘dead on arrival,’ for model development and validation. We abstracted demographic, injury and resuscitation characteristics, in-hospital procedures and events, and outcomes, including age, sex, weight, the initial emergency department vital signs and GCS score, pupillary reactivity, presence of prehospital cardiopulmonary resuscitation (CPR), the origin of patient before arrival, transportation mode, mechanism of injury, identified ICD-10-CM codes for ICP monitor placement, craniotomy, and craniectomy, and mortality.
Bayesian Model Development
A Bayesian network estimates the probability of a dependent variable using the joint probability distribution of observed or unobserved predictor variables.(16, 17) We used the software package BayesiaLab (Franklin, TN) to train a Bayesian network to predict the probability of neurosurgical intervention following arrival to the hospital among injured children and adolescents using immediately available emergency department features. We assigned this approach the acronym NINJA: Neurosurgery after INJury in pediAtrics. Because patients may require both procedures, we categorized each patient who underwent a neurosurgical procedure into two categories based on their initial procedure: craniotomy or craniectomy with or without ICP monitor placement (craniotomy/craniectomy) and ICP monitor placement alone. To create a binary dependent variable, we combined these two categories into ‘neurosurgical intervention.’
We a priori selected variables for potential inclusion in the network using domain knowledge, including (1) age, (2) sex, (3) initial emergency department values of systolic blood pressure, heart rate, respiratory rate, oxygen saturation, GCS score, (4) pupillary reactivity, (5) need for prehospital CPR, (6) classification as ‘dead on arrival,’ (7) origin prior to arrival, (8) transportation mode and (9) mechanism of injury (Table 2). We discretized continuous variables using the decision tree algorithm within the software package based on the data rather than defining these values using reference standards.(17) We defined pupillary reactivity as both reactive, one reactive, and neither reactive. Because pupillary reactivity is only reported for patients with an Abbreviated Injury Scale (AIS) head injury severity score ≥ 1, we made two assumptions for completing these missing values.(11) First, we assumed both pupils were reactive when the patient had no assigned AIS head injury score and was not ‘dead on arrival.’ Second, we assumed no pupillary reactivity when a patient had no AIS head injury score assigned and was ‘dead on arrival.’ We defined ‘dead on arrival’ using the coded data field in TQIP.(11) We defined the origin of arrival as transfer from an outside institution and other. We defined transportation mode as by ambulance, air, private vehicle, and ‘other.’ If a child was transported by air and ambulance, we designated the transportation mode as ‘air.’ We consolidated the 27 mechanism of injury categories used in TQIP into six categories based on domain knowledge and the frequency and proportion of patients receiving neurosurgical interventions. These final groups included ‘penetrating by firearm’, ‘fall’, ‘struck by, against,’ ‘motor vehicle crash,’ ‘pedestrian struck,’ and ‘other’ mechanisms. We classified mechanism as ‘other’ if the mechanism was associated with neurosurgical intervention less than the baseline frequency (1.3%). We used AIS severity scale values of ≥1 to determine the presence of any injury in eight major body regions: head, face, neck, chest, abdomen, spine, extremities, and external.
Table 2.
Summary Statistics of the Training, Calibration, and Validation Data Using the 2017–2019 Trauma Quality Improvement Project Dataset
| Variable | Training | Calibration | Validation | |||
|---|---|---|---|---|---|---|
| Observed Data (n = 128949) |
% Missing | Observed Data (n = 128944) |
% Missing | Observed Data (n = 128966) |
% Missing | |
|
| ||||||
| Age, (years [IQR]) | 11 (5,16) | 0.0 | 11 (5,16) | 0.0 | 11 (5,16) | 0.0 |
| Sex, n (%) | ||||||
| Male | 84099 (65.2) | 0.02 | 84236 (65.3) | 0.01 | 84239 (65.3) | 0.01 |
| Female | 44823 (34.8) | 44692 (34.7) | 44714 (34.7) | |||
| ED vital signs, (median [IQR]) | ||||||
| Systolic blood pressure | 121 (111,133) | 7.8 | 121 (110,133) | 7.8 | 121 (111,133) | 7.7 |
| Heart rate | 100 (85,116) | 2.4 | 100 (85,116) | 2.4 | 100 (85,116) | 2.4 |
| Respiratory rate | 20 (18,24) | 3.7 | 20 (18,24) | 3.6 | 20 (18,24) | 3.6 |
| Oxygen saturation | 99 (98,100) | 8.2 | 99 (98,100) | 8.3 | 99 (98,100) | 8.2 |
| ED GCS score, (median [IQR]) | 15 (15,15) | 6.2 | 15 (15,15) | 6.2 | 15 (15,15) | 6.1 |
| Mechanism of injury, n (%) | ||||||
| Firearm | 5993 (4.6) | 1.2 | 6092 (4.7) | 1.2 | 5985 (4.6) | 1.2 |
| Fall | 45212 (35.1) | 45010 (34.9) | 44945 (34.8) | |||
| Struck by, against | 13217 (10.2) | 13405 (10.4) | 13441 (10.4) | |||
| Motor vehicle crash | 32502 (25.2) | 32766 (25.4) | 32663 (25.3) | |||
| Pedestrian struck | 11353 (8.8) | 11228 (8.7) | 11588 (9.0) | |||
| Other | 19063 (14.8) | 18835 (14.6) | 18744 (14.5) | |||
| Pre-hospital CPR, n (%) | 1468 (1.1) | 2.4 | 1449 (1.1) | 2.4 | 1423 (1.1) | 2.4 |
| Body region injured, n (%) | ||||||
| Head | 38871 (30.1) | 0.0 | 38531 (29.9) | 0.0 | 38872 (30.1) | 0.0 |
| Face | 31990 (24.8) | 32167 (25.0) | 32132 (24.9) | |||
| Neck | 3328 (2.6) | 3296 (2.6) | 3312 (2.6) | |||
| Thorax | 17573 (13.6) | 17506 (13.6) | 17656 (13.7) | |||
| Abdomen/pelvis | 17203 (13.3) | 17193 (13.3) | 17210 (13.3) | |||
| Spine | 10236 (7.9) | 10232 (7.9) | 10317 (8.0) | |||
| Extremities | 79398 (61.6) | 79445 (61.6) | 79554 (61.7) | |||
| External | 11196 (8.7) | 11310 (8.8) | 11174 (8.7) | |||
| Transfer from OSH, n (%) | 45976 (35.6) | 0.01 | 46028 (35.7) | 0.004 | 46066 (35.7) | 0.005 |
| Dead on arrival, n (%) | 747 (0.6) | 0.005 | 714 (0.6) | 0.005 | 753 (0.6) | 0.01 |
| Neurosurgical intervention, n (%) | 1671 (1.3) | 0.0 | 1669 (1.3) | 0.0 | 1658 (1.3) | 0.0 |
| Time to surgery (hours [IQR]) | 3.1 (1.8,9.6) | 5.4 | 3.2 (1.8,10.7) | 4.1 | 3.1 (1.8,9.6) | 4.9 |
ED: emergency department; CPR: cardiopulmonary resuscitation; ICP: intracranial pressure GCS: Glasgow Coma Scale; OSH: outside hospital; IQR: interquartile range
We randomly split the data into equally balanced thirds to train, recalibrate, and validate the model. Missing data among the training dataset was imputed during model development using the structural expectation maximization algorithm.(16, 18) We applied the Augmented Markov Blanket supervised learning algorithm to define the machine-driven direct and indirect causal relationships between the need for neurosurgical intervention and the selected variables.(17) Because the strength of these relationships are not equivalent, a uniform structural coefficient threshold may overfit or underfit the machine-driven model.(17) To address this limitation, we augmented the supervised learning algorithm with structural prior leaning to select the best model.(17) The software package selected predictors for inclusion in the final model based on ‘mutual information,’ the amount of information contributed by each to prediction.(17, 19) We ranked the contribution of each variable in the final model as a predictor of the dependent variable using relative mutual information (RMI), the percentage equivalent of mutual information.(17)
Model Recalibration and Validation
Because the initial model over-predicted the probability of neurosurgical intervention, we recalibrated the model using isotonic regression.(20) We performed internal model validation using the test data (Figure 1). Model performance was assessed using discrimination and calibration. We evaluated discrimination using the area under the receiver operator characteristic (AUROC) curve. AUROC values were rated as ‘poor’ to ‘excellent’ using standard values.(21) Calibration was evaluated graphically using a calibration plot of observed versus predicted values and by evaluating the slope (optimal value 1) and intercept (optimal value 0) of the best fit line.(20, 22–25) To assess model performance at a probability threshold, we identified the threshold that maximized the Matthew’s correlation coefficient (MCC) using the calibrated training dataset probabilities. At this threshold, we evaluated the 2×2 confusion matrix metrics using the validation dataset, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and MCC. We presented data as average and standard deviation (SD) or median with interquartile range (IQR) based on data distributions. We defined significance using two-tailed tests as p<0.05. Statistical analysis was performed using SAS 9.4 (Cary, NC) and figures were developed using GraphPad PRISM (San Diego, CA).
Figure 1.

Stacked Histograms of Model Prediction Probabilities after Isotonic Regression Adjustment from Validation Dataset
RESULTS
Study Population Overview
The study population consisted of 386,859 injured children and adolescents, including 128,949 (33.3%) patients in the training dataset, 128,944 (33.3%) patients in the calibration dataset, and 128,966 (33.3%) in the test dataset (Table 2). We did not observe any clinical or statistical differences between these data (Table 2). 4,998 patients underwent a neurosurgical intervention during their hospitalization within a median of 4.0 hours of arrival (3.2 hours [interquartile range (IQR) 1.8,10.8]). Among these patients, 2,508 (50.2%) underwent a craniotomy/craniectomy and 2,490 (49.8%) had an ICP monitor placed (Table 3). The median time after arrival to the hospital until a craniotomy/craniectomy was performed was 2.9 hours (IQR 1.7, 13.6) and 3.3 hours (IQR 1.9, 7.5) for ICP monitor placement (Table 3). Among patients who had an ICP monitor placed, 296 (11.9%) required a craniotomy/craniectomy later in their hospital stay. The median time to craniotomy/craniectomy following ICP monitor placement was 41 hours (IQR 9.1, 110.6). Patients who first received an ICP monitor differed from those who received a craniotomy/craniectomy based on several characteristics, including age, heart rate, respiratory rate, GCS score, pupillary reactivity, mechanism of injury, presence of prehospital CPR, presence of ‘death on arrival’, transfer status, body regions injured, the number of body regions injured and the time to neurosurgical intervention (Supplementary Digital Content 2).
Model Performance
The final model included four predictor variables: GCS score, pupillary reactivity, mechanism of injury, and presence of prehospital CPR. The GCS score had the largest contribution to prediction (RMI 31.9%), followed by pupillary reactivity (RMI 11.6%), mechanism of injury (RMI 5.8%), and occurrence of prehospital CPR (0.8%) (Figure 2). Model discrimination was ‘good’ when using only the GCS score (AUROC 0.87 95% CI 0.86, 0.88) and improved to ‘excellent’ (AUROC 0.90 95% CI 0.89, 0.91) with inclusion of pupillary response, mechanism of injury, and presence of prehospital CPR. The model was well-calibrated, with a calibration curve slope of 0.77 (95% CI 0.29, 1.26) and an intercept of 0.05 (95% CI −0.14, 0.25) (Figure 3).
Figure 2.

Bayesian Belief Network with Relative Mutual Information of Predictor Variables
Figure 3.

Performance Metrics for NINJA Using the Validation Subset. (A) Calibration Curve and (B) Receiver Operating Characteristic Curve
Among patients with blunt injuries, performance was excellent (AUROC 0.90, 95% CI 0.89, 0.91) and the model’s output probabilities were well calibrated (calibration slope 0.84 95% CI 0.39, 1.28; y-intercept 0.04 95% CI −0.14, 0.22) with inclusion of all observed and unobserved predictor variables (Figure 4). Among patients with penetrating injuries, performance was good (AUROC 0.88, 95% CI 0.89, 0.92) and the probabilities generated were also well calibrated (calibration slope 0.97 95% CI 0.84, 1.1; y-intercept −0.01 95% −0.04, 0.03) (Figure 3).
Figure 4.

Calibration Curves for (A) Blunt and (B) Penetrating Injury Types and Receiver Operating Characteristic Curves for (C) Blunt and (D) Penetrating Injury Types using the Validation Subset
The probability threshold for NINJA that optimized MCC was 14.5%. At this threshold, NINJA had a specificity of 0.97 (95% CI 0.97, 0.97), sensitivity of 0.64 (95% CI 0.62, 0.66), PPV of 0.23 (95% CI 0.22, 0.23), NPV of 0.99 (95% CI 0.99, 0.99), accuracy of 0.97 (95% CI 0.94, 0.97), and MCC of 0.37.
DISCUSSION
In this study, we developed and validated NINJA, a Bayesian network that predicts the probability of neurosurgical intervention following injury among children and adolescents. This model generates a probability using variables observable soon after patient arrival rather than laboratory or radiographic findings that take longer to obtain. Our model had excellent discrimination and was well calibrated. The incorporation of additional examination findings and injury characteristics (pupillary response, mechanism of injury, and prehospital CPR) improved model performance compared to using the GCS score alone, a finding consistent with other studies related to timing to neurosurgical intervention and outcomes.(10,12) The model performed similarly among patients with blunt and penetrating injuries.
Despite recommending timely surgical intervention, the Brain Trauma Foundation does not define the optimal time threshold because the evidence to support one threshold over another is controversial.(1–5, 26) Two explanations may account for this controversy. First, injured adults and children who require an operation early after arrival to the hospital may be more severely injured.(27, 28) Second, children with brain injuries are more likely to present to an ED unequipped to manage complex brain injuries than adults, where they spend on average more than two hours.(27–29) Among children with brain injuries who require transfer for surgical management, intracranial hemorrhage may evolve during this time period and increase the severity of brain injury. In our study, we observed that 39.4% of children who had a craniotomy or craniectomy as their initial procedure were evaluated first at another institution, a finding consistent with other studies.(30, 31) Although an outcomes analysis was beyond the scope of this study, transfer between emergency departments is associated with a higher mortality and a longer hospital and intensive care unit length of stay in injured adults and children when controlling for age, injury severity score, and presence of injury to the head, neck, and extremities.(27, 29, 32, 33) NINJA may prompt providers at non-trauma or trauma hospitals without neurosurgical capabilities to initiate the transfer process earlier, improve communication between institutions, and sometimes avoid the need for repeat laboratory or imaging studies.
Accurate and timely diagnosis of a severe brain injury is a requirement for providing early management that can improve functional outcomes and survival. Emergency and critical care providers often use heuristic reasoning to diagnose and develop a plan of care.(34) Heuristics are mental shortcuts to allow for quick and efficient judgements during time-critical and high-pressured situations and are tuned by experience.(34) When applied to unfamiliar scenarios, decisions based on heuristic reasoning are prone to unintended outcomes compared to those based on a conscious and deliberate approach.(34) Because uncertainty has been cited by trauma surgeons as a common reason for delayed diagnosis, decision support aids have been developed to identify life-threatening injuries, including hemorrhagic shock and severe TBI.(8, 35–41) When later provided an estimate of risk from a decision support aid, the diagnostic accuracy of clinical judgment alone can be improved and can reduce the time to management strategies.(35, 42) Decision support aids like NINJA may provide additional contextual information to improve the timeliness of diagnosis and management of brain injuries.
The SITI scoring system is a scale validated at a binary threshold to identify injured children and adults who may receive a craniotomy or craniectomy within 24 hours of arrival.(10) The cutoff values for the SITI scale were performed using an AUROC analysis to identify the optimal sensitivity and specificity.(10) Like NINJA, the number of children who received an intervention was small compared to the population used for validation. When the number of true positives (neurosurgical intervention) and true negatives (no neurosurgical intervention) are highly imbalanced, this method can select a threshold that overpredicts the predictive performance of the model.(20–23) MCC accounts for imbalanced data and class swapping and provides a more accurate representation of a model’s predictive ability because it combines sensitivity, specificity, PPV, and NPV into a single value.(24–26) Although the sensitivity, PPV, and NPV of NINJA were lower than those of the SITI score, comparison between NINJA and the SITI scale is not possible for several reasons, including different inclusion and exclusion criteria and outcome definitions. In addition to using the MCC as a performance metric, the MCC can also be used to identify a probability threshold for binary predictors that may exceed the predictive performance of a threshold generated by the intersection of sensitivity and specificity.
Our study has several limitations. First, this study used retrospective data from trauma centers that participate in TQIP. Future prospective studies are needed to validate our findings in other settings. Second, TQIP does not report data from injured children under the age of one year old, limiting our model to older children and adolescents. Third, TQIP only contains pupillary reactivity data for patients with an AIS head injury score >0. Because we had to make assumptions about pupillary reactivity for patients without head injuries, prospective validation of our approach for classification is needed. Fourth, our model predicts the probability of a child or adolescent undergoing a neurosurgical intervention and not the actual need for neurosurgical intervention. Future studies are needed that include adjudication of the need for neurosurgical intervention. Fifth, most children who present with signs of life and a penetrating head injury will have an immediate neurosurgical intervention. The applicability of NINJA may be best to predict neurosurgical intervention in closed or blunt head injuries when the need is less clinically apparent. Finally, our model did not include laboratory or radiographic findings. Although we intended to exclude these metrics for the purpose of this study, incorporation of these findings will likely improve model performance.
Recognition and early intervention for TBI is needed to achieve the best functional outcomes and highest survival. Our Bayesian network, NINJA, trained on a national cohort of injured children and adolescents, accurately predicts the probability of obtaining neurosurgical intervention based on immediately observable factors. Although identification of outcomes associated with our model was beyond the scope of this study, this probabilistic model also allows stratification of risk that can help prioritize the initial evaluation and management of injured children and adolescents.
Supplementary Material
Supplementary Digital Content 2. Comparison of Demographic, Injury, and Resuscitation Characteristics Between Neurosurgical Procedures in the Training, Calibration, and Validation Datasets
Supplementary Digital Content 1: TRIPOD Statement
Acknowledgments
This work was supported by the National Institutes of Health: award number(s) R01LM011834, K23HD074620, and R03HD094912.
Footnotes
The authors have no conflict of interest to report
An abstract related to this project was presented at the American College of Surgeons Committee on Trauma Local Resident Paper Competition, October 21, 2022, in Washington, D.C. and at the American College of Surgeons Committee on Trauma Regional Resident Paper Competition, December 3, 2022, in Baltimore, MD.
REFERENCES
- 1.Bullock MR, Chesnut R, Ghajar J, Gordon D, Hartl R, Newell DW, et al. Surgical management of acute subdural hematomas. Neurosurgery. 2006;58(3 Suppl):S16–24 discussion Si-iv. [PubMed] [Google Scholar]
- 2.Bullock MR, Chesnut R, Ghajar J, Gordon D, Hartl R, Newell DW, et al. Surgical management of acute epidural hematomas. Neurosurgery. 2006;58(3 Suppl):S7–15 discussion Si-iv. [PubMed] [Google Scholar]
- 3.Bullock MR, Chesnut R, Ghajar J, Gordon D, Hartl R, Newell DW, et al. Surgical management of traumatic parenchymal lesions. Neurosurgery. 2006;58(3 Suppl):S25–46; discussion Si-iv. [DOI] [PubMed] [Google Scholar]
- 4.Bullock MR, Chesnut R, Ghajar J, Gordon D, Hartl R, Newell DW, et al. Surgical management of posterior fossa mass lesions. Neurosurgery. 2006;58(3 Suppl):S47–55 discussion Si-iv. [DOI] [PubMed] [Google Scholar]
- 5.Bullock MR, Chesnut R, Ghajar J, Gordon D, Hartl R, Newell DW, et al. Surgical management of depressed cranial fractures. Neurosurgery. 2006;58(3 Suppl):S56–60 discussion Si-iv. [DOI] [PubMed] [Google Scholar]
- 6.Ronning P, Helseth E, Skaga NO, Stavem K, Langmoen IA. The effect of ICP monitoring in severe traumatic brain injury: a propensity score-weighted and adjusted regression approach J Neurosurg. 2018;131(6):1896–904. [DOI] [PubMed] [Google Scholar]
- 7.Patterson KN, Nordin A, Beyene TJ, Onwuka A, Bergus K, Horvath KZ, et al. Implementation of a Level 1 Neuro Trauma Activation at a Tertiary Pediatric Trauma Center. Journal of Surgical Research. 2022;275:308–17. [DOI] [PubMed] [Google Scholar]
- 8.Dornbos D, Monson C, Cnp, Look A, Huntoon K, Smith LGF, et al. Validation of the Surgical Intervention for Traumatic Injury scale in the pediatric population. J Neurosurg Pediatr.2020;26(1):92–7. [DOI] [PubMed] [Google Scholar]
- 9.Sweeney TE, Salles A, Harris OA, Spain DA, Staudenmayer KL. Prediction of neurosurgical intervention after mild traumatic brain injury using the national trauma data bank. World J Emerg Surg. 2015;10:23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Moyer JD, Lee P, Bernard C, Henry L, Lang E, Cook F, et al. Machine learning-based prediction of emergency neurosurgery within 24 h after moderate to severe traumatic brain injury. World J Emerg Surg. 2022;17(1):42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.National Trauma Databank (NTDB). Trauma Quality Improvement Project (TQIP) Trauma Quality Program Participant Use File User Manual. 2019. Available at: https://www.ntdbdatacenter.com. American College of Surgeons. Chicago, Il. Accessed October 22, 2021. [Google Scholar]
- 12.Bennett TD, DeWitt PE, Dixon RR, Kartchner C, Sierra Y, Ladell D, et al. Development and Prospective Validation of Tools to Accurately Identify Neurosurgical and Critical Care Events in Children With Traumatic Brain Injury. Pediatr Crit Care Med. 2017;18(5):442–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bennett TD, Dixon RR, Kartchner C, DeWitt PE, Sierra Y, Ladell D, et al. Functional Status Scale in Children With Traumatic Brain Injury: A Prospective Cohort Study. Pediatr Crit Care Med. 2016;17(12):1147–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.DeWitt P, Bennett T. Data files and documetnation for PEDiatric vALidation oF vAriableS in TBI (PEDALFAST). [Data set].
- 15.Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–73. [DOI] [PubMed] [Google Scholar]
- 16.Derka IP, de Waal A. A Taxonomy of Explainable Bayesian Networks. In: Gerber A eds Artificial Intelligence Research. SACAIR 2021. Communications in Computer and Information Science. Volume 1342. Cham Switzerland: Springer; 2020 [Google Scholar]
- 17.Conrady S, Jouffe L. eds. Bayesian Networks & BayesiaLab - A Practical Introduction for Researchers. - 1st ed. Franklin, TN: Bayesia USA; 2015. [Google Scholar]
- 18.Do CB, Batzoglou S. What is the expectation maximization algorithm? Nat Biotechnol 2008;26(8):897–9. [DOI] [PubMed] [Google Scholar]
- 19.Tourassi GD, Frederick ED, Markey MK, Floyd CE Jr. Application of the mutual information criterion for feature selection in computer-aided diagnosis. Med Phys 2001;28(12):2394–402. [DOI] [PubMed] [Google Scholar]
- 20.Huang Y, Li W, Macheret F, Gabriel RA, Ohno-Machado L. A tutorial on calibration measurements and calibration models for clinical prediction models. J Am Med Inform Assoc. 2020;27(4):621–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Carter JV, Pan J, Rai SN, Galandiuk S. ROC-ing along: Evaluation and interpretation of receiver operating characteristic curves. Surgery. 2016;159(6):1638–45. [DOI] [PubMed] [Google Scholar]
- 22.Brandt J, Lanzen E. A Comparative Review of SMOTE and ADASYN in Imbalanced Data Classification [doctoral thesis on the internet]. Uppsala Universitet; 2020. Microsoft Word C-uppsats.docx. Available at: divaportal.org. Accessed January 4, 2022. [Google Scholar]
- 23.Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21(1):6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Chicco D, Totsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 2021;14(1):13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lindhiem O, Petersen IT, Mentch LK, Youngstrom EA. The Importance of Calibration in Clinical Psychology. Assessment. 2020;27(4):840–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bullock MR GD, Servadei F, Chesnut R, Hartl R, Walters BC, et al.Guidelines for the surgical management of traumatic brain injury. Neurosurgery. 2006;58:S2–4–S2–6. [Google Scholar]
- 27.Kim YJ. The impact of time to surgery on outcomes in patients with traumatic brain injury:a literature review. Int Emerg Nurs. 2014;22(4):214–9. [DOI] [PubMed] [Google Scholar]
- 28.Wright KD, Knowles CH, Coats TJ, Sutcliffe JC. ‘Efficient’ timely evacuation of intracranial haematoma--the effect of transport direct to a specialist centre. Injury. 1996;27(10):719–21. [DOI] [PubMed] [Google Scholar]
- 29.Sampalis JS, Denis R, Frechette P, Brown R, Fleiszer D, Mulder D. Direct transport to tertiary trauma centers versus transfer from lower level facilities: impact on mortality and morbidity among patients with major trauma. J Trauma Acute Care Surg. 1997;43(2):288 95; discussion 95–6. [DOI] [PubMed] [Google Scholar]
- 30.Faul M, Xu L, Sasser SM. Hospitalized Traumatic Brain Injury: Low Trauma Center Utilization and High Interfacility Transfers among Older Adults. Prehosp Emerg Care. 2016;20(5):594–600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Centers for Disease Control and Prevention - National Center for Injury Prevention and Control. Traumatic Brain Injury in the United States: Emergency Department Visits, Hospitalizations and Deaths 2002–2006. https://www.cdc.gov/traumaticbraininjurypdf/blue_book.pdf. Published March 2010. Accessed April 22, 2022.
- 32.Hartl R, Gerber LM, Iacono L, Ni Q, Lyons K, Ghajar J. Direct transport within an organized state trauma system reduces mortality in patients with severe traumatic brain injury. J Trauma Acute Care Surg. 2006;60(6):1250–6; discussion 6. [DOI] [PubMed] [Google Scholar]
- 33.Svenson J Trauma systems and timing of patient transfer: are we improving? Am J Emerg Med. 2008;26(4):465–8. [DOI] [PubMed] [Google Scholar]
- 34.Hammond MEH, Stehlik J, Drakos SG, Kfoury AG. Bias in Medicine: Lessons Learned and Mitigation Strategies. JACC Basic Transl Sci. 2021;6(1):78–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Motameni AT, Hodge RA, McKinley WI, Georgel JM, Strollo BP, Benns MV, et al. The use of ABC score in activation of massive transfusion: The yin and the yang. J Trauma Acute Care Surg. 2018;85(2):298–302. [DOI] [PubMed] [Google Scholar]
- 36.Tran ALT, Lampron J, Matar M, Fernando S, Perry J, et al. , editor Identification of high-risk trauma patients requiring major interventions for traumatic hemorrhage: a prospective study of clinical gestalt. Trauma Association of Canada Annual Scientific Meeting; 2021. October 2021; Vancouver, British Columbia (Virtual): Canadian Journal of Surgery. [Google Scholar]
- 37.Sullivan TM, Milestone ZP, Tempel PE, Gao S, Burd RS. Development and Validation of a Bayesian Belief Network Predicting the Probability of Blood Transfusion after Pediatric Injury. J Trauma Acute Care Surg. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Acker SN, Ross JT, Partrick DA, Tong S, Bensard DD. Pediatric specific shock index accurately identifies severely injured children. J Pediatr Surg. 2015;50(2):331–4. [DOI] [PubMed] [Google Scholar]
- 39.Reppucci ML, Acker SN, Cooper E, Meier M, Stevens J, Phillips R, et al. Improved identification of severely injured pediatric trauma patients using reverse shock index multiplied by Glasgow Coma Scale. J Trauma Acute Care Surg. 2022;92(1):69–73. [DOI] [PubMed] [Google Scholar]
- 40.Phillips R, Acker SN, Shahi N, Meier M, Leopold D, Recicar J, et al. The ABC-D score improves the sensitivity in predicting need for massive transfusion in pediatric trauma patients. J Pediatr Surg. 2020;55(2):331–4. [DOI] [PubMed] [Google Scholar]
- 41.Bressan S, Romanato S, Mion T, Zanconato S, Da Dalt L. Implementation of adapted PECARN decision rule for children with minor head injury in the pediatric emergency department. Acad Emerg Med. 2012;19(7):801–7. [DOI] [PubMed] [Google Scholar]
- 42.Dente CJ, Mina MJ, Morse BC, Hensman H, Schobel S, Gelbard RB, et al. Predicting the need for massive transfusion: Prospective validation of a smartphone-based clinical decision support tool. Surgery. 2021;170(5):1574–80. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Digital Content 2. Comparison of Demographic, Injury, and Resuscitation Characteristics Between Neurosurgical Procedures in the Training, Calibration, and Validation Datasets
Supplementary Digital Content 1: TRIPOD Statement
