Abstract
Exploring relationships between admission variables and outcome using regression models has been the focus of Traumatic Brain Injury (TBI) research. Although practical and well established, these approaches do not evaluate interactions between predictors. We therefore applied a set-theoretic logical analysis to the Corticosteroid Randomization after Significant Head Injury (CRASH) trial database. Complete data analysis of 6945 patients demonstrated 9 different configurations of admission variables were sufficient for favorable outcome in 87.5% of all cases and explained 57% of favorable outcomes (moderate disability or good outcome). We also evaluated the contrasting configurations for unfavorable versus favorable outcome. Results are largely in line with findings of previous studies however the influence of age fell behind GCS components, which is unexpected. Specifying a combination of admission parameters that are likely to translate into a given clinical outcome is appealing from a clinician’s perspective therefore our results have considerable translational value.
Introduction
Traumatic Brain Injury is a significant source of morbidity and mortality. TBI-related disability is quoted to be 5.3 million in the United States1 and 7.7 million in the European Union old member state2. Furthermore TBI affects younger population (<45 years), which contributes to the devastating impact on society. Prognostic models have given increasing insight into predictor importance highlighting patient age, motor response and imaging findings as the most influential predictors of outcome3. These findings helped tailor our assessment protocols and pointed out the variables that should be gathered for clinical trials3. In the past, TBI studies have investigated both multi-variable and singlevariable models to assess the prognostic strength of variables on TBI outcome4, 5. Multi-variable models3, 6, 7 focus on development and assessment of the combined effect of multiple variables on the outcome, while single-variable approaches focus on assessing the prognostic strength of one particular variable8. More recently machine-learning methods such as Bayesian Networks have been applied to TBI databases, which proved to be an appealing way to formalize intuitive as well as unexpected associations between variables7.
Model specification for multi-variate approach using regression, often rely on the inherent assumption that each variable has an independent effect on the outcome9. It is known that these techniques assess the net effect10 of a set of variables on an outcome and are not generally concerned with configurations and interaction of variables. For understanding complex biological conditions, the interactions between variables needs to be studied in a multidimensional manner.
In reviewing the methods that assess the effect of multiple variables on TBI outcome, we note that the mainstream techniques are marked by limitations in expressing the interactions between variables, and the role of these interactions in predicting the outcome. This means that in these techniques, the data is assumed to have just one ready answer for the magnitude of a variable’s effect on the outcome. When interaction terms are not modelled adequately, the accuracy of estimates in regression approaches can be affected by model misspecification11. If interaction terms are not modelled, the effect of individual independent variables are likely to be over-estimated.
Modeling interactions in a multi-variate analysis is not a straight forward task. Starting from single variables, all possible combinations of variables need to be investigated. Depending on the number of variables, multiple models can be generated and the validation of these models is non-trivial. Conventional statistical methods, cannot account for situations in which only specific combinations of variables reveal their impact on the outcome (conjunctural causation) or all paths that lead to an outcome need to be simultaneously uncovered (equifinality). These methods also fall short in explaining situations in which a given combination of variables contributes to the presence of an outcome but at the same time is irrelevant for the absence of that outcome (causal asymmetry)12.
Despite the depth and breadth of recent investigations, there is limited generalized knowledge to model the complex interaction of variables and the prognostic value of these interactions in TBI. In this study our goal is to systematically investigate these interactions. While considering that the predictors of favorable outcome in TBI are not necessarily the negation or reversal of predictors of unfavorable outcome, we study the interaction of variables causative to this asymmetry, in a multi-dimensional, multi-variate manner.
Set-theoretic logical analysis methods can detect recurring causal patterns17, and are well suited to help us explore a configurational model of TBI outcome. For this, we apply the method of Qualitative Comparative Analysis12, 10, 18 (QCA) which unlike statistical approaches, can address the three important phenomenon of conjunctural causation, equifinality and causal asymmetry inherent in modelling the concept of configurations18. The general assumption behind the configurational approach applied here is that the interaction or combinations of different predictor variables can explain the difference in outcome classes. Hence, in comparison to statistical approach like regression that provides an estimate of impacts of the study variable on outcome in a specified model, rather QCA allows a study factor to participate in difference configurations affecting the outcome.
The paper proceeds as follows: First we briefly cover the background on TBI and the current state of research in this area and will introduce the explanatory variables included in our study. Next, we explain the analytical framework behind our study, followed by the research design. We then present the QCA results and offer a more substantive interpretation of risk patterns before concluding the paper with an assessment of the predictive power of the model compared to that of a simple logistic regression model followed by a discussion.
Prognostic Models and Predictor Variables in TBI
The International Mission for Prognosis and Analysis of Clinical Trials in TBI (IMPACT)13 set forth three prognostic models with different levels of complexity, using well-known predictors (age, Glasgow Coma Motor Score, and pupillary reactivity), computed tomographic characteristics (CT classification and traumatic subarachnoid hemorrhage), secondary insults (hypoxia or hypotension) and laboratory values on admission (Hb and glucose) 3, 14. These models can predict 6-month outcome in patients with severe or moderate TBI with good discriminative ability based on the Area Under Curve (AUC)13. Assessment and validation of these widely accepted prediction models on different cohorts has been the focus of many investigations. Externally, the IMACT models were validated against the Corticosteroid Randomization after Significant Head Injury (CRASH) 15 trial findings. The CRASH trial included 10008 cases of patients with traumatic head injury within 8 hours of clinical assessment from 239 hospitals in 29 countries.
We’ve based our current study on the clinically relevant variables from previous studies by IMPACT and CRASH researchers who have identified age, motor score and imaging abnormalities as important predictors of clinical outcome in TBI16, 3, 7. Study variables include demographics, injury characteristics, computed tomography (CT) findings and Glasgow Outcome Scale (GCS, motor, verbal response and eye opening). Outcome measure were dichotomized as death or severe disability at 6 months.
From the 10008 cases in the CRASH dataset, about a third had one or more missing values and were omitted from our analysis. Our analysis is therefore based on the 6945 cases that had no missing values. Table 1 describes the characteristics of patient data in the CRASH dataset. The missing CT findings were responsible majority of the excluded values in the study (2191 of the 10008 patients, 21,9%). For the majority of these patients (2063) a CT brain was not performed at all whereas only 128 had one or more imaging findings not recorded in the dataset. We considered multiple imputations of missing data, which would technically be difficult to interface with the subsequent analysis. Furthermore previous studies with the CRASH trial dataset found no difference between imputed and complete datasets6. We therefore choose to undertake a complete data analysis rather than imputing missing values. Another consideration regarding the dataset was the better early outcomes (14 days) for high-income countries, compared low-middle income regions. The 6 month outcomes (used in our study) were however similar between income regions.
Table 1.
Variable category | Variable (abbreviation) | Category | Total cases with no missing values |
---|---|---|---|
Epidemiology | Sex (sex) | Male | 5706 |
Female | 1239 | ||
Age (age) | <20 | 892 | |
20-24 | 1191 | ||
25-29 | 860 | ||
30-34 | 754 | ||
35-44 | 1199 | ||
45-54 | 899 | ||
>55 | 1150 | ||
Injury Cause (cause) | Road traffic accident | 4780 | |
Fall>2 meters | 920 | ||
Other | 1245 | ||
Major extracranial injury (ec) | Yes | 1638 | |
No | 5307 | ||
Assessment | Eye opening (eye) | No response | 2680 |
To pain | 1261 | ||
To verbal stimulus | 1764 | ||
Spontaneous | 1240 | ||
Motor response (motor) | No response | 601 | |
Extension | 407 | ||
Abnormal flexion | 515 | ||
Withdrawal | 933 | ||
Localises | 2723 | ||
Follows commands | 1766 | ||
Verbal response (verbal) | No response | 2640 | |
Incomprehensible sounds | 1124 | ||
Single words | 821 | ||
Confused | 2006 | ||
Oriented | 354 | ||
Pupillary response (pupils) | Both reactive | 5791 | |
No response unilateral | 496 | ||
No response | 658 | ||
Image findings | Petechial haemorrhage (phm) | Yes | 1974 |
No | 4971 | ||
Subarachnoid bleed (sah) | Yes | 2206 | |
No | 4739 | ||
Obliterated 3rd ventricle or basal cisterns (oblt) | Yes | 1663 | |
No | 5282 | ||
Midline shift (mdls) | Yes | 1021 | |
No | 5924 | ||
Hematoma (hmt) | Yes | 2718 | |
No | 4227 | ||
Outcome | Outcome at 6 months | Death or severe disability | 2763 |
Moderate disability or good recovery | 4182 |
Qualitative Comparative Analysis
The method of Qualitative Comparative Analysis (QCA) carries potential for analysis of complex dependencies in configurational data12. Ragin describes QCA as “an analytic technique designed specifically for the study of cases as configurations of aspects, conceived as combinations of set memberships” 12. A configuration is a combination of variables that consistently produce (i.e. are sufficient for) the outcome10.
At its core, QCA is based on ideas from the field of logic synthesis19 to obtain the minimal Boolean sum-of-products (SOP) formulas that fully represents a given truth table of variables. The truth table lists all logically possible combinations of the variables based on the dataset included in the study.
The core algorithm in QCA, the Quine-McCuskey20, 21 algorithm, was established in 1950sand is used for minimization of Boolean logic formulas to find the smallest, logically valid combination of variables that have the largest coverage over the all cases under investigation. The minimization process is based on repeatedly applying three laws of logic: 1) absorption (e.g. x1.x2 + x1.x2’ = x1), 2) idempotency or redundancy (e.g.x1 + x1 = x1), and 3) the law of excluded middle (e.g. x1 + xl’ = 1).
The Quine-McCluskey algorithm like any other logical analysis method is not concerned with the empirical validity of the formulas that are being discovered. It is the role of the analyst to design a valid foundation for analysis and then to assess the empirical validity of the findings. After listing all variables in a truth table, the analyst needs to select the threshold at which sufficient evidence for the outcome is defined. For example, if the analyst wants to uncover all combinations of variables that lead to a certain outcome 85% of the time, the sufficiency score needs to be set to 85%. All combinations of conditions that meet this threshold are then included in further analysis. The analyst can also define the minimum number of occurrences of a certain combination for it to be included in the study. This gives the analyst the choice to include for example all combinations that appeared at least two times for favorable outcome.
The parameters of fit in QCA are consistency and coverage12. These parameters assess how consistently a combination of conditions appears in the data and the degree to which the findings cover or explain the dataset.
Steps of Analysis
For analysis, we used fsQCA22, a software developed by Ragin12 for configurational analysis. An implementation of QCA in R23 was also used for replication and comparison.
The steps are schematically shown below (Figure 1).
Variable Selection and Dimensionality Reduction
Since the computational cost of an exact multivariate logical analysis increases according to the number of variables included in the study, the algorithms used with these methods cannot process a large number of variables. The predictor variables in TBI dataset are nominal and multi valued. When flattened, the total number of variables in the truth table sums up to 36 (including the outcome variable,). An exact analysis of 35 variables and one outcome requires 1.8 Petabytes of memory and could not be analyzed on conventional lab computers at the university. (6th Generation Intel® Core™ i7-6700T Processor (8M Cache, up to 3.60 GHz), 12GB Memory, 1TB hard drive). For this reason, we need to select the most informative variables and consider increasing the granularity of multi-value variables by merging multiple sub-categories.
Result
We employed the binary decision tree algorithm RPART24, which is an implementation of Classification and Regression Trees25 (CART) in R, to identify the most informative variables and the cut off point for each multilevel variables. We pruned the resulting decision tree using two different complexity parameters (0.001 and 0.01) and evaluated the predictive power of the resulting models based on the Area Under Curve (AUC). Table 2 compares the AUC of the two models with that of the original CRASH dataset. DeLong’s test was used to formally compare the ROC curves for the different models.
Table 2.
Model | AUC | 95% CI (Delong) | DeLong p* |
---|---|---|---|
Original CRASH | 0.8348 | 0.8252-0.8444 | - |
11-var Binarized | 0.8235 | 0.8136-0.8334 | 0.1091 |
9-var Binarized | 0.8175 | 0.8073-0.8276 | 0.01504 |
Compared with Original CRASH
The 11-var model showed no significant different (Delong p values >0.05) compared with the original model (nonbinarized dataset). The alternative hypothesis was that the true difference in AUC is not equal to 0. Even though the dataset represents the same population, the paired ROC test is not applicable for comparing the ROC curves of the binarized models with that of the original model since the models are very different and are deemed to be unpaired by the built-in glm (general linear model) algorithm in R. At AUC 0.8175, the 9-variable model has a higher AUC than sensitivity based (AUC 0.8149) and specificity based (AUC 0.8132) models reported in earlier studies7. The Delong p-value for the 9-var model is less than 0.05 showing a more significant difference to the AUC of the original model compared to the 11-var model.
We test two models. The first model includes the 9 most informative variables based on the application of RPART, and the second model includes only 7 variables. Variable importance ranking for the two models is given in Table 3.
Table 3.
Variables | motor | verbal | eye | pupils | age | mdls | oblt | hmt | sah | ec | phm | sex | cause |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
9-var Model | 30 | 18 | 15 | 13 | 11 | 5 | 4 | 2 | 2 | 1 | 1 | - | - |
7-var modal | 35 | 15 | 14 | 16 | 11 | 4 | 3 | 1 | 1 | - | - | - | - |
Comparative Analysis Using QCA
The first step in an exact analysis of a dataset using the configurational approach is to construct a truth table of variables. Each case in the CRASH dataset will correspond to a row in truth table. A truth table represents a binary tree in which every input variable takes either a zero or one for value. The truth table for our dataset is constructed by replacing for each label i in variable x the ith label of x with a new variable xi. This means that multi-level variables are flattened into binary variables by expanding column wise. Given that our dataset includes 6945 cases, and in QCA terms this represents a Large-N analysis, it is unlikely that we can find perfectly sufficient causal combinations. We tested multiple levels and decided to set the sufficiency threshold18 to 70%. The inclusion cut off point is kept at 1, meaning a single occurrence of a combination is enough to include it for further analysis.
Raw coverage (RC), unique coverage (UC) and consistency (CONS) are the parameters of fit and assess how consistently a combination of conditions appears in the data and the degree to which the findings cover or explain the dataset18. The dashes (-) in the result tables mean that presence or absence of the variable does not matter for the outcome of that configuration.
Analysis of the 9-Var Model
As shown in Table 4, 67.8% of the Configurations for favorable outcome with a consistency of 84.9% could be explained by 40 combinations. The top 6 configurations for favorable outcome based on this model are reported. The first four conditions cover more cases in the dataset based on their RC and UC. Table 5 shows top 6 configurations for unfavorable outcome.
Table 4.
age | eye | motor | verbal | pupils | oblt | mdls | hmt | sah | RC | UC | CONS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | < 45 | - | localises or follows commands | > single words | both reactive | - | no | - | no | 0.335 | 0.028 | 0.905 |
2 | < 45 | any response | - | > single words | both reactive | no | no | - | - | 0.370 | 0.014 | 0.899 |
3 | < 45 | any response | localises or follows commands | - | - | no | no | no | - | 0.318 | 0.000 | 0.896 |
4 | < 45 | any response | localises or follows commands | - | both reactive | no | no | - | - | 0.414 | 0.020 | 0.885 |
5 | < 45 | - | withdrawal or less1 | - | both reactive | - | no | no | yes | 0.086 | 0.015 | 0.806 |
6 | - | any response | withdrawal or less | > single words | - | no | no | no | no | 0.315 | 0.079 | 0.872 |
Table 5.
age | eye | motor | verbal | pupils | oblt | mdls | hmt | sah | RC | UC | CONS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | - | no response | withdrawal or less | Incomp. sounds or no response | no response/unilateral | - | - | - | yes | 0.125 | 0.009 | 0.901 |
2 | - | no response | withdrawal or less | Incomp. sounds or no response | - | yes | - | yes | yes | 0.087 | 0.018 | 0.889 |
3 | - | - | withdrawal or less | Incomp. sounds or no response | - | - | yes | yes | no | 0.078 | 0.024 | 0.857 |
4 | <45 | no response | withdrawal or less | - | - | no | no | - | - | 0.059 | 0.001 | 0.921 |
5 | <45 | - | withdrawal or less | Incomp. sounds or no response | no response/unilateral | - | no | no | yes | 0.030 | 0.004 | 0.848 |
6 | <45 | any response | - | Incomp. sounds or no response | no response/unilateral | no | - | yes | - | 0.029 | 0.002 | 0.964 |
As shown in Table 5, 42.9% of the Configurations for unfavorable outcome with a consistency of 87.2% could be explained by 63 combinations. The top configurations for unfavorable outcome are reported.
Analysis of the 7-Var Model
Since variable importance ranking of hmt and sah are the lowest in the rankings of our classification tree, we removed these two variables to evaluate the resulting configurations without them. The AUC of the 7-var model is 0. 811 (95% CI: 0.8136-0.8334 (DeLong)). At DeLong’s p-value 9.474E-04 compared with the original model, the ROC curves of the two models were significantly different. It was found that 57.2% of the cases with favorable outcome with a consistency of 85.7% could be explained by 9 combinations. From these 9 configurations in Table 6, we report on the top 6 that have the highest RC and UC.
Table 6.
age | eye | motor | verbal | pupils | oblt | mdls | RC | UC | CONS | |
---|---|---|---|---|---|---|---|---|---|---|
1 | < 45 | - | localises OR follows commands | - | both reactive | no | no | 0.483 | 0.058 | 0.860 |
2 | < 45 | - | localises OR follows commands | at least single words | both reactive | - | no | 0.394 | 0.037 | 0.893 |
3 | < 45 | any response | - | at least single words | both reactive | no | - | 0.384 | 0.025 | 0.897 |
4 | < 45 | any response | localises OR follows commands | - | - | no | no | 0.423 | 0.003 | 0.883 |
5 | < 45 | any response | localises OR follows commands | at least single words | - | no | - | 0.364 | 2.3E-4 | 0.899 |
6 | < 45 | no response | withdrawal or less2 | at least single words | - | no | no | 0.003 | 0.003 | 0.736 |
With a raw coverage of 0.48, the configuration of row 1 in Table 6 explains the highest number of favorable outcomes covered by the total model (2740 cases), capturing the configuration “patients (below 45), with motor (localizes OR follows commands) AND pupils (both reactive) AND mdls (no) AND oblt (no).” This means that regardless of the value of eye and verbal, with 86% consistency, any configuration that matches row 1 results in favorable outcome. On the other hand, 44.5% of the cases of unfavorable outcome with a consistency of 83% could be explained by 20 combinations. Due to space limitation, we only report the top 8 configurations in Table 7 below.
Table 7.
age | eye | motor | verbal | pupils | oblt | mdls | RC | UC | CONS | |
---|---|---|---|---|---|---|---|---|---|---|
1 | 45 and above | no response | - | incomp. sounds or no response | - | - | - | 0.200 | 0.069 | 0.832 |
2 | 45 and above | - | - | incomp. sounds or no response | no response/unilateral | no | - | 0.059 | 0.009 | 0.858 |
3 | 45 and above | - | - | incomp. sounds or no response | - | no | yes | 0.034 | 0.006 | 0.840 |
4 | - | no response | withdrawal or less | incomp. sounds or no response | no response/unilateral | - | - | 0.239 | 0.043 | 0.857 |
5 | - | no response | withdrawal or less | incomp. sounds or no response | - | - | yes | 0.161 | 0.020 | 0.873 |
6 | 45 and above | no response | - | - | no response/unilateral | no | no | 0.035 | 0.001 | 0.860 |
7 | 45 and above | no response | - | - | both reactive | yes | no | 0.018 | 0.001 | 0.836 |
8 | 45 and above | any response | withdrawal or less2 | - | both reactive | - | yes | 0.009 | 0.002 | 0.896 |
With a raw coverage of 0.2, the configuration of row 1 in Table 7 explains the highest number of unfavorable outcomes covered by the total model; capturing the configuration “patients who are 45 and above with eye = (no response) AND verbal = (incomprehensible sounds or no response).”
Predicting Outcome with QCA
To evaluate the usefulness of the 7-variable QCA model, we compared its ability to predict the TBI outcome with that of a simple binary logistic regression (Logit) model:
where P is the predicted probability of TBI outcome based on the assumption of linear relationship between the variables. The purpose of using this simple model for comparison is to show the difference between the results of a conventional additive model with that of QCA. The two models are based on very different assumptions. The linear logistic regression model assigns a weight to all independent variables and is additive in nature. The QCA model takes patterns of interactions between variables into account and outputs multiple combinations.
We compared the predictive power of the two models based on the number of true positives and false negatives they predict as well as their overall prediction accuracy. The results are shown in Table 10. Precision reports the percentage of correct predictions that the model makes. Recall reports the fraction of positive predictions that are truly positive. Accuracy of the model is the percentage of all true predictions from the number of predictions the model makes. One main difference between the two models is that the Logit model generates one model for the whole dataset, but the QCA only explains patterns in a fraction of the dataset.
Table 10.
N = 6945 (Favorable outcome*= 4182 Cases, Unfavorable outcome**= 2763 Cases) | ||||
---|---|---|---|---|
Model | Precision | Recall | True Positive Rate | False Positive Rate |
7-var QCA favorable | 2394/2790 = 0.857 | 2394/4182 (2790) | 0.857 | 0.133 |
7-var QCA unfavorable | 1232/1483 = 0.830 | 1232/2763 (1483) | 0.830 | 0.169 |
7-var Logit favorable | 0.755 | 3608/4182 | 0.881 | 0.447 |
7-var Logit unfavorable | 0.735 | 1591/2763 | 0.553 | 0.317 |
Favorable outcome = Moderate Disability or Good Recovery
Unfavorable outcome = Death or Sever Disability at 6 Months
If precision and recall for the two models are calculated based on the number of cases that they claim to explain, the QCA model benefits from higher accuracy:
Accuracy of the QCA model = (2394+1232) / (2790+ 1483) = 0.848
Accuracy of the Logit model = (3684+1527) / (4182+ 2763) = 0.750
However, when we evaluate the predictive power of each model on the full dataset, the Logit model demonstrates a better precision and recall than the QCA model in predicting outcome, but suffers from higher false positive rates for both cases of unfavorable and favorable outcome. These results highlights the advantages of using QCA particularly when variables that affect outcome positively do not necessarily have reverse effect when they are removed, hence enabling us to highlight the possible asymmetries in the way individual variables can influence the outcome through their participation in configurations.
For the 9-var model, the cases of favourable outcome that QCA did not cover totals 215 different combinations, and for cases of unfavourable outcome that number is 174. For the 7-var model, the numbers are 79 and 59 respectively. Some of these non-covered cases are single occurrences of the configuration of variables that could not be factored with other configurations.
Discussion
Our study demonstrated a different approach to evaluating predictors of clinical outcome in TBI. With methods of QCA we established multiple configurations for admission variables that are predictive of favorable versus unfavorable outcome. Most of the findings are intuitive, young age (<45), good neurological condition and lack of CT abnormalities are in keeping with favorable outcome. Whereas older age, poor neurological condition and CT findings such as mass effect or traumatic subarachnoid bleed are suggestive of an unfavorable outcome. These results are in line with previous studies, however an unexpected finding was that on formal variable importance ranking using RPART age fell behind the GCS components as well as pupillary response. This is further traceable in several of the configurations (1-3 Table 5) for unfavorable outcome where age does not appear. A further finding in our study is the dichotomization values for admission variables which we established using a binary decision tree algorithm (RPART). Binary adaptation of clinical features is appealing to clinicians because it simplifies patient assessment particularly in the emergency setting. We have demonstrated that collapsing multi-level variables into binary does not impact model performance when maintaining the full set or most of variables present in the original model (Table 2). Consequently, a binary model can potentially inform a simplified assessment protocol without substantial loss of clinical information.
The configurational asymmetric models uncovered through the application of set-theoretic methods such as QCA, make these methods appealing when there is a possibility of interactions between variables. Comparative analysis with QCA is receiving increasing attention among researchers from a variety of disciplines such as social science26, business and economics27, 28, management and organization29, education30, and health policy research31, 32, 33, 34.
For analyzing the CRASH dataset with QCA, we made the models more parsimonious by removing some variables since the complexity of the dataset does not allow us to apply the exact procedure of QCA to the full dataset, or even the 11-var dataset. As a limitation, we forced the multi-level variables into dichotomies based on their first split in the classification tree. This first split is considered to be most informative for contrasting outcomes, however, with more granularity we might find different and possibly better results. These binary models however showed similar performance to models built with multi-level variables. A translational value of our findings is that the configurations of admission variables can be regarded as “typical” patient scenarios that are strongly predictive of a clinical outcome.
Conclusion
A configurational, asymmetric model of TBI outcome is investigated. We have demonstrated that the dichotomization of admission variables can provide basis of simplified assessment protocols that can usefully implemented in small centers for example without specialist capacity. From a clinician’s perspective is also useful to be presented with a set of “typical” scenarios that are suggestive of favorable versus unfavorable outcome. We evaluated the predictive power of a simple logistic regression model and a QCA model using the CRASH dataset. The Logit model demonstrates a better precision and recall than the QCA model in predicting outcome, but suffers from higher false positive rates for both classes of unfavorable and favorable outcomes. The QCA model only explains patterns in a fraction of the dataset while the Logit model attempts to cover the whole dataset. We are currently investigating a new heuristic based on logic synthesis and network analysis methods to overcome the inherent limitation in QCA in terms of the number of variables and the complexity of the dataset while automating the inclusion of interaction terms to develop minimal models that are maximally predictive.
Footnotes
Withdrawal or abnormal flexion or extension or no response for Motor Response assessment
Withdrawal or less means: withdrawal or abnormal flexion or extension or no response for Motor Response assessment.
References
- 1.Langlois J. A, Sattin R. W. Traumatic brain injury in the United States: research and programs of the Centers for Disease Control and Prevention (CDC) The Journal of head trauma rehabilitation. 2005 May 1;20(3):187–8. doi: 10.1097/00001199-200505000-00001. [DOI] [PubMed] [Google Scholar]
- 2.Tagliaferri F, Compagnone C, Korsic M, Servadei F, Kraus J. A systematic review of brain injury epidemiology in Europe. Acta neurochirurgica. 2006 Mar 1;148(3):255–68. doi: 10.1007/s00701-005-0651-y. [DOI] [PubMed] [Google Scholar]
- 3.Murray GD, Butcher I, McHugh GS, Lu J, Mushkudiani NA, Maas AI, Marmarou A, Steyerberg EW. Multivariable prognostic analysis in traumatic brain injury: results from the IMPACT study. Journal of neurotrauma. 2007 Feb 1;24(2):329–37. doi: 10.1089/neu.2006.0035. [DOI] [PubMed] [Google Scholar]
- 4.Majdan M, Brazinova A, Rusnak M, Leitgeb J. Outcome prediction after traumatic brain injury: Comparison of the performance of routinely used severity scores and multivariable prognostic models. Journal of Neurosciences in Rural Practice. 2017 Jan 1;8(1):20. doi: 10.4103/0976-3147.193543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hawley C, Sakr M, Scapinello S, Salvo J, Wrenn P. Traumatic brain injuries in older adults—6 years of data for one UK trauma centre: retrospective analysis of prospectively collected data. Emergency Medicine Journal. 2017 doi: 10.1136/emermed-2016-206506. [DOI] [PubMed] [Google Scholar]
- 6.Steyerberg EW, Mushkudiani N, Perel P, Butcher I, Lu J, McHugh GS, Murray GD, Marmarou A, Roberts I, Habbema JD, Maas AI. Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS Med. 2008 Aug 5;5(8):e165. doi: 10.1371/journal.pmed.0050165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zador Z, Sperrin M, King AT. Predictors of outcome in traumatic brain injury: new insight using receiver operating curve indices and Bayesian network analysis. PLoS one. 2016 Jul 7;11(7):e0158762. doi: 10.1371/journal.pone.0158762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Marmarou A, Lu J, Butcher I, McHugh GS, Murray GD, Steyerberg EW, Mushkudiani NA, Choi S, Maas AI. Prognostic value of the Glasgow Coma Scale and pupil reactivity in traumatic brain injury assessed pre-hospital and on enrollment: an IMPACT analysis. Journal of neurotrauma. 2007 Feb 1;24(2):270–80. doi: 10.1089/neu.2006.0029. [DOI] [PubMed] [Google Scholar]
- 9.Lingsma HF, Roozenbeek B, Steyerberg EW, Murray GD, Maas AI. Early prognosis in traumatic brain injury: from prophecies to predictions. The Lancet Neurology. 2010 May 31;9(5):543–54. doi: 10.1016/S1474-4422(10)70065-X. [DOI] [PubMed] [Google Scholar]
- 10.Ragin CC. University of Chicago Press; 2000. Aug 1, Fuzzy-set social science. [Google Scholar]
- 11.Gordon Robert A. Issues in multiple regression. American Journal of Sociology 73.5. 1968:592–616. [Google Scholar]
- 12.Ragin C. The comparative method: Moving beyond qualitative and quantitative methods. 1987 [Google Scholar]
- 13.Hukkelhoven CW, Steyerberg EW, Habbema JD, Farace E, Marmarou A, Murray GD, Marshall LF, Maas AI. Predicting outcome after traumatic brain injury: development and validation of a prognostic score based on admission characteristics. Journal of neurotrauma. 2005 Oct 1;22(10):1025–39. doi: 10.1089/neu.2005.22.1025. [DOI] [PubMed] [Google Scholar]
- 14.Maas AI, Steyerberg EW, Marmarou A, McHugh GS, Lingsma HF, Butcher I, Lu J, Weir J, Roozenbeek B, Murray GD. IMPACT recommendations for improving the design and analysis of clinical trials in moderate to severe traumatic brain injury. Neurotherapeutics. 2010 Jan 31;7(1):127–34. doi: 10.1016/j.nurt.2009.10.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Collaborators MC, Perel P, Arango M, Clayton T, Edwards P, Komolafe E, Poccock S, Roberts I, Shakur H, Steyerberg E, Yutthakasemsunt S. Predicting outcome after traumatic brain injury: practical prognostic models based on large cohort of international patients. Bmj. 2008 Feb 23;336(7641):425–9. doi: 10.1136/bmj.39461.643438.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Roozenbeek B, Maas AI, Menon DK. Changing patterns in the epidemiology of traumatic brain injury. Nature Reviews Neurology. 2013 Apr 1;9(4):231–6. doi: 10.1038/nrneurol.2013.22. [DOI] [PubMed] [Google Scholar]
- 17.Mahoney J, Goertz G, Ragin CC. InHandbook of causal analysis for social research. Netherlands: Springer; 2013. Causal models and counterfactuals; pp. 75–90. [Google Scholar]
- 18.Ragin CC. Chicago: University of Chicago Press; 2008. Redesigning social inquiry: Fuzzy sets and beyond. [Google Scholar]
- 19.Shannon C. The synthesis of two-terminal switching circuits. Bell Labs Technical Journal. 1949;28(1):59–98. [Google Scholar]
- 20.Nelson RJ, Quine WV. The problem of simplifying truth functions. The American mathematical monthly, vol. 59 (1952), pp. 521-531. (Offprint 1952, on sale by the Mathematical Association of America.) The Journal of Symbolic Logic. 1953 Sep 1;18(03):280–2. [Google Scholar]
- 21.McCluskey EJ. Minimization of Boolean functions. Bell Labs Technical Journal. 1956 Nov 1;35(6):1417–44. [Google Scholar]
- 22.Ragin CC, Drass KA, Davey S. Tucson, Arizona: Department of Sociology, University of Arizona; 2006. Fuzzy-set/qualitative comparative analysis 2.0. [Google Scholar]
- 23.Thiem A, Dusa A. QCA: A package for qualitative comparative analysis. The R Journal. 2013 Jun 1;5(1):87–97. [Google Scholar]
- 24.Therneau TM, Atkinson B, Ripley B. rpart: Recursive partitioning. R package version. 2010;3:1–46. [Google Scholar]
- 25.Breiman L, Friedman J, Stone CJ, Olshen RA. CRC press; 1984. Classification and regression trees. [Google Scholar]
- 26.Cress DM, Snow DA. Mobilization at the margins: Resources, benefactors, and the viability of homeless social movement organizations. American Sociological Review. 1996 Dec 1;:1089–109. [Google Scholar]
- 27.Evans AJ, Aligica PD. The spread of the flat tax in Eastern Europe: A comparative study. Eastern European Economics. 2008 May 1;46(3):49–67. [Google Scholar]
- 28.Valliere D, Ni N, Wise S. Prior relationships and M&A exit valuations: a set-theoretic approach. The Journal of Private Equity. 2008;11(2):60–72. [Google Scholar]
- 29.Greckhamer T. Cross-cultural differences in compensation level and inequality across occupations: A set-theoretic analysis. Organization Studies. 2011 Jan;32(1):85–115. [Google Scholar]
- 30.Glaesser J, Cooper B. Selectivity and flexibility in the German secondary school system: A configurational analysis of recent data from the German socio-economic panel. European Sociological Review. 2010 [Google Scholar]
- 31.Thomas J, O’Mara-Eves A, Brunton G. Using qualitative comparative analysis (QCA) in systematic reviews of complex interventions: a worked example. Systematic reviews. 2014 Jun 20;3(1):67. doi: 10.1186/2046-4053-3-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kahwati L, Viswanathan M, Golin CE, Kane H, Lewis M, Jacobs S. Identifying configurations of behavior change techniques in effective medication adherence interventions: a qualitative comparative analysis. Systematic reviews. 2016 May 4;5(1):83. doi: 10.1186/s13643-016-0255-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Harkreader S, Imershein AW. The conditions for State action in Florida’s health-care market. Journal of health and social behavior. 1999 Jun 1;:159–74. [PubMed] [Google Scholar]
- 34.Schensul JJ, Chandran D, Singh SK, Berg M, Singh S, Gupta K. The use of qualitative comparative analysis for critical event research in alcohol and HIV in Mumbai, India. AIDS and Behavior. 2010 Aug 1;14(1):113–25. doi: 10.1007/s10461-010-9736-6. [DOI] [PMC free article] [PubMed] [Google Scholar]