Abstract
INTRODUCTION
Missing data are inherent in clinical research and may be especially problematic for trauma studies. This study describes a sensitivity analysis to evaluate the impact of missing data on clinical risk prediction algorithms. Three blood transfusion prediction models were evaluated utilizing an observational trauma dataset with valid missing data.
METHODS
The PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study included patients requiring ≥ 1 unit of red blood cells (RBC) at 10 participating U.S. Level I trauma centers from July 2009 – October 2010. Physiologic, laboratory, and treatment data were collected prospectively up to 24h after hospital admission. Subjects who received ≥ 10 RBC units within 24h of admission were classified as massive transfusion (MT) patients. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation. A sensitivity analysis for missing data was conducted to determine the upper and lower bounds for correct classification percentages.
RESULTS
PROMMTT enrolled 1,245 subjects. MT was received by 297 patients (24%). Missing percentage ranged from 2.2% (heart rate) to 45% (respiratory rate). Proportions of complete cases utilized in the MT prediction models ranged from 41% to 88%. All models demonstrated similar correct classification percentages using complete case analysis and multiple imputation. In the sensitivity analysis, correct classification upper-lower bound ranges per model were 4%, 10%, and 12%. Predictive accuracy for all models using PROMMTT data was lower than reported in the original datasets.
CONCLUSIONS
Evaluating the accuracy clinical prediction models with missing data can be misleading, especially with many predictor variables and moderate levels of missingness per variable. The proposed sensitivity analysis describes the influence of missing data on risk prediction algorithms. Reporting upper/lower bounds for percent correct classification may be more informative than multiple imputation, which provided similar results to complete case analysis in this study.
Keywords: PROMMTT, trauma, incomplete data, massive transfusion
INTRODUCTION
There is a recognized need in the trauma community to account for missing data in predicting outcomes.1,2 Collecting complete information for all patients may be especially difficult in a trauma setting because critically injured patients require immediate care under narrow time constraints. Incomplete cases, or patients whose records do not contain information for all analyzed variables in a predictive model, are commonly excluded during analysis due to the default settings for statistical software programs. As the number of variables increases in multiple regression models, even small amounts of missing data can lead to many incomplete cases.3 Missing information arises from various sources including incomplete medical records, procedures not performed or not applicable, and data collection errors.
Trauma registries often have more missing information for patients who are severely injured, suggesting an injury severity bias in missing data.4 Excluding incomplete cases from analysis is valid only if data are missing completely at random, a rare scenario in clinical research.3–5 The most critically injured trauma patients are excluded from some research analyses because their records contain missing data, and the results may be biased by excluding these patients. Because research efforts should be as inclusive and generalizable as possible, accounting for those patients with missing information is essential.
Approximately 3% of all trauma patients receive a massive transfusion (MT), ≥10 units of red blood cells (RBC) within 24 hours of hospital admission. This MT definition does not include severely bleeding patients who died before receiving 10 units of RBC, but this definition has been used in prior literature to identify bleeding patients despite its limitations. Improvements in the U.S. trauma system have led to reductions in mortality for MT patients from 90% in the 1970’s to the currently reported mortality rates of 30%-70%.6–8 Many of these deaths are considered preventable if coagulopathic treatment, transfusion support, and emergency surgical care were improved.9,10
Many trauma centers have instituted MT protocols to facilitate earlier blood component transfusion,11 and early identification of hemorrhaging patients who may require MT protocol activation may further reduce mortality in this population.12 MT protocols involve notifying the transfusion service and laboratory, instituting laboratory testing algorithms, and initiating rapid blood product preparation and delivery in pre-specified amounts.13,14 Accurate prediction of patients requiring activation of a MT protocol is critical due to the inherent risks of transfusions given concerns regarding blood-borne diseases and transfusion-related acute lung injury.15,16 As blood components are scarce resources that require efficient management, the performance of MT protocols should be monitored to improve turn-around times and patient outcomes, while minimizing transfusion adverse events and wastage.13
In the past decade multiple transfusion studies have focused on developing risk prediction models to identify hemorrhaging patients who require activation of an MT protocol.17–25 The objective of this analysis was to evaluate the impact of missing data on previously published MT prediction models using a recent prospective multi-center transfusion study.
METHODS
Study Population
The PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) Study included ten U.S. Level I trauma centers.26 From July 2009—October 2010, adult trauma patients (age ≥ 16) arriving directly from the injury scene who received at least one unit of RBC in the Emergency Department (ED) were eligible for enrollment. Exclusion criteria included death within 30 minutes of ED arrival, treatment at an outside hospital, > 5 minutes of CPR, >20% burn injury, severe inhalation injury, pregnancy, and prisoners. Additional details regarding the study have been previously published.26,27
Data Collection
Clinical information was captured by data collectors (typically medical students or full time research staff) who prospectively followed study patients up to 2 hours after blood transfusions stopped or 24 hours after admission to the ED, whichever occurred first. Information describing injuries, ED vitals and labs, and treatments provided was collected prospectively. Further information including patient outcome was collected daily from the hospital medical records by research coordinators at each site. Local Institutional Review Boards for each trauma center and the U.S. Army Medical Research and Materiel Command Office of Research Protections approved the study.
Statistical Analysis
Percentages of missing data were evaluated for ED vitals and laboratory results variables used in the MT prediction models. Missing data in selected ED vital signs and laboratory tests were analyzed for association with MT and survival time to determine whether severely injured patients were more (or less) likely to be missing these data. Pearson’s Chi-square tests (or Fisher’s exact tests when the assumptions of Chi-square were not met) were calculated to compare the proportion of missing data in MT vs. non-MT patients and among patients surviving <6 hours, 6–24 hours, and >24 hours after admission. Statistical significance was assessed at α = 0.05. No significance levels were adjusted for multiple tests.
Among the nine MT prediction models considered,17–25 three contained variables available in PROMMTT. McLaughlin et al.,21 Cancio et al.,23 and Larson et al.24 MT prediction models were applied to the dataset (Table 1). Correct classification percentages were compared using complete case analysis, multiple imputation, and a sensitivity analysis to determine the upper/lower bounds for correct classification. For the multiple imputation values were imputed for missing data points in variables with missingness proportion > 3% (hematocrit, pH, diastolic blood pressure, GCS, respiratory rate, base deficit, and hemoglobin). Predictors of missingness in the multiple imputation model were variables missing for ≤ 3% of patients (heart rate, systolic blood pressure, age, gender, injury severity score, RBC units at 24 hours, and survival at 6 hours and 24 hours). Ten multiple imputation datasets (m=10) were created. Given ≤ 45% missingness for all imputed variables, a high level of efficiency (>95%) is achieved with ten imputation sets.28
Table 1.
Massive transfusion prediction models
Algorithm | Model | Variables | Cut-points |
---|---|---|---|
McLaughlin et al.21 |
log (p/[1-p]) = −1.576 + 0.825*SBP + 0.826*HR + 1.044*Hct + 0.462*pH |
SBP = Systolic Blood Pressure HR = Heart Rate Hct = Hematocrit pH = pH |
Algorithm Variables =1 if: SBP<110 HR>105 Hct<32% pH<7.25 |
Cancio et al.23 |
log (p/[1-p]) = 0.638 – .0115*RTS – 0.011*DAP + 0.358*SI |
RTS = Revised Trauma Score = 0.9368*GCScode + 0.7326*SBPcode + 0.2908*RRcode DAP = Diastolic Arterial Pressure SI = Shock Index = Heart Rate / Systolic Blood Pressure |
GCScode=4 if GCS=13–15 GCScode=3 if GCS=9–12 GCScode=2 if GCS=6–8 GCScode=1 if GCS=4–5 GCScode=0 if GCS=3 SBPcode=4 if SBP>89 SBPcode=3 if SBP=76–89 SBPcode=2 if SBP=50–75 SBPcode=1 if SBP=1–49 SBPcode=0 if SBP=0 RRcode=4 if RR=10–29 RRcode=3 if RR>29 RRcode=2 if RR=6–9 RRcode=1 if RR=1–5 RRcode=0 if RR=0 |
Larson et al.24 |
Predicted to require MT if any two characteristics are true |
Heart rate Systolic Blood Pressure Base Deficit Hemoglobin |
MT = 1 if: Heart rate > 110 SBP < 110 Base excess ≤ −6 Hemoglobin < 11 |
The sensitivity analysis included two single imputation datasets created under best and worst case scenarios: critical and non-critical values were assigned to missing data based on MT outcomes (i.e. whether massive transfusion was eventually received by the patient). Critical values were defined as the 95th percentile value for non-missing MT patients, and non-critical values were defined as the median value for patients with non-missing data who received < 3 units of RBC and survived > 24 hours after ED admission. For calculation of the upper bound (best case scenario), critical values were assigned to MT patients with missing information and non-critical values to missing data for non-MT patients. For calculation of the lower bound (worst case scenario), non-critical values were assigned to missing data for MT patients and critical values to non-MT subjects with missing data. Accuracy of the prediction models was determined by comparing the true MT status with predicted MT status using complete case analysis, multiple imputation, and upper/lower bound sensitivity analysis. Sensitivity, specificity, percent correct classification, and area under the receiver operator characteristic curve (AUC) are presented. Statistical analyses were performed using Stata v. 12 (College Station, TX).
RESULTS
A total of 1,245 patients were enrolled in PROMMTT. MT was received by 297 patients (24%). Table 2 summarizes evaluated variables and describes missing information. Percentages of missing data for predictor variables ranged from 2.2% (heart rate) to 45% (respiratory rate).
Table 2.
Summary of injury characteristics, physiological measures, and laboratory values from PROMMTT which are used to predict the MT status
Median (IQR) or N (%) |
Missing, N (%) |
|
---|---|---|
GSC Total, median (IQR) | 14 (3–5) | 110 (8.8) |
SBP (mmHg), median (IQR) | 106 (86–128) | 32 (2.6) |
DBP (mmHg), median (IQR) | 67.5 (53–82) | 217 (17) |
Heart Rate (bpm), median (IQR) | 105 (86–124) | 27 (2.2) |
Respiratory Rate (bpm), median (IQR) | 20 (18–26) | 559 (45) |
Hematocrit, median (IQR) | 34.8 (30.4–39) | 51 (4.1) |
Hemoglobin, median (IQR) | 11.7 (10.1–13.3) | 47 (3.8) |
Base Deficit, median (IQR) | 6.15 (3–10.1) | 285 (23) |
pH, median (IQR) | 7.27 (7.18–7.34) | 270 (22) |
24h PRBC (units), median (IQR) | 5 (2–9) | 1 (0.01) |
Massive Transfusion, N (%) | 297 (24) | 1 (0.01) |
GCS = Glascow Coma Score, SBP = Systolic Blood Pressure, DBP = Diastolic Blood Pressure, PRBC = packed red blood cells
Most evaluated variables contained higher proportions of missing information for MT patients compared to patients who did not receive MT (Table 3). Base deficit and pH were exceptions, which were missing at slightly higher proportions for non-MT patients, but the differences were not statistically significant. Survival time was strongly associated with missing data for most evaluated variables, except GCS, base deficit, and pH, for which differences in missing data by survival did not reach statistical significance. Patients surviving < 6 hours after ED admission had the most missing data for all variables except respiratory rate (Table 3). Missing information was highly dependent on study site. Sites reported significantly different percentages of missing information (all p<0.001).
Table 3.
Summary of missing data for injury characteristics, physiological measures, and lab values by MT status and survival at 6 hours and 24 hours after ED admission.
Missing by MT | Missing by Survival | Missing by Study Site | |||||||
---|---|---|---|---|---|---|---|---|---|
MT Missing N (%) |
Non-MT Missing N (%) |
p-value† | Survived >24 hrs Missing N (%) |
Survived 6–24 hrs Missing N (%) |
Death <6 hrs Missing N (%) |
p-value† | Missing Range Among 10 Sites, % |
p-value† | |
Total, N | 295 | 939 | - | 1097 | 46 | 102 | - | - | - |
GCS Total | 39 (13) | 70 (7) | 0.003 | 97 (9) | 2 (4) | 11 (11) | 0.45 | 2–21 | <0.001 |
SBP | 21 (7) | 10 (1) | <0.001 | 16 (1) | 3 (7) | 13 (13) | <0.001 | 0–7 | <0.001 |
DBP | 68 (23) | 148 (16) | 0.004 | 176 (16) | 9 (20) | 32 (31) | <0.001 | 0–75 | <0.001 |
Heart Rate | 13 (4) | 13 (1) | 0.003 | 19 (2) | 0 (0) | 8 (8) | 0.002 | 0–7 | <0.001 |
Respiratory Rate | 149 (50) | 409 (43) | 0.04 | 455 (41) | 35 (76) | 69 (68) | <0.001 | 10–83 | <0.001 |
Hematocrit | 20 (7) | 30 (3) | 0.01 | 31 (3) | 3 (7) | 13 (13) | <0.001 | 0–11 | <0.001 |
Hemoglobin | 19 (6) | 27 (3) | 0.01 | 31 (3) | 3 (7) | 13 (13) | <0.001 | 0–9 | <0.001 |
Base Deficit | 62 (21) | 223 (24) | 0.34 | 243 (22) | 9 (20) | 33 (32) | 0.06 | 6–90 | <0.001 |
pH | 59 (20) | 210 (22) | 0.38 | 232 (21) | 8 (17) | 30 (29) | 0.12 | 0–90 | <0.001 |
P-values were calculated using Pearson’s Chi-square tests or Fisher’s exact tests when Chi-square assumptions were not met.
GCS = Glascow Coma Score, SBP = Systolic Blood Pressure, DBP = Diastolic Blood Pressure
Critical, and non-critical values utilized during the sensitivity analysis are presented in Table 4. The number of incomplete variables for each MT model is described in Table 5. There was an inverse association between the number of variables in the models and the number of complete cases. The majority of patients had available data for all four variables in the McLaughlin and Larson models (75% and 74%, respectively). However, the Larson model does not require all four variables to be available in order to predict MT status (Table 1). For patients with two positively coded variables or three negatively coded variables, the Larson model predicts MT status even if the remaining variables are incomplete. In the PROMMTT dataset, 88% of cases had sufficient available data to calculate the Larson prediction without imputation (Table 6). Only 41% of patients had available data for all five variables required for the Cancio model.
Table 4.
Imputation strategy for MT prediction models: variables, components, and imputation values.
Algorithm Variables | Components | Critical Values |
Non-Critical Values |
---|---|---|---|
McLAUGHLIN et al.21 | |||
HR | High HR | 159 | 99 |
SBP | Low SBP | 50 | 115 |
Hematocrit | Low Hematocrit | 19 | 36.6 |
pH | Low pH | 6.83 | 7.31 |
CANCIO et al.23 | |||
DBP | Low DBP | 29 | 72 |
SI | High HR | 159 | 99 |
SI | Low SBP | 50 | 115 |
RTS | Low GCS | 3 | 14 |
RTS | Low SBP | 50 | 115 |
RTS | Low RR | 12 | 20 |
LARSON et al.24 | |||
HR > 110 | High HR | 159 | 99 |
SBP < 110 | Low SBP | 50 | 115 |
Base Deficit ≥ 6 | High Base Deficit | 22 | 5 |
Hemoglobin < 11 | Low Hemoglobin | 6.5 | 12.45 |
Table 5.
Number of patients with missing variables to calculate MT probability using McLaughlinet al.,21 Cancio et al.,23 and Larson et al.24 MT prediction algorithms
Model | Complete cases, N (%) |
Cases missing 1 variable, N (%) |
Cases missing 2 variables, N (%) |
Cases missing 3 variables, N (%) |
Cases missing 4 variables, N (%) |
Cases missing 5 variables, N (%) |
---|---|---|---|---|---|---|
McLaughlin et al.21 |
936 (75) | 260 (21) | 34 (3) | 8 (0.6) | 7 (0.6) | - |
Cancio et al.23 |
514 (41) | 574 (46) | 123 (10) | 20 (2) | 5 (0.4) | 9 (0.7) |
Larson et al.24 |
1095 (88) * | 279 (22) | 32 (3) | 8 (0.6) | 6 (0.5) | - |
Larson et al.24 model requires 2 of 4 available positively coded variables or 3 of 4 available negatively coded variables to calculate MT probability.
Table 6.
Comparison of MT predictive model capability using complete case analysis and imputation.
Algorithm | Analysis | Dataset | N (%) | Sensitivity % |
Specificity % |
Correctly Classified, % |
AUC |
---|---|---|---|---|---|---|---|
McLaughlin et al.21 |
|||||||
Complete Case | PROMMTT | 936 (75) | 65 | 58 | 60 | 0.62 | |
Multiple Imputation | PROMMTT | 1198 (96) | 61 | 62 | 62 | 0.62 | |
Upper Bound | PROMMTT | 1245 (100) | 66 | 63 | 63 | 0.64 | |
Lower Bound | PROMMTT | 1245 (100) | 57 | 60 | 59 | 0.59 | |
Original Report | McLaughlin et al.21 | 396 (NR) † | 59 | 77 | 70 | 0.75 | |
Cancio et al.23 |
|||||||
Complete Case | PROMMTT | 513 (41) ‡ | 77 | 32 | 41 | 0.54 | |
Multiple Imputation | PROMMTT | 1120 (90) | 77 | 30 | 40 | 0.54 | |
Upper Bound | PROMMTT | 1245 (100) | 82 | 32 | 46 | 0.57 | |
Lower Bound | PROMMTT | 1245 (100) | 76 | 27 | 36 | 0.51 | |
Original Report | Cancio et al.23 | 536 (77) | NR | NR | 62 | 0.64 | |
Larson et al.24 |
|||||||
Complete Case | PROMMTT | 1095 (88)* | 82 | 43 | 53 | 0.63 | |
Multiple Imputation | PROMMTT | 1219 (98) | 80 | 45 | 53 | 0.63 | |
Upper Bound | PROMMTT | 1245 (100) | 84 | 50 | 58 | 0.67 | |
Lower Bound | PROMMTT | 1245 (100) | 73 | 38 | 46 | 0.55 | |
Original Report | Larson et al.24 | 1124 (53) | 69 | 65 | 66 | 0.67 |
Predictive capability was calculated using a validation dataset, separate from the dataset used for model development. Incomplete cases were excluded from the validation set. The number of incomplete cases is not provided.
One subject had SBP=0 and HR=0, SI could not be calculated.
Complete cases must contain at least two positively coded variables or three negatively coded variables for evaluation, as required by the Larson et al.24 algorithm.
NR = not reported
Predictive capabilities of the three MT models are compared in Table 6. The highest correct classification was observed in the McLaughlin model for complete case analysis, multiple imputation, and the sensitivity analysis. All models demonstrated similar correct classification percentages using complete case analysis and multiple imputation. In the sensitivity analysis, correct classification ranges were 4% (McLaughlin), 10% (Cancio), and 12% (Larson). All models demonstrated lower predictive accuracy as compared to the original published results using complete case analysis.21,23,24
DISCUSSION
Missing data may be highly variable and difficult to avoid in an emergency trauma setting. Missing information in this study was dependent on patient factors, such as injury severity and mortality, as well as hospital factors, such as individual trauma center processes and policies. To evaluate the impact of missing data on assessments of model predictive accuracy, we compared correct classification percentages using complete case analysis, multiple imputation, and a sensitivity analysis to establish the upper and lower bounds for correct classification. While complete case analysis and multiple imputation resulted in similar estimates for model classification accuracy, the sensitivity analysis provided additional information to assess the impact of missing data on model correct classification.
Future studies should employ strategies to maximize data quality. In PROMMTT, these strategies included: 1) prospective data collection, 2) standardized training of data collectors, 3) collection of information regarding the reasons for missing data, and 4) efforts from a data coordinating center to investigate data quality issues and clean the data. Despite these strategies, some missing data were inevitable in PROMMTT because it was a purely observational study. Whereas clinical trials may ensure complete information by requiring certain tests and observations, PROMMTT did not standardize care, procedures or tests across physicians and study sites, resulting in missing data. Another useful strategy to improve data quality, especially in studies with limited resources, requires maintaining a narrow research focus and limiting information collected to only necessary variables.
Many missing variables were associated with MT status (Table 3); severely hemorrhaging patients often had less complete information than other patients. Survival time was a predictor of missing data for many variables, with higher levels of missingness in patients surviving < 6 hours after ED admission. These results are consistent with previous research demonstrating that injury severity factors are predictive of missing information in trauma registries.4 The strongest and most consistent predictor of missingness was the study site (Table 3). Varying patient populations, clinical workflow processes, and institutional policies among sites are responsible for dramatically different percentages of missing data between trauma centers. These associations between missingness and other measured variables in the dataset suggest that these data were not missing completely at random (MCAR).
Complete case analysis may not accurately assess an algorithm’s predictive ability, especially with high levels of missing data. If data are MCAR, then complete case analysis results in unbiased estimates. If the information is missing at random (MAR), indicating that missing data are correlated with other available variables in the dataset, then imputation strategies will provide more accurate estimates than complete case analysis.3,29,30 However, if information is missing not at random (MNAR), meaning the missingness is dependent on information not available in the dataset, then multiple imputation will not improve the results.3 In this study multiple imputation resulted in similar predictive accuracy of MT models as complete case analysis, and therefore it is possible that the relevant data were MNAR for the study dataset. The effectiveness of multiple imputation may have been limited by the number of available and complete variables. With more complete data, multiple imputation may have been more successful, but the magnitude of this limitation is difficult to determine. Developing a multiple imputation model requires many decisions, such as which variables to impute, which predictor variables to include, and how many datasets to create. Incorrect assumptions about the missing data mechanism and limited availability of predictor variables (whether they were unmeasured or collected with a high level of missingness) limits the usefulness of multiple imputation approaches.
Reporting upper and lower bounds for percent correct classification can be more informative than multiple imputation for understanding the impact of missing data on model predictive accuracy. The sensitivity analysis described in this study estimated the upper and lower bounds for percent correct classification by substituting missing data points with best and worst case scenarios based on patients’ reported MT outcomes. The best case (upper bound) estimate imputed values that would be expected—critical values for MT patients and non-critical values for patients that did not receive MT. The worst case (lower bound) estimate imputed the converse—non-critical values for MT patients and critical values for non-MT patients. The upper-lower bound range of the sensitivity analysis was fairly narrow in the McLaughlin model (4%), suggesting a relatively small impact of missing data on model predictive accuracy. Larger upper-lower bound ranges in the Cancio (10%) and Larson (12%) models suggest that predictive accuracy was more impacted by missing data. This type of systematic approach could be useful in trauma research as well as other clinical specialties, where some level of missing data is inevitable.
Correct classification percentages for all analyzed models were lower than those originally reported. These differences in model accuracy may be a result of the different patient populations in which the models were originally developed as compared to the PROMMTT population. To establish external validity of a prediction model it is recommended that a different data set to be used to assess the accuracy of the prediction models. The McLaughlin algorithm reported predictive accuracy using a separate validation dataset.21 Our findings in this report are indicators of external validity for the evaluated predictive models. We caution that our results should not be interpreted as endorsement of any particular prediction algorithm.
The definition of MT used in the previously developed prediction algorithms does not include the most severely hemorrhaging patients—those who did not live long enough to receive 10 RBC units, and this survival bias has been recognized in the trauma community.31 There are multiple options to address survival bias,24,32 but a consensus has not been established regarding the best methods to account for survival bias in predicting massive transfusion.
In developing a clinical decision model, the likelihood of encountering missing data in a real clinical setting is important to consider. If the goal is to estimate the probability of MT (or other clinical outcome) for a single patient in order to guide treatment decisions, then multiple imputation approaches are not possible. The overall level of missing data for a model is dependent not only on which variables are selected to be in the model but also the number of variables necessary to complete the model calculations. Developers of clinical decision models may consider incorporating some flexibility in the model such as substituting other available information. The Larson model allows flexibility in the variables used, as only two positively coded variables are necessary to predict MT, and three negatively coded variables are sufficient to predict that the patient will not require MT. The Larson model had 88% complete cases, whereas the McLaughlin model required all four variables to be available and had 75% complete cases, and the Cancio model required five variables and had 41% complete cases.
This study has several limitations. Only three MT models were included because the necessary variables to evaluate additional MT prediction models were not collected in PROMMTT. The study population included 10 academic Level I trauma centers, and the results may not be generalizable to all trauma centers. Missing data are highly site-dependent and different rates of missingness may lead to different results. Survival bias is inherent in the definition of MT and was not accounted for in this study, however, survival bias was present for all evaluated models and imputation approaches, so the comparison among these approaches remains valid. Finally, this study evaluated the impact of missing data on the implementation of MT risk prediction, not in the development of risk prediction models. Missing data are equally important to consider during model development. All three MT models included in this study were developed using data from complete cases only.21,23,24
Strengths of the study include a prospectively collected dataset with standardized computer data entry and data quality assurance protocols. The prospective nature of the study and efforts to maintain data quality provided an opportunity to appraise MT models in likely the best-case scenario of minimizing missing information. Inclusion of 10 Level I trauma centers provided a wide range of missing data that helped to assess external validity of the MT prediction algorithms.
Recording fully complete data in a trauma clinical setting may be especially difficult, as the priority is to provide treatment for severely injured patients. In these settings missing data should be further analyzed to understand its impact on study results. Complete case analysis and/or multiple imputation should be considered only after careful analysis of the missing data mechanisms in order to understand the potential biases introduced by these approaches. The sensitivity analysis proposed in this study provides additional information about the impact of missing data that is not available through complete case analysis or multiple imputation. Wider ranges between sensitivity analysis upper and lower bounds suggest a larger impact of missing data on evaluating a model’s predictive accuracy. Future studies evaluating clinical or other types of risk prediction models with incomplete data should consider utilizing a similar approach to assess the impact of missing data on model classification.
Acknowledgments
Funding/Support: This project was funded by the U.S. Army Medical Research and Materiel Command subcontract W81XWH-08-C-0712. Infrastructure for the Data Coordinating Center was supported by CTSA funds from NIH grant UL1 RR024148.
Role of the Sponsor: The sponsors did not have any role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; or the decision to submit this manuscript for publication.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AUTHOR CONTRIBUTIONS
Study concept and design: Holcomb, del Junco, Rahbar, Fox
Acquisition of data: Alarcon, Brasel, Bulger, Cohen, Cotton, Holcomb, Muskat, Myers, Phelan, Schreiber
Data analysis: Trickey
Interpretation of data: Trickey, Fox, del Junco, Ning, Rahbar
Drafting of the manuscript: Trickey
Critical revision of the manuscript for important intellectual content: Trickey, Fox, del Junco, Ning, Rahbar, Holcomb, Alarcon, Brasel, Bulger, Cohen, Cotton, Muskat, Myers, Phelan, Schreiber, Cotton, Wade, White
Obtained funding: Rahbar
Administrative, technical, or material support: Rahbar, Holcomb, Fox, del Junco, Alarcon,
Brasel, Bulger, Cohen, Cotton, Muskat, Myers, Phelan, Schreiber
Study supervision: Rahbar, Holcomb
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Holcomb reported serving on the board for Tenaxis, the Regional Advisory Council for Trauma, and the National Trauma Institute; providing expert testimony for the Department of Justice; grants funded by the Haemonetics Corporation, and KCI USA, Inc. and consultant fees from the Winkenwerder Company. No other disclosures were reported.
Disclaimer: The views and opinions expressed in this manuscript are those of the authors and do not reflect the official policy or position of the Army Medical Department, Department of the Army, the Department of Defense, or the United States Government.
Previous Presentation of the Information Reported in the Manuscript: These data were presented at the PROMMTT Symposium held at the 71st Annual Meeting of the American Association for the Surgery of Trauma (AAST) on September 10–15, 2012 in Kauai, Hawaii.
REFERENCES
- 1.Bouamra O, Wrotchford A, Hollis S, Vail A, Woodford M, Lecky F. A new approach to outcome prediction in trauma: a comparison with the TRISS model. J Trauma. 2006;61:701–710. doi: 10.1097/01.ta.0000197175.91116.10. [DOI] [PubMed] [Google Scholar]
- 2.Glance LG, Osler TM, Mukamel DB, Meredith W, Dick AW. Impact of statistical approaches for handling missing data on trauma center quality. Ann Surg. 2009;249:143–148. doi: 10.1097/SLA.0b013e31818e544b. [DOI] [PubMed] [Google Scholar]
- 3.Little RJA. Regression with missing x’s: a review. J Am Stat Assoc. 1992;87(420):1227–1237. [Google Scholar]
- 4.Joseph L, Belisle P, Tamim H, Sampalis JS. Selection bias found in interpreting analyses with missing data for the prehospital index for trauma. J Clin Epidemiol. 2004;57:147–153. doi: 10.1016/j.jclinepi.2003.08.002. [DOI] [PubMed] [Google Scholar]
- 5.Fairclough DL, Peterson HF, Chang V. Why are missing quality of life data a problem in clinical trials of cancer therapy? Stat Med. 1998;17:667–677. doi: 10.1002/(sici)1097-0258(19980315/15)17:5/7<667::aid-sim813>3.0.co;2-6. [DOI] [PubMed] [Google Scholar]
- 6.Como JJ, Dutton RP, Scalea TM, Edelman BB, Hess JR. Blood transfusion rates in the care of acute trauma. Transfusion. 2004;44:809–813. doi: 10.1111/j.1537-2995.2004.03409.x. [DOI] [PubMed] [Google Scholar]
- 7.Wilson RF, Mammen E, Walt AJ. Eight years of experience with massive blood transfusions. J Trauma. 1971;11:275–285. doi: 10.1097/00005373-197104000-00001. [DOI] [PubMed] [Google Scholar]
- 8.Cinat ME, Wallace WC, Nastanski F, West J, Sloan S, Ocariz J, Wilson SE. Improved survival following massive transfusion in patients who have undergone trauma. Arch Surg. 1999;134:964–968. doi: 10.1001/archsurg.134.9.964. [DOI] [PubMed] [Google Scholar]
- 9.Gruen RL, Jurkovich GJ, McIntyre LK, Foy HM, Maier RV. Patterns of errors contributing to trauma mortality: lessons learned from 2594 deaths. Ann Surg. 2006;244:371–380. doi: 10.1097/01.sla.0000234655.83517.56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Teixeira PG, Inaba K, Hadjizacharia P, Brown C, Salim A, Rhee P, Browder T, Noguchi TT, Demetriades D. Preventable or potentially preventable mortality at a mature trauma center. J Trauma. 2007;63(6):1338–1347. doi: 10.1097/TA.0b013e31815078ae. [DOI] [PubMed] [Google Scholar]
- 11.Holcomb JB, Jenkins D, Rhee P, Johannigman J, Mahoney P, Mehta S, Cox ED, Gehrke MJ, Beilman GJ, Schreiber M, et al. Damage control resuscitation: directly addressing the early coagulopathy of trauma. J Trauma. 2007;62:307–310. doi: 10.1097/TA.0b013e3180324124. [DOI] [PubMed] [Google Scholar]
- 12.Cotton BA, Gunter OL, Isbell J, Au BK, Robertson AM, Morris JA, St. Jacques P, Young PP. Damage control hematology: The impact of a trauma exsanguinations protocol on survival and blood product utilization. Presented at: 66th Annual Meeting of the American Association for the Surgery of Trauma; November 5, 2007; [DOI] [PubMed] [Google Scholar]
- 13.Shaz BH, Dente CJ, Harris RS, MacLeod JB, Hillyer CD. Transfusion management of trauma patients. Anesth Analg. 2009;108:1760–1768. doi: 10.1213/ane.0b013e3181a0b6c6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.O’Keeffe T, Refaai M, Tchorz K, Forestner JE, Sarode R. A massive transfusion protocol to decrease blood component use and costs. Arch Surg. 2008;143(7):686–691. doi: 10.1001/archsurg.143.7.686. [DOI] [PubMed] [Google Scholar]
- 15.Napolitano L. Cumulative risks of early red blood cell transfusion. J Trauma. 2006;60(6) suppl:S26–S34. doi: 10.1097/01.ta.0000199979.95789.17. [DOI] [PubMed] [Google Scholar]
- 16.MacLennan S, Williamson LM. Risks of fresh frozen plasma and platelets. J Trauma. 2006;60(6) suppl:S46–S50. doi: 10.1097/01.ta.0000199546.22925.31. [DOI] [PubMed] [Google Scholar]
- 17.Yucel N, Lefering R, Maegele M, Vorweg M, Tjardes T, Ruchholtz S, Neugebauer E, Wappler F, Bouillon B, Rixen D. the Polytrauma Study Group of the German Trauma Society. Trauma Associated Severe Hemorrhage (TASH)-score: probability of mass transfusion as a surrogate for life threatening hemorrhage after multiple trauma. J Trauma. 2006;60(6):1228–1236. doi: 10.1097/01.ta.0000220386.84012.bf. [DOI] [PubMed] [Google Scholar]
- 18.Schreiber MA, Perkins J, Kiraly L, Underwood S, Wade C, Holcomb JB. Early predictors of massive transfusion in combat casualties. J Am Coll Surg. 2007;205(4):541–545. doi: 10.1016/j.jamcollsurg.2007.05.007. [DOI] [PubMed] [Google Scholar]
- 19.Nunez TC, Dutton WD, May AK, Holcomb JB, Young PP, Cotton BA. Emergency department blood transfusion predicts early massive transfusion and early blood component requirement. Transfusion. 2010;50(9):1914–1920. doi: 10.1111/j.1537-2995.2010.02682.x. [DOI] [PubMed] [Google Scholar]
- 20.Nunez TC, Voskresensky IV, Dossett LA, Shinall R, Dutton WD, Cotton BA. Early prediction of massive transfusion in trauma: simple as ABC (assessment of blood consumption)? J Trauma. 2009;66(2):346–352. doi: 10.1097/TA.0b013e3181961c35. [DOI] [PubMed] [Google Scholar]
- 21.McLaughlin DF, Niles SE, Salinas J, Perkins JG, Cox ED, Wade CE, Holcomb JB. A predictive model for massive transfusion in combat casualty patients. J Trauma. 2008;64(2 Suppl):S57–S63. doi: 10.1097/TA.0b013e318160a566. [DOI] [PubMed] [Google Scholar]
- 22.Rainer TH, Ho AMH, Yeung JHH, Cheung NK, Wong RSM, Tang N, Ng SK, Wong GKC, Lai PBS, Graham CA. Early risk stratification of patients with major trauma requiring massive blood transfusion. Resuscitation. 2011;82(6):724–729. doi: 10.1016/j.resuscitation.2011.02.016. [DOI] [PubMed] [Google Scholar]
- 23.Cancio LC, Wade CE, West SA, Holcomb JB. Prediction of mortality and of the need for massive transfusion in casualties arriving at combat support hospitals in Iraq. J Trauma. 2008;64(2 Suppl):S51–S55. doi: 10.1097/TA.0b013e3181608c21. [DOI] [PubMed] [Google Scholar]
- 24.Larson CR, White CE, Spinella PC, Jones JA, Holcomb JB, Blackbourne LH, Wade CE. Association of shock, coagulopathy, and initial vital signs with massive transfusion in combat casualties. J Trauma. 2010;69(Suppl 1):S26–S32. doi: 10.1097/TA.0b013e3181e423f4. [DOI] [PubMed] [Google Scholar]
- 25.Maegele M, Lefering R, Wafaisade A, Theodorou P, Wutzler S, Fischer P, Bouillon B, Paffrath T. the Trauma Registry of the Deutsche Gesellschaft fur Unfallchirurgie (TR-DGU). Revalidation and update of the TASH-Score: a scoring system to predict the probability for massive transfusion as a surrogate for life-threatening haemorrhage after severe injury. Vox Sang. 2011;100(2):231–238. doi: 10.1111/j.1423-0410.2010.01387.x. [DOI] [PubMed] [Google Scholar]
- 26.Rahbar MH, Fox EE, del Junco DJ, Cotton BA, Podbielski J, Matijevic N, Cohen MJ, Schreiber MA, Zhang J, Mirhaji P, et al. Coordination and management of multicenter clinical studies in trauma: Experience from the PRospective Observational Multicenter Major Trauma Transfusion (PROMMTT) Study. Resuscitation. 2012;83:459–464. doi: 10.1016/j.resuscitation.2011.09.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Holcomb JB, del Junco DJ, Fox EE, Wade CE, Cohen MJ, Schreiber MA, Alarcon LH, Bai Y, Brasel KJ, Bulger EM, et al. The Prospective, Observational, Multicenter, Major Trauma Transfusion (PROMMTT) Study: Comparative Effectiveness of a Time-varying Treatment with Competing Risks. Arch Surg. doi: 10.1001/2013.jamasurg.387. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Schafer JL, Olsen MK. Multiple imputation for multivariate missing data problems: A data analyst’s perspective. Multivariate Behav Res. 1998;33:545–571. doi: 10.1207/s15327906mbr3304_5. [DOI] [PubMed] [Google Scholar]
- 29.Ambler G, Omar RZ, Royston P. A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res. 2007;16:277–298. doi: 10.1177/0962280206074466. [DOI] [PubMed] [Google Scholar]
- 30.Janssen KJM, Donders ART, Harrell FE, Vergouwe Y, Chen Q, Grobbee DE, Moons KGM. Missing covariate data in medical research: to impute is better than to ignore. J Clin Epidemiol. 2010;63:721–727. doi: 10.1016/j.jclinepi.2009.12.008. [DOI] [PubMed] [Google Scholar]
- 31.Snyder CW, Weinberg JA, McGwin G, Jr, Melton SM, George RL, Reiff DA, Cross JM, Hubbard-Brown J, Rue LW, III, Kerby JD. The relationship of blood product ratio to mortality: survival benefit or survival bias? J Trauma. 2009;66:358–362. doi: 10.1097/TA.0b013e318196c3ac. [DOI] [PubMed] [Google Scholar]
- 32.Borgman MA, Spinella PC, Holcomb JB, Blackbourne LH, Wade CE, Lefering R, Bouillon B, Maegele M. The effect of FFP:RBC ratio on morbidity and mortality in trauma patients based on transfusion prediction score. Vox Sang. 2011;101:44–54. doi: 10.1111/j.1423-0410.2011.01466.x. [DOI] [PMC free article] [PubMed] [Google Scholar]