Abstract
Background:
Injury severity score (ISS) is a measurement of injury severity based on the Abbreviated Injury Scale (AIS). Due to the difficulty and expense of AIS coding, there have been recent efforts in mapping ISS from administrative International Classification of Disease (ICD) codes instead. Specifically, the open source and freely-available International Classification of Diseases Programs for Injury Categorization in R (ICDPIC-R) converts ICD-9 codes to ISS. This study aims to compare ICDPIC calculations versus manually derived TQIP calculations for ICD-10 codes. Moderate concordance was chosen as the hypothetical relationship due to previous work by both Fleischman et al. who found moderate to substantial concordance between ICDPIC and ISS as well as DiBartolomeo et al. who found none to slight concordance. Given these very different findings, we thought it reasonable to predict moderate concordance with the use of more detailed ICD-10 codes.
Methods:
This was an observational cohort study of 1,040,728 encounters in the Trauma Quality Improvement Program (TQIP) registry for the year 2018. ICDPIC-R was used to derive ISS from the ICD-10 codes in the registry. The resulting scores were compared to the manually-derived ISS in TQIP.
Results:
The median difference between ISS derived from ICDPIC (ISS-ICDPIC) and manually-derived ISS (ISS-TQIP) was −3 [−5, 0] while the mean difference was −2.09 (−2.10, −2.07). There was substantial concordance between ISS-ICDPIC and ISS-TQIP (κ = 0.66). ISS-ICDPIC was a better predictor of mortality (AUC = 0.853 vs. 0.836) but a worse predictor of ICU admission (AUC = 0.741 vs. 0.757) and hospital stay ≥ 10 days (AUC = 0.701 vs. 0.743). ICDPIC has substantial concordance with TQIP for the firearm (κ = 0.69), motor vehicle trauma (κ = 0.71), and pedestrian (κ = 0.73) injury mechanisms.
Conclusion:
When TQIP data are unavailable, ICDPIC remains a valid way to calculate ISS after transition to ICD-10 codes. ISS-ICDPIC performs well in predicting a number of outcomes of interest but is best served as a predictor of mortality.
Level of Evidence:
Level III Prognostic and Epidemiologic
Keywords: Injury Severity Score, Injury Severity, ICD-10, ICDPIC, Trauma
Social Media Summary:
ICDPIC calculates ISS with moderate concordance with TQIP after transition to ICD-10 codes but requires caution in certain populations. ICDPIC performs best at high ISS ranges and for firearm, motor-vehicle, and pedestrian mechanisms.
Hashtags: #trauma, #ICDPIC, #injuryseverity
BACKGROUND
First described in 1974, the Injury Severity Score (ISS) is a single numerical description of overall injury severity for injured patients.(1) Since then, various other scoring systems for injury severity have been developed, but the ISS has remained the gold standard in trauma research modeling due to its simplistic calculation and correlation with mortality, length of hospital stay, and need for intensive care.(1–5) The ISS is reliant on an individual’s Abbreviated Injury Scale (AIS) scores. The AIS assigns each injury a body region and a severity from 1 to 6, with 1 indicating a minor injury and 6 indicating a maximal, untreatable injury.(6) The AIS is the most widely used anatomic scale for rating severity of injuries. But its coding requires expertise of a proprietary manual making it time-consuming, labor-intensive, and expensive.(6) AIS coding must be done by dedicated, trained personnel. The generated AIS Scores and ISS are largely only available for individuals cared for in trauma centers with resources to hire these dedicated personnel.
There have been many efforts to address this limitation by developing an automated method to abstract ISS from International Classification of Diseases (ICD) codes. One of the most recent efforts has been by Clark et al. (7) who developed International Classification of Diseases Programs for Injury Categorization, using the ICD and R statistical software (ICDPIC-R), an open source software to convert either ICD-9-CM or ICD-10-CM codes to AIS/ISS. Multiple studies have validated ICDPIC code mapping in generating ISS.(8–11) However, these studies were conducted utilizing an older version of ICDPIC and retired ICD-9-CM codes.
The United States transitioned in October 2015 from approximately 17,000 ICD-9-CM codes to over 155,000 detailed ICD-10-CM codes.(12) Clark updated ICDPIC from version 0.1.0 (ICDPIC-2018) to version 1.0.0 (ICDPIC-2021), adding support for basic ICD-10 codes and transitioning from calculating AIS using an ad hoc algorithm to ridge regression.(13–15) Ridge regression is a model tuning method designed to better address multicollinearity of predictors.(16) ICDPIC-2021 derived its lookup tables from the gold standard American College of Surgeons Trauma Quality Improvement Program (TQIP) 2017 data and the Health Care Utilization Project National Inpatient Sample (NIS) 2016 data instead of the National Trauma Data Bank (NTDB) 2015 data.(13,15) Clark showed ICDPIC-2021 utilizing ICD-10-CM codes better predicts mortality compared to ICDPIC-2018.(13) However, there is a literature gap in whether ICDPIC-2021 correctly predicts ISS compared to manually-derived ISS-TQIP.
Our study’s overall goal was to understand how reliably ICDPIC-2021 estimated ISS compared to TQIP across a range of injury severity, AIS body region and primary mechanisms of injury. First, we sought to determine the concordance of ISS calculated by ICDPIC-2021 using ICD-10-CM (ISS-ICDPIC) to ISS manually-derived from AIS (ISS-TQIP), comparing exact scores as well as commonly used ISS categorization (1–8, 9–15, 16–24, 25–40, 41–49, 50–75). Second, we sought to compare the AIS severity score distribution by body region between ICDPIC derived AIS and manually-derived AIS in TQIP. Third, we sought to compare the ability of both scores in predicting mortality, ICU admission, and long ICU stay. Lastly, we sought to estimate ISS-ICDPIC performance in classifying ISS compared to ISS-TQIP by mechanisms of injury. Given previous very different findings by both Fleischman et al. who found moderate to substantial concordance between ICDPIC and ISS as well as DiBartolomeo et al. who found none to slight concordance, our hypothesis was that ISS and AIS derived from ICDPIC would have moderate concordance with manually-coded ISS and AIS in TQIP upon utilizing the more detailed ICD-10-CM codes (10,11).
METHODS
Study Design
This study was an observational cohort study of patients included in the TQIP 2018 data set. TQIP is collated by the American College of Surgeons and contains patient data from over 750 participating trauma centers across the United States. Data in TQIP are standardized at the time of submission using the data definitions in the National Trauma Data Standard (NTDS) data dictionary.(17) This study conforms with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines and a complete checklist has been uploaded as Supplemental Digital Content (SDC 1). This study was approved by the Institutional Review Board (STU00212654). Informed consent was waived due to use of a deidentified database.
TQIP encounters from January 1, 2018 through December 31, 2018, were leveraged because all ICD diagnoses were codified under the ICD-10-CM system. Furthermore, unlike the TQIP 2017 data, they had not been used to train the ICDPIC-2021 model. Inclusion criteria were all encounters in the TQIP 2018 (n = 1043736). Exclusion criteria were encounters without a manually-coded AIS or ISS (n = 2957) and encounters without valid ICD-10 codes to calculate an ISS using ICDPIC (n = 52). The NTDS considers valid ICD-10-CM codes to be those in ranges S00-S99, T07, T14, T20–28, and T30–32. ICDPIC-R also requires that injury codes conclude with the letter “A”, indicating an initial encounter (13). There is an exception for codes indicating a fracture, where codes concluding with the letters “B” or “C” indicated an initial encounter with an open fracture. Based on these exclusion criteria, only encounters with a manually-coded ISS (ISS-TQIP) and ICD-10 derived ISS using ICDPIC-2021 (ISS-ICDPIC) were included in analysis (n = 1040728). Encounters without a TQIP manually-coded mechanism of injury were excluded from later analysis involving mechanism stratification (n = 23177). The flow diagram for encounter inclusion is shown in Figure 1.
Figure 1.

Flow diagram of inclusion and exclusion criteria.
Study Variables
The primary predictor was the ICDPIC. For each encounter, ISS was calculated as the sum of the squares of the scores of the three most severely injured AIS body regions.(1) ICDPIC-R version 1.0.0 (ICDPIC-2021) was used to derive ISS from the ICD-10-CM codes in the TQIP registry. For ICDPIC-2021, the “roc_max_TQIP” method was specified in the package to convert ICD to AIS, per recommendation by the developers for use of this method in US trauma registries coded using only ICD-10-CM.(7) The corresponding AIS body regions were assigned using the proposed Centers for Disease Control (CDC) classification.(7,18) Following the standard set by Boyd et al. (19), ISS > 15 was used as one cutoff for “serious” injury. Another cutoff of ISS > 24 was also used as a marker of “severe” injury based on literature suggesting similar outcomes between patients with an ISS less than 15 and between 16 and 24.(20) Other covariates of interest included were patient age, race, sex, mechanism of injury, number of ICD-10 codes, number of AIS codes, manually-derived AIS Severity Score for all body regions, in-hospital mortality, rate of ICU admission, and duration of hospital stay. A hospital stay greater than or equal to 10 days was considered a long hospital stay. The primary outcome of interest was the concordance between ISS-TQIP and ISS-ICDPIC.
Statistics
Continuous data were reported as mean and standard deviation, mean and 95% confidence interval, or median and interquartile range (IQR). Categorical data were reported as totals with percentages included as appropriate. The difference between ISS-ICDPIC and ISS-TQIP versus the mean of the two scores were plotted using Bland Altman plots. To assess concordance between the manually-derived ISS-TQIP and ICDPIC-derived ISS, the percent agreement, mean difference, and median difference between scores were calculated. Bland-Altman plots were also used to visualize the agreement between ISS-ICDPIC and ISS-TQIP.(21) The limits of agreement were set a priori at 95% or 1.96 standard deviations from the mean difference. Subsequently, the weighted Cohen’s Kappa (κ) was calculated to assess concordance of ISS-TQIP and ISS-ICDPIC as exact scores, as categorized into previously established categories of 1–8, 9–15, 16–24, 25–40, 41–49, 50–75 (22), as well as the dichotomously defined seriously injured (ISS > 15) or severely injured (ISS > 24). We then calculated the sensitivity, specificity, positive predictive value, and negative predictive value, of ISS-ICDPIC in predicting serious or severe injury, treating ISS-TQIP as the gold standard. To compare the predictive ability of the two scores, we evaluated their ability to predict different outcomes of interest such as in-hospital mortality, ICU admission, or long hospital stay. We evaluated logistic regression models where one of the aforementioned outcomes of interest was the dependent variable and the scores were the independent one. Performance was assessed based off the resulting receiver operating characteristic (ROC) curves and area under the curve (AUC) values. AUC values of the correlated ROC curves were compared using the DeLong statistic.(23)
We also compared ICDPIC AIS Severity Score estimations across the distribution of manually-coded TQIP AIS Severity Scores by AIS body region. Lastly, we calculated ISS-ICDPIC performance in classifying ISS compared to ISS-TQIP among primary mechanisms of injury as specified by TQIP. Python version 3.7 was used for all data management and statistical analysis. R version 4.0.3 was used to run ICDPIC-R.
RESULTS
A total of 1,0407,28 encounters had sufficient data to calculate ISS-TQIP and ISS-ICDPIC and were included in this study. This sample was 59.3% male and 75.1% white with a median age of 51 and a median injury severity score of 8 (Table 1).
Table 1:
Study Sample Characteristics
| Variables | N = 1,040,728 (%) |
|---|---|
| Sex,a | |
| Male | 616628 (59.3) |
| Female | 423980 (40.7) |
| Race,b | |
| White | 769351 (75.1) |
| Black | 145664 (14.2) |
| Asian | 19988 (2.0) |
| Other | 90713 (8.9) |
| Age, years, median [IQR]c | 51 [28–70] |
| ≤ 18 years | 123835 (12.7) |
| 19–44 years | 299592 (30.8) |
| 45–64 years | 234009 (24.0) |
| 65–84 years | 253549 (26.0) |
| ≥ 85 years | 63011 (6.5) |
| Estimate [IQR] | |
| Injury severity score | 8 [4, 10] |
| Number of AIS codes | 2 [1, 4] |
| Number of ICD codes | 3 [1, 5] |
| Length of stay, daysd | 3 [2, 6] |
| ICU admission, mean (95% CI) | 18.2% (18.1, 18.3) |
| In-hospital mortality, mean (95% CI) | 3.53% (3.50, 3.57) |
IQR = Interquartile Range, AIS = Abbreviated Injury Scale, ICD = International Classification of Disease, CI = Confidence Interval
120 encounters did not have sex recorded.
16,952 encounters did not have race recorded.
66,732 encounters did not have age recorded.
13,312 encounters did not have length of stay recorded.
The median ISS-TQIP and ISS-ICDPIC were 8 [4, 10] and 4 [1, 9], respectively. The median difference between the two scores was −3 [−5, 0]. The Bland-Altman plots demonstrated the difference between ISS-ICDPIC and ISS-TQIP versus the mean of the two scores (Figure 2). The mean difference was −2.08 with a 95% confidence interval of (−2.09, −2.06) and agreement limits of (−14.51, 10.34).
Figure 2.

Bland-Altman plot of the mean of ISS-ICDPIC and ISS-TQIP versus the difference of ISS-ICDPIC-2021 minus ISS-TQIP. Mean difference −2.08 (dashed blue line, 95% CI [−2.09, −2.06]). Limits of agreement −14.51 to 10.34 (dashed orange lines).
ICDPIC calculated the same ISS as TQIP 17.7% of the time with a weighted κ = 0.66. ISS-ICDPIC was in the same ISS category as ISS-TQIP 57.7% of the time, corresponding to a weighted κ = 0.61 and into the same or adjacent category 93.6% of the time (Table 2). ICDPIC tended to mis-categorize into a lower rather than higher category.
Table 2:
Concordance of ISS-TQIP with ISS-ICDPIC by Six ISS Severity Categories
| ISS-TQIP | ISS-ICDPICa | ||||||
|---|---|---|---|---|---|---|---|
| 1–8 | 9–15 | 16–24 | 25–40 | 41–49 | 50–75 | Total | |
| 1–8 | 455339 | 54017 | 12702 | 3119 | 69 | 12 | 525258 |
| 9–15 | 212051 | 93274 | 40607 | 7545 | 181 | 41 | 353699 |
| 16–24 | 21193 | 33568 | 30499 | 10729 | 577 | 136 | 96702 |
| 25–40 | 4093 | 12263 | 17550 | 19730 | 1928 | 688 | 56252 |
| 41–49 | 53 | 281 | 699 | 2091 | 811 | 504 | 4439 |
| 50–75 | 50 | 129 | 318 | 2097 | 823 | 961 | 4378 |
| Total | 692779 | 193532 | 102375 | 45311 | 4389 | 2342 | 1040728 |
ISS = Injury Severity Score, ICD = International Classification of Disease, TQIP = Trauma Quality Improvement Program
The weighted κ for concordance between ISS-ICDPIC and ISS-TQIP was 0.66.
Using a cutoff of ISS > 15 for serious injury, TQIP classified 15.5% of encounters as seriously injured while ICDPIC classified 14.8% as seriously injured. Agreement between ICDPIC and TQIP for the classification of ISS > 15 was 86.9%, corresponding to a weighted κ = 0.49. The sensitivity, specificity, positive predictive value, and negative predictive value of serious injury classification by ICDPIC compared to TQIP were 55.7%, 92.7%, 58.4%, and 91.9% respectively.
Using a cutoff of ISS > 24 for severe injury, TQIP classified 6.3% of encounters as severely injured while ICDPIC classified 5.0%. Agreement between ICDPIC and TQIP for the classification of ISS > 24 was 94.4%, corresponding to a weighted κ = 0.48. The sensitivity, specificity, positive predictive value, and negative predictive value of severe injury classification by ICDPIC-2021 compared to TQIP were 45.5%, 97.7%, 57.0%, and 96.4% respectively.
Figure 3 shows the ROC curves of the two ISS scores for predicting mortality, ICU admission, and hospital stay greater than or equal to 10 days. The AUC of ICDPIC is higher for predicting mortality (0.853 vs. 0.836, p < 0.01) but lower for predicting ICU admission (0.741 vs. 0.757, p < 0.01) or long hospital stay (0.701 vs. 0.743, p < 0.01).
Figure 3.

A) Receiver operating characteristic (ROC) curves for ISS-ICDPIC and ISS-TQIP in predicting in-hospital mortality. B) ROC curves for ISS-ICDPIC and ISS-TQIP in predicting ICU admission. C) ROC curves for ISS-ICDPIC and ISS-TQIP in predicting long hospital stay (length of hospital stay ≥ 10 days).
There were a total of 3,551,792 AIS manually-coded AIS severity scores and 4,022,483 ICDPIC-2021 derived scores (SDC 2). The manually-derived median AIS severity score was 2 [1, 2]. The ICDPIC derived median AIS severity score was 1 [1, 2]. Compared to TQIP, ICDPIC categorized more codes into an AIS severity of 1. As a result, the distribution of AIS severity by AIS body region differed significantly from TQIP for all body regions. In particular, ICDPIC mapped more ICD-10 codes into the extremities and head/neck regions and fewer into the external region.
Mean and median differences between the calculated ISS-ICDPIC and ISS-TQIP scores differed depending on the mechanism of injury (Table 3). There was at least moderate concordance between ISS-ICDPIC and ISS-TQIP for all injury mechanisms. The mechanisms with the lowest concordance, as represented by the lowest weighted κ statistic, were drowning/suffocation and poisoning. The mean difference between ISS-ICDPIC and ISS-TQIP was smallest in magnitude for the firearm mechanism. ISS-ICDPIC generally underestimated ISS-TQIP for all mechanisms except cut/pierce, fire/flame, and poisoning.
Table 3:
Comparing ISS-ICDPIC-2021 Concordance with ISS-TQIP By Primary Mechanism of Injury
| Primary Mechanism of Injury | N (%) | Mean Difference (95% CI) | Limits of Agreement | Median Difference [IQR] | Weighted κ (same ISS) |
|---|---|---|---|---|---|
| All mechanisms | 1017551 (100) | −2.08 (−2.09, −2.06) | −14.49 to 10.34 | −3 [−5, 0] | 0.66 |
| Cut or pierce | 42324 (4.2) | 1.05 (0.98, 1.11) | −12.16 to 14.26 | 0 [−3, 4] | 0.54 |
| Drowning or suffocation | 709 (0.07) | −5.01 (−5.80, −4.23) | −25.90 to 15.87 | −3 [−9, 0] | 0.47 |
| Fall | 497721 (48.9) | −2.54 (−2.56, −2.53) | −12.91 to 7.83 | −3 [−5, 0] | 0.58 |
| Fire or flame | 8440 (0.8) | 1.69 (1.51, 1.87) | −14.98 to 18.36 | 0 [0, 3] | 0.66 |
| Firearm | 42161 (4.1) | −0.25 (−0.34, −0.15) | −19.69 to 19.20 | 0 [−5, 5] | 0.69 |
| Hot object, natural or environmental, overexertion | 27195 (2.7) | −0.19 (−0.25, −0.13) | −10.21 to 9.82 | 0 [−3, 1] | 0.53 |
| Motor vehicle trauma | 276421 (27.2) | −2.42 (−2.45, −2.40) | −16.52 to 11.68 | −3 [−6, 1] | 0.71 |
| Machinery | 7209 (0.7) | −1.26 (−1.37, −1.14) | −11.08 to 8.57 | −1 [−3, 0] | 0.61 |
| Other | 22084 (2.2) | −1.01 (−1.10, −0.93) | −14.06 to 12.03 | 0 [−3, 0] | 0.66 |
| Pedal cyclist, other | 15727 (1.5) | −2.62 (−2.71, −2.53) | −13.73 to 8.49 | −3 [−5, 0] | 0.58 |
| Pedestrian, other | 7860 (0.8) | −2.24 (−2.41, −2.08) | −16.91 to 12.42 | −3 [−6, 1] | 0.73 |
| Poisoning | 246 (0.02) | 3.17 (2.25, 4.09) | −11.15 to 17.48 | 0 [0, 7] | 0.48 |
| Struck by, against | 66384 (6.5) | −1.87 (−1.92, −1.83) | −12.34 to 8.59 | −3 [−4, 0] | 0.61 |
| Unspecified | 3070 (0.3) | −1.16 (−1.40, −0.92) | 14.37 to 12.05 | −1 [−4, 2] | 0.66 |
DISCUSSION
The aim of the study was to assess the performance of ICDPIC in mapping ICD-10 codes to ISS. The results of this study demonstrated substantial concordance between ISS derived from ICD-10-CM codes using ICDPIC and manually-coded ISS-TQIP. Consistent with previous studies, ISS-ICDPIC underestimated ISS-TQIP.(10,24) While the substantial concordance between the two scores is encouraging, there are several differences in performance between the two scores that researchers should be aware of when deciding to utilize ICDPIC.
First, there is poor agreement between the two scores at higher ISS. The sensitivity and PPV of ISS-ICDPIC for predicting ISS-TQIP > 15 or ISS-TQIP > 24 are all below 60%. This finding is particularly concerning given the common use of ISS in identifying patients who are seriously or severely injured. Interestingly, this poor concordance at higher ISS values does not translate to consistently worse performance in predicting other outcomes associated with increased injury severity. ISS-ICDPIC has a higher AUC value as a predictor of mortality but a lower AUC value as a predictor of ICU admission or long hospital stay. This difference in performance can be attributed to the fact that the algorithm for ICDPIC was designed around optimizing mortality prediction when assigning AIS and ISS values.(25) Looking at the comparison of the two ROC curves for predicting mortality, it should be noted that they do cross. At lower false positive rate (FPR) and higher specificity there is an area in which TQIP outperforms ICDPIC. However, given that most trauma researchers are interested in a sensitive predictor of mortality, the superior performance of ISS-ICDPIC when the FPR is greater than 0.15 is encouraging.
ISS-ICDPIC serving as an improved predictor of mortality while not matching up well with ISS-TQIP at higher severity values shows that despite gold standard manually-coded ISS correlating well with mortality, it is not a perfect predictor. Optimization of mortality prediction is not the same as optimizing a minimal difference between ISS-ICDPIC and ISS-TQIP. The decreased performance of ISS-ICDPIC with regards to predicting ICU admission or long hospital stay also shows that improved mortality prediction also does not guarantee associated improvement in predicting other outcomes of interest associated with high injury severity. In fact, emphasizing accurate mortality prediction comes at the cost of accurate prediction of these other two outcomes. However, it should be acknowledged that despite poor concordance between the two scores at higher ISS, the ROC curves and AUC values of ISS-ICDPIC and ISS-TQIP are similar and well above the diagonal, indicating that ISS-ICDPIC performs respectably as a predictor of multiple outcomes of interest. As a result, we believe that ISS-ICDPIC can be utilized as a substitute for ISS-TQIP, particularly for those studies that wish to utilize ISS in studying mortality.
The concordance between the two scores also varied by different mechanisms of injury. In subsequent analysis, we found that ICDPIC tended to achieve higher concordance with TQIP for mechanisms with higher rates of mortality. Specifically, ICDPIC performed best for the mechanisms of motor vehicle trauma, firearm, and pedestrian injury which had mortalities of 4.1%, 15.6%, and 6.1%, respectively. ICDPIC performed relatively poorly for the mechanisms of cut/pierce and hot object, natural/environmental, overexertion which had mortalities of 1.6% and 0.9%, respectively.
This observation lines up with our previous discussion on the design of ICDPIC to optimize mortality prediction. For example, when there are 100 encounters labeled as fatal, ICDPIC can map the associated ICD-10 diagnosis to a more accurate and full injury severity gradient depending on if certain codes are present in all 100 encounters, 80, 60, or so on. This becomes more difficult when there are only 5 encounters as a certain code being present in 2 vs. 3 encounters will lead to different injury severity scores, but assigning them different severity scores may be inappropriate as a code being present in 1 additional fatal encounter may be due to chance rather than an actual association with increased severity. In other words, since the algorithm works to predict mortality and mortality is a rare event occurring in around 3.5% of encounters, the algorithm places greater weight on codes that contribute to mortality. As a result, these codes have a more accurate mapped ISS severity. Surprisingly, despite ICDPIC having a very different distribution of assigned AIS body regions compared to TQIP, this did not seem to factor into the differences seen in concordance after mechanism stratification. ICDPIC may not map a patient’s injury to the exact same AIS region as TQIP but it was generally able to capture when injuries were to multiple regions of the body. This is likely due to the detailed ICD-10-CM coding system that allows for further specification of injury location. Overall, researchers should recognize that ISS-ICDPIC has good concordance with the gold standard for the firearm, motor vehicle trauma, and pedestrian mechanisms and can be more reliably used in datasets primarily consisting of these mechanisms of injury and/or datasets with high rates of mortality.
Our study was similar to Fleischman et al.’s (10) multiregional retrospective cohort study that compared ICD-9-derived ISS-ICDPIC with manually-coded values in a trauma registry leveraging ICD-9 codes. Their study had higher rates of agreement with respect to same ISS (59.2% vs. 17.7%), same ISS range (76.4% vs. 57.7%), same or adjacent ISS range (95.6% vs. 93.6%), and ISS > 15 (91.1% vs. 86.9%). Again, we believe these differences are primarily driven by the ICDPIC algorithm optimizing mortality prediction, which does not necessarily correlate with accurately matching ISS-TQIP. Unlike in their study where ISS-ICDPIC performed worse compared to manually-coded ISS in predicting mortality (AUC = 0.84 vs 0.88), ISS-ICDPIC is actually a better predictor of mortality in our study. Otherwise, ICD-10-CM predictably provided a more detailed coding system with almost 8000 distinct codes compared to the almost 2000 distinct AIS codes.
The implications of this study are that ISS-ICDPIC utilizing ICD-10-CM codes should continue to serve as a valid substitute for calculation of ISS when manually-coded ISS is not available. Overall, ICDPIC has substantial concordance with TQIP when calculating the same ISS or same ISS range. ISS-ICDPIC and ISS-TQIP have poor agreement for categorizing serious or severe injury but are both good predictors of mortality, ICU admission, and long hospital stay. Due to the design of ICDPIC in mapping ICD-10 codes to AIS severity to best predict mortality, ISS-ICDPIC is best used in studies that wish to utilize ISS in studying mortality. In addition, ICDPIC is best applied to datasets that have higher mortality rates.
Further analysis should be conducted to identify which specific ICD-10-CM codes are major contributors to the differences between ISS-ICDPIC and ISS-TQIP. This can be accomplished through analysis of outliers in the Bland-Altman plot and further analysis of the specific ICD-10 CM codes associated with the mechanisms that had the lowest Cohen’s κ values. We are also curious about the utility of an algorithm that optimizes prediction of other outcomes of interest or an algorithm that works to minimize the difference between ICDPIC and manually-coded ISS.
This study has several limitations. First, it is primarily limited by the quality of the coding of ICD-10 CM diagnoses, AIS scores, and ISS in the TQIP dataset. Nonetheless, TQIP uses standardized definitions and dedicated trained surgical clinical reviewers to abstract data, resulting in the highest quality data currently available on a large scale in trauma.(26) Second, because TQIP data are submitted from trauma centers only, we have limited generalizability in our findings to non-trauma centers and even low-level trauma centers. It is unknown if our results can be generalized to non-trauma hospitals given the potential differences in coding resources and processes between trauma and non-trauma centers. Third, this study focused on concordance between ISS-ICDPIC and ISS-TQIP. ISS is a good, but not perfect metric in predicting mortality. As such, demonstrating concordance may not be helpful for researchers who use other injury severity scoring systems. Nonetheless, we believe our data do offer important insights into which ISS categories, AIS body regions and primary mechanisms of injury do demonstrate high concordance after the transition to ICD-10-CM codes. Specifically, the Health Care Utilization Project Nationwide Inpatient Sample (HCUP/NIS) uses ICDPIC to calculate ISS but still utilizes a version of ICDPIC that only supports ICD-9-CM codes. These data support researchers using ICDPIC in HCUP/NIS encounters that include ICD-10-CM codes with the understanding that ICDPIC performs best in studying patients with these specific injury profiles.
CONCLUSION
ICD to AIS code mapping using ICDPIC-R remains a valid way to generate ISS after the transition from ICD-9-CM to ICD-10-CM codes. ISS-ICDPIC generally underestimates ISS when compared to ISS-TQIP. ISS-ICDPIC performs best when used in standard categories (1–8, 9–15, 16–24, 25–40, 41–49, 50–75) and when used as a predictor of mortality. Further study is required to improve ICDPIC code mapping, especially at higher ISS range categories, and for certain mechanisms of injury. Manually-derived data from TQIP remain the gold standard.
Supplementary Material
Source of Funding:
Anne Stey was funded by AAST research award and the ACS C. James Carrico Faculty Research Fellowship and K23HL157832-01 from NHLBI.
Footnotes
SUPPLEMENTAL DIGITAL CONTENT
- STROBE Checklist application in the manuscript
- Abbreviated injury severity score by body region for the trauma quality improvement program 2017 and 2018.
Previous Presentation: These data were presented at the American College of Surgeons Committee on Trauma, May 2021, Chicago IL.
Conflicts of Interest: There are no conflicts of interest in regards to these data.
REFERENCES
- 1.Baker SP, O’Neill B, Haddon W, Long WB. The injury severity score: a method for describing patients with multiple injuries and evaluating emergency care. J Trauma. 1974;14(3):187–96. [PubMed] [Google Scholar]
- 2.Chawda MN, Hildebrand F, Pape HC, Giannoudis PV. Predicting outcome after multiple trauma: Which scoring system? Vol. 35, Injury. Elsevier; 2004. p. 347–58. [DOI] [PubMed] [Google Scholar]
- 3.Paladugu R, Schein M, Gardezi S, Wise L. One hundred citation classics in general surgical journals. World J Surg [Internet]. 2002. Sep 25 [cited 2021 Apr 19];26(9):1099–105. Available from: https://link.springer.com/article/10.1007/s00268-002-6376-7 [DOI] [PubMed] [Google Scholar]
- 4.Tamim H, Al Hazzouri AZ, Mahfoud Z, Atoui M, El-Chemaly S. The injury severity score or the new injury severity score for predicting mortality, intensive care unit admission and length of hospital stay: Experience from a university hospital in a developing country. Injury. 2008. Jan 1;39(1):115–20. [DOI] [PubMed] [Google Scholar]
- 5.Lavoie A, Moore L, LeSage N, Liberman M, Sampalis JS. The Injury Severity Score or the New Injury Severity Score for predicting intensive care unit admission and hospital length of stay? Injury [Internet]. 2005. [cited 2021 Jul 5];36(4):477–83. Available from: https://pubmed.ncbi.nlm.nih.gov/15755427/ [DOI] [PubMed] [Google Scholar]
- 6.Rating the Severity of Tissue Damage: I. The Abbreviated Scale. JAMA J Am Med Assoc [Internet]. 1971. Jan 11 [cited 2021 Jul 5];215(2):277–80. Available from: https://jamanetwork.com/journals/jama/fullarticle/359231 [DOI] [PubMed] [Google Scholar]
- 7.Clark DE, Black AW, Skavdahl DH, Hallagan LD. Open-access programs for injury categorization using ICD-9 or ICD-10. Inj Epidemiol. 2018;5(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Greene NH, Kernic MA, Vavilala MS, Rivara FP. Validation of ICDPIC software injury severity scores using a large regional trauma registry. Inj Prev [Internet]. 2015. Oct 1 [cited 2021 Apr 19];21(5):325–30. Available from: https://injuryprevention.bmj.com/content/21/5/325 [DOI] [PubMed] [Google Scholar]
- 9.Sears JM, Blanar L, Bowman SM. Predicting work-related disability and medical cost outcomes: A comparison of injury severity scoring methods. Injury [Internet]. 2014;45(1):16–22. [DOI] [PubMed] [Google Scholar]
- 10.Fleischman RJ, Mann NC, Dai M, Holmes JF, Wang NE, Haukoos J, et al. Validating the use of ICD-9 code mapping to generate injury severity scores. J Trauma Nurs. 2017;24(1):4–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Di Bartolomeo S, Tillati S, Valent F, Zanier L, Barbone F. ISS mapped from ICD-9-CM by a novel freeware versus traditional coding: A comparative study. Scand J Trauma Resusc Emerg Med. 2010;18(1):1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Topaz M, Shafran-Topaz L, Bowles KH. ICD-9 to ICD-10: evolution, revolution, and current debates in the United States. Perspect Health Inf Manag [Internet]. 2013. [cited 2021 Apr 19];10(Spring). Available from: /pmc/articles/PMC3692324/ [PMC free article] [PubMed] [Google Scholar]
- 13.Implementation Details [Internet]. Available from: https://cran.r-project.org/web/packages/icdpicr/vignettes/implementation-details.html
- 14.Airaksinen NK, Heinänen MT, Handolin LE. The reliability of the ICD-AIS map in identifying serious road traffic injuries from the Helsinki Trauma Registry. Injury [Internet]. 2019. Sep 1 [cited 2021 Jul 5];50(9):1545–51. Available from: https://pubmed.ncbi.nlm.nih.gov/31371171/ [DOI] [PubMed] [Google Scholar]
- 15.ICDPICR [Internet]. Available from: https://github.com/ablack3/icdpicr
- 16.Hoerl AE, Kannard RW, Baldwin KF. Ridge regression:some simulations. [Internet]. 2007. [cited 2021 Sep 19];4(2):105–23. [Google Scholar]
- 17.National Trauma Data Bank ® NTDB Research Data Set User Manual and Variable Description List. [cited 2021 Aug 14]; Available from: www.ntdb.org
- 18.Annest J, Hedegaard H, Chen L, Warner M, Small E. Proposed Framework for Presenting Injury Data using ICD-10-CM External Cause of Injury Codes. 2014. [cited 2021 Apr 19]; Available from: https://www.cdc.gov/injury/wisqars/pdf/icd-10-cm_external_cause_injury_codes-a.pdf
- 19.Boyd CR, Tolson MA, Copes WS. Evaluating trauma care: The TRISS method. J Trauma - Inj Infect Crit Care. 1987;27(4):370–8. [PubMed] [Google Scholar]
- 20.Palmer C Major trauma and the Injury Severity Score - Where should we set the bar? In: Annual Proceedings - Association for the Advancement of Automotive Medicine [Internet]. Association for the Advancement of Automotive Medicine; 2007. [cited 2021 Jul 5]. p. 13–29. Available from: /pmc/articles/PMC3217501/ [PMC free article] [PubMed] [Google Scholar]
- 21.Martin Bland J, Altman DG. STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT. Lancet. 1986. Feb 8;327(8476):307–10. [PubMed] [Google Scholar]
- 22.Copes WS, Champion HR, Sacco WJ, Lawnick MM, Keast SL, Bain LW. The injury severity score revisited. J Trauma - Inj Infect Crit Care [Internet]. 1988. [cited 2021 Jul 5];28(1):69–76. Available from: https://pubmed.ncbi.nlm.nih.gov/3123707/ [DOI] [PubMed] [Google Scholar]
- 23.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics. 1988. Sep;44(3):837. [PubMed] [Google Scholar]
- 24.Di Bartolomeo S, Tillati S, Valent F, Zanier L, Barbone F. ISS mapped from ICD-9-CM by a novel freeware versus traditional coding: a comparative study. Scand J Trauma Resusc Emerg Med [Internet]. 2010. Mar 31 [cited 2021 Nov 30];18(1):17. Available from: /pmc/articles/PMC2852374/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Clark DE, Black AW. ICDPIC-R Version 1.0.0. Injury [Internet]. 2021. Feb [cited 2021 Sep 19];0(0). Available from: http://www.injuryjournal.com/article/S0020138321001820/fulltext [DOI] [PubMed] [Google Scholar]
- 26.Hornor MA, Hoeft C, Nathens AB. Quality Benchmarking in Trauma: from the NTDB to TQIP [Internet]. Vol. 4, Current Trauma Reports. Springer International Publishing; 2018. [cited 2021 Apr 24]. p. 160–9. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
