Table 2 –
Model performance on the internal validation data.
| Model | c-statistic | Average Precision | Sensitivity | Specificity | Positive Predictive Value | % Positive |
|---|---|---|---|---|---|---|
| Baseline | 0.888 (0.881–0.894) | 0.215 (0.197–0.235) | 0.970 (0.963–0.977) | 0.439 (0.436–0.442) | 0.037 (0.035–0.038) | 57.0% (56.7–57.3) |
| Logistic Regression | 0.907 (0.900–0.913) | 0.280 (0.260–0.300) | 0.962 (0.954–0.970) | 0.512 (0.509–0.515) | 0.042 (0.040–0.043) | 49.8% (49.5–50.1) |
| Decision Tree | 0.916 (0.910–0.922) | 0.298 (0.278–0.318) | 0.963 (0.955–0.971) | 0.618 (0.614–0.621) | 0.053 (0.050–0.055) | 39.5% (39.2–39.8) |
| Random Forest | 0.913 (0.906–0.918) | 0.247 (0.230–0.265) | 0.962 (0.954–0.970) | 0.593 (0.590–0.597) | 0.050 (0.047–0.052) | 41.9% (41.5–42.2) |
| Gradient Boosting Machine | 0.924 (0.919–0.929) | 0.292 (0.273–0.314) | 0.963 (0.956–0.971) | 0.651 (0.648–0.654) | 0.058 (0.055–0.060) | 36.2% (35.9–36.5) |
Penalized logistic regression24, decision tree25, random forest26, and gradient boosting machine27 models were trained on the training data and evaluated on the internal validation data to predict transfusion on the day of surgery, using the procedure-specific transfusion rates observed in 2016–18. For comparison, a baseline model is also presented that used only the procedure-specific transfusion rate. c-statistic – area under the receiver operating characteristic curve. Average precision – area under the precision recall curve, indicative of model discrimination for the positive class. % Positive indicates percent of cases in the cohort for whom the model made a positive prediction, i.e., recommended a type and screen. All models were fixed with decision thresholds to achieve 96% sensitivity on the training data. 95% confidence intervals are shown in parentheses.