Table 1. XGBoost models were trained with and without metadata (ST, CC, Agr group, and MSSA/MRSA status).
Ten-fold cross-validation was used to assess model performance. For all measures, the average performance across cross-validation is reported. For non-binary classification, individual precision and recall measures are weighted according to its proportion in the overall dataset during averaging, and Cohen’s Kappa is calculated using squared weights. Specificity is not measured for non-binary classification models as it is included in the weighted precision and recall measures.
| Model | Precision | Recall | Specificity | AUROC(95% CI) | Cohen’s Kappa(p-value) |
|---|---|---|---|---|---|
| Binary predictor +/ − metadata | .875 | .333 | .990 | .697 (.553, .840) | .429 (.000000037) |
| Four category predictor + metadata | .423 (weighted) | .443 (weighted) | N/A | .664 (.597, .731) | .255 (.00441) (weighted) |
| Four category predictor − metadata | .451 (weighted) | .326 (weighted) | N/A | .667 (.576, .758) | .133 (.131) (weighted) |