Table 5.
Classifier | Diagnosis | Management | Severity | |||
---|---|---|---|---|---|---|
PPV (±SD) | NPV (±SD) | PPV (±SD) | NPV (±SD) | PPV (±SD) | NPV (±SD) | |
Random | 0.57 | 0.43 | 0.62 | 0.38 | 0.88 | 0.12 |
AS or PAS ≥ 4 and appendix diameter ≥ 6 mm | 0.82 | 0.85 | — | — | — | — |
Suspected diagnosis | 0.71 | 1.00 | — | — | — | — |
LR (full) | 0.83 (±0.07) | 0.83 (±0.09) | 0.89 (±0.06) | 0.79 (±0.09) | 0.92 (±0.04) | 0.51 (±0.28) |
LR (w/o US) | 0.78 (±0.08) | 0.68 (±0.10) | 0.91 (±0.03) | 0.88 (±0.10) | 0.94 (±0.04) | 0.61 (±0.34) |
LR (w/o peritonitis/abdominal guarding) | 0.83 (±0.09) | 0.82 (±0.11) | 0.82 (±0.05) | 0.74 (±0.09) | 0.92 (±0.04) | 0.45 (±0.29) |
LR (w/o US and peritonitis/abdominal guarding) | 0.76 (±0.09) | 0.68 (±0.10) | 0.78 (±0.04) | 0.68 (±0.09) | 0.93 (±0.04) | 0.69 (±0.33) |
RF (full) | 0.89 (±0.08) | 0.88 (±0.05) | 0.88 (±0.04) | 0.90 (±0.12) | 0.93 (±0.03) | 0.80 (±0.26) |
RF (w/o US) | 0.78 (±0.07) | 0.74 (±0.10) | 0.89 (±0.04) | 0.88 (±0.10) | 0.93 (±0.03) | 0.72 (±0.24) |
RF (w/o peritonitis/abdominal guarding) | 0.92 (±0.05) | 0.88 (±0.07) | 0.81 (±0.09) | 0.74 (±0.13) | 0.92 (±0.04) | 0.77 (±0.24) |
RF (w/o US and peritonitis/abdominal guarding) | 0.74 (±0.11) | 0.69 (±0.09) | 0.75 (±0.05) | 0.65 (±0.10) | 0.92 (±0.03) | 0.72 (±0.23) |
GBM (full) | 0.89 (±0.07) | 0.90 (±0.04) | 0.91 (±0.04) | 0.88 (±0.10) | 0.93 (±0.02) | 0.67 (±0.21) |
GBM (w/o US) | 0.81 (±0.09) | 0.73 (±0.11) | 0.91 (±0.03) | 0.87 (±0.11) | 0.93 (±0.02) | 0.70 (±0.25) |
GBM (w/o peritonitis/abdominal guarding) | 0.87 (±0.08) | 0.89 (±0.06) | 0.81 (±0.04) | 0.77 (±0.08) | 0.93 (±0.03) | 0.72 (±0.24) |
GBM (w/o US and peritonitis/abdominal guarding) | 0.73 (±0.09) | 0.70 (±0.10) | 0.76 (±0.06) | 0.67 (±0.11) | 0.93 (±0.03) | 0.68 (±0.23) |
Results are given by average positive and negative predictive values (PPV/NPV) with standard deviations across 10 folds. “Full” models use all predictors; models “w/o US” were trained without ultrasonographic findings; models “w/o peritonitis/abdominal guarding” were trained without the “peritonitis/abdominal guarding” predictor; and models “w/o US and peritonitis/abdominal guarding” were trained without ultrasonographic findings or the “peritonitis/abdominal guarding” predictor. For all classifiers, a probability threshold of 0.5 was used to differentiate between classes. “Random” corresponds to a random guess and serves as a naïve baseline. Bold values correspond to the best average performances achieved across all models.