Skip to main content
. 2021 Apr 29;9:662183. doi: 10.3389/fped.2021.662183

Table 5.

Ten-fold cross-validation results for logistic regression (LR), random forest (RF), and generalized boosted regression (GBM) models for predicting diagnosis, management, and severity.

Classifier Diagnosis Management Severity
PPV (±SD) NPV (±SD) PPV (±SD) NPV (±SD) PPV (±SD) NPV (±SD)
Random 0.57 0.43 0.62 0.38 0.88 0.12
AS or PAS ≥ 4 and appendix diameter ≥ 6 mm 0.82 0.85
Suspected diagnosis 0.71 1.00
LR (full) 0.83 (±0.07) 0.83 (±0.09) 0.89 (±0.06) 0.79 (±0.09) 0.92 (±0.04) 0.51 (±0.28)
LR (w/o US) 0.78 (±0.08) 0.68 (±0.10) 0.91 (±0.03) 0.88 (±0.10) 0.94 (±0.04) 0.61 (±0.34)
LR (w/o peritonitis/abdominal guarding) 0.83 (±0.09) 0.82 (±0.11) 0.82 (±0.05) 0.74 (±0.09) 0.92 (±0.04) 0.45 (±0.29)
LR (w/o US and peritonitis/abdominal guarding) 0.76 (±0.09) 0.68 (±0.10) 0.78 (±0.04) 0.68 (±0.09) 0.93 (±0.04) 0.69 (±0.33)
RF (full) 0.89 (±0.08) 0.88 (±0.05) 0.88 (±0.04) 0.90 (±0.12) 0.93 (±0.03) 0.80 (±0.26)
RF (w/o US) 0.78 (±0.07) 0.74 (±0.10) 0.89 (±0.04) 0.88 (±0.10) 0.93 (±0.03) 0.72 (±0.24)
RF (w/o peritonitis/abdominal guarding) 0.92 (±0.05) 0.88 (±0.07) 0.81 (±0.09) 0.74 (±0.13) 0.92 (±0.04) 0.77 (±0.24)
RF (w/o US and peritonitis/abdominal guarding) 0.74 (±0.11) 0.69 (±0.09) 0.75 (±0.05) 0.65 (±0.10) 0.92 (±0.03) 0.72 (±0.23)
GBM (full) 0.89 (±0.07) 0.90 (±0.04) 0.91 (±0.04) 0.88 (±0.10) 0.93 (±0.02) 0.67 (±0.21)
GBM (w/o US) 0.81 (±0.09) 0.73 (±0.11) 0.91 (±0.03) 0.87 (±0.11) 0.93 (±0.02) 0.70 (±0.25)
GBM (w/o peritonitis/abdominal guarding) 0.87 (±0.08) 0.89 (±0.06) 0.81 (±0.04) 0.77 (±0.08) 0.93 (±0.03) 0.72 (±0.24)
GBM (w/o US and peritonitis/abdominal guarding) 0.73 (±0.09) 0.70 (±0.10) 0.76 (±0.06) 0.67 (±0.11) 0.93 (±0.03) 0.68 (±0.23)

Results are given by average positive and negative predictive values (PPV/NPV) with standard deviations across 10 folds. “Full” models use all predictors; models “w/o US” were trained without ultrasonographic findings; models “w/o peritonitis/abdominal guarding” were trained without the “peritonitis/abdominal guarding” predictor; and models “w/o US and peritonitis/abdominal guarding” were trained without ultrasonographic findings or the “peritonitis/abdominal guarding” predictor. For all classifiers, a probability threshold of 0.5 was used to differentiate between classes. “Random” corresponds to a random guess and serves as a naïve baseline. Bold values correspond to the best average performances achieved across all models.