Skip to main content
. 2021 Oct 18;10:e70640. doi: 10.7554/eLife.70640

Table 2. Performance metrics of methods: random forest, gradient boosting machine (GBM), and logistic regression.

Metrics Random forest GBM Logistic regression
TrainingMarch–April (MA) ValidatingMarch–April (MA) TestingMay–Dec(MD) TrainingMarch–April (MA) ValidatingMarch–April (MA) TestingMay–Dec(MD) TrainingMarch–April (MA) ValidatingMarch–April (MA) TestingMay–Dec(MD)
AUC (DeLong)(95% CI) 0.97(0.97–0.98) 0.83(0.80–0.87) 0.78(0.73–0.84) 0.88(0.86–0.89) 0.84(0.80–0.88) 0.78(0.73–0.83) 0.84(0.82–0.86) 0.83(0.79–0.87) 0.52(0.44–0.60)
Sensitivity(95% CI) 0.93(0.91–0.97) 0.82(0.72–0.92) 0.73(0.54–1.00) 0.85(0.80–0.88) 0.80(0.66–0.90) 0.77(0.65–0.94) 0.80(0.77–0.84) 0.84(0.76–0.91) 0.87(0.18–1.00)
Specificity(95% CI) 0.92(0.88–0.94) 0.75(0.63–0.83) 0.73(0.41–0.89) 0.77(0.73–0.81) 0.75(0.65–0.87) 0.71(0.50–0.79) 0.74(0.70–0.77) 0.73(0.65–0.79) 0.26(0.11–0.94)

Comparison between the performances of three methods: random forest, GBM, and logistic regression model applied on the rebalanced dataset obtained with SMOTE methodology. Logistic regression predictions are computed using the 10-fold cross-validation in order to be comparable with random forest and GBM predictions (which use out-of-bag and 10-fold cross-validation, respectively).