Skip to main content
. 2020 Aug 5;20:55. doi: 10.1186/s40644-020-00329-8

Table 3.

Performance drop of models trained on single-center data and applied to unseen multi-center data, using non-robust and robust featues withs priors, averaged across class boundaries (lower is better). Listed as mean and 95% confidence intervals, calculated with the adjusted bootstrap percentile (BCa) method. The lowest drop is indicated in bold for each metric. Bal. Acc.: Balanced accuracy, Acc.: Accuracy

Feature set AUC drop Bal. acc. drop Acc. drop Specificity drop Sensitivity drop F1 drop Precision drop
Non-robust features 0.52 CI: [0.50,0.56] 0.40 CI: [0.26,0.45] 0.48 CI: [0.33,0.53] 0.80 CI: [0.70,0.88] 0.06 CI: [0.00,0.15] 0.38 CI: [0.24,0.50] 0.54 CI: [0.39,0.63]
Robust features, sequence prior 0.30 CI: [0.22,0.36] 0.26 CI: [0.03,0.35] 0.37 CI: [0.33,0.43] 0.40 CI: [−0.10,0.75] 0.18 CI: [0.00,0.34] 0.38 CI: [0.24,0.53] 0.51 CI: [0.37,0.65]
Robust features, hand-picked 0.32 CI: [0.27,0.36] 0.26 CI: [0.18,0.31] 0.33 CI: [0.27,0.37] 0.42 CI: [0.27,0.50] 0.16 CI: [0.02,0.37] 0.35 CI: [0.22,0.54] 0.48 CI: [0.35,0.66]