Table 3. Sources of variation in CV and external validation performance and their minimum (a measure of predictable performance).
ANOVA all end points | Cross-validation variance (%) | External validation variance (%) | Min(CV,EV) variance (%) | |||
---|---|---|---|---|---|---|
Performance metric | AUC | MCC | AUC | MCC | AUC | MCC |
Feature ranking | 0.01 | 0.23 | 0.00 | 0.12 | 0.00 | 0.13 |
Number of features | 0.38 | 0.48 | 0.15 | 0.37 | 0.24 | 0.46 |
Distance metric | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Vote weighting | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
Number of neighbors | 1.56 | 0.93 | 0.84 | 0.56 | 1.09 | 0.57 |
Decision threshold | 0.00 | 6.41 | 0.00 | 6.14 | 0.00 | 5.81 |
End point (data set) | 78.70 | 68.99 | 85.04 | 71.30 | 83.90 | 72.01 |
Data set | 16.46 | 6.22 | 9.77 | 3.33 | 11.14 | 4.16 |
Residual | 2.88 | 16.73 | 4.19 | 18.16 | 3.62 | 16.85 |
Abbreviations: ANOVA, analysis of variance; AUC, area under the receiver operating characteristic curve; MCC, Matthews correlation coefficient.