Table 2.
Category | Predictor | Mean per‐protein AUC | Per‐protein AUC at the worst quartile of proteins | Significance of differences compared to DISOselect |
---|---|---|---|---|
Hypothetical method | Oracle | 0.983 | 0.984 | p‐value < .01 (significantly better) |
Proposed model | DISOselect | 0.974 | 0.971 | |
Consensus models | Top2Predictor SVR | 0.947 | 0.938 | p‐value < .01 (significantly worse) |
Top2Predictor LR | 0.947 | 0.936 | p‐value < .01 (significantly worse) | |
12Predictor SVR | 0.942 | 0.929 | p‐value < .01 (significantly worse) | |
12Predictor LR | 0.940 | 0.921 | p‐value < .01 (significantly worse) | |
Individual predictors | SPOT‐disorder | 0.940 | 0.927 | p‐value < .01 (significantly worse) |
DISOPRED3 | 0.935 | 0.921 | p‐value < .01 (significantly worse) | |
ESpritz‐Xray | 0.880 | 0.832 | p‐value < .01 (significantly worse) | |
ESpritz‐NMR | 0.865 | 0.809 | p‐value < .01 (significantly worse) | |
VSL2B | 0.864 | 0.816 | p‐value < .01 (significantly worse) | |
disEMBL‐465 | 0.853 | 0.768 | p‐value < .01 (significantly worse) | |
IUPred‐short | 0.843 | 0.768 | p‐value < .01 (significantly worse) | |
disEMBL‐HL | 0.816 | 0.719 | p‐value < .01 (significantly worse) | |
ESpritz‐DisProt | 0.772 | 0.649 | p‐value < .01 (significantly worse) | |
JRONN | 0.733 | 0.603 | p‐value < .01 (significantly worse) | |
IUPred‐long | 0.718 | 0.584 | p‐value < .01 (significantly worse) | |
GlobPlot | 0.646 | 0.537 | p‐value < .01 (significantly worse) |
Note: We compared the mean per‐protein AUCs computed over the test proteins and the AUCs for the worst (the least accurately predicted) quartile of the test proteins (i.e., the 25% point in Figure 6). Methods are sorted by their mean per‐protein AUCs. Significance of the differences in the per‐protein AUCs of the predictions selected by DISOselect and the predictions generated by the other methods (including the oracle) was assessed with the t test for normal measurements and the Wilcoxon test otherwise; normality was tested with the Anderson‐Darling test at .05 significance; we sampled 50% of proteins in the test data set 10 times at random and compared the corresponding 10 pairs of AUCs; the resulting p‐values are listed in the last column.