Table 2.
Comparison of the ten structure-trained and disorder-trained predictors of binding residues and the new hybridDBRpred meta-predictor using the sampled test dataset
Sensitivity | Specificity | ||||||||
---|---|---|---|---|---|---|---|---|---|
Dataset | Type of methods | Predictors | AUC | AULCratio at 0.1 FPR | at 0.1 FPR | at 0.2 FPR | at 0.5 TPR | at 0.7 TPR | maxF1 |
Structure-annotated proteins from the benchmark dataset | |||||||||
Structure-trained | TargetS | 0.650 ± 0.026+/+ | 0.932 ± 0.407+/+ | 0.099 ± 0.033+/+ | 0.297 ± 0.054+/+ | 0.697 ± 0.035+/+ | 0.557 ± 0.034+/+ | 0.132 ± 0.012+/+ | |
TargetDNA | 0.752 ± 0.027+/+ | 4.654 ± 0.763+/+ | 0.399 ± 0.050+/+ | 0.567 ± 0.050+/+ | 0.834 ± 0.031+/+ | 0.690 ± 0.044+/+ | 0.235 ± 0.028+/+ | ||
BindN+ | 0.747 ± 0.018+/+ | 3.939 ± 0.534+/+ | 0.347 ± 0.032+/+ | 0.540 ± 0.034+/+ | 0.825 ± 0.020+/+ | 0.672 ± 0.029+/+ | 0.207 ± 0.020+/+ | ||
DNAPred | 0.808 ± 0.020 /+ | 7.314 ± 0.975 /– | 0.509 ± 0.049 /= | 0.649 ± 0.039 /+ | 0.904 ± 0.025 /= | 0.756 ± 0.035 /+ | 0.335 ± 0.040 /– | ||
DNAgenie | 0.748 ± 0.035+/+ | 3.476 ± 0.894+/+ | 0.325 ± 0.062+/+ | 0.546 ± 0.068+/+ | 0.821 ± 0.037+/+ | 0.676 ± 0.065+/+ | 0.205 ± 0.034+/+ | ||
Disorder-trained | fMoRFpred | 0.432 ± 0.018+/+ | 0.572 ± 0.145+/+ | 0.062 ± 0.010+/+ | 0.139 ± 0.014+/+ | 0.409 ± 0.024+/+ | 0.228 ± 0.023+/+ | 0.088 ± 0.010+/+ | |
ANCHOR2 | 0.468 ± 0.033+/+ | 0.171 ± 0.160+/+ | 0.029 ± 0.024+/+ | 0.127 ± 0.043+/+ | 0.453 ± 0.040+/+ | 0.306 ± 0.039+/+ | 0.092 ± 0.010+/+ | ||
DeepDISObind | 0.489 ± 0.035+/+ | 0.069 ± 0.083+/+ | 0.015 ± 0.015+/+ | 0.081 ± 0.035+/+ | 0.526 ± 0.045+/+ | 0.386 ± 0.047+/+ | 0.095 ± 0.012+/+ | ||
MoRFchibi | 0.533 ± 0.027+/+ | 1.293 ± 0.346+/+ | 0.121 ± 0.030+/+ | 0.216 ± 0.034+/+ | 0.545 ± 0.042+/+ | 0.358 ± 0.033+/+ | 0.097 ± 0.012+/+ | ||
DisoRDPbind | 0.621 ± 0.030+/+ | 1.965 ± 0.458+/+ | 0.187 ± 0.036+/+ | 0.333 ± 0.047+/+ | 0.657 ± 0.043+/+ | 0.475 ± 0.049+/+ | 0.127 ± 0.015+/+ | ||
Baseline meta-predictors | Average-based | 0.815 ± 0.017–/+ | 6.324 ± 0.736+/– | 0.488 ± 0.045+/+ | 0.674 ± 0.037–/+ | 0.895 ± 0.017+/= | 0.781 ± 0.030–/+ | 0.294 ± 0.030–/= | |
Logistic regression | 0.777 ± 0.019+/+ | 4.619 ± 0.857+/+ | 0.395 ± 0.051+/+ | 0.580 ± 0.047+/+ | 0.861 ± 0.023+/+ | 0.739 ± 0.034+/+ | 0.237 ± 0.025+/+ | ||
Deep learning meta-predictor | hybridDBRpred | 0.827 ± 0.017–/ | 5.704 ± 0.724+/ | 0.504 ± 0.042=/ | 0.704 ± 0.040–/ | 0.900 ± 0.016=/ | 0.802 ± 0.028–/ | 0.291 ± 0.027+/ | |
Disorder- annotated proteins from the benchmark dataset | |||||||||
Structure-trained | TargetS | 0.557 ± 0.030+/+ | 1.932 ± 0.291+/+ | 0.181 ± 0.026+/+ | 0.311 ± 0.036+/+ | 0.581 ± 0.048+/+ | 0.349 ± 0.047+/+ | 0.130 ± 0.016+/+ | |
TargetDNA | 0.529 ± 0.030+/+ | 1.892 ± 0.362+/+ | 0.170 ± 0.030+/+ | 0.280 ± 0.037+/+ | 0.528 ± 0.048+/+ | 0.301 ± 0.040+/+ | 0.122 ± 0.018+/+ | ||
BindN+ | 0.566 ± 0.034+/+ | 2.522 ± 0.562+/+ | 0.206 ± 0.018+/+ | 0.327 ± 0.045+/+ | 0.598 ± 0.058+/+ | 0.343 ± 0.047+/+ | 0.144 ± 0.025+/+ | ||
DNAPred | 0.535 ± 0.031+/+ | 2.342 ± 0.375+/+ | 0.187 ± 0.027+/+ | 0.281 ± 0.035+/+ | 0.557 ± 0.049+/+ | 0.298 ± 0.047+/+ | 0.130 ± 0.017+/+ | ||
DNAgenie | 0.683 ± 0.058 /+ | 3.769 ± 1.144 /+ | 0.323 ± 0.079 /+ | 0.447 ± 0.094 /+ | 0.753 ± 0.083 /+ | 0.564 ± 0.092 /+ | 0.220 ± 0.054 /+ | ||
Disorder-trained | fMoRFpred | 0.512 ± 0.014+/+ | 1.313 ± 0.187+/+ | 0.118 ± 0.011+/+ | 0.204 ± 0.013+/+ | 0.488 ± 0.025+/+ | 0.321 ± 0.021+/+ | 0.107 ± 0.020+/+ | |
ANCHOR2 | 0.585 ± 0.056+/+ | 1.551 ± 0.634+/+ | 0.158 ± 0.053+/+ | 0.315 ± 0.078+/+ | 0.614 ± 0.083+/+ | 0.420 ± 0.089+/+ | 0.133 ± 0.023+/+ | ||
DeepDISObind | 0.640 ± 0.066+/+ | 1.799 ± 0.771+/+ | 0.117 ± 0.065+/+ | 0.331 ± 0.095+/+ | 0.670 ± 0.083+/+ | 0.499 ± 0.116+/+ | 0.147 ± 0.027+/+ | ||
MoRFchibi | 0.628 ± 0.028+/+ | 2.370 ± 0.491+/+ | 0.214 ± 0.038+/+ | 0.369 ± 0.040+/+ | 0.687 ± 0.043+/+ | 0.461 ± 0.041+/+ | 0.155 ± 0.031+/+ | ||
DisoRDPbind | 0.632 ± 0.028+/+ | 2.508 ± 0.486+/+ | 0.235 ± 0.035+/+ | 0.392 ± 0.042+/+ | 0.710 ± 0.044+/+ | 0.469 ± 0.050+/+ | 0.163 ± 0.021+/+ | ||
Baseline meta-predictors | Average-based | 0.655 ± 0.037+/+ | 3.610 ± 0.668=/+ | 0.302 ± 0.047=/+ | 0.469 ± 0.051=/+ | 0.769 ± 0.049=/+ | 0.485 ± 0.077+/+ | 0.200 ± 0.034+/+ | |
Logistic regression | 0.678 ± 0.025=/+ | 3.129 ± 0.617+/+ | 0.274 ± 0.038+/+ | 0.411 ± 0.045+/+ | 0.756 ± 0.036=/+ | 0.584 ± 0.038=/+ | 0.186 ± 0.027+/+ | ||
Deep learning meta-predictor | hybridDBRpred | 0.766 ± 0.048–/ | 4.805 ± 1.140–/ | 0.396 ± 0.079–/ | 0.569 ± 0.082–/ | 0.844 ± 0.048–/ | 0.723 ± 0.061–/ | 0.258 ± 0.055–/ | |
Entire benchmark dataset | |||||||||
Structure-trained | TargetS | 0.593 ± 0.012+/+ | 1.549 ± 0.246+/+ | 0.147 ± 0.021+/+ | 0.287 ± 0.029+/+ | 0.641 ± 0.029+/+ | 0.455 ± 0.032+/+ | 0.120 ± 0.012+/+ | |
TargetDNA | 0.631 ± 0.024+/+ | 3.034 ± 0.396+/+ | 0.268 ± 0.030+/+ | 0.403 ± 0.035+/+ | 0.708 ± 0.037+/+ | 0.454 ± 0.039+/+ | 0.169 ± 0.016+/+ | ||
BindN+ | 0.644 ± 0.024+/+ | 3.085 ± 0.422+/+ | 0.267 ± 0.028+/+ | 0.422 ± 0.034+/+ | 0.729 ± 0.017+/+ | 0.337 ± 0.037+/+ | 0.170 ± 0.017+/+ | ||
DNAPred | 0.659 ± 0.025+/+ | 4.505 ± 0.531+/+ | 0.329 ± 0.032=/+ | 0.445 ± 0.035+/+ | 0.747 ± 0.034+/+ | 0.501 ± 0.049+/+ | 0.221 ± 0.020+/+ | ||
DNAgenie | 0.703 ± 0.036 /+ | 3.595 ± 0.714 /+ | 0.327 ± 0.047 /+ | 0.503 ± 0.059 /+ | 0.795 ± 0.057 /+ | 0.585 ± 0.062 /+ | 0.208 ± 0.032 /+ | ||
Disorder-trained | fMoRFpred | 0.477 ± 0.013+/+ | 1.006 ± 0.137+/+ | 0.092 ± 0.010+/+ | 0.176 ± 0.012+/+ | 0.457 ± 0.019+/+ | 0.282 ± 0.015+/+ | 0.094 ± 0.012+/+ | |
ANCHOR2 | 0.534 ± 0.032+/+ | 1.431 ± 0.450+/+ | 0.147 ± 0.037+/+ | 0.233 ± 0.044+/+ | 0.535 ± 0.053+/+ | 0.337 ± 0.037+/+ | 0.108 ± 0.014+/+ | ||
DeepDISObind | 0.566 ± 0.032+/+ | 1.309 ± 0.445+/+ | 0.132 ± 0.039+/+ | 0.261 ± 0.061+/+ | 0.565 ± 0.056+/+ | 0.415 ± 0.036+/+ | 0.115 ± 0.016+/+ | ||
MoRFchibi | 0.585 ± 0.022+/+ | 1.775 ± 0.310+/+ | 0.163 ± 0.023+/+ | 0.298 ± 0.030+/+ | 0.621 ± 0.033+/+ | 0.411 ± 0.029+/+ | 0.119 ± 0.016+/+ | ||
DisoRDPbind | 0.626 ± 0.021+/+ | 2.282 ± 0.323+/+ | 0.214 ± 0.025+/+ | 0.364 ± 0.031+/+ | 0.694 ± 0.034+/+ | 0.470 ± 0.036+/+ | 0.143 ± 0.014+/+ | ||
Baseline meta-predictors | Average-based | 0.726 ± 0.023–/+ | 4.839 ± 0.485–/+ | 0.387 ± 0.032–/+ | 0.560 ± 0.035–/+ | 0.840 ± 0.021–/+ | 0.644 ± 0.047–/+ | 0.239 ± 0.021–/+ | |
Logistic regression | 0.720 ± 0.016–/+ | 3.747 ± 0.513+/+ | 0.328 ± 0.031=/+ | 0.485 ± 0.034=/+ | 0.812 ± 0.020=/+ | 0.642 ± 0.028–/+ | 0.207 ± 0.019=/+ | ||
Deep learning meta-predictor | hybridDBRpred | 0.786 ± 0.023–/ | 5.187 ± 0.485 –/ | 0.432 ± 0.036–/ | 0.619 ± 0.036–/ | 0.874 ± 0.016–/ | 0.727 ± 0.035–/ | 0.262 ± 0.024–/ |
We report averages and the corresponding standard deviations over the 100 subsets (see ‘Assessment metrics and statistical analysis’ section for details). We provide sensitivity that is calibrated for all methods to the same FPR = 0.1 and 0.2, and specificity calibrated to the sensitivity = TPR = 0.5 and 0.7; This allows for a direct comparison between methods under several diverse predictive scenarios. The best results for a given dataset and for each column are in bold font. We report results from the statistical significance test using superscript in the ‘x/y’ format where x indicates comparison against the current method with the highest AUC and y stands for the comparison against the new hybridDBRpred meta-predictor; +, =, and – denote that the best current predictor or hybridDBRpred is significantly better, not significantly different, significantly worse than another method, respectively, at P-value < 0.01.