Table 1.
Average AUC-ROC | Average AUC-PrecRec | AUC-ROC All | AUC-PercRec All | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Window Size | 1 | 3 | 11 | 19 | 1 | 3 | 11 | 19 | 1 | 3 | 11 | 19 | 1 | 3 | 11 | 19 |
DAQ(AA) | 0.76 | 0.84 | 0.90 | 0.92 | 0.41 | 0.55 | 0.70 | 0.73 | 0.78 | 0.88 | 0.95 | 0.96 | 0.44 | 0.63 | 0.81 | 0.85 |
Q-score | 0.72 | 0.77 | 0.81 | 0.83 | 0.41 | 0.47 | 0.53 | 0.55 | 0.66 | 0.70 | 0.73 | 0.74 | 0.36 | 0.38 | 0.37 | 0.36 |
EMRinger | 0.55 | 0.58 | 0.62 | 0.63 | 0.22 | 0.24 | 0.31 | 0.33 | 0.56 | 0.59 | 0.66 | 0.69 | 0.15 | 0.17 | 0.23 | 0.27 |
CaBLAM | 0.66 | 0.69 | 0.72 | 0.73 | 0.33 | 0.38 | 0.48 | 0.48 | 0.63 | 0.65 | 0.68 | 0.70 | 0.28 | 0.30 | 0.37 | 0.39 |
Performance in detecting inconsistent residue positions in 35 first version models of the PDB2Ver dataset was evaluated for four validation scores: DAQ(AA), Q-score, EMRinger, and CaBLAM. Average AUC-ROC and Average AUC-PrecRoc, values were computed for each model separately and then averaged over the models. The latter two evaluations, AUC-ROC All and AUC-PrecRoc All, considered the 35 models altogether (Methods). Four window sizes (1, 3, 11, and 19 residues) were used to average scores. The largest values in each column are indicated in bold. Supplementary Figure 7 shows the score distributions for inconsistent and consistent residue positions.