. 2005 Apr 1;33(6):1874–1891. doi: 10.1093/nar/gki327

Table 3.

Comparison of selected servers participating in LiveBench-8

Only publicly available servers that provide a description of the underlying algorithm are listed. Results obtained in LiveBench-7 are also displayed if available. Results for the PROTINFO-CM server are not presented because of late predictions. Servers are colored blue (sequence only methods), red (hybrid methods) and black (structure meta predictors). The ‘Code’ column shows the code of the method as provided in Table 1. Results obtained using the 3D-score assessment measure are shown (see Table 2). The ‘Sum’ column prints the sum of scores obtained for correct models of difficult targets (no PSI-BLAST assignment with E-value below 0.001). The ‘FR’ column shows the number of correct models generated for difficult targets. The ‘All’ column shows the number of correct models generated for all targets including the easy ones. The ‘ROC’ (Receiver Operator Characteristic) value describes the specificity of the confidence scores reported by the methods. It corresponds to the average number of correct models that have a higher confidence score than the first, second … tenth false prediction. The ‘ROC%’ column prints the ‘ROC’ value divided by the total number of targets and multiplied by 100. Robetta is not listed here since it does not provide confidence scores. The ‘Score’ column reports the score of the third false positive prediction and the ‘3’ column presents the number of correct predictions with higher score than the third false one. The score can be used as an approximate value for the confidence threshold, below which false positive predictions become frequent. The ‘Lost’ column shows the number of missing predictions for each server. Servers that have more than a few missing predictions cannot be properly evaluated. Some servers that entered LiveBench in the eighth round have missing scores in LiveBench-7. The new sequence-only methods exhibit high specificity and can compete with structure meta predictors in this ranking. The structure meta predictors rank much higher in the sensitivity (FR) and model quality (Sum) based evaluation. In LiveBench-8 the meta–meta predictors, such as 3JC1 and 3JCa, which use results of other meta predictors, profit greatly from its parasitic nature. Servers, which are not maintained over long period of time become obsolete due to outdated fold libraries. As an example, FFAS now seems to perform similarly to PDBb, while it used to perform much better in first rounds of LiveBench. High-quality servers are able to generate ∼50% more correct models than PDBb (‘All’ column).