. 2016 Apr 4;6:23990. doi: 10.1038/srep23990

Table 2. The per-target average correlation, average loss, average Spearman’s correlation, average Kendall tau score, and total number of evaluated targets of Qprob and several other pure single-model QA methods on Stage 2 CASP11 dataset.

QA Method	Ave. corr.	Ave. loss	Ave. spearman	Ave. kendall.	p-value loss	p-value corr.	#
ProQ2	0.372	0.058	0.366	0.256	0.2387	0.8636	83
Qprob	0.381	0.068	0.387	0.272	–	–	83
VoroMQA	0.401	0.069	0.386	0.269	0.4335	0.5864	83
ProQ2-refine	0.37	0.069	0.375	0.264	0.2442	0.9656	83
ModelEvaluator	0.324	0.072	0.305	0.212	0.002554	0.3084	83
Dope	0.304	0.077	0.324	0.228	1.59E-07	0.74	83
RWplus	0.295	0.084	0.314	0.22	7.00E-09	0.11	83
Wang_SVM	0.362	0.085	0.351	0.245	0.4774	0.1502	83
raghavagps-qaspro	0.222	0.085	0.205	0.139	3.07E-07	0.006219	83
Wang_deep_2	0.307	0.086	0.298	0.208	0.000593	0.03628	83
Wang_deep_1	0.302	0.089	0.293	0.203	0.000911	0.04544	83
DFIRE2	0.235	0.091	0.253	0.175	6.15E-11	0.004036	83
Wang_deep_3	0.302	0.092	0.29	0.202	0.000469	0.008166	83
RF_CB_SRS_OD	0.36	0.097	0.35	0.243	0.06173	0.002035	83
FUSION	0.05	0.111	0.082	0.054	7.16E-11	5.82E-07	83

The p-value of pairwise Wilcoxon signed ranked sum test for the difference of loss and correlation of Qprob against other methods is listed for comparison. Five single-model QA methods which did not attend CASP11 are also listed and highlighted in bold.