. 2010 Jun 4;38(Web Server issue):W35–W40. doi: 10.1093/nar/gkq415

Table 3.

Validation for detection of specificity sites by SH and mR scored as area under curve (AUC) for the PR plots versus gold-standard specificity sites in the 22 data sets, 7 sets as defined in Table 2 and 15 sets obtained from Chakrabarti and Panchenko (15)

Dataset	cbm9	cd00	cd00	cd00	cd00	cd00	cd00	cd00	CN-	GPCR	GPCR	GST	IDH/	LacI	MDH/	AQP/	nucl	rab	ras/	ricin	serine	Smad	Aver
		120	264	333	363	365	423	985	myc	190			IMDH		LDH	GLP	cycl.^a	5/6	ral				Wt'd
# positives	7	3	3	12	6	10	4	3	11	21	21	9	14	28	1	23	2	28	12	21	2	29
mR	0.161	0.058	0.006	0.301	0.010	0.055	0.204	0.329	0.037	0.246	0.347	0.156	0.050	0.266	0.063	0.213	0.417	0.540	0.666	0.186	0.078	0.719	0.310
mR Z	0.161	0.058	0.006	0.301	0.010	0.055	0.204	0.329	0.037	0.252	0.347	0.156	0.050	0.282	0.063	0.216	0.417	0.539	0.666	0.186	0.078	0.721	0.312
SH.	0.074	0.054	0.003	0.287	0.008	0.119	0.080	0.198	0.067	0.486	0.489	0.242	0.048	0.124	0.125	0.249	0.413	0.602	0.540	0.194	0.261	0.713	0.330
SH Z	0.074	0.054	0.003	0.287	0.008	0.119	0.080	0.198	0.067	0.517	0.489	0.242	0.048	0.207	0.125	0.268	0.413	0.602	0.540	0.194	0.261	0.703	0.342
ProteinKeys	0.049	0.008	0.087	0.203	0.010	0.010	0.002	0.034	0.027	0.377	0.505	0.483	0.065	0.301	0.005	0.119	0.011	0.364	0.092	0.276	0.006	0.748	0.287
PROUST-II	0.349	0.079	0.012	0.055	0.011	0.016	0.049	0.058	0.122	0.308	^b	0.446	0.089	0.111	0.015	0.187	0.305	0.455	0.378	0.256	0.750	0.723	0.258
SDPpred v.2	0.122	0.126	0.017	0.376	0.012	0.126	0.234	0.509	0.162	0.508	0.508	0.615	0.196	0.146	0.250	0.242	0.413	0.416	0.357	0.201	0.542	0.522	0.333
Xdet	0.352	0.106	0.080	0.366	0.011	0.103	0.196	0.387	0.086	0.125	^b	0.117	0.100	0.190	0.033	0.169	0.054	0.350	0.398	0.173	0.105	0.688	0.234
Xdet sup^c	0.209	0.106	0.019	0.346	0.012	0.189	0.171	0.534	0.101	0.275	^b	0.402	0.129	0.207	0.250	0.208	0.292	0.346	0.545	0.193	0.750	0.677	0.279
Average	0.172	0.072	0.026	0.280	0.010	0.088	0.136	0.286	0.078	0.344	0.448	0.318	0.086	0.204	0.103	0.208	0.304	0.468	0.465	0.206	0.314	0.691	0.298

^aNucleotidyl cyclase.

^bThe GPCR data set is above the maximum of 1000 sequences for these methods.

^cSupervised by using subgroupings.

A higher AUC corresponds to better performance. For comparison, predictions by ProteinKeys, PROUST-II, SDPpred v.2 and Xdet are also shown. Best-scoring methods for each data set are in bold. The final column list the average AUCs per method weighted by number of positives, and the bottom row the averages per data set.