. 2008 Feb 13;9(Suppl 1):S4. doi: 10.1186/1471-2105-9-S1-S4

Table 3.

Evaluation results on the benchmark data set.

	Protein-level accuracy			Per-segment accuracy			Per-residue accuracy
Method	Q_ok	False Positives	False negatives	Qhtm Fscore	Qhtm %obs	Qhtm %prd	Q2

PHDpsihtm08	84	2	3	98	99	98	80
TMpro	83	14	0	95	95	96	73
HMMTOP2	83	6	0	99	99	99	80
DAS	79	16	0	97	99	96	72
TopPred2	75	10	8	90	90	90	77
TMHMM1	71	1	8	90	90	90	80
SOSUI	71	1	8	87	88	86	75
PHDhtm07	69	3	14	82	83	81	78
KD	65	81	0	91	94	89	67
PHDhtm08	64	2	19	76	77	76	78
GES	64	53	0	93	97	90	71
PRED-TMR	61	4	8	87	84	90	76
Ben-Tal	60	3	11	84	79	89	72
Eisenberg	58	66	0	92	95	89	69
Hopp-Woods	56	89	0	89	93	86	62
WW	54	32	0	93	95	91	71
Roseman	52	95	0	88	94	83	58
Av-Cid	52	95	0	88	93	83	60
Levitt	48	93	0	87	91	84	59
A-Cid	47	95	0	89	95	83	58
Heijne	45	92	0	87	93	82	61
Bull-Breese	45	100	0	87	92	82	55
Sweet	43	84	0	86	90	83	63
Radzicka	40	100	0	86	93	79	56
Nakashima	39	90	0	85	88	83	60
Fauchere	36	99	0	86	92	80	56
Lawson	33	98	0	82	86	79	55
EM	31	99	0	84	92	77	57
Wolfenden	28	2	39	52	43	62	62

Performance of methods other than TMpro were originally reported in Protein Science [16] and are reproduced here with permission from Cold Spring Harbor Laboratory Press, Copyright 2002. TMpro values in comparison to these published values are returned by the benchmark web server [29] when TMpro predictions are uploaded. The columns from left to right show: method being evaluated; Protein level accuracies: Q_ok, which is the percentage of proteins in which all experimentally determined segments are predicted correctly, and no extra segments are predicted; that is, there is a one to one match between predicted and experimentally determined segments. False positives, which is the percentage of globular proteins that are misrecognized as membrane proteins. False negatives, which is the number of membrane proteins that are misclassified as soluble proteins because no TM segment is predicted in the protein. In segment level metrics are shown segment F-score which is the geometric mean of Recall and Precision, Recall (Q^htm,%obs, percentage of experimentally determined segments that are predicted correctly), Precision (Q^htm,%predpercentage of predicted segments that are correct). Q₂is the residue level accuracy when all residues in a protein are considered together, and the Q₂value for the entire set of proteins is the average of that of individual proteins. See [16] for further details on these metrics.