. 2008 Oct 6;9:418. doi: 10.1186/1471-2105-9-418

Table 3.

Benchmarking FOSTA against the PIRSF dataset

Set	Families	Pairings	Basic statistics				Evaluation statistics
			TP	FP	TN	FN	PPV	MCC
A	122	2127	1744	2	3717	383	99.89	0.86
B	1095	18865	12967	23	34656	5898	99.82	0.77
C	474	11221	9146	62	11819	2075	99.33	0.83
D	339	5287	3674	16	4938	1613	99.57	0.72

N	1691	32213	23857	87	50192	8356	99.64	0.79
*	2020	37500	27531	103	55130	9969	99.63	0.79

Set ID: the identifier for each curation set [A='Full/Desc.', B='Full', C='Preliminary', D='None', N=aNnotated (A+B+C), * = All (N+D)]; Curation string: the string that defines the curation set; Families: the number of discrete protein families in the curation set; Pairings: the number of discrete pairings across all families to be tested in FOSTA; Basic statistics: the basic counts of true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); Evaluation statistics: the PPV (positive predictive value, TP/(TP + FP)), and the MCC (Matthews Correlation Coefficient), all rounded to 2dp