Skip to main content
. 2008 Oct 6;9:418. doi: 10.1186/1471-2105-9-418

Table 3.

Benchmarking FOSTA against the PIRSF dataset

Set Families Pairings Basic statistics Evaluation statistics
TP FP TN FN PPV MCC
A 122 2127 1744 2 3717 383 99.89 0.86
B 1095 18865 12967 23 34656 5898 99.82 0.77
C 474 11221 9146 62 11819 2075 99.33 0.83
D 339 5287 3674 16 4938 1613 99.57 0.72

N 1691 32213 23857 87 50192 8356 99.64 0.79
* 2020 37500 27531 103 55130 9969 99.63 0.79

Set ID: the identifier for each curation set [A='Full/Desc.', B='Full', C='Preliminary', D='None', N=aNnotated (A+B+C), * = All (N+D)]; Curation string: the string that defines the curation set; Families: the number of discrete protein families in the curation set; Pairings: the number of discrete pairings across all families to be tested in FOSTA; Basic statistics: the basic counts of true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); Evaluation statistics: the PPV (positive predictive value, TP/(TP + FP)), and the MCC (Matthews Correlation Coefficient), all rounded to 2dp