Table 3.
Set | Families | Pairings | Basic statistics | Evaluation statistics | ||||
TP | FP | TN | FN | PPV | MCC | |||
A | 122 | 2127 | 1744 | 2 | 3717 | 383 | 99.89 | 0.86 |
B | 1095 | 18865 | 12967 | 23 | 34656 | 5898 | 99.82 | 0.77 |
C | 474 | 11221 | 9146 | 62 | 11819 | 2075 | 99.33 | 0.83 |
D | 339 | 5287 | 3674 | 16 | 4938 | 1613 | 99.57 | 0.72 |
N | 1691 | 32213 | 23857 | 87 | 50192 | 8356 | 99.64 | 0.79 |
* | 2020 | 37500 | 27531 | 103 | 55130 | 9969 | 99.63 | 0.79 |
Set ID: the identifier for each curation set [A='Full/Desc.', B='Full', C='Preliminary', D='None', N=aNnotated (A+B+C), * = All (N+D)]; Curation string: the string that defines the curation set; Families: the number of discrete protein families in the curation set; Pairings: the number of discrete pairings across all families to be tested in FOSTA; Basic statistics: the basic counts of true positives (TP), false positives (FP), true negatives (TN), false negatives (FN); Evaluation statistics: the PPV (positive predictive value, TP/(TP + FP)), and the MCC (Matthews Correlation Coefficient), all rounded to 2dp