Skip to main content
. 2019 Feb 8;47(6):e36. doi: 10.1093/nar/gkz061

Table 3.

Results of the BLAST search on the false positive set of E. coli and S. aureus, and specifically on the false positives in disagreement with the annotation of the Mass Spectrometry (MS) and Edman sequencing (Ecogene) dataset

Set-up type # aligned  total TIS TIS + stop description  hypothetical
S. aureus Proteoform 79 79 77 73 12
Novel protein 25 19 17 15 6
E. coli Proteoforms 232 232 217 198 39
Novel protein 258 204 157 137 106
MS Proteoforms 34 34 22 28 1
Ecogene Proteoforms 43 43 40 36 1

These predictions can be divided into proteoforms, which have a TIS that is either up- or downstream of the annotated ORF, or novel proteins, constituting ORFs with a non-annotated stop site. A BLAST search of these proteins was performed on the non-redundant protein database. A maximum cut-off value of 0.1 for the E score is taken. The total amount of false positives are given for each type. Taking only the best aligned protein (i.e. highest E score) for each of the false positive results, the total amount of matches that were aligned by start site or both start and stop site are given. Finally, the total amount of proteins described as ‘hypothetical’ are given.