Skip to main content
. 2022 Jan 17;23(2):bbab549. doi: 10.1093/bib/bbab549

Table 2.

Generation of a curated benchmark ORF set. The benchmark set archives contain GFF files for labels of all annotated ORF sets (positive/negative), MS labels, tool predictions, close-proximity genes, genome sequences, and reference annotations to enable inspection in a genome browser. Links to the original data sources are provided. For each dataset the sequencing depth is given (total number of reads times average read length divided by genome length) [80]. The number of ORFs from each annotated ORF set (translatome, sORFs, close-proximity genes and stand-alone genes) that have been identified as translated (positive) or nontranslated (negative) are listed.

Organism E. coli L. monocytogenes [56] P. aeruginosa [59] S. typhimurium [58]
Benchmark set [zip] E. coli L. monocytogenes P. aeruginosa S. typhimurium
Growth conditions WT, LB @ 37Inline graphicC WT, BHI @ 37Inline graphicC WT, n-alkanes WT, LB @ 37Inline graphicC
Data GSE131514 SAMEA3864955 SAMN06617371 SRX3456030
SAMEA3864956 SRX3456038
Sequencing depth 42.98 939.76 81.92 38.92
Set Positive Negative Positive Negative Positive Negative Positive Negative
Translatome 2763 (65%) 1485 (35%) 2288 (80%) 579 (20%) 3935 (71%) 1638 (29%) 3284 (66%) 1689 (34%)
sORFs 54 (48%) 60 (52%) 7 (100%) 0 (0%) 7 (58%) 5 (42%) 31 (31%) 69 (69%)
Close-proximity genes 1794 (64%) 1015 (36%) 1622 (80%) 432 (20%) 2511 (69%) 1113 (31%) 1947(66%) 1010(34%)
Stand-alone genes 969 (67%) 470 (33%) 666 (82%) 147 (18%) 1424 (73%) 525 (27%) 1337 (66%) 679 (34%)