Table 1.
Species | # of annotated genes | # of genes predicted by GeneMarkS-T | Sn/Sp of GeneMarkS-T predicted genes | # of HC genes (Order excluded DB) | Sn/Sp of HC genes (Order excluded DB) | # of HC genes (Species excluded DB) | Sn/Sp of HC genes (Species excluded DB) |
---|---|---|---|---|---|---|---|
C. elegans | 19,969 | 14,746 | 46.8/63.4 | 8,062 | 35.7/88.4 | 11,399 | 51.7/90.6 |
A. thaliana | 27,445 | 17,589 | 51.2/79.9 | 16,008 | 55.0/94.7 | 16,551 | 58.8/97.6 |
D. melanogaster | 13,951 | 10,163 | 59.6/81.8 | 8,109 | 59.6/81.8 | 9,223 | 63.7/96.3 |
S. lycopersicum | 25,158 | 19,526 | 67.8/77.8 | 17,231 | 74.9/95.2 | 17,489 | 75.8/95.1 |
D. rerio | 25,611 | 22,992 | 59.6/59.9 | 16,918 | 67.0/88.5 | 16,573 | 66.9/90.4 |
G. gallus | 17,279 | 17,381 | 49.6/47.0 | 12,473 | 74.4/89.1 | 12,564 | 74.0/88.4 |
M. musculus | 22,611 | 15,819 | 49.6/63.2 | 13,057 | 63.5/93.2 | 12,965 | 63.9/94.5 |
Two versions of reference protein databases were used for each species: the database called ‘Species excluded’, containing all the proteins from an OrthoDB segment but proteins from the same species, as well as the smaller database called ‘Order excluded’ containing all the proteins from the same OrthoDB segment but proteins from the same taxonomic order (see Materials). Additional data is provided in Supplemental Table S1.