Skip to main content
[Preprint]. 2024 Apr 17:2023.01.13.524024. Originally published 2023 Jan 15. [Version 5] doi: 10.1101/2023.01.13.524024

Table 1.

Statistics characterizing the GeneMarkS-T initial gene predictions made in assembled transcripts as well as the HC genes predicted by the GeneMarkS-TP module. Two versions of a reference protein database were used for each species: the ‘Species excluded’, as well as the ‘Order excluded’ (see Data Sets). Refinement made by GeneMarkS-TP reduced the number of initial predictions and produced a significant increase in Pr. In most of the genomes there was also a positive change in Sn, especially in large inhomogeneous genomes of G. gallus and M. musculus. Additional data is provided in Supplemental Table S1.

Species # of genes initially predicted by GeneMarkS-T Sn/Pr of GeneMarkS-T predicted genes Order excluded DB Species excluded DB

# of HC genes Sn/Pr of HC genes # of HC genes Sn/Pr of HC genes

C. elegans 14,746 46.8 / 63.4 8,062 35.7 / 88.4 11,399 51.7 / 90.6
A. thaliana 17,589 51.2 / 79.9 16,008 56.7 / 97.3 16,551 58.8 / 97.6
D. melanogaster 10,163 59.6 / 81.8 8,109 55.0 / 94.7 9,223 63.7 / 96.3
S. lycopersicum 19,526 68.0 / 78.4 17,231 75.1 / 95.3 17,489 75.8 / 95.2
D. rerio 22,992 59.6 / 59.9 16,918 67.0 / 88.5 16,573 66.9 / 90.4
G. gallus 17,381 49.6 / 47.0 12,473 74.4 / 89.1 12,564 74.0 / 88.4
M. musculus 15,819 49.6 / 63.2 13,057 63.5 / 93.2 12,965 63.9 / 94.5