Skip to main content
. 2024 May;34(5):757–768. doi: 10.1101/gr.278373.123

Table 4.

The table shows total numbers of protein-coding genes as well as individual CDSs (including alternative isoforms) annotated in the seven genomes (the annotation sources are described in Supplemental Table S7)

Species Genome length (Mb) Reference annotation statistics
No. of protein-coding genes No. of CDSs Introns per gene
C. elegans 100 19,969 28,544 4.8
A. thaliana 119 27,445 40,827 4.0
D. melanogaster 138 13,951 22,395 2.8
S. lycopersicum 807 25,158 (15,138) 31,911 (15,150) 4.4 (4.3)
D. rerio 1345 25,610 (17,893) 42,929 (19,975) 8.4 (8.4)
G. gallus 1050 17,279 (10,736) 38,534 (12,733) 9.0 (9.2)
M. musculus 2723 22,405 (16,531) 58,318 (20,708) 6.0 (8.6)

The numbers of genes and individual CDSs in the intersections of the NCBI RefSeq and the Ensembl annotations are given in parentheses (see Methods). Annotations of the C. elegans, A. thaliana, and D. melanogaster genomes are identical between RefSeq and Ensembl; therefore, the intersection sets have the same numbers of genes and CDSs as in the RefSeq annotation.