Table 2. Statistics of RLFS-positive genes according to the Ensembl gene annotation system and ratio of RLFS-positive gene to total number of gene for each gene category in each organism.
No. of protein coding genes* | No. of pseudogenes* | No. of long noncoding RNA genes* | No. of short noncoding RNA genes* | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Organism | Total | RLFS positive | Ratio | Total | RLFS positive | Ratio | Total | RLFS positive | Ratio | Total | RLFS positive | Ratio |
human | 23 431 | 17 798 | 0.76 | 15 934 | 4 433 | 0.28 | 14 966 | 9 028 | 0.60 | 9 771 | 2 296 | 0.23 |
mouse | 22 835 | 18 201 | 0.80 | 6 129 | 2 161 | 0.35 | 4 291 | 3 283 | 0.77 | 5 924 | 1 950 | 0.33 |
rat | 22 771 | 17 529 | 0.77 | 1 840 | 662 | 0.36 | 79 | 65 | 0.82 | 1 714 | 608 | 0.35 |
chimpanzee | 18 759 | 15 462 | 0.82 | 572 | 213 | 0.37 | 0 | 0 | n/a | 8 681 | 1 870 | 0.22 |
chicken | 15 508 | 12 007 | 0.77 | 42 | 26 | 0.62 | 0 | 0 | n/a | 1 558 | 546 | 0.35 |
frog | 18 442 | 9 932 | 0.54 | 173 | 44 | 0.25 | 0 | 0 | n/a | 1 306 | 268 | 0.21 |
fruit_fly | 13 863 | 2 423 | 0.17 | 200 | 11 | 0.06 | 830 | 104 | 0.13 | 789 | 30 | 0.04 |
yeast | 6 692 | 102 | 0.02 | 21 | 4 | 0.19 | 15 | 0 | n/a | 398 | 0 | n/a |
All organisms | 142 301 | 93 454 | 0.66 | 24 911 | 7 554 | 0.30 | 20 181 | 12 480 | 0.62 | 30 141 | 7 568 | 0.25 |
*The Ensemble gene biotypes were grouped into protein coding genes, pseudogenes, long non-coding RNA genes and short non-coding RNA genes (available at http://asia.ensembl.org/Help/Faq?id=468). Examples of gene biotypes in each group are as follows:
-Protein coding genes: IGC gene, IGD gene, IG gene, IGJ gene, IGLV gene, IGM gene, IGV gene, IGZ gene, nonsense mediated decay, non-translating CDS, non-stop decay, polymorphic pseudogene, TRC gene, TRD gene, TRJ gene.
-Pseudogenes: disrupted domain, IGC pseudogene, IGJ pseudogene, IG pseudogene, IGV pseudogene, processed pseudogene, transcribed processed pseudogene, transcribed unitary pseudogene, transcribed unprocessed pseudogene, translated processed pseudogene, TRJ pseudogene, unprocessed pseudogene.
-Long non-coding RNA genes: 3-prime overlapping ncRNA, ambiguous ORF, antisense, antisense RNA, lincRNA, ncRNA host, processed transcript, sense intronic, sense overlapping.
-Short non-coding RNA genes: miRNA, miRNA pseudogene, miscRNA, miscRNA pseudogene, Mt rRNA, Mt tRNA, rRNA, scRNA, snlRNA, snoRNA, snRNA, tRNA, tRNA pseudogene.