Skip to main content
. 2014 Jun 16;23(22):5866–5878. doi: 10.1093/hmg/ddu309

Table 1.

The 19 features used to select the potential non-coding set

Features Genes Peptide detection (%)
Homology existence [UP] 131 6.87
Pseudogene [E] 75 6.67
PUTATIVE transcripts [G] 434 2.53
Caution—pseudogene [UP] 79 2.53
Caution—dubious CDS [UP] 47 2.13
Poor conservation (MI score) [A] 987 2.03
Predicted existence [UP] 507 1.58
No protein features [A] 1212 1.32
Nonsense-mediated decay [G] 78 1.28
Circular annotation [E/UP] 336 1.19
Uncertain existence [UP] 100 1.00
Primate gene family [E] 563 0.89
Read-through [E/G] 229 0.87
Obsolete [E/UP] 130 0.00
Dubious EST support [E/G] 98 0.00
Non-functional [E] 44 0.00
Non-coding [E] 38 0.00
Antisense/opposite strand [E] 25 0.00
Miscellaneous RNA [E] 7 0.00

Each feature is explained in more detail in Materials and methods and Supplementary sections. The source of each feature is indicated in square brackets (A, APPRIS; E, Ensembl; G, GENCODE; UP, UniProt). For each feature, we also show the number of genes with the feature and the proportion that we identify in the seven datasets.