Skip to main content
. 2024 Nov 23;11:1278. doi: 10.1038/s41597-024-04121-2

Table 5.

Gene annotation summary.

Gene category Gene Biotype CHM13 KSA001 Count of unmapped
Protein-coding protein coding 19,969 19,968 1
Non-coding RNA lncRNA 17,482 17,482 0
miRNA 2,045 2,223 20
misc RNA 2,224 2,221 3
Mt rRNA 3 3 0
Mt tRNA 29 29 0
ribozyme 8 8 0
rRNA 1,007 1,007 0
rRNA pseudogene 506 503 3
scaRNA 48 48 0
scRNA 2 2 0
snoRNA 945 944 1
snRNA 1,886 1,883 3
sRNA 5 5 0
TEC 1,341 1,341 0
vault RNA 1 1 0
Pseudogenes pseudogene 18 15 3
polymorphic pseudogene 50 50 0
processed pseudogene 10,769 10,764 5
transcribed processed pseudogene 551 550 1
transcribed unitary pseudogene 138 137 1
transcribed unprocessed pseudogene 941 941 0
translated processed pseudogene 2 2 0
translated unprocessed pseudogene 1 1 0
unitary pseudogene 98 98 0
unprocessed pseudogene 2,725 2,723 2
Immunoglobulin/T-cell receptor IG C gene 15 15 0
IG D gene 10 0 10
IG C pseudogene 9 10 0
IG J gene 18 8 10
IG J pseudogene 3 3 0
IG pseudogene 1 1 0
IG V gene 148 148 0
IG V pseudogene 216 214 2
TR C gene 7 7 0
TR J gene 80 73 7
TR J pseudogene 4 4 0
TR V gene 108 108 0
TR V pseudogene 33 33 0
Unknown StringTie 48 48 0
total 63,494 63,421 73

Counts of CHM13 genes (without chrY) mapped by Liftoff to the KSA001 genome assembly.