Skip to main content
. 2020 Dec 30;13(1):49. doi: 10.3390/v13010049

Table 1.

This table presents the distribution of the number of predicted genes for each dataset. Bat-CoV exhibit the widest distribution of gene count, and pangolin-CoV has the highest number of gene count, with one genome having 17 predicted genes. These outliers have low sequence or assembly quality. In the case of the pangolin-CoV genome reporting 17 genes, it has low-quality (“NNNN”) nucleotide regions spanning the centre of genes, which causes PROKKA to identify the two ends of one gene. The median gene count only varying in bat-CoVs, likely attributed to the large phylogenetic variation exhibited across the bat-CoVs.

Dataset Min. Median Mean Max. Sample Count
Wuhan 7 11 11 13 46
Charite 9 11 11 12 117
Bat 2 9 9 12 215
Pangolin 10 11 12 17 7