Table 3.
Attribute | Value | % |
---|---|---|
Genome size (bp) a | 3,088,407 | 100 |
DNA coding (bp) | 2,621,999 | 84.9 |
DNA G + C (bp) | 1,164,329 | 37.7 |
DNA scaffolds | 1281 | 100 |
Total genes | 3097 | 100 |
Protein-coding genes | 3045 | 98.3 |
RNA genes | 46 | 1.5 |
Pseudo genes | 6 | 0.2 |
Genes in internal clusters | - | - |
Genes with function prediction b | 2051 | 67.4 |
Genes assigned to COGs | 1659 | 54.5 |
Genes with Pfam domains | 1984 | 65.2 |
Genes with signal peptides c | 337 | 11.1 |
Genes with transmembrane helices | 626 | 20.6 |
CRISPR repeats | 10 |
aAll 1281 scaffolds >200 bp. 478 of these (37.3%) are scaffolds >1000 bp, comprising 2,726,561 bp (88.3% of all base pairs)
bGenes with function prediction are all 3045 protein-coding genes minus those 994 genes annotated as “hypothetical proteins” that have no COG category or fall into the COG categories “unknown function” or “general function prediction only” and that have no Pfam domain or a Pfam “domain of unknown function”
cIncludes genes for which a signal peptide was predicted with at least two of the three tools used. Percentages of genes with function prediction, COGs, Pfam domains, signal peptides and transmembrane helices were calculated against a total of 3045 protein-coding genes