Skip to main content
. 2025 Aug 7;15:28941. doi: 10.1038/s41598-025-14813-3

Table 2.

Sequencing and assembly statistics and PROKKA annotation results. Contigs, number of contigs; genome N50, the shortest contig length needed to cover 50% of the genome; genome size, the total length of each bin; GC, the content (%) of guanine-cytosine (GC) nucleotides; total coding sequences (CDS), number of predicted CDS; matching to COGs (Clusters of orthologous Groups), number of CDS in COG classification; with Enzyme Commission Number; missing CDS, number of CDS not classified in COG.

BIN 1 BIN 2 BIN 3 BIN 4 BIN 5 BIN 6 BIN 7
GTDB taxonomy Chloroflexus UBA5754 UBA6016 Halothiobacillaceae Calditerrivibrionaceae UBA9959 Caldisericaceae
Abundance (%) 1.7 1.4 1.8 68.1 0.3 0.7 0.3
Contigs 40 66 166 481 374 43 109
Genome N50 197,509 44,607 13,484 4,167 6,534 75,433 20,298
Genome size (bp) 4,106,688 1,986,167 1,584,930 1,535,600 1,815,858 1,852,120 1,308,601
GC (%) 56 58 47 64 33 31 34
Completeness (%) 100 98.2 96.4 86.8 93.4 92.9 95.5
Contamination (%) 0 0.09 0.58 3.5 4.5 0 0
Total CDS 3,395 1,854 1,595 1,549 1,719 1,711 1,238
Matching to COGs 1,091 730 661 768 775 608 520
Hypothetical protein 1,574 780 640 589 622 774 494
With EC Number 520 218 175 118 188 207 145
Missing CDS 210 126 119 74 134 122 79
Total tRNA 50 50 41 38 41 56 37
Total rRNA 0 3 0 2 3 0 3