Skip to main content
. 2023 May 15;2023:gigabyte81. doi: 10.46471/gigabyte.81

Table 4.

General statistics of the two M. margaritifera genome assemblies, including read alignment, gene prediction, and annotation.

Genome V2 contig* Genome V1 contig Genome V1 scaffold Megalonaias nervosa Potamilus streckersoni Venustaconcha ellipsiformis
Total number of sequences ≥ 1,000 bp 1,700 265,718 105,185 90,895 2,366 371,427
Total number of sequences ≥ 10,000 bp 1,700 66,019 15,384 54,764 2,162 26,952
Total number of sequences ≥ 25,000 bp 1,202 18,725 11,583 29,042 1,831 5,073
Total number of sequences ≥ 50,000 bp 1,570 4,284 9,265 12,699 1,641 1,456
Total length ≥ 1,000 bp 2.45 Gb 2.2 Gb 2.47 Gb 2.36 Gb 1.77 Gb 1.59 Gb
Total length ≥ 10,000 bp 2.45 Gb 1.52 Gb 2.29 Gb 2.19 Gb 1.77 Gb 0.54 Gb
Total length ≥ 25,000 bp 2.45 Gb 789 Mb 2.23 Gb 1.76 Gb 1.76 Gb 0.23 Gb
Total length ≥ 50,000 bp 2.44 Gb 299 Mb 2.15 Gb 1.19 Gb 1.76 Gb 0.10 Gb
N50 length (bp) 3.42 Mb 16 Kb 288 Kb 50 Kb 2.05 Mb 6,657
L50 207 34,910 2,393 12,463 245 58,531
Largest contig (bp) 23 Mb 0.209 Mb 2.5 Mb 0.588 Mb 10 Mb 313,274
GC content, % 35.3 35.42 35.42 35.82 33.79 34.19
Clean paired-end (PE) Reads Alignment Stats
Percentage of Mapped RNA-seq PE (%) - Average 96.94 - 97.75 - -
Percentage of Mapped WGS PE (%) - 99.69 - 97.75 - -
Total BUSCO for the genome assembly (%)
#Euk database - C:99.2% [S:97.6%, D:1.6%], F:0.4% - C:86.8% [S:85.8%, D:1.0%], F:5.9% C:70.6% [S:70.2%, D:0.4%], F:14.9% C:98.1% [S:97.3%, D:0.8%], F:0.8% C:45.9% [S:45.5%, D:0.4%], F:36.9%
#Met database - C:96.9% [S:95.5%, D:1.4%], F:2.0% - C:84.9% [S:83.8%, D:1.1%], F:4.9% C:71.5% [S:70.1%, D:1.4%], F:14.5% C:95.0% [S:93.6%, D:1.4%], F:2.3% C:53.7% [S:52.8%, D:0.9%], F:29.7%
Masking Repetitive Regions and Gene Prediction
Percentage masked bases (%) - 57.32 - 59.07 25.00 51.03 36.29
Number of mRNAs - 48,314 - 40,544 49,149 41,065 41,697
Protein coding genes (CDS) - 48,314 - 35,119 49,149 41,065 -
Functional annotated genes 35,649 - 31,584 - - -
Total gene length (bp) - 1.13 Gb - 902 Mb - - -
Total BUSCO for the predicted proteins (%)
+ Euk database - C:97.6% [S:83.9%, D:13.7%], F:2.0% - C:90.6% (S:81.2%, D:9.4%), F:3.9% - - -
+ Met database - C:98.7% [S:84.7%, D:14.0%], F:0.8% - C:92.6% (S:82.3%, D:10.3%), F:3.2% - - -

*Genome V2 refers to the new assembly here produced and is solely at the contig level, i.e., has no scaffolds; Genome V1 refers to the first M. margaritifera genome [13]; #Euk: From a total of 303 genes of Eukaryota library profile; #Met: From a total of 978 genes of Metazoa library profile; + Euk: From a total of 255 genes of Eukaryota library profile; + Met: From a total of 954 genes of Metazoa library profile; #,+ C: Complete; S: Single; D: Duplicated; F: Fragmented.