Table 4.
Genome V2 contig* | Genome V1 contig | Genome V1 scaffold | Megalonaias nervosa | Potamilus streckersoni | Venustaconcha ellipsiformis | ||
---|---|---|---|---|---|---|---|
Total number of sequences ≥ 1,000 bp | 1,700 | 265,718 | 105,185 | 90,895 | 2,366 | 371,427 | |
Total number of sequences ≥ 10,000 bp | 1,700 | 66,019 | 15,384 | 54,764 | 2,162 | 26,952 | |
Total number of sequences ≥ 25,000 bp | 1,202 | 18,725 | 11,583 | 29,042 | 1,831 | 5,073 | |
Total number of sequences ≥ 50,000 bp | 1,570 | 4,284 | 9,265 | 12,699 | 1,641 | 1,456 | |
Total length ≥ 1,000 bp | 2.45 Gb | 2.2 Gb | 2.47 Gb | 2.36 Gb | 1.77 Gb | 1.59 Gb | |
Total length ≥ 10,000 bp | 2.45 Gb | 1.52 Gb | 2.29 Gb | 2.19 Gb | 1.77 Gb | 0.54 Gb | |
Total length ≥ 25,000 bp | 2.45 Gb | 789 Mb | 2.23 Gb | 1.76 Gb | 1.76 Gb | 0.23 Gb | |
Total length ≥ 50,000 bp | 2.44 Gb | 299 Mb | 2.15 Gb | 1.19 Gb | 1.76 Gb | 0.10 Gb | |
N50 length (bp) | 3.42 Mb | 16 Kb | 288 Kb | 50 Kb | 2.05 Mb | 6,657 | |
L50 | 207 | 34,910 | 2,393 | 12,463 | 245 | 58,531 | |
Largest contig (bp) | 23 Mb | 0.209 Mb | 2.5 Mb | 0.588 Mb | 10 Mb | 313,274 | |
GC content, % | 35.3 | 35.42 | 35.42 | 35.82 | 33.79 | 34.19 | |
Clean paired-end (PE) Reads Alignment Stats | |||||||
Percentage of Mapped RNA-seq PE (%) | - | Average 96.94 | - | 97.75 | - | - | |
Percentage of Mapped WGS PE (%) | - | 99.69 | - | 97.75 | - | - | |
Total BUSCO for the genome assembly (%) | |||||||
#Euk database | - | C:99.2% [S:97.6%, D:1.6%], F:0.4% | - | C:86.8% [S:85.8%, D:1.0%], F:5.9% | C:70.6% [S:70.2%, D:0.4%], F:14.9% | C:98.1% [S:97.3%, D:0.8%], F:0.8% | C:45.9% [S:45.5%, D:0.4%], F:36.9% |
#Met database | - | C:96.9% [S:95.5%, D:1.4%], F:2.0% | - | C:84.9% [S:83.8%, D:1.1%], F:4.9% | C:71.5% [S:70.1%, D:1.4%], F:14.5% | C:95.0% [S:93.6%, D:1.4%], F:2.3% | C:53.7% [S:52.8%, D:0.9%], F:29.7% |
Masking Repetitive Regions and Gene Prediction | |||||||
Percentage masked bases (%) | - | 57.32 | - | 59.07 | 25.00 | 51.03 | 36.29 |
Number of mRNAs | - | 48,314 | - | 40,544 | 49,149 | 41,065 | 41,697 |
Protein coding genes (CDS) | - | 48,314 | - | 35,119 | 49,149 | 41,065 | - |
Functional annotated genes | 35,649 | - | 31,584 | - | - | - | |
Total gene length (bp) | - | 1.13 Gb | - | 902 Mb | - | - | - |
Total BUSCO for the predicted proteins (%) | |||||||
+ Euk database | - | C:97.6% [S:83.9%, D:13.7%], F:2.0% | - | C:90.6% (S:81.2%, D:9.4%), F:3.9% | - | - | - |
+ Met database | - | C:98.7% [S:84.7%, D:14.0%], F:0.8% | - | C:92.6% (S:82.3%, D:10.3%), F:3.2% | - | - | - |
*Genome V2 refers to the new assembly here produced and is solely at the contig level, i.e., has no scaffolds; Genome V1 refers to the first M. margaritifera genome [13]; #Euk: From a total of 303 genes of Eukaryota library profile; #Met: From a total of 978 genes of Metazoa library profile; + Euk: From a total of 255 genes of Eukaryota library profile; + Met: From a total of 954 genes of Metazoa library profile; #,+ C: Complete; S: Single; D: Duplicated; F: Fragmented.