Skip to main content
. 2024 Jan 20;115(2):212–220. doi: 10.1093/jhered/esae003

Table 2.

Sequencing and assembly statistics, and accession numbers.

BioProjects and vouchers VGP NCBI BioProject PRJNA489243
Species NCBI BioProject PRJNA970804
NCBI BioSample SAMN33212336
NCBI Genome accessions Haplotype 1 Haplotype 2
Assembly accession GCA_030035585.1 GCA_030020955.1
Genome sequences JASCZL000000000 JASCZM000000000
Genome sequence PacBio HiFi reads Run 3 PACBIO_SMRT (Sequel II) runs: 6.5 million reads, 102 Gbases
Omni-C Illumina reads Run 2 ILLUMINA (Illumina NovaSeq 6000) runs: 457.5 million reads, 138.2Gb
Assembly identifier (quality code)a mDugDug1 1(8.8.P8.Q70.C99)
HiFi read coverageb 32.0X
Genome Assembly Quality Metrics Haplotype 1 Haplotype 2
Number of contigs 294 256
Contig N50 (bp) 57,632,671 57,883,746
Contig NG50 (bp) 57,632,671 57,883,746
Longest contigs 162,184,114 209,448,431
Number of scaffolds 198 167
Scaffold N50 (bp) 177,379,183 138,031,769
Scaffold NG50 (bp) 177,379,183 138,031,769
Largest scaffold 267,865,978 230,272,189
Size of final assembly (bp) 3,159,179,246 3,154,861,630
Phased block NG50 (bp) 57,632,671 57,883,746
Gaps per Gbp (# Gaps) 25 (79) 28 (88)
Indel QV (frameshift) 41.52 42.16
Base pair QV 70.4553 70.3254
Full assembly = 70.3899
K-mer completeness 97.9001 97.8847
Full assembly = 99.7025
BUSCO completeness (vertebrata), n = 3354 Cc Sc Dc Fc Mc
Vertebrata n = 3354 H1d 97.9% 95.9% 2.0% 1.0% 1.1%
H2d 97.8% 95.7% 2.1% 1.1% 1.1%
Mammalia n = 9226 H1d 96.2% 95.3% 0.9% 0.8% 3.0%
H2d 96.1% 95.2% 0.9% 0.8% 3.1%
Organelles 1 complete mitochondrial sequence (pending NCBI accession code)

aAssembly quality code x·y·P·Q·C derived notation, from (Rhie et al. 2021). x = log10[contig NG50]; y = log10[scaffold NG50]; P = log10 [phased block NG50]; Q = Phred base accuracy QV (Quality value); C = % genome represented by the first “n” scaffolds, following a karyotype of 2n = 48 inferred from ancestral taxa Trichechus manatus (Noronha et al. 2022).

bRead coverage and NGx statistics have been calculated based on the estimated genome size of 3.16 Gbp.

cComplete BUSCOs (C), Complete and single-copy BUSCOs (S), Complete and duplicated BUSCOs (D), Fragmented BUSCOs (F), Missing BUSCOs (M).

d(H1) Haplotype 1 and (H2) Haplotype 2 assembly values.