Table 2.
Sequencing and assembly statistics, and accession numbers
Bio projects and vouchers | CCGP NCBI BioProject | PRJNA720569 | |||||
Genera NCBI BioProject | PRJNA763234 | ||||||
Species NCBI BioProject | PRJNA782591 | ||||||
NCBI BioSample | SAMN21436765 | ||||||
Specimen identification | Photo voucher MWFB Acc 2021-49 | ||||||
NCBI Genome accessions | Primary | Alternate | |||||
Assembly accession | GCA_022086475.1 | GCA_022086895.1 | |||||
Genome sequences | JAJLPC000000000 | JAJLPC000000000 | |||||
Genome sequence | PacBio HiFi reads | Run | 3 PACBIO_SMRT (Sequel II) runs: 5.5M spots, 80.5G bases, 55.6Gb | ||||
Accession | SRR17460090 | ||||||
Hi-C Illumina reads | Run | 2 Illumina HiSeq X Ten runs: 738.2M spots, 222.9G bases, 74.4Gb | |||||
Accession | SRX13631283 | ||||||
Genome Assembly Quality Metrics | Assembly identifier (quality codea) | rActMar1 (6.C.Q66) | |||||
HiFi Read coverageb | 30X | ||||||
Primary | Alternate | ||||||
Number of contigs | 198 | 4308 | |||||
Contig N50 (bp) | 75 081 387 | 2 165 034 | |||||
Longest Contigs | 223 757 816 | 15 093 301 | |||||
Number of scaffolds | 49 | 2508 | |||||
Scaffold N50 (bp) | 146 229 595 | 18 355 815 | |||||
Largest scaffold | 361 952 230 | 118 325 371 | |||||
Size of final assembly (bp) | 2 319 354 532 | 2 209 870 670 | |||||
Gaps per Gbp | 54 | 818 | |||||
Indel QV (Frame shift) | 46.59 | 46.59 | |||||
Base pair QV | 66.58 | 65.45 | |||||
Full assembly = 66.00 | |||||||
k-mer completeness | 92.881 | 88.25 | |||||
Full assembly = 99.01 | |||||||
BUSCO completeness (vertebrata) n = 3354 |
C | S | D | F | M | ||
Pc | 96.70% | 95.80% | 0.90% | 0.90% | 2.40% | ||
Ac | 91.20% | 89.90% | 1.30% | 1.50% | 7.30% | ||
Organelles | 1 Complete mitochondrial sequence | CM039065.1 |
Assembly quality code x.y.Q-derived notation, from Rhie et al. (2020). x = log10[contig NG50]; y = log10[scaffold NG50]; Q = Phred base accuracy QV (quality value). C = chromosome level. BUSCO Scores. (C)omplete and (S)ingle; (C)omplete and (D)uplicated; (F)ragmented and (M)issing BUSCO genes. n, number of BUSCO genes in the set/data base. bp, base pairs.
Read coverage has been calculated based on a genome size of 2.6 Gb.
P(rimary) and (A)lternate assembly values.