Table 2.
Sequencing and assembly statistics, and accession numbers
| Bio projects and vouchers | CCGP NCBI BioProject | PRJNA720569 | |||||
| Genera NCBI BioProject | PRJNA763234 | ||||||
| Species NCBI BioProject | PRJNA782591 | ||||||
| NCBI BioSample | SAMN21436765 | ||||||
| Specimen identification | Photo voucher MWFB Acc 2021-49 | ||||||
| NCBI Genome accessions | Primary | Alternate | |||||
| Assembly accession | GCA_022086475.1 | GCA_022086895.1 | |||||
| Genome sequences | JAJLPC000000000 | JAJLPC000000000 | |||||
| Genome sequence | PacBio HiFi reads | Run | 3 PACBIO_SMRT (Sequel II) runs: 5.5M spots, 80.5G bases, 55.6Gb | ||||
| Accession | SRR17460090 | ||||||
| Hi-C Illumina reads | Run | 2 Illumina HiSeq X Ten runs: 738.2M spots, 222.9G bases, 74.4Gb | |||||
| Accession | SRX13631283 | ||||||
| Genome Assembly Quality Metrics | Assembly identifier (quality codea) | rActMar1 (6.C.Q66) | |||||
| HiFi Read coverageb | 30X | ||||||
| Primary | Alternate | ||||||
| Number of contigs | 198 | 4308 | |||||
| Contig N50 (bp) | 75 081 387 | 2 165 034 | |||||
| Longest Contigs | 223 757 816 | 15 093 301 | |||||
| Number of scaffolds | 49 | 2508 | |||||
| Scaffold N50 (bp) | 146 229 595 | 18 355 815 | |||||
| Largest scaffold | 361 952 230 | 118 325 371 | |||||
| Size of final assembly (bp) | 2 319 354 532 | 2 209 870 670 | |||||
| Gaps per Gbp | 54 | 818 | |||||
| Indel QV (Frame shift) | 46.59 | 46.59 | |||||
| Base pair QV | 66.58 | 65.45 | |||||
| Full assembly = 66.00 | |||||||
| k-mer completeness | 92.881 | 88.25 | |||||
| Full assembly = 99.01 | |||||||
| BUSCO completeness (vertebrata) n = 3354 |
C | S | D | F | M | ||
| Pc | 96.70% | 95.80% | 0.90% | 0.90% | 2.40% | ||
| Ac | 91.20% | 89.90% | 1.30% | 1.50% | 7.30% | ||
| Organelles | 1 Complete mitochondrial sequence | CM039065.1 | |||||
Assembly quality code x.y.Q-derived notation, from Rhie et al. (2020). x = log10[contig NG50]; y = log10[scaffold NG50]; Q = Phred base accuracy QV (quality value). C = chromosome level. BUSCO Scores. (C)omplete and (S)ingle; (C)omplete and (D)uplicated; (F)ragmented and (M)issing BUSCO genes. n, number of BUSCO genes in the set/data base. bp, base pairs.
Read coverage has been calculated based on a genome size of 2.6 Gb.
P(rimary) and (A)lternate assembly values.