Table 2.
Bio Projects and Vouchers | CCGP NCBI BioProject | PRJNA720569 | |||||
Genera NCBI BioProject | PRJNA765883 | ||||||
Species NCBI BioProject | PRJNA777227 | ||||||
NCBI BioSample | SAMN29046565 | ||||||
Specimen identification | L20-20 | ||||||
NCBI Genome accessions | Primary | Alternate | |||||
Assembly accession | JANIGQ000000000 | JANIGR000000000 | |||||
Genome sequences | GCA_024610735.1 | GCA_024610745.1 | |||||
Genome Sequence | PacBio HiFi reads | Run | 1 PACBIO_SMRT (Sequel II) run: | ||||
6.1M spots, 90.4G bases, 48.9G bytes | |||||||
Accession | SRX17388741 | ||||||
Omni-C Illumina reads | Runs | 2 ILLUMINA (Illumina NovaSeq 6000) runs: | |||||
130.9M spots, 39.5G bases, 13.9G bytes | |||||||
Accessions | SRX23638327, SRX23638328 | ||||||
Genome Assembly Quality Metrics | Assembly identifier (Quality codea) | mUrsAme1(7.7.P7.Q58.C91) | |||||
HiFi Read coverageb | 38.02× | ||||||
Primary | Alternate | ||||||
Number of contigs | 339 | 77,310 | |||||
Contig N50 (bp) | 58,859,121 | 43,280 | |||||
Contig NG50b | 59,189,856 | 60,742 | |||||
Longest Contigs | 107,133,695 | 831,372 | |||||
Number of scaffolds | 316 | 77,310 | |||||
Scaffold N50 | 67,550,933 | 43,280 | |||||
Scaffold NG50b | 68,367,985 | 60,742 | |||||
Largest scaffold | 122,379,270 | 831,372 | |||||
Size of final assembly | 2,524,264,886 | 2,885,111,500 | |||||
Phased block NG50b | 59,189,856 | 60,747 | |||||
Gaps per Gbp (# Gaps) | 9 (22) | 0 (0) | |||||
Indel QV (Frame shift) | 43.1369853 | 42.81941933 | |||||
Base pair QV | 63.0115 | 56.9775 | |||||
Full assembly = 58.8514 | |||||||
k-mer completeness | 98.187 | 75.5469 | |||||
Full assembly = 99.6329 | |||||||
BUSCO completeness (mammalia_odb10) n = 9226 | C c | S c | D c | F c | M c | ||
P d | 96.30% | 95.60% | 0.70% | 1.10% | 2.60% | ||
A d | 62.60% | 58.00% | 4.60% | 7.80% | 29.60% | ||
Organelles | 1 Partial mitochondrial sequence | JANIGQ010000317.1 |
aAssembly quality code x.y.P.Q.C derived notation, from Rhie et al. (2021). x = log10[contig NG50]; y = log10[scaffold NG50]; P = log10 [phased block NG50]; Q = Phred base accuracy QV (quality value); C = % genome represented by the first “n” scaffolds, following a known karyotype for U. amerianus of 2n = 74 (Nash and O’Brien 1987). Quality code metrics were calculated from the primary assembly (mUrsAme1.0.p).
bRead coverage and NGx statistics have been calculated based on the estimated genome size of 2.37 Gb.
cBUSCO Scores. Complete BUSCOs (C). Complete and single-copy BUSCOs (S). Complete and duplicated BUSCOs (D). Fragmented BUSCOs (F). Missing BUSCOs (M).
d(P)rimary and (A)lternate assembly values.