Skip to main content
. 2021 Nov 24;113(2):188–196. doi: 10.1093/jhered/esab071

Table 2.

Sequencing and assembly statistics, and accession numbers

Bio projects
& vouchers
CCGP NCBI BioProject PRJNA720569
Genera NCBI BioProject PRJNA721387
Species NCBI BioProject PRJNA734616
NCBI BioSample SAMN19489519
Specimen identification ​​UCR ACC. # 292491
NCBI Genome accessions Primary Alternate
Assembly accession GCA_019985065.1 GCA_019985075.1
Genome sequences JAHSPW000000000 JAHSPX000000000
Genome sequence PacBio HiFi reads Run 1 PACBIO_SMRT (Sequel II) run: 1.8 M spots, 27.3G bases, 8.7Gb downloads
Accession SRR14883332
Hi-C Illumina reads Run 1 Illumina HiSeq X Ten run: 199.2M spots,
59.7G bases, 37Gb download
Accession SRR14883331
Genome assembly quality metrics Assembly identifier (Quality codea) ddArcGlau1 (6.7.Q62)
HiFi Read coverageb 45X
Primary Alternate
Number of contigs 353 2470
Contig N50 (bp) 8 041 760 1 739 008
Longest Contigs 22 990 225 10 884 557
Number of scaffolds 271 2350
Scaffold N50 (bp) 31 280 158 3 804 428
Largest scaffold 45 401 621 22 987 546
Size of final assembly (bp) 547 548 103 556 397 040
Gaps per Gbp 150 1885
Indel QV (Frame shift) 48.36 47.39
Base pair QV 62.36 56.24
Full assembly = 58.28
k-mer completeness 74.39 65.01
Full assembly = 95.59
BUSCO completeness C S D F M

(embryophyta) n = 1614
98.20% 95.70% 2.50% 0.90% 0.90%
85.90% 83.30% 2.60% 1.30% 12.80%
Organelles 1 Partial mitochondrial sequence
1 Partial chloroplast sequence
MZ779111
XXXXXX

a Assembly quality code x.y.Q derived notation, from (Rhie et al. 2021). x = log10[contig NG50]; y = log10[scaffold NG50]; Q = Phred base accuracy QV (Quality value). BUSCO Scores. (C)omplete and (S)ingle; (C)omplete and (D)uplicated; (F)ragmented and (M)issing BUSCO genes. n, number of BUSCO genes in the set/data base. Bp: base pairs.

b Read coverage has been calculated based on a genome size of 600Mb.