Skip to main content
. 2025 Jan 24;116(3):335–343. doi: 10.1093/jhered/esaf002

Table 2.

Sequencing and assembly statistics, and accession numbers.

Bio Projects & Vouchers CCGP NCBI BioProject PRJNA720569
Genera NCBI BioProject PRJNA766268
Species NCBI BioProject PRJNA766268
NCBI BioSample SAMN36908962
Specimen identification 1044C
NCBI Genome accessions Haplotype 1 Haplotype 2
Assembly accession JAVGWX000000000 JAVGWY000000000
Genome sequences GCA_036924085.1 GCA_036924075.1
Genome sequence PacBio HiFi reads Run 1 PACBIO_SMRT (Sequel IIe) run:
6.5M spots, 73.7G bases, 41.6G bytes
Accession SRX23901772
Omni-C Illumina reads Run 2 ILLUMINA (Illumina NovaSeq 6000) runs:
152.5M spots, 46.1G bases, 15G bytes
Accession SRX23901773, SRX23901774
Genome assembly quality metrics Assembly identifier (Quality codea) xgAriColu1(6.7.P6.Q.C97)
HiFi Read coverageb 32.11X
Haplotype 1 Haplotype 2
Number of contigs 1,382 1,530
Contig N50 (bp) 3,670,449 3,749,547
Contig NG50b 3,670,449 3,700,100
Longest contigs 19,357,573 22,212,580
Number of scaffolds 401 581
Scaffold N50 94,891,770 93,600,053
Scaffold NG50b 94,891,770 93,600,053
Largest scaffold 180,022,102 176,612,698
Size of final assembly 2,294,334,993 2,282,270,284
Phased block NG50b 3,863,613 3,797,846
Gaps per Gbp (# Gaps) 428(981) 416(949)
Indel QV (Frame shift) 51.44 50.79
Base pair QV 60.61 60.62
Full assembly = 60.61
k-mer completeness 90.19 90.07
Full assembly = 97.19
BUSCO completeness (metazoa) n = 954 Cc Sc Dc Fc Mc
H1d 93.90% 86.40% 7.50% 2.10% 4.00%
H2d 92.40% 84.20% 8.20% 2.10% 5.50%
BUSCO completeness
(mollusca) n = 5295
Cc Sc Dc Fc Mc
H1d 87.80% 70.10% 17.70% 2.40% 9.80%
H2d 87.50% 70.00% 17.50% 2.10% 10.40%
Organelles Mitochondrial sequence CM072560.1

aAssembly quality code x.y.P.Q.C derived notation, from (Rhie et al. 2021). x = log10[contig NG50]; y = log10[scaffold NG50]; P = log10 [phased block NG50]; Q = Phred base accuracy QV (Quality value); C = % genome represented by the first “n” scaffolds, following a known karyotype of 2n = 52 estimated as the mode of the number of chromosome from the closely related species Arion vulgaris (NCBI:GCA_020796225.1; Chen et al. 2022). Quality code for all the assembly denoted by primary assembly (xgAriColu1.0.hap1).

bRead coverage and NGx statistics have been calculated based on the estimated genome size of 2.29 Gb.

cBUSCO Scores. Complete BUSCOs (C). Complete and single-copy BUSCOs (S). Complete and duplicated BUSCOs (D). Fragmented BUSCOs (F). Missing BUSCOs (M).

dH1: Haplotype 1 and (H2) Haplotype 2 assembly values.