Table 1. Genome data for Elegia similella, ilEleSimi1.1.
Project accession data | ||
---|---|---|
Assembly identifier | ilEleSimi1.1 | |
Species | Elegia similella | |
Specimen | ilEleSimi1 | |
NCBI taxonomy ID | 1101167 | |
BioProject | PRJEB56060 | |
BioSample ID | SAMEA10978763 | |
Isolate information | ilEleSimi1, male: whole organism (DNA and Hi-C sequencing) | |
Assembly metrics * | Benchmark | |
Consensus quality (QV) | 66.4 | ≥ 50 |
k-mer completeness | 100.0% | ≥ 95% |
BUSCO ** | C:98.8%[S:98.3%,D:0.5%],F:0.4%,M:0.8%,n:5,286 | C ≥ 95% |
Percentage of assembly mapped
to chromosomes |
99.99% | ≥ 95% |
Sex chromosomes | ZZ | localised homologous pairs |
Organelles | Mitochondrial genome: 15.3 kb | complete single alleles |
Raw data accessions | ||
PacificBiosciences SEQUEL II | ERR10224929 | |
Hi-C Illumina | ERR10297823 | |
Genome assembly | ||
Assembly accession | GCA_947532085.1 | |
Accession of alternate haplotype | GCA_947532095.1 | |
Span (Mb) | 780.4 | |
Number of contigs | 50 | |
Contig N50 length (Mb) | 23.1 | |
Number of scaffolds | 33 | |
Scaffold N50 length (Mb) | 28.7 | |
Longest scaffold (Mb) | 56.26 | |
Genome annotation | ||
Number of protein-coding genes | 18,805 | |
Number of gene transcripts | 18,942 |
* Assembly metric benchmarks are adapted from column VGP-2020 of “Table 1: Proposed standards and metrics for defining genome assembly quality” from ( Rhie et al., 2021).
** BUSCO scores based on the lepidoptera_odb10 BUSCO set using version 5.3.2. C = complete [S = single copy, D = duplicated], F = fragmented, M = missing, n = number of orthologues in comparison. A full set of BUSCO scores is available at https://blobtoolkit.genomehubs.org/view/CANNWO01/dataset/CANNWO01/busco.