Abstract
The plum fruit moth Grapholita funebrana (Tortricidae, Lepidoptera) is an important pest of many wild and cultivated stone fruits and other plants in the family Rosaceae. Here, we assembled its nuclear and mitochondrial genomes using Illumina, Nanopore, and Hi-C sequencing technologies. The nuclear genome size is 570.9 Mb, with a repeat rate of 51.28%, and a BUCSO completeness of 97.7%. The karyotype for males is 2n = 56. We identified 17,979 protein-coding genes, 5,643 tRNAs, and 94 rRNAs. We also determined the mitochondrial genome of this species and annotated 13 protein-coding genes, 22 tRNAs, and 2 rRNA. These genomes provide resources to understand the genetics, ecology, and genome evolution of the tortricid moths.
Subject terms: Genomics, Agricultural genetics
Background & Summary
The plum fruit moth Grapholita funebrana is an important fruit borer from the family Tortricidae of Lepidoptera1,2. Larvae of G. funebrana cause damage by boring the fruits of many wild and cultivated stone fruits and other plants in the family Rosaceae, such as apricot, cherry, peach, and plum3. This species is native to Europe and currently found in fruit-growing regions of Europe, northern Africa, and Asia4. In the orchards, G. funebrana often co-occur with other fruit borers, such as the oriental fruit moth Grapholita molesta (Busck), the codling moth Cydia pomonella, and peach fruit moth Carposina sasakii Matsumura5. While many studies have focused on the biology and management of fruit borers, research on G. funebrana is lagging behind6–10. In addition, moths from the family Tortricidae are ideal for unveiling the evolution of chromosome fusion11,12. While species from the order Lepidoptera often have a conserved chromosome number of n = 31, in the Tortricidae family, many species have a reduced number of chromosomes due to the fusion of chromosome pairs13,14. Recent research has found that a common ancestor of the suborders Tortricinae and Olethreutinae diverged from the ancestral lepidopteran chromosome pattern due to a fusion of sex chromosomes with autosomes15. The karyotype of tortricid moths was traditionally studied by cytogenetic methods and fluorescence in situ hybridization15. Determining the genome sequences will improve understanding of the molecular evolution of chromosomes of tortricid moths16. Currently, chromosome-level genomes have been published for the C. pomonella16, and G. molesta17, as well as many publicly available assemblies for Tortricidae in the GenBank (https://www.ncbi.nlm.nih.gov/datasets/genome/?taxon=7139).
In this study, we assembled a chromosome-level genome for the G. funebrana as well its mitochondrial genome using Oxford Nanopore Technologies (ONT) long-read sequencing, Illumina short-read sequencing, high-throughput chromatin conformation capture (Hi-C) sequencing, and RNA-sequencing (RNA-seq). We yielded a nuclear genome assembly of 570.9 Mb, with an N50 of 21 Mb. These high-quality genomes will provide invaluable resources for the study of G. funebrana and in-depth investigation of chromosome evolution on macroevolutionary and microevolutionary levels.
Methods
Material and sequencing
Apricot (Prunus armeniaca) fruits with G. funebrana larvae were collected from Yanqing, Beijing, China, and reared in the laboratory for about 30 days to obtain specimens of different developmental stages. To decrease the effect of heterozygosity, a single larva was used for long-read, short-read, and Hi-C library construction. Single larva, pupa, and adult (unknown sex) were collected for the construction of RNA-seq libraries, respectively. All samples were immediately flash-frozen in liquid nitrogen and stored at −80 °C for subsequent experiments.
Genomic DNA was extracted using the Magnetic bead method (Invitrogen, Thermo Fisher Scientific, USA), while RNA was extracted using RNAprep Pure Plus Kit (Tiangen, China), respectively. The quantity of DNA was measured using Qubit 3.0. To generate short-read data for the genome survey, an Illumina library with an insert size of 350 bp was constructed and sequenced on the Illumina NovaSeq 6000 platform. To perform de novo genome assembly, a 15~20 kb ONT library was prepared and sequenced on the ONT platform to generate long-read data. To generate the Hi-C data, tissue from a larva was fixed with paraformaldehyde and digested with restriction enzymes DnpII, generating fragments with sticky ends. These sticky ends were repaired using DNA polymerase and ligated together to form chimeric circles using DNA ligase. The ligated DNAs were then decrosslinked, purified, and sheared into 350 bp insertion size. The Hi-C sequencing library was sequenced on the Illumina NovaSeq 6000 platform to generate 150-bp paired-end reads. Paired-end libraries were constructed using the VAHTSTM mRNA-seq V2 Library Prep Kit (Vazyme, Nanjing, China) and then sequenced on the Illumina NovaSeq 6000 platform with PE reads of 150 bp for genome annotation. A total of 33.7 Gb Illumina short read, 69.7 Gb ONT long-read, 58.3 Gb Hi-C reads, and 21.9 Gb RNA-seq reads data were generated. The raw data of Illumina reads were filtered by Fastp v0.21.018 with default parameters.
Genome survey
Genome survey was performed using a k-mer based method. The k-mer coverage was counted from Illumina short reads using Jellyfish version 2.2.1019 with parameters: ‘count -m 21 -C -s 5 G’. Genome size, heterozygosity, and duplication rate were estimated using GenomeScope version 2.020. The results showed a genome size about 515 Mb, a heterozygosity rate of 1.91%, and a duplication rate of 1.21%.
Genome assembly
The Nanopore long reads were assembled to the primary set of nuclear genome contigs using NextDenovo v2.5.121 with parameters: ‘read_cutoff = 1k, genome_size = 400 m, pa_correction = 20, nextgraph_options = -a 1’. The contigs contain 215 sequences, with a size of 594 Mb, and N50 of 6.6 Mb. Due to the high error rate of assembly based on ONT reads, the primary contigs were polished using NextPolish 1.4.122 with one round based on long reads and one round based on short reads. To achieve chromosome-level assembly, the polished contigs were anchored into pseudomolecules based on Hi-C reads information. Specifically, the Hi-C reads were mapped to contigs using Chromap 0.2.423 with options: “–preset hic–remove-pcr-duplicates–trim-adapters–SAM”. The SAM output was sorted by read name and output to BAM format using Samtools v1.1724 with options: “sort -n -O BAM”. Yahs v1.2a.125 and Juicerbox 1.22.0126 were then used for unsupervised and supervised scaffolding, respectively. After scaffolding, most contigs (95.3% contigs and 99.86% base-pairs) were anchored into 28 pseudo-chromosomes (Fig. 1a), consistent with the karyotype of most species in the subfamily Olethreutinae. To fill the gaps between contigs, we performed two rounds of polishing based on long- and short-reads using Nextpolish. The final assembly has a genome size of 570.9 Mb, with a N50 of 21 Mb. The assembled genome is 56.9 Mb larger than the estimated genome size. MitoZ v3.6 pipeline27 was performed to assembly using Megahit v1.2928 (“–kmers_megahit 39 59 79 99 119 141–requiring_taxa Lepidoptera”) and annotate mitochondrial genome. The mitochondrial genome of G. funebrana was 15,488 bp in length and contain 13 protein coding genes, 22 tRNA genes and 2 rRNA genes (Fig. 1b).
Fig. 1.
The interaction heat map of nuclear genome (a), and distribution of genes and read coverage on mitochondrial genome (b).
Genome annotations
For repeat sequence annotation, a species-specific repeat library was generated using RepeatModeler v2.0.429 with options: “-LTRStruct”. The species-specific repeat library, a RepBase database, and a repeat element library for Arthropoda from the Dfam database were then combined and passed to RepeatMasker v4.1.430 for repeat annotation. RepeatMasker was performed with options:” -no_is -norna -xsmall -q”.
For gene structure annotation, we performed a pipeline integrating RNA-seq-based, ab initio, and homolog-based methods. The RNA reads of single larva, pupa and adult libraries were mapped to our final assembly with Hisat v2.2.027 and assembled to transcripts with Stringtie v2.1.231. The transcriptome assemblies and protein sequences of Plutella xylostella (Accession: GCA_932276165.132) were provided as evidence to MAKER v3.01.04 pipeline26 to integrate. SNAP v2013-02-1628 and Augustus v3.2.329 were used to conduct ab initio annotation. Transfer RNA (tRNA) was predicted using tRNAscanSE 2.0.1233 with default parameters, and ribosome RNA (rRNA) was predicted using Barrnap 0.9 (https://github.com/tseemann/barrnap). The above gene models were merged to produce consensus models by EvidenceModeler v2.1.033. Functional annotation of protein-coding genes was evaluated using EggNOG-mapper v234.
Chromosome feature
The gene number, repeat sequence density, and Guanine-Cytosine(GC) content were calculated in 500 Kb non-overlapping sliding windows using Bedtools v2.30.035. The name of the chromosomes was assigned as lepidopteran ancestral linkage groups14, based on homology to Sesia bembeciformis36. The homology was detected using LAST37 alignment. A Circos plot of chromosome feature was generated by TBtools v2.02138 (Fig. 2a).
Fig. 2.
Chromosome features of Grapholita funebrana genome. (a) Circos plot of GC content, gene count, and repeat content. Chromosomes were labeled using Merian elements according to the homology with the Lepidopteran ancestral linkage groups14. (b) Synteny blocks between the G. funebrana and G. molesta reveal the same number of chromosomes and highly conserved gene order in the two moths. The chromosomes of two genomes were numbered according to their length. The grey lines show the synteny blocks between two genomes.
Data Records
Illumina, Nanopore, Hi-C, and transcriptome data for G. funebrana genome sequencing have been deposited in the NCBI Sequence Read Archive with accession number SRP48223139. The final assembled nuclear genome of G. funebrana has been deposited in the NCBI Genbank with accession number GCA_038095595.140. The mitochondrial genome has been deposited in the NCBI Genbank with accession number PP77602341. The genome assembly and annotation files are available in Figshare42.
Technical Validation
The Hi-C heatmap revealed a well-structured interaction pattern. Short-read sequencing data were mapped to the final assembly with BWA v0.7.1743, revealing a mapping rate of 97.7%. The completeness of G. funebrana genome assembly was evaluated using the BUSCO44 base on the lepidoptera_odb10 database (n = 5286). The completeness of the initial assembly (contig level) was 90.9%, while it increased to 97.7% (97.2% single-copied genes, 0.5% duplicated genes, 0.6% fragmented, and 1.7% missing genes) after polishing with NextPolish22 (Table 1). We identified 14,547 protein-coding genes, 11,673 of which were functionally annotated. The completeness of the annotated gene set was 95.8% (94.8% single-copied genes and 1.0% duplicated genes, 1.1% fragmented, and 3.1% missing genes). A synteny analysis between G. funebrana and G. molesta17 was performed using MCSCAN in JCVI package45. Strong syntenic blocks were found between the two closely related species (Fig. 2b). All evidence strongly supported the completeness and accuracy of G. funebrana genome assembly.
Table 1.
Statics of G. funebrana genome assembly.
| Item | Contig | Purged contig | Hi-C raised scaffold | Polished scaffold |
|---|---|---|---|---|
| No. of contigs | 215 | 175 | 28 | 28 |
| Size (Mb) | 593.9 | 580.3 | 579.6 | 570.9 |
| N50 (Mb) | 6.6 | 7.2 | 21.4 | 21.0 |
| GC content | 37.8% | 37.6% | 37.6% | 37.5% |
| Single-copy BUSCOs | 90.2% | 90.9% | 90.5% | 97.2% |
| Duplicated BUSCOs | 0.7% | 0.4% | 0.3% | 0.5% |
| Fragmented BUSCOs | 4.4% | 4.4% | 4.4% | 0.6% |
| Missing BUSCOs | 4.7% | 4.7% | 4.8% | 1.7% |
Acknowledgements
This work was supported by National Natural Science Foundation of China (32272543), and Beijing Key Laboratory of Environmentally Friendly Management on Pests of North China Fruits (BZ0432).
Author contributions
S.J.W. designed the study. J.C.C. contributed to the materials. L.J.C. and F.Y. analysed the data. F.Y. and L.J.C. wrote the manuscript. S.J.W. revised the manuscript.
Code availability
No custom scripts or code were used in this study.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Li L-L, et al. Functional disparity of four pheromone-binding proteins from the plum fruit moth Grapholita funebrana Treitscheke in detection of sex pheromone components. Int. J. Biol. Macromol. 2023;225:1267–1279. doi: 10.1016/j.ijbiomac.2022.11.186. [DOI] [PubMed] [Google Scholar]
- 2.Lo Verde G, Guarino S, Barone S, Rizzo R. Can mating disruption be a possible route to control plum fruit moth in mediterranean environments? Insects. 2020;11:589. doi: 10.3390/insects11090589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dickler, E. Tortricid pests of pome and stone fruits, eurasian species. in Tortricids Pests, Their Biology, Natural Enemies and Control (eds. van der Geest, L. P. S. & Evenhuis, H. H.) 435–452 (Elsevier, Amsterdam, Netherlands, 1991).
- 4.F, K. A taxonomic review of the genus Grapholita and allied genera (Lepidoptera: Tortricidae) in the Palaearctic region. Ent. Scand. Suppl. 55, 110 (1999).
- 5.Chen MH, Dorn S. Reliable and efficient discrimination of four internal fruit-feeding Cydia and Grapholita species (Lepidoptera: Tortricidae) by polymerase chain reaction-restriction fragment length polymorphism. J. Econ. Entomol. 2009;102:2209–2216. doi: 10.1603/029.102.0625. [DOI] [PubMed] [Google Scholar]
- 6.Ioriatti C, et al. Toxicity of emamectin benzoate to Cydia pomonella (L.) and Cydia molesta (Busck) (Lepidoptera: Tortricidae): laboratory and field tests. Pest Manag. Sci. 2009;65:306–312. doi: 10.1002/ps.1689. [DOI] [PubMed] [Google Scholar]
- 7.Liu J, et al. Reverse chemical ecology guides the screening for Grapholita molesta pheromone synergists. Pest Manag. Sci. 2022;78:643–652. doi: 10.1002/ps.6674. [DOI] [PubMed] [Google Scholar]
- 8.Stelinski LL, Il’ichev AL, Gut LJ. Efficacy and release rate of reservoir pheromone dispensers for simultaneous mating disruption of codling moth and oriental fruit moth (Lepidoptera: Tortricidae) J. Econ. Entomol. 2009;102:315–323. doi: 10.1603/029.102.0142. [DOI] [PubMed] [Google Scholar]
- 9.Witzgall P, Stelinski L, Gut L, Thomson D. Codling moth management and chemical ecology. Annu. Rev. Entomol. 2008;53:503–522. doi: 10.1146/annurev.ento.53.103106.093323. [DOI] [PubMed] [Google Scholar]
- 10.Wu Y, et al. Laboratory evaluation of the compatibility of Beauveria bassiana with the egg parasitoid Trichogramma dendrolimi (Hymenoptera: Trichogrammatidae) for joint application against the oriental fruit moth Grapholita molesta (Lepidoptera: Tortricidae) Pest Manag. Sci. 2022;78:3608–3619. doi: 10.1002/ps.7003. [DOI] [PubMed] [Google Scholar]
- 11.Nguyen P, et al. Neo-sex chromosomes and adaptive potential in tortricid pests. Proc. Natl. Acad. Sci. 2013;110:6931–6936. doi: 10.1073/pnas.1220372110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Sahara K, Yoshido A, Traut W. Sex chromosome evolution in moths and butterflies. Chromosome Res. 2012;20:83–94. doi: 10.1007/s10577-011-9262-z. [DOI] [PubMed] [Google Scholar]
- 13.Nguyen, P. & Carabajal Paladino, L. On the neo-sex chromosomes of Lepidoptera. in Evolutionary Biology: Convergent Evolution, Evolution of Complex Traits, Concepts and Methods (ed. Pontarotti, P.) 171–185. 10.1007/978-3-319-41324-2_11 (Springer International Publishing, Cham, 2016).
- 14.Wright, C. J., Stevens, L., Mackintosh, A., Lawniczak, M. & Blaxter, M. Comparative genomics reveals the dynamics of chromosome evolution in Lepidoptera. Nat. Ecol. Evol. 1–14, 10.1038/s41559-024-02329-4 (2024). [DOI] [PMC free article] [PubMed]
- 15.Šíchová J, Nguyen P, Dalíková M, Marec F. Chromosomal evolution in tortricid moths: conserved karyotypes with diverged features. PLoS ONE. 2013;8:e64520. doi: 10.1371/journal.pone.0064520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wan F, et al. A chromosome-level genome assembly of Cydia pomonella provides insights into chemical ecology and insecticide resistance. Nat. Commun. 2019;10:4237. doi: 10.1038/s41467-019-12175-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Cao L-J, et al. Population genomic signatures of the oriental fruit moth related to the Pleistocene climates. Commun. Biol. 2022;5:142. doi: 10.1038/s42003-022-03097-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27:764–770. doi: 10.1093/bioinformatics/btr011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Vurture GW, et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 2017;33:2202–2204. doi: 10.1093/bioinformatics/btx153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hu J, et al. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biol. 2024;25:107. doi: 10.1186/s13059-024-03252-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hu J, Fan J, Sun Z, Liu S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics. 2020;36:2253–2255. doi: 10.1093/bioinformatics/btz891. [DOI] [PubMed] [Google Scholar]
- 23.Zhang H, et al. Fast alignment and preprocessing of chromatin profiles with Chromap. Nat. Commun. 2021;12:6566. doi: 10.1038/s41467-021-26865-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Danecek P, et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10:giab008. doi: 10.1093/gigascience/giab008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zhou C, McCarthy SA, Durbin R. YaHS: yet another Hi-C scaffolding tool. Bioinformatics. 2023;39:btac808. doi: 10.1093/bioinformatics/btac808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Durand NC, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–98. doi: 10.1016/j.cels.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Meng G, Li Y, Yang C, Liu S. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic Acids Res. 2019;47:e63. doi: 10.1093/nar/gkz173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31:1674–1676. doi: 10.1093/bioinformatics/btv033. [DOI] [PubMed] [Google Scholar]
- 29.Flynn JM, et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. 2020;117:9451–9457. doi: 10.1073/pnas.1921046117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinforma. 2009;25:4.10.1–4.10.14. doi: 10.1002/0471250953.bi0410s25. [DOI] [PubMed] [Google Scholar]
- 31.Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat. Protoc. 2016;11:1650–1667. doi: 10.1038/nprot.2016.095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.2022. Genbank. GCA_932276165.1
- 33.Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 2021;49:9077–9096. doi: 10.1093/nar/gkab688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: Functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 2021;38:5825–5829. doi: 10.1093/molbev/msab293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Quinlan AR. BEDTools: The swiss-army tool for genome feature analysis. Curr. Protoc. Bioinforma. 2014;47:11.12.1–11.12.34. doi: 10.1002/0471250953.bi1112s47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.2022. Genbank. GCA_943735995.1
- 37.Katoh K, Frith MC. Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics. 2012;28:3144–3146. doi: 10.1093/bioinformatics/bts578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chen C, et al. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol. Plant. 2020;13:1194–1202. doi: 10.1016/j.molp.2020.06.009. [DOI] [PubMed] [Google Scholar]
- 39.2024. NCBI Sequence Read Archive. SRP482231
- 40.2024. Genbank. GCA_038095595.1
- 41.2024. Genbank. PP776023
- 42.Wei S-J, Yang F. 2024. Genome annotation of Grapholita funebrana. Figshare. [DOI]
- 43.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 2021;38:4647–4654. doi: 10.1093/molbev/msab199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Tang H, et al. Synteny and collinearity in plant genomes. Science. 2008;320:486–488. doi: 10.1126/science.1153917. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- 2022. Genbank. GCA_932276165.1
- 2022. Genbank. GCA_943735995.1
- 2024. NCBI Sequence Read Archive. SRP482231
- 2024. Genbank. GCA_038095595.1
- 2024. Genbank. PP776023
- Wei S-J, Yang F. 2024. Genome annotation of Grapholita funebrana. Figshare. [DOI]
Data Availability Statement
No custom scripts or code were used in this study.


