Skip to main content
Scientific Data logoLink to Scientific Data
. 2023 Jan 19;10:36. doi: 10.1038/s41597-023-01950-5

Chromosome-level genome assembly of the Colorado potato beetle, Leptinotarsa decemlineata

Junjie Yan 1,#, Chaowei Zhang 2,#, Mengdi Zhang 1, Hang Zhou 2, Zhangqi Zuo 2, Xinhua Ding 3, Runzhi Zhang 4, Fei Li 2,, Yulin Gao 1,
PMCID: PMC9849343  PMID: 36653371

Abstract

The Colorado potato beetle (Leptinotarsa decemlineata) is one of the most notorious insect pests of potatoes globally. Here, we generated a high-quality chromosome-level genome assembly of L. decemlineata using a combination of the PacBio HiFi sequencing and Hi-C scaffolding technologies. The genome assembly (−1,008 Mb) is anchored to 18 chromosomes (17 + XO), with a scaffold N50 of 58.32 Mb. It contains 676 Mb repeat sequences and 29,606 protein-coding genes. The chromosome-level genome assembly of L. decemlineata provides in-depth knowledge and will be a helpful resource for the beetle and invasive biology research communities.

Subject terms: Genome, Agricultural genetics

Background & Summary

The Colorado potato beetle (CPB), Leptinotarsa decemlineata, is one of the most successful globally-invasive insects. Its current habitat ranges over 16 million km2 across North America, Europe and Asia and continues to expand globally1. Both adults and larvae devour entire leaves. This makes CPB one of the most destructive insect pests. It has been estimated that a single larva can destroy approximately 40 cm2 of potato leaves over the stage2,3. Chemical pesticides have been used to control CPB since the 1860s4. However, high selection pressures have promoted the emergence of high level insecticide resistant CPB populations over the last decades5,6. Since the middle of the last century, the beetle has developed resistance to 52 different insecticides compounds.

Whole-genome sequencing is a fundamental tool to address important scientific issues in biological research, by providing a whole set of gene resources of a given species. The first genome assembly of L. decemlineata based on Illumina short reads was published in 20187, followed by an improved version Ldec_2.0. These two versions of CPB genomes have provided useful gene resources for the beetle community8,9. However, due to the limitation of short reads in genome assembly, the quality of the CPB genome still need be improved.

To this end, we applied the PacBio HiFi sequencing and High-throughput chromosome conformation capture technologies (Hi-C), to generate a high-quality chromosome-level genome assembly of L. decemlineata (Fig. 2). This produced a new CPB genome with high quality at chromosome level, which has a total scaffold length of 1,008.42 Mb mapping to 18 chromosomes (17 + XO). Compared to the published version Ldec_2.0, the scaffold N50 increased from 139 Kb to 58.32 Mb. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that gene coverage increased from 92.1% to 98.0% (Table 1). A total of 676 Mb repeat sequences representing 67.04% of whole genome were identified, much more than that found in Ldec_2.0, suggesting the new version of CPB genome is more complete. Among these repeat sequences, 72.47% were classified as known repeat elements (Table 2). In addition, protein-coding genes increased from 24,671 to 29,606, showing that a more complete set of genes were obtained. Most protein-coding genes identified in the previous version can be found in the new annotation. Functional categories were classified based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene ontology (GO) databases (Table 3).

Fig. 2.

Fig. 2

Heatmap of genome-wide Hi-C data (resolution = 500,000 bp) and overview of the genomic landscape of Leptinotarsa decemlineata. (a) The heatmap of chromosome interactions in L. decemlineata was visualized by HiCPlotter45. The frequency of Hi-C interaction links is represented by colours, which ranges from white (low) to red (high). (b) Circos plot of distribution of the genomic elements in L. decemlineata was visualized by Circos46. From the outer ring to the inner circle, blue, red and green represent GC content, repeat sequence coverage and gene density of each chromosome, respectively.

Table 1.

Comparison of two Leptinotarsa decemlineata genome assemblies.

Genome assembly This study Ldec_2.0*
Genome size (Mb) 1008.42 641.99
Assembly level Chromosome Scaffold
Number of assembled chromosomes 18 Not available
Contig N50 (kb) 8098.89 47.4
Scaffold N50 (Mb) 58.32 0.139
Busco genes (%) C:98.0% [S:89.5%, D:8.5%], F:0.8%, M:1.2% C:92.1% [S:91.2%, D:0.9%], F:4.2%, M:3.7%
GC content (%) 34.7 22.3
Number of genes 29,606 24,671
Repeat (%) 67.04% 16.93%

*Ldec_2.0 is the previous published assembly from Sean7.

Table 2.

Statistics of repeat elements of Leptinotarsa decemlineata.

Repeat types Number of elements Length occupied (bp) Percentage of sequence
SINE 89 5392 0.00%
LINE 518279 175527570 17.41%
LTR 141193 66692839 6.61%
DNA elements 295376 156214793 15.49%
Unclassified 1110542 277582050 27.53%
Small RNA 512 38038 0.00%
Total base masked 2065991 676060682 67.04%

Table 3.

Repeat elements of Leptinotarsa decemlineata.

Genome annotation Number of elements
predicted protein-coding genes 29606
Swissprot 14053
GO 14606
KEGG 9135
Pfam 16628

A total of 418 single-copy orthologous genes were found among CPB and other 15 insect species (Table S1). These 1:1:1 orthologous gene were used to construct a phylogenetic tree. The evolutionary analysis results showed that L. decemlineata and other Chrysomelidae beetles formed a cluster. Anoplophora glabripennis (family: Cerambycidae) diverged from L. decemlineata (family: Chrysomelidae) approximately 96.5 million years ago (mya), and Tribolium castaneum (family: Tenebrionidae) diverged from L. decemlineata (family: Chrysomelidae) approximately 152.5 mya9.

In total, 14,446 gene clusters were identified across the 16 species. Compared with other insect species, CPB had 1,260 expanded and 716 contracted gene families (Fig. 3, Table S2). REVIGO analysis indicated that expanded orthogroups are enriched in DNA integration, macroautophagy, regulation of adenosine receptor signalling pathway and diverse biological process (Fig. 4a, Table S3). In contrast, the contracted orthogroups were significantly enriched in L-ornithine transmembrane, transporter activity, virus receptor activity (Fig. 4a, Table S4).

Fig. 3.

Fig. 3

Phylogenetic tree of Leptinotarsa decemlineata and 15 other insect species. The numbers of expanded gene families (green) and contracted gene families (red) are shown to the right of each species branch44.

Fig. 4.

Fig. 4

Gene ontology (GO) enrichment of expanded and contracted orthogroups in Leptinotarsa decemlineata.

The whole genome of Tribolium castaneum and Anthonomus grandis in Chrysomelidae were publicly reported10,11, thus, we performed whole-genome synteny analysis of L. decemlineata with these two species. A large number of fission and fusion events were identified between L. decemlineata and the other two beetles, suggesting that the beetle family Chrysomelidae have undergone a high degree of divergence. CPB has XO sex determining system12. Synteny analysis also showed that the CPB Chromosome 6 (Chr 6) shared high sequence synteny with X chromosome of T. castaneum (Fig. 5). The gene LdVssc has been reported as X-linked13, and this gene can be found in Chr 6. Combining these evidences, the CPB Chr 6 is regarded as X chromosome.

Fig. 5.

Fig. 5

Comparative analysis of synteny among Leptinotarsa decemlineata, Tribolium castaneum and Anthonomus grandis. (a) Whole-genome synteny between Leptinotarsa decemlineata and Tribolium castaneum. (b) Whole-genome synteny between Leptinotarsa decemlineata and Anthonomus grandis.

As the first high-quality chromosome level genome assembly in Chrysomelidae, the chromosome-level genome assembly of L. decemlineata not only illuminate the genetic architecture of this important agricultural pests, providing a powerful approach to identify new gene targets for control measures, but also allows for exploration of biological characteristics of Chrysomelidae beetles.

Methods

Sample collection and sequencing

Leptinotarsa decemlineata adults were collected from Xinjiang Province, China. The adults were fed with fresh potato leaves and maintained at 26 ± 1 °C, under a 14:10-hr (light–dark) photoperiod cycle and 85% ± 5% relative humidity.

Genomic DNA was extracted from one female pupa using the QIAamp DNA Mini Kit (QIAGEN). Sex of the CPB pupa is identified by observing the 7th visible sternite14. The 7th visible sternite in the female pupa is separated in the middle by a suture, while the male pupa is complete and depressed in the centre. The integrity and purity of DNA was verified with agarose gel electrophoresis (AEG) and Nanodrop 2000. Eight micrograms of genomic DNA were sheared using g-Tubes (Covaris), and concentrated with AMPure PB magnetic beads. Each SMRT bell library was constructed using the Pacific Biosciences SMRT bell template prep kit 1.0. The constructed library was size-selected using the Sage ELF system for molecules 8–12 Kb, followed by primer annealing and the binding of SMRT bell templates to polymerases with the DNA Polymerase Binding Kit. Sequencing was carried out on the Pacific Bioscience Sequel II platform (Annoroad Gene Technology Co., Ltd, Beijing, China).

Chromosome-level genome assembly of L. decemlineata

HiFi reads were produced using the circular consensus sequencing (CCS) mode on the PacBio long-read systems. 31 Gb HiFi reads (30×) were produced with an average length of 19,479 bp. De novo assembly of PacBio HiFi reads was performed using Hifiasm v0.1314.

Hi-C libraries were constructed and sequenced on the Illumina HiSeq X Ten platform (Annoroad Gene Technology Co., Ltd, Beijing, China), using a standard procedure15. The clean reads were first aligned to the genome assembly using bowtie 2 v2.2.316. Unmapped reads were mainly composed of the chimeric regions spanning across the ligation junction. The ligation site of an unmapped read was determined with HiC-Pro v2.7.817. Then, its 5′ fractions were aligned back with the genome assembly. A single alignment file which merged the results of both mapping steps was generated. Reads that had low mapping quality, multiple matches in the assembly, singletons and mitochondrial DNA were discarded. The valid interaction pairs were used to scaffold assembled contigs into 18 pseudo-chromosomes using LACHESIS v2e27abb18. The number of pseudochromosomes was consistent with the data of L. decemlineata karyotype (n = 17 + XO)19. The chromosome matrix was visualized as a heatmap in the form of diagonal patches of strong linkage (Fig. 2a). The quality and completeness of the assembled genome was evaluated using BUSCO v5.020.

Gene prediction and functional annotation

A repeat database was used to train RepeatModeler221. Then, the repeat elements were annotated using the RepeatMasker v4.1.022 by homology searching with default parameters. After filtering the repeat sequences, the results of de novo prediction, transcriptome-based and homolog-based methods were combined to predict gene composition23. De novo gene models were generated using BRAKER2 v.2.1.524. Thirteen CPB transcriptomes were downloaded from the NCBI SRA database (SRR12121893, SRR13510813, SRR13510819, SRR13510821, SRR13510823, SRR9667707, SRR12121892, SRR13510812, SRR13510818, SRR13510820, SRR13510822, SRR9667699.1, SRR9667708). The transcriptomes were processed using Trimmomatic25, HISAT2 v.2.1.026 and StringTie2 v.2.1.527 to generate transcripts assemblies. The Homology proteins from all insect species were from OrthoDB28. Homology-based evidence was generated using GenomeThreader v.1.7.129. Finally, gene models were predicted after integrating results of the three methods of predictions using EVidenceModeler30.

The functions of protein-coding genes were annotated using DIAMOND BLASTP against the Swiss-Prot protein database (https://www.uniprot.org/) and Pfam database (http://pfam.xfam.org/). The predicted genes were classified into functional categories based on KEGG (https://www.genome.jp/kegg) and GO (https://www.uniprot.org/) (Table 3).

Phylogenetic analysis

We selected 15 coleopteran species for phylogenomic analysis, with Chrysoperla carnea (Order: Neuroptera) as an out-group. The protein sequences except CPB of these taxa were downloaded from NCBI and InsectBase 2.023 (Table S1).

A total of 418 single-copy orthogroups were extracted using Broccoli v1.231.The protein sequences in each orthogroup were extracted using seqkit v2.2.032, independently aligned using MAFFT v7.47133 and filtered using trimAl v1.434 with default parameters. The phylogenetic tree was constructed using iq-tree v1.6.1035 with the following parameters: -nt AUTO -m TEST -bb 1000. Branch support values were obtained from 1,000 bootstrap replicates. The divergence time among different species was estimated using the MCMCtree in the PAML package v4.9j36. Three standard divergence time points based on fossil records in the Paleobiology Database (www.paleobiodb.org) were applied: (a) stem Chrysomeloidea at 93.5–99.6 mya (b) stem Coleoptera at 166.1–168.3 mya (c) stem Coccinellidae at 295.5–298.9 mya.

Gene family expansion and contraction

The expansion and contraction of gene families were determined using CAFE v5.0.02937. The results from the phylogenetic tree with divergence times were used as inputs. A p-value of 0.05 was used to identify families that were significantly expanded and contracted. Gene ontology (GO) enrichment of expanded and contracted orthogroups of L.decemlineata were analysed and visualized by REVIGO38. The dispensability (i.e., redundancy with respect to the chosen representative GO term) of GO terms was less than 0.1.

Chromosomal synteny analysis

The whole-genome synteny analysis among the three species, was carried out using satsuma2 (https://github.com/bioinfologics/satsuma2). Synteny blocks were plotted across chromosomes using CIRCOS39.

Identification of sex chromosomes

To determine X chromosome, Blastn was used to map the X-linked locus LdVssc with 18 CPB chromosomes with default parameters.

Data Records

The PacBio and Hi-C sequencing data that were used for the genome assembly have been deposited in the NCBI Sequence Read Archive with accession number SRR2051912440,41 and SRR2109553642 and under BioProject accession number PRJNA854273. The chromosomal assembly has been deposited at GenBank with accession nember JANJPO00000000043. The annotated genes have been deposited in InsectBase 2.0 with ID IBG_0081844.

Technical Validation

The chromosome-level genome assembly was 1,008 Mb with a scaffold N50 of 58.32 Mb. For quantitative assessment of genome assembly, BUSCO assessment showed that 98.0% of BUSCO genes (insecta_odb10) were successfully identified in the genome assembly (Table 1), suggesting a remarkably complete assembly of the L. decemlineata genome.

The Hi-C heatmap revealed a well-organized interaction contact pattern along the diagonals within/around the chromosome inversion region (Fig. 1), which indirectly confirmed the accuracy of the chromosome assembly.

Fig. 1.

Fig. 1

The Colorado potato beetle, Leptinotarsa decemlineata.

Supplementary information

Supplementary Information (59.2KB, docx)

Acknowledgements

This work was supported by the Guangdong Major Project of Basic and Applied Basic Research (2021B0301030004), the National Key Research and Development Program of China (2018YFD0200802) and the National Natural Science Foundation of China (32102271).

Author contributions

Y.G. and F.L. conceived the research project. J.Y., X.D. and M.Z. led the collection of samples and population metadata. C.Z., Z.H. and F.L. performed the bioinformatic analyses. Y.G., F.L. and R.Z. wrote the manuscript.

Code availability

All software and pipelines were executed according to the manual and protocols of the published bioinformatic tools. The version and code/parameters of software have been described in Methods.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Junjie Yan, Chaowei Zhang.

Contributor Information

Fei Li, Email: lifei18@zju.edu.cn.

Yulin Gao, Email: gaoyulin@caas.cn.

Supplementary information

The online version contains supplementary material available at 10.1038/s41597-023-01950-5.

References

  • 1.Zehnder GW. Timing of Insecticides for Control of Colorado Potato Beetle (Coleoptera: Chrysomelidae) in Eastern Virginia Based on Differential Susceptibility of Life Stages. J. Econ. Entomol. 1986;79:851–856. doi: 10.1093/jee/79.3.851. [DOI] [Google Scholar]
  • 2.Logan PA, Casagrande RA, Faubert HH, Drummond FA. Temperature-dependent development and feeding of immature Colorado potato beetles, Leptinotarsa decemlineata (Say)(Coleoptera: Chrysomelidae) Environ. Entomol. 1985;14:275–283. doi: 10.1093/ee/14.3.275. [DOI] [Google Scholar]
  • 3.Ferro DN, Alyokhin AV, Tobin DB. Reproductive status and flight activity of the overwintered Colorado potato beetle. Entomol. Exp. Appl. 1999;91:443–448. doi: 10.1046/j.1570-7458.1999.00512.x. [DOI] [Google Scholar]
  • 4.Harris CR, Svec HJ. Colorado potato beetle resistance to carbofuran and several other insecticides in Quebec. J. Econ. Entomol. 1981;74:421–424. doi: 10.1093/jee/74.4.421. [DOI] [Google Scholar]
  • 5.Gauthier, N. L., Hofmaster, R. N. & Semel, M. History of Colorado potato beetle control. In “Advances in potato pest management” (Lashomb, J. H. & Casagrande, R. Eds.). Hutchinson Ross Stroudsberg Pa. 13–33 (1981).
  • 6.Alyokhin A, Baker M, Mota-Sanchez D, Dively G, Grafius E. Colorado Potato Beetle Resistance to Insecticides. Am. J. Potato Res. 2008;85:395–413. doi: 10.1007/s12230-008-9052-0. [DOI] [Google Scholar]
  • 7.Schoville SD, et al. A model species for agricultural pest genomics: the genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae) Sci. Rep. 2018;8:1931. doi: 10.1038/s41598-018-20154-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Dunn NA, et al. Apollo: Democratizing genome annotation. PLOS Comput. Biol. 2019;15:e1006790. doi: 10.1371/journal.pcbi.1006790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Thomas GWC, et al. Gene content evolution in the arthropods. Genome Biol. 2020;21:15. doi: 10.1186/s13059-019-1925-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tribolium Genome Sequencing Consortium The genome of the model beetle and pest Tribolium castaneum. Nature. 2008;452:949–955. doi: 10.1038/nature06784. [DOI] [PubMed] [Google Scholar]
  • 11.Cohen, Z. P. et al. Insight into weevil biology from a reference quality genome of the boll weevil, Anthonomus grandis grandis Boheman (Coleoptera: Curculionidae). G3 GenesGenomesGenetics jkac309 (2022). [DOI] [PMC free article] [PubMed]
  • 12.Hsiao TH, Hsiao C. Chromosomal analysis of Leptinotarsa and Labidomera species (Coleoptera: Chrysomelidae) Genetica. 1983;60:139–150. doi: 10.1007/BF00127500. [DOI] [Google Scholar]
  • 13.Hawthorne DJ. AFLP-Based Genetic Linkage Map of the Colorado Potato Beetle Leptinotarsa decemlineata: Sex Chromosomes and a Pyrethroid-Resistance Candidate Gene. Genetics. 2001;158:695–700. doi: 10.1093/genetics/158.2.695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Y P. A method for sex determination of the Colorado potato beetle pupa, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae) Entomol. News. 1993;104:140–142. [Google Scholar]
  • 15.Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods. 2021;18:170–175. doi: 10.1038/s41592-020-01056-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Shi J, et al. Chromosome conformation capture resolved near complete genome assembly of broomcorn millet. Nat. Commun. 2019;10:464. doi: 10.1038/s41467-018-07876-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Servant N, et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 2015;16:259. doi: 10.1186/s13059-015-0831-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Burton JN, et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 2013;31:1119–1125. doi: 10.1038/nbt.2727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Petitpierre, E., Segarra, C., Yadav, J. S. & Virkki, N. Chromosome Numbers and Meioformulae of Chrysomelidae. in Biology of Chrysomelidae (eds. Jolivet, P., Petitpierre, E. & Hsiao, T. H.) 161–186, 10.1007/978-94-009-3105-3_10 (Springer Netherlands, 1988).
  • 21.Krzywinski M, et al. Circos: An information aesthetic for comparative genomics. Genome Res. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Flynn JM, et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. 2020;117:9451–9457. doi: 10.1073/pnas.1921046117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Tarailo‐Graovac, M. & Chen, N. Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences. Curr. Protoc. Bioinforma. 25 (2009). [DOI] [PubMed]
  • 24.Mei Y, et al. InsectBase 2.0: a comprehensive gene resource for insects. Nucleic Acids Res. 2022;50:D1040–D1045. doi: 10.1093/nar/gkab1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Brůna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genomics Bioinforma. 2021;3:lqaa108. doi: 10.1093/nargab/lqaa108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kubota A, et al. Cytochrome P450 CYP2 genes in the common cormorant: Evolutionary relationships with 130 diapsid CYP2 clan sequences and chemical effects on their expression. Comp. Biochem. Physiol. Part C Toxicol. Pharmacol. 2011;153:280–289. doi: 10.1016/j.cbpc.2010.11.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kovaka S. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:1–13. doi: 10.1186/s13059-019-1910-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kuznetsov, D. et al. OrthoDB v11: annotation of orthologs in the widest sampling of organismal diversity. Nucleic Acids Res. gkac998 (2022). [DOI] [PMC free article] [PubMed]
  • 30.Gremme G, Brendel V, Sparks ME, Kurtz S. Engineering a software tool for gene structure prediction in higher organisms. Inf. Softw. Technol. 2005;47:965–978. doi: 10.1016/j.infsof.2005.09.005. [DOI] [Google Scholar]
  • 31.Haas BJ, et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 2008;9:R7. doi: 10.1186/gb-2008-9-1-r7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Derelle R, Philippe H, Colbourne JK. Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment. Mol. Biol. Evol. 2020;37:3389–3396. doi: 10.1093/molbev/msaa159. [DOI] [PubMed] [Google Scholar]
  • 33.Shen W, Le S, Li Y, Hu F. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLOS ONE. 2016;11:e0163962. doi: 10.1371/journal.pone.0163962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25:1972–1973. doi: 10.1093/bioinformatics/btp348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Nguyen L-T, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 2015;32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Yang Z. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 38.Mendes FK, Vanderpool D, Fulton B, Hahn MW. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics. 2021;36:5516–5518. doi: 10.1093/bioinformatics/btaa1022. [DOI] [PubMed] [Google Scholar]
  • 39.Supek F, Bošnjak M, Škunca N, Šmuc T. REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms. PLOS ONE. 2011;6:e21800. doi: 10.1371/journal.pone.0021800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Krzywinski MI, et al. Circos: An information aesthetic for comparative genomics. Genome Res. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.2022. NCBI Sequence Read Archive. SRR20519124
  • 42.2022. NCBI Sequence Read Archive. SRR21095536
  • 43.2022. NCBI GenBank. JANJPO000000000
  • 44.Leptinotarsa decemlineata in InsectBase 2.0http://v2.insect-genome.com/Ldec
  • 45.Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol. Biol. Evol. 2021;38:4647–4654. doi: 10.1093/molbev/msab199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Akdemir KC, Chin L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 2015;16:1–8. doi: 10.1186/s13059-015-0767-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. 2022. NCBI Sequence Read Archive. SRR20519124
  2. 2022. NCBI Sequence Read Archive. SRR21095536
  3. 2022. NCBI GenBank. JANJPO000000000

Supplementary Materials

Supplementary Information (59.2KB, docx)

Data Availability Statement

All software and pipelines were executed according to the manual and protocols of the published bioinformatic tools. The version and code/parameters of software have been described in Methods.


Articles from Scientific Data are provided here courtesy of Nature Publishing Group

RESOURCES