Abstract
The ithomiine butterflies (Nymphalidae: Danainae) represent the largest known radiation of Müllerian mimetic butterflies. They dominate by number the mimetic butterfly communities, which include species such as the iconic neotropical Heliconius genus. Recent studies on the ecology and genetics of speciation in Ithomiini have suggested that sexual pheromones, colour pattern and perhaps hostplant could drive reproductive isolation. However, no reference genome was available for Ithomiini, which has hindered further exploration on the genetic architecture of these candidate traits, and more generally on the genomic patterns of divergence. Here, we generated high-quality, chromosome-scale genome assemblies for two Melinaea species, M. marsaeus and M. menophilus, and a draft genome of the species Ithomia salapia. We obtained genomes with a size ranging from 396 to 503 Mb across the three species and scaffold N50 of 40.5 and 23.2 Mb for the two chromosome-scale assemblies. Using collinearity analyses we identified massive rearrangements between the two closely related Melinaea species. An annotation of transposable elements and gene content was performed, as well as a specialist annotation to target chemosensory genes, which is crucial for host plant detection and mate recognition in mimetic species. A comparative genomic approach revealed independent gene expansions in ithomiines and particularly in gustatory receptor genes. These first three genomes of ithomiine mimetic butterflies constitute a valuable addition and a welcome comparison to existing biological models such as Heliconius, and will enable further understanding of the mechanisms of adaptation in butterflies.
Keywords: chromosome-level genome, Hi-C, ithomiine butterflies, mimicry, olfaction
1. Introduction
The butterfly tribe Ithomiini (Nymphalidae: Danainae), which comprises 393 species, represents the largest known radiation of Müllerian mimetic butterflies, whereby co-occurring chemically-defended species converge in wing colour pattern, which acts as a warning signal learned and avoided by predators (Muller, 1879; Sherratt, 2008). Ithomiine butterflies are endemic to the neotropics, where they numerically dominate butterfly communities in lowland and mountain forests up to 2500 m, and where they engage in mimetic interactions with many other Lepidoptera (Beccaloni, 1997).
As such, ithomiine butterflies have an important ecological relevance. It is thus no wonder that ithomiine species served as examples in Bates’ (Bates, 1862) and Müller’s (Muller, 1879) original descriptions of Batesian (where palatable prey mimic distasteful ones) and Müllerian mimicry, respectively. Ithomiine butterflies are also remarkable in that many species have the unusual characteristic of harbouring partially transparent or translucent wings (McClure, Clerc, et al., 2019; Pinna et al., 2021). Mimetic butterflies have long attracted speciation research, as they usually feature assortative mating for wing colour patterns (e.g., Jiggins et al., 2001), combined with selection against hybrids between forms with different colour patterns (e.g., Merrill et al., 2014), because such hybrids typically harbour intermediate, nonmimetic colour patterns. The iconic genus Heliconius has been the focus of multiple such speciation studies, using both experimental (Jiggins et al., 2001; Merrill et al., 2014) and genomic (Martin et al., 2013; Merrill et al., 2019; Nadeau et al., 2012) approaches.
While colour pattern is believed to be a strong driver of diversification of mimetic butterflies (Kozak et al., 2015), including, possibly, Ithomiini (Jiggins et al., 2006), chemosensory communication may also be involved in speciation. Selection for similarity on a mating cue among co-occurring species is likely to result in reproductive interference (Boussens-Dumon & Llaurens, 2021; Estrada & Jiggins, 2008), raising the question of alternative mate recognition cues. Chemical signals such as sex pheromones have been suggested to a play a role in reproductive isolation in mimetic butterflies (Darragh et al., 2020; González-Rojas et al., 2020), particularly among co-mimetic species (Mérot et al., 2015). In ithomiine butterflies putative sexual pheromones have long been studied (Schulz et al., 2004), and have been shown to diverge between closely related taxa (Mann et al., 2020; McClure, Mahrouche, et al., 2019; Stamm et al., 2019), suggesting a possible role in reproductive isolation (McClure, Clerc, et al., 2019). More broadly, butterflies are phytophagous during the larval stage, and hostplant adaptation, mediated by chemical communication, has been hypothesized to be a major driver of speciation (Ehrlich & Raven, 1964; Jousselin & Elias, 2019). In Ithomiini, where butterfly-plant interaction tends to be very specific (Willmott & Mallet, 2004), divergent selection on hostplant has been documented (e.g., McClure & Elias, 2016b). Chemosensory and associated genes (i.e., all genes involved in chemical communication) thus represent particularly relevant targets for the study of speciation in mimetic butterflies. In butterflies, the detection of chemical signals is mainly performed by three types of membrane receptors named odorant receptors (ORs), gustatory receptors (GRs) and ionotropic receptors (IRs) and two secreted proteins families, the odorant-binding proteins (OBPs) and the chemosensory proteins (CSPs) (Pelosi et al., 2006; Robertson, 2019). The role of specific lineages of the OR gene family in the detection of volatile sex pheromones has been characterized in moths (Montagné et al., 2021). However, little is known of the molecular bases of pheromone detection in butterflies (Eyres et al., 2016; van Schooten et al., 2020). In Ithomiini, only one recent study addressed chemosensory genes, and found that one OR was differentially expressed between two subspecies of Melinaea marsaeus (Piron-Prunier et al., 2021), suggesting a possible role of chemical communication in mate choice.
Likewise, in contrast to Heliconius, little is known on the over-all genomic patterns of speciation in Ithomiini. Two studies, one using microsatellites and the other relying on reduced-complexity genomic data, revealed a range of levels of genetic differentiation among subspecies in five ithomiine species (Gauthier et al., 2020; McClure, Mahrouche, et al., 2019), calling for more in depth studies of population genetic structure and patterns of gene flow.
Despite these needs, research on speciation in Ithomiini is hindered by the lack of reference genomes. The paucity of genomic resources for Ithomiini is surprising, given their ecological and historical importance. The closest reference genome is that of the monarch butterfly, Danaus plexippus (Gu et al., 2019; Zhan et al., 2011), which belongs to the nymphalid tribe Danaini and that diverged from the Ithomiini tribe circa 42 million years ago (Chazot et al., 2019).
Here we present the first genomes of three Ithomiini species, Ithomia salapia (subspecies aquinia), Melinaea marsaeus (subspecies rileyi) and Melinaea menophilus (subspecies ssp nov). Ithomia salapia is a typical “clearwing” ithomiine butterfly, in that it shows transparent or translucent wings (Figure 1). Subspecies of I. salapia belong to large mimicry rings that include ithomiine and non ithomiine species (Beccaloni, 1997; Willmott et al., 2017). The genus Ithomia belongs to the Ithomiine “core-group”, a clade that encompasses 80% of the species of the tribe and that underwent steady diversification in the Central Andes during the Miocene before colonizing other neotropical regions (Chazot et al., 2019). A recent population genetic study in a suture zone showed that gene flow between subspecies of I. salapia was highly reduced, suggesting incipient speciation (Gauthier et al., 2020). The genus Melinaea (Figure 1) belongs to a basal Amazonian lineage that probably experienced high extinction rates during the Miocene before diversifying at a higher pace during the last couple of million years (Chazot et al., 2019). Melinaea species engage in mimetic interactions with multiple other Lepidoptera, including species from the tribe Heliconiini (Beccaloni, 1997). Also, and contrasting with I. salapia, genetic studies based on microsatellite and coding sequences found an exceptionally low level of divergence among Melinaea subspecies and even species (Chazot et al., 2019; Dasmahapatra et al., 2010; McClure, Mahrouche, et al., 2019), which may indicate recent diversification or extensive gene flow. Another intriguing feature in the genus Melinaea is the high karyotypic lability, with multiple events of chromosomal fusion recorded between two closely related subspecies (Brown Jr et al., 2004; McClure et al., 2017).
Figure 1.
Melinaea marsaeus, Melinaea menophilus, Ithomia salapia and wing pattern variation between subspecies of each of these species (source Joron et al., 2006 and photograph credits Céline Houssin)
Because the genomes of these three species are large and highly heterozygous, it has been necessary to test and combine different sequencing methods. The genomes of M. marsaeus and M. menophilus presented here were assembled combining PacBio HiFi, 10x and HiC, which allowed us to assemble genomes at the chromosome level. The I. salapia genome, obtained with 10x sequencing, is more fragmented and can be considered as a draft genome. For each of the genomes we generated gene annotations using a pipeline that incorporated transcriptomic data and manually annotated the chemosensory gene families, as these families are usually badly predicted by automatic annotations.
2. Materials and Methods
2.1. Sample collection, DNA extraction, library construction and sequencing
Females of I. salapia aquinia were collected in Urahuasha (6°27’ S, 76°20 W, San Martin, Peru) and kept in captivity, where they were presented with potted Witheringia solanacea for egg-laying. Females of M. marsaeus rileyi and M. menophilus ssp nov were collected in Micaela (5°56’ S, 76°14’ W, Loreto province, Peru), and Urahuasha, respectively, and kept in captivity in Tarapoto (San Martin, Peru), where they were presented with potted Juanulloa parasitica on which they laid eggs. Larvae of all species were reared on their host plants until pupation, and pupae were preserved in empty plastic vials at −80°C until extraction.
For the genomes of M. marsaeus (ilMelMars1.1) and M. menophilus (ilMelMeno1.1), DNA extraction, library preparation and sequencing were performed by the Scientific Operations core at the Wellcome Sanger Institute. DNA was extracted from flash-frozen pupae of female butterflies with the Qiagen MagAttract HMW DNA kit. Pacific Biosciences (PacBio) HiFi libraries were sequenced on a PacBio SEQUEL II. 10x Genomics Chromium version 2 libraries and HiC Arima version 2.0 libraries were constructed according to the manufacturer’s instructions and sequenced on Illumina HiSeq X instruments.
The two individuals used for the genome of I. salapia were extracted following a protocol adapted from (Mayjonade et al., 2016). Samples were snap frozen alive in liquid nitrogen and conserved at −80°C. DNA was extracted from the whole butterfly bodies with the exception of the head. Butterflies were ground in a frozen mortar with liquid nitrogen, 150 mg of tissue powder was mixed with 900 μl of preheated buffer and 6 μl of RNaseA. Tubes were incubated for 120 min at 50°C for lysis, and then at −10°C for 10 min, with the addition of 300 μl of potassium acetate for the precipitation. One volume of binding buffer was added with 100 μl of Serapure beads solution. Three washing cycles were used and DNA was resuspended in 100 μl of EB buffer. Library construction including adaptor ligation and size selection were performed according to the manufacturer’s instructions. The two 10x Chromium Genome Library libraries were sequenced on one lane of the HiSeq 2500 with a 250PE-RR read metric.
2.2. Transcriptomic data
For M. marsaeus and I. salapia transcriptomic data were generated from various tissues including (abdomen, thorax, head) and developmental stages (adult, pupae and two larval stages) (detailed in Table 1) to maximize transcript diversity. In addition, targeted tissues from pupal wing discs and antennae in M. marsaeus were used (Piron-Prunier et al., 2021). Tissue samples were homogenized in 600 μl of RLT buffer with TissueLyser (Qiagen). Total RNA was then extracted according to the manufacturer’s protocol (RNeasy Mini kit, Qiagen) and eluted in 30 μl of RNase-free water. To avoid genomic contamination, RNase-free DNase treatment (Qiagen) was performed during RNA extraction. The quality of the isolated RNA was checked on 0.8% agarose gel for the presence of 28 S and 18 S bands. The quality and quantity of RNA was further analysed using Qubit 2.0 fluorometer (Invitrogen) and RNA integrity was confirmed using an Agilent Bioanalyser 2100 (Agilent Technologies). Libraries were sequenced with Illumina HiSeq 2500 platform.
Table 1. Statistics of raw read data including sequencing strategy, read length, number of reads and total sequenced bases.
| Species | Sequencing strategy | Read length (bp) | No. reads | No. bases (Gb) |
|---|---|---|---|---|
| Genomic data | ||||
| M. marsaeus | PacBio HiFi | N50 = 11,228 | 2,593,875 | 26.81 |
| 10x | 151 | 146,203,982 | 22.07 | |
| HiC | 151 | 118,404,072 | 17.88 | |
| M. menophilus | PacBio HiFi | N50 = 11,996 | 2,275,183 | 25.20 |
| 10x | 151 | 118,378,976 | 17.87 | |
| HiC | 151 | 147,947,432 | 22.34 | |
| I. salapia | 10x | 150 | 163,421,078 | 24.51 |
| 10x | 150 | 155,015,880 | 23.25 | |
| Species | Tissue | Read length (bp) | No. reads | No. bases (Gb) |
| Transcriptomic data | ||||
| M. marsaeus * | Thorax | 150 | 72,739,116 | 10.91 |
| Abdomen | 150 | 67,825,765 | 10.17 | |
| Head | 150 | 76,024,296 | 11.40 | |
| Pupae | 150 | 79,408,138 | 11.91 | |
| Fifth instar caterpillar | 150 | 82,451,250 | 12.37 | |
| Pupae wing disks | 150 | 376,165,512 | 56.42 | |
| Adult antenna | 150 | 410,103,511 | 61.52 | |
| I. salapia | Thorax | 150 | 181,315,214 | 27.20 |
| Abdomen | 150 | 145,723,564 | 21.86 | |
| Head | 150 | 164,980,366 | 24.75 | |
| Pupae | 150 | 151,605,900 | 22.74 | |
| Fifth instar caterpillar | 150 | 149,181,784 | 22.38 | |
2.3. Genome size and heterozygosity estimation using k-mers approaches
Genome characteristics, genome size, heterozygosity, were estimated on each data set of raw reads using k-mer spectrum distribution analysis. K-mer distribution were estimated using JELLYFISH version 2.2.10 (Marçais & Kingsford, 2011) and a k-mer size of 31. GENOMESCOPE2 (Ranallo-Benavidez et al., 2020) was used to estimate genome characteristics and generate plots (Figure S1).
2.4. Genome assembly
For M. marsaeus and M. menophilus, the assembly process included the following sequence of steps: initial PacBio assembly generation with Hifiasm version 0.15.1 (Cheng et al., 2021), retained haplotig separation with purge_dups version 1.2.3 (Guan et al., 2020), short-read polishing using FreeBayes version 1.3.1-called variants (Garrison & Marth, 2012) from 10x Genomics Chromium reads aligned with LongRanger version 2.2.2 (https://github.com/10XGenomics/longranger), and Hi-C based scaffolding with SALSA2 version 2.2 (Ghurye et al., 2019) using Hi-C contact map (Figure S2). The mitochondrial genome was assembled using MitoHifi version 2 (https://github.com/marcelauliano/MitoHiFi). Finally, the assemblies were analysed and manually improved using rapid curation (Howe et al., 2021). Chromosome-scale scaffolds confirmed by the Hi-C data have been named in order of size. Genome completeness was assessed with BUSCO version 5 (Manni et al., 2021) using the “genome” mode with the lepidoptera_odb10 orthologue data set composed of 5286 orthologous genes. BUSCO genes were also used to identify the Z chromosomes in both species. The putative Z chromosomes also showed reduced read coverage in both species, supporting that they are Z chromosomes of females. In M. menophilus a second chromosome with reduced coverage, Hi-C links to the Z chromosome and very small size (2.99 Mbp) was assigned as putative W chromosome. For I. salapia, all 10x libraries of the two samples were first assembled separately with Supernova version 2.1.1 (Visendi, 2022) and then combined with Ragout using one genome as reference and the other one as target (Kolmogorov et al., 2018). Base accuracy (QV) was estimated using a k-mer size of 21 with Merqury (Rhie et al., 2020).
2.5. Synteny
Synteny between M. marsaeus and M. menophilus genomes was investigated using the positions of the complete nonduplicated BUSCO genes. Using a custom-made R script, we merged the BUSCO gene position files and plotted them against each other.
2.6. Gene prediction, automated and functional annotations
The transposable element annotation was realized using RepeatMasker (Tarailo-Graovac & Chen, 2009). This annotation was exported into GFF3 files and used as a mask for gene annotation. Later, repeat masking with de novo repeat discovery, automated curation and filtering was performed using the EarlGrey pipeline (version 1.2) (Baril et al., 2021) with default settings in combination with the Arthropoda library from the Dfam database (version 3.5) (Storer et al., 2021). The automated gene prediction and annotation was done using MAKER (Cantarel et al., 2008) integrating different features based on (i) the mapping of Lepidoptera proteins from LepBase (Challi et al., 2016), (ii) the transcriptomes of each species generated by the assembly of RNA-Seq data with Trinity 2.8.4 (Haas et al., 2013) and (iii) ab initio genes predictions using Augustus (Hoff & Stanke, 2019). Reliable gene predictions were extracted according to annotation edit distance (AED) ≤0.2 or a minimum coverage of 1000 from RNAseq data mapping after optimization using BUSCO statistics. Annotation completeness was assessed with BUSCO version 5 (Manni et al., 2021) using the “protein” mode with the lepidoptera_odb10 ortholog data set composed of 5286 orthologous genes. The functional annotation was performed using blastp from BLAST+ version 2.5.0 (Camacho et al., 2009) to compare predicted proteins in each genome to the NCBI nonredundant database. The 10 best hits below an e-value of 1 e-08 without complexity masking were conserved. Interproscan (Jones et al., 2014) was used to analyse protein sequences seeking for known protein domains in the different databases available in Interproscan. Finally, we used Blast2GO (Conesa et al., 2005) to associate a protein with a gene ontology (GO) group.
2.7. Orthologue analyses
Orthologous genes between annotated genes in each species and the seven outgroups (Pieris napi, Bicyclus anynana, Junonia coenia, Melitaea cinxia, Heliconius erato, Heliconius melpomene and Danaus plexippus) were identified using OrthoFinder version 2.5.2 (Emms & Kelly, 2015). Single copy orthologue proteins were extracted, aligned using MAFFT version 7.01775 and concatenated using AMAS (Borowiec, 2016). The species phylogeny was performed on this alignment composed of 996 orthologues for a length of 647 kb using PhyML (Anisimova & Gascuel, 2006) including a branch support estimation with 1000 bootstrap iterations.
2.8. Manual annotation of chemosensory genes
For each of the chemosensory gene family, that is, odorant receptors (ORs), the gustatory receptors (GRs), the variant ionotropic receptors (IRs), the odorant-binding proteins (OBPs) and the chemosensory proteins (CSPs), amino acid sequences previously identified from the genomes of D. plexippus, H. melpomene, S. frugiperda and B. mori (Briscoe et al., 2013; Gouin et al., 2017; Guo et al., 2017; Heliconius Genome Consortium, 2012; Liu et al., 2018; Meslin et al., 2022; Vogt et al., 2015; Zhan et al., 2011) as well as from the transcriptome of M. marsaeus (Piron-Prunier et al., 2021) were used as queries in a tBLASTn search against genome assemblies of the three species (e-value threshold 0.001), in order to identify scaffolds containing the genes to annotate. Query amino acid sequences were then aligned on these scaffolds with Exonerate (Slater & Birney, 2005) to identify precise intron-exon boundaries and create gene models. These models were visualized using Integrated Genomics Viewer version 2.11.9 (Robinson et al., 2011), and badly predicted models were eliminated from the final sequence data sets. Nucleotide and amino acid sequences were extracted with GffRead (Pertea & Pertea, 2020). To create CSP and GR trees, amino acid sequences from Ithomiini were aligned with those of the above-mentioned species (except S. frugiperda GRs that were not included to limit the number of sequences) with MAFFT version 7 (Katoh et al., 2019). Maximum-likelihood phylogenies were built using PhyML 3.0 (Guindon et al., 2010) following model selection by SMS (Lefort et al., 2017). Branch support was estimated via SH-like approximate likelihood-ratio test (Anisimova & Gascuel, 2006).
3. Results and Discussion
3.1. Sequencing strategy comparison
In order to obtain a high-quality reference genome for M. marsaeus, we combined deeper PacBio sequencing using the new HiFi technology with low error rates, 10x sequencing and HiC data (Table 1). The use of a HiC approach, which enabled us to organize the scaffolds at the chromosome level, was particularly successful as it resulted in a final genome of 503 Mb composed of 22 scaffolds and an N50 of 40.4 Mb (Table 2). The same strategy was used for the species M. menophilus and yielded similar quality results with a genome of 496 Mb composed of 28 scaffolds and an N50 of 23.1 Mb (Table 2). For I. salapia, two 10x libraries were generated from two individuals and sequenced separately (Table 1). Largely due to the absence of HiC libraries and PacBio HiFi libraries, the genome obtained for this species is more fragmented than those of the two Melinaea species. The final assembly is composed of 23,973 scaffolds for a total of 395 Mb and an N50 of 1.4 Mb (Table 2). For M. marsaeus, the 22 scaffolds obtained could be grouped into 13 chromosomes, two sex chromosomes W and Z, the mitochondrion and six unplaced scaffolds. For M. menophilus, the 28 scaffolds were grouped into 20 chromosomes, two sex chromosomes W and Z, the mitochondrion and five unplaced scaffolds. The final number of chromosomes assembled matches the number of chromosomes identified by cytogenetic techniques in M. menophilus, that is, 2 n = 42 (Dutrillaux et al., 2022).
Table 2. Assembly statistics and completeness evaluation.
| Assembly statistics | M. marsaeus | M. menophilus | I. salapia |
|---|---|---|---|
| No. scaffolds | 22 | 28 | 23,973 |
| N50 scaffold | 40,461,556 | 23,164,123 | 1,472,785 |
| L50 scaffold count | 6 | 8 | 70 |
| Mean scaffold size | 22,886,664.64 | 17,716,406.50 | 16,515 |
| Longest scaffold | 46,264,634 | 41,164,108 | 15,188,582 |
| %N | 0,001 | 0,002 | 8.74 |
| GC content (%) | 31.7 | 31.7 | 30.99 |
| Total length | 503,506,622 | 496,059,382 | 395,915,617 |
| Base quality (QV) | |||
| PacBio HiFi | 58.55 | 58.87 | |
| 10x | 48.23 | 49.61 | 58.54 |
| BUSCO results on genomes | |||
| Complete and single-copy BUSCOs | 96.8 | 98.1 | 95.0 |
| Complete and duplicated BUSCOs | 0.6 | 0.6 | 2.4 |
| Fragmented BUSCOs | 0.3 | 0.3 | 0.9 |
| Missing BUSCOs | 2.3 | 1.0 | 1.7 |
3.2. Genome size and heterozygosity estimation
For each of the three genomes, the size of the final assemblies is within, or slightly above, the range of the size estimates from k-mer approaches on the raw reads. For M. marsaeus the k-mer estimates range from 330 to 496 Mb (Table S1) and the assembled genome size is 503 Mb; for M. menophilus the k-mer estimates range from 357 to 527 Mb (Table S1) and the assembled genome size is 496 Mb; and finally for I. salapia, the k-mer size estimate range is 352 to 357 Mb and the assembled genome size is 395 Mb (Table 2, Table S1). These genome sizes are at the top of the distribution of genome sizes observed in the Danainae, ranging from 249 to 455 Mb, but are below those of the largest genomes observed in the Nymphalidae, such as Polyura nepenthes (Nymphalidae, Charaxinae) whose genome size is estimated at 925 Mb (Liu et al., 2020). When comparing 10x data, almost four times more heterozygosity is observed for M. marsaeus than for M. menophilus (Table S1). The levels of heterozygosity estimated using k-mer approaches show an heterogeneity between the different data sets but seem to show a fairly high level of heterozygosity (Table S1). This may be related to the demographic history of the populations and, for M. marsaeus, to the mechanisms of divergence and hybridisation that exist in the suture zone between the Andes and the Amazon. The populations of M. marsaeus around Tarapoto were found to be profoundly admixed in a previous study (McClure & Elias, 2016a). This high level of divergence between M. marsaeus populations and their hybridisation may explain the difficulty of assembly encountered during the first attempt to sequence this species.
The final assemblies show a high level of completeness, as testified by high BUSCO completeness using the “genome” mode (Seppey et al., 2019). For each of the genomes, including the more fragmented genome of I. salapia, more than 95% of 5286 single copy orthologues across Lepidoptera were recovered (Table 2).
In contrast to the highly colinear genomes Heliconius butterflies, where most species have 21 chromosomes (Seixas et al., 2021), our closely related Melinaea species differ strongly in chromosome number (14 vs. 21) and show numerous massive rearrangements (Figure 2). The only two M. marsaeus chromosomes that fully correspond to a single M. menophilus chromosome, are chromosomes 7 (chr. 1 in M. menophilus) and the Z chromosome. The high variation in chromosome numbers in species in the genus Melinaea has already been observed by (Brown et al., 2004; Dutrillaux et al., 2022; McClure et al., 2017). Here we show that this variation could be the result of fusion and fission events.
Figure 2.
Low synteny between M. marsaeus and M. menophilus despite very recent splitting time. The positions of BUSCO genes mapping uniquely to both genomes are shown in the order of the M. marsaeus chromosomes. The colours reflect the different M. marsaeus chromosomes. A fully conserved chromosome would be reflected as a single diagonal line as in M. marsaeus chromosome 7, which corresponds to M. menophilus chromosome 1. Grey lines indicate chromosome ends
3.3. Gene prediction and function annotation
Prior to the gene annotation step, an annotation of transposable and repeated elements was performed. To perform reliable gene annotation we took advantage of transcriptomic data. For M. marsaeus, we used assembled transcripts from a study on differential expression between two subspecies (Piron-Prunier et al., 2021), which included a reference transcriptome for that species across multiple stages (larval, pupal and imago) and transcriptomes of targeted tissues, namely pupal wing discs and antennae (Table 1). For I. salapia, we sequenced and assembled a reference transcriptome by sequencing transcripts from different tissues and different developmental stages (Table 1). Automated annotations combining transcriptomic data, known lepidopteran proteins and ab initio predictions annotated respectively 52,865 genes for M. marsaeus, 54,531 genes for M. menophilus and 32,213 for I. salapia. After the filtering of the reliable gene predictions, 18,670 genes were kept for M. marsaeus, 19,174 for M. menophilus and 18,283 for I. salapia. These genes have comparable characteristics in terms of gene size, number and sizes of exons and introns (Table 3). Like the genomes, these annotations and the predicted proteins have a high completeness level identified by BUSCO using the “protein” mode with more than 85.8% of the lepidopteran single copy orthologues recovered (Table 3).
Table 3. Annotation statistics and predicted protein completeness evaluation.
| Gene statistics | M. marsaeus | M. menophilus | I. salapia |
|---|---|---|---|
| No. of raw genes | 52,865 | 54,431 | 32,213 |
| No. of filtered genes | 18,670 | 19,174 | 18,283 |
| Average gene length (bp) | 5821.59 | 5779.93 | 5228.13 |
| Median gene length (bp) | 3877.00 | 3839.00 | 2889.00 |
| Average exon per gene | 6.78 | 6.68 | 6.09 |
| Average exon length (bp) | 257.52 | 262.14 | 239.17 |
| Average intron per gene | 5.78 | 5.68 | 5.09 |
| Average intron length (bp) | 643.28 | 650.95 | 659.45 |
| % coding sequence | 6.48 | 6.77 | 6.74 |
|
BUSCO results on
proteins | |||
| Complete and single-copy BUSCOs |
85.8 | 87.5 | 88.4 |
| Complete and duplicated BUSCOs |
1.0 | 1.1 | 2.6 |
| Fragmented BUSCOs | 1.0 | 0.9 | 1.5 |
| Missing BUSCOs | 12.2 | 10.5 | 7.5 |
Annotation of the repetitive elements of the genome, combining de novo and homology-based discovery approaches, revealed increased transposable element content with increasing genome size, with 14% total repeat content in I. salapia and 24% in the two Melinaea species (Figure S3). The differences could be linked to sequencing strategies. The complement of different element classes differed between the species and from the repeat content described in Danaus species, which themselves show considerable variation within the genus (Baril & Hayward, 2022). More specifically, the ithomiine genomes all exhibit increased DNA transposon, Rolling-circle and LINE and LTR retroelement content but decreased contributions of Penelope elements. SINE retroelements comprise nearly 3% of the genome assemblies in both Melinaea species but less than 0.2% of the I. salapia genome. Transposon landscape analysis supports recent transposon activity in all genomes, as indicated by the presence of several TE classifications with low genetic distance to their consensus sequences (Figure S2). Regarding the distribution at the chromosome level, the sex chromosomes have different concentrations of repeated elements than the autosomes. The Z chromosomes present only 14% of transposable elements for both species. Conversely, the W chromosomes have a much higher concentration than the autosomes, reaching 59.72% for M. menophilus and 73.62% for M. marsaeus. However, for both the Z and W chromosomes, the composition of the different families of transposable elements is substantially similar between the sex chromosomes and with the rest of the genome (Figure S3).
3.4. Comparison with key lepidopteran reference genomes
Orthologous genes for all annotated genes in the three focal species and seven outgroup butterfly species, including reference genomes such as Danaus plexippus (PRJNA564985), the species most closely related to the Ithomiini, and Heliconius melpomene (PRJEA71053), a species belonging to a large clade of mimetic butterflies, were identified using OrthoFinder version 1.1474 (Emms & Kelly, 2015). In total, 16,736 orthology groups were identified including 93.0% of all the analysed genes from the 10 species. Among them, 5792 orthogroups are shared by all species. Larger gene numbers were observed for the Melinaea species. Thus, a reduced proportion of genes are shared by the ithomiines, which represent 4.4 and 3.0% of the genes for M. marsaeus and M. menophilus respectively, and 2.0% of the genes for I. salapia (in light orange on Figure 3). Within Melinaea, a large proportion of genes are associated with the Melinaeae genus and shared between the two species, representing 11.1% of genes for M. marsaeus and 10.4% of genes for M. menophilus (in light yellow on Figure 3). Finally, we also observed a large proportion of species-specific genes, since they reach 11.9% (including 6.2% of duplicated species-specific) for M. marsaeus and 14.3% (including 7.4% of duplicated species-specific) for M. menophilus (in green on Figure 3).
Figure 3.
Phylogeny and orthologous gene numbers across 10 butterfly genomes. “Shared by some” represents orthologues shared by eight out of the 10 species and without phylogenetic signal
3.5. Annotation of chemosensory genes
Chemosensory cues and signals are instrumental for butterflies as they are involved in host plant detection and in mate recognition. This is especially the case in mimetic butterflies, whereby the colour pattern may not provide an effective cue for mate recognition due to mimicry (Mérot et al., 2015). Detection of chemosensory cues and signals by the peripheral nervous system of insects is mainly governed by transmembrane receptors located at the membrane of olfactory or gustatory neurons, responsible for signal transduction upon ligand activation. In insects, such receptors belong to three multigene families: the odorant receptors (ORs), the gustatory receptors (GRs) and the variant ionotropic receptors (IRs). Depending on insect orders, the number of genes within each family can vary from a few dozens to several hundreds (Robertson, 2019). We annotated genes belonging to these families in the three Ithomiini genomes (Table 4). The number of OR genes varied from 62 in M. menophilus to 70 in I. salapia, which is similar to the number found in any other lepidopteran genome, including the closely related species D. plexippus (Montagné et al., 2021). The same holds true for IR genes whose numbers varied from 31 in M. marsaeus to 36 in I. salapia. By contrast, we annotated an unexpectedly large number of GR genes in the three species, up to more than 200 in M. marsaeus. This high number of genes compared with other Nymphalidae (including D. plexippus) results from extensive duplications in Ithomiini that occurred in several lineages of the GR phylogeny (Figure 4). So far, such expansions of GR repertoires in Lepidoptera have been documented only in the Noctuidae family, where it has been tentatively linked to polyphagy (Gouin et al., 2017; Meslin et al., 2022). It is interesting to note that somehow similar expansions also occurred independently in Ithomiini, which are not polyphagous but rather oligophagous species (McClure & Elias, 2016b; Willmott & Mallet, 2004).
Table 4. Number of chemosensory genes annotated in different lepidopteran genomes.
| Species | OR | GR | IR | OBP | CSP |
|---|---|---|---|---|---|
| M. marsaeus | 63 | 209 | 31 | 39 | 54 |
| M. menophilus | 62 | 187 | 34 | 35 | 53 |
| I. salapia | 70 | 167 | 36 | 40 | 43 |
| D. plexippus | 64 | 56 | 32 | 32 | 34 |
| H. melpomene | 66 | 73 | 33 | 51 | 33 |
| S. frugiperda | 69 | 234 | 42 | 50 | 22 |
| B. mori | 73 | 76 | 30 | 39 | 21 |
Figure 4.
Maximum-likelihood phylogeny of lepidopteran GRs, built from amino acid sequences from B. mori, H. melpomene, D. plexippus, I. salapia, M. marsaeus and M. menophilus. Deep nodes highly supported by the likelihood-ratio test (aLRT >0.95) are indicated by black dots. Those that correspond to Ithomiini-specific large expansions (more than 10 genes) are shown with stars. The scale bar represents the expected number of amino acid substitutions per site
Apart from transmembrane receptors, chemodetection in insects also relies on soluble proteins that can bind and transport semiochemicals in the aqueous lymph of olfactory and gustatory sensilla, so that they can reach the neurons. The genomes of Ithomiini contain 35 to 40 genes encoding odorant-binding proteins (OBPs), which is in the range of what has been observed in other Lepidoptera. On the other hand, the number of chemosensory proteins (CSPs) is exceptionally high in Ithomiini genomes, especially in both Melinaea species which have more than 50 CSP genes (Table 4). The phylogenetic analysis shows that all but one of the CSP lineages are highly conserved in Lepidoptera, whereas numerous gene duplications occurred in a butterfly-specific lineage (Figure 5). This expansion has been documented previously (Heliconius Genome Consortium, 2012) yet it is particularly spectacular in Ithomiini genomes, which contain up to 30 CSP genes (in M. marsaeus) versus 11 in D. plexippus and six in H. melpomene. This further confirms a previous observation made following the analysis of the M. marsaeus transcriptome (Piron-Prunier et al., 2021).
Figure 5.
Maximum-likelihood phylogeny of lepidopteran CSPs, built from amino acid sequences from B. mori (Bmor), S. frugiperda (Sfru), H. melpomene (Hmel), D. plexippus (Dple), I. salapia (Isal), M. marsaeus (Mmar) and M. menophilus (Mmen). Deep nodes highly supported by the likelihood-ratio test (aLRT >0.95) are indicated by black dots. The scale bar represents the expected number of amino acid substitutions per site
4. Conclusion
In this study we sequenced, de novo assembled, and annotated the genomes of three ithomiine species. We analysed their genomic features and performed genomic content comparison and orthologous gene identification with D. plexippus, which belongs to the same sub-family (Danainae), and various outgroups including two Heliconius species (Nymphalidae: Heliconiinae), a well-studied mimetic genus that includes species that are mimetic with Melinaea. Manual curation of chemosensory genes in the three genomes revealed unexpected expansions of GR genes, as has been previously observed only in polyphagous noctuids. These first genomes of ithomiine mimetic butterflies will be useful to further understand the mechanisms of adaptation and the genetic bases underpinning mimicry, and provide a welcome comparison to existing biological models of mimicry like the Heliconius.
Supplementary Material
Acknowledgements
ME acknowledges financial support from ANR (projects SPECREP and CLEARWING) and HFSP (RGP0014/2016). MGX acknowledges financial support from France Génomique National infrastructure, funded as part of “Investissement d’Avenir” programme managed by Agence Nationale pour la Recherche (contract ANR-10-INBS-09). RD, SM and CZ acknowledge financial support from Wellcome grant WT207492, and SM and work at the Wellcome Sanger Institute were also supported by Wellcome grant WT206194. We thank members of the Wellcome Sanger Institute Scientific Operations and Tree of Life core groups for their contributions to data production and assembly for the Melinaea genomes. We thank Mario Tuanama and Ronald Mori-Pezo for help with rearing and collecting butterflies. MM acknowledges a postdoctoral fellowship by the Fonds Québécois de la Recherche sur la Nature et les Technologies (FQRNT). STM acknowledges support from the BBSRC Institute Strategy Programme (BB/P012574/1). We thank the Peruvian authorities for research permits (236-2012-AG-DGFFS-DGEFFS, 201-2013-MINAGRI-DGFFS/DGEFFS, 002-2015-SERFOR-DGGSPFFS and 373-2017-SEROR-DGGSPFFS), the Gobierno Regional San Martín PEHCBM (permit: 124-2016-GRSM/PEHCBM-DMA/EII-ANP/JARR) and the Museo de Historia Natural and Professor Gerardo Lamas for their support. We thank the GenOuest BioInformatics Platform (http://www.genouest.org/), which allowed the use of a computing cluster for bioinformatic analyses.
Funding information
Agence Nationale de la Recherche, Grant/Award Number: ANR-10-INBS-09; Human Frontier Science Program
Footnotes
Author Contributions
Marianne Elias designed the study. Melanie McClure performed sampling. Marianne Elias, Paul Jay and Sam T. Mugford performed DNA extraction. Joana Meier, Fabrice Legeai, Jérémy Gauthier, Claire Lemaitre, Annabel Whibley, Hugues Parrinello, Sam T. Mugford, Richard Durbin, Chenzi Zhou, Shane McCarthy, Florence Piron-Prunier, Paul Jay, Camille Noûs performed the sequencing and assemblies with contributions from Christopher W. Wheat. Jérémy Gauthier and Fabrice Legeai performed the annotations with contributions from Anthony Bretaudeau. Annabel Whibley performed the transposable element analyses. Joana Meier performed the synteny analyses. Jérémy Gauthier and Hélène Boulain performed the orthologous gene analyses. Nicolas Montagné, Emma Persyn, Christelle Monsempes, Marie-Christine François, Camille Meslin and Emmanuelle Jacquin-Joly performed the chemosensory analyses. All authors took part in discussions concerning the analyses and result interpretations. Jérémy Gauthier and Marianne Elias wrote the manuscript, with contributions from all authors.
Conflict of Interest
The authors declare no conflicts of interest.
Supporting Information
Additional supporting information can be found online in the Supporting Information section at the end of this article.
Data Availability Statement
Genome assemblies have been made available on the NCBI under GCA_918358865.1 accession number for M. marsaeus, GCA_918358695.1 for M. menophilus and JAPOND000000000 for I. salapia. Raw genomic data can be found under BioProjects PRJNA836751 (M. marsaeus v1), PRJEB48295 (M. marsaeus v2) and PRJEB48296 (M. menophilus). Raw transcriptomic data can be found under BioProjects PRJNA836751. The assembly and annotation pipelines including custom scripts have been made available in the Github repository https://github.com/JeremyLGauthier/scripts_Ithomiine_genomes.
Open Research Badges
This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://github.com/JeremyLGauthier/scripts_Ithomiine_genomes.
References
- Anisimova M, Gascuel O. Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Systematic Biology. 2006;55(4):539–552. doi: 10.1080/10635150600755453. [DOI] [PubMed] [Google Scholar]
- Baril T, Hayward A. Migrators within migrators: Exploring transposable element dynamics in the monarch butterfly, Danaus plexippus. Mobile DNA. 2022;13(1):5. doi: 10.1186/s13100-022-00263-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baril T, Imrie R, Hayward A. Earl Grey. 2021 doi: 10.5281/zenodo.5654616. [DOI] [Google Scholar]
- Bates HW. XXXII. Contributions to an insect fauna of the Amazon valley. Lepidoptera: Heliconidae. Transactions of the Linnean Society of London. 1862;23(3):495–566. doi: 10.1111/j.1096-3642.1860.tb00146.x. [DOI] [Google Scholar]
- Beccaloni GW. Vertical stratification of ithomiine butterfly (Nymphalidae: Ithomiinae) mimicry complexes: The relationship between adult flight height and larval host–plant height. Biological Journal of the Linnean Society Linnean Society of London. 1997;62(3):313–341. doi: 10.1006/bijl.1997.0165. [DOI] [Google Scholar]
- Borowiec ML. AMAS: A fast tool for alignment manipulation and computing of summary statistics. PeerJ. 2016;4:e1660. doi: 10.7717/peerj.1660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boussens-Dumon G, Llaurens V. Sex, competition and mimicry: An eco-evolutionary model reveals unexpected impacts of ecological interactions on the evolution of phenotypes in sympatry. Oikos. 2021;130(11):2028–2039. [Google Scholar]
- Briscoe AD, Macias-Muñoz A, Kozak KM, Walters JR, Yuan F, Jamie GA, Martin SH, Dasmahapatra KK, Ferguson LC, Mallet J, Jacquin-Joly E, et al. Female behaviour drives expression and evolution of gustatory receptors in butterflies. PLoS Genetics. 2013;9(7):e1003620. doi: 10.1371/journal.pgen.1003620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown KS, Jr, Von Schoultz B, Suomalainen E. Chromosome evolution in neotropical Danainae and Ithomiinae (lepidoptera) Hereditas. 2004;141(3):216–236. doi: 10.1111/j.1601-5223.2004.01868.x. [DOI] [PubMed] [Google Scholar]
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: Architecture and applications. BMC Bioinformatics. 2009;10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, Holt C, Alvarado AS, Yandell M. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research. 2008;18(1):188–196. doi: 10.1101/gr.6743907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Challi RJ, Kumar S, Dasmahapatra KK, Jiggins CD, Blaxter M. Lepbase: The lepidopteran genome database. Cold Spring Harbor Laboratory; 2016. [DOI] [Google Scholar]
- Chazot N, Willmott KR, Lamas G, Freitas AVL, Piron-Prunier F, Arias CF, Mallet J, De-Silva DL, Elias M. Renewed diversification following Miocene landscape turnover in a neotropical butterfly radiation. Global Ecology and Biogeography: A Journal of Macroecology. 2019;28(8):1118–1132. doi: 10.1111/geb.12919. [DOI] [Google Scholar]
- Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods. 2021;18(2):170–175. doi: 10.1038/s41592-020-01056-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
- Darragh K, Montejo-Kovacevich G, Kozak KM, Morrison CR, Figueiredo CME, Ready JS, Salazar C, Linares M, Byers KJRP, Merrill RM, McMillan WO, et al. Species specificity and intraspecific variation in the chemical profiles of Heliconius butterflies across a large geographic range. Ecology and Evolution. 2020;10(9):3895–3918. doi: 10.1002/ece3.6079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dasmahapatra KK, Lamas G, Simpson F, Mallet J. The anatomy of a “suture zone” in Amazonian butterflies: A coalescent-based test for vicariant geographic divergence and speciation. Molecular Ecology. 2010;19(19):4283–4301. doi: 10.1111/j.1365-294X.2010.04802.x. [DOI] [PubMed] [Google Scholar]
- Dutrillaux B, Dutrillaux A-M, McClure M, Gèze M, Elias M, Bed’hom B. Improved basic cytogenetics challenges holocentricity of butterfly chromosomes. 2022:2022.03.11.484012. doi: 10.1159/000526034. [DOI] [PubMed] [Google Scholar]
- Ehrlich PR, Raven PH. Butterflies and plants: A study in co-evolution. Evolution; International Journal of Organic Evolution. 1964;18(4):586–608. [Google Scholar]
- Emms DM, Kelly S. OrthoFinder: Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biology. 2015;16:157. doi: 10.1186/s13059-015-0721-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Estrada C, Jiggins CD. Interspecific sexual attraction because of convergence in warning colouration: Is there a conflict between natural and sexual selection in mimetic species? Journal of Evolutionary Biology. 2008;21(3):749–760. doi: 10.1111/j.1420-9101.2008.01517.x. [DOI] [PubMed] [Google Scholar]
- Eyres I, Jaquiéry J, Sugio A, Duvaux L, Gharbi K, Zhou J-J, Legeai F, Nelson M, Simon J-C, Smadja CM, Butlin R, et al. Differential gene expression according to race and host plant in the pea aphid. Molecular Ecology. 2016;25(17):4197–4215. doi: 10.1111/mec.13771. [DOI] [PubMed] [Google Scholar]
- Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv. 2012 http://arxiv.org/abs/1207.3907 . [Google Scholar]
- Gauthier J, de Silva DL, Gompert Z, Whibley A, Houssin C, Le Poul Y, McClure M, Lemaitre C, Legeai F, Mallet J, Elias M. Contrasting genomic and phenotypic outcomes of hybridization between pairs of mimetic butterfly taxa across a suture zone. Molecular Ecology. 2020;29(7):1328–1343. doi: 10.1111/mec.15403. [DOI] [PubMed] [Google Scholar]
- Ghurye J, Rhie A, Walenz BP, Schmitt A, Selvaraj S, Pop M, Phillippy AM, Koren S. Integrating hi-C links with assembly graphs for chromosome-scale assembly. PLoS Computational Biology. 2019;15(8):e1007273. doi: 10.1371/journal.pcbi.1007273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- González-Rojas MF, Darragh K, Robles J, Linares M, Schulz S, McMillan WO, Jiggins CD, Pardo-Diaz C, Salazar C. Chemical signals act as the main reproductive barrier between sister and mimetic Heliconius butterflies. Proceedings of the Royal Society B: Biological Sciences. 2020;287(1926):20200587. doi: 10.1098/rspb.2020.0587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gouin A, Bretaudeau A, Nam K, Gimenez S, Aury J-M, Duvic B, Hilliou F, Durand N, Montagné N, Darboux I, Kuwar S, et al. Two genomes of highly polyphagous lepidopteran pests (Spodoptera frugiperda, Noctuidae) with different host-plant ranges. Scientific Reports. 2017;7(1):11816. doi: 10.1038/s41598-017-10461-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu L, Reilly PF, Lewis JJ, Reed RD, Andolfatto P, Walters JR. Dichotomy of dosage compensation along the neo Z chromosome of the monarch butterfly. Current Biology: CB. 2019;29(23):4071–4077.:e3. doi: 10.1016/j.cub.2019.09.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guan D, McCarthy SA, Wood J, Howe K, Wang Y, Durbin R. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 2020;36(9):2896–2898. doi: 10.1093/bioinformatics/btaa025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Systematic Biology. 2010;59(3):307–321. doi: 10.1093/sysbio/syq010. [DOI] [PubMed] [Google Scholar]
- Guo H, Cheng T, Chen Z, Jiang L, Guo Y, Liu J, Li S, Taniai K, Asaoka K, Kadono-Okuda K, Arunkumar KP, et al. Expression map of a complete set of gustatory receptor genes in chemosensory organs of Bombyx mori. Insect Biochemistry and Molecular Biology. 2017;82:74–82. doi: 10.1016/j.ibmb.2017.02.001. [DOI] [PubMed] [Google Scholar]
- Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, Md MM, et al. De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nature Protocols. 2013;8(8):1494–1512. doi: 10.1038/nprot.2013.084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heliconius Genome Consortium. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature. 2012;487(7405):94–98. doi: 10.1038/nature11041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoff KJ, Stanke M. Predicting genes in single genomes with AUGUSTUS. Current Protocols in Bioinformatics. 2019;65(1):e57. doi: 10.1002/cpbi.57. [DOI] [PubMed] [Google Scholar]
- Howe K, Chow W, Collins J, Pelan S, Pointon D-L, Sims Y, Torrance J, Tracey A, Wood J. Significantly improving the quality of genome assemblies through curation. GigaScience. 2021;10(1) doi: 10.1093/gigascience/giaa153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiggins CD, Mallarino R, Willmott KR, Bermingham E. The phylogenetic pattern of speciation and wing pattern change in neotropicalithomiabutterflies (lepidoptera: Nymphalidae) Evolution; International Journal of Organic Evolution. 2006;60(7):1454–1466. doi: 10.1554/05-483.1. [DOI] [PubMed] [Google Scholar]
- Jiggins CD, Naisbit RE, Coe RL, Mallet J. Reproductive isolation caused by colour pattern mimicry. Nature. 2001;411(6835):302–305. doi: 10.1038/35077075. [DOI] [PubMed] [Google Scholar]
- Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, et al. InterProScan 5: Genome-scale protein function classification. Bioinformatics. 2014;30(9):1236–1240. doi: 10.1093/bioinformatics/btu031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Joron M, Papa R, Beltrán M, Chamberlain N, Mavárez J, Baxter S, Abanto M, Bermingham E, Humphray SJ, Rogers J, Beasley H, et al. A conserved supergene locus controls colour pattern diversity in Heliconius butterflies. Public Library of Science, Biology. 2006;4:e303. doi: 10.1371/journal.pbio.0040303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jousselin E, Elias M. Testing host-plant driven speciation in phytophagous insects: A phylogenetic perspective. arXiv. 2019 http://arxiv.org/abs/1910.09510 . [Google Scholar]
- Katoh K, Rozewicki J, Yamada KD. MAFFT online service: Multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics. 2019;20(4):1160–1166. doi: 10.1093/bib/bbx108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kolmogorov M, Armstrong J, Raney BJ, Streeter I, Dunn M, Yang F, Odom D, Flicek P, Keane TM, Thybert D, Paten B, et al. Chromosome assembly of large and complex genomes using multiple references. Genome Research. 2018;28(11):1720–1732. doi: 10.1101/gr.236273.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozak KM, Wahlberg N, Neild AFE, Dasmahapatra KK, Mallet J, Jiggins CD. Multilocus species trees show the recent adaptive radiation of the mimetic heliconius butterflies. Systematic Biology. 2015;64(3):505–524. doi: 10.1093/sysbio/syv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lefort V, Longueville J-E, Gascuel O. SMS: Smart model selection in PhyML. Molecular Biology and Evolution. 2017;34(9):2422–2424. doi: 10.1093/molbev/msx149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu G, Chang Z, Chen L, He J, Dong Z, Yang J, Lu S, Zhao R, Wan W, Ma G, Li J, et al. Genome size variation in butterflies (Insecta, Lepidotera, Papilionoidea): A thorough phylogenetic comparison. Systematic Entomology. 2020;45(3):571–582. doi: 10.1111/syen.12417. [DOI] [Google Scholar]
- Liu N-Y, Xu W, Dong S-L, Zhu J-Y, Xu Y-X, Anderson A. Genome-wide analysis of ionotropic receptor gene repertoire in lepidoptera with an emphasis on its functions of Helicoverpa armigera. Insect Biochemistry and Molecular Biology. 2018;99:37–53. doi: 10.1016/j.ibmb.2018.05.005. [DOI] [PubMed] [Google Scholar]
- Mann F, Szczerbowski D, de Silva L, McClure M, Elias M, Schulz S. 3-acetoxy-fatty acid isoprenyl esters from androconia of the ithomiine butterfly Ithomia salapia. Beilstein Journal of Organic Chemistry. 2020;16:2776–2787. doi: 10.3762/bjoc.16.228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Molecular Biology and Evolution. 2021;38(10):4647–4654. doi: 10.1093/molbev/msab199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764–770. doi: 10.1093/bioinformatics/btr011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, Blaxter M, Manica A, Mallet J, Jiggins CD. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Research. 2013;23(11):1817–1828. doi: 10.1101/gr.159426.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mayjonade B, Gouzy J, Donnadieu C, Pouilly N, Marande W, Callot C, Langlade N, Muños S. Extraction of high-molecular-weight genomic DNA for long-read sequencing of single molecules. BioTechniques. 2016;61(4):203–205. doi: 10.2144/000114460. [DOI] [PubMed] [Google Scholar]
- McClure M, Clerc C, Desbois C, Meichanetzoglou A, Cau M, Bastin-Héline L, Bacigalupo J, Houssin C, Pinna C, Nay B, Llaurens V, et al. Why has transparency evolved in aposematic butterflies? Insights from the largest radiation of aposematic butterflies, the Ithomiini. Proceedings of the Royal Society of London Series B: Biological Sciences. 2019;286:20182769. doi: 10.1098/rspb.2018.2769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClure M, Dutrillaux B, Dutrillaux A-M, Lukhtanov V, Elias M. Heterozygosity and chain Multivalents during meiosis illustrate ongoing evolution as a result of multiple holokinetic chromosome fusions in the genus Melinaea (lepidoptera, Nymphalidae) Cytogenetic and Genome Research. 2017;153(4):213–222. doi: 10.1159/000487107. [DOI] [PubMed] [Google Scholar]
- McClure M, Elias M. Ecology, life history, and genetic differentiation in neotropical Melinaea (Nymphalidae: Ithomiini) butterflies from North-Eastern Peru. Zoological Journal of the Linnean Society. 2016a;179(1):110–124. doi: 10.1111/zoj.12433. [DOI] [Google Scholar]
- McClure M, Elias M. Unravelling the role of host plant expansion in the diversification of a neotropical butterfly genus. BMC Evolutionary Biology. 2016b;16(1):128. doi: 10.1186/s12862-016-0701-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClure M, Mahrouche L, Houssin C, Monllor M, Le Poul Y, Frérot B, Furtos A, Elias M. Does divergent selection predict the evolution of mate preference and reproductive isolation in the tropical butterfly genus Melinaea (Nymphalidae: Ithomiini)? The Journal of Animal Ecology. 2019;88(6):940–952. doi: 10.1111/1365-2656.12975. [DOI] [PubMed] [Google Scholar]
- Mérot C, Frérot B, Leppik E, Joron M. Beyond magic traits: Multimodal mating cues in Heliconius butterflies. Evolution; International Journal of Organic Evolution. 2015;69(11):2891–2904. doi: 10.1111/evo.12789. [DOI] [PubMed] [Google Scholar]
- Merrill RM, Chia A, Nadeau NJ. Divergent warning patterns contribute to assortative mating between incipient Heliconius species. Ecology and Evolution. 2014;4(7):911–917. doi: 10.1002/ece3.996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merrill RM, Rastas P, Martin SH, Melo MC, Barker S, Davey J, Wo MM, Jiggins CD. Genetic dissection of assortative mating behavior. PLoS Biology. 2019;17(2):e2005902. doi: 10.1371/journal.pbio.2005902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meslin C, Mainet P, Montagné N, Robin S, Legeai F, Bretaudeau A, Johnston JS, Koutroumpa F, Persyn E, Monsempès C, François MC, et al. Spodoptera littoralis genome mining brings insights on the dynamic of expansion of gustatory receptors in polyphagous noctuidae. Genes, Genomes, Genetics. 2022;12(8) doi: 10.1093/g3journal/jkac131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Montagné N, Wanner K, Jacquin-Joly E. In: Insect pheromone biochemistry and molecular biology. Second. Blomquist GJ, Vogt RG, editors. Academic Press; 2021. 15 - olfactory genomics within the lepidoptera; pp. 469–505. [DOI] [Google Scholar]
- Muller F. Ituna and Thyridia; a remarkable case of mimicry in butterflies. Transactions of the Royal Entomological Society of London. 1879:xx–xxix. [Google Scholar]
- Nadeau NJ, Whibley A, Jones RT, Davey JW, Dasmahapatra KK, Baxter SW, Quail MA, Joron M, French-Constant RH, Blaxter ML, Mallet J, et al. Genomic islands of divergence in hybridizing Heliconius butterflies identified by large-scale targeted sequencing. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences. 2012;367(1587):343–353. doi: 10.1098/rstb.2011.0198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pelosi P, Zhou J-J, Ban LP, Calvello M. Soluble proteins in insect chemical communication. Cellular and Molecular Life Sciences: CMLS. 2006;63(14):1658–1676. doi: 10.1007/s00018-005-5607-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Research. 2020;9 doi: 10.12688/f1000research.23297.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pinna CS, Vilbert M, Borensztajn S, Daney de Marcillac W, Piron-Prunier F, Pomerantz A, Patel NH, Berthier S, Andraud C, Gomez D, Elias M. Mimicry can drive convergence in structural and light transmission features of transparent wings in lepidoptera. eLife. 2021;10:e69080. doi: 10.7554/eLife.69080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piron-Prunier F, Persyn E, Legeai F, McClure M, Meslin C, Robin S, Alves-Carvalho S, Mohammad A, Blugeon C, Jacquin-Joly E, Montagné N, et al. Comparative transcriptome analysis at the onset of speciation in a mimetic butterfly, the Ithomiini Melinaea marsaeus. Journal of Evolutionary Biology. 2021;34:1704–1721. doi: 10.1111/jeb.13940. [DOI] [PubMed] [Google Scholar]
- Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nature Communications. 2020;11(1):1432. doi: 10.1038/s41467-020-14998-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rhie A, Walenz BP, Koren S, Phillippy AM. Merqury: Reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biology. 2020;21:245. doi: 10.1186/s13059-020-02134-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robertson HM. Molecular evolution of the major arthropod chemoreceptor gene families. Annual Review of Entomology. 2019;64:227–242. doi: 10.1146/annurev-ento-020117-043322. [DOI] [PubMed] [Google Scholar]
- Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nature Biotechnology. 2011;29(1):24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schulz S, Beccaloni G, Brown KS, Boppré M, Freitas AVL, Ockenfels P, Trigo JR. Semiochemicals derived from pyrrolizidine alkaloids in male ithomiine butterflies (lepidoptera: Nymphalidae: Ithomiinae) Biochemical Systematics and Ecology. 2004;32(8):699–713. [Google Scholar]
- Seixas FA, Edelman NB, Mallet J. Synteny-based genome assembly for 16 species of Heliconius butterflies, and an assessment of structural variation across the genus. Genome Biology and Evolution. 2021;13(7):evab069. doi: 10.1093/gbe/evab069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seppey M, Manni M, Zdobnov EM. In: Gene prediction: Methods and protocols. Kollmar M, editor. Springer; New York: 2019. BUSCO: Assessing genome assembly and annotation completeness; pp. 227–245. [DOI] [PubMed] [Google Scholar]
- Sherratt TN. The evolution of Müllerian mimicry. Die Naturwissenschaften. 2008;95(8):681–695. doi: 10.1007/s00114-008-0403-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31. doi: 10.1186/1471-2105-6-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stamm P, Mann F, McClure M, Elias M, Schulz S. Chemistry of the Androconial secretion of the Ithomiine butterfly Oleria Onega. Journal of Chemical Ecology. 2019;45(9):768–778. doi: 10.1007/s10886-019-01100-5. [DOI] [PubMed] [Google Scholar]
- Storer J, Hubley R, Rosen J, Wheeler TJ, Smit AF. The Dfam community resource of transposable element families, sequence models, and genome annotations. Mobile DNA. 2021;12(1):2. doi: 10.1186/s13100-020-00230-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Current Protocols in Bioinformatics. 2009;25:4101–41014. doi: 10.1002/0471250953.bi0410s25. [DOI] [PubMed] [Google Scholar]
- van Schooten B, Meléndez-Rosa J, Van Belleghem SM, Jiggins CD, Tan JD, McMillan WO, Papa R. Divergence of chemosensing during the early stages of speciation. Proceedings of the National Academy of Sciences of the United States of America. 2020;117(28):16438–16447. doi: 10.1073/pnas.1921318117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Visendi P. In: Plant bioinformatics: Methods and protocols. Edwards D, editor. Springer US; 2022. De novo assembly of linked reads using supernova 2.0; pp. 233–243. [DOI] [PubMed] [Google Scholar]
- Vogt RG, Große-Wilde E, Zhou J-J. The lepidoptera odorant binding protein gene family: Gene gain and loss within the GOBP/PBP complex of moths and butterflies. Insect Biochemistry and Molecular Biology. 2015;62:142–153. doi: 10.1016/j.ibmb.2015.03.003. [DOI] [PubMed] [Google Scholar]
- Willmott KR, Mallet J. Correlations between adult mimicry and larval host plants in ithomiine butterflies. Proceedings of the Royal Society B: Biological Sciences. 2004;271(Suppl 5):S266–S269. doi: 10.1098/rsbl.2004.0184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Willmott KR, Robinson Willmott JC, Elias M, Jiggins CD. Maintaining mimicry diversity: Optimal warning colour patterns differ among microhabitats in Amazonian clearwing butterflies. Proceedings of the Royal Society B: Biological Sciences. 2017;284(1855):20170744. doi: 10.1098/rspb.2017.0744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhan S, Merlin C, Boore JL, Reppert SM. The monarch butterfly genome yields insights into long-distance migration. Cell. 2011;147(5):1171–1185. doi: 10.1016/j.cell.2011.09.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Genome assemblies have been made available on the NCBI under GCA_918358865.1 accession number for M. marsaeus, GCA_918358695.1 for M. menophilus and JAPOND000000000 for I. salapia. Raw genomic data can be found under BioProjects PRJNA836751 (M. marsaeus v1), PRJEB48295 (M. marsaeus v2) and PRJEB48296 (M. menophilus). Raw transcriptomic data can be found under BioProjects PRJNA836751. The assembly and annotation pipelines including custom scripts have been made available in the Github repository https://github.com/JeremyLGauthier/scripts_Ithomiine_genomes.





