eTOC blurb
Westerman, VanKuren et al. show that butterfly wing color maps to a putative cis-regulatory element adjacent to two aristaless genes. The genes are differentially expressed between white and yellow wings and CRISPR knockout of aristaless1 causes white wings to develop yellow. Both colors have been shared among species via hybridization.
Summary
Neotropical Heliconius butterflies display a diversity of warningly colored wing patterns which serve roles in both Müllerian mimicry and mate choice behavior. Wing pattern diversity in Heliconius is controlled by a small number of unlinked, Mendelian “switch” loci [1]. One of these, termed the K locus, switches between yellow and white color patterns, important mimicry signals as well as mating cues [2-4]. Furthermore, mate preference behavior is tightly linked to this locus [4]. K controls the distribution of white vs. yellow scales on the wing, with a dominant white allele and a recessive yellow allele. Here we combine fine-scale genetic mapping, genome-wide association studies, gene expression analyses, population and comparative genomics, and genome editing with CRISPR/Cas9 to characterize the molecular basis of the K locus in Heliconius and to infer its evolutionary history. We show that white vs. yellow color variation in Heliconius cydno is due to alternate haplotypes at a putative cis-regulatory element (CRE) downstream of a tandem duplication of the homeodomain transcription factor aristaless. Aristaless1 (al1) and aristaless2 (al2) are differentially regulated between white and yellow wings throughout development with elevated expression of al1 in developing white wings suggesting a role in repressing pigmentation. Consistent with this, knock-out of al1 causes white wings to become yellow. The evolution of wing color in this group has been marked by retention of the ancestral yellow color in many lineages, a single origin of white coloration in H. cydno, and subsequent introgression of white color from H. cydno into H. melpomene.
Results
The diverse color patterns of neotropical Heliconius butterflies are well known for their role in Müllerian mimicry [5-9], or mimicry among mutually toxic species [10]. In Heliconius, many instances of mimicry involve the sharing of color patterns among different Heliconius species, all of which are chemically defended and unpalatable to predators [5, 11]. In addition to serving as signals to predators—both as warning coloration and in the context of mimicry—Heliconius wing patterns also play an essential role in intraspecific communication. In particular, a number of Heliconius species and subspecies have been shown to mate assortatively and this is driven by male preference for females with conspecific color patterns [2-4, 12-14]. Wing patterns are highly variable within and between Heliconius species but the genetic architecture of this variation is dominated by a small number of large-effect loci [15-17]. These largely unlinked, Mendelian loci switch color pattern elements on particular portions of the wing [18]. Three of the four largest effect switch loci have been positionally cloned and traced back to the actions of specific genes: optix, WntA, and cortex [19-21]. The fourth switch locus, called the K locus, switches light portions of the wing between white and yellow but the molecular basis of this locus has not been identified.
Species and subspecies in the Heliconius cydno complex display a diversity of color patterns resulting primarily from variation in melanic patterning, as well as white vs. yellow variation controlled by the K locus. This phenotypic variation exists because across the range of H. cydno, which spans a number of subspecies as well as closely-related species such as H. pachinus, members of this clade are precise co-mimics of distantly-related Heliconius species such as H. sapho, H. eleuchia, and H. hewitsoni. Previously, we found that the K locus plays a dominant role in mediating assortative mating between H. cydno galanthus and H. pachinus, white and yellow sister taxa from Costa Rica (Figure S1A), because males use wing color (white vs. yellow) as the primary cue in choosing females [4]. Furthermore, male mate preference is genetically linked to the K locus itself [4]. Our previous genetic and QTL mapping results showed that the Mendelian K locus and quantitative variation in male mate preference both map to the same end of chromosome 1, in the vicinity of wingless, a color patterning candidate gene. Further supporting the linkage between color and preference, we also found that white and yellow males from the polymorphic subspecies H. cydno alithea differ in their preference for white and yellow females; white males do not, on average, prefer a color but yellow males significantly prefer yellow females [2].
To identify the molecular basis of the K locus’ effects, we improved our genetic mapping of the color switch using 127 recombinant offspring from five F2 backcross broods between white H. cydno galanthus and yellow H. cydno alithea. This resulted in a 525 kbp zero-recombinant window that contained 16 genes (Figure 1). This region was immediately adjacent to wingless but did not include it. Next, we analyzed whole genome sequence (WGS) data from 43 butterflies: 10 white H. cydno galanthus from Costa Rica, 10 yellow H. pachinus from Costa Rica, 13 white H. cydno alithea from Ecuador, and 10 yellow H. cydno alithea from Ecuador. Genome-wide association (GWA) mapping of wing color using the full dataset resulted in a narrow peak of single nucleotide polymorphism (SNP) associations located in the fine-mapping interval (Figure 1). There were three perfectly associated variants in this peak, spanning a distance of 3,920 bp, with the top two being just 434 bp apart. The associated SNPs fell in a non-coding intergenic region 28 kbp downstream of aristaless1 (al1) and aristaless2 (al2), a Lepidoptera-specific tandem duplication of the developmental transcription factor aristaless [22]. Aristaless is a paired-like homeodomain transcription factor with important roles in appendage patterning in Drosophila [23]. and aristaless2 expression has been shown to correlate with discal cell wing pattern elements in a variety of butterflies [22], making one or both of these genes good candidates for the Heliconius K locus.
Figure 1. Genetic mapping of the Heliconius cydno K locus.
Linkage and QTL mapping have shown that wing color and male color preference are genetically linked on chromosome 1 (left). Fine mapping yellow vs. white wing color, using 127 F2 intercross hybrids between white H. cydno galanthus and yellow H. cydno alithea identified a 525 kbp zero-recombinant window (middle). This region, from NDAE1 to HEATR1, spans 16 genes, including a tandem duplication of the developmental transcription factor aristaless. Genome-wide association mapping of wing color based on 43 whole-genome sequences (11,499,871 variants) identified a single, strong peak of association downstream of aristaless1, including three perfectly associated variants (right).
To explore the potential function of al1 and al2 in Heliconius wing coloration, we analyzed expression of these genes in developing wing discs of white and yellow H. cydno using quantitative real-time PCR (qPCR). Interestingly, we found largely non-overlapping expression patterns with al1 being highly expressed throughout development in white H. cydno (Figure 2A) and al2 being expressed in yellow H. cydno, albeit at a lower level (Figure 2B). The white color of scales on Heliconius wings is structurally based, whereas yellow scales are produced by those same scales producing or importing a yellow pigment, the ommochrome precursor 3-hydroxy-L-kynurenine (3-OHK) [24]. Because the white K locus allele, which is the absence of pigment, is dominant to the yellow allele, we hypothesized that the white allele may function as a dominant repressor of pigmentation. If al1 is functionally responsible for switching wing color, we hypothesized that eliminating expression of al1, which is expressed at high levels during the development of white wings, should result in a yellow pigmented wing. To test this, we used CRISPR/Cas9 [25, 26] to knockout al1 in white H. cydno galanthus and we observed the effect in first-generation (G0) individuals that were somatic mosaics. In support of our hypothesis, al1 knockout resulted in streaks of yellow clones spanning the white patch on the forewing (Figure 2C-E). Experiments with white H. cydno alithea produced similar results (Figure S2). Our data suggest that the narrow genomic interval we mapped downstream of al1 is a cis-regulatory element that leads to differential expression of the aristaless genes in developing butterflies, ultimately generating white or yellow wings.
Figure 2. aristaless1 controls Heliconius wing color.
Gene expression levels of aristaless1 (A) and aristaless2 (B) differ between white and yellow winged butterflies over the course of wing disc development with aristaless1 up-regulated in white H. cydno. (C-E) Mosaic CRISPR/Cas9 knockout of aristaless1 in white H. cydno generates yellow pigmented wing scales instead of wild-type white scales. See also Figure S2.
There are multiple, well-documented instances of adaptive introgression of wing pattern mimicry in Heliconius butterflies, many involving H. melpomene donating its color pattern to other species [27, 28]. It has been hypothesized that yellow-winged subspecies of H. cydno, including yellow H. cydno alithea and H. pachinus, may have independently acquired their wing color from sympatric H. melpomene [29]. We tested this by scanning the K locus interval for evidence of introgression between yellow winged H. cydno and yellow winged H. melpomene using the D-statistic [30], but our results revealed no evidence of yellow haplotype introgression (Figure 3A). In contrast, we did find a strong D-statistic signature (2.6–2.62 Mbp: D = 0.421±0.113, Z = 3.735, p = 0.0002) between white H. cydno and the white winged H. melpomene cythera, indicative of introgression of the white haplotype (Figure 3A). This signal was centered directly on the putative CRE downstream of al1. This region highlighted by the D-statistic also showed a peak in the related fd statistic (2.6–2.62 Mbp: fd = 0.376±0.101, Z = 3.726, p = 0.0002), as well as reduced DNA sequence divergence, dxy, between white H. cydno and white H. melpomene (Figure 3A). These signatures at the putative CRE stood out relative to chromosome 1 as a whole (Figure S1B) and the focal 20 kbp window was significantly different from windows spanning chromosome 1 (D: U = 2051, p = 2.94×10−7, fd: U = 1980, p = 1.98×10−7, dxy: U = 3438, p = 0.0002). Furthermore, comparison of a phylogenetic tree inferred using SNPs in the K locus (Figure 3B) to a tree inferred from 35.6 million genome-wide SNPs (Figure 3C) revealed instances of discordance indicative of introgression in the color determining region: the K locus phylogeny grouped samples by wing color, while the genome-wide phylogeny grouped samples by taxon. The introgression signal was specific to the K locus, as trees based on intervals adjacent to, but excluding K, also grouped samples by taxon instead of color (Figure S1C). The grouping of all white H. melpomene K locus haplotypes with white H. cydno haplotypes suggests that the white haplotype originated in H. cydno and was subsequently transferred to H. melpomene. Introgression appears to have been recent and spatially restricted because the K locus tree clustered white H. melpomene cythera samples with white H. cydno alithea, both of which co-occur in western Ecuador. Similarly, the small cluster of yellow H. melpomene samples that group with yellow H. cydno suggests that a minority of the yellow variation present in H. melpomene was acquired from H. cydno. Again, this appears to have been recent introgression because the K locus tree clustered these two H. melpomene samples, which were collected from the same site in Costa Rica, with yellow H. pachinus from Costa Rica.
Figure 3. Evolution of wing color across Heliconius butterflies.
(A) Sliding-window analyses of Patterson’s D-statistic, fd and dxy identified peaks of biased allele sharing between white H. cydno and white H. melpomene at the putative CRE downstream of al1. (B) A gene genealogy based on 30 kbp encompassing the putative CRE reveals evidence of ancestral yellow variation across Heliconius, as well as instances of putative introgression from H. cydno into H. melpomene. The close affiliation of H. timareta and H. melpomene on the K locus tree may also indicate introgression but that remains less certain because of the highly variable positon of H. timareta across gene genealogies (Figure S1C). Internal nodes with 100% bootstrap support are marked with a blue circle. (C) Phylogeny of the melpomene/cydno/silvaniform clade of Heliconius inferred based on > 35 million genome-wide SNPs. See also Figure S1.
Introgression at the K locus appears to involve movement of alleles from H. cydno to H. melpomene. Furthermore, these results suggest that the H. cydno lineage was ancestrally yellow and that the current widespread yellow coloration in the group is due to segregating ancestral variation. Sometime after the origin of the H. cydno clade, there was a new innovation, the origin of white wing coloration, and this was subsequently exported to H. melpomene via introgression. This evolutionary inference allowed us to further dissect potential regulation of the K locus. If white is the derived phenotype, we might expect to see evidence of novel functional variants associated with the white haplotype in the location of the putative CRE. Indeed, we found white-associated SNPs in the location of the putative CRE that resulted in novel binding sites for transcription factors Mitf, biniou, vismay, and aristaless itself (Figure S3). However, the general pattern was of potentially greater regulation of genes on the yellow haplotype, with transcription factors generally having higher binding affinities for yellow haplotype sequences than white haplotype sequences across the K locus interval (Figure S3). Furthermore, white-associated alleles at two highly-associated SNPs caused disruption of putative Mitf and vismay binding sites (Figure S3). The potential involvement of Mitf here suggests intriguing parallels between Heliconius wing coloration and the developmental genetics of stripe patterning in rodents, where an aristaless-like gene, Alx3, interacts with Mitf to switch between light and dark hairs [31]. A single copy of Mitf is present in Heliconius (HMEL006056 in H. melpomene, release 2, and HEL_005372 in H. erato lativitta) and is expressed in developing wings [32], but additional data will be required to compare the color patterning pathways.
Discussion
Wing pattern mimicry in butterflies has served as an important example of adaptation since Charles Darwin and Alfred Russell Wallace first proposed the concept of evolution by natural selection [33]. Both Henry Walter Bates and Fritz Muller built their concepts of mimicry, respectively known as Batesian and Mullerian mimicry, around their observations of Amazonian butterflies [10, 34]. Heliconius butterflies were much discussed in those early days of mimicry theory and have continued to fuel our understanding of adaptation and diversification ever since. Because Heliconius color patterns also contribute to assortative mating and reproductive isolation [13], they also provide a window into the process of speciation, in particular the mechanisms by which divergent ecological selection may facilitate speciation [35]. Recently, the molecular basis of Heliconius color pattern variation has been revealed, with critical observations connecting different Mendelian switch loci to the effects of the genes optix, WntA, and cortex. Here we have shown that a central component of mimicry in Heliconius cydno and related species, what has historically been called the K locus, is controlled by the gene aristaless1. The white vs. yellow wing color that is controlled by aristaless1 serves as both a mimicry signal as well as a mating cue, making this discovery relevant to understanding the genetics of both adaptation and speciation in the clade.
Our findings raise a number of important questions that have yet to be resolved. First, we still do not understand the regulation and developmental biology of this color switch. Our results indicate that al1 contributes to the development of distinct wing colors but is al2 involved as well? Previous work indicates that al2 expression may play a widespread role in patterning discal cell elements on the wings of butterflies [22]. It is unclear to what extent those observations intersect with our characterization of the K locus here. Even within Heliconius, we do not know whether al1 controls color pattern variation in other subclades, such as the pupal mating species H. eleuchia, H. sapho, and H. hewitsoni, which are variably yellow and white and are the co-mimics of the H. cydno clade. Finally, we have previously shown that mate preference is also genetically linked to the K locus [4] and it remains to be seen whether this color associated region, or a different linked region, underlies mate preference variation. Future work, combining behavior, genomics, and functional genetics will help to clarify these important, outstanding questions.
STAR Methods
CONTACT FOR REAGENT AND RESOURCE SHARING
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Marcus Kronforst (mkronforst@uchicago.edu).
METHOD DETAILS
Genetic mapping
We crossed a white H. cydno galanthus female with a yellow H. cydno alithea male to generate F1 hybrids and then individually paired five F1 males with yellow H. cydno alithea females to generate F2 backcross broods. Across the five families, we raised a total of 127 adult F2 butterflies (50 yellow, 77 white). We PCR amplified and Sanger sequenced nine markers in the K locus region in parents and offspring to identify and score segregating variation, particularly heterozygous sites in F1 fathers that could be traced in F2 offspring. Primers are in Table S3. We detected a total of seven recombinants representing four unique recombination events, two on either side of the 525 kbp zero-recombinant interval (Figure 1).
Genome-wide association mapping
We performed GWA for color using 43 previously-published H. cydno galanthus (N=10) H. pachinus (N=10) and yellow (N=10) and white (N=13) H. cydno alithea whole genome sequencing datasets [36]. Sample details are in Table S1. We trimmed adapters and low-quality regions from raw reads using Trimmomatic 0.36 [37] before mapping to the H. melpomene (release 2) reference genome [38] using BWA MEM v0.7.12 [39] with default settings except the-M flag to mark secondary alignments. We marked duplicate reads using PicardTools v1.92, then called SNPs and indels using the Genome Analysis ToolKit’s [GATK, v3.7-0 [40]] HaplotypeCaller and GenotypeGVCFs using default parameters except expected heterozygosity (-hets) was set to 0.005. We filtered out sites with overall quality less than 1000 and sites that exhibited strong supporting read biases to generate a final SNP/indel callset. We jointly called inversions (300 bp − 3 Mb), deletions (50 bp − 3 Mb), tandem duplications (300 bp − 3 Mb), and insertions using delly 0.7.6 [41]. We included only high-quality (≥ 3 supporting read pairs), nucleotide-resolution calls in our final call set. Filtered SNP, indel, and SV calls were merged before subsequent analyses. GWA was performed using a univariate linear mixed model in GEMMA 0.94 [42]. We excluded variant sites with >20% missing genotypes or minor allele frequency < 0.05. We analyzed p-values from site-wise Wald tests.
qRT-PCR
We compared al1 and al2 gene expression between white H. cydno galanthus and yellow H. cydno alithea using qRT-PCR. We assessed the relative expression of genes at developmental time points spanning 5th instar larvae to late pupal stages in forewing tissue. We extracted RNA using Trizol and then synthesized cDNA using BioRad’s iscript cDNA synthesis kit. For each target, we designed gene specific primers and checked for high (>90%) primer efficiency with standard curves. Primers are in Table S3. We also assessed expression of a control gene, elongation factor 1 alpha (ef1α), to normalize expression levels of our targets. Reactions were run on an ABI 7500 fast HT machine using the ABI sybr green 2X master mix and ABI MicroAmp Fast Optical 96 well plates. We quantified gene expression levels using the 2-∆∆CT method.
CRISPR/Cas9 knockouts
We reared H. cydno galanthus and H. cydno alithea in greenhouse insectaries at the University of Chicago. Larvae were fed on fresh host-plant material (Passiflora) under a light regime of 16 h light and 8 h dark at 26-27 °C, 60-80% humidity. We used two or three single gu ide-RNAs (sgRNAs) to generate long deletions or frameshifts (Figure S2) [26, 43]. sgRNAs were identified using FlyCRISPR tools (http://flycrispr.molbio.wisc.edu/tools) to search genomic regions for GGN18NGG or N20NGG sequences. The specificity of candidate sgRNA sequences was assessed using BLAST to confirm there were not multiple binding sites. Target sequences are in Table S3.. A sgRNA template was generated by PCR amplification with a forward primer encoding a T7 polymerase-binding site and a sgRNA target site using Phusion polymerase (New England Biolabs, Ipswich, MA, USA), and a reverse primer encoding the remainder of the sgRNA sequence [43]. In vitro transcription was conducted using Megascript T7 Kit (Ambion, Waltham, MA, USA) and purified by phenol–chloroform extraction and isopropanol precipitation [44].
To retrieve eggs for injection, we offered host-plants to female butterflies to lay eggs for a period of 1-3 hours. The eggs were collected and then washed for 120s in 7.5% benzalkonium chloride (Sigma-Aldrich), rinsed in water, and dried by air ventilation, in order to soften the chorion. Eggs were arranged on a double-sided adhesive tape glued to a glass slide with the micropyle facing up to allow the caterpillars to break the eggshell. Injection mixtures containing sgRNAs and recombinant Cas9 protein (PNA Bio Inc.) were injected using a 0.5-mm borosilicate needle (Sutter Instruments, Novato, CA, USA). The concentration of Cas9 and sgRNAs varied between 125 ng/ul–250 ng/μl and 83 ng/ul–125 ng/μl, respectively. After injection, embryos were placed into humid petri dishes until hatching (~4 days) and moved to an incubator at 26-27°C and ~80% humidity to develop. Hatchlings were collected via paintbrush and transferred to the host plant. Adults eclosed approximately 10 days post-pupation. After emerging, the adults were frozen for genotyping and pinned.
For genotyping, we isolated DNA from tissue showing mutant phenotypes, or single legs, using a phenol-chloroform protocol, and PCR amplified a section of al1 flanking the sgRNAs target region (Primers are in Table S3.). We gel-purified PCR products, subcloned into a TOPO TA vector (Invitrogen, Carlsbad, CA, USA), and then sequenced them with dye terminator technology. The sequence data were analyzed with Geneious v9.1.3 software.
Evolutionary analyses
We performed whole genome re-sequencing for 9 adult butterflies and combined these with 117 whole genome re-sequencing datasets from NCBI (PRJNA226620 and PRJNA308754) [28, 45] and ENA (ERP002440 and PRJEB8011) [20, 46]. The raw reads were processed with Trimmomatic Version 0.36 [37] and the high quality reads were aligned to the H. melpomene reference genome [38] using Bowtie2 v2.2.9 with the parameter -very-sensitive-local [46]. Picard v2.8.1 (https://broadinstitute.github.io/picard/) was used to remove PCR duplicates. RealignerTargetCreator and IndelRealigner in GATK v3.4 was used to realign indels [40] and UnifiedGenotyper was used to call genotypes [48]. SNPs with good quality (Qual > 30) were used in the downstream analyses. Sample details are listed in Table S2.
We performed a genome-wide maximum-likelihood phylogenetic analysis by extracting polymorphic genotype calls (approximately 35.64 million SNPs) from 39 individuals (one sample per species, subspecies, morph) with good quality, converting them into PHYLIP format and constructing a genome-wide tree using RAxML [49] with GTRGAMMA model and 100 bootstrap replicates (Figure 3C, Figure S1A). We also extracted genotype calls from 126 individuals and constructed a phylogenetic tree for 30 kb regions in the same manner (Figure 3B, Figure S1C). The tree images were visualized using iTOL [50].
We applied Patterson’s D-statistic [30, 51] to identify potential introgression signatures around the K locus. We used H. wallacei as an outgroup taxon and set the ingroup taxa as (yellow H. c. alithea, white H. c. alithea, H. m. cythera) and (yellow H. c. alithea, white H. c. alithea, H. m. rosina). We have multiple individuals per taxon, so we used the frequency of the derived allele instead of binary counts of fixed ABBA and BABA counts to detect gene flow. The D-statistic was calculated as
(1) |
[30] where P1, P2, P3 and P4 refer to four taxa and P̂ij refers to the SNP frequency in the corresponding population. We also calculated the related statistic, fd, using the following equation:
(2) |
[52] with P1, P2, P3 and O as the four taxa of the comparison. PD was the higher frequency of the derived allele from either P2 or P3.
For each 20 kbp window, the standard error was calculated using a moving block bootstrap approach according to Zhang et al. [28]. Then a two tailed z-test was performed to determine if the standard error for each D and fd was significantly different from zero, indicative of potential gene flow.
We also calculated mean pairwise sequence divergence (dxy) for two comparisons, white H. cydno vs. white H. melpomene cythera and white H. cydno vs. yellow H. melpomene rosina, around the K locus as well as chromosome 1 as a whole. We calculated mean dxy using as
(3) |
[53] where P̂ refers to the reference allele frequency in the corresponding population.
We calculated D, fd, and dxy across chromosome 1 in non-overlapping windows of 20 kbp, including windows with a minimum of 1,000 SNPs (Figure S1B), following ref. [54]. For a more detailed look at the K locus (Figure 3A), we calculated D, fd, and dxy in non-overlapping windows of 10 kbp, including windows with a minimum of 500 SNPs. We compared K locus values of D, fd, and dxy to windows spanning chromosome 1 using the Mann-Whitney U test.
Transcription factor binding site analysis
We tested if SNPs in the putative aristaless cis-regulatory element differentially affect transcription factor (TF) binding between yellow and white haplotypes using two related approaches. First, we generated 21 bp sequences for the top 10 color-associated SNPs by concatenating the 10 bp flanking the SNP and the “yellow” H. pachinus allele (fixed in H. pachinus and yellow H. c. alithea, but absent in H. c. galanthus) or “white” (H. c. galanthus) allele. We used TomTom [55] to manually search for TFBS sequences in the white and yellow sequences using TF position weight matrices (PWMs) from the Drosophila melanogaster OnTheFly database [56]. We kept for further study six TFs 1) that showed a presence/absence difference between the two sequences, 2) whose binding site encompassed the SNP site, and 3) had a strong PWM score. In our second approach, we analyzed the binding affinity of these six TFs at all SNP sites spanning an 11 kbp interval that included the putative CRE. We used position count matrices collected from the Fly Factor Survey database [57] and calculated relative affinity following ref. [58]. Where there were multiple matrices for a factor in the database, we selected the matrix with the highest number of counts per nucleotide. To avoid bias in matrices with few counts, we added a pseudocount of 1 to all matrices (distributed evenly across nucleotides). We generated 21 bp sequences, as before, for all SNPs in the putative CRE. We then calculate the PWM score, Si; of each sequence by calculating
(4) |
where f(b,j) is the frequency of base b at position j in the PWM and 0.25 is background probability of observing that nucleotide at that position. The relative binding affinity, Ki, of a sequence is then given by
(5) |
where Sm is the score of the consensus sequence for a PWM. We fix the parameter λ to 1 [58]. This gives the predicted binding affinity of a sequence in units of K0, the affinity of a consensus binding site for that factor.
QUANTIFICATION AND STATISTICAL ANALYSIS
We performed genome-wide association mapping of wing color using 43 previously-published H. cydno galanthus (N=10), H. pachinus (N=10), and yellow (N=10) and white (N=13) H. cydno alithea whole genome sequencing datasets [36]. GWA was performed using a univariate linear mixed model in GEMMA 0.94 [42]. We excluded variant sites with >20% missing genotypes or minor allele frequency < 0.05. We analyzed p-values from site-wise Wald tests. We compared al1 and al2 gene expression between white H. cydno galanthus and yellow H. cydno alithea using qRT-PCR. We also assessed expression of a control gene, elongation factor 1 alpha (ef1α), to normalize expression levels of our targets. We quantified gene expression levels using the 2−∆∆CT method. Values displayed in Figure 2A and B are scaled means ± sem, based on three biological replicates of each color/developmental stage. To study evolution of the K locus, we calculated D and fd statistics using the following taxa (yellow H. c. alithea, white H. c. alithea, H. m. cythera, H. wallacei) and (yellow H. c. alithea, white H. c. alithea, H. m. rosina, H. wallacei). These calculations used data from four samples of each ingroup taxon, and one sample of H. wallacei. The four H. m. rosina samples designated as the yellow melpomene group did not include the two samples subsequently shown to cluster with yellow cydno on the K locus gene genealogy. A two tailed z-test was performed to determine if the standard error for each D and fd was significantly different from zero. We also calculated dxy between white H. cydno and white H. melpomene cythera and between white H. cydno and yellow H. melpomene rosina, around the K locus as well as chromosome 1 as a whole. These calculations used data from four samples of each taxon. We compared K locus values of D, fd, and dxy to windows spanning chromosome 1 using the Mann-Whitney U test.
DATA AND SOFTWARE AVAILABILITY
The NGS data have been deposited in NCBI SRA under BioProject ID PRJNA485723.
Supplementary Material
Highlights.
Butterfly wing color maps to a putative cis-regulatory element of aristaless1.
Aristaless1 is differentially expressed between white and yellow wings.
CRISPR knockout of aristaless1 turns white wings yellow.
Wing color has been shared among species via hybridization.
Acknowledgments
We thank Sumitha Nallu and Carlos Sahagun for assistance in the lab and greenhouse. We also thank Nancy Greig, Durrell Kapan, and Larry Gilbert for help and guidance with various aspects of this project. This project was funded by a Chicago Biomedical Consortium Postdoctoral Research Grant to E.L.W., Big Ideas Generator funding from the University of Chicago and Alfred P. Sloan Foundation funding to S.E.P., NIH grant P50GM068763 to Harvard’s FAS Center for Systems Biology, the Pew Biomedical Scholars program, University of Chicago’s Neubauer funds, NSF grant IOS-1452648 and NIH grant GM108626 to M.R.K.
Footnotes
Declaration of Interests
The authors declare no competing interests.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Kronforst MR, and Papa R (2015). The functional basis of wing patterning in Heliconius butterflies: the molecules behind mimicry. Genetics 200, 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chamberlain NL, Hill RI, Kapan DD, Gilbert LE, and Kronforst MR (2009). Polymorphic butterfly reveals the missing link in ecological speciation. Science 326, 847–850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kronforst MR, Young LG, and Gilbert LE (2007). Reinforcement of mate preference among hybridizing Heliconius butterflies. J. Evol. Biol. 20, 278–285. [DOI] [PubMed] [Google Scholar]
- 4.Kronforst MR, Young LG, Kapan DD, McNeely C, O’Neill RJ, and Gilbert LE (2006). Linkage of butterfly mate preference and wing color preference cue at the genomic location of wingless. Proc. Natl. Acad. Sci. USA 103, 6575–6580. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Brown KS (1981). The biology of Heliconius and related genera. Annu. Rev. Entomol. 26, 427–456. [Google Scholar]
- 6.Eltringham H (1916). On specific and mimetic relationships in the genus Heliconius. Trans. Entomol. Soc. London 1916, 101–148. [Google Scholar]
- 7.Kapan DD (2001). Three-butterfly system provides a field test of mullerian mimicry. Nature 409, 338–340. [DOI] [PubMed] [Google Scholar]
- 8.Turner JRG (1976). Adaptive radiation and convergence in subdivisions of the butterfly genus Heliconius (Lepidoptera: Nymphalidae). Zool. J. Linn. Soc. 58, 297–308. [Google Scholar]
- 9.Turner JRG (1981). Adaptation and evolution in Heliconius: A defense of Neo Darwinism. Ann. Rev. Ecol. Syst. 12, 99–121. [Google Scholar]
- 10.Müller F (1879). Ituna and Thyridia; a remarkable case of mimicry in butterflies. Trans. Entomol. Soc. London 1879, xx–xxix. [Google Scholar]
- 11.Engler-Chaouat HS, and Gilbert LE (2007). De novo synthesis vs. sequestration: negatively correlated metabolic traits and the evolution of host plant specialization in cyanogenic butterflies. J. Chem. Ecol. 33, 25–42. [DOI] [PubMed] [Google Scholar]
- 12.Jiggins CD, Estrada C, and Rodrigues A (2004). Mimicry and the evolution of premating isolation in Heliconius melpomene Linnaeus. J. Evol. Biol. 17, 680–691. [DOI] [PubMed] [Google Scholar]
- 13.Jiggins CD, Naisbit RE, Coe RL, and Mallet J (2001). Reproductive isolation caused by colour pattern mimicry. Nature 411, 302–305. [DOI] [PubMed] [Google Scholar]
- 14.Merrill RM, Gompert Z, Dembeck LM, Kronforst MR, McMillan WO, and Jiggins CD (2011). Mate preference across the speciation continuum in a clade of mimetic butterflies. Evolution 65, 1489–1500. [DOI] [PubMed] [Google Scholar]
- 15.Nijhout HF, Wray G, and Gilbert LE (1990). An analysis of the phenotypic effects of certain color pattern genes in Heliconius (Lepidoptera: Nymphalidae). Biol. J. Linn. Soc. 40, 357–372. [Google Scholar]
- 16.Sheppard PM, Turner JRG, Brown KS, Benson WW, and C., S.M. (1985). Genetics and the evolution of Muellerian mimicry in Heliconius butterflies. Philos. Trans. R. Soc. Lond. B Biol. Sci. 308, 433–613. [Google Scholar]
- 17.Turner JRG (1971). The genetics of some polymorphic forms of the butterflies Heliconius melpomene (Linnaeus) and H. erato (Linnaeus). II. The hybridization of subspecies from Surinam and Trinidad. Zoologica 56, 125–157. [Google Scholar]
- 18.Naisbit RE, Jiggins CD, and Mallet J (2003). Mimicry: developmental genes that contribute to speciation. Evol. Dev. 5, 269–280. [DOI] [PubMed] [Google Scholar]
- 19.Martin A, Papa R, Nadeau NJ, Hill RI, Counterman BA, Halder G, Jiggins CD, Kronforst MR, Long AD, McMillan WO, et al. (2012). Diversification of complex butterfly wing patterns by repeated regulatory evolution of a Wnt ligand. Proc. Natl. Acad. Sci. USA 109, 12632–12637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nadeau NJ, Pardo-Diaz C, Whibley A, Supple MA, Saenko SV, Wallbank RW, Wu GC, Maroja L, Ferguson L, Hanly JJ, et al. (2016). The gene cortex controls mimicry and crypsis in butterflies and moths. Nature 534, 106–110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Reed RD, Papa R, Martin A, Hines HM, Counterman BA, Pardo-Diaz C, Jiggins CD, Chamberlain NL, Kronforst MR, Chen R, et al. (2011). optix drives the repeated convergent evolution of butterfly wing pattern mimicry. Science 333, 1137–1141. [DOI] [PubMed] [Google Scholar]
- 22.Martin A, and Reed RD (2010). Wingless and aristaless2 define a developmental ground plan for moth and butterfly wing pattern evolution. Mol. Biol. Evol. 27, 2864–2878. [DOI] [PubMed] [Google Scholar]
- 23.Campbell G, Weaver T, and Tomlinson A (1993). Axis specification in the developing Drosophila appendage: the role of wingless, decapentaplegic, and the homeobox gene aristaless. Cell 74, 1113–1123. [DOI] [PubMed] [Google Scholar]
- 24.Gilbert LE, Forrest HS, Schultz TD, and Harvey DJ (1988). Correlations of ultrastructure and pigmentation suggest how genes control development of wing scales in Heliconius butterflies. J. Res. Lepid. 26, 141–160. [Google Scholar]
- 25.Li X, Fan D, Zhang W, Liu G, Zhang L, Zhao L, Fang X, Chen L, Dong Y, Chen Y, et al. (2015). Outbred genome sequencing and CRISPR/Cas9 gene editing in butterflies. Nat. Commun. 6, 8212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Mazo-Vargas A, Concha C, Livraghi L, Massardo D, Wallbank RWR, Zhang L, Papador JD, Martinez-Najera D, Jiggins CD, Kronforst MR, et al. (2017). Macroevolutionary shifts of WntA function potentiate butterfly wing-pattern diversity. Proc. Natl. Acad. Sci. USA 114, 10701–10706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Heliconius Genome Consortium. (2012). Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature 487, 94–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Zhang W, Dasmahapatra KK, Mallet J, Moreira GR, and Kronforst MR (2016). Genome-wide introgression among distantly related Heliconius butterfly species. Genome Biol. 17, 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Gilbert LE (2003). Adaptive novelty through introgression in Heliconius wing patterns: evidence for a shared genetic “tool box” from synthetic hybrid zones and a theory of diversification In Ecology and Evolution Taking Flight: Butterflies as Model Systems, L. BC, B WW. and R. EP, eds. (Chicago: University of Chicago Press; ), pp. 281–318. [Google Scholar]
- 30.Durand EY, Patterson N, Reich D, and Slatkin M (2011). Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Mallarino R, Henegar C, Mirasierra M, Manceau M, Schradin C, Vallejo M, Beronja S, Barsh GS, and Hoekstra HE (2016). Developmental mechanisms of stripe patterns in rodents. Nature 539, 518–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lewis JJ, van der Burg KRL, Mazo-Vargas A, and Reed RD (2016). ChIP-Seq-annotated Heliconius erato genome highlights patterns of cis-regulatory evolution in Lepidoptera. Cell Rep. 16, 2855–2863. [DOI] [PubMed] [Google Scholar]
- 33.Darwin C, and Wallace AR (1858). On the tendency of species to form varieties; and on the perpetuation of varieties and species by natural means of selection. Zool. J. Linn. Soc. 3, 46–50. [Google Scholar]
- 34.Bates HW (1862). Contributions to an insect fauna of the Amazon Valley. Lepidoptera: Heliconidae. Trans. Linn. Soc. Lond. 23, 495–566. [Google Scholar]
- 35.Jiggins CD (2008). Ecological speciation in mimetic butterflies. BioScience 58, 541–548. [Google Scholar]
- 36.Gallant JR, Imhoff VE, Martin A, Savage WK, Chamberlain NL, Pote BL, Peterson C, Smith GE, Evans B, Reed RD, et al. (2014). Ancient homology underlies adaptive mimetic diversity across butterflies. Nat. Commun. 5, 4817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Bolger AM, Lohse M, and Usadel B (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Davey JW, Chouteau M, Barker SL, Maroja L, Baxter SW, Simpson F, Joron M, Mallet J, Dasmahapatra KK, and Jiggins CD (2016). Major improvements to the Heliconius melpomene genome assembly used to confirm 10 chromosome fusion events in 6 million years of butterfly evolution. G3 6, 695–708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Li H, and Durbin R (2009). Fast and accurate short read alignment with BurrowsWheeler transform. Bioinformatics 25, 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, and Korbel JO (2012). DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Zhou X, and Stephens M (2012). Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Perry M, Kinoshita M, Saldi G, Huo L, Arikawa K, and Desplan C (2016). Molecular logic behind the three-way stochastic choices that expand butterfly colour vision. Nature 535, 280–284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Bassett A, and Liu JL (2014). CRISPR/Cas9 mediated genome engineering in Drosophila. Methods 69, 128–136. [DOI] [PubMed] [Google Scholar]
- 45.Kronforst MR, Hansen ME, Crawford NG, Gallant JR, Zhang W, Kulathinal RJ, Kapan DD, and Mullen SP (2013). Hybridization reveals the evolving genomic architecture of speciation. Cell Rep. 5, 666–677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, Blaxter M, Manica A, Mallet J, and Jiggins CD (2013). Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res. 23, 1817–1828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Langmead B, and Salzberg SL (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet 43, 491–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Stamatakis A (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690. [DOI] [PubMed] [Google Scholar]
- 50.Letunic I, and Bork P (2016). Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 44, W242–245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH-Y, et al. (2010). A draft sequence of the Neandertal genome. Science 328, 710–722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Martin SH, Davey JW, and Jiggins CD (2015). Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Mol. Biol. Evol. 32, 244–257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Smith J, and Kronforst MR (2013). Do Heliconius butterfly species exchange mimicry alleles? Biol. Lett. 9, 20130503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Jay P, Whibley A, Frezal L, Rodriguez de Cara MA, Nowell RW, Mallet J, Dasmahapatra KK, and Joron M (2018). Supergene evolution triggered by the introgression of a chromosomal inversion. Curr. Biol. 28, 1839–1845. [DOI] [PubMed] [Google Scholar]
- 55.Gupta S, Stamatoyannopoulos JA, Bailey TL, and Noble WS (2007). Quantifying similarity between motifs. Genome Biol. 8, R24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Shazman S, Lee H, Socol Y, Mann RS, and Honig B (2014). OnTheFly: a database of Drosophila melanogaster transcription factors and their binding sites. Nucleic Acids Res. 42, D167–171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Zhu LJ, Christensen RG, Kazemian M, Hull CJ, Enuameh MS, Basciotta MD, Brasefield JA, Zhu C, Asriyan Y, Lapointe DS, et al. (2011). FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one-hybrid system. Nucleic Acids Res. 39, D111–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Berg OG, and von Hippel PH (1988). Selection of DNA binding sites by regulatory proteins. II. The binding specificity of cyclic AMP receptor protein to recognition sites. J. Mol. Biol. 200, 709–723. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.