Abstract
Neo-sex chromosomes are found in many taxa, but the forces driving their emergence and spread are poorly understood. The female-specific neo-W chromosome of the African monarch (or queen) butterfly Danaus chrysippus presents an intriguing case study because it is restricted to a single ‘contact zone’ population, involves a putative colour patterning supergene, and co-occurs with infection by the male-killing endosymbiont Spiroplasma. We investigated the origin and evolution of this system using whole genome sequencing. We first identify the ‘BC supergene’, a broad region of suppressed recombination across nearly half a chromosome, which links two colour patterning loci. Association analysis suggests that the genes yellow and arrow in this region control the forewing colour pattern differences between D. chrysippus subspecies. We then show that the same chromosome has recently formed a neo-W that has spread through the contact zone within approximately 2,200 years. We also assembled the genome of the male-killing Spiroplasma, and find that it shows perfect genealogical congruence with the neo-W, suggesting that the neo-W has hitchhiked to high frequency as the male-killer has spread through the population. The complete absence of female crossing-over in the Lepidoptera causes whole-chromosome hitchhiking of a single neo-W haplotype, carrying a single allele of the BC supergene and dragging multiple non-synonymous mutations to high frequency. This has created a population of infected females that all carry the same recessive colour patterning allele, making the phenotypes of each successive generation highly dependent on uninfected male immigrants. Our findings show how hitchhiking can occur between the physically unlinked genomes of host and endosymbiont, with dramatic consequences.
A chromosome carrying a colour patterning supergene has spread rapidly through a population of African monarch butterflies (Danaus chrysippus) by hitchhiking with a male-killing endosymbiont, Spiroplasma, showing how hitchhiking can occur between the unlinked genomes of host and endosymbiont, with dramatic consequences.
Introduction
Structural changes to the genome play an important role in evolution by altering the extent of recombination among loci. This is best studied in the context of chromosomal inversions that cause localised recombination suppression, and can be favoured by selection if they help to maintain clusters of co-adapted alleles (or ‘supergenes’) in the face of genetic mixing [1–4]. A greater extent of recombination suppression occurs in the formation of heteromorphic sex chromosomes, which can link sex-specific alleles similarly to supergenes [5]. However, suppressed recombination can also have costs. In particular, male-specific Y and female-specific W chromosomes can be entirely devoid of recombination, making them vulnerable to genetic hitchhiking and the accumulation of deleterious mutations through ‘Muller’s ratchet’, which may explain their deterioration over time [6–8]. These contrasting benefits and costs of recombination suppression are of particular interest in the evolution of neo-sex chromosomes, which can form through fusion of autosomes to existing sex chromosomes. There is accumulating evidence that neo-sex chromosomes are common in animals [9–15], but the processes underlying their emergence, spread, and subsequent evolution have not been widely studied. In particular, there are few studied examples of recently formed neo-sex chromosomes that are not yet fixed in a species.
The African monarch (or queen) butterfly Danaus chrysippus provides a unique test case for the causes and consequences of changes in genome architecture and recombination suppression. Like its American cousin (D. plexippus), it feeds on milkweeds and has bright colour patterns that warn predators of its distastefulness. However, within Africa, D. chrysippus is divided into four subspecies with distinct colour patterns and largely distinct ranges (Fig 1A). Predator learning should favour the maintenance of a single monomorphic warning in any single area. For this reason, researchers have long been puzzled by the large polymorphic contact zone in East and Central Africa, where all four D. chrysippus subspecies meet and interbreed [16–18] (Fig 1A). Crosses have shown that colour pattern differences between the subspecies are controlled by Mendelian autosomal loci, including the tightly linked ‘B’ and ‘C’ loci (putatively a ‘BC supergene’ [19]) that define three common forewing patterns [20,21] (Fig 1A). However, crosses with females from the contact zone revealed that the BC chromosome has become sex linked, forming a neo-W that is unique to this population [19,22]. Because female meiosis is achiasmatic (it lacks crossing-over) in the Lepidoptera, the formation of a neo-W would instantaneously cause perfect linkage, not just of the B and C loci but of an entire non-recombining chromosome, along with other maternally inherited DNA.
What is particularly striking is that the presence of the neo-W coincides with infection by a maternally inherited ‘male-killer’ endosymbiont related to Spiroplasma ixodetis, which kills male offspring and leads to highly female-biased sex ratios where infection is common [22–24]. The combination of neo-W and male-killing is expected to dramatically alter the inheritance and evolution of the BC chromosome [22,25]: Infected females typically give rise to all-female broods who should always inherit the same colour patterning allele on their neo-W, along with the male-killer, while the other maternal allele is systematically eliminated in the dead sons (Fig 1B), forming a genetic sink for all colour pattern alleles not on the neo-W. It has been suggested that the restriction of male-killing to females with the neo-W, and only in the region in which hybridisation occurs between subspecies, may not be a coincidence [19,22,25–27]. However, the genomic underpinnings of this system—the genetic controllers of colour pattern, the source and spread of the neo-W, and its relationship with the male-killer—have until now remained a mystery. We generated a reference genome for D. chrysippus and used whole genome sequencing of population samples to uncover the interconnected evolution of the BC supergene, neo-W, and Spiroplasma. Our findings reveal a recent whole-chromosome selective sweep caused by hitchhiking between the host and endosymbiont genomes.
Results and discussion
Identification of the BC supergene
We assembled a high-quality draft genome for D. chrysippus, with a total length of 322 megabases (Mb), a scaffold N50 length of 0.63 Mb, and a BUSCO [28] completeness score of 94% (S1–S8 Tables). We then further scaffolded the genome into a pseudo-chromosomal assembly based on homology with the Heliconius melpomene genome [29–31], accounting for known fusions that differentiate these species [9,30,32] (S1 Fig). We also resequenced 42 individuals representing monomorphic populations of each of the four subspecies and a polymorphic population from a known male-killing hotspot near Nairobi, in the contact zone (Fig 1A, S9 Table).
To identify the putative BC supergene, we scanned for genomic regions showing high differentiation between the subspecies and an association with colour pattern. Genetic differentiation (FST) and excessive divergence (dXY) is largely restricted to a handful of broad peaks, with a background FST of approximately zero (Fig 2A, S2 Fig, and S3 Fig). This low background level implies a nearly panmictic population across the continent. The effective population size appears to be very large, as average genome-wide diversity at putatively neutral 4-fold degenerate third codon positions is 0.042, which is among the highest values reported for animals [33,34]. The islands of differentiation that stand out from this background imply selection for local adaptation maintaining particular differences between the subspecies, similar to patterns seen between geographic races of Heliconius butterflies [35]. However, here the peaks of differentiation are broad, covering several Mb, implying some mechanism of recombination suppression such as inversions that differentiate the subspecies.
The inclusion of the polymorphic contact-zone samples, and the fact that three of the subspecies each carry a unique colour pattern allele (Fig 1A), allowed us to identify particular differentiated regions associated with the three major colour pattern traits. A region of approximately 3 Mb on Chromosome 4 is associated with the white hindwing patch (A locus) and a region of approximately 5 Mb on Chromosome 15 (hereafter chr15) is associated with both background orange/brown (B locus) and the forewing black tip (C locus) (Fig 2A and S2 Fig). Below, we refer to this region on chr15, which spans over 200 protein-coding genes, as the BC supergene [19], although we note that additional associated SNPs on Chromosome 22 suggest that background wing melanism may also be influenced by other loci.
Clustering analysis based on genetic distances reveals three clearly distinct alleles at the BC supergene (Fig 2C). This further supports the hypothesis of recombination suppression, although a number of individuals show mosaic ancestry consistent with occasional recombination (S4 Fig). The three main alleles correspond to the three common forewing phenotypes, so we term these BCchrysippus (orange background with black forewing tip, formerly bbcc), BCdorippus (orange without black tip, formerly bbCC), and BCorientis (brown background with black forewing tip, formerly BBcc) (Fig 2C). Fifteen of the twenty contact zone individuals are heterozygous, carrying two distinct BC alleles, and a few carry putative recombinant alleles, as do some of the southern African form orientis individuals (S4 Fig). As shown previously, BCdorippus (which includes the dominant C allele) and BCorientis (which includes the dominant B allele) are both dominant over the recessive BCchrysippus (Fig 2C and S4 Fig).
Although it can be challenging to identify particular functional mutations in regions of suppressed recombination, the presence of some recombinant individuals allowed us to narrow down candidate regions for the B and C loci. A cluster of SNPs most strongly associated with background colour (B locus) is found just upstream of the gene yellow, and a phylogenetic network for a 30-kb region around yellow groups individuals nearly perfectly by phenotype, although some individuals classed as heterozygous were intermingled with homozygotes (S5 Fig). In Drosophila, Yellow expression is associated with variation in melanism [37], and in some butterflies, yellow knockouts show reduced melanin pigmentation [38], making this a compelling candidate for the B locus. The strongest associations with forewing tip (C locus) occur at the gene arrow, and a phylogenetic network for a 100-kb region around this gene similarly clusters individuals by phenotype (S5 Fig). In Drosophila, Arrow is essential for Wnt signalling in wing development [39]. Wnt signalling is known to underlie variation in colour pattern in Heliconius butterflies [40], and knockout mutants for the Wnt ligand gene WntA in D. plexippus show a loss of pigmentation [41]. This makes arrow a promising candidate for the C locus. While these genes represent our best candidates, numerous strongly associated SNPs occurred closer to other genes in this region (S10 Table). Future studies will aim to narrow down and validate these associations.
Irrespective of their precise mode of action, the patterns of association imply that the B and C loci are approximately 1.6 Mb apart (S5 Fig) and would therefore be fairly loosely linked under normal recombination. This physical distance translates to around 7.6 cM, assuming crossover rates similar to those in Heliconius [31,42], whereas the estimated recombination distance between B and C based on crosses is 1.9 cM [43]. Theory predicts that recombination suppression can be favoured if it maintains linkage disequilibrium (LD) between co-adapted alleles in the face of gene flow [1–4]. Our study is one of only a few cases in which it can be shown that alleles at distinct loci that each influence a component of a complex trait are maintained in LD by suppressed recombination [44,45].
It is likely that chromosomal rearrangements contribute to recombination suppression at the BC supergene. Although our short-read data do not allow us to test directly for inversions, they do reveal dramatic variation in sequencing coverage over the proximal end of the chromosome. Comparison of coverage among individuals suggests a large (approximately 5 Mb) polymorphic insertion in this region that tends to occur in individuals carrying the BCdorippus allele (S6 Fig). Synteny comparison with H. melpomene reveals that this insertion involves an expansion in copy number of a region of several hundred kb. Comparison of copy numbers for two of the genes in the expansion with several other species confirms that it is derived in D. chrysippus (S7 Fig). The expansion appears to occur just a few kb from the coding region of arrow (S6 Fig), and is also perfectly associated with the presence of the dominant dorippus phenotype (absence of black forewing tip) (S7 Fig). It is possible that it has a causal effect on the phenotype by influencing the expression of arrow, but it might simply be linked to the causative mutation. Either way, we suggest that this large structural change, which increases the length of the chromosome by nearly a third, contributes to recombination suppression between the BCdorippus allele and other supergene alleles by interfering with chromosome pairing in heterozygotes.
A neo-W chromosome traps a single haplotype of chr15 in contact zone females
Previous crossing experiments indicated that the BC chromosome has become sex linked in contact zone females [22]. To confirm this hypothesis using genetic tools, we created a ‘cured line’ by treating a female from an all-female brood with tetracycline to eliminate Spiroplasma and allow the survival of male offspring [23]. A cross using this female confirms perfect sex-linkage of forewing phenotype (n = 22, chi-squared test p = 0.00002; S8 Fig). We then used PCR assays on a subsequent sibling cross from the cured line to confirm that maternal alleles for chr15 segregate with sex (n = 22, p < 0.00003), whereas paternal alleles segregate randomly (n = 22, p = 0.36; S8 Fig). These results exactly match the model (Fig 1B) in which the BC supergene has become linked to the W chromosome in females but continues to segregate as an autosome in males.
Although we were unable to definitively identify any scaffolds from the ancestral W chromosome, which is likely to be highly repetitive, we can test whether chr15 shows the expected hallmarks of a young neo-W, hypothesised to have formed through fusion to the ancestral W [22]. Due to the complete absence of recombination in females, we expect that a single fused haplotype of chr15 would be spreading in the population. Any unique mutations specific to this haplotype should therefore occur at high frequency in females and be absent in males. We scanned for such high-frequency female-specific mutations and found them to be abundant across the entire length of chr15 and nearly absent throughout the rest of the genome (Fig 3A). At the individual level, we can clearly identify 15 females (14 collected in the contact zone and the single ‘cured line’ female) that consistently share these high-frequency mutations (S9 Fig). Genetic distance among these females in the colinear region of chr15 (outside the BC supergene) is reduced, indicating that they all share a similar haplotype of the fused chromosome (Fig 3B).
The neo-W formed recently and spread rapidly
Genetic variation accumulated in the neo-W lineage since its formation can tell us about its age. Sequence divergence between the neo-W and autosomal copies of chr15 (inferred from the density of heterozygous sites in the colinear region of chr15 in females carrying the neo-W) is not significantly different from that between the autosomal copies in ‘wild-type’ individuals that lack the fusion (Fig 3C, Wilcoxon signed rank test, p = 0.36, n = 48 windows of 100 kb each). This implies that insufficient time has passed since the fusion event for significant accumulation of new mutations. The limited divergence of the neo-W haplotype from the autosomal copy of chr15 in each female makes it challenging to isolate. Nonetheless, by using diagnostic mutations that are unique to and fixed in the neo-W lineage, we were able to identify sequencing reads from the shared haplotype and reconstruct a partial neo-W sequence for each female (S10 Fig). A dated genealogy based on these sequences places the root of the neo-W lineage at approximately 2,200 years (26,400 generations) ago (posterior mean = 2,201, SD = 318).
The neo-W is present in all but one of the contact zone females, implying a rapid spread since its formation. This process is similar to a selective sweep of a beneficial mutation, except that complete recombination suppression in females means that the sweep affects the entire chromosome equally. Unlike a conventional sweep, it is not expected to eliminate genetic diversity from the population, as these females will also carry an autosomal copy of chr15 inherited from their father (Fig 1B). Indeed, we see a 20% reduction in overall nucleotide diversity (π) on chr15 in females of the neo-W lineage (Fig 3D). However, when we consider only the neo-W haplotype in each of these females, we see a nearly complete absence of genetic variation, with a π of 0.00007, more than two orders of magnitude lower than for autosomal copies of chr15 (0.0228) (Fig 3D). These results further support a very recent and rapid spread of the neo-W.
The neo-W haplotype carries the recessive BCchrysippus allele at the BC supergene (S4 Fig). However, previous work [22] shows that at the focal sampling site in the contact zone, most males are immigrants homozygous for the dominant BCdorippus allele, and the vast majority of females (84%) are heterozygous BCdorippus/BCchrysippus, as expected if most inherit BCdorippus from their father and BCchrysippus (on the neo-W) from their mother. The dominant dorippus phenotype is therefore by far the most abundant in this population. Because aposematic colouration should be under positive frequency dependent selection, it is highly unlikely that the spread of the neo-W can be explained by selection on colour pattern, highlighting the question of what else might have driven its spread.
Hitchhiking between the neo-W and Spiroplama
We hypothesised that the neo-W has spread as a result of co-inheritance with the male-killing Spiroplasma, which is itself spreading through the population as a selfish element. Experiments have suggested that all-female broods have enhanced survival relative to females from broods that include males, possibly due to reduced competition for resources [46], although other factors such as improved immunity [47] have not been tested. A similar boost to the relative fitness of infected females is thought to have driven the rapid spread of a male-killing Wolbachia in the butterfly Hypolimnas bolina, which has occurred over a similar timescale to that reported here [48]. For Spiroplasma to drive the spread of the neo-W, it would also need to be strictly vertically inherited down the female line, such that it is always co-inherited with the neo-W.
We identified nine scaffolds making up the 1.75-Mb Spiroplasma genome in our D. chrysippus assembly (S11 Fig). Infected individuals are clearly identifiable by mapping resequencing reads to the Spiroplasma scaffolds (S11 Fig), and this was confirmed by PCR. As predicted, all females in the neo-W lineage are infected (with the exception of the cured line female, in which Spiroplasma had been eliminated). Moreover, all infected females fall into the same mitochondrial clade (Fig 4A), consistent with matrilineal inheritance. To confirm that the Spiroplasma is strictly vertically inherited and always associated with a single female lineage, we used PCR assays for Spiroplasma and mitochondrial haplotype and expanded our sample size to 158 individuals, including samples used in previous studies going back two decades [19,23] (S12 Table and S12 Fig). This confirms the perfect association: 100% of infected individuals (n = 42) carry the same mitochondrial haplotype, and this haplotype is otherwise rare, occurring in 8% of uninfected individuals (n = 116) (S12 Fig).
Like the neo-W, the Spiroplasma genomes carry limited variation among individuals (π = 0.0005), consistent with a single and recent outbreak of the endosymbiont. Although the lack of variation makes it challenging to infer genealogies, our inferred maximum likelihood genealogies for the neo-W and Spiroplasma are strikingly congruent (Fig 4B). The low bootstrap support for multiple nodes is unsurprising, given that these sequences descend from a recent common ancestor, such that most nodes will be defined by only a few informative sites. This does not weaken the support for congruence, however, as the probability of two incorrectly inferred topologies matching by chance is infinitesimally small. In a permutation test for congruence between the two distance matrices [49], the observed level of congruence exceeds all 100,000 random permutations. There is therefore strong support for co-inheritance of the neo-W and Spiroplasma [50].
The combined spread of three physically unlinked DNA molecules—the mitochondrial genome, neo-W, and Spiroplasma genome—constitutes a form of genetic hitchhiking, but is facilitated by their strict matrilineal inheritance rather than physical linkage. We cannot entirely rule out the possibility that the neo-W is contributing to this spread, or even driving it entirely, through direct selection or meiotic drive. In theory, this is testable by examining broods that carry the neo-W but lack Spiroplasma, as these should comprise more females than males, despite the absence of the male-killer. We raised 11 such broods in our cured line, and Smith [51] reported 10 natural broods that showed sex-linked colour pattern and no male-killing. Across these 21 broods, totalling 528 adult offspring, 51% were female. This is far from significantly different from the null expectation of 50% (binomial test p = 0.7). However, we note that to detect meiotic drive causing a 1% female bias with good power would require a far larger sample size of >15,000. Importantly, the few natural broods that have been found to show sex-linked colour pattern without male-killing have only been reported from regions in which Spiroplasma infection is present, implying that these broods result from occasional failed transmission of the endosymbiont [23]. Despite this potential for the neo-W to become decoupled from the male-killer, it has not spread beyond these regions, further supporting the hypothesis that hitchhiking with the male-killer underlies its rapid spread. Selfish elements have been shown to drive hitchhiking of the mitochondrial genome or a portion of a chromosome through a population and even across species boundaries [52–54]. Our findings show how an entire chromosome can be captured in the same way. Hitchhiking may therefore be of general importance in driving the spread of neo-sex chromosomes.
In D. chrysippus, it is currently unclear whether the neo-W or male-killer emerged first. It is also unclear whether their co-occurrence in a single ancestor was simply a coincidence or instead reflects some functional connection, such as the suggestion that the neo-W might confer susceptibility to the male-killer [22]. It is important to note that this is not the first time a neo-sex chromosome has formed in this lineage. A fusion of Chromosome 21 to the ancestral Z chromosome occurred in an ancestor of all Danaus species, producing a neo-Z [9,32,55]. It is speculated that a complementary fusion of Chromosome 21 to the ancestral W also occurred [9,55], but this is difficult to conclusively verify because of degradation of the W chromosome over longer timescales. If this hypothesis of an ancient neo-W is correct, then the neo-W we describe (W-chr15) might in fact be better described as a neo-neo-W (W-chr21-chr15). It is possible that the spread of the original W-chr21 was also driven by hitchhiking with a selfish endosymbiont.
Genetic and phenotypic consequences of recombination suppression
Sex chromosome evolution in many other taxa involves the progressive spread of recombination suppression outward from the sex-determining locus [56]. By contrast, the absence of crossing over in female meiosis means that a lepidopteran neo-W experiences complete and immediate recombination suppression over its entire length. Butterfly W chromosomes are therefore thought to be highly degenerated and repetitive, and to our knowledge none have been successfully assembled to date. The young age of the D. chrysippus neo-W therefore provides a rare opportunity to study the early evolutionary consequences of recombination suppression across an entire chromosome. Two related processes could shape its evolution: hitchhiking of preexisting deleterious mutations that were initially rare in the population [6], and accumulation of novel deleterious mutations due to reduced purging through recombination and selection (i.e., Muller’s Ratchet) [7].
As a proxy for the ‘genetic load’ of deleterious mutations in the population, we considered Pn/Ps, the normalised ratio of non-synonymous to synonymous polymorphisms. Because of purifying selection, non-synonymous polymorphisms are typically rare, and where they do occur, the mutant allele typically occurs at low frequency in the population [57]. When considering all polymorphisms in the neo-W lineage, Pn/Ps for chr15 (excluding the BC supergene, to avoid bias) is very slightly (approximately 5%) higher than for other autosomes (S13 Fig). Of 1,000 bootstrap replicates, 916 reproduced this bias, corresponding to a p-value of 0.084. However, when we partition polymorphisms by allele frequency, we see that chr15 carries a large excess of non-synonymous polymorphisms in the highest frequency class (i.e., minor allele at 50%), with a Pn/Ps ratio >3 times larger than on other autosomes (S13 Fig). This holds across all 1,000 bootstrap replicates (i.e., p < 0.001). A change in the frequency distribution of non-synonymous variants, without a significant change in their abundance, is best explained by hitchhiking of preexisting mildly deleterious alleles that were initially rare in the population but were inadvertently carried to high frequency along with the neo-W haplotype, and are therefore now found in all females in this lineage. In fact, Pn/Ps for high-frequency polymorphisms on chr15 is somewhat higher than would be expected through hitchhiking alone based on comparison with singleton mutations on other autosomes (p = 0.044). This suggests that accumulation of additional mildly deleterious alleles on the neo-W might have occurred early during its spread through the population.
At the phenotypic level, perhaps counterintuitively, the spread of a single supergene allele on the neo-W has not caused homogenisation of warning pattern among contact zone females and might in fact have the opposite effect. In locations where the neo-W and Spiroplasma are nearly fixed, such as our sampling site near Nairobi, the high incidence of male-killing implies that the population is strongly shaped by immigrant males. Because the BCchrysippus allele on the neo-W is universally recessive, daughters will tend to match the phenotype of their immigrant father. However, because the neo-W is always transmitted to daughters, the paternal chr15 copy will be lost to male-killing after one generation, creating a genetic sink for immigrant male genes [22] (S14 Fig). This combination of processes results in a female population that is highly sensitive to the source of immigrants, which is known to fluctuate seasonally with monsoon winds [16,58] (S14 Fig). This model leads to the testable prediction that seasonal fluctuations in female phenotypes should be most dramatic where male-killing is most abundant.
Future evolutionary trajectories
The future of the neo-W and Spiroplasma outbreak is uncertain. A lack of males could lead to local extinctions [27], but extinction of the entire infected lineage is unlikely given the high dispersal ability and seasonal influxes of males in the contact zone. Indeed, it is notable that Spiroplasma infection has only been recorded within the contact zone population (with the exception of a single South African brood reported here, S12 Table), especially given theory showing that male-killers should spread very rapidly across the geographical range of a panmictic population if they provide even a very weak selective advantage [48]. Future work will investigate whether its spread might be curtailed by environmental factors, for example if oviposition behaviour or host plant availability only leads to sibling competition (and consequent benefits for all-female broods) under certain conditions [46]. An alternative and non–mutually exclusive hypothesis is that dispersal rates of infected females are strongly reduced. In other systems, sex-ratio distortion has driven adaptive responses by the host, including changes to the mating system [59] and the evolution of resistance to male-killing [60,61]. The absence of evidence for these phenomena in D. chrysippus might simply reflect the recency of the male-killing outbreak. Eventually, we also expect the non-recombining neo-W to begin to degenerate through further hitchhiking, gene loss, and the spread of repetitive elements [8,56]. This young system provides a rare opportunity to study how these phenomena unfold through time and space.
Methods
Ethics statement
Butterfly collection was performed under permit where relevant: NACOSTI/P15/3290/3607, NACOSTI/P15/2403/3602 (National Commission for Science and Technology, Kenya), MINEDUC/S&T/459/2017 (Ministry of Education, Rwanda), EMDEP006/17 (Environmental Management Division, St Helena Government); and always with permission of the land owner and/or local authorities. We also worked with local researchers wherever possible, including authors DJM, KSO, SC, and IJG and with the Lepidopterists Society of Africa.
Reference genome sequencing, assembly, and annotation
Detailed methods for generation of the D. chrysippus reference genome are provided in S1 Text. Briefly, a draft assembly was generated using SPAdes [62] from a combination of paired-end and mate-pair libraries of various insert sizes. Scaffolding and resolution of haplotypes was performed using Redundans [63] and Haplomerger2 [64]. The assembly was annotated using a combination of de novo gene predictors, yielding 16,654 protein coding genes. Mitochondrial genomes were assembled using NOVOplasty [65].
Although we currently lack linkage information for further scaffolding, we generated a pseudo-chromosomal assembly based on homology with the highly contiguous H. melpomene genome [30,31,66], adjusted for known karyotypic differences [9,30–32,55]. Although these genomes are diverged by approximately 90 million years, this homology-based approach has been shown previously to be successful for reconstructing chromosomes in a fragmented D. plexippus genome [9]. In total, 282 Mb (87% of the genome) could be confidently assigned to chromosomes (S1 Fig).
Scaffolds representing the Spiroplasma genome were identified based on read depth of remapped reads (S11 Fig) and homology to other available Spirolasma genomes. Annotation was performed using the RAST server pipeline [67,68].
Population sample resequencing and genotyping
This study made use of 42 newly sequenced D. chrysippus individuals, as well as previously sequenced individuals of the sister species, D. petilia (n = 1) and the next closest outgroup, D. gilippus (n = 2) [69] (S9 Table). Details of DNA extraction, sequencing, and genotyping are provided in S1 Text. Briefly, DNA was extracted from thorax tissue and sequenced (paired-end, 150 bp) to a mean depth of coverage 20× or greater. Reads were mapped to the D. chrysippus reference assembly using Stampy [70] v1.0.31, and genotyping was performed using GATK version 3 [71,72]. Genotype calls were required to have an individual read depth ≥8, and heterozygous and alternate allele calls were further required to have an individual genotype quality (GQ) ≥20 for downstream analyses.
Genomic differentiation and associations with wing pattern
We used the fixation index (FST) and absolute divergence (dXY) to examine genetic differentiation across the genome among the three subspecies for which we had six or more individuals sequenced. FST and dXY were computed using the script popgenWindows.py (github.com/simonhmartin/genomics_general release 0.2) with a sliding window of 100 kb, stepping in increments of 20 kb. Windows with fewer than 20,000 genotyped sites after filtering (see above) were ignored.
To identify SNPs associated with the three Mendelian colour pattern traits (i.e., the A, B, and C loci) (Fig 1A), we used PLINK v1.9 [73] with the ‘—assoc’ option and provided quantitative phenotypes of 0, 1, or 0.5 for assumed heterozygotes, which causes PLINK to use the Wald test for quantitative traits. In addition to the quality and depth filters above, SNPs used for this analysis were required to have genotypes for at least 40 individuals, a minor allele count of at least 2, and to be heterozygous in no more than 75% of individuals. SNPs were also thinned to a minimum distance of 100 bp.
To examine relationships among diploid individuals in specific regions of interest, we constructed phylogenetic networks using the Neighbor-Net [74] algorithm, implemented in SplitsTree [75]. Pairwise distances used for input were computed using the script distMat.py (github.com/simonhmartin/genomics_general release 0.2).
Haplotype cluster assignment
To assign haplotypes to clusters in the BC supergene region, we first phased genotypes using SHAPEIT2 [76,77] using SNPs filtered as for association analysis above, except with a minor allele count of at least 4 and no thinning. Default parameters were used for phasing except that the effective population size was set to 3 × 106. To minimise phasing switch errors, we analysed each 20-kb window separately. Cluster assignment for both haplotypes from each individual was based on average genetic distance to all haplotypes from each of three reference groups: D. c. dorippus, D. c. orientis, or D. c. alcippus (the latter is also representative of D. c. chrysippus, as they share the same alleles at the BC supergene). A haplotype was assigned to one of the three groups if its average genetic distance to members of that group was less than 80% of the average distance to the other two groups; otherwise, it was left as unassigned. Genetic distances were computed using the script popgenWindows.py (github.com/simonhmartin/genomics_general release 0.2).
Identification of neo-W–specific sequencing reads
To identify females carrying the neo-W chromosome, we visualised the distribution of female-specific derived mutations that occur at high frequency. Allele frequencies were computed using the script freq.py (github.com/simonhmartin/genomics_general release 0.2). Because of the absence of female meiotic crossing over in Lepidoptera, all females carrying the neo-W fusion should share a conserved chromosomal haplotype for the entire fused chromosome. To isolate this shared fused haplotype from the autosomal copy, we first identified diagnostic mutations as those that are present in a single copy in each member of the ‘neo-W lineage’ and absent from all other individuals and outgroups. We then isolated the sequencing read pairs from each of these females that carry the derived mutation (S10 Fig). This resulted in a patchy alignment file, with a stack of read pairs over each diagnostic mutation. Based on these aligned reads, we genotyped each individual as described above, except here setting the ploidy level to 1, and requiring a minimum read depth of 3.
Diversity and divergence of the neo-W
The lack of recombination across the neo-W makes it possible to gain insights into its age. Over time, mutations will arise that differentiate the neo-W from the recombining autosomal copies of the chromosome. We estimated this divergence based on average heterozygosity in females carrying the neo-W and compared it to heterozygosity from contact-zone individuals not carrying the neo-W. Heterozygosity was computed using the script popgenWindows.py (github.com/simonhmartin/genomics_general release 0.2), focusing only on the colinear portion of the chromosome (i.e., the distal portion from 11 Mb onwards), which is outside of the BC supergene. Heterozygosity was computed in 100-kb windows, and windows were discarded if they contained fewer than 20,000 sites genotyped in at least two individuals from each population.
A recent spread of the neo-W through the population should also be detectable in the form of strong conservation of the neo-W haplotype in all females that carry it (i.e., reduced genetic diversity). We therefore computed nucleotide diversity (π) in 100-kb windows as above. Reported values of π and heterozygosity represent the mean ± standard deviation across 100-kb windows.
Genealogical analyses
We produced maximum likelihood trees for the mitochondrial genome, neo-W, and Spiroplasma genome, using PhyML v3 [78] with the GTR substitution model. Given the small number of SNPs in both the neo-W and Spiroplasma genomes, regions with inconsistent coverage across individuals were excluded manually. Only sites with no missing genotypes were included.
We estimated the root node age for the neo-W using BEAST2 [79,80] version 2.5.1 with a fixed clock model and an exponential population growth prior. For all other priors we used the defaults as defined by BEAUti v2.5.1. We assumed a mutation rate of 2.9 × 10−9 per generation based on a direct estimate for Heliconius butterflies [81] and 12 generations per year [82]. BEAST2 was run for 500,000,000 iterations, sampling every 50,000 generations, and we used Tracer [83] version 1.7.1 to check for convergence of posterior distributions and compute the root age after discarding a burn-in of 10%.
We tested for congruence between the neo-W and Spiroplasma trees using PACo [49], which assesses the goodness of fit between host and parasite distance matrices, with 100,000 permutations. Distance matrices were computed using the script distMat.py (github.com/simonhmartin/genomics_general release 0.2).
Analysis of synonymous and non-synonymous polymorphism
We computed Pn/Ps as as the ratio of non-synonymous polymorphisms per non-synonymous site to synonymous polymorphisms per synonymous site. Synonymous and non-synonymous sites were defined conservatively as 4-fold degenerate and 0-fold degenerate codon positions, respectively, with the requirement that the other two codon positions are invariant across the entire dataset. Only sites genotyped in all 15 females in the neo-W lineage were considered, and counts were stratified by minor allele frequency using the script sfs.py (github.com/simonhmartin/genomics_general release 0.1).
Butterfly rearing and molecular diagnostics
To generate a stock line that is cured of Spiroplasma infection, we treated caterpillars from all-female broods with tetracycline, following Jiggins and colleagues [23]. A ‘cured line’ was initiated from a single treated female that had the heterozygous Cc transiens phenotype (Fig 1A). This female was crossed to a wild male (homozygous cc) to test for sex linkage of phenotype. The cured line was maintained through sibling crosses for six generations and the persistence of males indicated that Spiroplasma had been eliminated.
We then applied a molecular test for sex linkage of chr15 using the F5 brood from the cured line. We designed two separate PCR diagnostics based on SNPs segregating on chr15 to distinguish between the two chromosomes of the male and the female parents (S11 Table). PCR was performed using the Phusion HF Master Mix and HF Buffer (New England Biolabs, Ipswich, MA).
To screen for Spiroplasma infection, we designed a PCR assay targeting the glycerophosphoryl diester phosphodiesterase (GDP) gene (S11 Table). PCR was performed as above. We confirmed the sensitivity of this diagnostic by analysing individuals of known infection status based on whole genome sequencing (12 infected and 11 uninfected).
To investigate whether Spiroplasma infection was always associated with a single mitochondrial haplotype, we designed a PCR RFLP for the Cytochrome Oxidase Subunit I (COI) that differentiates the infected ‘K’ lineage (S11 Table). PCR was performed as above. A subset of products were verified by Sanger sequencing after purification using the QIAquick PCR Purification Kit (Qiagen).
Supporting information
Acknowledgments
We are grateful to Godfrey Amoni Etelej, Laura Hebberecht-Lopez, and Glennis Julian for support with butterfly rearing. We thank Roger Vila, Frank Jiggins, and David Pryce for providing samples, and Jenny York, Frank Jiggins, Deborah Charlesworth, and Greg Hurst for helpful comments.
Abbreviations
- chr15
Chromosome 15
- COI
Cytochrome Oxidase Subunit I
- GDP
glycerophosphoryl diester phosphodiesterase
- LD
linkage disequilibrium
- Mb
megabase
Data Availability
Raw genomic data and assemblies are available from GenBank (project accession numbers PRJNA448181 and PRJEB35880, and individual sample accessions are provided in S9 Table). All processed data files underlying all figures are available from the Dryad digital repository: https://doi.org/10.5061/dryad.9kd51c5d0. Scripts used for data analysis are available from https://github.com/simonhmartin/genomics_general.
Funding Statement
This work was funded by European Research Council (https://erc.europa.eu) European Union Horizon 2020 research and innovation programme grant 646625 (CB), ERC grant 339873 (CDJ), National Geographic Society (https://www.nationalgeographic.org) Research Grant WW-138R-17 (IJG), and a Royal Society (https://royalsociety.org) University Research Fellowship URF\R1\180682 (SHM). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Kirkpatrick M, Barton N. Chromosome inversions, local adaptation and speciation. Genetics. 2006;173: 419–34. 10.1534/genetics.105.047985 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bürger R, Akerman A. The effects of linkage and gene flow on local adaptation: A two-locus continent–island model. Theor Popul Biol. 2011;80: 272–288. 10.1016/j.tpb.2011.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Guerrero RF, Rousset F, Kirkpatrick M. Coalescent patterns for chromosomal inversions in divergent populations. Philos Trans R Soc B Biol Sci. 2012;367: 430–438. 10.1098/rstb.2011.0246 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Charlesworth D, Charlesworth B. Selection on recombination in clines. Genetics. 1979;91: 575–580. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Charlesworth D. The status of supergenes in the 21st century: Recombination suppression in Batesian mimicry and sex chromosomes and other complex adaptations. Evol Appl. 2016;9: 74–90. 10.1111/eva.12291 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Rice WR. Genetic hitchhiking and the evolution of reduced genetic activity of the Y sex chromosome. Genetics. 1987;116: 161–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Charlesworth B. Model for evolution of Y chromosomes and dosage compensation. Proc Natl Acad Sci U S A. 1978;75: 5618–22. 10.1073/pnas.75.11.5618 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bachtrog D, Charlesworth B. The temporal dynamics of processes underlying Y chromosome degeneration. Genetics. Genetics; 2008;179: 1513–25. 10.1534/genetics.107.084012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mongue AJ, Nguyen P, Voleníková A, Walters JR. Neo-sex chromosomes in the monarch butterfly, Danaus plexippus. G3. 2017;7: 3281–3294. 10.1534/g3.117.300187 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bracewell RR, Bentz BJ, Sullivan BT, Good JM. Rapid neo-sex chromosome evolution and incipient speciation in a major forest pest. Nat Commun. 2017;8 10.1038/s41467-017-00021-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Pala I, Naurin S, Stervander M, Hasselquist D, Bensch S, Hansson B. Evidence of a neo-sex chromosome in birds. Heredity. 2012;108: 264–272. 10.1038/hdy.2011.70 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kitano J, Ross JA, Mori S, Kume M, Jones FC, Chan YF, et al. A role for a neo-sex chromosome in stickleback speciation. Nature. 2009;461: 1079–1083. 10.1038/nature08441 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nguyen P, Sykorova M, Sichova J, Kuta V, Dalikova M, Capkova Frydrychova R, et al. Neo-sex chromosomes and adaptive potential in tortricid pests. Proc Natl Acad Sci. 2013;110: 6931–6936. 10.1073/pnas.1220372110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Carabajal Paladino LZ, Provazníková I, Berger M, Bass C, Aratchige NS, López SN, et al. Sex Chromosome Turnover in Moths of the Diverse Superfamily Gelechioidea. Genome Biol Evol. 2019;11: 1307–1319. 10.1093/gbe/evz075 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Steinemann M, Steinemann S. Enigma of Y chromosome degeneration: neo-Y and neo-X chromosomes of Drosophila miranda a model for sex chromosome evolution. Genetica. 1998;102–103: 409–20. [PubMed] [Google Scholar]
- 16.Smith DAS, Owen DF, Gordon IJ, Lowis NK. The butterfly Danaus chrysippus (L.) in East Afiica: polymorphism and morph-ratio clines within a complex, extensive and dynamic hybrid zone. Zool J Linn Soc. 1997;120: 51–78. [Google Scholar]
- 17.Smith DAS. Heterosis, epistasis and linkage disequilibrium in a wild population of the polymorphic butterfly Danaus chrysippus (L.). Zool J Linn Soc. 1980;69: 87–109. [Google Scholar]
- 18.Smith D a S, Owen DF, Gordon IJ, Owiny AM. Polymorphism and evolution in the butterfly Danaus chrysippus (L.) (Lepidoptera: Danainae). Heredity. 1993;71: 242–251. 10.1038/hdy.1993.132 [DOI] [Google Scholar]
- 19.Smith DAS, Gordon IJ, Allen JA. Reinforcement in hybrids among once isolated semispecies of Danaus chrysippus (L.) and evidence for sex chromosome evolution. Ecol Entomol. 2010;35: 77–89. 10.1111/j.1365-2311.2009.01143.x [DOI] [Google Scholar]
- 20.Smith DAS. Genetics of Some Polymorphic Forms of the African Butterfly Danaus chrysippus L. (Lepidoptera: Danaidae). Insect Syst Evol. 1975;6: 134–144. 10.1163/187631275X00235 [DOI] [Google Scholar]
- 21.Clarke CA, Sheppard PM, Smith AG. genetics of fore and hindwing colour in crosses between Danaus chrysippus from Australia and from Sierra Leone (Danaidae). Lepid Soc J. 1973; [Google Scholar]
- 22.Smith DAS, Gordon IJ, Traut W, Herren J, Collins S, Martins DJ, et al. A neo-W chromosome in a tropical butterfly links colour pattern, male-killing, and speciation. Proc R Soc B Biol Sci. 2016;283: 20160821 10.1098/rspb.2016.0821 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Jiggins FM, Hurst GD, Jiggins CD, v d Schulenburg JH, Majerus ME. The butterfly Danaus chrysippus is infected by a male-killing Spiroplasma bacterium. Parasitology. 2000;120: 439–46. 10.1017/s0031182099005867 [DOI] [PubMed] [Google Scholar]
- 24.Herren JK, Gordon I, Holland PWH, Smith D. The butterfly Danaus chrysippus (Lepidoptera: Nymphalidae) in Kenya is variably infected with respect to genotype and body size by a maternally transmitted male-killing endosymbiont (Spiroplasma). Int J Trop Insect Sci. 2007;27: 62 10.1017/S1742758407818327 [DOI] [Google Scholar]
- 25.Gordon IJ, Ireri P, Smith DAS. Hologenomic speciation: Synergy between a male-killing bacterium and sex-linkage creates a “magic trait” in a butterfly hybrid zone. Biol J Linn Soc. 2014;111: 92–109. 10.1111/bij.12185 [DOI] [Google Scholar]
- 26.Lushai G, Allen JA, Goulson D, Maclean N, Smith DAS. The butterfly Danaus chrysippus (L.) in East Africa comprises polyphyletic, sympatric lineages that are, despite behavioural isolation, driven to hybridization by female-biased sex ratios. Biol J Linn Soc. 2005;86: 117–131. 10.1111/j.1095-8312.2005.00526.x [DOI] [Google Scholar]
- 27.Idris E, Saeed M. Hassan S. Biased sex ratios and aposematic polymorphism in African butterflies: A hypothesis. Ideas Ecol Evol. 2013;6: 5–16. 10.4033/iee.2013.6.2.n [DOI] [Google Scholar]
- 28.Simão FA, Waterhouse RM, Ioannidis P, Kriventseva E V., Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31: 3210–3212. 10.1093/bioinformatics/btv351 [DOI] [PubMed] [Google Scholar]
- 29.Heliconius Genome Consortium T. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature. 2012;487: 94–8. 10.1038/nature11041 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Davey JW, Chouteau M, Barker SL, Maroja L, Baxter SW, Simpson F, et al. Major Improvements to the Heliconius melpomene Genome Assembly Used to Confirm 10 Chromosome Fusion Events in 6 Million Years of Butterfly Evolution. G3. 2016;6: 695–708. 10.1534/g3.115.023655 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Davey JW, Barker SL, Rastas PM, Pinharanda A, Martin SH, Durbin R, et al. No evidence for maintenance of a sympatric Heliconius species barrier by chromosomal inversions. Evol Lett. 2017;1: 138–154. 10.1002/evl3.12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ahola V, Lehtonen R, Somervuo P, Salmela L, Koskinen P, Rastas P, et al. The Glanville fritillary genome retains an ancient karyotype and reveals selective chromosomal fusions in Lepidoptera. Nat Commun. 2014;5: 1–9. 10.1038/ncomms5737 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Leffler EM, Bullaughey K, Matute DR, Meyer WK, Ségurel L, Venkat A, et al. Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol. 2012;10: e1001388 10.1371/journal.pbio.1001388 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mackintosh A, Laetsch DR, Hayward A, Charlesworth B, Waterfall M, Vila R, et al. The determinants of genetic diversity in butterflies. Nat Commun. 2019;10: 3466 10.1038/s41467-019-11308-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, et al. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res. 2013;23: 1817–1828. 10.1101/gr.159426.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Martin SH (2020). Data from: Whole-chromosome hitchhiking driven by a male-killing endosymbiont. Dryad Digital Repository [cited 2020 Jan 1]. Openly available from: 10.5061/dryad.9kd51c5d0 [DOI] [PMC free article] [PubMed]
- 37.Wittkopp PJ, Vaccaro K, Carroll SB. Evolution of yellow Gene Regulation and Pigmentation in Drosophila. Curr Biol. 2002;12: 1547–1556. 10.1016/s0960-9822(02)01113-2 [DOI] [PubMed] [Google Scholar]
- 38.Zhang L, Martin A, Perry MW, van der Burg KRL, Matsuoka Y, Monteiro A, et al. Genetic Basis of Melanin Pigmentation in Butterfly Wings. Genetics. 2017;205: 1537–1550. 10.1534/genetics.116.196451 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rives AF, Rochlin KM, Wehrli M, Schwartz SL, DiNardo S. Endocytic trafficking of Wingless and its receptors, Arrow and DFrizzled-2, in the Drosophila wing. Dev Biol. 2006;293: 268–283. 10.1016/j.ydbio.2006.02.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Martin A, Papa R, Nadeau NJ, Hill RI, Counterman BA, Halder G. Diversification of complex butterfly wing patterns by repeated regulatory evolution of a Wnt ligand. Proc Natl Acad Sci U S A. 2012;109: 12632–12637. 10.1073/pnas.1204800109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Mazo-Vargas A, Concha C, Livraghi L, Massardo D, Wallbank RWR, Zhang L, et al. Macroevolutionary shifts of WntA function potentiate butterfly wing-pattern diversity. Proc Natl Acad Sci. 2017;114: 10701–10706. 10.1073/pnas.1708149114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Van Belleghem SM, Rastas P, Papanicolaou A, Martin SH, Arias CF, Supple MA, et al. Complex modular architecture around a simple toolkit of wing pattern genes. Nat Ecol Evol. 2017;1: 0052 10.1038/s41559-016-0052 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Smith DAS. Evidence for autosomal meiotic drive in the butterfly Danaus chrysippus L. Heredity. 1976;36: 139–142. 10.1038/hdy.1976.13 [DOI] [PubMed] [Google Scholar]
- 44.Coughlan JM, Willis JH. Dissecting the role of a large chromosomal inversion in life history divergence throughout the Mimulus guttatus species complex. Mol Ecol. 2019;28: 1343–1357. 10.1111/mec.14804 [DOI] [PubMed] [Google Scholar]
- 45.Lee CR, Wang B, Mojica JP, Mandáková T, Prasad KVSK, Goicoechea JL, et al. Young inversion with multiple linked QTLs under selection in a hybrid zone. Nat Ecol Evol. 2017;1 10.1038/s41559-017-0119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gordon IJ, Ireri P, Smith DAS. Preference for isolated host plants facilitates invasion of Danaus chrysippus (Linnaeus, 1758) (Lepidoptera: Nymphalidae) by a bacterial male-killer Spiroplasma. Austral Entomol. 2015;54: 210–216. 10.1111/aen.12113 [DOI] [Google Scholar]
- 47.Hurst GDD, Hutchence KJ. Host defence: Getting by with a little help from our friends. Curr Biol. 2010;20: R806–R808. 10.1016/j.cub.2010.07.038 [DOI] [PubMed] [Google Scholar]
- 48.Duplouy A, O’Neill SL. Rapid spread of male-Killing Wolbachia in the butterfly Hypolimnas bolina. J Evol Biol. 2010;23: 209–227. 10.1007/978-3-642-12340-5_13 [DOI] [PubMed] [Google Scholar]
- 49.Balbuena JA, Míguez-Lozano R, Blasco-Costa I. PACo: A Novel Procrustes Application to Cophylogenetic Analysis. PLoS ONE. 2013;8: e61048 10.1371/journal.pone.0061048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Richardson MF, Weinert LA, Welch JJ, Linheiro RS, Magwire MM, Jiggins FM, et al. Population Genomics of the Wolbachia Endosymbiont in Drosophila melanogaster. PLoS Genet. 2012;8 10.1371/journal.pgen.1003129 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Smith DAS. African Queens and Their Kin: A Darwinian Odyssey. Taunton, UK: Brambleby Books; 2014. [Google Scholar]
- 52.Jiggins FM. Male-killing Wolbachia and mitochondrial DNA: selective sweeps, hybrid introgression and parasite population dynamics. Genetics. 2003;164: 5–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Hurst GDD, Jiggins FM. Problems with mitochondrial DNA as a marker in population, phylogeographic and phylogenetic studies: The effects of inherited symbionts. Proc R Soc B Biol Sci. 2005;272: 1525–1534. 10.1098/rspb.2005.3056 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Palopoli MF, Wu CI. Rapid evolution of a coadapted gene complex: Evidence from the segregation Distorter (SD) system of meiotic drive in Drosophila melanogaster. Genetics. 1996;143: 1675–1688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Traut W, Ahola V, Smith DAS, Gordon IJ, Ffrench-Constant RH. Karyotypes versus Genomes: The Nymphalid Butterflies Melitaea cinxia, Danaus plexippus, and D. chrysippus. Cytogenet Genome Res. 2018;153: 46–53. 10.1159/000484032 [DOI] [PubMed] [Google Scholar]
- 56.Wright AE, Dean R, Zimmer F, Mank JE. How to make a sex chromosome. Nature Commun. 2016. p. 12087 10.1038/ncomms12087 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Fay JC, Wyckoff GJ, Wu C-I. Positive and Negative Selection on the Human Genome. Genetics. 2001;158: 1227–1234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Smith DAS, Owen DF. Colour genes as markers for migratory activity: The butterfly Danaus chrysippus in Africa. Oikos. 1997;78: 127–135. [Google Scholar]
- 59.Jiggins FM, Hurst GDD, Majerus MEN. Sex-ratio-distorting Wolbachia causes sex-role reversal in its butterfly host. Proc R Soc B Biol Sci. 2000;267: 69–73. 10.1098/rspb.2000.0968 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hornett EA, Charlat S, Duplouy AMR, Davies N, Roderick GK, Wedell N, et al. Evolution of male-killer suppression in a natural population. PLoS Biol. 2006;4: 1643–1648. 10.1371/journal.pbio.0040283 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Wilfert L, Jiggins FM. The dynamics of reciprocal selective sweeps of host resistance and a parasite counter-adaptation in Drosophila. Evolution. 2013;67: 761–773. 10.1111/j.1558-5646.2012.01832.x [DOI] [PubMed] [Google Scholar]
- 62.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J Comput Biol. 2012;19: 455–477. 10.1089/cmb.2012.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Pryszcz LP, Gabaldón T. Redundans: an assembly pipeline for highly heterozygous genomes. Nucleic Acids Res. Oxford University Press; 2016;44: e113–e113. 10.1093/nar/gkw294 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Huang S, Kang M, Xu A. HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly. Bioinformatics. 2017;33: 2577–2579. 10.1093/bioinformatics/btx220 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Dierckxsens N, Mardulyn P, Smits G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2016;45: gkw955 10.1093/nar/gkw955 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Dasmahapatra KK, Walters JR, Briscoe AD, Davey JW, Whibley A, Nadeau NJ, et al. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature. 2012;487: 94–8. 10.1038/nature11041 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42: D206–D214. 10.1093/nar/gkt1226 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9: 75 10.1186/1471-2164-9-75 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Zhan S, Zhang W, Niitepõld K, Hsu J, Haeger JF, Zalucki MP, et al. The genetics of monarch butterfly migration and warning colouration. Nature. 2014;514: 317–21. 10.1038/nature13812 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Lunter G, Goodson M. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 2011;21: 936–9. 10.1101/gr.111120.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.DePristo M a, Banks E, Poplin R, Garimella K V, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43: 491–8. 10.1038/ng.806 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, et al. From fastQ data to high-confidence variant calls: The genome analysis toolkit best practices pipeline. Curr Protoc Bioinforma. 2013;43: 11.10.1–11.10.33. 10.1002/0471250953.bi1110s43 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. Narnia; 2015;4: 7 10.1186/s13742-015-0047-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Bryant D, Moulton V. Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks. Mol Biol Evol. 2003;21: 255–265. 10.1093/molbev/msh018 [DOI] [PubMed] [Google Scholar]
- 75.Huson DH, Bryant D. Application of Phylogenetic Networks in Evolutionary Studies. Mol Biol Evol. 2006;23: 254–267. 10.1093/molbev/msj030 [DOI] [PubMed] [Google Scholar]
- 76.Delaneau O, Zagury J-F, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10: 5–6. 10.1038/nmeth.2307 [DOI] [PubMed] [Google Scholar]
- 77.Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nat Methods. 2012;9: 179–181. 10.1038/nmeth.1785 [DOI] [PubMed] [Google Scholar]
- 78.Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59: 307–21. 10.1093/sysbio/syq010 [DOI] [PubMed] [Google Scholar]
- 79.Bouckaert RR. DensiTree: making sense of sets of phylogenetic trees. Bioinformatics. 2010;26: 1372–3. 10.1093/bioinformatics/btq110 [DOI] [PubMed] [Google Scholar]
- 80.Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu C-H, Xie D, et al. BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. PLoS Comput Biol. 2014;10: e1003537 10.1371/journal.pcbi.1003537 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Keightley PD, Pinharanda A, Ness RW, Simpson F, Dasmahapatra KK, Mallet J, et al. Estimation of the Spontaneous Mutation Rate in Heliconius melpomene. Mol Biol Evol. 2015;32: 239–243. 10.1093/molbev/msu302 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Owen DF, Chanter DO. Population biology of tropical African butterflies. Sex ratio and genetic variation in Acraea encedon. J Zool. 1969;157: 345–374. 10.1111/j.1469-7998.1969.tb01707.x [DOI] [Google Scholar]
- 83.Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7. Syst Biol. 2018;67: 901–904. 10.1093/sysbio/syy032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Reed RD, Papa R, Martin A, Hines HM, Kronforst MR, Chen R, et al. optix Drives the Repeated Convergent Evolution of Butterfly Wing Pattern Mimicry. Science. 2011;333: 1137–1142. 10.1126/science.1208227 [DOI] [PubMed] [Google Scholar]
- 85.Nadeau NJ, Pardo-diaz C, Whibley A, Supple MA, Suzanne V, Richard W, et al. The gene cortex controls mimicry and crypsis in butterflies and moths. Nature. 2016;534: 106–110. 10.1038/nature17961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Westerman EL, VanKuren NW, Massardo D, Tenger-Trolander A, Zhang W, Hill RI, et al. Aristaless Controls Butterfly Wing Color Variation Used in Mimicry and Mate Choice. Curr Biol. 2018;28: 3469–3474.e4. 10.1016/j.cub.2018.08.051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Kunte K, Zhang W, Tenger-Trolander A, Palmer DH, Martin A, Reed RD, et al. doublesex is a mimicry supergene. Nature. 2014;507: 229–232. 10.1038/nature13112 [DOI] [PubMed] [Google Scholar]
- 88.Thompson MJ, Timmermans MJ, Jiggins CD, Vogler AP. The evolutionary genetics of highly divergent alleles of the mimicry locus in Papilio dardanus. BMC Evol Biol. 2014;14: 140 10.1186/1471-2148-14-140 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Smith DAS, Gordon IJ, Depew LA, Owen DF. Genetics of the butterfly Danaus chlysippus (L.) in a broad hybrid zone, with special reference to sex ratio, polymorphism and intragenomic conflict. Biol J Linn Soc. 1998;65: 1–40. 10.1006/bijl.1998.0240 [DOI] [Google Scholar]