Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2025 Apr 10;21(4):e1011655. doi: 10.1371/journal.pgen.1011655

Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies

Sam Ebdon 1,*, Dominik R Laetsch 1, Roger Vila 2, Stuart J E Baird 3,#, Konrad Lohse 1,#
Editor: Nicolas Bierne4
PMCID: PMC12040345  PMID: 40209170

Abstract

Many closely related species continue to hybridise after millions of generations of divergence. However, the extent to which current patterning in hybrid zones connects back to the speciation process remains unclear: does evidence for current multilocus barriers support the hypothesis of speciation due to multilocus divergence? We analyse whole-genome sequencing data to investigate the speciation history of the scarce swallowtails Iphiclidespodalirius and I . feisthamelii, which abut at a narrow ( ∼ 25 km) contact zone north of the Pyrenees. We first quantify the heterogeneity of effective migration rate under a model of isolation with migration, using genomes sampled across the range to identify long-term barriers to gene flow. Secondly, we investigate the recent ancestry of individuals from the hybrid zone using genome polarisation and estimate the coupling coefficient under a model of a multilocus barrier. We infer a low rate of long-term gene flow from I . feisthamelii into I . podalirius - the direction of which matches the admixture across the hybrid zone - and complete reproductive isolation across  ≈  33% of the genome. Our contrast of recent and long-term gene flow shows that regions of low recent hybridisation are indeed enriched for long-term barriers which maintain divergence between these hybridising sister species. This finding paves the way for future analysis of the evolution of reproductive isolation along the speciation continuum.

Author summary

Efforts to understand how new species evolve typically approach the problem through either: 1) investigating patterns of genetic exchange across ‘hybrid zones’ — where closely related species interbreed — or 2) modelling the demographic history of species divergence. Both approaches are capable of quantifying variation in genetic exchange, or ‘gene flow’, along the genome to identify regions of reproductive isolation; yet they rely on different genetic signatures. While the former exploits allele frequency clines and patterns of linkage disequilibrium set up since the most recent range contact, the latter averages signatures over the history of divergence. Hence, we can contrast the genomic distribution of barriers acting on these different time scales to test how patterns of gene flow change across the speciation continuum. Here we use this strategy to capture the speciation dynamics of a pair of hybridising papilionid butterflies. Our results show that not only that these species continue to produce hybrids after more than a million years since the onset of divergence, but that there is a significant degree of concordance between patterns of gene flow observed along the genome across time scales.

Introduction

A fundamental goal of speciation research is to understand the genetic basis of reproductive isolation (RI) between diverging species and quantify the demographic and selective processes that lead to a build-up of RI [1]. We now know that episodes of gene flow during speciation are not only possible [25] but frequent [610]: closely related species often continue to hybridise after millions of generations of divergence [1114], yet remain distinct despite low levels of gene flow [15]. However, given the time scales over which speciation occurs, the processes that contribute to RI are likely to vary over time [16]. Stankowski and Ravinet [17] define the speciation continuum as a continuum of reproductive isolation, from incipience to complete hybrid inviability, and highlight that the species we observe today are at different stages along this continuum. For example, sister species in many temperate plant and animal taxa form secondary contact zones [18,19] which may be substantially younger than the onset of speciation. In other words, contemporary contact zones are likely one of many instances of secondary range contact generated by drastic environmental changes, such as glacial cycles over the last 800,000 years [20]. Hybrid zones (HZs) for such taxa are stable exactly because there is strong selection against admixed ancestry. Whilst it is evident that genetic divergence and differentiation varies along the genome [21,22], understanding the extent to which this reflects variation in the rate of gene flow requires explicit modelling of the interplay between migration, selection, genetic drift, and recombination.

Locally beneficial alleles may be selected against if they migrate into unfavourable environments and/or genetic backgrounds, reducing effective gene flow [23]. However, over time, gene flow may vary as a consequence of range shifts and/or other changes in demography caused by glacial cycles, and – as a consequence – the selective forces and targets underpinning RI may also vary over time. Periods of complete allopatry during which gene flow is interrupted facilitate the build-up of strong endogenous [24] or ‘intrinsic’ barriers [25]. Genetic studies of HZs show that recent introgression may vary considerably along the genome [26]: some genome regions harbouring strong incompatibilities act as strong barriers and show steep clines [27,28]. For example, steep clines in nature predicted [29] the approximate location of the second mammalian hybrid sterility gene Hstx2 years before its identification in the lab [30]. In other regions introgression may be unimpeded by selection or even advantageous [3133]. Importantly, the barriers associated with RI since the onset of divergence may differ from those acting during secondary contact, many of which may have arisen recently [34,35] and may even have evolved to reinforce existing barriers in the face of gene flow [36]. To date, few studies have compared the targets of recent selection against heterospecific ancestry in hybrids with the architecture of long-term barriers to gene flow.

A recent meta-analysis of barriers to gene flow between the Western and Eastern European house mouse (Mus musculus domesticus vs M. musculus musculus) found no significant overlap between postzygotic incompatibilities mapped in interspecific crosses in the lab and barriers to gene flow detected in the natural HZ of these taxa [37]. There are biological reasons for the non-repeatability between barriers detected in lab studies and HZs: firstly, the former are heavily biased towards barriers acting in the F1 or the first few backcross generations, the formations of which are extremely rare in natural HZs (which are typically several dispersal distances wide). Secondly, as argued by Frayer and Payseur [37] “the loci underlying barriers observed in the lab may be distinct from those that impede gene flow in nature" and may be involved in pre-mating isolation and/or extrinsic local adaptation. However, the lack of overlap found by Frayer and Payseur [37] may simply reflect a lack of statistical power resulting from the fact that their study pooled information from many clinal analyses with varying (but high) false positive rates and that mapping studies have low resolution in general; an n-generation lab cross study can only generate O(n) crossovers per chromosome.

Thus, one may argue that contrasts of barriers acting since the most recent secondary contact of a species pair with barriers acting since the onset of divergence are not only more biologically relevant than contrasts with lab-based (early generation) hybrids, but also have greater statistical power, at least in principle. This simply reflects the fact that a typical HZ individual involves a much larger number of admixture generations, ancestors, and junctions in ancestry than any lab-cross.

Previous attempts to investigate the overlap of HZ introgression with long-term barriers relied on comparing outliers of genetic differentiation with cline-based analyses and yielded mixed results: Harrison and Larson [38] highlight several systems in which outliers of increased genetic differentiation show a reduction of recent introgression as measured by steeper genomic clines across secondary contact hybrid zones [28,3946]. However, many other studies report mixed or negative results [40,4754]. Interestingly, in some instances, fixed inversions have been shown to be associated with both ‘islands of divergence’ and steep clines [5558]. While divergence outliers may correlate with regions of low HZ introgression, both the degree and significance of this association remain unclear.

In practice, comparisons between genomic clines and outliers of genetic differentiation have been limited for two reasons: firstly, summary statistics of the site frequency spectrum (such as FST) confound barrier effects with other population genetic processes and, by focusing on the most extreme outliers, limit sample size and power. Secondly, the power of clinal analyses is limited both by the need for extensive geographic sampling (which is impossible and/or prohibitively expensive for many natural HZ systems) and insufficiency of summary statistics; e.g. clines are often merely centered and scaled without allowing for variation in their shape and asymmetry.

The lack of comparisons of barriers to gene flow across time scales has been highlighted in a recent review on speciation genomics [35]. However, filling this gap requires quantitative inference frameworks that can distinguish barriers on both time scales from other evolutionary processes using the limited sample sizes available for most natural HZs. Such approaches have only become available recently.

Speciation in Iphiclides butterflies

The southern European ‘scarce swallowtails’ Iphiclides podalirius and I. feisthamelii are large papilionid butterflies, typically associated with various species of Prunus and other Rosaceae bushes and trees. While I. podalirius ranges across the north of the Mediterranean from France to East Asia, I. feisthamelii is restricted to the Iberian peninsula and the northwest of Africa (Fig 1). Despite a long appreciation of the phenotypic differences between the two species [59], including genital morphology [60] and male ultraviolet wing patterns [61], their taxonomic status has been disputed after DNA barcoding at the mitochondrial COX1 locus revealed that the two taxa share mitochondrial haplotypes [62,63]. However, it has recently been suggested that mitochondrial genealogy reflects an introgression sweep most likely linked to infection by Wolbachia [61]. The two taxa diverged approximately 1.2 million years ago [20,64] and today, the species abut at a narrow ( ∼ 25 km) contact zone north of the Pyrenees [65] (Fig 1). While potential hybrids have been diagnosed based on morphology in a small set of museum specimens [66], the putative HZ has not yet been characterised genetically. Here, we quantify introgression across the HZ and estimate long-term effective migration rates (me) for this pair of sister species.

Fig 1. (A) Sampling locations and ranges of I. feisthamelii (purple) and I. podalirius (teal) butterflies.

Fig 1

The samples collected from the hybrid zone (HZ) are shown in yellow. (B) Sampling locations of butterflies from the Iphiclides HZ. The dashed line represents the approximate HZ center, based on samples collected by Lafranchis et al. [66]. The circular samples resemble I. feisthamelii (HI < 0.1), the triangular samples are intermediate hybrids (HI > 0.1). Maps were generated using the Python ‘basemap’ package with Natural Earth’s 10m country and coastline datasets (available here), and a relief layer from the ArcGIS Rest Service [90]. The basemap plotting code is available here.

Aims and objectives

We investigate the history of speciation and quantify both long and short-term barriers to gene flow between I. podalirius and I. feisthamelii. We use two recently developed minimal-assumption inference methods gIMble [67] and diem [68] on whole-genome sequencing (WGS) data to quantify long-term barriers and selection against recent introgression in the HZ, respectively. Firstly, we infer variation in the long-term rate of effective migration along the genome since species divergence under an explicit demographic model and locate putative barriers to gene flow with gIMble. Secondly, we use the genome polarisation framework diem to quantify recent barriers to introgression in six putative hybrid individuals sampled from the Iphiclides HZ (Fig 1) and characterize the multilocus barrier to ongoing hybridisation between these species. Finally, we investigate the overlap between long-term barriers to gene flow and genomic regions that are depleted for recent introgression. We address the following questions:

  1. What is the direction and rate of gene flow between I. podalirius and I. feisthamelii since their initial divergence?

  2. What fraction of the Iphiclides genome acts as a long-term barrier to gene flow, and what are the properties of barrier regions?

  3. What is the evidence for recent gene flow between the species across the HZ?

  4. Is introgression across the HZ impeded by long-term barrier regions, i.e. do the barriers acting over these two different timescales overlap?

  5. What is the strength of the multilocus barrier acting against introgression in the HZ?

Results

Interspecific variation is consistent with hybridisation in Iphiclides

We generated WGS data for six individuals from the Iphiclides hybrid zone (we will refer to these as “the HZ set”) in southern France (Fig 1B) and 14 individuals sampled throughout the ranges of both species (“the non-HZ set”, Fig 1A). The non-HZ set includes six samples of I. podalirius and eight of I. feisthamelii.

Visualising genetic variation in a PCA reveals distinct clusters both within and between species. The first principle component (PC1) captures interspecies differentiation between I. podalirius and I. feisthamelii ( ≈ 23% of the variation, Fig 2). Samples assigned morphologically to each species form two clusters along PC1 with individuals from the HZ (Fig 2) falling between the two parental species clusters (Fig 2). The two North African I. feisthamelii samples are separated from the European I. feisthamelii samples along PC2 ( ≈ 14% of the variation, Fig 2). Genetic diversity is slightly larger in I. feisthamelii compared to I. podalirius (Table 1).

Fig 2. (A) The background demographic history of species divergence and gene flow, the height and width of populations, is relative to the maximum likelihood estimates under the IM2,pod model (Table 2).

Fig 2

This figure was generated using demes [91]. (B) PCA of Iphiclides sampled across Europe; I. feisthamelii and I. podalirius samples are shown in purple and teal respectively. Samples from the HZ are shown in yellow: PC1 captures differences between the two taxa, PC2 geographic structure, particularly the separation between North African and European I. feisthamelii.

Table 1. Estimates of genetic diversity (H, π, and θ), divergence (dxy), and differentiation (FST) at intergenic (I) and fourfold degenerate (4D) sites between I. feisthamelii and I. podalirius. Estimates for 4D sites are taken from [20].

HI H4D πI θI dxy,I dxy,4D FST,I FST,4D
I. feisthamelii 0.00676 0.00794 0.00682 0.00692 0.0244 0.0275 0.594 0.575
I. podalirius 0.00585 0.00521 0.00652 0.00659

We identified runs of homozygosity (ROH) > 100kb using PLINK (v1.9) [69] to estimate inbreeding via FROH. One I. podalirius sample from Sicily (RVcoll12R048) was particularly inbred (FROH0.25), and two other I. podalirius individuals, one from Romania (RVcoll14E561) and one from the HZ (1325) were somewhat inbred (FROH0.076 and 0.061 respectively). FROH in the remaining samples (including all I. feisthamelii) was negligible (S1 Table).

Evidence for post-divergence gene flow from I. feisthamelii into I. podalirius

We use the composite-likelihood method gIMble [67] to infer long-term barriers to gene flow. Although the speciation history of Iphiclides most likely involved glacial cycles of isolation and secondary contact, our aim is not to reconstruct this likely complex and dynamic demographic history in any detail, but rather to capture the variation in long-term gene flow along the genome with the fewest number of parameters. We therefore fit an IM model that assumes a constant rate of me through time but allows for heterogeneity in me along the genome [9,67]. Contrasting the support for a background IM model and a history without gene flow in genomic windows gives a measure for the cumulative local strength of barrier loci. Since this analysis of barriers relies on the assumption of a two-population history, HZ and North African samples were excluded. We summarise genetic variation in short 64 base ‘blocks’, which are assumed to be non-recombining, under no direct selection, and to evolve with a constant mutation rate. These assumptions allow modelling the shared genealogical history of closely linked variants in the composite likelihood framework implemented in gIMble. To maximise the density of neutrally evolving sites, we follow Laetsch et al. [67] and restrict this analysis to intergenic sequence which have similar per site diversity and divergence to fourfold degenerate (4D) sites in coding regions (Table 1). After applying coverage-based filters (see Methods), our analyses include 39% of the genome ( ≈  160 out of 408 Mb).

We fit a series of demographic models to the blockwise site frequency spectrum (bSFS) of the whole genome: a model of strict divergence (DIV) and an IM model with migration in either direction with three Ne parameters (IM3,pod and IM3,fei). Out of these three scenarios (Table 2), the best fitting history is an IM3 model with unidirectional gene flow (forwards in time) from I. feisthamelii into I. podalirius (IM3,pod). Since an IM model will always fit at least as well as the (nested) DIV model, we compared the observed improvement in model fit (ln ⁡  CL) relative to the null distribution of ln ⁡  CL which we obtained from simulating 100 data sets under the best fitting DIV history (see methods). This parametric bootstrap confirms that the IM3,pod model does indeed fit significantly better than a DIV3 history (S1 Fig).

Table 2. Maximum composite likelihood parameters for three demographic models of species divergence. We use the IM3,pod model for barrier inference. Δln ⁡  CL is relative to the best supported model (bold).

model Nfei Npod Nanc T m Δln ⁡  CL
DIV 459,000 416,000 1,240,000 1,810,000 - -389,374
IM3,pod 483,000 377,000 1,150,000 2,180,000 4.73E-08 0
IM3,fei 438,000 429,000 1,190,000 2,020,000 2.80E-08 -238,736

Under the background/global IM model, we infer a split time (T) 2,180,000 generations ago (Fig 2 and Table 2) and Ne estimates of 483,000, 377,000 and 1,140,000 for I. feisthamelii and I. podalirius and the ancestral population, respectively. This split time is close to previous estimates for the species pair [20]. The long-term genome-wide rate of gene flow me from I. feisthamelii into I. podalirius is 4.73 ×108 which corresponds to M2=2Neme=0.046 migrants per generation.

Extensive genome-wide reproductive isolation between I. feisthamelii and I.podalirius

We infer the effective migration rate me along the genome in sliding windows each composed of 28,125 intergenic blocks (which corresponds to a median window span ≈150 kb). For each window, we estimate parameters under the best fitting IM3,pod history using a pre-computed grid. Note that whilst the split time T is fixed globally, the remaining four parameters (Ne for each population and me) are estimated locally, i.e. per window. The estimates of local Ne are approximately normally distributed (S2 Fig). In contrast, local me estimates have a strongly leptokurtic distribution with a peak at me=0 and a long tail up to the maximum value in the grid (S2 Fig). Note that the large number of windows in the largest me bin reflects the fact that our grid of me estimates necessarily truncates the me distribution, but does not affect our analyses of barriers.

Following Laetsch et al. [67] we label windows as barriers to gene flow if a DIV history (me=0) has greater marginal support than an IM history assuming the best fitting genome-wide value of me (Table 2) and this difference in marginal support is supported by a parametric bootstrap (see methods), i.e. ΔB,0>0. Applying this strictest possible barrier definition, we find that barriers to gene flow are widespread in the Iphiclides genome, making up 20% of windows across all chromosomes. Combining overlapping barrier windows, we define 555 barrier regions across all autosomes that cover 33% of the genome ( ≈  143Mb). The average length of barrier regions is  ≈  257 kb (median  ≈  200 kb) with a maximum of 1.4 Mb.

The genomic correlates of barrier regions

Assuming that the causal loci underlying barriers are in or near genes, we predict an enrichment of long-term barrier regions for coding sequence. This pattern has been observed in Heliconius melpomene and H. cydno [67] as well as scans for pairwise incompatibilities in natural hybrid populations of swordtail fish [70]. Contrary to this expectation, we find that both the density of coding sequence (CDS) and repeats are reduced in barrier regions (S3 Fig, see discussion). Given that our windows are both overlapping and genetic variation is autocorrelated along the genome (due to LD and various other sources of autocorrelation), standard statistical methods that assume independence are inappropriate for assessing the significance of differences between barrier and non-barrier windows and would lead to vastly overconfident conclusions. Instead, we use a circular data bootstrap approach to quantify the statistical support for correlates of barriers. This is inspired by a similar circular resampling scheme used by Yassin et al. [71] and Nouhaud et al. [72] who resampled by circularising the genome and sampling datasets with fixed offsets. We modify this procedure to sample with random offsets and circularising each chromosome (rather than the whole genome). This allows us to obtain null distributions for any genomic measure (e.g. CDS density) for random sets of genomic windows that have the same distribution between and distances/clustering within chromosomes as the set of barrier windows inferred by gIMble. Applying this data bootstrap confirms that both CDS and repeat density are indeed significantly reduced in long-term barriers compared to non-barrier windows (S3 Fig).

Models of local adaptation under migration-selection balance predict a concentration of barriers in regions of low recombination and a positive correlation between me and recombination [73]. While we lack direct estimates of recombination for Iphiclides, we can use chromosome location as an indirect proxy for recombination over two scales. First, given the requirement of a single cross-over per meiosis [74], chromosome length is inversely related to recombination rate [7577]. Second, an ubiquitous feature of meiosis in Lepidoptera is that – despite the lack of centromeres – recombination is reduced in the centre of chromosomes. Indeed, heterozygosity is strongly correlated with chromosome length in both I. podalirius (Pearson’s ρ = -0.827, p = 3.18e-9) and I. feisthamelii (Pearson’s ρ = -0.856, p = 3.19e-9) and is lower towards the center of chromosomes (S4C Fig and S5 Fig). We therefore expect both long chromosomes and chromosome centres to be enriched for barriers to gene flow. In line with these predictions, we find that barriers are correlated with recombination rate variation between and within chromosomes. First, both the proportion of barrier sequence (Pearson’s ρ = 0.43, p = 0.0241, S4A Fig) and the average me (Pearson’s ρ = -0.55, p = 0.00195) are correlated with chromosome length. Second, we observe that, on average, barrier regions are closer to the centre of chromosomes than non-barrier regions (circular bootstrap, p = 0.001 S4C Fig). The number of barrier regions varies widely between chromosomes: e.g. chromosomes 29 and 31 harbour no and one barrier region, respectively. In contrast, chromosomes 19 and 17 have 53 (29.3% of windows) and 49 (29.5% of windows) barrier regions, respectively (S6 Fig).

Genome polarisation reveals complex hybrids

Chromosome-painting of individual genomes via diagnostic markers provides a direct way to visualize the mosaic ancestry of HZ samples [7880]. While many studies have relied on assigning hybrid genotypes based on a reference panel assumed to reflect ‘pure‘ ancestry, this assumption is both unnecessary and biases inference against introgression [81]. We labeled the ancestry of alleles at all non-singleton SNPs with respect to the focal (i.e. between species) barrier using the EM algorithm implemented in diem [68]. This approach requires no a priori defined sets of reference individuals or candidate variants and assigns genotypes of 0/0, 1/1 or 0/1 at each variant position which correspond to homozygous for each species or heterozygous by source respectively. Note that although all samples were included in the polarisation, we focus our analysis of recent introgression on the “HZ set" of six individuals (Fig 1B).

Genome polarisation shows that the HZ individuals (Fig 1B) have an alternating pattern of podaliriusfeisthamelii ancestry along all autosomes (Fig 3), which can only be generated by hybridisation after divergence. We find substantial heterogeneity in the degree of hybridisation, as quantified by the hybrid index (HI) – defined as the average across genotype assignment of an individual – both between individuals and chromosomes (Fig 4). The two most intermediate HZ samples, 1325 (HI = 0.712) and 1322 (HI = 0.420), show large stretches of all three possible ancestries, a pattern that reflects a complex history of hybridisation: the expected HI of a simple n-th generation backcross hybrid is (1∕2)n with homozygous tracts for one species only. While we expect substantial variation around this expectation, successive generations of simple backcrosses move the expected HI towards the parental values of 0 or 1 along the sides of the ternary plot of HI against heterozygosity (H). In other words, interspecific H (the proportion of genotypes with an allele from each side of the barrier) remains maximal. Thus, the relationship between HI and interspecific H shows that samples 1325 and 1330 are complex backcrosses (i.e. their ancestry is incompatible with successive backcrosses with pure parental individuals), and suggests that in both cases introgression is towards I. podalirius (Fig 4A). Interestingly, all HZ samples except 1322 show a deficit in interspecific heterozygosity relative to Hardy-Weinberg equilibrium (Fig 4). While these ternary diagrams show genotype samples from genomes rather than from loci, Hardy-Weinberg equilibrium remains the null expectation for localities at equilibrium for an admixture process uniform along the genome. Thus, the deficit in heterozygosity suggests local substructure/inbreeding in the HZ and/or heterogeneous selection against heterozygous (by source) ancestry, which may be driven by partially dominant incompatibilities. Strikingly, we find that the six HZ individuals (which include two male samples) only exhibit one Z chromosome of mixed ancestry (1325, a female), which suggests that the Z is a substantially greater barrier than autosomes of equivalent length, as predicted by numerous theoretical models [8285].

Fig 3. Circular representation of the location of barriers identified using gIMble and the polarity of diagnostic markers for each sample across linkage groups.

Fig 3

The inner ring shows the location of each chromosome in alternating grey and white. Moving outwards, the next ring indicates the location of barriers to gene flow (in red) and non-barrier/migrating regions (in black). The remaining rings show the genotype of each sample at each diagnostic marker. Teal bars are diagnostic of I. podalirius, purple bars are diagnostic of I. feisthamelii, and yellow bars are heterozygous for markers diagnostic of each species. The outermost ring shows the density of high DI sites. The location of each megabase of sequence for each chromosome is indicated on the outside of the circle.

Fig 4. Hybrid index (HI) versus interspecific heterozygosity (H) for samples collected from the Iphiclides hybrid zone (HZ).

Fig 4

(A) Mean values for each HZ sample. (B) Mean values for each chromosome including all samples from the HZ. (C) Mean values for each chromosome excluding the three I. feisthamelii-like samples (1303, 1306 and 1308, HI  ≈  0). The dashed line indicates the expectation under Hardy-Weinberg equilibrium.

Long-term barriers to gene flow are associated with regions of low hybridity

Setting up a comparison between long-term barriers to gene flow and introgression across the Iphiclides HZ requires a measure of barrier strength that extracts information contained in the small sample of HZ individuals. Classic theory for hybrid zone barriers is couched in terms of cline parameters which cannot be meaningfully estimated from a sample of six individuals. Nevertheless, it is clear that chromosome paintings even for a small sample of individuals contain a wealth of information about recent introgression. For example, assuming that our sample of six HZ individuals is centered, a genomic region without any gene flow (i.e. a complete short-term barrier region) is expected to be painted entirely homozygous for I. podalirius and I. feisthamelii ancestry for three individuals on either side of the centre (clines would be maximally narrow, stepped, and steep). To measure the local strength of barriers in a small set of polarized hybrid genomes, we estimated D¯ [86], the multi-site mean pairwise LD, in windows defined in the gIMble analysis. This is numerically equivalent to the variance in HI and captures the strong LD seen along stretches of co-introgressing variation: when minimal for polarised data (D = 0), states of sites are uncorrelated, and when maximal (D = 0 . 25), each sample reflects pure ancestry.

We find that — at the scale of gIMble windows ( ≈  100 kb) — D¯ is negatively correlated with me,i (Pearson’s ρ = -0.57, p = 0.0). To test whether short-term barrier strength (as measured by D¯) is greater within long-term barrier windows than expected by chance, we conducted two different circularised bootstrap schemes to obtain null distributions for D¯: either circularising each chromosome (accounting for differences in the number of barriers between chromosomes) or circularising the whole genome (not accounting for chromosome effects). We find that the mean D¯ for barrier windows is significantly greater than either of these resampled distributions (circular bootstrap, p<0.001, Fig 5B). This suggests ongoing selection against hybrid ancestry within the HZ at long-term barrier loci (Fig 5A). While the upward shift in the resampled distributions when accounting for chromosome of origin suggests that the between-chromosome variation in recombination rate does contribute to the overlap between long and short-term barriers, this overlap cannot be explained by the fact that both D¯ (Pearson’s ρ = 0.74, p = 4.84e-06) and barrier density are correlated with chromosome length (S4A Fig and S4B Fig).

Fig 5. The distributions of D¯, the number of unique ancestry junctions, and the number of strongly diagnostic sites (A/C/E) across gIMble barrier windows (grey) and non-barrier/migrating windows (yellow) and their corresponding bootstrap results (B/D/F).

Fig 5

The latter two metrics have been corrected for window span. Both D¯ calculated per window (A and B, circular bootstrap, p<0.001) and the number of highly diagnostic markers (E and F, circular bootstrap, p<0.001) is greater within long-term barriers than in non-barrier windows. The number of unique ancestry junctions is lower within long-term barriers than in non-barrier windows (C and D, circular bootstrap, p<0.001). Null distributions of D¯ were generated using four different resampling schemes (B). In each instance, random values of D¯ were drawn without replacement from the empirical distribution of window-wise D¯ to generate datasets corresponding to the number of barrier windows. We repeat each resampling 1,000 times and compare the distribution of mean D¯ to the observed value. Firstly, we sample datasets from the entire genome with (green) and without (black) circularising (see methods). Secondly, we resample per-chromosome accounting for differences in the number of outliers between chromosomes, also with (blue) and without (grey) circularising. We only show the most conservative test - the circular bootstrap accounting for chromosome-of-origin - for the number of junctions (D) and the number of strongly diagnostic sites (F).

While D¯ is an obvious measure of short-term barrier strength that is straightforward to compute from a set of diem polarized genomes, it is important to consider alternatives. A potential measure of the short-term barrier effects in a set of hybrid samples is the density of unique ancestry junctions. The number of unique ancestry junctions in a genomic region is negatively correlated with barrier strength [87], as selection against heterozygous ancestry limits the decay of admixture tracts generated by successive introgression events. Thus, we expect barrier regions to be populated by fewer and larger blocks of co-ancestry, and necessarily, fewer junctions than migrating regions. Consistent with this prediction, we find that barrier regions contain almost half the number of ancestry junctions than expected by chance (circular bootstrap, p<0.001, Fig 5C and 5D).

However, both measures of short-term barrier strength rely on markers with high diagnostic indices which occur at higher density in gIMble barrier windows compared to random windows (circular bootstrap, p<0.001, Fig 5E and 5F). Given that the local density of high DI markers must be a result of barriers acting over a range of timescales, it is crucial to test whether the overlap of long and short-term barriers we find (in the form of an excess of D¯ or reduction of junctions in gIMble barriers) simply reflects the density of high DI markers. To assess the robustness of our findings to heterogeneity in high DI marker density, we repeated our comparison between barrier and migrating windows (in terms of D¯ and junction density) for a dataset in which high DI markers were downsampled to be uniformly distributed (see methods). We find that our core result of reduced introgression across the six HZ samples in long-term barriers holds equally when we remove the heterogeneity of high DI markers (S8 Fig and S9 Fig). Our results are similarly robust to the exclusion of individual HZ samples (circular bootstrap, p<0.001 in all six instances for each metric).

Fig 6. Distribution of sizes of (purple points) I. feisthamelii tracts introgressing into I. podalirius, and (teal points) I. podalirius tracts into I. feisthamelii, on a log-log scale.

Fig 6

Note, as most introgression is heterozygous, the introgressing tracts largely correspond to the yellow tracts in Fig 3. These are compared to theoretical predictions (solid lines) for exchange between two infinite demes (S1 Appendix A-5). Introgression into I. podalirius is plotted for [M, S, R, T= 15, 0.11, 0.25, 150]. Introgression into I. feisthamelii is plotted for [M, S, R, T = 1.5, 0.0, 0.25, 275]. The difference in gradients on the right indicates stronger coupling (S ∕ R) of the I. podalirius background despite more migrants M per generation. The distribution of small blocks towards the left does not match theory (see Results), making the time-since-contact estimates T lower bounds only.

Modelling the history of introgressed blocks in complex hybrids

We have so far interrogated the data for the six HZ individuals in terms of the variation in the strength of barriers acting since the most recent secondary contact, both between and within chromosomes. However, theoretical models of multilocus barriers consider the aggregate effects of many loci under selection. This creates a genome-wide barrier that (assuming a model of uniformly distributed selection targets) can be captured by the ratio between the selection pressure and the recombination rate, a.k.a. the coupling coefficient θ = S ∕ R [88,89]. Thus an obvious question is: what strength of multilocus selection is the mosaic ancestry of the six HZ individuals compatible with? To address this, we considered the length distribution of introgression tracts. If we make the simplifying assumption that introgression across the HZ into either species started at time T, occurs at a constant rate m, and only involves backcrosses (i.e. the recipient population is infinitely large), the lengths of admixture tracts depend only on the coupling coefficient θ [88,89]. More precisely, the equilibrium solution for the gradient of the admixture tract length distribution on a log-log scale is 3+θ1+θ (see Fig 6, S1 Appendix). Thus, to learn about the direction and dynamics of recent gene flow across the HZ, we fit the distribution of admixture tract length of colinear, introgressed ancestry to the analytic expectation under this model [89]. In the absence of a recombination map, we measured the length of admixture tracts (x) relative to chromosome length, i.e. we assumed that each autosome has a map length of 25 cM, which corresponds to an average of one cross-over event per male meiosis (female butterflies are achiasmatic). To avoid block lengths being fragmented by rare errors and gene conversion events, we kernel smoothed diem labeling along chromosomes at a scale of 104× chromosome length, which corresponds to >1kb for most autosomes.

Considering long tracts (l>0.04), neutrality provides a good fit for the distribution of tracts of I. feisthamelii ancestry in I. podalirius (i.e. a gradient of -3). In contrast, the size distribution of I. podalirius admixture tracts in I. feisthamelii is best explained by θ = 0 . 44 (Fig 6). This asymmetry mirrors the asymmetry of long-term gene flow under the IM history inferred by gIMble and suggests that gene flow has been stable for much of the last 150 - 275 generations (75 - 140 years). In contrast, over the same timescale I. podalirius admixture into I. feisthamelii has been strongly selected against. Interestingly, we find that the admixture tract length distribution in neither direction involves a simple asymptote for short tracts as predicted for a secondary contact initiated at time T with constant m. This may be due to several factors: first, gene flow in the more distant past may have been genuinely lower, which — given the dependence of Iphiclides on orchards and other anthropogenic habitats — may reflect changes in landscape use. Secondly, our simplistic measure of x as a proportion of chromosome length is inadequate for short blocks; given that recombination is known to vary substantially along butterfly chromosomes [76]. Thirdly, our model is unrealistically simplistic in that it ignores space: it assumes introgression into an infinitely large panmictic population. In a spatially continuous population, we expect shorter tracts with increasing distance from the HZ, which is indeed what we observe in Iphiclides. Fourthly, violating our infinite population size assumption, hybrids in finite populations can interact, producing complex admixture tracts. This may inflate the distribution of small tracts in both admixture directions and explain the asymmetric deviation from expectation we see between fits. Finally, it is also plausible that the polygenic model of barrier architecture assumed by Baird [89] breaks down over short physical scales, because it does not include the possibility that increasingly short blocks may have increasingly variable fitness effects (including the possibility of adaptive introgression).

Discussion

Much of the recent research on the genomics of speciation has focused either on fitting demographic models to estimate migration over timescales of Ne generations [9295] or on investigating barriers to recent introgression (10s or 100s of generations) in studies of HZs (e.g. [29,52,96,97]), natural hybrid populations [70] and laboratory crosses (e.g. [98,99]). However, there are surprisingly few attempts to understand how barriers to gene flow over different timescales are related. Here we have fitted explicit demographic models of speciation to infer the heterogeneity in long-term effective migration between two sister species of Iphiclides butterflies. We intersect this inference with signals of recent, heterogenous introgression across a HZ estimated from a small set of HZ individuals using genome polarisation.

Evidence for both long-term and recent gene flow between scarce swallowtailsister species

We find that the long-term demographic history of I. podalirius and I. feisthamelii is well approximated by an IM model with a low rate of unidirectional migration (M=2Nem=0.046) from I. feisthamelii into I. podalirius. Thus our analysis adds to the growing number of young species pairs for which a signal of long-term and ongoing migration has been identified, including great apes [100], butterflies [8,101,102], mollusks [103], angiosperms [104], birds [105], Drosophila [106], and many more.

However, we note that the global effective migration rate (me) we infer in Iphiclides is considerably lower than estimates obtained for sister species pairs of Heliconius [67] and Brenthis [102] butterflies, both of which are of comparable age (S7 Fig). Although this low background level of genome-wide migration reduces the power to identify individual barrier regions, it still allows quantification of the variation of me and its correlates along the genome. Furthermore, we find evidence for ongoing introgression between the two Iphiclides species across a HZ in the form of complex hybrids. Relating the distribution of admixture tract lengths to analytic expectations under a model of secondary contact [89] suggests that recent gene flow into I. podalirius is neutral, while gene flow into I. feisthamelii is strongly selected against. This asymmetry is concordant with the inferred direction of long-term gene flow. It is also consistent with genetic load arguments, which predict stronger selection against admixture tracts derived from the taxon with lower Ne [107], i.e. I. podalirius in this case.

More generally, it is encouraging how much information about both recent and long-term introgression is contained in a small sample of individual genomes. Given that most species pairs do not have HZs, such long-term barrier inference is the only genomic information available about their speciation history. Even when HZs exist, they may not be amenable to clinal analyses due to sampling constraints as is the case for Iphiclides. However, our analyses demonstrate that the genome-wide barrier, which is described by the coupling coefficient and the recent history of gene flow can be estimated from a handful of HZ individuals. It is perhaps surprising how well the admixture tract lengths we observe in Iphiclides fit analytic expectations under the simplest possible model of secondary contact which assumes panmixia, infinite population size and weak selection [89]. The fact that the observed length distribution deviates from this expectation for short tracts suggests that small samples of hybrid genomes contain additional information about admixture. In particular, it would be interesting to fit models of more realistic barrier architectures that assume a large but finite number of loci. Furthermore, classic theory on Fisher junctions for HZs only considers the decay of admixture tracts due to crossover events [88]. However, the internal decay of admixture tracts due to gene conversion events has so far been ignored and overlays the effects of an additional clock.

The ability to estimate the genomic distribution of block sizes (Fig 6) and relate it to theory [89] developed long before genomic data were available is thrilling. The distribution of large blocks is expected to reach equilibrium quickly, and appears log-log linear, suggesting relatively strong coupling of 0.44 at least in one direction. Neither direction reaches the ‘tipping point’ coupling of S ∕ R = 1 [88], where multilocus clines ‘congeal’, however, that sharp threshold at equilibrium is misleading. Because small blocks equilibrate very slowly, it would take longer than an interglacial period for a sharp distinction in secondary contact outcome to be perceivable [89] (Fig 2).

Long and short-term barriers to gene flow

The fact that barriers, inferred over these two time-scales of evolution, overlap shows that a significant subset of barriers is persistent and acts at very different points of the speciation continuum. This suggests that regions of the genome that maintain reproductive isolation between species in the long-term are also relevant early on in species divergence. Indeed, in Heliconius butterflies it has been shown that the same wing pattern genes maintain species differences both across HZs [108] and in deep time [67]. However, in contrast, the congruence in barrier landscapes across timescales we find in Iphiclides is not restricted to a small number of large effect genes, but rather a genome-wide phenomenon, suggesting a polygenic barrier architecture.

We would argue that the internal comparison of long and short-term barrier landscapes we have conducted here is a more promising avenue for testing of models of speciation than comparisons/contrasts of species pairs at different stages, which invariably differ in speciation history and barrier architecture [17,109,110]. Thus, an important direction for future work is to develop quantitative predictions for the temporal change in barrier landscapes. Under an allopatric null model, which may apply to Iphiclides during glacial periods, pairwise intrinsic incompatibilities are assumed to accumulate at random positions in the genome and genome-wide coupling is expected to increase quadratically with time [111]. Under this model, we expect a limited amount of temporal barrier overlap which arises from the fact that barrier loci that establish early have a greater effect on long-term me than later barriers. In contrast, verbal models of ‘divergence hitchhiking‘ assume that early barriers expand locally [4,112], and so may predict a greater degree of temporal overlap. While theoretical models show that locally expanding barriers are possible — given sufficiently strong selection and linkage [113] — there is so far not much empirical evidence that locally expanding ‘islands of speciation’ are a common feature of speciation.

The architecture of barrier loci

We found substantial variation both in the size of barrier regions and in the proportion of barrier sequence per chromosome. Specifically, barriers to gene flow aggregate on large chromosomes and towards chromosome centres. Both are regions of the genome where recombination rates are reduced. Thus, it appears on a broad scale that the architecture of reproductive isolation between Iphiclides species is strongly linked to recombination rate heterogeneity, as would be expected from barrier loci that individually confer small effects. This is consistent with previous research on Heliconius butterflies, which demonstrated that barriers to introgression are concentrated in regions of low recombination [67,76], and supports a polygenic barrier architecture. This architecture was originally proposed as a null model of reproductive isolation [114] and evidence for polygenic barriers to gene flow has accumulated across a range of taxa [98,115117].

We investigated whether barrier regions are associated with particular genomic features and failed to find enrichment of repetitive elements and more surprisingly, coding sequence, as one might expect if reproductive isolation is driven by selection on genes. Our circular resampling procedure controls for differences in gene and repeat density between chromosomes, so it is clear that the reduced gene density for barrier regions (circular bootstrap, p<0.001, S3 Fig) is not simply a consequence of the negative correlation between gene density and chromosome length and the fact that barrier density is higher for long chromosomes. However, it may well be that the effect of intra-chromosomal variation in recombination rate vastly outweighs the impact of coding sequence density.

Limitations

To infer barriers to gene flow conservatively, we use the strictest possible threshold (me,i=0) and quantified the false positive rate in a parametric bootstrap (see methods). Whilst this potentially excludes actual barriers with me,i>0 at putatively neutral flanking regions — which are the basis/input of our inference — this minimises the false positive rate.

Numerous factors may contribute to biases in our results (see [35], and [67] for gIMble limitations). Given that direct estimates of recombination are not available for Iphiclides, we cannot directly quantify the degree to which recombination rate heterogeneity contributes to migration rate variation, nor account for the fact that uncertainty in estimates of both short and long-term barriers depends on the local recombination rate.

There may be scope to improve the power to detect weak gene flow through modelling more detailed demographic scenarios. Firstly, whilst it is very likely that the Iphiclides pair has undergone repeated rounds of separation and gene flow, we have fit a much simpler model that assumes a single continuous rate of migration. Secondly, we have modelled gene flow as unidirectional and inferred the most likely direction by comparing models. In reality, gene flow between these species is likely bidirectional but asymmetric, as estimates in other systems suggest [118120]. Expanding the gIMble framework to include isolation with initial migration (IIM) and bi-directional gene flow may improve our power to model the build-up of genomic barriers [67,121].

The diem genome polarisation algorithm is designed to work with even small amounts of low quality data. There should therefore be few power limitations working with high quality genome scale data, and indeed estimates of hybrid indices, interspecific heterozygosity, and visualisation of admixture all have high precision [68]. Estimating admixture tract size distributions requires a further level of precision however: a single ancestry state error in a long block will on average halve its estimated length, error due to a miscalled variant would affect one tract, but error due to a mispolarised site could affect many. To avoid such error-driven tract fragmentation, we filter polarised sites for high diagnostic index — strongly correlated with the support for correct polarity — and introduce an arbitrarily chosen scale of kernel smoothing of state along chromosomes. This has an advantage over HMM approaches in that it has no starting point chirality, but the disadvantage that it censors true small tract signal with some distribution of false negatives. This, if anything makes, the excess of small tract observations over simple model expectations more surprising (Fig 6).

Finally, one of the most striking results of our analysis of HZ is the scarcity of introgressed ancestry tracts on the Z chromosome (Fig 3). While we have not quantified the density of gIMble barriers on the Z (because such an analysis would be limited to male samples), it will be fascinating to investigate the long-term evolution of Z linked barriers in Iphiclides in the future.

Conclusion

We have demonstrated an association between long-term and short-term reproductive isolation in a species pair of swallowtail butterflies. Despite being several million generations old, gene flow persists between these species and is currently concentrated at a HZ. The considerable number, varying size, and location of barrier regions suggest that the genomic architecture of speciation is polygenic, or multilocus sensu Barton [88]. We find that variation in me correlates with proxies of recombination rate variation both between and within chromosomes and so is likely shaped by recombination rate heterogeneity, a pattern previously observed in Heliconius butterflies [67,76]. However, in the absence of a directly estimated recombination map, it is impossible to know to what extent the overlap between long and short-term barriers reflects a shared/stable recombination landscape rather than shared selective targets. Unlike genomic cline methods [122], our approach maximises information about barriers to gene flow contained in small samples over different time scales. This paves the way for future quantitative analyses of the temporal evolution of the genomic landscape of species barriers. While most speciation processes are far too slow to observe directly, genomic variation clearly contains information about the interplay of forces acting at different stages of the speciation continuum.

Methods

Ethics statement

Field sampling of butterflies was conducted in compliance with the School of Biological Sciences Ethics Committee at the University of Edinburgh and the European Research Council ethics review procedure. Permissions for field sampling were obtained from the Generalitat de Catalunya (SF/639), the Gobierno de Aragon (INAGA/500201/24/2018/0614 to Karl Wotton) and the Gobierno del Principado de Asturias (014252).

Sampling and sequencing

In total, we generated WGS data (150 base paired-end reads) for 20 individuals (S1 Table). Field sampling was conducted in 2017 and 2018 at several locations across Southern and Central Europe (Spain, France, Romania, and Hungary). Samples were hand-netted in the field, flash-frozen from live in a liquid nitrogen dry shipper (Voyageur 12) and stored at 70C. Wings were retained for identification. DNA extractions for all individuals were performed using a Qiagen DNeasy Blood & Tissue kit. Extractions were used to prepare TruSeq Nano gel-free libraries by Edinburgh Genomics which were sequenced on a NovaSeq 6000 or HiSeq X (S1 Table). Raw reads are deposited at the ENA (accession number PRJEB76171).

QC, read mapping, variant calling, and summaries

Reads were trimmed and checked for quality using FastQC v0.11.8 [123] both before and after trimming with FastP v0.20.0 [124], using MultiQC v1.7 [125] to visualise the results. Trimmed reads were aligned to the I. podalirius reference assembly [126] using bwa-mem v0.7.17 [127]. We marked duplicates using sambamba v0.6.6 [128]. Variants were called using freebayes v1.3 [129] (-k -w -j -T 0.02 -E -1 –report-genotype-likelihood-max). A principle components analysis (PCA) was performed on intergenic autosomal variants using plink v1.9 [69]. Genetic diversity (π), mean individual heterozygosity (H) and divergence (dXY) were estimated at intergenic sites using the gIMble ‘gimbleprep’ module [67].

Fitting models of speciation across the genome using the bSFS

To infer the likely speciation history of this species pair and to estimate migration rates along the genome we used the software package gIMble [67]. This analysis was restricted to the non-HZ set of individuals. As the inference assumes that each sample contributes diploid genotypes, analysis was restricted to the autosomes to allow the inclusion of both male and female data. Demographic models fit by gIMble assume a neutral model of evolution, and so we focused analysis on intergenic sequence (39% of the assembly), i.e. genic and repeat-rich regions and contigs that were not scaffolded into chromosomes were excluded. We filtered genotype (GT) calls to a minimum depth of eight reads per sample and a maximum depth of three times the mean coverage per sample. Additionally, GT calls were required to have a minimum PHRED quality of one and SNPs within 2 bases of indels were removed. We used the gIMble ‘parse’ module to quantify genetic diversity and divergence for the filtered subset of the data. We summarised variation in pair-blocks of 64 bases and tallied all blockwise site frequency configurations of mutations across heterospecific sample pairs (see [67] for details). We then fitted ‘strict divergence’ (DIV) and ‘isolation with migration’ (IM) demographic models to the genome-wide blockwise site frequency spectrum using the gIMble ‘optimize’ module.

To quantify heterogeneity in me and Ne, we fit a grid of parameters to sliding windows of a fixed length of 28,125 (pair-blocks) with a 20% overlap along the genome. This results in a total of 14,173 windows with a minimum span of 50 kb. The grid discretised Ne from 100,000 to 1,500,000 (in increments of 100,000) for I. feisthamelii from 100,000 to 900,000 (in increments of 50,000) for I. podalirius and from 100,000 to 2,400,000 (in increments of 100,000) for the ancestral population; me (forwards in time from I. feisthamelii into I. podalirius) was discretised from 0 to 4e-7, in increments of 5e-9. The split time T was fixed to the global result of 2,181,320 generations (Table 2) and the mutation rate was set to μ = 2 . 9  ×  109 per base and generation [130] throughout. We label windows as barriers to gene flow where the marginal support for a DIV model (i.e. me = 0) is greater than the support for an IM model parameterised with the most likely inferred effective migration rate under a null model of me variation (i.e. ΔB,0>0).

Simulations and bootstrapping

We used the following bootstrapping approach to test the global support for an IM model over a DIV model: we performed 100 simulations using msprime [131] (via the gIMble ’simulate’ module) parameterised by the DIV history which best fit the empirical data. We simulated 14,173 windows of the same size as the real data, i.e. 28,125 (64 base) pair-blocks, for six diploid individuals per population to match the empirical sample set. We simulated a discrete genome and an infinite allele mutation model. We assumed a constant per base crossover rate based of one crossover per male meiosis per chromosome (i.e. each chromosome has a sex-averaged map length of 25 cM) and a mutation rate of 2.9×109 per base and generation per base and generation, as estimated for Heliconius butterflies [130]. We fit a DIV model and an IM model to each simulation and compared the null distribution of relative fit (ln ⁡  CL) to the empirical ln ⁡  CL.

To adjust our barrier definition for false positives (which are expected at an appreciable rate when the background me is low), we ran a parametric bootstrap on the local estimates me: for each window, we simulated 100 replicates under the best fitting local background history using msprime [131] (via the gIMble ‘simulate’ module) and obtained a null distribution of ΔB,0. These simulations were analogous to the bootstrap for the global model, except that we allowed Ne parameters to vary between windows. We simulated under a null model with a globally fixed me, but accounted for variation in Ne by assuming the best composite log-likelihood Ne parameter inferred for each window. Only windows for which ΔB,0 in the real data was greater than the largest value contained in the ΔB,0 distribution from simulated data were labelled as barriers. This ensures a false positive rate <0.01.

To test whether various metrics are associated with barrier windows we used a simple resampling approach. In all instances, random empirical estimates of each per-window metric were sampled without replacement to generate datasets of sizes relevant to the scope of the resampling scheme. We compared the empirical estimate to the distribution of means of 1000 resampled data sets. Unless otherwise indicated, we resampled datasets with respect to each chromosome in a circularised fashion (similar to the shift-permutation approach outlined by Yassin et al. [71] and Nouhaud et al. [72]), by shifting the assignments of barrier windows within a chromosome irrespective of chromosome ends by a random integer. This approach accounts for between-chromosome variation in recombination rates and generates null distributions that have the same clustering as barriers inferred in the data. Finally, we repeated each circular bootstrap after excluding each of the HZ samples in turn to check for robustness to sub-sampling. For each of these tests we consider an adjusted significance threshold of p = 0 . 05 ∕ 6 ≈ 0 . 008. The bootstrapping was implemented using a custom script available at https://github.com/LohseLab/circular_bootstrap.

To test whether various measures of barrier strength are robust to heterogeneity in highly diagonstic marker density we employed a downsampling approach. SNPs were downsampled with over dispersion on the physical metric by choosing those with reference positions closest to multiples of ‘spacing’ 1100nt. Different spacing options were first trialed; 1100nt was the smallest spacing at which all chromosomes showed no significant deviation from uniformity after downsampling (Cramér-von Mises test, p=0.05). As downsampling cannot remove gaps, and gIMble windows do not occur in SNV gaps, for each chromosome the 5 largest gaps in SNV positions were excised for the purposes of this Uniformity testing.

Estimating hybrid indices and the ancestry of hybrid zone individuals

We polarized all variable sites (across all individuals) that were not singletons (which are uninformative for genome polarisation by association in state) and estimated genome-wide hybrid indices (HIs) using the diem framework implemented in both R and Mathematica [68]. Variants were subject to the same filtering criteria as the gIMble analysis. We calculated HIs per individual using sets of polarised sites with high diagnostic index (DI > -20). We used chromosome subsets of these sites to calculate HIs per chromosome. At a further reduction in scale we computed HI for each gIMble window and compared D across all individuals between barrier and non-barrier gIMble windows using the bootstrap scheme described above.

Deconfounding signal from multiple barriers

diem is designed to polarise genomes with respect to one barrier at a time, however trial diem analysis of the full genomes dataset suggested two barriers were present in the data: the focal barrier to gene flow between ‘French’ I. podalirius and ‘Iberian’ I. feisthamelii, and a ‘nuisance’ barrier to geneflow between ‘Iberian’ and ‘African’ I. feisthamelii, presumably due to the Strait of Gibraltar. For the focal analysis, signal of the nuisance barrier was removed by censoring all sites with minority homozygous state only present in (the two) samples from north Africa.

Kernel smoothing of genotypes along chromosomes

diem estimates the polarity of each site independently. Where a barrier exists we may expect polarisation to reveal tracts of different inheritance along chromosomes. Downstream analyses of tract size distributions requires estimates of the boundaries of these tracts. We estimate tract boundaries using kernel smoothing of the diploid state along chromosomes. Working on the physical (Mb) metric, the Laplace distribution truncated at its 95% probability boundaries provides the kernel. This is centered on each site to provide a weighted average for each state ‘smoothed’ over the chosen physical scale. The centre-site is then estimated to have the state with greatest smoothed weight. Where variable sites are dense on the physical metric, smoothed estimates will be influenced by many flanking sites, where sparse, few. Imposing a given smoothing kernel scale assumes we are uninterested in state variation at higher frequency (involving shorter tracts). The kernel smoothing is therefore being used as a low-pass signal filter to leave larger tracts with clearer boundaries.

Untangling blocks from genotypes

When diploid genomes are polarised with respect to a single barrier, an interval of heterozygous sites within otherwise homozygous chromosome genotype suggests a single tract of minority inheritance on one of the chromosome copies. While the same pattern would result from two minority tracts, one on each chromosome copy, that ‘perfectly’ abut (without overlap), simple genotypes can be parsimoniously (and automatically) parsed into haplotype tracts by assuming such perfect abutment is so rare as to be negligible. This accounts for the vast majority of the data: for each such heterozygous a block on one strand is counted. Occasional homozygous introgressed intervals are treated similarly: a block on one strand is counted. Finally (and rarest of all), intervals of mixed homozygous-introgressed and heterozygous state are treated similarly: a block on one strand, spanning the mixed interval, is counted” In this way small blocks which positionally overlap with larger introgressing blocks within the same diploid individual are censored from the block size distribution.

Supporting information

S1 Appendix

(PDF)

pgen.1011655.s001.pdf (164.9KB, pdf)
S1 Table

(CSV)

pgen.1011655.s002.csv (2.2KB, csv)
S1 Fig

(PDF)

pgen.1011655.s003.pdf (110.4KB, pdf)
S2 Fig

(PDF)

pgen.1011655.s004.pdf (564.4KB, pdf)
S3 Fig

(PDF)

pgen.1011655.s005.pdf (520.6KB, pdf)
S4 Fig

(PDF)

pgen.1011655.s006.pdf (645.6KB, pdf)
S5 Fig

(PDF)

pgen.1011655.s007.pdf (232.6KB, pdf)
S6 Fig

(PDF)

pgen.1011655.s008.pdf (671.1KB, pdf)
S7 Fig

(PDF)

pgen.1011655.s009.pdf (523.7KB, pdf)
S8 Fig

(PNG)

pgen.1011655.s010.png (65.3KB, png)
S9 Fig

(PNG)

pgen.1011655.s011.png (65.3KB, png)

Acknowledgments

We would like to thank Carla and Oskar Lohse for catching the two most informative hybrid samples: individuals 1322 and 1325. We thank Katy McDonald for help in the molecular lab, Alex Hayward for help with field collections and Edinburgh Genomics for generating libraries and sequence data. We are indebted to Vlad Dincă, Raluca Vodă and Leonardo Dapporto for contributing samples and to Nick Barton for helpful suggestions on the analyses of admixture tracts and Alex Mackintosh for insightful comments on an earlier version of this manuscript. We thank Richard Lewington for permission to reproduce his butterfly illustrations.

Data Availability

Read data is available from the ENA at PRJEB76171. Reads for sample IP 504 were generated by a previous study (doi: 10.1093/g3journal/jkac193) and are available at the ENA at PRJEB51340. Input data for plots and statistics is available from https://github.com/samebdon/iphiclides_ speciation_data. The bootstrapping was implemented using a custom script available at https://github.com/LohseLab/circular_bootstrap.

Funding Statement

This work was supported by a European Research Council starting grant (ModelGenomLand 757648 to KL and DRL), an EastBio studentship from the Biotechnology and Biological Sciences Research Council including a stipend (to SE), a fellowship from the Natural Environment Research Council (NE/L011522/1 to KL), and a Ministerio de Ciencia e Innovación grant PID2022-139689NB-I00 (MICIU/ AEI/ 10.13039/501100011033 and ERDF, EU to RV) (https://erc.europa.eu/homepage, https://www.ukri.org/councils/bbsrc/, https://www.aei.gob.es). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Campbell C, Poelstra J, Yoder A. What is speciation genomics? The roles of ecology, gene flow, and genomic architecture in the formation of species. Biol J Linnean Soc. 2018;124(4):561–83. [Google Scholar]
  • 2.Bush GL. Sympatric speciation in animals: new wine in old bottles. Trends Ecol Evol 1994;9(8):285–8. doi: 10.1016/0169-5347(94)90031-0 [DOI] [PubMed] [Google Scholar]
  • 3.Dieckmann U, Doebeli M. On the origin of species by sympatric speciation. Nature 1999;400(6742):354–7. doi: 10.1038/22521 [DOI] [PubMed] [Google Scholar]
  • 4.Nosil P. Speciation with gene flow could be common. Mol Ecol 2008;17(9):2103–6. doi: 10.1111/j.1365-294X.2008.03715.x [DOI] [PubMed] [Google Scholar]
  • 5.Feder JL, Egan SP, Nosil P. The genomics of speciation-with-gene-flow. Trends Genet 2012;28(7):342–50. doi: 10.1016/j.tig.2012.03.009 [DOI] [PubMed] [Google Scholar]
  • 6.Taylor C, Touré YT, Carnahan J, Norris DE, Dolo G, Traoré SF, et al. Gene flow among populations of the malaria vector, Anopheles gambiae, in Mali, West Africa. Genetics 2001;157(2):743–50. doi: 10.1093/genetics/157.2.743 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wang RL, Wakeley J, Hey J. Gene flow and natural selection in the origin of Drosophila pseudoobscura and close relatives. Genetics 1997;147(3):1091–106. doi: 10.1093/genetics/147.3.1091 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, et al. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res 2013;23(11):1817–28. doi: 10.1101/gr.159426.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Roux C, Fraïsse C, Romiguier J, Anciaux Y, Galtier N, Bierne N. Shedding light on the grey zone of speciation along a continuum of genomic divergence. PLoS Biol 2016;14(12):e2000234. doi: 10.1371/journal.pbio.2000234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Taylor SA, Larson EL. Insights from genomes into the evolutionary importance and prevalence of hybridization in nature. Nat Ecol Evol 2019;3(2):170–7. doi: 10.1038/s41559-018-0777-y [DOI] [PubMed] [Google Scholar]
  • 11.Mallet J. Hybrid zones of Heliconius butterflies in Panama and the stability and movement of warning colour clines. Heredity 1986;56(2):191–202. doi: 10.1038/hdy.1986.31 [DOI] [Google Scholar]
  • 12.Harrison R. Hybrid zones: windows on evolutionary process. Oxford Surv Evolution Biol. 1990;7:69–128. [Google Scholar]
  • 13.Rieseberg LH, Archer MA, Wayne RK. Transgressive segregation, adaptation and speciation. Heredity (Edinb). 1999;83(Pt 4):363–72. doi: 10.1038/sj.hdy.6886170 [DOI] [PubMed] [Google Scholar]
  • 14.Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, Pääbo S, et al. The genomic landscape of Neanderthal ancestry in present-day humans. Nature 2014;507(7492):354–7. doi: 10.1038/nature12961 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Stankowski S, Shipilina D, Westram A. Hybrid zones. eLS. 2021;2:1–12. [Google Scholar]
  • 16.Drès M, Mallet J. Host races in plant-feeding insects and their importance in sympatric speciation. Philos Trans R Soc Lond B Biol Sci 2002;357(1420):471–92. doi: 10.1098/rstb.2002.1059 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Stankowski S, Ravinet M. Defining the speciation continuum. Evolution 2021;75(6):1256–73. doi: 10.1111/evo.14215 [DOI] [PubMed] [Google Scholar]
  • 18.Hewitt GM. Hybrid zones-natural laboratories for evolutionary studies. Trends Ecol Evol 1988;3(7):158–67. doi: 10.1016/0169-5347(88)90033-X [DOI] [PubMed] [Google Scholar]
  • 19.Döttinger CA, Steige KA, Hahn V, Bachteler K, Leiser WL, Zhu X, et al. Unravelling the genetic architecture of soybean tofu quality traits. Mol Breed 2025;45(1):8. doi: 10.1007/s11032-024-01529-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ebdon S, Laetsch DR, Dapporto L, Hayward A, Ritchie MG, Dinca V, et al. The Pleistocene species pump past its prime: Evidence from European butterfly sister species. Mol Ecol 2021;30(14):3575–89. doi: 10.1111/mec.15981 [DOI] [PubMed] [Google Scholar]
  • 21.Hunt WG, Selander RK. Biochemical genetics of hybridisation in European house mice. Heredity (Edinb) 1973;31(1):11–33. doi: 10.1038/hdy.1973.56 [DOI] [PubMed] [Google Scholar]
  • 22.Nosil P, Funk DJ, Ortiz-Barrientos D. Divergent selection and heterogeneous genomic divergence. Mol Ecol 2009;18(3):375–402. doi: 10.1111/j.1365-294X.2008.03946.x [DOI] [PubMed] [Google Scholar]
  • 23.Lenormand T. Gene flow and the limits to natural selection. Trends Ecol Evol 2002;17(4):183–9. doi: 10.1016/s0169-5347(02)02497-7 [DOI] [Google Scholar]
  • 24.Kruuk LE, Baird SJ, Gale KS, Barton NH. A comparison of multilocus clines maintained by environmental adaptation or by selection against hybrids. Genetics 1999;153(4):1959–71. doi: 10.1093/genetics/153.4.1959 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Bank C, Bürger R, Hermisson J. The limits to parapatric speciation: Dobzhansky-Muller incompatibilities in a continent-island model. Genetics 2012;191(3):845–63. doi: 10.1534/genetics.111.137513 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Payseur BA, Rieseberg LH. A genomic perspective on hybridization and speciation. Molecul Ecol. 2016;25(11):2337–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Barton N, Bengtsson BO. The barrier to genetic exchange between hybridising populations. Heredity (Edinb). 1986;57(Pt 3):357–76. doi: 10.1038/hdy.1986.135 [DOI] [PubMed] [Google Scholar]
  • 28.Yang W, Feiner N, Laakkonen H, Sacchi R, Zuffi MAL, Scali S, et al. Spatial variation in gene flow across a hybrid zone reveals causes of reproductive isolation and asymmetric introgression in wall lizards. Evolution 2020;74(7):1289–300. doi: 10.1111/evo.14001 [DOI] [PubMed] [Google Scholar]
  • 29.Macholán M, Baird S, Dufková P, Munclinger P, Bímová B, Piálek J. Assessing multilocus introgression patterns: a case study on the mouse X chromosome in central Europe. Evolution. 2011;65(5):1428–46. [DOI] [PubMed] [Google Scholar]
  • 30.Bhattacharyya T, Reifova R, Gregorova S, Simecek P, Gergelits V, Mistrik M, et al. X chromosome control of meiotic chromosome synapsis in mouse inter-subspecific hybrids. PLoS Genet 2014;10(2):e1004088. doi: 10.1371/journal.pgen.1004088 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Hedrick PW. Adaptive introgression in animals: examples and comparison to new mutation and standing variation as sources of adaptive variation. Mol Ecol 2013;22(18):4606–18. doi: 10.1111/mec.12415 [DOI] [PubMed] [Google Scholar]
  • 32.Arnold ML, Sapir Y, Martin NH. Review. Genetic exchange and the origin of adaptations: prokaryotes to primates. Philos Trans R Soc Lond B Biol Sci 2008;363(1505):2813–20. doi: 10.1098/rstb.2008.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Racimo F, Sankararaman S, Nielsen R, Huerta-Sánchez E. Evidence for archaic adaptive introgression in humans. Nat Rev Genet 2015;16(6):359–71. doi: 10.1038/nrg3936 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Abbott R, Albach D, Ansell S, Arntzen JW, Baird SJE, Bierne N, et al. Hybridization and speciation. J Evol Biol 2013;26(2):229–46. doi: 10.1111/j.1420-9101.2012.02599.x [DOI] [PubMed] [Google Scholar]
  • 35.Ravinet M, Faria R, Butlin RK, Galindo J, Bierne N, Rafajlović M, et al. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. J Evol Biol 2017;30(8):1450–77. doi: 10.1111/jeb.13047 [DOI] [PubMed] [Google Scholar]
  • 36.Coyne JA, Orr HA. Patterns of speciation in drosophila. Evolution 1989;43(2):362–81. doi: 10.1111/j.1558-5646.1989.tb04233.x [DOI] [PubMed] [Google Scholar]
  • 37.Frayer ME, Payseur BA. Do genetic loci that cause reproductive isolation in the lab inhibit gene flow in nature?. Evolution 2024;78(6):1025–38. doi: 10.1093/evolut/qpae044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Harrison RG, Larson EL. Heterogeneous genome divergence, differential introgression, and the origin and structure of hybrid zones. Mol Ecol 2016;25(11):2454–66. doi: 10.1111/mec.13582 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Gompert Z, Lucas LK, Nice CC, Fordyce JA, Forister ML, Buerkle CA. Genomic regions with a history of divergent selection affect fitness of hybrids between two butterfly species. Evolution 2012;66(7):2167–81. doi: 10.1111/j.1558-5646.2012.01587.x [DOI] [PubMed] [Google Scholar]
  • 40.Parchman TL, Gompert Z, Braun MJ, Brumfield RT, McDonald DB, Uy JAC, et al. The genomic consequences of adaptive divergence and reproductive isolation between species of manakins. Mol Ecol 2013;22(12):3304–17. doi: 10.1111/mec.12201 [DOI] [PubMed] [Google Scholar]
  • 41.Gosset CC, Bierne N. Differential introgression from a sister species explains high F(ST) outlier loci within a mussel species. J Evol Biol 2013;26(1):14–26. doi: 10.1111/jeb.12046 [DOI] [PubMed] [Google Scholar]
  • 42.Taylor SA, Curry RL, White TA, Ferretti V, Lovette I. Spatiotemporally consistent genomic signatures of reproductive isolation in a moving hybrid zone. Evolution 2014;68(11):3066–81. doi: 10.1111/evo.12510 [DOI] [PubMed] [Google Scholar]
  • 43.Hamilton JA, Lexer C, Aitken SN. Genomic and phenotypic architecture of a spruce hybrid zone (Picea sitchensis × P. glauca). Mol Ecol 2013;22(3):827–41. doi: 10.1111/mec.12007 [DOI] [PubMed] [Google Scholar]
  • 44.Hamilton JA, Lexer C, Aitken SN. Differential introgression reveals candidate genes for selection across a spruce (Picea sitchensis × P. glauca) hybrid zone. New Phytol 2013;197(3):927–38. doi: 10.1111/nph.12055 [DOI] [PubMed] [Google Scholar]
  • 45.Larson EL, Andrés JA, Bogdanowicz SM, Harrison RG. Differential introgression in a mosaic hybrid zone reveals candidate barrier genes. Evolution 2013;67(12):3653–61. doi: 10.1111/evo.12205 [DOI] [PubMed] [Google Scholar]
  • 46.Larson EL, White TA, Ross CL, Harrison RG. Gene flow and the maintenance of species boundaries. Mol Ecol 2014;23(7):1668–78. doi: 10.1111/mec.12601 [DOI] [PubMed] [Google Scholar]
  • 47.Luttikhuizen PC, Drent J, Peijnenburg KTCA, van der Veer HW, Johannesson K. Genetic architecture in a marine hybrid zone: comparing outlier detection and genomic clines analysis in the bivalve Macoma balthica. Mol Ecol 2012;21(12):3048–61. doi: 10.1111/j.1365-294X.2012.05586.x [DOI] [PubMed] [Google Scholar]
  • 48.Saarman NP, Pogson GH. Introgression between invasive and native blue mussels (genus Mytilus) in the central California hybrid zone. Mol Ecol 2015;24(18):4723–38. doi: 10.1111/mec.13340 [DOI] [PubMed] [Google Scholar]
  • 49.Scordato ESC, Wilkins MR, Semenov G, Rubtsov AS, Kane NC, Safran RJ. Genomic variation across two barn swallow hybrid zones reveals traits associated with divergence in sympatry and allopatry. Mol Ecol 2017;26(20):5676–91. doi: 10.1111/mec.14276 [DOI] [PubMed] [Google Scholar]
  • 50.Pruisscher P, Nylin S, Gotthard K, Wheat CW. Genetic variation underlying local adaptation of diapause induction along a cline in a butterfly. Mol Ecol. 2018:10.1111/mec.14829. doi: 10.1111/mec.14829 [DOI] [PubMed] [Google Scholar]
  • 51.Wang S, Rohwer S, de Zwaan DR, Toews DPL, Lovette IJ, Mackenzie J, et al. Selection on a small genomic region underpins differentiation in multiple color traits between two warbler species. Evol Lett 2020;4(6):502–15. doi: 10.1002/evl3.198 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Teeter K, Payseur B, Harris L, Bakewell M, Thibodeau L, O’Brien J. Genome-wide patterns of gene flow across a house mouse hybrid zone. Genome Research. 2008;18(1):67–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Kingston SE, Parchman TL, Gompert Z, Buerkle CA, Braun MJ. Heterogeneity and concordance in locus-specific differentiation and introgression between species of towhees. J Evol Biol 2017;30(3):474–85. doi: 10.1111/jeb.13033 [DOI] [PubMed] [Google Scholar]
  • 54.Westram AM, Faria R, Johannesson K, Butlin R. Using replicate hybrid zones to understand the genomic basis of adaptive divergence. Mol Ecol 2021;30(15):3797–814. doi: 10.1111/mec.15861 [DOI] [PubMed] [Google Scholar]
  • 55.Lundberg M, Liedvogel M, Larson K, Sigeman H, Grahn M, Wright A. Genetic differences between willow warbler migratory phenotypes are few and cluster in large haplotype blocks. Evol Lett. 2017;1(3):155–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Mérot C, Berdan EL, Cayuela H, Djambazian H, Ferchaud A-L, Laporte M, et al. Locally adaptive inversions modulate genetic variation at different geographic scales in a seaweed fly. Mol Biol Evol 2021;38(9):3953–71. doi: 10.1093/molbev/msab143 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Riquet F, Liautard-Haag C, Woodall L, Bouza C, Louisy P, Hamer B, et al. Parallel pattern of differentiation at a genomic island shared between clinal and mosaic hybrid zones in a complex of cryptic seahorse lineages. Evolution 2019;73(4):817–35. doi: 10.1111/evo.13696 [DOI] [PubMed] [Google Scholar]
  • 58.Meyer L, Barry P, Riquet F, Foote A, Der Sarkissian C, Cunha RL, et al. Divergence and gene flow history at two large chromosomal inversions underlying ecotype differentiation in the long-snouted seahorse. Mol Ecol 2024;33(24):e17277. doi: 10.1111/mec.17277 [DOI] [PubMed] [Google Scholar]
  • 59.Godart JB. Histoire naturelle des épidoptères ou papillons de France, vol. 15. Mequignon-Marvis fils; 1842. [Google Scholar]
  • 60.Coutsis J, Van Oorschot H. Differences in the male and female genitalia between Iphiclides podalirius and Iphiclides feisthamelii, further supporting species status for the latter (Lepidoptera: Papilionidae). Phegea. 2011;39(1):12–22. [Google Scholar]
  • 61.Gaunet A, Dinca V, Dapporto L, Montagud S, Voda R, Schär S. Two consecutive Wolbachia-mediated mitochondrial introgressions obscure taxonomy in Palearctic swallowtail butterflies (Lepidoptera, Papilionidae). Zoologica Scripta. 2019. doi: 10.1111/zsc.12355 [DOI] [Google Scholar]
  • 62.Wiemers M. Chromosome differentiation and the radiation of the butterfly subgenus Agrodiaetus (Lepidoptera: Lycaenidae: Polyommatus): a molecular phylogenetic approach. Bonn Bonn: Univ. Diss., 2003. [Google Scholar]
  • 63.Wiemers M, Fiedler K. Does the DNA barcoding gap exist? - a case study in blue butterflies (Lepidoptera: Lycaenidae). Front Zool. 2007;4:8. doi: 10.1186/1742-9994-4-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Wiemers M, Gottsberger B. Discordant patterns of mitochondrial and nuclear differentiation in the Scarce Swallowtail Iphiclides podalirius feisthamelii (Duponchel, 1832) (Lepidoptera: Papilionidae). Entomologische Zeitschrift. 2010;120(3):111–5. [Google Scholar]
  • 65.Descimon H, Mallet J. Bad species. Ecol Butterflies Europe. 2009;500(C):219. [Google Scholar]
  • 66.Lafranchis T, Delmas S, Mazel R. Le contact Iphiclides feisthamelii - I. podalirius. Statut de ces deux taxons (Lepidoptera, Papilionidae). Revue de l’Association Roussillonnaise d’Entomologie. 2015;24(3):111–32. [Google Scholar]
  • 67.Laetsch DR, Bisschop G, Martin SH, Aeschbacher S, Setter D, Lohse K. Demographically explicit scans for barriers to gene flow using gIMble. PLoS Genet 2023;19(10):e1010999. doi: 10.1371/journal.pgen.1010999 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Baird SJE, Petružela J, Jaroň I, Škrabánek P, Martínková N. Genome polarisation for detecting barriers to geneflow. Methods Ecol Evol 2022;14(2):512–28. doi: 10.1111/2041-210x.14010 [DOI] [Google Scholar]
  • 69.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81(3):559–75. doi: 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Schumer M, Xu C, Powell DL, Durvasula A, Skov L, Holland C, et al. Natural selection interacts with recombination to shape the evolution of hybrid genomes. Science 2018;360(6389):656–60. doi: 10.1126/science.aar3684 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Yassin A, Debat V, Bastide H, Gidaszewski N, David JR, Pool JE. Recurrent specialization on a toxic fruit in an island Drosophila population. Proc Natl Acad Sci U S A 2016;113(17):4771–6. doi: 10.1073/pnas.1522559113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Nouhaud P, Martin SH, Portinha B, Sousa VC, Kulmuni J. Rapid and predictable genome evolution across three hybrid ant populations. PLoS Biol 2022;20(12):e3001914. doi: 10.1371/journal.pbio.3001914 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Barton NH, Hewitt GM. Analysis of hybrid zones. Annu Rev Ecol Syst 1985;16(1):113–48. doi: 10.1146/annurev.es.16.110185.000553 [DOI] [Google Scholar]
  • 74.Mather K. Crossing‐over. Biol Rev 1938;13(3):252–92. doi: 10.1111/j.1469-185x.1938.tb00516.x [DOI] [Google Scholar]
  • 75.Stapley J, Feulner PGD, Johnston SE, Santure AW, Smadja CM. Variation in recombination frequency and distribution across eukaryotes: patterns and processes. Philos Trans R Soc Lond B Biol Sci 2017;372(1736):20160455. doi: 10.1098/rstb.2016.0455 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Martin SH, Davey JW, Salazar C, Jiggins CD. Recombination rate variation shapes barriers to introgression across butterfly genomes. PLoS Biol 2019;17(2):e2006288. doi: 10.1371/journal.pbio.2006288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Näsvall K, Boman J, Höök L, Vila R, Wiklund C, Backström N. Nascent evolution of recombination rate differences as a consequence of chromosomal rearrangements. PLoS Genet 2023;19(8):e1010717. doi: 10.1371/journal.pgen.1010717 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Meier JI, Salazar PA, Kučka M, Davies RW, Dréau A, Aldás I, et al. Haplotype tagging reveals parallel formation of hybrid races in two butterfly species. Proc Natl Acad Sci U S A 2021;118(25):e2015005118. doi: 10.1073/pnas.2015005118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Iasi L, Chintalapati M, Skov L, Bossoma Mesa A, Hajdinjak M, Peter B. Neandertal ancestry through time: Insights from genomes of ancient and present-day humans. bioRxiv. 2024:2024–05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Saarman NP, Opiro R, Hyseni C, Echodu R, Opiyo EA, Dion K, et al. The population genomics of multiple tsetse fly (Glossina fuscipes fuscipes) admixture zones in Uganda. Mol Ecol 2019;28(1):66–85. doi: 10.1111/mec.14957 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Gompert Z, Mandeville EG, Buerkle CA. Analysis of population genomic data from hybrid zones. Annu Rev Ecol Evol Syst 2017;48(1):207–29. doi: 10.1146/annurev-ecolsys-110316-022652 [DOI] [Google Scholar]
  • 82.Charlesworth B, Coyne JA, Barton NH. The relative rates of evolution of sex chromosomes and autosomes. Am Naturalist 1987;130(1):113–46. doi: 10.1086/284701 [DOI] [Google Scholar]
  • 83.Mank JE, Nam K, Ellegren H. Faster-Z evolution is predominantly due to genetic drift. Mol Biol Evol 2010;27(3):661–70. doi: 10.1093/molbev/msp282 [DOI] [PubMed] [Google Scholar]
  • 84.Bachtrog D, Mahajan S, Bracewell R. Massive gene amplification on a recently formed Drosophila Y chromosome. Nat Ecol Evol 2019;3(11):1587–97. doi: 10.1038/s41559-019-1009-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Turelli M, Orr HA. The dominance theory of Haldane’s rule. Genetics 1995;140(1):389–402. doi: 10.1093/genetics/140.1.389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Barton NH, Gale KS. Genetic analysis of hybrid zones. In: Hybrid zones and the evolutionary process. In: Hybrid zones and the evolutionary process; 1993. p. 13–13. doi: 10.1093/oso/9780195069174.003.0002 [DOI] [Google Scholar]
  • 87.Hvala JA, Frayer ME, Payseur BA. Signatures of hybridization and speciation in genomic patterns of ancestry. Evolution 2018;72(8):1540–52. doi: 10.1111/evo.13509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Barton NH. Multilocus clines. Evolution 1983;37(3):454–71. doi: 10.1111/j.1558-5646.1983.tb05563.x [DOI] [PubMed] [Google Scholar]
  • 89.Baird SJE. A simulation study of multilocus clines. Evolution 1995;49(6):1038–45. doi: 10.1111/j.1558-5646.1995.tb04431.x [DOI] [PubMed] [Google Scholar]
  • 90.Esri. “World Hillshade” [basemap]. Scale Not Given. “Elevation/World Hillshade (MapServer)’’; 2024. https://server.arcgisonline.com/arcgis/rest/services/Elevation/World_Hillshade/MapServer
  • 91.Gower G, Ragsdale AP, Bisschop G, Gutenkunst RN, Hartfield M, Noskova E, et al. Demes: a standard format for demographic models. Genetics. 2022;222(3):iyac131. doi: 10.1093/genetics/iyac131 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M. Robust demographic inference from genomic and SNP data. PLoS Genet 2013;9(10):e1003905. doi: 10.1371/journal.pgen.1003905 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Aeschbacher S, Selby JP, Willis JH, Coop G. Population-genomic inference of the strength and timing of selection against gene flow. Proc Natl Acad Sci U S A 2017;114(27):7061–6. doi: 10.1073/pnas.1616755114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Mondal M, Bertranpetit J, Lao O. Approximate Bayesian computation with deep learning supports a third archaic introgression in Asia and Oceania. Nat Commun 2019;10(1):246. doi: 10.1038/s41467-018-08089-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Fraïsse C, Popovic I, Mazoyer C, Spataro B, Delmotte S, Romiguier J, et al. DILS: demographic inferences with linked selection by using ABC. Mol Ecol Resour 2021;21(8):2629–44. doi: 10.1111/1755-0998.13323 [DOI] [PubMed] [Google Scholar]
  • 96.Buggs RJA. Empirical study of hybrid zone movement. Heredity (Edinb) 2007;99(3):301–12. doi: 10.1038/sj.hdy.6800997 [DOI] [PubMed] [Google Scholar]
  • 97.Hagberg L, Celemín E, Irisarri I, Hawlitschek O, Bella J, Mott T. Extensive introgression at late stages of species formation: insights from grasshopper hybrid zones. Molecul Ecol. 2022;31(8):2384–99. [DOI] [PubMed] [Google Scholar]
  • 98.Turner LM, Harr B. Genome-wide mapping in a house mouse hybrid zone reveals hybrid sterility loci and Dobzhansky-Muller interactions. Elife. 2014;3:e02504. doi: 10.7554/eLife.02504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Poikela NP, Laetsch DR, Kankare M, Hoikkala A, Lohse K. Experimental introgression in Drosophila: asymmetric postzygotic isolation associated with chromosomal inversions and an incompatibility locus on the X chromosome. bioRxiv. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Mailund T, Halager AE, Westergaard M, Dutheil JY, Munch K, Andersen LN, et al. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. PLoS Genet 2012;8(12):e1003125. doi: 10.1371/journal.pgen.1003125 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Capblancq T, Mavárez J, Rioux D, Després L. Speciation with gene flow: evidence from a complex of alpine butterflies (Coenonympha, Satyridae). Ecol Evol 2019;9(11):6444–57. doi: 10.1002/ece3.5220 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Mackintosh A, Vila R, Laetsch DR, Hayward A, Martin SH, Lohse K. Chromosome fissions and fusions act as barriers to gene flow between brenthis fritillary butterflies. Mol Biol Evol. 2023;40(3):msad043. doi: 10.1093/molbev/msad043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Hirase S, Yamasaki YY, Sekino M, Nishisako M, Ikeda M, Hara M, et al. Genomic evidence for speciation with gene flow in broadcast spawning marine invertebrates. Mol Biol Evol 2021;38(11):4683–99. doi: 10.1093/molbev/msab194 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Papadopulos AST, Baker WJ, Crayn D, Butlin RK, Kynast RG, Hutton I, et al. Speciation with gene flow on Lord Howe Island. Proc Natl Acad Sci U S A 2011;108(32):13188–93. doi: 10.1073/pnas.1106085108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Reifová R, Majerová V, Reif J, Ahola M, Lindholm A, Procházka P. Patterns of gene flow and selection across multiple species of Acrocephalus warblers: footprints of parallel selection on the Z chromosome. BMC Evol Biol 2016;16(1):130. doi: 10.1186/s12862-016-0692-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Garrigan D, Kingan SB, Geneva AJ, Andolfatto P, Clark AG, Thornton KR, et al. Genome sequencing reveals complex speciation in the Drosophila simulans clade. Genome Res 2012;22(8):1499–511. doi: 10.1101/gr.130922.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Harris K, Nielsen R. The genetic cost of neanderthal introgression. Genetics 2016;203(2):881–91. doi: 10.1534/genetics.116.186890 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Nadeau NJ, Ruiz M, Salazar P, Counterman B, Medina JA, Ortiz-Zuazaga H, et al. Population genomics of parallel hybrid zones in the mimetic butterflies, H. melpomene and H. erato. Genome Res 2014;24(8):1316–33. doi: 10.1101/gr.169292.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Johnson EA, Miyanishi K. Testing the assumptions of chronosequences in succession. Ecol Lett 2008;11(5):419–31. doi: 10.1111/j.1461-0248.2008.01173.x [DOI] [PubMed] [Google Scholar]
  • 110.Mérot C, Salazar C, Merrill RM, Jiggins CD, Joron M. What shapes the continuum of reproductive isolation? Lessons from Heliconius butterflies. Proc Biol Sci 2017;284(1856):20170335. doi: 10.1098/rspb.2017.0335 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Orr HA. The population genetics of speciation: the evolution of hybrid incompatibilities. Genetics 1995;139(4):1805–13. doi: 10.1093/genetics/139.4.1805 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Wu CI. The genic view of the process of speciation. J Evolution Biol. 2001;14(6):851–65. [Google Scholar]
  • 113.Otto SP. Evolutionary potential for genomic islands of sexual divergence on recombining sex chromosomes. New Phytol 2019;224(3):1241–51. doi: 10.1111/nph.16083 [DOI] [PubMed] [Google Scholar]
  • 114.Barton NH, Charlesworth B. Genetic revolutions, founder effects, and speciation. Annu Rev Ecol Syst 1984;15(1):133–64. doi: 10.1146/annurev.es.15.110184.001025 [DOI] [Google Scholar]
  • 115.Szymura JM, Barton NH. Genetic analysis of a hybrid zone between the fire-bellied toads, bombina bombina and B. Variegata, near cracow in southern poland. Evolution 1986;40(6):1141–59. doi: 10.1111/j.1558-5646.1986.tb05740.x [DOI] [PubMed] [Google Scholar]
  • 116.Machol´an M, Munclinger P, Sugerkova M, Dufkova P, Bımova B, Bozıkova E, et al. Genetic analysis of autosomal and X-linked markers across a mouse hybrid zone. Evolution. 2007;61(4):746–771. [DOI] [PubMed] [Google Scholar]
  • 117.Morán T, Fontdevila A. Genome-wide dissection of hybrid sterility in Drosophila confirms a polygenic threshold architecture. J Hered 2014;105(3):381–96. doi: 10.1093/jhered/esu003 [DOI] [PubMed] [Google Scholar]
  • 118.Yan J, Zhu M, Liu W, Xu Q, Zhu C, Li J, et al. Genetic variation and bidirectional gene flow in the riparian plant Miscanthus lutarioriparius, across its endemic range: implications for adaptive potential. GCB Bioenergy 2015;8(4):764–76. doi: 10.1111/gcbb.12278 [DOI] [Google Scholar]
  • 119.Ngeve MN, Van der Stocken T, Sierens T, Koedam N, Triest L. Bidirectional gene flow on a mangrove river landscape and between-catchment dispersal of Rhizophora racemosa (Rhizophoraceae). Hydrobiologia 2016;790(1):93–108. doi: 10.1007/s10750-016-3021-2 [DOI] [Google Scholar]
  • 120.Banker SE, Bonhomme F, Nachman MW. Bidirectional Introgression between Mus musculus domesticus and Mus spretus. Genome Biol Evol. 2022;14(1):evab288. doi: 10.1093/gbe/evab288 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Wilkinson-Herbots HM. The distribution of the coalescence time and the number of pairwise nucleotide differences in the “isolation with migration” model. Theor Popul Biol 2008;73(2):277–88. doi: 10.1016/j.tpb.2007.11.001 [DOI] [PubMed] [Google Scholar]
  • 122.Gompert Z, Buerkle CA. bgc: Software for Bayesian estimation of genomic clines. Mol Ecol Resour 2012;12(6):1168–76. doi: 10.1111/1755-0998.12009.x [DOI] [PubMed] [Google Scholar]
  • 123.Andrews S. FastQC version 0.11.5: a quality control tool for high throughput sequence data. 2016.
  • 124.Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884–90. doi: 10.1093/bioinformatics/bty560 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 2016;32(19):3047–8. doi: 10.1093/bioinformatics/btw354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Mackintosh A, Laetsch DR, Baril T, Ebdon S, Jay P, Vila R, et al. The genome sequence of the scarce swallowtail, Iphiclides podalirius. G3 (Bethesda). 2022;12(9):jkac193. doi: 10.1093/g3journal/jkac193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint. 2013. https://arxiv.org/abs/1303.3997 [Google Scholar]
  • 128.Tarasov A, Vilella AJ, Cuppen E, Nijman IJ, Prins P. Sambamba: fast processing of NGS alignment formats. Bioinformatics 2015;31(12):2032–4. doi: 10.1093/bioinformatics/btv098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint. 2012. https://arxiv.org/abs/1207.3907. [Google Scholar]
  • 130.Keightley PD, Pinharanda A, Ness RW, Simpson F, Dasmahapatra KK, Mallet J, et al. Estimation of the spontaneous mutation rate in Heliconius melpomene. Mol Biol Evol 2015;32(1):239–43. doi: 10.1093/molbev/msu302 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131.Baumdicker F, Bisschop G, Goldstein D, Gower G, Ragsdale AP, Tsambos G, et al. Efficient ancestry and mutation simulation with msprime 1.0. Genetics. 2022;220(3):iyab229. doi: 10.1093/genetics/iyab229 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Kelly A Dyer, Nicolas Bierne

9 Aug 2024

Dear Dr Ebdon,

Thank you very much for submitting your Research Article entitled 'Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, log into your Editorial Manager account and select the option 'Revise Submission' in the 'Submissions Needing Revision' folder.

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Nicolas Bierne

Academic Editor

PLOS Genetics

Kelly Dyer

Section Editor

PLOS Genetics

Dear Dr. Ebdon,

Thank you for submitting your manuscript to PLoS Genetics. It has been carefully reviewed by three referees. Overall, we appreciated your use of two innovative methods, gIMble and diem - one for demographic reconstruction and barrier locus mapping, and the other for chromosome painting in hybrids - within the same study. We also found your approach of comparing the results of these methods by analyzing parental populations far from the contact zone and hybrids within the contact zone interesting. In addition, we were intrigued by the result highlighted in your title, provided it was well supported. However, the reviewers have identified several concerns that need to be addressed before your manuscript can be considered for publication. Chief among these is the issue of small sample size, which was discussed by all of them. While the reviewers are open to the possibility that relevant conclusions can be drawn from a small sample, they remain unconvinced. About gIMble, while small sample size may not be a significant issue for inferring demographic parameters, it poses a challenge for barrier mapping. This aspect needs to be further explored and explicitly acknowledged in your manuscript. If your primary objective is to correlate the results with those obtained using diem, the effect of sample size may be less critical. However, it must be explicitly recognized that the positions of each individual barriers should not be considered strongly supported. Regarding the diem method, there is also a lack of clarity about the time scale to which your results apply. Given that your hybrid zone is likely partially congealed, and that only one individual could be considered an early generation hybrid, while the others are likely introgressed local parents, the congruence with the gIMble results may not be surprising. These local parents could simply be more introgressed, with bigger introgression tracts, compared to allopatric parents. The diem analysis again raises the question of whether your sample size is sufficient for conclusive results. In addition, you should acknowledge that your approach is conceptually similar, though not identical, to the genomic cline approach and discuss the relevant literature. It is important to explicitly address these issues while explaining that your primary goal is the correlation shown in Figure 6. You must also propose a solution to ensure that local errors do not unduly influence the overall correlation. Finally, as referee #2 pointed out, if the main objective of your study is the result shown in Figure 6, it needs to be discussed thoroughly. Was it expected or unexpected? How does it differ from the recent meta-analysis by Frayer and Payseur (https://doi.org/10.1093/evolut/qpae044) that found no correlation? It is important to discuss how your natural hybrids differ from lab-crossed hybrids in this context.

A small additional point. Figure 5 is not a De Finetti triangle and the concept of a genomic Wahlund effect is misleading. In an admixed population different loci will quickly drift or sweep to different allele frequencies and deviation from the line is more of an expectation than an exception. Please delete the dotted line. The difference between chromosomes is interesting, though, and underexplored. Some chromosomes support lower heterozygosity, what is their length? their density in barrier loci? what is their contribution to the result in figure 6?

I'll stop here; you have a lot of comments to address in the reviewers' reports. It is imperative that you address all concerns in a thoroughly revised version of your manuscript. While there is no formal "reject/encourage" decision at PLoS Genetics, I am assigning your manuscript a major revision. Please be aware that while your analyses are of interest, PLoS Genetics places a strong emphasis on the novelty of a study, and I am not yet sufficiently convinced in this regard after this round of reviewing. Therefore, it is crucial that your revised text meets the expectations of your title. Using gIMble and diem will not suffice. You will also need to convince the reviewers that your methods are unbiased and that your sample size is adequate - neither of which is currently guaranteed.

I look forward to reviewing your revised manuscript.

Best regards,

Nicolas

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this study, Ebdon et al. describe barriers to gene flow between two species of Iphiclides butterflies at two different timescales, including the first genetic description of a hybrid zone between these species. They find a substantial barrier to gene flow that appears similar at both timescales. It is rare and difficult to compare these two timescales, and it was not a foregone conclusion that these approaches would identify the same loci within the genome. This manuscript utilizes two exciting, recently-developed techniques to draw inferences from an impressively limited number of individuals. This study will be exciting to many within the speciation community who are limited by sample collection. Additionally, the comparison of the distribution of tract lengths to theoretical predictions is particularly exciting.

A limitation of this work is the lack of a detailed recombination map for these species. The authors have addressed this in several ways, but it would be helpful to have two further questions clarified within the text: 1) Are the predicted relationships between chromosome structure and recombination rate on lines 202-205 supported by any literature within this genus?; 2) How do these results relate to the expectation that tracts should be broken down more slowly in regions of lower recombination, and thus we may see minor parent ancestry reduced at loci farther away from a barrier locus than in a region of high recombination? While the authors have already spent significant space on addressing the issue of recombination, I think it is important as it is key to some of the interpretations and would be relevant for other researchers who are unable to obtain a detailed recombination map for their species.

Furthermore, it would be useful for the authors to discuss how the amount of divergence between the species reflects on the power to detect barriers along the genome, and if that could contribute to the observed similarities between the methods.

While this manuscript is well written, there are a few terms that require clarification when first introduced. In most cases, these terms are defined in the Methods (presented here following the Results), but not defined when first introduced in the Results. Examples include: “delta B0” on line 191, “focal barrier” on line 215, and the coupling coefficient terms on line 243. I recommend that the authors revisit the Results section and bring all necessary information forward from the Methods section, if it is to remain at the end of the manuscript.

Minor comments:

The authors should clarify the proportion of the genome that is reproductively isolated. The abstract says 33% and at line 193 it says 35%. If these numbers are not the same for a reason, it would be helpful to clarify.

On lines 134 and 135, should these be references to Figure 1 rather than Figure 2?

Figure 5 is referenced in the text before Figure 4.

Appendix equation A-3 appears to be missing a negative sign in the exponent.

Reviewer #2: This paper compares to analyses of barriers to gene flow between two Iphiclides species. The first major analysis uses a nice recent IM approach to quantify effective rates of migration across the genome. The second method uses genomes from 6 (or sometimes 3) hybrids to investigate barriers that act currently. A central conclusion is the partial concordance between long- and short-term barriers.

Overall, I found this an interesting but frustrating paper to review. On the positive side, the new methods (Figs. 4-5) are exciting, and results from the IM methods are also interesting and well applied. On the negative side, the whole paper felt hastily written, with many methods inadequately reported (many Figures lack important information in the legends for example). I also thought that the paper lacked a clear message. As such, it was difficult to tell which results were meant as interesting descriptive asides, and which were meant to contribute to a broader conclusion.

1. Overall message

The final sentences of both the abstract and the conclusions seemed disappointing at the moment. The "aims and objectives" section currently names software packages and the four questions are just cryptic descriptions of the analyses that follow. The reader needs to understand exactly why it is important to compare long-term and short-term barriers. What exactly could we conclude if the two sets of barriers were congruent vs. incongruent? What would an intermediate degree of congruence tell us? None of this was made clear either from the introduction or the aims.

2. Assumptions of the methods

The paper assumes that the reader is quite familiar with several previous papers (especially Barton and Gale 1993; Baird 1995; Laetsch et al. 2023, including e.g., the methods of parametric bootstrap). Currently, key quantities (S, R, D etc.) are not properly defined. I think more details of the methods should be reported throughout.

3. Congruence between the two sets of barriers

An interesting claim of the paper is the congruence between the long- and short- barriers. This is demonstrated by a weakish negative correlation between m_e and D (reported just as a summary statistic), and by an ANOVA on D between barrier and non-barrier windows, (Figure 6).

If I am correct that this is the central result of the paper, the reader needs to be better convinced of its importance and robustness. The p value of the ANOVA is reported, but without summary stats, sample size, or tests of its parametric assumptions. We also need to understand exactly how independent the short- and long-term analyses are. Naively, we might expect analysis of recent hybrids to be confounded if the taxa had already undergone extensive gene flow in the recent past. Is this a problem or not?

The conclusion also mentions possible confounding with recombination rate. I understand that a proper test might be impossible, but couldn’t you, for example, compare results for short- vs long chromosomes?

Small points:

1. “we label windows as barriers to gene flow if a DIV history (me = 0) has greater marginal support than an IM history assuming the best fitting genome-wide value of me “ Is this correctly described as “the strictest possible threshold” (p. 18), and might results be affected by adaptive introgression?

2.

“Genomic Wahlund effect” Are you suggesting an explanation here (e.g. that the reduced heterozygosity follows from spatial structure)? If so I would explain, if not, I would avoid the label.

3.

“The deficit of coding sequence is likely a consequence of the strong negative correlation between gene density and chromosome length (Pearson’s ρ = -0.421) on one hand and the strong positive correlation between barrier density and chromosome length on the other. These together may explain the strong negative correlation between gene and barrier density (Pearson’s ρ = -0.652), which is partially driven by the numerous small chromosomes with very high numbers of genes and small numbers of barriers.”

Couldn’t this be tested?

Reviewer #3: Review of "Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies"

This manuscript takes an interesting approach to make a lot out of a relatively modest data set to ask an interesting question in speciation genomics - the relationship between short term and longer term permeability of the genome to introgression. In contrast to the more traditional approach of comparing divergent population pairs at different points on "the speciation continuum" this study compares populations of the focal species pairs at different portions of the range - such that the allopathic populations reveal long-term historical introgression, while sympatric populations reveal the short-term recent process of hybridization and introgression. This is an interesting approach, which will undoubtably inspire others to perform similar analyses in their systems. That said, I have numerous concerns about this study.

First - before asking their major motivating question, the authors make a solid approach to asking basic question about the system and inferring the history of speciation and gene flow. Notably they fit an IM model model to their allopatric samples and infer a history of ongoing gene flow. They also show that this history fits better than a simple model of pure divergence and that simple simulations under the best fitting demographic model of pure diverge do not confuse the IM inference. While this is reasonable, there are a few weaknesses here that must be noted, if not fully fixed. First the details of simulation are quite vague - e.g. the authors say the used gable simulate to run these simulations. However this is not particularly informative as looking back at the gIMble paper, it seems that gIMble simulate is simply wrapper for msprime. As such the authors should a. Reference msprime as this seems to rob msprime of a citation and b. Provide more details about genome architecture - did they model the genome structure of their focal species? Or was this a bunch of unlinked sites? Or was this a few chromosomes all 1 Morgan long? Was the mutation rate fixed or variable across chromosomes etc etc.Similarly, it appears that the DIV model assumes a clean and instantaneous spell rather than split from a structured ancestral population etc etc etc. All of these details have important influence on the interpretation of the fact that neutral demographic simulations with a pure split of divergence are not confused for IM model as a misspecified demographic model can lead to poor inference. Similarly linked selection can also generate signals confused for introgression, especially in the IM framework [e.g. Smith and Hahn 2024https://academic.oup.com/genetics/advance-article/doi/10.1093/genetics/iyae089/7683793 ]. I'm not sure what to do with these criticisms, as the authors are using currently established standard best practices of the field, but I am concerned that such issues could lead to incorrect inferences downstream.

I have similar concerns about the interpretation of the block length distribution. The authors note some (but not all) of the complicating assumptions involved in going from a block length distribution (which they note is further enhanced by the lack of a genetic map) to an inference of selection. I don't think this analysis provides much support or anything, and it is probably best to remove it.

The most exciting result is - of course - the inferred correlation between long term and short therm barriers to introgression. This is an interesting result, but it was not well explained or motivated. Here are a few issues which require some though / attention:

First off the justification for using D - an indirect measure of admixture proportion rather than some more radiation approach was not well explained. Trying to read the minds of the authors, my intuition is that this reflects "mixture LD" (sense Falush et al 2003), and that the authors were worried that some more straightforward local ancestry deconvolution approach would miss admixed blocks on this short time scale. But this was neither stated nor supported / justified so it would help if the authors elaborated some.

Second - the stats are not well explained - e.g. What is the ANOVA model for comparing barrier and non-barrier loci- is this simply a t-test or is there more to the model? What is done about the non-independence of windows ? My hunch is that both D and m_{e,i} are autocorrelated across the genome, so this should be addressed. Additionally, the stats reporting was strange - in one case a correlation coefficient was provided without a measure of significance, in another case a p value was provided without a summary of the effect size etc...

Third - the biological question was unclear. Is this a question about the repeatability of key barrier loci, or a question about if the landscape of recombination itself is sufficient to generate correlations in local ancestry across time? Either way, the authors should better articulate their biological question, and either develop a parametric model that include variables such as chromosomes length, and position - -or develop some form of nonparametric matched permutation) to see if the correlation exceeds predictions off these simple correlates (although it could be that this model being inadequate simply reflects the limited info about the genome etc). This is particularly important because the authors find that simple genome features well predict local introgression.

Finally, the sample size here is remarkably small - only five samples from sympatry - and two with relatively little admixture. While it could be argued that making much of little data is a strength of the question and approach, I would like to have some sense of power, sample size necessary etc. While one genome provide a nice collection of coalescent genealogies and sometimes can be quite revealing for pop gen, one recently admixed sample does not. A stronger paper would include both some form of jacknifing at the individual level and some complimentary simulation.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: No: 

Reviewer #3: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 1

Kelly A Dyer, Nicolas Bierne

5 Dec 2024

PGENETICS-D-24-00612R1

Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies

PLOS Genetics

Dear Dr. Ebdon,

Thank you for submitting your manuscript to PLOS Genetics. After careful consideration, we feel that it has merit but does not fully meet PLOS Genetics's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 30 days Jan 04 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosgenetics@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pgenetics/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Nicolas Bierne

Academic Editor

PLOS Genetics

Kelly Dyer

Section Editor

PLOS Genetics

Aimée Dudley

Editor-in-Chief

PLOS Genetics

Anne Goriely

Editor-in-Chief

PLOS Genetics

Additional Editor Comments:

Dear Dr Ebdon,

Your revised manuscript was examined by two of the three referees of the first round. The third did not respond to my request and I did not wish to invite a new referee. Both referees gave a positive feedback about this revised version. I share their positive evaluation. However, referee 1's concern about divergence time is shared by me, and it remains an important caveat of your paper. We are not convinced that your deductions and interpretations allow you to be so categorical in your conclusions about the contrast between two very different evolutionary time scales. Your poor knowledge of the study system, the limited number of demographic histories considered (2 pop IM model, no space, no variation of connectivity through time etc...), and above all the small number of individuals in a hybrid zone whose spatial structure and level of local introgression (the tails of the clines) are little known, do not allow you to assess whether the coincidence of regions resistant to introgression in and far from the hybrid zone is a surprise or not. Your junctions approach is the best you can get from this very small sample, and I very much appreciate it, but that's not the issue, the issue is more about the generality of the interpretations. So yes, you find figures, 100-200 generations in the ZH and 2M generations between the parents, but what? Does it allow you to believe that you are doing better than previous studies that have attempted the same investigation? I receive, and share, and so do referee 1, the criticisms of lab crosses studies, but at least we know what type of hybrids we're studying. And I'm not very happy that you only make negative comments about the low number of recombination generations, as if all this literature should be thrown away, rather than trying to have a constructive discussion. In fact, I don't know whether I prefer 1,000 lab F2s or 6 hybrids from a little-known HZ, be they studied with an elegant junction method. I could also well prefer a genomic cline analysis with less markers but many individuals and a nice transect. In your case, you have your own vision of your complex HZ hybrids, but in reality we don't know very much (the fact you misunderstood my concern that you might have locally introgressed parents and an early generation hybrid between these introgressed parents, proves you have a preconceived view). I don't share your certainties. That said, contrary to my conclusion in the first round, I think your methodological approaches deserve to be published in PloS Genet, and the comparison of the two approaches remains super interesting. However, I must ask you to make an effort to correct your certainties throughout your manuscript. You should work on the abstract and completely rewrite the author's summary. Throughout the manuscript, you must try to mitigate this very strong idea that you are comparing things that are as different as you say they are. If you have a secondary contact, even though the HZ introgression queues hybrids have recently incorporated new neutral heterospecific tracts, they are expected to have the same barriers as the parents. That's what people have done by comparing divergence islands and genomic clines in other studies. Your methods may well be more elegant, and that's where the novelty lies, not so much in the conceptual findings. I understand that my comment is somewhat at odds with my recommendation of the first round, but your revision and the very nice new analyses you've done, as well as your inability to really convince me of the significance of finding coincident barriers with the two methods, prompt me to ask you to tone that result down a bit in your revised version. Sorry for the back and forth.

I look forward to reading your revised version soon.

Best regards,

Nicolas Bierne

Minor comments :

- "vastly different timescales", "very recent signatures" "long-term signatures over the history of divergence" etc. please mitigate.

- L46 "speciation in the face of gene flow" → episodes of gene flow during speciation

- L58 "contemporary contact zones are likely one of many instances of secondary contact generated by drastic environmental changes" : I don’t understand what you mean here, probably too much than what you could.

- L 173 "An IM model supports a history of speciation with gene flow". What can an IM model support other than an IM model? Given you do not test secondary contact or varying level of geneflow with time, do not conclude more than what you can. I realy don’t mind that you use an IM model to map the barriers, but do not conclude too much about the history of divergence and gene flow. That a hybrid zone in the southwest of France would not be a secondary contact after post-glacial recolonization would surprise me a lot.

- Figure 4 "The dashed line indicates the expectation under Hardy-Weinberg equilibrium" Again, please delete this dashed line. If I make an effort to go your way, this line could be useful if you plot the population average (but you don’t really have a population sample here), that this average is close to the line, and you find a way to show how far the individual scatterplot deviates from it as a measure of HW and linkage disequilibrium (but the variance in the ancetry does the job well). When you compare different parts of a genome, it's against the genome average, but given that an individual genomic average may well deviate from the line due to linkage disequilibrium in HZ or selection and drift in an admixed population, I don't see the point of showing that expectation. Finally, as I said it previously, in an isolated admixed population that converged toward HWLE, one might expect individuals to be close to this line under neutrality and slow drift in large populations, but this is never the case, you always deviate from it (notwithstanding that theory predicts selection for heterozygosity and ancestry of the major or fitter parent). When an expectation is not useful, when it is not well explained and when it can mislead the reader because it is unusual, I don't see the point of showing this expectation. To put it briefly, I don't agree with you at all that it's a useful expectation.

Journal Requirements:

1) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. Please ensure that the funders and grant numbers match between the Financial Disclosure field and the Funding Information tab in your submission form. Note that the funders must be provided in the same order in both places as well.

- State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

- State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.".

If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this study, the authors investigate short- and long-term barriers to gene flow between two species of Iphiclides butterflies. They use two recently developed techniques (gIMble and diem) to draw inferences from small sample size. Interestingly, they find that the barriers identified through the two approaches overlap significantly. They find that the Z chromosome has very little introgression. This study is exciting because such comparisons of different times scales are rare.

This is the second time I have reviewed this manuscript. I appreciate the authors’ efforts to make improvements, and I believe the manuscript has been strengthened as a result. I appreciate the new emphasis on the overlap between the short- and long-term barriers, which is a novel contribution of this manuscript. I particularly appreciate the authors’ addition of junction number as a metric and their new permutation approach, as it is exciting to see this theory applied to real data. Previous attempts at applying junctions to real data have looked at very different types of hybrid populations that made this comparison difficult (e.g. Lavretsky et al 2019, https://doi.org/10.1002/ece3.4981).

That being said, I’m not sure that the authors have addressed my concern about divergence. I was less concerned with the time of divergence and more concerned with the uneven pattern of differentiated markers along the genome. At line 330, the authors state that “sites with high diagnostic indices are more densely populated within barrier windows relative to a null resampled distribution.” Is there an issue of relative power along the genome? It would improve the paper for the authors to address this.

Minor comments:

1) I agree with the authors that the Frayer and Payseur paper has several additional complications brought on by the lab vs wild cross comparison (as was the purpose of their discussion), but it may also be worth noting that the time scales are different (where late-stage hybrids in that comparison may be closer to the short-term barriers considered here).

2) Around line 205, it would improve clarity to explicitly say that the loci used for the sliding windows are not the same as those used for fitting the global IM model above.

3) At lines 332-334, you say that the number of junctions should be negatively correlated with barrier strength. While I agree a reduction is expected, I think there is evidence that the strength of the reduction is modulated by several factors and not just selection strength (e.g. Hvala et al 2018, https://doi.org/10.1111/evo.13509). If this is based on specific theory, it would be good to cite it. This does not impact your conclusion, as you only show a reduction and not a correlation.

4) At lines 516-518, the authors state that their approach may undercount small blocks, but they actually observe more small blocks than expected by their model. Please clarify this point.

5) On lines 583 and 596, the authors reference different Keightley papers for the same mutation rate.

6) In the Methods section titled “Untangling blocks from genotypes,” the authors state that no phasing choice is necessary to determine the lengths of minor tracks of introgression. While this is true, a choice would still be required in most cases where minor ancestry is homozygous. The legend of Figure 6 suggests that that does not happen in most cases, and this seems reasonable given the relative proportions of admixture. However, it would be useful to clarify in the Methods both that most blocks were heterozygous, and what was done with homozygous blocks.

7) At the end of the Figure 5 legend, there should be a reference to plot F rather than plot E.

Reviewer #2: This manuscript is greatly improved. I like the new stats and especially the new circular randomization. The intro and discussion also do a much better job of contextualizing the work, and explaining why its questions are interesting. I am grateful to the authors for their constructive response to the earlier review.

I have two very very small comments:

1. There are lots of acronyms and their component single letters in this ms (D, Z, I, H, M, HI, DI, HZ, IM, IF, HWE, LD, CL etc.). I know they are all standard, but I would double check if any of the rarer ones could be removed (e.g. mt, x, HWE, FPR), or different notation used when similar symbols appear in the same paragraph (e.g. p. 9 has H, I, Z, HZ, HI, and IF).

2. p. 6: "using a pre-computed grid" Not sure what this meant.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Decision Letter 2

Kelly A Dyer, Nicolas Bierne

14 Mar 2025

Dear Dr Ebdon,

We are pleased to inform you that your manuscript entitled "Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Nicolas Bierne

Academic Editor

PLOS Genetics

Kelly Dyer

Section Editor

PLOS Genetics

Aimée Dudley

Editor-in-Chief

PLOS Genetics

Anne Goriely

Editor-in-Chief

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-24-00612R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Kelly A Dyer, Nicolas Bierne

PGENETICS-D-24-00612R2

Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies

Dear Dr Ebdon,

We are pleased to inform you that your manuscript entitled "Genomic regions of current low hybridisation mark long-term barriers to gene flow in scarce swallowtail butterflies" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Anita Estes

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix

    (PDF)

    pgen.1011655.s001.pdf (164.9KB, pdf)
    S1 Table

    (CSV)

    pgen.1011655.s002.csv (2.2KB, csv)
    S1 Fig

    (PDF)

    pgen.1011655.s003.pdf (110.4KB, pdf)
    S2 Fig

    (PDF)

    pgen.1011655.s004.pdf (564.4KB, pdf)
    S3 Fig

    (PDF)

    pgen.1011655.s005.pdf (520.6KB, pdf)
    S4 Fig

    (PDF)

    pgen.1011655.s006.pdf (645.6KB, pdf)
    S5 Fig

    (PDF)

    pgen.1011655.s007.pdf (232.6KB, pdf)
    S6 Fig

    (PDF)

    pgen.1011655.s008.pdf (671.1KB, pdf)
    S7 Fig

    (PDF)

    pgen.1011655.s009.pdf (523.7KB, pdf)
    S8 Fig

    (PNG)

    pgen.1011655.s010.png (65.3KB, png)
    S9 Fig

    (PNG)

    pgen.1011655.s011.png (65.3KB, png)
    Attachment

    Submitted filename: Ebdon2024_responses.pdf

    pgen.1011655.s012.pdf (523.7KB, pdf)
    Attachment

    Submitted filename: Ebdon2024_responses_auresp_2.pdf

    pgen.1011655.s013.pdf (483.1KB, pdf)

    Data Availability Statement

    Read data is available from the ENA at PRJEB76171. Reads for sample IP 504 were generated by a previous study (doi: 10.1093/g3journal/jkac193) and are available at the ENA at PRJEB51340. Input data for plots and statistics is available from https://github.com/samebdon/iphiclides_ speciation_data. The bootstrapping was implemented using a custom script available at https://github.com/LohseLab/circular_bootstrap.


    Articles from PLOS Genetics are provided here courtesy of PLOS

    RESOURCES