Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2018 Jun 26;115(28):E6507–E6515. doi: 10.1073/pnas.1800334115

Whole-genome data reveal the complex history of a diverse ecological community

Lynsey Bunnefeld a,b,1,2, Jack Hearn a,1, Graham N Stone a,3, Konrad Lohse a,3
PMCID: PMC6048486  PMID: 29946026

Significance

Widespread biological communities are common, but little is known about how they assemble. A key question is how sets of trophically linked species (predators and their prey, hosts and parasites) spread to occupy current distributions. Do they disperse together, preserving ecological interactions, or separately, such that interactions are interrupted? This is central to assessing the potential for coevolution in a system and requires inference of species associations both over space and through time. Here, we use de novo genomic data and likelihood-based approaches to infer the assembly history of a multispecies community of Western Palearctic insect herbivores and parasitoid natural enemies—the two trophic groups that together comprise 50% of all animal species.

Keywords: population genomics, comparative phylogeography, gall wasps, chalcid parasitoids, Western Palearctic

Abstract

How widespread ecological communities assemble remains a key question in ecology. Trophic interactions between widespread species may reflect a shared population history or ecological fitting of local pools of species with very different population histories. Which scenario applies is central to the stability of trophic associations and the potential for coevolution between species. Here we show how alternative community assembly hypotheses can be discriminated using whole-genome data for component species and provide a likelihood framework that overcomes current limitations in formal comparison of multispecies histories. We illustrate our approach by inferring the assembly history of a Western Palearctic community of insect herbivores and parasitoid natural enemies, trophic groups that together comprise 50% of terrestrial species. We reject models of codispersal from a shared origin and of delayed enemy pursuit of their herbivore hosts, arguing against herbivore attainment of “enemy-free space.” The community-wide distribution of species expansion times is also incompatible with a random, neutral model of assembly. Instead, we reveal a complex assembly history of single- and multispecies range expansions through the Pleistocene from different directions and over a range of timescales. Our results suggest substantial turnover in species associations and argue against tight coevolution in this system. The approach we illustrate is widely applicable to natural communities of nonmodel species and makes it possible to reveal the historical backdrop against which natural selection acts.


Ecological communities comprise organisms of multiple species living in a specified place and time (1). A common feature of life on Earth is that very similar communities, showing high overlap in species composition and patterns of interaction, can be found in multiple locations (2). How such sets of communities assemble remains largely unknown (3). Are similar communities derived by expansion of a single original community, with component species expanding their distributions side by side such that ecological interactions among them are maintained? Or do similar communities arise repeatedly through a process of ecological fitting (2, 4), involving the local sorting of species pools into communities based on trait values that have not been shaped by coevolution? Alternatively, community assembly could be an entirely random and ecologically neutral process (5). Which of these models applies is central to our understanding of how traits associated with species interactions evolve, because while codispersal allows sustained coadaptation and coevolutionary interactions among component species, no such processes are associated with ecological fitting or random assembly. These alternative models also make contrasting predictions for variation in population history across component species in a community. While codispersal predicts a concordant origin and direction of range expansion in interacting species, ecological fitting and neutral processes have no such requirement and can accommodate very discordant population histories.

Concordant population histories are expected in obligate species associations, such as those between plants and specialist pollinators, and these commonly show both coevolution of associated traits and codiversification of the lineages involved (6, 7). However, much of terrestrial diversity is found in communities dominated by less specific interactions between guilds of species, for which either codispersal or ecological fitting is a plausible assembly mechanism. These are exemplified by the rich insect communities associated with temperate trees, in which widespread herbivores are commonly attacked by a consistent set of parasitoid enemies (8, 9). Both herbivores and parasitoids in such systems show traits that structure trophic interactions (1012), but the extent to which these represent coadaptations or the results of ecological fitting remains little understood. Indeed, evidence exists for asynchronous divergence between Joshua tree and yucca moth populations, suggesting that, despite the obligate nature of this pollination mutualism, delayed host tracking may also be at play (13).

Here we assess the evidence for four alternative models (Fig. 1 AD) in the assembly of an exemplar community of oak-associated insects, comprising herbivorous insect hosts and parasitoid and inquiline natural enemies (Fig. 1 legend). These models include strict codispersal (simultaneous range expansion), host tracking (parasitoids’ pursuit of their hosts), and ecological fitting (local recruitment of parasitoids from alternative hosts), each of which makes contrasting predictions for spatial patterns of genetic variation within and among species. We also evaluate the support for an alternative null model of assembly under ecological neutrality in which a species’ history is a random draw from a community-wide distribution regardless of its trophic interactions with other community members (5, 14).

Fig. 1.

Fig. 1.

Population histories under alternative hypotheses of community assembly. (A) Under strict codispersal species disperse between refugia together from a shared origin and their population histories show the same topology and timing of population splits. (B) Host tracking predicts that hosts and parasitoids share a common geographic origin, but allows parasitoids to follow after their hosts. Host tracking is significant ecologically because hosts can achieve a measure of enemy-free space (20) which decouples coevolutionary interactions between trophic levels. Enemy escape has been seen over ecological timescales (21, 22), but has rarely been studied in the context of population histories (20). (C) Ecological fitting results in the same set of interacting species in each refuge, but allows highly discordant patterns of range expansion from different origins. (D) Under an ecological and biogeographic null model, expansion times are random draws from a single community-wide distribution. Shown is a six-species community with two parasitoids (orange) and one gall wasp host (brown) expanding out of the east and two parasitoids (gray) and one gall wasp (blue) expanding out of the west. Split times (T1 and T2) are drawn randomly from an exponential distribution. (E) The demographic history of each species is captured by a seven-parameter model. We assume a separate population size (Ne) for each refugial population (three parameters). In this example, NE<NW<NC (shown by width of bars and where NE is the Ne for the eastern refugium, and so on) and an instantaneous admixture event (shown by a horizontal arrow) at Tadm transfers a fraction f of the central population into the east. The ancestral populations always share an Ne with one of their descendant populations. Here, Anc1 shares an Ne with the central population and Anc2 shares an Ne with the eastern population. (F) Exemplar members of the oak gall community. Galls induced by the four oak gall wasp species (Bottom) are attacked by a range of parasitoid wasps (Top). Natural enemies from Top Left to Top Right: Torymus auratus, Synergus umbraculus, Eurytoma brunniventris, and Ormyrus nitidulus. (S. umbraculus is an inquiline oak gall wasp that inhabits galls induced by other oak gall wasps; for brevity we group it with parasitoid species hereafter). Galls from Bottom Left to Bottom Right: Andricus grossulariae, Neuroterus quercusbaccarum, Biorhiza pallida, and Pseudoneuroterus saliens. [Scale bars: 1 mm (Top row) and 1 cm (Bottom row).]

Our exemplar community comprises oak cynipid gall wasp herbivores and their chalcid parasitoid wasp natural enemies, which form a set of interacting species whose distributions span the Western Palearctic from Iberia to Iran. Associated parasitoids attack only cynipid galls, making this system ecologically closed and hence suitable for analysis in isolation (8). Single-species analyses of community members (1518) have found westward declines in genetic diversity that support expansion into Europe from Western Asia, an “out of the East” pattern that is concordant with inferred divergence of Western Palearctic gall wasp lineages in Western Asia 5–7 Mya (19). However, the extent to which this is true of the whole community, and particularly the parasitoids, remains unclear (20).

We test the predictions of each model of community assembly using whole-genome sequence (WGS) data for 13 species (4 gall wasp herbivores and 9 parasitoids) of the oak gall community, each assembled de novo and sampled across three Pleistocene refugial populations spanning the Western Palearctic: Iberia (west), the Balkans (center), and Iran (east) (SI Appendix, Table S1). For each species and each refugium, we generated WGS data for two males whose haploidy greatly facilitates bioinformatic and population genetic analyses. We infer an explicit history of divergence and admixture between refugia for each species, using a computationally efficient composite-likelihood framework (23) based on the joint occurrence of mutations in sequence blocks (24). Analyzing WGS data in a blockwise manner overcomes the limitations of previous comparative analyses which have either not considered the effect of demographic history on genetic diversity (25, 26) or lacked power because of the small number of loci available (20). We conduct a formal comparison of demographic histories across species and between trophic levels to address the following questions: (i) To what extent do community members share a common origin, direction, and timescale of range expansion across the Western Palearctic? (ii) Did gall wasp herbivores disperse before their parasitoids and so achieve “enemy-free space”? (iii) Is there evidence of dispersal between refugia following their initial colonization, compatible with multiple waves of range expansion? And (iv) can the timing of expansion events across the full set of species be explained by a neutral model of community assembly?

We find a diversity of population histories in oak gall wasps and their parasitoids. While most species dispersed into Europe from the east, four species joined the community following dispersal from the west. Evidence of gene flow between refugial populations implies multiple range expansion events and the potential for dynamic community evolution on a continental scale. Species also vary in the timing of their range expansions, with initial divergences between refugial populations dating from less than 100 kya to over 1 Mya. A likelihood-based comparison of these histories allows us to reject both strict codispersal of this community and herbivore occupation of enemy-free space, two paradigmatic models of deterministic community assembly. Hosts as a guild did not disperse before their parasitoids, providing no evidence for (perfect) enemy escape. However, our results also argue against an ecologically neutral null model of community assembly that views the histories of individual species as random draws from a simple community-wide distribution. Instead, we identify a complex history for this community, including a mixture of idiosyncratic range expansions combined with multispecies pulses of range expansion that appear to correspond to climatic shifts during the Pleistocene. The interspecific variation of demographic histories we uncover implies between-refugium variation in species interactions that is entirely compatible with a geographic mosaic model of coevolution (27, 28) on a continental scale. The fact that population histories are largely uncoupled across members of the oak gall community implies that any host–parasitoid coevolution is likely to have been diffuse.

Results

We generated WGS data (100-base paired-end Illumina) for a total of 75 individual haploid male wasps. For each of the 13 species we sampled two individuals from each of the three refugia (west, center, and east; SI Appendix, Table S2). Reads for each species were combined to assemble reference genomes de novo (SI Appendix, Table S1) and mapped back to generate variation data.

Defining Demographic Model Space.

Our initial aim was to infer the history of longitudinal range expansions into, and admixture between, refugial populations for each species. We defined a space of plausible demographic models that is both biologically realistic and yet small enough to allow for an exhaustive comparison of all possible population relationships. Cross-species comparisons of the pairwise genetic diversity within (π) and divergence between (dXY) refugia provide several immediate insights into the demographic history of the oak gall community and were used to guide the selection of appropriate models of population history (SI Appendix, Fig. S1). These summaries indicated that a minimal description of the longitudinal history of each species should include (i) differences in effective population size between refugia, (ii) divergence between populations, and (iii) the possibility of admixture between refugia. These processes can be captured by a seven-parameter model (Fig. 1E) that includes the effective population size Ne of each refugium (NE, NC, NW), a total of three time parameters describing the two divergence events between refugial populations (T1 and T2), and an instantaneous unidirectional admixture event at time Tadm that transfers a fraction f from a source population into another refugium. To restrict model complexity to a computationally feasible number of parameters, we considered only admixture that involved the oldest population either as source or as sink (Fig. 1E).

For each species, we conducted a full search of model space (a total of 48 full models, where Tadm < T1 < T2), which encompasses three possible orders of population divergence, four combinations of ancestral Ne, and four possible unidirectional admixture events, and determined the best-supported model. We also assessed the support for all simpler scenarios nested within these full models, including divergence without admixture (f=0), polytomous divergence from a single ancestral population (T1=T2), and complete panmixia across the range (T1=T2=0). To get a sense of the likely timescale of community assembly, we converted time estimates into years, using a direct mutation rate estimate for Drosophila melanogaster (29) of 3.5×109 mutations per base and generation (Table 1).

Table 1.

Maximum composite-likelihood estimates (MCLE) of demographic parameters and population topologies under the best supported model for each species

Species binomial Species code Anc1 Anc2 Topology NW NC NE Tadm T1 T2 f SoSi
Parasitoids
Cecidostiba fungosa Cfun West West East 22.9 367.5 25.0 142** 142 622 0.79 WE
Eupelmus annulatus Eann East West West 15.5 82.1 40.4 0 41 131 0.03
Eurytoma brunniventris Ebru Centre Centre East 1162 47.5 116.3 0* 72 1051 0.91 WE
Megastigmus dorsalis Mdor Centre East East 2.5 34.3 11.4 0* 8 90 0.39 CE
Megastigmus stigmatizans Msti West Polytomy 6.4 1.0 0.4 37 37
Ormyrus nitidulus Onit Centre Centre East 3.6 17.9 10.6 24 24 113 0
Ormyrus pomaceus Opom East East West 18.0 71.3 30.3 53 102 217 0.06
Synergus umbraculus Sumb Centre West West 8.7 13.4 19.8 38** 41 309 0.54 CW
Torymus auratus Taur East Polytomy 1050 20.9 41.7 80 80
Gall wasp hosts
Andricus grossulariae Agro Centre East East 1.0 1.9 13.9 14 35 45 0.09 EW
Biorhiza pallida Bpal East West West 13.0 20.0 65.0 0* 19 135 0.12 WE
Neuroterus quercusbaccarum Neuqba Centre East East 10.3 20.7 12.2 153** 154 1175 0.34 CE
Pseudoneuroterus saliens Neusal Centre Centre East 2.4 35.7 11.0 60 66 474 0.05

Divergence (T1, T2) and admixture times (Tadm) are given in thousands of years (ky), and effective population sizes (NW, NC, NE) are ×104 individuals. Anc1and Anc2 indicate which current population size is shared with the younger and older ancestral populations, respectively (Fig. 1E). For Tadm, * indicates that the 95% confidence interval (CI) of the estimate for this parameter includes zero, while ** indicates that the 95% CI overlaps the 95% CI for T1So→Si indicates the source and sink population for the admixture event, where the 95% CI for f was found to not overlap zero.

Species Have Expanded Across the Western Palearctic in Different Directions.

A history of directional range expansion (T1<T2) was supported for 11 of 13 species, while the two parasitoid species Torymus auratus and Megastigmus stigmatizans showed an unresolved polytomy (T1=T2). Of the 11 species showing a directional signal, 7 were inferred to have an eastern origin [i.e., an (E,(C,W)) topology], including 3 of the 4 gall wasp species and 4 of the 9 parasitoids (Fig. 2 and Table 1). Four species supported a western origin [i.e., a (W,(C,E)) topology], including the gall wasp Biorhiza pallida, the inquiline gall wasp Synergus umbraculus, and two parasitoids. A (C,(W,E)) topology was not supported for any species (Table 1). As expected, the four species supporting a western origin had more recent mean gene divergence (dXY) between the center and eastern populations than between the west and either center or east (SI Appendix, Fig. S1). While we can reject a null model under which all three orders of population divergence are equally likely (P=0.029, binomial test), this incongruence in expansion histories argues against both strict codispersal and host tracking, but is compatible with ecological fitting and with random assembly.

Fig. 2.

Fig. 2.

Community-wide patterns of range expansion across the Western Palearctic. Estimates for nine parasitoids and four gall wasp host species are shown at Top and Bottom, respectively (for species abbreviations see Table 1). Divergence times are shown in green and comprise a point estimate flanked by 95% CIs. The arrow on the older divergence time for each species shows the direction of range expansion; e.g., a right-pointing arrow indicates expansion from west to east. Note that the arrow is absent for Msti and Taur because their histories are unresolved polytomies and hence nondirectional. Where supported in the best model, admixture times are shown in black. The arrow gives the direction of admixture, again comprising a point estimate flanked by 95% CIs. CIs are arbitrary for the three species with admixture estimated at the present day. Orange vertical bars indicate interglacials. The best-fitting community-wide mixture distribution (Materials and Methods) of divergence and admixture times is shown in gray. Note that the x axis on the right-hand side is compressed relative to that on the left. Point estimates and 95% CIs are tabulated in SI Appendix, Table S6.

Admixture (f significantly >0) between refugia was inferred for seven species (indicated in black in Fig. 2). Most had an inferred eastern origin with admixture back into the east either from the center or from the west. The exceptions were the inquiline S. umbraculus, which showed the reverse pattern, a western origin with admixture back into the west from the center, and two gall wasps, B. pallida with a western origin and low admixture into the east and Andricus grossulariae with the reverse (Table 1).

The Age of Refugial Populations Differs Across Species but Not Guilds.

Population divergence times between refugia vary by over an order of magnitude across species, the oldest split (T2) ranging from 1,175 kya in the gall wasp Neuroterus quercusbaccarum and 1,051 kya in the parasitoid Eurytoma brunniventris to <50 kya in the gall wasp A. grossulariae and the parasitoid M. stigmatizans (Table 1). Given that a large number of species have nonoverlapping confidence intervals (computed using a full parametric bootstrap; Materials and Methods) for this split, we can confidently reject assembly via strict codispersal (Fig. 2). Irrespective of whether we consider all species jointly or only those with a putative eastern origin, the deepest population split (T2) is not consistently older in gall wasp hosts than in their parasitoid enemies (Fig. 2), arguing against both delayed host tracking and widespread enemy escape. We stress that our comparison of population divergence times across species does not rely on any absolute calibration but assumes only (i) knowledge of the generation times of each species (which we have; Materials and Methods and ref. 20) and (ii) an equal mutation rate across species (which seems reasonable) (see SI Appendix, Table S3 for uncalibrated parameter estimates for the top three best-supported models for each species).

Testing Ecologically Neutral Assembly.

Our rejection of the codispersal and delayed host tracking models raises the question of whether the observed variation in species’ demographic histories is nonetheless structured or whether it is compatible with a neutral model of community assembly. We address the first question by asking how many divergence events are required to explain the demographic history of this community. Considering only the oldest split in each species (T2), we explored models of codivergence (using a discretized grid of the marginal support for T2; Materials and Methods). The most parsimonious model of codivergence contained three clusters of species with indistinguishably similar divergence times (T. auratus and Megastigmus dorsalis at 82 kya, Eupelmus annulatus and B. pallida at 133 kya, and E. brunniventris and N. quercusbaccarum at 1,175 kya) and seven species that each diverged independently of all others (ΔlnL=1.8, 3 df, p=0.308; SI Appendix, Table S4).

A second, and we argue more meaningful, question is whether the assembly of the oak gall community is compatible with an altogether random process. In other words, can we reject an ecologically neutral null model that does not consider trophic interactions? To test this, we fitted continuous distributions to the community-wide set of divergence (T1 and T2) and admixture (Tadm) estimates. The inference of community-wide parameters was implemented in a likelihood framework with two key features: First, we propagated the uncertainty in species-specific parameter estimates by using a discretized grid of marginal support for the T parameters and f of each species. Second, for each species with significant admixture, the contributions of the time of admixture (Tadm) and the oldest split (T2) to the community-wide distribution were weighted by its admixture fraction (i.e., f and 1f, respectively) (see Materials and Methods for details).

Arguably (14), the simplest ecologically neutral model would view the timing of divergence and admixture between refugia as a simple waiting process (T parameters are drawn from an exponential distribution) or assume a single unimodal distribution (e.g., a log normal). We were able to reject both of these null models of community assembly in favor of a more complex history involving several continuous multispecies divergence pulses. The most parsimonious model assumes that, community-wide, the times of between-refuge population divergence and admixture are draws from a mixture distribution consisting of a single exponential and three log-normal distributions (Fig. 2) with modes at 13 kya, 38 kya, 140 kya [ΔlnL (relative to a model with a single exponential) = 16.7, 9 df, P=0.05; SI Appendix, Table S5].

Community-Wide Expansion Pulses Coincide with Warm Periods in the Pleistocene.

The community-wide modes of divergence and admixture times coincide approximately with warm periods in the late Pleistocene (corresponding to the beginning of the current Holocene interglacial, a Dansgaard–Oeschger interstadial event, and just before the Eemian interglacial). This suggests that, within the oak gall community, expansions into and admixture between refugia are most likely associated with periods of geographic expansion of the distributions of oaks and their associated gall communities—a process confirmed for the Eemian interglacial by fossils for oaks, gall wasps, and parasitoids (30). Although the correspondence of community-wide time estimates with Pleistocene climatic events is striking, we stress that these absolute time estimates rely on a calibration with a Drosophila mutation rate and should be interpreted with caution.

Discussion

Understanding how complex natural communities have assembled requires a retrospective approach and inference of the population histories of component species. Previous comparative analyses of the phylogeographic history of trophically linked species either have been limited in scope by the small numbers of species compared (e.g., refs. 17 and 31) and/or were based on small numbers of loci and achieved low resolution of the demographic histories of individual species (20). Here we use whole-genome data for small samples of individuals to make detailed, continental-scale reconstructions of the Pleistocene histories of 13 members of a widespread insect community: oak gall wasps and their associated parasitoid enemies.

Our systematic comparison of population histories within and between trophic levels unveils considerable complexity in the assembly of the oak gall community. This can neither be explained by classic models of deterministic community assembly which predict a small number of codivergence events (32) nor be captured by an ecologically neutral process which predicts a simple community-wide distribution of divergence times (14).

Members of the oak gall community differ in both the directionality and timing of their longitudinal expansion into Europe, without any consistent difference between gall wasp hosts and their parasitoids. We can therefore confidently reject both codivergence and enemy escape, two contrasting and paradigmatic models of community assembly. Moreover, the fact that we identify a minimum of 10 distinct divergence (or codivergence) events for the oldest split implies that we can also rule out more complex assembly scenarios that involve a small number of codivergence events, e.g., a history where all species codiverge with at least one other. Our comparative analysis also allows us to reject an ecologically neutral null model of random community assembly. This model views species as exchangeable and assumes that divergence and admixture events between refugia happen at a single community-wide rate (5, 14) and independently of past climatic events.

Ecological Fitting and Host Range.

Our finding that species differ in both the timing and the direction of range expansions supports the existence of a continental-scale geographic mosaic of species interactions (27, 28) whose structure has varied through time. For example, we infer that the seven species showing an out-of-the-east population history have shared the eastern refuge throughout their history. In contrast, interactions in the center and the west between any two of the same set of species are possible only when both have reached these refugia. Because divergence times in both guilds vary by an order of magnitude, there must have been considerable turnover in species interactions. For example, E. brunniventris, which has the most ancient out-of-the-east parasitoid population history, can only have exploited hosts whose expansion histories into the center are at least as early (such as N. quercusbaccarum). Interactions in the center with later-expanding hosts, such as A. grossulariae, could be restored only when these, too, arrived from the east. The complex temporal dynamics of species associations in this system suggest that current values of phenotypic traits associated with trophic interactions likely reflect a complex selective history, with low potential for tight coevolution between any given pair of species. Such turnover in trophic relationships may also explain why this and other communities centered around temperate insect herbivores are dominated by relatively generalist species that attack a range of hosts within the community (11, 33). We suggest that generalists are likely to have a larger fitness space (4) over host diversity than specialists, facilitating recruitment to regionally varying host assemblages through ecological fitting.

An alternative explanation for the incongruence in timing we observe across species is that the genetic signatures of older demographic events have been erased by extensive subsequent admixture in some species, but not in others. While such potential loss of signal is a general limitation of demographic inference and cannot be ruled out, it seems an unlikely explanation for our general finding of incongruence between species, given the common absence of admixture signals.

A species’ potential to expand its range is a function of its ecology and life history. A recent study identified life-history strategy as the only correlate of genetic diversity: Animals with more rapid life histories and larger numbers of offspring had higher genetic diversity (25). While this and other comparative studies of genetic diversity (26, 34) have ignored demography, comparing species with respect to their demographic histories makes it possible (at least in principle) to test for assembly rules. For example, if ecological fitting plays an important role in community assembly, more generalist and/or widespread parasitoids should have older histories than specialists. We do indeed find that T2 and ancestral Ne are positively correlated with both host range and abundance, which is compatible with this idea (SI Appendix, Fig. S2). However, confirming this relationship will require larger samples of species.

Eastern Origins and Westerly Winds.

A majority of members of the oak gall community have an eastern origin, a pattern that is compatible with the east to west declines in genetic diversity found by previous studies of this community (16, 17, 31) and Western Palearctic taxa more generally (35, 36). It also matches the inferred origin of gall wasp lineages now found across the Western Palearctic ca. 10 Mya in Anatolia/Iran (19). The longitudinal colonization history of deciduous oaks in the Western Palearctic is currently too poorly resolved to know whether gall wasps tracked their oak hosts westward into Europe (37, 38). While only a few species in the oak gall community show evidence for admixture, it is intriguing that our inference of admixture is (i) most substantial in the three species with the oldest histories, N. quercusbaccarum, Cecidostiba fungosa, and E. brunniventris (Table 1), and (ii) directed predominantly from the west into the east, that is, in the opposite direction of population divergence. We conducted a simple simulation study to confirm that this directional signal is not an artifact of our modeling approach; i.e., admixture back into the source population is not easier to detect than in the reverse direction (Materials and Methods). Inference of gene flow between refugia is compatible with the documented range expansion by both gall wasps and associated parasitoids over similar spatial scales (17, 39) and the ability of gall wasps to reach offshore islands (40). Given that both chalcid parasitoids and cynipids have been observed in aerial samples taken 200 m from the ground (41) and the prevailing winds in the Western Palearctic are from the west, it seems plausible that gene flow into the east is simply easier and therefore more likely than in the reverse direction.

Limitations and Potential Confounding Factors.

While the demographic models we consider are necessarily simplifications of the truth, they capture key population-level processes that have been largely ignored by previous comparative analyses. First, many comparative phylogeographic studies have focused on divergence between only pairs of populations (20, 42). Second, analyses of divergence and admixture between multiple populations have, for the sake of tractability, often assumed equal Ne among populations (43, 44) which can bias estimates of divergence and admixture. Our finding that Ne varies by up to two orders of magnitude among populations within species (Table 1) highlights the importance of modeling Ne parameters explicitly (45, 46). While this study’s reanalysis of B. pallida inferred little admixture when accounting for Ne differences between refugia, a previous analysis based on a single sample per population (43) suggested substantial gene flow from east to west under a model in which ancestral Ne was fixed across all populations.

To make robust inferences about community assembly, it is imperative to minimize biases that might affect comparability among species. We have sequenced species to the same coverage whenever possible and have used the same bioinformatic pipeline and comparable block lengths (in terms of diversity) for all species. To test whether selective constraint is likely to differ drastically among species or trophic levels (which could lead to systematic biases in demographic estimates), we compared diversity across all sites to that at synonymous coding sites and found no difference between hosts and parasitoids (SI Appendix, Suppl. Info 2). Finally, our parametric bootstrap replicates which were simulated with recombination show that recombination (which may differ between species) had minimal effects on model and parameter estimates.

Implications for Comparative Phylogeography.

Previous comparative studies have been unable to build such a nuanced picture of community assembly in this or any other system due to limitations in both available data and analytical approaches (20, 47, 48) [although new methods have begun to emerge (49, 50); see ref. 13 for an early successful example]. In particular, we conducted a previous analysis of the oak gall wasp community based solely on mitochondrial data sampled for the same set of refugia and an overlapping set of taxa. In contrast to the present study, this previous analysis had very limited power to resolve even simple demographic histories with or without gene flow (20).

We now have both the inference tools and the data necessary to resolve divergence and admixture relationships among multiple populations and across many trophically linked species. Given the resolution of WGS data, it becomes hard to justify the logic of testing for strict codivergence (51), if this implies identical divergence times across species. Three recent comparative phylogeographic analyses, each based on thousands of ddRAD loci, avoided using approaches that test for strict codivergence, but instead compared parameter estimates post hoc among species (5254). For example, Oswald et al. (53) found that six pairs of codistributed birds shared the same mode of divergence (isolation with asymmetric migration) but diverged asynchronously. Our approach replaces informal post hoc comparisons with a formal comparative framework that both incorporates uncertainty in species-specific parameter estimates and characterizes community-wide distributions of divergence times.

Outlook.

Being able to efficiently reconstruct intraspecific histories from genome-scale data across multiple species means that one can infer the tempo and mode of community assembly across space and through time. The composite-likelihood framework we use applies to a wide range of demographic histories (31, 55) and, as we have shown here, lends itself to comparative analyses. Given sufficient replication in terms of species, such analyses open the door to answering a range of fundamental questions in evolution and ecology. For example, an interesting avenue of future research will be to explicitly include selection in comparative analyses. In particular, it will be fascinating to ask whether and how range shifts and admixture events are accompanied by selection on genes that may be involved in host–parasitoid interactions. It will soon be possible to use WGS data to compare not only the demographic histories of interacting species but also the signatures of selective sweeps and their targets in the genome. In practice, such more detailed comparisons will require both further work on inference methods and more complete datasets, including more contiguous (and functionally annotated) reference genomes and larger samples of individuals and taxa.

Materials and Methods

Samples and Sequencing.

Sample processing.

We generated WGS data for three species of oak gall wasp hosts, eight parasitoids, and one inquiline. We also reanalyzed WGS data generated for a pilot study for B. pallida, another gall wasp species (43) (ENA Sequence Read Archive: http://www.ebi.ac.uk/ena/data/view/ERP002280). For each species we sampled a minimum of two male individuals from each of three refugial populations spanning the Western Palearctic: Iberia (west), the Balkans (center), and Iran (east) (SI Appendix, Table S2). Only a single eastern sample was available for two species, Ormyrus pomaceus and B. pallida, and a single western sample for S. umbraculus.

DNA was extracted from (haploid) male individuals, using the Qiagen DNeasy kit. Nextera libraries were generated for each individual specimen and sequenced on a HiSeq 2000 by Edinburgh Genomics. Raw reads are deposited in the European Nucleotide Archive (accession nos. PRJEB20883/ERP023079). Median average coverage per individual ranged from 3.81 to 6.59 for the parasitoids and from 2.94 to 9.33 for the gall wasps. The inquiline gall wasp S. umbraculus was sequenced to higher coverage (median 34.1) because its genome size was initially overestimated.

De novo genome assembly.

Reads were quality controlled and adapter trimmed using trimmomatic (56), followed by cutadapt (57) to remove remaining Nextera transposase sequences. Results were inspected using FastQC (58). For each species, data were combined across all individuals and a reference genome assembly generated using SPAdes (59), MaSuRCA (60), or CLC de novo assembler (SI Appendix, Table S1). Assemblies are available at the European Nucleotide Archive (accession nos. PRJEB27189/ERP109243). We used RepeatScout (61) and RepeatMasker (62) to de novo predict repeat regions. Contigs that Blast (63) matched putative contaminant (bacteria, fungi) or mitochondrial sequence after inspection of blobplots (64) were removed from further analysis.

Variant calling.

Reads per individual were mapped back to each reference genome using BWA (0.7.15-r1140) (65) and duplicates marked using picard (broadinstitute.github.io/picard/) (V2.9.0) MarkDuplicates. Variant calling was performed with freebayes (https://github.com/ekg/freebayes) (v1.1.0-3-g961e5f3) with minimum alternate allele count (-C) of 1, minimum base quality (-q) of 10, and minimum mapping quality (-q) of 20. Variants were then filtered for a minimum quality of 20 and decomposed into allelic primitives. Remaining single-nucleotide polymorphisms (SNPs) were retained for further analyses.

Generating blockwise data.

The CallableLoci walker of GATK (v3.4) (66) was used to identify regions covered in all individuals and with a base quality >10 and a mapping quality >20 in the mark-duplicated bam files (these filters are the same used for SNP calling in freebayes). Blocks of a fixed length l of callable sites were generated for each species, using custom scripts. For parasitoids, we included only sites that passed the CallableLoci filter in all six individuals. Given the overall lower sequencing depth for the gall wasps, we relaxed this requirement in one of the four gall wasp hosts, Pseudoneuroterus saliens, and partitioned the data into blocks using the CallableLoci filter independently for each subsample of n=3 (see below).

Since our inference framework assumes no recombination within blocks, we partitioned the data into short blocks (and performed a simulation-based check for potential biases). To ensure equal information content and to minimize differences in bias due to recombination within blocks across species, we generated blocks with a fixed average number of 1.5 pairwise differences. In other words, the physical length of blocks l was chosen to be inversely proportional to the average pairwise per site diversity π (postfiltering) of a species and so differed between taxa.

Blocks with a physical span (prefiltering length) of >2l or containing more than five “None” sites were removed. Likewise, short contigs (<2l) were ignored. These filtering steps resulted in datasets of between 56,031 and 659,990 blocks per species. For P. saliens where each subsample was filtered independently, average block number was 78,105 blocks.

Fitting Demographic Models.

Computing model likelihoods.

Models were fitted to block-wise data using an analytic coalescent framework (24). Briefly, the probability of observing a particular mutational configuration in a sequence block can be expressed as a higher-order derivative of the generating function (GF) of genealogical branch lengths. This calculation assumes an infinite-sites mutation model (which is reasonable given the low per site diversity). It is straightforward to calculate the support (i.e., the logarithm of the likelihood, lnL) of a particular model, given the corresponding GF and a table of counts of observed block-wise mutational configurations.

We used an automation (23) previously implemented in Mathematica (67) to obtain the GF for the set of 48 models. Since a solution for the GF for the full sample of six individuals (two per population) is intractable, we computed the composite likelihood,

CL=i#blocksjn:j=|3|L(Θ|ki,j_),

where ki,j_ is the count of the mutations defined by the three unrooted branches for block i and triplet j. In other words, the composite likelihood (CL) is a product over contributions from all (14 for a sample of size n=6 we consider) possible (unrooted) subsamples of triplets and all blocks. This calculation scales to arbitrarily large samples of individuals as the GF expressions for each triplet are small. It also makes use of the phase information contained in the haploid data. Unlike previous implementations (i.e., ref. 43) there is no requirement for an outgroup. Analogous CL strategies based on subsamples have been used in phylogenetic network analysis (68) and to infer divergence and continuous gene flow (69, 70). We maximized lnCL for all parameters (NE,NC,NW,T1,T2,Tadm,f) included in the best-supported model, using a Nelder–Mead simplex algorithm implemented in the Mathematica function NMaximize. Configurations involving more than kmax=2 mutations on any particular branch were combined in the CL calculation. The method differs from other recent approaches for inferring reticulate evolutionary histories in that it allows Ne to differ between populations (cf. ref. 68) and makes use of linkage among sites [cf. TreeMix (71)].

Parametric bootstrap.

Maximizing the CL across blocks and subsamples ignores linkage between blocks but gives an unbiased point estimate of parameters. To measure the uncertainty in demographic parameter estimates we conducted a full parametric bootstrap: For each species, we simulated 100 replicate datasets in msprime (72) under the full ancestral recombination graph and the best-supported model (and the MCLEs under that model). Recombination rates per base pair (ρ) were inferred for each species, using the two-locus GF outlined in ref. 24. This inference was based on two Spanish individuals from each species. Each bootstrap replicate had the same total sequence length as the real dataset (after filtering). We assumed that blocks in the bootstrap replicates were immediately adjacent to one another, which is conservative given the filtering pipeline used to delimit blocks in the real data. For the sake of computational efficiency, each simulation replicate was divided into 50 equally sized chunks/chromosomes. The 95% CIs were obtained as 2 SDs of estimates across bootstrap replicates.

Identifying codivergence clusters.

To assess whether the oldest divergence times T2 differed significantly between species and trophic levels, we implemented a hierarchical clustering procedure. For each of the six species that overlapped in a 95% CI for T2 with any other species, we evaluated the marginal lnCL along a discretized grid (maximizing lnCL across all other parameters). We used the SD in T2 estimates across parametric bootstrap replicates to rescale these marginal lnCL. These rescaled lnCL were treated as an approximation of the true marginal support (lnL) for T2.

To test whether divergence times differed between species, we computed the support for models in which divergence times are shared by subsets of taxa. Essentially, this phrases the test for clustered divergence times as a model simplification problem: For example, the MCLE for a codivergence event involving two or more species is given by the maximum of the sum of their respective marginal lnL. The associated reduction in model support is measured by ΔlnL relative to the full model, i.e., allowing species to have unique divergence times. We identified the most parsimonious model as the model with the fewest distinct T2 clusters (assuming ΔlnL has a χ2 distribution) and examined all possible combinations of codiverged species.

Characterizing the community-wide distribution of divergence and admixture times.

We fitted a series of distributions of increasing complexity to the inferred T parameters for all species. We assumed that divergence and admixture times across species are given by some community-wide function g(t;λ). The likelihood of λ, the community-wide rate of population divergence (or set of parameters in the case of more complex functions) is

L(λ)=k=1nfkTadm,kT1,kT2,kp(Θk)fkg(Tadm,k;λ)×g(T1,k;λ)×(1fk)g(T2,k;λ) [1]

where Θk is the set of time parameters and the admixture fraction for the kth species and p(Θk) is the (normalized) likelihood of Θk. Since computing Eq. 1 directly from the data is intractable, we used a discretized grid of likely T and f values (i.e., replacing the integrals in [1] by sums). Grids were centered around the MCLE and bounded by the 95% CI of parameters. To ensure that each species contributes equal information to the community-wide inference, we normalized the lnCL grid of each species assuming a total of 2,000 unlinked blocks. Although admittedly arbitrary, this is very conservative: If we assume that blocks are randomly distributed in the genome, this normalization corresponds to an average distance of at least 150 kb between blocks (assuming a minimum genome size of 300 Mb). In each dimension, we evaluated five points. For computational tractability, we fixed θ and the Ne scalars at their MCLEs.

We initially fitted exponential and log-normal distributions, followed by increasingly complex mixture distributions of a single exponential and several log normals. Models were fitted sequentially: As we included additional mixture contributions, we fixed the parameters and mixture weights of the lowest weighted component of the previous distribution. We identified the most parsimonious model as the simplest model (i.e., with the fewest parameters) which did not give a significant reduction in lnL. The python scripts and Mathematica notebooks are available at https://github.com/KLohse/BunnefeldEtAL_2018.

Supplementary Material

Supplementary File
pnas.1800334115.sapp.pdf (391.3KB, pdf)

Acknowledgments

We thank Richard Harrison, James Nicholls, Sarah White, and Lisa Cooper for help in the laboratory and Edinburgh Genomics for sequencing. We give many thanks to Stuart Baird, Alex Hayward, and Mike Hickerson for discussions throughout the project and to Ally Phillimore for a critical manuscript review. We are grateful to Rob Ness for advice on the bioinformatics; to Pablo Fuentes-Utrilla, Győrgy Csóka, George Melika, and Majide Tavakoli for contributing specimens; to Richard R. Askew for help with identification; and to William Walton for collating the ecological data. This work was supported by a standard grant from the Natural Environmental Research Council (NERC) United Kingdom (to G.N.S. and K.L.) (NE/J010499/1). K.L. is supported by a NERC Independent Research fellowship (NE/L011522/1).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. C.M. is a guest editor invited by the Editorial Board.

Data deposition: Raw reads have been deposited in the European Nucleotide Archive (ENA) (accession no. ERP023079) and the Short Read Archive (accession no. PRJEB20883). Genome assemblies are deposited in the ENA (accession nos. PRJEB27189 and ERP109243). Python and Mathematica code can be found at https://github.com/KLohse/BunnefeldEtAL_2018.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1800334115/-/DCSupplemental.

References

  • 1.Vellend M. Conceptual synthesis in community ecology. Q Rev Biol. 2010;85:183–206. doi: 10.1086/652373. [DOI] [PubMed] [Google Scholar]
  • 2.Janzen DH. On ecological fitting. Oikos. 1985;45:308–310. [Google Scholar]
  • 3.Ricklefs RE. Intrinsic dynamics of the regional community. Ecol Lett. 2015;18:497–503. doi: 10.1111/ele.12431. [DOI] [PubMed] [Google Scholar]
  • 4.Agosta SJ, Klemens JA. Ecological fitting by phenotypically flexible genotypes: Implications for species associations, community assembly and evolution. Ecol Lett. 2008;11:1123–1134. doi: 10.1111/j.1461-0248.2008.01237.x. [DOI] [PubMed] [Google Scholar]
  • 5.Hubbell SP. The Unified Neutral Theory of Biodiversity and Biogeography, Princeton Monographs in Population Biology. Princeton Univ Press; Princeton: 2001. [Google Scholar]
  • 6.Jousselin E, Rasplus JY, Kjellberg F. Convergence and coevolution in a mutualism: Evidence from a molecular phylogeny of ficus. Evolution. 2003;57:1255–1269. doi: 10.1554/02-445. [DOI] [PubMed] [Google Scholar]
  • 7.Mueller UG, Rehner SA, Schultz TR. The evolution of agriculture in ants. Science. 1998;281:2034–2038. doi: 10.1126/science.281.5385.2034. [DOI] [PubMed] [Google Scholar]
  • 8.Stone GN, Schönrogge K, Atkinson RJ, Bellido D, Pujade-Villar J. The population biology of oak gall waps (hymenoptera: Cynipidae) Annu Rev Entomol. 2002;47:633–668. doi: 10.1146/annurev.ento.47.091201.145247. [DOI] [PubMed] [Google Scholar]
  • 9.Leppänen SA, Altenhofer E, Liston AD, Nyman T. Ecological versus phylogenetic determinants of trophic associations in a plant-leafminer-parasitoid food web. Evolution. 2013;67:1493–1502. doi: 10.1111/evo.12028. [DOI] [PubMed] [Google Scholar]
  • 10.Bailey R, et al. Host niches and defensive extended phenotypes structure parasitoid wasp communities. PLoS Biol. 2009;7:e1000179. doi: 10.1371/journal.pbio.1000179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Askew RR. The diversity of insect communities in leaf mines and plant galls. J Anim Ecol. 1980;49:145–152. [Google Scholar]
  • 12.Stireman JO, Singer MS. Determinants of parasitoid-host associations: Insights from a natural tachinid-lepidopteran community. Ecology. 2003;84:296–310. [Google Scholar]
  • 13.Smith CI, Godsoe WKW, Tank S, Yoder JB, Pellmyr O. Distinguishing coevolution from covariance in an obligate pollination mutualism: Asynchronous divergence in Joshua tree and its pollinators. Evolution. 2008;62:2676–2687. doi: 10.1111/j.1558-5646.2008.00500.x. [DOI] [PubMed] [Google Scholar]
  • 14.Rosindell J, Hubbell SP, Etienne RS. The unified neutral theory of biodiversity and biogeography at age ten. Trends Ecol Evol. 2011;26:340–348. doi: 10.1016/j.tree.2011.03.024. [DOI] [PubMed] [Google Scholar]
  • 15.Lohse K, Sharanowski B, Stone GN. Quantifying the population history of the oak gall parasitoid Cecidostiba fungosa. Evolution. 2010;58:439–442. doi: 10.1111/j.1558-5646.2010.01024.x. [DOI] [PubMed] [Google Scholar]
  • 16.Stone GN, et al. The phylogeographical clade trade: Tracing the impact of human-mediated dispersal on the colonization of northern Europe by the oak gallwasp Andricus kollari. Mol Ecol. 2007;16:2768–2781. doi: 10.1111/j.1365-294X.2007.03348.x. [DOI] [PubMed] [Google Scholar]
  • 17.Hayward A, Stone GN. Comparative phylogeography across two trophic levels: The oak gall wasp Andricus kollari and its chalcid parasitoid Megastigmus stigmatizans. Mol Ecol. 2006;15:479–489. doi: 10.1111/j.1365-294X.2005.02811.x. [DOI] [PubMed] [Google Scholar]
  • 18.Nicholls JA, et al. Concordant phylogeography and cryptic speciation in two western Palaearctic oak gall parasitoid species complexes. Mol Ecol. 2010;19:592–609. doi: 10.1111/j.1365-294X.2009.04499.x. [DOI] [PubMed] [Google Scholar]
  • 19.Stone GN, et al. Extreme host plant conservatism during at least 20 million years of host plant pursuit by oak gallwasps. Evolution. 2009;63:854–869. doi: 10.1111/j.1558-5646.2008.00604.x. [DOI] [PubMed] [Google Scholar]
  • 20.Stone GN, et al. Reconstructing community assembly in time and space reveals enemy escape in a western palaearctic insect community. Curr Biol. 2012;22:531–537. doi: 10.1016/j.cub.2012.01.059. [DOI] [PubMed] [Google Scholar]
  • 21.Prior KM, Hellmann JJ. Does enemy loss cause release? A biogeographical comparison of parasitoid effects on an introduced insect. Ecology. 2013;94:1015–1024. doi: 10.1890/12-1710.1. [DOI] [PubMed] [Google Scholar]
  • 22.Vosteen I, Gershenzon J, Kunert G. Enemy-free space promotes maintenance of host races in an aphid species. Oecologia. 2016;181:659–672. doi: 10.1007/s00442-015-3469-1. [DOI] [PubMed] [Google Scholar]
  • 23.Lohse K, Chmelik M, Martin SH, Barton NH. Efficient strategies for calculating blockwise likelihoods under the coalescent. Genetics. 2016;202:775–786. doi: 10.1534/genetics.115.183814. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lohse K, Harrison RJ, Barton NH. A general method for calculating likelihoods under the coalescent process. Genetics. 2011;58:977–987. doi: 10.1534/genetics.111.129569. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Romiguier J, et al. Comparative population genomics in animals uncovers the determinants of genetic diversity. Nature. 2014;515:261–263. doi: 10.1038/nature13685. [DOI] [PubMed] [Google Scholar]
  • 26.Leffler EM, et al. Revisiting an old riddle: What determines genetic diversity levels within species? PLoS Biol. 2012;10:1–9. doi: 10.1371/journal.pbio.1001388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Thompson J. The Geographic Mosaic of Coevolution. Univ of Chicago Press; Chicago: 2005. [Google Scholar]
  • 28.Hoberg EP, Brooks DR. A macroevolutionary mosaic: Episodic host-switching, geographical colonization and diversification in complex host–parasite systems. J Biogeogr. 2008;35:1533–1550. [Google Scholar]
  • 29.Keightley PD, et al. Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines. Genome Res. 2009;19:1195–1201. doi: 10.1101/gr.091231.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Stone GN, van der Ham RWJM, Brewer JG. Fossil oak galls preserve ancient multitrophic interactions. Proc R Soc B Biol Sci. 2008;275:2213–2219. doi: 10.1098/rspb.2008.0494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lohse K, Barton NH, Melika N, Stone GN. A likelihood-based comparison of population histories in a parasitoid guild. Mol Ecol. 2012;49:832–842. doi: 10.1111/j.1365-294X.2012.05700.x. [DOI] [PubMed] [Google Scholar]
  • 32.Ricklefs RE. Disintegration of the ecological community. Am Nat. 2008;172:741–750. doi: 10.1086/593002. [DOI] [PubMed] [Google Scholar]
  • 33.Askew RR. On the biology of the inhabitants of oak galls of Cynipidae (Hymenoptera) in Britain. Trans Soc Br Entomol. 1961;14:237–258. [Google Scholar]
  • 34.Corbett-Detig RB, Hartl DL, Sackton TB. Natural selection constrains neutral diversity across a wide range of species. PLoS Biol. 2015;13:1–25. doi: 10.1371/journal.pbio.1002112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Koch MA, et al. Three times out of Asia minor: The phylogeography of Arabis alpina l. (Brassicaceae) Mol Ecol. 2006;15:825–839. doi: 10.1111/j.1365-294X.2005.02848.x. [DOI] [PubMed] [Google Scholar]
  • 36.Duvaux L, Belkhir K, Boulesteix M, Boursot P. Isolation and gene flow: Inferring the speciation history of European house mice. Mol Ecol. 2011;20:5248–5264. doi: 10.1111/j.1365-294X.2011.05343.x. [DOI] [PubMed] [Google Scholar]
  • 37.Petit RJ, et al. Chloroplast DNA variation in European white oaks: Phylogeography and patterns of diversity based on data from over 2600 populations. For Ecol Manag. 2002;156:5–26. [Google Scholar]
  • 38.Atkinson R, Rokas A, Stone G. Longitudinal patterns in species richness and genetic diversity in European oaks and oak gallwasps. In: Weiss S, editor. Phylogeography in Southern European Refugia. Springer; Dordrecht, The Netherlands: 2007. pp. 127–154. [Google Scholar]
  • 39.Stone GN, Sunnucks P. Genetic consequences of an invasion through a patchy environment–The cynipid gallwasp Andricus quercuscalicis (Hymenoptera: Cynipidae) Mol Ecol. 1993;2:251–268. [Google Scholar]
  • 40.Walker P, Leather SR, Crawley MJ. Differential rates of invasion in three related alien oak gall wasps (cynipidae: Hymenoptera) Divers Distrib. 2002;8:335–349. [Google Scholar]
  • 41.Chapman J, Reynolds D, Smith A, Smith E, Woiwod I. An aerial netting study of insects migrating at high altitude over England. Bull Entomol Res. 2004;94:123–136. doi: 10.1079/ber2004287. [DOI] [PubMed] [Google Scholar]
  • 42.Hickerson MJ, Stahl EA, Lessios HA, Crandall K. Test for simultaneous divergence using approximate Bayesian computation. Evolution. 2006;60:2435–2453. [PubMed] [Google Scholar]
  • 43.Hearn J, et al. Likelihood-based inference of population history from low-coverage de novo genome assemblies. Mol Ecol. 2014;23:198–211. doi: 10.1111/mec.12578. [DOI] [PubMed] [Google Scholar]
  • 44.Yu Y, Dong J, Liu KJ, Nakhleh L. Maximum likelihood inference of reticulate evolutionary histories. Proc Natl Acad Sci USA. 2014;111:16448–16453. doi: 10.1073/pnas.1407950111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Marko PB, Hart MW. The complex analytical landscape of gene flow inference. Trends Ecol Evol. 2011;26:448–456. doi: 10.1016/j.tree.2011.05.007. [DOI] [PubMed] [Google Scholar]
  • 46.Hung CM, Drovetski SV, Zink RM. The roles of ecology, behaviour and effective population size in the evolution of a community. Mol Ecol. 2017;26:3775–3784. doi: 10.1111/mec.14152. [DOI] [PubMed] [Google Scholar]
  • 47.Sutton TL, Riegler M, Cook JM. One step ahead: A parasitoid disperses farther and forms a wider geographic population than its fig wasp host. Mol Ecol. 2016;25:882–894. doi: 10.1111/mec.13445. [DOI] [PubMed] [Google Scholar]
  • 48.Espíndola A, Carstens BC, Alvarez N. Comparative phylogeography of mutualists and the effect of the host on the genetic structure of its partners. Biol J Linn Soc. 2014;113:1021–1035. [Google Scholar]
  • 49.Beeravolu Reddy C, Hickerson MJ, Frantz LAF, Lohse K. 2017. Blockwise site frequency spectra for inferring complex population histories and recombination. bioRxiv:077958. Preprint, posted September 25, 2017.
  • 50.Xue AT, Hickerson MJ. The aggregate site frequency spectrum for comparative population genomic inference. Mol Ecol. 2015;24:6223–6240. doi: 10.1111/mec.13447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Papadopoulou A, Knowles LL. Toward a paradigm shift in comparative phylogeography driven by trait-based hypotheses. Proc Natl Acad Sci USA. 2016;113:8018–8024. doi: 10.1073/pnas.1601069113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Satler JD, Carstens BC. Do ecological communities disperse across biogeographic barriers as a unit? Mol Ecol. 2017;26:3533–3545. doi: 10.1111/mec.14137. [DOI] [PubMed] [Google Scholar]
  • 53.Oswald JA, Overcast I, Mauck WM, Andersen MJ, Smith BT. Isolation with asymmetric gene flow during the nonsynchronous divergence of dry forest birds. Mol Ecol. 2017;26:1386–1400. doi: 10.1111/mec.14013. [DOI] [PubMed] [Google Scholar]
  • 54.Rougemont Q, et al. Inferring the demographic history underlying parallel genomic divergence among pairs of parasitic and nonparasitic lamprey ecotypes. Mol Ecol. 2017;26:142–162. doi: 10.1111/mec.13664. [DOI] [PubMed] [Google Scholar]
  • 55.Bunnefeld L, Frantz LAF, Lohse K. Inferring bottlenecks from genome-wide samples of short sequence blocks. Genetics. 2015;201:1157–1169. doi: 10.1534/genetics.115.179861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10. [Google Scholar]
  • 58.Andrews S. 2010 FastQC: A quality control application for high throughput sequence data. Available at www.bioinformatics.babraham.ac.uk/projects/fastqc. Accessed December 12, 2013.
  • 59.Bankevich A, et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Zimin AV, et al. The MaSuRCA genome assembler. Bioinformatics. 2013;29:2669–2677. doi: 10.1093/bioinformatics/btt476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21:i351–i358. doi: 10.1093/bioinformatics/bti1018. [DOI] [PubMed] [Google Scholar]
  • 62.Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2009;Chap 4:Unit 4.10. doi: 10.1002/0471250953.bi0410s25. [DOI] [PubMed] [Google Scholar]
  • 63.Altschul SF, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Kumar S, Jones M, Koutsovoulos G, Clarke M, Blaxter M. Blobology: Exploring raw genome data for contaminants, symbionts and parasites using taxon-annotated gc-coverage plots. Front Genet. 2013;4:237. doi: 10.3389/fgene.2013.00237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Auwera GAVD, et al. From FastQ data to high confidence variant calls: The genome analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics. 2013;43:11.10.1–11.10.33. doi: 10.1002/0471250953.bi1110s43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Wolfram Research I . Mathematica, Version 11.1.1.0. Wolfram Research, Inc.; Champaign, IL: 2017. [Google Scholar]
  • 68.Than C, Ruths D, Nakhleh L. Phylonet: A software package for analyzing and reconstructing reticulate evolutionary relationships. BMC Bioinformatics. 2008;9:322. doi: 10.1186/1471-2105-9-322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Costa RJ, Wilkinson-Herbots H. Inference of gene flow in the process of speciation: An efficient maximum-likelihood method for the isolation-with-initial-migration model. Genetics. 2017;205:1597–1618. doi: 10.1534/genetics.116.188060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Yang Z. A likelihood ratio test of speciation with gene flow using genomic data. Genome Biol Evol. 2010;2:200–211. doi: 10.1093/gbe/evq011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 2012;8:1–17. doi: 10.1371/journal.pgen.1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Kelleher J, Etheridge AM, McVean G. Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Comput Biol. 2016;12:1–22. doi: 10.1371/journal.pcbi.1004842. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1800334115.sapp.pdf (391.3KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES