Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2010 Jan 19;5(1):e8655. doi: 10.1371/journal.pone.0008655

Genetic Variation at Nuclear Loci Fails to Distinguish Two Morphologically Distinct Species of Aquilegia

Elizabeth A Cooper 1,*, Justen B Whittall 2, Scott A Hodges 3, Magnus Nordborg 4
Editor: Simon Joly5
PMCID: PMC2808223  PMID: 20098727

Abstract

Aquilegia formosa and pubescens are two closely related species belonging to the columbine genus. Despite their morphological and ecological differences, previous studies have revealed a large degree of intercompatibility, as well as little sequence divergence between these two taxa [1], [2]. We compared the inter- and intraspecific patterns of variation for 9 nuclear loci, and found that the two species were practically indistinguishable at the level of DNA sequence polymorphism, indicating either very recent speciation or continued gene flow. As a comparison, we also analyzed variation at two loci across 30 other Aquilegia taxa; this revealed slightly more differentiation among taxa, which seemed best explained by geographic distance. By contrast, we found no evidence for isolation by distance on a more local geographic scale. We conclude that the extremely low levels of genetic differentiation between A. formosa and A.pubescens at neutral loci will facilitate future genome-wide scans for speciation genes.

Introduction

The genetic mechanisms underlying the process of speciation are of critical interest to evolutionary biologists. In order to unravel this process, it is necessary to both identify the genes responsible for existing reproductive barriers and to consider what demographic and selective forces have shaped these traits. In particular, many recent studies have focused on the role of gene flow during the speciation process [3][10], even though the more traditional (allopatric) view of speciation posits that genetic exchange must be rare in order for species to remain distinct [11]. These studies have shown that adaptive differences between species can be maintained even in the face of significant amounts of introgression, especially if only a few genes or genomic regions control the traits that lead to reproductive isolation [4]. Genome-wide analyses of many species have shown that levels of introgression can vary across the genome, with divergent selection playing an active role in preventing gene flow at the loci underlying adaptive traits, but not acting at other areas in the genome [3], [4], [12]. Incipient species will also show varying levels of differentiation across the genome, with the most differentiated regions also being the most likely to contain genes that restrict random mating [3], [7]. These species can appear almost identical at many loci, even in the complete absence of genetic exchange.

Whether or not gene flow is a factor, closely related taxa offer an excellent opportunity to study the genetic changes and processes that lead to reproductive isolation, since genome-wide scans should be able to pinpoint loci with higher levels of differentiation, and these loci are most likely to be under the influence of natural selection [3], [13][15]. While the identification of potential speciation genes will not definitively prove a particular speciation model, comparing the pattern of variation in these loci with the pattern of shared variation in neutral loci will provide much more insight into the question of whether or not two species have diverged in the face of gene flow [3], [7]. Thus, it is especially important to identify pairs or groups of species that maintain high levels of shared polymorphism over much of their genomes.

The columbine genus Aquilegia [Ranunculaceae] is an excellent example of a recent, rapid adaptive radiation [1], and thus should provide an opportunity to identify the genetic changes important for speciation. The genus is comprised of approximately 70 outcrossing species that occupy a wide variety of habitats in North America, Europe, and Asia [16] and that differ substantially in floral morphology [16], [17]. Despite these differences, species are usually cross-compatible [18], [19].

Two species, Aquilegia formosa and A. pubescens, have long been studied for the purpose of understanding the factors controlling reproductive isolation between them [20][23]. A. formosa is found throughout mountainous regions of western North America while A. pubescens is restricted to the southern Sierra Nevada range [22]. The species exhibit distinct differences in floral characters that have been shown to influence pollinator preference, thereby restricting gene flow between them [22], [23](Figure 1). Additionally, they prefer different habitats: A. formosa populations typically occur in moist areas with well-developed soils at lower elevations (below 3,000m), whereas A. pubescens populations are found in drier, poorly developed soils at higher elevations (3,000–4,000m) [20], [21], [24]. However, the two species are highly interfertile, and form natural hybrid zones at mid elevations where the two habitats co-occur [22]. Molecular markers exhibit more introgression than morphological characters near these zones, suggesting that gene flow could be extensive between these species for neutral markers [22].

Figure 1. Striking differences in floral morphology between (A) the hummingbird pollinated Aquilegia formosa and (B) the hawkmoth pollinated Aquilegia pubescens.

Figure 1

Previous studies have uncovered limited DNA sequence variation between A. formosa and A. pubescens in both chloroplast and nuclear sequences [1], [2]. However, these previous studies showed either low sequence variation across a wide range of Aquilegia species [1] or few individuals were sampled [2] and therefore do not address the degree of genetic differentiation between these species. Other studies suggest that intraspecific sequence variation may be quite similar in A. formosa and A. pubescens and thus that they may be especially useful for identifying speciation genes. For instance, microsatellite loci have similar numbers of alleles and size ranges [25], and another study including over 850 AFLP markers polymorphic in a small sample of both species found only one marker that showed complete differentiation [17]. Because these previous studies did not assess variation at the DNA sequence level or use relatively large population samples, we sought to gain insight into the inter- and intraspecific patterns of genetic variation in these species by sequencing nine nuclear loci from a total of 80 individuals from several populations. As a comparison, we also assessed variation among all of the North American species in the genus (plus some Eurasian taxa) by sequencing two of the nine nuclear loci. By examining loci that are not believed to be involved in the maintenance of reproductive isolation, we sought to assess the potential of using genome-wide scans for speciation genes in these species by determining levels of neutral variation and population structure.

Results

Polymorphism Levels and Linkage Disequilibrium

The counts of segregating sites found in each fragment are given in Table 1. Estimates of both Inline graphic and Inline graphic generally fell in the range of 0.004 to 0.006 per base pair (Figure 2). Overall, these estimates are slightly lower than estimates of Inline graphic in other outcrossing plant species such as maize (Inline graphic) [26] and sunflowers (Inline graphic) [27], similar to estimates in the model species Arabidopsis thaliana [28], and higher than estimates found in soybeans (Inline graphic) [29]. However, values of Inline graphic and Inline graphic varied across the 9 fragments, and one of them, UF3GT, was substantially more polymorphic than the others, with a value of Inline graphic between 0.01 and 0.02 (Figure 2). The estimates of Inline graphic for all fragments are strikingly correlated across species; as we shall demonstrate in the next section, this is because almost all variation is shared.

Table 1. Polymorphism Counts for Each Fragment For each count in bold, the number of sites represents the number of SNPs plus the number of indels treated as single SNPs. Numbers in parentheses represent the number of sites with a Minor Allele Frequency (MAF) Inline graphic5% and Inline graphic10%, respectively.

Fragment Total Inline graphic Indels A. formosa A. pubescens Shared Fixed Differences
Exclusive Exclusive
Acetyl 5 (2, 2) 0 1 (0, 0) 2 (0, 0) 2 (2, 2) 0
DEFEN 20 (8, 6) 3 9 (1,0) 4 (0, 0) 7 (7, 6) 0
Gapc 44 (14, 8) 14 18 (1,0) 11 (0,0) 15 (13, 8) 0
H3 18 (7, 2) 2 5 (0, 0) 5 (1, 0) 8 (6, 2) 0
Heat 20 (7, 4) 1 7 (0, 0) 9 (4, 1) 4 (3, 3) 0
AP3 29 (16, 12) 10 10 (4, 3) 11 (4, 2) 8 (8, 7) 0
LFY 12 (4, 2) 4 6 (2, 0) 3 (0, 0) 3 (2, 2) 0
Pist 15 (8, 4) 2 10 (4, 1) 2 (1, 0) 3 (3, 3) 0
UF3GT 34 (19, 16) 4 9 (2, 0) 6 (0, 0) 19 (17, 16) 0

Figure 2. Variation among levels of polymorphism for each species.

Figure 2

The sequences of each of the 9 fragments were grouped according to species, and Inline graphic and Inline graphic were estimated separately for each group. Grey bars represent estimates for A. formosa, and black bars represent A. pubescens.

Linkage disequilibrium (LD) was not extensive in any of the 9 regions that were sequenced, with average Inline graphic values ranging between 0.1 and 0.2 for most of the fragments. When values of Inline graphic are plotted against physical distance between SNPs, the relationship is weak (Figure S7). The fragments with the highest levels of polymorphism show evidence for a rapid decay of LD (within about 1 kb or less). The combined fragment data show low LD values overall, and our estimate of Inline graphic was 0.009, which is higher than estimates of Inline graphic in humans [30], suggesting a relatively high rate of recombination in these species (Table 2).

Table 2. Estimates of recombination rate for each fragment.

Fragment Name Inline graphic Inline graphic
Acetyl 1 0.081
AP3 11 0.003
Defen 8 0.133
Gapc 12 0.004
H3 3 0.00
Heat 6 0.126
LFY 4 0.006
Pist 4 0.001
UF3GT 15 0.023
Combined 0.009

Genetic Differentiation

When we compared the minor allele counts for each species, we found that few high frequency SNPs corresponded to species-specific polymorphisms (Figure 3 and Table 1). In fact, at sites with a minor allele frequency Inline graphic5%, there were more than twice as many shared polymorphisms (61) as species-specific polymorphisms (24). We found no fixed differences in any of the 9 sequences.

Figure 3. Comparison of minor allele counts in A. formosa and A.pubescens.

Figure 3

The horizontal line represents the mean allele count in A. pubescens, while the vertical line represents the mean allele count in A. formosa. Point size reflects the number of comparisons at that point. Species-specific polymorphisms correspond to the points along either the very bottom or the far left of the plot. All other sites correspond to a shared polymorphism. There are no fixed differences. The average minor allele frequency for any species-specific polymorphism was 0.105, while the average frequency for any shared allele was 0.424.

STRUCTURE was unable to cluster individuals according to species under the naïve assumption of K = 2 (Figure S1). The most likely number of clusters appeared to be around 11 (Figure S2), based on when the estimated probability and the average clusteredness stopped (consistently) increasing (Figures S3 and S4). Although the pattern of clustering does not correspond perfectly to the sample populations, it does not seem to be entirely random, especially among the more well-defined clusters (where individuals tend to have membership coefficients Inline graphic0.5). We found that pairs of individuals from the same population tended to cluster together Inline graphic15% of the time, whereas pairs of individuals from different populations only clustered together Inline graphic9.5% of the time (Inline graphic in Inline graphic test). Similarly, Inline graphic11% of same-species pairs were found in the same cluster, whereas only Inline graphic8% of different-species pairs were clustered together (Inline graphic).

Average Inline graphic between the two species was approximately Inline graphic (with 95% C.I. between Inline graphic and Inline graphic), which is low, but statistically different from zero. When populations were randomly assigned to 2 groups (regardless of species), we achieved very similar results: a mean Inline graphic of Inline graphic with a 95% C.I. between Inline graphic and Inline graphic. Although these estimates are technically statistically different, they do not suggest that much of the observed differentiation is due to species differences.

In order to determine whether such a high degree of shared polymorphism was common in the Aquilegia genus or unique to A. formosa and A. pubescens, we also calculated Inline graphic in a broader sample of 32 taxa using two gene regions (Gapc and UF3GT). We estimated Inline graphic between pairs of populations and obtained a mean estimate of 0.247. Because the sample of 32 taxa encompassed a broader geographical range, we tested the relationship between geographic distance and genetic differentiation across pairs of populations (Figure 4). Results of the Mantel test indicated that geographic distance had a significant relationship with genetic differentiation within the Aquilegia genus (Inline graphic, Inline graphic)-more so than any of the other factors we examined (Figure S5). However, on a more local scale, we do not find evidence for isolation by distance either within on between A.formosa or A. pubescens populations (Inline graphic, Inline graphic) (Figure 5).

Figure 4. Relationship between geographic distance and genetic distance.

Figure 4

Each dot represents a comparison between 2 populations of at least 5 individuals. For populations where there were more than 5 individuals, estimates of Inline graphic were bootstrapped to ensure that the larger sample size did not cause any bias in the estimate.

Figure 5. Relationship between geographic distance and genetic distance for A. formosa and A.pubescens only.

Figure 5

Black squares represent comparisons between A. formosa and A. pubescens populations; gray triangles are comparisons among populations of A. formosa; white circles are comparisons among populations of A. pubescens.

Isolation-Migration

When MIMAR was run with the migration rate fixed at Inline graphic, the time since the split between A. formosa and A. pubescens is estimated as approximately 0.062 in coalescent time units (Figure S6). If we assume the mutation rate to be Inline graphic, then this is equivalent to 55,784.5 generations. If the actual mutation rate in Aquilegia is higher than we assumed, then the estimated number of generations since the split will be lower, and if the actual mutation rate is lower, than the number of generations will be higher. The generation time in Aquilegia is not known, but a very rough estimate can be calculated as 10 years, based on the observation that the plants seem to produce seeds in the wild for about 20 years. If we assume the generation time is around 10 years, then the MIMAR results suggest that A. formosa and A. pubescens diverged 557,845 years ago. When migration was incorporated into the model, the estimate for the time since the split rose slightly (to 660,860 years). Both of these estimates seem reasonable, given that the diversification of the North American Aquilegia clade is believed to have occurred less than 2 million years ago [31].

Although we obtained believable estimates of the divergence time, MIMAR was not able to converge on an estimate for the migration rate, despite the fact that the model seemed to be mixing well and the estimates of Inline graphic corresponded to our earlier calculations (data not shown). Using Wright's Inline graphic–based estimator of migration rate, we calculated that the average number of migrants between populations per generation (Inline graphic) was 6. Because the MIMAR analysis suggests that A. formosa and A. pubescens have diverged recently, it is reasonable to assume that at least some of the shared variation is due to ancestral polymorphism, and is not solely the result of gene flow between the two species. Therefore, this estimate of 6 migrants per generation should be considered as a maximum possible value for Inline graphic.

Discussion

We used direct sequencing to compare levels of intra- and interspecific variation in Aquilegia, and found that our genetic data could not distinguish A. formosa and A. pubescens. Not only were values of Inline graphic strikingly similar across species for every fragment, but estimates of Inline graphic were also extremely low, indicating that almost all polymorphism is shared between species. This is a remarkable finding given that these two species are strongly differentiated both ecologically and morphologically.

Several studies of other species have uncovered the same phenomenon. Different species of wild sunflowers exhibit strong ecological differentiation, but it has been found that there are few fixed differences between the species, despite very high levels of intraspecific variation (higher than what we observed in Aquilegia) [4]. Hybridization also occurs between these species, and there is evidence for long-term introgression since their divergence one million years ago [4]. Gene flow has also played a role in shaping the patterns of genetic divergence among species in the Hawaiian silversword alliance, which (like Aquilegia) is another example of an adaptive radiation in plants [32]. Finally, African cichlid fishes represent one of the most dramatic examples of an adaptive radiation, and many of the more than 2,000 unique species in this group have arisen via sympatric speciation and are still capable of forming viable hybrid offspring, despite many ecological, morphological, and behavioral differences [33].

As in the above examples, it is known that hybrid zones form between A. formosa and A. pubescens [20][22]. There are also some genetic markers which suggest introgression beyond the hybrid zones [22], which makes it tempting to speculate that gene flow between the species has been occurring since their divergence. Our implementation of the isolation-migration model [34], [35] produced an estimate of the divergence time that fit well with the model of recent speciation, but since it could not simultaneously converge on an estimate for the migration rate, we cant be sure that gene flow is still occurring. This may be the result of too little data in general, or it may also be the result of having zero fixed differences in the sample.

The patterns of population structure were also unclear in our sample; geographic distance between populations has a clear correlation with genetic differentiation in the broad sample of North American Aquilegia taxa, but there is not a clear relationship when only A. formosa and A. pubescens are examined on a more local scale. At the same time, the clustering of individuals in STRUCTURE does not seem entirely random, with two individuals being more likely to cluster together if they are from the same species and the same population than if they are not. It is possible that these results are a reflection of a pre-exisiting population structure in the common ancestor, or that migration between populations has made the structure harder to discern.

We believe that finding the loci responsible for reproductive isolation will help us to gain a clearer understanding of how speciation has occurred in Aquilegia. A relatively recent scan of genome-wide patterns of interspecific differentiation in two species of European oaks led to the identification of a few genomic regions which seem to underlie species divergence [36]. Like Aquilegia, these oak species were closely related and highly interfertile, despite exhibiting significant differences in ecology and morphology. The overall low levels of interspecific variation in these species facilitated the identification of highly differentiated regions. The primary goal of this study was to assess the feasibility of a similar type of genome-wide scan for highly differentiated loci in Aquilegia. Our results have shown that despite reasonable levels of intraspecific polymorphism, genetic differentiation is incredibly low at neutral loci, which should make it easier to distinguish putative speciation genes.

Materials and Methods

Sample Collection and Preparation

Leaf tissue was collected from individual plants found in different locations along the west coast of North America. Samples were taken from 40 individuals of each species, for a total sample size of 80 individuals. A. formosa samples were taken from 9 different populations, ranging from California, Nevada, Washington state, British Columbia, and Alaska. The number of individuals in each of these populations varied between 1 and 10, but most populations had 5 individuals. There were only 3 populations of A. pubescens, and all of them were from California. There were between 4 and 16 individuals in each of these populations (see also Table S1 for a description of the sampling). Because the A. pubescens populations were less geographically dispersed than the A. formosa samples, there was some concern that A. pubescens might falsely appear to be less polymorphic than A. formosa. However, as was discussed in the Results section, the same level of polymorphism was found in both species, so sampling bias was not an issue.

DNA extractions were performed using Qiagen's DNeasy Plant Mini Extraction Kits. Due to limited sample amounts, extracted DNA was used directly in only 5 out of the 9 amplifications (Acetyl, Defen, H3, LFY, and UF3GT). For the remaining 4 amplifications, the extracted DNA was first amplified using Qiagen's REPLI-g Mini Kit and corresponding whole genome amplification protocol.

Additional leaves were collected from thirty-two Aquilegia taxa (including A. formosa and A. pubescens) [17]. Twenty-five of these are also native to North America, while the remaining 7 are found in Europe and Asia. For each species, between 1 and 3 populations were sampled, with an average of 5 individuals per population (Table S1). The majority of individuals came from western North America. DNA extractions were performed as described above.

Fragment Amplification and Sequencing

Nine short regions of the Aquilegia genome were amplified in the original sample via PCR using 3′-UTR anchored primers (Table S2). These primers were originally designed by Whittall et al. [2] to reconstruct a species-level phylogeny for several members of the Aquilegia genus (including A. formosa, but excluding A. pubescens). None of these regions are expected to be involved in the evolution of reproductive barriers. Two of the 9 regions were also amplified in the broader sample of 32 species (Gapc and UF3GT). All of the sequences contained some non-exonic DNA (Table S3).

All PCR amplifications were done in a total volume of 25Inline graphicL, with 20Inline graphicL Promega PCR Master Mix (2×: 50 units per mL of Taq polymerase, 400Inline graphicM dATP, 400Inline graphicM dGTP, 400Inline graphicM dCTP, 400Inline graphicM dTTP, 3mM MgInline graphic), 3Inline graphicL of forward and reverse primers (10Inline graphicM each), and approximately 20 ng of DNA template. Although the annealing temperature varied slightly among primer pairs, the cycling conditions were generally as follows: 92Inline graphicC for 2 minutes, followed by 35 cycles of: 92Inline graphicC for 45 seconds, 61Inline graphicC for 30 seconds, 72Inline graphicC for 1.5 minutes, and a final extension step at 72Inline graphicC for 10 minutes.

Sequencing for the original sample of 80 individuals was performed in both directions using the Beckman-Coulter CEQ 2000 platform. Purifications and sequencing reactions were all done as recommended by the Beckman-Coulter protocols. PCR products were purified using Promega's Wizard MagneSil PCR Clean-Up System. Eight microliters of purified template were mixed with 1Inline graphicL CEQ 10× Buffer, 1Inline graphicL CEQ QuickStart Mix, 2.8Inline graphicL water, and 0.25Inline graphicL of either forward or reverse primer (for a total reaction volume of 13Inline graphicL). The sequencing reaction mixtures were then subjected to the following cycling conditions: 96Inline graphicC for 20 seconds, 50Inline graphicC for 20 seconds, and 60Inline graphicC for 4 minutes for a total of 40 cycles, followed by holding at 4Inline graphicC. The reaction products were cleaned up using the Beckman-Coulter protocol for “Ethanol Plate Precipitation in a CEQ sample plate,” and then finally loaded into the CEQ 2000 for sequencing. Sequencing for the broader sample was performed on the Li-Cor System.

Sequence Alignment and Editing

Sequences obtained from the CEQ 2000 were aligned using phredPhrap [37], [38], and visualized in Consed [39]. All alignments were edited manually with the aid of MABCW (program written by T. Hu; scripts and more information available upon request). The indel polymorphisms that we were able to identify were all relatively short, and we only observed two alleles at each of these sites. We were not able to characterize individuals that were heterozygous at these sites, and treated these sequences as missing data during our analyses. For homozygous individuals, indels were analyzed as biallelic SNPs.

For each fragment, the set of segregating sites was identified using alignments of all sequences from both species. The sites in this set were then subsequently characterized as either exclusive to one species or shared based on whether or not they were still segregating in an alignment of sequences from only one species. At each SNP position, the derived allele was determined by using a draft assembly of the Aquilegia coerulea (Goldsmith) genome as an outgroup (Joint Genome Institute (JGI) Aquilegia Sequencing Project, unpublished data).

For the purpose of linkage disequilibrium analyses, haplotypes were reconstructed using PHASE 2.0.2 [40], [41]. For all other analyses, (estimation of Inline graphic, Inline graphic, MIMAR, and population structure), we used the un-phased genotype data directly.

Analysis

The population mutation parameter (Inline graphic) was estimated using Watterson's estimator (Inline graphic) [42] and the average number of pairwise differences (Inline graphic) [43]. Using in-house scripts (available upon request), both of these statistics were determined for each of the 9 sets of sequences and then scaled by the length of the sequence in order to get a per base pair value. The reading frame for each fragment was assumed based on alignment with cDNA sequences available in Genbank (accession numbers: DQ286961, DQ224264, DQ224271, DQ217409, DQ286960, DQ224258, AY162852, and DQ286959). Estimates of Inline graphic for different classes of sites were scaled by the total number of silent sites or nonsynonymous sites in each sequence (see Table S4 for results). The number of silent sites (S) and the number of nonsynonmous sites (N) were calculated based on a simple Jukes-Cantor model of substitution [44], with the following equations: Inline graphic, Inline graphic, where Inline graphic is the number of non-degenerate sites, Inline graphic is the number of twofold degenerate sites, and Inline graphic is the number of fourfold degenerate sites.

Linkage disequilibrium (LD) between SNPs was quantified using Inline graphic, the squared correlation coefficient. For each fragment, Inline graphic was plotted as a function of the distance between SNPs (measured in base pairs). The population recombination parameter (Inline graphic) was estimated by fitting the equation given in [45], [46]:

graphic file with name pone.0008655.e091.jpg

where Inline graphic. For all analyses of recombination, low frequency (MAF Inline graphic10%) polymorphisms were removed, since they provide little information about the overall pattern of LD.

Estimates of Wright's Inline graphic were calculated based on estimates of Inline graphic [47] using the following equation:

graphic file with name pone.0008655.e096.jpg

where Inline graphic refers to the average pairwise difference between individuals from different species, and Inline graphic is the average pairwise difference within species. Confidence intervals were obtained by using 10,000 bootstrap replicates.

Analysis of Isolation By Distance (IBD) was performed using a Mantel test [48] with 10,000 replications as implemented by the R package “ade4” [49], [50]. The genetic distance matrix was composed of estimates for Inline graphic while the geographic distance matrix was measured in kilometers between populations.

Population structure was inferred directly from the sequence data using the program STRUCTURE 2.0, which implements a model-based clustering approach [51]. STRUCTURE was run under the “linkage model” with “correlated allele frequencies.” Specifying correlated allele frequencies enhances the ability of the algorithm to detect distinct clusters even among a sample of very closely related populations [52], which is well suited to the Aquilegia data set. Although geographic sampling information was available, initial STRUCTURE runs suggested that geographic location did not correspond well with the genetic data, so we did not use the “prior population information” model to assist in clustering. The program was run with a burn-in length of 50,000 and a run length of 20,000. This was done several times for each K value (ranging from 2 to 15) in order to ensure that results were consistent. Plots of the STRUCTURE output were generated using distruct [53]. The average “clusteredness” of individuals was calculated for each STRUCTURE run according to the equation presented by Rosenberg et al. [52].

In order to estimate divergence time and migration rate, the data were analyzed using the program MIMAR [35], which can incorporate recombination into an “isolation–migration” model. The mutation rate, Inline graphic, was assumed to be Inline graphic, based on an estimation of the average substitution rate in nuclear DNA in plants [54]. The intralocus recombination rate was set at Inline graphic, based on the estimation of the population recombination rate from linkage disequilibrium data. Inline graphic, Inline graphic, and Inline graphic were all sampled from a uniform prior distribution Inline graphic. The time since split, Inline graphic, measured in generations, was sampled from the prior distribution Inline graphic. Migration was either fixed at Inline graphic, or drawn from a prior range between 0.135 and 7.39. The program was run for Inline graphic recorded steps, and Inline graphic burnin steps.

We also estimated the migration rate using Wright's equation [55] for an n-island population model, which is based on Inline graphic:

graphic file with name pone.0008655.e113.jpg

Supporting Information

Figure S1

STRUCTURE cannot cluster individuals according to species. Each individual is indicated by a thin line, where the two colors represent the estimated membership coefficients for the 2 clusters. The clusteredness score for this plot was estimated as 0.26.

(0.04 MB PDF)

Figure S2

Inferred population structure for 80 Aquilegia individuals. The results from STRUCTURE are plotted for K = 11, which had an average clusteredness score of ≈0.52. Each individual is represented by a thin horizontal line, with corresponding population and species information given on either side.

(0.06 MB PDF)

Figure S3

Probability of different K estimates. The estimated log probability of the data (as calculated by STRUCTURE) is plotted against different K values. For each K value, STRUCTURE was run 3 times, and the plotted value is the average of those 3 runs.

(0.03 MB PDF)

Figure S4

Average clusteredness for different K values. For each K value, the average clusteredness measures the extent to which each individual belongs to a single cluster rather than to multiple clusters, so the higher the clusteredness the “better” the clusters.

(0.03 MB PDF)

Figure S5

Other factors influencing FST in Aquilegia. In all panels, red dots indicate comparisons where both populations were the same for the factor being considered, while gray dots indicate comparisons where the two populations were different. Panel (A) shows FST vs distance both within and between species, with the green diamonds indicating comparisons between either A. formosa or A. pubescens and one of the natural hybrid populations. Panel (B) shows FST vs distance with the same and different pollinator syndrome, while Panel (C) shows the same comparisons for habitat type.

(0.09 MB PDF)

Figure S6

MIMAR estimates of time since divergence

(0.41 MB PDF)

Figure S7

R2 versus distance for the combined data.

(0.07 MB PDF)

Table S1

Summary of Aquilegia samples used in this study.

(0.04 MB PDF)

Table S2

Primer pairs used to amplify the 9 nuclear loci.

(0.03 MB PDF)

Table S3

Positions of introns, exons, and UTRs in each locus.

(0.02 MB PDF)

Table S4

Levels of polymorphism for synonymous and nonsynonymous sites.

(0.04 MB PDF)

Acknowledgments

We thank Jeremy Schmutz for providing us with unpublished DNA sequence data from the Aquilegia coerulea genome project. We also thank Dr. Simon Joly and two anonymous reviewers for their helpful comments and criticisms on earlier versions of this manuscript.

Footnotes

Competing Interests: The authors have declared that no competing interests exist.

Funding: This work was supported by an NSF grant: EF-0412727. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Hodges S, Arnold M. Columbines: A geographically widespread species ock. Proceedings of the National Academy of Sciences. 1994;91:5129–5132. doi: 10.1073/pnas.91.11.5129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Whittall JB, Medina-Marino A, Zimmer EA, Hodges SA. Generating single-copy nuclear gene data for a recent adaptive radiation. Molecular Phylogenetics and Evolution. 2006;39:124–134. doi: 10.1016/j.ympev.2005.10.010. [DOI] [PubMed] [Google Scholar]
  • 3.Hey J. Recent advances in assessing gene ow between diverging populations and species. Current Opinion Genetics and Development. 2006;16:592–596. doi: 10.1016/j.gde.2006.10.005. [DOI] [PubMed] [Google Scholar]
  • 4.Strasburg JL, Rieseberg LH. Molecular demographic history of the annual sunowers Helianthus annuus and H. petiolaris-large effective population sizes and rates of long-term gene ow. Evolution. 2008;62:1936–1950. doi: 10.1111/j.1558-5646.2008.00415.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Crow KD, Munehara H, Kanamoto Z, Balanov A, Antonenko D, et al. Maintenance of species boundaries despite rampant hybridization between three species of reef fishes (Hexagrammidae): implications for the role of selection. Biological Journal of the Linnean Society. 2007;91:135–147. [Google Scholar]
  • 6.Yatabe Y, Kane NC, Scotti-Saintagne C, Rieseberg LH. Rampant gene exchange across a strong reproductive barrier between the annual sunowers, Helianthus annuus and H. petiolaris. Genetics. 2007;175:1883–1893. doi: 10.1534/genetics.106.064469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Butlin RK, Galindo J, Grahame JW. Sympatric, parapatric or allopatric: the most important way to classify speciation? Philosophical Transactions of the Royal Society B: Biological Sciences. 2008;363:2997–3007. doi: 10.1098/rstb.2008.0076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Mallet J. Hybridization, ecological races and the nature of species: empirical evidence for the ease of speciation. Philosophical Transactions of the Royal Society B: Biological Sciences. 2008;363:2971–2986. doi: 10.1098/rstb.2008.0081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Noor MA, Feder JL. Speciation genetics: evolving approaches. Nature Reviews Genetics. 2006;7:851–861. doi: 10.1038/nrg1968. [DOI] [PubMed] [Google Scholar]
  • 10.Seehausen O. Hybridization and adaptive radiation. Trends in Ecology and Evolution. 2004;19:198–207. doi: 10.1016/j.tree.2004.01.003. [DOI] [PubMed] [Google Scholar]
  • 11.Coyne JA, Orr HA. Speciation. Sunderland, MA U.S.A.: Sinauer Associates, Inc; 2004. [Google Scholar]
  • 12.Turner TL, Hahn MW, Nuzdhin SV. Genomic islands of speciation in Anopheles gambiae. PLoS Biology. 2005;3:1572–1578. doi: 10.1371/journal.pbio.0030285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Clarke B, Johnson M, Murray J. Clines in the genetic distance between two species of island land snails: how ‘molecular leakage’ can mislead us about speciation. Philosophical Transactions of the Royal Society B: Biological Sciences. 1996;351:773–784. [Google Scholar]
  • 14.Wang RL, Wakeley J, Hey J. Gene ow and natural selection in the origin of Drosophila pseudoobscura and close relatives. Genetics. 1997;147:1091–1106. doi: 10.1093/genetics/147.3.1091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Machado CA, Kliman RM, Markert JA, Hey J. Inferring the history of speciation from multilocus DNA sequence data: The case of Drosophila pseudoobscura and close relatives. Molecular Biology and Evolution. 2002;19:472–488. doi: 10.1093/oxfordjournals.molbev.a004103. [DOI] [PubMed] [Google Scholar]
  • 16.Munz P. Aquilegia: the wild and cultivated columbines. Gentes Herbarum. 1946;7:1–150. [Google Scholar]
  • 17.Whittall JB, Hodges SA. Pollinator shifts drive increasingly long nectar spurs in columbine owers. Nature. 2007;447:706–709. doi: 10.1038/nature05857. [DOI] [PubMed] [Google Scholar]
  • 18.Prazmo W. Cytogenetic studies on the genus Aquilegia. IV. fertility relationships among the Aquilegia species. Acta Soc Bot Poloniae. 1965;34:667–685. [Google Scholar]
  • 19.Taylor R. Interspecific hybridization and its evolutionary significance in the genus Aquilegia. Brittonia. 1967;19:379–390. [Google Scholar]
  • 20.Grant V. Isolation and hybridization between Aquilegia formosa and A. pubescens. Aliso. 1952;2:341–360. [Google Scholar]
  • 21.Chase V, Raven P. Evolutionary and ecological relationships between Aquilegia formosa and A.pubescens (Ranunculaceae), two perennial plants. Evolution. 1975;29:474–486. doi: 10.1111/j.1558-5646.1975.tb00837.x. [DOI] [PubMed] [Google Scholar]
  • 22.Hodges S, Arnold M. Floral and ecological isolation between Aquilegia formosa and Aquilegia pubescens. Proceedings of the National Academy of Sciences. 1994;91:2493–2496. doi: 10.1073/pnas.91.7.2493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fulton M, Hodges S. Floral isolation between Aquilegia formosa and Aquilegia pubescens. Proceedings of the Royal Society B: Biological Sciences. 1999;266:2247–2252. [Google Scholar]
  • 24.Hodges SA, Fulton M, Yang JY, Whittall JB. Verne grant and evolutionary studies of Aquilegia. New Phytologist. 2004;161:113–120. [Google Scholar]
  • 25.Yang JY, Counterman BA, Eckert CG, Hodges SA. Cross-species amplification of microsatellite loci in Aquilegia and Semiaquilegia (Ranunculaceae). Molecular Ecology Notes. 2005;5:317–320. [Google Scholar]
  • 26.Tenaillon M, Sawkins M, Long A, Gaut R, Doebley J, et al. Patterns of DNA sequence polymorphism along chromosome I of maize (Zea mays ssp. mays L.). Proceedings of the National Academy of Sciences. 2001;98:9161–9166. doi: 10.1073/pnas.151244298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kolkman JM, Berry ST, Leon AJ, Slabaugh MB, Tang S, et al. Single nucleotide polymorphisms and linkage disequilibrium in sunower. Genetics. 2007;177:457–468. doi: 10.1534/genetics.107.074054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Nordborg M, Hu TT, Ishino Y, Jhaveri J, Toomajian C, et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biology. 2005;3:1289–1299. doi: 10.1371/journal.pbio.0030196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhu Y, Song Q, Hyten D, Tassell CV, Matukumalli L, et al. Single-nucleotide polymorphisms in soybean. Genetics. 2003;163:1123–1134. doi: 10.1093/genetics/163.3.1123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ptak SE, HInds DA, Koehler K, Nickel B, Patil N, et al. Fine-scale recombination patterns differ between chimpanzees and humans. Nature Genetics. 2005;37:429–434. doi: 10.1038/ng1529. [DOI] [PubMed] [Google Scholar]
  • 31.Kay KM, Whittall JB, Hodges SA. A survey of nuclear ribosomal internal transcribed spacer substitution rates across angiosperms: an approximate molecular clock with life history effects. BMC Evolutionary Biology. 2006;6 doi: 10.1186/1471-2148-6-36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lawton-Rauh A, Robichaux R, Purugganan M. Diversity and divergence patterns in regulatory genes suggest differential gene ow in recently derived species of the Hawaiian silversword alliance adaptive radiation (Asteraceae). Molecular Ecology. 2007;16:3995–4013. doi: 10.1111/j.1365-294X.2007.03445.x. [DOI] [PubMed] [Google Scholar]
  • 33.Kocher TD. Adaptive evolution and explosive speciation: The Cichlid fish model. Nature Reviews Genetics. 2004;5:288–298. doi: 10.1038/nrg1316. [DOI] [PubMed] [Google Scholar]
  • 34.Hey J, Nielsen R. Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics. 2004;167:747–760. doi: 10.1534/genetics.103.024182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Becquet C, Przeworski M. A new approach to estimate parameters of speciation models with application to apes. Genome Research. 2007;17:1505–1519. doi: 10.1101/gr.6409707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Scotti-Saintagne C, Mariette S, Porth I, Goicoechea PG, Barreneche T, et al. Genome scanning for interspecific differentiation between two closely related oak species [Quercus robur L. and Q. petrea (Matt.) Liebl.]. Genetics. 2004;168:1615–1626. doi: 10.1534/genetics.104.026849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research. 1998;8:186–194. [PubMed] [Google Scholar]
  • 38.Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research. 1998;8:175–185. doi: 10.1101/gr.8.3.175. [DOI] [PubMed] [Google Scholar]
  • 39.Gordon D, Abajian C, Green P. Consed: A graphical tool for sequence finishing. Genome Research. 1998;8:195–202. doi: 10.1101/gr.8.3.195. [DOI] [PubMed] [Google Scholar]
  • 40.Stephens M, Smith NJ, Donnelly P. A new statistical method for haplotype reconstruction from population data. American Journal of Human Genetics. 2001;68:978–989. doi: 10.1086/319501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Stephens M, Donnelly P. A comparison of bayesian methods for haplotype reconstruction from population genotype data. American Journal of Human Genetics. 2003;73:1162–1169. doi: 10.1086/379378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Watterson GA. On the number of segregating sites in genetical models without recombination. Theoretical Population Biology. 1975;7:256–276. doi: 10.1016/0040-5809(75)90020-9. [DOI] [PubMed] [Google Scholar]
  • 43.Tajima F. Evolutionary relationship of DNA sequences in finite populations. Genetics. 1983;105:437–460. doi: 10.1093/genetics/105.2.437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Jukes T, Cantor C. Evolution of protein molecules. New York, chapter: Academic Press; 1969. pp. 21–123. [Google Scholar]
  • 45.Weir B, Hill W. Nonuniform recombination with the human beta-globin gene cluster. American Journal of Human Genetics. 1986;38:776–778. [PMC free article] [PubMed] [Google Scholar]
  • 46.Cutter AD, Baird SE, Charlesworth D. High nucleotide polymorphism and rapid decay of linkage disequilibrium in wild populations of Caenorhabditis remanei. Genetics. 2006;174:901–913. doi: 10.1534/genetics.106.061879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hudson RR, Slatkin M, Maddison WP. Estimation of levels of gene ow from DNA—sequence data. Genetics. 1992;132:583–589. doi: 10.1093/genetics/132.2.583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Mantel N. The detection of disease clustering and a generalized regression approach. Cancer Research. 1967;27:209–220. [PubMed] [Google Scholar]
  • 49.Team RDC. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2007. URL http://www.R-project.org. [Google Scholar]
  • 50.Dray S, Dufour A. The ade4 package: implementing the duality diagram for ecologists. Journal of Statistical Software. 2007;22:1–20. [Google Scholar]
  • 51.Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–959. doi: 10.1093/genetics/155.2.945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Rosenberg NA, Mahajan S, Ramachandran S, Zhao C, Pritchard JK, et al. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genetics. 2005;1:660–671. doi: 10.1371/journal.pgen.0010070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Rosenberg NA. Distruct: A program for the graphical display of structure results. 2002. http://rosenberglabbioinformaticsmedumichedu/distructhtml.
  • 54.Wolfe KH, Li WH, Sharp PM. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proceedings of the National Academy of Sciences. 1987;84:9054–9058. doi: 10.1073/pnas.84.24.9054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Wright S. The genetical structure of populations. Annals of Eugenics. 1951;15:323–354. doi: 10.1111/j.1469-1809.1949.tb02451.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1

STRUCTURE cannot cluster individuals according to species. Each individual is indicated by a thin line, where the two colors represent the estimated membership coefficients for the 2 clusters. The clusteredness score for this plot was estimated as 0.26.

(0.04 MB PDF)

Figure S2

Inferred population structure for 80 Aquilegia individuals. The results from STRUCTURE are plotted for K = 11, which had an average clusteredness score of ≈0.52. Each individual is represented by a thin horizontal line, with corresponding population and species information given on either side.

(0.06 MB PDF)

Figure S3

Probability of different K estimates. The estimated log probability of the data (as calculated by STRUCTURE) is plotted against different K values. For each K value, STRUCTURE was run 3 times, and the plotted value is the average of those 3 runs.

(0.03 MB PDF)

Figure S4

Average clusteredness for different K values. For each K value, the average clusteredness measures the extent to which each individual belongs to a single cluster rather than to multiple clusters, so the higher the clusteredness the “better” the clusters.

(0.03 MB PDF)

Figure S5

Other factors influencing FST in Aquilegia. In all panels, red dots indicate comparisons where both populations were the same for the factor being considered, while gray dots indicate comparisons where the two populations were different. Panel (A) shows FST vs distance both within and between species, with the green diamonds indicating comparisons between either A. formosa or A. pubescens and one of the natural hybrid populations. Panel (B) shows FST vs distance with the same and different pollinator syndrome, while Panel (C) shows the same comparisons for habitat type.

(0.09 MB PDF)

Figure S6

MIMAR estimates of time since divergence

(0.41 MB PDF)

Figure S7

R2 versus distance for the combined data.

(0.07 MB PDF)

Table S1

Summary of Aquilegia samples used in this study.

(0.04 MB PDF)

Table S2

Primer pairs used to amplify the 9 nuclear loci.

(0.03 MB PDF)

Table S3

Positions of introns, exons, and UTRs in each locus.

(0.02 MB PDF)

Table S4

Levels of polymorphism for synonymous and nonsynonymous sites.

(0.04 MB PDF)


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES