Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2011 Aug 8;108(34):14192–14197. doi: 10.1073/pnas.1104212108

Arabidopsis hybrid speciation processes

Roswitha Schmickl 1, Marcus A Koch 1,1
PMCID: PMC3161561  PMID: 21825128

Abstract

The genus Arabidopsis provides a unique opportunity to study fundamental biological questions in plant sciences using the diploid model species Arabidopsis thaliana and Arabidopsis lyrata. However, only a few studies have focused on introgression and hybrid speciation in Arabidopsis, although polyploidy is a common phenomenon within this genus. More recently, there is growing evidence of significant gene flow between the various Arabidopsis species. So far, we know Arabidopsis suecica and Arabidopsis kamchatica as fully stabilized allopolyploid species. Both species evolved during Pleistocene glaciation and deglaciation cycles in Fennoscandinavia and the amphi-Beringian region, respectively. These hybrid studies were conducted either on a phylogeographic scale or reconstructed experimentally in the laboratory. In our study we focus at a regional and population level. Our research area is located in the foothills of the eastern Austrian Alps, where two Arabidopsis species, Arabidopsis arenosa and A. lyrata ssp. petraea, are sympatrically distributed. Our hypothesis of genetic introgression, migration, and adaptation to the changing environment during the Pleistocene has been confirmed: We observed significant, mainly unidirectional gene flow between the two species, which has given rise to the tetraploid A. lyrata. This cytotype was able to escape from the narrow ecological niche occupied by diploid A. lyrata ssp. petraea on limestone outcrops by migrating northward into siliceous areas, leaving behind a trail of genetic differentiation.

Keywords: Central Europe, evolution, Brassicaceae


Life is a dynamic process frequently characterized by departure from equilibrium stages (1). This principle provides the fundamental background for adaptation and speciation in a dynamic and changing environment. Gene flow within and between populations or even species reflects such dynamic processes. Microevolutionary processes are considered to provide frequency-based changes in a population system allowing gradual differentiation. Assuming increased isolation, macroevolutionary processes contribute to divergent and diversifying genetic differentiation. With time, completely isolated gene pools are established, eventually resulting in new species adapted to different environments (2).

A special evolutionary case is indicated when these different species meet again and reproductive isolation is incomplete. In the case of highly diverged and almost isolated ancestral gene pools, this is called hybridization. Subsequent secondary isolation of these hybridizing gene pools results in hybrid speciation processes (3). However, the term “hybridization” can be applied to a continuous spectrum of genetic admixture: Hybridization occurs between nearly fully isolated gene pools, normally treated as different species, giving rise to strong hybrids with initially half of each parental genome. If backcrossing of the hybrids into one particular parental populations is involved (introgression), the proportion of the other and introgressing parental genome will decrease, resulting in weak hybrids. Numerous research studies have focused on hybrid speciation and introgression in the past (4, 5), but only a few have demonstrated the continuous process of hybridization and introgression on a regional scale (6, 7).

The genus Arabidopsis is, meanwhile, known to be frequently affected by hybridization and introgression. Alongside seven mainly diploid species (8) and the Arabidopsis arenosa species aggregate with two ploidy levels, there are two polyploid species for which hybridization has been demonstrated: Arabidopsis suecica was described as a hybrid between diploid Arabidopsis thaliana (n = 5) and diploid A. arenosa (n = 8) in Fennoscandinavia (9). It is assumed that this hybrid originated from a single Pleistocene speciation event dating back to between 20,000 and 300,000 y ago (10), with A. arenosa contributing as the paternal parent. The second natural hybrid, Arabidopsis kamchatica (2n = 32), was recently reported to be a hybrid between the Siberian Arabidopsis lyrata ssp. petraea (2n = 16) and a member of the Arabidopsis halleri complex (A. halleri ssp. gemmifera, 2n = 16) (11, 12), and it has been shown that A. kamchatica probably evolved several times independently during Pleistocene glaciation and deglaciation cycles. It is distributed throughout the amphi-Beringian region but is also found in Japan, Korea, and Taiwan. There is also evidence of past gene flow in Europe between diploid A. lyrata and A. halleri on the basis of either nuclear genes (13) or self incompatibility alleles (14).

In this study, we bridge the gap between these two natural Arabidopsis hybrids by describing a third natural hybrid and introgression system between A. arenosa and A. lyrata ssp. petraea (which we will refer to simply as A. lyrata hereafter) within a well-defined geographical, ecological, and biological setting. A. arenosa is distributed throughout central and southeastern Europe. Diploids were detected exclusively in southeastern Europe and the Carpathians, and tetraploids in the Carpathians and remaining distribution areas (15). A. lyrata has a circumarctic distribution but is also found in Central Europe and North America. So far, A. lyrata ssp. petraea has been reported to be exclusively diploid (16); only Polatschek (16) found several tetraploid populations in eastern Austria, and two of these populations were investigated by Mable et al. (17). This study found that tetraploid A. lyrata is widely distributed in this region. The study area is located in the northeastern Austrian Alps and adjacent lowland regions covering an area of ∼900 km2. This region was greatly affected by glaciation and deglaciation cycles during the Pleistocene, as the boundary of the Last Glacial Maximum ran along its eastern edge. Ecologically, the area can be subdivided into a limestone-dominated region, which is located in the eastern Austrian Forealps, and a siliceous region to the north comprising the Danube River Valley in the Wachau (Fig. S1). The whole area is a major center of genetic diversity for both the A. arenosa and A. lyrata species complex (18).

On the basis of extensive field observations, we developed the hypothesis that both species were affected by a changing environment, were forced to migrate and adapt, and probably came into secondary contact after the last glaciation. We have analyzed seven nuclear-encoded microsatellite markers to study gene flow between populations, a maternally inherited chloroplast genome marker [trnL intron (trnL) and trnL/F intergenic spacer (trnL/F-IGS) of tRNASer and tRNAThr, respectively] to investigate migrational movements due to seed dispersal, chromosome number variation to distinguish between diploids and polyploids, and morphometric measurements to characterize introgression independently of the molecular means.

By combining all of the above methods, our main intention was to accomplish the following: (i) to define the geographic distribution and admixture of the cyto- and morphotypes of A. arenosa and A. lyrata in the eastern Austrian Forealps and the Wachau and set these aspects in the context of cytotype distribution within the species’ overall distribution range; (ii) to investigate the direction of hybridization and introgression between the two species and compare this with other Arabidopsis hybrids; and (iii) to define putative hybrid zones.

Results

Cytogenetic Analysis Indicates Nonoverlapping Distribution Patterns of the Diploid and Tetraploid Cytotypes of A. arenosa and A. lyrata in Eastern Austria.

Diploid and tetraploid cytotypes were found in distinct geographic regions (Fig. S2): In the western part of the eastern Austrian Forealps, tetraploid A. arenosa populations are found, and these expand from colline and montane habitats into the subalpine boulders of the northeastern Limestone Alps. Diploid A. lyrata populations are restricted to the warmer eastern margin of the eastern Austrian Forealps. These populations are found in cryptic refugia, which are characterized by the occurrence of cold-adapted plants that survived the interglacials and Holocene warming in naturally open habitats such as single, exposed rocks or rocky slopes (19). Tetraploid A. lyrata populations are located in the central and northeastern Austrian Forealps. In addition, all populations from the northerly adjacent Danube River Valley (Wachau) are also found to be tetraploid. In the northern part of the Wachau and also farther north into the Bohemian Massif tetraploid, A. arenosa populations only are present. Populations of mixed cytotypes (diploids and tetraploids) were not detected, and triploids were rare and found only in one population.

Morphometric Analyses and Molecular Data Recognize Largely Well-Separated Groups of Diploid and Tetraploid Cytotypes and Give First Evidence for Hybridization.

Diploids of both species, A. arenosa and A. lyrata, were validated according to morphological criteria. As no diploid A. arenosa populations were found in eastern Austria, diploids from the Western Carpathians were included in the analyses. These populations were relatively close to our study area. Principal component analysis (PCA) of all individuals investigated is displayed in Fig. 1A. Eigenvalues of axes one and two were 0.25 and 0.11, respectively, composing 36% of the data's total variance. The analysis revealed two main groups corresponding to taxonomy, A. arenosa and A. lyrata. The diploid clusters of each species are clearly separated, but the tetraploid clusters, in particular of A. lyrata, partly overlap, indicating intermediate (hybrid) morphology between the species.

Fig. 1.

Fig. 1.

(A) PCA based on 29 morphological characters (three quantitative, six quantitative as ratios, nine qualitative). Each symbol (triangle, circle) represents an individual of either diploid or tetraploid A. arenosa or A. lyrata ssp. petraea. Percentage quotations along the axes represent PCA eigenvalues. Ten individuals per population were analyzed on average, and a total of 731 individuals were included from 81 populations. (B) cpDNA trnL/F suprahaplotype network of both diploid and tetraploid A. arenosa and A. lyrata ssp. petraea. The sizes of the circles indicate the relative frequency of a suprahaplotype. Species and cytotypes are indicated with the same colors as in A. Plastid DNA sequence polymorphism was analyzed in 10 individuals per population on average, and a total of 946 individuals were included from 98 populations. (C) Population structure estimated by BAPS analyses of both diploid and tetraploid A. arenosa and A. lyrata ssp. petraea populations, based on seven microsatellite loci. Genetic mixture model with “clustering of groups of individuals” and “fixed k” option was applied to test whether grouping was according to species and cytotypes. Using K = 2, we tested whether the dataset clustered according to A. arenosa and A. lyrata; using K = 4, we tested whether the two species groups each split up into two subgroups of diploids and tetraploids. Each population is represented by a vertical bar. Different sizes of the bars correspond to different numbers of individuals per population. Species and cytotypes are indicated with the same colors as in A and B. On average, 20 individuals per population—in total 1,830 individuals from 97 populations—were analyzed.

The differentiation of the dataset, based on morphological data, is strongly supported by the chloroplast DNA (cpDNA) sequence analysis. The haplotype network built with trnLF suprahaplotypes (Fig. 1B) revealed three “core” suprahaplotypes—A, B, and C—which each predominantly characterize one species: A and B were found mainly in A. arenosa and C in A. lyrata. Except for Q, all derived “tip” suprahaplotypes were exclusively observed in either A. arenosa (AU, BE, E, L, P, U, Y) or A. lyrata (AC, AH, AI, AJ, AK, AL, R, V). Only a few of them were shared between the diploid and tetraploid cytotypes (E, L, and U in A. arenosa); the remaining suprahaplotypes were unique for either the diploids or the tetraploids. The few examples of sharing of the “core” suprahaplotypes A, B, and C and the “tip” suprahaplotype Q can be explained by ancestral shared polymorphisms, a well-known phenomenon within the genus Arabidopsis (18, 14), predating the radiation of the genus approximately 2 million years ago (18). An alternative explanation for shared cpDNA polymorphism is recent or past hybridization resulting in chloroplast capture. However, we will show that gene flow is mainly unidirectional from A. arenosa into A. lyrata (via pollen) and either does not occur or is extremely rare in the other direction. Consequently, putatively introgressed A. lyrata populations from the Wachau carry the original and maternally inherited A. lyrata cpDNA suprahaplotypes (Fig. 2).

Fig. 2.

Fig. 2.

(A) Distribution of cpDNA trnL/F suprahaplotypes of A. arenosa and A. lyrata ssp. petraea. Diploid A. lyrata ssp. petraea populations are indicated by gray rings surrounding the suprahaplotype circles. (B) Zoom into the Wachau region with a detailed presentation of cpDNA trnL/F suprahaplotypes of tetraploid A. lyrata populations.

Microsatellite data, analyzed with BAPS, also revealed a differentiation between A. arenosa and A. lyrata (Fig. 1C). BAPS was favored over Structure, because a pregrouping of the dataset, based on clustering of populations instead individuals, could be achieved, which is a unique feature of BAPS. Furthermore, BAPS can handle combined data of diploids and tetraploids, and Structure cannot. A genetic mixture model with “clustering of groups of individuals” and a “fixed k” option was applied to test if grouping was according to species and cytotypes. Using K = 2, we tested whether the dataset clustered according to A. arenosa and A. lyrata; using K = 4, we tested whether the two species groups each split up into two subgroups of diploids and tetraploids based on microsatellite data. In these analyses, admixture was not taken into account because we do not focus on interploidal gene flow in this article. Gene flow between the tetraploids of both species was analyzed with Structure. Group clustering under K = 2 detected a group of diploid and tetraploid A. arenosa (green cluster) and a group of diploid and tetraploid A. lyrata (yellow cluster) (Fig. 1C). Under K = 4, formation of the four main groups could be observed: diploid A. arenosa (orange cluster), tetraploid A. arenosa (green cluster), diploid A. lyrata (blue cluster), and tetraploid A. lyrata from the Wachau (yellow cluster). Tetraploid A. lyrata from the eastern Austrian Forealps (green cluster) grouped together with tetraploid A. arenosa, which indicates hybridization.

Microsatellite Markers Give Evidence for Genetic Introgression Predominantly from A. arenosa into A. lyrata.

Regarding microsatellite data, analyzed with Structure, K values were estimated with Structure-sum. The highest value of the mean deltaK graph was reached under K = 2 (Fig. S3A), and also under K = 2, the least varying data point could be observed in the distribution-of-likelihood graph (Fig. S3B). K = 4 was taken as an example of a more strongly differentiated population structure. As analysis with K = 2 showed genetic introgression more clearly than analysis with K = 4, we will focus on the results with K = 2. Structure analysis of the tetraploids, excluding all diploids, with K = 2 revealed two genetic clusters, and each cluster corresponded to one of the species (Fig. S4 A and B). A. arenosa was characterized by the green cluster, A. lyrata by the yellow cluster, and populations with both colors showed signs of introgression and hybridization between these two species. In general, a decrease of the A. arenosa-specific cluster in tetraploid A. lyrata populations from the south to the north of the study area could be observed. In the southern part, the eastern Austrian Forealps, all A. lyrata populations exhibited large proportions of the A. arenosa-specific cluster, which identifies these populations as strong interspecific hybrids.

These results were supported by a correlation analysis of genetic differentiation (GST values) and geographic distances (Fig. 3) along a linear transect through the whole study area (Fig. 3 and Fig. S2). Population pairs for calculating GST and geographic distance always consisted of the northernmost population of the linear transect and one of the southern populations. Tetraploid A. lyrata populations from the eastern Austrian Forealps (marked with “B” in Fig. 3 and Fig. S2) showed strong hybrid indices between 0.032 and 0.058 (mean 0.045 ± 0.011), and moderate hybrid indices of populations from the northern Wachau (marked with “A” in Fig. 3 and Fig. S2) were found to vary from 0.009 to 0.023 (mean 0.014 ± 0.005). All remaining A. lyrata populations showed weak hybrid indices between 0.002 and 0.022 (mean 0.006 ± 0.004). A. arenosa populations also revealed weak hybrid indices between 0.002 and 0.018 (mean 0.006 ± 0.005).

Fig. 3.

Fig. 3.

Genetic differentiation GST after Nei (20), calculated with Tetrasat, of tetraploid A. arenosa and A. lyrata ssp. petraea populations plotted against geographic distance along a linear transect through the study area. Each population's hybrid index is shown as a vertical bar and corresponds to the fraction (%) of the genetic cluster characteristic for A. arenosa found in an A. lyrata population and vice versa. The hybrid index of a population was obtained from Structure analysis with K = 2. Genetic identity of individuals belonging to a population were combined using an in-house software tool: For K = 2, for example, the proportions of the A. arenosa- and A. lyrata-like genetic clusters of all individuals of a population were summed up to the total proportions of these two genetic clusters within this population. The hybrid zone with moderate hybrid indices, located in the Wachau, is marked with “A.” The hybrid zone with strong hybrid indices, located in the eastern Austrian Forealps, is marked with “B.” Acquisition of geographic distances is explained in detail in Fig. S2.

Genetic Gradient, Based on Microsatellite Markers, Indicates Northward Migration of the Introgressed A. lyrata Populations.

On the basis of the Structure analysis with K = 4 (Fig. S4 C and D), a migration model of tetraploid A. lyrata populations could be developed: As there was a high proportion of the red cluster in populations from both the eastern Austrian Forealps and the southern Wachau (Fig. S4 C and D), colonization of the Wachau from the eastern Alps can be assumed. The red cluster gradually decreased toward the northern Wachau (Fig. S4D), and the yellow and violet clusters prevailed. The yellow cluster was present mainly in the south- and northwestern part of the Wachau, and the violet cluster in the northeastern part on both sides of the river. Hence, the tetraploid populations exhibit a genetic gradient from south to north.

Discussion

In this study we describe a natural Arabidopsis hybrid involving a combination of parental species: On the basis of morphological data, diploid cytotypes of both A. arenosa and A. lyrata are well separated, but, in contrast, tetraploids of each species partly overlap in morphology, which indicates introgression between the two species. On the basis of nuclear microsatellites, we also observed introgression between A. arenosa and A. lyrata resulting in populations with different hybrid indices. High values of genetic diversity of the subsequently isolated, introgressed tetraploids in the study region (Table S1) indicate that the populations have an ancient colonization history. This is supported by a genetic gradient, observed from microsatellite data, which marks the hybrid's migration route from the eastern Austrian Forealps into the Danube River Valley (Wachau). On the basis of nuclear microsatellites, we observed mainly unidirectional gene flow from A. arenosa as pollen donor into A. lyrata as the mother plant. Similarly, A. suecica’s hybrid origin is explained by the paternal contribution of A. arenosa and A. thaliana as maternal genetic source (21). In the case of A. kamchatica, the pollen donor is represented mainly by A. halleri, and A. lyrata served as the mother plant (11). The herein presented data also provide some strong evidence that in natural Arabidopsis hybrids each species is predetermined to serve either as pollen donor (A. arenosa) or pollen acceptor (A. lyrata).

The genus Arabidopsis is one of numerous examples of incomplete reproductive isolation between species. Although diploids of each species are well separated according to morphological and neutral molecular marker data, hybridization frequently occurs, especially in plant families that underwent radiation events, e.g., Asteraceae [Helianthus (22), Senecio (23, 24), Tragopogon (25, 26)] and Brassicaceae [Boechera (27), Cardamine (28)]. In numerous of these study systems, hybridization is not one single event, but occurs multiple times—independently and sometimes even polytopically (24, 2528). For the A. lyrata × A. arenosa hybrid/introgression system, microsatellite marker data indicate that populations from the eastern Austrian Forealps with their strong hybrid index mark the initial hybridization event, which could have happened during Pleistocene glaciation and deglaciation cycles. However, it will remain an open question if these plants hybridized on the diploid or tetraploid level or even between the two different ploidy levels. Populations from the northern Wachau with their moderate hybrid index mark a second, probably very recent and ongoing hybridization event, and for those populations hybridization on the tetraploid level can be assumed, as only tetraploids are found in this region today.

Hybridization between A. arenosa and A. lyrata is restricted to a region that had underlain strong environmental dynamics during Pleistocene glaciation and deglaciation cycles; although the area remained unglaciated at least during the Last Glacial Maximum, the boundary of this last glaciation ran along its eastern edge. At that time, climate oscillations were the reason for environmental dynamics, resulting in disturbed habitats, which, in general, promote hybrid speciation (2931). Today, habitats disturbed by humans offer a major ecological basis for hybridization (30, 31). Pacheco et al. (31), for example, found enhanced hybridization in anthropogenically disturbed regions in comparison with undisturbed ones in Gunnera, and they argued that environmental dynamics might have brought the parental species into closer contact. In the case of A. lyrata × A. arenosa, present-day edaphic and climatic conditions provide additional ecologically altered habitats for the hybrid in contrast to its parents: Substrate types in the Wachau, where most of the hybrids occur, are siliceous bedrocks, rarely serpentine, and not the usual limestone and dolomite on which the diploids of both species preferentially grow. Average annual rainfall declines by more than 50%, and annual mean temperature increases by a maximum of 2 °C. These past and present environmental dynamics probably facilitated hybridization and contributed to the establishment of the hybrid populations. Our future projects will aim at testing if especially edaphic adaptation was one major driving force for hybrid speciation between A. arenosa and A. lyrata.

Methods

Plant Material.

Plants were collected on numerous field trips as both silica-dried and herbarium material. Detailed information is provided in the SI Text and SI Appendix along with chromosome numbers and information on mitotic chromosome preparation.

DNA Isolation, Amplification, and Sequencing of Plastid DNA.

Total DNA was obtained from dried-leaf material and extracted according to the cetyl trimethylammonium bromide protocol of Doyle and Doyle (32) with modifications according to previous studies (11). For the cpDNA marker trnL intron and trnL/F intergenic spacer (trnL/F-IGS), primers, PCR cycling scheme, purification of the amplified fragment, cycle sequencing, and sequencing on a MegaBace 500 sequencer followed the protocol of Schmickl et al. (11). Amplified sequences of trnL/F-IGS included the complete trnL/F-IGS and the first 18 bases of the trnF gene.

Genotyping.

Microsatellites were chosen from previous population studies of A. lyrata (33). Selection criteria, PCR, and genotyping conditions are provided together with a list of the seven microsatellites chosen for the analyses (Table S2). Scoring of fragment sizes and fluorescence intensity/peak heights (in tetraploids) was manually performed from the raw data displayed with Genetic Profiler (GE Healthcare). Allele frequencies within each tetraploid individual could be unambiguously assigned manually for the majority of individuals on the basis of the fluorescence intensity of the fragment peaks. However, two aspects had to be taken into account: (i) In general, the area size underneath the electrophoretic peak was measured, but (ii) it was considered that fluorescence intensity slightly decreases with increasing fragment length. About 10% of the total number of analyzed individuals with ambiguous allele frequencies were excluded from the analyses.

Data Analyses.

Plastidic trnL/F sequence definition and network analysis.

Plastidic trnL/F sequences were defined as haplotypes and suprahaplotypes following our previous studies (e.g., 11). Haplotypes are characterized by varying (in sequence and structure) trnF pseudogenes in the 3′-region of the trnL/F-IGS close to the functional trnF gene: Haplotypes belonging to one suprahaplotype share the same base order throughout the whole sequence except for the pseudogene-rich region, where they vary in both length and base content. Mutation rate within the pseudogene-rich region is about 20 times higher than within the noncoding spacer and intron regions (34). Therefore, our cpDNA dataset is based on trnL/F suprahaplotypes only. Suprahaplotypes differ from each other by single point mutations and/or indels. Newly defined trnL/F haplotypes were assigned GenBank nos. FJ477717–FJ477722 (SI Text and SI Appendix). The network was constructed with TCS version 1.21 (35) using the statistical parsimony algorithm (36). Gaps (except polyT stretches) were coded as single additional binary characters.

Coding of microsatellite alleles and genetic assignment tests.

Information on the coding procedure is provided in Table S2. Microsatellite raw data were deposited at Dryad under doi:10.5061/dryad.j5g76. Two Bayesian analyses were used to identify population structure. Both analyses assume random mating. This assumption can be applied for diploid A. lyrata with its well-investigated sporophytic self-incompatibility system (e.g., ref. 37), which prevents self-pollination, and also for tetraploid A. lyrata, as Mable et al. (17) demonstrated that the tetraploids from eastern Austria remained self-incompatible. The mating system of A. arenosa was addressed in a greenhouse experiment: Over a period of 3 y, we performed selfing and reciprocal outcrossing experiments with the populations that were studied with microsatellite markers. We obtained an outcrossing rate of over 95% over selfing. To pregroup the dataset, including all populations (diploids and tetraploids—altogether 1,830 individuals taken from 97 populations, 4 individuals as the minimum size of a population), BAPS version 5.1 (38) was favored over Structure. Pregrouping was based on clustering of populations instead of individuals, which is a unique feature of BAPS. Furthermore, BAPS can handle combined data of diploids and tetraploids, and Structure cannot. A genetic mixture model with “clustering of groups of individuals” and a “fixed k” option was applied to test if grouping was according to species and cytotypes. Using K = 2, we tested whether the dataset clustered according to A. arenosa and A. lyrata; using K = 4, we tested whether the two species groups split up into two subgroups each of diploids and tetraploids on the basis of microsatellite data. In these analyses, admixture was not taken into account because we do not focus on interploidal gene flow in this article. With Structure version 2.3.3 (39), only tetraploids of A. arenosa and A. lyrata, which were drawn from a total of 1,609 individuals from 87 populations, were analyzed,. The admixture model was applied, assuming individuals to have inherited a genome fraction from ancestors in population K. This model is certainly correct for the majority of the dataset, due to a common colonization history of each of the two species in eastern Austria. The correlated allele frequency model (40) was used because allele frequencies in different populations of each species are likely to be similar, due to past migration events and/or shared ancestry. The K value was estimated with R version 2.11.1 and Structure2.2-sum R script (41) from 96 runs with K values ranging from K = 1–12, each with 8 iterations. The highest value of the mean deltaK graph was reached under K = 2 (Fig. S3A), and under K = 2, the least-varying data point could be observed in the distribution of likelihood graph (Fig. S3B). K = 4 was taken as an example of a more strongly differentiated population structure. Analyses were run for 1,000,000 generations, of which the first 100,000 were discarded as burn-in. Genetic identities of individuals of each population were combined using an in-house software tool: For K = 2, for example, the proportions of the yellow and green genetic clusters of all individuals of a population were summed up to the total proportions of these two genetic clusters within this population. The hybrid index of a population, plotted on the GST graph (Fig. 3), corresponds to the fraction (%) of the genetic cluster characteristic for A. arenosa found in A. lyrata and vice versa. These values were obtained from the Structure analysis with K = 2.

Population differentiation based on microsatellite data.

Tetraploids (1,604 individuals from 86 populations) were analyzed with Tetrasat version 1.0 (42). This software was originally developed to calculate allele frequencies of partial heterozygotes before calculating parameters of population genetics, as the authors assumed peak heights not strongly correlating with allele frequencies (43). However, we believe that, in our dataset, most allele frequencies can be assigned, and so we use an input file with four alleles per individual per marker. Pairwise population differentiation (GST) was calculated according to Nei (20). GST values are plotted against geographic distances along a linear transect through the research area (Fig. 3 and Fig. S2). Population pairs for calculating GST and geographic distance always consisted of the northernmost population of the linear transect and one of the southern populations.

Morphometric Analyses.

Selection criteria of morphological characters are provided with the list of the 29 characters investigated (Table S3). Raw data of morphometric measurements are deposited at Dryad under doi:10.5061/dryad.j5g76. The whole dataset of diploid and tetraploid A. arenosa and A. lyrata was examined with PCA to show overall morphological plasticity both within and between species and cytotypes. PCA was performed with standardization by zero mean and unit SD. Euclidian distance was used for computing pairwise similarities. PCA was carried out with SYN-TAX 2000 (44) and graphically displayed with SPSS version 16.0.

Supplementary Material

Supporting Information

Acknowledgments

We thank Holger Baldauf, Susanne Ball, Christina Mall, Juraj Paule, and Michaela Wernisch for laboratory assistance. We thank Markus Kiefer for the development of two software tools, which can be requested from the corresponding author. Two anonymous reviewers are gratefully acknowledged for substantial improvement of the manuscript. This research was supported by Deutsche Forschungsgemeinschaft Grant KO 2302/5-2 (to M.A.K.).

Footnotes

The authors declare no conflict of interest.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. FJ477717FJ477722). Deposition of microsatellite and morphological raw data have been deposited at Dryad under doi:10.5061/dryad.j5g76.

*This Direct Submission article had a prearranged editor.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1104212108/-/DCSupplemental.

References

  • 1.Kleidon A, Malhi Y, Cox PM. Maximum entropy production in environmental and ecological systems. Philos Trans R Soc Lond B Biol Sci. 2010;365:1297–1302. doi: 10.1098/rstb.2010.0018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Coyne JA, Orr HA. MA: Sinauer, Sunderland; 2004. Allopatric and parapatric speciation. Speciation; pp. 83–124. [Google Scholar]
  • 3.Barton NH, Hewitt GM. Adaptation, speciation and hybrid zones. Nature. 1989;341:497–503. doi: 10.1038/341497a0. [DOI] [PubMed] [Google Scholar]
  • 4.Mallet J. Hybrid speciation. Nature. 2007;446:279–283. doi: 10.1038/nature05706. [DOI] [PubMed] [Google Scholar]
  • 5.Rieseberg LH, Willis JH. Plant speciation. Science. 2007;317:910–914. doi: 10.1126/science.1137729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Abbott RJ, Lowe AJ. Origins, establishment and evolution of new polyploid species: Senecio cambrensis and S. eboracensis in the British Isles. Biol J Linn Soc Lond. 2004;82:467–474. [Google Scholar]
  • 7.Milne RI, Abbott RJ. Reproductive isolation among two interfertile Rhododendron species: Low frequency of post-F1 hybrid genotypes in alpine hybrid zones. Mol Ecol. 2008;17:1108–1121. doi: 10.1111/j.1365-294X.2007.03643.x. [DOI] [PubMed] [Google Scholar]
  • 8.Koch MA, Wernisch M, Schmickl R. Arabidopsis thaliana’s wild relatives: An updated overview on systematics, taxonomy and evolution. Taxon. 2008;57:933–943. [Google Scholar]
  • 9.O'Kane SL, Schaal BA, Al-Shehbaz IA. The origins of Arabidopsis suecica (Brassicaceae) as indicated by nuclear rDNA sequences. Syst Bot. 1996;21:559–566. [Google Scholar]
  • 10.Jakobsson M, et al. A unique recent origin of the allotetraploid species Arabidopsis suecica: Evidence from nuclear DNA markers. Mol Biol Evol. 2006;23:1217–1231. doi: 10.1093/molbev/msk006. [DOI] [PubMed] [Google Scholar]
  • 11.Schmickl R, Jørgensen MH, Brysting AK, Koch MA. The evolutionary history of the Arabidopsis lyrata complex: A hybrid in the amphi-Beringian area closes a large distribution gap and builds up a genetic barrier. BMC Evol Biol. 2010;10:98. doi: 10.1186/1471-2148-10-98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shimizu-Inatsugi R, et al. The allopolyploid Arabidopsis kamchatica originated from multiple individuals of Arabidopsis lyrata and Arabidopsis halleri. Mol Ecol. 2009;18:4024–4048. doi: 10.1111/j.1365-294X.2009.04329.x. [DOI] [PubMed] [Google Scholar]
  • 13.Wang W-K, et al. Multilocus analysis of genetic divergence between outcrossing Arabidopsis species: Evidence of genome-wide admixture. New Phytol. 2010;188:488–500. doi: 10.1111/j.1469-8137.2010.03383.x. [DOI] [PubMed] [Google Scholar]
  • 14.Castric V, Bechsgaard J, Schierup MH, Vekemans X. Repeated adaptive introgression at a gene under multiallelic balancing selection. PLoS Genet. 2008;4 doi: 10.1371/journal.pgen.1000168. e1000168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mesicek J. Chromosome counts in Cardaminopsis arenosa agg. (Cruciferae) Preslia. 1970;42:225–248. [Google Scholar]
  • 16.Polatschek A. Cytotaxonomische Beiträge zur Flora der Ostalpenländer. Österr. Bot Z. 1966;113:1–46. [Google Scholar]
  • 17.Mable BK, Beland J, Di Berardo C. Inheritance and dominance of self-incompatibility alleles in polyploid Arabidopsis lyrata. Heredity. 2004;93:476–486. doi: 10.1038/sj.hdy.6800526. [DOI] [PubMed] [Google Scholar]
  • 18.Koch MA, Matschinger M. Evolution and genetic differentiation among relatives of Arabidopsis thaliana. Proc Natl Acad Sci USA. 2007;104:6272–6277. doi: 10.1073/pnas.0701338104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Birks HJB, Willis KJ. Alpines, trees, and refugia in Europe. Plant Ecol Div. 2008;1:147–160. [Google Scholar]
  • 20.Nei M. Definition and estimation of fixation indices. Evolution. 1986;40:643–645. doi: 10.1111/j.1558-5646.1986.tb00516.x. [DOI] [PubMed] [Google Scholar]
  • 21.Comai L, et al. Phenotypic instability and rapid gene silencing in newly formed Arabidopsis allotetraploids. Plant Cell. 2000;12:1551–1568. doi: 10.1105/tpc.12.9.1551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Yatabe Y, Kane NC, Scotti-Saintagne C, Rieseberg LH. Rampant gene exchange across a strong reproductive barrier between the annual sunflowers, Helianthus annuus and H. petiolaris. Genetics. 2007;175:1883–1893. doi: 10.1534/genetics.106.064469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Brennan AC, Bridle JR, Wang A-L, Hiscock SJ, Abbott RJ. Adaptation and selection in the Senecio (Asteraceae) hybrid zone on Mount Etna, Sicily. New Phytol. 2009;183:702–717. doi: 10.1111/j.1469-8137.2009.02944.x. [DOI] [PubMed] [Google Scholar]
  • 24.Ashton PA, Abbott RJ. Multiple origins and genetic diversity in the newly arisen allopolyploid species, Senecio cambrensis Rosser (Compositae) Heredity. 1992;68:25–32. [Google Scholar]
  • 25.Symonds VV, Soltis PS, Soltis DE. Dynamics of polyploid formation in Tragopogon (Asteraceae): Recurrent formation, gene flow, and population structure. Evolution. 2010;64:1984–2003. doi: 10.1111/j.1558-5646.2010.00978.x. [DOI] [PubMed] [Google Scholar]
  • 26.Soltis DE, et al. Recent and recurrent polyploidy in Tragopogon (Asteraceae): Cytogenetic, genomic and genetic comparisons. Biol J Linn Soc Lond. 2004;82:485–501. [Google Scholar]
  • 27.Koch MA, Dobeš C, Mitchell-Olds T. Multiple hybrid formation in natural populations: Concerted evolution of the internal transcribed spacer of nuclear ribosomal DNA (ITS) in North American Arabis divaricarpa (Brassicaceae) Mol Biol Evol. 2003;20:338–350. doi: 10.1093/molbev/msg046. [DOI] [PubMed] [Google Scholar]
  • 28.Franzke A, Mummenhoff K. Recent hybrid speciation in Cardamine (Brassicaceae): Conversion of nuclear ribosomal ITS sequences in statu nascendi. Theor Appl Genet. 1999;98:831–834. [Google Scholar]
  • 29.Bleeker W. Interspecific hybridization in Rorippa (Brassicaceae): Patterns and processes. Syst Biodivers. 2007;5:311–319. [Google Scholar]
  • 30.Urbanska KM, Landolt E. Patterns and processes of man-influenced hybridization in Cardamine L. In: van Raamsdonk LWD, den Nijs JCM, editors. Plant Evolution in Man-Made Habitats. Amsterdam: Hugo de Vries Laboratory; 1999. pp. 29–47. [Google Scholar]
  • 31.Pacheco P, Stuessy TF, Crawford DJ. Natural interspecific hybridization in Gunnera (Gunneraceae) of the Juan Fernandez Islands, Chile. Pac Sci. 1991;45:389–399. [Google Scholar]
  • 32.Doyle JJ, Doyle JL. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull. 1987;19:11–15. [Google Scholar]
  • 33.Clauss MJ, Cobban H, Mitchell-Olds T. Cross-species microsatellite markers for elucidating population genetic structure in Arabidopsis and Arabis (Brassicaeae) Mol Ecol. 2002;11:591–601. doi: 10.1046/j.0962-1083.2002.01465.x. [DOI] [PubMed] [Google Scholar]
  • 34.Koch MA, et al. Evolution of the trnF(GAA) gene in Arabidopsis relatives and the Brassicaceae family: Monophyletic origin and subsequent diversification of a plastidic pseudogene. Mol Biol Evol. 2005;22:1032–1043. doi: 10.1093/molbev/msi092. [DOI] [PubMed] [Google Scholar]
  • 35.Clement M, Posada D, Crandall KA. TCS: A computer program to estimate gene genealogies. Mol Ecol. 2000;9:1657–1659. doi: 10.1046/j.1365-294x.2000.01020.x. [DOI] [PubMed] [Google Scholar]
  • 36.Templeton AR, Crandall KA, Sing CF. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics. 1992;132:619–633. doi: 10.1093/genetics/132.2.619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Schierup MH, Mable BK, Awadalla P, Charlesworth D. Identification and characterization of a polymorphic receptor kinase gene linked to the self-incompatibility locus of Arabidopsis lyrata. Genetics. 2001;158:387–399. doi: 10.1093/genetics/158.1.387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Corander J, Waldmann P, Sillanpää MJ. Bayesian analysis of genetic differentiation between populations. Genetics. 2003;163:367–374. doi: 10.1093/genetics/163.1.367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–959. doi: 10.1093/genetics/155.2.945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics. 2003;164:1567–1587. doi: 10.1093/genetics/164.4.1567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ehrich D, et al. Genetic consequences of Pleistocene range shifts: Contrast between the Arctic, the Alps and the East African mountains. Mol Ecol. 2007;16:2542–2559. doi: 10.1111/j.1365-294X.2007.03299.x. [DOI] [PubMed] [Google Scholar]
  • 42.Markwith SH, Stewart DJ, Dyer JL. TETRASAT: A program for the population analysis of allotetraploid microsatellite data. Mol Ecol Notes. 2006;6:586–589. [Google Scholar]
  • 43.Markwith SH, Scanlon MJ. Multiscale analysis of Hymenocallis coronaria (Amaryllidaceae) genetic diversity, genetic structure, and gene movement under the influence of unidirectional stream flow. Am J Bot. 2007;94:151–160. doi: 10.3732/ajb.94.2.151. [DOI] [PubMed] [Google Scholar]
  • 44.Podani J. Syntax SYNTAX 2000. Computer Programs for Data Analysis in Ecology and Systematics: User's Manual (Scientia Publishing, Budapest, Hungaria) 2001 pp 1–53. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
1104212108_sapp.doc (75.5KB, doc)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES