Significance
Central topics in evolutionary biology include uncovering the processes and genetic bases of speciation and documenting environmental adaptations and processes responsible for them. The challenging environment of the Qinghai-Tibetan Plateau (QTP) facilitates such investigations, and the Tibetan frog, Nanorana parkeri, offers a unique opportunity to investigate these processes. A cohort of whole-genome sequences of 63 individuals from across its entire range opens avenues for incorporating population genomics into studies of speciation. Natural selection plays an important role in maintaining and driving the continuing divergence and reproductive isolation of populations of the species. The QTP is a natural laboratory for studying how selection drives adaptation, how environments influence evolutionary history, and how these factors can interact to provide insight into speciation.
Keywords: gene flow, hybridization, natural selection, population genomics, speciation
Abstract
Tibetan frogs, Nanorana parkeri, are differentiated genetically but not morphologically along geographical and elevational gradients in a challenging environment, presenting a unique opportunity to investigate processes leading to speciation. Analyses of whole genomes of 63 frogs reveal population structuring and historical demography, characterized by highly restricted gene flow in a narrow geographic zone lying between matrilines West (W) and East (E). A population found only along a single tributary of the Yalu Zangbu River has the mitogenome only of E, whereas nuclear genes of W comprise 89–95% of the nuclear genome. Selection accounts for 579 broadly scattered, highly divergent regions (HDRs) of the genome, which involve 365 genes. These genes fall into 51 gene ontology (GO) functional classes, 14 of which are likely to be important in driving reproductive isolation. GO enrichment analyses of E reveal many overrepresented functional categories associated with adaptation to high elevations, including blood circulation, response to hypoxia, and UV radiation. Four genes, including DNAJC8 in the brain, TNNC1 and ADORA1 in the heart, and LAMB3 in the lung, differ in levels of expression between low- and high-elevation populations. High-altitude adaptation plays an important role in maintaining and driving continuing divergence and reproductive isolation. Use of total genomes enabled recognition of selection and adaptation in and between populations, as well as documentation of evolution along a stepped cline toward speciation.
Speciation, the fundamental phenomenon underlying biodiversity, continues to be a central focus of research in evolutionary biology. Disentangling pattern and process and gaining an understanding of the underlying genetic mechanisms as speciation proceeds are central issues in species biology (1, 2). Most vertebrate species arise by vicariant isolation. This form of speciation takes time and the right combination of stochastic genetic change and natural selection (3, 4). It can occur swiftly under certain circumstances, for example, when populations experience rapid demographic changes (e.g., bottlenecks, expansions) (5, 6) or when they are exposed to ecological shifts (7–9). Speciation becomes difficult to understand when it involves many nonexclusive mechanisms (10). Fortunately, approaches using whole genomes or transcriptomes are opening new research pathways (11–13).
Speciation accompanied by gene flow (i.e., without complete geographical isolation) is thought to be common in animal evolution (14, 15). The role of gene flow in speciation remains controversial because it is often assumed to be an impediment to speciation (16). However, gene flow can also facilitate evolution and speciation by transferring adaptive genes or generating novel genes (17). A genomic approach has the potential to identify genes that are important for adaptation and speciation, which often evolve more rapidly than other genes (18). Growing numbers of studies have used genome-scale comparisons in phylogeography and speciation [e.g., birds (19–22), fishes (23, 24), mammals (25, 26), reptiles (27, 28)].
How organisms adapt and diversify in high-altitude environments has attracted much attention in recent years (29–31). Because of its heterogeneous topography, inhospitable environment, and complex paleoclimate history, the Qinghai-Tibetan Plateau (QTP) is a natural laboratory for studying adaptation and speciation (32–35). Differences in elevation between valleys and mountaintops often exceed 2,000 m, presenting strong ecological gradients in abiotic variables, such as oxygen partial pressure (36), UV radiation (37), precipitation (38), and ambient temperatures (39). The periodicity of climatic oscillations affects diverse phenomena, such as dispersal, fluctuations in population size, and speciation (32, 40).
An endemic Tibetan frog, Nanorana parkeri, occurs in lentic environments of the southeastern QTP along the Yalu Zangbu River (YZR) drainage. It faces challenges that few other amphibians experience. Not unusual for frogs, it occurs at an elevation of 2,800 m, yet it is the only amphibian to also exist at 5,000 m (41), where oxygen is scarce and UV radiation is dangerously high. Western (W) and eastern (E) mitochondrial DNA matrilines of N. parkeri diverged in the middle Pleistocene and formed distinct entities with overlapping elevational ranges (42). Three nuclear DNA loci correspond to this pattern with limited admixture of E and W alleles in two localities near their geographic boundaries (43). However, neither the extent of gene flow nor the degree of isolation in the zone of admixture is known. No morphological differences between W and E have been noted (44). Clines in elevation, genetic-morphological discordance, and mtDNA introgression in the zone of admixture offer an unprecedented opportunity to investigate the role selection plays in driving rapid genetic change, environmental adaptation, and the evolution of species differences.
The annotated genome of N. parkeri (45) facilitates investigations into the evolutionary drivers of population divergence, and possibly speciation, by enabling the identification of selected genes and their functions. Herein, we report on whole-genome sequences of 63 N. parkeri individuals from across the range of the species and decipher the genetic mechanisms that may be responsible for driving population divergence. We reconstruct the evolutionary history of N. parkeri from a genomic perspective, investigate the impacts of isolation and gene flow on the process of speciation, and explore genomic consequences. We also evaluate ecological factors leading to speciation and identify candidate genes underlying the observed differentiation.
Results
Sampling and Sequencing.
Collection localities ranged from 2,900 to 4,900 m in elevation and covered the entire documented distribution of N. parkeri (Fig. 1A and SI Appendix, Table S1). The 45 E and W individuals were sequenced for 33.77 Gb (16.89-fold coverage) on average, and 18 individuals with mixed E and W heritage were sequenced for 11.92 Gb (5.96-fold coverage). Most reads were aligned (90.88% average mappable rate) to the reference genome of N. parkeri belonging to E (45). After genotyping and stringent quality-filtering, we retained 8.59 million single-nucleotide polymorphisms (SNPs) of E and W individuals and 6.44 million SNPs of mixed heritage.
Fig. 1.
Sampling sites and population structure of the Tibetan frog (N. parkeri). (A) Sampling locations (ArcGIS 10.2; esri). Site numbers refer to SI Appendix, Table S1. Colors denote the five main groups recovered from population structure and phylogenetic analyses. Gray indicates hybrid populations (25–27). (B) PCA of all high-coverage samples. (C) Principle component plot of E samples only. (D) ML tree based on concatenated sequences. W, green; E1, purple; E2, yellow; E3, light blue; E4, red.
Population Structure, Phylogeny, and Introgression.
Principal component analysis (PCA) and population structure analyses unambiguously identify five genetic clusters. The PCA plot separates populations W and E along the first eigenvector, which explains 57.76% of total genetic variance (Fig. 1B). The second eigenvector identifies subpopulations E1–E4 and explains 8.80% of the variance. E1 and E3 comprise samples from low elevations (∼2,900–3,300 m; localities 1–5; Fig. 1 A–C) near the Yalu Zangbu Grand Canyon, and no obvious geographic barrier separates them. Subpopulation E2 includes samples from high elevations (∼3,900–4,900 m; localities 6–8; Fig. 1 A–C). The remaining localities, at elevations from ∼3,700–4,500 m, form subpopulation E4. Phylogenetic analyses based on 1,000 neutral loci (Fig. 1D) and gene tree-based coalescent (SI Appendix, Figs. S1 and S2) methods display nearly identical topologies. Populations W and E split into distinct branches, with subpopulation E1 in the most basal position, subpopulation E2 following subpopulation E1, and then allopatric units E3 and E4. Population structure analysis corroborates the phylogenetic analyses and PCA, and identifies signs of nuclear gene flow from E to W at their boundary (Fig. 2). Two topology-based methods verify nuclear admixture among some of the matrilines. The TreeMix analysis (SI Appendix, Fig. S3) and the D-statistic test (Table 1) show gene flow within sympatric E1 and E3. A significant signature of admixture is found between E2 and E4 (|Z| > 3; Table 1).
Fig. 2.
Population structure plots with the number of ancestral clusters (K) = 2–5.
Table 1.
Admixture signatures from D-statistic tests
Test | D statistic | Z score |
W, E1; E3, E4 | −0.247 | −50.868 |
W, E1; E3, E2 | −0.264 | −61.376 |
W, E1; E4, E2 | −0.009 | −2.354 |
W, E2; E3, E4 | 0.117 | 21.667 |
E1, E2; E3, E4 | 0.260 | 78.181 |
Populations with gene flow are denoted in boldface.
Demography History.
We used the generalized phylogenetic coalescent sampler (G-PhoCS) (46) to infer ancestral population sizes, divergence times, and migration rates. We estimate the divergence time of W and E from ∼688.3–846.0 kya (Ka) (Fig. 3A). The ancestral effective population size of E at this period was reduced by ∼56% (SI Appendix, Table S2). Pairwise sequentially Markovian coalescence (PSMC) results suggest that W was reduced by ∼50% (SI Appendix, Fig. S4). Population expansion occurred two- to threefold until the penultimate glaciation [Marine Isotope Stage 6, ∼100–200 Ka (47)]. E1 diverged from E2–E4 at ∼181.7 Ka [95% confidence interval (CI) = 158.0–205.4 Ka]. The time of divergence for E2 relative to E3 and E4, and between the latter two, is estimated at 45.9 Ka (95% CI: 36.9–55.6 Ka) and 39.3 Ka (95% CI: 32.1–46.6 Ka), respectively. We may underestimate the divergence time because the mutation rate [0.776e-09 per site per year (45)] is a conservative value. Demographic analysis shows evidence of genetic exchanges among all subpopulations of E as well as directional migration from E4 to W (Fig. 3A). Gene flow from the hypothesized ancestral population of E2–E4 into W occurred between 45.9 and 181.7 Ka (G-PhoCS analyses). Coalescent simulations of gene flow between W and E under different migration scenarios suggest that the inferred demographic parameters were most concordant with the observed patterns of differentiation (Fig. 3B).
Fig. 3.
Demographic inference. (A) Demographic history inferred by G-PhoCS. Widths of branches are proportional to Ne. Horizontal dashed lines denote posterior estimates for divergence times, associated mean values are shown in bold, and 95% credible intervals are shown in parentheses. Arrows indicate the direction of gene flow, and associated figures indicate the estimates of total migration rates. (B) Distribution of FST(W, E1) values in the observed data and in the simulated data under different migration scenarios between W and E. The full model shows simulation with the full set of demographic parameters inferred from G-PhoCS. The no_E234 refers to the simulation without the postdivergence migration from E234 to W. The no_E refers to the simulation without current and postdivergence migration from E to W.
Genomic Isolation Between West and East Matrilines.
We used several statistical approaches to assess the extent of genomic differentiation between W and E. High levels of differentiation between W and E and within E are found using pairwise mean relative divergence (FST) and absolute sequence divergence (dxy). The mean FST = 0.4787 ± 0.0097 and dxy = 0.0030 ± 0.0002 values are more than threefold higher than those among subpopulations of E (FST: 0.1405 ± 0.0322, dxy: 0.0009 ± 0.0001; Fig. 4 and SI Appendix, Tables S3 and S4). Levels of divergence using the density of fixed differences per site between populations (df) are about 100-fold higher than those within subpopulations of E (SI Appendix, Table S5). The genomic landscape of divergence using the df statistic finds large numbers of loci across the genome fixed between W and subpopulations of E (Fig. 4A and SI Appendix, Fig. S5).
Fig. 4.
Genomic divergence associated with species formation. (A) FST distribution between W and E1 and their landscape of genomic divergence measured by the df. Both statistics are measured from 50-kb nonoverlapping windows. Scaffolds are concatenated to reveal the whole-genomic divergence pattern. (B) FST distribution between E1 and E2 and their landscape of genomic divergence. (C) FST distribution between E3 and E4 and their landscape of genomic divergence.
We used a modified population branch statistic (PBS) approach to measure the extent of genomic differentiation between W and all four subpopulations of E (Materials and Methods). The mean PBS value of W (PBSw) is 0.5583. Neutral coalescent simulations based on the inferred demographic model find the top 2.5% of the observed PBSW values (ca. PBSW ≥ 0.8906; Fig. 5A) to be significantly higher than simulated neutral values (P < 2.2e-16, two-tailed Mann–Whitney test; Fig. 6). Because all five populations of the species expanded recently (based on PSMC analysis), we performed the simulation taking this expansion into account; results were similar to those under the G-PhoCS model (SI Appendix, section 1.1). Accordingly, we can identify highly divergent regions (HDRs) of the genome with the observed PBSw ≥ 0.8906; there are 579 regions with a total length of 39.60 Mb. The longest fragment is 350 kb, but most (76.34%) are only a one-window-length (50 kb) fragment (SI Appendix, Fig. S6). FST and dxy values are significantly higher in HDRs than those in the background genome (P < 2.2e-16, two-tailed Mann–Whitney test; Fig. 5B and SI Appendix, Table S6). The correlation between FST and dxy is significantly positive (Pearson’s R > 0.3, P < 2.2e-16; SI Appendix, Fig. S7), reflecting reduced gene flow in these regions. A contrasting pattern between W and E occurs with respect to nucleotide diversity (π) (SI Appendix, Table S6). The value in HDRs of W is no less than for the rest of the genome (0.00078 vs. 0.00067) but is reduced significantly (P < 2.2e-16) in the corresponding regions of subpopulations of E (Fig. 5B and SI Appendix, Table S6). The reduced level of intrapopulation diversity in HDRs suggests that selection has affected E. Relative to the genomic background, these regions also contain a significantly high derived allele frequency (DAF) (P < 2.2e-16).
Fig. 5.
Identification of HDRs. (A) Distribution of FST-based statistic of PBSW. HDRs in top 2.5% PBS distribution, light blue; outside HDRs, gray. (B) Comparisons of HDRs of PBSW in terms of FST, π, dxy, and DAF versus the genomic background. (C) Distribution of PBSE1. (D) Comparisons of HDRs of PBSE1 with the genomic background. Asterisks designate levels of significance between HDRs and outside HDRs by a two-tailed Mann–Whitney test (*P < 0.01; **P < 1e-8; ***P < 2.2e-16).
Fig. 6.
Distribution of observed genome-wide top 2.5% PBSW values compared with the simulated PBSW values under the full model.
Divergence of W and E Matrilines in Relation to Reproductive Isolation.
We used the annotated genome of N. parkeri to infer the functions of candidate target genes within the HDRs that might relate to speciation. We annotated 365 protein-coding genes in the HDRs of PBSW. Gene ontology (GO) evaluations identified 51 functional classes that were significantly overrepresented (P < 0.05; SI Appendix, Table S7). In total, 21 genes distributed among 14 GO terms have functions related to reproduction, including, as examples, reproductive developmental process (GO:0003006), sexual reproduction (GO:0019953), and spermatogenesis (GO:0007283) (Table 2 and SI Appendix, Table S7).
Table 2.
GO analysis of genes located in regions that strongly differentiated W from all four subpopulations of E
Category | Term | No. of genes | P value |
Cluster 1 | GO:0048538∼thymus development | 5 | 0.00 |
GO:0048534∼hemopoietic or lymphoid organ development | 12 | 0.01 | |
GO:0002520∼immune system development | 12 | 0.01 | |
Cluster 2 | GO:0003006∼reproductive developmental process | 15 | 0.00 |
GO:0048610∼reproductive cellular process | 11 | 0.00 | |
GO:0048609∼reproductive process in a multicellular organism | 18 | 0.01 | |
GO:0032504∼multicellular organism reproduction | 18 | 0.01 | |
GO:0019953∼sexual reproduction | 17 | 0.01 | |
GO:0007281∼germ cell development | 7 | 0.01 | |
GO:0007276∼gamete generation | 15 | 0.01 | |
GO:0048232∼male gamete generation | 12 | 0.02 | |
GO:0007283∼spermatogenesis | 12 | 0.02 | |
Cluster 3 | GO:0035270∼endocrine system development | 6 | 0.01 |
GO:0030325∼adrenal gland development | 3 | 0.01 |
Annotation clusters with an enrichment score of ≥2 are shown.
Little Representation of E Matrilines in the Nuclear Genome Within the Zone of Admixture.
Individuals in the zone of admixture are more closely related to W than E with respect to the nuclear genome (SI Appendix, Figs. S8 and S9). However, all mixed individuals have the mitogenome of E (SI Appendix, Fig. S10). We used PCAdmix to identify genomic regions from E in the genomes of the mixed individuals. The low level of E ancestry detected (∼5.9–11.2%; SI Appendix, Fig. S11) is not caused by systematic errors (<2%; SI Appendix, Table S8). Regions of E ancestry occur in no less than 67% (24 of 36 haploids) of mixed individuals. Such regions, about 1.6 Mb in size, are dispersed randomly across the genome, with a mean length of 62 Kb. These regions contain 31 protein-coding genes, one relevant to mitochondria [VDAC3 (48)], but none exhibit reduced levels of polymorphism (0.00059 > 0.00046) or significant variation in intergroup and intragroup nonsynonymous/synonymous ratios [P > 0.05, McDonald–Kreitman test (49)]. We detect no signal of selection on the mitogenome in the zone of admixture (Tajima D = −1.49, P > 0.10).
Incomplete Isolation and Gene Flow Within the Geographic Range of E Matrilines.
Levels of whole-genomic differentiation among E1–E4 are comparatively small. Interpopulation FST values range from 0.0974 to 0.1825, and dxy values range from 0.0007 to 0.0010 (SI Appendix, Tables S3 and S4). Mean df values within the subpopulations range from 1.27 × 10−07 to 6.02 × 10−07 (SI Appendix, Figs. S12 and S13 and Table S5), which corresponds to ∼1.27–6.02 fixed differences for every 10 Mb of sequence data in compared populations. We applied the PBS statistic to identify HDRs and then examined the features of these regions. Similar patterns occur for each subpopulation as follows: (i) most outlier windows (>72%) are discontinuous and one window in length (50 kb) (SI Appendix, Fig. S6); (ii) FST and dxy values are significantly higher than the background genome (P < 2.2 × 10−16, two-tailed Mann–Whitney test; Fig. 5D and SI Appendix, Fig. S14 and Table S6); (iii) π is significantly lower in HDRs relative to the rest of the genome, although the level varies among the subpopulations (Fig. 5D and SI Appendix, Fig. S14 and Table S6); and (iv) E1–E3 are skewed toward high frequencies of derived variants (Fig. 5D and SI Appendix, Fig. S14 and Table S6).
Environmental, Histological, and Physiological Divergence Affecting E Matrilines.
In a PCA of all bioclimatic variables associated with frog sampling sites (SI Appendix, Fig. S15), PC1 explains 49.13% of the variation. This value differs significantly between low-elevation clades E1 and E3 and high-elevation clades E2 and E4 (P = 3.3e-4). Histological sections of middorsal skin show that the frogs from high elevations have a significantly greater number of granular glands than frogs from low elevations (P < 0.05, two-tailed t test; Fig. 7A); such glands may function in response to environmental stimuli (50). Hemoglobin (Hb) levels in peripheral blood are correlated with elevation (P < 0.01; Fig. 7B). In contrast to Hb levels, muscular oxygen content in relatively low-elevation populations is significantly higher than muscular oxygen content in populations from higher elevations (P < 0.01; Fig. 7C); this corresponds to the decreased oxygen content in air.
Fig. 7.
Morphological and physiological changes associated with elevation. (A) Bar plots of the differences in numbers of granular glands in the middorsal skin between low-elevation (2,968 m, E1) and high-elevation (4,859 m, E4) populations (two-tailed test: P < 0.05). (B) Hb levels (grams per deciliter) at different elevations. Red lines show the best-fit regression line based on a third-order polynomial equation. The 95% confidence interval is shown in gray. (C) TOC (μmol/L) in low (E1) and high (E4) elevations. (D) Expression level of DNAJC8 in the brain, TNNC1 and ADORA1 in the heart, and LAMB3 in the lung from low-elevation (E1) and high-elevation (E4) populations of E, respectively. Eight replicates were performed for each group. Statistically significant differences in differential expression are indicated by asterisk(s) (two-tailed t test: *P < 0.05; **P < 0.01).
Genes, Ecological Factors, and the Evolution of E Matrilines.
Genes from the HDRs of each subpopulation were evaluated for functional categories that related to environmental factors. GO enrichment analyses revealed many overrepresented functional categories that appear to associate with adaptation to the environment of the QTP (SI Appendix, Tables S9–S12). For instance, Kyoto Encyclopedia of Genes and Genomes pathways and GO categories related to metabolism (fatty acid metabolism, hsa00071), blood circulation (circulatory system process, GO:0003013), motility (muscle cell differentiation, GO:0042692), and immune response (inflammatory response, GO:0006954) characterize low-elevation E1 (SI Appendix, Table S9). GO categories associated with apoptosis (induction of apoptosis, GO:0006917) were significantly enriched in high-elevation E2 (SI Appendix, Table S10), and response to radiation (GO:0009314) and light stimulus (GO:0009416) were significantly enriched in high-elevation E4 (SI Appendix, Table S11). GO categories associated with blood circulation (GO:0008015) and metabolism process (response to lipid, GO:0033993) are present in low-elevation E3 (SI Appendix, Table S12).
To verify the function of the genes in HDRs, we performed real-time quantitative PCR on seven candidate genes from three different tissues (heart, lung, and brain). These seven candidate genes were identified based upon what they potentially contribute to high-altitude adaptation (SI Appendix, Table S13). Four of these differ in their levels of expression between low-elevation and high-elevation lineages (Fig. 7D), including two related to cardiac function in the heart [Troponin C1 Slow (TNNC1) (51) and Adenosine A1 Receptor (ADORA1) (52)] and two hypoxia-related genes [DnaJ Homolog Subfamily C Member 8 (DNAJC8) in the brain (53) and Laminin Subunit Beta 3 (LAMB3) in the lung (54)].
Discussion
Genomic Isolation and Speciation Between E and W Matrilines.
Population genomic analyses find substantial genomic isolation between W and E, with limited directional gene flow. Gene flow has been found only in one geographically restricted area, where no obvious geographic barrier (e.g., mountains) exists that could restrict dispersal. Further, the genetic patterns indicate that frogs from both E and W cross the YZR within their ranges and reject the null hypothesis of panmixia within a single species. Neutral processes, including divergence, changes of current, and ancestral effective population size (Ne), and levels of gene flow have the potential to explain the broad pattern of differentiation between W and E. However, the HDRs between W and E exhibit significantly more genetic differentiation than our simulation using the inferred history. HDRs could result from positive selection and not demographic history (23). Furthermore, the HDRs in subpopulations of E exhibit significantly lower π and higher DAF compared with the rest of the genome (Fig. 5B), suggesting the action of positive selection on E. Because the HDRs of W and E harbor many genes that relate to GO categories involved in fertilization, particularly to spermatogenesis, they likely lead to reproductive isolation by suppressing gene flow. Studies of other species have detected the rapid evolution of male reproductive genes, for instance, in primates (55), birds (19), fruit flies (56), and amphibians (57). Taken together with the fact that W, E2, and E4 occupy similar climatic environments (SI Appendix, Fig. S15) and display no differences in morphology, we suggest that endogenous selection is a dominant factor building reproductive isolation, resulting in speciation between W and E matrilines.
Populations from the geographic region of admixture between E and W matrilines occur along a narrow valley containing a tributary of the YZR. They have the mitogenome of E, whereas nuclear genes of W strongly dominate. Such mitonuclear discordance is common in vertebrates (58); however, here, a cohort of N. parkeri genomes offers an unusual opportunity to decipher the genomic pattern and the genetic mechanism of the phenomenon. Population genomic analyses reveal the E introgressed regions in hybrids are small and randomly dispersed in the genome. Those short introgressed segments suggest to us that they formed by hybridization, and were subsequently broken into even smaller sizes by recombination operating over many generations (59, 60). We detect no signal of selection in either the regions of introgression or the mitochondrial genome, and no significant enrichment of the pathways relevant to mitochondria or of the GOs of genes in introgressed regions. Thus, the mitonuclear discordance is not caused by natural selection driven by environmental factors (29, 61, 62). A likely scenario is establishment of reproductive isolating barriers (RIBs) resulting from secondary contact with historical and limited introgression after mid-Pleistocene divergence (63, 64). We hypothesize that hybrids formed historically only between females of E and males of W upon secondary contact, with hybrid females continuing to backcross to the males of W, leading to postzygotic, prezygotic, or both categories of RIBs.
Ecological Adaptation Within E Matrilines.
E Tibetan frogs primarily segregate into high-elevation (>3,700 m, E2 and E4) and low-elevation (<3,700 m, E1 and E3) populations (Fig. 1A). This pattern likely reflects an ecological stratification in the southeastern QTP, including oxygen partial pressure (36), UV radiation (37), precipitation (38), and ambient temperature (39). If ecological selection is one of the forces that drives differentiation of the populations of E, one expects to find an imprint on genomic regions involved in ecological adaptation (27, 65).
The HDRs of each subpopulation of E exhibit signals of selection compared with the background, including low π, increased dxy between populations, and DAF. GO enrichment analyses point to environmental drivers of divergence. No clear geographic barriers separate the low-elevation sympatric pair E1 and E3. Although they were formed at different times (Fig. 3A), they share several GO functional categories, such as blood circulation and metabolic processes. Given the phylogeny and population history, these functions likely evolved independently in each lineage. The histological, physiological, and expressional evidence (Fig. 7) corresponds to the scenario that ecological stratification in the southeastern QTP has greatly promoted adaptive population diversification in situ.
In summary, N. parkeri is a useful model to investigate the processes and genetic bases of speciation along geographic and environmental gradients. We argue that natural selection plays important roles in driving continuing divergence within the species, and even in maintaining it. The extreme environments of the Tibetan Plateau can drive the rapid evolution of species [e.g., yak (33)]. Given the rapidity of changes and the challenging environment, the area is a natural laboratory for studying how selection drives adaptation; how environments influence evolutionary history; and, in some cases, how speciation can occur.
Materials and Methods
Sample Collection and Sequencing.
Tibetan frogs (45 in total) were selected for our genomic study based on matrilineal (mtDNA) genealogies (42). An additional 18 individuals come from the area of mixed clades in a tributary of the YZR (localities 25, 26, and 27; Fig. 1). One individual of Nanorana pleskei was used as the outgroup taxon (66). All collections were made according to animal use protocols approved by the Kunming Institute of Zoology Animal Care and Ethics Committee.
Total genomic DNA was extracted from liver, muscle, toe clips, or tadpole samples using the phenol/chloroform method (67). For each individual, 1–3 μg of DNA was sheared into fragments of 300–800 bp using the Covaris system. DNA fragments were processed and sequenced using Illumina paired-end sequencing technology (Illumina, Inc.). The 45 nonhybrid individuals and the outgroup taxa were sequenced to a target depth of 15×, and the 18 hybrid individuals were sequenced to target depth of 5×. The raw sequence data from this study have been submitted to the Genome Sequence Archive (gsa.big.ac.cn/) under accession nos. CRA000919 (N. parkeri) and CRA000918 (N. pleskei).
SNP Calling and Filtering.
Raw sequence reads of each individual were mapped to the Tibetan frog reference genome (45) using BWA-ALN (v.0.7.4) (68) with default parameters. SAMtools (v.0.1.18) was used for sorting and removing PCR duplicates (69). To minimize false-positive SNP calls around indels, local realignment around indels was performed using the Genome Analysis Tool Kit (v.2.6-5) (70). Raw SNPs were extracted using SAMtools on the locally realigned BAM files with the command “samtools mpileup -q 20 -Q 20 -C 50 -uDEf.”
To obtain high-quality genotype calls for downstream analyses, we kept SNPs that met the following criteria: (i) sites were at least 5 bp away from a predicted insertion/deletion, (ii) the consensus quality was ≥40, (iii) sites did not have triallelic alleles and indels, (iv) the depth ranged from 2.5 to 97.5% in depth quartile, (v) SNPs had minor allele frequencies ≥ 0.01, and (vi) SNPs occurred in more than 95% of high-coverage individuals (nonhybrids) and 75% of medium-coverage individuals (hybrids). The filtered data were phased using Beagle v.3.3.2 (71).
Population Structure, Phylogenetic Inference, and Admixture Analyses.
We used PCA and population structure analysis to evaluate the genetic structuring of frogs. SNPs in scaffolds longer than 500 kb were extracted; they occupied about 76% of the entire genome. PCA was performed using the package GCTA (v.1.24.2) (72). The genomic ancestry of each individual was inferred using Frappe (v.1.1) (73). To avoid the effect of linkage disequilibrium, we selected one SNP for each interval of 50 kb. The postulated number of ancestral clusters (K) was set from two to five, and the maximum number of expectation-maximization iterations was set to 10,000.
Phylogenetic relationships were inferred via both coalescent and concatenation methods. To minimize the effects of potential alignment errors and regions with strong natural selection, we performed these analyses on putatively neutral genomic regions by filtering out the positions with repeat sequences, exons, and the 10 kb flanking them on each side. We randomly selected 1,000 neutral loci with a window size of 100 kb from the intergenic region. First, we reconstructed individual gene trees for each window based on the maximum likelihood (ML) approach using RAxML (v.8.1.15) (74). Support values for each node were inferred using 100 rapid bootstrap replicates based on the GTRGAMMA model. Second, for gene tree-based coalescent analysis, species trees were generated using MP-EST (75) and STAR (76, 77) and using population as the units of the tree tips. Third, for the concatenation analysis, ML trees were constructed based on the GTRGAMMA model using the concatenated sequences from the same set.
To infer admixture events, we applied the D-statistic using the qpDstat module in the ADMIXtools package (78). We also adopted a tree-based approach, which was implemented in TreeMix (79), to verify the existence of gene flow through modeling the migration values set from 0 to 4 with a block of 5,000 SNPs.
Demographic Analysis Using G-PhoCS and PSMC.
G-PhoCS (46) was employed to infer the complete demographic history for N. parkeri, including population divergence times, ancestral population size, and migration rates based on 1,000 neutral loci (80). The parameters were inferred in a Bayesian manner using Markov Chain Monte Carlo to jointly sample model parameters and genealogies of the input loci (46). Migration scenarios were added by combining results of D-statistic tests, Frappe, and TreeMix because G-PhoCS often have limited ability to characterize complex migration scenarios (81). Additionally, two postdivergence migration bands were added to test if gene flow occurred between W and E since they separated. Each Markov chain was run for 2,000,000 generations while sampling parameter values every 20th iteration. Burn-in and convergence of each run were determined with TRACER 1.5 (82). More information about the control file of G-PhoCS is provided in SI Appendix, section 1.2. Divergence times in units of years, effective population sizes, and migration rates were calibrated by the estimates of generation time and neutral mutation rate from previous studies (45, 83). We repeated the G-PhoCS analysis with four separate runs to obtain reliable and stable estimates for the demographic parameters. To validate the G-PhoCS inferences and test if the differential gene flow to W was correctly identified, we used ms (84) to perform simulations under different migration scenarios between W and E (SI Appendix, section 1.3). We used the inferred demographic parameters to produce 50 Mbp of sequences and compared the observed patterns of differentiation for W–E1 with those under different migration scenarios.
The trajectory of demographic histories for the five populations of Tibetan frogs was inferred by the PSMC model (85). Because PSMC has high false-negative rates at low sequence coverage, we restricted this analysis to the individual in each group with highest coverage (≥15×). In addition, a correction factor (-N) was invoked to correct for the false-negative rate caused by the shallow sequence depth (80). The PSMC analysis was set as the following parameters: -N25 -t15 -r5 -b -p “4+25 * 2+4+6”. A bootstrapping approach with 100 replicates was performed to assess the variation in the inferred Ne trajectories. A generation time of 5 y and a neutral mutation rate of 0.776e-09 per site per year were used to convert the population sizes and scaled time into real sizes and time (45, 83).
The program fastsimcoal2 (86) was used to estimate the extent of population expansion within 30,000 y based on the model of G-PhoCS. Fifty time simulations were performed, and the result with the highest likelihood was kept. For each run, demographic estimates were obtained from 100,000 simulations (-n 100,000) and 40 expectation/conditional maximization cycles (-L 40) per parameter file.
Inference of Mitochondrial Phylogeny and Hybrid Ancestry Assignment.
We used BWA-ALN (v.0.7.4) to map all raw sequence reads from each individual to the previously assembled complete mitochondrial genomes of N. pleskei (87), the closest relative of our Tibetan frog. SNPs were identified and filtered based on the same nuclear genome. The matrilineal (mitochondrial) genealogy was inferred using the ML method based on the GTRGAMMA model implemented in RAxML (v.8.1.15) (74). PCAdmix (88) was used to identify the ancestries in each of the 18 hybrid individuals. We used W and E4 as the ancestral populations because they were the geographically closest populations and showed the highest signal of admixture in the f3 test. To prevent high-linkage blocks from having excessive influence on the inferred ancestry of a region, SNPs were thinned with r2 > 0.80 in all ancestral and admixed groups. PCA was performed using a window of 40 SNPs. A default calling threshold of 0.9 was used as the criterion to assign ancestry. However, systematic errors, such as incomplete lineage sorting and estimation errors, may have contributed to similar introgression signals in the PCAdmix analysis. To estimate these, we reanalyzed the PCAdmix analysis using non-E4 individuals as hybrids.
Inference HDRs.
Genomic differentiation was calculated based on the PBS under the topological transformation described previously (89). The PBS value estimates the amount of sequence change along a population branch since its divergence from other branches of a population tree. First, pairwise FST statistics were calculated using Weir and Cockerham’s method (90), implemented in VCFtools v0.1.11 (91), with nonoverlapping 50-kb genomic windows. Negative values of FST were treated as 0. Windows less 20 kb in length were excluded for further analysis. A log-transformation FST was used for the PBS calculation (92). The length of the branch leading to W since the divergence from all subpopulations of E was estimated as follows:
The length of the branch leading to E1 since the divergence from the remaining subpopulations of E clades was then estimated as follows:
Calculations for branch lengths leading to subpopulations E2, E3, and E4 were similar to the equation for E1. HDRs were defined as the upper 2.5% of each PBS distribution. To avoid the potential assembly or mapping errors at the ends of scaffolds, we further removed high divergence peaks less than two consecutive windows in both ends of scaffolds in both sides.
Characterization of HDRs in Terms of dxy, df, π, and DAF.
In addition to FST and PBS, we applied several population genomic parameters to quantify and compare the outlier windows with the background genome, including the following: dxy between populations (93), df between populations (19), within-population π level (19, 23, 91), DAF, and population-scaled recombination rates (ρ). We treated N. pleskei as the outgroup when comparing differences in DAF between W and E, and treated three nonadmixture W individuals (Fig. 2) as the outgroup when comparing DAF differences among subpopulations of E.
GO Enrichment Analysis.
GO enrichment analysis was performed using DAVID (Database for Annotation Visualization and Integrated Discovery) (94). GO terms with less than two genes were excluded from further analysis. GO terms with a P value <0.05 were considered to be significantly enriched.
Phenotype Data from Physiological and Morphological Detection.
Hb levels and tissue oxygen content (TOC) were collected in the field during June and July 2015. Measurements were taken in five communities in E along an elevation gradient: Nyingchi (29.61°N, 94.36°E, 2,968 m above sea level), Lhasa (29.67°N, 90.88°E, 3,671 m above sea level), Zhaxigang (29.75°N, 91.95°E, 4,011 m above sea level), Riduo (29.70°N, 92.23°E, 4,368 m above sea level), and Milashankou (29.80°N, 92.34°E, 4,859 m above sea level). At least 10 individuals were measured for each community. Hb level in peripheral blood was measured from 53 adult frogs, and muscular oxygen content was measured from 69 adult frogs. Hb was determined using Mission Plus Hb, immediately after drawing blood from the heart ventricle. TOC was determined via a fiber optic cable (PreSens Precision Sensing GmbH). We also made histological sections of the middorsal skin of seven frogs from the communities of Nyingchi and Milashankou, which represented populations from E at low and high elevations, respectively. The tissues were fixed in Heidenhain’s Susa fixative to make paraffin sections (5 μm thick).
RNA Isolation and Reverse Transcriptase PCR Assay.
Total RNAs from populations at low (Nyingchi, 2,968 m) and high (Milashankou, 4,859 m) elevations of E were extracted using the TRIzol total RNA extract kit (Tiangen). Reverse transcription was carried out using the Fermentas RevertAid First-Strand cDNA synthesis kit (Fermantas) to prepare templates for real-time quantitative PCR. The primers of candidate genes used for real-time PCR are displayed in SI Appendix, Table S14. Actin, Beta (ACTB) was used as a loading control.
Supplementary Material
Acknowledgments
We thank Michael W. Nachman (University of California, Berkeley), Weiwei Zhai (Genome Institute of Singapore), and two reviewers for helpful comments and suggestions. We also thank Na Lin of Anhui University for assistance in file preparation. This work was supported by the Strategic Priority Research Program (B) (Grant XDB13020200 to J.C.) of the Chinese Academy of Sciences (CAS), National Natural Science Foundation of China (Grants 91431105 and 31622052 to J.C.), and the Animal Branch of the Germplasm Bank of Wild Species of CAS (Large Research Infrastructure Funding) (J.C.). W.-W.Z. was supported by Strategic Priority Research Program (B) (Grant XDB03030107) of CAS and the National Key Research and Development Program of China (Grant 2017YFC0505202). G.-D.W. was supported by the 13th Five-year Informatization Plan of CAS (Grant No. XXH13503-05). J.C. and G.-D.W. are supported by the Youth Innovation Promotion Association, CAS.
Footnotes
The authors declare no conflict of interest.
The sequence data reported in this paper have been deposited in the genome sequence archive of Beijing Institute of Genomics, Chinese Academy of Sciences, gsa.big.ac.cn (accession nos. CRA000919 for N. parkeri, and CRA000918 for N. pleskei).
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1716257115/-/DCSupplemental.
References
- 1.Seehausen O, et al. Genomics and the origin of species. Nat Rev Genet. 2014;15:176–192. doi: 10.1038/nrg3644. [DOI] [PubMed] [Google Scholar]
- 2.Coyne J, Orr H. Speciation. Sinauer; Sunderland, MA: 2004. [Google Scholar]
- 3.Mayr E. Systematics and the Origin of Species from the Viewpoint of a Zoologist. Harvard Univ Press; Cambridge, MA: 1999. [Google Scholar]
- 4.Dobzhansky T. Genetics and the Origin of Species. Columbia Univ Press; New York: 1937. [Google Scholar]
- 5.Hewitt GM. Genetic consequences of climatic oscillations in the Quaternary. Philos Trans R Soc Lond B Biol Sci. 2004;359:183–195, discussion 195. doi: 10.1098/rstb.2003.1388. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Avise JC. Phylogeography: The History and Formation of Species. Harvard Univ Press; Cambridge, MA: 2001. [Google Scholar]
- 7.Funk DJ. Isolating a role for natural selection in speciation: Host adaptation and sexual isolation in Neochlamisus bebbianae leaf beetles. Evolution. 1998;52:1744–1759. doi: 10.1111/j.1558-5646.1998.tb02254.x. [DOI] [PubMed] [Google Scholar]
- 8.Schluter D. The Ecology of Adaptive Radiation. Oxford Univ Press; Oxford: 2000. [Google Scholar]
- 9.Nosil P. Ecological Speciation. Oxford Univ Press; Oxford: 2012. [Google Scholar]
- 10.Turelli M, Barton NH, Coyne JA. Theory and speciation. Trends Ecol Evol. 2001;16:330–343. doi: 10.1016/s0169-5347(01)02177-2. [DOI] [PubMed] [Google Scholar]
- 11.Nosil P, Feder JL. Genomic divergence during speciation: Causes and consequences. Philos Trans R Soc Lond B Biol Sci. 2012;367:332–342. doi: 10.1098/rstb.2011.0263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Feder JL, Egan SP, Nosil P. The genomics of speciation-with-gene-flow. Trends Genet. 2012;28:342–350. doi: 10.1016/j.tig.2012.03.009. [DOI] [PubMed] [Google Scholar]
- 13.Feder JL, Flaxman SM, Egan SP, Comeault AA, Nosil P. Geographic mode of speciation and genomic divergence. Annu Rev Ecol Evol Syst. 2013;44:73–97. [Google Scholar]
- 14.Nosil P. Speciation with gene flow could be common. Mol Ecol. 2008;17:2103–2106. doi: 10.1111/j.1365-294X.2008.03715.x. [DOI] [PubMed] [Google Scholar]
- 15.Pinho C, Hey J. Divergence with gene flow: Models and data. Annu Rev Ecol Evol Syst. 2010;41:215–230. [Google Scholar]
- 16.Mayr E. Animal Species and Evolution. Belknap Press of Harvard Univ Press; Cambridge, MA: 1966. [Google Scholar]
- 17.Nolte AW, Tautz D. Understanding the onset of hybrid speciation. Trends Genet. 2010;26:54–58. doi: 10.1016/j.tig.2009.12.001. [DOI] [PubMed] [Google Scholar]
- 18.Nosil P, Funk DJ, Ortiz-Barrientos D. Divergent selection and heterogeneous genomic divergence. Mol Ecol. 2009;18:375–402. doi: 10.1111/j.1365-294X.2008.03946.x. [DOI] [PubMed] [Google Scholar]
- 19.Ellegren H, et al. The genomic landscape of species divergence in Ficedula flycatchers. Nature. 2012;491:756–760. doi: 10.1038/nature11584. [DOI] [PubMed] [Google Scholar]
- 20.Poelstra JW, et al. The genomic landscape underlying phenotypic integrity in the face of gene flow in crows. Science. 2014;344:1410–1414. doi: 10.1126/science.1253226. [DOI] [PubMed] [Google Scholar]
- 21.Burri R, et al. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Res. 2015;25:1656–1665. doi: 10.1101/gr.196485.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lamichhaney S, et al. Evolution of Darwin’s finches and their beaks revealed by genome sequencing. Nature. 2015;518:371–375. doi: 10.1038/nature14181. [DOI] [PubMed] [Google Scholar]
- 23.Malinsky M, et al. Genomic islands of speciation separate cichlid ecomorphs in an East African crater lake. Science. 2015;350:1493–1498. doi: 10.1126/science.aac9927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Jones FC, et al. Broad Institute Genome Sequencing Platform & Whole Genome Assembly Team The genomic basis of adaptive evolution in threespine sticklebacks. Nature. 2012;484:55–61. doi: 10.1038/nature10944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Phifer-Rixey M, Bomhoff M, Nachman MW. Genome-wide patterns of differentiation among house mouse subspecies. Genetics. 2014;198:283–297. doi: 10.1534/genetics.114.166827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Carneiro M, et al. The genomic architecture of population divergence between subspecies of the European rabbit. PLoS Genet. 2014;10:e1003519. doi: 10.1371/journal.pgen.1003519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Campbell-Staton SC, et al. Winter storms drive rapid phenotypic, regulatory, and genomic shifts in the green anole lizard. Science. 2017;357:495–498. doi: 10.1126/science.aam5512. [DOI] [PubMed] [Google Scholar]
- 28.Bragg JG, et al. Phylogenomics of a rapid radiation: The Australian rainbow skinks. BMC Evol Biol. 2018;18:15. doi: 10.1186/s12862-018-1130-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Cheviron ZA, Brumfield RT. Migration-selection balance and local adaptation of mitochondrial haplotypes in rufous-collared sparrows (Zonotrichia capensis) along an elevational gradient. Evolution. 2009;63:1593–1605. doi: 10.1111/j.1558-5646.2009.00644.x. [DOI] [PubMed] [Google Scholar]
- 30.Cheviron ZA, Connaty AD, McClelland GB, Storz JF. Functional genomics of adaptation to hypoxic cold-stress in high-altitude deer mice: Transcriptomic plasticity and thermogenic performance. Evolution. 2014;68:48–62. doi: 10.1111/evo.12257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Storz JF, Scott GR, Cheviron ZA. Phenotypic plasticity and genetic adaptation to high-altitude hypoxia in vertebrates. J Exp Biol. 2010;213:4125–4136. doi: 10.1242/jeb.048181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Favre A, et al. The role of the uplift of the Qinghai-Tibetan Plateau for the evolution of Tibetan biotas. Biol Rev Camb Philos Soc. 2015;90:236–253. doi: 10.1111/brv.12107. [DOI] [PubMed] [Google Scholar]
- 33.Qiu Q, et al. The yak genome and adaptation to life at high altitude. Nat Genet. 2012;44:946–949. doi: 10.1038/ng.2343. [DOI] [PubMed] [Google Scholar]
- 34.Qu Y, et al. Ground tit genome reveals avian adaptation to living at high altitudes in the Tibetan plateau. Nat Commun. 2013;4:2071. doi: 10.1038/ncomms3071. [DOI] [PubMed] [Google Scholar]
- 35.Simonson TS, et al. Genetic evidence for high-altitude adaptation in Tibet. Science. 2010;329:72–75. doi: 10.1126/science.1189406. [DOI] [PubMed] [Google Scholar]
- 36.Wang D. 2003. The geography of aquatic vascular plants of Qinghai Xizang (Tibet) Plateau. PhD dissertation (Wuhan University, Wuhan, China)
- 37.Norsang G, Kocbach L, Stamnes J, Tsoja W, Pincuo N. Spatial distribution and temporal variation of solar UV radiation over the Tibetan Plateau. Appl Phys Rev. 2011;3:37. [Google Scholar]
- 38.Li SC, et al. Change of annual precipitation over Qinghai-Xizang Plateau and sub-regions in recent 34 Years. J Desert Res. 2007;27:307–314. [Google Scholar]
- 39.Ma X, Lu X. Annual cycle of reproductive organs in a Tibetan frog, Nanorana parkeri. Anim Biol Leiden Neth. 2010;60:259–271. [Google Scholar]
- 40.Zhou W, et al. River islands, refugia and genetic structuring in the endemic brown frog Rana kukunoris (Anura, Ranidae) of the Qinghai-Tibetan Plateau. Mol Ecol. 2013;22:130–142. doi: 10.1111/mec.12087. [DOI] [PubMed] [Google Scholar]
- 41.Kunming Institute of Zoology (CAS) 2018 AmphibiaChina. The database of Chinese amphibians. Available at www.amphibiachina.org/. Accessed May 5, 2018.
- 42.Zhou WW, et al. DNA barcodes and species distribution models evaluate threats of global climate changes to genetic diversity: A case study from Nanorana parkeri (Anura: Dicroglossidae) PLoS One. 2014;9:e103899. doi: 10.1371/journal.pone.0103899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Liu J, et al. Phylogeography of Nanorana parkeri (Anura: Ranidae) and multiple refugia on the Tibetan Plateau revealed by mitochondrial and nuclear DNA. Sci Rep. 2015;5:9857. doi: 10.1038/srep09857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Fei L, Ye CY. Amphibians of China: I. Science Press; Beijing: 2017. [Google Scholar]
- 45.Sun YB, et al. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes. Proc Natl Acad Sci USA. 2015;112:E1257–E1262. doi: 10.1073/pnas.1501764112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A. Bayesian inference of ancient human demography from individual genome sequences. Nat Genet. 2011;43:1031–1034. doi: 10.1038/ng.937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Zheng B, Xu Q, Shen Y. The relationship between climate change and Quaternary glacial cycles on the Qinghai-Tibetan Plateau: Review and speculation. Quat Int. 2002;35643:93–101. [Google Scholar]
- 48.Rahmani Z, Maunoury C, Siddiqui A. Isolation of a novel human voltage-dependent anion channel gene. Eur J Hum Genet. 1998;6:337–340. doi: 10.1038/sj.ejhg.5200198. [DOI] [PubMed] [Google Scholar]
- 49.McDonald JH, Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991;351:652–654. doi: 10.1038/351652a0. [DOI] [PubMed] [Google Scholar]
- 50.Xu X, Lai R. The chemistry and biological activities of peptides from amphibian skin secretions. Chem Rev. 2015;115:1760–1846. doi: 10.1021/cr4006704. [DOI] [PubMed] [Google Scholar]
- 51.Hershberger RE, et al. Coding sequence rare variants identified in MYBPC3, MYH6, TPM1, TNNC1, and TNNI3 from 312 patients with familial or idiopathic dilated cardiomyopathy. Circ Cardiovasc Genet. 2010;3:155–161. doi: 10.1161/CIRCGENETICS.109.912345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Jenner TL, Rose’Meyer RB. Loss of vascular adenosine A1 receptors with age in the rat heart. Vascul Pharmacol. 2006;45:341–349. doi: 10.1016/j.vph.2006.05.005. [DOI] [PubMed] [Google Scholar]
- 53.Kaindl AM, et al. Erythropoietin protects the developing brain from hyperoxia-induced cell death and proteome changes. Ann Neurol. 2008;64:523–534. doi: 10.1002/ana.21471. [DOI] [PubMed] [Google Scholar]
- 54.Kokkonen N, et al. Hypoxia upregulates carcinoembryonic antigen expression in cancer cells. Int J Cancer. 2007;121:2443–2450. doi: 10.1002/ijc.22965. [DOI] [PubMed] [Google Scholar]
- 55.Wyckoff GJ, Wang W, Wu CI. Rapid evolution of male reproductive genes in the descent of man. Nature. 2000;403:304–309. doi: 10.1038/35002070. [DOI] [PubMed] [Google Scholar]
- 56.Ting CT, Tsaur SC, Wu ML, Wu CI. A rapidly evolving homeobox at the site of a hybrid sterility gene. Science. 1998;282:1501–1504. doi: 10.1126/science.282.5393.1501. [DOI] [PubMed] [Google Scholar]
- 57.Malone JH, Michalak P. Physiological sex predicts hybrid sterility regardless of genotype. Science. 2008;319:59. doi: 10.1126/science.1148231. [DOI] [PubMed] [Google Scholar]
- 58.Toews DP, Brelsford A. The biogeography of mitochondrial and nuclear discordance in animals. Mol Ecol. 2012;21:3907–3930. doi: 10.1111/j.1365-294X.2012.05664.x. [DOI] [PubMed] [Google Scholar]
- 59.Pool JE, Nielsen R. Inference of historical changes in migration rate from the lengths of migrant tracts. Genetics. 2009;181:711–719. doi: 10.1534/genetics.108.098095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Harris K, Nielsen R. Inferring demographic history from a spectrum of shared haplotype lengths. PLoS Genet. 2013;9:e1003521. doi: 10.1371/journal.pgen.1003521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Irwin DE. Local adaptation along smooth ecological gradients causes phylogeographic breaks and phenotypic clustering. Am Nat. 2012;180:35–49. doi: 10.1086/666002. [DOI] [PubMed] [Google Scholar]
- 62.Currat M, Ruedi M, Petit RJ, Excoffier L. The hidden side of invasions: Massive introgression by local genes. Evolution. 2008;62:1908–1920. doi: 10.1111/j.1558-5646.2008.00413.x. [DOI] [PubMed] [Google Scholar]
- 63.Chan KM, Levin SA. Leaky prezygotic isolation and porous genomes: Rapid introgression of maternally inherited DNA. Evolution. 2005;59:720–729. [PubMed] [Google Scholar]
- 64.Colliard C, et al. Strong reproductive barriers in a narrow hybrid zone of West-Mediterranean green toads (Bufo viridis subgroup) with Plio-Pleistocene divergence. BMC Evol Biol. 2010;10:232. doi: 10.1186/1471-2148-10-232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Storz JF. Using genome scans of DNA polymorphism to infer adaptive population divergence. Mol Ecol. 2005;14:671–688. doi: 10.1111/j.1365-294X.2005.02437.x. [DOI] [PubMed] [Google Scholar]
- 66.Che J, et al. Spiny frogs (Paini) illuminate the history of the Himalayan region and Southeast Asia. Proc Natl Acad Sci USA. 2010;107:13765–13770. doi: 10.1073/pnas.1008415107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Sambrock J, Russel DW. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press; Plainview, NY: 2012. [Google Scholar]
- 68.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Li H, et al. 1000 Genome Project Data Processing Subgroup The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.DePristo MA, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–498. doi: 10.1038/ng.806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81:1084–1097. doi: 10.1086/521987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: A tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88:76–82. doi: 10.1016/j.ajhg.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Tang H, Peng J, Wang P, Risch NJ. Estimation of individual admixture: Analytical and study design considerations. Genet Epidemiol. 2005;28:289–301. doi: 10.1002/gepi.20064. [DOI] [PubMed] [Google Scholar]
- 74.Stamatakis A. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Liu L, Yu L, Edwards SV. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol Biol. 2010;10:302. doi: 10.1186/1471-2148-10-302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Liu L, Yu L, Pearl DK, Edwards SV. Estimating species phylogenies using coalescence times among sequences. Syst Biol. 2009;58:468–477. doi: 10.1093/sysbio/syp031. [DOI] [PubMed] [Google Scholar]
- 77.Shaw TI, Ruan Z, Glenn TC, Liu L. STRAW: Species TRee analysis web server. Nucleic Acids Res. 2013;41:W238–W241. doi: 10.1093/nar/gkt377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Patterson N, et al. Ancient admixture in human history. Genetics. 2012;192:1065–1093. doi: 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 2012;8:e1002967. doi: 10.1371/journal.pgen.1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Wang GD, et al. Out of southern East Asia: The natural history of domestic dogs across the world. Cell Res. 2016;26:21–33. doi: 10.1038/cr.2015.147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Freedman AH, et al. Genome sequencing highlights the dynamic early history of dogs. PLoS Genet. 2014;10:e1004016. doi: 10.1371/journal.pgen.1004016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Rambaut A, Drummond AJ. 2007 Tracer 1.5. Available at tree.bio.ed.ac.uk/software/tracer/. Accessed July 3, 2011.
- 83.Ma X, Lu X. Sexual size dimorphism in relation to age and growth based on skeletochronological analysis in a Tibetan frog. Amphibia-Reptilia. 2009;30:351–359. [Google Scholar]
- 84.Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002;18:337–338. doi: 10.1093/bioinformatics/18.2.337. [DOI] [PubMed] [Google Scholar]
- 85.Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475:493–496. doi: 10.1038/nature10231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M. Robust demographic inference from genomic and SNP data. PLoS Genet. 2013;9:e1003905. doi: 10.1371/journal.pgen.1003905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Chen G, Wang B, Liu J, Xie F, Jiang J. Complete mitochondrial genome of Nanorana pleskei (Amphibia: Anura: Dicroglossidae) and evolutionary characteristics. Curr Zool. 2011;57:785–805. [Google Scholar]
- 88.Brisbin A, et al. PCAdmix: Principal components-based assignment of ancestry along each chromosome in individuals with admixed ancestry from two or more populations. Hum Biol. 2012;84:343–364. doi: 10.3378/027.084.0401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Lachance J, Tishkoff SA. Biased gene conversion skews allele frequencies in human populations, increasing the disease burden of recessive alleles. Am J Hum Genet. 2014;95:408–420. doi: 10.1016/j.ajhg.2014.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Weir BS, Cockerham CC. Estimating F‐statistics for the analysis of population structure. Evolution. 1984;38:1358–1370. doi: 10.1111/j.1558-5646.1984.tb05657.x. [DOI] [PubMed] [Google Scholar]
- 91.Danecek P, et al. 1000 Genomes Project Analysis Group The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. doi: 10.1093/bioinformatics/btr330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Yi X, et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010;329:75–78. doi: 10.1126/science.1190371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Ai H, et al. Adaptation and possible ancient interspecies introgression in pigs identified by whole-genome sequencing. Nat Genet. 2015;47:217–225. doi: 10.1038/ng.3199. [DOI] [PubMed] [Google Scholar]
- 94.Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4:44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.