Skip to main content
eLife logoLink to eLife
. 2024 Nov 8;12:RP89815. doi: 10.7554/eLife.89815

Archaic introgression contributed to shape the adaptive modulation of angiogenesis and cardiovascular traits in human high-altitude populations from the Himalayas

Giulia Ferraretti 1,, Paolo Abondio 2,, Marta Alberti 1, Agnese Dezi 3, Phurba T Sherpa 4, Paolo Cocco 5, Massimiliano Tiriticco 5, Marco Di Marcello 5, Guido Alberto Gnecchi-Ruscone 6, Luca Natali 5,7, Angela Corcelli 8, Giorgio Marinelli 5, Davide Peluzzi 5, Stefania Sarno 1,, Marco Sazzini 1,9,†,
Editors: Emilia Huerta-Sanchez10, George H Perry11
PMCID: PMC11548878  PMID: 39513938

Abstract

It is well established that several Homo sapiens populations experienced admixture with extinct human species during their evolutionary history. Sometimes, such a gene flow could have played a role in modulating their capability to cope with a variety of selective pressures, thus resulting in archaic adaptive introgression events. A paradigmatic example of this evolutionary mechanism is offered by the EPAS1 gene, whose most frequent haplotype in Himalayan highlanders was proved to reduce their susceptibility to chronic mountain sickness and to be introduced in the gene pool of their ancestors by admixture with Denisovans. In this study, we aimed at further expanding the investigation of the impact of archaic introgression on more complex adaptive responses to hypobaric hypoxia evolved by populations of Tibetan/Sherpa ancestry, which have been plausibly mediated by soft selective sweeps and/or polygenic adaptations rather than by hard selective sweeps. For this purpose, we used a combination of composite-likelihood and gene network-based methods to detect adaptive loci in introgressed chromosomal segments from Tibetan WGS data and to shortlist those presenting Denisovan-like derived alleles that participate to the same functional pathways and are absent in populations of African ancestry, which are supposed to do not have experienced Denisovan admixture. According to this approach, we identified multiple genes putatively involved in archaic introgression events and that, especially as regards TBC1D1, RASGRF2, PRKAG2, and KRAS, have plausibly contributed to shape the adaptive modulation of angiogenesis and of certain cardiovascular traits in high-altitude Himalayan peoples. These findings provided unprecedented evidence about the complexity of the adaptive phenotype evolved by these human groups to cope with challenges imposed by hypobaric hypoxia, offering new insights into the tangled interplay of genetic determinants that mediates the physiological adjustments crucial for human adaptation to the high-altitude environment.

Research organism: Human

Introduction

The scientific community currently agrees that the Homo sapiens species experienced admixture with extinct Hominins since traces of such inbreeding events are still detectable in the genomes of modern humans (Gouy and Excoffier, 2020). In fact, people belonging to non-African populations show 1–2% of Neanderthal ancestry (Green et al., 2010; Prüfer et al., 2014), while Melanesians and East-Asians present 3% and 0.2% of Denisovan ancestry, respectively (Reich et al., 2010; Meyer et al., 2013; Prüfer et al., 2014; Racimo et al., 2017). Despite evidence supporting selection against introgressed alleles has been collected (Simonti et al., 2016; Racimo et al., 2017; McArthur et al., 2021), some of the genomic segments showing signatures ascribable to archaic introgression were also proved to have been targeted by natural selection in modern human populations, thus providing examples for the occurrence of adaptive introgression (AI) events (Racimo et al., 2017).

So far, several studies have indeed identified introgressed archaic alleles at high frequency in human genes involved in metabolism or in the response to environmental conditions, such as temperature, sunlight, and altitude (Prüfer et al., 2014; Vernot and Akey, 2014; Sankararaman et al., 2014; Huerta-Sánchez et al., 2014; Gittelman et al., 2016; Racimo et al., 2017; Enard and Petrov, 2018; Dannemann and Racimo, 2018). Moreover, some genes that play a role in immune responses to pathogens are found to be characterized by a similar pattern of variability (Laurent et al., 2011; Enard and Petrov, 2018) and certain Neanderthal alleles have been shown to be associated with down-regulation of gene expression in brain and testes (McCoy et al., 2017; Racimo et al., 2017; Dannemann and Racimo, 2018). These works collectively attest how genetic variants introduced in the human gene pool by admixture with archaic species can significantly impact our biology by possibly comporting modifications in the modulation of several functional pathways. In particular, the high frequency of some archaic alleles in protein-coding and/or regulatory genomic regions suggests a possible adaptive role for Neanderthal and/or Denisovan variants, pointing to a further evolutionary mechanism having potentially contributed to the processes of human biological adaptation to different environmental and cultural settings. By introducing new alleles in the gene pool of a given population, admixture in fact provides a very rapid opportunity for natural selection to act on it (Huerta-Sánchez et al., 2014; Jeong et al., 2014; Racimo et al., 2015; Hamid et al., 2021) and this is supposed to have likely occurred during the evolutionary history of H. sapiens, particularly after the last Out of Africa migration in the late Pleistocene (Sugden, 2018; Vahdati et al., 2022). According to this view, gene flow from extinct Hominin species could have facilitated the adaptation of H. sapiens populations to peculiar Eurasian environments.

For instance, a Denisovan origin of the adaptive EPAS1 haplotype, which confers reduced susceptibility to chronic mountain sickness to Tibetan and Sherpa highlanders (Beall, 2007; Bigham et al., 2010; Yi et al., 2010; Peng et al., 2011; Xu et al., 2011) is well established (Huerta-Sánchez et al., 2014; Zhang et al., 2021). However, the hard selective sweep experienced in high-altitude Himalayan populations by the EPAS1 introgressed haplotype has been demonstrated to account only for an indirect aspect of their adaptive phenotype, which does not explain most of the physiological adjustments they evolved to cope with hypobaric hypoxia (Gnecchi-Ruscone et al., 2018). Therefore, how far gene flow between Denisovans and the ancestors of Tibetan/Sherpa peoples facilitated the evolution of other key adaptive traits of these populations remains to be elucidated.

To fill this gap, and to overcome the main limitation of most approaches currently used to test for AI (i.e., inferring archaic introgression and the action of natural selection separately by means of different algorithms, which increases the risk of obtaining biased results due to confounding variables), we assembled a dataset of whole-genome sequences (WGSs) from 27 individuals of Tibetan ancestry living at high altitude (Cho et al., 2017; Jeong et al., 2018) and we analysed it using a composite-likelihood method specifically developed to detect AI events at once (Setter et al., 2020). Notably, this method was designed to recognize AI mediated by subtle selective events (as those involved in polygenic adaptation) and/or soft selective sweeps, which represent the evolutionary mechanisms that are supposed to have played a more relevant role than hard selective sweeps during the adaptive history of human groups characterized by particularly small effective population size, such as Tibetans and Sherpa (Gnecchi-Ruscone et al., 2018). Coupled with validation of the identified putative adaptive introgressed loci through (1) the assessment of the composition of gene networks made up of functionally related DNA segments presenting archaic derived alleles that are absent in human groups which are supposed to do not have experienced Denisovan admixture, such as African ones, (2) the confirmation that natural selection targeted these genomic regions in populations of Tibetan ancestry, and (3) the quantification of genetic distance between modern and archaic haplotypes, such an approach provided new evidence about the biological functions that have mediated high-altitude adaptation in Himalayan populations and that have been favourably shaped by admixture of their ancestors with Denisovans.

Results

Spatial distribution of genomic variation and ancestry components of Tibetan samples

After quality control (QC) filtering of the available WGS data, we obtained a dataset made up of 27 individuals of Tibetan ancestry characterized for 6,921,628 single-nucleotide variants (SNVs). To assess whether this dataset represents a reliable proxy for the genomic variation observable in high-altitude Himalayan populations, we merged it with genome-wide genotyping data for 1086 individuals of East-Asian ancestry belonging to both low- and high-altitude groups (Gnecchi-Ruscone et al., 2017; Landini et al., 2021). We thus obtained an extended dataset including 231,947 SNVs (Supplementary file 1a), which was used to perform population structure analyses.

Results from ADMIXTURE and principal components analysis (PCA) were found to be concordant with those described in previous studies (Jeong et al., 2014; Gnecchi-Ruscone et al., 2017; Gnecchi-Ruscone et al., 2018; Yang et al., 2021). According to the ADMIXTURE model showing the best predictive accuracy (K = 7) (Figure 1—figure supplement 1), the examined WGS exhibited a predominant genetic component that was appreciably represented also in other populations speaking Tibeto-Burman languages, such as Tu, Yizu, Naxi, Lahu, and Sherpa (Figure 1A and Figure 1—figure supplement 2). Such a component reached an average proportion of around 78% in individuals of Tibetan ancestry from Nepal included in the extended dataset, as well as of more than 80% in the subjects under investigation, who live in the Nepalese regions of Mustang and Ghorka (Figure 1A, B). This suggests that after their relatively recent migration in Nepalese high-altitude valleys, these communities might have experienced a higher degree of isolation and genetic drift with respect to populations that are still settled on the Tibetan Plateau, in which the same ancestry fraction did not exceed 64% (Figure 1A, B). Nevertheless, the overall ADMIXTURE profile of the considered WGS appeared to be quite comparable to those inferred according to genome-wide genotyping data for other Tibetan populations (Figure 1A, B, Figure 1—figure supplement 2). Similarly, PCA pointed to the expected divergence of Tibetan and Sherpa high-altitude groups from the cline of genomic variation of East-Asian lowland populations (Abdulla et al., 2009; Jeong et al., 2014; Gnecchi-Ruscone et al., 2017; Zhang et al., 2017; Wang et al., 2022; Figure 1C). Remarkably, the WGS under investigation clustered within the bulk of genome-wide data generated for other groups from the Tibetan Plateau, thus supporting their representativeness as concerns the overall genetic background of high-altitude Himalayan populations.

Figure 1. Population structure analyses performed on the extended dataset including Tibetan, Sherpa, and lowland East-Asian individuals.

(A) Admixture analysis showed the best predictive accuracy when seven (K = 7) population clusters were tested. Populations included in the dataset are labelled according to population names and acronyms reported in Supplementary file 1a. (B) Map showing geographic location and admixture proportions at K = 7 of the high-altitude groups included in the extended dataset. The label Tibetans_WG indicates whole-genome sequence data for individuals of Tibetan ancestry analysed in the present study. Additional information about the considered samples (e.g., number of individuals per group, reference study, and used abbreviations) are reported in Supplementary file 1a. (C) Principal components analysis (PCA) plot considering PC1 vs PC2 and summarizing genomic divergence between high-altitude Tibetan/Sherpa people and the cline of variation observable for lowland East-Asian populations. The enlarged square displays clustering between Tibetan samples sequenced for the whole genome (i.e., blue dots) and Tibetan samples characterized by genome-wide data (i.e., light-blue squares).

Figure 1.

Figure 1—figure supplement 1. Scatterplot showing the number of possible population clusters (K) tested by the different ADMIXTURE runs performed and the cross-validation (CV) errors associated to them.

Figure 1—figure supplement 1.

Figure 1—figure supplement 2. Admixture analyses performed on the extended dataset for K = 2 to K = 12.

Figure 1—figure supplement 2.

Numbers of K increase from the top to the bottom of the plot and the considered populations are named according to the labels reported in Supplementary file 1a.

Detecting putative AI signatures in Tibetan genomes

To identify genomic regions showing signatures putatively ascribable to AI events, we scanned Tibetan WGS with the VolcanoFinder algorithm and we computed the composite likelihood ratio (LR) and −logα statistics for each polymorphic site (Setter et al., 2020). We then considered the most significant results by focusing on loci showing LR values falling in the positive tail (i.e., top 5%) of the obtained distribution (see Materials and methods, Supplementary file 1b).

According to such an approach, we were first able to recapitulate the AI event previously described for the EPAS1 gene (Figure 2—figure supplement 1A; Huerta-Sánchez et al., 2014; Hu et al., 2017; Zhang et al., 2021). In fact, this chromosomal interval was found to be characterized by a remarkable number of variants (N = 19) showing significant LR scores, as well as by high overall values of −logα (Figure 2—figure supplement 1A), suggesting, respectively, the plausible archaic origin of many alleles at this gene and an appreciable action of natural selection on it. Five of these significant SNVs have been already described as Denisovan-like derived alleles at outstanding frequency (i.e., ranging between 0.96 and 1) in Tibetans but not in other modern human populations (Supplementary file 1c; Zhang et al., 2021). On the contrary, at the genomic region encompassing EGLN1 (which we have considered as a negative control for AI, see Materials and methods) we detected high −logα values coupled with a low number of SNVs (N = 3) showing significant LR scores, with only one being remarkably above the adopted significance threshold (Figure 2—figure supplement 1B and Supplementary file 1d). Overall, these findings are concordant with evidence from literature that suggest adaptive evolution of both EPAS1 and EGLN1 loci in high-altitude Himalayan populations (Yang et al., 2017; Liu et al., 2022), although only the former was proved to have been impacted by archaic introgression (Huerta-Sánchez et al., 2014; Hu et al., 2017; Zhang et al., 2021).

Moreover, we were able to confirm other introgression signatures previously inferred from WGS data for populations of Tibetan ancestry, such as those involving the PRKCE gene and the MIRLET7BHG long non-coding region, which are located in the overlapping upstream chromosomal intervals, respectively, of EPAS1 and PPARA (Figure 2—figure supplement 2A, B). In line with what reported for EPAS1, also PPARA has been already proposed to play a role in the modulation of high-altitude adaptation of Himalayan human groups (Simonson et al., 2010; Horscroft et al., 2017; Zhang et al., 2021). Interestingly, AI signatures identified by VolcanoFinder in the MIRLET7BHG locus extended also in the PPARA gene, as well in its downstream region (Figure 2—figure supplement 2A), supporting the findings described by Hu et al., 2017. Finally, we observed patterns comparable to those at EPAS1 and PPARA for 10 additional genomic regions that were differentially pointed out by previous studies as Tibetan and/or Han Chinese DNA segments potentially carrying introgressed Denisovan alleles (Hu et al., 2017; Browning et al., 2018; Zhang et al., 2021; Figure 2A, B, Figure 2—figure supplement 3A, Supplementary file 1e).

Figure 2. Distribution of VolcanoFinder statistics suggestive of putative adaptive introgrossed loci across the TBC1D1 and PRKAG2 genomic regions.

On the x-axis are reported genomic positions of each single-nucleotide variant (SNV), while on the y-axis are displayed the related statistics obtained. Pink background indicates the chromosomal interval occupied by the considered genes, while the grey background identifies those genes (i.e., PGM2 in the TBC1D1 downstream genomic region and the RHEB gene in the upstream PRKAG2 region) possibly involved in regulatory transcription mechanisms. The dashed red line identifies the threshold set to filter for significant likelihood ratio (LR) values (i.e., top 5% of LR values). For both these genomic regions, the distribution of LR and −logα are concordant with those observed at the EPAS1 positive control for adaptive introgression (AI). (A) A total of 50 significant LR values (red stars) and −logα (grey diamonds) values resulted collectively elevated in both the TBC1D1 gene and its downstream genomic regions. A remarkable concentration of significant LR values characterizing 19 SNVs was especially observable in the first portion of the gene. (B) The entire PRKAG2 genomic region was found to comprise 46 SNVs showing significant LR values, with the greatest peaks being located in the downstream region associated to such gene. Peaks detected for the LR statistic are accompanied by peaks of −logα values.

Figure 2.

Figure 2—figure supplement 1. Distribution of VolcanoFinder statistics across the EPAS1 and EGLN1 positive and negative controls for adaptive introgression.

Figure 2—figure supplement 1.

In the plots, the chart base reports genomic positions of variants, pink background indicates the starting and the ending positions of the genes, while the grey background identifies those genes (i.e., the LINC02583 gene in the EPAS1 downstream genomic region, the SPRTN gene in the upstream EGLN1 region) possibly involved in regulatory transcription mechanisms. The dashed red line identifies the significant threshold set to filter likelihood ratio (LR) values (i.e., top 5% LR values). (A) The 19 single-nucleotide variants (SNVs) showing significant LR values (i.e., red stars) resulted closely distributed in the ending portion of the EPAS1 gene and in both up- and downstream genomic regions flanking such locus. The −logα values (i.e., grey diamonds) appeared consistently distributed in the entire EPAS1 region. A similar a pattern is observed also for the TBC1D1, PRKAG2, and RASGRF2 new candidate AI genes, as reported in Figure 2A, B and in Figure 2—figure supplement 3. (B) The EGLN1 genomic region is characterized by only three LR significant values among which only one strongly deviates from the significant LR threshold. Several SNVs distributed in the EGLN1 starting portion, as well as in its flaking regions, showed elevated −logα values supporting the action of natural selection on them in the considered Tibetan population.
Figure 2—figure supplement 2. Distribution of VolcanoFinder statistics across MIRLET7BHG, PPARA, and PRKCE genes.

Figure 2—figure supplement 2.

In the plots, the chart base reports genomic positions of variants, pink background indicates the starting and the ending positions of the PPARA and PRKCE genes, while the grey background identifies those loci (i.e., MIRLET7BHG long non-coding, CDPF1 and TTC38 genes located in PPARA up- and downstream regions and the SRBD1 gene located in the PRKCE upstream genomic region, respectively) possibly involved in regulatory transcription mechanisms. The red horizontal dashed line displayed the significant threshold set for filtering LR significant values. (A) A total of 32 single-nucleotide variants (SNVs) showed significant LR values (red stars) covering all the genomic region considered. LR greatest peaks are observed in the ending portion of the PPARA gene and in the CDPF1 gene. Collectively, the genomic regions comprising significant LR scores also showed elevated −logα values (grey diamonds). (B) The 55 significant LR scores and the elevated −logα values cover the entire region of the PRKCE gene (i.e., a gene located in a genomic region nearby to EPAS1).
Figure 2—figure supplement 3. Distribution of VolcanoFinder statistics across the RASGRF2 candidate adaptive introgression (AI) gene.

Figure 2—figure supplement 3.

The chart base reports genomic positions of variants, pink background indicates the starting and the ending positions of the RASGRF2 gene, while the grey background identifies those genes (i.e., RASGRF2-AS1 antisense RNA gene and CKMT2 located in the RASGRF2 up- and downstream regions, respectively) possibly involved in regulatory transcription mechanisms. The red horizontal dashed line displayed the significant threshold set for filtering likelihood ratio (LR) significant values. A total of 14 significant LR values (red stars) were equally distributed across the entire genomic region considered, with the greatest peak recovered in the ending portion of the RASGRF2 gene. Elevated peaks of −logα are observable in both RASGRF2 gene and its flanking genomic regions. Such a pattern resulted in line with that observed for the EPAS1 positive control for AI, as reported in Figure 2—figure supplement 1, and for both TBC1D1 and PRKAG2 genomic regions, as reported in Figure 2A, B.
Figure 2—figure supplement 4. Distribution of VolcanoFinder statistics across the KRAS candidate adaptive introgression (AI) gene.

Figure 2—figure supplement 4.

The pink and the grey rectangulars represent the portion of the genome covered by the KRAS and ETFRF1 genes, respectively. Although only three significant likelihood ratio (LR) values can be observed for such a genomic region, the KRAS gene was included in our set of new candidate AI genes because it was confirmed by all the subsequent validation analyses performed and according with previous evidenced advanced by Hu et al., 2017 and Browning et al., 2018, which suggest that a portion of variants included in the KRAS gene, as well as in its downstream region, shows signatures of archaic Denisovan introgression in both Tibetan and CHB populations. Elevated −logα values are instead consistently distributed across the KRAS gene and its surrounding genomic regions.

Validating genomic regions affected by archaic introgression

To validate signatures of archaic introgression at the candidate AI loci identified with VolcanoFinder, we relied on the approach described by Gouy and Excoffier, 2020. In detail, we used the Signet algorithm to identify networks of genes participating to the same functional pathway and presenting archaic-like (i.e., Denisovan) derived alleles observable in the putative admixed group (i.e., Tibetans) but not in an outgroup of African ancestry (i.e., Yoruba, YRI), by assuming that only Eurasian H. sapiens populations experienced Denisovan admixture (see Materials and methods).

After having crosschecked results from the VolcanoFinder and Signet analyses, we identified six gene networks that turned out to be consistently significant across all the Signet runs performed and that included a total of 15 genes pointed by VolcanoFinder as candidate AI loci (Supplementary file 1f). Four of these loci composed the gene network overall ascribable to the Pathways in Cancer biological functions, which included also the EPAS1 positive control for AI (Figure 3A). Most of the other genes supported by both the analyses were instead observed in significant networks belonging to the Ras signalling and AMPK signalling pathways (Supplementary file 1f).

Figure 3. Significant gene networks including Denisovan-like derived alleles according to the Signet analysis.

Figure 3.

(A) Schematic representation of the activation of the RAS/MAPK(ERK) axis after interaction of the bradykinin receptors with their ligands (e.g., ANG II) within the framework of the Pathways in Cancer network. Genes supported by both Signet (i.e., belonging to the significant network associated to Pathways in cancer) and VolcanoFinder (i.e., including at least a single-nucleotide variant (SNV) showing likelihood ratio (LR) value within top 5% of the obtained results) analyses as potentially introgressed loci, are highlighted in red and present solid outline. Grey circles with dotted-dashed contour instead indicate genes supported only by Signet, while loci marked with stars are those including genomic windows showing LASSI T statistic within top 5% of the related distribution. After the interaction between ANG II (active enzyme angiotensin II) and bradykinin receptors, activation of the Ras protein encoded by KRAS mediated by RAS-GTPases (e.g., RASGRF2) comports a series of phosphorylation reactions that eventually promotes angiogenesis (Kranenburg et al., 2004). In detail, phosphorylation of the MAPK1 protein and prevention of MAPK1-DAPK-1-dependent apoptosis leads to increased MAPK1 activity (Kanehisa and Goto, 2000; Stevens et al., 2007) that causes improved FOS mRNA expression (Monje et al., 2005). FOS together with other proteins (e.g., Jun) forms the AP-1 transcription factor, which bounds to the VEGF promoter region upregulating its expression in endothelial cells (Catar et al., 2013) and sustaining angiogenesis when the hypoxia inducible factor 1 (HIF-1) signalling cascade is inhibited (Lorenzo et al., 2014). (B) Gene network built by setting co-expression as force function and by displaying the entire set of genes identified by the Signet algorithm as belonging to significant pathways including Denisovan-like derived alleles. Genes whose variation pattern was supported by both VolcanoFinder and Signet analyses (e.g., TBC1D1) as shaped by archaic introgression are displayed with a solid black outline. The EPAS1 positive control locus that has been previously proved to have mediated adaptive introgression in Tibetan populations was represented as light-blue octagonal. Genes included in pathways involved in angiogenesis (e.g., RASGRF2) and/or activated in hypoxic conditions (e.g., PRKAG2) are reported as dark red and light-blue circles, respectively, while the remining fraction of significant genes are represented as light-grey circles. The closeness or the distance between all nodes reflects the tendency to be co-expressed with each other and all the connections inferred are characterized by a confident score ≥0.7.

Interestingly, gene networks belonging to the Pathways in Cancer and Ras signalling pathway appeared to be tightly related from a functional perspective because included oncogenes that promote the initiation and progression of tumour growth by stimulating cell proliferation and angiogenesis (Kranenburg et al., 2004). In particular, according to the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database KRAS and RASGRF2 genes from the Ras signalling network were found to contribute also to the Pathways in Cancer functions (Figure 3A), especially by interacting with the identified candidate introgressed loci PLCB1, RASGRP2, DAPK1, MAPK1, FOS, and VEGFA to modulate the VEGF signalling pathway, which is activated in hypoxic conditions and induces the transcription of genes that promote angiogenesis (Figure 3A; Maxwell and Ratcliffe, 2002; Kranenburg et al., 2004).

Also, the AMPK signalling pathway is known to be activated in different cell types by stresses such as deprivation of oxygen and/or glucose, leading to the inhibition of energy-consuming biosynthetic pathways (e.g., protein and glycogen synthesis) and to the activation of ATP-producing catabolic pathways, such as fatty acid oxidation and glycolysis (Kanehisa and Goto, 2000; Chen et al., 2018; Dengler, 2020).

No significant gene networks involving the EGLN1 genomic region considered as a negative control for AI were instead reconstructed with the Signet approach (Supplementary file 1f), suggesting that the very few variants at this locus that showed VolcanoFinder LR scores above the adopted significant threshold might represent false positive results (Figure 2—figure supplement 1B and Supplementary file 1d).

Shortlisting introgressed genomic regions characterized by adaptive evolution

To further shortlist the most robust candidate genes involved in AI events, we applied the LASSI algorithm to phased Tibetan WGS data with the aim of searching for genomic signatures ascribable to the action of natural selection (Harris and DeGiorgio, 2020) (see Materials and methods).

This enabled us to confirm the strong selective events occurred at the EPAS1 and EGLN1 genes, as previously reported by multiple studies conducted on high-altitude Himalayan populations (Beall et al., 2010; Yi et al., 2010; Simonson et al., 2010; Horscroft et al., 2017; Zhang et al., 2021), as well as to corroborate adaptive evolution of some of the genes pointed out by both VolcanoFinder and Signet analyses. In fact, several chromosomal intervals associated to these loci presented values of the computed T statistic that fall within the top 5% of the related distribution (Figure 4C, D, Figure 4—figure supplements 1 and 2C, D, Figure 4—figure supplement 3C, D).

Figure 4. Representation of genetic distances between modern and archaic haplotypes and barplots showing haplotype frequency spectra for TBC1D1 and RASGRF2 candidate adaptive introgression (AI) genes.

Haplotypes are reported in rows, while derived (i.e., black square) and ancestral (i.e., white square) alleles are displayed in columns. Haplotypes are ranked from top to bottom according to their number of pairwise differences with respect to the Denisovan sequence. (A) Heatmap displaying divergence between Tibetan, CHB and YRI TBC1D1 haplotypes with respect to the Denisovan genome. A total of 33 TBC1D1 haplotypes (i.e., 61% of the overall haplotypes inferred for such a region) belonging to individuals with Tibetan ancestry are plotted in the upper part of the heatmap thus presenting the smallest number of pairwise differences with respect to the Denisovan sequence. (B) Heatmap displaying divergence between Tibetan, CHB and YRI RASGRF2 haplotypes with respect to the Denisovan genome. A total of 16 Tibetan haplotypes in the RASGRF2 genomic region present no differences with respect to the Denisovan sequence. As regards barplots, on the x-axis are reported the haplotypes detected in the considered genomic windows, while on the y-axis is indicated the frequency for each haplotype. The black and dark-grey bars indicate the more frequent haplotypes (i.e., the putative adaptive haplotypes inferred by the LASSI method), while red stars mark those haplotypes carrying Denisovan-like derived alleles. (C) TBC1D1 haplotype frequency spectrum. The TBC1D1 gene presents a haplotype pattern qualitatively comparable to that observed at EPAS1 (Figure 4—figure supplement 3A), with a predominant haplotype carrying archaic derived alleles and reaching elevated frequencies in Tibetan populations. In line with this observation, such a pattern was inferred by LASSI as conformed with a non-neutral evolutionary scenario, even if it seems to be characterized by a soft rather than a hard selective sweep due to the occurrence of three sweeping haplotypes. (D) RASGRF2 haplotype frequency spectrum. A soft selective sweep was inferred also for the considered RASGRF2 genomic window, although frequencies reached by the sweeping haplotypes turned out to be more similar with each other. The second most represented haplotype was that carrying the archaic derived alleles and, reached a frequency of 29% in the Tibetan group.

Figure 4.

Figure 4—figure supplement 1. Haplotype frequency spectra of the top windows detected as adaptively evolved by LASSI in the EPAS1 and EGLN1 genomic regions.

Figure 4—figure supplement 1.

Barplots showing haplotype frequency spectra in the genomic windows associated with the highest T value and linked to (A) EPAS1 and (B) EGLN1 genes. The x-axis reports the haplotypes detected in the windows, while on y-axes are indicated frequencies of each haplotype. For these windows the haplotype frequency spectra clearly reflect the pattern of diversity expected under the hard selective sweeps model in which a single predominant haplotype carrying adaptive variants (i.e., sweeping haplotypes represented with the black bars) reaches elevated frequencies in the population.
Figure 4—figure supplement 2. Representation of genetic distances between modern and archaic haplotypes.

Figure 4—figure supplement 2.

Heatmap displaying the divergence between Tibetan, CHB and YRI KRAS and PRKAG2 haplotypes with respect to the Denisovan sequence. Haplotypes are reported in rows, while derived (i.e., black square) and ancestral (i.e., white square) alleles are displayed in columns. Haplotypes are ranked from top to bottom according to their number of pairwise differences with respect to the Denisovan sequence. The red square identifies the cluster of Tibetan haplotypes classiefed by the LASSI method as sweeping haplotypes (i.e., haplotypes with elevated or moderate frequencies and which carry putative adaptive variants). (A) 16% of Tibetan haplotypes inferred for KRAS conformed with a non-neutral evolutionary scenario according to LASSI results and presented the smallest number of pairwise differences with respect to the Denisovan genome, being plotted in the upper part of the heatmap. (B) 33% of Tibetan PRKAG2 haplotypes cluster in the upper part of the heatmap being among the most close haplotypes with respect to the Denisovan sequence and presenting only four pairwise differences with it. Barplots showing haplotype frequency spectrum of KRAS and PRKAG2 windows suggestive of adaptations mediated by soft selective sweeps in Tibetans. In both the plots are reported on the x-axis the haplotypes detected in the considered windows, while on y-axes are indicated the frequencies of each haplotype. The black and dark-grey bars indicate the more frequent haplotypes (i.e., the sweeping haplotypes inferred by LASSI), while the red star marks those haplotypes carrying Denisovan-like derived variants. (C) KRAS presents a pattern qualitatively comparable to that expected for a non-neutral evolution (i.e, positive likelihood T values), with two main haplotypes carrying putative adaptive variants and reaching elevated frequencies in Tibetans. The second sweeping haplotype carries the Denisovan-like derived variant and reaches 16% of frequency. (D) The most frequent sweeping haplotype detected in this PRKAG2 window reches 33% of frequency in Tibetans and carries the Denisovan-like derived variant.
Figure 4—figure supplement 3. Representation of genetic distances between modern and archaic haplotypes.

Figure 4—figure supplement 3.

Heatmap displaying the divergence between Tibetan, CHB and YRI EPAS1 and EGLN1 haplotypes with respect to the Denisovan sequence. Haplotypes are reported in rows, while derived (i.e., black square) and ancestral (i.e., white square) alleles are displayed in columns. Haplotypes are ranked from top to bottom according to their number of pairwise differences with respect to the Denisovan sequence. The red square identifies the cluster of Tibetan haplotypes classified by LASSI as sweeping haplotypes (i.e., haplotypes with elevated or moderate frequencues which carry putative adaptive variants). (A) The first homogeneus cluster of haplotypes visible in upper part of the heatmap belongs to Tibetan individuals (i.e., light-blue cluster). These haplotypes are among the closest ones to the archaic Denisovan sequence indicated in black, thus confirming archaic introgression at EPAS1. As concerning the haplotypes inferred for the other population in the plot, only one YRI haplotype presents one pairwise difference less than the haplotypes in the first Tibetan cluster. (B) Except for two Tibetan haplotypes, which appeared the closest ones to the archaic reference, the cluster of haplotypes presenting the lowest number of differences with respect to the Denisovan sequence belongs to the Han Chinese population. The most frequent Tibetan haplotype did not present any variant shared with the archaic reference, counting 14 pairwise differences with respect to Denisovan genome and thus not supporting the archaic origin of these EGLN1 variants. Barplots showing the haplotype frequencies spectrum of the (C) EPAS1 and (D) EGLN1 windows suggestive of adaptation mediated by hard selective sweeps in Tibetans. In both the plots are reported on the x-axis the haplotypes detected in the considered windows, while on y-axes are indicated the frequencies of each haplotype. The black and dark-grey bars indicate the more frequent haplotypes (i.e., sweeping haplotypes inferred by LASSI), while the red star marks those sweeping haplotypes carrying Denisovan-like derived variants. For these windows, haplotype frequency spectra clearly reflect patterns expected under the hard selective sweep model in which a single haplotype carrying adaptive variants (i.e., sweeping haplotypes represented with the black bars) reaches elevated frequencies in the population. However, only for EPAS1 such haplotype effetively carries the Denisovan-like derived variant.

More in detail, in addition to EPAS1, genomic windows associated to the DAPK1, GNG7, AK5, TBC1D1, PLCB1, RASGRF2, and PRKAG2 introgressed loci supported by both VolcanoFinder and Signet approaches were found to present scores within the top 5% of the T distribution, suggesting that their haplotype diversity was appreciably shaped by positive selection. Interestingly, adaptive evolution of the TBC1D1 and RASGRF2 genes has been previously proposed by studies conducted on different populations of Tibetan ancestry (Peng et al., 2011; Zheng et al., 2023).

Estimating genetic distance between modern and archaic sequences

As a final step for prioritizing the most convincing AI genes supported by VolcanoFinder, Signet, and LASSI approaches, as well as to explicitly test whether the Denisovan human species represented a plausible source of archaic alleles for them, we merged Tibetan WGS data with those from low-altitude Han Chinese (CHB) and YRI populations sequenced by the 1000 Genomes Project (Auton et al., 2015), and with the Denisovan genome. We then used the Haplostrip algorithm (Marnetto and Huerta‐Sánchez, 2017) to estimate genetic distance between modern and archaic haplotypes at the candidate AI genes reported in the previous paragraph. We especially considered genomic windows that included Denisovan-like derived alleles and that presented values of the likelihood T statistic supporting an adaptive evolution (see Materials and methods).

Among the tested putative AI loci, TBC1D1, RASGRF2, PRKAG2, and KRAS were found to present substantial proportions of Tibetan haplotypes that cluster close to the Denisovan sequence, thus showing the lowest numbers of pairwise differences with respect to it as compared with CHB or YRI haplotypes (Figure 4A, B, Figure 4—figure supplement 2A, B). More in detail, 61% of the TBC1D1 Tibetan haplotypes turned out to be the nearest ones to the archaic sequence by entailing only two pairwise differences with respect to it (Figure 4A), while 29% of Tibetan haplotypes inferred for the RASGRF2 gene were even identical to the Denisovan DNA (Figure 4B). Interestingly, both these chromosomal intervals were classified by the LASSI method as regions whose variation pattern was conformed with the soft selective sweep model, presenting three potential adaptive haplotypes (i.e., those haplotypes that plausibly carry putative advantageous alleles and thus increased in frequency due to positive selection). At each gene, one of these haplotypes was found to contain the Denisovan-like derived alleles that are completely absent in YRI (Figure 4A, D). A similar pattern was observed also for the considered PRKAG2 and KRAS genomic windows (Figure 4—figure supplement 2A, B).

Overall, the distribution of similarities between modern and archaic haplotypes described for the four identified AI candidate loci appears to be comparable to that obtained for EPAS1, with the sole relevant distinction being represented by an even more pronounced differentiation between Tibetan and CHB patterns plausibly ascribable to the occurrence of a hard (rather than soft) selective sweep at EPAS1 (Figure 4—figure supplement 3A), as previously proposed (Simonson et al., 2010; Huerta-Sánchez et al., 2014). In fact, at EPAS1 the haplotype carrying the lowest number of pairwise differences (N = 3) with respect to the archaic one belongs to the Tibetan population, in which it reached 74% of frequency, being instead absent in all the others modern groups considered (Figure 4—figure supplement 3A). Conversely, the EGLN1 genomic window showing the highest T score according to LASSI analysis was characterized by an opposite pattern, with CHB haplotypes being overrepresented among those with the lowest number of pairwise differences (N = 3) with respect to the Denisovan genome (Figure 4—figure supplement 3B). Moreover, the sole EGLN1 putative adaptive haplotype inferred by LASSI for Tibetans was among those presenting the highest number of differences as compared with the archaic sequence, being characterized exclusively by alleles that are not observed in the Denisovan genome and thus suggesting that natural selection targeted modern rather than archaic EGLN1 variation (Figure 4—figure supplement 3B).

Discussion

In the present study, we aimed at investigating how far gene flow between the Denisovan archaic human species and the ancestors of modern populations settled in high-altitude regions of the Himalayas contributed to the evolution of key adaptive traits of these human groups, in addition to having conferred them reduced susceptibility to chronic mountain sickness (Huerta-Sánchez et al., 2014). For this purpose, we used WGS data from individuals of Tibetan ancestry to search for genomic signatures ascribable to AI mediated by weak selective events rather than by hard selective sweeps, under the assumption that soft sweeps and/or processes of polygenic adaptation are more likely to have occurred in such remarkably isolated and small effective population size groups (Gnecchi-Ruscone et al., 2018).

By assembling a large genome-wide dataset including both low- and high-altitude populations, we first framed the available WGS data into the landscape of East-Asian genomic variation. This confirmed that the genomes under investigation are well representative of the overall profiles of ancestry components observable in high-altitude Himalayan peoples (Figure 1A, B). In fact, the considered individuals were found to show close genetic similarity to other populations of Tibetan ancestry (Jeong et al., 2014) and to Sherpa people from Nepal (Gnecchi-Ruscone et al., 2017), as well as to appreciably diverge from the cline of variation of lowland East-Asians (Abdulla et al., 2009; Figure 1C).

Based on this evidence, we submitted Tibetan WGS to a pipeline of analyses that implemented multiple independent approaches aimed at identifying genomic regions characterized by signatures putatively ascribable to AI events and by tight functional correlations with each other. According to such a rationale, we shortlisted the candidate introgressed loci that most likely contribute to the same adaptive trait by searching for chromosomal intervals including loci simultaneously showing: (1) significant LR scores computed by the VolcanoFinder algorithm (Figure 2A, B, Figure 2—figure supplements 3 and 4), (2) Denisovan-like derived alleles belonging to significant networks of functionally related genes reconstructed with the Signet method and completely absent in populations of African ancestry (Figure 3A, B, Supplementary file 1f), (3) signatures ascribable to the action of natural selection as pointed out by LASSI analysis (Figure 4C, D, Figure 4—figure supplement 2C, D), and (4) haplotypes more similar to the Denisovan ones rather than to those observed in other modern human populations, as depicted by the Haplostrips approach (Figure 4A, B; Figure 4—figure supplement 2A, B).

Overall, in addition to EPAS1, which we considered as a positive control for AI, a total of 18 genes encompassed within the putative AI chromosomal intervals identified by the VolcanoFinder method (Supplementary file 1e) were found to have been previously proposed as genomic regions impacted by introgression of Denisovan alleles in Tibetan and/or Han Chinese populations (Huerta-Sánchez et al., 2014; Hu et al., 2017; Browning et al., 2018; Yang et al., 2017; Zhang et al., 2021). Among them, PPARA, PRKCE, and TBC1D1 (Figure 2A, Figure 2—figure supplement 2A, B) were also specifically suggested to have played an adaptive role in high-altitude groups from the Tibetan Plateau (Simonson et al., 2010; Peng et al., 2011; Horscroft et al., 2017; Arciero et al., 2018; Deng et al., 2019; Zhang et al., 2021; Zheng et al., 2023). Interestingly, PPARA encodes for a nuclear transcription factor whose decreased expression in the myocardium of rats exposed to hypoxia seems to contribute to the maintenance of the correct heart contractile function despite such a stressful condition (Cole et al., 2016). Similarly, the PRKCE protein kinase C has been demonstrated to exert a cardio-protective role against ischemic injury (Scruggs et al., 2016). Moreover, TBC1D1 encodes for a protein whose serine phosphorylation sites are targeted by AMP-activated protein kinases (AMPK) after the activation of the AMPK signalling pathway as a result of the increase cellular AMP/ATP ratio caused by stresses that induce a lower ATP production (e.g., deprivation of oxygen and/or glucose) or that accelerate ATP consumption (e.g., intense muscle contraction) (Kanehisa and Goto, 2000; Vichaiwong et al., 2010). In addition, another member of the AMPK signalling pathway, PRKAG2, was suggested by both VolcanoFinder analysis and literature data to present putative introgressed Denisovan alleles in Tibetan populations (Figure 2B; Zhang et al., 2021). Mutations at this locus are known to cause the PRKAG2 cardiac syndrome, an inherited disease characterized by ventricular pre-excitation, supraventricular arrhythmias, and cardiac hypertrophy (Zhang et al., 2013; Porto et al., 2016). Dysregulation of AMPK activity mediated by reduction in PRKAG2 expression and leading to the impairment of glycogen metabolism in cell cultures has been proposed as a possible cause for the development of this pathological condition (Zhang et al., 2013). Conversely, enhanced activation of the AMPK signalling pathway during pregnancy coupled with PRKAG2 overexpression was observed in the placenta of women living at high altitudes when compared with women living in low-altitude regions (Lorca et al., 2021). Finally, high-altitude individuals of Tibetan ancestry were found to exhibit reduced incidence of major adverse cardiovascular events with respect to low-altitude controls possibly indicating the involvement of protective cardiac mechanisms in the modulation of high-altitude adaptations as previously proposed (Kolár and Ostádal, 2004; Mallet et al., 2018; Lei et al., 2024). We can thus speculate that adaptive evolution at the PPARA, PRKCE, TBC1D1, and PRKAG2 genomic regions in Tibetans might have contributed to the development of protective mechanisms that reduce cardiovascular risk associated to the hypoxic stress.

In addition to these genes, other two of the identified candidate AI loci (i.e., RASGRF2 and KRAS) have been previously proved to have been targeted by natural selection in Tibetan populations (Peng et al., 2011) or to present putative introgressed archaic alleles (Hu et al., 2017; Browning et al., 2018; Figure 2—figure supplements 3 and 4). The proteins encoded by these loci are strictly related from a functional perspective, with the Ras protein specific guanine nucleotide releasing factor 2 representing a calcium-regulated nucleotide exchange factor that activates the RAS protein codified by the proto-oncogene KRAS (Kanehisa and Goto, 2000; Sayers et al., 2022).

Despite evidence reported in literature seem to corroborate VolcanoFinder results that indicate PPARA and PRKCE as putatively implicated in AI events experienced by Tibetan ancestors, only EPAS1, TBC1D1, RASGRF2, PRKAG2, and KRAS loci were finally retained after the adopted filtering procedure to represent the most reliable candidate AI loci.

Archaic introgression at these genomic regions was first confirmed by Signet analysis, with TBC1D1 and PRKAG2 being included in a significant gene network belonging to the AMPK signalling pathway, while RASGRF2 and KRAS participate to that related to the Ras signalling pathway (Supplementary file 1f). The same analysis pointed to EPAS1 as a member of a significant network belonging to the Pathways in cancer, a complex group of biological functions such as those involved in Ras, MAPK, VEGF, and HIF-1 signalling cascades (Kanehisa and Goto, 2000; Figure 3A and Supplementary file 1f). These findings emphasize a link between biological mechanisms activated within the context of hypoxic tumour microenvironments in different types of cancers and those involved in high-altitude adaptations, especially as concerns functions that might underlie the improved angiogenesis observed in Tibetan and Sherpa individuals (Gnecchi-Ruscone et al., 2018; Calderón - Gerstein and Torres - Samaniego, 2021). For instance, accumulation of the HIF-1α transcriptional factor in the nucleus of cells close to hypoxic tumour masses comports the activation of diverse biological responses such as the formation of dense capillary structures that permit oxygen and nutrients supplies to cancer cells, thus determining tumour progression and/or treatment failure (Brahimi-Horn et al., 2007; Calderón - Gerstein and Torres - Samaniego, 2021). In line with this evidence, the Ras and MAPK/ERK signalling pathways have been proposed to play a significant role in promoting angiogenesis by triggering VEGF expression, being possibly implicated in the adaptive response to hypoxia evolved by high-altitude populations (Figure 3A; Kanehisa and Goto, 2000; Kranenburg et al., 2004). Moreover, in the study by Lorenzo et al., 2014 a gain-of-function mutation in the EGLN1 gene was observed in Tibetans and was proved to enhance the catalytic activity of the HIF prolyl 4-hydroxylase 2 (PHD2) under hypoxic conditions. This alters the binding of HIF-2α (the isoform 2 of the inducible hypoxia transcriptional factor encoded by EPAS1) and negatively regulates the activation of the HIF-1 signalling pathway during hypoxia, eventually offering protection against the detrimental effects of prolonged polycythaemia. When HIF-2α exerts its transcriptional activities along with p300 protein and HIF-1β, it enhances VEGF expression and permits the activation of the VEGF signalling pathway (Kanehisa and Goto, 2000; Rashid et al., 2021). Accordingly, down-regulation of the HIF-1 signalling pathway comports the reduction of VEGF mRNA expression (Greenberger et al., 2008; Zhang et al., 2018). Coupled with these observations, results from the Signet analysis suggest that in individuals of Tibetan ancestry, when the HIF-1 signalling pathway is likely down-regulated in chronic hypoxic conditions (Lorenzo et al., 2014), adaptive changes at the Ras/MAPK signalling pathways could represent alternative biological mechanisms that in its place enable to sustain improved angiogenesis and thus permit adequate tissue oxygenation (Figure 3A).

The same five candidate genes, in addition to the EGLN1 locus considered as a negative control for AI, were also confirmed by the LASSI method to have adaptively evolved in the studied populations (Figure 4C, D, Figure 4—figure supplement 1A, B, Figure 4—figure supplements 2 and 3C, D). In fact, several genomic windows in their associated chromosomal intervals and/or in their flanking regions presented positive values of the computed likelihood T statistic, many of which falling in the top 5% of the related distribution, which indicate their non-neutral evolution. Interestingly, TBC1D1, RASGRF2, PRKAG2, KRAS, and EPAS1 significant genomic windows pointed out by LASSI were also found to include Denisovan-like derived alleles that are completely absent in the YRI African population, suggesting that positive selection acted in Tibetans on haplotypes containing archaic introgressed variation. Such a scenario was further supported by the Haplostrips analysis, which revealed for all the loci mentioned above patterns of similarity between Tibetan and Denisovan haplotypes that are comparable to that observed for EPAS1 (Figure 4A, B, Figure 4—figure supplement 2A, B), with the sole exception being represented by EGLN1 (Figure 4—figure supplement 1B, Figure 4—figure supplement 3B, D, Figure 4—figure supplement 3A).

Overall, we collected multiple evidence supporting both the archaic origin and the adaptive role of variation at TBC1D1, RASGRF2, PRKAG2, and KRAS genes in populations of Tibetan ancestry. Genetic signatures at such loci are especially consistent with the hypothesis of adaptive events mediated by soft selective sweeps and/or polygenic mechanisms that involved haplotypes including both modern and archaic introgressed alleles. Therefore, the results obtained have succeeded in expanding the knowledge about AI events having involved the ancestors of modern high-altitude Himalayan populations and Denisovans and emphasized once more the complexity of the adaptive phenotype evolved by these human groups to cope with challenges imposed by hypobaric hypoxia. Accordingly, they offer new insights for future studies aimed at elucidating the molecular mechanisms by which several genes along with TBC1D1, PRKAG2, RASGRF2, and KRAS interact with each other and contribute to mediate physiological adjustments that are crucial for human adaptation to the high-altitude environment.

Materials and methods

Ethics

The University of Bologna Ethics Committee released approval (Prot. 205142, 12/9/2019) for the present study within the framework of the project ‘Genetic adaptation and acclimatization to high altitude as experimental models to investigate the biological mechanisms that regulate physiological responses to hypoxia’. However, no new sampling campaign was performed in the context of the present study and all the genomic data analysed were publicly available. The informed consent for the 27 Tibetan WGS analysed here was previously obtained and declared in the Ethics statement section of the study by Jeong et al., 2018.

Dataset composition and curation

The dataset used in the present study included WGS data for 27 individuals of Tibetan ancestry recruited from the high-altitude Nepalese districts of Mustang and Ghorka (Jeong et al., 2018). Although these subjects reside in Nepal, they have been previously proved to speak Tibetan dialects and to live in communities showing religious and social organizations proper of populations settled on the Tibetan Plateau, being also biologically representative of high-altitude Tibetan people (Cho et al., 2017). To filter for high-quality genotypes, the dataset was subjected to QC procedures using the software PLINK v1.9 (Purcell et al., 2007). In detail, we retained autosomal SNVs characterized by no significant deviations from the Hardy–Weinberg equilibrium (p > 5.3 × 10−9 after Bonferroni correction for multiple testing), as well as loci/samples showing less than 5% of missing data. Moreover, we removed SNVs with ambiguous A/T or G/C alleles and multiallelic variants, obtaining a dataset composed by 27 individuals and 6,921,628 SNVs. WGS data were finally phased with SHAPEIT2 v2.r904 (Delaneau et al., 2013) by applying default parameters and using the 1000 Genomes Project dataset as a reference panel (Auton et al., 2015) and HapMap phase 3 recombination maps.

Population structure analyses

To assess representativeness, genetic homogeneity, and ancestry composition of Tibetan WGS included in the dataset, we performed genotype-based population structure analyses. For this purpose, we merged the unphased WGS dataset with genome-wide genotyping data for 34 East-Asian populations (Gnecchi-Ruscone et al., 2017; Landini et al., 2021) and we applied the same QC described above. The obtained extended dataset included 231,947 SNVs and was used to assess the degree of recent shared ancestry (i.e., identity-by-descent, IBD) for each pair of individuals. Identity-by-state (IBS) estimates were thus used to calculate an IBD kinship coefficient and a threshold of PI_HAT = 0.270 was considered to remove closely related subjects to the second degree (Ojeda-Granados et al., 2022). To discard variants in linkage disequilibrium (LD) with each other we then removed one SNV for each pair showing r2 > 0.2 within sliding windows of 50 SNVs and advancing by 5 SNVs at the time. The obtained LD-pruned dataset was finally filtered for variants with minor allele frequency <0.01 and used to compute PCA utilizing the smartpca method implemented in the EIGENSOFT package (Patterson et al., 2006), as well as to run the ADMIXTURE algorithm version 1.3.0 (Alexander et al., 2009) by testing K = 2 to K = 12 population clusters. In detail, 25 replicates with different random seeds were run for each K and we retained only those presenting the highest log-likelihood values. In addition, cross-validation errors were calculated for each K to identify the value that best fit the data. Both PCA and ADMIXTURE results were visualized with the R software version 4.0.5. ADMIXTURE results were visualized with the R software version 4.0.5.

Detecting signatures of AI

To identify chromosomal regions showing signatures putatively ascribable to AI events, we submitted the phased WGS dataset to the VolcanoFinder pipeline, which relies on the analysis of the allele frequency spectrum of the population that is supposed to have experienced archaic introgression (Setter et al., 2020). This model considers three populations: recipient (i.e., the modern population), donor (i.e., the archaic population), and outgroup and assumes the occurrence of the introgression event after which a beneficial haplotype is introduced in the modern gene pool and starts to rise in frequency because of demographic random processes and because of the action of natural selection on it. The influence of these evolutionary forces comports the existence of multiple haplotypes that carry the beneficial archaic allele, thus comporting elevated heterozygosity level tested by the LR statistic. Therefore, the pattern of variability tested by such a statistic deeply differs from that attributable to classical selective sweeps (especially from that associated to a hard selective sweep), allowing to identify weak signatures of adaptation that resulted from both the introgression and the action of natural selection on beneficial standing genetic variation.

The VolcanoFinder algorithm was chosen among several approaches developed to detect AI signatures according to the following reasons. First of all, it can test jointly both archaic introgression and adaptive evolution according to a model that differs from those considered by other statistics that are aimed at identifying chromosomal segments showing low divergence with respect to a specific archaic sequence and/or enriched in alleles uniquely shared between the admixed group and the archaic source and characterized by a frequency above a certain threshold in the population under study (Racimo et al., 2017). In fact, these methods are especially useful to test an evolutionary scenario conformed to that expected in the case that adaptation was mediated by hard selective sweeps. On the contrary, VocanoFinder was proved to have an elevated power in the identification of AI events mediated by more than one predominant haplotype (Setter et al., 2020), as expected when soft sweeps/polygenic adaptations occurred. Moreover, VolcanoFinder relies on less demanding computational efforts with respect to algorithms that require to be trained on genomic simulations built specifically to reflect the evolutionary history of the population under study (Gower et al., 2021; Zhang et al., 2023), but possibly introducing bias in the obtained results if the information that guides simulations is not accurate.

We thus scanned Tibetan WGS data using the VolcanoFinder method and calculating two statistics for each polymorphic site: α (subsequently converted in −logα) and LR which are informative, respectively, of: (1) the strength of natural selection and (2) the conformity to the evolutionary model of AI. Since elevated LR scores are assumed to support AI signatures (Setter et al., 2020), we filtered the most significant results obtained by focusing on SNVs showing LR values falling in the positive tail (i.e., top 5%) of the distribution built for such a statistic. We then visualized the distribution of both α and LR parameters across the genomic regions including the EPAS1 and EGLN1 genes, which we considered, respectively, as positive and negative control loci for AI and by investigating chromosomal intervals spanning 50 kb up- and downstream to these genes. We then defined the new candidate AI genomic regions based on their conformity with the pattern observed for the positive AI control gene (i.e., according to the detection of multiple peaks of LR scores consistently distributed in the entire genomic region considered and coupled with elevated values of the −logα parameter). Moreover, we relied on evidence advanced by previous studies aimed at detecting archaic introgression signatures from WGS data for individuals with Tibetan and/or Han Chinese ancestries (Hu et al., 2017; Browning et al., 2018; Zhang et al., 2021) to filter out loci potentially targeted by natural selection in Tibetans, but with questionable archaic origins.

Identifying gene networks including Denisovan-like derived alleles

To validate archaic introgression signatures inferred with VolcanoFinder by using an independent method, we followed the Signet approach described by Gouy and Excoffier, 2020, with the aim of identifying biological functions whose underlying genomic regions might have been significantly shaped by Denisovan introgression. The Signet approach consists in crosschecking the information contained in the input dataset with that collected in reference databases of functional pathways, such as the KEGG (available at https://www.kegg.jp/), using a simulated annealing algorithm approach to define the High Scoring Subnetworks within each biological pathway (Gouy and Excoffier, 2020). In detail, we used the Signet algorithm to reconstruct network of genes that participate to the same biological pathway and that also included Denisovan-like derived alleles observable in the Tibetan population but not in an outgroup population of African ancestry, by assuming that only Eurasian H. sapiens populations experienced Denisovan admixture. To do so, we first compared the Tibetan and Denisovan genomes to assess which SNVs were present in both modern and archaic sequences. These loci were further compared with the ancestral reconstructed refence human genome sequence to discard those presenting an ancestral state (i.e., that we have in common with several primate species). Moreover, we further filtered the considered variants by retaining only those alleles that were completely absent (i.e., with frequency equal to zero) in the YRI population sequenced by the 1000 Genomes Project (Auton et al., 2015). We then calculated the average frequency in the Tibetan population of the Denisovan-like derived alleles observable in each gene and we used the obtained genomic distribution to inform Signet. We performed five independent runs of the Signet algorithm to check for consistency of the significant gene networks and functional pathways identified and we finally depicted the confirmed results using Cytoscape v3.9.1 (Shannon et al., 2003).

Testing adaptive evolution of candidate introgressed loci

To confirm signatures ascribable to the action of natural selection at the putative introgressed loci supported by both VolcanoFinder and Signet analyses, we applied the LASSI likelihood method (Harris and DeGiorgio, 2020) on the available Tibetan WGS data. Such an approach detects and distinguishes genomic regions that have experienced different types of selective events (i.e., strong and weak ones) and has been demonstrated to be more powerful in doing so than other haplotype-based approaches (Harris and DeGiorgio, 2020). The rationale behind the LASSI approach is based on the recognition of the modification resulted in the haplotype frequency spectrum of a given genomic region after the action of natural selection on it. For instance, according to the hard sweep model, when in the population arise a beneficial mutation with a very strong impact on a given phenotypic trait, the haplotype frequency spectrum of such a genomic region will be characterized by a single haplotype with an extremely elevated frequency in the population (i.e., the sweeping haplotype), while the other haplotypes whether they exist, are found at very low frequencies. Consequentially, when the selection acts on few new variants with a lower impact on a trait or on standing genetic variation, the resulted haplotype spectrum will be characterized by the existence of two or few more haplotypes that reach moderate frequencies in the population. Conversely, the variability pattern associated to the haplotype frequency spectrum expected under neutrality will be characterized by a series of different haplotypes at low frequencies in the population.

Specifically, we calculated for each genomic window the likelihood T statistic, which measures the conformity of variability patterns of the analysed region to those expected according to a haplotype frequency spectrum under adaptive rather than neutral evolution. In addition, the LASSI algorithm calculated the parameter m (i.e., the number of sweeping haplotypes) for each genomic region, thus classifying them as affected by hard sweeps (when m = 1) or soft sweeps (when m > 1). In detail, T scores significantly different from zero indicate the conformity with a non-neutral evolutionary scenario, with ever higher likelihood scores being indicative of increasingly robust evidence for a selective event (Harris and DeGiorgio, 2020).

The method requires to fix a custom value for the length of the considered genomic windows, which are measured in terms of the number of SNVs included in them, and to move windows by 1 SNV after each computation. Therefore, we selected this fixed-length value (i.e., 13 SNVs) by estimating the average number of SNVs included into a haplotype block as defined for the population under study by using the --blocks function implemented in PLINK v1.9 (Purcell et al., 2007). Moreover, by following the indications by Harris and DeGiorgio, 2020 of choosing values for the fixed number of haplotypes in the spectrum (i.e., K values) <10 for increasing the power of the T statistic, we set it at seven. Finally, we choose the likelihood model 3 to calculate the T statistic and we applied the LASSI algorithm to the phased WGS dataset.

We then focused on the genomic windows showing T scores falling in the positive tail (i.e., top 5%) of the obtained distribution and we crosschecked these results with those significant ones pointed out by VolcanoFinder and Signet analyses to shortlist genomic regions having plausibly experienced both archaic introgression and adaptive evolution.

Haplostrip analysis

To explicitly test whether the putative adaptive introgressed loci pointed out by VolcanoFinder, Signet, and LASSI analyses present variation patterns compatible with a scenario of introgression from the Denisovan archaic human species, we estimated genetic distance between modern and archaic haplotypes inferred for those genomic windows supported by all the methods mentioned above, as well as for EPAS1 and EGLN1 for the sake of comparison with established positive and negative control genes that have been previously proved to be involved or not in AI events. For this purpose, we used the Haplostrip pipeline, as described in previous studies (Huerta-Sánchez et al., 2014; Marnetto and Huerta‐Sánchez, 2017). Moreover, since the EGLN1 gene did not include any Denisovan-like variant as identified according to the criteria described in the previous paragraphs, we choose to build the Haplostrips heatmap by considering the EGLN1 genomic window associated with the highest value of the LASSI statistic.

In detail, we merged the 27 Tibetan whole genomes under study (Jeong et al., 2018) with 27 CHB, 27 YRI WGSs (Auton et al., 2015) and with the Denisovan genome (Meyer et al., 2013) (downloadable at http://cdna.eva.mpg.de/neandertal/altai/Denisovan/). The CHB population, which is known to share a relatively recent common ancestry with Tibetans, was used as a ‘negative low-altitude control’ (i.e., as a group whose ancestors experienced Denisovan introgression, but did not evolve high-altitude adaptation). YRI individuals were instead used as the outgroup population (i.e., a population that presumably did not experience Denisovan admixture), as previously proposed (Zhang et al., 2021). We then phased the assembled dataset with SHAPEIT2 v2.r904 (Delaneau et al., 2013), as described in the Dataset composition and curation section and we run the Haplostrip algorithm.

Acknowledgements

We acknowledge support from the Fondazione Cassa di Risparmio in Bologna through the project 'Genetic adaptation and acclimatization to high altitude as experimental models to investigate the biological mechanisms that regulate physiological responses to hypoxia', which was granted to MS (n. 2019.0552).

Funding Statement

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Contributor Information

Marco Sazzini, Email: marco.sazzini2@unibo.it.

Emilia Huerta-Sanchez, Brown University, United States.

George H Perry, Pennsylvania State University, United States.

Funding Information

This paper was supported by the following grant:

  • Fondazione Cassa di Risparmio in Bologna 2019.0552 to Marco Sazzini.

Additional information

Competing interests

No competing interests declared.

Author contributions

Data curation, Software, Formal analysis, Investigation, Writing – original draft.

Data curation, Software, Formal analysis, Investigation, Writing – original draft.

Software, Formal analysis.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Data curation, Writing – review and editing.

Formal analysis, Writing – original draft.

Conceptualization, Resources, Supervision, Funding acquisition, Writing – review and editing.

Ethics

The University of Bologna Ethics Committee released approval (Prot. 205142, 12/9/2019) for the present study within the framework of the project 'Genetic adaptation and acclimatization to high altitude as experimental models to investigate the biological mechanisms that regulate physiological responses to hypoxia'. However, no new sampling campaign was performed in the context of the present study and all the genomic data analysed were publicly available. The informed consent for the 27 Tibetan WGS analysed here was previously obtained and declared in the Ethics statement section of the study by Jeong et al., 2018.

Additional files

Supplementary file 1. Supplementary tables 1a-1f.

(a) Populations included in the extended dataset. (b) Single-nucleotide variants (SNVs) associated with values falling in top 5% of the distribution of likelihood ratio (LR) statistic calculated by VolcanoFinder. (c) SNVs associated with values falling in top 5% of the distribution of the LR statistic and comprised in the genomic region of the EPAS1 gene (i.e., 2:46474546–2:46663836). (d) SNVs associated with values falling in top 5% of the distribution of the LR statistic and comprised in the genomic region of the EGLN1 gene (i.e., 1:231449502–1:231608033). (e) Adaptive intregressed genes confirmed by VolcanoFinder and previous studies. (f) Gene networks including Denisovan-like derived alleles identified according to the Signet approach.

elife-89815-supp1.xlsx (8.4MB, xlsx)
MDAR checklist

Data availability

The current manuscript is a computational study, so no data have been generated for this manuscript. The dataset used has been generated by Jeong et al., 2018. The code and the software used have been developed by Marnetto and Huerta‐Sánchez, 2017; Setter et al., 2020; Gouy and Excoffier, 2020; Harris and DeGiorgio, 2020.

The following previously published dataset was used:

Jeong C, Witonsky DB, Basnyat B, Neupane M, Beall CM, Childs G, Craig SR, Novembre J, Di Rienzo A. 2018. Tibetan/Sherpa Sequence Reads. NCBI BioProject. PRJNA420511

References

  1. Abdulla MA, Ahmed I, Assawamakin A, Bhak J, Brahmachari SK, Calacal GC, Chaurasia A, Chen CH, Chen J, Chen YT, Chu J, Cutiongco-de la Paz EMC, De Ungria MCA, Delfin FC, Edo J, Fuchareon S, Ghang H, Gojobori T, Han J, Ho SF, Hoh BP, Huang W, Inoko H, Jha P, Jinam TA, Jin L, Jung J, Kangwanpong D, Kampuansai J, Kennedy GC, Khurana P, Kim HL, Kim K, Kim S, Kim WY, Kimm K, Kimura R, Koike T, Kulawonganunchai S, Kumar V, Lai PS, Lee JY, Lee S, Liu ET, Majumder PP, Mandapati KK, Marzuki S, Mitchell W, Mukerji M, Naritomi K, Ngamphiw C, Niikawa N, Nishida N, Oh B, Oh S, Ohashi J, Oka A, Ong R, Padilla CD, Palittapongarnpim P, Perdigon HB, Phipps ME, Png E, Sakaki Y, Salvador JM, Sandraling Y, Scaria V, Seielstad M, Sidek MR, Sinha A, Srikummool M, Sudoyo H, Sugano S, Suryadi H, Suzuki Y, Tabbada KA, Tan A, Tokunaga K, Tongsima S, Villamor LP, Wang E, Wang Y, Wang H, Wu JY, Xiao H, Xu S, Yang JO, Shugart YY, Yoo HS, Yuan W, Zhao G, Zilfalil BA, HUGO Pan-Asian SNP Consortium. Indian Genome Variation Consortium Mapping human genetic diversity in Asia. Science. 2009;326:1541–1545. doi: 10.1126/science.1177074. [DOI] [PubMed] [Google Scholar]
  2. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research. 2009;19:1655–1664. doi: 10.1101/gr.094052.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arciero E, Kraaijenbrink T, Haber M, Mezzavilla M, Ayub Q, Wang W, Pingcuo Z, Yang H, Wang J, Jobling MA, van Driem G, Xue Y, de Knijff P, Tyler-Smith C. Demographic history and genetic adaptation in the himalayan region inferred from genome-wide snp genotypes of 49 populations. Molecular Biology and Evolution. 2018;35:1916–1933. doi: 10.1093/molbev/msy094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR, 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Beall CM. Two routes to functional adaptation: tibetan and andean high-altitude natives. PNAS. 2007;104:8655–8660. doi: 10.1073/pnas.0701985104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Beall CM, Cavalleri GL, Deng L, Elston RC, Gao Y, Knight J, Li C, Li JC, Liang Y, McCormack M, Montgomery HE, Pan H, Robbins PA, Shianna KV, Tam SC, Tsering N, Veeramah KR, Wang W, Wangdui P, Weale ME, Xu Y, Xu Z, Yang L, Zaman MJ, Zeng C, Zhang L, Zhang X, Zhaxi P, Zheng YT. Natural selection on EPAS1 (HIF2alpha) associated with low hemoglobin concentration in Tibetan highlanders. PNAS. 2010;107:11459–11464. doi: 10.1073/pnas.1002443107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bigham A, Bauchet M, Pinto D, Mao X, Akey JM, Mei R, Scherer SW, Julian CG, Wilson MJ, López Herráez D, Brutsaert T, Parra EJ, Moore LG, Shriver MD. Identifying signatures of natural selection in tibetan and andean populations using dense genome scan data. PLOS Genetics. 2010;6:e1001116. doi: 10.1371/journal.pgen.1001116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Brahimi-Horn MC, Chiche J, Pouysségur J. Hypoxia and cancer. Journal of Molecular Medicine. 2007;85:1301–1307. doi: 10.1007/s00109-007-0281-3. [DOI] [PubMed] [Google Scholar]
  9. Browning SR, Browning BL, Zhou Y, Tucci S, Akey JM. Analysis of human sequence data reveals two pulses of archaic denisovan admixture. Cell. 2018;173:53–61. doi: 10.1016/j.cell.2018.02.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Calderón - Gerstein WS, Torres - Samaniego G. High altitude and cancer: an old controversy. Respiratory Physiology & Neurobiology. 2021;289:103655. doi: 10.1016/j.resp.2021.103655. [DOI] [PubMed] [Google Scholar]
  11. Catar R, Witowski J, Wagner P, Annett Schramm I, Kawka E, Philippe A, Dragun D, Jörres A. The proto-oncogene C-Fos transcriptionally regulates VEGF production during peritoneal inflammation. Kidney International. 2013;84:1119–1128. doi: 10.1038/ki.2013.217. [DOI] [PubMed] [Google Scholar]
  12. Chen X, Li X, Zhang W, He J, Xu B, Lei B, Wang Z, Cates C, Rousselle T, Li J. Activation of AMPK inhibits inflammatory response during hypoxia and reoxygenation through modulating JNK-mediated NF-κB pathway. Metabolism. 2018;83:256–270. doi: 10.1016/j.metabol.2018.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cho JI, Basnyat B, Jeong C, Di Rienzo A, Childs G, Craig SR, Sun J, Beall CM. Ethnically Tibetan women in Nepal with low hemoglobin concentration have better reproductive outcomes. Evolution, Medicine, and Public Health. 2017;2017:82–96. doi: 10.1093/emph/eox008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cole MA, Abd Jamil AH, Heather LC, Murray AJ, Sutton ER, Slingo M, Sebag-Montefiore L, Tan SC, Aksentijević D, Gildea OS, Stuckey DJ, Yeoh KK, Carr CA, Evans RD, Aasum E, Schofield CJ, Ratcliffe PJ, Neubauer S, Robbins PA, Clarke K. On the pivotal role of PPARα in adaptation of the heart to hypoxia and why fat in the diet increases hypoxic injury. FASEB Journal. 2016;30:2684–2697. doi: 10.1096/fj.201500094R. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dannemann M, Racimo F. Something old, something borrowed: admixture and adaptation in. Current Opinion in Genetics and Development. 2018;53:009. doi: 10.1016/j.gde.2018.05. [DOI] [PubMed] [Google Scholar]
  16. Delaneau O, Zagury JF, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nature Methods. 2013;10:5–6. doi: 10.1038/nmeth.2307. [DOI] [PubMed] [Google Scholar]
  17. Deng L, Zhang C, Yuan K, Gao Y, Pan Y, Ge X, He Y, Yuan Y, Lu Y, Zhang X, Chen H, Lou H, Wang X, Lu D, Liu J, Tian L, Feng Q, Khan A, Yang Y, Jin Z-B, Yang J, Lu F, Qu J, Kang L, Su B, Xu S. Prioritizing natural-selection signals from the deep-sequencing genomic data suggests multi-variant adaptation in Tibetan highlanders. National Science Review. 2019;6:1201–1222. doi: 10.1093/nsr/nwz108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dengler F. Activation of AMPK under Hypoxia: many roads leading to rome. International Journal of Molecular Sciences. 2020;21:2428. doi: 10.3390/ijms21072428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Enard D, Petrov DA. Evidence that RNA viruses drove adaptive introgression between neanderthals and modern humans. Cell. 2018;175:360–371. doi: 10.1016/j.cell.2018.08.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gittelman RM, Schraiber JG, Vernot B, Mikacenic C, Wurfel MM, Akey JM. Archaic hominin admixture facilitated adaptation to out-of-africa environments. Current Biology. 2016;26:3375–3382. doi: 10.1016/j.cub.2016.10.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gnecchi-Ruscone GA, Jeong C, De Fanti S, Sarno S, Trancucci M, Gentilini D, Di Blasio AM, Sherpa MG, Sherpa PT, Marinelli G, Di Marcello M, Natali L, Peluzzi D, Pettener D, Di Rienzo A, Luiselli D, Sazzini M. The genomic landscape of Nepalese Tibeto-Burmans reveals new insights into the recent peopling of Southern Himalayas. Scientific Reports. 2017;7:15512. doi: 10.1038/s41598-017-15862-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gnecchi-Ruscone GA, Abondio P, De Fanti S, Sarno S, Sherpa MG, Sherpa PT, Marinelli G, Natali L, Di Marcello M, Peluzzi D, Luiselli D, Pettener D, Sazzini M. Evidence of polygenic adaptation to high altitude from tibetan and sherpa genomes. Genome Biology and Evolution. 2018;10:2919–2930. doi: 10.1093/gbe/evy233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gouy A, Excoffier L. Polygenic patterns of adaptive introgression in modern humans are mainly shaped by response to pathogens. Molecular Biology and Evolution. 2020;37:1420–1433. doi: 10.1093/molbev/msz306. [DOI] [PubMed] [Google Scholar]
  24. Gower G, Picazo PI, Fumagalli M, Racimo F. Detecting adaptive introgression in human evolution using convolutional neural networks. eLife. 2021;10:e64669. doi: 10.7554/eLife.64669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MHY, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Ž, Gušic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PLF, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S. A draft sequence of the Neandertal genome. Science. 2010;328:710–722. doi: 10.1126/science.1188021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Greenberger LM, Horak ID, Filpula D, Sapra P, Westergaard M, Frydenlund HF, Albaek C, Schrøder H, Ørum H. A RNA antagonist of hypoxia-inducible factor-1alpha, EZN-2968, inhibits tumor cell growth. Molecular Cancer Therapeutics. 2008;7:3598–3608. doi: 10.1158/1535-7163.MCT-08-0510. [DOI] [PubMed] [Google Scholar]
  27. Hamid I, Korunes KL, Beleza S, Goldberg A. Rapid adaptation to malaria facilitated by admixture in the human population of Cabo Verde. eLife. 2021;10:e63177. doi: 10.7554/eLife.63177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Harris AM, DeGiorgio M. A likelihood approach for uncovering selective sweep signatures from haplotype data. Molecular Biology and Evolution. 2020;37:3023–3046. doi: 10.1093/molbev/msaa115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Horscroft JA, Kotwica AO, Laner V, West JA, Hennis PJ, Levett DZH, Howard DJ, Fernandez BO, Burgess SL, Ament Z, Gilbert-Kawai ET, Vercueil A, Landis BD, Mitchell K, Mythen MG, Branco C, Johnson RS, Feelisch M, Montgomery HE, Griffin JL, Grocott MPW, Gnaiger E, Martin DS, Murray AJ. Metabolic basis to Sherpa altitude adaptation. PNAS. 2017;114:6382–6387. doi: 10.1073/pnas.1700527114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hu H, Petousi N, Glusman G, Yu Y, Bohlender R, Tashi T, Downie JM, Roach JC, Cole AM, Lorenzo FR, Rogers AR, Brunkow ME, Cavalleri G, Hood L, Alpatty SM, Prchal JT, Jorde LB, Robbins PA, Simonson TS, Huff CD. Evolutionary history of Tibetans inferred from whole-genome sequencing. PLOS Genetics. 2017;13:e1006675. doi: 10.1371/journal.pgen.1006675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Huerta-Sánchez E, Jin X, Bianba Z, Peter BM, Vinckenbosch N, Liang Y, Yi X, He M, Somel M, Ni P, Wang B, Ou X, Luosang J, Cuo ZXP, Li K, Gao G, Yin Y, Wang W, Zhang X, Xu X, Yang H, Li Y, Wang J, Wang J, Nielsen R. Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature. 2014;512:194–197. doi: 10.1038/nature13408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Jeong C, Alkorta-Aranburu G, Basnyat B, Neupane M, Witonsky DB, Pritchard JK, Beall CM, Di Rienzo A. Admixture facilitates genetic adaptations to high altitude in Tibet. Nature Communications. 2014;5:1–7. doi: 10.1038/ncomms4281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Jeong C, Witonsky DB, Basnyat B, Neupane M, Beall CM, Childs G, Craig SR, Novembre J, Di Rienzo A. Detecting past and ongoing natural selection among ethnically Tibetan women at high altitude in Nepal. PLOS Genetics. 2018;14:e1007650. doi: 10.1371/journal.pgen.1007650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Research. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kolár F, Ostádal B. Molecular mechanisms of cardiac protection by adaptation to chronic hypoxia. Physiological Research. 2004;53:S3–S13. [PubMed] [Google Scholar]
  36. Kranenburg O, Gebbink MFBG, Voest EE. Stimulation of angiogenesis by Ras proteins. Biochimica et Biophysica Acta. 2004;1654:23–37. doi: 10.1016/j.bbcan.2003.09.004. [DOI] [PubMed] [Google Scholar]
  37. Landini A, Yu S, Gnecchi-Ruscone GA, Abondio P, Ojeda-Granados C, Sarno S, De Fanti S, Gentilini D, Di Blasio AM, Jin H, Nguyen TT, Romeo G, Prata C, Bortolini E, Luiselli D, Pettener D, Sazzini M. Genomic adaptations to cereal-based diets contribute to mitigate metabolic risk in some human populations of East Asian ancestry. Evolutionary Applications. 2021;14:297–313. doi: 10.1111/eva.13090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Laurent AR, J. M. J, Subhash K, Alasdair M, Klara D, Loren G, Farbod B, Baback G, Ma L, P. F. A, Joshua K, Mary C, Derek M, Raja R, Meral B, Marsh SGE, Martin M, Lisbeth AG, Sofia T, Peter P. The shaping of modern human immune systems by multiregional admixture with archaic humans. Science (New York, N.Y.) 2011;334:89–94. doi: 10.1126/science.1209202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lei L, Liu M, Ma D, Lei X, Zeng S, Li P, Huang K, Lyu J, Lei Q. Cardioprotective effects of high-altitude adaptation in cardiac surgical patients: a retrospective cohort study with propensity score matching. Frontiers in Cardiovascular Medicine. 2024;11:1347552. doi: 10.3389/fcvm.2024.1347552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Liu C-C, Witonsky D, Gosling A, Lee JH, Ringbauer H, Hagan R, Patel N, Stahl R, Novembre J, Aldenderfer M, Warinner C, Di Rienzo A, Jeong C. Ancient genomes from the Himalayas illuminate the genetic history of Tibetans and their Tibeto-Burman speaking neighbors. Nature Communications. 2022;13:1203. doi: 10.1038/s41467-022-28827-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lorca RA, Houck JA, Laurent LC, Matarazzo CJ, Baker K, Horii M, Nelson KK, Bales ES, Euser AG, Parast MM, Moore LG, Julian CG. High altitude regulates the expression of AMPK pathways in human placenta. Placenta. 2021;104:267–276. doi: 10.1016/j.placenta.2021.01.010. [DOI] [PubMed] [Google Scholar]
  42. Lorenzo FR, Huff C, Myllymäki M, Olenchock B, Swierczek S, Tashi T, Gordeuk V, Wuren T, Ri-Li G, McClain DA, Khan TM, Koul PA, Guchhait P, Salama ME, Xing J, Semenza GL, Liberzon E, Wilson A, Simonson TS, Jorde LB, Kaelin WG, Jr, Koivunen P, Prchal JT. A genetic mechanism for Tibetan high-altitude adaptation. Nature Genetics. 2014;46:951–956. doi: 10.1038/ng.3067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Mallet RT, Manukhina EB, Ruelas SS, Caffrey JL, Downey HF. Cardioprotection by intermittent hypoxia conditioning: evidence, mechanisms, and therapeutic potential. American Journal of Physiology. Heart and Circulatory Physiology. 2018;315:H216–H232. doi: 10.1152/ajpheart.00060.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Marnetto D, Huerta‐Sánchez E. Haplostrips : revealing population structure through haplotype visualization. Methods in Ecology and Evolution. 2017;8:1389–1392. doi: 10.1111/2041-210X.12747. [DOI] [Google Scholar]
  45. Maxwell PH, Ratcliffe PJ. Oxygen sensors and angiogenesis. Seminars in Cell & Developmental Biology. 2002;13:29–37. doi: 10.1006/scdb.2001.0287. [DOI] [PubMed] [Google Scholar]
  46. McArthur E, Rinker DC, Capra JA. Quantifying the contribution of Neanderthal introgression to the heritability of complex traits. Nature Communications. 2021;12:4481. doi: 10.1038/s41467-021-24582-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. McCoy RC, Wakefield J, Akey JM. Impacts of Neanderthal-introgressed sequences on the landscape of human gene expression. Cell. 2017;168:916–927. doi: 10.1016/j.cell.2017.01.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Meyer M, Kircher M, Gansauge M, Li H, Mallick S, Schraiber JG, Jay F, Prüfer K, De C, Sudmant PH, Alkan C, Fu Q, Do R, Rohland N, Siebauer M, Green RE, Bryc K, Briggs AW, Dabney J, Eichler EE. A high coverage genome sequence from an archaic. Semantic Scholar; 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Monje P, Hernández-Losa J, Lyons RJ, Castellone MD, Gutkind JS. Regulation of the transcriptional activity of c-Fos by ERK. A novel role for the prolyl isomerase PIN1. The Journal of Biological Chemistry. 2005;280:35081–35084. doi: 10.1074/jbc.C500353200. [DOI] [PubMed] [Google Scholar]
  50. Ojeda-Granados C, Abondio P, Setti A, Sarno S, Gnecchi-Ruscone GA, González-Orozco E, De Fanti S, Jiménez-Kaufmann A, Rangel-Villalobos H, Moreno-Estrada A, Sazzini M. Dietary, cultural, and pathogens-related selective pressures shaped differential adaptive evolution among native mexican populations. Molecular Biology and Evolution. 2022;39:msab290. doi: 10.1093/molbev/msab290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Patterson N, Price AL, Reich D. Population structure and eigenanalysis. PLOS Genetics. 2006;2:e190. doi: 10.1371/journal.pgen.0020190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Peng Y, Yang Z, Zhang H, Cui C, Qi X, Luo X, Tao X, Wu T, Chen H, Shi H, Su B. Genetic variations in Tibetan populations and high-altitude adaptation at the Himalayas. Molecular Biology and Evolution. 2011;28:1075–1081. doi: 10.1093/molbev/msq290. [DOI] [PubMed] [Google Scholar]
  53. Porto AG, Brun F, Severini GM, Losurdo P, Fabris E, Taylor MRG, Mestroni L, Sinagra G. Clinical spectrum of PRKAG2 syndrome. Circulation. Arrhythmia and Electrophysiology. 2016;9:e003121. doi: 10.1161/CIRCEP.115.003121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, Heinze A, Renaud G, Sudmant PH, de Filippo C, Li H, Mallick S, Dannemann M, Fu Q, Kircher M, Kuhlwilm M, Lachmann M, Meyer M, Ongyerth M, Siebauer M, Theunert C, Tandon A, Moorjani P, Pickrell J, Mullikin JC, Vohr SH, Green RE, Hellmann I, Johnson PLF, Blanche H, Cann H, Kitzman JO, Shendure J, Eichler EE, Lein ES, Bakken TE, Golovanova LV, Doronichev VB, Shunkov MV, Derevianko AP, Viola B, Slatkin M, Reich D, Kelso J, Pääbo S. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 2014;505:43–49. doi: 10.1038/nature12886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Racimo F, Sankararaman S, Nielsen R, Huerta-Sánchez E. Evidence for archaic adaptive introgression in humans. Nature Reviews. Genetics. 2015;16:359–371. doi: 10.1038/nrg3936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Racimo F, Marnetto D, Huerta-Sánchez E. Signatures of archaic adaptive introgression in present-day human populations. Molecular Biology and Evolution. 2017;34:296–317. doi: 10.1093/molbev/msw216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Rashid M, Zadeh LR, Baradaran B, Molavi O, Ghesmati Z, Sabzichi M, Ramezani F. Up-down regulation of HIF-1α in cancer progression. Gene. 2021;798:145796. doi: 10.1016/j.gene.2021.145796. [DOI] [PubMed] [Google Scholar]
  59. Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B, Briggs AW, Stenzel U, Johnson PLF, Maricic T, Good JM, Marques-Bonet T, Alkan C, Fu Q, Mallick S, Li H, Meyer M, Eichler EE, Stoneking M, Richards M, Talamo S, Shunkov MV, Derevianko AP, Hublin J-J, Kelso J, Slatkin M, Pääbo S. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature. 2010;468:1053–1060. doi: 10.1038/nature09710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, Pääbo S, Patterson N, Reich D. The genomic landscape of Neanderthal ancestry in present-day humans. Nature. 2014;507:354–357. doi: 10.1038/nature12961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC, Connor R, Funk K, Kelly C, Kim S, Madej T, Marchler-Bauer A, Lanczycki C, Lathrop S, Lu Z, Thibaud-Nissen F, Murphy T, Phan L, Skripchenko Y, Tse T, Wang J, Williams R, Trawick BW, Pruitt KD, Sherry ST. Database resources of the national center for biotechnology information. Nucleic Acids Research. 2022;50:D20–D26. doi: 10.1093/nar/gkab1112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Scruggs SB, Wang D, Ping P. PRKCE gene encoding protein kinase C-epsilon-Dual roles at sarcomeres and mitochondria in cardiomyocytes. Gene. 2016;590:90–96. doi: 10.1016/j.gene.2016.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Setter D, Mousset S, Cheng X, Nielsen R, DeGiorgio M, Hermisson J. VolcanoFinder: Genomic scans for adaptive introgression. PLOS Genetics. 2020;16:e1008867. doi: 10.1371/journal.pgen.1008867. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Research. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Simonson TS, Yang Y, Huff CD, Yun H, Qin G, Witherspoon DJ, Bai Z, Lorenzo FR, Xing J, Jorde LB, Prchal JT, Ge R. Genetic evidence for high-altitude adaptation in Tibet. Science. 2010;329:72–75. doi: 10.1126/science.1189406. [DOI] [PubMed] [Google Scholar]
  66. Simonti CN, Vernot B, Bastarache L, Bottinger E, Carrell DS, Chisholm RL, Crosslin DR, Hebbring SJ, Jarvik GP, Kullo IJ, Li R, Pathak J, Ritchie MD, Roden DM, Verma SS, Tromp G, Prato JD, Bush WS, Akey JM, Denny JC, Capra JA. The phenotypic legacy of admixture between modern humans and Neandertals. Science. 2016;351:737–741. doi: 10.1126/science.aad2149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Stevens C, Lin Y, Sanchez M, Amin E, Copson E, White H, Durston V, Eccles DM, Hupp T. A germ line mutation in the death domain of DAPK-1 inactivates ERK-induced apoptosis. The Journal of Biological Chemistry. 2007;282:13791–13803. doi: 10.1074/jbc.M605649200. [DOI] [PubMed] [Google Scholar]
  68. Sugden AM. Earliest modern humans out of Africa. Science. 2018;359:407. doi: 10.1126/science.359.6374.407-p. [DOI] [Google Scholar]
  69. Vahdati AR, Weissmann JD, Timmermann A, Ponce de León M, Zollikofer CPE. Exploring Late Pleistocene hominin dispersals, coexistence and extinction with agent-based multi-factor models. Quaternary Science Reviews. 2022;279:107391. doi: 10.1016/j.quascirev.2022.107391. [DOI] [Google Scholar]
  70. Vernot B, Akey JM. Resurrecting surviving neandertal lineages from modern human genomes. Science. 2014;343:1017–1021. doi: 10.1126/science.1245938. [DOI] [PubMed] [Google Scholar]
  71. Vichaiwong K, Purohit S, An D, Toyoda T, Jessen N, Hirshman MF, Goodyear LJ. Contraction regulates site-specific phosphorylation of TBC1D1 in skeletal muscle. The Biochemical Journal. 2010;431:311–320. doi: 10.1042/BJ20101100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Wang Y, Zou X, Wang M, Yuan D, Yang L, Zeng Y, Cheng F, Tang R, He G. The genomic history of southwestern Chinese populations demonstrated massive population migration and admixture among proto-Hmong-Mien speakers and incoming migrants. Molecular Genetics and Genomics. 2022;297:241–262. doi: 10.1007/s00438-021-01837-3. [DOI] [PubMed] [Google Scholar]
  73. Xu S, Li S, Yang Y, Tan J, Lou H, Jin W, Yang L, Pan X, Wang J, Shen Y, Wu B, Wang H, Jin L. A genome-wide search for signals of high-altitude adaptation in Tibetans. Molecular Biology and Evolution. 2011;28:1003–1011. doi: 10.1093/molbev/msq277. [DOI] [PubMed] [Google Scholar]
  74. Yang J, Jin ZB, Chen J, Huang XF, Li XM, Liang YB, Mao JY, Chen X, Zheng Z, Bakshi A, Zheng DD, Zheng MQ, Wray NR, Visscher PM, Lu F, Qu J. Genetic signatures of high-altitude adaptation in Tibetans. PNAS. 2017;114:4189–4194. doi: 10.1073/pnas.1617042114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Yang X-Y, Rakha A, Chen W, Hou J, Qi X-B, Shen Q-K, Dai S-S, Sulaiman X, Abdulloevich NT, Afanasevna ME, Ibrohimovich KB, Chen X, Yang W-K, Adnan A, Zhao R-H, Yao Y-G, Su B, Peng M-S, Zhang Y-P. Tracing the genetic legacy of the tibetan empire in the balti. Molecular Biology and Evolution. 2021;38:1529–1536. doi: 10.1093/molbev/msaa313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZXP, Pool JE, Xu X, Jiang H, Vinckenbosch N, Korneliussen TS, Zheng H, Liu T, He W, Li K, Luo R, Nie X, Wu H, Zhao M, Cao H, Zou J, Shan Y, Li S, Yang Q, Ni P, Tian G, Xu J, Liu X, Jiang T, Wu R, Zhou G, Tang M, Qin J, Wang T, Feng S, Li G, Luosang J, Wang W, Chen F, Wang Y, Zheng X, Li Z, Bianba Z, Yang G, Wang X, Tang S, Gao G, Chen Y, Luo Z, Gusang L, Cao Z, Zhang Q, Ouyang W, Ren X, Liang H, Zheng H, Huang Y, Li J, Bolund L, Kristiansen K, Li Y, Zhang Y, Zhang X, Li R, Li S, Yang H, Nielsen R, Wang J, Wang J. Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010;329:75–78. doi: 10.1126/science.1190371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Zhang B, Xu R, Zhang J, Zhao X, Wu H, Ma L, Hu J, Zhang J, Ye Z, Zheng X, Qin Y. Identification and functional analysis of a novel PRKAG2 mutation responsible for Chinese PRKAG2 cardiac syndrome reveal an important role of non-CBS domains in regulating the AMPK pathway. Journal of Cardiology. 2013;62:241–248. doi: 10.1016/j.jjcc.2013.04.010. [DOI] [PubMed] [Google Scholar]
  78. Zhang C, Lu Y, Feng Q, Wang X, Lou H, Liu J, Ning Z, Yuan K, Wang Y, Zhou Y, Deng L, Liu L, Yang Y, Li S, Ma L, Zhang Z, Jin L, Su B, Kang L, Xu S. Differentiated demographic histories and local adaptations between Sherpas and Tibetans. Genome Biology. 2017;18:115. doi: 10.1186/s13059-017-1242-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Zhang J, Xu J, Dong Y, Huang B. Down-regulation of HIF-1α inhibits the proliferation, migration, and invasion of gastric cancer by inhibiting PI3K/AKT pathway and VEGF expression. Bioscience Reports. 2018;38:BSR20180741. doi: 10.1042/BSR20180741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Zhang X, Witt KE, Bañuelos MM, Ko A, Yuan K, Xu S, Nielsen R, Huerta-Sanchez E. The history and evolution of the Denisovan- EPAS1 haplotype in Tibetans. PNAS. 2021;118:1–9. doi: 10.1073/pnas.2020803118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Zhang X, Kim B, Singh A, Sankararaman S, Durvasula A, Lohmueller KE. MaLAdapt reveals novel targets of adaptive introgression from neanderthals and denisovans in worldwide human populations. Molecular Biology and Evolution. 2023;40:msad001. doi: 10.1093/molbev/msad001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Zheng W, He Y, Guo Y, Yue T, Zhang H, Li J, Zhou B, Zeng X, Li L, Wang B, Cao J, Chen L, Li C, Li H, Cui C, Bai C, Qi X, Su B. Large-scale genome sequencing redefines the genetic footprints of high-altitude adaptation in Tibetans. Genome Biology. 2023;24:73. doi: 10.1186/s13059-023-02912-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife Assessment

Emilia Huerta-Sanchez 1

This study presents valuable findings on what networks of genes were impacted by introgression from Denisovans, to identify the biological functions involved in high-altitude adaptation in Tibet. This study applies solid and previously validated methodology to identify genes with signatures of both introgression and positive selection. This paper would be of interest to population geneticists, anthropologists, and scientists studying the genetic basis underlying high-altitude adaptation.

Reviewer #1 (Public review):

Anonymous

Over the last decade, numerous studies have identified adaptation signals in modern humans driven by genomic variants introgressed from archaic hominins such as Neanderthals and Denisovans. One of the most classic signals comes from a beneficial haplotype in the EPAS1 gene in Tibetans that is evidently of Denisovan origin and facilitated high altitude adaptation (HAA). Given that HAA is a complex trait with numerous underlying genetic contributions, in this paper Ferraretti et al. asked whether Denisovan introgression facilitated HAA in other ways by contributing to additional HAA-related genetic variants. Specifically, the authors considered that if such signature exists, they most likely are only mild signals from polygenic selection, or soft sweeps on standing archaic variation, in contrast to a strong and nearly complete selection signal like the EPAS1. They leveraged a few recently developed methods, including a composite likelihood method for detecting adaptive introgression and a biological network-based method for detecting polygenic selection, and identified compelling evidence of additional genes that exhibit Denisovan-like adaptive introgression signature and contributed to the polygenic adaptation at high altitude in Tibetans.

Strength:

The study is well motivated by an important question, which is, whether archaic introgression can drive polygenic adaptation via multiple small effect contributions in genes underlying different biological pathways regulating a complex trait (such as HAA). This is a valid question and the influence of archaic introgression on polygenic adaptation has not been thoroughly explored by previous studies

The authors reexamined previously published high-altitude Tibetan whole genome data and detected new evidence of adaptive introgression and polygenic selection. Specifically, by applying VolcanoFinder, they confirmed previously identified adaptive introgression alleles such as EPAS1 and PPARA. By applying signet, they identified subsets of biological pathways enriched for archaic variants that contributed to HAA polygenic selection. They also leveraged additional methods such as LASSI and haplotype plotting to help confirm the signature of natural selection on their newly discovered adaptive introgression candidate genes.

Weakness:

The manuscript also improved substantially since the initial review, and the new candidate genes presented here now harbor compelling and convincing evidence of both adaptive introgression and HAA polygenic selection. There are no notable weaknesses in the revised manuscript and updated results.

Reviewer #2 (Public review):

Anonymous

Summary:

In Ferrareti et al. they identify adaptively introgressed genes using VolcanoFinder and then identify pathways enriched for adaptively introgressed genes. They use signet to identify pathways that are enriched for Denisovan alleles. The authors find that angiogenesis is one of the biological functions enriched for introgression.

Strengths:

Most papers that have studied the genetic basis of high altitude (HA) adaptation in Tibet have highly emphasized the role of a few genes (e.g. EPAS1, EGLN1), and in this paper the authors look for more subtle signals of selection in other genes to investigate how archaic introgression may be enriched at the pathway level. A couple of methods are used to confirm the consistency of the results.

Looking into the biological functions enriched for Denisovan introgression in Tibetans is important for characterizing the impact of Denisovan introgression in facilitating high altitude adaptations.

Weaknesses:

I thank the authors for providing an improved version of their manuscript.

eLife. 2024 Nov 8;12:RP89815. doi: 10.7554/eLife.89815.3.sa3

Author response

Giulia Ferraretti 1, Paolo Abondio 2, Marta Alberti 3, Agnese Dezi 4, Phurba T Sherpa 5, Paolo Cocco 6, Massimiliano Tiriticco 7, Marco Di Marcello 8, Guido Alberto Gnecchi-Ruscone 9, Luca Natali 10, Angela Corcelli 11, Giorgio Marinelli 12, Davide Peluzzi 13, Stefania Sarno 14, Marco Sazzini 15

The following is the authors’ response to the original reviews.

Reviewer #1 (Public Review):

Over the last decade, numerous studies have identified adaptation signals in modern humans driven by genomic variants introgressed from archaic hominins such as Neanderthals and Denisovans. One of the most classic signals comes from a beneficial haplotype in the EPAS1 gene in Tibetans that is evidently of Denisovan origin and facilitated high altitude adaptation (HAA). Given that HAA is a complex trait with numerous underlying genetic contributions, in this paper Ferraretti et al. asked whether additional HAA-related genes may also exhibit a signature of adaptive introgression. Specifically, the authors considered that if such a signature exists, they most likely are only mild signals from polygenic selection, or soft sweeps on standing archaic variation, in contrast to a strong and nearly complete selection signal like in the EPAS1. Therefore, they leveraged two methods, including a composite likelihood method for detecting adaptive introgression and a biological networkbased method for detecting polygenic selection, and identified two additional genes that harbor plausible signatures of adaptive introgression for HAA.

Strengths:

The study is well motivated by an important question, which is, whether archaic introgression can drive polygenic adaptation via multiple small effect contributions in genes underlying different biological pathways regulating a complex trait (such as HAA). This is a valid question and the influence of archaic introgression on polygenic adaptation has not been thoroughly explored by previous studies.

The authors reexamined previously published high-altitude Tibetan whole genome data and applied a couple of the recently developed methods for detecting adaptive introgression and polygenic selection.

Weaknesses:

My main concern with this paper is that I am not too convinced that the reported genomic regions putatively under polygenic selection are indeed of archaic origin. Other than some straightforward population structure characterizations, the authors mainly did two analyses with regard to the identification of adaptive introgression: First, they used one composite likelihood-based method, the VolcanoFinder, to detect the plausible archaic adaptive introgression and found two candidate genes (EP300 and NOS2). Next, they attempted to validate the identified signal using another method that detects polygenic selection based on biological network enrichments for archaic variants.

In general, I don't see in the manuscript that the choice of methods here are well justified. VolcanoFinder is one among the several commonly used methods for detecting adaptive introgression (eg. the D, RD, U, and Q statistics, genomatnn, maldapt etc.). Even if the selection was mild and incomplete, some of these other methods should be able to recapitulate and validate the results, which are currently missing in this paper. Besides, some of the recent papers that studied the distribution of archaic ancestry in Tibetans don't seem to report archaic segments in the two gene regions. These all together made me not sure about the presence of archaic introgression, in contrast to just selection on ancestral variation.

Furthermore, the authors tried to validate the results by using signet, a method that detects enrichments of alleles under selection in a set of biological networks related to the trait. However, the authors did not provide sufficient description on how they defined archaic alleles when scoring the genes in the network. In fact, reading from the method description, they seemed to only have considered alleles shared between Tibetans and Denisovans, but not necessarily exclusively shared between them. If the alleles used for scoring the networks in Signet are also found in other populations such as Han Chinese or Africans, then that would make a substantial difference in the result, leading to potential false positives.

Overall, given the evidence provided by this article, I am not sure they are adequate to suggest archaic adaptive introgression. I recommend additional analyses for the authors to consider for rigorously testing their hypothesis. Please see the details in my review to the authors.

Reviewer #2 (Public Review):

In Ferrareti et al. they identify adaptively introgressed genes using VolcanoFinder and then identify pathways enriched for adaptively introgressed genes. They also use a signet to identify pathways that are enriched for Denisovan alleles. The authors find that angiogenesis and nitric oxide induction are enriched for archaic introgression.

Strengths:

Most papers that have studied the genetic basis of high altitude (HA) adaptation in Tibet have highly emphasized the role of a few genes (e.g. EPAS1, EGLN1), and in this paper, the authors look for more subtle signals in other genes (e.g EP300, NOS2) to investigate how archaic introgression may be enriched at the pathway level.

Looking into the biological functions enriched for Denisovan introgression in Tibetans is important for characterizing the impact of Denisovan introgression.

Weaknesses:

The manuscript lacks details or justification about how/why some of the analyses were performed. Below are some examples where the authors could provide additional details.

The authors made specific choices in their window analysis. These choices are not justified or there is no comment as to how results might change if these choices were perturbed. For example, in the methods, the authors write "Then, the genome was divided into 200 kb windows with an overlap of 50 kb and for each of them we calculated the ratio between the number of significant SNVs and the total number of variants."

Additional information is needed for clarity. For example, "we considered only protein-protein interactions showing confidence scores {greater than or equal to} 0.7 and the obtained protein frameworks were integrated using information available in the literature regarding the functional role of the related genes and their possible involvement in high-altitude adaptation." What do the confidence scores mean? Why 0.7?

In the method section (Identifying gene networks enriched for Denisovan-like derived alleles), the authors write "To validate VolcanoFinder results by using an independent approach". Does this mean that for signet the authors do not use the regions identified as adaptively introgressed using volcanofinder? I thought in the original signet paper, the authors used a summary describing the amount of introgression of a given region.

Later, the authors write "To do so, we first compared the Tibetan and Denisovan genomes to assess which SNVs were present in both modern and archaic sequences. These loci were further compared with the ancestral reconstructed reference human genome sequence (1000 Genomes Project Consortium et al., 2015) to discard those presenting an ancestral state (i.e., that we have in common with several primate species)." It is not clear why the authors are citing the 1000 genomes project. Are they comparing with the reference human genome reference or with all populations in the 1000 genomes project? Also, are the authors allowing derived alleles that are shared with Africans? Typically, populations from Africa are used as controls since the Denisovan introgression occurred in Eurasia.

The methods section for Figures 4B, 4C, and 4D is a little hard to understand. What is the x-axis on these plots? Is it the number of pairwise differences to Denisovan? The caption is not clear here. The authors mention that "Conversely, for non-introgressed loci (e.g., EGLN1), we might expect a remarkably different pattern of haplotypes distribution, with almost all haplotype classes presenting a larger proportion of non-Tibetan haplotypes rather than Tibetan ones." There is clearly structure in EGLN1. There is a group of non-Tibetan haplotypes that are closer to Denisovan and a group of Tibetan haplotypes that are distant from Denisovan...How do the authors interpret this?

In the original signet paper (Guoy and Excoffier 2017), they apply signet to data from Tibetans. Zhang et al. PNAS (2021) also applied it to Tibetans. It would be helpful to highlight how the approach here is different.

We thank the Reviewers for having appreciated the rationale of our study and to have identified potential issues that deserve to be addressed in order to better focus on robust results specifically supported by multiple approaches.

First, we agree with the Reviewers that clarification and justification for the methodologies adopted in the present study should be deepened with respect to what done in the original version of the manuscript, with the purpose of making it more intelligible for a broad range of scientists. As reported thoroughly in the revised version of the text, the VolcanoFinder algorithm, which we used as the primary method to discover new candidate genomic regions affected by events of adaptive introgression, was chosen among several approaches developed to detect signatures ascribable to such an evolutionary process according to the following reasons: (i) VolcanoFinder is one of the few methods that can test jointly events of both archaic introgression and adaptive evolution (e.g., the D statistic cannot formally test for the action of natural selection, having been also developed to provide genome wide estimates of allele sharing between archaic and modern groups rather than to identify specific genomic regions enriched for introgressed alleles); (ii) the model tested by the VolcanoFinder algorithm remarkably differs from those considered by other methods typically used to test for adaptive introgression, such as the RD, U and Q statistics, which are aimed at identifying chromosomal segments showing low divergence with respect to a specific archaic sequence and/or enriched in alleles uniquely shared between the admixed group and the source population, as well as characterized by a frequency above a certain threshold in the population under study, thus being useful especially to test an evolutionary scenario conformed to that expected in the case that adaptation was mediated by strong selective sweeps rather than weak polygenic mechanisms (see answer to comment #1 of Reviewer #1 for further details); (iii) VolcanoFinder relies on less demanding computational efforts respect to other algorithms, such as genomatnn and Maladapt, which also require to be trained on large genomic simulations built specifically to reflect the evolutionary history of the population under study, thus increasing the possibility to introduce bias in the obtained results if the information that guides simulation approaches is not accurate.

Despite that, we agree with Reviewer #2 that some criteria formerly implemented during the filtering of VolcanoFinder results (e.g., normalization of LR scores, use of a sliding windows approach, and implementation of enrichment analysis based on specific confidence scores) might introduce erratic changes, which depend on the thresholds adopted, in the list of the genomic regions considered as the most likely candidates to have experienced adaptive introgression. To avoid this issue, and to adhere more strictly to the VolcanoFinder pipeline of analyses developed by Setter et al. 2020, in the revised version of the manuscript we have opted to use raw LR scores and to shortlist the most significant results by focusing on loci showing values falling in the top 5% of the genomic distribution obtained for such a statistic (see Materials and methods for details).

Moreover, to further reduce the use of potential arbitrary filtering thresholds we decided to do not implement functional enrichment analysis to prioritize results from the VolcanoFinder method. To this end, although a STRING confidence score (i.e., the approximate probability that a predicted interaction exists between two proteins belonging to the same functional pathway according to information stored in the KEGG database) above 0.7 is generally considered a high confidence score (string-db.org, Szklarczyk et al. 2014), we replaced such a prioritization criterion by considering as the most robust candidates for adaptive introgression only those genomic regions that turned out to be supported by all the approaches used (i.e., VolcanoFinder, Signet, LASSI and Haplostrips analyses).

According to the Reviewers’ comments on the use of the Signet algorithm, we realized that the rationale beyond such a validation approach was not well described in the original version of the manuscript. First and foremost, we would like to clarify that in the present study we did not use this method to test for the action of natural selection (as it was formerly used by Gouy et al. 2017), but specifically to identify genomic regions putatively affected by archaic introgression. For this purpose, we followed the approach described by Gouy and Excoffier 2020 by searching for significant networks of genes presenting archaic-derived variants observable in the considered Tibetan populations but not in an outgroup population of African ancestry. Accordingly, we used the Signet method as an independent approach to obtain a first validation of introgressed (but not necessarily adaptive) loci pointed out by VolcanoFinder results.

In detail, in response to the question by Reviewer #2 about which genomic regions have been considered in the Signet analysis, it is necessary to clarify that to obtain the input score associated to each gene along the genome, as required by the algorithm, we calculated average frequency values per gene by considering all the archaic-derived alleles included in the Tibetan dataset but not in the outgroup one. Therefore, we did not take into account only those loci identified as significant by VolcanoFinder analysis, but we performed an independent genome scan. Then, we crosschecked significant results from VolcanoFinder and Signet approaches and we shortlisted the genomic regions supported by both. This approach thus differs from that of Zhang et al. 2021 in which the input scores per gene were obtained by considering only those loci previously pointed out by another method as putatively introgressed. Moreover, as mentioned in the previous paragraph, our approach differs also from that implemented by Guoy et al. 2017, in which the input scores assigned to each gene were represented by the variants showing the smallest P-value associated to a selection statistic, being thus informative about putative adaptive events but not introgression ones.

However, as correctly pointed out by both the Reviewers, we formerly performed Signet analysis by considering derived alleles shared between Tibetans and the Denisovan species, without filtering out those alleles that are observed also in other modern human populations. We agree with the Reviewers that this approach cannot rule out the possibility of retaining false positive results ascribable to ancestral polymorphisms rather than introgressed alleles. According to the Reviewers’ suggestion, we thus repeated the Signet analysis by removing derived alleles observed also in an outgroup population of African ancestry (i.e., Yoruba), by assuming that only Eurasian H. sapiens populations experienced Denisovan admixture. In detail, we considered only those alleles that: (i) were shared between Tibetans and Denisovan (i.e., Denisovan-like alleles); (ii) were assumed to be derived according to the comparison with the ancestral reconstructed reference human genome sequence; (iii) were completely absent (i.e., present frequency equal to zero) in the Yoruba population sequenced by the 1000 Genomes Project. Despite the comment of Reviewer #1 seems to propose the possible use of Han Chinese as a further control population, we decided to do not filter out Denisovan-like derived alleles present also in this human group because evidence collected so far suggest that Denisovan introgression in the gene pool of East Asian ancestors predated the split between low-altitude and high-altitude populations (Lu et al. 2016; Hu et al. 2017) and, as mentioned before, we aimed at using the Signet algorithm to validate introgression events rather than adaptive ones (see the answer to comment #6 of Reviewer #1 for further details). Moreover, we would like to remark that we decided to maintain the Signet analysis as a validation method in the revised version of the manuscript because: (i) comments from both the Reviewers converge in suggesting how to effectively improve this approach, and (ii) it represents a method that goes beyond the simple identification of single putative introgressed alleles, by instead enabling us to point out those biological functions that might have been collectively shaped by gene flow from Denisovans.

In addition to validate genomic regions putatively affected by archaic introgression by crosschecking results from the VolcanoFinder and Signet analyses, according to the suggestion by Reviewer #1 we implemented a further validation procedure aimed at formally testing for the adaptive evolution of the identified candidate introgressed loci. For this purpose, we applied the LASSI likelihood haplotype based method (Harris & DeGiorgio 2020) to Tibetan whole genome data. Notably, we choose this approach mainly for the following reasons: (i) because it is able to detect and distinguish genomic regions that have experienced different types of selective events (i.e. strong and weak ones); (ii) it has been demonstrated to have increased power in identifying them with respect to other selection statistics (e.g., H12 and nSL) (Harris & DeGiorgio 2020). Again, we performed an independent genome scan using the LASSI algorithm and then we crosschecked the obtained significant results with those previously supported by VolcanoFinder and Signet approaches in order to shortlist genomic regions that have plausibly experienced both archaic introgression and adaptive evolution.

Moreover, we maintained a final validation step represented by Haplostrips analysis, which was instead specifically performed on chromosomal segments supported by results from both VolcanoFinder, Signet, and LASSI approaches. This enabled us to assess the similarity between Denisovan haplotypes and those observed in Tibetans (i.e., the population under study in which archaic alleles might have played an adaptive role in response to high-altitude selective pressures), Han Chinese (i.e., a sister group whose common ancestors with Tibetans have experienced Denisovan admixture, but have then evolved at low altitude), and Yoruba (i.e., an outgroup that is assumed to have not received gene flow from Denisovans).

In conclusion, we believe that the substantial changes incorporated in the manuscript according to the Reviewers’ suggestions strongly improved the study by enabling us to focus on more solid results with respect to those formerly presented. Interestingly, although the single candidate loci supported by all the approaches now implemented for validating the obtained results have attained higher prioritization with respect to previous ones (which are supported by some but not all the adopted methods), angiogenesis still stands out as the one of the main biological functions that have been shaped by events of adaptive introgression in human groups of Tibetan ancestry. This provides new evidence for the contribution of introgressed Denisovan alleles other than the EPAS1 ones in modulating the complex adaptive responses evolved by Himalayan populations to cope with selective pressures imposed by high altitudes.

Responses to Recommendations For The Authors:

Reviewer #1:

The authors mainly relied on one method, VolcanoFinder (VF), to detect adaptive introgression signals. As one of the recently developed methods, VF indeed demonstrated statistical power at detecting mild selection on archaic variants, as well as detecting soft sweeps on standing variations. However, compared to other commonly used methods for detecting adaptive introgression, such as the U and Q stats (Racimo et al. 2017), genomatnn (Gower et al. 2021), or MaLAdapt (Zhang et al. 2023),

VF doesn't seem to have better power at capturing mild and incomplete sweeps. And it makes me wonder about the justification for choosing VF over other methods here, which is not clearly explained in the manuscript. If these adaptive introgression candidates are legitimate, even if the signals are mild, at least some of the other methods should be able to recapitulate the signature (even if they don't necessarily make it through the genome-wide significance thresholds). I would be more convinced about the archaic origin of these regions if the authors could validate their reported findings using some of the aforementioned other methods.

According to the Reviewer’s suggestion, in the revised version of the manuscript we have expanded the considerations reported as concern the rationale that guided the choice of the adopted methods. In particular, in the Materials and methods section (see page 12) we have specificed the reasons for having used the VolcanoFinder algorithm.

First, it represents one of the few approaches that relies on a model able to test jointly the occurrence of archaic introgression and the adaptive evolution of the genomic regions affected by archaic gene flow, without the need for considering the putative source of introgression. This was a relevant aspect for us, beacuse we planned to adopt at least two main independent (and possibly quite different in terms of the underlying approaches) methods to validate the identified candidate intregressed loci and the other algorithm we used (i.e., Signet) was explicitly based on the comparison of modern data with the archaic sequence. Accordingly, the model tested by VolcanoFinder differs from those considered by the RD, U and Q statistics. In fact, RD statistic is aimed at identifying regions of the genome with low divergence with respect to a given archaic reference, while the U/Q statistics can detect those chromosomal segments enriched in alleles that are (i) uniquely shared between the admixed group (e.g., Tibetans) and the source population (e.g., Denisovans), and (ii) that present a frequency above a specific threshold in the admixed population (Racimo et al. 2016). For instance, all the loci considered as likely involved in adaptive introgression events by Racimo et al. 2016 presented remarkable frequencies, with most of them showing values above 50%. That being so, we decided to do not implement these methods because we believe that they are more suitable for the detection of adaptive introgression events involving few variants with a strong effect on the phenotype, which comport a substantial increase in frequency in the population subjected to the selective pressure (i.e., cases such as that of EPAS1), while it appears challenging to choose an arbitrary frequency threshold appropriate for the detection of weak and/or polygenic selective events.

As regards the possible use of Maladapt or genomatnn approaches as validation methods, we believe that they rely on more demanding computational efforts with respect to the Signet algorithm and, above all, they have the disadvantage of requiring to be trained on simulated genomic data. This makes them more prone to the potential bias introduced in the obtained results by simulations that do not carefully reflect the evolutionary history of the population under study.

Overall, we do not agree with the Reviwer’s statement about the fact that we mainly relied on a single method to detect adaptive introgression signals because, as mentioned above, the Signet algorithm was specifically used to identify genomic regions putatively affected by introgression. This method relies on assumptions very similar to those described above for the U/Q statistics (e.g. it considers alleles uniquely shared between Tibetans and Denisovans), but avoids the necessity to select a frequency threshold to shortlist the most likely adaptive intregressed loci. In addition, according to another suggestion by the Reviewer we have now implemented a further approach to provide evidence for the adaptive evolution of the candidate introgressed loci (see response to comment #3).

As regards the use of Signet, based on comments from both the Reviewers we realized that the rationale beyond such a validation approach was not well described in the original version of the manuscript. First and foremost, we would like to clarify that in the present study we did not use this method to test for the action of natural selection (as it was formerly used by Gouy et al. 2017), but specifically to identify genomic regions putatively affected by archaic introgression. For this purpose, we followed the approach described by Gouy and Excoffier (2020) by searching for significant networks of genes presenting archaic-derived variants observable in the considered Tibetan populations. That being so, we used the Signet method as an independent approach to obtain a first validation of VolcanoFinder results. However, by following suggestions from both the Reviweres, we modified the criteria adopted to filter for archaic-derived variants, by excluding those alleles in common between Denisovan and the Yoruba outgroup population (see response to comment #6 for further information regarding this aspect).

To sum up, we think that the combination of VolcanoFinder and Signet+LASSI approaches offered a good compromise between required computational efforts to shortlist the most robust candidates of adaptive introgressed loci and the typologies of model tested (i.e. that does not diascard a priori genomic signatures ascribable to weak and/or polygenic selective events). Morevoer, we would like to remark that we decided to maintain the Signet method as a validation approach in the revised version of the manuscript because: (i) comments from both the Reviewers converge in suggesting how to effectively improve this approach, and (ii) it represents a method that can be used to perform both single-locus validation analysis and to search for those biological functions that have been collectively much more impacted by archaic introgression, allowing to test a more realistic approximation of the polygenic model of adaptation involving introgressed alleles. In fact, although the single candidate loci supported by all the approaches now implemented for validating the obtained results (see responses to comments #3 and #7 for further details) have attained higher prioritization with respect to previous ones (i.e., EP300 and NOS2, which are now supported by some but not all the adopted methods), angiogenesis still stands out as one of the main biological functions that have been shaped by events of adaptive introgression in the ancestors of Tibetan populations.

Besides, I am a little surprised to see that in Supplementary Figure 2, VF didn't seem to capture more significant LR values in the EPAS1 region (positive control of adaptive introgression) than in the negative control EGLN1 region. The author explained this as the selection on EPAS1 region is "not soft enough", which I find a bit confusing. If there is no major difference in significant values between the positive and negative controls, how would the authors be convinced the significant values they detected in their two genes are true positives? I would like to see more discussion and justification of the VF results and interpretations.

In the light of such a Reviewer’s observation and according to the Reviewer #2 overall comment on the procedures implemented for filtering VolcanoFinder results, we realized that both normalization of LR scores and the use of a sliding windows approach might introduce erratic changes, which depend on the thresholds adopted, in the list of the genomic regions considered as the most likely candidates to have experienced adaptive introgression. To avoid this issue, and to adhere more strictly to the VolcanoFinder pipeline of analyses developed by Setter et al. 2020, in the revised version of the manuscript we have opted to use raw LR scores and to shortlist the most significant results by focusing on loci showing values falling in the top 5% of the genomic distribution obtained for such a statistic (see Materials and methods, page 13 lines 4 -16 for further details).

By following this approach, we indeed observed a pattern clearer than that previously described, in which the distribution of LR scores in the EPAS1 genomic region is remarkably different with respect to that obtained for the EGLN1 gene (Figure 2 – figure supplement 1). More in detail, we identified a total of 19 EPAS1 variants showing scores within the top 5% of LR values, in contrast to only three EGLN1 SNVs. Moreover, LR values were collectively more aggregated in the EPAS1 genomic region and showed a higher average value with respect to what observed for EGLN1. We reported LR values, as well as -log (a) scores calculated for these control genes in Supplement tables 3 and 4.

Nevertheless, we agree with the Reviewer that results pointed out by VolcanoFinder require to be confirmed by additional methods, which is was what we have done to define both new candidate adaptive intregressed loci and the considered positive/negative controls. In fact, validation analyses performed to confirm signatures of both archaic introgression and adaptive evolution (i.e., Signet, LASSI and Haplostrips) converged in indicating that Tibetan variability at the EGLN1 gene does not seem to have been shaped by archaic introgression events but only by the action of natural selection (see Results, page 5 lines 3-9, page 6 lines 23-25, page 7 lines 29-36; Discussion page 14 lines 33-36; Figure 2 – figure supplement 1B and Figure 4 – figure supplement 1B, 3B and 3D), also according to what was previously proposed (Hu et al., 2017). On the other hand, results from all validation analyses confirmed adaptive introgression signatures at the EPAS1 genomic region (see Results page 4 lines 32-37, page 5 lines 1-2 and 30-34, page 6 lines 23-29; Figure 3A, 3B and Figure 4 – figure supplement 1A, 3A and 3C).

Finally, as already reported in the former version of the manuscript, our choice of considering EPAS1 and EGLN1 respectively as positive and negative controls for adaptive introgression was guided by previous evidence suggesting these loci as targets of natural selection in high-altitude Himalayan populations (Yang et al., 2017; Liu et al., 2022), although only EPAS1 was proved to have been involved also in an adaptive introgression event (Huerta-Sanchez et al., 2014; Hu et al., 2017).

With that being said, I suggest the authors try to first validate the signal of positive selection in the two gene regions using methods such as H2/H1 (Garud et al. 2015), iHS (Voight et al. 2006) etc. that have demonstrated power and success at detecting mild sweeps and soft sweeps, regardless of if these are adaptive introgression.

According to the Reviewer’s suggestion, we validated the new candidate adaptive introgressed loci by using also a method to formally test for the action of natural selection. In particular, we decided to use the LASSI (Likelihood-based Approach for Selective Sweep Inference) algorithm developed by Harris & DeGiorgio (2020) mainly for the following reasons: (i) it is able to identify both strong and weak genomic signatures of positive selection similarly to others approaches, but additionally it can distinguish these signals by explicitly classifying genomic windows affected by hard or soft selective sweeps; (ii) when applied on simulated data generated under different demographic models and by setting a range of different values for the parameters that describe a selective event (e.g., the time at which the beneficial mutation arose, the selection coefficient s) it has been proved to have an increased power with respect to traditional selection scans, such as nSL, H2/H1 and H12 (see Harris & DeGiorgio 2020 for further details).

According to such an approach, we were able to recapitulate signatures of natural selection previously observed in Tibetans for both EPAS1 and EGLN1 (Figure 4 – figure supplement 1 and 3C – 3D). We also obtained comparable patterns for our previous candidate adaptive introgressed loci (i.e., EP300 and NOS2), as well as for the new ones that have been instead prioritized in the revised version of the manuscript according to consistent results also from VolcanoFinder, Signet and Haplostrips analyses (see Results, page 6 lines 30-35; Figure 4C, 4D, Figure 4 – figure supplement 2C and 2D).

With regard to the plausible archaic origin of the haplotypes under selection in these gene regions, my concern comes from the fact that other recent studies characterizing the archaic ancestry landscape in Tibetans and East Asians (eg. SPrime reports from Browning et al. 2018, as well as ArchaicSeeker reports from Yuan et al. 2021) didn't report archaic segments in regions overlapping with EP300 and NOS2. So how would the authors explain the discrepancy here, that adaptive introgression is detected yet there is little evidence of archaic segments in the regions?

We thank the Reviewer for the comment and the references provided. However, we read the suggested articles and in both of them it does not seem that genomes from individuals of Tibetan ancestry have been analysed. Moreover, in the study by Yuan et al. 2021 we were not able to find any table or supplementary table reporting the genomic segments showing signatures of Denisovan-like introgression in East Asian groups, with only findings from enrichment analyses performed on significant results being described for the Papuan population. Anyway, as reported below in the response to comment #5, in line with what observed by the Reviwer as concerns the original version of the manuscript, according to the additional validation analyses implemented during this revison EP300 and NOS2 received lower prioritization with respect to other loci showing more robust signatures supporting introgression of Denisovan alleles in the gene pool of Tibetan ancestors (i.e., TBC1D1, PRKAG2, KRAS and RASGRF2). Three out of four of these genes are in accordance also with previously published results supporting introgression of Denisovan alleles in the ancestors of present-day Han Chinese (Browning et al. 2018) or directly in the Tibetan genomes (Hu et al. 2017) (see Results, page 5 lines 10-21 and Supplement table 5). Despite that, the reason why not all the candidate adaptive introgression regions detected by our analyses are found among results from Browning et al. 2018 can be represented by the fact that in Han Chinese this archaic variation could have evolved neutrally after the introgression events, thus preventing the identification of chromosomal segments enriched in putative archaic introgressed variants according to VolcanoFinder and LASSI approaches (which consider also the impact of natural selection). In fact, the Sprime method implemented by Browning et al. 2018 focuses only on introgression events rather than adaptive introgression ones. For instance, the Denisovan-like regions identified with Sprime in Han Chinese by such a study do not comprise at all the EPAS1 region.

Additionally, looking at Figure 4 and Supplementary Figure 4, the authors showed haplotype comparisons between Tibetans, Denisovan, and Han Chinese for EP300 and NOS2 regions. However, in both figures, there are about equal number of Tibetans and Han Chinese that harbor the haplotype with somewhat close distance to the Denisovan genotype. And this closest haplotype is not even that similar to the Denisovan. So how would the authors rule out the possibility that instead of adaptive introgression, the selection was acting on just an ancestral modern human haplotype?

We agree with the Reviewer that according to the analyses presented in the original version of the manuscript haplotype patterns observed at EP300 and NOS2 loci by means of the Haplostrips approach cannot ruled out the possibility that their adaptative evolution involved ancestral modern human haplotypes. In fact, after the modifications implemented in the adopted pipeline of analyses based on the Reviewers’ suggestions, their role in modulating complex adaptations to high-altitudes was confirmed also by results obtained with the LASSI algorithm (in addition to results from previous studies Bigham et al., 2010; Zheng et al., 2017; Deng et al., 2019; X. Zhang et al., 2020), but their putative archaic origin received lower prioritization with respect to other loci, being not confirmed by all the analyses performed.

Furthermore, I have a question about how exactly the authors scored the genes in their network analysis using Signet. The manuscript mentioned they were looking for enrichment of archaic-like derived alleles, and in the methods section, they mentioned they used SNPs that are present in both Denisovan and Tibetan genomes but are not in the chimp ancestral allele state. But are these "derived" alleles also present in Han Chinese or Africans? If so, what are the frequencies? And if the authors didn't use derived alleles exclusively shared between Tibetans and Denisovans, that may lead to false positives of the enrichment analysis, as the result would not be able to rule out the selection on ancestral modern human variation.

As mentioned in the response to comment #1, by following the suggestions of both the Reviewers we have modified the criteria adopted for filtering archaic derived variants exclusively shared between Denisovans and Tibetans. In particular, we retained as input for Signet analysis only those alleles that (i) were shared between Tibetans and Denisovan (i.e., Denisovan-like alleles) (ii) were in their derived state and (iii) were completely absent (i.e., show frequency equal to zero) in the Yoruba population sequenced by the 1000 Genome Project and used here as an outgroup by assuming that only Eurasian H. sapiens populations experienced Denisovan admixture. We instead decided to do not filter out potential Denisovan-like derived alleles present also in the Han Chinese population because multiple evidence agreed at indicating that gene flow from Denisovans occurred in the ancestral East Asian gene pool no sooner than 48–46 thousand years ago (Teixeira et al. 2019; Zhang et al. 2021; Yuan et al. 2021), thus predating the split between low-altitude and high-altitude groups, which occurred approximately 15 thousand years ago (Lu et al. 2016; Hu et al. 2017). In fact, traces of such an archaic gene-flow are still detectable in the genomes of several low-altitude populations of East Asian ancestry (Yuan et al. 2021).

Concerning the above, I would also suggest the authors replot their Figure 4 and Figure S4 by adding the African population (eg. YRI) in the plot, and examine the genetic distance among the modern human haplotypes, in contrast to their distance to Denisovan.

According to the Reviewer’s suggestion, after having identified new candidate adaptive introgressed loci according to the revised pipeline of analyses, we run the Haplostrips algorithm by including in the dataset 27 individuals (i.e., 54 haplotypes) from the Yoruba population sequenced by the 1000 Genomes Project (Figure 4A, 4B, Figure 4 - figure supplement 2A, 2B, 3A).

Reviewer #2:

In the methods the authors write "Since composite likelihood statistics are not associated with pvalues, we implemented multiple procedures to filter SNVs according to the significance of their LR values." What does significance mean here?

After modifications applied to the adopted pipeline of analyses according to the Reviewers’ suggestions (see responses to public reviews and to comments #1, #3, #6, #7 of Reviewer #1), new candidate adaptive introgressed loci have been identified specifically by focusing on variants showing LR values falling in the top 5% of the genomic distribution obtained for such a statistic in order to adhere more strictly to the VolcanoFinder approach developed by Setter et al. 2020. Therefore, the related sentence in the materials and methods section was modified accordingly.

Signet should be cited the first time it appears in the manuscript. The citation in the references is wrong. It lists R. Nielsen as the last author, but R. Nielsen is not an author of this paper.

We thank the Reviewer for the comment. We have now mentioned the article by Gouy and Excoffier (2020) in the Results section where the Signet algorithm was first described and we have corrected the related reference.

I could not find Figure 5 which is cited in the methods in the main text. I assume the authors mean Supplementary Figure 5, but the supplementary files have Figure 4.

We thank the Reviewer for the comment. We have checked and modified figures included in the article and in the supplementary files to fix this issue.

I didn't see a table with the genes identified as adaptatively introgressed with VolcanoFinder. This would be useful as I believe this is the first time VolcanoFinder is being used on Tibetan data?

According to the Reviewer suggestion, we have reported in Supplement table 2 all the variants showing LR scores falling in the top 5% of the genomic distribution obtained for such a statistic, along with the associated α parameters computed by the VolcanoFinder algorithm.

It is easier for the reviewer if lines have numbers.

According to the Reviewer suggestion, we have included line numbers in the revised version of the manuscript.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Jeong C, Witonsky DB, Basnyat B, Neupane M, Beall CM, Childs G, Craig SR, Novembre J, Di Rienzo A. 2018. Tibetan/Sherpa Sequence Reads. NCBI BioProject. PRJNA420511 [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Supplementary file 1. Supplementary tables 1a-1f.

    (a) Populations included in the extended dataset. (b) Single-nucleotide variants (SNVs) associated with values falling in top 5% of the distribution of likelihood ratio (LR) statistic calculated by VolcanoFinder. (c) SNVs associated with values falling in top 5% of the distribution of the LR statistic and comprised in the genomic region of the EPAS1 gene (i.e., 2:46474546–2:46663836). (d) SNVs associated with values falling in top 5% of the distribution of the LR statistic and comprised in the genomic region of the EGLN1 gene (i.e., 1:231449502–1:231608033). (e) Adaptive intregressed genes confirmed by VolcanoFinder and previous studies. (f) Gene networks including Denisovan-like derived alleles identified according to the Signet approach.

    elife-89815-supp1.xlsx (8.4MB, xlsx)
    MDAR checklist

    Data Availability Statement

    The current manuscript is a computational study, so no data have been generated for this manuscript. The dataset used has been generated by Jeong et al., 2018. The code and the software used have been developed by Marnetto and Huerta‐Sánchez, 2017; Setter et al., 2020; Gouy and Excoffier, 2020; Harris and DeGiorgio, 2020.

    The following previously published dataset was used:

    Jeong C, Witonsky DB, Basnyat B, Neupane M, Beall CM, Childs G, Craig SR, Novembre J, Di Rienzo A. 2018. Tibetan/Sherpa Sequence Reads. NCBI BioProject. PRJNA420511


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES