Proposed workflow for hypoxia-related therapeutics and schematic genealogical tree
A: a proposed workflow for hypoxia related therapeutics, starting with genetic samples and ending with candidate therapeutic targets. B: schematic genealogical tree illustrating the evolution of a non-recombining genomic fragment across three populations, one of which migrates to high altitude (HA population) and undergoes genetic adaptation, whereas the others remain at low altitudes (LA and Outgroup populations). The bottom of the tree (leaves) represents individuals sampled from the current generation, whereas the upper sections reflect the past genealogy. In the HA population, hypoxia imposes positive natural selection on the beneficial allele (blue star), increasing its frequency (in the non-CMS group) at the expense of individuals carrying the maladapted allele (CMS). As long as phenotypic variation persists in the adaptive trait (e.g., Hb levels are still variable in the HA population, meaning the selective sweep is ongoing), genetic association may find variants associated with the trait. However, after the trait reaches fixation or given small effect sizes and/or smaller cohorts, genome-wide association (GWA) is unlikely to reveal the adaptive genes. Neutrality tests can be used to pinpoint genomic regions under selection in both settings (i.e., pre- and postfixation, and given a smaller sample). These tests utilize properties of the genealogical tree. The LSBL/PBS tests approximate the branch length leading to the MRCA of the HA population, which is unusually high in regions under selection (see long branch with blue SNPs). Tajima's π uses the mean allelic heterogeneity, which is unusually low in regions under selection (since HA individuals are genetically similar given their relatively recent MRCA). The iHS/EHH tests use haplotype homozygosity, which is unusually high and spans longer regions under selection (most variation in HA individuals, shown as SNPs on the path from MRCA to the present HA individuals, is common to the entire HA sample). Common practice is to genotype a population sample, followed by imputation from a nearby, and densely sequenced, reference population (e.g., the LA population). Because imputation relies on conserved linkage disequilibrium (LD) between target and reference populations, and LD is strongly altered by selective sweeps, imputation will be inaccurate in regions evolving under strong selection. This further illustrates the importance of WGS. MCRA, most recent common ancestor.