Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Aug 1.
Published in final edited form as: Trends Genet. 2020 Jun 10;36(8):563–576. doi: 10.1016/j.tig.2020.05.006

In The Blood: Connecting Variant to Function In Human Hematopoiesis

Satish K Nandakumar 1,2,3,*, Xiaotian Liao 1,2,3,*, Vijay G Sankaran 1,2,3,4,Δ
PMCID: PMC7363574  NIHMSID: NIHMS1603112  PMID: 32534791

Abstract

Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with a range of human diseases and traits. However, understanding the mechanisms by which these genetic variants impact associated diseases and traits, often referred to as the variant to function (V2F) problem, remains a significant hurdle. Solving the V2F challenge requires us to identify causative genetic variants, relevant cell types/ states, target genes, and mechanisms by which variants can cause diseases or alter phenotypic traits. In this review, we discuss the emerging functional approaches being applied to tackle the V2F problem for blood cell traits, illuminating how human genetic variation can impact key mechanisms in hematopoiesis, as well as highlighting future prospects for this nascent field.

Keywords: GWAS, variant to function, hematopoiesis, blood cell traits

Variant to Function: The next grand challenge in human genetics

Genome-wide association studies (GWAS, see Glossary) take advantage of human genetic variation to identify specific polymorphisms in the genome that are associated with a trait or disease of interest. The majority of variants identified by typical GWAS are common, given that these studies are better powered to detect significant effects for such variants (therefore, GWAS can be referred to as CVAS) [1, 2]. Innovation in genotyping technologies and computational analyses has led to an explosion of GWAS studies aimed at studying a range of human diseases and traits with increasingly larger cohort sizes (thousands to millions of individuals). As of March 2020, the National Human Genome Research Institute - European Bioinformatics Institute (NHGRI-EBI) GWAS Catalog contained 4476 published GWAS studies reporting 177,629 associations [3]. Despite these advances, our ability to obtain biological insight from these studies has remained limited, with a few exceptions [47]. How genetic variants contribute to a disease phenotype, often referred to as the variant to function (V2F) problem, is still unclear for the vast majority of GWAS loci. Each V2F problem can be a significant undertaking that involves a range of tailored functional approaches. These validation studies have generally lagged behind compared to variant discovery, resulting in a significant proportion of functionally understudied GWAS loci. A major limitation is that mechanistic studies have to be customized, in contrast to genetic studies that employ similar approaches irrespective of the disease being studied. A significant challenge is, therefore, to address the V2F problem in a scalable manner. In this review, we describe approaches taken to tackle the variant to function (V2F) problem in the context of blood cell traits, which have led to important advances in our understanding of how human hematopoiesis occurs in health and disease.

Introduction to the variant to function (V2F) problem

By performing a GWAS study, we can pinpoint regions of the human genome (genomic loci) that are associated with any particular disease or phenotype. However, the ultimate goal of GWAS is to understand the underlying mechanisms impacted by genetic variants that ultimately lead to the observed phenotype. To understand how genetic variants contribute to a biological phenotype, we need to determine the following information (Figure 1, Key Figure):

Figure 1. Post- GWAS (genome-wide association study) workflow.

Figure 1.

The figure represents a generalized process moving from studying variant to function in human hematopoiesis. Candidate GWAS loci are first dissected by fine-mapping to identify potential causal variants, which can be further prioritized with in silico prediction, functional annotation, and experimental validation methods. Once the target cell types in which these variants act are identified, one can perform experiments to define variants such as reporter assays, enhancer silencing or disruption (variant-centric), or perform experiments to identify genes such as loss-of-function screens (gene-centric). These approaches and additional functional studies can lead to the identification of mechanisms impacted by the genetic variant.

Causal variant(s):

At each GWAS loci, there are many potential common variants (dozens to thousands) that are in linkage disequilibrium (LD) with the sentinel or tag SNP. Often, the causal variant underlying the phenotype might not be directly genotyped in the study; be it coding or noncoding. Identifying the causal variant is a logical first step in V2F studies. Coding variants are straightforward to identify and functionally examine, as they directly impact gene function. However, a majority of GWAS variants are noncoding variants, whose effects (enhancer activity, promoter function, splicing, boundary elements, etc.) and associated target genes can be difficult to identify.

Cell type or cell state:

Genetic variants can influence the phenotype in any cell type/ state (or in multiple cell types) within the body. Identifying the relevant cell type is critical to understand function. Non-coding variants are thought to act through modulation of cell-type specific enhancer elements, and hence can have selective effects [8]. In contrast, coding variants can affect any cell type wherein the gene is expressed, although not all functions might be relevant in every cellular context [9].

Causal gene(s):

Connecting the target gene(s) to regulatory variants (that affect cell-type specific enhancers) can be challenging owing to our limited understanding of the non-coding genome and connectivity between regulatory elements and their target genes. These connections can be different across cell types, with each regulatory element potentially impacting multiple genes [4].

Cellular/ physiological function:

Identifying the physiological function impacted by a causal variant/ gene that leads to a phenotype is the final step in V2F studies, which validates the genetic association. This involves functionally testing the causal variant or gene in an appropriate and relevant cellular assay. Perturbations of causal variants/ genes in model organisms can also be valuable to study complex physiologic functions.

The path of going from variant to function is a significant challenge and currently, no straightforward/ gold-standard methods exist that enable all of this information to be obtained at once. Often, multiple approaches can be adopted in parallel to tackle each of the V2F components [10, 11]. We explore selected examples from blood cell traits that illustrate how success can be obtained in the (often circuitous) arc going from genetic discovery to functional validation.

Functional studies of hematopoietic traits

There are over ten different cell types in circulating blood with unique and key physiologic roles: red blood cells that play a central role in oxygen transport; white blood cells (granulocytes, monocytes, lymphocytes) that enable immune function; and platelets that are necessary for hemostasis. The numbers and characteristics of the individual blood cell types are tightly regulated within a narrow range with variation between individuals mostly arising from environmental and genetic factors [12, 13]. Blood cell traits include measures of the numbers and characteristics of red blood cells (e.g. RBC count, hemoglobin concentration, mean cell volume, mean cell hemoglobin content, mean cell hemoglobin concentration, etc.), platelets (e.g. platelet count, mean platelet volume), and white blood cells (e.g. granulocyte, monocyte, and lymphocyte counts). Most of these blood cell traits can be measured as part of a complete blood count (CBC), which is frequently performed during routine clinical care, and such measurements are therefore available from large population-based studies [12, 14, 15]. However, there are other blood traits, such as the amounts of a fetal form of hemoglobin (fetal hemoglobin or HbF)-that have also been studied through GWAS, but are less frequently measured [16].

Given how often CBCs are performed, some of the largest GWAS studies have been performed on blood cell traits, with recent studies involving over 29 different traits in over 750,000 individuals [1215]. The variation in blood cell traits is highly heritable with a narrow sense heritability that ranges between 5.8% and 28.8%, depending on the trait [14, 15]. From these studies, a large collection of conditionally-independent loci have been identified, including 8403 loci for red blood cell traits, 2825 loci for platelet traits, and 5672 loci for white blood cell traits [14, 15]. This large number of loci is valuable to understand normal variation in hematopoiesis. Moreover, it is clear that this polygenic variation has a key role in impacting the phenotypes observed in those with presumed monogenic blood disorders, such as inherited anemias and bone marrow failure syndromes [14].

Overview of hematopoiesis

Because circulating mature blood cells have a finite lifespan, they need to be constantly replenished throughout the process of hematopoiesis, which occurs primarily inside the bone marrow of adults (Figure 2) [17]. The hematopoietic stem cell (HSC) is the most primitive cell, capable of indefinite self-renewal and multi-lineage differentiation [1820]. HSCs further differentiate into multipotent (MPPs) and lineage-committed progenitors with limited or absent self-renewal capacity [2124]. One of the major strengths of the hematopoietic system is that stem/ progenitor cells and mature cells can be highly enriched using antibodies followed by fluorescent activated cell sorting (FACS), enabling prospective functional analyses. Detailed epigenetic maps and gene expression profiles are available for many components of the hematopoietic compartment [25, 26]. Key transcription factors required for various differentiation steps have been identified through studies in model organisms, such as mice and zebrafish [17]. Additionally, hematopoietic cells can be readily studied using well established in vitro assays, such as colony-forming assays (CFU-C), and in vivo assays, such as transplantation into immunocompromised mice [27]. The availability of growth factors also enable the ex vivo culture of hematopoietic cells and make them amenable to examination using perturbation approaches such as CRISPR-Cas9 genome editing, shRNA knockdown, and lentivirus-mediated overexpression [2830]. Given the unique accessibility and ability to manipulate this system, some of the best studied examples of V2F studies derive from blood cell traits.

Figure 2. Schematic of the human hematopoietic hierarchy.

Figure 2.

Dashed lines indicate recently discovered differentiation paths. HSC, hematopoietic stem cell; MPP, multipotent progenitor; LMPP, lymphoid-primed multipotent progenitor; CMP, common myeloid progenitor; CLP, common lymphoid progenitor; GMP, granulocyte-macrophage progenitor; MEP, megakaryocyte-erythroid progenitor; NK, natural killer cell; CD4, CD4+ T cell; CD8, CD8+ T cell; B, B cell; pDC, plasmacytoid dendritic cell; mono, monocyte; mDC, myeloid dendritic cell; gran, granulocyte; ery, erythroid; Mega, megakaryocyte. The looping arrow represents the self-renewal ability of hematopoietic stem cells. 16 blood traits that have been genetically studied are shown below the hierarchy.

Prominent insights into hematopoiesis from GWAS studies

Fetal Hemoglobin Regulation

The identification of BCL11A as a key regulator of fetal hemoglobin expression is a key example of success derived from GWAS studies, a discovery that has now moved into the arena of Phase1/2 clinical trials for curative treatment of patients with sickle cell disease and β-thalassemia (NCT03282656)I (NCT0365567)II .

Hemoglobin is the main oxygen-carrying molecule present abundantly in red blood cells. The predominant form of hemoglobin expressed during gestation is fetal hemoglobin (HbF) [31]. Shortly after birth, the adult form of hemoglobin replaces HbF. There has been great interest in understanding the process of hemoglobin switching due to the long-standing observation that elevated HbF levels after infancy can ameliorate the clinical severity of both sickle cell disease and β-thalassemia [32]. In 2007–2008, GWAS of HbF variation identified 3 genomic loci: the hemoglobin subunit beta (HBB) gene locus on chromosome 11, the HBS1L-MYB intergenic region on chromosome 6, and the B-cell lymphoma/leukemia 11A (BCL11A) gene on chromosome 2 [16, 33, 34]. Although the former two loci had been previously identified, a role for BCL11A in erythroid cells was unknown at that time.

It was initially noted that BCL11A showed a varied developmental expression pattern in human erythropoiesis, such that it was well expressed at the later stages of development when the HbF-encoding genes, hemoglobin subunit gamma 1 and 2 (HBG1/2), are largely silenced, whereas BCL11A was expressed at lower levels at earlier stages of development when HbF is well expressed [35]. This suggested that BCL11A expression might silence HbF, a phenomenon directly demonstrated through RNA-interference-mediated knockdown of BCL11A that resulted in increased HbF amounts within adult erythroid cells [35]. Furthermore, the loss of BCL11A in transgenic mice containing the human β-globin locus resulted in impaired developmental HbF silencing [36]. Subsequent identification of humans with deletions or mutations in the BCL11A gene with persistent HbF levels have further verified the key in vivo role of BCL11A in hemoglobin switching [37, 38]. Recent studies have further defined the mechanisms by which BCL11A acts to silence the HbF-encoding genes, including identification of interactions with the nucleosome remodeling and histone deacetylase (NuRD) complex [39], as well as interactions with chromatin at the proximal promoters of HBG1/2 and more distal regions involved in this silencing process [4043]. Moreover, the upstream regulators of BCL11A gene, including an RNA binding protein Lin-28 homolog B (LIN28B), that enable hemoglobin switching in human development, have recently been uncovered [44]. None of these key biological insights would have been possible without the initial GWAS for HbF.

Although, the initial functional work identified a key role for BCL11A in HbF silencing, the role of putative causal variants identified through the GWAS remained unclear. The second intron of BCL11A harbored an enhancer that specifically regulates expression of BCL11A in erythroid cells, containing three DNAse I hypersensitive (DHS) peaks (termed +62, +58 and +55). Two of the most significantly-associated variants, rs1427407 and rs7606173, were found to be located in DHS peaks +62 and +55 respectively [8]. Deletion of entire 10 kb enhancer regions specifically decreased BCL11A expression in erythroid cells, but not in lymphoid cells [8]. Interestingly, a CRISPR saturation mutagenesis screen identified the +58 DHS peak as harboring nearly all of the enhancer activity [45] and this is now being pursued therapeutically for sickle cell disease and β-thalassemia through a number of ongoing studies [46]. However, even though attention has focused on this therapeutically-relevant +58 DHS region, the impact of a causal variant(s) in this larger enhancer element remains to be fully defined, with further functional studies needed to dissect this. Importantly, all current therapeutic approaches that target BCL11A, including genome editing or gene therapy approaches using RNA interference, seek to mimic the effects of the genetic variants upon the gene - downregulation of BCL11A specifically in erythroid cells - emphasizing the value of the V2F arc in defining optimal therapeutic approaches for human diseases [4749].

Erythropoiesis

Similar to HbF-associated variants, functional follow-up of variants associated with other RBC traits has resulted in unexpected insights into erythropoiesis. During human terminal erythroid differentiation, red cell precursors undergo 4–5 cell divisions during which they progressively decrease in cell size prior to exiting the cell cycle and producing enucleate reticulocytes [50]. Further insights into how this process is regulated came from genetic association studies. A genetic variant, rs9349205 was found to be associated with both size and count of red blood cells [51]. The variant was located at 6p21.1 in the cyclin D3 (CCND3) locus ~15 kb, upstream of the gene promoter in an erythroid specific enhancer element marked by DNAse I hypersensitivity and histone 3 Lys 4 monomethylation (H3K4me1)-also bound by the erythroid transcription factors GATA binding protein 1 (GATA1), TAL bHLH transcription factor 1 (TAL1), and krüppel like factor 1 (KLF1) [52]. This variant-containing regulatory element showed enhancer activity in reporter assays and the major allele (linked to decreased RBC count and increased MCV) was associated with decreased enhancer activity relative to the minor allele. Chromatin conformation capture assay (3C) showed that this enhancer element interacted with the CCND3 gene, nominating it as the causal gene for most red blood cell traits. CCND3 knockout mice showed an increase in red blood cell size and a decrease in RBC count similar to the trait-associated variant relative to wildtype mice. Further functional analysis of primary human and mouse erythroid cells revealed that the encoded protein, cyclin D3, served as a key regulator of the number of cell divisions during terminal erythropoiesis. Of note, a recent study used CRISPR-Cas9-facilated homology-directed repair (HDR) to create allelic variants of rs9349205 in erythroid cell lines and confirmed that decreased expression of CCND3 was associated with the major allele [53]. In addition, recent GWAS studies have identified a secondary variant (rs112233623) 161 bp upstream of the original variant that independently impacts these same red cell traits such as red blood cell (RBC) count and mean corpuscular volume (MCV), suggesting complex genetic regulation at this locus [13].

With the ever-increasing size of GWAS for red blood traits, more loci have been continually identified (e.g. 8400 conditionally independent loci to date), which makes it challenging to study a single locus and emphasizes the need to scale up these functional approaches [12, 54]. High-throughput functional assays performed in trait-relevant cell types are extremely valuable to prioritize GWAS-nominated variants or genes. For example, a high-throughput reporter assay termed a massively parallel reported assay (MPRA) was used to screen 2756 RBC trait-associated variants for both enhancer activity of the variant-harboring regulatory element and for allelic variation in human erythroid cells [10]. This enabled important global insights: specifically, many trait-associated variants impacted the activity of regulatory elements but did not directly perturb chromatin occupancy sites of master erythroid transcription factors, such as GATA1. Moreover, interesting individual variants could also be identified through this screen, such as the variant rs737092 that was associated with RBC count. Studies of this variant-harboring regulatory element through endogenous CRISPR-Cas9 genome editing and perturbation of the target gene revealed a role for the splicing factor RNA binding motif protein 38 (RBM38), as a key regulator of alternative splicing during terminal erythropoiesis.

Platelet traits

Beyond the success of studying red blood cell traits, there are other examples of well-characterized GWAS loci relevant to other aspects of hematopoiesis. For instance, 7q22.3 was the first human locus identified to be associated with platelet size (measured by mean platelet volume or MPV) [55]. The strongest association signal within the locus was found at rs342293, located in an accessible chromatin region, specific to megakaryocytes, but not erythroid cells [56]. The variant was also located in an intergenic region neighboring 6 genes within a ~1 MB interval, which were all highly transcribed in megakaryocytes, the precursor cell that gives rise to platelets. Expression quantitative trait (eQTL) analysis revealed rs342293, a variant that specifically impacts the expression of the phosphatidylinositol-4, 5-bisphosphate 3-kinase catalytic subunit gamma (PIK3CG) gene in platelets. PIK3CG encodes the γ-chain of PI3-kinase, responsible for the synthesis of phosphatidylinositol-3,4,5-trisphosphate, a key component enabling megakaryocytes to produce platelet precursors. Specifically, Pik3cg −/− mice showed differences in the expression of platelet genes, with no changes in platelet volume [56]. The putative causal variant, rs342293, disrupted binding motifs of two key megakaryocyte transcription factors, GATA1 and myelodysplasia syndrome-associated protein 1 and EVI1 complex locus / ecotropic viral integration site 1 (MECOM/ EVI1); however, in gel shift assays, the binding of EVI1 alone was reduced by the variant [56]. This example illustrates how the V2F arc can provide new and unexpected insights about how key regulators of hematopoiesis might vary.

Additionally, a whole exome sequencing-based association analyses of blood cell traits identified a low-frequency synonymous coding variant in the hematopoietic transcription factor growth factor independent 1B transcriptional repressor (GFI1B) that is associated with lower platelet counts in humans [57]. The synonymous variant rs150813342 was predicted to have an effect on splicing by disrupting a putative exon splicing enhancer [57]. To directly assess the impact upon GFI1B splicing in an endogenous setting, CRISPR-Cas9 editing was applied to create several isogenic hematopoietic cell clones harboring this variant, which led to decreased formation of the GFI1B isoform containing exon 5 (long isoform), as well as the preferential formation of the isoform without exon 5 (short isoform) relative to wild type clones. In addition, the splicing defects from the edited clones resulted in severely reduced megakaryocyte differentiation with unaltered erythroid differentiation [57]. Going from the variant to hematopoietic differentiation, these series of experiments showed that the GFI1B variant restricted the formation of the long isoform that has a key role in megakaryocyte differentiation and platelet production in humans.

Neutrophil traits

Another well-studied association can be found between neutrophil count and the chemokine receptor CXCR2 in humans. Specifically, three rare missense variants (rs55799208, p.Arg153His; rs10201766, p.Arg236Cys; and rs61733609, p.Arg248Gln) in CXCR2 were identified using a rare variant association study [58]. These variants collectively appear to explain a mild reduction in neutrophil counts (1.7 – 3.5 X109/L) in healthy populations, and are predicted to impact protein function. Using a highly penetrant truncated CXCR2 mutant, identified in congenital neutropenia (a genetic disorder associated with a severe reduction in neutrophil counts), it was shown that the migration of neutrophils out of the bone marrow requires CXCR2. Moreover, in mouse models, neutrophils lacking Cxcr2 are retained in the bone marrow and fail to effectively mobilize to the blood [59]. However, the effects of the three rare missense variants on cellular migration and neutrophil mobilization remain to be directly tested. Of note, common variants linked to the ligand for CXCR2, the chemokine CXCL2, have also been associated with low neutrophil counts [60].

There are a few key lessons that can be gleaned from all of these successful V2F studies on blood cell traits. First, blood cell traits predominantly impact cell-intrinsic mechanisms within hematopoietic cell types, although there might be examples of extrinsic mechanisms that remain to be elucidated. Second, prioritizing non-coding variants based on overlap with epigenetic marks, such as accessible chromatin or transcription factor binding sites can potentially enable improved V2F studies. Third, the choice of a relevant functional assay in an appropriate cell type is crucial, as exemplified by the study of variants impacting RBC traits at select stages of erythropoiesis in vitro.

Despite the success, each of the above examples provided, has had limitations in completely fulfilling the V2F scheme. One major limitation of earlier GWAS follow-up studies was that the V2F arc often failed to involve endogenous perturbation, which has now been made possible through advances in genome editing technologies and the ability to deliver these tools into primary hematopoietic cells. However, as we discuss in detail below, a number of challenges remain in completing the V2F arc. However, promising approaches continue to emerge in this nascent field.

Approaching the problem from multiple angles: current strategies for functional interpretation of variants

Although, we have described a few select examples from hematopoietic traits, it is clear that low-throughput approaches can be limited, and there are opportunities to pursue the V2F problem using scalable and systematic approaches to identify causal variants, genes, cell-types, and functions.

Prioritization of causal genetic variants

A critical first step in the V2F arc is identifying the causal variant(s) underlying an association and eliminating other variants that might be in linkage disequilibrium due to the colocalization of variants in human history. Fine-mapping methods can be important to prioritize candidate variants while minimizing the background. These approaches can be broadly categorized into three groups: statistical genetic approaches, colocalization methods using epigenomic or other annotations, and experimental approaches. Of course, combinations of these broad approaches can be particularly powerful.

Statistical fine-mapping

In order to identify putative causal variants, statistical genetic fine-mapping approaches have been developed which aim to pinpoint causal variants in trait-associated loci by modeling linkage disequilibrium structure and association strength [61, 62]. Most of these approaches are limited by the underlying assumption that only a single causal variant exists at any given locus, despite studies suggesting otherwise [6365]. Recent innovations have enabled these methods to account for multiple putative causal variants and have been instantiated through approaches such as CaviarBF, GUESSFM, and FINEMAP [6668]. Nevertheless, these methods have their own drawbacks. When multiple variants are present in high linkage disequilibrium (LD), for example, it is nearly impossible to statistically distinguish variants. Additionally, the assumption of having multiple co-existing causal variants, requires testing of various combinations of putatively causal variants, a process which can be computationally intensive as increasing numbers of causal configurations are tested at each locus.

Functional fine-mapping guided by genome annotations

Naturally, functional annotation data, such as epigenomic information and overlap with gene bodies or distance to gene transcription start sites, is often used to further prioritize variants that have a higher probability of being the causal variants. Broadly speaking, many different types of genomic data are potentially informative, including gene expression, transcription factor binding, chromatin accessibility, DNA methylation, and histone modification data. Many international efforts have assembled large collections of epigenomic data across many cell lines and tissue types, such as the ENCODE, Roadmap Epigenomics, and FANTOM projects [69, 70]. However, co-localization approaches are limited in the assumptions about potential causal mechanisms that need to be made when using these data.

Experimental fine-mapping

The effects of genetic variants on certain molecular functions such as transcriptional enhancer activity or RNA splicing can be assessed using exogenous assays. One commonly used experimental method for assessing transcriptional regulatory activity is a reporter assay, which requires one to insert regulatory sequences surrounding candidate variants upstream of a reporter gene, such as green fluorescent protein (GFP) or luciferase, and then introduce such constructs into relevant cell types [71]. Variation in regulatory activity can then be measured by comparing RNA and DNA amounts for every construct using sequencing or other approaches. By using sequencing, significant scaling can be achieved – an approach termed a massively parallel reporter assay (MPRA) [71, 72]. For example, one study used this approach to test 32,373 variants from 3,642 cis-eQTL loci for differential allelic effect and found 842 variants with differential expression between alleles, containing 53 well-annotated variants linked to diseases and traits in the literature [73]. These methods could also be used for other GWAS, such as employed for red cell traits as discussed above [10]. However, such reporter-based methods face limitations as these assays are done exogenously, and most often require a specific activity to be tested, such as impacts on transcriptional regulation or splicing. Nonetheless, they are valuable assays that can enable rapid assessments of numerous alleles associated with a disease or trait of interest.

Identification of the relevant cell type

A key step in defining mechanisms of trait-associated genetic variation requires the identification of relevant cell types and states. For a given disease or trait, several methods that integrate tissue-specific gene expression or epigenomic data with GWAS summary statistics to identify risk loci enrichment in specific cell types were developed recently, including methods such as SNPsea, DEPICT, RolyPoly, g-chromVAR, and CHEERS [13, 7479]. Such frameworks allow researchers to narrow down potential cell types or states, even potentially at single-cell resolution, that might be most relevant to the disease of interest, enabling focused functional dissection.

Nomination of target genes

In many cases, the ability to complete the arc of perturbing a variant and observing molecular effects is challenging. Therefore, the V2F arc is often best pursued in a stepwise manner, by first connecting candidate variants to causal genes and then the genes, to molecular effects. Sometimes projecting a variant to a particular gene is straightforward, such as when it resides in a promoter or in a gene body, but the majority of variants implicated by GWAS studies reside in non-coding regions; and, identifying the gene(s) that a variant perturbs is generally a non-trivial task. One intuitive approach is to identify nearby genes of a candidate variant and assess gene expression changes with a particular variant present to determine, which, if any, might be candidate causal genes. However, this can be challenging, particularly because gene regulation can occur over long distances. Therefore, proximity ligation approaches are often employed to delineate enhancer-promoter interactions [80, 81]. For instance, a recent study has attempted to comprehensively predict enhancer-promoter interactions across 53 red cell trait-associated loci using proximity ligation approaches in human erythroid cells and found 194 candidate target genes at 48 loci [53]. Furthermore, chromatin accessibility of the variant can be correlated with gene expression to suggest likely causal genes, under the assumption that accessible chromatin of the region overlapping a variant correlates highly with the expression of genes of interest [13].

Identification of cellular function

Determining the causality of a variant ideally requires demonstrating an altered phenotype by allelic replacement in a native cellular context. Once the allele is replaced, molecular phenotype assays such as gene expression can be conducted to determine if the different alleles have varying activities being on the same genetic background. Allele-specific changes are now feasible with the advent of genome editing tools, such as CRISPR-Cas9, but the scale of investigation these methods can achieve remain limited and might not be applicable for some of the relevant cell types that remain challenging, if not impossible, to introduce precise changes in, even with the latest genome manipulation tools such as base editors [82]. Therefore, perturbing a region containing a putative causal variant or an entire regulatory element using genome editing tools to create insertions or deletions is often done in place of precise allelic changes because such data can at least enable functional validation of the role of a particular variant-harboring regulatory element.

In addition to perturbing a specific regulatory element or creating an allelic change, the precise functional assays that are disease or trait-relevant can be challenging to define. In some cases, relevant phenotypes might be obvious. For instance, in the context of blood cell trait variation, alterations of hematopoietic cell expansion/differentiation in culture could represent a relevant phenotype [11]. However, for other traits or diseases, this might not be so readily done, and identifying optimal functional assays remains a substantial challenge.

Scalable perturbation methods

Despite the advances discussed above in prioritizing the target variant and gene for follow-up studies, investigation of small numbers of individual variants will not allow us to fully take advantage of the hundreds to thousands of variants identified by any GWAS to gain biological insight. Therefore, scalable approaches remain of significant value, assuming an appropriate functional readout can be defined (Figure 3). For instance, high-throughput loss-of-function screens with putative genes have been proposed as a complementary approach to the conventional variant-focused method, circumventing the challenge of target gene identification. In the context of red blood cell traits, for instance, we conducted a high-throughput loss-of-function (LOF) screen with multiple shRNAs targeting all nominated genes from a GWAS involving 75 loci in primary human stem and progenitor cells (HSPCs) undergoing erythroid differentiation [11]. This enabled us to identify 77 candidate genes at 38 out of the original 75 loci, whose suppression either promoted or perturbed erythroid differentiation.

Figure 3. Scalable approaches for tackling the V2F challenge.

Figure 3.

Displayed in the figure are some high-throughput approaches used to prioritize genetic variants identified from GWAS studies. The majority of GWAS nominated variants affect regulatory elements such as enhancers, thereby affecting gene expression. Effects of genetic variants on transcriptional activity of thousands of regulatory elements can be efficiently studied using pooled reporter constructs exogenously introduced into relevant cell types. Alternatively, pooled CRISPR/Cas9 screens can be applied to endogenously perturb either the regulatory element, target gene, or genetic variant in their native chromatin state. Various functional assays, such as single-cell RNA sequencing (e.g. as depicted by a gene expression heat map), cell proliferation/survival assays, and flow cytometry, can be used as functional readouts of CRISPR-Cas9 screens. UTR, untranslated region; barcodes, unique DNA sequences that identify reporter constructs; CRISPRi, CRISPR interference; CRISPRa, CRISPR activation. The methods listed are not comprehensive, but the figure depicts a general representation of the mechanistic progression involved in the V2F arc.

Furthermore, CRISPR-based functional perturbation methods have been applied at large-scales to target multiple regulatory elements containing the variants or candidate genes themselves, enabling investigation of these regions in their native cellular context. For example, a CRISPR-Cas9-based genome-wide pooled LOF screen was conducted to identify genes that regulate the proliferation of primary human T-cells [83]. In addition to finding key mediators of T-cell receptor (TCR) signaling pathways, the screen also identified a previously uncharacterized gene, FAM105A, that contained missense variants associated with the risk of developing allergic diseases such as asthma, hay fever, and eczema [83]. Further studies demonstrated that the FAM105A gene could mediate immunosuppressive adenosine signals in CD8+ T-cells, identifying a potential mechanism of allergic responses.

Two recent studies, utilized KRAB-dCas9 screens and MPRAs to dissect thousands of non-coding variants at the TNF alpha-induced protein 3(TNFAIP3) locus, which is associated with multiple autoimmune diseases [84, 85]. Combining KRAB-dCas9 mediated repression to identify endogenous enhancers of TNFAIP3 and a MPRA approach to identify elements that also show allele-specific activity exogenously in several immune cell lines led to the prioritization of 18 variants at this locus [84]. Interestingly, applying a single MPRA approach in the most relevant cell type primary CD4+ T-cells, led to the identification of a single variant rs6927172 that affected disease-relevant autoimmune function [85]. The variant rs69271272 was located in a super-enhancer region and CRISPR-Cas9 editing of the variant led to unrestrained T-cell activation due to decreased TNFAIP3 expression. However, these studies also illustrate challenges in definitively identifying putative causal variants using multiple assays and across a limited range of cell states that might not faithfully mimic physiology. Another creative approach combined a pooled KRAB-dCas9 screen with single-cell RNA-sequencing to measure the effects of 5920 enhancer perturbations on single-cell transcriptomes, which resulted in the identification of gene targets for 664 enhancers [86]. Such high-throughput genome-wide mapping of enhancer function in several trait-relevant cell-types might further facilitate the V2F arc and it is likely that such screens could be performed across a range of diseases/ traits, given that gene expression is a relatively universal readout.

Concluding Remarks

Even though we have provided a few examples of V2F success stories from the study of blood cell traits, and discussed challenges in completing the V2F arc, it is clear that this is a field with significant potential for further advances. There will undoubtedly be key innovations in both computational and experimental tools that enable further insights into making the transition from V2F. We are excited to see how these approaches contribute to our understanding of variation in human hematopoiesis (see Outstanding Questions) and there are likely to be other cases where this can lead to more effective therapies, as discussed for BCL11A and HbF regulation.

Outstanding Questions.

What are the most effective computational and experimental methods for prioritizing variants for post-GWAS functional follow-up studies?

To what extent can V2F studies be performed in cell lines and what phenotypes are best examined in primary cells?

What are optimal model systems for studying specific blood cell traits or diseases and to what extent can model organisms be helpful in such studies?

How can we develop systematic approaches to study genetic variants that impact blood cells through cell-extrinsic mechanisms?

How can we gain insights into the functional basis by which polygenic variation impacts blood diseases-could scalable functional assays prove valuable to understand polygenic variations?

How can we functionally study genetic variants that predispose to blood phenotypes that require long latencies, such as clonal hematopoiesis or blood cancers?

Given the difficulty in acquiring somatic mutations in the experimental setting should we look for intermediate phenotypes?

The coming years are likely to provide exciting opportunities for further advances in human genetics. In the context of hematopoiesis, we foresee three exciting and important areas where advances are likely to occur. First, the focus of most GWAS has been on variation in healthy populations. However, it is becoming clear that this polygenic variation can contribute to disease risk, and presumably serve as modifiers in blood disorders with a monogenic basis [14]. Elucidating the polygenic influences on blood diseases will undoubtedly improve our understanding of the pathophysiology of these disorders. However, V2F studies will be key to understanding how these polygenic influences might mechanistically contribute to disease pathogenesis or progression. In addition, whereas most studies have focused on variation in baseline homeostatic blood traits, it is clear that there are also likely to be heritable influences for blood diseases, as exemplified by blood cancers such as myeloproliferative neoplasms, chronic lymphoid leukemia, and multiple myeloma [8790]. Understanding both the genetic underpinnings of this heritable risk and ultimately tackling the V2F arc in this context will be valuable. Third, it is likely that studies of additional phenotypes relevant to hematopoiesis, particularly for traits that are not so routinely measured as CBCs, will be valuable. In this context, both phenotype-first and genotype-first recall studies will be complementary to improve our knowledge of how human variation can alter the process of hematopoiesis.

Highlights.

Variant to Function (V2F) studies aim to decipher the mechanisms by which GWAS-nominated genetic variants contribute to specific phenotypes.

Targeted V2F studies in blood cell traits have led to novel insights into hematopoiesis in areas including hemoglobin switching, production of mature blood cells, and immune regulation.

Scalable perturbation methods are enabling global V2F studies by prioritizing variants and target genes for further experimental studies to validate the associated phenotype.

Future V2F studies might help elucidate how common genetic variants influence rare blood disorders, the risk of acquiring blood cancers, and other cryptic hematopoietic phenotypes.

Acknowledgments

We apologize for not covering many key findings in the field due to space limitations. We thank members of the Sankaran laboratory for valuable discussions. Work in our laboratory was supported by the New York Stem Cell Foundation, National Institutes of Health Grants R01 DK103794, and R01 HL146500, the MPN Research Foundation, the Leukemia & Lymphoma Society, and a gift from the Lodish Family to Boston Children’s Hospital. SKN is a Scholar of the American Society of Hematology. VGS is a New York Stem Cell Foundation—Robertson Investigator.

Glossary

Base editors

new CRISPR-Cas9 based genome editing tools that enable the direct conversion of a specific DNA base without creating double-stranded DNA breaks

Boundary element

DNA sequences that define the ends of functionally independent domains of transcriptional activity

β-thalassemia

a genetic blood disorder characterized by reduced levels of the beta-subunit of hemoglobin in developing red blood cell precursors

Chromatin conformation assay (3C)

an experimental technique that can identify long-range chromatin interactions

Common variant association study (CVAS)

common variant association studies are GWAS studies that examine common genetic variants, which are generally thought to have an allele frequency of greater than 1% in the population

Coding variant

a genetic variant in the protein-coding region of a gene

Colony-forming unit cell (CFU-C)

assays used to observe the ability of hematopoietic stem and progenitor cells to form colonies in semisolid media

Complete blood count (CBC)

a blood test used to evaluate an individual’s health and detect disorders by measuring the amount and properties of circulating blood cells

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-Cas9

a bacterial adaptive immune system containing a Cas9 endonuclease that has been repurposed to perform genome editing in mammalian cells

DNAse I hypersensitive Site (DHS)

regions of open chromatin that are sensitive to cleavage by the DNase I enzyme

Enhancer

regulatory sequences that enhance the transcription of an associated gene when bound by transcription factors

Expression quantitative trait loci (eQTL)

a genomic locus that explains variation in gene expression of nearby genes

Fluorescence-activated cell sorting (FACS)

A method of purification of populations of live cells based on the expression of a fluorescent protein or fluorescently-labeled antibodies

Genome-wide association study (GWAS)

a genome-wide analysis that associates genetic variants with a phenotype of interest

Gel shift assay

an affinity-based electrophoresis technique used to study protein-DNA, protein-RNA, or protein-protein interactions

Hemoglobin switching

a developmental process whereby the composition of hemoglobin alters in response to shifting needs for oxygen. This is a process regulated primarily at the level of transcription

Homology-directed repair (HDR)

a naturally occurring nucleic acid repair system initiated by the presence of double-strand DNA breaks that can be re-purposed to modify genomes

Isogenic

a population of cells that are genetically identical except at a single locus. Such cells are often derived from the same precursors and established through genome editing in a single region

KRAB-dCas9

a system where the catalytically deactivated Cas9 (dCas9) protein is fused with KRAB (Krüppel-associated box) that recruits various histone modifiers to suppress gene expression through chromatin structural changes

Lentivirus

a type of retrovirus that can infect dividing and nondividing cells. It has been used to deliver genetic information stably into the genome of the host cell

Linkage disequilibrium (LD)

the difference between the observed frequency of a particular combination of alleles at two loci and the frequency expected for random association

Massive parallel reporter assay (MPRA)

a high-throughput platform that allows for the analysis of transcriptional activities of hundreds or thousands of regulatory elements in a single experiment

Noncoding variant

a genetic variant that is located outside the protein-coding region of the human genome

Promoter

DNA sequences that define where the transcription of a gene starts

Proximity ligation approaches

a variation of techniques that rely on the cross-linking of distal chromatinized regions to study long-range chromatin interactions

Variant-harboring regulatory element

regions of DNA sequences containing specific genetic variants that regulate transcription of neighboring or distal target gene(s)

Short hairpin RNA (shRNA)

a short sequence of RNA that folds into a tight hairpin structure and can be applied to silence gene expression

Sickle cell disease

a genetic blood disorder caused by a point mutation in the hemoglobin beta-subunit that results in a tendency of hemoglobin to polymerize in the deoxygenated state. This results in the deformation of red blood cells with this hemoglobin, which can block small blood vessels, causing a variety of clinical complications

Single nucleotide polymorphism (SNP)

a single nucleotide substitution at a specific position in the genome

Splicing

the process by which the introns are excised out of the primary messenger RNA transcript, and the coding regions are joined together to generate mature messenger RNA, which then serves as the template for synthesis of a specific protein

Tag SNP

SNP that is genotyped in a GWAS study

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Conflict of Interest

The authors declare no conflict of interest.

Resources

I

This study is registered with ClinicalTrials.gov.

II

This study is registered with ClinicalTrials.gov.

References

  • 1.Claussnitzer M et al. (2020) A brief history of human disease genetics. Nature 577 (7789), 179–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Visscher PM et al. (2017) 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet 101 (1), 5–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Buniello A et al. (2019) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res 47 (D1), D1005–d1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Claussnitzer M et al. (2015) FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med 373 (10), 895–907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Sankaran VG et al. (2008) Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322 (5909), 1839–42. [DOI] [PubMed] [Google Scholar]
  • 6.Musunuru K et al. (2010) From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466 (7307), 714–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gupta RM et al. (2017) A Genetic Variant Associated with Five Vascular Diseases Is a Distal Regulator of Endothelin-1 Gene Expression. Cell 170 (3), 522–533.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bauer DE et al. (2013) An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science 342 (6155), 253–7.24115442 [Google Scholar]
  • 9.Guo MH et al. (2017) Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms. Proc Natl Acad Sci U S A 114 (3), E327–e336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ulirsch JC et al. (2016) Systematic Functional Dissection of Common Genetic Variation Affecting Red Blood Cell Traits. Cell 165 (6), 1530–1545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Nandakumar SK et al. (2019) Gene-centric functional dissection of human genetic variation uncovers regulators of hematopoiesis. eLife 8, e44080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Astle WJ et al. (2016) The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167 (5), 1415–1429. e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ulirsch JC et al. (2019) Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nature genetics 51 (4), 683–693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Vuckovic D et al. (2020) The Polygenic and Monogenic Basis of Blood Traits and Diseases. DOI: 10.1101/2020.02.02.20020065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen M-H et al. (2020) Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. DOI: 10.1101/2020.01.17.910497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lettre G et al. (2008) DNA polymorphisms at the BCL11A, HBS1L-MYB, and beta-globin loci associate with fetal hemoglobin levels and pain crises in sickle cell disease. Proceedings of the National Academy of Sciences of the United States of America 105 (33), 11869–11874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Orkin SH and Zon LI (2008) Hematopoiesis: an evolving paradigm for stem cell biology. Cell 132 (4), 631–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Notta F et al. (2011) Isolation of single human hematopoietic stem cells capable of long-term multilineage engraftment. Science 333 (6039), 218–21. [DOI] [PubMed] [Google Scholar]
  • 19.Doulatov S et al. (2012) Hematopoiesis: a human perspective. Cell Stem Cell 10 (2), 120–36. [DOI] [PubMed] [Google Scholar]
  • 20.Spangrude GJ et al. (1988) Purification and characterization of mouse hematopoietic stem cells. Science 241 (4861), 58–62. [DOI] [PubMed] [Google Scholar]
  • 21.Akashi K et al. (2000) A clonogenic common myeloid progenitor that gives rise to all myeloid lineages. Nature 404 (6774), 193–7. [DOI] [PubMed] [Google Scholar]
  • 22.Kondo M et al. (1997) Identification of clonogenic common lymphoid progenitors in mouse bone marrow. Cell 91 (5), 661–72. [DOI] [PubMed] [Google Scholar]
  • 23.Manz MG et al. (2002) Prospective isolation of human clonogenic common myeloid progenitors. Proc Natl Acad Sci U S A 99 (18), 11872–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Morrison SJ et al. (1997) Identification of a lineage of multipotent hematopoietic progenitors. Development 124 (10), 1929–39. [DOI] [PubMed] [Google Scholar]
  • 25.Corces MR et al. (2016) Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat Genet 48 (10), 1193–203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ludwig LS et al. (2019) Transcriptional States and Chromatin Accessibility Underlying Human Erythropoiesis. Cell Rep 27 (11), 3228–3240.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Eaves CJ (2015) Hematopoietic stem cells: concepts, definitions, and the new reality. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Metcalf D (2008) Hematopoietic cytokines. Blood 111 (2), 485–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kaushansky K (2006) Lineage-specific hematopoietic growth factors. N Engl J Med 354 (19), 2034–45. [DOI] [PubMed] [Google Scholar]
  • 30.Giani FC, Fiorini C, Wakabayashi A, Ludwig LS, Salem RM, Jobaliya CD, … & Guo MH (2016) Targeted application of human genetic variation can improve red blood cell production from stem cells. Cell stem cell (18(1)), 73–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sankaran VG and Orkin SH (2013) The switch from fetal to adult hemoglobin. Cold Spring Harb Perspect Med 3 (1), a011643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Platt OS et al. (1991) Pain in sickle cell disease. Rates and risk factors. N Engl J Med 325 (1), 11–6. [DOI] [PubMed] [Google Scholar]
  • 33.Uda M et al. (2008) Genome-wide association study shows BCL11A associated with persistent fetal hemoglobin and amelioration of the phenotype of beta-thalassemia. Proc Natl Acad Sci U S A 105 (5), 1620–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Menzel S et al. (2007) A QTL influencing F cell production maps to a gene encoding a zinc-finger protein on chromosome 2p15. Nat Genet 39 (10), 1197–9. [DOI] [PubMed] [Google Scholar]
  • 35.Sankaran VG et al. (2008) Human Fetal Hemoglobin Expression Is Regulated by the Developmental Stage-Specific Repressor BCL11A. Science 322 (5909), 1839–1842. [DOI] [PubMed] [Google Scholar]
  • 36.Sankaran VG et al. (2009) Developmental and species-divergent globin switching are driven by BCL11A. Nature 460 (7259), 1093–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Basak A et al. (2015) BCL11A deletions result in fetal hemoglobin persistence and neurodevelopmental alterations. J Clin Invest 125 (6), 2363–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Dias C et al. (2016) BCL11A Haploinsufficiency Causes an Intellectual Disability Syndrome and Dysregulates Transcription. Am J Hum Genet 99 (2), 253–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Xu J et al. (2013) Corepressor-dependent silencing of fetal hemoglobin expression by BCL11A. Proc Natl Acad Sci U S A 110 (16), 6518–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Liu N et al. (2018) Direct Promoter Repression by BCL11A Controls the Fetal to Adult Hemoglobin Switch. Cell 173 (2), 430–442.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Sankaran VG et al. (2011) A functional element necessary for fetal hemoglobin silencing. The New England journal of medicine 365 (9), 807–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Xu J et al. (2010) Transcriptional silencing of {gamma}-globin by BCL11A involves long-range interactions and cooperation with SOX6. Genes & development 24 (8), 783–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Martyn GE et al. (2018) Natural regulatory mutations elevate the fetal globin gene via disruption of BCL11A or ZBTB7A binding. Nat Genet 50 (4), 498–503. [DOI] [PubMed] [Google Scholar]
  • 44.Basak A et al. (2020) Control of human hemoglobin switching by LIN28B-mediated regulation of BCL11A translation. Nat Genet 52 (2), 138–145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Canver MC et al. (2015) BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527 (7577), 192–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Wu Y et al. (2019) Highly efficient therapeutic gene editing of human hematopoietic stem cells. Nat Med 25 (5), 776–783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Dendrou CA et al. (2016) Resolving TYK2 locus genotype-to-phenotype differences in autoimmunity. Sci Transl Med 8 (363), 363ra149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Ding Q et al. (2014) Permanent alteration of PCSK9 with in vivo CRISPR-Cas9 genome editing. Circ Res 115 (5), 488–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Brendel C et al. (2016) Lineage-specific BCL11A knockdown circumvents toxicities and reverses sickle phenotype. J Clin Invest 126 (10), 3868–3878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Hattangadi SM et al. (2011) From stem cell to red cell: regulation of erythropoiesis at multiple levels by multiple proteins, RNAs, and chromatin modifications. Blood 118 (24), 6258–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ganesh SK et al. (2009) Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium. Nature genetics 41 (11), 1191–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sankaran VG et al. (2012) Cyclin D3 coordinates the cell cycle during differentiation to regulate erythrocyte size and number. Genes Dev 26 (18), 2075–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Downes DJ et al. (2020) An integrated platform to systematically identify causal variants and genes for polygenic human traits. DOI: 10.1101/813618. [DOI] [Google Scholar]
  • 54.van der Harst P et al. (2012) Seventy-five genetic loci influencing the human red blood cell. Nature 492 (7429), 369–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Soranzo N et al. (2009) A novel variant on chromosome 7q22.3 associated with mean platelet volume, counts, and function. Blood 113 (16), 3831–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Paul DS et al. (2011) Maps of open chromatin guide the functional follow-up of genome-wide association signals: application to hematological traits. PLoS Genet 7 (6), e1002139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Polfus LM et al. (2016) Whole-Exome Sequencing Identifies Loci Associated with Blood Cell Traits and Reveals a Role for Alternative GFI1B Splice Variants in Human Hematopoiesis. Am J Hum Genet 99 (2), 481–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Auer PL et al. (2014) Rare and low-frequency coding variants in CXCR2 and other genes are associated with hematological traits. Nat Genet 46 (6), 629–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Eash KJ et al. (2010) CXCR2 and CXCR4 antagonistically regulate neutrophil trafficking from murine bone marrow. J Clin Invest 120 (7), 2423–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Sabeti PC et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449 (7164), 913–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Farh KK-H et al. (2015) Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518 (7539), 337–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Wellcome Trust Case Control, C. et al. (2007) Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nature genetics 39 (11), 1329–1337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Flister MJ et al. (2013) Identifying multiple causative genes at a single GWAS locus. Genome research 23 (12), 1996–2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Galarneau G et al. (2010) Fine-mapping at three loci known to affect fetal hemoglobin levels explains additional genetic variation. Nature genetics 42 (12), 1049–1051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Chung CC et al. (2011) Fine mapping of a region of chromosome 11q13 reveals multiple independent loci associated with risk of prostate cancer. Human molecular genetics 20 (14), 2869–2878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Chen W et al. (2015) Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics. Genetics 200 (3), 719–736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Wallace C et al. (2015) Dissection of a Complex Disease Susceptibility Region Using a Bayesian Stochastic Search Approach to Fine Mapping. PLoS genetics 11 (6), e1005272–e1005272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Benner C, Spencer CC, Havulinna AS, Salomaa V, Ripatti S, & Pirinen M (2016) FINEMAP: efficient variable selection using summary data from genome-wide association studies Bioinformatics (32(10)), 1493–1501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Consortium F et al. (2014) A promoter-level mammalian expression atlas. Nature 507 (7493), 462–470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Romanoski CE et al. (2015) Roadmap for regulation. Nature 518 (7539), 314–316. [DOI] [PubMed] [Google Scholar]
  • 71.Melnikov A et al. (2012) Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nature biotechnology 30 (3), 271–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Patwardhan RP et al. (2012) Massively parallel functional dissection of mammalian enhancers in vivo. Nat Biotechnol 30 (3), 265–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Tewhey R et al. (2016) Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay. Cell 165 (6), 1519–1529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Soskic B et al. (2019) Chromatin activity at GWAS loci identifies T cell states driving complex immune diseases. Nature genetics 51 (10), 1486–1493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Hu X et al. (2011) Integrating Autoimmune Risk Loci with Gene-Expression Data Identifies Specific Pathogenic Immune Cell Subsets. Am J Hum Genet 89 (4), 496–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Pers TH et al. (2015) Biological interpretation of genome-wide association studies using predicted gene functions. Nat Commun 6, 5890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Finucane HK et al. (2018) Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat Genet 50 (4), 621–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Slowikowski K et al. (2014) SNPsea: an algorithm to identify cell types, tissues and pathways affected by risk loci. Bioinformatics 30 (17), 2496–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Calderon D et al. (2017) Inferring Relevant Cell Types for Complex Traits by Using Single-Cell Gene Expression. Am J Hum Genet 101 (5), 686–699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Mifsud B et al. (2015) Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nature Genetics 47 (6), 598–606. [DOI] [PubMed] [Google Scholar]
  • 81.Kempfer R and Pombo A (2020) Methods for mapping 3D chromosome architecture. Nat Rev Genet 21 (4), 207–226. [DOI] [PubMed] [Google Scholar]
  • 82.Rees HA and Liu DR (2018) Base editing: precision chemistry on the genome and transcriptome of living cells. Nat Rev Genet 19 (12), 770–788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Shifrut E et al. (2018) Genome-wide CRISPR Screens in Primary Human T Cells Reveal Key Regulators of Immune Function. Cell 175 (7), 1958–1971.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Ray JP et al. (2020) Prioritizing disease and trait causal variants at the TNFAIP3 locus using functional and genomic features. Nat Commun 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Bourges C et al. (2020) Resolving mechanisms of immune-mediated disease in primary CD4 T cells. EMBO Mol Med, e12112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Gasperini M et al. (2019) A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens. Cell 176 (1–2), 377–390 e19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Hinds DA et al. (2016) Germ line variants predispose to both JAK2 V617F clonal hematopoiesis and myeloproliferative neoplasms. Blood 128 (8), 1121–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Speedy HE et al. (2014) A genome-wide association study identifies multiple susceptibility loci for chronic lymphocytic leukemia. Nat Genet 46 (1), 56–60. [DOI] [PubMed] [Google Scholar]
  • 89.Yang JJ et al. (2012) Genome-wide association study identifies germline polymorphisms associated with relapse of childhood acute lymphoblastic leukemia. In Blood, pp. 4197–204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Went M et al. (2018) Identification of multiple risk loci and regulatory mechanisms influencing susceptibility to multiple myeloma. Nat Commun 9 (1), 3707. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES