Skip to main content
Human Molecular Genetics logoLink to Human Molecular Genetics
. 2022 Sep 3;31(R1):R84–R96. doi: 10.1093/hmg/ddac194

Multiplexed functional genomic assays to decipher the noncoding genome

Yonatan A Cooper 1,2,3, Qiuyu Guo 4, Daniel H Geschwind 5,6,7,8,
PMCID: PMC9585676  PMID: 36057282

Abstract

Linkage disequilibrium and the incomplete regulatory annotation of the noncoding genome complicates the identification of functional noncoding genetic variants and their causal association with disease. Current computational methods for variant prioritization have limited predictive value, necessitating the application of highly parallelized experimental assays to efficiently identify functional noncoding variation. Here, we summarize two distinct approaches, massively parallel reporter assays and CRISPR-based pooled screens and describe their flexible implementation to characterize human noncoding genetic variation at unprecedented scale. Each approach provides unique advantages and limitations, highlighting the importance of multimodal methodological integration. These multiplexed assays of variant effects are undoubtedly poised to play a key role in the experimental characterization of noncoding genetic risk, informing our understanding of the underlying mechanisms of disease-associated loci and the development of more robust predictive classification algorithms.

Introduction—Challenges of interpreting noncoding genetic variation

Since the early 2000s, the rapid development and plummeting costs of genotyping arrays and next-generation sequencing (NGS) technologies have revolutionized the fields of genetics and genomics (1). The widespread deployment of genotyping, whole exome and whole genome sequencing studies in the research, clinical and direct-to-consumer spaces have enabled the identification of millions of genetic variants across myriad individuals and ancestral populations as well as the de novo somatic mutational landscape in cancer (2–4). The interpretation of this vast and rapidly expanding catalog of human genetic variation has emerged as a fundamental problem in modern genetics. Understanding the relationship between genotypic and phenotypic variance is critical for elucidating the molecular basis for gene expression, identifying the genetic contribution to diseases, and providing genetic diagnosis and risk stratification in the clinical setting.

The biological interpretation of gene-mapping studies remains a major challenge. In genome-wide association studies (GWAS), linkage disequilibrium (LD) within trait-associated loci obscures the underlying causal variants among the many correlated polymorphisms within the locus, the majority of which are expected to be functionally neutral (5). Additionally, the vast majority of loci and variants identified by GWAS are found within non-protein coding regions, which comprise roughly 98% of the human genome and are under relatively less selective constraint than protein coding regions (6). Careful statistical partitioning of heritability has repeatedly confirmed that the majority of heritability for common disorders is harbored within the noncoding genome, particularly within enhancer domains and intronic splice sites (7,8). Similarly, the bulk of variation identified in NGS studies is found in the noncoding genome, and interpretation of the millions of noncoding rare variants of likely small effects remains notoriously difficult. The contribution of noncoding variation to rare and Mendelian disorders also remains poorly understood, although the existence of mutations that focally disrupt enhancers (9), along with the many mutations known to act via haploinsufficiency, suggests that regulatory mutations affecting gene dosage are more common than currently appreciated. Similarly, examples of single noncoding mutations associated with severe developmental disorders (10) and an expanding array of noncoding driver mutations found in cancer (11,12) demonstrate the broader importance of interpreting noncoding variation in the clinical setting. In contrast to protein altering mutations, noncoding variation maintains unclear functional relationships with putative target genes and may not act within all cell types or tissues (challenges summarized in Table 1). Therefore, high-throughput functional interpretation of noncoding genetic variation is one of the major challenges facing modern genetics (13).

Table 1.

The major challenges in interpreting noncoding loci and variants identified in GWAS or NGS studies, and proposed methodological solutions to address these challenges

Challenge Exacerbating Variables Methodological Solutions
Is the variant causal? - Linkage disequilibrium- Low power/small effect sizes- Genotyping/imputation failure - Fine-mapping- Deep imputation with improved population reference panels, increased sample sizes
Is the variant functional? - Same as above- Uncertain function of noncoding genome - Prioritization algorithms- Functional genomic annotations - Genome editing/in vitro modeling- MPRA
Which gene is regulated? - Gene dense regions- Strong LD- Complex 3D genomic architecture - QTL colocalization - 3D interactome assays - Genome editing/in vitro modeling
Which cell-type mediates risk? - Trait in heterogenous tissue, or unclear causal organ system- Gene dense regions - Heritability enrichment by cell-type - Transcriptomics/Proteomics- Genome editing/in vitro modeling

Interpretation of noncoding genetic variation using functional genomic maps

Noncoding genetic variation is assumed to primarily function by directly or indirectly influencing gene expression or splicing. Therefore, an efficient and simple prioritization strategy is to overlap variants with functional genomic annotations (14). These annotations are identified through empirical assays leveraging NGS to query DNA regions enriched for specific biochemical modifications (called ‘peaks’) known to correlate with regulatory activity. This includes methods to identify DNA accessibility (DNase-hypersensitivity (15) and ATAC-seq (16)), DNA methylation, histone modifications (Histone ChIP-seq) (17), Transcription Factor (TF) Binding (TF-ChIP, HT-SELEX, PBM (18)), STARR-Seq (19) or direct assessments of transcriptional activity (GRO/PRO-cap (20,21), GRO-seq (22), PRO-seq (23), CAGE (24)). Large-scale consortium initiatives including NIH Roadmap (25), ENCODE (26), IHEC (27), PsychENCODE (28) and FANTOM5 (29) catalog multiple such marks across many cell-lines and tissues and are an invaluable public resource. Other consortia catalog features such as TFBSs (e.g. JASPAR (18), TRANSFAC (30)) or enhancer annotations (e.g. Enhancer Atlas (31)). Moreover, the development of computational tools including ChromHMM (32) and Segway (33) that integrate multiple annotations across tissues provides higher resolution genomic segmentation and more refined descriptions of transcriptional regulatory states. Other tools, including RegulomeDB (34) and HaploReg (35), integrate pre-existing or user defined marks to directly prioritize GWAS variants. Similarly, functional maps are leveraged by a number of computational algorithms and machine learning methods for functional variant prediction (discussed below; tools reviewed in (36)). The mechanistic relevance of functional genomic annotations to gene regulation and variant prioritization is supported by the strong statistical enrichment of GWAS variation within these regions in a tissue and cell-type-specific manner (37–41).

Nevertheless, functional genomic maps are subject to a number of limitations. First, biochemical maps are often discordant and fail to overlap relevant variation. One analysis found little genomic overlap between multiple enhancer annotation methods and found that only a minority of GWAS and eQTL variants (including curated causal variants) overlapped any given enhancer mark (42). Furthermore, experimental characterization of predicted enhancers found that between 30 and 46% are transcriptionally active (43,44), which taken together suggests that a large proportion of enhancers are both miss-specified and undetected. Second, biochemical peaks are broad and lack the nucleotide-level resolution to disambiguate closely spaced variants (45). Third, existing functional maps are static snapshots and may not recapitulate critical demographic, disease or environmental-specific states, nor dynamic stimulus-induced regulatory changes (46,47). Moreover, such maps do not always capture cell-type or stage-specific aspects of transcriptional regulation, although single-cell epigenomic studies (48) as well as cross-tissue imputation strategies (49) attempt to rectify this. Finally, and most critically, overlap of GWAS variation with regulatory annotations alone does not prove causal relationships. For a comprehensive review of advances and challenges in genome wide annotation of regulatory elements, see Xu et al. (this issue) (14).

High-throughput functional assays for the annotation of noncoding variation

The limitations of biochemical annotations have motivated the development of an array of high-throughput experimental methods for genomic characterization (50) (Table 2). These multiplexed assays of variant effects (MAVEs) generally come in two flavors: (1) parallelized, direct genomic manipulation using gene-editing technology, and (2) massively multiplexed synthetic reporter assays using NGS as a readout (51). MAVEs have rapidly proliferated in recent years, leading to the collection and curation of rich functional data across the spectrum of noncoding variation (51,52). In the remainder of this review, we will discuss the impact of both approaches toward characterizing noncoding genetic elements and their respective limitations. These high-throughput experimental methods are especially important because existing computational methods are either not well powered or lack the specificity to define functional variation in a convincing manner (45).

Table 2.

Comparison of different methods used to screen noncoding and regulatory variants

Screening technology Advantages Limitations
CRISPR knockout - Native context - Infer target genes - Relatively low throughput - Cannot control exact editing result
CRISPRa/i - Same as above - Relatively low throughput - Cannot infer exact causal variants
CRISPR base editing - Same as above - Precise base conversion/direct inference - Relatively low throughput - Bystander edits - Limited types of conversions - Editing window restriction
CRISPR prime editing - Same as above - Versatile types of edits - Relatively low throughput - Low efficiency
MPRA - Higher throughput- Direct measurement of functional variants - Artificial expression- Not within native genome context- Cannot infer target genes

Interrogation of noncoding regulatory function through CRISPR/Cas-based genome modification

Genome editing technologies based on CRISPR/Cas systems permit perturbation of genetic elements in their native genomic contexts and subsequent measures of downstream phenotypic response (53). The modular nature of these technologies enables targeted mutagenesis (54), epigenomic modulation (e.g. CRISPRi (55) and CRISPRa (56)) and direct base pair editing (base editing (57) or prime editing (58)), among other applications (Fig. 1). Excitingly, these assays can be scaled by generating pooled libraries of many gRNAs combined with single-cell phenotyping to create high-throughput CRISPR screens (59). However, the bulk of these multiplexed assays have focused on manipulation of the protein-coding genome in pooled knock-out or saturation mutagenesis studies (60).

Figure 1.

Figure 1

Effects of CRISPR-based tools on noncoding variants. (A) CRISPR/Cas9 generates a double-strand break close to the target site, which induces NHEJ, resulting in an unpredictable repair event that often involves knockout of the variant of interest (highlighted ‘X’). (B) Base editing, as exemplified by the cytidine editor, leads to conversion of specific base pairings (highlighted ‘C’ to ‘T’) for all available relevant base instances in an editing window. (C) Prime editing induces an accurate and versatile conversion of a short sequence (‘X’ to ‘Y’ conversion) by utilizing the template for reverse transcription on the prime editing guide RNA (pegRNA); however, the efficiency of prime editing is much lower than the other technologies. (D) Epigenetic modulators, such as CRISPR interference (CRISPRi), use fused transcriptional co-effectors (KRAB-dCas9 in this case) to modify relevant chemical moieties on chromatin to change chromatin states and repress transcription; a range of up to 1 kb of DNA can be influenced.

Nevertheless, pooled CRISPR screens have been used to efficiently survey the noncoding regulatory landscape. The earliest iterations of these approaches paired Cas9-induced mutagenesis of cis-regulatory regions with functional readouts including cellular proliferation or antibody staining paired with FACS (61) (Fig. 2A). Classic examples include mutagenesis and characterization of putative enhancers upstream of the BCL11A gene in HUDEP-2 erythroid cells (62), TP53 enhancers in MCF-7 fibroblasts (63), among others (64,65). Focusing on biochemically predefined enhancers tractably constrained the target search space but likely missed novel regulatory elements (66) (Fig. 2B).

Figure 2.

Figure 2

Common screening strategies for the noncoding regulatory landscape with CRISPR-based tools. (A) In the first strategy (upper panel), a cellular knock-in reporter line is constructed to monitor endogenous expression of the target gene. Following CRISPRi or CRISPR perturbation, cells with low reporter expression are identified by FACS, followed by gRNA quantification to identify enrichment within functional relevant sites. This type of screening is inexpensive and well-powered, but limited to one target gene. In the second strategy (lower panel), single-cell RNA-seq (scRNA-seq) is performed along with single-cell gRNA sequencing (sc-gRNA-seq). Cells expressing each unique gRNA are grouped together and compared with the remaining cells to identify differentially expressed genes; this analytic framework is similar to eQTL analysis and offers genome-wide coverage, but requires much greater sequencing depth and expense. (B) Screening can be performed in either a tiling (upper) or focused (lower) fashion. The tiling array is unbiased but limited to a restricted number of loci, while the focused array is biased toward known epigenomic or enhancer annotations but enables surveillance of many distinct loci simultaneously.

CRISPRi/a screening represents a major technical breakthrough by greatly expanding throughput compared with earlier mutagenesis studies (Fig. 2). For CRISPR interference (CRISPRi), targeted suppression of endogenous transcriptional machinery is achieved through fusion of dCas9 to repressor proteins, typically the Kruppel-associated box domain (KRAB) (55). Other repressor domains include LSD1, DNMTA3, SID4X and HDAC3 (61,67). Conversely, for CRISPR activation (CRISPRa), dCas9 is fused to one or more activating domains including VP64 (e.g. sunTag (68)), p300 (69) or the MS2 RNA hairpin that recruits MCP-P65-HSF1 fusion proteins (e.g. SAM (70)) to drive gene expression. These systems modify transcriptional machinery within ~200–500 bp of the targeting gRNA to enable regional rather than nucleotide level assay resolution (Fig. 1D). Assay multiplexing and incorporation of single-cell transcriptomic measurement allows for the efficient surveillance of the noncoding regulatory landscape at the genome-wide level and includes approaches such CROP-seq (71), Perturb-seq (59), CRISP-seq (72), MOSAIC-seq (73) and CRISPRi-flowFISH (74).

In an early seminal paper, Fulco and colleagues (75) used CRISPRi to characterize a ~1 Mb region surrounding the MYC and GATA1 transcription factors, identifying nine distal enhancers. A more recent study from Gasperini et al. (76) used the CROP-seq method to screen more than 5920 putative enhancers in K562 cells, identifying 664 novel enhancer-gene relationships (Fig. 2A) highlighting the rapid evolution and improving throughput of these approaches. Other examples of CRISPRi and CRISPRa demonstrate a rapidly growing literature characterizing enhancer elements across cell types (reviewed by Shukla and Huangfu (61)). These assays have also been used to characterize other genomic features such as long noncoding RNAs (77), transcription factor binding sites (TFBS) and cistromes (78,79), and even to validate putative regulatory variants identified from eQTL studies, GWAS or cancer-associated somatic mutations identified in clinical cohorts using WGS (80,81).

A major consideration when interpreting CRISPR-based screening methods is the potential for off-target effects (82). This issue is somewhat mitigated by in silico gRNA design tools that use either alignment (83,84) or evidence-based models to predict on-target and off-target efficiency (85–87). While there are several experimental methods to detect off-target effects of single gRNAs in a genome-wide fashion (88–90), such methods are not scalable for CRISPR screening assays. Therefore, using multiple gRNAs to target each individual locus is the typical solution.

Another methodological limitation is the high dropout rate in mRNA detection using single-cell sequencing technologies, which is magnified in pooled CRISPRi screens incorporating eQTL-like analysis pipelines (76). Practically, our experience suggests that differential gene expression (DGE) based on scRNA-seq data is powered to detect changes primarily in the top decile of genes expressed in a given cell population. Additionally, enhancers have smaller effect sizes relative to promoters and may have compensatory or redundant function when perturbed (76). These two issues, combined with a large multiple-testing burden given the scale of the noncoding genome, suggest a significant limitation in detection power for these assays.

Lastly, there is a tradeoff between throughput and base-pair level resolution. CRISPRa/i approaches lead to chromatin modifications at the range of several hundred base pairs, enabling in aggregate extended assessment of the broader genetic landscape without base-pair level resolution. In contrast, base editing approaches allow nucleotide-scale manipulation and have become highly efficient, with desired conversion rates up to 90% (91). However, these nucleotide conversions can be restricted (primarily C-to-T or A-to-G) (92), bystander effects can occur (i.e. unintended edits of similar bases within the editing window) (93) and fixed windows of editing may limit the editability of bases. As opposed to base editing, even the most efficient version of prime editing thus far achieves 10–20% of desired edits (94), which is not pragmatic for high-throughput screening with a gene expression readout. A recent study that performed a saturation prime editing screen on specific coding regions of two genes with 26 pegRNA structures (95) cleverly used functional markers for target enrichment to achieve high editing efficiency (up to 80%). Overall, base pair editing approaches are very promising and with additional technical advances are likely to be useful for high-throughput screening of noncoding variants in the near future.

Massively parallel reporter assays enable direct functional characterization of diverse genomic features

A second set of approaches for the functional characterization of genetic elements are collectively known as massively parallel reporter assays (MPRA), which involve the construction of a synthetic library of genomic elements, each paired with a reporter gene and a unique, genetically encoded barcode. These libraries are delivered to cell lines or tissues of interest, and the functional effects of library elements are assayed through the multiplexed measurement of barcoded reporter transcripts using NGS (96). This method has enabled regulatory characterization of diverse sets of genomic features across a variety of biological contexts (Fig. 3).

Figure 3.

Figure 3

MPRA workflow. (A) Individual cis-regulatory elements (CREs) can either be synthesized on a DNA oligonucleotide array or sampled from genomic DNA using selection or capture for specific target regions. (B) The CREs are cloned into a reporter vector to drive expression of a uniquely barcoded transcript. Each CRE normally corresponds to multiple unique identifying barcodes. In this example, the CRE is upstream of the minimal promoter and barcoded transcript, but other MPRA designs exist, each with specific advantages and drawbacks. (C) The MPRA reporter library is delivered into a cell model, followed by mRNA collection and unique barcode amplification using targeted PCR. (D) MPRA data can be used to infer transcriptional activity of tested CREs or compare the relative transcriptional efficacies of different alleles of the same variant within a given CRE.

The earliest MPRA iterations were developed to test the transcriptional activity of enhancers (43,97–99) by counting uniquely barcoded transcripts deriving from each enhancer element. These synthetic assays have variable library designs and implementations. For example: (1) enhancer elements can be placed upstream of a minimal promoter, or within the 3’ UTR of the reporter gene (STARR-seq (19), suRE (100,101)). (2) Enhancer DNA can be obtained via microarray synthesis, PCR (102) or genomic DNA capture (103,104). (3) Libraries can be delivered using episomal or integrating vectors (105,106). MPRAs have been used to successfully screen for enhancer activity (96), repressor elements (107) or differential activity across a diverse array of prokaryotic and eukaryotic cell types (97–99,108). Packaging libraries within viral delivery platforms, including adeno-associated virus (104,109) and lentivirus (105,110), have broadened the available cellular contexts to include difficult primary cell lines such as neural cells (110–112), and in vivo assays in the mouse retina (113) and brain (104,109). Approaches incorporating saturation mutagenesis (114) or tiling of overlapping elements (115) enable elucidation of TF-binding logic and nucleotide-resolution sequence specificity. Finally, more recent MPRA iterations probe the function of other noncoding genomic features such as 5’ and 3’ UTRs or splice sites and their effects on transcriptional, post-transcriptional or translational regulation. These include MPRA characterization of RNA splicing (116,117), RNA or protein stability (118–120), RNA editing (121), RNA localization (122) and translation efficiency (123).

MPRA can also be used to compare transcriptional regulatory effects between alleles to interrogate noncoding genetic variation, though these assays are performed less frequently due to the challenge of measuring small allelic effect sizes. Kircher et al. (114) performed saturation mutagenesis across 20 disease associated cis-regulatory elements to test for functional effects across ~30 000 SNVs. Others have assessed synthetic configurations of TFBSs to query the sensitivity of DNA-binding elements to disrupting variation (112,124–126). Additionally, MPRA has been used to screen thousands of noncoding variants derived from eQTL (127,128), GWAS and WGS studies. Specifically, MPRA has been used to measure allelic effects of variants associated with vascular traits (129,130), cancer (130–132), obesity (133), diabetes (134), osteoarthritis (135), COPD (136), lupus (137) and neuropsychiatric (138,139) and neurodegenerative disorders (140). Other work has characterized rare autism-associated (141) or other common (142) variants located in human accelerated regions, as well as assessment of human-specific (143) and introgressed (144) regulatory variants.

It has been repeatedly shown that MPRAs are exquisitely sensitive at detecting functional disruption of TF binding sites within gene-regulatory elements (112,145,146). Recently, Cooper et al. (140) used MPRA to screen common variants derived from GWAS for two neurodegenerative disorders, finding that functional common regulatory variants disrupted a network of interacting TFs likely to operate in a cell-type specific manner to modulate disease risk. These results underscore how high-throughput data from functional genetic screens can intersect with network biology to provide broader insights into the etiology of complex traits beyond mechanistic characterization of individual loci.

When compared with gene-editing approaches, MPRAs afford a number of advantages. First, most MPRAs do not require deep single-cell phenotyping and are therefore comparatively cheap. Synthetic library construction is flexible and not limited to sequences or regions, such as those harboring PAM sites, and internal validity and normalization is facilitated through the incorporation of known control sequences. Moreover, assay throughput is an order of magnitude greater, with MPRA libraries typically querying >10 000 elements. Most significantly, MPRAs allow for nucleotide level resolution and testing of a priori defined sets of human polymorphisms.

Technical considerations are important and library design and implementation can have a large impact on results (147). Increasing enhancer size changes regulatory function through the introduction of distal TFBSs with transcriptional modifying effects, which may be more reliable (147). The ‘classic’ placement of library elements in the 5’ UTR increases assay sensitivity to alterations in DNA-binding motifs, whereas elements in the 3′ orientation (STARR-seq) are more influenced by RNA-binding proteins (147,148). Additionally, MPRAs testing allelic effects are dependent on high transfection efficiency, which has necessitated the use of immortalized cell lines in the literature (149), which can impact assay outcomes (150). Therefore, MPRAs are ideally performed in primary cell types or in vivo tissues most relevant to the underlying trait of interest, or using a staged design. Similarly, assays rarely account for state-specific or environmental factors that may interact with genetic elements to influence gene expression. Improvements in library delivery methods and more complex state-specific study designs will begin to rectify this.

Additionally, since MPRAs involve testing DNA elements outside of their native genomic environments they do not account for locus-specific chromatin architectures, DNA–protein and DNA–RNA interactions and adjacent sequence contexts that strongly regulate gene expression. Yet, despite this limitation, they are remarkably reliable. In all cases, interpretation should be cautious, as demonstration of regulatory function does not necessarily imply that a specific variant is ‘causal’ or pathogenic. Similarly, functional variants might regulate multiple downstream genes through one-to-many enhancer relationships. Only experimental follow-up can resolve which functional variants or associated risk genes are mechanistically relevant.

Integration of multiple functional methods overcomes individual technical limitations

Multimodal methodological integration can address many of the limitations of individual approaches. In a recent tour de force, Ray and colleagues (151) characterized ~2776 regulatory SNVs in the multi-disease associated TNFAIP3 locus across three related erythroid cell types using seven approaches: HiChIP, ATAC-seq, TFBS assessment, CRISPRa, CRISPRi and L-MPRA (lenti) and T-MPRA (transfection). In their analysis, they found that CRISPRi and T-MPRA were effective for identifying likely causal variation and particularly favored the integration of these two approaches (151). In another recent study, we also integrated T-MPRA, CRISPRi screens and published biochemical data to functionally prioritize 5706 GWAS variants for two neurodegenerative disorders, Alzheimer’s disease and Progressive Supranuclear Palsy (140). These complementary methods were used to sequentially and iteratively prioritize high-probability causal variants and account for the limitations of individual methods: MPRA provided allele level functional information, CRISPRi verified functionality in native genomic contexts and assignment of downstream genes, and biochemical assays confirmed accessibility and likely functionality within relevant post-mortem tissues (140). Other recent advances include the transMPRA approach, which cleverly combines synthetic MPRA libraries with CRISPRi screens in a combinatorial manner to systematically identify trans regulators of user-defined noncoding genetic elements (152). These results highlight the necessity of a multimodal approach in future studies seeking to meaningfully annotate the noncoding genome.

Data from multiplexed assays can improve predictive classification methods

The development of inferential and predictive methods for identifying functional genomic elements is an active area of ongoing research. Computational methods have the advantage of more throughput and less time and overhead costs compared with experimental approaches. In recent years, an array of statistical and machine learning algorithms leveraging sequence features including biochemical annotations and evolutionary conservation have been developed for functional variant prediction (see Nicholls and colleagues for an overview of machine learning methods (153) and Schipper and Posthuma, this issue, for a broad discussion of analytic methods (155)). Unfortunately, these algorithms still exhibit inconsistent predictive accuracy across all use cases and poor agreement with each other (45,147), limiting utility in the research and clinical setting. As such, the ACMG’s latest guidelines have restrictive rules on the usage of predictive algorithms in clinical reporting (154). To rectify this, it has been proposed that functional data accrued from high-throughput experimental screens can serve as an orthogonal benchmark of algorithm performance (128,156). However, the extent to which computational algorithms recapitulate functional data from MPRA during direct comparison remains highly disparate in the literature (129,140,147,157) and may vary depending on the specific features used to train particular algorithms. Interestingly, recent work demonstrates that functional data from MPRA can be directly incorporated to train computational algorithms and boost predictive performance (158). For example, Movva and colleagues (159) developed MPRA-DragoNN, a convolutional neural network exclusively trained on pre-existing MPRA, and validated this method by fine-mapping an independent GWAS of LDL levels.

These findings are illustrative of a broader point: high-throughput experimental datasets can be leveraged to improve a priori predictive methods across different classes of functional variation. For instance, Fulco and colleagues developed the CRISPRi-FlowFish method to characterize putative enhancers in K562 lymphoblastoid cells and used the resultant functional data to generate an Activity-by-Contact model of enhancer function that outperformed existing predictive methods (74) and effectively prioritized GWAS variation (160). Rosenberg et al. (116) created a functional assay to measure differential splicing effects for more than 2 million synthetic variants, which was used to train a predictive model that vastly outperformed existing algorithms. Other work functionally screened polyribosome loading for more than 280 000 synthetically derived 5’ UTRs to generate an effective predictive model for the translational efficacy of 5’ UTR sequence variants (123). These examples demonstrate that the diverse, rapidly expanding catalog of experimental functional datasets can be utilized to improve predictive algorithms and may even inform the interpretation of human variation in clinical databases.

Conclusions and future perspectives

Recent technological innovations have rapidly advanced our functional understanding of the human regulatory genome (Xu et al., this issue). Tissue-specific, genome-wide annotations of putative regulatory elements are now available across hundreds of tissues and cell lines, as exemplified by work from the ENCODE (148), Roadmap Epigenomics (25) and PsychENCODE (161) consortiums. Cell-type-specific regulatory maps represent the next frontier in this effort. Yet, functional screening is still required to fully delineate the functional effects of noncoding genetic variation.

The complex genetic architectures underlying most common traits require integrative experimental approaches to dissect individual loci and to understand their combined effects. The work on autism spectrum disorders (ASD) is a particularly salient example of genetic studies revolutionizing causal models of diseases. Studies over the past decades have identified causal roles for: (1) common variants, (2) rare, de novo and protein-truncating variants, (3) copy number variants and (4) high-confidence syndromic genes (162). Although the phenotypes of ASD are multifaceted and highly variable, both genetic (163,164) and transcriptomic evidence (165,166) points to disruption of convergent gene networks and pathways during fetal brain development. Comprehensive mapping of such networks advances our understanding of disease etiology. Transcriptional regulation through cis-regulatory elements, where noncoding variants play critical roles, stands as a key piece of the network. Therefore, high-throughput functional assays are required to dissect the variants involved—in order to map their cell-type specificity and downstream target genes. A similar logic can be applied to many other diseases.

CRISPRa/i screens and MPRA are two proven technologies to dissect noncoding variation. Worth mentioning are other high-throughput methods surveying other aspects of noncoding variant function. For example, SNP-SELEX (167) and REEL-Seq (168) are parallelized in vitro assays directly querying DNA/TF interactions. In addition, variants affecting RNA splicing (sQTLs) are also predicted to play a significant role in disease etiology (8,169), but high-throughput functional characterization of sQTLs is in its early stages (e.g. (117)). To improve high-throughput analysis of splicing, a more efficient prime editing assay coupled with an isoform-level transcriptomic readout would be optimal.

One major area of opportunity aside from increased scaling is to couple these technologies with other biological assays. Currently, CRISPR screens rely primarily on transcriptomics or cell proliferation (170) as readouts. However, recent studies have started to incorporate imaging analysis into this repertoire (171,172), which may facilitate the linkage of genetic variants to other biological features specific to disease. Other complex biological traits could be integrated within these assays, such as neuronal activity, metabolomics, etc., which more directly reflect the relevant physiological processes. Ultimately, functional genetic screening will ideally be performed at base-pair resolution and in versatile, native and relevant cellular contexts with physiologically relevant functional readouts. In the near term, this will depend on improved gene editing technologies and cell delivery methods.

Funding

National Institute of Neurological Disorders and Stroke [UG3NS104095 to D.H.G.]; National Institute of Mental Health [1U54NS123746 and 5U01MH116489-05 to D.H.G.]; National Institute of Aging fellowship [1F30AG064832 to Y.A.C.]; UCLA-Caltech MSTP training grant [T32-GM008042 to Y.A.C.].

Contributor Information

Yonatan A Cooper, Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Medical Scientist Training Program, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA.

Qiuyu Guo, Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA.

Daniel H Geschwind, Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA; Program in Neurogenetics, Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA; Center for Autism Research and Treatment, Semel Institute, University of California Los Angeles, Los Angeles, CA, USA; Institute of Precision Health, University of California Los Angeles, Los Angeles, CA, USA.

References

  • 1. Claussnitzer, M., Cho, J.H., Collins, R., Cox, N.J., Dermitzakis, E.T., Hurles, M.E., Kathiresan, S., Kenny, E.E., Lindgren, C.M. and MacArthur, D.G. (2020) A brief history of human disease genetics. Nature, 577, 179–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Karczewski, K.J., Francioli, L.C., Tiao, G., Cummings, B.B., Alföldi, J., Wang, Q., Collins, R.L., Laricchia, K.M., Ganna, A. and Birnbaum, D.P. (2020) The mutational constraint spectrum quantified from variation in 141,456 humans. Nature, 581, 434–443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J., Downey, P., Elliott, P., Green, J. and Landray, M. (2015) UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med., 12, e1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C. and Stuart, J.M. (2013) The cancer genome atlas pan-cancer analysis project. Nat. Genet., 45, 1113–1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Schaid, D.J., Chen, W. and Larson, N.B. (2018) From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet., 19, 491–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Timpson, N.J., Greenwood, C.M., Soranzo, N., Lawson, D.J. and Richards, J.B. (2018) Genetic architecture: the shape of the genetic contribution to human traits and disease. Nat. Rev. Genet., 19, 110. [DOI] [PubMed] [Google Scholar]
  • 7. Gusev, A., Lee, S.H., Trynka, G., Finucane, H., Vilhjálmsson, B.J., Xu, H., Zang, C., Ripke, S., Bulik-Sullivan, B. and Stahl, E. (2014) Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet., 95, 535–552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Li, Y.I., Van De Geijn, B., Raj, A., Knowles, D.A., Petti, A.A., Golan, D., Gilad, Y. and Pritchard, J.K. (2016) RNA splicing is a primary link between genetic variation and disease. Science, 352, 600–604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. de la  Torre-Ubieta, L., Stein, J.L., Won, H., Opland, C.K., Liang, D., Lu, D. and Geschwind, D.H. (2018) The Dynamic Landscape of Open Chromatin during Human Cortical Neurogenesis. Cell, 172, 289–304.e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Wright, C.F., Quaife, N.M., Ramos-Hernández, L., Danecek, P., Ferla, M.P., Samocha, K.E., Kaplanis, J., Gardner, E.J., Eberhardt, R.Y. and Chao, K.R. (2021) Non-coding region variants upstream of MEF2C cause severe developmental disorder through three distinct loss-of-function mechanisms. Am. J. Hum. Genet., 108, 1083–1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Melton, C., Reuter, J.A., Spacek, D.V. and Snyder, M. (2015) Recurrent somatic mutations in regulatory regions of human cancer genomes. Nat. Genet., 47, 710–716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Fredriksson, N.J., Ny, L., Nilsson, J.A. and Larsson, E. (2014) Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types. Nat. Genet., 46, 1258–1263. [DOI] [PubMed] [Google Scholar]
  • 13. Alexander, R.P., Fang, G., Rozowsky, J., Snyder, M. and Gerstein, M.B. (2010) Annotating non-coding regions of the genome. Nat. Rev. Genet., 11, 559–571. [DOI] [PubMed] [Google Scholar]
  • 14. Xu, J., Pratt, H.E., Moore, J.E., Gerstein, M.B. and Weng, Z. (2022) Building integrative functional maps of gene regulation. Hum. Mol. Genet.   10.1093/hmg/ddac195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Boyle, A.P., Davis, S., Shulha, H.P., Meltzer, P., Margulies, E.H., Weng, Z., Furey, T.S. and Crawford, G.E. (2008) High-resolution mapping and characterization of open chromatin across the genome. Cell, 132, 311–322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. and Greenleaf, W.J. (2013) Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods, 10, 1213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Kouzarides, T. (2007) Chromatin modifications and their function. Cell, 128, 693–705. [DOI] [PubMed] [Google Scholar]
  • 18. Fornes, O., Castro-Mondragon, J.A., Khan, A., Van der Lee, R., Zhang, X., Richmond, P.A., Modi, B.P., Correard, S., Gheorghe, M. and Baranašić, D. (2020) JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res., 48, D87–D92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Arnold, C.D., Gerlach, D., Stelzer, C., Boryń, L.M., Rath, M. and Stark, A. (2013) Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science, 339, 1074–1077. [DOI] [PubMed] [Google Scholar]
  • 20. Core, L.J., Martins, A.L., Danko, C.G., Waters, C.T., Siepel, A, and Lis, J.T. (2014) Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat Genet., 46, 1311–1320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Yao, L., Liang, J., Ozer, A., Leung, A.K., Lis, J.T. and Yu, H. (2022) A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nat Biotechnol, 40, 1056–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Core, L.J., Waterfall, J.J. and Lis, J.T. (2008) Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science, 322, 1845–1848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Mahat, D.B., Kwak, H., Booth, G.T., Jonkers, I.H., Danko, C.G., Patel, R.K., Waters, C.T., Munson, K., Core, L.J. and Lis, J.T. (2016) Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nat. Protoc., 11, 1455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Shiraki, T., Kondo, S., Katayama, S., Waki, K., Kasukawa, T., Kawaji, H., Kodzius, R., Watahiki, A., Nakamura, M. and Arakawa, T. (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci., 100, 15776–15781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Bernstein, B.E., Stamatoyannopoulos, J.A., Costello, J.F., Ren, B., Milosavljevic, A., Meissner, A., Kellis, M., Marra, M.A., Beaudet, A.L. and Ecker, J.R. (2010) The NIH roadmap epigenomics mapping consortium. Nat. Biotechnol., 28, 1045–1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Consortium,E.P (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Stunnenberg, H.G., Abrignani, S., Adams, D., de  Almeida, M., Altucci, L., Amin, V., Amit, I., Antonarakis, S.E., Aparicio, S. and Arima, T. (2016) The International Human Epigenome Consortium: a blueprint for scientific collaboration and discovery. Cell, 167, 1145–1149. [DOI] [PubMed] [Google Scholar]
  • 28. Wang, D., Liu, S., Warrell, J., Won, H., Shi, X., Navarro, F.C., Clarke, D., Gu, M., Emani, P., Yang, Y.T.  et al. (2018) Comprehensive functional genomic resource and integrative model for the human brain. Science, 362(6420), eaat8464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Consortium, F (2014) A promoter-level mammalian expression atlas. Nature, 507, 462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Wingender, E., Dietze, P., Karas, H. and Knüppel, R. (1996) TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res., 24, 238–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Andersson, R., Gebhard, C., Miguel-Escalada, I., Hoof, I., Bornholdt, J., Boyd, M., Chen, Y., Zhao, X., Schmidl, C., Suzuki, T.  et al. (2014) An atlas of active enhancers across human cell types and tissues. Nature, 507, 455–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Ernst, J. and Kellis, M. (2017) Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc., 12, 2478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Hoffman, M.M., Buske, O.J., Wang, J., Weng, Z., Bilmes, J.A. and Noble, W.S. (2012) Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods, 9, 473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Boyle, A.P., Hong, E.L., Hariharan, M., Cheng, Y., Schaub, M.A., Kasowski, M., Karczewski, K.J., Park, J., Hitz, B.C. and Weng, S. (2012) Annotation of functional variation in personal genomes using RegulomeDB. Genome Res., 22, 1790–1797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Ward, L.D. and Kellis, M. (2012) HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res., 40, D930–D934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Rojano, E., Seoane, P., Ranea, J.A. and Perkins, J.R. (2019) Regulatory variants: from detection to predicting impact. Brief. Bioinform., 20, 1639–1654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Trynka, G., Sandor, C., Han, B., Xu, H., Stranger, B.E., Liu, X.S. and Raychaudhuri, S. (2013) Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet., 45, 124–130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Farh, K.K.-H., Marson, A., Zhu, J., Kleinewietfeld, M., Housley, W.J., Beik, S., Shoresh, N., Whitton, H., Ryan, R.J., Shishkin, A.A.  et al. (2015) Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature, 518, 337–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Finucane, H.K., Bulik-Sullivan, B., Gusev, A., Trynka, G., Reshef, Y., Loh, P.-R., Anttila, V., Xu, H., Zang, C. and Farh, K. (2015) Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet., 47, 1228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Bulik-Sullivan, B.K., Loh, P.-R., Finucane, H.K., Ripke, S., Yang, J., Patterson, N., Daly, M.J., Price, A.L. and Neale, B.M. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet., 47, 291–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Finucane, H.K., Reshef, Y.A., Anttila, V., Slowikowski, K., Gusev, A., Byrnes, A., Gazal, S., Loh, P.-R., Lareau, C., Shoresh, N.  et al. (2018) Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet., 50, 621–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Benton, M.L., Talipineni, S.C., Kostka, D. and Capra, J.A. (2019) Genome-wide enhancer annotations differ significantly in genomic distribution, evolution, and function. BMC Genomics, 20, 1–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Kwasnieski, J.C., Fiore, C., Chaudhari, H.G. and Cohen, B.A. (2014) High-throughput functional testing of ENCODE segmentation predictions. Genome Res., 24, 1595–1602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Kvon, E.Z., Kazmar, T., Stampfel, G., Yáñez-Cuna, J.O., Pagani, M., Schernhuber, K., Dickson, B.J. and Stark, A. (2014) Genome-scale functional characterization of Drosophila developmental enhancers in vivo. Nature, 512, 91–95. [DOI] [PubMed] [Google Scholar]
  • 45. Liu, L., Sanderford, M.D., Patel, R., Chandrashekar, P., Gibson, G. and Kumar, S. (2019) Biological relevance of computationally predicted pathogenicity of noncoding variants. Nat. Commun., 10, 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Alasoo, K., Rodrigues, J., Mukhopadhyay, S., Knights, A.J., Mann, A.L., Kundu, K., Hale, C., Dougan, G. and Gaffney, D.J. (2018) Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet., 50, 424–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Soskic, B., Cano-Gamez, E., Smyth, D.J., Rowan, W.C., Nakic, N., Esparza-Gordillo, J., Bossini-Castillo, L., Tough, D.F., Larminie, C.G., Bronson, P.G.  et al. (2019) Chromatin activity at GWAS loci identifies T cell states driving complex immune diseases. Nat. Genet., 51, 1486–1493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Kelsey, G., Stegle, O. and Reik, W. (2017) Single-cell epigenomics: Recording the past and predicting the future. Science, 358, 69–75. [DOI] [PubMed] [Google Scholar]
  • 49. Ernst, J. and Kellis, M. (2015) Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol., 33, 364–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Starita, L.M., Ahituv, N., Dunham, M.J., Kitzman, J.O., Roth, F.P., Seelig, G., Shendure, J. and Fowler, D.M. (2017) Variant interpretation: functional assays to the rescue. Am. J. Hum. Genet., 101, 315–325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Esposito, D., Weile, J., Shendure, J., Starita, L.M., Papenfuss, A.T., Roth, F.P., Fowler, D.M. and Rubin, A.F. (2019) MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol., 20, 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Rubin, A.F., Min, J.K., Rollins, N.J., Da, E.Y., Esposito, D., Harrington, M., Stone, J., Bianchi, A.H., Fu, Y. and Gallaher, M. (2021) MaveDB v2: a curated community database with over three million variant effects from multiplexed functional assays. bioRxiv.
  • 53. Knott, G.J. and Doudna, J.A. (2018) CRISPR-Cas guides the future of genetic engineering. Science, 361, 866–869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Yang, H., Wang, H., Shivalila, C.S., Cheng, A.W., Shi, L. and Jaenisch, R. (2013) One-Step Generation of Mice Carrying Reporter and Conditional Alleles by CRISPR/Cas-Mediated Genome Engineering. Cell, 154, 1370–1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Gilbert, L.A., Larson, M.H., Morsut, L., Liu, Z., Brar, G.A., Torres, S.E., Stern-Ginossar, N., Brandman, O., Whitehead, E.H. and Doudna, J.A. (2013) CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell, 154, 442–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Maeder, M.L., Linder, S.J., Cascio, V.M., Fu, Y., Ho, Q.H. and Joung, J.K. (2013) CRISPR RNA–guided activation of endogenous human genes. Nat. Methods, 10, 977–979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Komor, A.C., Kim, Y.B., Packer, M.S., Zuris, J.A. and Liu, D.R. (2016) Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature, 533, 420–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Anzalone, A.V., Randolph, P.B., Davis, J.R., Sousa, A.A., Koblan, L.W., Levy, J.M., Chen, P.J., Wilson, C., Newby, G.A. and Raguram, A. (2019) Search-and-replace genome editing without double-strand breaks or donor DNA. Nature, 576, 149–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Dixit, A., Parnas, O., Li, B., Chen, J., Fulco, C.P., Jerby-Arnon, L., Marjanovic, N.D., Dionne, D., Burks, T. and Raychowdhury, R. (2016) Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell, 167, 1853–1866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Bock, C., Datlinger, P., Chardon, F., Coelho, M.A., Dong, M.B., Lawson, K.A., Lu, T., Maroc, L., Norman, T.M. and Song, B. (2022) High-content CRISPR screening. Nature Reviews Methods Primers, 2, 1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Shukla, A. and Huangfu, D. (2018) Decoding the noncoding genome via large-scale CRISPR screens. Curr. Opin. Genet. Dev., 52, 70–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Canver, M.C., Smith, E.C., Sher, F., Pinello, L., Sanjana, N.E., Shalem, O., Chen, D.D., Schupp, P.G., Vinjamur, D.S. and Garcia, S.P. (2015) BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature, 527, 192–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Korkmaz, G., Lopes, R., Ugalde, A.P., Nevedomskaya, E., Han, R., Myacheva, K., Zwart, W., Elkon, R. and Agami, R. (2016) Functional genetic screens for enhancer elements in the human genome using CRISPR-Cas9. Nat. Biotechnol., 34, 192–198. [DOI] [PubMed] [Google Scholar]
  • 64. Sanjana, N.E., Wright, J., Zheng, K., Shalem, O., Fontanillas, P., Joung, J., Cheng, C., Regev, A. and Zhang, F. (2016) High-resolution interrogation of functional elements in the noncoding genome. Science, 353, 1545–1549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Diao, Y., Fang, R., Li, B., Meng, Z., Yu, J., Qiu, Y., Lin, K.C., Huang, H., Liu, T. and Marina, R.J. (2017) A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat. Methods, 14, 629–635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Rajagopal, N., Srinivasan, S., Kooshesh, K., Guo, Y., Edwards, M.D., Banerjee, B., Syed, T., Emons, B.J., Gifford, D.K. and Sherwood, R.I. (2016) High-throughput mapping of regulatory DNA. Nat. Biotechnol., 34, 167–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Xu, X. and Qi, L.S. (2019) A CRISPR–dCas toolbox for genetic engineering and synthetic biology. J. Mol. Biol., 431, 34–47. [DOI] [PubMed] [Google Scholar]
  • 68. Tanenbaum, M.E., Gilbert, L.A., Qi, L.S., Weissman, J.S. and Vale, R.D. (2014) A protein-tagging system for signal amplification in gene expression and fluorescence imaging. Cell, 159, 635–646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Hilton, I.B., D’ippolito, A.M., Vockley, C.M., Thakore, P.I., Crawford, G.E., Reddy, T.E. and Gersbach, C.A. (2015) Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers. Nat. Biotechnol., 33, 510–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Konermann, S., Brigham, M.D., Trevino, A.E., Joung, J., Abudayyeh, O.O., Barcena, C., Hsu, P.D., Habib, N., Gootenberg, J.S. and Nishimasu, H. (2015) Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature, 517, 583–588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Datlinger, P., Rendeiro, A.F., Schmidl, C., Krausgruber, T., Traxler, P., Klughammer, J., Schuster, L.C., Kuchler, A., Alpar, D. and Bock, C. (2017) Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods, 14, 297–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Jaitin, D.A., Weiner, A., Yofe, I., Lara-Astiaso, D., Keren-Shaul, H., David, E., Salame, T.M., Tanay, A., van  Oudenaarden, A. and Amit, I. (2016) Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell, 167, 1883–1896. [DOI] [PubMed] [Google Scholar]
  • 73. Xie, S., Duan, J., Li, B., Zhou, P. and Hon, G.C. (2017) Multiplexed engineering and analysis of combinatorial enhancer activity in single cells. Mol. Cell, 66, 285–299. [DOI] [PubMed] [Google Scholar]
  • 74. Fulco, C.P., Nasser, J., Jones, T.R., Munson, G., Bergman, D.T., Subramanian, V., Grossman, S.R., Anyoha, R., Doughty, B.R. and Patwardhan, T.A. (2019) Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet., 51, 1664–1669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Fulco, C.P., Munschauer, M., Anyoha, R., Munson, G., Grossman, S.R., Perez, E.M., Kane, M., Cleary, B., Lander, E.S. and Engreitz, J.M. (2016) Systematic mapping of functional enhancer–promoter connections with CRISPR interference. Science, 354, 769–773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Gasperini, M., Hill, A.J., McFaline-Figueroa, J.L., Martin, B., Kim, S., Zhang, M.D., Jackson, D., Leith, A., Schreiber, J. and Noble, W.S. (2019) A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell, 176, 377–390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Cai, P., Otten, A.B., Cheng, B., Ishii, M.A., Zhang, W., Huang, B., Qu, K. and Sun, B.K. (2020) A genome-wide long noncoding RNA CRISPRi screen identifies PRANCR as a novel regulator of epidermal homeostasis. Genome Res., 30, 22–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Fei, T., Li, W., Peng, J., Xiao, T., Chen, C.-H., Wu, A., Huang, J., Zang, C., Liu, X.S. and Brown, M. (2019) Deciphering essential cistromes using genome-wide CRISPR screens. Proc. Natl. Acad. Sci., 116, 25186–25195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Lopes, R., Sprouffske, K., Sheng, C., Uijttewaal, E.C., Wesdorp, A.E., Dahinden, J., Wengert, S., Diaz-Miyar, J., Yildiz, U. and Bleu, M. (2021) Systematic dissection of transcriptional regulatory networks by genome-scale and single-cell CRISPR screens. Sci. Adv., 7, eabf5733. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Stuart, W.D., Guo, M., Fink-Baldauf, I.M., Coleman, A.M., Clancy, J.P., Mall, M.A., Lim, F.-Y., Brewington, J.J. and Maeda, Y. (2020) CRISPRi-mediated functional analysis of lung disease-associated loci at non-coding regions. NAR Genom. Bioinform., 2, lqaa036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Dietlein, F., Wang, A.B., Fagre, C., Tang, A., Besselink, N.J., Cuppen, E., Li, C., Sunyaev, S.R., Neal, J.T. and Van Allen, E.M. (2022) Genome-wide analysis of somatic noncoding mutation patterns in cancer. Science, 376, eabg5601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Fu, Y., Foden, J.A., Khayter, C., Maeder, M.L., Reyon, D., Joung, J.K. and Sander, J.D. (2013) High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol., 31, 822–826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83. Bae, S., Park, J. and Kim, J.-S. (2014) Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics, 30, 1473–1475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Jacquin, A.L.S., Odom, D.T. and Lukk, M. (2019) Crisflash: open-source software to generate CRISPR guide RNAs against genomes annotated with individual variation. Bioinformatics, 35, 3146–3147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Hsu, P.D., Scott, D.A., Weinstein, J.A., Ran, F.A., Konermann, S., Agarwala, V., Li, Y., Fine, E.J., Wu, X., Shalem, O.  et al. (2013) DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol., 31, 827–832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. Doench, J.G., Fusi, N., Sullender, M., Hegde, M., Vaimberg, E.W., Donovan, K.F., Smith, I., Tothova, Z., Wilen, C., Orchard, R.  et al. (2016) Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol., 34, 184–191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87. Chuai, G., Ma, H., Yan, J., Chen, M., Hong, N., Xue, D., Zhou, C., Zhu, C., Chen, K., Duan, B.  et al. (2018) DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol., 19, 80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88. Kuscu, C., Arslan, S., Singh, R., Thorpe, J. and Adli, M. (2014) Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nat. Biotechnol., 32, 677–683. [DOI] [PubMed] [Google Scholar]
  • 89. Kim, D., Bae, S., Park, J., Kim, E., Kim, S., Yu, H.R., Hwang, J., Kim, J.-I. and Kim, J.-S. (2015) Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in human cells. Nat. Methods, 12, 237–243. [DOI] [PubMed] [Google Scholar]
  • 90. Cameron, P., Fuller, C.K., Donohoue, P.D., Jones, B.N., Thompson, M.S., Carter, M.M., Gradia, S., Vidal, B., Garner, E., Slorach, E.M.  et al. (2017) Mapping the genomic landscape of CRISPR–Cas9 cleavage. Nat. Methods, 14, 600–606. [DOI] [PubMed] [Google Scholar]
  • 91. Koblan, L.W., Doman, J.L., Wilson, C., Levy, J.M., Tay, T., Newby, G.A., Maianti, J.P., Raguram, A. and Liu, D.R. (2018) Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol., 36, 843–846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92. Anzalone, A.V., Koblan, L.W. and Liu, D.R. (2020) Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol., 38, 824–844. [DOI] [PubMed] [Google Scholar]
  • 93. Huang, T.P., Newby, G.A. and Liu, D.R. (2021) Precision genome editing using cytosine and adenine base editors in mammalian cells. Nat. Protoc., 16, 1089–1128. [DOI] [PubMed] [Google Scholar]
  • 94. Chen, P.J., Hussmann, J.A., Yan, J., Knipping, F., Ravisankar, P., Chen, P.-F., Chen, C., Nelson, J.W., Newby, G.A., Sahin, M.  et al. (2021) Enhanced prime editing systems by manipulating cellular determinants of editing outcomes. Cell, 184, 5635–5652.e29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95. Erwood, S., Bily, T.M.I., Lequyer, J., Yan, J., Gulati, N., Brewer, R.A., Zhou, L., Pelletier, L., Ivakine, E.A. and Cohn, R.D. (2022) Saturation variant interpretation using CRISPR prime editing. Nat. Biotechnol., 40, 855–895. [DOI] [PubMed] [Google Scholar]
  • 96. Inoue, F. and Ahituv, N. (2015) Decoding enhancers using massively parallel reporter assays. Genomics, 106, 159–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Patwardhan, R.P., Lee, C., Litvin, O., Young, D.L. and Pe’er, D. and Shendure, J. (2009) High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol., 27, 1173–1175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98. Patwardhan, R.P., Hiatt, J.B., Witten, D.M., Kim, M.J., Smith, R.P., May, D., Lee, C., Andrie, J.M., Lee, S.-I. and Cooper, G.M. (2012) Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol., 30, 265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99. Melnikov, A., Murugan, A., Zhang, X., Tesileanu, T., Wang, L., Rogov, P., Feizi, S., Gnirke, A., Callan, C.G. and Kinney, J.B. (2012) Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol., 30, 271–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100. van  Arensbergen, J., FitzPatrick, V.D., de  Haas, M., Pagie, L., Sluimer, J., Bussemaker, H.J. and van  Steensel, B. (2017) Genome-wide mapping of autonomous promoter activity in human cells. Nat. Biotechnol., 35, 145–153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101. van  Arensbergen, J., Pagie, L., FitzPatrick, V.D., de  Haas, M., Baltissen, M.P., Comoglio, F., van der  Weide, R.H., Teunissen, H., Võsa, U. and Franke, L. (2019) High-throughput identification of human SNPs affecting regulatory element activity. Nat. Genet., 51, 1160–1169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102. Vockley, C.M., Guo, C., Majoros, W.H., Nodzenski, M., Scholtens, D.M., Hayes, M.G., Lowe, W.L. and Reddy, T.E. (2015) Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res., 25, 1206–1214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103. Vanhille, L., Griffon, A., Maqbool, M.A., Zacarias-Cabeza, J., Dao, L.T., Fernandez, N., Ballester, B., Andrau, J.C. and Spicuglia, S. (2015) High-throughput and quantitative assessment of enhancer activity in mammals by CapStarr-seq. Nat. Commun., 6, 1–10. [DOI] [PubMed] [Google Scholar]
  • 104. Shen, S.Q., Myers, C.A., Hughes, A.E., Byrne, L.C., Flannery, J.G. and Corbo, J.C. (2016) Massively parallel cis-regulatory analysis in the mammalian central nervous system. Genome Res., 26, 238–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105. Inoue, F., Kircher, M., Martin, B., Cooper, G.M., Witten, D.M., McManus, M.T., Ahituv, N. and Shendure, J. (2017) A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res., 27, 38–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106. Davis, J.E., Insigne, K.D., Jones, E.M., Hastings, Q.A., Boldridge, W.C. and Kosuri, S. (2020) Dissection of c-AMP response element architecture by using genomic and episomal massively parallel reporter assays. Cell Systems, 11, 75–85. [DOI] [PubMed] [Google Scholar]
  • 107. Jayavelu, N.D., Jajodia, A., Mishra, A. and Hawkins, R.D. (2020) Candidate silencer elements for the human and mouse genomes. Nat. Commun., 11, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108. Sharon, E., Kalma, Y., Sharp, A., Raveh-Sadka, T., Levo, M., Zeevi, D., Keren, L., Yakhini, Z., Weinberger, A. and Segal, E. (2012) Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol., 30, 521–530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Lambert, J.T., Su-Feher, L., Cichewicz, K., Warren, T.L., Zdilar, I., Wang, Y., Lim, K.J., Haigh, J.L., Morse, S.J. and Canales, C.P. (2021) Parallel functional testing identifies enhancers active in early postnatal mouse brain. elife, 10, e69479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110. Maricque, B.B., Dougherty, J.D. and Cohen, B.A. (2017) A genome-integrated massively parallel reporter assay reveals DNA sequence determinants of cis-regulatory activity in neural cells. Nucleic Acids Res., 45, e16–e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111. Inoue, F., Kreimer, A., Ashuach, T., Ahituv, N. and Yosef, N. (2019) Identification and massively parallel characterization of regulatory elements driving neural induction. Cell Stem Cell, 25, 713–727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Kreimer, A., Ashuach, T., Inoue, F., Khodaverdian, A., Deng, C., Yosef, N. and Ahituv, N. (2022) Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation. Nat. Commun., 13, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113. White, M.A., Myers, C.A., Corbo, J.C. and Cohen, B.A. (2013) Massively parallel in vivo enhancer assay reveals that highly local features determine the cis-regulatory function of ChIP-seq peaks. Proc. Natl. Acad. Sci., 110, 11952–11957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114. Kircher, M., Xiong, C., Martin, B., Schubach, M., Inoue, F., Bell, R.J., Costello, J.F., Shendure, J. and Ahituv, N. (2019) Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun., 10, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115. Ernst, J., Melnikov, A., Zhang, X., Wang, L., Rogov, P., Mikkelsen, T.S. and Kellis, M. (2016) Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat. Biotechnol., 34, 1180–1190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116. Rosenberg, A.B., Patwardhan, R.P., Shendure, J. and Seelig, G. (2015) Learning the sequence determinants of alternative splicing from millions of random sequences. Cell, 163, 698–711. [DOI] [PubMed] [Google Scholar]
  • 117. Cheung, R., Insigne, K.D., Yao, D., Burghard, C.P., Wang, J., Hsiao, Y.-H.E., Jones, E.M., Goodman, D.B., Xiao, X. and Kosuri, S. (2019) A multiplexed assay for exon recognition reveals that an unappreciated fraction of rare genetic variants cause large-effect splicing disruptions. Mol. Cell, 73, 183–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118. Rabani, M., Pieper, L., Chew, G.-L. and Schier, A.F. (2017) A massively parallel reporter assay of 3′ UTR sequences identifies in vivo rules for mRNA degradation. Mol. Cell, 68, 1083–1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119. Litterman, A.J., Kageyama, R., Le Tonqueze, O., Zhao, W., Gagnon, J.D., Goodarzi, H., Erle, D.J. and Ansel, K.M. (2019) A massively parallel 3′ UTR reporter assay reveals relationships between nucleotide content, sequence conservation, and mRNA destabilization. Genome Res., 29, 896–906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120. Matreyek, K.A., Starita, L.M., Stephany, J.J., Martin, B., Chiasson, M.A., Gray, V.E., Kircher, M., Khechaduri, A., Dines, J.N., Hause, R.J.  et al. (2018) Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet., 50, 874–882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121. Safra, M., Nir, R., Farouq, D., Slutskin, I.V. and Schwartz, S. (2017) TRUB1 is the predominant pseudouridine synthase acting on mammalian mRNA via a predictable and conserved code. Genome Res., 27, 393–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. Shukla, C.J., McCorkindale, A.L., Gerhardinger, C., Korthauer, K.D., Cabili, M.N., Shechner, D.M., Irizarry, R.A., Maass, P.G. and Rinn, J.L. (2018) High-throughput identification of RNA nuclear enrichment sequences. EMBO J., 37, e98452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123. Sample, P.J., Wang, B., Reid, D.W., Presnyak, V., McFadyen, I.J., Morris, D.R. and Seelig, G. (2019) Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol., 37, 803–809. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124. Mogno, I., Kwasnieski, J.C. and Cohen, B.A. (2013) Massively parallel synthetic promoter assays reveal the in vivo effects of binding site variants. Genome Res., 23, 1908–1915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125. Smith, R.P., Taher, L., Patwardhan, R.P., Kim, M.J., Inoue, F., Shendure, J., Ovcharenko, I. and Ahituv, N. (2013) Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model. Nat. Genet., 45, 1021–1028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126. King, D.M., Hong, C.K.Y., Shepherdson, J.L., Granas, D.M., Maricque, B.B. and Cohen, B.A. (2020) Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells. elife, 9, e41279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127. Tewhey, R., Kotliar, D., Park, D.S., Liu, B., Winnicki, S., Reilly, S.K., Andersen, K.G., Mikkelsen, T.S., Lander, E.S. and Schaffner, S.F. (2016) Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell, 165, 1519–1529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128. Abell, N.S., DeGorter, M.K., Gloudemans, M.J., Greenwald, E., Smith, K.S., He, Z. and Montgomery, S.B. (2022) Multiple causal variants underlie genetic associations in humans. Science, 375, 1247–1254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129. Ulirsch, J.C., Nandakumar, S.K., Wang, L., Giani, F.C., Zhang, X., Rogov, P., Melnikov, A., McDonel, P., Do, R. and Mikkelsen, T.S. (2016) Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell, 165, 1530–1545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130. Liu, S., Liu, Y., Zhang, Q., Wu, J., Liang, J., Yu, S., Wei, G.-H., White, K.P. and Wang, X. (2017) Systematic identification of regulatory variants associated with cancer risk. Genome Biol., 18, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131. Choi, J., Zhang, T., Vu, A., Ablain, J., Makowski, M.M., Colli, L.M., Xu, M., Hennessey, R.C., Yin, J., Rothschild, H.  et al. (2020) Massively parallel reporter assays of melanoma risk variants identify MX2 as a gene promoting melanoma. Nat. Commun., 11, 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132. Ajore, R., Niroula, A., Pertesi, M., Cafaro, C., Thodberg, M., Went, M., Bao, E.L., Duran-Lozano, L., de  Lapuente, L., Portilla, A. and Olafsdottir, T. (2022) Functional dissection of inherited non-coding variation influencing multiple myeloma risk. Nat. Commun., 13, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133. Joslin, A.C., Sobreira, D.R., Hansen, G.T., Sakabe, N.J., Aneas, I., Montefiori, L.E., Farris, K.M., Gu, J., Lehman, D.M. and Ober, C. (2021) A functional genomics pipeline identifies pleiotropy and cross-tissue effects within obesity-associated GWAS loci. Nat. Commun., 12, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 134. Khetan, S., Kales, S., Kursawe, R., Jillette, A., Ulirsch, J.C., Reilly, S.K., Ucar, D., Tewhey, R. and Stitzel, M.L. (2021) Functional characterization of T2D-associated SNP effects on baseline and ER stress-responsive β cell transcriptional activation. Nat. Commun., 12, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135. Klein, J.C., Keith, A., Rice, S.J., Shepherd, C., Agarwal, V., Loughlin, J. and Shendure, J. (2019) Functional testing of thousands of osteoarthritis-associated variants for regulatory activity. Nat. Commun., 10, 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136. Castaldi, P.J., Guo, F., Qiao, D., Du, F., Naing, Z.Z.C., Li, Y., Pham, B., Mikkelsen, T.S., Cho, M.H., Silverman, E.K.  et al. (2019) Identification of functional variants in the FAM13A chronic obstructive pulmonary disease genome-wide association study locus by massively parallel reporter assays. Am. J. Respir. Crit. Care Med., 199, 52–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137. Lu, X., Chen, X., Forney, C., Donmez, O., Miller, D., Parameswaran, S., Hong, T., Huang, Y., Pujato, M., Cazares, T.  et al. (2021) Global discovery of lupus genetic risk variant allelic enhancer activity. Nat. Commun., 12, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138. Myint, L., Wang, R., Boukas, L., Hansen, K.D., Goff, L.A. and Avramopoulos, D. (2020) A screen of 1,049 schizophrenia and 30 Alzheimer’s-associated variants for regulatory potential. Am. J. Med. Genet. B Neuropsychiatr. Genet., 183, 61–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139. Matoba, N., Liang, D., Sun, H., Aygün, N., McAfee, J.C., Davis, J.E., Raffield, L.M., Qian, H., Piven, J. and Li, Y. (2020) Common genetic risk variants identified in the SPARK cohort support DDHD2 as a candidate risk gene for autism. Transl. Psychiatry, 10, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140. Cooper, Y.A., Teyssier, N., Dräger, N.M., Guo, Q., Davis, J.E., Sattler, S.M., Yang, Z., Patel, A., Wu, S., Kosuri, S. et al. (2022) Functional regulatory variants implicate distincttranscriptional networks in dementia. Science, 377, eabi8654. [DOI] [PubMed] [Google Scholar]
  • 141. Doan, R.N., Bae, B.-I., Cubelos, B., Chang, C., Hossain, A.A., Al-Saad, S., Mukaddes, N.M., Oner, O., Al-Saffar, M. and Balkhy, S. (2016) Mutations in human accelerated regions disrupt cognition and social behavior. Cell, 167, 341–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142. Uebbing, S., Gockley, J., Reilly, S.K., Kocher, A.A., Geller, E., Gandotra, N., Scharfe, C., Cotney, J. and Noonan, J.P. (2021) Massively parallel discovery of human-specific substitutions that alter enhancer activity. Proc. Natl. Acad. Sci. U.S.A., 118(2), p.e2007049118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143. Weiss, C.V., Harshman, L., Inoue, F., Fraser, H.B., Petrov, D.A., Ahituv, N. and Gokhman, D. (2021) The cis-regulatory effects of modern human-specific variants. elife, 10, e63713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144. Jagoda, E., Xue, J.R., Reilly, S.K., Dannemann, M., Racimo, F., Huerta-Sanchez, E., Sankararaman, S., Kelso, J., Pagani, L. and Sabeti, P.C. (2022) Detection of Neanderthal Adaptively Introgressed Genetic Variants That Modulate Reporter Gene Expression in Human Immune Cells. Mol. Biol. Evol., 39, msab304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145. Kheradpour, P., Ernst, J., Melnikov, A., Rogov, P., Wang, L., Zhang, X., Alston, J., Mikkelsen, T.S. and Kellis, M. (2013) Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res., 23, 800–811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146. Grossman, S.R., Zhang, X., Wang, L., Engreitz, J., Melnikov, A., Rogov, P., Tewhey, R., Isakova, A., Deplancke, B. and Bernstein, B.E. (2017) Systematic dissection of genomic features determining transcription factor binding and enhancer function. Proc. Natl. Acad. Sci., 114, E1291–E1300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147. Klein, J.C., Agarwal, V., Inoue, F., Keith, A., Martin, B., Kircher, M., Ahituv, N. and Shendure, J. (2020) A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nat. Methods, 17, 1083–1091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148. The ENCODE Project Consortium, Moore, J.E., Purcaro, M.J., Pratt, H.E., Epstein, C.B., Shoresh, N., Adrian, J., Kawli, T., Davis, C.A., Dobin, A.  et al. (2020) Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature, 583, 699–710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149. Melnikov, A., Zhang, X., Rogov, P., Wang, L. and Mikkelsen, T.S. (2014) Massively parallel reporter assays in cultured mammalian cells. J. Vis. Exp., (90), p.e51719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 150. Mulvey, B., Lagunas, T. and Dougherty, J.D. (2020) Massively Parallel Reporter Assays: Defining Functional Psychiatric Genetic Variants across Biological Contexts. Biol. Psychiatry, 89(1), 76–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151. Ray, J.P., de  Boer, C.G., Fulco, C.P., Lareau, C.A., Kanai, M., Ulirsch, J.C., Tewhey, R., Ludwig, L.S., Reilly, S.K. and Bergman, D.T. (2020) Prioritizing disease and trait causal variants at the TNFAIP3 locus using functional and genomic features. Nat. Commun., 11, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152. Calderon, D., Ellis, A., Daza, R.M., Martin, B., Tome, J.M., Chen, W., Chardon, F.M., Leith, A., Lee, C. and Trapnell, C. (2020) TransMPRA: A framework for assaying the role of many trans-acting factors at many enhancers. bioRxiv.
  • 153. Nicholls, H.L., John, C.R., Watson, D.S., Munroe, P.B., Barnes, M.R. and Cabrera, C.P. (2020) Reaching the end-game for GWAS: machine learning approaches for the prioritization of complex disease loci. Front. Genet., 11, 350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 154. Schipper, M., Posthuma, D. (2022) Demystifying non-coding GWAS variants: an overview of computational tools and methods. Hum. Mol. Genet., ddac198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 155. Ghosh, R., Oak, N. and Plon, S.E. (2017) Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biol., 18, 225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156. Dong, S. and Boyle, A.P. (2022) Prioritization of regulatory variants with tissue-specific function in the non-coding regions of human genome. Nucleic Acids Res., 50, e6–e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157. Nishizaki, S.S. and Boyle, A.P. (2017) Mining the unknown: assigning function to noncoding single nucleotide polymorphisms. Trends Genet., 33, 34–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158. Beer, M.A. (2017) Predicting enhancer activity and variant impact using gkm-SVM. Hum. Mutat., 38, 1251–1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159. Movva, R., Greenside, P., Marinov, G.K., Nair, S., Shrikumar, A. and Kundaje, A. (2019) Deciphering regulatory DNA sequences and noncoding genetic variants using neural network models of massively parallel reporter assays. PLoS One, 14, e0218073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 160. Nasser, J., Bergman, D.T., Fulco, C.P., Guckelberger, P., Doughty, B.R., Patwardhan, T.A., Jones, T.R., Nguyen, T.H., Ulirsch, J.C. and Lekschas, F. (2021) Genome-wide enhancer maps link risk variants to disease genes. Nature, 593, 238–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 161. Akbarian, S., Liu, C., Knowles, J.A., Vaccarino, F.M., Farnham, P.J., Crawford, G.E., Jaffe, A.E., Pinto, D., Dracheva, S., Geschwind, D.H.  et al. (2015) The PsychENCODE project. Nat. Neurosci., 18, 1707–1712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162. Havdahl, A., Niarchou, M., Starnawska, A., Uddin, M., van der  Merwe, C. and Warrier, V. (2021) Genetic contributions to autism spectrum disorder. Psychol. Med., 51, 2260–2273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 163. Ruzzo, E.K., Pérez-Cano, L., Jung, J.-Y., Wang, L., Kashef-Haghighi, D., Hartl, C., Singh, C., Xu, J., Hoekstra, J.N., Leventhal, O.  et al. (2019) Inherited and De Novo Genetic Risk for Autism Impacts Shared Networks. Cell, 178, 850–866.e26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 164. Satterstrom, F.K., Kosmicki, J.A., Wang, J., Breen, M.S., De Rubeis, S., An, J.-Y., Peng, M., Collins, R., Grove, J., Klei, L.  et al. (2020) Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell, 180, 568–584.e23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 165. Parikshak, N.N., Luo, R., Zhang, A., Won, H., Lowe, J.K., Chandran, V., Horvath, S. and Geschwind, D.H. (2013) Integrative Functional Genomic Analyses Implicate Specific Molecular Pathways and Circuits in Autism. Cell, 155, 1008–1021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 166. Ramaswami, G., Won, H., Gandal, M.J., Haney, J., Wang, J.C., Wong, C.C.Y., Sun, W., Prabhakar, S., Mill, J. and Geschwind, D.H. (2020) Integrative genomics identifies a convergent molecular subtype that links epigenomic with transcriptomic differences in autism. Nat. Commun., 11, 4873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 167. Yan, J., Qiu, Y., Ribeiro dos Santos, A.M., Yin, Y., Li, Y.E., Vinckier, N., Nariai, N., Benaglio, P., Raman, A., Li, X.  et al. (2021) Systematic analysis of binding of transcription factors to noncoding variants. Nature, 591, 147–151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168. Zhao, Y., Wu, D., Jiang, D., Zhang, X., Wu, T., Cui, J., Qian, M., Zhao, J., Oesterreich, S., Sun, W.  et al. (2020) A sequential methodology for the rapid identification and characterization of breast cancer-associated functional SNPs. Nat. Commun., 11, 3340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169. Walker, R.L., Ramaswami, G., Hartl, C., Mancuso, N., Gandal, M.J., de la  Torre-Ubieta, L., Pasaniuc, B., Stein, J.L. and Geschwind, D.H. (2019) Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms. Cell, 179, 750–771.e22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 170. Tian, R., Gachechiladze, M.A., Ludwig, C.H., Laurie, M.T., Hong, J.Y., Nathaniel, D., Prabhu, A.V., Fernandopulle, M.S., Patel, R., Abshari, M.  et al. (2019) CRISPR Interference-Based Platform for Multimodal Genetic Screens in Human iPSC-Derived Neurons. Neuron, 104, 239–255.e12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 171. Wheeler, E.C., Vu, A.Q., Einstein, J.M., DiSalvo, M., Ahmed, N., Van Nostrand, E.L., Shishkin, A.A., Jin, W., Allbritton, N.L. and Yeo, G.W. (2020) Pooled CRISPR screens with imaging on microraft arrays reveals stress granule-regulatory factors. Nat. Methods, 17, 636–642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 172.Feldman, D., Singh, A., Schmid-Burgk, J.L., Carlson, R.J., Mezger, A., Garrity, A.J., Zhang, F. and Blainey, P.C. (2019) Optical Pooled Screens in Human Cells. Cell, 179, 787–799.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Human Molecular Genetics are provided here courtesy of Oxford University Press

RESOURCES