Table 2:
Method- Family |
Method | Parallel Methods |
Variant Selection Criteria |
Targets | Context | Cell Lines |
Cell Type | Application | Limitations | Secondary Validation |
Source |
---|---|---|---|---|---|---|---|---|---|---|---|
MPRAs | Mutagenesis | NA | Random nucleotide substitutions in enhancers at a rate of 10% per position. | 27000 | Episomal | HEK293T | human kidney | Direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture. | Depends on (1) careful design of the sequence library, (2) minimization of artifacts during amplification and cloning, (3) high transfection efficiency, (4) and necessary power to detect transcriptional shifts. | NA | Melnikov et al 2012 |
Mutagenesis | NA | Tested 2104 WT sequences & 3314 engineered variants with motif disruptions. | >5,000 | Episomal | HepG2, K562 | Carcinoma LL | (1) Manipulations of a large number of enhancers and disruptions for individual cis-regulatory motifs. (2) Well-suited to systematic testing of pairs or sets of elements, and de novo enhancer design. | (1) Unable to determine relative contribution of chromatin vs. primary sequence information (2) focused on distal enhancers at least 2 kb from any annotated TSS for the SV40 promoter region only | Luciferase validation | Kheradpour et al 2013, | |
Mutagenesis | NA | 20 promoter/enhancers (600bp loci) | 30,000 SNPs/ | Episomal | HepG2 | Liver | (1) Scaled saturation mutagenesis to measure regulatory consequences of tens-of-thousands of regulatory elements (2) Longer sequences than are typical for MPRAs, up to 600 bp to provide more context. | (1) Limited with respect to context, both cis and trans, (2) reproducibility of measurements for elements with lower basal activity | RNAi | Kircher et al 2019, | |
GWAS-based MPRA | NA | Selection of variants in high LD with 75 GWAS hits from red blood cell (RBC) traits | 2,756 SNPs | Episomal | K562 | Erythriod | Identified 32 functional variants representing 23 of the original 75 GWAS hits | (1) Not configured to detect functional variants in haplotypes that may be jointly causal and fall within more than one regulatory element. (2) Primarily as a screen to reduce set of leads. | Cas9 genome editing | Ulirsch et al 2016, | |
GWAS-based MPRA | NA | 1,049 SZ and 30 AD variants in high LD with lead SNPs from 64 and 9 GWSIG loci respectively | 1228 SNPs | Episomal | K562, SK-SY5Y | LCLs | Identified 148 variants showing allelic differences in K562 and 53 in SK-SY5Y cells | (1) High potential for false negatives due to lack of native context. (2) False positives possible if regulatory effect on the reporter gene comes from other parts of the construct or if variant resides in closed chromatin. | NA | Myint, et al., 2020, | |
eQTL-based MPRA | NA | Candidate eQTLs from RNAseq dataset of lymphoblastoid LCLs | 3,642 cis-eQTLs | Episomal | HepG2, NA12878NA19239 | Liver, LCLs | (1) Identified 842 eQTLs with a significant transcriptional shift between alleles. (2) Provides a discovery tool for linking a genetic locus to a phenotype. | (1) Cannot test for causality. (2) Endogenously silenced sequences explain a proportion of reported active sequences. (3) Positive predictive value of 34%–68%. | CRISPR-mediaterd allelic rep-placement | Tewhey et al 2016, | |
LentiMPRA | RNA-seq, ATAC-seq, H3K27ac ChIP-seq | Identified by RNA-seq/ATAC/H3K27Ac/ChIP-seq bease on genes involved in neural differentiation | ~ 2300 | Chromosomally Integrated | HepG2, hESC | liver, hNPCs | Functional characterization of >1,500 temporal enhancers (1) lentiMPRA used in an episomal or integrated context (2) can be used in a wide variety of cell types (3) numerous barcodes per variant; and (4) extensive predictive modeling. | (1) Even as an integrated reporter assays, each tested element is removed from its native sequence location and epigenetic context | CRISPRi | Inoue et al 2017, 2019 | |
SuRE MPRA | Dnase-seq, ATAC-seq, H3K27ac | Randomly generated two SuRE libraries of ~300 million random fragment-barcode pairs | 5.9 million SNPs | Episomal | K563, HepG2 | Eryhriod, Liver | (1) Increased traditional MPRA scale by >100 folde. (2) Provides a resource to help identify causal SNPs among candidates generated by GWAS and eQTL studies | (1) Random SNPs assayed outside of endogenous context (3) Power to detect transcriptional shift may be limited by number of barcodes per fragment | CRIPSR/ Cas9 SNP editing |
van Arensbergen et al. 2019 | |
two-Stage MPRA screen | NA | Random genome-wide DNA fragments | 32,776 substitutions | Episomal | hESC | hNSCs | (1) MPRA in human neural stem cells. (2) Identified 532 HARs and HGEs with human-specific changes in enhancer activity in human neural stem cells. | (1) Effects were modest and lacked genomic context. | CRISPRi enhancer validation | Uebbing et al. 2020 | |
CRISPR | Multiplexed CRISPRi, eQTL-inspired framework | scRNA-seq | Top 5,000 intergenic open chromatin regions in K562s | 5,920 | Endogenous | K562 | LCLs | Identified 664 cis enhancer-gene pairs enriched for specific transcription factors, non- housekeeping status, and genomic and 3D conformational proximity to their target genes | (1) Not all enhancers susceptible to perturbation; (2) variable degree of gRNAs targeting ability; (3) enhancers may be required for initial establishment rather than maintenance. (4) not a comprehensive survey of noncoding landscape | CRISPRi singleton experiments | Gasperini et al. 2019 |
Genome-Wide CRISPRa screen | NA | Library targeting all computationally predicted TFs a& other DNA-binding factors (TRANSFAC) | 2,428 | Endogenous | CamES | Neuron | (1) Systematically identify transcription factors that efficiently promote neuronal fate from ESCs. (2) Generated a quantitative GI map for the neuronal fate decision by pairwise activation of core neuronal-inducing factors. | (1) Scalability of the CRIPRa screens | Flow cytometry and cDNA expression. | Liu et al 2019 | |
CRISPRi-FlowFISH | ChIP-seq, Hi-C | Selected all DNase I hypersensitive (DHS) elements in K562 cells within 450 kb of 30 genes in five genomic regions | 4,662 | Hi-C and ChIP-seq | K562 | LCLs | (1) Tests noncoding regulatory elements by mapping and modeling promoter–promoter regulation, functions of CTCF sites, and combinatorial effects. (2) Potential application to any gene. (3) Method uses endogenous genes allowing candidate target genes to be identified. | (1) Does not profile effects of intronic enhancers; (2) performance may decrease by weakly expressed genes | ABC prediction model | Fulco et al. 2019 | |
Pooled CRISPR screens | Parallel screens of enhancers, genes, & genomic background | Identified from H3K27sc datasets from human cortex, developing cortex, limb, embryonic stem cells, and adult tissues | 10674 genes & 2,227 enhancers | Endogenous | H9 hESCs | hNSCs | (1) Probed gene disruptions affecting proliferation in model of human corticogenisis and their associations with neurodevelopmental disease | (1) sgRNA-Cas9 screening using Cas9 original method resulting in insertions, deletions, and substitutions. influencing the DNA directly versus inhibiting or activating expression or enhancer activity. | Confirm enhancer-gene interaction using Hi-C data | Geller et al. 2019 | |
Pooled genome- wide & CRISPRi screens | CROP-seq, longitudinal imagining | CRISPRi v2 H1 library with top 5 sgRNAs per gene (Horlbeck et al., 2016) | 18,905 genes | Endogenous | hiPSCs | Neuron | (1) Identified distinct neuronal roles for ubiquitous genes; (2) an inducible and reversible method enabling the time-resolved dissection of human gene function; (3) perturbs gene function via partial knockdown; (4) longitudinal imaging provides timeline of toxicity and reveals gene-specific temporal patterns. | (1) Scalability limited by us of lentivirus, synthetic sgRNAs in arrayed CRISPRi screens would increase scalability; (2) False-positive phenotypes possible due to interference with the differentiation process. | Secondary CRISPR screens, CROP-seq | Tian et al 2019 | |
Pooled CRISPRa/i screen | CROP-seq | (1) Genome-wide survival-based screen. (2) Secondary CRISPR screens. (3) Crop-seq | Genome-wide and targeted screens > 1000 genes | Endogenous | hiPSCs | Neuron | (1) The first genome-wide CRISPRa/i screens in human neurons (2) Uncovered neuron-response pathways to chronic oxidative stress implicated in neurodegenerative disease (3) Established CRISPRbrain resource to compare gene function across human cell types | (1) CRISPR a/i screening protocol may confound results related to oxidative stress as gRNA integration could affect cellular stress | Lipomics in KO neurons | Tian et al. 2020 |
CC = cervical cancer cells, hESCs = Human Embryonic Stem Cells, HiPSCs = Human-induced Pluripotent Stem Cells, LCL = Lymphoblastoid cell lines, ML = myelogenous leukemia cells, hNSCs – Neural Stem cells, NPCs = Neuronal Progenitor Cells, CREST-seq= “cis-regulatory element scan by tiling-deletion and sequencing”