Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 31.
Published in final edited form as: ACS Synth Biol. 2017 Mar 7;6(6):1034–1042. doi: 10.1021/acssynbio.6b00358

A Scalable Epitope Tagging Approach for High Throughput ChIP-Seq Analysis

Xiong Xiong , Yanxiao Zhang , Jian Yan ‡,#, Surbhi Jain §, Sora Chee ||, Bing Ren ‡,||,*, Huimin Zhao †,⊥,*
PMCID: PMC5536957  NIHMSID: NIHMS879533  PMID: 28215080

Abstract

Eukaryotic transcriptional factors (TFs) typically recognize short genomic sequences alone or together with other proteins to modulate gene expression. Mapping of TF-DNA interactions in the genome is crucial for understanding the gene regulatory programs in cells. While chromatin immunoprecipitation followed by sequencing (ChIP-Seq) is commonly used for this purpose, its application is severely limited by the availability of suitable antibodies for TFs. To overcome this limitation, we developed an efficient and scalable strategy named cmChIP-Seq that combines the clustered regularly interspaced short palindromic repeats (CRISPR) technology with microhomology mediated end joining (MMEJ) to genetically engineer a TF with an epitope tag. We demonstrated the utility of this tool by applying it to four TFs in a human colorectal cancer cell line. The highly scalable procedure makes this strategy ideal for ChIP-Seq analysis of TFs in diverse species and cell types.

Keywords: ChIP-Seq, microhomology mediated end joining, CRISPR/Cas9, genome engineering, FLAG tagging

Graphical Abstract;

graphic file with name nihms879533u1.jpg


Genome-wide profiling of TF-DNA interactions is crucial for dissecting the transcriptional regulation networks that govern the spatiotemporal expression of genes in an organism.13 Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) is the most common method for this purpose.46 However, this technique is limited by the availability of ChIP-grade antibodies against transcription factors.7,8 To address this bottleneck, bacterial artificial chromosome (BAC) vector has been employed to introduce epitope tags to transcription factors for ChIP-Seq experiments by highly specific antibodies against the epitopes.9 However, genetic engineering of BACs can be time-consuming and laborious.10 Alternatively, adeno-associated virus (rAAV) has been employed as a delivery vehicle for a knock-in (KI) vector to realize the epitope tag insertion and overcome low recombination efficiency.11,12 More recently, a method was reported that directly modifies an endogenous locus via CRISPR/Cas9 mediated homologous recombination (HR).13 All these approaches suffer from low scalability due to the laborious, costly and time-consuming procedures for assembly of BAC vectors or construction of long HR arms.

To circumvent these difficulties, we have developed a highly scalable approach for epitope-tagging of transcription factors in cultured cells. Specifically, we combined the clustered regularly interspaced short palindromic repeats (CRISPRs)14,15 technology with microhomology mediated end joining (MMEJ)16 to insert a 3×FLAG-tag with screening markers to the C-terminus of a target TF. Recently, microhomology has been used for predicting nuclease target sites that allows efficient gene disruption.17 Suzuki and co-workers developed an MMEJ-assisted KI method that has been applied to a variety of organisms, ranging from cell lines such as HEK293T, HeLa, CHO-K1 to silkworm, zebrafish and frog.18 The advantages of this method include the easiness of vector construction and decent efficiency in precise gene editing, which could reach 85% in certain organisms. In addition, compared with the HR mediated integration method, MMEJ-assisted KI was accompanied by improved colony-forming efficiency.19

Our CRISPR-MMEJ mediated tagging approach addresses two major bottlenecks in the current KI strategies. One bottleneck is the low efficiency of gene targeting, which necessitates laborious downstream genotyping verification of individual clones. Our method alleviates this problem by using drug selection or fluorescent screening of cell populations. The other bottleneck is the low throughput of the procedure, limited by the laborious homology arm construction. Compared with non-homologous end joining (NHEJ), MMEJ provides an alternative cellular repair mechanism with more precise integration.17,18 Using this CRISPR-MMEJ mediated tagging approach, we tagged TFs including SP1, MYC, TCF7L2 and CTCF and used the resulting cells for successful ChIP-Seq experiments.

RESULTS AND DISCUSSION

Design of the CRISPR-MMEJ Mediated ChIP-Seq (cmChIP-Seq) Method

We assembled an all-in-one expression vector CRISPRexp containing multiple guide RNA cassettes and a Cas9 nuclease.20 We also constructed donor plasmids for the target transcriptional factors. The Cas9 nuclease gene is driven by the CBh promoter (the chicken β-actin short promoter) while three sgRNAs are driven by the hU6 promoter individually (Supplemental Table S1). CRISPR/Cas9 nuclease generates a double-strand break (DSB) a few base pairs upstream from the stop codon in the last coding exon. We chose CRISPR target sites close to the stop codon in order to maintain integrity of the coding sequence. The engineered donor plasmid MicroDonor contains the epitope tag followed by a P2A sequence, mNeonGreen, a T2A and puromycin, and the cassette is flanked with only 8 to 10 bp microhomologous arms matching the sequences upstream and downstream from the cleavage site on the genome (Figure 1). The MicroDonor was constructed using Gibson assembly21 and linearized by CRISPR/Cas9 cleavage in vivo. After up to 7 days of selection in puromycin, drug-resistant clones were isolated and the integration events were verified through genotyping analyses.

Figure 1.

Figure 1

Schematic depiction of the MMEJ-mediated TF tagging strategy named cmChIP-seq for high throughput ChIP-Seq analysis. Cells are transfected with plasmids containing the Cas9 nuclease, gRNAs, and epitope tag donor constructs, leading to the integration of the FLAG tag, 2A linker sequences, mNeonGreen and puromycin resistance gene at the 3′ end of the target transcription factor.

Analyses for SP1, MYC and TCF7L2 Binding Sites in HCT116 Cells

As a proof of principle, we first applied the above strategy to transcription factor SP1 (specificity protein 1). SP1 is a well characterized TF and has been implicated to be essential in cell growth, differentiation, apoptosis and carcinogenesis.22 It activates the transcription of a variety of cellular genes by binding putative GC-rich sites in the promoters. The designed CRISPR/Cas9 nuclease creates a DSB 10 bp upstream from the stop codon of SP1’s last coding exon and linearizes the tagging cassette in the donor vector. A well-studied colorectal carcinoma (CRC) cell line HCT116 was chosen for this experiment because of availability of many TF ChIP-Seq data for this cell line.23 Initially, single cell clones were isolated and verified through genotyping. To validate the success of MMEJ mediated integration, PCR and Sanger sequencing were performed to ensure that the epitope tag was integrated in-frame (Figure 2A and 2B). By examining the 5′ junction at the DSB, we found that 10 clones showed in-frame integration, 70% of which matched exact integration sequence. Sample No. 8 showed biallelic insertion as no wild type product was detected (Figure 2B).

Figure 2.

Figure 2

Genotyping analyses for SP1 monoclonal samples. (A) Junction sequencing results. The intended knocked-in sequence is shown at the top. Blue: microhomologous arm; Yellow: initial sequence of 3 × Flag; Green: inserted nucleotides. (B) PCR check for selected samples no. 1, 2, 3, 8. 3′ junction check used forward primer targeting puromycin and reverse primer targeting genomic region after the double-strand break. Genome check applied both primers targeting the genome. Triangular points refer to expected sized amplicons and the asterisk refers to wild type product.

We selected four clones with precise integration at 5′ junction for ChIP-Seq analyses. The experiments were successful for all despite of variations in their 3′ junctions. Compared to the published ChIP-Seq data of this factor in CRC cell line HCT116 (Figure S1A),23 ChIP-Seq analyses of the tagged SP1 protein using a monoclonal antibody against the FLAG tag clearly showed ChIP signal enrichment at the previously identified SP1 binding sites in these cells (Figure S1B). SP1 was characterized to play an essential role in activating human Ek1 promoter and in regulating AXL promoter constitutively.24,25 Examples of our enriched peaks at the promoter regions of ETNK1 and AXL are given in Figure 3A and Figure S2A, and the read enrichment tracks share high similarities among the four monoclonal samples. Moreover, when we selected the top 500 peaks in each ChIP-Seq data set for de novo motif discovery, the known SP1 binding motif sequence (CCCGCC) was recovered as the top hit (Figure 3B). This result demonstrated that our method can be used to map TF binding sites as effectively as with antibodies against the protein itself.

Figure 3.

Figure 3

(A) Representative DNA-binding protein read enrichment tracks on the Integrative Genomics Viewer (IGV) for SP1 monoclonal samples. (B) Motif analyses on the top 500 peaks of each SP1 monoclonal sample identified SP1 binding motif. MEME de novo software analyzed 500 top peaks with 100 bp surrounding the peak summit and gave all SP1 motif enrichment validation.

To further demonstrate the generality of our cmChIP-Seq method, we tagged transcription factors TCF7L2 and MYC with 3× FLAG epitope tags with junctions checked (Figures S3 and S4). We picked three clones for ChIP-Seq analyses for each TF using anti-FLAG monoclonal antibodies. In all cases, consistent ChIP-Seq enrichment signals were obtained that matched the results of previous ChIP-Seq studies using antibodies against the TFs (Figure S5A, S5C and S5D).26,27 Since there was no deposited MYC data of HCT116, we compared our results to that from BL14 cell line.27 Examples of genome enrichment tracks revealed that TCF7L2 occupied neighborhood regions of UAP1L1 and CCND1 (Figure 4A, Figure S2B), consistent with results of the TCF7L2 binding sites in a prior study.28 In the case of MYC tracks, we observed the enrichment at KAT2A and HIF1A sites, indicating MYC controls the expression of the corresponding genes (Figure 4A, Figure S2C).29,30 The canonical binding motifs for TCF7L2 and MYC were also enriched at top 500 binding sites (Figure 4B and 4C). TCF7L2 binds the regulatory element sequence (ACATCAAAGGGA) and MYC binds to the E-box sequence (CACGTG). Taken together, these results demonstrated that the tagging process was successful and the C-terminal tandem peptide add-on allowed effective epitope recognition and the downstream chromatin precipitation.

Figure 4.

Figure 4

(A) Representative DNA-binding protein read enrichment tracks on the IGV for MYC and TCF7L2 monoclonal samples. (B) Top motif identified for TCF7L2 monoclonal samples matched known TCF7L2 motif. (C) Motif discovery for MYC monoclonal samples identified MYC motif as the top hit.

ChIP-Seq Analyses of TF Binding Sites in Pooled Cell Populations

In the above genetic engineering experiments, the rate limiting step was clonal expansion and genotyping. Encouraged by the relatively high tagging efficiency, we asked if this step could be eliminated and the pooled cell population, instead of genotyped cell clones, could be directly used for ChIP-Seq analysis. As a proof of principle, we performed cell transfection with tagging vectors for SP1, and collected three biological replicates of the transfected cell population for ChIP-Seq analyses. A portion of each replicate was also characterized by fluorescence-activated cell sorting (FACS) and genotyping to determine the success rate of epitope tagging (Figure 5A). A good portion of each pooled cell sample was fluorescent (Figure S6). For instance, about 58.1 ± 3.4% cells were GFP positive for SP1 after antibiotic drug selection for 7 days. ChIP- Seq was performed on the remaining cell pools either before or after FACS. The ChIP-Seq experiments for sorted samples showed similar binding patterns at ETNK1 and AXL regions which are comparable to monoclonal samples described earlier (Figure 5B, Figure S2D). Additionally, the aggregated read enrichment at published SP1 peaks also indicated that the pooled cell samples performed equally well as the monoclonal samples (Figure S1C and S1D).

Figure 5.

Figure 5

(A) Genotyping analyses for SP1 pooled cell samples. Whole cassette analysis used both primers annealing to genomic DNA and junction check involved one primer targeting genome and the other recognizing the insert. Triangular points refer to expected amplicon sizes with insertions and the asterisk refers to wild type product. (B) Representative DNA-binding protein read enrichment tracks on the IGV for SP1 pooled cell samples. (C) Motif discovery for SP1 pooled cell samples before fluorescence-activated cell sorting (FACS). (D) Motif discovery for SP1 pooled cell samples after FACS. (E) Read enrichment heatmap of the 3000 bp regions centered on the SP1 ChIP-Seq peaks generated by ENCODE. Each row is a peak and the x-axis denotes the whole 3000 bp genomic regions. Each column is a TF-tagged SP1 sample.

Similarly, for each data set, top 500 peaks were selected and MEME analysis for motif discovery was performed. Notably, in all pooled cell samples, core SP1 motif enrichment was observed (Figure 5C and 5D). We also compared the signal profile of all SP1 epitope-tagged samples (including monoclonal and pooled cell samples) on 3000 bp regions surrounding the published Encyclopedia of DNA Elements (ENCODE) SP1 peaks.23 The heatmap not only showed that our samples had recapitulated the ChIP-Seq enrichment in ENCODE peaks, but also demonstrated high consistency between all our samples (Figure 5E). In particular, the pooled cell samples are almost identical to monoclonal samples, except for the difference in read depth. The fact that ChIP-Seq data from pooled cells is of comparable quality to that from the single clone suggests that pooled cell samples are adequate for specific signal enrichment over the nonspecific signal and that the single clone isolation step can be simply skipped.

We also assessed the feasibility of our approach for additional TFs, including MYC and CTCF. Without FACS, biological replicates after puromycin treatment were directly used for ChIP-Seq analyses. Genotypes were confirmed through PCR (Figure S7A and S7B) and MYC samples displayed comparable signal enrichment to previous result (Figure S7C).27 The DNA binding patterns were consistent with previous MYC monoclonal cell populations (Figure 6A, Figure S2E). CTCF had enrichment at BBC3 region (Figure 6A) where CTCF has been proved bound to in the CTCF-cohesin complex format.31 Motif analyses reported the E-box sequence for MYC binding with high concordance (Figure 6B). The top motif discovered for CTCF binds sequence (CCACCAGGGGGCGC) for all three replicates (Figure 6D). The CTCF consensus binding sequence is considered to contain CpG and can be subject to DNA methylation. When we extracted the ChIP-Seq enrichment signal within the previously published peak lists, again we found highly enriched signals toward all peak centers for MYC and CTCF (Figure 6C and 6E).

Figure 6.

Figure 6

(A) Representative DNA-binding protein read enrichment tracks on the IGV for MYC and CTCF pooled cell samples. (B) Motif discovery for MYC pooled cell samples before FACS. (C) Read enrichment heatmap of the 3000 bp regions centered on the MYC peaks generated by Seitz and co-workers.27 Each row is a peak and the x-axis denotes the whole 3000 bp genomic regions. Each column is a TF-tagged MYC sample. (D) Motif discovery for CTCF pooled cell samples before FACS. (E) Read enrichment heatmap of the 3000 bp regions centered on the CTCF ChIP-Seq peaks generated by ENCODE. Each row is a peak and the x-axis denotes the whole 3000 bp genomic regions. Each column is a TF-tagged CTCF sample.

The ENCODE consortium has carried out ChIP-Seq studies to investigate transcription factor binding in mammalian cells.23 However, only a small fraction of the known human TFs has been characterized so far. Here we demonstrated that our approach to tag TFs is a quick and simple solution to high throughput ChIP-Seq analysis. This new method takes advantage of the newly developed CRISPR/Cas9 system and combines it with microhomology mediated end joining mechanism. It enables the modification of the endogenous TF within weeks without the tedious single colony selection and individual genotyping. These additional procedures are required for the traditional TF tagging methods and may take a few months for one TF. One of the key advantages of this CRISPR-MMEJ based KI strategy is the relative ease for donor vector construction within 3 days. We designed very short homologous arms in the primers for amplifying insert cassette, which can be easily cloned to the backbone via Gibson assembly. The strategy makes donor vector assembly more feasible and scalable compared with any HR approach that requires the addition of long homologous arms flanking with the insert cassette.

The tagged TFs could be pulled down precisely and efficiently with the anti-FLAG monoclonal antibody. The data generated by the new method displayed higher quality than most TF ChIP-Seq data using the polyclonal antibody against the TF protein. Mapping our enrichment tracks to the ENCODE and other prior work, our data sets presented high consistency. In the SP1 examples, signals of the sorted pooled cells were more enriched than the nonsorted cells, which could be due to less interference from the wild type cells. Despite this, all nonsorted samples are adequate for chromatin precipitation experiments and we confirmed that by testing tagged MYC and CTCF. As all ChIP-Seq experiments could be performed with the same antibody, this also makes possible the quantitative comparison of ChIP-Seq data among different TFs. This strategy could be applied to more TFs, especially those without commercial ChIP grade antibodies.

Although successful tagging events and improved sequencing signal enrichments were observed, we noticed this strategy is still limited for certain TFs. To assess the integration efficiency, we estimated the number of colonies formed after drug selection (data not shown). We observed more survived colonies for MYC than the other three TFs. With higher expression level in HCT116 cells, MYC maintains stronger native promoter and could generate better resistance in the presence of same drug concentration. As a matter of fact, successful insertion depends on both the efficiency of CRISPR/Cas9 cutting and the MMEJ repairing. We chose the insertion site closer to the stop codon to maintain genome integrity. The CRISPR/Cas9 system requires unique protospacer adjacent motif (PAM) for its recognition. Because of that, we are constrained to choose the site to introduce the DSBs. The gRNA sequence affects Cas9 binding efficiency and consequently, results in different efficiency in generating DSBs.32 As MMEJ is the alternative repair pathway, it shares enzymes involved in classic DNA repair pathways.33 The mechanism of MMEJ is under extensive study but still remains incomplete.34,35 Therefore, the insertion efficiency is expected to be varied because of the inconsistency in the CRISPR/Cas9 cleavage efficiency and short homology recombination rates. We have also tested potential CRISPR off-target sites using online prediction software (http://crispr.mit.edu/)36 for SP1 samples. For monoclone #8 and pooled cell samples, no significant mutations were detected (Supplementary Table S2 and S3). In our approach, the CRISPR/Cas system relies on one gRNA to direct recognition and two for releasing the donor. Yamamoto and co-workers reported an upgraded donor version that only one generic gRNA is required to enable the fragmentation of donor vector.18 By adapting to that, we can leverage the scalability of the method further with simpler CRISPR plasmid construction. The overall efficiency is expected to be higher because no trimming of the extra bases outside the microhomologies leads to distal MMEJ. Distal MMEJ is considered to happen at higher frequency than proximal MMEJ, which is our integration system based upon.18

In summary, we repurposed the CRISPR/Cas9 enabled MMEJ process to introduce an epitope tag to genes of interest and achieved the downstream chromatin precipitation using a same ChIP-grade FLAG antibody. Given a variety of tagging efficiencies, we advanced pooled samples to ChIP preparation straight after quick genotyping confirmation. In all targets tested, similar enrichment patterns between isolated monoclonal samples and pooled cell samples were observed at binding sites discovered by previous ChIP-Seq studies. As a result, this new method reduces the time and labor needed and could enable mapping of less-characterized TFs via ChIP-Seq analysis and open avenues for other efficient genome modifications.

METHODS AND MATERIALS

Construction of CRISPRexp Plasmids

The CRISPRexp plasmid was assembled using the Multiplex CRISPR/Cas9 Assembly System kit (Addgene; Kit 1000000055). Three gRNA-expressing cassettes were incorporated into a single plasmid using Golden Gate assembly.21 Oligonucleotides for gRNA templates were synthesized and annealed into corresponding intermediate vectors. The oligonucleotides used are listed in Supplementary Table S1.

Construction of Donor Vectors

The donor vectors were constructed using PCR and Gibson Assembly Cloning kit (New England Biolabs, Ipswich, MA). Short homology arms have been designed and included in the primers. The original donor backbone was a gift from Dr. Kenichi T. Suzuki from Hiroshima University, Hiroshima, Japan.

Cell Culture and Transfection

HCT116 cells were routinely maintained in the McCoy’s 5A medium (ATCC, Manassas, VA) supplemented with 10% Fetal Bovine Serum (FBS; Hyclone, Logan, UT). Cells were seeded in 100 mm dish at a density of 1 × 106. After 24 h, cells were transfected with 6.66 μg CRISPRexp plasmids and 3.33 μg donor vectors using FuGene HD transfection reagent (Promega, Madison, WI) under conditions specified by the manufacturer. After transfection, cells were cultured with transfection reagent for 24 h and cultured in growth medium described above for additional 2 days. Puromycin (0.5–1.0 μg/mL) selection was conditioned for 7 days and single clones were isolated using cloning cylinders (Sigma-Aldrich, St. Louis, MO). Only the SP1 pooled cells were used for FACS (Supplementary Methods and Materials).

Genomic PCR and DNA Sequencing

The genomic DNA from cell pellets was extracted using QuickExtract DNA solution (Epicenter, Chicago, IL). Genomic PCR was performed using Herculase II Fusion DNA polymerase (Agilent Technologies, Santa Clara, CA) or Q5 High-Fidelity DNA polymerase (New England BioLabs, Ipswich, MA) with primers listed in the Supplementary Table S4. The PCR products were subjected to direct DNA sequencing service (ACGT, Inc., Wheeling, IL & GENEWIZ, Cambridge, MA).

ChIP-Seq Analysis

The ChIP-Seq analysis was carried out following well-established guidelines.37 Briefly, cells were cross-linked with 1% formaldehyde for 10 min at room temperature. Chromatin was sheared using Covaris M220 Focused-ultrasonicator (Covaris, Woburn, MA) to obtain DNA fragments of about 400–600 bp. Five microgram of monoclonal antibody M2 (Sigma, Cat. No. F1804) was used to pull down the tagged TF. The chromatin was then de-cross-linked at 65 °C overnight with proteinase K (New England Biolabs, Ipswich, MA). DNA was purified using MinElute PCR purification kit (Qiagen) and made to library for sequencing with Illumina Hiseq 2500 sequencer (Illumina, San Diego, CA). Fifty bp of short single end reads were used and mapped to human genome hg19 with BWA alignment software.38 Duplicated reads at the same genomic loci were removed and peak-calling was performed using MACS2.39 The peak numbers for all the monoclone and pooled cell samples are listed in Supplementary Table S5. Supplementary Figure S8 shows the Pearson correlations among the previously reported analyses, monoclone and pooled cell samples for each TF respectively. All data was deposited to public database GEO with an accession number GSE78064. MEME suite was used to perform the de novo motif discovery.40 For each experiment, top 500 peaks were selected and genomic sequences within 200 bp centered on each peak summit were used as input for MEME with default parameters. De novo MEME parameter was set up as -revcomp -dna -nmotifs 3 -minw 5 -maxw 20. The ChIP signal heatmaps for peak lists (Figure 5E and others) were generated by HOMER annotatePeak script based on 3000 bp surrounding peak centers.41

Supplementary Material

SI

Acknowledgments

This work was supported by the Carl R. Woese Institute for Genomic Biology at the University of Illinois at Urbana–Champaign (H.Z.), National Institutes of Health (1U54DK107965) (H.Z.), the Ludwig Institute for Cancer Research (B.R.), National Institutes of Health (P50 GM085764-04, 1U54DK107977-01) (B.R.), and an International Postdoctoral fellowship from the Swedish Vetenskapsradet (537-2014-6796) (J.Y.). The authors thank Dr. Barbara Pilas (Flow Cytometry Facility, Biotechnology Center, University of Illinois at Urbana–Champaign, Urbana, IL 61801, USA) for helpful suggestions.

ABBREVIATIONS

CRISPR/Cas

clustered regularly interspaced short palindromic repeats and CRISPR-associated proteins

DSB

double-strand break

gRNA

guide RNA

HR

homologous recombination

MMEJ

microhomology mediated end joining

Footnotes

Author Contributions

X.X., Y.Z., J.Y., B.R. and H.Z. designed the experiments; X.X. and Y.Z. performed all the experiments with the help of J.Y., S.J. and S.C.; X.X., Y.Z., B.R. and H.Z. wrote the manuscript.

Notes

The authors declare no competing financial interest.

Supporting Information

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acssynbio.6b00358.

Supplementary methods, additional figures and tables (PDF)

References

  • 1.Rivera CM, Ren B. Mapping human epigenomes. Cell. 2013;155:39–55. doi: 10.1016/j.cell.2013.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hawkins RD, Ren B. Genome-wide location analysis: insights on transcriptional regulation. Hum Mol Genet. 2006;15:R1–7. doi: 10.1093/hmg/ddl043. [DOI] [PubMed] [Google Scholar]
  • 3.Tam WL, Lim B. StemBook. Cambridge, MA: 2008. Genome-wide transcription factor localization and function in stem cells. [PubMed] [Google Scholar]
  • 4.Rodriguez R, Miller KM. Unravelling the genomic targets of small molecules using high-throughput sequencing. Nat Rev Genet. 2014;15:783–796. doi: 10.1038/nrg3796. [DOI] [PubMed] [Google Scholar]
  • 5.Jothi R, Cuddapah S, Barski A, Cui K, Zhao K. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 2008;36:5221–5231. doi: 10.1093/nar/gkn488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Furey TS. ChIP-seq and beyond: new and improved methodologies to detect and characterize protein-DNA interactions. Nat Rev Genet. 2012;13:840–852. doi: 10.1038/nrg3306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, Bernstein BE, Bickel P, Brown JB, Cayting P, Chen YW, DeSalvo G, Epstein C, Fisher-Aylor KI, Euskirchen G, Gerstein M, Gertz J, Hartemink AJ, Hoffman MM, Iyer VR, Jung YL, Karmakar S, Kellis M, Kharchenko PV, Li QH, Liu T, Liu XS, Ma LJ, Milosavljevic A, Myers RM, Park PJ, Pazin MJ, Perry MD, Raha D, Reddy TE, Rozowsky J, Shoresh N, Sidow A, Slattery M, Stamatoyannopoulos JA, Tolstorukov MY, White KP, Xi S, Farnham PJ, Lieb JD, Wold BJ, Snyder M. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012;22:1813–1831. doi: 10.1101/gr.136184.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Egelhofer TA, Minoda A, Klugman S, Lee K, Kolasinska-Zwierz P, Alekseyenko AA, Cheung MS, Day DS, Gadel S, Gorchakov AA, Gu TT, Kharchenko PV, Kuan S, Latorre I, Linder-Basso D, Luu Y, Ngo Q, Perry M, Rechtsteiner A, Riddle NC, Schwartz YB, Shanower GA, Vielle A, Ahringer J, Elgin SCR, Kuroda MI, Pirrotta V, Ren B, Strome S, Park PJ, Karpen GH, Hawkins RD, Lieb JD. An assessment of histone-modification antibody quality. Nat Struct Mol Biol. 2011;18:91. doi: 10.1038/nsmb.1972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pilon AM, Ajay SS, Kumar SA, Steiner LA, Cherukuri PF, Wincovitch S, Anderson SM, Center NCS, Mullikin JC, Gallagher PG, Hardison RC, Margulies EH, Bodine DM. Genome-wide ChIP-Seq reveals a dramatic shift in the binding of the transcription factor erythroid Kruppel-like factor during erythrocyte differentiation. Blood. 2011;118:e139–148. doi: 10.1182/blood-2011-05-355107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Liu M, GS, Battle M, Stiles JK. Gene Functional Studies Using Bacterial Artificial Chromosome (BACs) Bacterial Artificial Chromosomes. 2011 doi: 10.5772/32167. [DOI] [Google Scholar]
  • 11.Wang Z. Epitope tagging of endogenous proteins for genome-wide chromatin immunoprecipitation analysis. Methods Mol Biol. 2009;567:87–98. doi: 10.1007/978-1-60327-414-2_6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhang X, Guo C, Chen Y, Shulha HP, Schnetz MP, LaFramboise T, Bartels CF, Markowitz S, Weng Z, Scacheri PC, Wang Z. Epitope tagging of endogenous proteins for genome-wide ChIP-chip studies. Nat Methods. 2008;5:163–165. doi: 10.1038/nmeth1170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Savic D, Partridge EC, Newberry KM, Smith SB, Meadows SK, Roberts BS, Mackiewicz M, Mendenhall EM, Myers RM. CETCh-seq: CRISPR epitope tagging ChIP-seq of DNA-binding proteins. Genome Res. 2015;25:1581–1589. doi: 10.1101/gr.193540.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, Zhang F. Genome engineering using the CRISPR-Cas9 system. Nat Protoc. 2013;8:2281–2308. doi: 10.1038/nprot.2013.143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, Marraffini LA, Zhang F. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. doi: 10.1126/science.1231143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.McVey M, Lee SE. MMEJ repair of double-strand breaks (director’s cut): deleted sequences and alternative endings. Trends Genet. 2008;24:529–538. doi: 10.1016/j.tig.2008.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bae S, Kweon J, Kim HS, Kim JS. Microhomology-based choice of Cas9 nuclease target sites. Nat Methods. 2014;11:705–706. doi: 10.1038/nmeth.3015. [DOI] [PubMed] [Google Scholar]
  • 18.Sakuma T, Nakade S, Sakane Y, Suzuki KT, Yamamoto T. MMEJ-assisted gene knock-in using TALENs and CRISPR-Cas9 with the PITCh systems. Nat Protoc. 2016;11:118–133. doi: 10.1038/nprot.2015.140. [DOI] [PubMed] [Google Scholar]
  • 19.Nakade S, Tsubota T, Sakane Y, Kume S, Sakamoto N, Obara M, Daimon T, Sezutsu H, Yamamoto T, Sakuma T, Suzuki KT. Microhomology-mediated end-joining-dependent integration of donor DNA in cells and animals using TALENs and CRISPR/Cas9. Nat Commun. 2014;5:5560. doi: 10.1038/ncomms6560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Sakuma T, Nishikawa A, Kume S, Chayama K, Yamamoto T. Multiplex genome engineering in human cells using all-in-one CRISPR/Cas9 vector system. Sci Rep. 2014 doi: 10.1038/srep05400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gibson DG, Young L, Chuang RY, Venter JC, Hutchison CA, 3rd, Smith HO. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat Methods. 2009;6:343–345. doi: 10.1038/nmeth.1318. [DOI] [PubMed] [Google Scholar]
  • 22.Vizcaino C, Mansilla S, Portugal J. Sp1 transcription factor: A long-standing target in cancer chemotherapy. Pharmacol Ther. 2015;152:111–124. doi: 10.1016/j.pharmthera.2015.05.008. [DOI] [PubMed] [Google Scholar]
  • 23.Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kuan CS, See Too WC, Few LL. Sp1 and Sp3 Are the Transcription Activators of Human ek1 Promoter in TSA-Treated Human Colon Carcinoma Cells. PLoS One. 2016;11:e0147886. doi: 10.1371/journal.pone.0147886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Mudduluru G, Allgayer H. The human receptor tyrosine kinase Axl gene–promoter characterization and regulation of constitutive expression by Sp1, Sp3 and CpG methylation. Biosci Rep. 2008;28:161–176. doi: 10.1042/BSR20080046. [DOI] [PubMed] [Google Scholar]
  • 26.Frietze S, Wang R, Yao L, Tak YG, Ye Z, Gaddis M, Witt H, Farnham PJ, Jin VX. Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the genome by association with GATA3. Genome biology. 2012;13:R52. doi: 10.1186/gb-2012-13-9-r52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Seitz V, Butzhammer P, Hirsch B, Hecht J, Gutgemann I, Ehlers A, Lenze D, Oker E, Sommerfeld A, von der Wall E, Konig C, Zinser C, Spang R, Hummel M. Deep sequencing of MYC DNA-binding sites in Burkitt lymphoma. PLoS One. 2011;6:e26837. doi: 10.1371/journal.pone.0026837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Zhao J, Schug J, Li M, Kaestner KH, Grant SF. Disease-associated loci are significantly over-represented among genes bound by transcription factor 7-like 2 (TCF7L2) in vivo. Diabetologia. 2010;53:2340–2346. doi: 10.1007/s00125-010-1852-3. [DOI] [PubMed] [Google Scholar]
  • 29.Yin YW, Jin HJ, Zhao W, Gao B, Fang J, Wei J, Zhang DD, Zhang J, Fang D. The Histone Acetyltransferase GCN5 Expression Is Elevated and Regulated by c-Myc and E2F1 Transcription Factors in Human Colon Cancer. Gene Expression. 2015;16:187–196. doi: 10.3727/105221615X14399878166230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Chen C, Cai S, Wang G, Cao X, Yang X, Luo X, Feng Y, Hu J. c-Myc enhances colon cancer cell-mediated angiogenesis through the regulation of HIF-1alpha. Biochem Biophys Res Commun. 2013;430:505–511. doi: 10.1016/j.bbrc.2012.12.006. [DOI] [PubMed] [Google Scholar]
  • 31.Gomes NP, Espinosa JM. Gene-specific repression of the p53 target gene PUMA via intragenic CTCF-Cohesin binding. Genes Dev. 2010;24:1022–1034. doi: 10.1101/gad.1881010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wang T, Wei JJ, Sabatini DM, Lander ES. Genetic screens in human cells using the CRISPR-Cas9 system. Science. 2014;343:80–84. doi: 10.1126/science.1246981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Truong LN, Li Y, Shi LZ, Hwang PY, He J, Wang H, Razavian N, Berns MW, Wu X. Microhomology-mediated End Joining and Homologous Recombination share the initial end resection step to repair DNA double-strand breaks in mammalian cells. Proc Natl Acad Sci U S A. 2013;110:7720–7725. doi: 10.1073/pnas.1213431110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kent T, Chandramouly G, McDevitt SM, Ozdemir AY, Pomerantz RT. Mechanism of microhomology-mediated end-joining promoted by human DNA polymerase theta. Nat Struct Mol Biol. 2015;22:230–237. doi: 10.1038/nsmb.2961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Crespan E, Czabany T, Maga G, Hubscher U. Microhomology-mediated DNA strand annealing and elongation by human DNA polymerases lambda and beta on normal and repetitive DNA sequences. Nucleic Acids Res. 2012;40:5577–5590. doi: 10.1093/nar/gks186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O, Cradick TJ, Marraffini LA, Bao G, Zhang F. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013;31:827–832. doi: 10.1038/nbt.2647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Yan J, Enge M, Whitington T, Dave K, Liu J, Sur I, Schmierer B, Jolma A, Kivioja T, Taipale M, Taipale J. Transcription factor binding in human cells occurs in dense clusters formed around cohesin anchor sites. Cell. 2013;154:801–813. doi: 10.1016/j.cell.2013.07.034. [DOI] [PubMed] [Google Scholar]
  • 38.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings of the International Conference on Intelligent Systems for Molecular Biology. ISMB. International Conference on Intelligent Systems for Molecular Biology; 1994. pp. 28–36. [PubMed] [Google Scholar]
  • 41.Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

RESOURCES