Abstract
Genome editing critically relies on selective recognition of target sites. However, despite recent progress, the underlying search mechanism of genome-editing proteins is not fully understood in the context of cellular chromatin environments. Here, we use single-molecule imaging in live cells to directly study the behavior of CRISPR/Cas9 and TALEN. Our single-molecule imaging of genome-editing proteins reveals that Cas9 is less efficient in heterochromatin than TALEN because Cas9 becomes encumbered by local searches on non-specific sites in these regions. We find up to a fivefold increase in editing efficiency for TALEN compared to Cas9 in heterochromatin regions. Overall, our results show that Cas9 and TALEN use a combination of 3-D and local searches to identify target sites, and the nanoscopic granularity of local search determines the editing outcomes of the genome-editing proteins. Taken together, our results suggest that TALEN is a more efficient gene-editing tool than Cas9 for applications in heterochromatin.
Subject terms: Genetic engineering, Single-molecule biophysics, CRISPR-Cas9 genome editing
While Cas9 outperforms TALENs in euchromatin, it is less efficient in heterochromatic regions. Here the authors, using single-molecule imaging, show that Cas9 uses a less efficient search strategy compared to TALENs in these regions.
Introduction
Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) and transcription activator-like effector nuclease (TALEN) are programmable DNA search engines that query genomic sequences for target-specific editing1. Both Cas9 and TALEN can recognize a custom genetic sequence but have strikingly different mechanisms of target-site binding2. Cas9 can be programmed to find a specific DNA sequence upstream of an indispensable 3-nucleotide motif (protospacer adjacent motif or PAM) by designing a single guide RNA (sgRNA) that mediates target-site binding through DNA-RNA pairing3. On the other hand, the DNA-binding domain of a TALEN is comprised of a tandem array of 33–34 amino acid (aa)-long customizable monomers that theoretically can be assembled to recognize any genetic sequence following a one-repeat-binds-one-base-pair recognition code4,5. In vitro single-molecule studies have shown that TALEs (nuclease-free analogs of TALENs) utilize a unique rotationally decoupled, “molecular zip-line” mechanism for target-site search along DNA; it does this by translating along the DNA backbone without rotating or tracking the major groove6,7. However, it is not known how TALEs maneuver the complex nuclear architecture and search for the target-site in vivo. Previous studies have presented a conflicting view of the CRISPR/Cas9 search mechanism, often using dCas9 (a nuclease-deficient Cas9), either described as pure 3-D diffusion8–10 or capable of 1-D diffusion along DNA11.
In this work, we directly observe the search behavior of dCas9 and TALE proteins in different chromatin environments in vivo. By analyzing the trajectories of single protein molecules in live cells, we characterize the local search mechanisms of TALE and dCas9 in euchromatin and heterochromatin regions. Our results show that Cas9 is less efficient than TALEN in heterochromatin regions because Cas9 tends to become encumbered by local searches on non-specific sites. To further assess the functional implications of the differences in search behaviors, we conducted a TIDE (Tracking of Indels by Decomposition)12 analysis of TALEN and Cas9, revealing that TALEN was up to fivefold more efficient than Cas9 in the constrained heterochromatin regions of the genome. Overall, this combined strategy allows us to independently investigate the search mechanism as well as the editing efficiency of both genome-editing proteins.
Results
Live-cell imaging of TALE and dCas9 proteins
Live-cell single-molecule fluorescence microscopy13,14 was used to directly observe the search dynamics of TALE and dCas9 proteins in mammalian cells. We designed a TALE protein that is primarily in the search mode because it has few binding sites—specifically the cystic fibrosis transmembrane conductance regulator (CFTR) genomic loci in euchromatin with less than 4 binding sites in the genome. We also synthesized a TALE protein with multiple target sites—in particular, a TALE targeting the Alu retrotransposon elements15, with an estimated 1 million interspersed target sites (Fig. 1a). In both cases, the proteins were fused with a Halotag domain16 and were constructed using an in-house liquid handling robotic system17, enabling 1:1 stoichiometric labeling with JF 549 dye18 (Supplementary Fig. 1). Similarly, we also designed guide-RNAs targeting CFTR and Alu sites to be used with dCas9 proteins. We performed control experiments with the core histone protein H2B (Histone 2B), a widely studied DNA-binding protein19,20 (Supplementary Table 1).
TALE and dCas9 exhibit two major search behaviors
Protein search dynamics were analyzed using two different imaging conditions: short-exposure times (10–20 ms) to study fast diffusion kinetics (Fig. 1b, Top) and long-exposure times (500 ms) to characterize residence times of the bound molecules (Fig. 1b, Bottom). First, we analyzed the data obtained for the short-exposure time imaging condition. The seminal theoretical framework described by Berg, Winter, and von Hippel identified that specific DNA-binding proteins undergo four major processes of translocation, namely, (i) “long-range” or “macroscopic” disassociation-reassociation events (3-D diffusion), (ii) “short-range” or “microscopic” disassociation-reassociation events (hopping), (iii) ring-closure or “intersegmental transfer” in case the protein has 2 DNA-binding sites and lastly (iv) “sliding” (1-D diffusion) along the DNA molecule21. More recently, 3-D diffusion and hopping have been referred to as global search, and 1-D sliding and jumping (<5 bp) have been characterized as local search19,22. Our experimental setup is unable to differentiate between a jump and a pure 1-D sliding translocation due to the lower bound of the short-exposure time and the inability to visualize DNA. Fast-diffusing molecules carry out global search, whereas slow-diffusing molecules are carrying out local search19.
We performed multi-state Gaussian fitting on normalized diffusion coefficient histograms of TALE and dCas9 proteins (Fig. 1c). Diffusion histograms of CFTR-TALE exhibited two types of search behaviors, a “fast” diffusion (red curve) and a “slow” diffusion (green curve) with significant overlap (Fig. 1c, top image). These results show that TALE proteins are capable of switching (on a timescale of 20 ms) from fast to slow diffusion and vice versa. We posit that the fast-diffusing populations (red curve) result primarily from global search events such as hopping and 3-D diffusion, whereas the slow-diffusing populations (green curve) include locally searching molecules. In conclusion. the double peak behavior of the normalized diffusion coefficient histograms suggests that TALE proteins engage in global search as well as local search behavior. A similar analysis of diffusion coefficient histograms for H2B controls also showed evidence of two molecular populations captured by fast and slow-moving H2B molecules (Supplementary Fig. 2).
We further studied the search behavior of nuclease-deficient Cas9 (dCas9) in living cells. Our results show that dCas9 also exhibits two major search behaviors, similar to TALE. The characteristic “fast” and “slow” diffusion populations make up the CFTR-dCas9 diffusion processes (Fig. 1c, bottom). The kinetic parameters of the search process are comparable between CFTR-TALE and dCas9. We have found the target-search process of TALE molecules similar to that of dCas9 in a live-mammalian nucleus. However, it has been reported that Cas9 tends to outperform TALEN when editing sequences in the open chromatin2. In the next sections, we aimed to investigate the fundamental molecular differences of TALE and dCas9 target-search processes and how they may affect the editing outcomes.
Long exposure time condition allows visualization of DNA-bound proteins
We further imaged proteins using long-exposure times (500 ms), allowing for the visualization of DNA-bound molecules (bottom image, Fig. 1b). In this imaging condition, fast-moving proteins diffuse in the background, and only DNA-bound proteins are visible. Using long-exposure times, we determined the residence time histograms of bound TALE and Cas9 proteins that targeted CFTR and Alu genomic loci. Residence time histograms could not be fit with a single-exponential decay function (Supplementary Fig. 3). Histograms of Alu- and CFTR-TALE residence times were well described by a two-component exponential decay model, which suggests the presence of “non-specifically” and “specifically” bound molecules23 (Fig. 2a). Binding times were determined for both populations based on the double-exponential decay model, after correcting for photobleaching (see Methods). Our results show that short-lived populations of CFTR-TALE (τ1, CFTR) and Alu-TALE (τ1, Alu) have lifetimes of 0.48 s and 1.01 s, respectively (Fig. 2a). However, residence times of the long-lived population (τ2) differed significantly between CFTR-TALE and Alu-TALE, such that CFTR-TALE and Alu-TALE showed residence times of 1.8 s and 20.2 s, respectively. Because Alu TALE has more than a million target sites, the longer residence time τ2 for Alu-TALE reflects the behavior of proteins that are bound to specific target sites. Moreover, CFTR has only a few target sites (<4), such that the vast majority of CFTR TALE–DNA interactions are likely to be non-specific. Therefore, we deduced that TALE spends an average of 1.8 s (τ2 of CFTR TALE) at non-specific sites and 20.2 s (τ2 of Alu-TALE) bound to target sites. The short-lived populations described by τ1 reflect the dynamics of the transitionary molecules, representing the time between two chromatin-binding events. Hence, 81.6% and 86.4% of CFTR-TALE and Alu-TALE molecules, respectively, are undergoing global search, whereas 18.4% of CFTR molecules are engaging in local search and 13.6% of Alu-TALE molecules are either specifically bound or are undergoing local search.
Residence time distributions of dCas9 proteins also fitted two-component exponential decay function (Fig. 2b). Our analysis shows that dCas9 spends 13.41 s on the specific target sites (τ2 of Alu-dCas9) and 5.87 s on non-specific target sites (τ2 for CFTR dCas9). dCas9 spends more time (5.87 s) on non-specific sites than TALE (1.8 sec). Since it is not possible to distinguish the target-site bound vs. non-specifically bound protein molecules with our imaging methodology, we note that the calculated residence times of specifically bound proteins could be an under-approximation as the calculation may include some of the proteins that are not specifically bound. Since the mammalian genome has more than a million Alu target sites, it is highly probable that the longer residence time corresponds to proteins that are bound to target sites, and non-specifically interacting proteins will have a minor impact on the calculated residence time. Similar analysis on H2B controls also revealed two bound populations, with long-lived populations spending 14.8 s on the genome (Supplementary Fig. 2).
Difference between TALE and dCas9 local search behaviors
To further understand the molecular origins of the slow-diffusing populations of TALE and dCas9, carrying out the local search, we analyzed individual trajectories of TALE and dCas9 to characterize search dynamics by calculating an instantaneous diffusion coefficient Dinst (see “Methods”). The rapid global search was distinguished from the local search using thresholds for Dinst (depicted by the blue dashed line in Fig. 3a)23. Here, Dinst is plotted corresponding to one characteristic trajectory of a single TALE as well as a single dCas9 protein (Fig. 3a). We observed that both TALE and dCas9 proteins transition rapidly between slow and fast Dinst ranges. TALE proteins transition between fast global and slow local search along DNA in live cells, which is consistent with prior in vitro single-molecule studies of TALE proteins6,7. Similarly, along with global search, dCas9 can also engage in local search of the genome in live cells.
To further analyze local search dynamics, we plotted histograms of Dinst for TALE and dCas9 (Fig. 3b). The dashed blue vertical line demarcates the globally searching population from the locally searching population. We also determined the time spent by TALE and dCas9 in local DNA search in one cycle (Fig. 3c). Local search time per cycle is defined as the time spent by a protein molecule interacting with DNA between two consecutive cycles of global search. Our results show that the average local search fraction of dCas9 (56%) is marginally larger than that of TALE (50%). Moreover, dCas9 spends more time (96 ± 1 ms) engaging in local DNA search than TALE (65 ± 0.3 ms) per local search cycle. Overall, dCas9 spends more time undergoing local search compared to TALE.
To further probe in vivo TALE search dynamics, we determined the jumping angles of TALE proteins during the search process (Fig. 3d). Jumping angles describe the relative change in the direction of motion of a DNA-binding protein due to the local search environment encountered in the target-search process (e.g., genome compaction, other transcription factors). We also defined the skewness factor to quantify the asymmetry/non-uniformity of the jumping angle distribution (Supplementary Fig. 4). CFTR-TALE exhibits non-uniform distributions of jumping angle (skewness: 2.41), revealing that the TALE target-search process is affected by genomic occlusions (Fig. 3d). H2B proteins (Supplementary Fig. 2) demonstrate a significant bias towards 180° (skewness: 2.61), suggesting a constricted search environment. On the other hand, dCas9 shows a uniform angular distribution (skewness: 1.25) indicating that dCas9 performs efficient genome-search at the whole-nucleus level (Fig. 3d). These fundamental differences in local search efficiencies are enabling Cas9 to outperform TALENs in open chromatin. However, it is still not clear how local search will affect the performance of TALE and dCas9 in compact chromatin states of the human genome.
TALEs navigate heterochromatin more efficiently than Cas9
We next studied the search mechanism of TALE and dCas9 in the context of prominent genomic features in heterochromatin in live-mammalian cells. We directly imaged TALE and dCas9 search dynamics in three heterochromatin environments: Alu repetitive retrotransposons, centromeric structures, and a compact genomic locus marked by H3K9 trimethylation epigenetic modifications24. HeLa cells were used to image repetitive elements, and due to the availability of epigenetic data in HCT116 cells, they were used for imaging compact genomic loci. Stable (in HeLa cells) or transient (in HCT116 cells) expression of heterochromatin protein 1 alpha (HP1α) fused with the green fluorescent protein (GFP) enabled specific tracking of single protein molecules in heterochromatin25 (Supplementary Fig. 5). Prior work has shown that TALEN and Cas9 editing activity is hindered in heterochromatin26–28; however, the search dynamics of these proteins in the context of heterochromatin is not understood. We observed overall slower kinetics compared to euchromatin and differential search behavior depending on the chromatin context for both dCas9 and TALE (Supplementary Fig. 6). Jumping angle analyses (Supplementary Fig. 6) demonstrate that both TALE and dCas9 encounter a considerably constricted search space as indicated by the highly skewed angular distribution in the heterochromatin region. TALE heterochromatin search is described by three distinct modes, including an additional intermediate diffusive population for repetitive elements Alu and centromere (Supplementary Fig. 6). However, when dCas9 and TALEs were designed to search for a target-site embedded in highly compacted constitutive heterochromatin located in chromosome 16, there was a significant difference between TALE (TALE 16) and dCas9 (gRNA9) search kinetics (Fig. 4a). TALE 16 (D = 2.35 µm2/s) showed significantly faster overall search dynamics compared to dCas9-gRNA9 (D = 1.93 µm2/s) in heterochromatin (Fig. 4a). Jumping angle distributions of TALE 16 are more uniformly distributed (skewness: 2.16) compared to that of dCas9-gRNA9 (skewness: 2.65), indicating that TALE can maneuver the tight heterochromatin environment more efficiently (Fig. 4b).
TALEN shows higher editing efficiency than Cas9 in heterochromatin
To assess the functional implications of differences in search behavior of TALE and dCas9 in heterochromatin, we constructed a series of TALENs and Cas9-gRNA variants capable of editing sequences present in highly repressed heterochromatin loci. Using HCT116 ENCODE H3K9me3 and H3K27me3 ChIP-seq data29, we chose twelve chromosome loci of approximately 500 bp that differed in ChIP-seq signal fold change ranging from 2.543 to 9.547 (Supplementary Table 2). We designed four gRNAs and two TALEN pairs per loci using Benchling, CHOPCHOP30, and SAPTA31 design tools (Supplementary Table 3, 4). We tested the ability of gRNA constructs to cut chromatin-less plasmid-DNA by an eGFP reporter assay where an active gRNA will cut the target-site out-of-frame with the eGFP gene resulting in loss of fluorescence upon cutting (Supplementary Fig. 7)32. The eGFP reporter assay enabled us to screen gRNAs that are functional and determine their editing efficiency in the context of heterochromatin (Supplementary Fig. 8). We also chose four published gRNAs33 targeting euchromatin sites located in endogenous genes to compare the efficiency of genome-editing proteins in transcriptionally active open chromatin. We performed TIDE analysis12 to calculate the target-site editing efficiency. In 11 out of 12 loci (91.66%), TALENs showed similar or higher editing activity in heterochromatin compared to Cas9 (Fig. 5a, top; Supplementary Fig. 9) whereas at 4 euchromatin sites, Cas9 demonstrated either similar or greater editing activity indicating that TALEN’s enhanced editing activity in heterochromatin to be a context-dependent phenomenon (Fig. 5a, bottom). Together, the genome-editing efficiency results are consistent with in vivo search dynamics results, showing that TALE proteins are more efficient than Cas9 in navigating dense heterochromatin regions of the genome due to enhanced ability to sample heterochromatin locally. In contrast, in euchromatin, this advantage is superseded by Cas9’s increased local search ability.
Based on our results, we propose a mechanistic model for the search mechanisms of TALE and dCas9 in heterochromatin that explains the difference in their relative editing performance in euchromatin and heterochromatin (Fig. 5b). Our single-molecule imaging analysis shows that not only the search mechanisms of dCas9 and TALE adapt to the chromatin environment, they also differ significantly in their extent of local search. In combination, this is correlated with their functional efficacy as Cas9 is more efficient in cutting at euchromatin sites due to its greater ability to query binding sites in a relatively unhindered environment. We hypothesize that, in heterochromatin, the enhanced local search is not a beneficial feature for Cas9 and results in reduced cutting efficiency, whereas TALEN can access heterochromatin with greater ease due to a lesser extent of local search behavior. TALE’s unique rotationally decoupled DNA search mechanism7 and short local search enable it to glide over compact heterochromatin structures in a mammalian nucleus. On the other hand, dCas9 has to unravel the DNA double helix to interrogate for target specificity, and nucleosomes act as roadblocks for the local search, essentially trapping dCas9 molecules in the heterochromatin regions. Hence, TALE is able to find a target site embedded in mammalian heterochromatin with greater efficiency compared to dCas9.
Discussion
In conclusion, we used a combination of single-molecule imaging and sequencing-based editing analysis to study the search dynamics of TALE and Cas9 proteins in live cells. Our results show that TALE proteins use a combination of local search and 3-D diffusion to find their target site in mammalian cells. In addition, dCas9 proteins exhibit local search behavior while sampling DNA to find the target site. We conducted a detailed single-molecule investigation of the effect of structurally distinct chromatin states on the target-search mechanism of genome-editing proteins. Alu and Centromere targeting TALEs and dCas9 variants were used to characterize the search process in prominent heterochromatin structural elements of the mammalian genome. In the case of centromeric structures, the target sites are highly repetitive and concentrated, and we observe a “hopping” like behavior of TALE and dCas9 proteins. We further show that this hopping behavior depends on the presence of similar sites in close proximity for target-searching dCas9 molecules. dCas9 targeting Alu retrotransposon elements, which are not concentrated but are interspersed throughout the genome, do not exhibit hopping behavior, which suggests that the target-search process of these proteins in heterochromatin is fundamentally different. For TALEs, the hopping behavior is seemingly dependent on the compaction of the chromatin, but for dCas9, there is an additional requirement, perhaps the increased concentration of PAM sites or a seed-region including PAM-site.
Our results show that TALE and dCas9 search behavior is strongly dependent on the search environment and can even be of functional consequence i.e. relative genome-editing performance, as demonstrated by TIDE analysis, with TALENs emerging as the superior genome-editing tool with editing efficiencies up to 5-fold greater than that of Cas9 in heterochromatin regions. Our results show that the local search extent appears to be the most prominent distinguishing factor in determining editing outcomes at heterochromatin loci. To locate the target-site, Cas9 is dependent on local search interactions to a greater extent than TALEs, and this becomes a disadvantage in highly compact chromatin architecture, limiting Cas9’s editing efficiency in those regions. Overall, these results serve as a guide in selecting genome-editing proteins for the engineering of hard-to-edit heterochromatin regions of mammalian cells for general as well as therapeutic purposes.
Methods
Cell culture and transfection
HeLa cells (ATCC® CCL-2™) were cultured in DMEM media supplemented with 10% heat-inactivated FBS (HI FBS, Life Technologies), 100 U/mL of penicillin, and 100 µg/mL streptomycin antibiotics. Cells were imaged after 24 h of transfection. HCT116 (ATCC® CCL-247™) cells were cultures in McCoy 5A medium supplemented with 10% FBS. Culture conditions were maintained at 37 °C with 5% CO2. Cells were transfected using Lipofectamine 2000 (Life Technologies) or Fugene HD transfection reagent (3:1 Reagent:DNA ratio) (Promega #E2311) with plasmid DNA.
Lentiviral transduction
Lentiviral particles were produced in HEK293T (ATCC® CRL-3216™) cells using Fugene HD (Promega #E2311) for the transfection of plasmids. HEK293T cells were split to reach a confluency of 50–60% at the time of transfection. Lentiviral vectors were co-transfected with the lentiviral packaging plasmid psPAX2 (Addgene #12260) and the VSV-G envelope plasmid pMD2.G (Addgene #12259). Transfection reactions were assembled in reduced serum media (Opti-MEM; Gibco #31985-070). For lentiviral particle production on 6-well plates, 1 μg lentiviral vector, 0.5 μg psPAX2, and 0.25 μg pMD2.G were mixed in 0.4 mL Opti-MEM. After 15–20 min of incubation at room temperature, the transfection reactions were dispersed over HEK293T cells. The media was changed 24 h post-transfection, and the virus was harvested at 60 h post-transfection. Viral supernatants were filtered using 0.45 μm cellulose acetate or polyethersulfone (PES) membrane filters and stored at −80 °C. Polybrene (8 μg/mL; Sigma-Aldrich) was supplemented to enhance transduction efficiency.
Plasmid construction
TALE(N)s were assembled using an in house robotic liquid handling system Fluent following the protocol described previously15,32. Assembled TALEs were validated by restriction digestion (SpeI/BamHI) and Sanger sequencing with primers: N2-end-seq-F: 5′ AGCTGGATACCGGCCAACTCTT and C01-SEQ-R: 5′ ACCAGGTGGTCGTTTGTCAA. Halotag gene was further subcloned into correct TALE assemblies by Gibson assembly. Plasmid pcDNA-dCas9-Halotag expressing SpdCas9-Halotag was constructed by replacing VP64 in pcDNA-dCas9-VP6434 purchased from Addgene (Addgene plasmid 47107) by Halotag, which was PCR amplified from custom pCMV-TALECFTR-Halotag plasmid. pcDNA-dCas9-VP64 was digested by AscI, and AflII and PCR amplified Halotag was assembled with the digested backbone into the final plasmid construct by Gibson assembly. All Gibson assemblies were carried out by using the Gibson assembly master mix (NEB #E2611L). H2B-GFP35 was a gift from Geoff Wahl (Addgene plasmid#11680). Plasmid pcDNA-H2B-Halotag was made by assembling a PCR amplified Halotag fragment with pCMV-H2B-EGFP digested by AgeI-HF and NotI-HF. For lentiviral production, lentiv4 empty backbone harboring puromycin selection marker was digested with BamHI and AgeI, and PCR amplified GFP-HP1a26 (Addgene #17652) was inserted into the backbone by Gibson assembly and validated by Sanger sequencing.
gRNA design and cloning
All gRNAs were cloned into pSPgRNA34 featuring a U6 promoter and Streptococcus pyogenes gRNA scaffold was purchased from Addgene (Addgene plasmid 47108). A 20 bp guide sequence was cloned into pSPgRNA by annealing and phosphorylating two complementary oligonucleotides 5′-caccg N20-3′ and 5′-aaacN20c-3′, then ligating into a BbsI digested pSPgRNA backbone. N20 represents the 20 bp guide sequence.
Labeling and live-cell imaging
Cells were washed with 1× phosphate buffer saline (PBS) and incubated with 2 nM of JF549 dye18 for 15 min. Cells were washed with PBS 3x and incubated for 15 min in phenol red-free DMEM media. Finally, cells were additionally washed for 3X with PBS and plated in 35 mm glass-bottom dishes (Cellvis) in phenol red-free DMEM media. Fluorescence microscopy was performed on a Nikon Ti Eclipse microscope with ×150 magnification (CFI Apo TIRF ×100 Oil, N.A. 1.49, Nikon) using Nikon Elements software. Live cells were imaged at 30 °C in a temperature-controlled chamber (InVivo Scientific). 561 nm excitation laser (MLC400B, Agilent Technologies) was used to excite the fluorophore. A quad-band dichroic (ZT405-488-561-640RPC, Chroma) and 600/50 emission filter (Semrock) was used. An EMCCD camera (iXon DU-897E, Andor) was used to capture images at 20 ms and 500 ms exposure time. We could achieve localization accuracy up to ~5 nm for short-exposure time movies. This length scale of the localization accuracy was much shorter than the diffusion length scales of TALE and Cas9 proteins in the live-cell nucleus.
For imaging the heterochromatin region, HP1 protein was fused with GFP and was stably expressed in HCT116 cell line. GFP was imaged by 488 nm laser with the emission filter of 510/20 on the same microscope. Immediately after GFP illumination, proteins (labeled with JF646)18 were tracked by illuminating the same area with the red laser (640 nm). Later during the analysis, using Fiji36, we created the region of interest (ROI) based on GFP fluorescence. The same ROI was overlaid on the corresponding JF646 movie, and trajectories within the heterochromatin ROI were analyzed.
Live-cell imaging movie of CFTR fused with JF549 in HeLa cells is available in Supplementary Movie 1.
Single-particle tracking calculations
Movies obtained from the microscope were analyzed with Trackmate plugin37 of Fiji36 and trajectories were extracted. For analyzing the fast-moving trajectories, a cut-off (Dfast) of 5 µm2/s was selected. The maximum possible displacement between two frames (Rfast) was calculated from the Dfast and was used to link particles in two consecutive frames. Trajectories were further analyzed with msdanalyzer38 and in house written MATLAB scripts for extracting the diffusion coefficients and residence times of TALE proteins. We calculated the diffusion coefficient using an unbiased covariance-based estimator (CVE)39:
1 |
2 |
3 |
where Δxn is xn+1 – xn in a trajectory time series. In this equation, denotes averages over the time series Δx1…Δxn. Δt is the exposure time.
In our 500 ms exposure time movies, fast-moving populations of proteins were blurred and we only observed the bound proteins. For these slow trajectories, diffusion coefficient cutoff (Dslow) was selected as 0.05 µm2/s. Trajectories were generated based on Dslow. The residence time of proteins was estimated as the total time a protein appeared in the movie. The single appearance of a protein was also considered. The residence time was fitted as either single or double component exponential decay model depending upon the best fit, and decay rates were calculated. The General single-exponential decay model is
4 |
The general double-exponential decay model is
5 |
In this equation, τ1 and τ2 are the residence times of two populations and f1 and f2 are the corresponding fractions.
Photobleaching rate also affects the calculation of the residence times of the bound populations. Dissociation rates are related to the photobleaching rates in the following manner:
6 |
where koutput is the rate that is obtained from fitting the exponentials, kb is the photobleaching rate, and koff is the dissociation rate. Residence time can be calculated by taking the inverse of the dissociation rate:
7 |
For calculating the jumping angles, trajectories were segregated in groups of three consecutive time points. The angle between two vectors made by three points was calculated, and polar histograms were plotted by in house written MATLAB script.
Angular distribution analysis
In house script was written in MATLAB to calculate the jumping angles. Three consecutive points in the trajectory were chosen and the angle was calculated between two vectors formed by points 1,2 and points 2,3. This was done for all the points in every trajectory to get the jumping angles. Jumping angles were plotted with the polar histogram function of MATLAB.
Distance threshold for SPT
We used Trackmate37 plugin of ImageJ to extract the trajectories of single particles. We defined the distance threshold (r) for defining the trajectory of single protein molecules. r was defined as the maximum distance that a particle can travel in consecutive frames. The value of r is dependent on the exposure time and is calculated as follows:
For 2D diffusion coefficient,
8 |
For fast diffusion,
9 |
10 |
For 500 ms, slow diffusion,
11 |
12 |
For 500 ms, chromatin movement,
13 |
14 |
Local search analysis
Using mean velocity filter in TrackMate37, a threshold to include bound molecules only was set. Using the links in tracks statistics, average displacement dave overall tracks was calculated:
Dinst was defined as
15 |
For H2B bound population:
16 |
17 |
Dinst determination for local search:
18 |
19 |
Dinst determination for 3D search:
20 |
21 |
A range of Dinst ± 1 μm2/s was set to define bound, local search and 3-D diffusion regimes.
The threshold of Dinst was used to analyze the individual trajectories to characterize the cycles of local search. Each cycle of the local search was defined by the time spent by the protein locally searching the DNA between consecutive 3D search regimes. We calculated the local search cycle times and plotted their histograms in Fig. 2c.
Reporter assay cloning and transfection
CMV-GFP-HP1a (Addgene #17652) plasmid was modified to remove GFP-HP1a sequence using BamHI and HindIII. This backbone was used to clone TALEN or gRNA binding sites in-frame with GFP that was amplified by PCR from the same backbone. Complementary oligos Forward Primer: ctaggccaccatggtg(N20NGG)cc and Reverse Primer: gatcgg(revcomp(N20NGG))caccatggtggc, containing Kozak sequence and binding site with PAM were phosphorylated and annealed and then ligated to the backbone using T7 Ligase (NEB # M0318L). 0.5×10^5 cells/well were plated in a 24-well plate, 24 h before transfection. 335 ng reporter plasmid was diluted in 26 ul pre-warmed Opti-MEM along with 195 ng of either empty plasmid (no gRNA) or gRNA plasmid. 1.65 μL Fugene HD reagent equilibriated to room temperature was added to the OptiMEM-DNA solution, mixed by pipetting 15-16 times and incubated at room temperature for 15–20 min. Resulting OptiMEM-DNA-FugeneHD mixture was added dropwise to sample wells. Cells were harvested 48 h post-transfection for flow cytometry measurements.
Flow cytometry and analysis
Cells were trypsinized and collected after 48 h post-transfection. Collected cells were resuspended in 500 μL PBS to prepare flow cytometry samples. Samples were analyzed on the LSR II Flow Cytometer (BD Biosciences) and data analysis was performed using FCS Express 6 (Supplementary Fig. 7b). The arithmetic mean of GFP fluorescence was used to compare Cas9-gRNA samples to a reporter only control.
Editing comparison assay
4 gRNAs and 2 TALEN pairs were designed for each heterochromatin loci. One gRNA and 2 TALEN-pairs were designed for euchromatin loci. The top 2 gRNAs with the highest predicted cutting efficiency were selected using benchling CRISPR design tool (https://www.benchling.com/crispr/) and CHOPCHOP gRNA design tool (https://chopchop.cbu.uib.no/). TALEN pairs were designed with CHOPCHOP and SAPTA Scoring Algorithm (http://bao.rice.edu/Research/BioinformaticTools/TAL_targeter.html). TALEN pair constructs with the highest predicted cutting efficiency were synthesized for TIDE analysis. Cas9-gRNA and TALEN pair plasmids were transfected in HCT116 cells in equimolar amounts using Fugene HD transfection reagent following manufacturer’s protocol and cell samples were collected after 48 h for TIDE analysis, which corresponds to cells undergoing at least 2 cycles. This enables averaging out of the confounding factors associated with differential HDR and NHEJ efficiency pertaining to the cell cycle stage.
Genomic PCR and DNA sequencing
Genomic DNA from cell pellets was extracted using QuickExtract DNA solution 1.0 (Epicenter). Genomic PCR was performed using Herculase polymerase (Agilent) or KOD polymerase with primers listed in the Supplementary Table 5. The PCR products were sequenced by Sanger DNA sequencing (Genewiz or ACGT inc.).
TIDE analysis
Genomic PCR products were purified using gel extraction kit (Zymo research). The indel rates were analyzed by the online software (http://tide.nki.nl) using WT sequences as reference. Default parameters were used for indel analysis of CRISPR/Cas9. For TALEN editing, 20 bp sequence between the paired binding sites was used as the “guide sequence”.
Statistical analysis
Data are shown as mean and s.e.m. All p-values were generated from two-tailed t tests using the GraphPad Prism software package (version 6.0c, GraphPad Software) or Microsoft Excel (version 15.24).
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
The authors thank L. Lavis for generously providing HaloTag ligands for imaging experiments and Tarun Chhabra and Neetesh Sharma for assistance with statistical data analysis. We also thank Kai Wen Teng and Duncan Nall from Selvin lab for their help with initial imaging experiments. We would also like to thank Guanhua Xun and Emily Gaither for help with TALEN synthesis. This work was supported by the National Institutes of Health (1U54DK107965 and 1UM1HG009402 to H.Z. and NS100019 to P.R.S.) and the National Science Foundation (PHY 1430124 to P.R.S.).
Source data
Author contributions
Conceptualization: S.J., S.S., and H.Z.; methodology: S.J. and S.S.; Matlab scripts: S.S.; experimental investigation: S.J., S.S., C.Y., Z.F., M.Z., M.L., S.T.L., X.X., S.A.; imaging data analysis: S.S., S.J., and S.A.; ChIP-seq data analysis: Y.W., writing: S.J., S.S., H.Z., C.M.S., and P.R.S.; guidance and discussion: H.Z., P.R.S., and C.M.S., supervision: H.Z. and P.R.S.
Data availability
All data are available in the main text or the supplementary information text and files. Genomic loci sequence files are provided in the Source data file. Single-molecule imaging raw datasets and any other relevant data are available from the corresponding author upon request. Source data are provided with this paper.
Code availability
The custom codes for the data analysis used in this study are available from the corresponding author upon request. Codes can also be accessed on GitHub following this link (https://github.com/sshukla101/SelvinLab/blob/master/Diffusion_Coefficient_Calculation_CVE_Estimator.m).
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks Chirlmin Joo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Surbhi Jain, Saurabh Shukla
Supplementary information
Supplementary information is available for this paper at 10.1038/s41467-020-20672-5.
References
- 1.Gaj T, Gersbach CA, Barbas CF. ZFN, TALEN and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol. 2013;31:397–405. doi: 10.1016/j.tibtech.2013.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wei C, et al. TALEN or Cas9—rapid, efficient and specific choices for genome modifications. J. Genet. Genomics. 2013;40:281–289. doi: 10.1016/j.jgg.2013.03.013. [DOI] [PubMed] [Google Scholar]
- 3.Jinek M, et al. A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Boch J, et al. Breaking the code of DNA binding specificity of TAL-type III effectors. Science. 2009;326:1509–1512. doi: 10.1126/science.1178811. [DOI] [PubMed] [Google Scholar]
- 5.Moscou MJ, Bogdanove AJ. A simple cipher governs DNA recognition by TAL effectors. Science. 2009;326:1501. doi: 10.1126/science.1178817. [DOI] [PubMed] [Google Scholar]
- 6.Cuculis L, Abil Z, Zhao H, Schroeder CM. Direct observation of TALE protein dynamics reveals a two-state search mechanism. Nat. Commun. 2015;6:7277. doi: 10.1038/ncomms8277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Cuculis L, Abil Z, Zhao H, Schroeder CM. TALE proteins search DNA using a rotationally decoupled mechanism. Nat. Chem. Biol. 2016;12:831–837. doi: 10.1038/nchembio.2152. [DOI] [PubMed] [Google Scholar]
- 8.Sternberg SH, Redding S, Jinek M, Greene EC, Doudna JA. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature. 2014;507:62–67. doi: 10.1038/nature13011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Knight SC, et al. Dynamics of CRISPR-Cas9 genome interrogation in living cells. Science. 2015;350:823–826. doi: 10.1126/science.aac6572. [DOI] [PubMed] [Google Scholar]
- 10.Shibata M, et al. Real-space and real-time dynamics of CRISPR-Cas9 visualized by high-speed atomic force microscopy. Nat. Commun. 2017;8:1–9. doi: 10.1038/s41467-016-0009-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Globyte V, Lee SH, Bae T, Kim J-S, Joo C. CRISPR/Cas9 searches for a protospacer adjacent motif by lateral diffusion. EMBO J. 2019;38:e99466. doi: 10.15252/embj.201899466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Brinkman EK, Chen T, Amendola M, van Steensel B. Easy quantitative assessment of genome editing by sequence trace decomposition. Nucleic Acids Res. 2014;42:e168–e168. doi: 10.1093/nar/gku936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Elf J, Li G-W, Xie XS. Probing transcription factor dynamics at the single-molecule level in a living cell. Science. 2007;316:1191–1194. doi: 10.1126/science.1141967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Liu Z, Tjian R. Visualizing transcription factor dynamics in living cells. J. Cell Biol. 2018;217:1181–1191. doi: 10.1083/jcb.201710038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Deininger P. Alu elements: know the SINEs. Genome Biol. 2011;12:236. doi: 10.1186/gb-2011-12-12-236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Stagge F, Mitronova GY, Belov VN, Wurm CA, Jakobs S. Snap-, CLIP- and halo-tag labelling of budding yeast cells. PLoS ONE. 2013;8:e78745. doi: 10.1371/journal.pone.0078745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chao R, et al. Fully automated one-step synthesis of single-transcript TALEN pairs using a biological foundry. ACS Synth. Biol. 2017;6:678–685. doi: 10.1021/acssynbio.6b00293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Grimm JB, et al. A general method to improve fluorophores for live-cell and single-molecule microscopy. Nat. Methods. 2015;12:244–250. doi: 10.1038/nmeth.3256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Izeddin I, et al. Single-molecule tracking in live cells reveals distinct target-search strategies of transcription factors in the nucleus. eLife. 2014;3:e02230. doi: 10.7554/eLife.02230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chen J, et al. Single-molecule dynamics of enhanceosome assembly in embryonic stem cells. Cell. 2014;156:1274–1285. doi: 10.1016/j.cell.2014.01.062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Berg, O. G., Winter, R. B. & von Hippel, P. H. Diffusion-driven mechanisms of protein translocation on nucleic acids. 1. Models and theory. Biochemistry20, 6929–6948. [DOI] [PubMed]
- 22.Normanno, D., Dahan, M. & Darzacq, X. Intra-nuclear mobility and target search mechanisms of transcription factors: a single-molecule perspective on gene expression. Biochim. Biophys. Acta1819, 482–493. [DOI] [PubMed]
- 23.Loffreda A, et al. Live-cell p53 single-molecule binding is modulated by C-terminal acetylation and correlates with transcriptional activity. Nat. Commun. 2017;8:313. doi: 10.1038/s41467-017-00398-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Rice JC, Allis CD. Histone methylation versus histone acetylation: new insights into epigenetic regulation. Curr. Opin. Cell Biol. 2001;13:263–273. doi: 10.1016/S0955-0674(00)00208-8. [DOI] [PubMed] [Google Scholar]
- 25.Cheutin T, et al. Maintenance of stable heterochromatin domains by dynamic HP1 binding. Science. 2003;299:721–725. doi: 10.1126/science.1078572. [DOI] [PubMed] [Google Scholar]
- 26.Yarrington, R. M., Verma, S., Schwartz, S., Trautman, J. K. & Carroll, D. Nucleosomes inhibit target cleavage by CRISPR-Cas9 in vivo. Proc. Natl. Acad. Sci.10.1073/pnas.1810062115 (2018). [DOI] [PMC free article] [PubMed]
- 27.Chen X, et al. Probing the impact of chromatin conformation on genome editing tools. Nucleic Acids Res. 2016;44:6482–6492. doi: 10.1093/nar/gkw524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Isaac, R. S. et al. Nucleosome breathing and remodeling constrain CRISPR-Cas9 function. eLife 5, e13450 (2016). [DOI] [PMC free article] [PubMed]
- 29.Tasan I, et al. CRISPR/Cas9-mediated knock-in of an optimized TetO repeat for live cell imaging of endogenous loci. Nucleic Acids Res. 2018;46:e100–e100. doi: 10.1093/nar/gky501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Labun K, et al. CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editing. Nucleic Acids Res. 2019;47:W171–W174. doi: 10.1093/nar/gkz365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lin Y, et al. SAPTA: a new design tool for improving TALE nuclease activity. Nucleic Acids Res. 2014;42:e47–e47. doi: 10.1093/nar/gkt1363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sun N, Liang J, Abil Z, Zhao H. Optimized TAL effector nucleases (TALENs) for use in treatment of sickle cell disease. Mol. Biosyst. 2012;8:1255–1263. doi: 10.1039/c2mb05461b. [DOI] [PubMed] [Google Scholar]
- 33.Slaymaker IM, et al. Rationally engineered Cas9 nucleases with improved specificity. Science. 2016;351:84–88. doi: 10.1126/science.aad5227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Perez-Pinera P, et al. RNA-guided gene activation by CRISPR-Cas9-based transcription factors. Nat. Methods. 2013;10:973–976. doi: 10.1038/nmeth.2600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kanda T, Sullivan KF, Wahl GM. Histone-GFP fusion protein enables sensitive analysis of chromosome dynamics in living mammalian cells. Curr. Biol. CB. 1998;8:377–385. doi: 10.1016/S0960-9822(98)70156-3. [DOI] [PubMed] [Google Scholar]
- 36.Schindelin J, et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods. 2012;9:676–682. doi: 10.1038/nmeth.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Tinevez J-Y, et al. TrackMate: An open and extensible platform for single-particle tracking. Methods. 2017;115:80–90. doi: 10.1016/j.ymeth.2016.09.016. [DOI] [PubMed] [Google Scholar]
- 38.Persson F, Lindén M, Unoson C, Elf J. Extracting intracellular diffusive states and transition rates from single-molecule tracking data. Nat. Methods. 2013;10:265–269. doi: 10.1038/nmeth.2367. [DOI] [PubMed] [Google Scholar]
- 39.Vestergaard CL, Blainey PC, Flyvbjerg H. Optimal estimation of diffusion coefficients from single-particle trajectories. Phys. Rev. E. 2014;89:022726. doi: 10.1103/PhysRevE.89.022726. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data are available in the main text or the supplementary information text and files. Genomic loci sequence files are provided in the Source data file. Single-molecule imaging raw datasets and any other relevant data are available from the corresponding author upon request. Source data are provided with this paper.
The custom codes for the data analysis used in this study are available from the corresponding author upon request. Codes can also be accessed on GitHub following this link (https://github.com/sshukla101/SelvinLab/blob/master/Diffusion_Coefficient_Calculation_CVE_Estimator.m).