Abstract
Using Monte Carlo simulations, we deconvolved the sliding and hopping kinetics of GFP-LacI proteins on elongated DNA from their experimentally observed seconds-long diffusion trajectories. Our simulations suggest the following results: (i) in each diffusion trajectory, a protein makes on average hundreds of alternating slides and hops with a mean sliding time of several tens of milliseconds; (ii) sliding dominates the root-mean-square displacement of fast diffusion trajectories, whereas hopping dominates slow ones; (iii) flow and variations in salt concentration have limited effects on hopping kinetics, while in vivo DNA configuration is not expected to influence sliding kinetics; and (iv) the rate of occurrence for hops longer than 200 nm agrees with experimental data for EcoRV proteins.
I. INTRODUCTION
Timely target association of DNA-binding (DB) proteins is important for prompt cellular response to external stimuli using mechanisms such as gene regulation, DNA replication, and DNA repair. The target association rates of DB proteins frequently deviate from the diffusion limit due to their interactions with nonspecific DNA via the process of facilitated diffusion [1–3]. Facilitated diffusion mainly consists of two motions: sliding, where a protein diffuses along nonspecific DNA without losing contact, and hopping, where the protein jumps off DNA and undergoes 3D diffusion before reassociating to the same (Fig. 1) or a different segment of DNA (referred to as intersegmental transfer). In this article, we regard events with long hopping distances, usually called jumping, as a form of hopping. A DB protein may slide and hop many times on nonspecific DNA before reaching the target. In order to quantify the effect of facilitated diffusion on DB proteins’ target binding rate, how long a protein spends sliding on DNA (mean sliding time 〈t1〉) and how fast it moves along DNA (sliding diffusion coefficient D1) are two critical parameters for all calculations of in vitro and in vivo DNA geometries [2,4–8].
Single-molecule (SM) fluorescence imaging studies of DB proteins’ Brownian diffusion along elongated DNA have obtained effective diffusion coefficients D for the whole seconds-long diffusions (in this article we define each observed diffusion event between protein association and permanent dissociation to be a diffusion trajectory, and t is the total time of the diffusion) [3,9–21]. In the past, numerous studies had substituted t and D values in the place of 〈t1〉 and D1 in target binding rate and protein-nonspecific-DNA binding energy calculations since 〈t1〉 and D1 were not experimentally accessible [3,5–8,10,12,17,22]. Since the extent of hopping involvement is unknown, it is dubious to use t and D values for 〈t1〉 and D1. Recent evidence suggests that these diffusion trajectories include both sliding and hopping: (i) the sliding time of DB proteins has been estimated to be milliseconds [6,12,22,23]; (ii) the sliding displacement has been estimated to be less than 50 bp [24], shorter than the displacements of whole diffusion trajectories of the reported DB proteins (>100 nm); and (iii) hops longer than 200 nm have been observed [15]. In order to obtain 〈t1〉 and D1 from experimental data, deconvolving sliding and hopping from individual diffusion trajectories is necessary.
II. SIMULATIONS
Here we deconvolve sliding and hopping in a diffusion trajectory and obtain 〈t1〉 and D1 using (i) Monte Carlo simulations, (ii) experimental D and t values, and (iii) the following two relations (derived in Ref. [25]):
(1) |
(2) |
where N is the mean number of sliding and hopping alternations in a diffusion trajectory, D3 is the 3D diffusion coefficient of the protein, and 〈t3〉 is the mean hopping time. From hopping simulations we first determine N and 〈t3〉; then, combining with experimental D and t values, t1 and D1 are obtained using Eqs. (1) and (2).
For each hopping simulation, a protein was initially positioned at the protein-center-to-DNA-center distance of R = rDNA + rprotein + Δr, where rDNA = 1 nm is the DNA radius, rGFP–LacI = 2.68 nm, and Δr ≈ 0.5 nm is an estimate of the protein-DNA binding distance (or location of the interaction potential minimum beyond which we consider no protein-DNA interactions) [26,27]. The protein immediately dissociates from DNA and undergoes 3D diffusion until rebinding to DNA, at which time the position was recorded, or until the maximum number of steps of the hopping simulation was reached, in which case the protein was assumed to have permanently dissociated and its diffusion trajectory was not used in subsequent data analysis. Figure 2 describes the criterion for determining whether a hopping protein collided with DNA. For every step, the length of the perpendicular drawn from the center of the DNA to the line connecting the last two protein locations (dashed arrow) was calculated and if less than R, association occurred. The binding position was chosen to be the midpoint between the two protein locations. We have modeled DNA as an infinite, rigid cylinder assuming 100% probability for association on protein-DNA collision; the distance between the protein binding location and its origin denotes the hopping distance.
The simulation parameters were determined as follows. The hopping simulation step size δ, and step time τ, are the collision distance and time, respectively [28]. At temperature T = 294 K, the instantaneous velocity of a protein of mass m in solution is the root-mean-square (rms) velocity , where kB is the Boltzmann constant, and m = 67.5 kDa for a GFP-LacI monomer. Using the Einstein-Stokes relation, D3 = δ2/(2τ) = kBT/6πηr = 8.03 × 107 nm2/s for GFP-LacI, where the viscosity of water is η = 10–3 N s/m2 and the protein hydrodynamic radius r is 2.68 nm, assuming a typical protein density of 1.38 g/cm3, we obtain . Therefore, δ = 0.267 Å and τ = 4.46 ps. Each simulation step in the x,y,z dimensions was drawn from a Gaussian distribution with a mean of zero and a standard deviation of δ.
The time limit for simulation of each GFP-LacI hop was ≈ 1 ms (or 2.1 × 108 steps), selected according to the following two estimations: (i) Since the observed diffusion of proteins on DNA is the combination of sliding and hopping with diffusion coefficients D1 and D3, respectively, the maximum total hopping time of a diffusion trajectory cannot exceed Nt3,max = Dt/D3 when D1 ≈ 0. For GFP-LacI, 〈D〉 ≈ 2 × 104 nm2/s [3] which dictates that t3,max ≈ 0.25 ms when t is on the order of 1 s and using the low bound for N of one hop per diffusion trajectory. Therefore, a hopping time limit of t3,max ≈ 1 ms for a single hop should be sufficiently long for all 3D diffusing proteins to return to DNA. (ii) A longer hopping time limit, such as 10 ms per hop (data not shown), results in additional proteins returning to DNA with individual hopping distances longer than , a detectable distance in SM measurements that are usually used to separate single diffusion trajectories into segments free of large displacements for accurate D analysis [3,15].
III. RESULTS AND DISCUSSION
For 4 × 105 GFP-LacI hopping simulations (maximum simulation time t3,max ≈ 1 ms) with δ = 0.267 Å and R = 4.2 nm, 99.809% of these trials resulted in the protein reassociating to DNA and thus the probability for a simulated hop to return to DNA is P = 0.99809. The hopping characteristics are shown in Figs. 3(a) and 3(b), in which the mean hopping distance along DNA is 3.37 Å (median, 0.41 Å), the mean hopping height (the maximum radial distance of the protein from DNA) is 4.93 Å (median, 0.45 Å), and the mean number of steps per hop is 4.97 × 104 (median, 5), yielding a mean hopping time of 〈t3〉 = 0.22 μs. The mean number of hops in a GFP-LacI diffusion trajectory is N = 526, obtained by dividing the total number of simulated hops of 4 × 105 by the total number of nonreturned hopping events of 763; the distribution for the number of hops per diffusion trajectory is shown in Fig. 4. This set of values have been verified to converge with those from a larger simulation of 4 × 106 hops. Specifically, N values differ by 0.57%. The inset of Fig. 3 shows the distribution of total hopping displacements in a diffusion trajectory with each data point simulated from 526 randomly selected hopping displacements. The rms total hopping displacements per diffusion trajectory is 127.5 nm , and the mean total hopping time is N 〈t3〉 = 115 μs. Note that although shorter hopping distances, such as ones less than the base pair length of 0.34 Å, do not carry direct biological significance nor do they noticeably disrupt sliding, they are important for correctly assessing rms total hopping displacement statistics in a diffusion trajectory.
We can also compute the “diffusion to capture” probability P for a protein to return to DNA using a steady-state solution to the diffusion equation, incorporating a cutoff radial distance c [28]. Proteins released after the initial step at b = 4.22 nm are either adsorbed at the DNA surface (R = 4.2 nm) or escape beyond . The probability is time independent and given by
(3) |
Imposing the same cutoff distance c = 551.2 nm in subsequent simulations, we obtained P = 0.998 65, in near agreement with the analytical value above.
Having obtained 〈t3〉 and N from simulation, we now solve Eqs. (1) and (2) for 〈t1〉 and D1 from the experimentally measured values of t and D. With values of D for GFP-LacI ranging from 2.3 × 102 to 1.3 × 105 nm2/s [3] and t = 10.4 s [Fig. 3(d)],
(4) |
(5) |
The sliding time is several tens of ms and D1 ranges from ≈0 for slow diffusion to ≈D for fast diffusion. The 〈D1〉 for GFP-LacI is 9.1 × 103 nm2/s using 〈D〉 of 104 nm2/s. Since D1 > 0, Eq. (5) sets the lower bound of D such that it must be greater than D3N 〈t3〉/t ≈ 896 nm2/s. The rms total sliding displacement in a diffusion trajectory becomes longer than the rms total hopping displacement when D > 2ND3t3/t ≈ 1790 nm2/s.
Since our protein-nonspecific-DNA binding distance is an estimate, we have carried out simulations with Δr ranging from 0.5 to 6.5 nm (corresponding to protein-DNA distances R of 4.2 and 10.2 nm, respectively). Comparing the R = 10.2 nm results to the R = 4.2 nm results, the distributions for hopping distances [Fig. 3(a)] and hopping times [Fig. 3(b)] are similar, although the mean hopping distance reduces to 2.82 Å, the mean number of steps per hop reduces to 3.23 × 104, and the mean number of hops N doubles to 1101. Solving for 〈t1〉 and D1 at R = 10.2 nm, we found 〈t3〉 = 0.14 μs, N〈t3〉 = 154 μs, 〈t1〉 = 9.4 ms (approximately half of the value for R = 4.2 nm), and D1 to be similar to the previously calculated value for R = 4.2 nm. Given that the sliding and hopping values at R = 4.2 and 10.2 nm are close, our method and results can be safely applied to most DB protein-DNA binding distances.
To investigate hopping distances within a diffusion trajectory, Fig. 3(c) shows the distribution of the number of hops per diffusion trajectory longer than a finite hopping distance, ranging from 0.25 Å to 800 nm, for R = 4.2 and 10.2 nm. For the 4.2-nm results, 3.37 hops in a diffusion trajectory were longer than 5 nm, and 11% of diffusion trajectories had a hop longer than 200 nm. As expected, the results for 10.2 nm are approximately twice as large since N is doubled. The crosses represent EcoRV proteins, which have a comparable hydrodynamic radius of 2.66 nm (see Table I), that were experimentally observed in different buffers to have hopped longer than 200 nm with reported occurrences ranging from 6 to 16% per diffusion trajectory [15]. These observations are in agreement with our simulations results. Furthermore, for hops longer than 300 and 500 nm, our observations agree with the reported values in Fig. 4(a) of Ref. [15].
TABLE I.
Protein | rprotein (nm) | δ (Å) | D (nm2/s) |
---|---|---|---|
YFP-LacI, 2a | 3.13 | 0.284 | 4.6 × 104 [12] |
GFP-LacI | 2.68 | 0.267 | 2.3 × 102–1.3 × 105 [3] |
EcoRV, 2 | 2.66 | 0.262 | 0.9 – 2.5 × 104 [15] |
EcoRVa | 3.1 × 103 [19] | ||
RNAP, 4a | 6.1 × 103 – 4.3 × 105 [13] | ||
RNAPa | 1.3 × 105 [29], ~104 [9] | ||
hOgg1 | 2.36 | 0.247 | 5.78 × 105 [10] |
p53 | 2.34 | 0.246 | 3.01 × 105 [17] |
UL42 | 2.63 | 0.261 | 5.1 × 103–2.2 × 104 [16] |
T7 gp5, 2 | 2.86 | 0.272 | 8.0 × 105–1.86 × 106 [21] |
T7 gp5, 2 | 3.00 | 0.278 | 4.0 × 105 [21] |
C-Ada | 1.77 | 0.214 | 1.3 × 106 [20] |
The number 2 indicates a dimer, and 4 indicates a tetramer.
Unknown molecular size due to unspecified/uncertain protein components and/or labels.
Other DB proteins may differ from GFP-LacI in their sizes and thus δ and R. Table I lists DB proteins that can hop on DNA (instead of proteins that slide only [11]) studied using SM fluorescence tracking methods on elongated DNA. Despite the difference in R by up to 1.26 nm, the δ values differ only by less than 0.07 Å. The effect of R difference is considered in Fig. 5(a), in which the number of hops per diffusion trajectory longer than a finite distance, ranging from 0.1 Å to 800 nm for δ = 0.267 Å and R from 4.2 to 10.2 nm are shown. The number of hops per diffusion trajectory increases with R moderately for all hopping distances, indicating that our hopping results are applicable to most observed DB proteins.
The step size δ in the current approach, based on microscopic Brownian random walk models, can be made larger or smaller for vastly different particle sizes. Figure 5(b) shows distributions of hopping distances for three δ values: 0.267, 3.4, and 10 Å (we used R = 4.2 nm and t3,max ≈ 1 ms). The distribution curves collapse when protein hopping distances are larger than δ, indicating that the tail distribution of protein hopping probability has the same asymptotic form at long distances, in agreement with the solution to the diffusion equation [30]. However, the mean hopping distance [Fig. 5(b, inset); values 3.37, 36, and 95 Å], the mean number of hops N in a trajectory (526, 42, and 14), and 〈t3〉 (0.22, 3.1, and 9.2 μs) all depend on δ sensitively, as short-length scale motions dominate protein-DNA reassociation [Fig. 3(a)]. This regime can not be accessed in the macroscopic theory, i.e., by solving the diffusion equation directly.
When the protein-nonspecific-DNA association probability, p, is not 100%, e.g., due to rotation of the DNA-binding domain during large hops, hopping statistics and the subsequent sliding statistics will change. For a low binding probability of p = 10%, although on average, 10 consecutive hops would be needed for reassociation, the mean number of association attempts will still be N. However, the effective mean hopping time , and the mean hopping distance are expected to increase while the effective number of hops per diffusion trajectory N′ decreases since t is held constant. The effective total hopping time and the rms total hopping distance per diffusion trajectory should therefore remain constant. The binding probability is thus inversely related to the effective mean sliding time , according to Eq. (2), which for p = 10% results in a 10-fold increase in .
When salt concentration varies, p and R will change, as will D3 within a few angstroms of the DNA surface. However, since t remains because , the observed changes in t with salt concentration are likely due to changes in the total sliding time rather than the total hopping time. Consequently, changes in t as a result of varying salt concentration are not indicative of hopping and should not be used to determine its presence in diffusion trajectories, in disagreement with Refs. [6,10,16,17,21].
Some studies use flow to elongate DNA and/or investigate hopping properties of DB proteins [10,17,20,21,31]. Here we describe the effect of flow on hopping distances using the maximum reported flow rate in SM studies of 100 μm/s. For our mean hopping time of 〈t3〉 = 0.22 μs, a typical dissociated protein is carried by flow a length 0.22 Å along DNA; this distance is negligible compared to its mean hopping distance of 3.37 Å (the total displacement of the protein from flow alone within a diffusion trajectory consisting of 526 hops will be 11.6 nm which is substantially less than the total hopping displacement of 127.5 nm observed for GFP-LacI and similarly other proteins, as shown above). On the other hand, for a trajectory that includes a 1-μm-long hop, which occurs once every 1000 diffusion trajectories, the hopping time is 6.22 ms and a protein is flown 622 nm along DNA. This distance would be sufficiently large for the protein to be considered dissociated.
Our results suggest that for diffusion trajectories without large hops of longer than of order a few hundred nanometers, a protein is unlikely to have been “washed out” while those that include large hops, the protein may be. However, according to Fig. 3(c), the probability for such an event to occur is approximately 1% of all diffusion trajectories.
Furthermore, sliding kinetics are not expected to be drastically affected by DNA configuration since a protein remains in contact with nonspecific DNA and should not be subject to DNA condensation and coiling either in vivo or in vitro, contrary to hopping kinetics. The reported values for D1 and t can therefore be applied under in vivo situations for better estimation of target binding rates.
IV. CONCLUSION
In summary, this study analyzes DB proteins’ hopping on elongated DNA to address sliding kinetics. While we have made several assumptions regarding the nature of protein association and modeling DNA, our study suggests that the observed sliding kinetics is a robust feature. Although hopping kinetics will change according to in vivo conditions, the lower bound on D for a typical DB protein should help future experiments in identifying the presence of hopping in protein diffusion trajectories with greater certainty.
ACKNOWLEDGMENTS
We are grateful to Anders Carlsson for helpful discussions. M.C.D. wishes to thank the National Institutes of Health for Grant No. 5T90 DA022871.
References
- 1.Riggs AD, Bougeois S, Cohn M. J. Mol. Biol. 1970;53:401. doi: 10.1016/0022-2836(70)90074-4. [DOI] [PubMed] [Google Scholar]
- 2.Halford SE, Marko JF. Nucl. Acids Res. 2004;32:3040. doi: 10.1093/nar/gkh624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wang YM, Austin RH, Cox EC. Phys. Rev. Lett. 2006;97:048302. doi: 10.1103/PhysRevLett.97.048302. [DOI] [PubMed] [Google Scholar]
- 4.Klenin KV, Merlitz H, Langowski J, Wu CX. Phys. Rev. Lett. 2006;96:018104. doi: 10.1103/PhysRevLett.96.018104. [DOI] [PubMed] [Google Scholar]
- 5.Hu L, Grosberg AY, Bruinsma R. Biophys. J. 2008;95:1151. doi: 10.1529/biophysj.108.129825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wunderlich Z, Mirny LA. Nucl. Acids Res. 2008;36:3570. doi: 10.1093/nar/gkn173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Murugan R. Biophys. J. 2010;99:353. doi: 10.1016/j.bpj.2010.04.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.de la Rosa MAD, et al. Biophys. J. 2010;98:2943. doi: 10.1016/j.bpj.2010.02.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Harada Y, et al. Biophys. J. 1999;76:709. doi: 10.1016/S0006-3495(99)77237-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Blainey PC, et al. Proc. Natl. Acad. Sci. USA. 2006;103:5752. doi: 10.1073/pnas.0509723103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Granéli A, et al. Proc. Natl. Acad. Sci. USA. 2006;103:1221. doi: 10.1073/pnas.0508366103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Elf J, Li G-W, Xie XS. Science. 2007;316:1191. doi: 10.1126/science.1141967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kim JH, Larson RG. Nucl. Acids Res. 2007;35:3848. doi: 10.1093/nar/gkm332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gorman J, et al. Mol. Cell. 2007;28:359. doi: 10.1016/j.molcel.2007.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bonnet I, et al. Nucl. Acids Res. 2008;36:4118. doi: 10.1093/nar/gkn376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Komazin-Meredith G, et al. Proc. Natl. Acad. Sci. USA. 2008;105:10721. doi: 10.1073/pnas.0802676105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Tafvizi A, et al. Biophys. J. 2008;95:L01. doi: 10.1529/biophysj.108.134122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Porecha RH, Stivers JT. Proc. Natl. Acad. Sci. USA. 2008;105:10791. doi: 10.1073/pnas.0801612105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Biebricher A, et al. Biophys. J. 2009;96:L50. doi: 10.1016/j.bpj.2009.01.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lin Y, et al. Biophys. J. 2009;96:1911. doi: 10.1016/j.bpj.2008.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Etson CM, et al. Proc. Natl. Acad. Sci. USA. 2010;107:1900. doi: 10.1073/pnas.0912664107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Li G-W, Berg OG, Elf J. Nat. Phys. 2009;5:294. [Google Scholar]
- 23.Revzin A. The Biology of Nonspecific DNA Protein Interactions. CRC Press; London: 1990. [Google Scholar]
- 24.Gowers DM, Wilson GG, Halford SE. Proc. Natl. Acad. Sci. USA. 2005;102:15883. doi: 10.1073/pnas.0505378102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.The DB protein's displacement on DNA, x, contains two alternating diffusion displacements: 1D sliding displacement x1, and 3D hopping displacement . This relation has also been verified by simulations.
- 26.Florescu A-M, Joyeux M. J. Chem. Phys. 2009;130:015103. doi: 10.1063/1.3050097. [DOI] [PubMed] [Google Scholar]
- 27.Dahirel V, Paillusson F, Jardat M, Barbi M, Victor JM. Phys. Rev. Lett. 2009;102:228101. doi: 10.1103/PhysRevLett.102.228101. [DOI] [PubMed] [Google Scholar]
- 28.Berg HC. Random Walks in Biology. Princeton University Press; Princeton, NJ: 1993. [Google Scholar]
- 29.Ricchetti M, Metzger W, Heumann H. Proc. Natl. Acad. Sci. USA. 1988;85:4610. doi: 10.1073/pnas.85.13.4610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Loverdo C, Benichou O, Voituriez R, Biebricher A, Bonnet I, Desbiolles P. Phys. Rev. Lett. 2009;102:188101. doi: 10.1103/PhysRevLett.102.188101. [DOI] [PubMed] [Google Scholar]
- 31.Kim JH, et al. Nano. Res. Lett. 2007;2:185. [Google Scholar]