Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Jun 17.
Published in final edited form as: Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Feb 16;83(2 0 1):021907. doi: 10.1103/PhysRevE.83.021907

Protein sliding and hopping kinetics on DNA

Michael C DeSantis 1, Je-Luen Li 2, Y M Wang 1
PMCID: PMC3683889  NIHMSID: NIHMS477724  PMID: 21405863

Abstract

Using Monte Carlo simulations, we deconvolved the sliding and hopping kinetics of GFP-LacI proteins on elongated DNA from their experimentally observed seconds-long diffusion trajectories. Our simulations suggest the following results: (i) in each diffusion trajectory, a protein makes on average hundreds of alternating slides and hops with a mean sliding time of several tens of milliseconds; (ii) sliding dominates the root-mean-square displacement of fast diffusion trajectories, whereas hopping dominates slow ones; (iii) flow and variations in salt concentration have limited effects on hopping kinetics, while in vivo DNA configuration is not expected to influence sliding kinetics; and (iv) the rate of occurrence for hops longer than 200 nm agrees with experimental data for EcoRV proteins.

I. INTRODUCTION

Timely target association of DNA-binding (DB) proteins is important for prompt cellular response to external stimuli using mechanisms such as gene regulation, DNA replication, and DNA repair. The target association rates of DB proteins frequently deviate from the diffusion limit due to their interactions with nonspecific DNA via the process of facilitated diffusion [13]. Facilitated diffusion mainly consists of two motions: sliding, where a protein diffuses along nonspecific DNA without losing contact, and hopping, where the protein jumps off DNA and undergoes 3D diffusion before reassociating to the same (Fig. 1) or a different segment of DNA (referred to as intersegmental transfer). In this article, we regard events with long hopping distances, usually called jumping, as a form of hopping. A DB protein may slide and hop many times on nonspecific DNA before reaching the target. In order to quantify the effect of facilitated diffusion on DB proteins’ target binding rate, how long a protein spends sliding on DNA (mean sliding time 〈t1〉) and how fast it moves along DNA (sliding diffusion coefficient D1) are two critical parameters for all calculations of in vitro and in vivo DNA geometries [2,48].

FIG. 1.

FIG. 1

(Color online) Schematics of a diffusion trajectory showing a protein initially binding to DNA, proceeding to slide (light disks) and hop (dark disks), and finally permanently dissociating from DNA. This example diffusion trajectory has two discernible hops.

Single-molecule (SM) fluorescence imaging studies of DB proteins’ Brownian diffusion along elongated DNA have obtained effective diffusion coefficients D for the whole seconds-long diffusions (in this article we define each observed diffusion event between protein association and permanent dissociation to be a diffusion trajectory, and t is the total time of the diffusion) [3,921]. In the past, numerous studies had substituted t and D values in the place of 〈t1〉 and D1 in target binding rate and protein-nonspecific-DNA binding energy calculations since 〈t1〉 and D1 were not experimentally accessible [3,58,10,12,17,22]. Since the extent of hopping involvement is unknown, it is dubious to use t and D values for 〈t1〉 and D1. Recent evidence suggests that these diffusion trajectories include both sliding and hopping: (i) the sliding time of DB proteins has been estimated to be milliseconds [6,12,22,23]; (ii) the sliding displacement has been estimated to be less than 50 bp [24], shorter than the displacements of whole diffusion trajectories of the reported DB proteins (>100 nm); and (iii) hops longer than 200 nm have been observed [15]. In order to obtain 〈t1〉 and D1 from experimental data, deconvolving sliding and hopping from individual diffusion trajectories is necessary.

II. SIMULATIONS

Here we deconvolve sliding and hopping in a diffusion trajectory and obtain 〈t1〉 and D1 using (i) Monte Carlo simulations, (ii) experimental D and t values, and (iii) the following two relations (derived in Ref. [25]):

t=Nt1+Nt3, (1)
2Dt=2D1Nt1+2D3Nt3, (2)

where N is the mean number of sliding and hopping alternations in a diffusion trajectory, D3 is the 3D diffusion coefficient of the protein, and 〈t3〉 is the mean hopping time. From hopping simulations we first determine N and 〈t3〉; then, combining with experimental D and t values, t1 and D1 are obtained using Eqs. (1) and (2).

For each hopping simulation, a protein was initially positioned at the protein-center-to-DNA-center distance of R = rDNA + rprotein + Δr, where rDNA = 1 nm is the DNA radius, rGFP–LacI = 2.68 nm, and Δr ≈ 0.5 nm is an estimate of the protein-DNA binding distance (or location of the interaction potential minimum beyond which we consider no protein-DNA interactions) [26,27]. The protein immediately dissociates from DNA and undergoes 3D diffusion until rebinding to DNA, at which time the position was recorded, or until the maximum number of steps of the hopping simulation was reached, in which case the protein was assumed to have permanently dissociated and its diffusion trajectory was not used in subsequent data analysis. Figure 2 describes the criterion for determining whether a hopping protein collided with DNA. For every step, the length of the perpendicular drawn from the center of the DNA to the line connecting the last two protein locations (dashed arrow) was calculated and if less than R, association occurred. The binding position was chosen to be the midpoint between the two protein locations. We have modeled DNA as an infinite, rigid cylinder assuming 100% probability for association on protein-DNA collision; the distance between the protein binding location and its origin denotes the hopping distance.

FIG. 2.

FIG. 2

(Color online) Determination of protein-DNA association. The gray (open) circle marks the effective protein-DNA binding distance. The protein moves ballistically between consecutive steps.

The simulation parameters were determined as follows. The hopping simulation step size δ, and step time τ, are the collision distance and time, respectively [28]. At temperature T = 294 K, the instantaneous velocity of a protein of mass m in solution is the root-mean-square (rms) velocity vx2=kBTm=δτ=6.02ms, where kB is the Boltzmann constant, and m = 67.5 kDa for a GFP-LacI monomer. Using the Einstein-Stokes relation, D3 = δ2/(2τ) = kBT/6πηr = 8.03 × 107 nm2/s for GFP-LacI, where the viscosity of water is η = 10–3 N s/m2 and the protein hydrodynamic radius r is 2.68 nm, assuming a typical protein density of 1.38 g/cm3, we obtain δ=2D3vx2. Therefore, δ = 0.267 Å and τ = 4.46 ps. Each simulation step in the x,y,z dimensions was drawn from a Gaussian distribution with a mean of zero and a standard deviation of δ.

The time limit for simulation of each GFP-LacI hop was ≈ 1 ms (or 2.1 × 108 steps), selected according to the following two estimations: (i) Since the observed diffusion of proteins on DNA is the combination of sliding and hopping with diffusion coefficients D1 and D3, respectively, the maximum total hopping time of a diffusion trajectory cannot exceed Nt3,max = Dt/D3 when D1 ≈ 0. For GFP-LacI, 〈D〉 ≈ 2 × 104 nm2/s [3] which dictates that t3,max ≈ 0.25 ms when t is on the order of 1 s and using the low bound for N of one hop per diffusion trajectory. Therefore, a hopping time limit of t3,max ≈ 1 ms for a single hop should be sufficiently long for all 3D diffusing proteins to return to DNA. (ii) A longer hopping time limit, such as 10 ms per hop (data not shown), results in additional proteins returning to DNA with individual hopping distances longer than 2Dt=200nm, a detectable distance in SM measurements that are usually used to separate single diffusion trajectories into segments free of large displacements for accurate D analysis [3,15].

III. RESULTS AND DISCUSSION

For 4 × 105 GFP-LacI hopping simulations (maximum simulation time t3,max ≈ 1 ms) with δ = 0.267 Å and R = 4.2 nm, 99.809% of these trials resulted in the protein reassociating to DNA and thus the probability for a simulated hop to return to DNA is P = 0.99809. The hopping characteristics are shown in Figs. 3(a) and 3(b), in which the mean hopping distance along DNA is 3.37 Å (median, 0.41 Å), the mean hopping height (the maximum radial distance of the protein from DNA) is 4.93 Å (median, 0.45 Å), and the mean number of steps per hop is 4.97 × 104 (median, 5), yielding a mean hopping time of 〈t3〉 = 0.22 μs. The mean number of hops in a GFP-LacI diffusion trajectory is N = 526, obtained by dividing the total number of simulated hops of 4 × 105 by the total number of nonreturned hopping events of 763; the distribution for the number of hops per diffusion trajectory is shown in Fig. 4. This set of values have been verified to converge with those from a larger simulation of 4 × 106 hops. Specifically, N values differ by 0.57%. The inset of Fig. 3 shows the distribution of total hopping displacements in a diffusion trajectory with each data point simulated from 526 randomly selected hopping displacements. The rms total hopping displacements per diffusion trajectory is 127.5 nm (2D3Nt3), and the mean total hopping time is Nt3〉 = 115 μs. Note that although shorter hopping distances, such as ones less than the base pair length of 0.34 Å, do not carry direct biological significance nor do they noticeably disrupt sliding, they are important for correctly assessing rms total hopping displacement statistics in a diffusion trajectory.

FIG. 3.

FIG. 3

(Color online) (a) Distributions of hopping distances along DNA for δ = 0.267 Å and R = 4.2 (green, open circles) and 10.2 nm (red dots), and hopping height for R = 4.2 nm (gray line). (b) Distributions for number of steps per hop for R = 4.2 and 10.2 nm. (Inset) Distribution for total hopping displacement per diffusion trajectory and Gaussian fit (solid line). (c) Number of hops per diffusion trajectory longer than 0.25 Å, and up to hops longer than 800 nm, for R = 4.2 and 10.2 nm. The crosses are experimental data for EcoRV proteins, where the occurrence rate of hops per diffusion trajectory longer than 200 nm are 0.06, 0.1, and 0.16 (the 0.15 value was omitted for clarity) [15]. (d) GFP-LacI total diffusion time t distribution (from experimental data in Ref. [3]). The mean of the exponential fit (solid line) is 10.4 s.

FIG. 4.

FIG. 4

(Color online) Distribution of number of hops per diffusion trajectory. The results of 4 × 105 individual hopping simulations constitute a total of 763 protein diffusion trajectories such that 526 hops occur on average per trajectory.

We can also compute the “diffusion to capture” probability P for a protein to return to DNA using a steady-state solution to the diffusion equation, incorporating a cutoff radial distance c [28]. Proteins released after the initial step at b = 4.22 nm are either adsorbed at the DNA surface (R = 4.2 nm) or escape beyond c=R+4D3t3,max. The probability is time independent and given by

P=log(cb)log(cR)=0.99896. (3)

Imposing the same cutoff distance c = 551.2 nm in subsequent simulations, we obtained P = 0.998 65, in near agreement with the analytical value above.

Having obtained 〈t3〉 and N from simulation, we now solve Eqs. (1) and (2) for 〈t1〉 and D1 from the experimentally measured values of t and D. With values of D for GFP-LacI ranging from 2.3 × 102 to 1.3 × 105 nm2/s [3] and t = 10.4 s [Fig. 3(d)],

t1=tNt3tN=19.8(ms), (4)
D1DD3Nt3t=D930910.4(nm)2(s). (5)

The sliding time is several tens of ms and D1 ranges from ≈0 for slow diffusion to ≈D for fast diffusion. The 〈D1〉 for GFP-LacI is 9.1 × 103 nm2/s using 〈D〉 of 104 nm2/s. Since D1 > 0, Eq. (5) sets the lower bound of D such that it must be greater than D3Nt3〉/t ≈ 896 nm2/s. The rms total sliding displacement in a diffusion trajectory becomes longer than the rms total hopping displacement when D > 2ND3t3/t ≈ 1790 nm2/s.

Since our protein-nonspecific-DNA binding distance is an estimate, we have carried out simulations with Δr ranging from 0.5 to 6.5 nm (corresponding to protein-DNA distances R of 4.2 and 10.2 nm, respectively). Comparing the R = 10.2 nm results to the R = 4.2 nm results, the distributions for hopping distances [Fig. 3(a)] and hopping times [Fig. 3(b)] are similar, although the mean hopping distance reduces to 2.82 Å, the mean number of steps per hop reduces to 3.23 × 104, and the mean number of hops N doubles to 1101. Solving for 〈t1〉 and D1 at R = 10.2 nm, we found 〈t3〉 = 0.14 μs, Nt3〉 = 154 μs, 〈t1〉 = 9.4 ms (approximately half of the value for R = 4.2 nm), and D1 to be similar to the previously calculated value for R = 4.2 nm. Given that the sliding and hopping values at R = 4.2 and 10.2 nm are close, our method and results can be safely applied to most DB protein-DNA binding distances.

To investigate hopping distances within a diffusion trajectory, Fig. 3(c) shows the distribution of the number of hops per diffusion trajectory longer than a finite hopping distance, ranging from 0.25 Å to 800 nm, for R = 4.2 and 10.2 nm. For the 4.2-nm results, 3.37 hops in a diffusion trajectory were longer than 5 nm, and 11% of diffusion trajectories had a hop longer than 200 nm. As expected, the results for 10.2 nm are approximately twice as large since N is doubled. The crosses represent EcoRV proteins, which have a comparable hydrodynamic radius of 2.66 nm (see Table I), that were experimentally observed in different buffers to have hopped longer than 200 nm with reported occurrences ranging from 6 to 16% per diffusion trajectory [15]. These observations are in agreement with our simulations results. Furthermore, for hops longer than 300 and 500 nm, our observations agree with the reported values in Fig. 4(a) of Ref. [15].

TABLE I.

DB protein diffusion properties on elongated DNA.

Protein rprotein (nm) δ (Å) D (nm2/s)
YFP-LacI, 2a 3.13 0.284 4.6 × 104 [12]
GFP-LacI 2.68 0.267 2.3 × 102–1.3 × 105 [3]
EcoRV, 2 2.66 0.262 0.9 – 2.5 × 104 [15]
EcoRVa 3.1 × 103 [19]
RNAP, 4a 6.1 × 103 – 4.3 × 105 [13]
RNAPa 1.3 × 105 [29], ~104 [9]
hOgg1 2.36 0.247 5.78 × 105 [10]
p53 2.34 0.246 3.01 × 105 [17]
UL42 2.63 0.261 5.1 × 103–2.2 × 104 [16]
T7 gp5, 2 2.86 0.272 8.0 × 105–1.86 × 106 [21]
T7 gp5, 2 3.00 0.278 4.0 × 105 [21]
C-Ada 1.77 0.214 1.3 × 106 [20]
a

The number 2 indicates a dimer, and 4 indicates a tetramer.

b

Unknown molecular size due to unspecified/uncertain protein components and/or labels.

Other DB proteins may differ from GFP-LacI in their sizes and thus δ and R. Table I lists DB proteins that can hop on DNA (instead of proteins that slide only [11]) studied using SM fluorescence tracking methods on elongated DNA. Despite the difference in R by up to 1.26 nm, the δ values differ only by less than 0.07 Å. The effect of R difference is considered in Fig. 5(a), in which the number of hops per diffusion trajectory longer than a finite distance, ranging from 0.1 Å to 800 nm for δ = 0.267 Å and R from 4.2 to 10.2 nm are shown. The number of hops per diffusion trajectory increases with R moderately for all hopping distances, indicating that our hopping results are applicable to most observed DB proteins.

FIG. 5.

FIG. 5

(Color online) Distributions for number of hops per diffusion trajectory longer than 0.1, 0.34, 1, 5, 10, 20, 50, 100, 200, 300, 500, and 800 nm (top to bottom in a), (a) for R ranging from 4.2 to 10.2 nm (left to right) and (b) for R = 4.2 nm and δ = 0.267 (circles), 3.4 (empty squares), and 10.2 Å (crosses). (Inset) Hopping distance distributions for the three δ values.

The step size δ in the current approach, based on microscopic Brownian random walk models, can be made larger or smaller for vastly different particle sizes. Figure 5(b) shows distributions of hopping distances for three δ values: 0.267, 3.4, and 10 Å (we used R = 4.2 nm and t3,max ≈ 1 ms). The distribution curves collapse when protein hopping distances are larger than δ, indicating that the tail distribution of protein hopping probability has the same asymptotic form at long distances, in agreement with the solution to the diffusion equation [30]. However, the mean hopping distance [Fig. 5(b, inset); values 3.37, 36, and 95 Å], the mean number of hops N in a trajectory (526, 42, and 14), and 〈t3〉 (0.22, 3.1, and 9.2 μs) all depend on δ sensitively, as short-length scale motions dominate protein-DNA reassociation [Fig. 3(a)]. This regime can not be accessed in the macroscopic theory, i.e., by solving the diffusion equation directly.

When the protein-nonspecific-DNA association probability, p, is not 100%, e.g., due to rotation of the DNA-binding domain during large hops, hopping statistics and the subsequent sliding statistics will change. For a low binding probability of p = 10%, although on average, 10 consecutive hops would be needed for reassociation, the mean number of association attempts will still be N. However, the effective mean hopping time t3, and the mean hopping distance are expected to increase while the effective number of hops per diffusion trajectory N′ decreases since t is held constant. The effective total hopping time Nt3 and the rms total hopping distance per diffusion trajectory should therefore remain constant. The binding probability is thus inversely related to the effective mean sliding time t1, according to Eq. (2), which for p = 10% results in a 10-fold increase in t1.

When salt concentration varies, p and R will change, as will D3 within a few angstroms of the DNA surface. However, since t remains Nt1 because Nt3Nt1, the observed changes in t with salt concentration are likely due to changes in the total sliding time rather than the total hopping time. Consequently, changes in t as a result of varying salt concentration are not indicative of hopping and should not be used to determine its presence in diffusion trajectories, in disagreement with Refs. [6,10,16,17,21].

Some studies use flow to elongate DNA and/or investigate hopping properties of DB proteins [10,17,20,21,31]. Here we describe the effect of flow on hopping distances using the maximum reported flow rate in SM studies of 100 μm/s. For our mean hopping time of 〈t3〉 = 0.22 μs, a typical dissociated protein is carried by flow a length 0.22 Å along DNA; this distance is negligible compared to its mean hopping distance of 3.37 Å (the total displacement of the protein from flow alone within a diffusion trajectory consisting of 526 hops will be 11.6 nm which is substantially less than the total hopping displacement of 127.5 nm observed for GFP-LacI and similarly other proteins, as shown above). On the other hand, for a trajectory that includes a 1-μm-long hop, which occurs once every 1000 diffusion trajectories, the hopping time is 6.22 ms and a protein is flown 622 nm along DNA. This distance would be sufficiently large for the protein to be considered dissociated.

Our results suggest that for diffusion trajectories without large hops of longer than of order a few hundred nanometers, a protein is unlikely to have been “washed out” while those that include large hops, the protein may be. However, according to Fig. 3(c), the probability for such an event to occur is approximately 1% of all diffusion trajectories.

Furthermore, sliding kinetics are not expected to be drastically affected by DNA configuration since a protein remains in contact with nonspecific DNA and should not be subject to DNA condensation and coiling either in vivo or in vitro, contrary to hopping kinetics. The reported values for D1 and t can therefore be applied under in vivo situations for better estimation of target binding rates.

IV. CONCLUSION

In summary, this study analyzes DB proteins’ hopping on elongated DNA to address sliding kinetics. While we have made several assumptions regarding the nature of protein association and modeling DNA, our study suggests that the observed sliding kinetics is a robust feature. Although hopping kinetics will change according to in vivo conditions, the lower bound on D for a typical DB protein should help future experiments in identifying the presence of hopping in protein diffusion trajectories with greater certainty.

ACKNOWLEDGMENTS

We are grateful to Anders Carlsson for helpful discussions. M.C.D. wishes to thank the National Institutes of Health for Grant No. 5T90 DA022871.

References

RESOURCES