Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Dec 14.
Published in final edited form as: Phys Rev Lett. 2012 Dec 10;109(24):248301. doi: 10.1103/PhysRevLett.109.248301

The range of interaction between DNA-bending proteins is controlled by the second-longest correlation length for bending fluctuations

Houyin Zhang 1,, John F Marko 2,
PMCID: PMC3759365  NIHMSID: NIHMS505066  PMID: 23368394

Abstract

When a DNA molecule is stretched, the zero-force correlation length for its bending fluctuations – the persistence length A – bifurcates into two different correlation lengths - the shorter “longitudinal” correlation length ξ(f) and the longer “transverse” correlation length ξ(f). In the high-force limit, ξ(f)=ξ(f)/2=kBTA/f/2. When DNA-bending proteins bind to the DNA molecule, there is an effective interaction between the protein-generated bends mediated by DNA elasticity and bending fluctuations. Surprisingly, the range of this interaction is not the longest correlation length associated with transverse fluctuations of the tangent vector along the polymer, but instead is the second longest longitudinal correlation length ξ (f, μ). The effect arises from the protein-bend contribution to the Hamiltonian having an axial rotational symmetry which eliminates its coupling to the transverse fluctuations.


In living cells, binding of proteins to DNA controls gene expression and packaging of the genome [13], requiring DNA-binding proteins to communicate and cooperate over a wide range of length scales. Many DNA-binding proteins distort the double helix, which has led to development of micromechanical methods for analyis of protein-DNA interactions [47]. Notably, tension along DNA molecules can induce cooperative interactions between distant DNA-bending proteins [810]. Numerical analysis of a semiflexible “worm-like chain” (WLC) showed the force-generated interaction range between two quenched bends (corresponding to bound proteins) to be ξ(f)A/βf/2 [10], in the high-force range f > kBT/A ≈ 0.1 pN. The dominant (longest) correlation length for the WLC is that for bending fluctuations, ξ(f)A/βf [11]. Surprisingly, the protein-induced bends have an interaction range exactly half of ξ(f). Here we show that this arises from the rotational symmetry of the part of the energy describing the protein-DNA interaction. We will illustrate this for a specific model, but we will also argue that our results are general for a wide class of similar models of proteins interacting with DNA.

To investigate the force-dependent interactions between DNA-binding proteins we use a discretized WLC model including protein-binding effects [10] (essentially the model of Refs. [12] and [13]). The DNA is treated as a chain of N links each of length b, with each link able to point in any direction with a Boltzmann weight that depends on the angle between adjacent links. This form of the model allows it to be treated by “transfer matrix” [14] methods, whereby summation over individual link degrees of freedom can be reduced to matrix multiplications. This method has a long history of use in biopolymer physics, from early studies of the helix-coil transition [15], to recent studies of semiflexible polymer elasticity [16, 17].

In our model, proteins are taken to bind “nonspecifi-cally” (with no sequence-dependent affinity) to the nodes between adjacent links. The DNA-protein complex energy is a sum over the links,

βE=i=1N{a2|t^i+1t^i|2(1ni)+[a2(t^i+1·t^iγ)2μ]niβbft^i·z^ηnini+1}, (1)

where i is an unit vector which describes the direction of each link, the dimensionless parameter a (= A/b) is the naked DNA bending rigidity, a′ is the rigidity of a protein-occupied node, and ni is the protein occupation degree of freedom for the i-th node (ni = 0 for an unoccupied node, or 1 for a protein-bound node). Protein binding is controlled by the chemical potential μ which is a function of bulk protein concentration c (μ = ln[c/c0] where c0 is a protein-dependent constant determining the binding affinity), and the bends induced by protein binding are described by γ = cosθ, where θ is the preferred angle between adjacent links resulting from protein binding. The DNA-protein complex is stretched by the applied force in the direction. Finally, η is the intrinsic binding cooperativity, describing the strength of direct interaction between adjacent bound proteins.

This model describes a variety of DNA-binding proteins [10, 12, 18], depending on the various parameter values chosen. For most situations of interest, the total DNA contour length L (= Nb) can be taken to be much bigger than the link length b, and a periodic boundary condition can be used (N = 0). We compute the equilibrium partition function Z = Tr(TN) following the method of Ref. [10], where T is the transfer matrix for a link

T=((t^i,1|T|t^i+1,1)(t^i,1|T|t^i+1,0)(t^i,0|T|t^i+1,1)(t^i,0|T|t^i+1,0))=(Aeμ+ηAeμBB). (2)

Here A and B are matrices with elements t^|A|t^=ea2(t^·t^γ)2+βbft^·z^ and t^|B|t^=ea2|t^t^|2+βbft^·z^.

The matrices may be expressed in terms of spherical harmonics, i.e.,

lm|A|lmdt^dt^Ylm*(t^)t^|A|t^Ylm(t^), (3)

where the integrals are on the unit sphere. Due to axial symmetry around the stretching (z) direction, the matrix elements are diagonal in m, i.e., Eq. 3 is proportional to δmm′. Closed-form expressions for the matrix elements of A and B can be found in Sec. S1 of Supplemental Material [19].

Protein occupation at any node along the chain can be computed in this transfer matrix framework using the “operators” 〈, n||t̂′, n′〉 = δt̂t̂′, δnn′ n, or

n^=(1000), (4)

where 1 and 0 stand for the unit matrix and zero matrix in the {|l,m〉} space, respectively. The average and two-point correlations for the occupation variables are

ni=Tr(n^TN)/Tr(TN),nini+j=Tr(n^Tjn^TNj)/Tr(TN),Gn^(r,f)nini+jnini+j, (5)

where G(r, f) is the protein occupation correlation function for two DNA-binding proteins separated by contour length r = jb.

The three components of the tangent vector can also be expressed as “operators”: 〈, n|i|t̂′, n′〉 = δt̂t̂′δnn′ti, i = x,y, or z. It will be useful to express x and y in terms of raising and lowering operators ± = x ± iy. Matrix elements of these operators in the {|l,m〉} basis are straightforward to compute (Sec. S2 of Supplemental Material [19]).

These matrices have some important symmetries; ± and z are diagonal in n (for the moment we suppress the simple n-dependence of these operators). Second, z is diagonal in m due to rotational symmetry around the z direction; in the {|l,m〉} space with an appropriate order of basis vectors, z is block-diagonal (nonzero entries shown as purple or dark gray entries of Fig. 1 for l, l′ ≤ 3). On the other hand, in this basis the raising (lowering) operator + () is a block lower (upper) next-to-diagonal matrix [nonzero entries indicated as the orange or light gray lower (upper) next-to-diagonal elements in Fig. 1].

FIG. 1.

FIG. 1

(color online) Structure of the three tangent component operators z, +, and in {|l, m〉} space for the case l,l′ ≤ 3 (ignoring the n-dependence). For matrix z, only matrix elements along the main diagonal blocks are possibly non-zero (purple or dark gray); all other matrix elements are zero. For matrix + (, only matrix elements in the lower (upper) next-to-diagonal blocks are possibly non-zero (orange or light gray); all other matrix elements are zero.

The raising and lowering operators also have some key symmetries in the {|l,m〉} basis:

t^l,m,n;l,m1,n+=t^l,1m,n;l,m,n+t^l,m1,n;l,m,n=t^l,m,n;l,1m,n. (6)

The average z-component of the tangent vector is

tiz=Tr(t^zTN)/Tr(TN), (7)

which also equals the molecule end-to-end extension as a fraction of total contour length. The transverse components of the tangent vector average to zero by symmetry: tix=tiy=0.

The longitudinal correlations are

Gt^z(r,f)tizti+jztizti+jz,tizti+jz=Tr(t^zTjt^zTNj)/Tr(TN), (8)

and the transverse correlations are

Gt^x(r,f)tixti+jx=ti+ti+j/2=Gt^y(r,f),ti+ti+j=Tr(t^+Tjt^TNj)/Tr(TN). (9)

Before studying force-generated protein-protein interactions, we examine the bending correlation lengths for naked DNA. In the absence of protein (μ = −∞, driving ni = 0), Eq. 1 becomes a 1-d Heisenberg model. We use a segment size b = 5 nm (approximately 15 bp) and DNA bending stiffness a = 10 corresponding to persistence length A = ab = 50 nm. We consider the long-DNA limit LA for all computations, and a cutoff on l is chosen to obtain ten-digit numerical precision (typically lmax = 14). The averages of the z-component of the tangent vector 〈z〉 and the tangent vector correlation functions Gz (r, f) and Gx (r, f) are computed from Eqs. 7, 8 and 9 (Sec. S3 of Supplemental Material [19]). For large j, we extract correlation lengths from the expected asymptotic form of the correlation functions:

Gt^z(r,f)=C(f)er/ξ(f), (10)

where ξ (f) and C (f) are the longitudinal correlation length and amplitude. Similarly, we extracted the transverse correlation length ξ(f) and amplitude C(f).

Fig. 2(a) shows ξ(f) (bottom pink crosses) and ξ(f) (top orange pluses) as a function of force. For small forces, the two correlation lengths approach the same limit ξ(f) ≈ ξ(f) ≈ 50 nm, the persistence length for a dsDNA. However, as force is increased, the two correlation lengths vary differently. As one enters the high-force range (f > kBT/A ≈ 0.1 pN) the correlation lengths bifurcate into two curves, both ∼ f−1/2 but with different coefficients: ξ=2ξ=kBTA/f. ξ(f) approaches (numerically) exactly half of ξ(f) for large forces.

FIG. 2.

FIG. 2

(color online) Tangent vector correlation lengths and amplitudes for naked DNA. (a) The longitudinal correlation length ξ(f) calculated by three different methods: by Eq. 10 (bottom pink crosses), by Eq. 11 (bottom red line), and by Eq. 13 (bottom green squares, overlapping with bottom pink crosses). The transverse correlation length ξ(f) calculated by three different methods: by Eq. 10 (top orange pluses), by Eq. 11 (top blue line), and by Eq. 13 (top aqua circles, overlapping with top orange pluses). In high-force range, ξ(f)=2ξ(f)=kBTA/f. (b) The longitudinal correlation function amplitude C(f) calculated by Eq. 10 (bottom pink crosses), agreeing with Gz (0, f) (bottom green squares) and the high-force limit from Eq. 11 (bottom red line). The transverse correlation function amplitude C(f) calculated by Eq. 10 (top orange pluses), agreeing with Gx (0, f) (top aqua circles) and the high-force limit from Eq. 11 (top blue line).

Fig. 2(b) shows the longitudinal correlation function amplitude C (f) (bottom pink crosses), the zero-distance correlation function Gz (0, f) (bottom green squares), the transverse correlation function amplitude C(f) (top orange pluses ), and the zero-distance correlation function Gx (0, f) (top aqua circles). The longitudinal amplitude is (numerically) the square of the transverse correlation function amplitude for large forces.

To analytically understand the relationship between the transverse and longitudinal correlation functions for naked DNA, we consider the high-force limit (fkBT/A) of the continuous WLC model where a Gaussian approximation for tangent vector fluctuations becomes appropriate [11]. In this limit, since fluctuates only slightly around , then tx and ty are small quantities, and tz=1(tx2+ty2)/2+𝒪(tx4+ty4). Following Ref. [11] (Sec. S4 of Supplemental Material [19]),

Gt^x(r,f)=kBT/4Afer/kBTA/f+,Gt^z(r,f)=(kBT/4Af)er/kBTA/4f+. (11)

Therefore, the longitudinal correlation function is essentially the square of the transverse correlation function in the large force limit, indicating that ξ(f)=kBTA/4f=ξ(f)/2 and C(f)=kBTA/4Af=C2(f). Fig. 2 includes analytic results for ξ(f) (Fig. 2(a), bottom red line), ξ(f) (Fig. 2(a), top blue line), C(f) (Fig. 2(b), bottom red line) and C(f) (Fig. 2(b), top blue line). In the high-force range, the analytic results match our numerical calculation.

The asymptotic behaviors of the transverse and longitudinal correlation functions can also be obtained from the discrete WLC transfer matrices, using Gz (r, f) and Gx (r, f) defined in Eq. 8 and Eq. 9, respectively. In the large-N limit (Sec. S5 of Supplemental Material [19]),

Gt^z(r,f)=k=1(t0z)0,k(t0z)k,0(λk0λ00)j,Gt^z(r,f)=12k=1(t0,1+)0,k(t1,0)k,0(λk1λ00)j, (12)

where the λkm(m=0,±1,±2,;k=|m|,|m|+1,|m|+2,) are the eigenvalues of the m-th diagonal block of the “naked DNA” matrix B and λ|m|m>λ|m|+1m>λ|m|+2m>. Sec. S6 of the Supplemental Material [19] shows how applied force f affects the eigenvalues {λkm}, i.e., the transfer matrix spectrum of naked DNA.

(tmz)k1,k2 is the matrix element of the m-th diagonal block of the matrix z (ignoring n-dependence) expressed by taking the eigenvectors of the m-th diagonal block of the naked matrix B ({|λkm}) as the basis, (tmz)k1,k2=λk1m|t^mz|λk2m Similarly, (tm,m1+)k1,k2=λk1m|t^m,m1+|λk2m1 and (tm,m1)k1,k2=λk1m1|t^m1,m|λk2m.

For large distances (large j) the correlation functions (12) is dominated by the k = 1 contributions, giving

ξ(f)=bln(λ00/λ10),ξ(f)=bln(λ00/λ11). (13)

We have verified that the calculated results of ξ (f) and ξ(f) by Eq. 13 computed by matrix diagonalization [bottom green squares and top aqua circles in Fig. 2(a)] match the result computed directly from the multiplication of matrices [compare to bottom pink crosses and top orange pluses in Fig. 2(a)].

We now consider the interaction between DNA-bending proteins. We computed the protein occupation correlation function G (r, f) along a DNA in dilute protein solution, the decay of which provides the correlation length for the inter-protein interaction, for the case γ = 0 and a′ = 50, corresponding to a rather stiff protein-DNA complex with a preferred bend angle of 90°. We extract the protein occupancy correlation length, just as in Eq. 10; the result for μ = −5 and η = 0 is shown in Fig. 3 (bottom grey filled circles; also see Sec. S7 of Supplemental Material [19]). For comparison, in Fig. 3 we also show the tangent vector correlation lengths of protein-bound DNA (bottom pink crosses and top orange pluses) and those of naked DNA (bottom green squares and top aqua circles).

FIG. 3.

FIG. 3

(color online) Protein occupation correlation length along DNA (bottom grey filled circles) in dilute protein solution (μ = −5, η = 0, γ = 0, a′ = 50). For reference, the tangent vector correlation lengths along DNA in the same solution (bottom pink crosses and top orange pluses) and those along a naked DNA (bottom green squares and top aqua circles) are shown. Inset shows ln[G(r)/e] vs. distance at f = 0.1 pN (green squares) and a linear fit of it (black line).

Fig. 3 reveals that the protein occupancy correlation length is equal to the longitudinal tangent vector correlation length ξ(f, μ). This can be understood from the diagonal-in-m form (Fig. 1) of the protein occupancy matrix , which is shared by A, z, and B reflecting their axial rotational symmetry. For the protein occupancy correlation function, just as for Gz (r, f), blocks of different m are decoupled, leading to the correlation function decay being determined by the largest two eigenvalues of the (m = 0)th diagonal blocks of transfer matrix T, which has the largest eigenvalue of T. Thus, the correlation function of the protein occupation degrees of freedom have a decay length of ξ (f,μ). For large forces, this is exactly half of the longest correlation length in the problem, that of ξ(f, μ); this longer transverse correlation length arises from the off-block-diagonal structure of ± which allows selection of the two largest eigenvalues from any two adjacent diagonal blocks of T. The protein occupancy operator has higher symmetry (Eq. 4), which makes ni2=ni and explains the jump of protein occupation correlation function at r = 0.

Our main result, namely ξ being the range of protein-protein interactions mediated by bending fluctuations along a stretched DNA, follows from the rotational symmetry of the protein-DNA interaction operator A. This result is general for proteins which distort DNA structure as long as the distortions are isotropically distributed around the double helix axis (the same result follows for other values of γ, a′ and μ, Sec. S8 of Supplemental Material [19]). For non-sequence-specific binding this will be the case since two proteins may bind in essentially any relative axial orientation by shifting their binding positions over a 10.5 base-pair range (the h ≈ 3.6 nm helix repeat for dsDNA), a length scale small relative to the inter-protein separations we are considering. Twist fluctuations of inter-protein DNA will further randomize the relative axial orientation of pairs of DNA-bound proteins. Of course, for very large forces where ξ becomes comparable to h, a more detailed model taking into account the helix structure would need to be employed, but in that regime, with ξ comparable in size to a typical DNA-binding protein, one would expect direct interactions between adjacent proteins to dominate (Sec. S7 of Supplemental Material [19]).

One might ask whether the longer ξ can ever be the protein-protein interaction range. This would require the relative angular orientations of the two proteins around the axial (stretching) direction to be fixed: one might imagine using a sequence-specific DNA-binding protein and binding sites “phased” to be at the same orientation around the double helix (spaced multiples of 10.5 bp) in an attempt to do this. However, since angular (twist) fluctuations between the two sites will increase with in-terprotein distance, one would expect to see an interaction decaying with the longer ξonly at short distances, with the shorter ξ dominating at longer distances.

Finally we note that while the individual protein-generated distortions considered here are isotropically distributed around the double helix axis, the relative orientation of two nearby bends can be expected to be angularly correlated. This arises from the fact that two oppositely-directed bends of the double helix compensate one another’s distortion [810]. Therefore one can expect force on a DNA to direct the self-organization of DNA-bending-proteins into clusters (via the interaction we have analyzed) with correlated bend directions. If the bends are chiral (the general case since DNA and proteins are both chiral), nearby proteins will be chirally organized, i.e. forming locally helical DNA-protein complexes. An important case is that of nucleosomes, which have well-defined entry/exit angles for the DNA bound to them, with broken chiral symmetry; moderate applied tension can be expected to drive the packing and helical organization of adjacent nucleosomes in chromatin.

Supplementary Material

text and figures

Acknowledgments

We acknowledge the support of the NSF through Grant DMR-0715099, and of the NIH through Grant 1U54CA143869-01 (NU-PS-OC).

Contributor Information

Houyin Zhang, Email: houyinzhang2011@u.northwestern.edu.

John F. Marko, Email: john-marko@northwestern.edu.

References

  • 1.Stavans J, Oppenheim A. Phys. Biol. 2006;3:R1. doi: 10.1088/1478-3975/3/4/R01. [DOI] [PubMed] [Google Scholar]
  • 2.Rocha EPC. Annu. Rev. Genet. 2008;42:211–233. doi: 10.1146/annurev.genet.42.110807.091653. [DOI] [PubMed] [Google Scholar]
  • 3.Rippe K, editor. Genome Organization and Function in the Cell Nucleus. New York: Wiley-VCH; 2011. [Google Scholar]
  • 4.Ali BM, Amit R, Braslavsky I, Oppenheim AB, Gileadi O, Stavans J. Proc. Natl. Acad. Sci. U.S.A. 2001;98:10658. doi: 10.1073/pnas.181029198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.van Noort J, Verbrugge S, Goosen N, Dekker C, Dame RT. Proc. Nat. Acad. Sci. USA. 2004;101:6969. doi: 10.1073/pnas.0308230101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Meglio A, Praly E, Ding F, Allemand JF, Bensimon D, Croquette V. Curr. Opin. Struct. Biol. 2009;19:615. doi: 10.1016/j.sbi.2009.08.005. [DOI] [PubMed] [Google Scholar]
  • 7.Chaurasiya KR, Paramanathan T, McCauley MJ, Williams MC. Phys. Life. Rev. 2010;7:299. doi: 10.1016/j.plrev.2010.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rudnick J, Bruinsma R. Biophys. J. 1999;76:1725. doi: 10.1016/S0006-3495(99)77334-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Koslover EF, Spakowitz AJ. Phys. Rev. Lett. 2009;102:178102. doi: 10.1103/PhysRevLett.102.178102. [DOI] [PubMed] [Google Scholar]
  • 10.Zhang H, Marko JF. Phys. Rev. E. 2010;82:051906. doi: 10.1103/PhysRevE.82.051906. [DOI] [PubMed] [Google Scholar]
  • 11.Marko JF, Siggia ED. Macromolecules. 1995;28:8759. [Google Scholar]
  • 12.Yan J, Marko JF. Phys. Rev. E. 2003;68:011905. doi: 10.1103/PhysRevE.68.011905. [DOI] [PubMed] [Google Scholar]
  • 13.Yan J, Kawamura R, Marko JF. Phys. Rev. E. 2005;71:061905. doi: 10.1103/PhysRevE.71.061905. [DOI] [PubMed] [Google Scholar]
  • 14.Kramers HA, Wannier GH. Phys. Rev. 1941;60:252. [Google Scholar]
  • 15.Zimm BH, Bragg JK. J. Chem. Phys. 1959;31:526. [Google Scholar]
  • 16.Livadaru L, Netz RR, Kreuzer HJ. Macromolecules. 2003;36:3732. [Google Scholar]
  • 17.Storm C, Nelson PC. Phys. Rev. 2003;E 67:051906. doi: 10.1103/PhysRevE.67.051906. [DOI] [PubMed] [Google Scholar]
  • 18.Xiao B, Zhang H, Johnson RC, Marko JF. Nucleic Acids Res. 2011;39:5568. doi: 10.1093/nar/gkr141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Supplemental Material is available at (URL)

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

text and figures

RESOURCES