Abstract
When a DNA molecule is stretched, the zero-force correlation length for its bending fluctuations – the persistence length A – bifurcates into two different correlation lengths - the shorter “longitudinal” correlation length ξ‖(f) and the longer “transverse” correlation length ξ⊥(f). In the high-force limit, . When DNA-bending proteins bind to the DNA molecule, there is an effective interaction between the protein-generated bends mediated by DNA elasticity and bending fluctuations. Surprisingly, the range of this interaction is not the longest correlation length associated with transverse fluctuations of the tangent vector along the polymer, but instead is the second longest longitudinal correlation length ξ‖ (f, μ). The effect arises from the protein-bend contribution to the Hamiltonian having an axial rotational symmetry which eliminates its coupling to the transverse fluctuations.
In living cells, binding of proteins to DNA controls gene expression and packaging of the genome [1–3], requiring DNA-binding proteins to communicate and cooperate over a wide range of length scales. Many DNA-binding proteins distort the double helix, which has led to development of micromechanical methods for analyis of protein-DNA interactions [4–7]. Notably, tension along DNA molecules can induce cooperative interactions between distant DNA-bending proteins [8–10]. Numerical analysis of a semiflexible “worm-like chain” (WLC) showed the force-generated interaction range between two quenched bends (corresponding to bound proteins) to be [10], in the high-force range f > kBT/A ≈ 0.1 pN. The dominant (longest) correlation length for the WLC is that for bending fluctuations, [11]. Surprisingly, the protein-induced bends have an interaction range exactly half of ξ⊥(f). Here we show that this arises from the rotational symmetry of the part of the energy describing the protein-DNA interaction. We will illustrate this for a specific model, but we will also argue that our results are general for a wide class of similar models of proteins interacting with DNA.
To investigate the force-dependent interactions between DNA-binding proteins we use a discretized WLC model including protein-binding effects [10] (essentially the model of Refs. [12] and [13]). The DNA is treated as a chain of N links each of length b, with each link able to point in any direction with a Boltzmann weight that depends on the angle between adjacent links. This form of the model allows it to be treated by “transfer matrix” [14] methods, whereby summation over individual link degrees of freedom can be reduced to matrix multiplications. This method has a long history of use in biopolymer physics, from early studies of the helix-coil transition [15], to recent studies of semiflexible polymer elasticity [16, 17].
In our model, proteins are taken to bind “nonspecifi-cally” (with no sequence-dependent affinity) to the nodes between adjacent links. The DNA-protein complex energy is a sum over the links,
(1) |
where t̂i is an unit vector which describes the direction of each link, the dimensionless parameter a (= A/b) is the naked DNA bending rigidity, a′ is the rigidity of a protein-occupied node, and ni is the protein occupation degree of freedom for the i-th node (ni = 0 for an unoccupied node, or 1 for a protein-bound node). Protein binding is controlled by the chemical potential μ which is a function of bulk protein concentration c (μ = ln[c/c0] where c0 is a protein-dependent constant determining the binding affinity), and the bends induced by protein binding are described by γ = cosθ, where θ is the preferred angle between adjacent links resulting from protein binding. The DNA-protein complex is stretched by the applied force in the ẑ direction. Finally, η is the intrinsic binding cooperativity, describing the strength of direct interaction between adjacent bound proteins.
This model describes a variety of DNA-binding proteins [10, 12, 18], depending on the various parameter values chosen. For most situations of interest, the total DNA contour length L (= Nb) can be taken to be much bigger than the link length b, and a periodic boundary condition can be used (t̂N = t̂0). We compute the equilibrium partition function Z = Tr(TN) following the method of Ref. [10], where T is the transfer matrix for a link
(2) |
Here A and B are matrices with elements and .
The matrices may be expressed in terms of spherical harmonics, i.e.,
(3) |
where the integrals are on the unit sphere. Due to axial symmetry around the stretching (z) direction, the matrix elements are diagonal in m, i.e., Eq. 3 is proportional to δmm′. Closed-form expressions for the matrix elements of A and B can be found in Sec. S1 of Supplemental Material [19].
Protein occupation at any node along the chain can be computed in this transfer matrix framework using the “operators” 〈t̂, n|n̂|t̂′, n′〉 = δt̂t̂′, δnn′ n, or
(4) |
where 1 and 0 stand for the unit matrix and zero matrix in the {|l,m〉} space, respectively. The average and two-point correlations for the occupation variables are
(5) |
where Gn̂(r, f) is the protein occupation correlation function for two DNA-binding proteins separated by contour length r = jb.
The three components of the tangent vector can also be expressed as “operators”: 〈t̂, n|t̂i|t̂′, n′〉 = δt̂t̂′δnn′ti, i = x,y, or z. It will be useful to express t̂x and t̂y in terms of raising and lowering operators t̂± = t̂x ± it̂y. Matrix elements of these operators in the {|l,m〉} basis are straightforward to compute (Sec. S2 of Supplemental Material [19]).
These matrices have some important symmetries; t̂± and t̂z are diagonal in n (for the moment we suppress the simple n-dependence of these operators). Second, t̂z is diagonal in m due to rotational symmetry around the z direction; in the {|l,m〉} space with an appropriate order of basis vectors, t̂z is block-diagonal (nonzero entries shown as purple or dark gray entries of Fig. 1 for l, l′ ≤ 3). On the other hand, in this basis the raising (lowering) operator t̂+ (t̂−) is a block lower (upper) next-to-diagonal matrix [nonzero entries indicated as the orange or light gray lower (upper) next-to-diagonal elements in Fig. 1].
The raising and lowering operators also have some key symmetries in the {|l,m〉} basis:
(6) |
The average z-component of the tangent vector is
(7) |
which also equals the molecule end-to-end extension as a fraction of total contour length. The transverse components of the tangent vector average to zero by symmetry: .
The longitudinal correlations are
(8) |
and the transverse correlations are
(9) |
Before studying force-generated protein-protein interactions, we examine the bending correlation lengths for naked DNA. In the absence of protein (μ = −∞, driving ni = 0), Eq. 1 becomes a 1-d Heisenberg model. We use a segment size b = 5 nm (approximately 15 bp) and DNA bending stiffness a = 10 corresponding to persistence length A = ab = 50 nm. We consider the long-DNA limit L ≫ A for all computations, and a cutoff on l is chosen to obtain ten-digit numerical precision (typically lmax = 14). The averages of the z-component of the tangent vector 〈t̂z〉 and the tangent vector correlation functions Gt̂z (r, f) and Gt̂x (r, f) are computed from Eqs. 7, 8 and 9 (Sec. S3 of Supplemental Material [19]). For large j, we extract correlation lengths from the expected asymptotic form of the correlation functions:
(10) |
where ξ‖ (f) and C‖ (f) are the longitudinal correlation length and amplitude. Similarly, we extracted the transverse correlation length ξ⊥(f) and amplitude C⊥(f).
Fig. 2(a) shows ξ‖(f) (bottom pink crosses) and ξ⊥(f) (top orange pluses) as a function of force. For small forces, the two correlation lengths approach the same limit ξ‖(f) ≈ ξ⊥(f) ≈ 50 nm, the persistence length for a dsDNA. However, as force is increased, the two correlation lengths vary differently. As one enters the high-force range (f > kBT/A ≈ 0.1 pN) the correlation lengths bifurcate into two curves, both ∼ f−1/2 but with different coefficients: . ξ‖(f) approaches (numerically) exactly half of ξ⊥(f) for large forces.
Fig. 2(b) shows the longitudinal correlation function amplitude C‖ (f) (bottom pink crosses), the zero-distance correlation function Gt̂z (0, f) (bottom green squares), the transverse correlation function amplitude C⊥(f) (top orange pluses ), and the zero-distance correlation function Gt̂x (0, f) (top aqua circles). The longitudinal amplitude is (numerically) the square of the transverse correlation function amplitude for large forces.
To analytically understand the relationship between the transverse and longitudinal correlation functions for naked DNA, we consider the high-force limit (f ≫ kBT/A) of the continuous WLC model where a Gaussian approximation for tangent vector fluctuations becomes appropriate [11]. In this limit, since t̂ fluctuates only slightly around ẑ, then tx and ty are small quantities, and . Following Ref. [11] (Sec. S4 of Supplemental Material [19]),
(11) |
Therefore, the longitudinal correlation function is essentially the square of the transverse correlation function in the large force limit, indicating that and . Fig. 2 includes analytic results for ξ‖(f) (Fig. 2(a), bottom red line), ξ⊥(f) (Fig. 2(a), top blue line), C‖(f) (Fig. 2(b), bottom red line) and C⊥(f) (Fig. 2(b), top blue line). In the high-force range, the analytic results match our numerical calculation.
The asymptotic behaviors of the transverse and longitudinal correlation functions can also be obtained from the discrete WLC transfer matrices, using Gt̂z (r, f) and Gt̂x (r, f) defined in Eq. 8 and Eq. 9, respectively. In the large-N limit (Sec. S5 of Supplemental Material [19]),
(12) |
where the are the eigenvalues of the m-th diagonal block of the “naked DNA” matrix B and . Sec. S6 of the Supplemental Material [19] shows how applied force f affects the eigenvalues , i.e., the transfer matrix spectrum of naked DNA.
is the matrix element of the m-th diagonal block of the matrix t̂z (ignoring n-dependence) expressed by taking the eigenvectors of the m-th diagonal block of the naked matrix B as the basis, Similarly, and .
For large distances (large j) the correlation functions (12) is dominated by the k = 1 contributions, giving
(13) |
We have verified that the calculated results of ξ‖ (f) and ξ⊥(f) by Eq. 13 computed by matrix diagonalization [bottom green squares and top aqua circles in Fig. 2(a)] match the result computed directly from the multiplication of matrices [compare to bottom pink crosses and top orange pluses in Fig. 2(a)].
We now consider the interaction between DNA-bending proteins. We computed the protein occupation correlation function Gn̂ (r, f) along a DNA in dilute protein solution, the decay of which provides the correlation length for the inter-protein interaction, for the case γ = 0 and a′ = 50, corresponding to a rather stiff protein-DNA complex with a preferred bend angle of 90°. We extract the protein occupancy correlation length, just as in Eq. 10; the result for μ = −5 and η = 0 is shown in Fig. 3 (bottom grey filled circles; also see Sec. S7 of Supplemental Material [19]). For comparison, in Fig. 3 we also show the tangent vector correlation lengths of protein-bound DNA (bottom pink crosses and top orange pluses) and those of naked DNA (bottom green squares and top aqua circles).
Fig. 3 reveals that the protein occupancy correlation length is equal to the longitudinal tangent vector correlation length ξ‖(f, μ). This can be understood from the diagonal-in-m form (Fig. 1) of the protein occupancy matrix n̂, which is shared by A, t̂z, and B reflecting their axial rotational symmetry. For the protein occupancy correlation function, just as for Gt̂z (r, f), blocks of different m are decoupled, leading to the correlation function decay being determined by the largest two eigenvalues of the (m = 0)th diagonal blocks of transfer matrix T, which has the largest eigenvalue of T. Thus, the correlation function of the protein occupation degrees of freedom have a decay length of ξ‖ (f,μ). For large forces, this is exactly half of the longest correlation length in the problem, that of ξ⊥(f, μ); this longer transverse correlation length arises from the off-block-diagonal structure of t̂± which allows selection of the two largest eigenvalues from any two adjacent diagonal blocks of T. The protein occupancy operator n̂ has higher symmetry (Eq. 4), which makes and explains the jump of protein occupation correlation function at r = 0.
Our main result, namely ξ‖ being the range of protein-protein interactions mediated by bending fluctuations along a stretched DNA, follows from the rotational symmetry of the protein-DNA interaction operator A. This result is general for proteins which distort DNA structure as long as the distortions are isotropically distributed around the double helix axis (the same result follows for other values of γ, a′ and μ, Sec. S8 of Supplemental Material [19]). For non-sequence-specific binding this will be the case since two proteins may bind in essentially any relative axial orientation by shifting their binding positions over a 10.5 base-pair range (the h ≈ 3.6 nm helix repeat for dsDNA), a length scale small relative to the inter-protein separations we are considering. Twist fluctuations of inter-protein DNA will further randomize the relative axial orientation of pairs of DNA-bound proteins. Of course, for very large forces where ξ‖ becomes comparable to h, a more detailed model taking into account the helix structure would need to be employed, but in that regime, with ξ‖ comparable in size to a typical DNA-binding protein, one would expect direct interactions between adjacent proteins to dominate (Sec. S7 of Supplemental Material [19]).
One might ask whether the longer ξ⊥ can ever be the protein-protein interaction range. This would require the relative angular orientations of the two proteins around the axial (stretching) direction to be fixed: one might imagine using a sequence-specific DNA-binding protein and binding sites “phased” to be at the same orientation around the double helix (spaced multiples of 10.5 bp) in an attempt to do this. However, since angular (twist) fluctuations between the two sites will increase with in-terprotein distance, one would expect to see an interaction decaying with the longer ξ⊥only at short distances, with the shorter ξ‖ dominating at longer distances.
Finally we note that while the individual protein-generated distortions considered here are isotropically distributed around the double helix axis, the relative orientation of two nearby bends can be expected to be angularly correlated. This arises from the fact that two oppositely-directed bends of the double helix compensate one another’s distortion [8–10]. Therefore one can expect force on a DNA to direct the self-organization of DNA-bending-proteins into clusters (via the interaction we have analyzed) with correlated bend directions. If the bends are chiral (the general case since DNA and proteins are both chiral), nearby proteins will be chirally organized, i.e. forming locally helical DNA-protein complexes. An important case is that of nucleosomes, which have well-defined entry/exit angles for the DNA bound to them, with broken chiral symmetry; moderate applied tension can be expected to drive the packing and helical organization of adjacent nucleosomes in chromatin.
Supplementary Material
Acknowledgments
We acknowledge the support of the NSF through Grant DMR-0715099, and of the NIH through Grant 1U54CA143869-01 (NU-PS-OC).
Contributor Information
Houyin Zhang, Email: houyinzhang2011@u.northwestern.edu.
John F. Marko, Email: john-marko@northwestern.edu.
References
- 1.Stavans J, Oppenheim A. Phys. Biol. 2006;3:R1. doi: 10.1088/1478-3975/3/4/R01. [DOI] [PubMed] [Google Scholar]
- 2.Rocha EPC. Annu. Rev. Genet. 2008;42:211–233. doi: 10.1146/annurev.genet.42.110807.091653. [DOI] [PubMed] [Google Scholar]
- 3.Rippe K, editor. Genome Organization and Function in the Cell Nucleus. New York: Wiley-VCH; 2011. [Google Scholar]
- 4.Ali BM, Amit R, Braslavsky I, Oppenheim AB, Gileadi O, Stavans J. Proc. Natl. Acad. Sci. U.S.A. 2001;98:10658. doi: 10.1073/pnas.181029198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.van Noort J, Verbrugge S, Goosen N, Dekker C, Dame RT. Proc. Nat. Acad. Sci. USA. 2004;101:6969. doi: 10.1073/pnas.0308230101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Meglio A, Praly E, Ding F, Allemand JF, Bensimon D, Croquette V. Curr. Opin. Struct. Biol. 2009;19:615. doi: 10.1016/j.sbi.2009.08.005. [DOI] [PubMed] [Google Scholar]
- 7.Chaurasiya KR, Paramanathan T, McCauley MJ, Williams MC. Phys. Life. Rev. 2010;7:299. doi: 10.1016/j.plrev.2010.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rudnick J, Bruinsma R. Biophys. J. 1999;76:1725. doi: 10.1016/S0006-3495(99)77334-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Koslover EF, Spakowitz AJ. Phys. Rev. Lett. 2009;102:178102. doi: 10.1103/PhysRevLett.102.178102. [DOI] [PubMed] [Google Scholar]
- 10.Zhang H, Marko JF. Phys. Rev. E. 2010;82:051906. doi: 10.1103/PhysRevE.82.051906. [DOI] [PubMed] [Google Scholar]
- 11.Marko JF, Siggia ED. Macromolecules. 1995;28:8759. [Google Scholar]
- 12.Yan J, Marko JF. Phys. Rev. E. 2003;68:011905. doi: 10.1103/PhysRevE.68.011905. [DOI] [PubMed] [Google Scholar]
- 13.Yan J, Kawamura R, Marko JF. Phys. Rev. E. 2005;71:061905. doi: 10.1103/PhysRevE.71.061905. [DOI] [PubMed] [Google Scholar]
- 14.Kramers HA, Wannier GH. Phys. Rev. 1941;60:252. [Google Scholar]
- 15.Zimm BH, Bragg JK. J. Chem. Phys. 1959;31:526. [Google Scholar]
- 16.Livadaru L, Netz RR, Kreuzer HJ. Macromolecules. 2003;36:3732. [Google Scholar]
- 17.Storm C, Nelson PC. Phys. Rev. 2003;E 67:051906. doi: 10.1103/PhysRevE.67.051906. [DOI] [PubMed] [Google Scholar]
- 18.Xiao B, Zhang H, Johnson RC, Marko JF. Nucleic Acids Res. 2011;39:5568. doi: 10.1093/nar/gkr141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Supplemental Material is available at (URL)
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.