Abstract
The binding of proteins onto DNA contributes to the shaping and packaging of the genome as well as to the expression of specific genetic messages. With a view to understanding the interplay between the presence of proteins and the deformation of DNA involved in such processes, we developed a new method to minimize the elastic energy of DNA fragments at the mesoscale level. Our method makes it possible to obtain the optimal pathways of protein-decorated DNA molecules for which the terminal base pairs are spatially constrained. We focus in this work on the deformations induced by selected architectural proteins on circular DNA. We report the energy landscapes of DNA minicircles subjected to different levels of torsional stress and containing one or two proteins as functions of the chain length and spacing between the proteins. Our results reveal cooperation between the elasticity of the double helix and the structural distortions of DNA induced by bound proteins. We find that the imposed mechanical stress influences the placement of proteins on DNA and that the proteins, in turn, modulate the mechanical stress and thereby broadcast their presence along DNA.
Graphical Abstract
INTRODUCTION
The organization of long genomes in the confined spaces of a cell requires special facilitating mechanisms. A variety of architectural proteins play key roles in these processes. Some of these proteins help to compact the DNA by introducing sharp turns along its pathway while others bring distant parts of the chain molecule into close proximity. The histone-like HU protein from Escherichia coli strain U93 and the structurally related Hbb protein from Borrelia burgdorferi induce some of the largest known deformations of the DNA double-helical structure, including global bends in excess of 160°, appreciable untwisting, and accompanying dislocations of the helical axis.1–5 The degree of DNA deformation reflects both the phasing of the protein-induced structural distortions and the number of nucleotides wrapped on the protein surface. Changes in the spacing between deformation sites on DNA, with one such deformation associated with each half of these two dimeric protein assemblies, contribute to chiral bends, while variation in the electrostatic surface of the proteins perturbs the degree of association with DNA and the extent of duplex bending.
By contrast, recent studies have highlighted the important role played by proteins which bridge and scaffold distant DNA sites and induce the formation of loops. For example, the Escherichia coli H-NS (histone-like nucleoid structuring) protein forms a superhelical scaffold to compact DNA,6 and the Escherichia coli terminus-containing factor MatP can bridge two DNA sites to form a loop.7 The MatP protein is responsible for the compaction of a specific macrodomain (Ter) in Escherichia coli and is believed to be able to bind sites located either on separate chromosomes or within one chromosome.8 The detailed influence of MatP on chromosome segregation and cellular division is yet to be unraveled,8 but it is expected that the presence of MatP proteins could significantly alter the organization and the dynamics of the whole Ter macrodomain.7
The composite kinking and wrapping of DNA on the surface of HU resembles, albeit on a much smaller scale, the packaging of DNA around the histone octamer of the nucleosome core particle,9 while MatP might serve as a bacterial analogue of the insulator-binding proteins that divide chromatin into independent functional domains.10 Deciphering how these two very different bacterial proteins influence the structure and properties of DNA provides a first step in understanding how architectural proteins contribute to the spatial organization of genomes.
With this goal in mind, we have developed a new method for the optimization of DNA structure at the base-pair level. Our method takes account of the sequence-dependent elasticity of DNA and can be applied to chain fragments in which the first and last base pairs are spatially constrained. Moreover, our approach allows for constraints in intervening parts of the DNA and thus makes it possible to model the presence of bound proteins on DNA. We can accordingly compute the energy landscape for a wide variety of protein–DNA systems and herein illustrate the effects of HU and Hbb on the configurations of circular DNA and provide examples of the bridging of distant sites on circular DNA by MatP.
The advantage of our method resides in the direct control of the positions and orientations of the base pairs at the ends of a DNA chain and in the capability of accounting for the presence of bound proteins. In addition, the different constraints due to the boundary conditions and the presence of proteins are directly integrated in the minimization process, which makes it possible to rely on unconstrained numerical optimization methods. Others have derived methods to optimize the energy of elastic deformation for DNA fragments. For example, Coleman et al.11 developed a method similar to ours, but their approach requires explicit specification of the forces and moments acting on the terminal base pairs (i.e., there is no direct control on the positions and orientations of the terminal base pairs). Zhang and Crothers also presented an optimization method,12 albeit limited to circular DNA, which only considers angular deformations within the double helix and takes the boundary conditions into account through Lagrange multipliers.
MATERIALS AND METHODS
We present in this section our new method for minimizing the elastic energy of a collection of DNA base pairs. The method accounts for the sequence-dependent elasticity of DNA and can be applied to a DNA fragment in which the first and last base pairs are spatially constrained. Moreover, our approach makes it possible to constrain parts of the DNA in order to model the presence of bound proteins. We describe the method in symbolic fashion for readability and present full details of the calculations and optimization procedure in the Supporting Information.
Geometry of a Collection of Base Pairs.
We consider a collection of N rigid base pairs, and for the ith base pair, we denote its origin and di the matrix containing the axes of the base-pair frame organized as column vectors. The segmented curve defined by the base-pair origins is referred to as the collection centerline and can be interpreted as a discrete double-helical axis. Here and in the rest of this paper, vector symbols are underlined and matrices are represented with bold symbols. Superscripted italic letters (i, j, …) are used to index base pairs and base-pair steps.
Base-Pair Step Parameters.
The geometry of a collection of base pairs is traditionally described in terms of the base-pair step parameters (or step parameters for short).13 These parameters describe the relative arrangement of successive base pairs and are denoted for the ith step, following Coleman et al.,11 as and referred to, respectively, as tilt, roll, twist, shift, slide, and rise. The first three parameters correspond to three angles describing the relative orientation of base-pair frames di and di+1, and the last three parameters are the components of the step-joining vector expressed in a particular frame referred to as the step frame (see Section 2 in the Supporting Information for more details and the papers of Lu and Olson,14 El Hassan and Calladine,15 and Coleman et al.11 for pictorial definitions and explicit equations). The set of step parameters for the base-pair collection is a 6(N – 1) vector and is denoted P.
We shall see later that the energy of elastic deformation for a base-pair step can be written as a quadratic form with respect to the step parameters. The step parameters, however, are not a convenient representation to express constraints on the positions and orientations of base pairs within the collection. This is due to the fact that the components of the step-joining vector ri depend on the relative orientation of the two base pairs forming the step. In order to circumvent this issue, we introduce a different representation of the geometry of a base-pair collection.
Base-Pair Step Degrees of Freedom.
We now define a new set of variables for each step of the base-pair collection. These variables are denoted for the ith step and referred to as the step degrees of freedom or step dofs for short. Although the variables ψi are identical to the angular step parameters θi, we use a different symbol for clarity. The variables ri are the components of the step-joining vector expressed with respect to the global reference frame (a convenient choice for the global reference frame is the first base-pair frame). The set of step dofs for the base-pair collection is a 6(N − 1) vector and is denoted Φ.
The main advantage of this choice of variables is to separate the representation of the centerline of the base-pair collection from the orientation of the base-pair frames. This is somewhat similar to the centerline/spin representation of Langer and Singer16 for continuous elastic rods. This representation is particularly convenient to deal with the end conditions applied to a collection of base pairs.
Energy of Elastic Deformation.
The energy of elastic deformation for a DNA base-pair collection is defined as the sum of the energy of elastic deformation for each step. For the ith step, this energy is given by the following quadratic form:
(1) |
where contains the intrinsic step parameters, i.e., the zero-energy configuration of the step, also called the rest state), and Fi is a 6 × 6 matrix containing the elastic moduli associated with the different modes of deformation. Both the intrinsic step parameters and the force constant matrix can differ for each step depending on the DNA sequence. The elastic energy for the base-pair collection is then given by
(2) |
The purpose of our method is to minimize the elastic energy of a base-pair collection given by eq 2 under the constraints detailed below. In other words, we need to calculate the derivatives of the elastic energy. The difficulty stems from the fact that this gradient has to be calculated with respect to a set of independent variables. This set of independent variables depends on the end conditions applied to the collection of base pairs and also on the presence of bound proteins. As mentioned above, the step parameters are not convenient for dealing with the end conditions. We therefore calculate the gradient with respect to the step dofs. We first consider the trivial case of an unconstrained collection (i.e., a collection with no imposed end conditions and no bound proteins), and we then show how to account for specific end conditions and the presence of bound proteins. These constraints are explicitly solved (see Supporting Information for the details of the calculations), which makes it possible to reduce the number of step dofs used as independent variables in the minimization process.
The minimization of the elastic energy given by eq 2 can be carried out using any traditional gradient-based methods (our results are obtained using a L-BFGS algorithm, but other methods would lead to similar results).
Note that our method does not require the elastic energy of a step to be a quadratic form with respect to the step parameters.17 The expression for the elastic energy of a step can also include contributions arising from the coupling between neighboring base-pair steps.
Elastic Energy Gradient for a Free Base-Pair Collection.
The case of a base-pair collection free of any end conditions and without bound proteins is trivial in the sense that the optimization leads to the reference configuration. We will use this gradient for a free collection as the starting point of our method.
The gradient of the elastic energy for a single step is obtained directly from eq 1:
(3) |
where we introduce the symmetrized force constant matrix defined as . It follows that the variation of the elastic energy for the complete collection is given by
(4) |
In order to obtain the gradient with respect to the step dofs, we introduce the Jacobian matrix , defined as
(5) |
It follows that
(6) |
which leads to the following expression for the free-collection gradient with respect to the step dofs:
(7) |
The details of the calculation for the matrix JΦ, are given in Section 3.2 of the Supporting Information.
End Conditions.
We now consider the case of a base-pair collection with end conditions; i.e., both the end-to-end vector and the orientation between the first and last base pairs are fixed. Our results can easily be generalized to other types of end conditions.
The end-to-end vector for the collection of base pairs is given by
(8) |
If the end-to-end vector is imposed, it follows that
(9) |
The end-to-end rotation corresponds to the orientation between the first and last base pairs and is given by
(10) |
where D(i,j) denotes the rotation matrix between the ith and jth base pairs, and Di is the rotation matrix for the ith step which is completely parametrized by the angular step dofs (see Sections 2.1 and 3.1 of the Supporting Information and the work of Coleman et al.11 for further details). If the end-to-end rotation is imposed, we have the following constraint:
(11) |
We show in Section 4 of the Supporting Information that these two constraints imply that the step dofs are not independent variables. In other words, the end conditions reduce the number of independent step dofs. Note that we choose to express the last step dofs as the nonindependent variables, but any other step could have been chosen. We now denote as the set of independent step dofs, and we have the relation . We then define the Jacobian matrix such that
(12) |
The dimensions of this matrix depend on the precise details of the end conditions (although the number of rows is always 6(N − 1) and the details of its expression are given in Section 4 of the Supporting Information).
It follows that the gradient of the elastic energy for a collection of base pairs subjected to imposed end-to-end conditions is
(13) |
Note that this gradient corresponds to the gradient with respect to the independent step dofs and accounts for the end conditions. This is one advantage of our method: the end conditions are directly accounted for. Hence, we transform a constrained optimization problem into an unconstrained one, which makes the numerical implementation simpler and more robust.
Bound-Protein Constraint.
In order to study protein-decorated DNA, we need to account for the presence of bound proteins on the double helix. We model the binding of proteins on the DNA by considering the step parameters of the binding domain as frozen; i.e., these step parameters are imposed and cannot change. For example, the frozen step parameters can be extracted from high-resolution crystal structures of protein–DNA complexes with the help of the 3DNA software.14 The choice of frozen steps can, of course, be varied in different rounds of minimization to allow for deformations in the protein–DNA complex.
We first consider that the kth step in the base-pair collection is frozen; i.e., the step parameters pk are imposed (the following results can be easily generalized to an arbitrary number of frozen steps). It follows directly that the angular step dofs are constant, and we therefore have . The translational step dofs, however, are not constant: any changes in the angular steps dofs ψj with j < k will change the orientation of the joining vector of the frozen step. Indeed, we show in Section 5 of the Supporting Information how to relate the variations in the joining vector to the variations in the angular step dofs of the preceding steps.
Similar to the set of independent step dofs introduced for the treatment of the end conditions, we introduce a new set of independent step dofs , which corresponds to the set of nonfrozen step dofs among the set ; i.e., we have . We define the matrix as the Jacobian (see Section 5 of the Supporting Information for details):
(14) |
We obtain for the gradient
(15) |
Here, denotes the set of nonfrozen step parameters, and is the Jacobian matrix defined as
(16) |
This matrix is directly obtained from by removing the rows associated with the frozen step parameters.
The gradient obtained with eq 15 corresponds to the gradient of the elastic energy of the base-pair collection (eq 2) with respect to the set of nonfrozen independent step dofs.
Force Field.
The expression for the elastic energy of a base-pair step (eq 1) depends on two sequence-dependent quantities: the intrinsic step parameters and the force constant matrix Fi. These quantities constitute the force field of our theory. The intrinsic step parameters describe the rest states of the base-pair steps, while the force constants are the stiffnesses of the steps.
In this work, we use a uniform ideal force field, i.e., not depending on the details of the DNA sequence. The intrinsic step parameters for this force field are given by p0 = (0°, 0°, 34.2857°, 0 Å, 0 Å, 3.4 Å), which correspond to a B-DNA-like straight rest state and a helical repeat of 10.5 bp. We do not consider coupling between the different modes of deformation, and hence, the force constants matrix is diagonal and reads
(17) |
The units for the first three diagonal entries are kBT deg−2 and kBT Å−2 for the last three entries. This force field can be considered quasi-inextensible due to the high force constants associated with the translational step parameters. In the limit of an infinitely long base-pair collection, we can calculate the persistence lengths corresponding to the force field, and we find that the bending and twisting persistence lengths are 47.7 and 66.6 nm, respectively.
Protein-Binding Procedure.
As explained earlier, we model the presence of proteins on DNA by setting the base-pair step parameters of the binding domain to imposed values. These imposed values can be extracted from high-resolution crystal structures of protein–DNA complexes or can be chosen to represent an ideal protein–DNA binding system.
In this study, we use an iterative procedure to bind a protein onto DNA. For example, we consider a protein with a binding domain of np steps to be bound starting at the kth step; i.e., steps k to k + np − 1 are bound (and, hence, frozen). The iterative procedure consists of adjusting the step parameters of the binding domain using the following linear ramp function:
(18) |
where the term denotes the step parameters of the protein-free DNA (in our case, they will correspond to the step parameters of a portion of a naked-DNA minicircle) and the terms are the step parameters of the DNA found in the protein–DNA complex. The parameter λ ∈ [0, 1] is the ramp parameter. Our procedure starts with λ = 0, and we minimize the DNA base-pair collection while gradually increasing the value of the ramp parameter until λ = 1. The outcome is an optimized DNA base-pair collection in which a part of DNA is shaped as if the protein were bound to it. This numerical procedure is not meant to convey the physical process of protein binding to DNA. It is designed to set regions of DNA to the states found in protein–DNA complexes. The binding ramp can also be tuned in order to improve the robustness of the method. For example, the linear terms can be replaced by quadratic expressions. Such modifications might be needed to avoid instabilities or to control changes in the total twist while ramping the step parameters in the case of binding domains with large deformations.
Minicircle Topology.
The topology of a DNA minicircle is characterized by its linking number Lk,18 writhing number Wr,19,20 and total twist Tw.20 The linking number of a minicircle, an integer, describes the entanglement of the minicircle centerline and the curve traced out by one of the edges of the base-pair frames (see our earlier work21 for detailed explanations and computational methods). In particular, for a planar minicircle the linking number corresponds to the number of turns the double helix makes. For a minicircle of N bp, the relaxed linking number Lk0 is given by the integer nearest to N/10.5, where 10.5 is the assumed helical repeat of DNA We introduce the difference between the actual and relaxed linking numbers of a minicircle as ΔLk = Lk − Lk0. Because a DNA minicircle is covalently closed, its linking number, and hence ΔLk, is constant and not altered by the deformation of the double helix.
The linking number of a minicircle is always equal to the sum of its writhing number and total twist, Lk = Tw + Wr. The writhing number characterizes the global folding and nonchiral distortions of the minicircle centerline, while the total twist measures the twisting or twist density of the base pairs around the centerline. Like the linking number, the writhing number and total twist can be directly obtained from the minicircle base-pair frames as explained previously.21 The invariance of the linking number implies that when the minicircle is deformed, the total twist and the writhing number are redistributed. The total twist is directly related to the torsional stress within the double helix, while the writhing number depends on the curvature of the centerline, albeit in a nontrivial way.
Protein-Free DNA Minicircles.
In order to study protein-decorated DNA minicircles, we first need to generate protein-free minicircles. For a minicircle of N bp, we first build a set of step parameters using the following formula for the ith step:
(19) |
where Δ = 2π/N is the bending angle between the planes of successive base pairs and denotes the intrinsic value of the rise step parameter within our force field. The parameter corresponds to the local twist density in the minicircle and can be adjusted to generate over- or undertwisted configurations. In order to enforce the covalent closure of the minicircle, we add at the end of the base-pair collection an additional base-pair identical to the first one. We then minimize the energy of the base-pair collection under the imposed end-to-end vector and rotation (which are both null in the present case). The outcome of the minimization is an optimized DNA minicircle with an imposed ΔLk.
We initially focus on planar, protein-free minicircles of lengths ranging from 63 to 105 bp, i.e., from 6 to 10 helical repeats. Planar minicircles always have a writhing number equal to zero, which means that the total twist is equal to the linking number. Because we require the naked minicircles to be planar, there is an upper bound on the value of |Tw| (and, hence, of |ΔLk|). This limiting value is related to the Michell–Zajac instability,22,23 and for larger values of |Tw|, planar circular configurations are no longer stable. With our ideal force field, for a given chain length, we can always generate optimized minicircles with ΔLk = 0 and, depending on the chain length, ΔLk = −1 or ΔLk = +1 (for a few specific chain lengths, all three values are possible; see Section 6 of the Supporting Information for more details). In other words, we always compute optimized protein-free minicircles with ΔLk = 0, and in addition, we also generate optimized minicircles with ΔLk= −1 and/or ΔLk= +1. These different minicircles are then used with the binding ramp (eq 18) to model the presence of bound proteins.
RESULTS AND DISCUSSION
We first apply our minimization method to the optimization of DNA minicircles on which one or two proteins are bound. We focus on two distinct proteins: the abundant histone-like HU protein and the structurally related Hbb protein. These two proteins have a common fold and share some structural similarities, such as the fact that they both associate as dimers and both introduce significant localized bends in the double helix. The Hbb dimer, however, binds a longer stretch of DNA and induces a sharper turn of double helix than the HU dimer. We used the 3DNA software14 to extract the step parameters of the DNA found in known crystal complexes1,5 (Protein Data Bank files 1P71 and 2NP2 for HU and Hbb, respectively). The length of the extracted binding domains of HU is 17 bp, and that of Hbb is 35 bp. The net bending angle introduced by the HU dimer, roughly 135°, is also less than that induced by Hbb (about 160°), where the measured values correspond to the angles between the normals of the first and last base pairs of the two protein–DNA complexes. Both proteins also slightly underwind the double helix. That is, the total twist across the DNA bound to each protein is less than the total twist of an undeformed B-DNA fragment of same length (for HU ΔTw ≃ −0.18 turn and for Hbb ΔTw ≃ −0.1 turn).
Minicircles with a Single Bound Protein.
We first performed a series of optimizations to study how the addition of an HU or Hbb dimer affects the configurations of a DNA minicircle. We considered minicircles of 63–105 bp in both relaxed ΔLk = 0) and supercoiled ΔLk = ±1) states. The known uniformity of closed protein-bound DNA structures generated through earlier Monte Carlo treatments of chains of comparable length,24 combined with the success of others in mirroring the observed and simulated cyclization properties of even longer (150–160 bp) DNA chains through the minimization of DNA elastic energy,25 lends support to our approach, allowing us to omit consideration of thermal fluctuations and focus instead on the lowest-energy states.
In order to characterize the optimized configurations, we define a relative step energy ε as
(20) |
where E is the energy of the optimized minicircle with the bound dimer, E0 is the energy of the optimized protein-free minicircle, N denotes the minicircle chain length, and Nfree the protein-free chain length. In other words, ε is the ratio of the energy per base-pair step for the minicircle bearing a single protein versus that for the naked minicircle. Recall that our elastic energy for a minicircle accounts for protein-free steps only and ignores the contributions of protein–DNA binding.
The results for both proteins are presented in Figure 1. The relaxed minicircles include configurations from two competing families of structures, which exhibit similar ~21 bp periodic dependencies on chain length.26–29 The oscillations of energy with chain length differ in phase by about a helical turn such that the valleys in the energy profile of one family of closed configurations coincide with the peaks in the other and vice versa. The breaks in the plotted data correspond to jumps between the configurational families. The reported supercoiled states are natural extensions of these families that arise as the DNA ends fall out of register. The underwound states build up with an increase in chain length and the overwound states with a decrease.
The presence of an HU dimer almost always reduces the energy of the protein-free DNA compared to that of the naked minicircle (as shown by the values of the relative step energy ε less than 1 in the upper panel in Figure 1). On the other hand, the addition of an Hbb protein has a mixed effect on the relative step energy depending on the chain length and the value of ΔLk (lower panel in Figure 1). A possible explanation lies in the fact that the binding domain of the Hbb dimer is twice as long as that of the HU dimer. Thus, the number of protein-free DNA steps on an Hbb-bound minicircle is lower than that on a minicircle of the same size bearing an Hbb protein. In other words, the deformation required to satisfy the boundary conditions is more localized in minicircles with an Hbb dimer, and hence, the energy is higher. This argument also accounts for the differences in the chain-length dependence of the relative step energy for minicircles containing an HU dimer versus those with an Hbb dimer. As noted above, the chain-length dependence of the relative step energy dimer follows a damped oscillatory pattern. The periodicity in the Hbb pattern is not as clear as that for HU, suggesting that, within the present range of chain lengths, a minicircle with a single Hbb dimer is more constrained than one with an HU protein.
The other interesting feature found in these results is the influence of the linking number and the total twist on the relative step energy. In the case of the HU dimer, relaxed minicircles have, in most cases, a lower energy than under- or overwound minicircles. It is only for chain lengths close to half-integral numbers of helical repeats that the energy is lower for underwound minicircles. On the other hand, underwound minicircles of 79 bp or greater bearing an Hbb dimer are consistently lower in energy than relaxed or overwound minicircles. As mentioned above, both dimers slightly underwind the double helix, which means that their presence on a minicircle induces a redistribution of the torsional stress. Note that this redistribution may entail the conversion of part of the torsional stress into bending deformations as a consequence of the changes in the total twist and the writhing number. This also explains why overwound minicircles lead to higher energies: the large difference in the torsional stress between the naked DNA (prior to the addition of a protein) and the bound DNA causes higher deformations in the remaining part of the minicircle. In addition, the torsional stress in naked DNA depends on the chain length: for chain lengths close to integral numbers of helical repeats, the torsional stress is lower in relaxed minicircles than in underwound minicircles, but for chain lengths near half-integral numbers of helical repeats, the situation is reversed.
Our results further suggest that the torsional stress in DNA may influence the recruitment of HU and Hbb dimers and might act as a control mechanism for the presence of such proteins. For example, a relaxed minicircle of 94 bp might be more likely to associate with an HU than an Hbb dimer (assuming that the binding affinities of the two dimers are comparable), whereas an underwound minicircle of the same length might be more likely to take up an Hbb dimer. Although our results are obtained on covalently closed DNA minicircles (and, hence, under a topological constraint), it is reasonable to think that similar effects will take place on torsionally constrained linear DNA fragments (e.g., where the anchoring conditions could be set by large protein assemblies or magnetic/optical tweezers).
Minicircles with Two Bound Proteins.
Our second series of optimizations focuses on minicircles of 100 and 105 bp containing two HU or two Hbb dimers. In addition, for minicircles of 100 bp we consider relaxed (ΔLk = 0) and underwound (ΔLk = −1) molecules. This choice is motivated by the fact that the difference in energy between a naked-DNA minicircle of 100 bp with ΔLk = 0 and ΔLk = −1 is only 1.8 kBT (the relaxed minicircle corresponds to the lower energy). Our computations provide energy landscapes of the protein-bound minicircles as functions of the spacing s between the two binding sites. These landscapes are directly related to the relative likelihoods of forming minicircles with two dimers at specific locations. In other words, the landscapes reveal which dimer positions along the minicircle are more apt to be occupied, e.g., in cyclization experiments. We report in Figures 2 and 3 the energy landscapes as well as the associated changes in the total twist ΔTw (with respect to the planar, naked minicircles).
The energy landscapes consist of several local minima separated by high-energy states. In other words, there are well-defined locations for optimal placements of two HU or Hbb proteins along the DNA minicircles. Note that, only the minima with the lowest energies and, hence, with the highest Boltzmann weights, are likely to be relevant to the statistical physics of protein-decorated minicircles. These minima appear periodically in the landscapes, and as expected, the period is roughly equal to the assumed DNA helical repeat (10.5 bp). In addition, the relative locations of the two proteins along the minicircle appear to affect the torsional stress significantly as evidenced by the variations in the total twist. Moreover, the lowest energy configurations are all similar from a geometric point of view. That is, for both HU and Hbb proteins, the globally optimal configurations are obtained when the two dimers are located at antipodal or near antipodal sites (as shown by the configurations labeled “1” in Figures 2 and 3). We note that the minima for the minicircles with two HU dimers are of lower energy than those for minicircles with two Hbb dimers. As noted above, the length of protein-free DNA for Hbb-bound minicircles is shorter than that for HU. Indeed, for two Hbb dimers, the length of naked DNA is comparable in size to the Hbb binding domain. The Hbb system is therefore highly constrained and leads to larger values in the illustrated energy landscapes and greater changes in the total twist.
The results for the minicircles of 100 bp containing two HU or two Hbb dimers (see data on left in Figures 2 and 3) show that the energy minima are lower for the underwound minicircle (ΔLk = −1) than for the relaxed minicircle (ΔLk = 0). This is consistent with the results obtained for minicircles of 100 bp with a single dimer (see Figure 1). The variations in the energy landscapes of the relaxed and underwound minicircles of 100 bp are out of phase. That is, the minima for the relaxed minicircles correspond to the maxima of the underwound minicircles and vice versa. This phase shift between the energy landscapes of relaxed and underwound minicircles is roughly equal to half the assumed helical repeat (~5–6 bp). The changes in the total twist for these minicircles are also similar, although the magnitude of the change is larger for the minicircles with two Hbb proteins. For the relaxed minicircles of 100 bp, the changes in the total twist are negative (with respect to the planar, protein-free minicircles), while for underwound minicircles the changes are positive. This implies, because of the conservation of the linking number, that the changes in the writhing number are of opposite sign for relaxed versus underwound minicircles. Indeed, as shown in Figure 3 (configurations labeled “2”), the minicircles are of different handedness. Also, notice that the pattern in the changes in the total twist for minicircles of 100 bp bearing two dimers is similar regardless of the associated protein: the local maxima in the energy correspond to the highest changes in the total twist, while the local minima correlate with those of moderate values of ΔTw.
For minicircles of 105 bp containing two HU or Hbb dimers, the energy landscapes resemble those obtained for 100 bp minicircles. In particular, the minima appear periodically (every ~10.5 bp) and are separated by high-energy configurations. The minima for the 105 bp chains are lower than those on the minicircles of 100 bp. This difference is due to the fact that the torsional stress in the 105 bp minicircle is lower than that in the 100 bp minicircle prior to the addition of the proteins. The main difference between minicircles of 100 and 105 bp resides in the changes in the total twist. Whereas the changes in the total twist are comparable for 100 bp minicircles bearing two HU or two Hbb dimers, the values of ΔTw at the local minima are of different signs for the corresponding 105 bp HU and Hbb minicircles. That is, for the lower energy configurations, the presence of two HU dimers reduces the torsional stress in the minicircle (see Figure 2), while two Hbb dimers increase the torsional stress (see Figure 3). Notice, however, that the magnitudes of the changes in the total twist for these local minima are comparable.
Our findings for minicircles with two dimers reveal an interplay between the elastic properties of DNA and the positioning of proteins on the double helix. It appears that this result is neither about the proteins shaping the double helix nor about the DNA stiffnesses controlling the positioning of proteins. Rather, there is cooperation between the presence of proteins and the elasticity of the double helix, particularly in the distribution of the torsional stress. The high degree of contrast in the energy landscapes suggests that once a protein is bound to a topologically constrained DNA fragment, the stress in the double helix will favor specific binding sites for other proteins. Such an interpretation echoes the results about DNA–protein allosteric effects found in single-molecule studies.30 It is also interesting to note that the local minima found in the energy landscapes are always flanked by configurations of comparable energy (up to a few kBT). This suggests that there should be fluctuations in experimental measurements of the most likely binding sites of HU and Hbb dimers on DNA minicircles. Also note that these results have been obtained with a minimal model, i.e., DNA is modeled as an isotropic material with standard bending and twisting stiffnesses and is focused on a special class of protein geometry, which includes specific deformations of the double helix. Nevertheless, the approach serves as a proof of concept and paves the way for more detailed studies about the synergy between DNA deformation and protein positioning.
CONCLUSIONS
The minimization procedure introduced in this work facilitates the investigation of how architectural proteins may contribute to the spatial organization and genetic processing of DNA. This new approach gives direct control of the positions and orientations of the base pairs at the ends of a DNA chain and makes it possible to specify the precise sites of protein uptake and the detailed changes in double-helical structure brought about by the binding of protein. Here we illustrate the utility of the method in a study of the elastic energy landscapes of DNA minicircles decorated by the nonspecific architectural protein HU and the similarly folded, albeit site-specific,31 Hbb protein. Both proteins associate as dimers and introduce severe bends and untwist their DNA targets. We consider the Hbb-bound DNA as an extreme example of HU-induced DNA distortion and thus treat both proteins as nonspecific. We focus on covalently closed molecules comparable in length to the loops that are formed by various regulatory proteins and enzymes that bind to sequentially distant sites on DNA32,33 and allow for the uptake of one or two HU or Hbb dimers on the DNA. We also consider underwound and overwound minicircles in order to study the added effects of torsional stress within the double helix on the protein-binding landscapes. We find that the presence of protein has a significant effect on the bending and twisting deformations in the minicircles and, conversely, that the torsional stress within DNA prior to the addition of proteins has a strong effect on the optimal placement of proteins along the minicircles. For example, we show that a single HU dimer is more likely to bind relaxed rather than under- or overwound minicircles of most chain lengths between 63 and 105 bp and that an Hbb dimer binds preferentially to underwound minicircles of the same lengths.
Our results reveal cooperation between the deformability of the double helix and the structural distortions of DNA induced by bound proteins. In the case of minicircles with two HU or two Hbb dimers, the presence of a first protein strongly influences the locations of the optimal binding sites of a second protein. That is, the DNA, through its elastic deformation, acts as a communication medium between the proteins. In particular, the torsional stress and the twisting stiffness appear to play a major role in this action at a distance. The mechanical signaling also provides a rationale for the DNA allostery reported in single-molecule studies of the dissociation of proteins on DNA chains constrained to full extension by the flow of solvent.30 In particular, the binding of one protein on the extended DNA stabilizes or destabilizes the binding of another protein, even when not in direct contact. Indeed, this sort of long-range communication, in which the binding of a ligand in one part of the DNA helix influences (positively or negatively) the recognition of a different ligand at a remote site, also termed telestability,34 has puzzled DNA scientists for decades. Although our results concern a special class of proteins bound to covalently closed rather than straightened DNA, we observe an interplay between the elasticity of the double helix and the placement of proteins along DNA that may hold for extended as well as cyclic chains. Indeed, constraints on the twist of the intervening DNA give rise to a decaying oscillatory pattern of protein uptake on linear chains of the type observed experimentally,35 as does the interplay between DNA double-helical geometry and the deformations of base pairs considered in recent analytical work36 and the many interatomic interactions incorporated in detailed molecular simulations of protein-bound DNA chain fragments.37–41 In our case, the most likely placement of a second DNA-bending protein on a minicircle is antipodal to the first, i.e., as sequentially far apart as possible. Our studies provide examples of how the mechanical stress in DNA can control the placement of proteins and how proteins can alter the mechanical stress to broadcast their presence along DNA. Our findings further suggest that local modifications of the mechanical properties of DNA, such as methylation or the occurrence of kinks, could modulate and possibly repress this type of mechanical signaling. The very different responses of the minicircles to the degree of DNA uptake on HU vs Hbb also lend support to ideas suggesting how the unwrapping of nucleosomal DNA in chromatin might contribute to the binding of other molecules on the intervening protein-free DNA linkers and the deformability of chromatin as a whole.42 Note, for example, the greater accessibility and lower energy of the DNA connecting proteins at sites labeled “2” on 100-bp minicircles in Figures 2 and 3. The DNA connecting the HUs adopts a smooth bend with a locally accessible groove structure compared to the tightly folded supercoiled configuration that accompanies the uptake of nearly a turn of DNA (9 bp) on the termini of each Hbb.
Although the minimal elastic energy configurations of long DNA chains are not necessarily relevant from a statistical physics point of view, our method can be used to study larger systems, such as long DNA molecules decorated by multiple proteins. For example, we show in Figure 4 two examples of multiple loops on circular DNA induced by the Ter-specific protein MatP,7 which is thought to be involved in the condensation of chromosomes in Escherichia coli. We also show in Figure 5 a long DNA plasmid decorated by 64 closely spaced Hbb proteins. The software can be used in combination with other tools (e.g., 3DNA14) to optimize DNA fragments anchored by proteins. An interesting application of our approach resides in the development of software for biomolecular sculpting,43 which would make it possible to study how proteins can bundle and organize DNA. Our methodology could also be used to investigate the effects of DNA “melting” within a closed double-helical pathway, e.g., introducing noncanonical, opened states of the type captured in molecular-dynamics simulations44 or observed in high-resolution structures45 at selected sites and studying the “dynamic” response of the DNA as a whole. We are currently working on improving our method to account for additional types of constraints, such as the treatment of DNA excluded volume via electrostatic interactions.46 The software does not currently check for collisions between DNA and proteins. Although the proteins collide on some of the minicircles presented here (see Figures 2 and 3), the collisions only occur in high-energy configurations, in which pairs of proteins lie immediately next to one another. We also plan to include more realistic force fields to account, for example, for the higher deformability of pyrimidine–purine compared to other base-pair steps or for the flexibility of protein–DNA assemblies. This task is facilitated by the fact that the method already takes into account the sequence-dependent elasticity of DNA. It is thus also possible to use our approach to examine the effects of different force fields, e.g., knowledge-based47 vs computer-simulated48 elastic potentials, on large-scale chain properties, such as protein-mediated DNA looping.29 Note that the software makes it possible for the user to define and use custom force fields, thus facilitating investigations on the sensitivity of optimized structures with respect to the details of the force constants and intrinsic parameters. Finally, we have begun to study other biomolecular systems, including the short loops mediated by the Lac and Gal repressor proteins, nucleosome-decorated minicircles, and the relative contributions of protein and DNA flexibility in these systems.
The methods described in this paper are implemented as a set of documented C++ tools available at https://nicocvn.github.io/emDNA/.
Supplementary Material
ACKNOWLEDGMENTS
This work was generously supported by the U.S. Public Health Service under research grant GM34809.
Footnotes
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.0c11612.
Full details of the method of minimization of the elastic energy for a collection of base pairs (PDF)
Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jpcb.0c11612
The authors declare no competing financial interest.
Contributor Information
Nicolas Clauvelin, Center for Quantitative Biology and Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey, Piscataway, New Jersey 08854, United States.
Wilma K. Olson, Center for Quantitative Biology and Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey, Piscataway, New Jersey 08854, United States.
REFERENCES
- (1).Swinger KK; Lemberg KM; Zhang Y; Rice PA Flexible DNA bending in HU-DNA cocrystal structures. EMBO J. 2003, 22, 3749–3760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (2).Swinger KK; Rice PA IHF and HU: flexible architects of bent DNA. Curr. Opin. Struct. Biol 2004, 14, 28–35. [DOI] [PubMed] [Google Scholar]
- (3).Swinger KK; Rice PA Structure-based analysis of HU-DNA binding. J. Mol. Biol 2007, 365, 1005–1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (4).Sagi D; Friedman N; Vorgias C; Oppenheim AB; Stavans J Modulation of DNA conformations through the formation of alternative high-order HU-DNA complexes. J. Mol. Biol 2004, 341, 419–428. [DOI] [PubMed] [Google Scholar]
- (5).Mouw KW; Rice PA Shaping the Borrelia burgdorferi genome: crystal structure and binding properties of the DNA-bending protein Hbb. Mol. Microbiol 2007, 63, 1319–1339. [DOI] [PubMed] [Google Scholar]
- (6).Arold ST; Leonard PG; Parkinson GN; Ladbury JE H-NS forms a superhelical protein scaffold for DNA condensation. Proc. Natl. Acad. Sci. U. S. A 2010, 107, 15728–15732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Dupaigne P; Tonthat NK; Espéli O; Whitfill T; Boccard F; Schumacher MA Molecular basis for a protein-mediated DNA-bridging mechanism that functions in condensation of the E. coli chromosome. Mol. Cell 2012, 48, 560–571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Dame RT; Kalmykowa OJ; Grainger DC Chromosomal macrodomains and associated proteins: implications for DNA organization and replication in gram negative bacteria. PLoS Genet. 2011, 7, e1002123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (9).Luger K; Mader AW; Richmond RK; Sargent DF; Richmond TJ Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 1997, 389, 251–260. [DOI] [PubMed] [Google Scholar]
- (10).West AG; Gaszner M; Felsenfeld G Insulators: many functions, many mechanisms. Genes Dev. 2002, 16, 271–288. [DOI] [PubMed] [Google Scholar]
- (11).Coleman BD; Olson WK; Swigon D Theory of sequence-dependent DNA elasticity. J. Chem. Phys 2003, 118, 7127–7140. [Google Scholar]
- (12).Zhang YL; Crothers DM Statistical mechanics of sequence-dependent circular DNA and its application for DNA cyclization. Biophys. J 2003, 84, 136–153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).Dickerson RE; Bansal M; Calladine CR; Diekmann S; Hunter WN; Kennard O; von Kitzing E; Lavery R; Nelson HCM; Olson WK; et al. Definitions and nomenclature of nucleic acid structure parameters. Nucleic Acids Res. 1989, 17, 1797–1803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (14).Lu X-J; Olson WK 3DNA: a software package for the analysis, rebuilding, and visualization of three-dimensional nucleic acid structures. Nucleic Acids Res. 2003, 31, 5108–5121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (15).El Hassan MA; Calladine CR The assessment of the geometry of dinucleotide steps in double-helical DNA: a new local calculation scheme. J. Mol. Biol 1995, 251, 648–664. [DOI] [PubMed] [Google Scholar]
- (16).Langer J; Singer DA Lagrangian aspects of the Kirchhoff elastic rod. SIAM Rev. 1996, 38, 605–618. [Google Scholar]
- (17).Gonzalez O; Petkevičiūte D; Maddocks JH A sequence-dependent rigid-base model of DNA. J. Chem. Phys 2013, 138, 055102. [DOI] [PubMed] [Google Scholar]
- (18).Gauss CF Carl Friedrich Gauss Werke; Gedruckt in der Dieterichschen Universit∈tsdruckerei (Kaestner WF): Göttingen, 1867. [Google Scholar]
- (19).Pohl WF The self-linking number of a closed space curve. J. Math. Mech 1968, 17, 975–985. [Google Scholar]
- (20).White JH; Bauer WR Calculation of the twist and the writhe for representative models of DNA. J. Mol. Biol 1986, 189, 329–341. [DOI] [PubMed] [Google Scholar]
- (21).Clauvelin N; Tobias I; Olson WK Characterization of the geometry and topology of DNA pictured as a discrete collection of atoms. J. Chem. Theory Comput 2012, 8, 1092–1107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (22).Zajac EE Stability of two planar loop elasticas. J. Appl Mech 1962, 29, 136–142. [Google Scholar]
- (23).Goriely A Twisted elastic rings and the rediscoveries of Michell’s instability. J. Elast 2006, 84, 281–299. [Google Scholar]
- (24).Wei J; Czapla L; Grosner MA; Swigon D; Olson WK DNA topology confers sequence specificity to nonspecific architectural proteins. Proc. Natl. Acad. Sci. U. S. A 2014, 111, 16742–16747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (25).Manning RS; Maddocks JH; Kahn JD A continuum rod model of sequence-dependent DNA structure. J. Chem. Phys 1996, 105, 5626–5646. [Google Scholar]
- (26).Swigon D; Coleman BD; Olson WK Modeling the Lac repressor-operator assembly: the influence of DNA looping on Lac repressor conformation. Proc. Natl. Acad. Sci. U. S. A 2006, 103, 9879–9884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Zhang Y; McEwen AE; Crothers DM; Levene SD Statistical-mechanical theory of DNA looping. Biophys. J 2006, 90, 1903–1912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).Goyal S; Lillian T; Blumberg S; Meiners J-C; Meyhöfer E; Perkins NC Intrinsic curvature of DNA influences LacR-mediated looping. Biophys. J 2007, 93, 4342–4359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (29).Perez PJ; Olson WK Insights into genome architecture deduced from the properties of short Lac repressor-mediated DNA loops. Biophys. Rev 2016, 8, 135–144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Kim S; Bröstromer E; Xing D; Jin J; Chong S; Ge H; Wang S; Gu C; Yang L; Gao YQ; et al. Probing allostery through DNA. Science 2013, 339, 816–819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (31).Kobryn K; Naigamwalla DZ; Chaconas G Site-specific DNA binding and bending by the Borrelia burgdorferi Hbb protein. Mol. Microbiol 2000, 37, 145–155. [DOI] [PubMed] [Google Scholar]
- (32).Adhya S Multipartite genetic control elements: communication by DNA loop. Annu. Rev. Genet 1989, 23, 227–250. [DOI] [PubMed] [Google Scholar]
- (33).Schleif R DNA looping. Annu. Rev. Biochem 1992, 61, 199–223. [DOI] [PubMed] [Google Scholar]
- (34).Burd J; Wartell RM; Dodgson JB; Wells RD Transmission of stability (telestability) in deoxyribonucleic acid. J. Biol. Chem 1975, 250, 5109–5113. [PubMed] [Google Scholar]
- (35).Koslover EF; Spakowitz AJ Twist- and tension-mediated elastic coupling between DNA-binding proteins. Phys. Rev. Lett 2009, 102, 178102. [DOI] [PubMed] [Google Scholar]
- (36).Singh J; Purohit PK Elasticity as the basis of allostery in DNA. J. Phys. Chem. B 2019, 123, 21–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (37).Xu X; Ge H; Gu C; Gao YQ; Wang SS; Thio BJR; Hynes JT; Xie XS; Cao J Modeling spatial correlation of DNA deformation: DNA allostery in protein binding. J. Phys. Chem. B 2013, 117, 13378–13387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Dršata T; Zgarbová M; Špačková N; Jurečka P; Šponer J; Lankaš F Mechanical model of DNA allostery. J. Phys. Chem. Lett 2014, 5, 3831–3835. [DOI] [PubMed] [Google Scholar]
- (39).Gu C; Zhang J; Yang YI; Chen X; Ge H; Sun Y; Su X; Yang L; Xie S; Gao YQ DNA structural correlation in short and long ranges. J. Phys. Chem. B 2015, 119, 13980–13990. [DOI] [PubMed] [Google Scholar]
- (40).Dršata T; Zgarbová M; Jurečka P; Šponer J; Lankaš F On the use of molecular dynamics simulations for probing allostery through DNA. Biophys. J 2016, 110, 874–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (41).Balaceanu A; Pérez A; Dans PD; Orozco M Allosterism and signal transfer in DNA. Nucleic Acids Res. 2018, 46, 7554–7565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (42).Lesne A; Foray N; Cathala G; Forné T; Wong H; Victor J-M Chromatin fiber allostery and the epigenetic code. J. Phys.: Condens. Matter 2015, 27, 064114. [DOI] [PubMed] [Google Scholar]
- (43).Touzain F; Petit M-A; Schbath S; Karoui ME DNA motifs that sculpt the bacterial chromosome. Nat. Rev. Microbiol 2011, 9, 15–26. [DOI] [PubMed] [Google Scholar]
- (44).Irobalieva RN; Fogg JM; Catanese DJ Jr.; Sutthibutpong T; Chen M; Barker AK; Ludtke SJ; Harris SA; Schmid MF; Chiu W; et al. Structural diversity of supercoiled DNA. Nat. Commun 2015, 6, 8440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (45).Barnes CO; Calero M; Malik I; Graham BW; Spahr H; Lin G; Cohen AE; Brown IS; Zhang Q; Pullara F; et al. Crystal structure of a transcribing RNA polymerase II complex reveals a complete transcription bubble. Mol. Cell 2015, 59, 258–269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (46).Westcott TP; Tobias I; Olson WK Modeling self-contact forces in the elastic theory of DNA supercoiling. J. Chem. Phys 1997, 107, 3967–3980. [Google Scholar]
- (47).Olson WK; Gorin AA; Lu X-J; Hock LM; Zhurkin VB DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. Proc. Natl Acad. Sci. U. S. A 1998, 95, 11163–11168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (48).Lankaš F; Šponer J; Hobza P; Langowski J Sequence-dependent elastic properties of DNA. J. Mol. Biol 2000, 299, 695–709. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.