Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1998 Dec 22;95(26):15163–15165. doi: 10.1073/pnas.95.26.15163

DNA curvature and deformation in protein–DNA complexes: A step in the right direction

Donald M Crothers 1
PMCID: PMC33930  PMID: 9860938

What role does the energy of DNA deformation play in the thermodynamics and specificity of complex formation by proteins that distort the double helix? This question by its nature requires a conjunction between structural, energetic, and thermodynamic studies. Detailed analysis of the extent of DNA distortion depends on knowing both the starting and ending conformations. In spite of the extensive body of DNA structures determined since the pioneering work of Drew et al. (1), there are few cases in which high-resolution structures are available for DNA both free and in complex with the protein to which it binds. In the paper by Rozenberg et al. (2) in this issue of the Proceedings, the Shakked group adds a high-resolution x-ray study of the DNA target of the bovine papillomavirus E2 protein to its earlier work on the structure of the trp operator DNA (3). This contribution, taken together with the earlier determination of the structure of the E2 protein–DNA complex (4, 5), effectively doubles the number of cases where direct “before and after” comparison of DNA structures is possible.

The sequence target of E2 has the general form ACCGNNNNCGGT, where N4 is variable and does not contact the protein. Binding of the protein yields bends toward the major grooves at the dyad-symmetric ACCG/CGGT elements (4, 5). The bends result primarily from roll, which measures rotation of a base pair plane about its long axis. This motion creates an angle, narrowing toward the major groove for positive roll, between two otherwise parallel adjacent base pairs. Because the centers of the two ACCG/CGGT sequences are less than a full helical turn apart, a writhed curve rather than a planar bend of the DNA axis results.

Comparison with the structure of the starting DNA, determined by Rozenberg et al. (2) from two crystal forms containing four unique copies of the same molecule, shows that the protein amplifies a tendency to bend toward the major groove at the ACCG sequences, particularly by roll at the CC and CG steps. The small degree of structure modification induced in this region by protein binding is emphasized by the value of 0.8 Å for the rms difference between the conserved ACCG/CGGT regions of the bound and free helices, only slightly larger than the 0.6 Å observed for the difference between the two equivalent domains of the free DNA molecules. Thus, although the mechanism is not entirely “lock and key,” the DNA clearly has a predisposition to adopt the structure bound by the protein in the consensus sequence region.

However, matters are not so simple for the N4 spacer. The protein-induced DNA bend has a substantial component caused by a bend toward the minor groove at the central spacer. The gACGTc sequence at the N4 spacer takes on substantial negative roll, even at the CG sequence that was amplified in its tendency to positive roll in the consensus sequence region. However, Rozenberg et al. (2) note that the induced negative roll is more pronounced at the AC and GT steps than at CG, paralleling the trend observed in the preference for more positive roll at CG than AC/GT in the naked DNA. Thus, one can view the bend as enforced by the protein, but the localized components still reflect the intrinsic relative bending tendencies of the DNA.

Affinity of the natural E2 binding sites is modulated by the nature of the N4 spacer (6), so one should not expect that all of the observed sequences are selected to maximize binding affinity. On the face of it, GACGTC would seem to be a rather poor choice, because it is represented in the GCN4 target sequence ATGACGTCAT, which contains a small (≈8°) bend toward the major groove in solution (7, 8). Comparison with the related sequence ATGACTCAT, which is not bent, indicates that the central CG step is essential (7, 8). Why would a sequence bent toward the major groove be selected when the protein induces a bend toward the minor groove? Why not a sequence such as an A tract that causes intrinsic curvature toward the minor groove? Indeed, such sequences are found as the target of the human virus E2 protein.

Thermodynamic data that bear on this issue were recently provided by Hines et al. (9), who investigated the dependence of the affinity of bovine and human papillomavirus E2 proteins for DNA constructs containing different spacer sequences and also sources of flexibility such as nicks and gaps in the N4 sequence. The results showed a remarkable contrast between the closely related bovine and human virus proteins. Binding of bovine E2 was only marginally affected by substitution of an A4 tract for ACGT, whereas the affinity of the human virus protein increased by nearly 30-fold. Nicks or gaps in the DNA binding site also produced little change in the affinity for bovine E2, but reduced by as much as 100-fold the affinity for the human virus protein. At this stage of enlightenment it is difficult to generalize on these observations, except to say that it obviously will be of great interest to compare the structures of the bovine and human varieties of the E2 protein–DNA complexes, as well as the respective target DNAs, to understand how the two protein binding affinities can be so differently impacted by the curvature and flexibility of their DNA targets.

Considerable effort has gone into generating dinucleotide (10, 11) or trinucleotide (12) parameter sets designed to predict the observed magnitude and direction of curvature of DNA sequences (13). Additional experimental tests come, for example, from predicting the preferred positions of nucleosomes on genomic DNA (14), reflecting the intrinsic curvature that helps meet the need for bending DNA around the core histones. Work recently published by Olson and coworkers (15, 16) adds parameter sets from a different source. They surveyed the cumulative x-ray data on DNA oligonucleotides (15) and oligonucleotide-protein complexes (16), and tabulated average roll and tilt angles for individual dinucleotide steps, as well as other variables that characterize local DNA structure. (Tilt arises from rotation about a base pair short axis and is generally less than half as large as roll.) The average roll angles from the x-ray data differ from one dinucleotide to another, as is also the case for the other parameter sets. In some instances there is agreement; for example, all of the parameter sets collected in Table 1 assign low or negative roll to AA/TT steps, and high positive roll to the CG step (the one found at the center of the ACGT E2 spacer). However, as the table shows, there is considerable divergence in sign and magnitude.

Table 1.

Parameters for roll angles at dinucleotide steps

A T G C
T 0.9a −6.5 1.6 −2.7
8.0b −5.4 6.7 2.0
2.6c 0.5 1.1 −0.1
3.3d 0.7 4.7 1.9
A −6.5 2.6 8.4 −0.9
−5.4 −7.3 1.0 −2.4
0.5 −0.6 2.9 0.4
0.7 1.1 4.5 0.7
C 1.6 8.4 6.7 1.2
6.7 1.0 4.6 1.3
1.1 2.9 6.6 6.5
4.7 4.5 5.4 3.6
G −2.7 −0.9 1.2 −5.0
2.0 −2.4 1.3 −3.7
−0.1 0.4 6.5 −7.0
1.9 0.7 3.6 0.3

Bend angles are given in degrees. 

a

The first row in each set comes from Bolshoy et al. (10) and corresponds to the roll angle between nucleotide i and j. For example, the upper left corner of the matrix corresponds to T followed by A in the 5′ to 3′ direction. 

b

The second row is from De Santis et al. (11). 

c

The third row is from Gorin et al. (15) for naked DNA structures. 

d

The fourth row is from Olson et al. (16) for selected data from protein−DNA complexes.  

In considering Table 1 it is well to keep in mind the source of the numbers in the different parameter sets. The values from De Santis et al. (11) have their origin in theoretical calculations of the minimum energy structure, and those of Bolshoy et al. (10) were generated to fit the observed curvature of DNA sequences. The experimental tests of these parameters lie, for example, in reproducing experimentally observed DNA curvature and in predicting where nucleosomes will bind preferentially, based on the preferred direction and magnitude of DNA curvature. Because DNA curvature depends on the difference between roll angles, the predictions generated by the De Santis et al. and Bolshoy et al. parameter sets would be unaffected by adding or subtracting a constant to the roll angles at all steps. The Gorin et al. (15) and Olson et al. (16) parameters, however, are based on direct structural observations, and cannot be adjusted in that way without violating the basic assumption that the structure in the crystal reflects the structure in solution. To provide a more transparent comparison between the parameters, Table 2 shows the result of setting the mean of each set arbitrarily to zero, and adjusting the range by dividing by half the difference between the maximum and minimum roll angle values. Viewed this way the agreement appears better; for example, in most cases the parameters for a particular dinucleotide have the same sign.

Table 2.

Relative roll angles at dinucleotide steps

A T G C
T 0.06 −0.94 0.15 −0.42
0.98 −0.77 0.81 0.20
0.16 −0.15 −0.06 −0.24
0.28 −0.75 0.82 −0.28
A −0.94 0.29 1.07 −0.18
−0.77 −1.02 0.07 −0.38
−0.15 −0.31 0.20 −0.16
−0.75 −0.59 0.75 −0.75
C 0.15 1.07 0.84 0.10
0.81 0.07 0.54 0.11
−0.06 0.20 0.75 0.73
0.82 0.75 1.10 0.39
G −0.42 −0.18 0.10 −0.73
0.20 0.38 0.11 −0.55
−0.24 −0.16 0.73 −1.25
0.28 −0.75 0.39 −0.90

Relative dinucleotide roll angles compared for the Bolshoy et al. (10), De Santis et al. (11), Gorin et al. (15), and Olson et al. (16) parameter sets. The numbers were generated from the corresponding entries in Table 1 by (arbitrarily) setting the mean for each set to zero and normalizing the range by dividing by half the difference between the maximum and minimum roll angles. Thus the total range for each set is 2.0 units.  

Stringent tests of the predictive power of these parameters are now available from solution data. For example, how well can one predict the relative curvature of molecules containing A tracts with varying sequences between the tracts, as deduced from comparative electrophoresis data (17)? The De Santis et al. parameters (11) reproduce quite well the small variation of curvature in such sequences (17), but the Gorin et al. (15) x-ray parameter set does not do as well. For example, the repeated sequence … AnTGACTAn…  contains no dinucleotides that are given large positive roll in the intra-A-tract sequence by the Gorin et al. parameters, but its curvature is actually about 20% more bent than sequences of the form … AnGTCGGAn.. (17), which contain the CG and GG dinucleotide steps assigned high positive roll by the Gorin et al. parameters. This lack of predictive power provides a challenge to those who argue that the x-ray data show unambiguously that high positive roll in sequences between A tracts is the basis for A-tract bends (18).

Further data for testing the parameters will be increasingly available from cyclization kinetic measurements, which are capable of characterizing intrinsic DNA bends as small as 8° (8, 1922). For example, how well do the parameter sets predict the direction and magnitude of curvature of DNA sequences such as the CAP site in the lac promoter (17)? Although success is not assured, one can be optimistic about the prospects for bridging between structural descriptions at the atomic level and the global properties of DNA curvature.

The energy of deforming DNA to bind a protein depends not only on the propensity to curve in the required direction, but also on the magnitude of the forces that oppose further bending (i.e. curvature vs. bendability). The analysis of Olson et al. (16) brings new insight into this part of the problem. They assumed that the spread of roll and other angles in the crystal structure database about the mean for a particular dinucleotide in DNA-protein complexes reflects a Boltzmann distribution in the ensemble, and proceeded to extract the corresponding harmonic potential force constants. The same assumption about the roll and tilt angular fluctuations for B-DNA (15) allow one to use a generalization of the Schellman formula for “hinge” bending (23) to two independent bend angles (24) to calculate a persistence length (a measure of DNA bending stiffness) of about 175 bp, close enough to the canonical value of about 150 bp to provide some reassurance about the underlying assumption. Further support comes from the observation that DNAs containing multiple CTG repeats are considerably more flexible than normal B-DNA (25, 26), because the reported stiffness parameters for the individual dinucleotides CT, TG, and GC are unusually flexible in the naked DNA parameter set (15) (although not for the protein–DNA complexes; ref. 16). It may, however, turn out that quantitative characterization of both bending and flexibility will require trinucleotide or higher order parameter sets to be fully effective (12).

Finally, returning to the CG step in the E2 spacer, Olson et al. (16) cast interesting light on why this step may be chosen. Their analysis of the structural variation of DNA in protein complexes (see their table 2) provides an estimate of the volume in Cartesian coordinate-angle conformation space that lies within a specified free energy contour. By this measure CG (followed by CA) has the largest volume and is therefore the most flexible of the dinucleotide steps in protein complexes. This finding supports the view of Rozenberg et al. (2) that the GACGTC sequence has an unusual ability to adapt its conformation. It certainly is safe to say that the days are over when one dinucleotide step was as good as another in determining the structure and mechanical properties of DNA.

Footnotes

A commentary on this article begins on page 15194.

References

  • 1.Drew H R, Wing R M, Takano T, Broka C, Tanaka S, Itakura K, Dickerson R E. Proc Natl Acad Sci USA. 1981;78:2179–2183. doi: 10.1073/pnas.78.4.2179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Rozenberg H, Rabinovich D, Frolow F, Hegde R S, Shakked Z. Proc Natl Acad Sci USA. 1998;95:15194–15199. doi: 10.1073/pnas.95.26.15194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Shakked Z, Guzikevich-Guerstein G, Frolow F, Rabinovich D, Joachimiak A, Sigler P. Nature (London) 1994;369:469–473. doi: 10.1038/368469a0. [DOI] [PubMed] [Google Scholar]
  • 4.Hegde R S, Grossman S R, Laimins L A, Sigler P B. Nature (London) 1992;359:505–512. doi: 10.1038/359505a0. [DOI] [PubMed] [Google Scholar]
  • 5.Hegde R S, Wang A-F, Kim S-S, Schapira M. J Mol Biol. 1998;276:797–908. doi: 10.1006/jmbi.1997.1587. [DOI] [PubMed] [Google Scholar]
  • 6.Li R, Knight J, Bream G, Stenlund A, Botchan M. Genes Dev. 1989;3:510–526. doi: 10.1101/gad.3.4.510. [DOI] [PubMed] [Google Scholar]
  • 7.Paolella D M, Palmer C R, Schepartz A. Science. 1994;264:1130–1133. doi: 10.1126/science.8178171. [DOI] [PubMed] [Google Scholar]
  • 8.Hockings S C, Kahn J D, Crothers D M. Proc Natl Acad Sci USA. 1998;95:1410–1415. doi: 10.1073/pnas.95.4.1410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hines C S, Meghoo C, Shetty S, Biburger M, Brenowitz M, Hegde R S. J Mol Biol. 1998;276:809–818. doi: 10.1006/jmbi.1997.1578. [DOI] [PubMed] [Google Scholar]
  • 10.Bolshoy A, McNamara P, Harrington R E, Trifonov E N. Proc Natl Acad Sci USA. 1991;88:2312–2316. doi: 10.1073/pnas.88.6.2312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.De Santis P, Palleschi A, Savino M, Scipioni A. Biochemistry. 1990;29:9269–9273. doi: 10.1021/bi00491a023. [DOI] [PubMed] [Google Scholar]
  • 12.Brukner I, Sanchez R, Suck D, Pongor S. EMBO J. 1995;14:1812–1818. doi: 10.1002/j.1460-2075.1995.tb07169.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Bofelli D, De Santis P, Palleschi A, Risuleo B, Savino M. FEBS Lett. 1992;300:175–178. doi: 10.1016/0014-5793(92)80190-r. [DOI] [PubMed] [Google Scholar]
  • 14.Bofelli D, De Santis P, Palleschi A, Savino M. Biophys Chem. 1991;39:127–136. doi: 10.1016/0301-4622(91)85014-h. [DOI] [PubMed] [Google Scholar]
  • 15.Gorin A A, Zhurkin V B, Olson W K. J Mol Biol. 1995;247:34–48. doi: 10.1006/jmbi.1994.0120. [DOI] [PubMed] [Google Scholar]
  • 16.Olson W K, Gorin A A, Lu X-J, Hock L M, Zhurkin V B. Proc Natl Acad Sci USA. 1998;95:11163–11168. doi: 10.1073/pnas.95.19.11163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Haran T E, Kahn J D, Crothers D M. J Mol Biol. 1994;244:135–144. doi: 10.1006/jmbi.1994.1713. [DOI] [PubMed] [Google Scholar]
  • 18.Goodsell D S, Kaczor-Grzeskowiak M, Dickerson R E. J Mol Biol. 1994;239:79–96. doi: 10.1006/jmbi.1994.1352. [DOI] [PubMed] [Google Scholar]
  • 19.Sitlani A, Crothers D M. Proc Natl Acad Sci USA. 1998;95:1404–1409. doi: 10.1073/pnas.95.4.1404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hockings S C, Kahn J D, Crothers D M. Proc Natl Acad Sci USA. 1998;95:1410–1415. doi: 10.1073/pnas.95.4.1410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Crothers D M, Drak J, Kahn J D, Levene S D. Methods Enzymol. 1992;212:3–29. doi: 10.1016/0076-6879(92)12003-9. [DOI] [PubMed] [Google Scholar]
  • 22.Kahn J D, Crothers D M. J Mol Biol. 1998;276:287–309. doi: 10.1006/jmbi.1997.1515. [DOI] [PubMed] [Google Scholar]
  • 23.Schellman J A. Biopolymers. 1974;13:217–226. doi: 10.1002/bip.1974.360130115. [DOI] [PubMed] [Google Scholar]
  • 24.Levene S D, Crothers D M. J Biomol Struct Dyn. 1983;1:429–436. doi: 10.1080/07391102.1983.10507452. [DOI] [PubMed] [Google Scholar]
  • 25.Bacolla A, Gellibolian R, Shimizu M, Amirlkhaeri S, Kang S, Ohshima K, Larson J E, Harvey S C, Stollar B D, Wells R D. J Biol Chem. 1997;272:16783–16792. doi: 10.1074/jbc.272.27.16783. [DOI] [PubMed] [Google Scholar]
  • 26.Chastain P D, Sinden R R. J Mol Biol. 1998;275:405–411. doi: 10.1006/jmbi.1997.1502. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES