Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Jul 18.
Published in final edited form as: Int J Non Linear Mech. 2008 Dec;43(10):1082–1093. doi: 10.1016/j.ijnonlinmec.2008.07.003

Mesoscale modeling of multi-protein-DNA assemblies: the role of the catabolic activator protein in Lac-repressor-mediated looping

David Swigon *, Wilma K Olson #
PMCID: PMC3715064  NIHMSID: NIHMS84473  PMID: 23874000

Abstract

DNA looping plays a key role in the regulation of the lac operon in Escherichia coli. The presence of a tightly bent loop (between sequentially distant sites of Lac repressor protein binding) purportedly hinders the binding of RNA polymerase and subsequent transcription of the genetic message. The unexpectedly favorable binding interaction of this protein-DNA assembly with the catabolic activator protein (CAP), a protein that also bends DNA and paradoxically facilitates the binding of RNA polymerase, stimulated extension of our base-pair level theory of DNA elasticity to the treatment of DNA loops formed in the presence of several proteins. Here we describe in detail a procedure to determine the structures and free energies of multi-protein-DNA assemblies and illustrate the predicted effects of CAP binding on the configurations of the wild-type 92-bp Lac repressor-mediated O3-O1 DNA loop. We show that the DNA loop adopts an antiparallel orientation in the most likely structure and that this loop accounts for the published experimental observation that, when CAP is bound to the loop, one of the arms of LacR binds to an alternative site that is displaced from the original site by 5 bp.

Introduction

Understanding the biological functions of macromolecules and their assemblies requires detailed knowledge of their three-dimensional structures. The experimental methods that are used to elucidate biomolecular structure provide data that represent a compromise between the size of the object studied and the amount of detail obtained, e.g., low-resolution electron microscopic images of large multi-component systems sacrifice the precise atomic information found in the high-resolution X-ray structures of small molecules. Computational methods help to overcome this limitation by combining the available experimental data with models that reflect well-known biophysical properties of the molecular systems.

This paper describes the computational approach that we have developed for determining the structures and free energies of multi-protein-DNA assemblies. Our method assumes that the DNA deforms in such an assembly, with the protein components providing spatial constraints on DNA, either by distorting the DNA double-helical structure at the binding site or by placing restrictions on the locations of the ends of an otherwise free DNA segment. Thus, correct determination of the configuration and energy of the constrained DNA leads to a model that provides a very good approximation of the actual structure of the assembly.

We make use of a base-pair level theory of DNA elasticity [1] and an efficient algorithm for calculating the configuration of protein-constrained DNA. One of the important advantages of this procedure is that the DNA and protein molecules are represented as elastic bodies and, consequently, that their deformations can be described using a relatively small number of variables. Such an approach makes the modeling of assemblies with tens of thousands of atoms both computationally feasible and efficient.

We developed this methodology in order to gain deeper understanding of the large protein-DNA assemblies formed during transcription, the first step in the expression of genes, and the communication of the many molecular species involved in the regulation of this process. The transcription of most genes reflects the interplay of activator or repressor proteins that bind in the vicinity of the transcription start site, i.e., the precise location on DNA where RNA polymerase (RNAP), the enzyme that copies the information encoded in DNA into RNA, first binds. The mechanisms of control range from the simple competition of repressors with RNAP for overlapping binding sites on DNA to the recruitment of RNAP by activators near the transcription start site and the so-called “action at a distance” [2] of activators/repressors and RNAP from binding sites separated by hundreds of base pairs (100–500 bp). The DNA in the latter case forms loops, which bring sequentially distant sites close together. The classic example of transcription regulation in which DNA looping plays an important role is the E. coli lac operon, the system of proteins and the DNA sequences responsible for the expression of the enzymes used by the bacterium in lactose metabolism. A 5-bp change in the separation of the nucleotide sites where proteins bind alters the looping ability of DNA and can completely disrupt the regulatory function of the operon [3].

Here we continue our investigation of the DNA loops formed by the binding of the tetrameric Lac repressor (LacR) protein to its recognition sequences within the lac operon. The primary site of attachment of LacR to DNA, called O1, lies 401 and 92 bp away from two secondary sites, respectively called O2 and O3. The LacR complex can adopt either a rigid V-shaped structure, or a flexible extended arrangement with dimeric domains connected by a flexible hinge. The DNA can associate with LacR in one of several possible looping modes, depending upon the orientation of the bound nucleotide sequence with respect to the protein. Our earlier work [4] predicts that LacR adopts the extended form when bound to the O3-O1 loop in vitro. The sequence of the wild-type O3-O1 loop of the lac operon contains a binding site for another E. coli protein, the catabolite activator protein (CAP), also called the cAMP receptor protein, located 11 bp away from the O3 LacR recognition site [5]. Other weaker sites of CAP binding within the lac operon do not appear to play a role in transcription activation [6]. Much attention has been given to the question of whether LacR, a repressor of Lac genes, and CAP, an activator of those genes, can bind simultaneously to DNA in the so-called promoter region preceding (downstream of) the transcription start site. Hudson and Fried [7] concluded from enzyme-cutting patterns of DNA that CAP binding places no restrictions on the binding of LacR to the O1 site but precludes binding to the O3 site. They subsequently found [8] that simultaneous binding of CAP and LacR to the O3 site is possible at high concentrations of LacR but that the sequence covered by LacR, the so-called footprint, is shifted approximately 5–6 bp upstream. Balaeff et al. [9] recently constructed a model of the LacR-CAP-DNA loop complex in one of the possible looping modes using the V-shaped LacR structure and an ideal elastic-rod representation of DNA. A 6–8 bp upstream relocation of the O3 site lowers the computed energy of their modeled DNA loop.

Here we determine the configurations and deformational free energies of LacR-CAP-DNA loop complexes using our sequence-dependent treatment of DNA. We take into account all DNA looping modes, the precise structural changes in DNA induced by the binding of LacR and CAP, the possibility of opening LacR, and the binding interactions of LacR with DNA. We find that, in the vicinity of the O3 binding site, there is an alternative LacR binding site, which we call O3*, that preserves nearly all of the observed hydrogen-bonding interactions of DNA with LacR in the complex with the O1 site [10]. When CAP is bound to the loop and LacR to O3*, one of the antiparallel loop types becomes very favorable in terms of its free energy. We propose that this structure is the LacR-mediated loop found by Fried and Hudson [8] in the presence of CAP.

Methods

Sequence-dependent DNA elasticity

The mechanical properties of DNA influence its role in the cell [11]. High-resolution structural studies show that both the intrinsic structure and the elastic properties depend on sequence. Some base-pair steps, i.e., neighboring base pairs, act as natural wedges that change the direction of the helical axis; others are sites of under- or overtwisting relative to the average twist of ca. 36°. The positioning of local bends in phase with the double-helical structure gives the DNA intrinsic curvature [12]–[14], and the positioning of the local stiffness in phase with the helical repeat leads to bending anisotropy [15]. The six rigid-body motions describing the relative rotation and displacement of neighboring base pairs, depicted in Fig. 1, can be strongly coupled [16]. The untwisting of adjacent residues frequently induces an increase in roll, the parameter that describes the component of bending with predominant influence on the widths of the major and minor grooves. Untwisting can also induce a decrease in slide, the parameter describing the motion of a base-pair plane along its long axis.

Fig. 1.

Fig. 1

(A) Model of a base-pair step showing the vector rn = xn+1xn that connects the origins of successive residues and the orthonormal frames (d1n,d2n,d3n),(d1n+1,d2n+1,d3n+1) on the base pairs. Each base is covalently bonded at the darkened corner to one of the two sugar-phosphate chains. The minor-groove edges of base pairs are shaded in gray, and the antiparallel 5′-3′ directions of the complementary strands are denoted by the arrows at the edges.

(B) Schematic representation of kinematical variables describing the relative orientation and displacement of base pairs in a step. Images illustrate positive values of the designated variables with respect to the leading strand, denoted in (A) by the arrow on the left.

The elastic theory used here captures these deformational features of DNA and also accounts for the dependence of the mechanical properties on nucleotide sequence [1], [17]. The DNA configuration is specified by giving, for each base pair, numbered by index n, its location xn in space and its orientation described by an embedded orthonormal frame (d1n,d2n,d3n). The relative orientation and position of the base pair and its predecessor are specified by six kinematical variables ξin=(θ1n,θ2n,θ3n,ρ1n,ρ2n,ρ3n), termed, respectively, tilt, roll, twist, shift, slide, and rise (see Fig. 1 for representative images and the Appendix for a convenient definition of the kinematical variables) [18].

Let us write Ξ={ξin}i=16n=1N1 for the configuration of a DNA segment of length N. The elastic energy Ψ = Ψ(Ξ) is taken to be the sum of the base-pair step energies ψn, each of which is a quadratic function of the corresponding variables ξin, i.e.,

Ψ=n=1N1ψn,  ψn=12i=13j=13FijXYΔξinΔξjn; (1)

here XY is the nucleotide sequence (in the direction of the coding strand) of the nth base-pair step, Δξin=ξinξ¯iXY are the deviations of variables from their intrinsic values ξ¯iXY,andFijXY are the elastic moduli. We use empirical estimates of intrinsic values and moduli deduced from the averages and fluctuations of base-pair step parameters in high-resolution DNA-protein complexes [16] and normalized so that the persistence length of mixed-sequence DNA matches observed values (~500Å) [19].

Electrostatic energy of DNA

We account for the polyelectrolyte nature of DNA by adding to the total energy the contribution resulting from the repulsion between negatively charged phosphate groups on the DNA backbone. The electrostatic energy Φ is a sum of pairwise interactions, adjusted for counterion condensation and screening effects due to ionic environment. For simplicity it is convenient to merge the two charges associated with each base pair into a single charge of twice the magnitude located at the centroid of the base pair, in which case the energy Φ becomes:

Φ=(2δ)24πεm=1N2n=m+2Nexp(κ|rmn|)|rmn|, (2)

where rmn = xmxn is the position vector connecting the centers of base pairs m and n, δ is the net effective charge of the base pair, taken to be 0.48e or 7.7 × 10−20 Coulombs assuming 76% charge neutralization by condensed cations [20], ε is the permittivity of water at 300 K, and κ is the Debye screening parameter, which, for monovalent salt such as NaCl, depends on the molar salt concentration c as κ=0.329cÅ1. The merging of charges reduces the amount of computer time needed to evaluate Φ and introduces, in our experience, only small error to the resulting configurations [unpublished results]. The sum in (2) does not include the electrostatic interaction of nearest neighbors, because we assume that such an interaction is already accounted for in the local elastic terms in (1). Thus, equation (2) represents only nonlocal (i.e., long-range) effects.

Constraints on DNA configurations

The binding of proteins imposes several types of constraints on DNA in macromolecular complexes.

(A) Intrinsic structure

The binding of a protein to DNA between base pairs n1 and n2, alters the intrinsic values ξ¯in and moduli Fijn,n1n<n2,i,j=16, of the contacted base-pair steps. Good estimates of these quantities can be found from analysis of available high-resolution structures of the protein-DNA complex of interest [21]. If no such structures exist, one can approximate the configuration of the bound complex with a related structure or a so-called homology model [22]. We generally assume that the configuration of protein-bound DNA is identical to the configuration of the crystal structure, and that that configuration is rigid and unaffected by the deformation of adjacent DNA, i.e., ξin,n1nn2,i=16 will be held fixed for each bound protein.

(B) Rigid end conditions

Some proteins bind two DNA sites simultaneously and impose spatial constraints, i.e., end conditions, on the DNA segment between the binding sites. Such proteins, if rigid, will impose constraints on the relative position and orientation of the bound DNA segments. If the bound DNA is also rigid, a relation between two base pairs, one from each bound segment, is sufficient to describe the constraint:

dim·djn=Dijmn,  dim·(xnxm)=rijmn. (3)

(C) Semi-rigid end conditions

In some circumstances, the protein can deform via one or several degrees of freedom αk, k = 1,…,K. In such cases the relative position and orientation of the bound DNA segments are functions of the degrees of freedom:

dimdjn=gijmn({αk}),  dim(xnxm)=himn({αk}). (4)

(D) Cyclization

Ring closure imposes special configurational constraints on a DNA segment. Such constraints can be accounted for by introducing an additional, hypothetical (N+1) base pair and then requiring that the position and orientation of such a base pair are identical to the position and orientation of the first base pair:

x1=xN+1,di1=diN+1. (5)

(E) Linking number

A circular DNA or a DNA closed by binding to a protein at two or more sites is subject to the topological constraint of fixed linking number. The linking number of circular DNA is an integer that represents how many times the two strands of the molecule are intertwined [23],[24]. For closed DNA, the linking number cannot be changed unless one of the strands of DNA is severed. The linking number of a DNA segment bound to a protein at two ends, however, can be changed. The linking number is defined in this case by introducing a virtual closure for each strand and can be changed by dissociation of the protein from the DNA.

The linking number Lk(Ξ) of two piecewise linear curves {xn+εd2n}n=1Nand{xnεd2n}n=1N representing the DNA strands can be computed using the results of [25]. An important consequence of Lk(Ξ) being an integer invariant is that it does not change when the configuration Ξ is perturbed infinitesimally. Therefore it does not contribute to the Lagrangian (see below) and serves primarily to classify computed equilibrium configurations.

We denote the above collection of constraints (A)–(E) imposed on the configuration Ξ of the DNA under consideration as:

Cj(Ξ)=0   j=1,,M. (6)

(F) Soft end conditions

There are proteins that bind to DNA in two places but contain a flexible polypeptide tether between the DNA binding subunits. Such proteins do not impose a rigid constraint on the relative location and orientation of the bound DNA, but rather provide an additional contribution to the total energy of the system:

Θ=Θ(Ξ). (7)

For example, for polypeptide linkers one may take the energy derived from the radial distribution function Θ(Ξ) = −kT ln p(r(Ξ)), where p(r) = cπr2 exp(−a(rb)2) (see [26]) and r(Ξ) is the distance between anchoring points on subunits bound to DNA with configuration Ξ. This added energy Θ can also be used to take account of conservative applied external forces, as in single-molecule manipulation experiments.[27]

(G) DNA-DNA steric hindrance

Although the total energy contains an electrostatic term that penalizes configurations in which two parts of the same DNA molecule come into close proximity, it does not, by itself, prevent overlap of computed configurations. Thus, one must include constraints of steric hindrance to insure that no points in two distinct DNA base pairs occupy the same position in space. There are, in fact, two situations that must be avoided: (a) cases in which the DNA is bent to such a degree that two neighboring base pairs intersect and (b) cases in which two DNA segments come together in such a way that sequentially distant base pairs intersect. For a DNA represented as a tube with a continuous axial curve, a convenient way to enforce both constraints at once is to restrict the global curvature of the tube [28]. That method is not appropriate for the discrete model because of the possibility of shearing deformations. Instead we use a simple and efficient approximation to enforce the two constraints. Specifically, we treat each base-pair n as a disc of diameter d = 20 Å centered at xn with normal d3n and require (a) that for each n the discs Dn and Dn+1 do not intersect and (b) that the centers of Dn and Dn+1 are separated more than d for any m, n such that | mn | > πd /(2 < ρ̅3 >) (where < ρ̅3 > = 3.4 Å is the average rise in undeformed DNA). Both constraints (a) and (b) can be represented as inequalities

Ek(Ξ)0    k=1,,K. (8)

(H) Protein-DNA steric hindrance

In addition to constraining the steric interactions of DNA with itself, the DNA must be prevented from sterically interfering with the bound proteins and the bound proteins from interfering with one another. These additional constraints can be implemented by introducing a surface for each bound protein and requiring that no sphere Sn intersects any protein surface and no two protein surfaces intersect each other. It should be noted that the detailed features of protein surfaces can greatly increase the number of equilibrium configurations of the assembly and therefore, in the interest of computational efficiency, the proteins can be roughly approximated by appropriate ideal geometrical bodies, such as ellipsoids. The resulting constraints will again have the form (8).

Classical mechanics

We are concerned with two goals: (i) computing the locally stable configurational states of the constrained DNA, and (ii) describing the likelihood of occurrence of each of the states in a thermally excited environment. We omit consideration of the dynamics of DNA or the rates of transitions between configurational states.

Statistical mechanics tells us that the most likely configuration of DNA is the one for which the total energy E of the DNA,

E(Ξ)=Ψ(Ξ)+Φ(Ξ)+Θ(Ξ), (9)

i.e., the sum of the elastic energy Ψ, the electrostatic energy Φ, and the total energy of deformable proteins and applied loading Θ, is minimized subject to the imposed constraints (6) and (8). Such a configuration is clearly one for which the first variation in E vanishes for any perturbation. Following the standard approach to nonlinear constrained optimization [29] we introduce the Lagrangian L(Ξ,Γ,Λ) with Lagrange multiplier constants Γ={Fj}j=1MandΛ={Gj}j=1N2 follows:

L(Ξ,Γ,Λ)=E(Ξ)+j=1MFjCj(Ξ)+j=1KGjEj(Ξ). (10)

A necessary condition for a configuration Ξ* obeying constraints (6) and (8) to be in equilibrium is that there are multipliers Γ* and Λ* ≥ 0 such that Ξ* obeys the Kuhn-Tucker equations [30]:

ΞL(Ξ*,Γ*,Λ*)=ΞE(Ξ*)+j=1MFj*ΞCj(Ξ*)+k=1KGk*ΞEk(Ξ*)=0, (11a)
Cj(Ξ*)=0,   j=1,,M, (11b)
Gk*Ek(Ξ*)=0,   k=1,,K. (11c)

Note that the variational equation in (11) expresses the laws of balance of forces and moments acting on the nth base pair [1]:

fnfn1=gn,  mnmn1=fn×rn+nnn=2,,N, (12)

where fn and mn are the force and moment exerted on the nth base pair by the (n+1)th base pair, and gn and nn are the external force and moment acting on the nth base pair, which result from the electrostatic interaction or arise as Lagrange multipliers corresponding to constraints (6) and (8). Our assumption about the charges centralized in the base pairs implies (see [31]) that if Θ = 0 and self-contact is absent, then nn = 0 and

gn=(2δ)24πεm=1N2n=m+2N(1+κ|rmn|)exp(κ|rmn|)|rmn|3rmn. (13)

As shown in [1], equations (1) and (12) imply that

fndin=j=13Qijnψnρjn=j=13k=13QijnF(j+3)kXYΔξkn, (14)
mndin=j=13Γijn(s=13FjsXYΔξsn+k=13l=13Λklnjρlns=13F(j+3)sXYΔξsn), (15)

where the matrices Qijn,Γijn,andΛklnj (not to be confused with the Lagrange multipliers introduced above) depend on (ξin) as shown in the Appendix.

A solution Ξ* of equations (11) (or, equivalently, equations (11b),(11c),(12)(15)) is in equilibrium in the sense that the first variation of the total energy vanishes for all perturbations. Such a solution is not necessarily the configuration globally minimizing E, but can be any metastable or unstable equilibrium configuration corresponding to the set of constraints. The stability of an equilibrium configuration Ξ* can be verified by checking that the constrained Hessian,

Ξ2L(Ξ*,Γ*,Λ*)=Ξ2E(Ξ*)+j=1MFj*Ξ2Cj(Ξ*)+k=1KGk*Ξ2Ek(Ξ*), (16)

obeys Ξ2L(Ξ*,Γ*,Λ*)[X,X]>0 for all normalized perturbations X such that

ΞCj(Ξ*)[X]=0j=1,,M,Gk*ΞEk(Ξ*)[X]=0,  k=1,,K. (17)

Equilibrium configurations for which the Hessian is not positive definite are saddle points on the energy surface and can be used to characterize transition points (mountain passes) between locally stable configurations.

Statistical mechanics of the assembly

Each configuration Ξ* that locally minimizes the energy E is a possible state of the system. The probability of occurrence of a state Ξ* is proportional to the integral

Z(Ξ*)=B(Ξ*)exp(E(Ξ)/kT)J(Ξ)dΞ, (18)

where B(Ξ*) is the basin of attraction of Ξ*, defined as the set of configurations Ξ obeying constraints (6) and (8) from which there is a downhill path to Ξ* but not to any other local minimum of the system. The Jacobian J is included in the probability measure because of the non-canonical choice of independent variables [32]. If Ξ* is a sufficiently deep minimum, one can approximate the integral in (18) by another in which (i) the integration domain is extended from B(Ξ*) to the entire set of configurations that obey the constraints, and (ii) the energy function is replaced (and its definition extended to points not in B(Ξ*)) by a quadratic expansion about Ξ*:1

Z(Ξ*)Cj(Ξ)=0exp(E(Ξ)/kT+lnJ(Ξ))dΞexp(E(Ξ*)/kT)J(Ξ*)ΞCj(Ξ*)[X]=0Gk*ΞEk(Ξ*)[X]=0exp(A[X,X]2kT)dX (19)

where the quadratic functional A has the form:

A=Ξ2L(Ξ*,Γ*,Λ*)2kTΞ2lnJ(Ξ*). (20)

The error of this approximation depends on the depth of the potential well corresponding to the minimum, the size of B(Ξ*) , and the departure of L from a quadratic function. Therefore it is difficult to asses this error a priori except for very special cases. The presence of constraints in the integration domain can be treated by using a reduced set of perturbations Y that belong to the joint nullspace of the linear operators ∇ΞCj (Ξ*) and Gk*ΞEk(Ξ*) (see next section).

If multiple equilibrium states Ξ1*,Ξ2*,,ΞK* are present for given constraints (6) and (8), the probability of each such state is computed as:

P(Ξj*)=Z(Ξj*)/i=1KZ(Ξi*)  j=1,,K. (21)

The aforementioned method can be used to calculate many cases of interest, such as (i) the free energies of multiple states of a single topoisomer, or (ii) the free energies of topoisomers, by relaxing the constraint of fixed Lk while keeping the constraint of closure.

The free energy GDNA of a state Ξ* is given by:

GDNA=kTlnZ(Ξ*). (22)

The looping free energy difference ΔGDNA can be estimated as the difference between GDNA and the free energy Gfree of a state with the closure conditions (B), (C) and (D) relaxed. The energy ΔGDNA describes only the contribution from DNA deformation and does not include the protein-DNA binding energies, which must be added in order to assess the probability of loop formation under experimental conditions. Nonetheless, ΔGDNA can be used to compare the probability of formation of distinct states (loop types) in which the number, type, and location of bound proteins are the same.

Although the method above has been applied to computations of looping and ring-closure probability, the possible errors associated with comparing the closed and open states have not been thoroughly examined. Thus for the computation of closure probabilities and the cyclization factor it is better to use a Monte-Carlo procedure [34] which can yield such quantities efficiently and with high precision.

Computational procedure

The (equations (11) represent a system of non-polynomial algebraic equations in the variables Ξ={ξin}i=16n=1N1. If the only contribution to the energy E is the elastic energy Ψ (as was the case in [1] and [17]) then the equations form a weakly coupled system that can be solved for base pairs 1,2,…, N consecutively in terms of the moments and forces applied to the first base pair, in a way similar to solving an initial value problem for an ordinary differential equation. In the general case, the Hessian matrix for the system is full. A solution Ξ* can be found by solving the system of algebraic equations (11) numerically using, for example, the Levenberg-Marquardt procedure starting from an appropriate initial configuration. Multiple solutions can be found by randomizing starting configurations for the algorithm.

A more convenient approach is to use a continuation method with one of the constraints depending on a homotopy parameter 0 ≤ λ ≤1 in such a way that λ = 0 corresponds to constraints giving a known solution, e.g., the stress-free state, and λ = 1 corresponds to the constraints for which a solution is sought. One commonly employed situation is that in which various topoisomers are found by relaxing the constraint of fixed angle ϕ between d1 vectors at the two ends of DNA. Knowing the solution for one topoisomer, one can compute another topoisomer by varying ϕ over the interval [0,2π]. Other variable parameters may be the length N, various matrix elements in (3) and (4), or the diameter d.

Once an equilibrium configuration Ξ* is found, its stability is verified by computing the constrained Hessian (16). If Ξ* is stable then its partition function Z can be computed approximately using (19) and (20): the quadratic functional A in (20) and the orthogonal basis B for the joint nullspace of ∇ΞCj(Ξ*) (with j = 1,…,M) and Gk*ΞEk(Ξ*) (with k = 1,…,K) is computed, and the integral in (20) is converted to a standard multivariate Gauss integral and evaluated explicitly as:

exp(yTBTABy2kT)dy1dyM=(2π)Mdet(BTAB). (23)

In all computations full use is made of symbolic algebra software (Maple 10 by Maplesoft) and automatic differentiation [35].

Proteins

Lac repressor

Each arm of the LacR tetramer (represented schematically by boxes I and III in Fig. 2C) contains two polypeptide chains. A four-helix bundle tetramerization domain, located at the base of the “V” (represented by cube II in Fig. 2C), holds the two halves of the complex in place. A small contact interface between the dimeric fragments (located in the circle in Fig. 2A) further stabilizes the complex. Disruption of that contact interface is expected to promote the opening of the V-shaped structure. Electron microscopic images of freeze-etched samples of LacR [36] show extended forms that are presumably obtained by opening the “V” to the extent that the tetramerization domain occupies the middle of the assembly and the DNA binding sites lie at opposite ends of the complex.2 As in [4] we treat the opening of the tetramer as a concerted motion of three rigid domains—I, II, and III—about two hinges (Fig. 2C). For simplicity we assume that the two hinge angles i.e., the angle between lI and lII and that between lII and lIII, shown in Fig. 2C, are identical and hence equal to one half of the total angle of opening α. Additional flexibility is provided by allowing the rotation of domains I and III about axes lI and lIII by equal amounts β, where −90° ≤ β ≤ 90°. For the V-shaped configuration found in the crystal α = 34° and β = 33°. We assume that LacR can exist in two states: (i) the aforementioned V-shaped arrangement [38],[39] or (ii) a flexible state with two degrees of freedom, α and β. The free-energy penalty for opening LacR, GLacR, has been estimated [4] to be between 1.8 kT and 3.8 kT.3

Fig. 2.

Fig. 2

(A) Model of the structure of the tetrameric Lac repressor protein (LacR) in complex with O1 and O3 operator segments, obtained by composition of available X-ray data (see Methods). The black spheres on protein represent the Cα atoms of Gln 335 and those on DNA the P atoms of the central base pairs. Color-coding denotes the protein monomers and DNA chains in closest contact at the highlighted P atoms. The black circle marks the dimer contact interface found in the crystal structure.

(B) DNA loop types. The color-coded arrows depict the 5′-3′ directions of the sequence strand on LacR in the four possible orientations of DNA on the tetramer. The colors correspond to those of the associated DNA and protein chains in part (A).

(C) Schematic representation of LacR opening. The rigid domains I (residues 1–332 of chains A and B and the DNA bound to these chains) and III (corresponding residues of chains C and D and the bound DNA) are connected to domain II (residues 340–354 of chains A, B, C, D) by two hinges. The axes of rotational symmetry of the three domains are lI, lII, and lIII. Chains (A–D) correspond respectively to proteins shown in (A) in violet, yellow, green, and red.

(D) Schematic representation of the closures of DNA strands used in the computation of linking number. The top closure is appropriate for antiparallel loops and the bottom for parallel and extended loops.

For our calculations we assign to the protein-bound segments of DNA a three-dimensional structure that is in accord with currently available crystallographic data. Because the crystal structure of the LacR tetramer with one dimer bound to the O3 sequence and the other to O1 has not been determined, we employ a model built by superposing the 2.6-Å resolution structure [40] of the LacR dimer complexed with the Osym operator (PDB_id: 1EFA) and the 2.7-Å structure [39] of the LacR tetramer lacking DNA-binding headpieces (PDB_id: 1LBI). We assume that the bound O3 operator adopts the same structure as the bound O1 operator and the complex is symmetric.

The DNA operators can be oriented in one of two ways with respect to each protein dimer, with the 5′-3′ direction of the coding strand pointing inside or outside the V-shaped reference state (Fig. 2B). The combination of possible DNA orientations for each dimer gives rise to four possible DNA looping modes for the repressor assembly (eight possible modes if the V-shaped complex is asymmetric [41].) Following the notation of Geanacopoulos et al. [42] and our earlier work [4], we denote these modes A1, A2, P1, P2, where the A and P refer, respectively, to antiparallel and parallel orientations of operators (see Fig. 2B). Because the core regions of the protein monomers are congruent, there appears to be no a priori preference for a given orientation of DNA on the protein.

For the computation of the linking number, Lk, we introduce virtual closures of the two DNA strands through the tetramer assembly (see Fig. 2D). Each of these closures originates at the phosphorus atom on one of the DNA strands attached to the central base pair of the O3 operator, passes through the Gln 335 Cα atom of the LacR chain that makes direct contact with the 5′-end of the strand [10], continues through a second Gln 335 Cα atom in the other half of the protein assembly, and terminates at the corresponding phosphorus atom on the O1 operator in such a way that the linked phosphorus atoms lie on the same DNA strand.

Catabolite activator protein

The catabolite activator protein (CAP) is a dimeric protein, with each subunit containing a ligand-binding domain and a DNA-binding domain. The affinity of CAP for its DNA binding site increases upon binding two cAMP molecules, yielding an apparent equilibrium constant of 4.1×107 M−1 [43]. Upon binding DNA, CAP kinks the double-helical structure sharply at two sites, producing a global bend of 80°±12° [5],[44]. The base-pair step parameters used here to model the 20-bp CAP binding site found between the O3 or O3* and O1 operator sites on DNA correspond to those in the crystal complex of CAP with the consensus binding sequence [5] (PDB_id: 1CGP).

Results

The above procedure underlies our earlier analyses of (i) the sequence-dependent configurations of closed DNA minicircles [1], [17], (ii) the looping of DNA mediated by LacR [4], and (iii) the determination of the structures of open and closed complexes of RNAP and CAP [6]. Here we report an additional application of our method, extending the analysis of DNA looping mediated by the LacR to cases in which CAP is present.

Figure 3 illustrates representative minimum-energy configurations of the wild-type O3–O1 loop mediated by LacR for various combinations of looping mode, linking number, and LacR conformation. The configurations shown for each looping mode are the two most probable topoisomers. Each of these configurations minimizes the energy of DNA looping at fixed linking number Lk for the given choice of anchoring conditions on the protein.4 Configurations Ea and Eb (previously called P1E) optimize LacR-opening geometry. The calculated values of ΔGDNA and other characteristics of the configurations are listed in Table 1.5 As shown previously [4], the elastic contribution dominates the energies of the preferred configurations, and the Eb loop with flexible LacR has the lowest total free energy. Thus, the Eb loop is predicted to be the most likely arrangements of the DNA-LacR complex in vitro.6

Fig. 3.

Fig. 3

Representative minimum-energy configurations of LacR-mediated O3-O1 loops with DNA shown in aqua, LacR in red, and operator sites in blue. Geometric and energetic properties of the loops are given in Table 1.

Table 1.

Calculated energy (in kT) and configurational parameters for O3-O1 LacR-mediated DNA loops.

Loop α β Lk Ψ Φ GLacR GDNA ΔGDNA
A1a 34 33 9 32.1 71.9 117.7 36.5
A1b 34 33 10 39.1 70.0 122.7 41.4
A2a 34 33 8 33.1 71.6 118.0 36.7
A2b 34 33 9 41.6 70.3 125.3 44.1
P1a 34 33 9 38.8 73.0 123.4 42.2
P1b 34 33 10 62.5 71.3 145.4 64.1
P2a 34 33 9 71.3 73.4 145.4 77.4
P2b 34 33 10 45.7 70.8 130.4 49.1
Ea 8 41.2 70.0 2.8±1 127.2±1 45.9±1
Eb 9 23.1 68.3 2.8±1 107.4±1 26.1±1
Free 0 68.2 81.2

Loops denoted by labels in Figure 3; LacR deformation angles (α,β) and closed pathway used to calculate Lk defined in Figure 2; Ψ: elastic energy; Φ: electrostatic energy at 10 mM salt; GLacR: free energy of LacR opening; GDNA: free energy of LacR-mediated loop at room temperature under the given ionic conditions. ΔGDNA free energy difference between loop and “free” DNA with bound LacR dimers. “Free” refers to the unconstrained linear DNA chain of the same wild-type (O3-O1) sequence: GGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATT. Here the O3 and O1 sequences are shown in boldface.

Complexes with CAP

In principle, there is no steric hindrance preventing the simultaneous binding of CAP to its DNA recognition site and LacR to the O3 site on the E. coli lac operon. Representative configurations and free energies of the O3-O1 loops with bound CAP are reported, respectively, in Figure 4 and Table 2. The free energies ΔGDNA of CAP-bound DNA loops anchored to the V-shaped LacR structure exceed, by at least 5kT, those of the corresponding CAP-free structures. The outer surface of CAP (i.e., the surface antipodal to the DNA binding site) makes unfavorable steric contacts with LacR or DNA in some configurations (A1c, A1d, A2d, P1c, and P1d) and the bending of the activator protein is ~180° out of phase with the bending of the LacR-mediated loop in others (A2c and A2d). The Ed configuration is much lower in free energy than all other configurations and minimizes ΔGDNA with a value comparable to that of the Eb loop formed in the absence of CAP.

Fig. 4.

Fig. 4

Representative minimum-energy configurations of LacR-mediated O3-O1 loops with bound CAP shown in yellow and other components color-coded as in Figure 3. Geometric and energetic properties of the loops are given in Table 2.

Table 2.

Calculated energy (in kT) and configurational parameters for O3-O1 LacR-mediated DNA loops with bound CAP.

Loop α β Lk Ψ Φ GLacR GDNA ΔGDNA
A1c 34 33 9 38.2 71.2 121.9 42.2
A1d 34 33 10 50.1 70.0 132.1 52.4
A2c 34 33 8 76.5 73.7 162.6 82.9
A2d 34 33 9 80.6 73.6 167.0 87.3
P1c 34 33 9 44.5 72.9 128.2 48.5
P1d 34 33 10 82.8 73.9 167.7 88.0
Ec 110 18 8 35.6 68.9 2.8±1 120.1±1 40.4±1
Ed 108 −50 9 24.3 67.2 2.8±1 105.9±1 26.2±1
Free 0 67.0 79.7

Loops denoted by labels in Figure 4; α, β, Lk, Ψ, Φ, GLacR, GDNA are as in Table 1. ΔGDNA: free energy difference between loop and “free” DNA with bound CAP and LacR dimers. Sequence: GGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATT. Here the O3 and O1 sequences are shown in boldface and the CAP binding site is underlined.

Exploration of the DNA sequence in the vicinity of the O3 site reveals an additional putative LacR binding site 5 bp upstream of O3, GCGGGCAGTGAGCGCAA, which shares structural similarities with the O1 site. As shown in Fig. 5, all but two of the unique hydrogen-bonding contacts between protein and DNA atoms would be preserved if the LacR were to associate with the modified sequence, here termed O3*, in the same way that it binds the natural O1 operator in solution [10]. Although the putative binding site aligns poorly against the nucleotides that comprise O1 (only 5 of 19 base pairs are identical), the base-pair modifications at the key sites are conservative in the sense that the substitutions preserve the positioning of key elements in DNA recognized by protein [48] e.g., the O4 hydrogen-bond acceptor on T is replaced by O6 on G and the N6 hydrogen-bond donor on A is replaced by N4 on C in the six T.A→G.C modifications. Moreover, two thirds of the close contacts of LacR and DNA (≤ 3.4 Å) are nonspecific in that they involve sugar and base atoms.

Fig. 5.

Fig. 5

Diagrams of comparative molecular interactions of the LacR headpiece with the O1, O3, and O3* operators. Strand I is the leading strand of each recognition site, i.e., the sequences listed in Table 1Table 2 for O1 and O3 and in Table 3 for O3*, and Strand II is the complement. The O1 interactions are as found in solution [10]; the O3 and O3* interactions are putative. DNA base pairs in close contact (interatomic distances of 3.4 Å or less) with protein residues in the O1 complex are denoted in gray (van der Waals’ interactions) and aqua (hydrogen bonds) in the center of each diagram. The contacted atoms of the bases are indicated on each side of the center column, together with the corresponding protein residues on the outside. The base pair contacts in O3 and O3* that are identical to those in O1 are highlighted in yellow, and the contacts that preserve hydrogen bonding are denoted in orange. The bifurcated hydrogen bonding (at position 17) that accommodates all base-pair combinations is shown in pink. Comparison of the three sites shows that most of the hydrogen-bonding elements are conserved upon substitution of the O3 or O3* sequence—a direct consequence of the isosteric character of the Watson-Crick base pairs, the pseudo-symmetric positioning of hydrogen-bond donor atoms in the DNA minor groove [48], the equivalent positioning of major-groove atoms in conservatively substituted (G↔T and A↔C) bases [48], and the dual hydrogen-bonding (donor or acceptor) capabilities of selected amino acids. The two potential hydrogen-bonding elements not conserved upon substitution of the O3 or O3* sequences are outlined by boxes.

Representative configurations of LacR-CAP-DNA loop topoisomers with LacR bound at the putative O3* site are shown in Figure 6, and the computed free energies are given in Table 3. The free energies of all topoisomers are generally larger than those for the O3-O1 loops without CAP, with the singular exception of the A2f loop, for which ΔGDNA equals 24.0 kT at 10 mM monovalent salt. This number is lower than ΔGDNA for the open Ee loop, making it the most likely CAP-bound O3*-O1 loop, and is even lower than the free energy of Ed, the most optimal CAP-bound O3-O1 loop. Direct comparison of ΔGDNA for the A2f and Ed loops, however, is precluded because GO3*, the binding energy of LacR to the O3* site, is not known. The CAP-induced bend is naturally positioned near the locus of highest curvature in the A2f loop, thereby absorbing the cost of bending DNA. The same happens, but to a lesser degree, in the A2e, P2c, and Ee loops. The P1e and P1f loops resemble the U and O CAP-bound loops reported by Balaeff et al. [9] in which LacR is bound 7 bp upstream of the O3 site (a location different from O3*). The energies of the P1e and P1f loops found here, with LacR bound at O3*, substantially exceed that of A2f, or, for that matter, the P2c loop.

Fig. 6.

Fig. 6

Representative minimum-energy configurations of LacR-mediated O3*-O1 loops with bound CAP and LacR positioned at the alternate O3* binding site. Geometric and energetic properties of the loops are given in Table 3 and the molecular color-coding in Figure 3 and Figure 4.

Table 3.

Calculated energy (in kT) and configurational parameters for O3*-O1 LacR-mediated DNA loops with bound CAP.

Loop α β Lk Ψ Φ GLacR GDNA ΔGDNA
A1e 34 33 9 65.6 76.2 154.4 70.2
A1f 34 33 10 48.4 72.6 133.6 49.4
A2e 34 33 8 65.2 74.9 153.0 68.8
A2f 34 33 9 23.0 72.8 108.1 23.9
P1e 34 33 9 90.6 80.6 183.2 99.0
P1f 34 33 10 54.8 75.7 142.5 58.3
P2c 34 33 10 35.4 73.9 122.4 38.2
Ee 61 2 9 54.4 73.7 2.8±1 144.0±1 59.8±1
Free 0 70.5 84.2

Loops denoted by labels in Figure 6; α, β, Lk, Ψ, Φ, GLacR, GDNA are as in Table 1. “Free” refers to the unbound, linear DNA chain of the (O3*-O1) sequence: AAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATT. Here the O3* and O1 sequences are shown in boldface and the CAP binding site is underlined.

Discussion

The general procedure described here for computing equilibrium configurations of protein-DNA assemblies and estimating their free energies takes into account the sequence-dependence of DNA deformability and various types of constraints on such configurations, including conformational changes induced by binding proteins, flexibility of protein-bound DNA, restrictions on contacts between sequentially distant DNA segments, closure constraints, electrostatic forces, and the constraint of fixed linking number. The advantage in using a discrete, as opposed to a continuum, model for DNA is that the protein-induced changes in intrinsic structure and deformability, found in high-resolution structures, are represented exactly. Although we employ a quadratic energy function for DNA deformation, the formalism developed here can be easily extended to more general energy functions, provided they remain additive; in that case only equations (13) and (14) would be affected. The procedure is computationally efficient: a single configuration and its free energy can be determined within a few minutes on a standard desktop PC computer (Dell Optiplex GX270, with 3GHz Pentium 4, running Matlab 6). We have verified that the main conclusions reported in this paper would be valid even if the DNA were assumed to be ideally elastic, i.e., homogeneous, isotropic, intrinsically straight, and with no coupling. Such assumptions, however, do not simplify the computation and hence there is no reason to make them here.

The application of this procedure to the analysis of DNA looping mediated by the Lac repressor protein is an important first step for obtaining a dynamical picture of the interactions of LacR, CAP, RNAP, and DNA that will add to current understanding of the regulation of the lac operon in vivo. The proposed configuration of the LacR-CAP-DNA loop, shown as A2f in Fig. 6, in which LacR binds to the alternative site O3*, agrees with the experimental evidence [8] showing a 5-bp upstream shift of the LacR binding site upon CAP binding. Although the O3* site has not yet been confirmed experimentally, it is likely that if LacR binds to the sequence, it does so only in the presence of CAP. The strong CAP-induced bending deformation of DNA contributes to the stability of the A2f loop by mimicking the site of highest curvature. This looping mode also brings the DNA surrounding the CAP recognition sequence into close contact with positively charged residues on the sides of CAP (Lys26, Lys166, His199, and Lys201; possibly, Lys22 and Lys44) that may provide additional stabilizing energy [49]. These sites appear to be responsible for the CAP-induced bending of DNA observed in time-resolved fluorescence measurements [44].

In addition, we find that the Ed loop (the extended form of the P1 loop), in which LacR is bound to the O3 site, is energetically comparable to the optimum O3*-bound A2f loop. As is clear from Fig. 2B, the O1 operator is oriented in the same direction on LacR in the A2 and P1 (Ed) looping modes, but O3 is oriented differently. Thus, interconversion between the A2f form and the open Ed configuration would entail reorientation of O3 with respect to LacR, as well as the shift of binding site. Such configurational transitions may occur in solution. Final determination of the likely configurations of LacR-mediated DNA loops in the presence of CAP requires further experimental work.

In summary, we predict that the presence of CAP completely alters the distribution of LacR-mediated loop types. The binding of CAP also appears to increase the apparent affinity of LacR to DNA by lowering the loop-formation energy. In other words, the looped structure induces a mechanical coupling that gives rise to a binding cooperativity between CAP and LacR that cannot be accounted for by traditional mechanisms because these proteins are not in direct contact. The structure and precise placement of CAP and other proteins on DNA undoubtedly play important roles in determining both the configurations and the populations of DNA loops formed in the cell and detected in gene expression studies. Our predictions can be tested experimentally in several ways: (i) the presence of the A2f loop can be detected by measuring the cutting enhancement of DNAse I in footprinting experiments, a method we have described previously,[4] and (ii) the existence of the alternative binding site can be tested by binding affinity experiments.

Acknowledgement

D.S. acknowledges support from an A.P. Sloan Fellowship and NSF grant DMS-05-16646 and W.K.O. support from USPHS grant GM34809. We also thank the Institute for Mathematics and Its Applications at the University of Minnesota for providing a stimulating environment to carry out this work and Dr. Yun Li for sharing unpublished data on LacR-DNA interactions.

Appendix

In the interest of making this paper self-contained we here review the parametrization ξ1 = (θ1, θ2 θ3, ρ1, ρ2, ρ3) and the matrices Qij, Γij, and jΛkl used to describe DNA structure in our computations. For simplicity, the superscript n denoting the base-pair number has been omitted from these terms. If we let Dij=dindjn+1 be the matrix of coordinates of the frame djn+1 with respect to the frame din, then D = TBT, where T and B are defined as

T=[cos(θ3/2)sin(θ3/2)0sin(θ3/2)cos(θ3/2)0001],N=κ1[θ2θ10θ1θ20001],K=[cosκ0sinκ010sinκ0cosκ],B=NTKN=κ2[θ12+θ22cosκθ1θ2(1cosκ)θ2κsinκθ1θ2(1cosκ)θ12cosκ+θ22θ1κsinκθ2κsinκθ1κsinκκ2cosκ],

with κ=θ12+θ22. Thus κ is the overall bending angle and θ1, θ2 describe the bending direction. If we let ri=din(xn+1xn) be the vector of components of the displacement vector with respect to the frame din, we can then define ρ = QTr where Q=TB. (Incidentally, the matrix B has the same form as B except sin κ and cos κ are replaced by sin(κ / 2) and cos(κ / 2)). Finally,

Γ=[θ1sinζκ+θ2cosζ2tan(κ/2)θ2sinζκθ1cosζ2tan(κ/2)tan(κ/2)cosζθ1cosζκ+θ2sinζ2tan(κ/2)θ2cosζκ+θ1sinζ2tan(κ/2)tan(κ/2)sinζθ2/2θ1/21],

where ζ =θ3 / 2 − γ, sin γ = θ1/κ, cos γ = θ2 / κ, and for each j the matrix jΛ is a skew matrix with the following components:

Λ121=θ2(1cos(κ/2))κ2,Λ131=θ1θ2(2sin(κ/2)κ)2κ3,Λ231=12+θ22(2sin(κ/2)κ)2κ3Λ122=θ1(cos(κ/2)1)κ2,Λ132=θ12(κ2sin(κ/2))2κ312,Λ233=θ1θ2(κ2sin(κ/2))2κ3Λ123=cos(κ/2)2,Λ133=θ1sin(κ/2)2κ,Λ233=θ2sin(κ/2)2κ

Using the formulae of [32] one can show that the Jacobian J in (18)(20) is given by:

J(Ξ)=i=1Ncos(κn)=i=1Ncos((θ1n)2+(θ2n)2)=i=1Ncos((ξ1n)2+(ξ2n)2)

Other parametrizations of the discrete DNA model have been proposed (see the review [50]) but only the one introduced in [18] and described here has the property that θ3 is independent of θ1, θ2 in the following sense: consider a configuration in which the centers xn and the normal vectors d3n all lie in a plane P, and the configuration can be described by the set {θ1n,θ2n,θ3n,ρ1n,ρ2n,ρ3n}n=1N. Now suppose that the structure is deformed in such a way that xn and d3n still lie in P and the angle between d1n and the plane P is held fixed. The parametrization described above guarantees that the new configuration will have {θ3n}n=1N identical to those of the old configuration. This property is important for separating the twisting and bending energy contributions to the DNA energy and is not true for any other parametrization defined in the literature.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1

To the best of our knowledge this method was first employed in DNA statistical mechanics by Zhang and Crothers [33].

2

Villa et al. [37] dispute the possibility of opening the LacR tetramer on the basis of their molecular dynamics simulations, in which two salt bridges form at the dimer-dimer interface during the forced opening of the tetramer and appear to lock the structure in the V-shape. The force opening, however, occurs on a timescale (16 ns) several orders of magnitude shorter than the expected real timescale of opening, and there is no experimental support for the salt-bridge formation.

3

Villa et al. [37] suggest, on the basis of molecular dynamics simulations of one type of LacR-mediated DNA loop, that the dimer headpieces rotate with respect to the rest of the protein. We do not consider such deformations here in view of the uncertainty of these predictions in the absence of supporting experimental data.

4

The P1a and P1b loops resemble the “o” and “e” loops obtained by Balaeff et al. using an elastic rod model of DNA and later termed O and U loops by the same authors [9]. The energies of the O3-O1 loops reported in those papers are lower than those found here due to the choice of bending modulus in and [9], which would be appropriate for a chain with persistence length 300 Å but not DNA.

5

The electrostatic energies listed in Table 1 differ slightly from those reported in [4] due to the relocation of charged sites from the phosphorus positions in [4] to the origins of base pairs here.

6

Looped structures resembling those seen in Ea and Eb appear during the course of molecular dynamics simulations of LacR opening [46]. Direct comparison of the structures is not possible as the structures in [46] are dynamically evolving and subject to forcing and hence not in equilibrium. Extended LacR-mediated loops have also been considered by Zhang et al. [47] The free energies of such loops (described as SL loops in that paper) are, like ours, lower than those of the parallel and antiparallel forms (respectively termed WA and LB loops) but also lower than the values reported here.

References Cited

  • 1.Coleman BD, Olson WK, Swigon D. Theory of sequence-dependent DNA elasticity. J. Chem. Phys. 2003;118:7127. [Google Scholar]
  • 2.Burd JF, Wartell JB, Dodgson JB, Wells RD. Transmission of stability (telestability) in deoxyribonucleic acid. J. Biol. Chem. 1975;250:5109–5113. [PubMed] [Google Scholar]
  • 3.Müller-Hill B. The lac Operon. Berlin: Walter de Gruyter; 1996. p. 1996. [Google Scholar]
  • 4.Swigon D, Coleman BD, Olson WK. Modeling the Lac repressor-operator assembly: The influence of DNA looping on Lac repressor conformation. Proc. Natl. Acad. Sci. USA. 2006;103:9879. doi: 10.1073/pnas.0603557103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Schultz SC, Schields GC, Steitz TA. Crystal structure of a CAP-DNA complex: the DNA is bent by 90°. Science. 1991;253:1001. doi: 10.1126/science.1653449. [DOI] [PubMed] [Google Scholar]
  • 6.Lawson CL, Swigon D, Murakami K, Darst SA, Berman HM, Ebright RH. Catabolite activator protein (CAP): DNA binding and transcription activation. Curr. Opin. Struct. Bio. 2004;14:1. doi: 10.1016/j.sbi.2004.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hudson JM, Fried MG. Co-operative interactions between the catabolite gene activator protein and the lac repressor at the lactose promoter. J. Mol. Biol. 1990;214:381. doi: 10.1016/0022-2836(90)90188-R. [DOI] [PubMed] [Google Scholar]
  • 8.Fried GM, Hudson JM. DNA looping and Lac repressor-CAP interaction. Science. 1996;274:1930. doi: 10.1126/science.274.5294.1930. [DOI] [PubMed] [Google Scholar]
  • 9.Balaeff A, Mahadevan L, Schulten K. Structural basis for cooperative DNA binding by CAP and Lac repressor. Structure. 2004;12:123. doi: 10.1016/j.str.2003.12.004. [DOI] [PubMed] [Google Scholar]
  • 10.Kalodimos CG, Bonvin AMJJ, Salinas RK, Wechselberger R, Boelens R, Kaptein R. Plasticity in protein-DNA recognition: Lac repressor interacts with its natural operator O1 through alternative conformations of its DNA-binding domain. EMBO J. 2002;21:2866. doi: 10.1093/emboj/cdf318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Garcia HG, Grayson P, Han L, Inamdar M, Kondev J, Nelson PC, Phillips R, Widom J, Wiggins PA. Biological consequences of tightly bent DNA: the other life of a macromolecular celebrity. Biopolymers. 2007;85:115. doi: 10.1002/bip.20627. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Trifonov EN. DNA in profile. Trends Biochem. Sci. 1991;16:467. doi: 10.1016/0968-0004(91)90181-t. [DOI] [PubMed] [Google Scholar]
  • 13.Crothers DM, Drak J, Kahn JD, Levene SD. DNA bending, flexibility, and helical repeat by cyclization kinetics. Methods Enzymol. 1992;212:3. doi: 10.1016/0076-6879(92)12003-9. [DOI] [PubMed] [Google Scholar]
  • 14.Hagerman PJ. Straightening out the bends in curved DNA. Biochem. Biophys. Acta. 1992;1131:125. doi: 10.1016/0167-4781(92)90066-9. [DOI] [PubMed] [Google Scholar]
  • 15.Matsumoto A, Olson WK. Sequence-dependent motions of DNA: a normal mode analysis at the base-pair level. Biophys. J. 2002;83:22. doi: 10.1016/S0006-3495(02)75147-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Olson WK, Gorin AA, Lu X-J, Hock LM, Zhurkin VB. DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. Proc. Natl. Acad. Sci. USA. 1998;95:11163. doi: 10.1073/pnas.95.19.11163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Olson WK, Swigon D, Coleman BD. Implications of the dependence of the elastic properties of DNA on nucleotide sequence. Phil. Trans. Roy. Soc. Lond. A. 2004;362:1403. doi: 10.1098/rsta.2004.1380. [DOI] [PubMed] [Google Scholar]
  • 18.El Hassan MA, Calladine CR. The assessment of the geometry of dinucleotide steps in double-helical DNA; a new local calculation scheme. J. Mol. Biol. 1995;251:648. doi: 10.1006/jmbi.1995.0462. [DOI] [PubMed] [Google Scholar]
  • 19.Olson WK, Colasanti AV, Czapla L, Zheng G. Insights into the sequence-dependent macromolecular properties of DNA from base-pair level modeling. In: Voth Gregory A., editor. Course-Graining of Condensed Phase and Biomolecular Systems. Taylor and Francis Group, LLC; 2008. pp. 205–223. Chapter 14. [Google Scholar]
  • 20.Manning GS. The molecular theory of polyelectrolyte solutions with applications to the electrostatic properties of polynucleotides. Quart. Rev. Biophys. 1978;11:179. doi: 10.1017/s0033583500002031. [DOI] [PubMed] [Google Scholar]
  • 21.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28:235. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Marti-Renom MA, Stuart A, Fiser A, Sánchez R, Melo F, Sali A. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 2000;29:291. doi: 10.1146/annurev.biophys.29.1.291. [DOI] [PubMed] [Google Scholar]
  • 23.Courant R. Differential and Integral Calculus. Vol. 2. London: Blackie; 1936. [Google Scholar]
  • 24.White JH. In: Mathematical Methods for DNA Sequences. CRC, editor. Boca Raton, FL: Waterman, M. S.; 1989. p. 225. [Google Scholar]
  • 25.Swigon D, Coleman BD, Tobias I. The elastic rod model for DNA and its application to the tertiary structure of DNA minicircles in mononucleosomes. Biophys. J. 1998;74:2515. doi: 10.1016/S0006-3495(98)77960-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Möglich A, Joder K, Kiefhaber T. End-to-end distance distributions and intrachain diffusion constants in unfolded polypeptide chains indicate intramolecular hydrogen bond formation. Proc. Natl. Acad. Sci. USA. 2006;103:12394. doi: 10.1073/pnas.0604748103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Charvin G, Allemand J-F, Strick TR, Bensimon D, Croquette V. Twisting DNA: single molecule studies. Contemporary Physics. 2004;45:383. [Google Scholar]
  • 28.Gonzalez O, Maddocks JH, Schuricht F, von der Mosel H. Global curvature and self-contact of nonlinearly elastic curves and rods. Calculus of Variations and Partial Differential Equations. 2002;14:29. [Google Scholar]
  • 29.Avriel M. Nonlinear Programming: Analysis and Methods. Dover Publications; 2003. [Google Scholar]
  • 30.Kuhn HW, Tucker AW. Nonlinear Programming; Proc. 2nd Berkeley Symp.; University of California Press; 1951. p. 481. [Google Scholar]
  • 31.Biton YY, Coleman BD, Swigon D. On bifurcations of equilibria of intrinsically curved, electrically charged, rod-like structures that model DNA molecules in solution. J. Elasticity. 2007;87:187. [Google Scholar]
  • 32.Gonzales O, Maddocks JH. Extracting parameters for base-pair level models of DNA from molecular dynamics simulations. Theor. Chem. Acc. 2001;106:76. [Google Scholar]
  • 33.Zhang YL, Crothers DM. Statistical mechanics of sequence-dependent circular DNA and its application for DNA cycliztion. Biophys J. 2003;84:136–153. doi: 10.1016/S0006-3495(03)74838-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Czapla L, Swigon D, Olson WK. Sequence-dependent effects in the cyclization of short DNA. J. Chem. Theory Comput. 2006;2:685. doi: 10.1021/ct060025+. [DOI] [PubMed] [Google Scholar]
  • 35.Griewank A, Corliss G. Automatic Differentiation of Algorithms. Philadelphia: SIAM; 1991. [Google Scholar]
  • 36.Ruben GC, Roos TB. Conformation of Lac repressor tetramer in solution, bound and unbound to operator DNA. Microsc. Res. Tech. 1997;36:400. doi: 10.1002/(SICI)1097-0029(19970301)36:5<400::AID-JEMT10>3.0.CO;2-W. [DOI] [PubMed] [Google Scholar]
  • 37.Villa E, Balaeff A, Schulten K. Structural dynamics of the lac repressor-DNA complex revealed by a multiscale simulation. Proc. Natl. Acad. Sci., USA. 2005;102:6783–6788. doi: 10.1073/pnas.0409387102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Friedman AM, Friedman TO, Steitz TA. Crystal structure of Lac repressor core tetramer and its implications for DNA looping. Science. 1995;268:1721. doi: 10.1126/science.7792597. [DOI] [PubMed] [Google Scholar]
  • 39.Lewis M, Chang G, Horton NC, Kercher MA, Pace HC, Schumacher MA, Brennan RG, Lu P. Crystal structure of the lactose operon repressor and its complexes with DNA and inducer. Science. 1996;271:1247. doi: 10.1126/science.271.5253.1247. [DOI] [PubMed] [Google Scholar]
  • 40.Bell CE, Lewis MA. closer view of the conformation of the Lac repressor bound to operator. Nat. Struct. Biol. 2000;7:209. doi: 10.1038/73317. [DOI] [PubMed] [Google Scholar]
  • 41.Goyal S, Lillian T, Blumberg S, Meiners JC, Meyhofer E, Perkins N. Intrinsic curvature of DNA influences Lac-R mediated looping. Biophys J. 2007;93:4342–4359. doi: 10.1529/biophysj.107.112268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Geanacopoulos M, Vasmatzis G, Zhurkin VB, Adhya S. Gal repressosome contains an antiparallel DNA loop. Nat Struct Biol. 2001;8:432. doi: 10.1038/87595. [DOI] [PubMed] [Google Scholar]
  • 43.Pyles EA, Lee JC. Mode of selectivity in cyclic AMP receptor protein-dependent promoters. Escherichia coli. Biochemistry. 1996;35:1162. doi: 10.1021/bi952187q. [DOI] [PubMed] [Google Scholar]
  • 44.Kapanidis AN, Ebright YW, Ludescher RD, Chan S, Ebright RH. Mean DNA bend angle and distribution of DNA bend angles in the CAP-DNA complex in solution. J. Mol. Biol. 2001;312:453. doi: 10.1006/jmbi.2001.4976. [DOI] [PubMed] [Google Scholar]
  • 45.Balaeff A, Mahadevan L, Schulten K. Elastic rod model of a DNA loop in the lac operon. Phys. Rev. Lett. 1999;83:4900–4903. [Google Scholar]
  • 46.Villa E, Balaeff A, Mahadevan L, Schulten K. Multi-scale method for simulating protein-DNA complexes. Multiscale Modeling and Simulation. 2004;2:527–553. [Google Scholar]
  • 47.Zhang Y, McEwen AE, Crothers DM, Levene SD. Analysis of in-vivo LacR-mediated gene repression based on the mechanics of DNA looping. PLoS ONE 1. 2006:e136. [Google Scholar]
  • 48.Seeman NC, Rosenberg JM, Rich A. Sequence-specific recognition of double helical nucleic acids by proteins. Proc. Natl. Acad. Sci., USA. 1976;73:804–808. doi: 10.1073/pnas.73.3.804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Warwicker J, Engelman BP, Steitz TA. Electrostatic calculations and model-building suggest that DNA bound to CAP is sharply bent. Proteins. 1987;2:283–289. doi: 10.1002/prot.340020404. [DOI] [PubMed] [Google Scholar]
  • 50.Lu X-J, Babcock MS, Olson WK. Overview of nucleic acid analysis programs. J. Biomol. Struct. Dynam. 1999;16:833. doi: 10.1080/07391102.1999.10508296. [DOI] [PubMed] [Google Scholar]

RESOURCES