Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2005 Nov 15;102(47):16909–16910. doi: 10.1073/pnas.0508686102

Target site localization by site-specific, DNA-binding proteins

Jonathan Widom 1,*
PMCID: PMC1288015  PMID: 16287970

Bacterial genomes contain over a million base pairs of DNA, and the human genome contains over a billion base pairs, yet specific locations in all this DNA must be reached and recognized in a timely manner by just a handful of molecules of any given site-specific, DNA-binding protein. How proteins rapidly search the entire genome to find a specific site has been the subject of intense discussion for decades but remains mysterious. In a recent issue of PNAS, a new article by Gowers et al. (1) provided a clear experimental distinction between two different but similar appearing search mechanisms.

Work on this problem of DNA target site location began in earnest with a startling report by Riggs and colleagues (2), who showed that the Escherichia coli lac repressor protein bound to a specific target site in vitro 1,000 times faster than had been imagined possible. Subsequent studies from other groups confirmed this conclusion and extended the results to many other protein-DNA systems. Basic explanations of these results and detailed theoretical models were developed by Berg and von Hippel and their colleagues (3-5) and have since been extended to consider searching over protein-DNA complexes, such as chromatin (6), and to define the optimal search strategy (7-9).

Three distinct search mechanisms, known collectively as “facilitated diffusion,” are now recognized to contribute importantly to a protein's ability to search along nonspecific DNA to find a specific target site. In the “sliding” mechanism, a protein diffuses along the length of the DNA without dissociating. The reduced dimensionality of the sliding process accelerates encounters over short distances, whereas the random walk nature of sliding (forward and backward steps are equally probable) makes this mode of searching very inefficient for long distances. In the “intersegment transfer” mechanism, a protein having two distinct DNA binding sites may be handed off directly from one segment to another that happens to be immediately adjacent in space without ever dissociating, like a monkey moving through a tree. Finally, in the “hopping” or “jumping” mechanism, the protein releases entirely from the DNA, without, however, significantly mixing with the other protein molecules free in solution; it diffuses a short distance in solution and then rebinds at another nearby DNA location. In general, the site of rebinding will be a short distance away along the same stretch of DNA, but it may instead occur on a stretch of DNA that was far away along the DNA length but that happened to loop back nearby in three-dimensional space to the original site.

The hopping/jumping mechanism reflects an important and underappreciated property of protein-DNA interactions. Consider a protein that starts out immediately adjacent to a stretch of DNA (for example, having just dissociated from it). No bonds exist between protein and DNA, so the protein is free to undergo a random walk through solution. A remarkable property of such random walks is that there is a very high probability that the protein will return after a short time to another stretch close to (typically within a few base pairs of) its original site on the same DNA.

A consequence of this correlated dissociation/rebinding process is to make sliding and hopping mechanisms appear very similar. Gowers et al. (1) developed an approach that takes advantage of a heterodimeric restriction enzyme, BbvCI, to distinguish (processive) sliding from (dissociative) hopping/jumping during translocation of this enzyme. Gowers et al. analyze the processivity of enzyme digestion by using DNA fragments containing pairs of the enzyme's asymmetric recognition sites that are separated by a variable distance and with the two sites either directly repeated or in inverted orientation. With directly repeated sites (Fig. 1 Upper), sliding allows the enzyme to act processively and cleave both sites during a single encounter. However, for the inverted-site orientation, sliding does not allow cleavage at the second site during a single encounter because the inverted-site orientation requires that the enzyme turn around on the DNA, which, in turn, requires full dissociation of the enzyme (Fig. 1 Lower). If translocation occurs solely by sliding, processive action will be observed on the direct repeat DNA but not on the inverted, whereas, if translocation always involves at least one dissociation/reassociation step, the apparent processivity will be equal on the two DNAs. Gowers et al. measure the processivity by quantifying the excess probability of making a second cleavage given that a first cleavage has already occurred, for the initial stages of reactions carried out in substrate excess.

Fig. 1.

Fig. 1.

Experimental test for sliding. (Upper) A DNA fragment has two target sites for the heterodimeric restriction enzyme BbvCI oriented in the same direction. The enzyme can bind to one site, then slide along the DNA (with or without also rotating around the DNA axis) and retain the correct orientation to recognize and cleave the second site. (Lower) A DNA fragment has two restriction sites in inverted orientation. It is not possible for an enzyme bound to the left-hand site to recognize and cleave the right-hand site without first dissociating. The initial stages of digestion reactions are analyzed, with substrate present in excess over enzyme. Sliding is detected as an enhanced probability of coupled cleavages at both sites, when the sites are in a direct orientation compared with the inverted orientation. For sites in the inverted orientation, the enzyme must dissociate and then rebind to cleave the second site. A rotation around just the y axis leaves the two subunits each contacting the face opposite their cleavage site; an additional rotation about the z axis would restore the correct contacts but is forbidden because it would require movement of protein atoms through the DNA. A single rotation about the x axis leads to the same outcome as the previous two rotations combined and again would require moving protein atoms through the DNA. The impossibility of such a rotation may be better appreciated by imagining the protein to be elongated along the DNA, as in a hot dog bun lying over a hot dog. For a hot dog oriented along the y axis, rotation of the bun about the x axis is forbidden because it requires passage of the bun through the hot dog. These steric arguments apply to any plausible geometric shape for the protein-DNA interface. As few as two bonds remaining between protein and DNA would prevent any of these rotations.

The results are dramatic. When the two sites are within 30 bp of each other and the experiment is carried out in very low-salt concentration [which facilitates long-range sliding by reducing the protein's dissociation rate (10)], cleavage on the direct repeat DNAs is ≈1.4-fold more processive than on the DNAs carrying the inverted repeats. However, when the separation was increased to 75 bp or the salt concentration increased to a roughly physiological level, the ratio of processivities on the direct to inverted repeat DNAs decreased to ≈1. These results mean that translocation over distances of ≈30 bp or more in conditions of physiological salt concentration always included at least one dissociation/rebinding step.

These results pertain at present to one particular site-specific, DNA-binding protein; however, the results are likely to be broadly applicable. The BbvCI enzyme has an affinity for nonspecific DNA and a salt concentration-dependence to this affinity that are typical for generic site-specific, DNA-binding proteins (10), suggesting that its sliding behavior is likely to be generic. Moreover, the relatively short range of sliding observed in this work is consistent with theoretical predictions for optimal sliding strategies, for which details specific to a given protein are unimportant (7-9). A nice feature of this work is that the concept underlying the experiment can be extended to other enzyme-DNA interactions, although such application requires that the enzyme must interact differently with each DNA strand. Thus, it will be possible in future work to directly test the generality of these conclusions.

These results lead to a picture of the search process in which the journey of a protein from an arbitrary starting location to a specific target site along the genome is dominated by hopping/jumping or direct intersegment transfer steps. Sliding facilitates both the search and the protein-DNA “docking” reaction but only in the neighborhood immediately around each landing site.

Much remains to be learned. Although the conclusions derived from these studies on BbvCI are likely to be broadly applicable, structural studies of site-specific, DNA-binding proteins in complexes with nonspecific DNA have identified differing specific ways in which the proteins appear to be evolved to facilitate target site location by one mechanism or another. For example, the nonspecifically bound BamHI endonuclease (11) has a large arch-shaped active site cavity that fits loosely over the DNA but has no direct DNA contacts at all, suggesting that this enzyme might float nearly “frictionless” over the DNA surface as it slides along nonspecific DNA. Binding to a specific target site is coupled to a large-scale conformational change in the enzyme that clamps the cavity tightly around the target. In contrast, the Cro repressor protein appears to bend the DNA in nonspecific complexes as sharply as in the specific complexes (12), suggesting that sliding by Cro might be anything but frictionless. In another contrast with the BamHI nonspecific complex, the lac repressor (13) and the bacteriophage T4 Dam DNA adenine methyltransferase (14) retain nearly the identical folds in nonspecific or specific complexes. The same set of residues facilitates loose, primarily electrostatic interactions in the nonspecific complexes and tight and base-specific interactions with the specific targets. Such ability would seem to facilitate a search based on hopping/jumping. More work is required to tease out the real meaning of these striking features of nonspecific DNA complexes.

In addition, questions of a more basic nature remain unanswered. Other search modes may prove to be important. Many regulatory proteins are oligomeric and have two distinct specific DNA-binding surfaces. When one such protein is already bound to one specific target site, binding at a nearby target site may be greatly accelerated by DNA looping (15), increasing occupancies and decreasing noise in gene regulation (16). At a more basic level, we do not know the real rates of target location for any in vivo system. Current physical models of gene regulation (17-21) assume that individual regulatory steps, such as the binding and unbinding of repressor and activator molecules and of the initially formed “closed” RNA polymerase/promoter complexes, are in rapid quasiequilibrium; but this critical assumption remains unproven. Indeed, although physical considerations suggest that primary events in gene regulation, such as target site location and recruitment of auxiliary factors, should occur in seconds or less, the best recent experiments show the real in vivo events occurring over minutes or more (22). Finally, real genomes exist as protein-DNA complexes, not naked DNA, and it remains unclear how this structural organization of bacterial or eukaryotic genomes will affect the dynamics of the search process (6).

Author contributions: J.W. wrote the paper.

Conflict of interest statement: No conflicts declared.

See companion article on page 15883 in issue 44 of volume 102.

References


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES