Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2017 Mar 14;112(5):859–867. doi: 10.1016/j.bpj.2017.01.018

On the Mechanism of Homology Search by RecA Protein Filaments

Maria P Kochugaeva 1,2, Alexey A Shvets 1,2, Anatoly B Kolomeisky 1,2,
PMCID: PMC5358537  PMID: 28297645

Abstract

Genetic stability is a key factor in maintaining, survival, and reproduction of biological cells. It relies on many processes, but one of the most important is a homologous recombination, in which the repair of breaks in double-stranded DNA molecules is taking place with a help of several specific proteins. In bacteria, this task is accomplished by RecA proteins that are active as nucleoprotein filaments formed on single-stranded segments of DNA. A critical step in the homologous recombination is a search for a corresponding homologous region on DNA, which is called a homology search. Recent single-molecule experiments clarified some aspects of this process, but its molecular mechanisms remain not well understood. We developed a quantitative theoretical approach to analyze the homology search. It is based on a discrete-state stochastic model that takes into account the most relevant physical-chemical processes in the system. Using a method of first-passage processes, a full dynamic description of the homology search is presented. It is found that the search dynamics depends on the degree of extension of DNA molecules and on the size of RecA nucleoprotein filaments, in agreement with experimental single-molecule measurements of DNA pairing by RecA proteins. Our theoretical calculations, supported by extensive Monte Carlo computer simulations, provide a molecular description of the mechanisms of the homology search.

Introduction

A successful functioning of biological cells, which includes accurate DNA replication and transfer of genetic information without errors, strongly depends on stability of its genome (1). However, within a cell, DNA molecules constantly experience attacks by various active chemical molecules, thermal collisions with other molecules, and the effect of external radiation (2, 3, 4). This leads to frequent defects and even breaks in double-stranded DNA chains, which might be lethal for cells. Fortunately, there is a natural process known as a “homology recombination”, which is crucial for repairing DNA double-strand breaks (3, 5, 6, 7). During this process, a nucleotide sequence of another identical or similar DNA duplex is used for restoring the original sequence in the broken DNA molecule (8, 9). Furthermore, the homologous recombination plays a fundamental role in the process of genetic diversity and reassembling of genetic information, which is important for living systems for adaption to changing environments (1, 3, 5, 8, 10, 11, 12, 13, 14).

In bacteria, a central role in the homologous recombination is played by RecA proteins (11, 15, 16). They belong to a class of recombination proteins that are conserved across different organisms, from bacteria to mammals (RadA in archaea, and Rad51 and Dmc1 in eukaryotes) (3). The process of the homologous recombination consists of several stages, and RecA is activated by assembling into a nucleoprotein filament on a single-stranded DNA segment that appeared due to a double-strand break in DNA (15, 17, 18, 19, 20). One of the most mysterious steps in the homology recombination is a process when the RecA nucleoprotein filament is looking for the corresponding homologous regions on identical undamaged DNA (7, 21, 22, 23). This is known as a “homology search”. It is still poorly understood, and it remains unknown how RecA nucleoprotein filaments can find and recognize the homologous sequences on a long DNA chain so quickly and efficiently (3, 9, 10). By now, it is well proven that the homology search is not an active process, because ATP hydrolysis is not required to proceed forward (18, 19). Therefore, it must be related to protein search for targets on DNA that have been intensively investigated in recent years (24, 25, 26, 27, 28, 29, 30, 31, 32). However, a comprehensive description of the mechanisms of the homology search still does not exist even for simplified in vitro systems. The homology search is much more difficult for understanding for in vivo conditions where the identity and the functions of various assisting proteins are not yet fully identified (3, 9).

Several important experimental advances in studying the homology search have been reported in recent years. Experimental studies that utilized optical trapping and single-molecule fluorescent microscopy were able to visualize how individual RecA nucleoprotein filaments paired broken and undamaged DNA molecules (12, 23). It was determined that the homology search strongly depends on the 3D conformational state of the target DNA, i.e., on the degree of the polymer extension, as well as on the size of the nucleoprotein filament. Based on these observations, it was suggested that the RecA filament utilizes a so-called “intersegmental contact sampling” mechanism, in which multiple contacts with the target DNA are explored (23). As a result, the homologous search is more efficient for longer filaments and for less extended (more coiled) DNA chains. A general question on the role of DNA coiling in the protein search has been addressed theoretically (24, 33, 34, 35). However, a quantitative model of the intersegment transfer for the homology search has not yet been developed.

A different idea, which argues that the filament sliding is the source of the high efficiency of the homologous search process, had also been explored in 2012 (13). Single-molecule fluorescent measurements with a high spatiotemporal resolution were employed, which, in contrast to earlier experiments that indicated no sliding (36), have observed multiple diffusional events for RecA filaments on DNA. Using computer simulations, it was argued that sliding accelerates the homology search as much as 200 times (13). However, the degree of sliding was not significant: the observed sliding lengths were comparable to or less than the size of the RecA filament. In addition, the continuum model utilized in simulations might not be reasonable at these conditions, and more advanced theoretical models show that the sliding might not actually lead to such large accelerations in the search dynamics (27).

Another important question that is still highly debated is the mechanism of homologous recognition (10, 14, 37, 38). It has been argued that electrostatic interactions as well as torsional deformations of DNA molecules are key players in the recognition process. But the exact molecular details of this process remain poorly understood, especially for in vivo cellular conditions (10, 14, 37, 38, 39).

In this article, we develop a minimalist theoretical model of the homology search that provides a quantitative analysis of the underlying dynamics. It is based on the idea related to the intersegmental contact sampling mechanism, and the model argues that the RecA nucleoprotein filament can scan DNA faster for more coiled DNA conformations. Our analysis extends the earlier developed discrete-state stochastic approach for protein search for targets on DNA (26, 27), which explicitly takes into consideration major physical and chemical processes in the system. This allows us to obtain a full analytical description for all dynamic properties of the homology search by utilizing a method of first-passage probabilities (27). Our theoretical calculations, supported by extensive Monte Carlo computer simulations, show that, indeed, extending the DNA chain and/or shortening the RecA filament will slow down the search dynamics. Furthermore, our theoretical model is able to describe quantitatively and explain the experimental observations (23), as well as to make testable predictions.

Materials and Methods

Theoretical model

To describe the homology search, we introduce a discrete-state stochastic model as presented in Fig. 1. The DNA molecule is viewed as consisting of L binding sites, and one of them (at the site m) is the homology segment that the RecA needs to find and recognize. Thus, we assume that the size of each site is equal to the filament’s length, and it includes many nucleotides. The RecA nucleoprotein filament, which consists of the protein molecule and a ssDNA segment, can associate nonspecifically to dsDNA at any site with a rate kon, and the dissociation rate into the solution is equal to koff (Fig. 1). We divide the bulk volume around DNA into L segments, each of them surrounding the corresponding site on DNA (as shown in Fig. 1). While in the solution, the RecA filament can go from one segment to another segment with the effective diffusion rate u (Fig. 1). It is important to note that the rate u is generally not an intrinsic diffusivity of the RecA filament, but instead an effective rescaled diffusion rate that depends on the degree of coiling of DNA. This effectively means that we simplified a three-dimensional (3D) motion of the RecA filament into an effective one-dimensional (1D) motion. There are several arguments to justify this approximation. We can always look at the translation of the center of mass of the RecA filament projected along the line that connects DNA ends as an effective 1D motion. In addition, the experiments on DNA pairing by RecA filaments have been developed in a quasi-1D setup when the dual optical trapping system fixed the DNA ends, and the width of the experimental setup is ∼1 μm, while the DNA chain is extended to the distance ∼10 μm (23). The corresponding diffusion rate depends on the end-to-end extension of the DNA chain: the smaller this distance, the larger is the rate u. This assumption explicitly incorporates the idea of the intersegmental contact sampling. To simplify calculations, a single-molecule view is adopted. We also label all states as n(i) with i=0 when the RecA filament is associated to DNA, and i=1 when the RecA filament is free in the solution (Fig. 1).

Figure 1.

Figure 1

A schematic view of the homology search on DNA by RecA filaments. There are L1 nonspecific sites and one specific homology site at the position m on the DNA chain. A filament can diffuse in the solution along the DNA chain with the rate u, or might associate to any site on DNA with the rate kon. From DNA, the RecA filament can dissociate into the solution with the rate koff. The states in the bulk solution are labeled as 1, and the states on DNA are labeled as 0. To see this figure in color, go online.

The RecA filament starts from the solution and the process is completed when it reaches the homology site m for the first time. This means that we adopt a simple all-or-nothing approach to describe the recognition step in the homology search, neglecting the molecular details of this complex process (10, 14, 37, 38, 39). This also suggests employing a first-passage method of analyzing the dynamic properties, which turned out to be very successful in analyzing various protein search phenomena (26, 27, 31, 40). We introduce functions Fn(0)(t) and Fn(1)(t), defined as the probability density functions to reach the target at time t for the first time, if initially at t=0 the RecA filament starts at the site n(i) (n=0,1,,L) on the DNA (i=0) or in the solution (i=1). The temporal evolution of the first-passage probabilities follow the backward master equations (27). For nm, we have

Fn(0)(t)t=koffFn(1)(t)koffFn(0)(t), (1)

and

Fn(1)(t)t=u[Fn+1(1)(t)+Fn1(1)(t)]+konFn(0)(2u+kon)Fn(1)(t). (2)

Because the dynamics is different at the boundaries (n=1, n=m or n=L), the corresponding equations are the following:

F1(1)(t)t=uF2(1)(t)+konF1(0)(t)(u+kon)F1(1)(t), (3)
Fm(1)(t)t=u[Fm+1(1)(t)+Fm1(1)(t)]+konFm0(t)(2u+kon)Fm(1)(t), (4)
FL(1)(t)t=uFL1(1)(t)+konFL(0)(t)(u+kon)FL(1)(t). (5)

In addition, the initial conditions imply that

Fm(0)(t)=δ(t). (6)

The physical meaning of this expression is the following: if the RecA filament starts at t=0 from the homology sequence site, the search is instantaneously completed.

To determine the first-passage probability functions, we utilize a Laplace transformation, F˜(s)=0estF(t)dt. This allows us to simplify the problem by solving algebraic equations instead of original differential equations. Thus, Eqs. 1 and 2 are modified into

sFn(0)˜(s)=koffFn(1)˜(s)koffFn(0)˜(s); (7)
sFn(1)˜(s)=u[Fn+1(1)˜(s)+Fn1(1)˜(s)]+konFn(0)˜(s)(2u+kon)Fn(1)˜(s). (8)

And for the boundaries, taking into account Eq. 6, we have Fm(0)˜(s)=1, and the following expressions can be written:

[s+u+konkonkoffs+koff]F1(1)˜(s)=uF2(1)˜(s); (9)
[s+u+konkonkoffs+koff]FL(1)˜(s)=uFL1(1)˜(s); (10)
[s+2u+kon]Fm(1)˜(s)=u[Fm+1(1)˜(s)+Fm1(1)˜(s)]+kon. (11)

These equations can be solved, producing from Eq. 7:

Fn(1)˜(s)=Fn(0)˜(s)s+koffkoff. (12)

Then, from Eq. 8, taking into account the boundary conditions, we obtain

Fn(0)˜(s)=A1yn+A2yn, (13)

where

y=s+2u+konkonkoffs+koff(s+2u+konkonkoffs+koff)24u22u, (14)

and the coefficients A1 and A2 have different values depending on the position of the state n with respect to the target. For nm, we have

A1=kon(ym+y2Lm+1)(konkoffs+koff)(ym+ym)(ym+y2Lm+1)+u(1y2)(1y2L), (15)

and

A2=yA1. (16)

Similarly, for nm it can be shown that

A1=kon(ym+y1m)(konkoffs+koff)(ym+ym)(ym+y2Lm+1)+u(1y2)(1y2L), (17)

and

A2=y2L+1A1. (18)

The knowledge of first-passage distribution functions allows us to obtain a comprehensive description of the dynamics in the system. For example, we are interested in evaluating the mean search times for the RecA filament to find the homology sequence starting from the solution. It is reasonable to assume that the nucleoprotein filament can start with equal probability at any region of the solution around the DNA chain. This also corresponds to the experimental conditions of studying the homology search with well-mixed chemical components (23). Then for the mean search time, we derive

T=1Ln=1Lτn(1), (19)

where

τn(1)=F˜n(1)s|s=0 (20)

is defined as a mean time to reach the target from the state n in the solution (Fig. 1). The final expression for the mean search time is given by

T=W6u[koff/(kon+koff)]+[L1koff+Lkon], (21)

where

W=1+3L+2L26m6Lm+6m2. (22)

Equation 21 has a clear physical meaning. The first term corresponds to the time when the RecA filament diffuses in the solution under the condition that it is not bound to the DNA (equal to koff/(kon+koff)). The second term describes the total time for associations and dissociations. So in the limit of very fast diffusion in the solution, u, which corresponds to highly coiled DNA configurations, the search time is equal to

T=L1koff+Lkon. (23)

This expression is easy to understand because, on average, the filament will make L1 unsuccessful binding attempts to DNA before the Lth step, where it will find the homology sequence. This also agrees with the analytical result obtained earlier (27).

Similar analysis can be done for any dynamic property in the homology search, and we will illustrate this below when the experimental results are analyzed.

Results and Discussion

Dynamic properties of the homology search

The knowledge of first-passage probability functions allows us to fully analyze the dynamics of the homology search. Several important questions can now be investigated. One of them is the role of the target sequence position along the DNA on the search dynamics. This can be done by varying the target location m, evaluating how it affects the mean search time. The corresponding results are presented in Fig. 2.

Figure 2.

Figure 2

Normalized mean search time to find the homology sequence as a function of the relative target position for different DNA lengths. (Solid lines) Theoretical predictions; (symbols) Monte Carlo simulations. Parameters used for calculations are: kon=u=105 s−1 and koff=103 s−1. To see this figure in color, go online.

One can see that moving the target location along the DNA chain changes the search times for all sets of parameters (Fig. 2). The fastest dynamics is observed when the homology sequence is in the middle of DNA, and slowest dynamics when the target is at the end. For long DNA chains, L1, it is up to four times faster to find the target if it is in the center versus the end of the DNA molecule (Fig. 2). This can be easily explained by noting that, to reach the homology sequence, the RecA filament must first to come to the volume segment around the target. Because the filament starts with equal probability anywhere in the volume around the DNA chain, the dynamics is faster if the target is in the middle of the chain. At average, the protein filament must diffuse L/4 segments to reach the segment around the target in the middle, while for the end target locations it will move through L/2 volume segments. The time diffused in the solution scales quadratically with the distance. Then the search slowing down due to moving the target to the end is given by T/Tmax((L/4)2/(L/2)2)=1/4 (Fig. 2).

It is interesting to note that this behavior differs for other protein search systems where 3D diffusion is usually very fast (27, 31). For these systems, the relative contribution of the sliding along DNA in comparison with the bulk diffusion determines the importance of the varying the target position. If the search is dominated by the motion via the bulk solution, the variation of the target position is less relevant for the protein search dynamics.

The next question is to understand the effect of the length of DNA in the homology search. The dependence of the search time on L is presented in Fig. 3. Two different behaviors are observed. When the protein filament spends most of the time in the solution (ukon), the mean search time has a quadratic scaling, TL2. In the solution, the RecA protein filament performs a simple unbiased diffusion. At the same time, when the rate limiting step is associated with binding/unbinding events to/from DNA (ukon,koff), the scaling is linear, TL, because the filament visits each site on DNA independently of each other. The same conclusions can be obtained from Eq. 21. Thus, for RecA to conduct a fast and efficient search, it should not stay very long in the solution if DNA chains are quite long. This also means that the search is generally faster for longer RecA filaments, assuming that the length of DNA is fixed, because a smaller number of associations to DNA is needed to find the homology sequence.

Figure 3.

Figure 3

Mean search times to find the homology sequence as a function of the DNA length. (Solid lines) Theoretical predictions; (symbols) Monte Carlo simulations. The homology sequence is in the middle of the DNA chain (m=L/2). To see this figure in color, go online.

Another important factor in the homology search is the strength of protein-DNA interactions. To quantify the affinity between protein and DNA molecules, we consider an equilibrium binding constant K=kon/koff that specifies the tendency of the protein to associate nonspecifically to DNA. The mean search times as a function of the binding affinity are illustrated in Fig. 4. A nonmonotonic behavior is observed, which can be explained using the following arguments. For weak affinities, K1, the protein filament does not like to associate to DNA, and this prevents it from quickly finding the homology sequence. For strong affinities, K1, the effect is opposite. The nonspecific interaction between the RecA filament and DNA is so strong that the protein is frequently trapped at different locations on DNA. This again slows down the search dynamics. Only for intermediate affinities, K ∼1–10, the homology search is fast because the protein can efficiently scan the DNA without being trapped. Experiments suggest that RecA operate in this regime of affinities (23). Similar nonmonotonic behavior for search dynamics in other protein systems has been explained earlier (32).

Figure 4.

Figure 4

Mean search times to find the homology sequence as a function of the RecA filament-DNA affinity, K=kon/koff. Different curves correspond to different but fixed dissociation transition rates koff. Parameters used for calculations are: L=100l,m=50l, and u=1 s−1. To see this figure in color, go online.

Our theoretical approach allows us to fully describe all possible search behaviors in the system. To show this, we build a dynamic phase diagram for the homology search, as shown in Fig. 5. To understand the dynamics, we note that there are two relevant length scales in the system. The first one is the length of the DNA chain, L. The second one is a length d=u/kon, which has a physical meaning of the average distance that the RecA filament diffuses in the solution before binding to DNA. These two length scales lead to two different dynamic search regimes (Fig. 5). When dL, the nucleoprotein filament has a strong tendency to nonspecifically bind DNA at any position, and this effectively traps the protein-ssDNA filament on the dsDNA chain. The search time decreases with increasing the length d because it corresponds to relaxing the trapping on DNA, leading to faster search for the homology sequence. The situation is different for dL, when the protein filament prefers to be found in the solution. But this does not help with the search. Increasing the length d also increases the mean search time, and this makes the search even less efficient. One can clearly see that the most optimal search is observed when the filament diffusion in the solution is balanced by frequent associations and dissociation from DNA.

Figure 5.

Figure 5

Mean search times to find the homology sequence as a function of the characteristic length d for different DNA lengths. (Solid curves) Theoretical predictions; (symbols) Monte Carlo computer simulations. Parameters used for calculations were m=L/2,u=105 s−1, and koff=103 s−1. To see this figure in color, go online.

It is important to note that this dynamic phase diagram is also different from other protein search systems, where three dynamic regimes are usually observed (27, 31). The main reason for this is the absence of the sliding along the DNA, which is associated with another length scale and a different dynamic regime.

Comparison with experiments

One of the advantages of our theoretical approach is the fact that it provides a fully analytical description of the homology search dynamics. It can be tested by applying it for a quantitative description of the experiments on RecA. Here we present the analysis of the single-molecule observations of in vitro RecA homology search (23). In these experiments, the degree of DNA pairing by RecA filaments has been measured for different end-to-end distances in DNA and for different filaments lengths (23).

To assist our analysis, we first have to evaluate the effective diffusion rate u. Because the RecA nucleoprotein filaments utilized in these experiments usually are not very long (several hundreds of nucleotides in length), they can be viewed as rigid cylindrical tubes. Then the translational diffusion constant for the cylindrical object of length l and diameter a in the solution of viscosity η can be written as (41)

D=kBT[ln(l/a)+c]6πηl, (24)

where a numerical parameter c is a finite-size correction. For long cylinders, la, it was numerically evaluated that c0.312 (41). For RecA protein, the diameter can be estimated as a2 nm (42), and the filament with 500 nucleotides has a length l150 nm, so that l is always much larger than a. Assuming that the viscosity of the solution can be reasonably approximated as the viscosity of water, η(water)=8.9×104 kg m/s, and considering the RecA nucleoprotein filaments of the length of 500 nucleotides, we estimate the translational diffusion coefficient as D7×1012 m2/s.

To obtain the estimate for the effective diffusion rate u, we note that the volume around the DNA molecule is divided into L/l segments, and the effective 1D size of each segment is given by leff=R/(L/l), where R is the end-to-end DNA distance (see also Fig. 1). It takes the average time 1/u to move the distance leff in our model. Simultaneously, the time for the RecA filament to diffuse the same distance in the solution is leff2/2D. Then the following condition

1uleff22D=R2l22DL2 (25)

leads to the explicit estimate for the rate u. For experimental parameters that describe the DNA pairing by RecA filaments (L=48,502 bp) (23), it can be shown that u103 s−1 for the fully extended DNA chains (R=16 μm), and for the coiled DNA chains (R2 μm), we obtain u105 s−1. The 100 times increase in the diffusion rate u is related with the order-of-magnitude decrease for the end-to-end distance R between the fully stretched and fully coiled DNA conformations and the quadratic dependence of the time on the distance for the diffusion process. Experiments also suggest that for the filament concentration of 100 pM, the equilibrium binding constant is K>10, giving the connection between the transition rates kon and koff (23).

The experimentally observed fraction f(t) of RecA molecules that find the homology sequence at the time t can be evaluated from the first-passage probability functions, averaged over the initial starting conditions,

f(t)=1Ln=1L0tFn(1)(t)dt. (26)

These quantities are explicitly calculated by numerically inverting the Laplace transforms Fn(1)(s)˜ using the procedure described in Valko and Abate (43). Experimental measurements on the degree of DNA pairing by RecA filaments are described then with only one fitting parameter, and the results are presented in Figs. 6 and 7.

Figure 6.

Figure 6

Fraction of the homologously paired RecA molecules as a function of the DNA end-to-end distance for l=430 nucleotides filaments 2 min after the beginning of the homology search. (Red symbols) Experimental data from Forget and Kowalczykowski (23); (green circles) theoretical estimates. The following parameters were utilized for calculations: kon=1000 s−1, koff=2 s−1, m=55l, and L=113l; the effective diffusion rates u = 24,536, 6134, 2726, and 1533 s−1 for R = 2, 4, 6, and 8 μm DNA end-to end distances, respectively. To see this figure in color, go online.

Figure 7.

Figure 7

Fraction of the homologously paired RecA molecules as a function of the time for 430 nucleotides filaments. (Symbols) Experimental data from Forget and Kowalczykowski (23); (solid curves) theoretical predictions. The following parameters were utilized for calculations: kon=1000 s−1, koff=2 s−1, m=55, and L=113l; the effective diffusion rates u = 24,536 and 2726 s−1 for 2 and 6 μm DNA end-to end distances, respectively. To see this figure in color, go online.

As one can see from Fig. 6, extending the distance between DNA ends makes the homology search less efficient. When DNA is coiled (2 μm end-to-end distance), <85% of the targets are found in <2 min. For the extended DNA chain (8 μm end-to-end distance), <10% of the homology sequences are located during the same time period. Similar behavior is presented in Fig. 7, where the temporal evolution of the fraction of the successful homology search events is presented. Our theoretical picture, which argues for longer search times for more extended DNA configurations, is able to capture these experimental observations reasonably well.

Because of the fully quantitative nature of our theoretical model, more than just existing experimental observations can be described. We can also make specific predictions that can tested in the lab. More specifically, for the experimental conditions described in Forget and Kowalczykowski (23), the dependence of the search time on the degree of coiling of DNA molecules is presented in Fig. 8. We predict that the homology search will be accomplished in <2 min for coiled DNA chains, while for the extended conformations it might take up to 8–10 min.

Figure 8.

Figure 8

Theoretical predictions for homology search times as a function of the DNA end-to-end distance for experimental conditions in Forget and Kowalczykowski (23). To see this figure in color, go online.

Conclusions

We developed a fully quantitative theoretical approach to describe the homology search process by RecA nucleoprotein filaments. It is based on the idea of intersegment sampling, which suggests that it takes longer to probe more extended DNA conformations and for shorter filaments. Our discrete-state stochastic model of the homology search, which takes into account the most relevant physical-chemical processes in the system, is solved analytically using the method of first-passage processes.

The presented theoretical analysis shows that the location of the homology sequence influences the search dynamics: the search is faster up to four times if the target is in the middle of the DNA chain in comparison with the location at DNA ends. We also found that the dependence of the search time on the DNA length is determined by the dominating process in the system. If the RecA filament spends most of the time in the solution around the DNA molecule, then the search time has a quadratic dependence on the DNA length. However, if association to DNA and dissociation from DNA are rate-limiting steps, then the linear scaling is observed. In addition, the nonspecific filament-DNA interactions have been identified as another factor that affects the homology search. Our theoretical calculations suggest that there is the optimal filament-DNA affinity that speeds up the search dynamics. It was argued that this effect is the result of the compromise between being trapped on DNA for strong attractive interactions and not coming to DNA at all for strong repulsions. At the end, we constructed a dynamic phase diagram for the homology search, where two different regimes were found. When the distance traveled by the filament in the solution is much smaller than the DNA length, the RecA filament has a strong tendency to be trapped on DNA. Increasing this length accelerates the search. However, when this length becomes larger than the DNA length, the trend is reversed because the filament is mostly in the solution, and this increases their search times for the homology sequence. Furthermore, theoretical description of the homology search indicates that it differs in many aspects from other processes of proteins searching for targets on DNA. Finally, our theoretical picture is successfully applied for analyzing single-molecule experiments on DNA pairing by RecA filaments. We are able to quantitatively account for the effect of DNA coiling and size of RecA filaments during the homology search. Experimentally testable predictions on the homology search times are also presented.

Although our theoretical approach provides a simple quantitative description of the homology search, which also agrees reasonably well with single-molecule experimental measurements, there are several issues that should be discussed. It has been argued that the intersegmental contact sampling is the leading mechanisms in the homology search (23). However, our theoretical method only partially takes this mechanism into account. We incorporate the idea that it is longer to search more extended DNA chains, but we implicitly assume that the RecA protein binds and dissociates as a whole filament during the search process. In reality, some parts of the filament can dissociate from DNA, simultaneously binding to other segments on DNA. This possibility of intersegment transfer is not accounted for in our method. These arguments suggest that our model should work better for more extended DNA conformations, and its accuracy for more coiled DNA configurations is less reliable. At the same time, one should notice that the intersegment transfer has been successfully accounted in other protein search systems (40), so our theoretical method can be extended in this direction. Another important process that was not taken into account in our theoretical model is the sliding of RecA filaments along the DNA chains and the DNA sequence heterogeneity (13, 44, 45). It is still unclear if this effect is relevant for the homology search, but the proposed discrete-state stochastic framework can be extended to take care of this question. Despite these issues, we believe that our method captures some important physical-chemical aspects of the mechanisms of the homology search. It will be important to test the theoretical predictions in experimental studies, as well as in more advanced theoretical treatments.

Author Contributions

A.B.K. designed research; M.P.K. and A.A.S. performed research; and all authors wrote the article.

Acknowledgments

We acknowledge the help of Dr. Peter Valko for clarifying some technical issues with numerical calculations.

The work was supported by the Welch Foundation (grant No. C-1559), by the NSF (grant No. CHE-1360979), and by the Center for Theoretical Biological Physics sponsored by the NSF (grant No. PHY-1427654).

Editor: Keir Neuman.

References

  • 1.Alberts B., Johnson A., Walter P. 5th Ed. Garland Science; New York: 2007. Molecular Biology of the Cell. [Google Scholar]
  • 2.Krejci L., Altmannova V., Zhao X. Homologous recombination and its regulation. Nucleic Acids Res. 2012;40:5795–5818. doi: 10.1093/nar/gks270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Renkawitz J., Lademann C.A., Jentsch S. Mechanisms and principles of homology search during recombination. Nat. Rev. Mol. Cell Biol. 2014;15:369–383. doi: 10.1038/nrm3805. [DOI] [PubMed] [Google Scholar]
  • 4.Kuzminov A. DNA replication meets genetic exchange: chromosomal damage and its repair by homologous recombination. Proc. Natl. Acad. Sci. USA. 2001;98:8461–8468. doi: 10.1073/pnas.151260698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Sagi D., Tlusty T., Stavans J. High fidelity of RecA-catalyzed recombination: a watchdog of genetic diversity. Nucleic Acids Res. 2006;34:5021–5031. doi: 10.1093/nar/gkl586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Gonda D.K., Radding C.M. The mechanism of the search for homology promoted by recA protein. Facilitated diffusion within nucleoprotein networks. J. Biol. Chem. 1986;261:13087–13096. [PubMed] [Google Scholar]
  • 7.Morrical S.W. DNA-pairing and annealing processes in homologous recombination and homology-directed repair. Cold Spring Harb. Perspect. Biol. 2015;7:a016444. doi: 10.1101/cshperspect.a016444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Chen Z., Yang H., Pavletich N.P. Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structures. Nature. 2008;453:489–494. doi: 10.1038/nature06971. [DOI] [PubMed] [Google Scholar]
  • 9.Cox M.M. Motoring along with the bacterial RecA protein. Nat. Rev. Mol. Cell Biol. 2007;8:127–138. doi: 10.1038/nrm2099. [DOI] [PubMed] [Google Scholar]
  • 10.Dorfman K.D., Fulconis R., Viovy J.-L. Model of RecA-mediated homologous recognition. Phys. Rev. Lett. 2004;93:268102. doi: 10.1103/PhysRevLett.93.268102. [DOI] [PubMed] [Google Scholar]
  • 11.Fulconis R., Mine J., Viovy J.-L. Mechanism of RecA-mediated homologous recombination revisited by single molecule nanomanipulation. EMBO J. 2006;25:4293–4304. doi: 10.1038/sj.emboj.7601260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gibb B., Greene E.C. Sliding to the rescue of damaged DNA. eLife. 2012;1:e00347. doi: 10.7554/eLife.00347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ragunathan K., Liu C., Ha T. RecA filament sliding on DNA facilitates homology search. eLife. 2012;1:e00067. doi: 10.7554/eLife.00067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kornyshev A.A., Wynveen A. The homology recognition well as an innate property of DNA structure. Proc. Natl. Acad. Sci. USA. 2009;106:4683–4688. doi: 10.1073/pnas.0811208106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Galletto R., Amitani I., Kowalczykowski S.C. Direct observation of individual RecA filaments assembling on single DNA molecules. Nature. 2006;443:875–878. doi: 10.1038/nature05197. [DOI] [PubMed] [Google Scholar]
  • 16.van der Heijden T., Modesti M., Dekker C. Homologous recombination in real time: DNA strand exchange by RecA. Mol. Cell. 2008;30:530–538. doi: 10.1016/j.molcel.2008.03.010. [DOI] [PubMed] [Google Scholar]
  • 17.Bell J.C., Plank J.L., Kowalczykowski S.C. Direct imaging of RecA nucleation and growth on single molecules of SSB-coated ssDNA. Nature. 2012;491:274–278. doi: 10.1038/nature11598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Menetski J.P., Kowalczykowski S.C. Interaction of recA protein with single-stranded DNA. Quantitative aspects of binding affinity modulation by nucleotide cofactors. J. Mol. Biol. 1985;181:281–295. doi: 10.1016/0022-2836(85)90092-0. [DOI] [PubMed] [Google Scholar]
  • 19.Kowalczykowski S.C., Krupp R.A. DNA-strand exchange promoted by RecA protein in the absence of ATP: implications for the mechanism of energy transduction in protein-promoted nucleic acid transactions. Proc. Natl. Acad. Sci. USA. 1995;92:3478–3482. doi: 10.1073/pnas.92.8.3478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Bochkarev A., Pfuetzner R.A., Frappier L. Structure of the single-stranded-DNA-binding domain of replication protein A bound to DNA. Nature. 1997;385:176–181. doi: 10.1038/385176a0. [DOI] [PubMed] [Google Scholar]
  • 21.Fu H., Le S., Yan J. Dynamics and regulation of RecA polymerization and de-polymerization on double-stranded DNA. PLoS One. 2013;8:e66712. doi: 10.1371/journal.pone.0066712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Pugh B.F., Cox M.M. General mechanism for RecA protein binding to duplex DNA. J. Mol. Biol. 1988;203:479–493. doi: 10.1016/0022-2836(88)90014-9. [DOI] [PubMed] [Google Scholar]
  • 23.Forget A.L., Kowalczykowski S.C. Single-molecule imaging of DNA pairing by RecA reveals a three-dimensional homology search. Nature. 2012;482:423–427. doi: 10.1038/nature10782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hu T., Grosberg A.Y., Shklovskii B.I. How proteins search for their specific sites on DNA: the role of DNA conformation. Biophys. J. 2006;90:2731–2744. doi: 10.1529/biophysj.105.078162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kolomeisky A.B. Physics of protein-DNA interactions: mechanisms of facilitated target search. Phys. Chem. Chem. Phys. 2011;13:2088–2095. doi: 10.1039/c0cp01966f. [DOI] [PubMed] [Google Scholar]
  • 26.Kolomeisky A.B., Veksler A. How to accelerate protein search on DNA: location and dissociation. J. Chem. Phys. 2012;136:125101. doi: 10.1063/1.3697763. [DOI] [PubMed] [Google Scholar]
  • 27.Veksler A., Kolomeisky A.B. Speed-selectivity paradox in the protein search for targets on DNA: is it real or not? J. Phys. Chem. B. 2013;117:12695–12701. doi: 10.1021/jp311466f. [DOI] [PubMed] [Google Scholar]
  • 28.Mirny L.A., Slutsky M., Kosmrlj A. How a protein searches for its site on DNA: the mechanism of facilitated diffusion. J. Phys. A Math. Theor. 2009;42:434013. [Google Scholar]
  • 29.Koslover E.F., Díaz de la Rosa M.A., Spakowitz A.J. Theoretical and computational modeling of target-site search kinetics in vitro and in vivo. Biophys. J. 2011;101:856–865. doi: 10.1016/j.bpj.2011.06.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Bauer M., Metzler R. Generalized facilitated diffusion model for DNA-binding proteins with search and recognition states. Biophys. J. 2012;102:2321–2330. doi: 10.1016/j.bpj.2012.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kochugaeva M.P., Shvets A.A., Kolomeisky A.B. How conformational dynamics influences the protein search for targets on DNA. J. Phys. A Math. Theor. 2016;49:444004. [Google Scholar]
  • 32.Cherstvy A.G., Kolomeisky A.B., Kornyshev A.A. Protein-DNA interactions: reaching and recognizing the targets. J. Phys. Chem. B. 2008;112:4741–4750. doi: 10.1021/jp076432e. [DOI] [PubMed] [Google Scholar]
  • 33.van den Broek B., Lomholt M.A., Wuite G.J.L. How DNA coiling enhances target localization by proteins. Proc. Natl. Acad. Sci. USA. 2008;105:15738–15742. doi: 10.1073/pnas.0804248105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lomholt M.A., van den Broek B., Metzler R. Facilitated diffusion with DNA coiling. Proc. Natl. Acad. Sci. USA. 2009;106:8204–8208. doi: 10.1073/pnas.0903293106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Leger J.F., Robert J., Marko J.F. RecA binding to a single double-stranded DNA molecule: a possible role of DNA conformational fluctuations. Proc. Natl. Acad. Sci. USA. 1998;95:12295–12299. doi: 10.1073/pnas.95.21.12295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Adzuma K. No sliding during homology search by RecA protein. J. Biol. Chem. 1998;273:31565–31573. doi: 10.1074/jbc.273.47.31565. [DOI] [PubMed] [Google Scholar]
  • 37.Kornyshev A.A., Leikin S. Sequence recognition in the pairing of DNA duplexes. Phys. Rev. Lett. 2001;86:3666–3669. doi: 10.1103/PhysRevLett.86.3666. [DOI] [PubMed] [Google Scholar]
  • 38.Cherstvy A.G., Kornyshev A.A., Leikin S. Torsional deformation of double helix in interaction and aggregation of DNA. J. Phys. Chem. B. 2004;108:6508–6518. doi: 10.1021/jp0380475. [DOI] [PubMed] [Google Scholar]
  • 39.Klapstein K., Chou T., Bruinsma R. Physics of RecA-mediated homologous recognition. Biophys. J. 2004;87:1466–1477. doi: 10.1529/biophysj.104.039578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Esadze A., Kemme C.A., Iwahara J. Positive and negative impacts of nonspecific sites during target location by a sequence-specific DNA-binding protein: origin of the optimal search at physiological ionic strength. Nucleic Acids Res. 2014;42:7039–7046. doi: 10.1093/nar/gku418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ortega A., Garcia de la Torre J. Hydrodynamic properties of rodlike and disklike particles in dilute solution. J. Chem. Phys. 2003;119:9914–9919. [Google Scholar]
  • 42.Xing X., Bell C.E. Crystal structures of Escherichia coli RecA in a compressed helical filament. J. Mol. Biol. 2004;342:1471–1485. doi: 10.1016/j.jmb.2004.07.091. [DOI] [PubMed] [Google Scholar]
  • 43.Valko P.P., Abate J. Comparison of sequence accelerators for the Gaver method of numerical Laplace transform inversion. Comput. Math. Appl. 2004;48:629–636. [Google Scholar]
  • 44.Shvets A.A., Kolomeisky A.B. Sequence heterogeneity accelerates protein search for targets on DNA. J. Chem. Phys. 2015;143:245101. doi: 10.1063/1.4937938. [DOI] [PubMed] [Google Scholar]
  • 45.Bauer M., Rasmussen E.S., Metzler R. Real sequence effects on the search dynamics of transcription factors on DNA. Sci. Rep. 2015;5:10072. doi: 10.1038/srep10072. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES