Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2013 Mar 4;110(12):4500–4505. doi: 10.1073/pnas.1214051110

From mechanical folding trajectories to intrinsic energy landscapes of biopolymers

Michael Hinczewski a,1, J Christof M Gebhardt b,c, Matthias Rief b, D Thirumalai a,1
PMCID: PMC3606983  PMID: 23487746

Abstract

In single-molecule laser optical tweezer (LOT) pulling experiments, a protein or RNA is juxtaposed between DNA handles that are attached to beads in optical traps. The LOT generates folding trajectories under force in terms of time-dependent changes in the distance between the beads. How to construct the full intrinsic folding landscape (without the handles and beads) from the measured time series is a major unsolved problem. By using rigorous theoretical methods—which account for fluctuations of the DNA handles, rotation of the optical beads, variations in applied tension due to finite trap stiffness, as well as environmental noise and limited bandwidth of the apparatus—we provide a tractable method to derive intrinsic free-energy profiles. We validate the method by showing that the exactly calculable intrinsic free-energy profile for a generalized Rouse model, which mimics the two-state behavior in nucleic acid hairpins, can be accurately extracted from simulated time series in a LOT setup regardless of the stiffness of the handles. We next apply the approach to trajectories from coarse-grained LOT molecular simulations of a coiled-coil protein based on the GCN4 leucine zipper and obtain a free-energy landscape that is in quantitative agreement with simulations performed without the beads and handles. Finally, we extract the intrinsic free-energy landscape from experimental LOT measurements for the leucine zipper.


The energy landscape perspective has provided a conceptual framework to describe how RNA (1) and proteins (24) fold. Some of the key theoretical predictions (5, 6), have been confirmed by experiments (7). More refined comparisons require mapping the full folding landscape of biomolecules. Advances in laser optical tweezer (LOT) experiments have been used to obtain free-energy profiles as a function of the extension of biomolecules under tension (712).

The usefulness of the LOT technique, however, hinges on the assumption that information about the fluctuating biomolecule can be accurately recovered from the raw experimental data, namely the time-dependent changes in the positions of the beads in the optical traps, attached to the biomolecule by double-stranded DNA (dsDNA) handles (Fig. 1). Thus, we only have access to the intrinsic folding landscape of the biomolecule (in the absence of handles and beads) indirectly through the bead–bead separation along the force direction. Many extraneous factors, such as fluctuations of the handles (13, 14), rotation of the beads, and the varying applied tension due to finite trap stiffness, can distort the intrinsic folding landscape. Moreover, the detectors and electronic systems used in the data collection have finite response times, leading to filtering of high-frequency components in the signal (15). Ad hoc attempts have been made to account for handle effects based on experimental estimates of stretched DNA properties, using techniques similar to image deconvolution (8, 11, 16). Theory has been used to extract free-energy information from nonequilibrium pulling experiments (17), and to determine the intrinsic power spectrum of protein fluctuations (18) from LOT data. However, to date there has been no comprehensive theory to model and correct for all of the systematic instrumental distortions of the underlying folding landscapes of proteins and RNA.

Fig. 1.

Fig. 1.

Dual-beam optical tweezer setup for studying the equilibrium folding landscape of a single protein molecule under force.

How can one construct the intrinsic free-energy profile of a biomolecule using the measured folding trajectories in the presence of beads and handles [the total separation ztot(t) in Fig. 1 as a function of time t]? Here, we solve this problem using a rigorous theoretical procedure. Besides ztot(t), the only inputs needed in our theory are the bead radii, the trap strengths and positions, and handle characteristics such as the contour length, the persistence length, and the elastic stretch modulus. The output is the intrinsic free energy as a function of the biomolecular extension (zp in Fig. 1) in the constant force ensemble.

We validate our approach using two systems: (i) a generalized Rouse model (GRM) hairpin (19), which has an analytically solvable double-well energy landscape under force; and (ii) a double-stranded coiled-coil protein based on the yeast transcriptional factor GCN4 leucine zipper domain, whose folding landscape was studied using a LOT experiment (11). We first use coarse-grained molecular simulations to obtain the intrinsic free-energy landscape of the isolated protein at a constant force. We then simulate mechanical folding trajectories using the full LOT setup, from which we quantitatively recover the intrinsic free-energy landscape of GCN4, thus further establishing the efficacy of our theory. Finally, we apply our theory to experimentally generated data, and show that we can get reliable estimates for the protein energy profile independent of the optical trap parameters.

Results

Theory for Constructing the Intrinsic Protein Folding Landscape from Measurements.

In a dual-beam optical tweezer setup (Fig. 1) the protein is covalently connected to dsDNA handles that are attached to glass or polystyrene beads in two optical traps. For small displacements of the beads from the trap centers (20) the trap potentials are harmonic, with strengths kx = kzktrap along the lateral plane, and a weaker axial strength ky = αktrap, where α < 1 (21). For simplicity, we take both traps to have equal strengths, although our method can be generalized to an asymmetric setup. The trap centers are separated from each other along the Inline graphic axis, with trap 1 at z = 0 and trap 2 at z = ztrap. As the bead–handle–protein system fluctuates in equilibrium, the positions of the bead centers r1(t) and r2(t) vary in time. The experimentalist can collect a time series of the z components of the bead positions z1(t) and z2(t). Denote the mean of each time series as Inline graphic and Inline graphic. We assume that the trap centers are sufficiently far apart that the whole system is under tension, which implies that the mean bead displacements are nonzero, Inline graphic, where Inline graphic is the mean tension along Inline graphic. We focus on the case where there is no feedback mechanism to maintain a constant force, so the instantaneous tension in the system changes as the total end-to-end extension component ztot(t) ≡ z2(t) − z1(t) (Fig. 1) varies. Although we choose one particular passive setup, the theory can be adapted to other types of passive optical tweezer systems (8, 20) where the force is approximately constant (in which case we could skip the transformation into the constant-force ensemble described below). The mean tension Inline graphic, a measure of the overall force scale, can be tuned at the start of the experiment by making the trap separation ztrap larger (leading to higher Inline graphic) or smaller (leading to lower Inline graphic). Because Inline graphic, the precise relationship between ztrap and Inline graphic requires knowing the mean total extension Inline graphic, which depends among other things on the details of the energy landscape. Hence, we cannot in general calculate beforehand what Inline graphic will be for a given ztrap. However, one of the advantages of our approach is that we can combine data from different experimental runs (each having a different ztrap and Inline graphic) to accurately construct the protein free-energy profile. This combination is carried out through the weighted histogram analysis method (WHAM) (22) (Supporting Information) in a spirit similar to earlier work in the context of optical tweezers (23, 24). We first solve the problem of obtaining the protein landscape based on a single observed trajectory of bead-to-bead separations specified as ztot as a function of t.

The key quantity in the construction procedure is Inline graphictot(ztot), the equilibrium probability distribution of ztot within the external trap potential, which can be directly derived from the experimental time series. The imperfect nature of the measured data, due to noise and low-pass filtering effects in the recording apparatus, will distort Inline graphictot(ztot), but we have developed a technique to model and approximately correct for these issues (Materials and Methods, FBS). Once we have an experimental estimate for Inline graphictot(ztot), the objective is to find Inline graphic, the intrinsic distribution of the protein end-to-end extension component zp at some constant force F0, whose value we are free to choose. (Tilde notation denotes probabilities in the constant-force ensemble.) The intrinsic protein free-energy profile is Inline graphic. The procedure, obtained from rigorous theoretical underpinnings described in detail in SI Text, consists of two steps:

  • i) Transformation into the constant-force ensemble. Given Inline graphictot(ztot), we obtain the total system end-to-end distribution at a constant F0 using

graphic file with name pnas.1214051110eq1.jpg

where β = 1/kBT and C is a normalization constant. The equation above applies in the case of a single experimental trajectory at a particular trap separation ztrap.

  • ii) Extraction of the intrinsic protein distribution. In the constant-force ensemble, Inline graphic, relates the total end-to-end fluctuations Inline graphic to the end-to-end distributions for the individual components Inline graphic, where α denotes bead (b), handle (h), or protein (p), and “*” is a 1D convolution operator. For the beads, “end-to-end” refers to the extension between the bead center and the handle attachment point, projected along Inline graphic. In Fourier space the convolution has the form

graphic file with name pnas.1214051110eq2.jpg

where Inline graphic is the Fourier transform of Inline graphic. Here, Inline graphic, which is the result of convolving all of the bead and handle distributions, acts as the main point-spread function relating the intrinsic protein distribution Inline graphic to Inline graphic. Because Inline graphic can be modeled from a theoretical description of the handles and beads, we can solve for Inline graphic using Eq. 2 and hence find Inline graphic, the intrinsic free-energy profile of the protein.

The derivation of the procedure (SI Text) shows the conditions under which the two-step method works. The mathematical approximation underlying step i becomes exact if: (i) kx = ky = 0; or (ii) the full 3D total system end-to-end probability is separable into a product of distributions for longitudinal Inline graphic and transverse (Inline graphic, Inline graphic) components. In general, condition (ii) is not physically sensible (19). However, if Inline graphic is the typical length scale describing transverse fluctuations, then condition (i) is approximately valid when Inline graphic. If this condition breaks down, accurate construction of the intrinsic energy landscape cannot be performed without knowledge of the transverse behavior. In the simulation and experimental results below, the force scales are such that transverse fluctuations are small, Inline graphic, so to ensure condition (i) is met, we require that Inline graphic pN/nm at T = 298 K. We use the experimental value ktrap = 0.25 pN/nm in our test cases (11), which is well under the upper limit. In principle, one can choose any F0, the force value of the constant-force ensemble where we carry out the analysis. In practice, F0 should be chosen from among the range of forces that is sampled in equilibrium during the actual experiment, because this will minimize statistical errors in the final constructed landscape. For example, setting Inline graphic, the mean tension, is a reasonable choice.

Step ii depends on knowledge of Inline graphic, and thus the individual constant-force distributions of the beads and the handles in Fourier space. The point-spread function is characterized by the bead radius Rb, the handle contour length L, the handle persistence length lp, and the handle elastic stretching modulus γ. In Inline graphic we also include the covalent linkers which attach the handles to the beads and protein, giving two additional parameters: the linker stiffness κ and length ℓ. Using the extensible semiflexible chain as a model for the handles, we exploit an exact mapping between this model and the propagator for the motion of a quantum particle on the surface of a unit sphere (25) to calculate the handle Fourier-space distribution to arbitrary numerical precision. Together with analytical results for the bead and linker distributions, we can directly solve for Inline graphic. To verify that the analytical model for the point-spread function can accurately describe handle/bead fluctuations over a range of forces, we have analyzed data from control experiments on a system involving only the dsDNA handles attached to beads, where Inline graphictot = Inline graphicbh (SI Text). The theory simultaneously fits results for several experimental quantities measured on the same system: the distributions Inline graphic derived from three different trap separations, corresponding to mean forces F0 = 9.4−12.7 pN, and a force-extension curve. The accuracy of the model Inline graphic is ∼1–3%, within the experimental error margins.

Robustness of the Theory Validated by Application to an Exactly Solvable Model.

We first apply the theory to a problem for which the intrinsic free-energy profiles at arbitrary force are known exactly. The GRM hairpin (SI Text) is a two-state folder whose full 3D equilibrium end-to-end distributions are analytically solvable. A representative GRM distribution Inline graphic at F0 = 11.9 pN is plotted in Fig. 2A. The upper part shows a projection onto the Inline graphic plane, because Inline graphic is cylindrically symmetric, whereas the lower part shows the further projection onto the z coordinate. The two peaks correspond to the native (N) state at small z, and the unfolded (U) state at large z. To model the optical tweezer system, we add handles and beads to the GRM hairpin, whose probabilities Inline graphic and Inline graphic (including transverse fluctuations) are illustrated in Fig. 2 B and C. The full 3D behavior is derived in an analogous manner to the theory mentioned above for the 1D Fourier-space distribution Inline graphic of the beads/handles; the only difference is that the transverse degrees of freedom are not integrated out. The 3D convolution of the system components, plus the optical trap contribution, gives the total distribution Inline graphictot in Fig. 2D. The bead, handle, linker, and trap parameters are listed in Table S1. From Inline graphictot one can calculate the mean total z extension and the mean tension, which in this case are Inline graphic nm, Inline graphic pN.

Fig. 2.

Fig. 2.

GRM hairpin in an optical tweezer setup. First row shows the exact end-to-end distributions along Inline graphic for each component type in the system: (A) GRM, (B) dsDNA handle, (C) polystyrene bead. Handle, bead, and trap parameters are listed in Table S1 (GRM column). (Upper) Probabilities projected onto cylindrical coordinates Inline graphic. (Lower) Projection onto z alone. (D) Result for the total system end-to-end distribution Inline graphictot derived by convolving the component probabilities and accounting for the optical traps. (EG) Construction of the original GRM distribution Inline graphic starting from Inline graphictot. (E) Inline graphictot (purple) and Inline graphic (blue) as a function of z on the bottom axis, measured relative to Inline graphic, the average extension for each distribution. For Inline graphictot, the upper axis shows the z range translated into the corresponding trap forces F. After removing the trap effects, Inline graphic is the distribution for constant force F0 = 11.9 pN. (F) Inline graphic, describing the total probability at F0 of fluctuations resulting from both handles and the rotation of the beads. (G) Constructed solution for Inline graphic (solid line), obtained by numerically inverting the convolution Inline graphic. Exact analytical result for Inline graphic is shown as a dashed line; zN is the position of the N peak.

The Inline graphic-probability projection in Fig. 2D (Lower) is the information accessible in an experiment, and the computation of the intrinsic distribution in Fig. 2A (Lower) is the ultimate goal of the construction procedure. Comparing A and D, two effects of the apparatus are visible: the GRM peaks have been partially blurred into each other, and the transverse (ρ) fluctuations have been enhanced. The handles provide the dominant contribution to both these effects.

Fig. 2 EG illustrates the construction procedure for the GRM optical tweezer system. E corresponds to step i, with a transformation of the distribution Inline graphictot (whose varying force scale is shown along the top axis) into Inline graphic at constant force F0 = 11.9 pN. Step ii uses the exact Inline graphic, shown in real space in F, and produces the intrinsic distribution Inline graphic, drawn as a solid line in G. The agreement with the exact analytical result (dashed line) is extremely close, with a median error of 3% over the range shown. This deviation is due to the approximation in step i, discussed above, as well as the numerical implementation of the deconvolution procedure.

As shown in our previous study (19), the smaller the ratio lp/L for the handles, the more the features of the protein energy landscape get blurred by the handle fluctuations. Because the experimentally measured total distribution always distorts to some extent the intrinsic protein free-energy profile due to the finite duration and sampling of the system trajectory, more flexible handles will exacerbate the signal-to-noise problem. To illustrate this effect, we performed Brownian dynamics simulations of the GRM in the optical tweezer setup, with handles modeled as extensible, semiflexible bead–spring chains (SI Text). In Fig. 3A we compare the free energy ℱtot = −kB T ln Inline graphictot for a fixed L = 100 nm and a varying lp/L, derived from the simulation trajectories, and the exact intrinsic GRM result Inline graphic at F0. When the handles are very flexible, with lp/L = 0.02, the energy barrier between the native and unfolded states almost entirely disappears in ℱtot, with the noise making the precise barrier shape difficult to resolve. Remarkably, even with this extreme level of distortion, using our theory we still recover a reasonable estimate of the intrinsic landscape (Fig. 3B). For each ℱtot in Fig. 3A, Fig. 3B compares the result of the construction procedure and the exact answer for Inline graphic. Clearly some information is lost as lp/L becomes smaller, because the lp/L = 0.02 system does not yield as accurate a result as the ones with stiffer handles. However, in all cases the basic features of the exact Inline graphic are reproduced. Thus, the method works remarkably well over a wide range of handle parameters. This conclusion is generally valid even when other parameters are varied (see Fig. S3 in SI Text for tests at various F0 and ktrap). The excellent agreement between the constructed and intrinsic free-energy profiles for the exactly solvable GRM hairpin over a wide range of handle and trap experimental variables establishes the robustness of the theory.

Fig. 3.

Fig. 3.

Effects of handle characteristics on the free-energy profile of the GRM in a LOT setup. (A) Total system free energy ℱtot = −kBT ln Inline graphictot for fixed L = 100 nm, and varying ratios lp/L. All of the other parameters are in Table S1 (GRM column). The exact analytical free energy at F0 =11.9 pN (dashed line) for the GRM alone, Inline graphic, is shown for comparison. (B) For each ℱtot in A, the construction of Inline graphic at F0, together with the exact answer (dashed line). (C) For system parameters matching the experiment (Table S1), the variance of the point-spread function Inline graphic broken down into the individual handle, bead, and linker contributions. The fraction for each component is shown as a function of varying handle elastic modulus γ.

Intrinsic Folding Landscape of a Simulated Leucine Zipper.

To demonstrate that the theory can be used to produce equilibrium intrinsic free-energy profiles with multiple states from mechanical folding trajectories, we performed simulations of a protein in an optical tweezer setup. The simulations were designed to mirror a single-molecule experiment (11). To this end we studied a coiled-coil, LZ26 (26), based on three repeats of the leucine zipper domain from the yeast transcriptional factor GCN4 (27) (Materials and Methods). The simple linear unzipping of the two strands of LZ26 allows us to map the end-to-end extension to the protein configuration. Furthermore, the energy heterogeneity of the native bonds that form the ‘‘teeth’’ of the zipper leads to a nontrivial folding landscape with at least two intermediate states (11, 26, 28).

The LZ26 N structure in Fig. 4 (from a simulation snapshot), shows two alpha-helical strands running from the N terminus at the bottom to the C terminus at the top. In the experiment a handle is attached to the N terminus of each strand, and this is where the strands begin to unzip under applied force. To prevent complete strand separation, the C termini are cross-linked through a disulfide bridge between two cysteine residues. Each alpha-helix coil consists of a series of seven-residue heptad repeats, with positions labeled a through g. For the leucine zipper the a and d positions are the teeth, consisting of mostly hydrophobic residues (valine and leucine) which have strong noncovalent interactions with their counterparts on the other strand. The exceptions to the hydrophobic pattern are the three hydrophilic asparagine residues in a position on each strand (marked in blue in the structure snapshots in Fig. 4). As has been seen experimentally (11, 26) (and shown below through simulations), the weaker interaction of these asparagine pairs is crucial in determining the properties of the intermediate folding states.

Fig. 4.

Fig. 4.

Intrinsic characteristics of the LZ26 leucine zipper at constant F0, derived from SOP simulations in the absence of handles/beads. (A) LZ26 free energy Inline graphic at F0 = 12.3 pN vs. end-to-end extension z. (Right) Representative protein configurations from the four wells (N, I1, I2, U), with asparagine residues colored blue. (B) Average fraction of native contacts between the two alpha-helical strands of LZ26 (the “zipper bonds”) as a function of z. (Left) Lists of the a and d residues in the heptads making up the amino acid sequence for each LZ26 strand, placed according to their position along the zipper. Asparagines (N) are highlighted in blue. (C) For the residues listed in B, the residue contact energies used in the SOP simulation [rescaled BT (30) values].

In analyzing the LZ26 leucine zipper system, we performed coarse-grained simulations using the self-organized polymer (SOP) model (29) (full details in SI Text, with parameters summarized in Table S1). The intrinsic free-energy profile Inline graphic at F0 = 12.3 pN in Fig. 4A has four prominent wells in Inline graphic as a function of zp corresponding to four stages in the progressive unzipping of LZ26. At F0 = 12.3 pN all of the states are populated, and the system fluctuates in equilibrium between the wells. The transition barrier between N and I1 exhibits a shallow dip that may correspond to an additional, very transiently populated intermediate. Because this dip is much smaller than kBT, we do not count it as a distinct state.

Adding the optical tweezer apparatus to the SOP simulation significantly distorts the measured probability distributions. In the first row of Fig. 5 sample simulation trajectory fragments are shown both for the protein-only case (Fig. 5A) at constant force F0 = 12.3 pN, and within the full optical tweezer system (Fig. 5C) with ztrap = 503 nm. For the latter case we plot both ztot(t) (purple) and zp(t) (gray), allowing us to see how the bead separation tracks changes in the protein extension. The probability distributions Inline graphic and Inline graphictot are plotted in Fig. 5 B and D, respectively. In Fig. 5E, the distribution Inline graphictot within the optical tweezer system is plotted for ztrap = 503 nm. Although we only illustrate this particular ztrap value, ∼260 trajectories are generated at different ztrap and combined together using WHAM (22) (SI Text) to produce a single Inline graphic at a constant force F0 = 12.3 pN (Fig. 5E). We can then use our theoretical method to recover the protein free energy Inline graphic (Fig. 5F). Despite numerical errors due to limited statistical sampling (both in the protein-only and total system runs), there is remarkable agreement between the constructed result and Inline graphic derived from protein-only simulations. This is particularly striking given that the total system free energy ℱtot(ztot) = −kBT ln Inline graphictot(ztot) (Fig. 5F), shows that handles/beads blur the energy landscape, reducing the energy barriers to a degree that the N state is difficult to resolve. The signature of N in ℱtot(ztot) is a slight change in the curvature at higher energies on the left of the I1 well. Nevertheless, we still recover a basin of attraction representing the N state in the constructed Inline graphic. The results in Fig. 5 provide a self-consistency check of the method for a system with multiple intermediates.

Fig. 5.

Fig. 5.

(A and B) A trajectory fragment and the probability distribution Inline graphic from SOP simulations of the LZ26 leucine zipper at constant force F0 = 12.3 pN in the absence of handles/beads. (C and D) A trajectory fragment and the total system distribution Inline graphictot at ztrap = 503 nm. C shows both the total extension ztot(t) (purple) and the protein extension zp(t) (gray). Triangles mark times when the protein makes a transition between states, and the arrows point to two enlarged portions of the trajectories. In all cases the z-axis origin is zI1, the peak location of the I1 intermediate state. (EG) Leucine zipper free-energy profiles extracted from time series (third row = simulation, fourth row = experiment). First column shows the total system end-to-end distribution Inline graphictot and the corresponding Inline graphic at constant force F0 = 12.3 pN. In the experimental case F0 = 12.3 ± 0.9 pN is the midpoint force at which the I1 and U states are equally likely. For Inline graphictot, ztrap = 503 nm (simulation), 1553 ± 1 nm (experiment). Force scales at the top are the range of trap forces for Inline graphictot. Second column shows the computed intrinsic protein free-energy profiles Inline graphic compared with the total system profile, ℱtot (shifted upward for clarity). (F) SOP simulations for the protein alone at constant F0 provide a reference landscape, drawn as a dashed line. (H) Dotted curve is the reconstructed Inline graphic at the midpoint force F0 = 12.1 ± 0.9 pN, from a second, independent experimental trajectory, with ztrap = 1547 ± 1 nm. Inline graphic curves have a median uncertainty of 0.4 kBT over the plotted range (see SI Text for error analysis).

Folding Landscape of the Leucine Zipper from Experimental Trajectories.

As a final test of the efficacy of the theory we used the experimental time series data (11) to obtain Inline graphic. The data consist of two independent runs with the LZ26 leucine zipper, using the same handle/bead parameters for each run (Table S1) but at different trap separations ztrap. We project the deconvolved landscape from each trajectory onto the midpoint force F0 where the two most populated states (I1 and U) have equal probabilities in Inline graphic. The values of F0 derived from the two runs are the same within error bounds: 12.3 ± 0.9 and 12.1 ± 0.9 pN. The detailed deconvolution steps are shown for one run in the last row of Fig. 5. The intrinsic free-energy profile Inline graphic is shown for both runs in Fig. 5H (solid and dotted blue curves, respectively). Accounting for error due to finite trajectory length and uncertainties in the apparatus parameters, the median total uncertainty in each of the reconstructed landscapes is about 0.4 kBT in the z range shown (see SI Text for full error analysis). The landscapes from the two independent runs have a median difference of 0.3 kBT, and hence the method gives consistent results between runs, up to a small experimental uncertainty, an important test of its practical utility. The reproducibility of Inline graphic is a testament to the stability of the dual optical tweezer setup. Each trajectory lasted for more than 100 s, and thus collected ∼(102−105) of the various types of transitions between protein states (the slowest transition, U → I2, occurred on time scales of 0.4–0.6 s).

Comparison between the experimental Inline graphic in Fig. 5H and the simulation result in Fig. 5F reveals a notable difference: The landscape constructed using the experimental data does not have four basins. The N state may not be discernible in the experiment because of the limited resolution of the apparatus (see below). The spacing between the I1 and I2 wells is similar in the simulation and experiment (∼9–13 nm), but that between I2 and U is ∼13 nm in the simulation vs. 25 nm in the experiment. This is likely due to a larger helix content in the unfolded state in the simulations.

Discussion

Origins of the Variance in the Point-Spread Function.

Our theory for the point-spread function Inline graphic can be used to understand the interplay of physical effects that relate the intrinsic protein distribution to the total system. To quantify the various contributions to Inline graphic, we calculated its variance. Because variances of probability distributions combine additively upon convolution, we break down the variance of Inline graphic into the individual bead, handle, and linker contributions. Fig. 3C shows the fraction of the variance associated with each component as a function of the handle elastic modulus γ at F0 = 12.3 pN, with Rb = 500 nm, L = 188 nm, and lp = 20 nm (the approximate experimental parameters from ref. 11). For any given value of γ, the height of each of the four colored slices represents four fractions. Although not directly measured in ref. 11, we have assumed κ = 200 kcal/mol⋅ nm2, ℓ = 1.5 nm for the linkers. The handle contribution is itself broken down into the “elastic” part, defined as the extra variance due to finite γ, compared with an inextensible (γ → ∞) worm-like chain (WLC), and the remainder, which we call the WLC part. For the case of ref. 11, γ = 400 pN. Because the length extension relative to the WLC result is ∼F0/γ, we expect finite handle extensibility to play a small role. Surprisingly, the elastic contribution to the total Inline graphic variance at this γ is 43%, comparable to the WLC contribution of 48%. Hence, in predicting Inline graphic correctly it is important to account for both the bending rigidity and elasticity of the handles.

Instrumental Noise Filtering and the Limits of the Theoretical Approach.

The difference in the number of wells in the simulation and experimental free-energy landscapes of the leucine zipper is related to finite time and spatial resolution. The measured time series is subject to noise and low-pass filtering due to the apparatus (15). The standard experimental protocol often involves additional low-pass filtering as a way of removing noise and smoothing trajectories: for the leucine zipper every five data points (originally recorded at 10-μs intervals) are averaged together during collection to give a time step of 50 μs (11). Noise broadens the measured distribution of bead separations, whereas low-pass filtering narrows it. The finite bandwidth scaling (FBS) technique (Materials and Methods; SI Text), based on the details of the specific apparatus used in the experiment, estimates and corrects for the distortions.

The FBS theory can only apply corrections to peaks (i.e., distinct protein states) that we observe in the measured probability distributions. There is the possibility of protein states leaving no discernible signature in the recorded distribution. The N state in the leucine zipper is connected only to the I1 state in the folding pathway. In the simulations, where the N state is directly observed, it has short mean lifetimes (≲6 μs in the studied force range). The N ↔ I1 change involves the shortest mean extension difference (∼8 nm) among all of the transitions. If the N state in the actual protein has similar properties, it could be impossible to resolve it in the experimental data for two reasons: (i) Regardless of any additional filtering, the intrinsic low-pass characteristics of the apparatus filter out states with very short lifetimes. For our LOT setup, the effective low-pass filter time scale for the detectors/electronics is τf ∼ 7 μs (SI Text), which is at the cutting edge of current technology. Thus, states with lifetimes ≲τf will not appear as distinct peaks in the measured distribution. (ii) Independent of the filtering issues in detection/recording, environmental background noise in the time series also poses a problem, particularly because we measure bead displacements, and these have signal amplitudes at high frequencies that are generally attenuated compared with the intrinsic amplitudes of the protein conformational changes. The reason for this is that the beads have much larger hydrodynamic drag than dsDNA handles or proteins, and their characteristic relaxation times τr in the optical traps may be comparable to or larger than the lifetime of a particular protein state. The bead cannot fully respond to force changes on time scales shorter than its relaxation time (14). For example, τr ∼ 20 μs in the leucine zipper experiment. If the lifetime of the N state at a particular force is much smaller than τr, protein transitions from I1→N→I1 will generally occur before the bead can relax into the N-state equilibrium position. If the bead displacements associated with these transitions are smaller than the noise amplitude in the time series, the entire excursion to the N state will be lost to the noise.

We can illustrate the finite response time of the bead using simulations where resolution is not limited by noise or apparatus filtering, allowing us to see the relationship between ztot(t) and zp(t), compared in two different trajectory fragments in Fig. 5C. Triangles in the figure indicate times where the protein makes a transition between states. Changes in protein extension during these transitions are very rapid, and the bead generally mirrors these changes with a small time lag, as seen in the enlarged trajectory interval at t = 36−42 μs. When the protein makes sharp, extremely brief excursions (such as a visit to the N state from I1 in the enlarged interval t = 90–96 μs), the corresponding changes in bead separation are smaller and much less well-defined. In the presence of noise, such tiny changes would be obscured.

Thus, we surmise that the N state is not observable due to some combination of apparatus filtering, noise, and finite bead response time. Hence, the theory applied to the experimental data produces a landscape with only I1, I2, and U wells, as opposed to the four wells in the simulation data.

Conclusions

Extraction of the energy landscape of biomolecules using LOT data is complicated because accurate analysis depends on correcting for distortions due to system components. We have solved this problem completely by developing a theoretically based construction method that accounts for these factors. Through an array of tests involving an analytically solvable hairpin model, coarse-grained protein simulations, and experimental data, we have demonstrated the robustness of the technique in a range of realistic scenarios. The method works for arbitrarily complicated landscapes, producing consistent results when the same protein is studied under different force scales.

Materials and Methods

FBS.

Probability distributions derived from experimental time series of bead–bead separations are corrupted by noise, finite apparatus bandwidth, and in some cases additional filtering due to the data processing protocol. We developed FBS theory to model and correct for these effects (SI Text), using information encoded in time series autocorrelations, together with spectral characterization of the dual-trap optical tweezer detector and electronic systems (15). All of the experimental distributions Inline graphictot in the main text were first processed by FBS.

Leucine Zipper.

We use a variant of the coarse-grained SOP model (29, 31), where each of the 176 residues in LZ26 is represented by a bead centered at the Cα position (SI Text). The α-helical secondary structure is stabilized by interactions which mimic (i, i + 4) hydrogen bonding (32). We use residue-dependent energies for tertiary interactions (30). Simulations are carried out using an overdamped Brownian dynamics algorithm (33).

Supplementary Material

Supporting Information

Acknowledgments

M.H. was a Ruth L. Kirschstein National Research Service Postdoctoral Fellow supported by Grant 1 F32 GM 97756-1 from the National Institute of General Medical Sciences. D.T. was supported by Grant GM 089685 from the National Institutes of Health.

Footnotes

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1214051110/-/DCSupplemental.

References

  • 1.Thirumalai D, Hyeon C. RNA and protein folding: Common themes and variations. Biochemistry. 2005;44(13):4957–4970. doi: 10.1021/bi047314+. [DOI] [PubMed] [Google Scholar]
  • 2.Onuchic JN, Luthey-Schulten Z, Wolynes PG. Theory of protein folding: The energy landscape perspective. Annu Rev Phys Chem. 1997;48:545–600. doi: 10.1146/annurev.physchem.48.1.545. [DOI] [PubMed] [Google Scholar]
  • 3.Dill KA, Ozkan SB, Shell MS, Weikl TR. The protein folding problem. Annu Rev Biophys. 2008;37:289–316. doi: 10.1146/annurev.biophys.37.092707.153558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Thirumalai D, O’Brien EP, Morrison G, Hyeon C. Theoretical perspectives on protein folding. Annu Rev Biophys. 2010;39:159–183. doi: 10.1146/annurev-biophys-051309-103835. [DOI] [PubMed] [Google Scholar]
  • 5.Guo Z, Thirumalai D. Kinetics of protein folding: Nucleation mechanism, time scales, and pathways. Biopolymers. 1995;36(1):83–102. [Google Scholar]
  • 6.Klimov DK, Thirumalai D. Symmetric connectivity of secondary structure elements enhances the diversity of folding pathways. J Mol Biol. 2005;353(5):1171–1186. doi: 10.1016/j.jmb.2005.09.029. [DOI] [PubMed] [Google Scholar]
  • 7.Stigler J, Ziegler F, Gieseke A, Gebhardt JCM, Rief M. The complex folding network of single calmodulin molecules. Science. 2011;334(6055):512–516. doi: 10.1126/science.1207598. [DOI] [PubMed] [Google Scholar]
  • 8.Woodside MT, et al. Direct measurement of the full, sequence-dependent folding landscape of a nucleic acid. Science. 2006;314(5801):1001–1004. doi: 10.1126/science.1133601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Woodside MT, et al. Nanomechanical measurements of the sequence-dependent folding landscapes of single nucleic acid hairpins. Proc Natl Acad Sci USA. 2006;103(16):6190–6195. doi: 10.1073/pnas.0511048103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Neupane K, Yu H, Foster DAN, Wang F, Woodside MT. Single-molecule force spectroscopy of the add adenine riboswitch relates folding to regulatory mechanism. Nucleic Acids Res. 2011;39(17):7677–7687. doi: 10.1093/nar/gkr305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gebhardt JCM, Bornschlögl T, Rief M. Full distance-resolved folding energy landscape of one single protein molecule. Proc Natl Acad Sci USA. 2010;107(5):2013–2018. doi: 10.1073/pnas.0909854107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Elms PJ, Chodera JD, Bustamante C, Marqusee S. The molten globule state is unusually deformable under mechanical force. Proc Natl Acad Sci USA. 2012;109(10):3796–3801. doi: 10.1073/pnas.1115519109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hyeon C, Thirumalai D. Forced-unfolding and force-quench refolding of RNA hairpins. Biophys J. 2006;90(10):3410–3427. doi: 10.1529/biophysj.105.078030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Manosas M, et al. Force unfolding kinetics of RNA using optical tweezers. II. Modeling experiments. Biophys J. 2007;92(9):3010–3021. doi: 10.1529/biophysj.106.094243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.von Hansen Y, Mehlich A, Pelz B, Rief M, Netz RR. Auto- and cross-power spectral analysis of dual trap optical tweezer experiments using Bayesian inference. Rev Sci Instrum. 2012;83(9):095116. doi: 10.1063/1.4753917. [DOI] [PubMed] [Google Scholar]
  • 16.Yu H, et al. Direct observation of multiple misfolding pathways in a single prion protein molecule. Proc Natl Acad Sci USA. 2012;109(14):5283–5288. doi: 10.1073/pnas.1107736109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hummer G, Szabo A. Free energy profiles from single-molecule pulling experiments. Proc Natl Acad Sci USA. 2010;107(50):21441–21446. doi: 10.1073/pnas.1015661107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Hinczewski M, von Hansen Y, Netz RR. Deconvolution of dynamic mechanical networks. Proc Natl Acad Sci USA. 2010;107(50):21493–21498. doi: 10.1073/pnas.1010476107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hyeon C, Morrison G, Thirumalai D. Force-dependent hopping rates of RNA hairpins can be estimated from accurate measurement of the folding landscapes. Proc Natl Acad Sci USA. 2008;105(28):9604–9609. doi: 10.1073/pnas.0802484105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Greenleaf WJ, Woodside MT, Abbondanzieri EA, Block SM. Passive all-optical force clamp for high-resolution laser trapping. Phys Rev Lett. 2005;95(20):208102. doi: 10.1103/PhysRevLett.95.208102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Neuman KC, Block SM. Optical trapping. Rev Sci Instrum. 2004;75(9):2787–2809. doi: 10.1063/1.1785844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ferrenberg AM, Swendsen RH. Optimized Monte Carlo data analysis. Phys Rev Lett. 1989;63(12):1195–1198. doi: 10.1103/PhysRevLett.63.1195. [DOI] [PubMed] [Google Scholar]
  • 23.Shirts MR, Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J Chem Phys. 2008;129(12):124105. doi: 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.de Messieres M, Brawn-Cinani B, La Porta A. Measuring the folding landscape of a harmonically constrained biopolymer. Biophys J. 2011;100(11):2736–2744. doi: 10.1016/j.bpj.2011.03.067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kierfeld J, Niamploy O, Sa-yakanit V, Lipowsky R. Stretching of semiflexible polymers with elastic bonds. Eur Phys J E Soft Matter. 2004;14(1):17–34. doi: 10.1140/epje/i2003-10089-3. [DOI] [PubMed] [Google Scholar]
  • 26.Bornschlögl T, Rief M. Single molecule unzipping of coiled coils: Sequence resolved stability profiles. Phys Rev Lett. 2006;96(11):118102. doi: 10.1103/PhysRevLett.96.118102. [DOI] [PubMed] [Google Scholar]
  • 27.O’Shea EK, Klemm JD, Kim PS, Alber T. X-ray structure of the GCN4 leucine zipper, a two-stranded, parallel coiled coil. Science. 1991;254(5031):539–544. doi: 10.1126/science.1948029. [DOI] [PubMed] [Google Scholar]
  • 28.Bornschlögl T, Rief M. Single-molecule dynamics of mechanical coiled-coil unzipping. Langmuir. 2008;24(4):1338–1342. doi: 10.1021/la7023567. [DOI] [PubMed] [Google Scholar]
  • 29.Hyeon C, Dima RI, Thirumalai D. Pathways and kinetic barriers in mechanical unfolding and refolding of RNA and proteins. Structure. 2006;14(11):1633–1645. doi: 10.1016/j.str.2006.09.002. [DOI] [PubMed] [Google Scholar]
  • 30.Betancourt MR, Thirumalai D. Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemes. Protein Sci. 1999;8(2):361–369. doi: 10.1110/ps.8.2.361. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Mickler M, et al. Revealing the bifurcation in the unfolding pathways of GFP by using single-molecule experiments and simulations. Proc Natl Acad Sci USA. 2007;104(51):20268–20273. doi: 10.1073/pnas.0705458104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Denesyuk NA, Thirumalai D. Crowding promotes the switch from hairpin to pseudoknot conformation in human telomerase RNA. J Am Chem Soc. 2011;133(31):11858–11861. doi: 10.1021/ja2035128. [DOI] [PubMed] [Google Scholar]
  • 33.Ermak DL, McCammon JA. Brownian dynamics with hydrodynamic interactions. J Chem Phys. 1978;69(4):1352–1360. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES