Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2017 May 9;112(9):1829–1840. doi: 10.1016/j.bpj.2017.03.028

Competing Pathways and Multiple Folding Nuclei in a Large Multidomain Protein, Luciferase

Zackary N Scholl 1,, Weitao Yang 2, Piotr E Marszalek 3,∗∗
PMCID: PMC5425382  PMID: 28494954

Abstract

Proteins obtain their final functional configuration through incremental folding with many intermediate steps in the folding pathway. If known, these intermediate steps could be valuable new targets for designing therapeutics and the sequence of events could elucidate the mechanism of refolding. However, determining these intermediate steps is hardly an easy feat, and has been elusive for most proteins, especially large, multidomain proteins. Here, we effectively map part of the folding pathway for the model large multidomain protein, Luciferase, by combining single-molecule force-spectroscopy experiments and coarse-grained simulation. Single-molecule refolding experiments reveal the initial nucleation of folding while simulations corroborate these stable core structures of Luciferase, and indicate the relative propensities for each to propagate to the final folded native state. Both experimental refolding and Monte Carlo simulations of Markov state models generated from simulation reveal that Luciferase most often folds along a pathway originating from the nucleation of the N-terminal domain, and that this pathway is the least likely to form nonnative structures. We then engineer truncated variants of Luciferase whose sequences corresponded to the putative structure from simulation and we use atomic force spectroscopy to determine their unfolding and stability. These experimental results corroborate the structures predicted from the folding simulation and strongly suggest that they are intermediates along the folding pathway. Taken together, our results suggest that initial Luciferase refolding occurs along a vectorial pathway and also suggest a mechanism that chaperones may exploit to prevent misfolding.

Introduction

An understanding of how proteins incrementally obtain their structure through folding is crucial for the prevention of diseases and development of new therapeutics (1, 2, 3, 4). The previous 50 years of protein folding research (see reviews (5, 6, 7, 8, 9, 10)) have produced excellent theoretical frameworks for understanding protein folding of single-domain proteins, but most proteins are multidomain as they comprise >70% of the proteomes in all kingdoms of life (11). Multidomain protein folding behavior is not well understood, and recent experiments have shown that the simple smooth energy landscape theories typically do not apply to multidomain proteins, which often display non-Anfinsen mechanisms like kinetic partitioning (12), cotranslational folding (13, 14, 15, 16, 17, 18), coexisting intersecting folding pathways (19), or confinement by chaperonins (20, 21).

Here we make use of the model multidomain protein, Firefly Luciferase (Photinus pyralis, denoted as “Luciferase” hereafter), which effectively cannot refold (refolding takes up to 72 h (22, 23)), and only folds due to the involvement of molecular chaperones, or the ribosome (23, 24, 25, 26). In this study, we ask the following: is the structural topology of Luciferase inherently predisposed to inhibit spontaneous refolding while promoting cotranslational or vectorial folding? To answer this question, we use a combination of experiment and simulation. We first use single-molecule refolding experiments to identify initial steps in refolding, and then we use coarse-grain simulation to identify autonomously folded core substructures (27, 28). Finally, we use systematic truncation and single-molecule force spectroscopy to experimentally test the selected Luciferase substructures (29, 30, 31, 32, 33, 34, 35).

In contrast to previous truncation studies that use traditional spectroscopic techniques to evaluate stability (36, 37), we experimentally test the structure and stability of the simulation-predicted structures using single-molecule force spectroscopy (SMFS, reviewed in (38, 39, 40, 41, 42, 43, 44)). Our method of truncation has three distinct advantages over traditional truncation studies, as follows. First, SMFS takes advantage of the observation that mechanically stable structures produce distinct and consistent force peaks while unstructured polypeptides lack any discernible force peaks and follow a simple wormlike chain polymer elasticity (45). Second, using polyproteins as flanking handles eliminates the introduction of the C-terminal carboxyl charge or N-terminal amino charge at the sites of truncation, which should not be present in the true protein and may introduce anomalous charge interactions. Third, single-molecule experiments unfold proteins in isolation, which can minimize their tendency to aggregate.

Whereas SMFS can relay information about the stability through the unfolding force peak, the extension of the polypeptide chain after the unfolding event (contour-length increment) directly measures the number of amino acids initially present in the stable folded structure (46). Thus the results from SMFS can be directly compared to steered-molecular dynamic simulations to infer whether the measured contour-length increment is consistent with the structure identified in the simulations. This powerful combination of simulation and experiment reveals that Luciferase has the greatest propensity to fold correct without misfolding when the process occurs vectorially, from N-terminal domain to the C-terminal, as it does proceed cotranslationally.

Materials and Methods

Protein purification

Plasmids containing I913-Luciferase-I914 construct and the I919 were used to purify protein as described in the literature (47, 48). Truncated versions of Luciferase were generated at the DNA level through mutagenesis or gene synthesis from GenScript (Piscataway, NJ) at the residues specified in the main text. Plasmids were transformed and overexpressed in C41(DE3)pLysS cells (Lucigen; www.lucigen.com) and then purified using NiNTA columns (Qiagen, Hilden, Germany). The resulting purified protein was dialyzed into a buffer containing either 1× or 2× PBS pH 7.6 at a concentration of 1–10 mg/mL and used for atomic force microscopy (AFM) measurements.

Atomic force spectroscopy

Atomic force spectroscopy (AFS) experiments were performed using custom-built instruments (49). Proteins were diluted to 150 μg/mL in a 2× PBS buffer and loaded onto a freshly evaporated gold substrate for 1 h. The substrate was then washed once and used for pulling experiments. We used either OBL cantilevers (spring constant ∼6 pN/nm) or MLCT cantilevers (spring constant ∼16 pN/nm) (Bruker, Billerica, MA). The spring constant of each cantilever was calibrated using the energy equipartition approach (49). During unfolding experiments, the AFM cantilever was pressed against the sample at a contact force of 100–500 pN to nonspecifically bind to the protein. The presence of at least five I91 domains (ΔLc ∼28 nm and unfolding force ∼200 pN) allowed for the unequivocal determination of the unfolding pattern of the protein. During refolding experiments, the AFM cantilever tip was first moved toward the sample, relaxing the protein and then was retracted away from the sample to determine the force extension data of the refolded polypeptide; this procedure minimized unwanted tip-surface interactions. All data was analyzed using the software MATLAB 7.10 (The MathWorks, Natick, MA). The Gaussian mixture model fitting for three was done using an Expectation maximization algorithm.

Steered molecular dynamics

Simulations were conducted using the software GROMACS 4.5.7 (www.gromacs.org) (50) using a coarse-grain Cα model of Luciferase generated using the software SMOG (Software Modeling Group; http://smog.iem.pw.edu.pl/) (51, 52) from the apo-conformation crystal structure of Luciferase (53). Truncated variants of Luciferase were generated by truncating contacts from the original Luciferase coarse-grain Cα model. Simulations were conducted using a temperature of 150 temperature units. This temperature is near the heat capacity for this particular coarse-grain Cα model of Luciferase, as shown previously (47). All simulations were performed at a speed of 1 nm/s using a spring constant of 6 pN/nm, by pulling from the N-terminus and fixing the C-terminus. Similar results were obtained when pulling from the C-terminus and fixing the N-terminus (data not shown).

Molecular dynamics of luciferase folding

Simulations were conducted using the software GROMACS 5.0.6 (54) using a coarse-grain Cα model of the apo-conformation of Luciferase (53) generated using the software SMOG (51, 52). Simulations were first initialized by heating the Luciferase model to 300 temperature units, which immediately denatures the coarse-grain model of the protein. We then conducted 407 coarse-grain simulations starting from a randomly selected heat denatured state and quenched the temperature to 150 temperature units. This temperature is near the heat capacity for this particular coarse-grain Cα model of Luciferase, as shown previously, and allows the coarse-grain model to refold (47). Simulations were done for 1,000,000,000 steps, with step size of 0.0005. Every 1,000,000 steps were saved to an output file and used for analysis.

Markov state models

Markov state models (MSMs) are useful for determining a complete description of kinetic and thermodynamic properties from many simulations (55, 56, 57). The MSMs in this study were determined by clustering trajectories into a series of metastable states, and then determining the transition matrix that corresponds to the exchange between these states.

First, simulations were analyzed and clustered using the software Python 2.7.6 (https://www.python.org/download/releases/2.7.6/; accessed January 22, 2015) with packages from SciPy, NumPy, and Scikit-learn (58, 59). Each frame from the resulting trajectory was analyzed for whether any of the 1956 contacts of Luciferase were present to create a contact vector. Frames with <20% of the total contacts were automatically clustered as “unfolded” whereas frames containing >90% of the total contacts were automatically clustered as “folded”. A matrix containing the contact vector for every other frame was then used for clustering via K-means or DBSCAN algorithms. Similar results were obtained for both clustering methods, and different size of clusters as shown in Fig. S1.

The transition matrix for the MSM was generated by identifying transitions between metastable states in each individual simulation and then tallying into a single matrix showing the probability between any two given states in all the simulations, P (60). The MSMs were visualized using the software Graphviz (Graph Visualization Software; graphviz.org) (61). These visualizations (Fig. S2) omit the self-transitions and transitions below a minimum threshold of 0.001%. Kinetics were determined using an initialization vector for relative occupancy, v(0), and then determining the subsequent relative occupancy through matrix multiplication, as follows:

v(t)=v(0)×Pt, (1)

where P is the normalized transition probability matrix for the Markov chain, and t is the step. Results from kinetic calculations were also similar irrespective of cluster size or method.

For the scope of this work we will limit the discussion to the results from clustering of Luciferase into 14 states, shown in Fig. S3, which are related by their transition probabilities shown in Fig. S2. The results of all other clustering is shown in Figs. S4–S7, which all give similar conclusions.

Monte Carlo simulations

The percentage of pathways that are able to reach the native state, as shown in Fig. 4, was determined using a Monte Carlo simulation of the MSM for the 14-state cluster shown in Fig. S3. We simulated 10,000 trajectories that start in the unfolded state and progress to the folded state via the probabilities outlined in the MSM transition matrix. Trajectories that regressed back to the unfolded state were ignored. The probabilities in Fig. 4 represent the number of these trajectories that pass through that particular state.

Figure 4.

Figure 4

Representation of the main folding routes from coarse-grain simulations (back arrows representing unfolding not shown; for complete Markov state model, see Fig. S2). The numbers above each state represent that state number for the 14-state clustering (reference in Fig. S3), while the percentage is the probability of obtaining that structure during any given folding pathway (see Materials and Methods). The arrow thickness qualitatively represents the relative flux through the pair of states. Structures with a star indicate that they were evaluated experimentally by SMFS. To see this figure in color, go online.

Results

Luciferase is a 550-amino-acid monomeric protein that uses ATP to catalyze oxidation of D-(-)-Luciferin (Luciferin) to produce a photon of light in Firefly insects (62, 63, 64). The structural conformation of Luciferase has high homology with other AMP-forming ligases like acyl-CoA ligases (65) and nonribosomal peptide synthetases (66). Most structural studies indicate that there are three structural domains in Luciferase, as follows: an N-terminal, a middle, and a C-terminal domain, but the precise structural boundaries are difficult to ascertain and the boundary definitions depend highly on the method of characterization, as shown in Fig. S8 (colored bars). A recent method by Porter and Rose (67) used a thermodynamic metric of protein domains and observed that Luciferase has four domains, as follows: residues 15–266, 267–337, 338–433, and 434–522. Protein Domain Parser (68) determined that there are three domains, as follows: residues 18–218, 219–437, and 438–542. Conti et al. (69) determined a structure for Luciferase and observed that there are four structural domains. Previous results in our lab show that experimental and simulated mechanical unfolding occurs in three domains composed of residues 1–248, 249–443, and 444–550 (47), but their contribution to the folding pathway is unknown and it is unclear how the structure of the domain is obtained incrementally through folding.

Luciferase initiates refolding through folding nuclei

Luciferase does not readily refold and requires the use of chaperones to accelerate its refolding (22, 23, 25, 47, 70). This may be due to Luciferase’s propensity to aggregate after unfolding (23), while chaperones may be able to avert this process by acting as an unfoldase to relieve nonnative protein-protein interactions before allowing the protein to refold (71, 72, 73, 74, 75, 76). Thus, we sought to use an atomic force microscope as the ultimate unfoldase, to perform single-molecule refolding on Luciferase by first prestretching the molecule to a high force (≈100 pN) before attempting to refold.

In our refolding experiment (schematic in Fig. 1), each attempt begins with the molecule prestretched. A refolding attempt is initiated when the substrate is moved at constant velocity toward a set point, after which it immediately retracts at constant velocity. During each refolding attempt the protein experiences a force related to the entropic elasticity of the polypeptide chain, which is well approximated by a wormlike chain (77). A polypeptide approximated by a wormlike chain with a persistence length of 0.4 nm, moving at constant velocity, will spend 27% of the time under a force of 5 pN. This force of 5 pN is the observed isometric unfolding force for many molecules (i.e., the force at which the molecule will spend 50% folded/unfolded), and forces below will favor refolding over unfolding (78, 79, 80, 81). Thus, a refolding attempt spanning 4 s will allow ≈1 s where the force on the molecule is below the typical isometric force so refolding can occur. Successful refolding is detected during the retraction, where the folded domain may be forcefully unfolded under the increasing force, resulting in a peak of the force-extension curve (denoted an “unfolding event”). We hypothesized that this method of refolding would allow us to observe refolding events of Luciferase.

Figure 1.

Figure 1

Schematic of the refolding experiments for I913-Luciferase-I914 construct and I919. The distance between the substrate and the tip is modulated by the piezo, which executes a constant velocity ramp as shown in (A). The force on the molecule as determined by the extension, closely approximated by a wormlike chain, is shown in (B). The shaded region in (A) and (B) is where the molecule experiences a force that is below the typical isometric unfolding force (5 pN). The physical schematic of the experiment is shown in (C), where the piezo moves toward the cantilever, at which a protein can fold (denoted by the star), which is determined by the subsequent unfolding that results in a force peak.

We first performed 765 refolding attempts on >50 single molecules of the polyprotein I919, without Luciferase, as a control experiment. Most of the refolding attempts yielded fully folded I91 and/or fully unfolded I91 (unstructured polypeptide chain), but there was a small fraction that contained events that have neither the characteristic unfolding contour-length increment of 28 nm nor the unfolding force of 200 pN, as shown in Figs. S9 and S10. The vast majority of these peaks are <120 pN, which we use as the cutoff threshold for non-I91 unfolding events (Fig. S9). We observe that 11% of refolding attempts yield non-I91 unfolding events. Previous literature has detected similar types of events, which were determined to be misfolded I91 (82, 83, 84, 85).

We then performed 849 refolding attempts on >50 single molecules of I91-flanked Luciferase construct (I913-Luciferase-I914). The presence of flanking I91 domains serves as a positive control for the single-molecule under extension and as a handle for manipulation of the molecule of interest (86). The refolding pattern contained a single non-I91 event in 37.4 ± 0.3% of recordings, as shown in Fig. 3 (bottom) with representative examples in Fig. 2. When the pulling speed is decreased to allow up to 10 s of time in which the molecule experiences a force <5 pN, the refolding pattern contains a single non-I91 event in 36.9 ± 0.8% of recordings. Thus, we cannot detect a difference in the propensity of refolding with the time range that our AFM can probe. The origin of this event could be from the misfolding of I91 or the misfolding/refolding of Luciferase. However, several observations point to the conclusion that the majority of these events are from Luciferase and not I91. First, the presence of non-I91 events occurs approximately three times more often with Luciferase (31% with I913-Luciferase-I914 and 11% with I919), despite having fewer domains and similar experimental times. Second, the contour length to the first I91 domain for the Luciferase refolding events is consistent with the size of Luciferase (which should provide ∼200 nm of contour-length, Fig. S11). Third, the distribution of contour-length increments of the non-I91 events of Luciferase do not match the distribution from I91 alone (Fig. 3).

Figure 3.

Figure 3

Distributions of the contour-length increments for non-I91 peaks in refolding recordings (peaks with force <120 pN; see Fig. S9 for all peaks). Top shows the distribution for refolding of I919. Bottom shows the distribution for the refolding of I913-Luciferase-I914, which has events not present in I919 and match previous results for the unfolding of full Luciferase. To see this figure in color, go online.

Figure 2.

Figure 2

Four examples of successive refolding attempts on a single molecule of I913-Luciferase-I914. The order of recordings goes from bottom to top for each set. As can be seen, the I91 molecules (200 pN unfolding force) are able to refold successfully upon each attempt. A single non-I91 peak appears a fraction of the time, and a wormlike chain (dashed line) shows a contour-length increments that match previously reported contour-length increments for individual domains of Luciferase.

The distribution of the contour-length increments for the non-I91 events also indicate that Luciferase may be refolding and not misfolding. There are three distributions (R3, R4, and R5) unique only to Luciferase, centered at 49.1, 70.1, and 84.5 nm. These distributions and forces (shown in Table 1) are consistent with the previous results for the contour-length increments for the unfolding of the natively folded truncated N-terminal domain (54.3 nm; Fig. S12), middle domain (71.9 nm; Fig. S13), and the N-terminal domain of Luciferase (87.4 nm; Fig. S14), respectively (47). Thus, these results indicate that Luciferase is able to, in most cases, initially refold to the native state of one of its domains. The first two distributions, R1 and R2, may be indicative of refolding of the smaller C-terminal domain of Luciferase (Fig. S15) or partial structures of the other two domains; however, we are currently unable to rule out events similar to I91 events, as shown in Figs. S16 and S17. We note that none of these refolding attempts yielded a fully folded Luciferase, which may indicate a still difficult barrier to the full refolding of Luciferase.

Table 1.

Results for the Contour-Length Increment and Unfolding Force for Non-I91 Events from Refolding of Luciferase

Unfolding Event ΔLc [nm] Fu [pN]
R1 22.0 ± 4.0 58 ± 29
R2 31.2 ± 4.3 55 ± 28
R3 49.1 ± 5.2 45 ± 22
R4 70.1 ± 2.9 45 ± 25
R5 84.5 ± 3.7 46 ± 26

Unfolding events are labeled according to clusters in Fig. 3.

Folding simulations reveal three independent folding nuclei

We sought to understand the domains that correspond to the initial and subsequent refolding events for Luciferase using molecular dynamic simulations of structure-based models. These models explicitly encode the native state but are well suited for determining the degree of topological frustration that exists in a protein acquiring its native fold. We simulated 407 model-dependent folding trajectories where each started from the unfolded state. These trajectories were then used to generate a Markov state model (MSM), which can provide a complete description of the thermodynamic and kinetic properties of the combined simulations (more details in Materials and Methods) (57, 87, 88, 89).

The prominent states determined from the folding simulations are shown in Fig. 4. The transition probabilities (Fig. S2) predict that only three states are accessible to Luciferase, from the unfolded ensemble. These three states have the same contact maps and transition probabilities, regardless of clustering method or number of clusters as shown in Fig. S1. Although the clustering results are similar across number of clusters and method (Fig. S1), this article will focus on the 14-state clustering of the coarse-grained folding trajectory of Luciferase, whose MSM is shown in Fig. S2.

The states that are first occupied by Luciferase after being unfolded are states 11 (C-terminal domain), 1 (core of middle domain), and 3 (core of N-terminal domain), whose structural representations are shown in Fig. 4 and contact maps in Fig. S3. The transitions between the unfolded ensemble and these three states (and self-transitions) account for 99.998% captured in the simulation, and offer three distinct pathways for folding.

The most frequent pathway, which goes through state 1, is composed of the core of the second middle half of Luciferase, namely residues 332–440 (Fig. 4, orange structure). The unfolded ensemble transitions to state 1 with a probability of 0.70% during each step of the simulation. While it is the most frequent state originating from the unfolded ensemble, this state only results in reaching the final folded state 36% of the time. State 1 is also moderately unstable, as it regresses back to the unfolded ensemble 3.3% of the time and only transits further down the folding pathway 2.2% of the time (whereas the rest of the time it transitions back to itself), as shown in the full MSM in Fig. S2.

The second most frequent pathway passes through state 3, which has a probability of 0.26% for transitioning from the unfolded ensemble during each step in the simulation. This state is the core of the Luciferase N-terminal domain, composed of residues 46–180, as shown in Fig. 4 as the blue structure. The N-terminal core is the most proficient at reaching the final folded state, as 52% of the folding pathways originate from it. In contrast to a state corresponding to the middle domain of Luciferase, the N-terminal core is much more stable. Although it transitions back to the unfolded ensemble 0.9% of the time, it will transit further down the folding pathway 9.7% percent of the time.

The third and final most frequented pathway from the unfolded ensemble is state 11, which corresponds to the C-terminal domain of Luciferase (Fig. 4 as the red structure). The C-terminal domain is the least likely to be the origin for the final folded state, as it is present in only 4% of all folding pathways, and because the unfolded ensemble transitions to the C-terminal domain with a low probability of 0.13% at each step of the simulation. This state is moderately unstable—similar to the middle domain of Luciferase—as it unfolds 3.7% of the time and only continues along the folding pathway 1.5% of the time.

Nonnative states predicted from simulation

As can be seen in the kinetics from the Monte Carlo simulations of the Markov states (Fig. S18), there are four states—2, 4, 7, and 10—that have nonzero occupancy with the folded state when the simulation converges. With the exception of state 2, which represents the natively folded Luciferase without the C-terminal domain (Fig. 4; Fig. S2), these states represent a nonnative topology.

All of the nonnative states from our simulations involve an incorrect threading of the N-terminal residues (Fig. 5). In the first case, state 7, a β-hairpin at residues 36–52, is interrupted by residues 261–278. Normally residues 268–278 would form a helix; however, because of the threading, there are too many steric clashes for a helix to form, so those contacts cannot be acquired. The structure is stable, though, because the surrounding contacts can still form natively. This state is most often obtained from state 5, which is along the pathway of nucleating from the C-terminal or middle domain (where the N-terminal domain remains unfolded).

Figure 5.

Figure 5

Representative structures of the topologically misfolded states obtained during folding of Luciferase, with the state labeled in bold (referring to 14-state clustering in Fig. S2). The structures are colored from red (N-terminal residues) to blue (C-terminal residues). The right side shows a template molecule of the correctly folded form of Luciferase. To see this figure in color, go online.

In the second case of nonnative topology, state 10, the linker residues 104–110 enter the loop formed by the helix-turn-strand residues 73–95, instead of going around. Because of this, the helix between residues 109 and 121 cannot form properly, although the rest of the native contacts are formed normally. This state transitions from the fully folded state (although this could be mislabeling, as the number of contacts are very similar), but it also transitions often from state 8, which is a case where the N-terminal folds and the C-terminal part of the middle domain folds—but the interface is not yet set.

In the third case of nonnative topology, state 4, the first helix, residues 19–34, are not natively formed because the most N-terminal residues are caught between a helix composed of residues 82–95 and hairpin-helix composed of residues 36–67. The N-terminal residues 1–15 normally form contacts with the middle domain of Luciferase (data not shown) and the rest of the contacts are enough to stabilize this fold. The states 5 and 9 can transition to this state, and are both along the pathway leading to state 11.

The transition probabilities from the 14-state cluster (Materials and Methods) can be used to determine which pathway is the most frequent origin for misfolding. Using a Monte Carlo approach, we started simulations in the unfolded ensemble and progressed the state until either a nonnative state or a fully folded state was reached. The simulations that folded from the C-terminal domain (state 11) obtained one of the nonnative states 88% of the time. The simulations that folded from the middle domain (state 1 or 12) obtained a nonnative state in 37% of its trajectories. The N-terminal (state 3 or 6) had the best avoidance of nonnative states, as trajectories starting from these states only obtained a misfolded state in 16% of its trajectories.

Experimental stability of conformations predicted from simulation

Folding nuclei are experimentally stable

The three states determined from coarse-grained simulations, 11, 1, and 3, represent the folding nuclei of the C-terminal, middle, and N-terminal domains of Luciferase, respectively (Fig. 4). To determine whether these folding nuclei are stable, we engineered the truncated variants of the proteins corresponding to the three states and examined their force-extension profiles using force spectroscopy. If they are stable structures, probing by force spectroscopy will produce distinct and consistent force peaks, whereas no discernible force peaks will be produced if the truncated variants are unstable or unfolded, as these would simply follow wormlike chain polymer elasticity.

Before experimentally perturbing with force, we first simulated the mechanical force experiment using the prospective structures from states 3, 1, and 11 with coarse-grained steered molecular dynamics (Materials and Methods) to determine whether their elasticity follows a simple wormlike chain or reveals unfolding force peaks that produce contour-length increments that could be later compared to the experimental observations. The simulated structures were generated by double-truncation of the full model of Luciferase at the residues specified from the three states. All three of these steered molecular dynamic simulations yielded a single rupture event with a contour-length increment (dLc) that matched the experimental contour-length increment (see Fig. 7; Table 2). We would expect that experimental measurements of the contour-length increment should yield a similar result, if the domain is folded as predicted.

Figure 7.

Figure 7

Experimental SMFS results of the contour-length increment (left) and unfolding force (right) for Luciferase truncations. Previous results on the full Luciferase (47) are shown at the top for comparison, which shows unfolding of C-terminal domain (light shading), middle domain (dark shading), and N-terminal domain (solid). The dashed bar shows the result for the contour-length increment calculated from SMD simulations.

Table 2.

Results for the Contour-Length Increment and Unfolding Force for Unfolding of Full Luciferase and Luciferase Truncations

Unfolding event ΔLc [nm] (Simulations) ΔLc [nm] (Experiments) Fu [pN]
Luciferase, Peak 1 35.0 38.7 ± 6.6 24 ± 12
Luciferase, Peak 2 68.0 71.9 ± 5.7 38 ± 14
Luciferase, Peak 3 91.3 87.4 ± 5.2 54 ± 18
Luc1-203 59.5 51.1 ± 10.5 53 ± 29
Luc1-268 83.2 79.6 ± 8.1 81 ± 37
Luc1-450-Peak1 67.5 72.8 ± 7.3 39 ± 16
Luc1-450-Peak2 90.5 84.1 ± 6.8 54 ± 15
Luc332-440 33.7 26.0 ± 5.7 51 ± 21
Luc46-180 47.0 36.2 ± 8.7 78 ± 39
Luc433-550 36.1 33.5 ± 6.4 29 ± 15

Distributions of results are shown in Fig. 7. Representative results are shown in Fig. 6. The Simulations column corresponds to the contour-length increment measured from steered-molecular dynamic coarse-grain simulations of the respective molecule.

We made protein constructs representing the three states by engineering at the DNA level their sequence and then flanking by I91 domains as for native Luciferase (47). These proteins were produced and purified and then subjected to mechanical force (see Materials and Methods). Representative results of the force-extension curves are shown in Fig. 6 and the distributions of contour-length increment and unfolding force are shown in Fig. 7. All three states, state 11 (C-terminal domain), state 1 (core middle domain, Luc332-440), and state 3 (core of N-terminal domain, Luc46-180) showed distributions of contour lengths that peaked very close to the simulated contour-length increment determined from SMD calculations, indicating that these truncated variants are all stable and have structures near the predicted model structure. The contour-length increment and the unfolding forces of state 11 (C-terminal domain, Luc433-550) were very similar to the distributions of the full Luciferase unfolding for Peak 1 (determined earlier to be the C-terminal domain by Scholl et al. (47)), tested using the Kolmogorov-Smirnov test and corrected for the multiple hypothesis testing. However, contour-length increments for state 1 and state 3 determined from force-spectroscopy measurements on truncated variants were shorter than the contour-length increments corresponding to Peaks 2 and 3, respectively, in full-length Luciferase. This is understandable, as both these states are truncations of the larger domains whose unfolding give rise to Peaks 2 and 3.

Figure 6.

Figure 6

Three representative examples of experimental force-spectroscopy recordings obtained for each construct in this study. All the vertical scale bars represent 100 pN and the horizontal scale bars all represent 50 nm. Arrows indicate the force rupture event used in analysis.

The contour-length increments determined from refolding experiments (Table 1) and truncation unfolding experiments (Table 2) indicate that the refolding may populate these truncations. The contour-length increments for R2 (〈ΔLc〉 = 31.2 nm) could contain contributions from Luc46–180 (〈ΔLc〉 = 36.2 nm) and Luc433–50 (〈ΔLc〉 = 33.5 nm). Similarly, the contour-length increment distribution for R1 (〈ΔLc〉 = 22.0 nm) could contain contributions from folded Luc332–433 (〈ΔLc〉 = 26.0 nm). However, both R1 and R2 also exist in the refolding of only I919 (Fig. 3), so these could also have contributions from I91 (for example, Figs. S16 and S17). The contribution from I91 for R2 is less than R1, however, because the percentage of R2 (29%) is higher than R1 (16%) in Luciferase refolding, but not I919 refolding. We plan on refolding these truncations individually to more accurately determine their contributions to the total refolding of Luciferase.

Predominant folding transitions are experimentally stable

After forming the initial core of Luciferase (states 11, 1, or 3), there are several predominant states along the folding pathway of Luciferase. The Monte Carlo simulations of folding (Fig. 4) and the kinetics (Fig. S18) indicate that the states 2, 6, and 12 are frequently observed along the folding pathway to the native state (>30% of all pathways). States 2 and 6 are expansions of the N-terminal core, containing residues 1–450 and 1–268, respectively, as shown in Figs. 4 and S3. State 12 is an expansion on state 1 (Fig. S3), which we chose not to further investigate experimentally in this study, as it lies on the less predominant pathway folding from state 1.

Following our approach for states 1, 3, and 11, we engineered truncated variants for state 2 and state 6 and examined their structure and stability by SMD and by AFM. In the following discussion we also include earlier results obtained on a truncation from residues 1–203, which is similar to state 6, only slightly smaller (47). Representative experimental force-extension curves are shown in Fig. 6 and the results from the experimental force-spectroscopy measurements are shown in Fig. 7.

The simulated contour-length increments calculated from SMD simulations corresponds well with the experimentally measured contour-length increments, indicating that all the original models appropriately represent the N-C extension of the folded state of the Luciferase truncation. Also, as expected, the truncation encompassing residues 1–450 from state 2 results in two separate peaks. These two peaks match Peaks 2 and 3 of the full Luciferase unfolding as their contour-length increment distributions and their rupture force distributions are not statistically significantly different (as tested using the Kolmogorov-Smirnov test). This is in line with previous results positing that Peak 1 from the full Luciferase unfolding originates from the unfolding of the C-terminal domain of Luciferase and that Peaks 2 and 3 correspond to the unfolding of the middle domain of Luciferase and the N-terminal domain, respectively. These experiments indicate that these structures are all likely configured in a similar structure as predicted by simulation.

As suggested by full Luciferase, the truncations may also be present in the refolding experiments as determined by examining similarities in contour-length increment refolding experiments (Table 1) and truncation experiments (Table 2). The contour length increment distribution for R3 (〈ΔLc〉 = 49.1 nm) closely matches Luc1–203 (N-terminal core, 〈ΔLc〉 = 51.1 nm). Similarly, the contour-length distribution for R4 (〈ΔLc〉 = 70.1 nm) is similar to Luc1–450 Peak 1 (middle domain of Luciferase, 〈ΔLc〉 = 72.8 nm). Also, the contour-length increment distribution of R5 (〈ΔLc〉 = 84.5 nm) matches Luc1–450 peak 2 (N-terminal domain, 〈ΔLc〉 = 84.1 nm). This further supports our view that the refolding experiments detect events that originate from natively folded Luciferase.

Discussion

Here we use AFM-based SMFS and simulation to identify the folding nuclei that initialize refolding of Luciferase. Previous experiments found that Luciferase was unable to refold in an appreciable amount of time (22, 23) and that Luciferase must either acquire its native fold through the one-time event of cotranslational folding (25, 26) or through interaction with chaperones (24, 47). We find that the refolding of Luciferase can begin under certain conditions, in the absence of chaperones, and that the folding complexity is mainly due to competing substructures. There were four major conclusions from our experiments and simulations, as follows.

First, we observe that the presence of an unfoldase-like (71, 72, 73) interaction helps to initialize refolding of Luciferase. Previous single-molecule attempts to refold Luciferase often encountered misfolding or nonspecific interactions (47, 70), unless chaperones were provided. Here, we use AFM as the ultimate unfoldase from which we observe force-extension curves that are either absent of refolding, or that contain an event that matches what would be expected for the refolding of one of the domains of Luciferase (Fig. 3). This seems to support the idea that some chaperone proteins may affect refolding positively by binding and prestretching the molecule to alleviate nonspecific interactions (74, 75, 76). We never observed complete refolding of Luciferase, although our timespan for refolding was only extended up to 10 s, while in principle Luciferase may take hours to refold (22, 23). This is currently beyond the temporal stability of our AFM, though we plan to implement new techniques to considerably improve this stability (90) for performing longer timespans in the future.

Second, we find that Luciferase initially refolds through competing pathways that involve either the N-terminal domain or the middle domain. Simulations of the folding pathway predict that the majority of successful refolding occurs through the N-terminal domain, as these were obtained 52% of the time in Monte Carlo simulations from Markov state models of refolding (Fig. 4). Refolding experiments also detected a distribution of contour-length increments that closely matched the correct unfolding of N-terminal domain residues 1–203 or residues 1–268, which were obtained 41% of the time in experiments (R3 and R5; Fig. 3). The middle domain of Luciferase, residues 268–440, was also identified as the secondmost common starting point for refolding: it occurred in 35% of pathways in simulation (Fig. 4). The middle domain of Luciferase was also definitively identified in refolding experiments (R4; Fig. 3), which was observed to account for at least 14% of refolding events. Thus, both simulations and experiments are in agreement with a vectorial folding of Luciferase, which predisposes Luciferase to optimal cotranslational folding.

Third, we find that the topology of Luciferase allows for several viable substructures. Folding simulations revealed distinct and consistent states where Luciferase incrementally acquires its native fold (Fig. 4). All tested substructures were found to be mechanically stable and have a contour-length increment expected from the hypothetical structure (Fig. 7). This implies that Luciferase is capable of forming native interactions incrementally, and the pathway from the unfolded to folded state may only transition through a finite number of consistent substructures. We note that although these substructures are mechanically stable, their atomic structure may deviate from prediction and require further biophysical characterization to understand their role in the folding pathway. We plan to carry out these characterizations on truncated mutants, without flanking domains, in future work.

Fourth, our simulations of folding lend insight into the type of nonnative interactions that can occur in Luciferase (Fig. 5). All examples of these nonnative structures were caused by folding into an incorrect topology in the loops between N-terminal helices. Monte Carlo simulations using the MSM transition matrix indicate that the nonnative interactions originated from the C-terminal domain, middle domain, and N-terminal domain with probabilities 88, 36 and 17%, respectively. These observations suggest, due to the topology of Luciferase, that nonnative interactions can be avoided by initializing folding from the N-terminal domain, which is the predominant initialization of refolding that we observed experimentally. These observations also suggest that, under denaturing conditions such as heat-shock, the unfolded C-terminal domain may contribute to the progeny of misfolding and guide the protein to a nonnative structure. We plan to test these hypotheses in future work.

Conclusions

Whereas much is known about protein folding in general, the folding of large, multidomain proteins is still a difficult problem that has not been solved. Here we make use of coarse-grained simulation and SMFS to broach several steps in the folding of Luciferase, a model multidomain protein. Our results indicate that Luciferase initializes its folding from competing pathways, either from a core N-terminal domain or through the middle domain. This is similar to what has been expected to occur under cotranslational conditions and our results likely represent key components of the intermediate structures along this pathway. Our observation that folding from the C-terminal domain leads to nonnative structures suggests a mechanism for chaperones to prevent misfolding by eliminating this pathway.

Author Contributions

Z.N.S., W.Y., and P.E.M. designed research and wrote the paper. Z.N.S. performed research and analyzed data.

Acknowledgments

We are grateful to Prof. J. Clarke (University of Cambridge, UK) for providing the pAFM 1–8 plasmid. The authors are grateful Dr. João Nunes for insightful discussion.

This work was supported by NIH grant No. R01GM061870-13, National Science Foundation (NSF) GRFP grant No. 1106401 and the Katherine Goodman Stern Fellowship to Z.N.S., and by National Science Foundation (NSF) grant No. MCB-1517245 to P.E.M. and W.Y.

Editor: Daniel Muller.

Footnotes

Contributor Information

Zackary N. Scholl, Email: scholl@ualberta.edu.

Piotr E. Marszalek, Email: pemar@duke.edu.

Supporting Material

Document S1. Figs. S1–S18
mmc1.pdf (5.4MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (9.2MB, pdf)

References

  • 1.Gruebele M., Dave K., Sukenik S. Globular protein folding in vitro and in vivo. Annu. Rev. Biophys. 2016;45:233–251. doi: 10.1146/annurev-biophys-062215-011236. [DOI] [PubMed] [Google Scholar]
  • 2.Dobson C.M. Protein folding and misfolding. Nature. 2003;426:884–890. doi: 10.1038/nature02261. [DOI] [PubMed] [Google Scholar]
  • 3.Fersht A. W. H. Freeman; New York: 1999. Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding. [Google Scholar]
  • 4.Dill K.A., MacCallum J.L. The protein-folding problem, 50 years on. Science. 2012;338:1042–1046. doi: 10.1126/science.1219021. [DOI] [PubMed] [Google Scholar]
  • 5.Gruebele M. Protein folding: the free energy surface. Curr. Opin. Struct. Biol. 2002;12:161–168. doi: 10.1016/s0959-440x(02)00304-4. [DOI] [PubMed] [Google Scholar]
  • 6.Bartlett A.I., Radford S.E. An expanding arsenal of experimental methods yields an explosion of insights into protein folding mechanisms. Nat. Struct. Mol. Biol. 2009;16:582–588. doi: 10.1038/nsmb.1592. [DOI] [PubMed] [Google Scholar]
  • 7.Englander S.W., Mayne L., Krishna M.M.G. Protein folding and misfolding: mechanism and principles. Q. Rev. Biophys. 2007;40:287–326. doi: 10.1017/S0033583508004654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Onuchic J.N., Wolynes P.G. Theory of protein folding. Curr. Opin. Struct. Biol. 2004;14:70–75. doi: 10.1016/j.sbi.2004.01.009. [DOI] [PubMed] [Google Scholar]
  • 9.Eaton W.A., Muñoz V., Hofrichter J. Fast kinetics and mechanisms in protein folding. Annu. Rev. Biophys. Biomol. Struct. 2000;29:327–359. doi: 10.1146/annurev.biophys.29.1.327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schueler-Furman O., Wang C., Baker D. Progress in modeling of protein structures and interactions. Science. 2005;310:638–642. doi: 10.1126/science.1112160. [DOI] [PubMed] [Google Scholar]
  • 11.Han J.-H., Batey S., Clarke J. The folding and evolution of multidomain proteins. Nat. Rev. Mol. Cell Biol. 2007;8:319–330. doi: 10.1038/nrm2144. [DOI] [PubMed] [Google Scholar]
  • 12.Peng Q., Li H. Atomic force microscopy reveals parallel mechanical unfolding pathways of T4 lysozyme: evidence for a kinetic partitioning mechanism. Proc. Natl. Acad. Sci. USA. 2008;105:1885–1890. doi: 10.1073/pnas.0706775105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Sander I.M., Chaney J.L., Clark P.L. Expanding Anfinsen’s principle: contributions of synonymous codon selection to rational protein design. J. Am. Chem. Soc. 2014;136:858–861. doi: 10.1021/ja411302m. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kaiser C.M., Goldman D.H., Bustamante C. The ribosome modulates nascent protein folding. Science. 2011;334:1723–1727. doi: 10.1126/science.1209740. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kimchi-Sarfaty C., Oh J.M., Gottesman M.M. A “silent” polymorphism in the MDR1 gene changes substrate specificity. Science. 2007;315:525–528. doi: 10.1126/science.1135308. [DOI] [PubMed] [Google Scholar]
  • 16.Chaney J.L., Clark P.L. Roles for synonymous codon usage in protein biogenesis. Annu. Rev. Biophys. 2015;44:143–166. doi: 10.1146/annurev-biophys-060414-034333. [DOI] [PubMed] [Google Scholar]
  • 17.Hartl F.U., Hayer-Hartl M. Converging concepts of protein folding in vitro and in vivo. Nat. Struct. Mol. Biol. 2009;16:574–581. doi: 10.1038/nsmb.1591. [DOI] [PubMed] [Google Scholar]
  • 18.Gloge F., Becker A.H., Bukau B. Co-translational mechanisms of protein maturation. Curr. Opin. Struct. Biol. 2014;24:24–33. doi: 10.1016/j.sbi.2013.11.004. [DOI] [PubMed] [Google Scholar]
  • 19.Pirchi M., Ziv G., Haran G. Single-molecule fluorescence spectroscopy maps the folding landscape of a large protein. Nat. Commun. 2011;2:493. doi: 10.1038/ncomms1504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chakraborty K., Chatila M., Hayer-Hartl M. Chaperonin-catalyzed rescue of kinetically trapped states in protein folding. Cell. 2010;142:112–122. doi: 10.1016/j.cell.2010.05.027. [DOI] [PubMed] [Google Scholar]
  • 21.Zhou H.-X., Dill K.A. Stabilization of proteins in confined spaces. Biochemistry. 2001;40:11289–11293. doi: 10.1021/bi0155504. [DOI] [PubMed] [Google Scholar]
  • 22.Herbst R., Schäfer U., Seckler R. Equilibrium intermediates in the reversible unfolding of firefly (Photinus pyralis) luciferase. J. Biol. Chem. 1997;272:7099–7105. doi: 10.1074/jbc.272.11.7099. [DOI] [PubMed] [Google Scholar]
  • 23.Herbst R., Gast K., Seckler R. Folding of firefly (Photinus pyralis) luciferase: aggregation and reactivation of unfolding intermediates. Biochemistry. 1998;37:6586–6597. doi: 10.1021/bi972928i. [DOI] [PubMed] [Google Scholar]
  • 24.Schumacher R.J., Hurst R., Matts R.L. ATP-dependent chaperoning activity of reticulocyte lysate. J. Biol. Chem. 1994;269:9493–9499. [PubMed] [Google Scholar]
  • 25.Frydman J., Erdjument-Bromage H., Hartl F.U. Co-translational domain folding as the structural basis for the rapid de novo folding of firefly luciferase. Nat. Struct. Biol. 1999;6:697–705. doi: 10.1038/10754. [DOI] [PubMed] [Google Scholar]
  • 26.Svetlov M.S., Kommer A., Spirin A.S. Effective cotranslational folding of firefly luciferase without chaperones of the Hsp70 family. Protein Sci. 2006;15:242–247. doi: 10.1110/ps.051752506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ding F., Guo W., Shea J.-E. Reconstruction of the src-SH3 protein domain transition state ensemble using multiscale molecular dynamics simulations. J. Mol. Biol. 2005;350:1035–1050. doi: 10.1016/j.jmb.2005.05.017. [DOI] [PubMed] [Google Scholar]
  • 28.Shaw D.E., Maragakis P., Wriggers W. Atomic-level characterization of the structural dynamics of proteins. Science. 2010;330:341–346. doi: 10.1126/science.1187409. [DOI] [PubMed] [Google Scholar]
  • 29.Itzhaki L.S., Neira J.L., Fersht A.R. Search for nucleation sites in smaller fragments of chymotrypsin inhibitor 2. J. Mol. Biol. 1995;254:289–304. doi: 10.1006/jmbi.1995.0617. [DOI] [PubMed] [Google Scholar]
  • 30.Kurt N., Rajagopalan S., Cavagnero S. Effect of hsp70 chaperone on the folding and misfolding of polypeptides modeling an elongating protein chain. J. Mol. Biol. 2006;355:809–820. doi: 10.1016/j.jmb.2005.10.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.de Prat Gay G., Ruiz-Sanz J., Fersht A.R. Conformational pathway of the polypeptide chain of chymotrypsin inhibitor-2 growing from its N terminus in vitro. Parallels with the protein folding pathway. J. Mol. Biol. 1995;254:968–979. doi: 10.1006/jmbi.1995.0669. [DOI] [PubMed] [Google Scholar]
  • 32.Neira J.L., Itzhaki L.S., Fersht A.R. Following co-operative formation of secondary and tertiary structure in a single protein module. J. Mol. Biol. 1997;268:185–197. doi: 10.1006/jmbi.1997.0932. [DOI] [PubMed] [Google Scholar]
  • 33.Neira J.L., Fersht A.R. Acquisition of native-like interactions in C-terminal fragments of barnase. J. Mol. Biol. 1999;287:421–432. doi: 10.1006/jmbi.1999.2602. [DOI] [PubMed] [Google Scholar]
  • 34.Gay G.D.P., Ruiz-Sanz J., Fersht A.R. Folding of a nascent polypeptide chain in vitro: cooperative formation of structure in a protein module. Proc. Natl. Acad. Sci. USA. 1995;92:3683–3686. doi: 10.1073/pnas.92.9.3683. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bertz M., Rief M. Mechanical unfoldons as building blocks of maltose-binding protein. J. Mol. Biol. 2008;378:447–458. doi: 10.1016/j.jmb.2008.02.025. [DOI] [PubMed] [Google Scholar]
  • 36.Neira J.L., Fersht A.R. Exploring the folding funnel of a polypeptide chain by biophysical studies on protein fragments. J. Mol. Biol. 1999;285:1309–1333. doi: 10.1006/jmbi.1998.2249. [DOI] [PubMed] [Google Scholar]
  • 37.Chow C.C., Chow C., Cavagnero S. Chain length dependence of apomyoglobin folding: structural evolution from misfolded sheets to native helices. Biochemistry. 2003;42:7090–7099. doi: 10.1021/bi0273056. [DOI] [PubMed] [Google Scholar]
  • 38.Neuman K.C., Nagy A. Single-molecule force spectroscopy: optical tweezers, magnetic tweezers and atomic force microscopy. Nat. Methods. 2008;5:491–505. doi: 10.1038/nmeth.1218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Borgia A., Williams P.M., Clarke J. Single-molecule studies of protein folding. Annu. Rev. Biochem. 2008;77:101–125. doi: 10.1146/annurev.biochem.77.060706.093102. [DOI] [PubMed] [Google Scholar]
  • 40.Hoffmann T., Dougan L. Single molecule force spectroscopy using polyproteins. Chem. Soc. Rev. 2012;41:4781–4796. doi: 10.1039/c2cs35033e. [DOI] [PubMed] [Google Scholar]
  • 41.Noy A., Friddle R.W. Practical single molecule force spectroscopy: how to determine fundamental thermodynamic parameters of intermolecular bonds with an atomic force microscope. Methods. 2013;60:142–150. doi: 10.1016/j.ymeth.2013.03.014. [DOI] [PubMed] [Google Scholar]
  • 42.Zoldák G., Rief M. Force as a single molecule probe of multidimensional protein energy landscapes. Curr. Opin. Struct. Biol. 2013;23:48–57. doi: 10.1016/j.sbi.2012.11.007. [DOI] [PubMed] [Google Scholar]
  • 43.Mashaghi A., Kramer G., Tans S.J. Chaperone action at the single-molecule level. Chem. Rev. 2014;114:660–676. doi: 10.1021/cr400326k. [DOI] [PubMed] [Google Scholar]
  • 44.Woodside M.T., Block S.M. Reconstructing folding energy landscapes by single-molecule force spectroscopy. Annu. Rev. Biophys. 2014;43:19–39. doi: 10.1146/annurev-biophys-051013-022754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Carrion-Vazquez M., Oberhauser A.F., Fernandez J.M. Mechanical and chemical unfolding of a single protein: a comparison. Biophys. J. 1999;96:3694–3699. doi: 10.1073/pnas.96.7.3694. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Dietz H., Rief M. Protein structure by mechanical triangulation. Proc. Natl. Acad. Sci. USA. 2006;103:1244–1247. doi: 10.1073/pnas.0509217103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Scholl Z.N., Yang W., Marszalek P.E. Chaperones rescue luciferase folding by separating its domains. J. Biol. Chem. 2014;289:28607–28618. doi: 10.1074/jbc.M114.582049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Scholl Z.N., Josephs E.A., Marszalek P.E. Modular, nondegenerate polyprotein scaffolds for atomic force spectroscopy. Biomacromolecules. 2016;17:2502–2505. doi: 10.1021/acs.biomac.6b00548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kim M., Abdi K., Marszalek P.E. Fast and forceful refolding of stretched alpha-helical solenoid proteins. Biophys. J. 2010;98:3086–3092. doi: 10.1016/j.bpj.2010.02.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Hess B., Kutzner C., Lindahl E. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 2008;4:435–447. doi: 10.1021/ct700301q. [DOI] [PubMed] [Google Scholar]
  • 51.Clementi C. Coarse-grained models of protein folding: toy models or predictive tools? Curr. Opin. Struct. Biol. 2008;18:10–15. doi: 10.1016/j.sbi.2007.10.005. [DOI] [PubMed] [Google Scholar]
  • 52.Noel J.K., Levi M., Whitford P.C. SMOG 2: a versatile software package for generating structure-based models. PLOS Comput. Biol. 2016;12:e1004794. doi: 10.1371/journal.pcbi.1004794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Franks N.P., Jenkins A., Brick P. Structural basis for the inhibition of firefly luciferase by a general anesthetic. Biophys. J. 1998;75:2205–2211. doi: 10.1016/S0006-3495(98)77664-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Abraham M.J., Murtola T., Lindahl E. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19–25. [Google Scholar]
  • 55.Pande V.S., Beauchamp K., Bowman G.R. Everything you wanted to know about Markov state models but were afraid to ask. Methods. 2010;52:99–105. doi: 10.1016/j.ymeth.2010.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Chodera J.D., Noé F. Markov state models of biomolecular conformational dynamics. Curr. Opin. Struct. Biol. 2014;25:135–144. doi: 10.1016/j.sbi.2014.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.de Sancho D., Best R.B. Reconciling intermediates in mechanical unfolding experiments with two-state protein folding in bulk. J. Phys. Chem. Lett. 2016;7:3798–3803. doi: 10.1021/acs.jpclett.6b01722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Pedregosa F., Varoquaux G., Duchesnay É. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
  • 59.van der Walt S., Colbert S.C., Varoquaux G. The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 2011;13:22–30. [Google Scholar]
  • 60.Shukla D., Hernández C.X., Pande V.S. Markov state models provide insights into dynamic modulation of protein function. Acc. Chem. Res. 2015;48:414–422. doi: 10.1021/ar5002999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Gansner E.R., North S.C. An open graph visualization system and its applications to software engineering. Softw. Pract. Exper. 2000;30:1203–1233. [Google Scholar]
  • 62.Denburg J.L., Lee R.T., McElroy W.D. Substrate-binding properties of firefly luciferase. I. Luciferin-binding site. Arch. Biochem. Biophys. 1969;134:381–394. doi: 10.1016/0003-9861(69)90297-5. [DOI] [PubMed] [Google Scholar]
  • 63.Lee R.T., Denburg J.L., McElroy W.D. Substrate-binding properties of firefly luciferase. II. ATP-binding site. Arch. Biochem. Biophys. 1970;141:38–52. doi: 10.1016/0003-9861(70)90103-7. [DOI] [PubMed] [Google Scholar]
  • 64.DeLuca M., McElroy W.D. Kinetics of the firefly luciferase catalyzed reactions. Biochemistry. 1974;13:921–925. doi: 10.1021/bi00702a015. [DOI] [PubMed] [Google Scholar]
  • 65.Chang K.H., Xiang H., Dunaway-Mariano D. Acyl-adenylate motif of the acyl-adenylate/thioester-forming enzyme superfamily: a site-directed mutagenesis study with the Pseudomonas sp. strain CBS3 4-chlorobenzoate:coenzyme A ligase. Biochemistry. 1997;36:15650–15659. doi: 10.1021/bi971262p. [DOI] [PubMed] [Google Scholar]
  • 66.Kleinkauf H., von Döhren H. A nonribosomal system of peptide biosynthesis. Eur. J. Biochem. 1996;236:335–351. doi: 10.1111/j.1432-1033.1996.00335.x. [DOI] [PubMed] [Google Scholar]
  • 67.Porter L.L., Rose G.D. A thermodynamic definition of protein domains. Proc. Natl. Acad. Sci. USA. 2012;109:9420–9425. doi: 10.1073/pnas.1202604109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Alexandrov N., Shindyalov I. PDP: protein domain parser. Bioinformatics. 2003;19:429–430. doi: 10.1093/bioinformatics/btg006. [DOI] [PubMed] [Google Scholar]
  • 69.Conti E., Franks N.P., Brick P. Crystal structure of firefly luciferase throws light on a superfamily of adenylate-forming enzymes. Structure. 1996;4:287–298. doi: 10.1016/s0969-2126(96)00033-0. [DOI] [PubMed] [Google Scholar]
  • 70.Mashaghi A., Mashaghi S., Tans S.J. Misfolding of luciferase at the single-molecule level. Angew. Chem. 2014;126:10558–10561. doi: 10.1002/anie.201405566. [DOI] [PubMed] [Google Scholar]
  • 71.Shtilerman M., Lorimer G.H., Englander S.W. Chaperonin function: folding by forced unfolding. Science. 1999;284:822–825. doi: 10.1126/science.284.5415.822. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Lin Z., Madan D., Rye H.S. GroEL stimulates protein folding through forced unfolding. Nat. Struct. Mol. Biol. 2008;15:303–311. doi: 10.1038/nsmb.1394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Sharma S.K., De los Rios P., Goloubinoff P. The kinetic parameters and energy cost of the Hsp70 chaperone as a polypeptide unfoldase. Nat. Chem. Biol. 2010;6:914–920. doi: 10.1038/nchembio.455. [DOI] [PubMed] [Google Scholar]
  • 74.Finka A., Mattoo R.U., Goloubinoff P. Experimental milestones in the discovery of molecular chaperones as polypeptide unfolding enzymes. Annu. Rev. Biochem. 2016;85:715–742. doi: 10.1146/annurev-biochem-060815-014124. [DOI] [PubMed] [Google Scholar]
  • 75.De Los Rios P., Goloubinoff P. Hsp70 chaperones use ATP to remodel native protein oligomers and stable aggregates by entropic pulling. Nat. Struct. Mol. Biol. 2016;23:766–769. doi: 10.1038/nsmb.3283. [DOI] [PubMed] [Google Scholar]
  • 76.Sousa R., Liao H.-S., Lafer E.M. Clathrin-coat disassembly illuminates the mechanisms of Hsp70 force generation. Nat. Struct. Mol. Biol. 2016;23:821–829. doi: 10.1038/nsmb.3272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Bouchiat C., Wang M.D., Croquette V. Estimating the persistence length of a worm-like chain molecule from force-extension measurements. Biophys. J. 1999;76:409–413. doi: 10.1016/s0006-3495(99)77207-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Cecconi C., Shank E.A., Marqusee S. Direct observation of the three-state folding of a single protein molecule. Science. 2005;309:2057–2060. doi: 10.1126/science.1116702. [DOI] [PubMed] [Google Scholar]
  • 79.Elms P.J., Chodera J.D., Marqusee S. The molten globule state is unusually deformable under mechanical force. Proc. Natl. Acad. Sci. USA. 2012;109:3796–3801. doi: 10.1073/pnas.1115519109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Chen H., Yuan G., Yan J. Dynamics of equilibrium folding and unfolding transitions of titin immunoglobulin domain under constant forces. J. Am. Chem. Soc. 2015;137:3540–3546. doi: 10.1021/ja5119368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Alemany A., Rey-Serra B., Ritort F. Mechanical folding and unfolding of protein barnase at the single-molecule level. Biophys. J. 2016;110:63–74. doi: 10.1016/j.bpj.2015.11.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.das Eiras Nunes, J. M. 2013. Tracking chaperone-mediated folding using force spectroscopy. Ph.D. thesis, Technical University of Dresden. https://www.researchgate.net/publication/299537797.
  • 83.Fernandez J.M., Oberhauser A.F., Carrion-Vazquez M. Single protein misfolding events captured by atomic force microscopy. Nat. Struct. Biol. 1999;6:1025–1028. doi: 10.1038/14907. [DOI] [PubMed] [Google Scholar]
  • 84.Wright C.F., Teichmann S.A., Dobson C.M. The importance of sequence diversity in the aggregation and evolution of proteins. Nature. 2005;438:878–881. doi: 10.1038/nature04195. [DOI] [PubMed] [Google Scholar]
  • 85.Nunes J.M., Mayer-Hartl M., Müller D.J. Action of the Hsp70 chaperone system observed with single proteins. Nat. Commun. 2015;6:6307. doi: 10.1038/ncomms7307. [DOI] [PubMed] [Google Scholar]
  • 86.Scholl Z.N., Li Q., Marszalek P.E. Single molecule mechanical manipulation for studying biological properties of proteins, DNA, and sugars. Wiley Interdiscip. Rev. Nanomed. Nanobiotechnol. 2014;6:211–229. doi: 10.1002/wnan.1253. [DOI] [PubMed] [Google Scholar]
  • 87.Lane T.J., Shukla D., Pande V.S. To milliseconds and beyond: challenges in the simulation of protein folding. Curr. Opin. Struct. Biol. 2013;23:58–65. doi: 10.1016/j.sbi.2012.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Bowman G.R., Voelz V.A., Pande V.S. Taming the complexity of protein folding. Curr. Opin. Struct. Biol. 2011;21:4–11. doi: 10.1016/j.sbi.2010.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Reddy G., Thirumalai D. Dissecting ubiquitin folding using the self-organized polymer model. J. Phys. Chem. B. 2015;119:11358–11370. doi: 10.1021/acs.jpcb.5b03471. [DOI] [PubMed] [Google Scholar]
  • 90.King G.M., Carter A.R., Perkins T.T. Ultrastable atomic force microscopy: atomic-scale stability and registration in ambient conditions. Nano Lett. 2009;9:1451–1456. doi: 10.1021/nl803298q. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figs. S1–S18
mmc1.pdf (5.4MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (9.2MB, pdf)

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES