Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Jul 18.
Published in final edited form as: J Mol Biol. 2008 May 17;380(4):742–756. doi: 10.1016/j.jmb.2008.05.023

Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction

Colin A Smith 1,2,3,4,*, Tanja Kortemme 1,2,3,*
PMCID: PMC2603262  NIHMSID: NIHMS59641  PMID: 18547585

Summary

Incorporation of effective backbone sampling into protein simulation and design is an important step in increasing the accuracy of computational protein modeling. Recent analysis of high-resolution crystal structures has suggested a new model, termed backrub, to describe localized, hinge-like alternative backbone and side chain conformations observed in the crystal lattice. The model involves internal backbone rotations about axes between Cα atoms. Based on this observation, we have implemented a backrub-inspired sampling method in the Rosetta structure prediction and design program. We evaluate this model of backbone flexibility using three different tests. First, we show that Rosetta backrub simulations recapitulate the correlation between backbone and side-chain conformations in the high-resolution crystal structures upon which the model was based. As a second test of backrub sampling, we show that backbone flexibility improves the accuracy of predicting point-mutant side chain conformations over fixed backbone rotameric sampling alone. Finally, we show that backrub sampling of triosephosphate isomerase loop 6 can capture the ms/µs oscillation between the open and closed states observed in solution. Our results suggest that backrub sampling captures a sizable fraction of localized conformational changes that occur in natural proteins. Application of this simple model of backbone motions may significantly improve both protein design and atomistic simulations of localized protein flexibility.

Keywords: flexible backbone sampling, backrub motion, point mutation, Monte Carlo, triosephosphate isomerase loop 6

Introduction

Proteins undergo conformational fluctuations in response to thermal energy, binding events, and mutation. Understanding and predicting such excursions around the native state of a protein is a key challenge in computational molecular biology. Side chain sampling1 has been shown to be an extremely useful first-order method for predicting small-scale conformational change. Successful applications include protein-protein docking2,3, total redesign of protein sequences4,5, and redesign of both protein-protein6 and protein-DNA7 interfaces. However, one key approximation made by many of these applications is keeping the backbone structure fixed. In actual proteins the backbone often undergoes subtle shifts in response to binding events8 or sequence changes9. Successfully capturing such near-native shifts is thus important for many docking and design applications.

Numerous methods have been developed to take backbone flexibility into account for both the whole protein and local subsections. Molecular dynamics is currently one of the most pervasive methods. However, in the absence of a steep energy gradient, dynamics depend on random thermal velocities and a long sequence of time steps to sample motions as simple as a rotamer change. Monte Carlo minimization of backbone torsion angles1012 has also been very successful, but can result in highly non-local displacements of the protein backbone and becomes increasingly less efficient with greater protein size. Insertion of peptide fragments has been used for de novo protein structure prediction13 and loop prediction14, but causes similar propagating changes. Several non-local sampling techniques have been applied to protein design including random torsion angle sampling15 and more correlated methods such as fragment insertion16, parameterized coiled-coils17, and normal mode analysis18. These methods make use of patterns commonly observed in protein structures or a harmonic approximation of intra-protein interactions to increase backbone sampling efficiency. Other methods have addressed the problem of making local perturbations using heuristics to iteratively optimize backbone torsion angles until distortions of covalent geometry are minimized1921, but those techniques sometimes leave strained chain junctions that must be relaxed with other algorithms. Another method, called wriggling22, was developed to make partially local moves in which groups of four torsion angles are changed simultaneously to minimize the displacement of distant atoms.

Deformations of protein backbones are truly local only if all consecutive atoms beyond the perturbed region remain fixed. Several local methods exist, the first being introduced by Go and Scheraga23 with numerous subsequent refinements and adaptations2427. These methods involve making a random prerotation of one or more backbone angles, followed by solving a geometric constraint equation for six other backbone degrees of freedom to maintain the locality of the move. Several of the methods incorporated bond angle sampling, either as part of the prerotation24,27, or both the prerotation and the solved constraint equation26. The latter work also biased the prerotations towards less perburbed backbone conformations. The implementation of these methods is more complex than other common techniques like rotamer sampling. Another drawback is that such loop closure methods are biased towards proposing moves that satisfy bonded, geometric constraints, whose multiple free rotation axes can lead to radically different conformations, often with substantial steric clashes and unsatisfied hydrogen bonds. Those non-bonded factors are particularly relevant in highly packed protein cores and interfaces.

The work described here, instead of being motivated by geometric constraints, derives its motional model from conformational variations observed in high-resolution (≤ 1Å) crystal structures28. The fluctuations observed in the crystal lattice motivated Davis et al. to create a simple model, called Backrub, for subtle backbone shifts using just three residues. The core idea in this work is to use that type of motion, observed in nature, to computationally sample backbone configurations in a generalized scheme. A similar move set was recently described29 in the context of a simplified energy function. Here, we investigate the utility of the backrub move to sample conformations in the context of the Rosetta all-atom force field. Rosetta has been successfully used for protein-protein docking3, protein-ligand docking30, redesign of protein cores16, design of new protein interface specificities6, and de novo prediction of small protein structures31. As an initial test, we recapitulate the backbone/side-chain correlations observed in the same high-resolution structures that inspired the Backrub model. We go on to show that backrub backbone flexibility improves side-chain modeling of point mutations. Finally, as a demonstration of the method’s potential, we present a proof-of-concept simulation showing efficient sampling of the opening and closing of triosephosphate isomerase loop 6. Our results indicate that the backbone sampling described here captures a sizable fraction of the subtle conformational variability found in folded proteins.

Results

We implemented the backrub sampling protocol inspired by motion observed in protein structures28 (see Figure 1, Figure 2, and Methods), and evaluated it using three different tests: First, we sought to determine whether the motional model, combined with an all-atom force field, could recapitulate the variation seen in occurrences of a backrub motion in high-resolution crystal structures. Secondly, we test whether backrub sampling can improve the accuracy of modeling small backbone and side chain conformational changes in response to single point mutations in a set of crystal structure pairs. Finally, we show simulations indicating that backrub sampling can capture conformational variability observed in a long time-scale loop motion.

Figure 1.

Figure 1

Schematic showing the generalized backrub move. Moves are made by first randomly selecting the polypeptide backbone segment size, typically 2–12 residues, then randomly selecting a starting residue compatible with the selected segment size. The Cα atoms of the starting and ending residues define the rotation axis. All atoms between the two Cα atoms are then rotated about that axis by a random angle up to 11–40 degrees, depending on the segment size. To minimize the bond angle penalty imposed by full atom force fields, precise placement of branching Cβ and hydrogen atoms is done using quadratic equations that describe the relationship between the backbone bond angle and branching atom spherical coordinates (see Materials and Methods).

Figure 2.

Figure 2

Flow chart depicting the series of substeps taken during a single Monte Carlo step. The proportion of backbone, rotamer, and backbone + rotamer steps are controlled by two parameters. The first, Protamer, specifies the probability of only making a rotamer move. The second, Pbackbone, specifies the probability that only the backbone is modified, given that a rotamer only move type was not selected.

Test 1: Simulation of 3-Residue Backrubs

Davis et al.28 derived the “Backrub” model of protein backbone motion from examples of three residue segments exhibiting multiple backbone conformations in high-resolution (≤ 1.0 Ǻ) crystal structures. To model those variations, they used the Cα atoms as pivot points and enumerated the three possible rotation axes between them. By manually rotating the backbone around those axes, they were able to model the conformational transitions in a significant number of cases. They catalogued 126 such instances in the PDB that fit their model, the majority of which involved a simultaneous rotamer change. In those cases, the backbone-determined location of the Cβ atom significantly altered the conformation of attached side chain atoms. As an initial test of our generalized backrub sampling method, we used focused Monte Carlo simulations to determine whether we could detect distinct populations of coupled backbone/side-chain conformations centered on coordinates observed in the PDB.

Out of 161 derived starting structures (see Materials and Methods), the majority (105) came from PDB residue entries with χ1 angles of the central side chain, i, occupying multiple rotameric bins (−60°, 60°, 180°). In our analysis, we therefore used the χ1 angle as a one-dimensional representation of the side chain conformation. We used the Cαi,initial–Cαi−1–Cαi+1–Cαi,current pseudo-dihedral angle (τdisp), to represent the backbone conformation of the 3-residue segment. (Figure 3) We wanted to determine whether the simulations showed a similar correspondence between side-chain and backbone conformation to that observed in the crystal structures. To answer that question, we calculated τdisp probability distributions for each of the χ1 bins visited during the simulations and compared those distributions to the crystallographic τdisp backbone angles. An example analysis of a simulation showing good agreement with the PDB is given in Figure 4.

Figure 3.

Figure 3

For 3-residue backrub analysis, the τdisp angle was used as a one-dimensional representation of the backbone conformation. In simulations, it was defined as the Cαi,initial–Cαi−1–Cαi+1–Cαi,current pseudo dihedral angle (red). In this illustration, the starting atomic coordinates are shown in gray. In some high-resolution PDB structures, alternate Cα coordinates were not provided, so the τdisp angle was instead defined as the Cβi,initial–Cαi−1–Cαi+1–Cβi,alternate pseudo dihedral angle (not shown) for all PDB analysis.

Figure 4.

Figure 4

Example backbone/χ1 populations from a 3-residue backrub simulation of PDB 1PQ7 chain A, residues 62–64, starting from the “B” alternate backbone coordinates. The central side-chain is glutamine. A) To monitor coupled backbone and side-chain conformational changes, we recorded both the Cαi,initial–Cαi−1–Cαi+1–Cαi,current pseudo-dihedral angle (τdisp) and χ1 angle after every Monte Carlo step. Those angles are shown binned into hexagonal arrays55. B) We separated the backbone pseudo-dihedral angles by χ1 angle and generated normalized histograms for bins −60° (red), 60° (green), and 180° (blue, not shown because the overall population was < 0.05%). Circles indicate the population means (〈τdisp | χ1〉). For each alternate Cβ atom position found in the PDB, the Cβi,initial–Cαi−1–Cαi+1–Cβi,alternate dihedral angle (also τdisp) is indicated as a vertical line colored according to the χ1 bin. The overall population of each bin is indicated in the upper right. The RMSD of the population means from the corresponding PDB τdisp angles is shown in the upper left. Representative structures are shown from the C) −60° and D) 60° χ1 bins. The three simulated residues are shown with a ball and stick representation. Other protein residues are shown using a surface representation. The two crystallographic alternate backbone conformations are shown using a wire representation and colored according to the χ1 bin. Images were created with VMD.

A simple binary metric indicating if the simulations correctly captured the side-chain/backbone bias is whether the average backbone conformations for each χ1 bin (〈τdisp | χ1〉, circles in Figure 4B) were in the same relative orientations found in the PDB (vertical lines in Figure 4B). This is easiest to interpret for those residues with PDB side-chain conformations in exactly two χ1 bins, as in Figure 4. There were 98 starting structures where that was the case and of those, in 76 cases the simulations did visit both χ1 bins observed in the PDB, making the comparison possible. Out of these 76, 55 (73%) showed the correct bias, which is significantly better (chi-square p-value 1·10−4) than would be expected at random (50%). When only buried side chains (SASA <30%, see below) are considered, 15 out of 17 (88%) are correct.

A comparison between the mean τdisp angles from the simulations and those determined from the PDB shows reasonable agreement (Figure 5). As the accuracy of rotamer prediction has been shown to be strongly dependent on the degree of residue burial3235, we show results for 24 residues with solvent accessible surface areas (SASA) of <30%, using the surface area of an extended residue flanked by glycines as the reference SASA. Deviations from the diagonal can result from both scoring/sampling problems in our modeling procedure and uncertainty in the crystallographic fitting. However, there is a reasonable positive correlation (R = 0.64). The correlation becomes clearer when individual simulations (connected by lines) are examined. Nearly all such lines show positive slopes, indicating that the simulations capture the direction of correlated side chain and backbone conformational changes correctly in many cases, albeit with some variation in the absolute magnitude.

Figure 5.

Figure 5

Backbone angular displacement (〈τdisp | χ1〉) is correlated between PDB structures and predicted populations from 3-residue backrub simulations. Points are colored by χ1 bin: −60° (red), 60° (green), and 180° (blue). Points from the same simulation are connected by thin lines. Any connecting lines with positive slopes represent simulations which show the correct bias between side chain conformation and backbone conformation. Disconnected points are from simulations where only one of the χ1 bins seen in the PDB was visited during the simulation. The thick black line shows a least squares linear fit.

A notable observation is that at certain backbone τdisp angles (i.e. < 3° or > 17° in Figure 4), some side chain conformations are completely inaccessible. The intervening backbone conformations form a transitional zone where the rotameric change becomes more and more energetically favorable. In our simulations, there is little evidence for an energetic barrier between the subtle differences in backbone conformations. On the other hand, there can be significant energy barriers involved in side chain transitions, particularly in the protein core. Our data indicate that the side chain rotamers may lock the backbone into slightly different conformations, giving rise to the alternate conformations observed by Davis et al28. This mirrors another simulation study, where a side chain transition played a key role in stabilizing a relatively unconstrained backbone conformational transition36.

Backbone/χ1 Correlation in Crystal Structures Alone

After observing the correlation of backbone conformation with the side chain χ1 angle in our simulations, we wanted to determine whether the same biases could be observed at a global level in the Davis et al.28 dataset, irrespective of the simulation results. To do so, we considered the 68 residues (out of 126) where there were at least two χ1 bins represented in the PDB. For every alternate backbone conformation, we calculated the τdisp angle, using the first conformation as the reference structure. We then normalized the τdisp angles for each of the 68 residues to make the τdisp weighted mean (using PDB occupancies as the weights) of each individual residue 0. The distribution of τdisp for each χ1 bin is shown in Figure 6.

Figure 6.

Figure 6

In high-resolution crystal structures, alternate backbone conformations are correlated with the side chain χ1 angle, with a straightforward structural explanation. A) Out of 126 residues in the backrub set, 68 have χ1 angles in multiple rotameric bins. For those residues, the calculated Cβi,initial–Cαi−1–Cαi+1–Cβi,alternate pseudo-dihedral angles (τdisp) described in Figure 4B were normalized by the average angle (weighted by PDB occupancy). Histograms of those angles are shown using 2.5° bins and colored by χ1 bin: −60° (red), 60° (green), and 180° (blue). B) The clear difference between the −60°/180° and 60° bins has a straightforward structural explanation, where side chains in the 60° bin push the backbone to the left, and the −60°/180° side chains push the backbone to the right. Hypothetical γ atom positions are colored by χ1 bin.

Interestingly, the distribution of τdisp angles for the 60° χ1 bin was significantly different from the distributions for the −60°/180° χ1 bins. (Figure 6A) There was a 5.8° difference in means between the 60° and the joint −60°/180° distributions. The structural explanation for the difference is quite clear when the orientation of the Cβ-C/Oγ bond vector is visualized on a hypothetical backbone with the central side chain pointing up and the Cαi−1 atom in front of the Cαi+1 atom. (Figure 6B) In that orientation, the 60° Cβ-C/Oγ vector points nearly perpendicular to the Cαi−1-Cαi+1 axis, pushing the backbone in a counter-clockwise direction. That rotation corresponds to a negative τdisp angle. In the −60°/180° bins, the Cβ-C/Oγ vectors point in the opposite direction but are much less perpendicular. The degree of perpendicularity helps explain the negative skew of the 60° distribution in comparison to the relative symmetry of the −60°/180° distributions.

In principle, the dependence of backbone conformation on side-chain conformation could be used to derive coupled moves in sequence and structural optimization algorithms. For example, the differences in backbone distributions could be used to restrict sampling of backbone conformations when switching into or out of the 60° χ1 rotameric bin.

Test 2: Point Mutant Side Chain Prediction

In addition to distinct conformations observed in the high-resolution crystal structure dataset discussed above, another context in which subtle backbone differences may be important are residue point mutation. A single-residue point mutation represents the simplest of increasingly more difficult structural modeling tasks where one is given a template and then must predict the new low energy conformation after a known perturbation. In addition, the ability to accurately predict the conformation of a side-chain upon point mutation has direct bearing on the success of protein sequence design algorithms.

We wanted to determine the extent to which generalized backrub sampling could improve the prediction of point mutant side chains, especially when using a fixed rotamer library as commonly done in computational protein design methods. Recently, Bordner and Abagyan37 compiled a large benchmark set of PDB structure pairs differing by a single point mutation. We applied the generalized backrub protocol to locally refine structural models after mutation/fixed backbone rotamer optimization in Rosetta16. We found that overall, incorporation of backrub sampling improved both side chain heavy atom RMSD and χ1/χ2 recovery within 40°. (Figure 7) We also found that the local backbone RMSD between PDB structure pairs was correlated with prediction difficulty, in terms of both RMSD and χ1/χ2 recovery. The larger the backbone conformational change upon mutation, the larger was the improvement resulting from backrub sampling. In particular, the fraction of pairs with the highest starting RMSD showed the most sizeable improvement. Similar observations were made in a previous study15 which showed improvements in prediction of side chain conformations after core substitutions in T4 lysozyme when comparing flexible with fixed backbone methods. There backbone flexibility was modeled using a different mechanism employing random continuous adjustments of ±3 degrees to each backbone angle.

Figure 7.

Figure 7

Generalized backrub sampling (blue) improves A) the RMSD and B) χ1/χ2 recovery (within 40°) of predicted point mutant side-chains over fixed backbone sampling alone (red). Prediction results were sorted by increasing starting backbone RMSD and divided into four equally sized groups. The improvement was most distinguished for structure pairs with a larger starting backbone RMSD. Breaks between groups are along the x-axis. Results are shown for 543 non-proline residues for which the solvent accessible surface area of the wild-type residue was < 5%. Residues within a 6Å radius of the wild-type residue were sampled. In the boxplots, boxes indicate the interquartile range (IQR), thick horizontal lines show the median, and dots show the mean. Whiskers extend to the most extreme datapoint within 1.5 times the IQR of the 25th or 75th percentile.

In addition to the dependence on initial backbone RMSD, we also investigated how a number of other factors affected the extent of improvement, including the radius of neighboring residues allowed to change rotameric conformations (4, 5, 6, 7, and 8 Å from the mutated residue) and the degree of burial of the mutated residue (< 5%, < 30%, and ≤ 100% solvent accessible surface area). Figure 7 shows results from point mutant predictions that showed the best overall improvement in prediction accuracy, considering < 5% solvent exposure and a 6Å sampling radius. A complete enumeration of the radius of sampling and degree of burial is shown in supplementary data for both side-chain RMSD and χ1/χ2 recovery. Between a 4 Å and 8 Å sampling radius, the prediction of side chain RMSD does not change significantly. Backrub sampling gives better χ1/χ2 recovery at 6Å than using any other radius. Considering the amount of residue burial, backrub sampling continues to improve overall RMSD prediction somewhat using a 30% SASA cutoff. When evaluating all residues including those that are largely solvent exposed, backrub sampling still improves predictions for high backbone RMSD pairs, but makes low RMSD pairs slightly worse. (see Supplementary Material)

Figure 8 shows examples of backrub sampling improving side chain prediction. The improvement can come from two sources, namely better prediction of the side-chain conformation and better prediction of the protein backbone. In some cases, the side-chain improvement comes at the cost of backbone prediction accuracy, as is shown in the last row of images. However, the worsening of backbone (Cα/Cβ) RMSD is relatively small compared with the improvement in side chain RMSD. (Table 1) The source of the error could lie in crystallographic uncertainty, inaccuracies in the scoring function, or compensation for a discretized rotameric side-chain representation.

Figure 8.

Figure 8

Backrub backbone sampling improves side chain prediction, as shown in these examples. The fixed backbone prediction is shown in red and the backrub prediction is shown in blue. The starting PDB structure is shown in green and the target mutated PDB structure is shown in purple. Nitrogen and oxygen atoms are shown in light blue and red, respectively. Examples are sorted by the improvement in mutant residue Cα/Cβ RMSD from fixed backbone to backrub protocols. The modeled mutation M153F in 1KYO and 1LOJ is the same, but 1KY0 has a leucine at positions 118 & 121 whereas 1L0J has a methionine at positions 118 & 121. In both cases, backrub sampling correctly shifts the backbone at residue 153 and better recovers the target side chain. In the last five examples, backrub sampling increases the Cα/Cβ RMSD but improves the side chain prediction. Cα/Cβ and side-chain RMSDs are listed in Table 1. Images were created using ICM Browser.

Table 1.

Cα/Cβ and side-chain RMSD for examples of point mutant predictions pictured in Figure 8. Examples were selected from cases where backrub sampling improved side-chain prediction. The majority of the selected examples also showed improvement in prediction of the backbone, although this was not always the case.

Mutated Residue Cα/Cβ RMSD Side Chain RMSD
PDB Chain Mutation Fixed BB Backrub Delta Fixed BB Backrub Delta
1CV1:A M111I 1.13 0.21 −0.91 2.41 0.37 −2.04
1CWU:B G138A 1.03 0.16 −0.87 NA NA NA
1KGY:H G1649Q 1.22 0.42 −0.80 4.54 1.21 −3.33
1LVE:A Q89L 0.64 0.09 −0.55 1.14 0.30 −0.84
2HEC:A A56F 0.70 0.16 −0.55 3.70 0.98 −2.71
2MEB:A L56M 0.57 0.08 −0.49 1.18 0.59 −0.60
1THP:B P225Y 0.75 0.31 −0.44 2.24 0.38 −1.86
1LUW:B Q30V 0.64 0.23 −0.41 1.01 0.32 −0.69
1GAD:P N313T 0.43 0.13 −0.30 2.61 0.27 −2.34
5EAA:A S191W 0.68 0.40 −0.28 7.47 0.88 −6.60
1WKD:A A102D 0.46 0.22 −0.24 2.74 0.57 −2.17
1KY0:A M153F 0.54 0.32 −0.22 3.66 0.49 −3.17
1NAG:A G43N 0.45 0.24 −0.21 3.10 0.37 −2.73
2BQJ:A A125V 0.44 0.24 −0.20 2.50 0.42 −2.08
1L0J:A M153F 0.34 0.24 −0.10 3.61 0.24 −3.37
1G7L:A S92W 0.34 0.45 0.11 3.71 0.81 −2.90
2TOD:D A69K 0.16 0.32 0.16 4.36 1.36 −3.00
1N7X:A E45Y 0.13 0.31 0.19 6.60 0.65 −5.95
1KX0:C V207I 0.14 0.44 0.29 3.29 0.76 −2.53
1L82:A L153F 0.48 0.93 0.45 3.70 0.98 −2.72

Test 3: Triosephosphate Isomerase Loop 6 Simulation

As a third, proof-of-principle test of the backrub sampling protocol, we investigated a much larger conformational change. The hinge motion of triosephosphate isomerase (TIM) loop 6 is a well-characterized example of a protein segment undergoing significant conformational change while maintaining a relatively rigid internal conformation. The Cα RMSD between an 11 residue segment (V167-T177) in the closed, (PDB 2YPI38), and open (PDB 1YPI39) conformations is 4.6 Å. We found that by manually rotating the closed segment 50° about the Cα167–Cα177 axis, the Cα RMSD to the open form drops from 4.6 to 1.1 Å, indicating that a backrub simulation may capture much of the conformational variability of the TIM loop. However, such a large rotation introduces several significant steric clashes and adds sizable backbone bond angle strain at residue V167.

We wanted to determine whether generalized backrub sampling of the loop region could capture the same degree of conformational variability without producing energetically unreasonable conformations. Previous studies using molecular dynamics have had difficulty capturing the TIM loop 6 conformational transition40,41. Notably, the simulations required temperatures from 1000–1200 K to see transitions from one state to the other.

We ran simulations starting from both the open conformation (1YPI) and the closed conformation (2YPI). In each simulation, we allowed backrub moves of size 2–12 on residues 165–179. Residues 128–130 showed small but potentially significant changes between the two conformations, so we allowed backrub moves of size 2–3 for those residues. In addition to all of those residues, rotamer changes were allowed for residues whose side chains were in the vicinity of the loop using a 5 Å cutoff and by visual inspection (3, 7, 95, 96, 131, 134, 139, 164, 180, 183, 208, 211, 216, 219, 220, 223, 230). Each simulation was run without the ligand, making the atomic composition identical. We ran the simulations for 1.5 million Monte Carlo moves, using a temperature of 302 K in the Metropolis criterion. Each simulation took 14 hours to complete on a single 2.0GHz Xeon processor.

To analyze the simulation trajectories, we calculated the Cα RMSD of the loop from both the open (1YPI) and closed (2YPI) conformations. The sign of the difference between those RMSDs indicates whether the loop is closer to the open (positive sign) or closed (negative sign) conformation. Starting from the closed conformation with the ligand removed, the backrub simulations were able to oscillate between the open and closed forms of the loop many times during a 1.5 million step simulation. Eight example transitions are pictured in Figure 9, where the minimum RMSD for each approach to the open form ranged 1.57–2.2 Å and the RMSD values for return to the closed form ranged 1.37–2.36 Å. The loop structure (V167-T177) maintained a relatively stable internal conformation over the length of the simulation, with an average aligned Cα RMSD of 1.3 Å (0.3 Å standard deviation) from the starting structure.

Figure 9.

Figure 9

Generalized backrub sampling captures triosephosphate isomerase loop 6 opening. A) For every 200 accepted moves in the simulations, we calculated the Cα RMSD of the loop from both the open (1YPI, purple lines) and closed (2YPI, green lines) PDB structures. We defined a single reaction coordinate for the simulations as RMSDclosed – RMSDopen (black lines). The green and purple lines are plotted with modified axes such that the black line is the sum of those component lines. The simulation starting in the open conformation is on top and the simulation starting in the closed conformation is on the bottom. The simulation starting from the open conformation makes an initial excursion closer to the closed conformation but then stays open for the remainder. The simulation starting in the closed conformation alternates back and forth between the open and closed conformations at least eight times during the simulation. B) The open structure (1YPI) is shown in purple. The closed structure (2YPI) is shown in green. For reference the substrate analogue, 2-phosphoglycolate, is shown using space fill. (It was not present in either simulation.) The conformation at the numbered Monte Carlo step is shown in black.

We found that the motion of the loop depended on the starting structure, not always showing the opening and closing behavior. In the simulation starting from the open conformation, there was a transient excursion closer to the closed form (within 2.38 Å) at the beginning of the simulation. After that, the loop stayed in a predominantly open conformation for the remainder of the simulation, in some cases migrating to a “hyper open” state up to 8.29/4.23 Å from the closed and open structures, respectively.

There are several explanations for the difference in the simulations and lack of convergence. In addition to possibly needing more sampling to equilibrate, it may be that the anchor points or other fixed regions of the different starting protein structures bias the loop motion. Another possible explanation is that the backrub motions are not sufficiently sampling the internal degrees of freedom in the loop. A likely limitation is that proline residues (at TIM residue positions 168 and 176) are not currently allowed as pivot residues in backrub sampling, thus keeping all of their internal angles fixed. This effect may be substantial in the TIM loop case as P168 shows a ψ angle change of 40° between the two conformations. (Supplemental Figure 12)

To assess the degree to which more complete backrub sampling (not limiting sampling to just the loop 6 region) captures the flexibility of the TIM structure, we ran multiple simulations of the complete TIM dimer (with the constraint of fixing the backbone coordinates of 17 core residues in each monomer). The loop 6 region does indeed show the largest conformational variability: three of the four largest calculated B-factors from those simulations are in the tip of loop 6 (G171–G173). (Supplemental Figure 13) In addition to the high calculated B-factors for loop 6, some flexibility was observed in several other regions. These regions also showed structural differences between the open and closed crystal structures.

Discussion

We have shown that the backrub sampling method is useful for sampling small, high-resolution conformational fluctuations as well as a larger, functionally relevant conformational change. In addition to capturing the structural variability of single sequences, generalized backrub sampling also improves modeling of changes to protein structures upon point mutation. While many of the backbone movements are less than 1 Å, they can result in significant displacements of the attached side chains. In addition, the localized breathing motion that backrub sampling emphasizes can allow otherwise energetically unfavorable rotameric transitions.

This work supports the conclusion advanced by Davis et al28 that protein backbones are influenced by side-chain conformations in a predictable manner, complementing the accepted notion that side-chain conformations can be backbone dependent. In backbone-dependent rotamer libraries, the side chain conformation is influenced by the ϕ and ψ angles of the residue itself33. Our simulations and analysis support the notion of a second-order correlation between a central side-chain and the protein backbone in adjacent residues. The 3-residue simulations indicate that the energetic barriers between the relevant backbone conformations can be significantly less than those typically associated with side-chain rotamer transitions.

As a sampling method, the overall philosophy behind the move used here is somewhat different from other methods (although bearing similarities to local perturbation approaches highlighted earlier26,27). First, it is a generalization of movement that is observed in nature at both small and large amplitude. Rotameric sampling was likewise inspired by observations made from crystal structures. Second, instead of treating bond angles as inviolable, it takes advantage of the small but significant flexibility in the bond angle to move a set of backbone atoms through a single, unified rotation. As indicated by the 3-residue simulations, backrub sampling helps free the protein backbone to explore an ensemble of conformations around the native state. While molecular dynamics could be used to accomplish the same goal, correlated movement of atoms can take a considerable number of time steps, unless there is already a set of forces accelerating the atoms in a concerted direction. Simultaneous, correlated rotation of many atoms is one of the strengths of rotamer sampling. Backrub sampling shares that strength.

Backrub moves are biased towards sampling hinge-like protein motion. Another type of motion sometimes seen in proteins is a shearing move, where a subsection of the protein is translated laterally in relation to the remainder of the protein. That type of motion is almost completely orthogonal to the generalized backrub move described in this work. However, the same philosophy as defined originally for the backrub move28,29 and described here could be applied to model shearing moves directly. This would require four Cα atoms as pivot points. One embodiment may consist of rotating Cα2 about Cα1 in the Cα1-Cα2-Cα3 plane, and rotating Cα3 about Cα4 in the Cα2-Cα3-Cα4 plane, such that the Cα2-Cα3 distance is preserved. Though somewhat more complex, this type of move may help model subtle shifts of alpha helices and other structural elements by small but significant distances. While development of additional move sets may prove useful in capturing the full range of protein motion, another promising avenue might involve combination of backrub and rotamer sampling with traditional molecular dynamics in a hybrid Monte Carlo approach. Towards that end, a discussion and validation of the effect of backrub sampling on detailed balance is given in supplementary information.

The combination of backrub-inspired backbone flexibility with side chain sampling and protein design protocols has a number of useful practical applications. First, we show here that employing backrub motions in a high-resolution refinement protocol improves mutant side chain predictions in two large datasets, comprising 126 backrub motions in 19 high-resolution structures and 2,023 pairs of protein point mutant structures. These results suggest that backrub sampling may enhance the applicability and accuracy of methods to estimate the change in fold stability or binding affinity of proteins upon point mutation15,4245. Second, we show in related work that incorporation of backbone flexibility using the backrub model significantly increases the agreement between modeled side chain conformational variability in folded proteins and side chain relaxation order parameters measured by NMR (Friedland et al., in press). Such simulations may provide insights into protein dynamics and mechanisms of correlated motions. Finally, the use of near-native backbone ensembles has been shown to broaden the set of sequences identified by computational methods and result in successful designs18. Similarly, we find that design simulations employing backrub-generated backbone ensembles predict protein sequence families more similar to those observed by experimental phage display selection methods than predictions using just the crystallographic backbone (Humphris & Kortemme, unpublished data). Given its relative simplicity in implementation and ability to capture relevant conformational changes inspired by observed alternative conformations in high-resolution structures, the backrub method may be generally useful for a broad spectrum of side chain sampling and protein design protocols.

Materials and Methods

Generalized Backrub Move

The backrub move (Figure 1) is applied to an internal protein segment two or more residues long and consists of a geometric rotation by a random angle, τ, about an axis defined by the flanking Cα atoms. The move simultaneously changes 6 internal backbone degrees of freedom in the protein, namely the ϕ and ψ angles at both pivot points and the N-Cα-C bond angle, α, at both pivots. (Variable names follow the conventions of Betancourt29 instead of Davis28, which uses τ for the N-Cα-C bond angle.)

The sampling strategy employed here is similar to the one described by Betancourt29, in that three types of moves are used, namely backbone only, rotamer only, and rotamer/backbone. However, the move selection is significantly different. We were interested in selectively sampling backbone motion in specified local regions of the protein while keeping other regions fixed. Therefore, we devised a flexible scheme for specifying which parts of the protein structure were variable. At the highest level, the operator indicates for each residue whether to sample the backbone, side chain or both. Backrub moves are only allowed for segments where backbone sampling is enabled for both the beginning (i) and ending (j) residues, and all intervening residues. Because the proline side-chain rejoins to the backbone at the amide nitrogen, it has been excluded as a pivot point. In addition, the minimum and maximum segment size (j−i+1) can be varied. By default, the minimum segment size is 2, corresponding to a rotation of the atoms making up the peptide bond between two consecutive Cα atoms. The default maximum segment size is 12, although higher or lower values may be desired depending on the application.

Given that information, a sparse upper-triangular boolean matrix, B, is created where B[i,j] indicates whether a move starting at residue i and ending at residue j is permissible. B can then be further modified to enable or disable individual residue segments. Before beginning Monte Carlo sampling, a data structure is generated from B that lists each possible segment size, along with all starting residues compatible with that particular segment size. Segment selection then becomes the simple procedure of first selecting a random segment size uniformly from all allowed segment sizes, and then selecting a random segment from all allowed segments with the selected size. As there are fewer long segments than short segments, individual long segments will be selected slightly more often than individual short segments.

Monte Carlo Sampling Protocol

During the course of an actual Monte Carlo simulation, the protocol described in Figure 2 is used to perform each move. At the beginning of a step, a decision to make a rotamer only move is made according to the adjustable probability, Protamer. The default value of Protamer is 0.25. If a rotamer only move is chosen, a single variable side-chain is randomly selected and a rotamer is chosen from a library generated using a backbone-dependent rotamer library16,46. The rotamer library is initialized using the ϕ/ψ angles from the starting structure and not updated during the simulation. If a rotamer only move is not made, then a random segment and angle is selected as described previously, and the rotation is applied. At that point, the algorithm decides whether to terminate the move (leaving it as a backbone only move) according to the second adjustable probability, Pbackbone. The default value of Pbackbone is 0.75 to emphasize the more frequently accepted backbone only moves. If the move is not ended, then one or two residues (respective probabilities 0.75 and 0.25) are selected from along the length of the perturbed backbone segment and random rotamers are chosen for those residues. After all structural perturbations are complete, the move is evaluated using the Metropolis criterion and the Rosetta energy function. Constraints on the degree of angular perturbation and example acceptance probabilities are given in subsequent sections.

Rosetta Scoring Function

In a previous implementation29 of the move described here, N-Cα-C bond angles were not energetically scored and were constrained to being within 10° of the median bond angle observed in PDB structures. In this work we used bond angle potentials from the Amber ff9447 and CHARMM2248 force fields.

In addition to the added bond angle term, the Rosetta full-atom scoring function49 uses several bonded terms including a ϕ/ψ angle term based on Ramachandran distributions and a χ angle term based on Dunbrack rotamer statistics. For evaluating non-bonded interactions, Rosetta uses a van der Waals term resembling a Lennard-Jones potential, an explicit geometry-dependent hydrogen bonding term44, a short-range electrostatics term approximated by a residue-specific pairwise distance potential, and the Lazaridis/Karplus implicit solvation model50.

Bond Angle Constraints

In order to reduce the amount of bond angle strain imposed, we sought to bracket the randomly chosen rotation angle such that the bond angle strain never exceeds a threshold value, αmax. We used a previously described method29 to analytically determine the set of τ intervals satisfying that constraint. Briefly, the method involves solving for τ a trigonometric equation that relates α to τ and then plugging in αideal − αmax and αideal + αmax for both the starting and ending residues. The resulting values of τ establish the intervals of allowed τ angles. We term that set of intervals Ibond angle.

To determine how the acceptance ratio decays for increasingly strained bond angles, we performed a long (106 step) Rosetta Monte Carlo simulation using a PDZ domain structure (PDB 2H3L51), imposing the Amber bond angle potential and limiting bond angles to within 10° of the overall bond angle minimum. Move attempts were binned by the maximum deviation (at either pivot point) from the Amber ideal bond angle and acceptance ratios were calculated (Supplemental Figure 1A). The acceptance rate remained above 20% for all moves where both bond angles remained within 6.25° degrees of ideal. At the extreme, where one of the bond angles reached a 10° deviation from ideal, the acceptance rate dropped to 6.6%. Those rates may initially seem somewhat high, given the severity of the angular strain. However, there are always two bond angles changing during any move. At equilibrium, moves may transfer bond angle strain from one residue to the other, without increasing the total amount of strain in the system.

Rotation Angle Constraints

Examining the acceptance statistics further, we made the intuitive observation that as the magnitude of angular displacement increases, the acceptance statistics drop almost exponentially (Supplemental Figure 1B). This phenomenon is best explained through sterics, where the larger the rotation, the more likely a deleterious steric clash is encountered. We therefore imposed an additional constraint upon moves that restricted the maximum angular rotation to a given threshold, τmax. This was done by generating an additional interval, Irotation angle = [−τmax, τmax], and then calculating the intersection, I = Ibond angleIrotation angle, of that interval with the previously calculated intervals bracketing the bond angle. Importantly, this additional constraint can create an imbalance in the selection probabilities for the possible angles, as the total angular range of the intervals may be different before (l) and after (l’) moves. (Supplemental Figure 2) Because the probability of selecting a given angle is inversely proportional to the number of possible values, the following acceptance criterion can be used to produce uniform selection probabilities:

P(τ)=min[1,ll]

Trial move τ angles are generated using this procedure:

  1. Calculate the total length, l, of the set of intervals, I.

  2. Choose a random threshold, t, uniformly from the interval [0, 1].

  3. Choose a random angle, τ, uniformly from I.

  4. At angle τ, calculate the new rotational interval, Irotation angle = [τ − τmax, τ + τmax].

  5. Calculate I’ = Ibond angleIrotation angle and the total length, l’, of I’.

  6. If l/l’t, return τ. Otherwise go back to step 3.

Because l and l’ are generally quite similar, this procedure rarely iterates more than several times and is considerably less costly than other parts of the simulation.

To ameliorate the reduction in acceptance ratio for large segment sizes, the τmax parameter is varied for each possible residue segment. This is distinguished from the Betancourt strategy of making equal magnitude displacements regardless of segment size. Different values of τmax are stored in another sparse upper triangular matrix, T. Based on empirical observation of the acceptance statistics, we devised the following rule relating τmax to segment size, s:

τmax={40;s=223s;s3

We found peptide bonds (size 2) to be significantly more flexible than other segment sizes. The large increase in flexibility is partially due to peptide bonds lacking the steric constraints of other segment sizes. However, when one looks at the distribution of allowable τ angles, given only a 10° bond angle cutoff, it is also clear that peptide bond segments have significantly more flexibility than larger segments (Supplemental Figure 1C). In addition to Davis et al.28, a similar type of motion has also been observed in unbiased computational simulations. A recent analysis of correlated ϕ/ψ motions in a large set of molecular dynamics trajectories also observed significant, localized peptide bond fluctuations52. Additionally, in pairs of structures of the same protein crystallized multiple times, larger “peptide flips” (involving rotations ~180°) are often observed53.

As a result of constraining both the N-Cα-C bond angles and maximum angular displacement during a move, the acceptance statistics remain relatively high for segments sizes from 2 to 12 (Supplemental Figure 1D). For the PDZ domain test simulation, backbone only moves showed an average acceptance ratio of 29%, and rotamer only moves showed an acceptance ratio of 34% (data not shown). When combined with the much less accepted simultaneous rotamer/backbone moves, the overall acceptance ratio drops to 26% (weighted mean of all move types). Elimination of simultaneous rotamer/backbone moves would increase the overall acceptance rate to 30%.

Optimized Placement of Cβ and Hα Atoms

An important methodological consideration in a procedure that modulates Cα backbone bond angles is how the positions of the branching Cβ and hydrogen atoms are simultaneously varied. Placement of branching atoms means positioning of Cβ and Hα relative to the N, Cα, and C atoms. A number of angular bisecting heuristics can be applied to place those atoms in positions with acceptable geometries. In this work, to reduce bond angle strain, the branching atoms are placed in positions at the minimum of the force field bond angle potential, given the current N-Cα-C bond angle. Minimization after every Monte Carlo move would be computationally expensive. Fortunately, the minimized internal coordinates of those atoms follow a predictable pattern (Supplemental Figure 4). To enable fast updates of the position of a branching atom X, quadratic functions were fit that related a series of N-Cα-C backbone bond angles to the corresponding fully minimized branching atom internal coordinates, namely the C-N-Cα-X torsion offset from ϕ, and the N-Cα-X bond angle (Supplemental Table 2). These fits were very accurate even to highly unfavorable bond angle energies of 20 kcal/mol.

After every backrub move, the new branching atom positions are found using those quadratic fits. Subsequently, the coordinates of the side chain prior to the move are rotated about the Cα atom pivot point such that the old Cβ atom is collinear with the new Cα-Cβ axis. Finally, the whole side chain is rotated slightly about the Cα-Cβ axis to restore the χ1 angle to its original value.

Simulation of 3-Residue Backrubs (Test 1)

Davis et al28 identified 126 positions in 19 high resolution (≤ 1.0Å) crystal structures where there was evidence for a localized rotation of a 3-residue segment of the protein backbone. In some cases, the conformational variability in the backbone was only implied by alternate Cβ atom positions in the PDB file, with a single set of Cα atom coordinates representing the mean location of multiple Cα atom positions. In those cases, we used a single starting structure with the Cβ atom optimized according to the bond angle potential. If alternate Cα coordinates were present in the PDB file, we generated 2–3 starting structures, one for each variant letter (A, B, or C) in the contiguous set of atoms with alternate backbone coordinates. All other alternate atom coordinates were set to the A variants. In total, this procedure yielded 161 starting structures in the 3-residue backrub set.

The simulations for each identified backrub were as follows. Given a three residue backbone motion centered on residue i, angular perturbations were enabled for residue pairs (i−1, i+1), (i−1, i), and (i, i+1). The side chain of residue i was also allowed to sample different rotameric states. 200,000 Monte Carlo steps were run at a temperature of 302 K. 50% of the steps consisted of a random perturbation of the (i−1, i+1) angle and a simultaneous rotamer swap. The other 50% of steps consisted of a random perturbation of either the (i−1, i) or (i, i+1) angle and no rotameric sampling.

Point Mutant Side Chain Prediction (Test 2)

We used a benchmark set of 2,141 pairs of protein structures for which the only difference was a single point mutation, aside from extra or missing residues at the N and C termini37. We removed 7 pairs from the set that had, at the mutated residue position, either missing side chain atoms or a non-canonical amino acid. We also removed 8 pairs for which the mutation was duplicated in another pair in the list. Finally we removed 103 pairs that had either missing or zero occupancy backbone atoms in the first structure in the pair. Structures with missing or zero occupancy backbone atoms in the second structure were removed during analysis (see below). That left 2,023 ordered pairs of structures.

During side-chain prediction, we sampled conformations (backbone and side-chain) for both the mutated residue and neighboring residues. Neighboring residues were selected that, prior to mutation, had any atom within a given radius of any atom in the mutated residue. Radial cutoffs of 4Å, 5Å, 6Å, 7Å, and 8Å were tested. At the beginning of each sampling run, the side-chain in the first PDB structure was mutated and then rotamer optimized along with all the neighboring side-chains using an energy-table based Monte Carlo simulated annealing protocol16. Subsequently, the backrub protocol was run for 104 steps at a single temperature of kT = 0.6, maintaining either a fixed backbone (Protamer = 1) or allowing backbone sampling (Protamer = 0.25). The lowest energy structure found during ten separate executions was used as the prediction.

To facilitate comparison of the prediction with the second PDB structure in the pair, we first superimposed the N, Cα, and C atoms from a set of residues around the mutated residue. The superimposed set was defined as all residues satisfying the following condition in both the first and second PDB structures: a heavy atom of the residue must be within 4Å of a heavy atom in the mutated residue. All subsequent RMSD calculations used this fixed superimposition. To compare effects of the mutation on surrounding side-chains, we used a similar set of residues. The set was defined as all non-mutated residues satisfying the following condition in either the first or second PDB structures: a non-backbone heavy atom of the residue must be within 4Å of a non-backbone heavy atom in the mutated residue. Any RMSD calculation in which all compared atoms in the second PDB structure had zero occupancy was ignored in calculating overall statistics. All superimposition, RMSD, and chi angle calculations were done using ICM Browser 3.5-1l (Molsoft). Sequence alignments for mapping atom selections from structure to structure were created using ClustalW 1.8354.

Code Availability

Source code for the implemented backrub model is available for download free-of-charge as part of the 2.2 release of the Rosetta molecular modeling software at http://www.rosettacommons.org/.

Supplementary Material

01

Acknowledgements

The authors thank Andrew Bordner for providing a text version of the point mutant benchmark. Greg Friedland gave helpful feedback about the backrub sampling code and method. Jerome Nilmeier provided useful discussions about detailed balance. Matt Jacobson gave valuable feedback as co-advisor of CAS. Christopher McClendon, Libusha Kelly, and Ian Davis provided useful comments about the manuscript. CAS was supported by NIH training grant GM067547, the Department of Defense Graduate Research Fellowship Program, and the Genentech Scholars Program. TK is an Alfred P. Sloan Scholar in Molecular Biology. This work was supported by SynBERC (NSF EEC-0540879).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Ponder JW, Richards FM. Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol. 1987;193:775–791. doi: 10.1016/0022-2836(87)90358-5. [DOI] [PubMed] [Google Scholar]
  • 2.Fernández-Recio J, Totrov M, Abagyan R. ICM-DISCO docking by global energy optimization with fully flexible side-chains. Proteins. 2003;52:113–117. doi: 10.1002/prot.10383. [DOI] [PubMed] [Google Scholar]
  • 3.Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol. 2003;331:281–299. doi: 10.1016/s0022-2836(03)00670-3. [DOI] [PubMed] [Google Scholar]
  • 4.Dahiyat BI, Mayo SL. De novo protein design: fully automated sequence selection. Science. 1997;278:82–87. doi: 10.1126/science.278.5335.82. [DOI] [PubMed] [Google Scholar]
  • 5.Dantas G, Kuhlman B, Callender D, Wong M, Baker D. A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J Mol Biol. 2003;332:449–460. doi: 10.1016/s0022-2836(03)00888-x. [DOI] [PubMed] [Google Scholar]
  • 6.Kortemme T, Joachimiak LA, Bullock AN, Schuler AD, Stoddard BL, Baker D. Computational redesign of protein-protein interaction specificity. Nat Struct Mol Biol. 2004;11:371–379. doi: 10.1038/nsmb749. [DOI] [PubMed] [Google Scholar]
  • 7.Ashworth J, Havranek JJ, Duarte CM, Sussman D, Monnat RJ, Stoddard BL, Baker D. Computational redesign of endonuclease DNA binding and cleavage specificity. Nature. 2006;441:656–659. doi: 10.1038/nature04818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Swain JF, Gierasch LM. The changing landscape of protein allostery. Curr Opin Struct Biol. 2006;16:102–108. doi: 10.1016/j.sbi.2006.01.003. [DOI] [PubMed] [Google Scholar]
  • 9.Li Y, Li H, Yang F, Smith-Gill SJ, Mariuzza RA. X-ray snapshots of the maturation of an antibody response to a protein antigen. Nat Struct Biol. 2003;10:482–488. doi: 10.1038/nsb930. [DOI] [PubMed] [Google Scholar]
  • 10.Li Z, Scheraga HA. Monte Carlo-minimization approach to the multiple-minima problem in protein folding. Proc Natl Acad Sci U S A. 1987;84:6611–6615. doi: 10.1073/pnas.84.19.6611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Abagyan R, Totrov M, Kuznetsov D. ICM - A new method for protein modeling and design: Applications to docking and structure prediction from the distorted native conformation. J Comput Chem. 1994;15:488–506. [Google Scholar]
  • 12.Rohl CA, Strauss CE, Misura KM, Baker D. Protein structure prediction using Rosetta. Methods Enzymol. 2004;383:66–93. doi: 10.1016/S0076-6879(04)83004-0. [DOI] [PubMed] [Google Scholar]
  • 13.Simons KT, Kooperberg C, Huang E, Baker D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol. 1997;268:209–225. doi: 10.1006/jmbi.1997.0959. [DOI] [PubMed] [Google Scholar]
  • 14.Rohl CA, Strauss CE, Chivian D, Baker D. Modeling structurally variable regions in homologous proteins with rosetta. Proteins. 2004;55:656–677. doi: 10.1002/prot.10629. [DOI] [PubMed] [Google Scholar]
  • 15.Desjarlais JR, Handel TM. Side-chain and backbone flexibility in protein core design. J Mol Biol. 1999;290:305–318. doi: 10.1006/jmbi.1999.2866. [DOI] [PubMed] [Google Scholar]
  • 16.Kuhlman B, Baker D. Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci U S A. 2000;97:10383–10388. doi: 10.1073/pnas.97.19.10383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Harbury PB, Plecs JJ, Tidor B, Alber T, Kim PS. High-resolution protein design with backbone freedom. Science. 1998;282:1462–1467. doi: 10.1126/science.282.5393.1462. [DOI] [PubMed] [Google Scholar]
  • 18.Fu X, Apgar JR, Keating AE. Modeling backbone flexibility to achieve sequence diversity: the design of novel alpha-helical ligands for Bcl-xL. J Mol Biol. 2007;371:1099–1117. doi: 10.1016/j.jmb.2007.04.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wick CD, Siepmann JI. Self-adapting fixed-end-point configurational-bias Monte Carlo method for the regrowth of interior segments of chain molecules with strong intramolecular interactions. Macromolecules. 2000;33:7207–7218. [Google Scholar]
  • 20.Canutescu AA, Dunbrack RL. Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Sci. 2003;12:963–972. doi: 10.1110/ps.0242703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Jacobson MP, Pincus DL, Rapp CS, Day TJ, Honig B, Shaw DE, Friesner RA. A hierarchical approach to all-atom protein loop prediction. Proteins. 2004;55:351–367. doi: 10.1002/prot.10613. [DOI] [PubMed] [Google Scholar]
  • 22.Cahill M, Cahill S, Cahill K. Proteins wriggle. Biophys J. 2002;82:2665–2670. doi: 10.1016/S0006-3495(02)75608-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Go N, Scheraga HA. Ring Closure and Local Conformational Deformations of Chain Molecules. Macromolecules. 1970;3:178–187. [Google Scholar]
  • 24.Bruccoleri RE, Karplus M. Chain closure with bond angle variations. Macromolecules. 1985;18:2767–2773. [Google Scholar]
  • 25.Dinner AR. Local deformations of polymers with nonplanar rigid main-chain internal coordinates. J Comput Chem. 2000;21:1132–1144. [Google Scholar]
  • 26.Ulmschneider JP, Jorgensen WL. Monte Carlo backbone sampling for polypeptides with variable bond angles and dihedral angles using concerted rotations and a Gaussian bias. J Chem Phys. 2003;118:4261–4271. [Google Scholar]
  • 27.Coutsias EA, Seok C, Jacobson MP, Dill KA. A kinematic view of loop closure. J Comput Chem. 2004;25:510–528. doi: 10.1002/jcc.10416. [DOI] [PubMed] [Google Scholar]
  • 28.Davis IW, Arendall WB, Richardson DC, Richardson JS. The backrub motion: how protein backbone shrugs when a sidechain dances. Structure. 2006;14:265–274. doi: 10.1016/j.str.2005.10.007. [DOI] [PubMed] [Google Scholar]
  • 29.Betancourt MR. Efficient Monte Carlo trial moves for polypeptide simulations. J Chem Phys. 2005;123:174905–174905. doi: 10.1063/1.2102896. [DOI] [PubMed] [Google Scholar]
  • 30.Meiler J, Baker D. ROSETTALIGAND: protein-small molecule docking with full side-chain flexibility. Proteins. 2006;65:538–548. doi: 10.1002/prot.21086. [DOI] [PubMed] [Google Scholar]
  • 31.Bradley P, Misura KM, Baker D. Toward high-resolution de novo structure prediction for small proteins. Science. 2005;309:1868–1871. doi: 10.1126/science.1113801. [DOI] [PubMed] [Google Scholar]
  • 32.Holm L, Sander C. Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of coordinate errors. J Mol Biol. 1991;218:183–194. doi: 10.1016/0022-2836(91)90883-8. [DOI] [PubMed] [Google Scholar]
  • 33.Dunbrack RL, Karplus M. Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J Mol Biol. 1993;230:543–574. doi: 10.1006/jmbi.1993.1170. [DOI] [PubMed] [Google Scholar]
  • 34.Shenkin PS, Farid H, Fetrow JS. Prediction and evaluation of side-chain conformations for protein backbone structures. Proteins. 1996;26:323–352. doi: 10.1002/(SICI)1097-0134(199611)26:3<323::AID-PROT8>3.0.CO;2-E. [DOI] [PubMed] [Google Scholar]
  • 35.De Maeyer M, Desmet J, Lasters I. All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. Fold Des. 1997;2:53–66. doi: 10.1016/s1359-0278(97)00006-0. [DOI] [PubMed] [Google Scholar]
  • 36.Formaneck MS, Ma L, Cui Q. Reconciling the "old" and "new" views of protein allostery: a molecular simulation study of chemotaxis Y protein (CheY) Proteins. 2006;63:846–867. doi: 10.1002/prot.20893. [DOI] [PubMed] [Google Scholar]
  • 37.Bordner AJ, Abagyan RA. Large-scale prediction of protein geometry and stability changes for arbitrary single point mutations. Proteins. 2004;57:400–413. doi: 10.1002/prot.20185. [DOI] [PubMed] [Google Scholar]
  • 38.Lolis E, Petsko GA. Crystallographic analysis of the complex between triosephosphate isomerase and 2-phosphoglycolate at 2.5-A resolution: implications for catalysis. Biochemistry. 1990;29:6619–6625. doi: 10.1021/bi00480a010. [DOI] [PubMed] [Google Scholar]
  • 39.Lolis E, Alber T, Davenport RC, Rose D, Hartman FC, Petsko GA. Structure of yeast triosephosphate isomerase at 1.9-A resolution. Biochemistry. 1990;29:6609–6618. doi: 10.1021/bi00480a009. [DOI] [PubMed] [Google Scholar]
  • 40.Joseph D, Petsko GA, Karplus M. Anatomy of a conformational change: hinged "lid" motion of the triosephosphate isomerase loop. Science. 1990;249:1425–1428. doi: 10.1126/science.2402636. [DOI] [PubMed] [Google Scholar]
  • 41.Derreumaux P, Schlick T. The loop opening/closing motion of the enzyme triosephosphate isomerase. Biophys J. 1998;74:72–81. doi: 10.1016/S0006-3495(98)77768-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Guerois R, Nielsen JE, Serrano L. Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol. 2002;320:369–387. doi: 10.1016/S0022-2836(02)00442-4. [DOI] [PubMed] [Google Scholar]
  • 43.Kortemme T, Baker D. A simple physical model for binding energy hot spots in protein-protein complexes. Proc Natl Acad Sci U S A. 2002;99:14116–14121. doi: 10.1073/pnas.202485799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Kortemme T, Morozov AV, Baker D. An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. J Mol Biol. 2003;326:1239–1259. doi: 10.1016/s0022-2836(03)00021-4. [DOI] [PubMed] [Google Scholar]
  • 45.Yin S, Ding F, Dokholyan NV. Modeling Backbone Flexibility Improves Protein Stability Estimation. Structure. 2007;15:1567–1576. doi: 10.1016/j.str.2007.09.024. [DOI] [PubMed] [Google Scholar]
  • 46.Dunbrack RL, Cohen FE. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 1997;6:1661–1681. doi: 10.1002/pro.5560060807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J Am Chem Soc. 1995;117:5179–5197. [Google Scholar]
  • 48.MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiorkiewicz-Kuczera J, Yin D, Karplus M. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J Phys Chem B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
  • 49.Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science. 2003;302:1364–1368. doi: 10.1126/science.1089427. [DOI] [PubMed] [Google Scholar]
  • 50.Lazaridis T, Karplus M. Effective energy function for proteins in solution. Proteins. 1999;35:133–152. doi: 10.1002/(sici)1097-0134(19990501)35:2<133::aid-prot1>3.0.co;2-n. [DOI] [PubMed] [Google Scholar]
  • 51.Appleton BA, Zhang Y, Wu P, Yin JP, Hunziker W, Skelton NJ, Sidhu SS, Wiesmann C. Comparative structural analysis of the Erbin PDZ domain and the first PDZ domain of ZO-1. Insights into determinants of PDZ domain specificity. J Biol Chem. 2006;281:22312–22320. doi: 10.1074/jbc.M602901200. [DOI] [PubMed] [Google Scholar]
  • 52.Fitzgerald JE, Jha AK, Sosnick TR, Freed KF. Polypeptide motions are dominated by peptide group oscillations resulting from dihedral angle correlations between nearest neighbors. Biochemistry. 2007;46:669–682. doi: 10.1021/bi061575x. [DOI] [PubMed] [Google Scholar]
  • 53.Hayward S. Peptide-plane flipping in proteins. Protein Sci. 2001;10:2219–2227. doi: 10.1110/ps.23101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Carr DB, Littlefield RJ, Nicholson WL, Littlefield JS. Scatterplot Matrix Techniques for Large N. Journal of the American Statistical Association. 1987;82:424–436. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES