Abstract
To successfully design new proteins and understand the effects of mutations in natural proteins, we must understand the geometric and physicochemical principles underlying protein structure. The side chains of amino acids in peptides and proteins adopt specific dihedral angle combinations; however, we still do not have a fundamental quantitative understanding of why some side-chain dihedral angle combinations are highly populated and others are not. Here we employ a hard-sphere plus stereochemical constraint model of dipeptide mimetics to enumerate the side-chain dihedral angles of leucine (Leu) and isoleucine (Ile), and identify those conformations that are sterically allowed versus those that are not as a function of the backbone dihedral angles ϕ and ψ. We compare our results with the observed distributions of side-chain dihedral angles in proteins of known structure. With the hard-sphere plus stereochemical constraint model, we obtain agreement between the model predictions and the observed side-chain dihedral angle distributions for Leu and Ile. These results quantify the extent to which local, geometrical constraints determine protein side-chain conformations.
Introduction
Researchers in computational protein design seek to create new proteins with desirable properties, such as novel folds, enhanced stability, or tailored binding affinity and specificity (1). Although a number of successes in protein design have been achieved in recent years, the problem is by no means solved (2–12). In a recent study (13), for example, protein domains were designed to bind to a conserved region of the stem of influenza hemagglutinin protein. However, only 3% of the designed structures exhibited any binding when tested experimentally. That work both illustrated the state of the art in computational protein design and highlighted its limitations, as the authors themselves subsequently discussed (14).
There are several issues with current approaches to computational protein design. Current force fields mix knowledge-based and molecular-mechanics-based terms with relative weights that are determined ad hoc and are specific to each design problem (15,16). This approach also results in double counting of some energetic contributions. For example, including a knowledge-based helix propensity term double counts the energetics of van der Waals and hydrogen-bonding interactions. Moreover, many of the molecular-mechanics-based terms (e.g., van der Waals, electrostatics, and solvent-mediated interactions) do not need to be included in all applications. However, molecular-dynamics force fields have been optimized with all terms present, and with respect to a particular water model, which makes it difficult for researchers to assess the sensitivity of molecular-mechanics force fields to individual energetic terms. Instead of making the force fields more complicated, we seek a computational methodology in which the force fields are simplified to include only the dominant terms that are relevant to a particular application.
Exploration of the limits of a hard-sphere and stereochemical model for protein structure has a long history. More than 40 years ago, Ramakrishnan and Ramachandran (17) identified the allowed backbone conformations of an alanyl dipeptide given hard-sphere and stereochemical constraints. The sterically allowed combinations of the backbone dihedral angles ϕ and ψ predicted for the alanyl dipeptide match those observed in proteins of known structure.
The influence of steric and packing constraints in proteins has been investigated extensively in both experiments and computational studies (18–31). For example, in experiments, researchers have determined the structural and thermodynamic changes in response to large- to small-cavity-forming mutations and alternative core-packing arrangements. In addition, the Richardson group (32–35) developed a method to assess the quality of protein crystal structures and ameliorate incorrect ones. They found that the highest-resolution structures efficiently fill space with few steric clashes, whereas low-quality structures are less well packed and possess many steric clashes. Dunbrack and colleagues have extensively analyzed the side-chain dihedral angle distributions in high-resolution protein crystal structures (36–39). They emphasized that the side-chain dihedral angle distributions are rotameric, with high probabilities at specific χ1 and χ2 combinations that depend sensitively on the backbone dihedral angles ϕ and ψ. They also showed that certain rotamers are rare because of steric repulsions analogous to those that constrain the conformations of hydrocarbon chains.
Backbone (e.g., CMAP and Amber-NMR) and side-chain (e.g., Amber-ILDN) dihedral angle potentials and backbone-dependent rotamer libraries have been developed for implementation into molecular-dynamics simulation packages (40–42). However, even with these corrections, results from CHARMM and Amber still disagree with each other in their predictions for the distributions of the backbone and side-chain dihedral angles for dipeptide mimetics (43,44). Without the CMAP corrections, CHARMM predictions for the backbone dihedral angle distributions can be well outside the hard-sphere limits of the Ramachandran plot (45,46).
Given the importance of side-chain packing in specifying the stability of protein-protein interfaces (47,48) and protein cores, we argue that for computational approaches to protein design to be successful, one must quantitatively understand the form of the side-chain dihedral angle distributions, i.e., one must explain why particular side-chain dihedral angle combinations are more or less probable. In this work, we present the results of computational studies of Leu and Ile dipeptide mimetics. We explain the observed side-chain dihedral angle probabilities for these uncharged, nonpolar resides using a hard-sphere model with stereochemical constraints (i.e., the bond lengths, bond angles, and ω backbone dihedral angles set to experimental values) and no additional energetic terms.
Materials and Methods
Fig. 1 shows stick representations of the Leu and Ile dipeptide mimetics (N-acetylleucine-N′-methylamide and N-acetylisoleucine-N′-methylamide). Dipeptide conformations for both Leu and Ile are specified by the backbone dihedral angles ϕ and ψ, side-chain dihedral angles χ1 and χ2, 12 bond lengths, 15 bond angles, and two additional backbone dihedral angles ω1 and ω2 (without rotations of the hydrogen atoms; see the Supporting Material). We compare the results of our calculations with a subset (structures with resolution ≤1.0 Å and R factor ≤ 0.2) of Leu and Ile residues from the PDB provided by Dr. Roland Dunbrack, Jr., extracted from PISCES (49,50). From here on, we will refer to this database as the culled Dunbrack database. Note that this data set is not a subset of the set presented in Shapovalov and Dunbrack (39), even though a similar methodology was used to obtain it. Our selected subset includes 2204 Leu and 1555 Ile residues. The culled Dunbrack database is just one of several high-resolution protein databases that could have been used (33,52).
Figure 1.

Stick representation of Leu (left) and Ile (right) dipeptide mimetics. The backbone dihedral angles, ϕ and ψ, and the side-chain dihedral angles χ1 and χ2 are highlighted, with positive angles indicated by the arrows. The methyl hydrogen atoms were added using the REDUCE program (56). The Cα atoms of the central, proceeding (i + 1), and trailing (i − 1) amino acids are labeled. Carbon, nitrogen, oxygen, and hydrogen atoms are shaded pink, blue, oxygen, and white, respectively. To see this figure in color, go online.
The culled Dunbrack database, against which we compare our calculations, is carefully curated to include a large number of high-resolution and high-confidence structures. Some researchers have reported that only at resolutions less than ∼0.7 Å are x-ray crystal structures truly free of refinement bias (53). However, the extremely small number of available ultrahigh-resolution structures (only six) precludes a meaningful statistical analysis. Nevertheless, we performed a side-chain conformational analysis of the 51 Leu and 32 Ile residues in these ultrahigh-resolution structures. We observed no significant differences between these analyses and those based on the culled Dunbrack data set. See the Supporting Material and Figs. S1 and S2.
Fig. 2 shows the observed probability distributions for the backbone dihedral angles P(ϕ,ψ) and side-chain dihedral angles P(χ1, χ2) for Leu (Fig. 2, a and c) and Ile (Fig. 2, b and d) from protein crystal structures in the culled Dunbrack database. The probability distributions were binned in 5° × 5° boxes and normalized separately so that the sum over all ϕ and ψ, or over all χ1 and χ2, equals one. Note that the majority (60%) of Ile residues have side-chain dihedral angles that fall near a single rotamer combination (300°, 180°) (box 6). For ease of reference, we decomposed χ1 and χ2 space into nine boxes, labeled 1–9. The χ1 and χ2 combinations around (300°, 300°) (box 3), (60°, 180°) (box 4), (180°, 180°) (box 5), and (180°, 60°) (box 8) are sometimes observed, whereas the χ1 and χ2 combinations around (60°, 300°) (box 1), (180°, 300°) (box 2), (60°, 60°) (box 7), and (300°, 60°) (box 9) rarely occur (with probabilities ≤1%). For Leu residues, >90% of the side-chain dihedral angles are found with χ1 and χ2 combinations around (300°, 180°) (box 6) and (180°, 60°) (box 8). Side-chain dihedral angle combinations around (180°, 180°) (box 5) and (300°, 60°) (box 9) are sometimes observed, whereas all other χ1 and χ2 combinations are rarely observed.
Figure 2.

Observed probability distributions for the backbone dihedral angles P(ϕ,ψ) (top) and side-chain dihedral angles P(χ1, χ2) (bottom) binned in 5° × 5° increments for Leu (left) and Ile (right) from protein crystal structures in the culled Dunbrack database. (a–d) The sums of the probability distributions over all ϕ and ψ in a and b, or over all χ1 and χ2, in c and d equal one. In c and d, the probability values within each of the nine χ1 and χ2 boxes are labeled.
To obtain a physical understanding of the observed side-chain dihedral angle distributions of Leu and Ile, we model the atoms in the dipeptide mimetics as hard spheres with specified radii and bond-length, bond-angle, and ω-backbone dihedral-angle constraints (54). Using this model, we exhaustively sample all backbone (ϕ,ψ) and side-chain dihedral angles (χ1, χ2) and determine which angle combinations give rise to steric overlaps and which ones do not. In this context, a steric overlap is defined as a clash between two nonbonded atoms (with both located on the side chain or one on the side chain and the other on the backbone, i.e., we do not consider clashes between backbone atoms) that satisfies rij < (σi + σj)/2, where rij is the center-to-center separation between atoms i and j with diameters σi and σj. We then calculate the probability distributions for sterically allowed combinations of the side-chain dihedral angles χ1 and χ2 for particular values of the backbone dihedral angles ϕ and ψ. Our calculations involve the following steps: First, we set the atom sizes for hydrogen, sp3 carbon, sp2 carbon, nitrogen, and oxygen to be 1.05, 1.5, 1.4, 1.4, and 1.45 Å, respectively. These values were calibrated in our previous studies of the side-chain dihedral angle distributions for Val and Thr (55). We then add the methyl hydrogens and position them using the REDUCE software package (56). To calculate the backbone and side-chain dihedral-angle distributions, we discretize the ϕ and ψ or χ1 and χ2 plane into 5° × 5° boxes, and for each box we sum the number of Leu or Ile backbone or side-chain conformations that are sterically allowed. The number of counts in each box normalized by the total number of rotamer combinations sampled gives P(ϕ,ψ) and P(χ1, χ2). Thus, the sum of P(ϕ,ψ) and P(χ1, χ2) over all ϕ and ψ, or over all χ1 and χ2, equals one. See the Supporting Material for additional details of the computational methods.
Results and Discussion
Fig. 3 summarizes the results obtained using the hard-sphere plus stereochemical constraint model for Ile dipeptide mimetics. In panels b, d, f, h, and j, we show the calculated probability distributions P(χ1, χ2) of sterically allowed side-chain dihedral-angle combinations χ1 and χ2 when the backbone dihedral angles ϕ and ψ are sampled according to the distributions shown in a, c, e, g, and i, respectively. When ϕ and ψ are sampled according to the observed Ile dipeptides in the culled Dunbrack database, where the majority of ϕ and ψ are in the α-helix region of the Ramachandran plot, the model predicts that the boxes with the most sterically allowed side-chain dihedral-angle combinations χ1 and χ2 are boxes 6 (35%), 4 (23%), 5 (20%), and 3 (16%), which is similar to the results from the culled Dunbrack database in Fig. 2 d, i.e., boxes 6 (60%), 4 (16%), 3 (15%), and 5 (6%). One interesting exception, which we will investigate in future studies, is box 5, for which we predict 20%, whereas the culled Dunbrack database gives 6%. This discrepancy suggests that the Dunbrack database does not uniformly weight the sterically allowed side-chain dihedral-angle combinations. Note that both the calculated and Dunbrack distributions do not populate boxes 1, 2, 7, and 9.
Figure 3.

Calculated probability distributions of the sterically allowed side-chain dihedral-angle combinations χ1 and χ2 (5° × 5° bins) from the steric plus stereochemical constraint model (in b, d, f, h, and j) after averaging over all Ile configurations with the ϕ and ψ backbone dihedral angles given in a, c, e, g, and i, respectively. Panel a shows the distribution of ϕ and ψ from the culled Dunbrack database. Panel c indicates that the dipeptide mimetics derived from the culled Dunbrack database have ϕ and ψ set to the canonical α-helix values, ϕ = −60° and ψ = −45°. Panels e, g, and i represent uniform sampling of ϕ and ψ values in the shaded regions that coincide roughly with the α-helix, β-sheet, and α-helix plus β-sheet regions of the Ramachandran plot outer limits (dashed line) for τ = 115°. Note that sterically allowed conformations can occur outside the Ramachandran outer limits because we are not including clashes between backbone atoms. To see this figure in color, go online.
To determine the origin of the high-probability χ1 and χ2 combinations in box 5 centered around (300°, 180°), we investigated how the sampling of the backbone dihedral angles influences the side-chain dihedral-angle distributions. In Fig. 3 d, we show the sterically allowed probability distribution P(χ1, χ2) for Ile dipeptides derived from the culled Dunbrack database after setting the backbone dihedral angles to canonical α-helix values ϕ = −60° and ψ = −45°. Setting the ϕ and ψ backbone dihedral angles to canonical helix values somewhat increases the probability of box 6 from 35% to 49%. This result suggests that one reason for the large number of side-chain dihedral angle combinations near (300°, 180°) in the culled Dunbrack database is the preponderance of α-helical structures in the database.
To further investigate the interdependence between the backbone dihedral angles ϕ and ψ and side-chain dihedral angles χ1 and χ2, we also calculated the sterically allowed P(χ1, χ2) when uniformly sampling over different regions of ϕ and ψ space: the α-helix region (Fig. 3 e), β-sheet region (Fig. 3 g), and the combined α-helix and β-sheet regions (Fig. 3 i). The calculated results corresponding to each of these sampling methods are shown in Fig. 3, f, h, and j, respectively. Sampling different regions of ϕ and ψ space in this fashion has dramatic consequences for the sterically allowed side-chain dihedral-angle distributions. For example, we find that box 6 no longer contains the most sterically allowed χ1 and χ2 combinations when we sample uniformly over ϕ and ψ space. Boxes 4 and 5 now contain the largest number of sterically allowed χ1 and χ2 combinations, with >80% of the total contained in boxes 4, 5, and 6. This result emphasizes that χ1 and χ2 combinations in box 6 might be overweighted in rotamer libraries that do not account for the high α-helix content in the Protein Data Bank (PDB).
We present the sterically allowed distributions P(χ1, χ2) for the relevant regions of ϕ and ψ space for Ile in Fig. 4. A close examination of these data makes it clear that the ψ dependence of P(χ1, χ2) is stronger than the ϕ dependence (except perhaps for values near ψ = −65°). For values in the range 35° ≤ ψ ≤ 55° (i.e., the top two rows of Fig. 4), box 4 contains the only sterically allowed χ1 and χ2 combinations over the full range, −180° ≤ ϕ ≤ −30°. As ψ decreases, sterically allowed χ1 and χ2 combinations populate box 5 as well as box 4. The most diverse collection of sterically allowed χ1 and χ2 combinations occurs in the range −65° ≤ ψ ≤ −25°, with boxes 3, 4, 5, 6, and 8 containing a significant number of sterically allowed combinations. For ψ ≤ −65°, the number of sterically allowed χ1 and χ2 combinations begins to decrease significantly.
Figure 4.

Calculated probability distributions P(χ1,χ2) of the sterically allowed side-chain dihedral-angle combinations χ1 and χ2 using the hard-sphere plus stereochemical constraint model for Ile dipeptides extracted from protein crystal structures in the culled Dunbrack database, after setting them to particular values of the backbone dihedral angles ϕ and ψ indicated in each panel. The sum of the P(χ1, χ2) distributions over all χ1 and χ2 equals one in each panel separately. To see this figure in color, go online.
Another illustrative way to display our data is to plot sterically allowed ϕ and ψ values for each box of χ1 and χ2 combinations. In Fig. 5, we count the number of sterically allowed χ1 and χ2 combinations that occur within 5° × 5° boxes in ϕ and ψ space for Ile. As expected, we find that there are very few ϕ and ψ combinations that admit sterically allowed χ1 and χ2 combinations in boxes 1, 2, 7, and 9. In addition, sterically allowed χ1 and χ2 combinations that populate boxes 3 and 6 are associated with ϕ and ψ combinations near canonical α-helix and β-sheet values. In contrast, sterically allowed χ1 and χ2 combinations that populate boxes 4 and 5 are associated with the bridge region and elevated ψ values in the β-sheet region of the Ramachandran plot. This behavior is also found in protein crystal structures from the culled Dunbrack database, as shown in Fig. S3.
Figure 5.

Calculated probability distributions P(ϕ, ψ) based on the sterically allowed combinations of Ile side-chain dihedral angles in boxes 1–9 (Fig. 2 d) in each panel. The Ramachandran plot inner (red) and outer (blue) limits for τ = 115° are indicated. The sums of the distributions P(ϕ, ψ) over all ϕ and ψ equal one for each panel separately. To see this figure in color, go online.
We also investigated the influence of correlations among the bond angles, bond lengths, and ω-backbone dihedral angles on the distribution of sterically allowed side-chain dihedral angles. In Fig. 6, we analyze the effects of the correlations between the 12 bond lengths, 15 bond angles, and 2 ω-backbone dihedral angles on the calculated sterically allowed probability distributions P(χ1, χ2) for Ile dipeptides when the backbone dihedral angles are fixed at the α-helix canonical values ϕ = −60° and ψ = −45°. In Fig. 6 a, we show the calculated P(χ1, χ2) for Ile residues from the culled Dunbrack database with ϕ and ψ at α-helix canonical values (same as Fig. 3 d). The correlation coefficients between the bond lengths and bond and dihedral angles for Ile residues from the culled Dunbrack database are shown in Fig. 6 d, with labels given in Table 1. The amplitudes of the fluctuating positive and negative correlations are above random noise (Fig. 6 f). In Fig. 6 b, we show P(χ1, χ2) for artificial Ile dipeptide mimetics with bond lengths, bond angles, and ω-backbone dihedral angles randomly selected from Gaussian distributions with means, standard deviations (SDs), and multivariate correlations that match those from the culled Dunbrack database. We find that the probability distributions P(χ1, χ2) shown in Fig. 6, a and b, are very similar to those obtained from Ile dipeptides constructed without building in multivariate correlations. Thus, correlations in the bond lengths, bond angles, and ω-dihedral angles do not strongly influence the distribution of sterically allowed side-chain dihedral angles in dipeptides.
Figure 6.

(a–c) Calculated probability distribution P(χ1, χ2) of the sterically allowed combinations of χ1 and χ2 for (a) Ile dipeptides extracted from protein crystal structures in the culled Dunbrack database with backbone dihedral angles for all residues rotated to ϕ = −60° and ψ = −45° (same as Fig. 3c); (b) 8970 randomly generated Ile dipeptide mimetics with the backbone dihedral angles rotated to ϕ = −60° and ψ = −45°, bond lengths, bond angles, and dihedral angles ω (from residues i and i + 1) chosen randomly from Gaussian distributions with the same mean, SD, and multivariate correlations as found in the culled Dunbrack database; and (c) 8970 randomly generated Ile dipeptide mimetics with backbone dihedral angles set to ϕ = −60° and ψ = −45° and bond lengths, bond angles, and ω-dihedral angles chosen randomly from Gaussian distributions with only means and SDs that match the culled Dunbrack database. Panels d–f show the correlation coefficients between the 12 bond lengths, 15 bond angles, and two backbone ω-dihedral angles from the Ile dipeptide mimetics employed to calculate the probability distributions P(χ1, χ2) in a–c, respectively. The axes labels in d–f index the bond lengths, bond angles, and dihedral angles as shown in Table 1. To see this figure in color, go online.
Table 1.
Indexes that label the 12 bond lengths, 15 bond angles, and two backbone ω-dihedral angles that characterize the Ile dipeptide mimetic and appear in Fig. 6, d–f
| Index | Name |
|---|---|
| 1 | Cαi-1 – Ci-1 |
| 2 | Ci-1 – Oi-1 |
| 3 | Ci-1 – Ni |
| 4 | N – Cα |
| 5 | Cα – C |
| 6 | C – O |
| 7 | Cα - Cβ |
| 8 | Cα – Cγ2 |
| 9 | Cβ – Cγ1 |
| 10 | Cγ1 – Cδ |
| 11 | Ci – Ni+1 |
| 12 | Ni+1 – Cαi+1 |
| 13 | Cαi-1 – Ci-1 – Oi-1 |
| 14 | Cαi-1 – Ci-1 – Ni |
| 15 | Oi-1 – Ci-1 –Ni |
| 16 | Ci-1 – Ni – Cαi |
| 17 | N – Cα – Cβ |
| 18 | N – Cα – C |
| 19 | C – Cα – Cβ |
| 20 | Cα – C – O |
| 21 | Cα – Cβ – Cγ1 |
| 22 | Cα – Cβ – Cγ2 |
| 23 | Cγ1 – Cβ – Cγ2 |
| 24 | Cβ – Cγ1 – Cδ |
| 25 | Cαi – Ci – Ni+1 |
| 26 | Oi – Ci – Ni+1 |
| 27 | Ci – Ni+1 – Cαi+1 |
| 28 | Cαi-1 – Ci-1 – Ni – Cαi |
| 29 | Cαi – Ci – Ni+1 – Cαi+1 |
We find qualitatively similar results for Leu dipeptides, with a few noteworthy differences. In Fig. 2 c, we show that the most χ1 and χ2 combinations from Leu residues in the culled Dunbrack database occur in boxes 6 and 8 (Fig.7 , f and h), totaling >92% of the side-chain conformations. In this figure we plot the sterically allowed distributions P(χ1, χ2) for our model when we employ different sampling methods for ϕ and ψ. When we sample ϕ and ψ according to the culled Dunbrack database or when we set ϕ and ψ to canonical α-helix values, we find that 75% of the sterically allowed χ1 and χ2 combinations are found in boxes 6 and 8 (Fig.7, f and h). An interesting difference between P(χ1, χ2) obtained from the culled Dunbrack database and that predicted from our model is that side-chain conformations in box 9 (Fig.7, i) are more abundant in the model. This abundance occurs despite syn-pentane interactions (Dunbrack) that lead to strong overlaps between backbone and side-chain Cδ atoms for χ1 ≥ 300°. In future studies, we will investigate whether structures in coil libraries more frequently populate the sterically allowed conformations in box 9(Fig.7, i). In contrast to the results for Ile, the specific method used to sample ϕ and ψ does not strongly influence the calculated P(χ1, χ2) for Leu, i.e., uniform sampling of ϕ and ψ in f, h, and j gives results qualitatively similar to those obtained by sampling ϕ and ψ according to the culled Dunbrack distribution.
Figure 7.

(a–j) Calculated probability distributions of the sterically allowed side-chain dihedral-angle combinations χ1 and χ2 (5°×5° bins) from the steric plus stereochemical constraint model (in b, d, f, h, and j) after averaging over all Leu configurations with ϕ and ψ backbone dihedral angles given in a, c, e, g, and i, respectively. See Fig. 3 for additional information. To see this figure in color, go online.
The sterically allowed distributions P(χ1, χ2) for the relevant regions of ϕ and ψ space are plotted for Leu in Fig. S4. Again, we find that the ψ dependence of P(χ1, χ2) is somewhat stronger than the ϕ dependence. For values in the range 35° ≤ ψ ≤ 55° (i.e., the top two rows of Fig. S4), the model predicts few sterically allowed χ1 and χ2 combinations, with most occurring in box 6. As ψ decreases, sterically allowed χ1 and χ2 combinations populate more boxes, with most occurring in 6, 8, and 9. We also find sterically allowed χ1 and χ2 combinations that bridge boxes 5 and 6, as well as boxes 8 and 9, which suggests that these conformations enable transitions between rotamers (57).
In Fig. S5, we count the number of sterically allowed χ1 and χ2 combinations that occur within 5° × 5° boxes in ϕ and ψ space for Leu. For the rare χ1 and χ2 combinations (e.g., boxes 2, 3, 4, 5, and 7), the ϕ and ψ combinations that admit sterically allowed χ1 and χ2 combinations are fairly uniform. In contrast, the highly probable sterically allowed χ1 and χ2 combinations that populate boxes 6, 8, and 9 for the most part are associated with ϕ and ψ combinations in the canonical α-helix and β-sheet regions of the Ramachandran plot, although some conformations in box 6 exist in the bridge region. This predicted behavior is also found in the protein structures from the culled Dunbrack database (Fig. S6).
We also performed similar side-chain conformational analyses on the Leu and Ile residues in ultrahigh-resolution structures, and these gave results similar to those obtained with the calculations described above. See the Supporting Material and Figs. S1 and S2.
Conclusions
In summary, we have enumerated the sterically allowed side-chain dihedral-angle combinations for Leu and Ile dipeptide mimetics using a hard-sphere plus stereochemical constraint model. We find that the regions of the sterically allowed probability distributions P(χ1, χ2) correspond to side-chain dihedral-angle combinations that are observed in proteins of known structure. Thus, we emphasize that, in many cases, modeling steric and stereochemical constraints alone can quantitatively describe side-chain conformational statistics. The discrepancies between the side-chain dihedral-angle distributions calculated from our model and those extracted from the PDB are likely due to the particular nonuniform weighting of the sterically allowed side-chain conformations in the PDB and will be investigated in future studies.
Our complete enumeration approach may be contrasted with methods that rely exclusively on the PDB, which are overweighted by the ϕ and ψ combinations that occur frequently in structures deposited in the PDB. In contrast, with our model, we can interrogate side-chain conformations that are rarely sampled in the PDB as well as the highly probable ones. We are now in a position to calculate the side-chain dihedral-angle distributions for all other dipeptide mimetics and predict side-chain conformations in the context of proteins.
Acknowledgments
We thank R.L. Dunbrack, Jr., for providing a new high-resolution set of structures from the PDB, as well as for thought-provoking discussions. We also thank Jane and David Richardson for their valuable insights.
This work was supported by the National Science Foundation (grants DMR-1006537 and PHY-1019147) and the Raymond and Beverly Sackler Institute for Biological, Physical and Engineering Sciences. Alice Qinhua Zhou is a Howard Hughes Medical Institute International Research Fellow.
Footnotes
This is an Open Access article distributed under the terms of the Creative Commons-Attribution Noncommercial License (http://creativecommons.org/licenses/by-nc/2.0/), which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Supporting Material
References
- 1.Kortemme T., Joachimiak L.A., Baker D. Computational redesign of protein-protein interaction specificity. Nat. Struct. Mol. Biol. 2004;11:371–379. doi: 10.1038/nsmb749. [DOI] [PubMed] [Google Scholar]
- 2.Shandler S.J., Korendovych I.V., DeGrado W.F. Computational design of a β-peptide that targets transmembrane helices. J. Am. Chem. Soc. 2011;133:12378–12381. doi: 10.1021/ja204215f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Korendovych I.V., Kulp D.W., DeGrado W.F. Design of a switchable eliminase. Proc. Natl. Acad. Sci. USA. 2011;108:6823–6827. doi: 10.1073/pnas.1018191108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Grove T.Z., Osuji C.O., Regan L. Stimuli-responsive smart gels realized via modular protein design. J. Am. Chem. Soc. 2010;132:14024–14026. doi: 10.1021/ja106619w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cortajarena A.L., Wang J., Regan L. Crystal structure of a designed tetratricopeptide repeat module in complex with its peptide ligand. FEBS J. 2010;277:1058–1066. doi: 10.1111/j.1742-4658.2009.07549.x. [DOI] [PubMed] [Google Scholar]
- 6.Cortajarena A.L., Liu T.Y., Regan L. Designed proteins to modulate cellular networks. ACS Chem. Biol. 2010;5:545–552. doi: 10.1021/cb9002464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Chen T.S., Palacios H., Keating A.E. Structure-based redesign of the binding specificity of anti-apoptotic Bcl-x(L) J. Mol. Biol. 2013;425:171–185. doi: 10.1016/j.jmb.2012.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Humphris-Narayanan E., Akiva E., Kortemme T. Prediction of mutational tolerance in HIV-1 protease and reverse transcriptase using flexible backbone protein design. PLOS Comput. Biol. 2012;8:e1002639. doi: 10.1371/journal.pcbi.1002639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.King N.P., Sheffler W., Baker D. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science. 2012;336:1171–1174. doi: 10.1126/science.1219364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Roberts K.E., Cushing P.R., Donald B.R. Computational design of a PDZ domain peptide inhibitor that rescues CFTR activity. PLOS Comput. Biol. 2012;8:e1002477. doi: 10.1371/journal.pcbi.1002477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Privett H.K., Kiss G., Mayo S.L. Iterative approach to computational enzyme design. Proc. Natl. Acad. Sci. USA. 2012;109:3790–3795. doi: 10.1073/pnas.1118082108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Murphy G.S., Mills J.L., Kuhlman B. Increasing sequence diversity with flexible backbone protein design: the complete redesign of a protein hydrophobic core. Structure. 2012;20:1086–1096. doi: 10.1016/j.str.2012.03.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Fleishman S.J., Whitehead T.A., Baker D. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science. 2011;332:816–821. doi: 10.1126/science.1202617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Fleishman S.J., Baker D. Role of the biomolecular energy gap in protein design, structure, and evolution. Cell. 2012;149:262–273. doi: 10.1016/j.cell.2012.03.016. [DOI] [PubMed] [Google Scholar]
- 15.Rohl C.A., Strauss C.E., Baker D. Protein structure prediction using Rosetta. Methods Enzymol. 2004;383:66–93. doi: 10.1016/S0076-6879(04)83004-0. [DOI] [PubMed] [Google Scholar]
- 16.Guntas G., Purbeck C., Kuhlman B. Engineering a protein-protein interface using a computationally designed library. Proc. Natl. Acad. Sci. USA. 2010;107:19296–19301. doi: 10.1073/pnas.1006528107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ramakrishnan C., Ramachandran G.N. Stereochemical criteria for polypeptide and protein chain conformations. Biophys. J. 1965;5:909–933. doi: 10.1016/S0006-3495(65)86759-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tsai J., Taylor R., Gerstein M. The packing density in proteins: standard radii and volumes. J. Mol. Biol. 1999;290:253–266. doi: 10.1006/jmbi.1999.2829. [DOI] [PubMed] [Google Scholar]
- 19.Lee C., Levitt M. Accurate prediction of the stability and activity effects of site-directed mutagenesis on a protein core. Nature. 1991;352:448–451. doi: 10.1038/352448a0. [DOI] [PubMed] [Google Scholar]
- 20.Chen J., Lu Z., Stites W.E. Proteins with simplified hydrophobic cores compared to other packing mutants. Biophys. Chem. 2004;110:239–248. doi: 10.1016/j.bpc.2004.02.007. [DOI] [PubMed] [Google Scholar]
- 21.Benítez-Cardoza C.G., Stott K., Jackson S.E. Exploring sequence/folding space: folding studies on multiple hydrophobic core mutants of ubiquitin. Biochemistry. 2004;43:5195–5203. doi: 10.1021/bi0361620. [DOI] [PubMed] [Google Scholar]
- 22.Willis M.A., Bishop B., Brunger A.T. Dramatic structural and thermodynamic consequences of repacking a protein’s hydrophobic core. Structure. 2000;8:1319–1328. doi: 10.1016/s0969-2126(00)00544-x. [DOI] [PubMed] [Google Scholar]
- 23.Johnson E.C., Lazar G.A., Handel T.M. Solution structure and dynamics of a designed hydrophobic core variant of ubiquitin. Structure. 1999;7:967–976. doi: 10.1016/s0969-2126(99)80123-3. [DOI] [PubMed] [Google Scholar]
- 24.Baldwin E., Xu J., Matthews B.W. Thermodynamic and structural compensation in “size-switch” core repacking variants of bacteriophage T4 lysozyme. J. Mol. Biol. 1996;259:542–559. doi: 10.1006/jmbi.1996.0338. [DOI] [PubMed] [Google Scholar]
- 25.Harbury P.B., Tidor B., Kim P.S. Repacking protein cores with backbone freedom: structure prediction for coiled coils. Proc. Natl. Acad. Sci. USA. 1995;92:8408–8412. doi: 10.1073/pnas.92.18.8408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Buckle A.M., Henrick K., Fersht A.R. Crystal structural analysis of mutations in the hydrophobic cores of barnase. J. Mol. Biol. 1993;234:847–860. doi: 10.1006/jmbi.1993.1630. [DOI] [PubMed] [Google Scholar]
- 27.Sandberg W.S., Terwilliger T.C. Energetics of repacking a protein interior. Proc. Natl. Acad. Sci. USA. 1991;88:1706–1710. doi: 10.1073/pnas.88.5.1706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Munson M., Balasubramanian S., Regan L. What makes a protein a protein? Hydrophobic core designs that specify stability and structural properties. Protein Sci. 1996;5:1584–1593. doi: 10.1002/pro.5560050813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Munson M., O’Brien R., Regan L. Redesigning the hydrophobic core of a four-helix-bundle protein. Protein Sci. 1994;3:2015–2022. doi: 10.1002/pro.5560031114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lim W.A., Sauer R.T. Alternative packing arrangements in the hydrophobic core of λ repressor. Nature. 1989;339:31–36. doi: 10.1038/339031a0. [DOI] [PubMed] [Google Scholar]
- 31.Ponder J.W., Richards F.M. Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol. 1987;193:775–791. doi: 10.1016/0022-2836(87)90358-5. [DOI] [PubMed] [Google Scholar]
- 32.Word J.M., Lovell S.C., Richardson D.C. Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. J. Mol. Biol. 1999;285:1711–1733. doi: 10.1006/jmbi.1998.2400. [DOI] [PubMed] [Google Scholar]
- 33.Lovell S.C., Word J.M., Richardson D.C. The penultimate rotamer library. Proteins. 2000;40:389–408. [PubMed] [Google Scholar]
- 34.Keedy D.A., Williams C.J., Richardson J.S. The other 90% of the protein: assessment beyond the Cαs for CASP8 template-based and high-accuracy models. Proteins. 2009;77(Suppl 9):29–49. doi: 10.1002/prot.22551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Headd J.J., Immormino R.M., Richardson J.S. Autofix for backward-fit sidechains: using MolProbity and real-space refinement to put misfits in their place. J. Struct. Funct. Genomics. 2009;10:83–93. doi: 10.1007/s10969-008-9045-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dunbrack R.L., Jr., Karplus M. Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains. Nat. Struct. Biol. 1994;1:334–340. doi: 10.1038/nsb0594-334. [DOI] [PubMed] [Google Scholar]
- 37.Bower M.J., Cohen F.E., Dunbrack R.L., Jr. Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool. J. Mol. Biol. 1997;267:1268–1282. doi: 10.1006/jmbi.1997.0926. [DOI] [PubMed] [Google Scholar]
- 38.Dunbrack R.L., Jr., Cohen F.E. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 1997;6:1661–1681. doi: 10.1002/pro.5560060807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Shapovalov M.V., Dunbrack R.L., Jr. A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure. 2011;19:844–858. doi: 10.1016/j.str.2011.03.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.MacKerell A.D., Jr., Feig M., Brooks C.L., 3rd Improved treatment of the protein backbone in empirical force fields. J. Am. Chem. Soc. 2004;126:698–699. doi: 10.1021/ja036959e. [DOI] [PubMed] [Google Scholar]
- 41.Li D.-W., Brüschweiler R. NMR-based protein potentials. Angew. Chem. Int. Ed. Engl. 2010;49:6778–6780. doi: 10.1002/anie.201001898. [DOI] [PubMed] [Google Scholar]
- 42.Lindorff-Larsen K., Piana S., Shaw D.E. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins. 2010;78:1950–1958. doi: 10.1002/prot.22711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Beauchamp K.A., Lin Y.-S., Pande V.S. Are protein force fields getting better? A systematic benchmark on 524 diverse NMR measurements. J. Chem. Theory Comput. 2012;8:1409–1414. doi: 10.1021/ct2007814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Vymětal J., Vondrášek J. Critical assessment of current force fields. Short peptide test case. J. Chem. Theory Comput. 2013;9:441–451. doi: 10.1021/ct300794a. [DOI] [PubMed] [Google Scholar]
- 45.Hu H., Elstner M., Hermans J. Comparison of a QM/MM force field and molecular mechanics force fields in simulations of alanine and glycine “dipeptides” (Ace-Ala-Nme and Ace-Gly-Nme) in water in relation to the problem of modeling the unfolded peptide backbone in solution. Proteins. 2003;50:451–463. doi: 10.1002/prot.10279. [DOI] [PubMed] [Google Scholar]
- 46.Mackerell A.D., Jr., Feig M., Brooks C.L., 3rd Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J. Comput. Chem. 2004;25:1400–1415. doi: 10.1002/jcc.20065. [DOI] [PubMed] [Google Scholar]
- 47.Reichmann D., Rahat O., Schreiber G. The molecular architecture of protein-protein binding sites. Curr. Opin. Struct. Biol. 2007;17:67–76. doi: 10.1016/j.sbi.2007.01.004. [DOI] [PubMed] [Google Scholar]
- 48.Schreiber G., Keating A.E. Protein binding specificity versus promiscuity. Curr. Opin. Struct. Biol. 2011;21:50–61. doi: 10.1016/j.sbi.2010.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Wang G., Dunbrack R.L., Jr. PISCES: a protein sequence culling server. Bioinformatics. 2003;19:1589–1591. doi: 10.1093/bioinformatics/btg224. [DOI] [PubMed] [Google Scholar]
- 50.Wang G., Dunbrack R.L., Jr. PISCES: recent improvements to a PDB sequence culling server. Nucleic Acids Res. 2005;33(Web Server issue):W94–W98. doi: 10.1093/nar/gki402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Reference deleted in proof.
- 52.Lovell S.C., Davis I.W., Richardson D.C. Structure validation by Cα geometry: φ,ψ and Cβ deviation. Proteins. 2003;50:437–450. doi: 10.1002/prot.10286. [DOI] [PubMed] [Google Scholar]
- 53.Tronrud D.E., Karplus P.A. A conformation-dependent stereochemical library improves crystallographic refinement even at atomic resolution. Acta Crystallogr. D Biol. Crystallogr. 2011;67:699–706. doi: 10.1107/S090744491102292X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zhou A.Q., O’Hern C.S., Regan L. Revisiting the Ramachandran plot from a new angle. Protein Sci. 2011;20:1166–1171. doi: 10.1002/pro.644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Zhou A.Q., O’Hern C.S., Regan L. The power of hard-sphere models: explaining side-chain dihedral angle distributions of Thr and Val. Biophys. J. 2012;102:2345–2352. doi: 10.1016/j.bpj.2012.01.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Word J.M., Lovell S.C., Richardson D.C. Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J. Mol. Biol. 1999;285:1735–1747. doi: 10.1006/jmbi.1998.2401. [DOI] [PubMed] [Google Scholar]
- 57.Petrella R.J., Karplus M. The energetics of off-rotamer protein side-chain conformations. J. Mol. Biol. 2001;312:1161–1175. doi: 10.1006/jmbi.2001.4965. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
