Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Aug 11.
Published in final edited form as: J Chem Theory Comput. 2015 Jul 23;11(8):3696–3713. doi: 10.1021/acs.jctc.5b00255

ff14SB: Improving the accuracy of protein side chain and backbone parameters from ff99SB

James A Maier †,§, Carmenza Martinez ‡,§, Koushik Kasavajhala ‡,§, Lauren Wickstrom , Kevin E Hauser ‡,§, Carlos Simmerling †,‡,§,*
PMCID: PMC4821407  NIHMSID: NIHMS772276  PMID: 26574453

Abstract

Molecular mechanics is powerful for its speed in atomistic simulations, but an accurate force field is required. The Amber ff99SB force field improved protein secondary structure balance and dynamics from earlier force fields like ff99, but weaknesses in side chain rotamer and backbone secondary structure preferences have been identified. Here, we performed a complete refit of all amino acid side chain dihedral parameters, which had been carried over from ff94. The training set of conformations included multidimensional dihedral scans designed to improve transferability of the parameters. Improvement in all amino acids was obtained as compared to ff99SB. Parameters were also generated for alternate protonation states of ionizable side chains. Average errors in relative energies of pairs of conformations were under 1.0 kcal/mol as compared to QM, reduced 35% from ff99SB. We also took the opportunity to make empirical adjustments to the protein backbone dihedral parameters as compared to ff99SB. Multiple small adjustments of φ and ψ parameters were tested against NMR scalar coupling data and secondary structure content for short peptides. The best results were obtained from a physically motivated adjustment to the φ rotational profile that compensates for lack of ff99SB QM training data in the β-ppII transition region. Together, these backbone and side chain modifications (hereafter called ff14SB) not only better reproduced their benchmarks, but improved secondary structure content in small peptides, and reproduction of NMR χ1 scalar coupling measurements for proteins in solution. We also discuss the Amber ff12SB parameter set, a preliminary version of ff14SB that includes most of its improvements.

Keywords: force field, molecular mechanics, molecular dynamics, MD, ff99SB, ff12SB, ff14SB, AMBER, protein folding

TOC figure

graphic file with name nihms772276u1.jpg

Introduction

Computational studies of biopolymers such as proteins have become commonplace, supplementing experimental information with models that provide simultaneous resolution in time, space and energy. The results, however, strongly depend on the accuracy of the computed energies and forces. The constant chemical topology prescribed by molecular mechanics (MM) leads to dramatically enhanced efficiency over quantum mechanics (QM). As wide application of even the most approximate quantum methods to solvated biomolecular simulations remains computationally prohibitive, development of accurate force fields is a critical problem for in silico biomolecular studies. Polarizable force fields1 in principle should be more accurate than fixed charge force fields, but for many systems of interest current fixed-charged models may provide results that are comparably reasonable in aqueous solution to polarizable contemporaries2. This is particularly important as these fixed-charge models also typically have lower computational overhead, allowing for improvements to the conformational sampling that often limits comparison to experiment. We believe that there is still room for improvement in fixed-charge biomolecular MM force fields. Such optimization is the focus of this work, where we focus on improving accuracy where needed most, without adding additional computational complexity over the widely used ff99SB model3.

ff99SB uses the functional form and many of the parameters derived in ff944 and ff995, largely associated with the Amber software6. A key difference in these force fields is the parameters in Fourier series adjusting energy profiles for rotation around bonds. These corrections account for key orbital effects or weaknesses in other terms like 1B4 non-bonded interactions, typically the last step in fitting force field parameters. A key assumption in these force fields is that the dihedral corrections are uncoupled, and thus have no explicit dependence on values of neighboring dihedrals. In ff94, “generic”’ torsions applying to all sets of four atoms around a bond between two atom types (using a wildcard for the outer 2 atoms) were fit to a set of experimental small molecule barrier heights. In ff995, multiple-periodicity specific torsional parameters applicable to protein side chains were fit to a larger set of small molecules.

An important component of protein force fields is the “backbone” dihedral parameters that can alter secondary structure preferences. ff94 and ff99SB leverage unique corrections to multiple sets of 4 atom dihedral terms around φ and ψ, describing the multiple combinations of atoms bonded to the central 2 atoms. In ff94, the baseline backbone dihedral profile for φ (C–N–Cα–C) and ψ (N–Cα–C–N) dihedral corrections were fit to glycine dipeptide conformation energies from QM. Then, the influence of the side chain was added to other amino acids by fitting parameters for the so-called φ′ (C–N–Cα–Cβ) and ψ′ (Cβ–Cα–C–N) based on alanine dipeptide QM conformational energies. Importantly, the φ′ and ψ′ were fit as a correction on top of the φ and ψ parameters that had already been fit to glycine. Thus all amino acids except glycine had 2 full sets of “backbone” dihedral contributions, one using only backbone atoms and a second correction set using the Cβ atom. The ubiquity of force fields based on ff94 shows its overall effectiveness, despite specific weaknesses in performance for proteins, such as exaggerated helical propensity3. With ff99SB3, protein backbone dihedrals were refit by expanding upon the methods used in ff94 and ff99. A larger set of alanine tetrapeptide conformations was used in fitting φ′ and ψ′, as well as introducing glycine tetrapeptide conformations for fitting φ and ψ. The relative energies of conformations control populations and barriers in an MM model, and thus they were used as the direct targets in parameter optimization. The conformations were limited to local minima because of computational expense, but the fitting struck a balance of secondary structure suitable for a range of systems3, 7. ff99SB became widely adopted in the simulation community, and thus the same approach of conformation pair energy fitting is used in the present work for optimization of side chain dihedral parameters.

Limitations in ff99SB

Limitations in models often only become apparent after extensive use and testing. One advantage of the wide adoption of ff99SB is that trends in the weaknesses were noted, in contrast to single anecdotal failures for which the cause may be difficult to determine, or unknown weaknesses in force fields that are not widely distributed.

Most notably, rotamer preferences for several side chains were observed to be less accurate than others8. This likely arose since ff99SB inherited amino acid side chain dihedral parameters from ff99, which were derived against a limited set of relative energies for small organic compounds5. The transferability of energy correction parameters for small molecules with relatively simple energy landscapes to amino acids may be an issue. The atoms in the amino acids typically have different partial atomic charges than the reference compounds, as well as more complex coupling to neighboring fragments. Due to recent increases in computational power, more extensive calculations (including full rotational energy profiles rather than selected stable conformations) can now be used to train side chain parameters directly against QM data on complete amino acids89.

Several studies noted room for improvement in the secondary structure preferences of ff99SB, and this is also investigated here. After ff99SB was published, solution scalar coupling data for short peptides10 became available, against which ff99SB and other force fields were compared11, and the potential for improvement was discussed11a. NMR scalar coupling constants, especially three-bond scalar 3J couplings, are particularly relevant for evaluation of dihedral distributions, as 3J couplings measure spin-spin interactions across three bonds. This allows one to utilize the simple Karplus relation12—a third order cosine series—to convert directly from dihedral angles to scalar couplings. In practice, however, the Karplus relation fails to account for other features that may affect scalar coupling, such as bond length or angle, or neighboring spin systems. Recent DFT calculations suggest that nearly all peptide backbone scalar couplings may in fact depend on both φ and ψ, for example13. Furthermore, scalar couplings calculated by a Karplus relation are sensitive to which empirical Karplus parameters are used11a, 11b. It also has been suggested that helical structures are not stable enough in ff99SB11b, 11c, 14.

We hypothesize that two potential weaknesses in the ff99SB backbone parameter fitting strategy may be the dominant factors limiting accuracy: (1) the lack of backbone fitting data outside gas-phase minima and (2) using pre-polarized MM partial charges intended for aqueous solution simulations while fitting dihedral parameters against gas-phase QM data. Limiting the backbone parameter training to gas-phase local minima left potentially arbitrary energies for transition barriers or, importantly, in regions that become minima in solution or in the complex landscape of the protein interior. Additionally, the additive ff99SB model employs HF/6-31G* RESP partial atomic charges15 that overestimate gas-phase dipoles by a similar amount as obtained in water models such as TIP3P16, thus approximating the polarization expected in aqueous solution15. However, subsequent refitting of dihedral energy profiles to more accurate gas-phase energies calculated at the MP2 level results in dihedral parameters that may partially counteract the contribution of implicit polarization effects on the rotational energy profiles. Thus empirical corrections may provide additional benefit in reproducing experiments in water. While an alternative strategy to account for solvation effects in a more consistent way might be to develop an entirely new charge model17, the original ff94 RESP charge model4, 15 developed by Peter Kollman has been extensively tested, and retaining it also maintains compatibility with many other parameter sets such as those modeling nucleic acids and carbohydrates18. Likewise, refitting the entire backbone dihedral profile rather than just minima would potentially lose the advantage of extensive studies7, 11a, 11b, 19 evaluating ff99SB’s strengths and weaknesses. Here, we investigate the simpler strategy of developing a small empirical adjustment to the ff99SB backbone parameters to improve reproduction of the experimental data in solution.

We show below that ff14SB, the combination of ff99SB with these new QM-based side chain dihedral parameters and a small empirical adjustment to the backbone φ energy profile, improves upon ff99SB in ensemble distributions for short polyalanine conformations (backbone scalar couplings), protein side chains (χ1 scalar couplings), and secondary structure balance for peptides adopting α-helical and β-hairpin conformations, while maintaining the quality of ff99SB local dynamics (Lipari-Szabo S2 order parameters). Recently, we also showed that updating the side chain parameters with those reported here results in a model that is able to accurately fold a wide variety of protein topologies up to nearly 100 amino acids20. It has also been shown that ff14SB maintains the crystal lattice and protein conformations of triclinic lysozyme better than ff99SB, ff14ipq, or CHARMM3621. We recommend use of ff14SB for protein simulations in Amber as well as in other biomolecular simulation software.

Fitting Strategy and Goals

Side chains

For the side chain update, we focus on improving the aspects that we feel are most likely to be the greatest weaknesses in the current model. Several recent reports of force field training have focused on application of more accurate quantum theory3, 89, 22. Although the level of theory is certainly important, we feel that focusing on the conformational diversity and consistency in the training data set is more likely to improve the model. In principle, dihedral parameters account for orbital effects missing in a classical model for bond rotation, but in practice also serve as empirical corrections for all differences between the QM and MM models, including lack of charge polarization changes during rotation, as well as dihedral-dependent errors in other (bond, angle, and nonbonded) terms in the force field. As a result, the appropriate correction needed to match the MM torsion rotation energy profile to that obtained using QM may differ depending on chemical or conformational context, such as backbone conformation or other side chain torsions. In most biomolecular MM force fields, however, each rotatable bond is described by parameters that are independent of the conformation of the rest of the molecule (though exceptions exist where a subset of dihedral pairs are explicitly coupled, such as the CHARMM CMAP23). As a result, while the net energy profile for rotating about a given bond will likely depend on other dihedrals (through steric effects, for example), the lack of explicit coupling in dihedral parameters limits the parameters to an implicit account for any coupling missing in the classical model. Therefore, it is important that the structures used for dihedral fitting include neighboring regions of the molecule where the parameters will be used. It is paramount to include conformational variety in those regions to avoid implicit coupling to a limited subset of their phase space, for example, a single rotamer or backbone conformation. In the present case, this led us to follow previous work89 and use complete amino acids in the QM calculations for the side chain training data, as opposed to the small organic compounds used in ff995 and carried over to ff99SB. It is also crucial to ensure that the appropriate molecular degrees of freedom are consistent between the QM and MM, whereas other degrees of freedom may need to be optimized for each model. Thus we explored different restraints and optimization schemes.

Furthermore, implicit coupling was incorporated by fitting a single set of dihedral parameters using a large set of conformations that included multidimensional scans of all side chain χ rotatable bonds, with both α and β backbone conformations for the dipeptide. Thus, while the model lacks explicit coupling, the correction parameters for each dihedral are optimized in a mean field of extensive conformational variability for the remainder of the molecule. In a recent revision of a small number of ff99SB side chain torsions that were identified by comparison of rotamer preferences in a helical context against the PDB, training against quantum mechanics energies for conformations with extended backbone improved χ1 rotamer preferences in β-rich proteins. However, while two of four amino acids showed considerable improvement in the helical test case, the other two showed more modest reduction in errors8. Our goal is to derive parameters that are transferable across chemical and conformational diversity, thus we explicitly included dipeptide conformations with both α and β backbone when sampling side chain rotational profiles. Given the weaknesses associated with scalar couplings, we did not fit to side chain scalar coupling data, but used them only to evaluate results of parameter changes that were fit using QM. This differentiates our approach from CHARMM369, for example, where side chain parameters were finally adjusted to better reproduce χ1 scalar couplings.

Another choice concerns the importance of conformational diversity in the side chain rotamer training set. One option is to scan each dihedral rotamer separately, but as discussed above, this approach can fail to incorporate coupling needed in the correction terms, and may provide parameters that work well for some rotamer combinations, but fail for others that were absent in the training data. Since including coupling via multidimensional scans can generate large numbers of conformations, one option to reduce the size of the training set is to include only minima. However, the exact locations of side chain χ minima can depend on backbone conformation, solvent, and packing with nearby residues. We thus decided to sample side chain conformations via full two-dimensional scans for the thirteen amino acids (counting different protonation states for Asp and His) with two side chain dihedrals (Table S1). For larger amino acids, conformational diversity was generated via symmetry considerations or dynamics simulations (described below). Because positions of minima may change in the context of a more complex system, and because energies for transitions may be relevant, each point in these scans was considered equally important as compared to weighting data points by their energy. As side chain preferences are coupled to the backbone24, we performed these scans at both α (φ,ψ = −60°, −45°) and β (−135°, 135°) backbone conformations. Although additional backbone conformations could be employed, we considered only the archetypal α and β secondary structures due to computational cost. A separate ppII conformation (−75°, 150°) was not included, as the interaction of the side chain with the N-terminal peptide group is comparable in ppII and α conformations, while the interaction of the side chain with the C-terminal peptide is comparable in ppII and β conformations, thus these interactions are represented in the two backbone conformations already included in our set.

Our fitting targets for the side chains were gas-phase ab initio energies, as in ff99SB. To accommodate the 15082 dipeptide conformations in our training set, we employed a relatively modest level of theory, with geometries calculated at HF/6-31G* and single point energies calculated at MP2/6-31+G**. Given the fundamental approximations, such as additivity, fixed partial charges, r−12 repulsion, and harmonic bond and angle vibrations, we do not expect the quantum theory to be the limiting factor in improving our model and focused on increasing the conformational diversity in the training set.

Additional choices that must be made relate to the generation of the QM and MM energies for conformations in the training set. First, we investigated what restraints to use in potential energy surface scans. Restraining the 4-atom set defining each χ dihedral, as well as those for φ and ψ, is natural given the goal of scanning combinations. Less obvious is whether other dihedral restraints should be included for the rotatable bond being scanned, such as those sharing the same 2 atoms defining the central bond, but varying the outer atoms. For example, the restraint for χ1 in Val is defined using N-Cα-Cβ-Cγ1, but the dihedral N-Cα-Cβ-Cγ2 could either be restrained or allowed to freely optimize in the presence of the χ1 restraint. Another choice is whether (and how strongly) to restrain other parts of the molecule, such as methyl rotations, or the peptide ω rotation. Next, given the fundamental differences in QM and MM models, and weakness in MM description of energetics beyond dihedral profiles, we investigated whether to optimize geometries once—calculating molecular mechanical energies of quantum mechanical structures—or to re-optimize the QM structures with the MM model prior to comparing energies. The energies could be calculated for identical structures, for example the quantum mechanical structures. An advantage of this approach is that all coordinates and non-bonded distances would be identical. Alternatively, energies could be compared for structures optimized with the corresponding method (i.e. MM energies for MM-optimized structures). The MM model may not reproduce small changes in bond and angle geometries for different rotamers in the QM model, and the stiffness of the MM quadratic function could result in these differences making large contributions to the errors in relative energies of conformation pairs that could be relaxed with MM re-optimization (or dynamics), thus focusing the resulting energy profile on the rotamer changes rather than MM covalent structure approximations.

Like several other MM force fields, the Amber-related models have traditionally used atom types to apply a small number of bond, angle and dihedral parameters to similar fragments in different amino acids. Ideally, the parameters would be highly transferable, and show accuracy for a variety of contexts. This approach reduces the number of parameters needed, but also limits the accuracy of the model as the implicit coupling we seek is worsened when the parameters are averaged over too great a variety of neighboring functional groups that can influence charge distribution. More specifically, the dihedral parameters are added to the energy calculated using the 1–4 electrostatics, but the partial charges are atom-specific and need not be the same for atoms with the same atom type. Yet, many sets of four atoms in the amino acid backbone and side chains shared the same atom types (and therefore the same dihedral corrections) with each other and also with nucleic acids in ff99SB. To overcome this, new atom types were created when needed to improve specificity. For example, asparagine χ2 (Cα-Cβ-Cγ-Nδ), glutamine χ3 (Cβ-Cγ-Cδ-Nɛ), and ψ′ (Cβ-Cα-C-N) all shared atom types CT-CT-C -N, and therefore the same dihedral corrections applied to all three bonds. Here, additional atom types were created to allow independent adjustment of backbone parameters and different side chain parameters. A new atom type for the α carbon (CX) was created to separate main chain, χ1, χ2, and other χ parameters, as the backbone and side chain were corrected separately. Where cross-referencing simulation data and errors fitting quantum energies suggested that solving corrections for particular amino acids together led to inaccuracies that solving separately would alleviate, additional atom types were also introduced to segregate them. Within the side chains, atom types 2C and 3C were developed for carbons bound to two or three heavy atoms, respectively, more thoroughly describing branched amino acids while isolating the revisions to amino acids (and preventing application to nucleic acids, which was possible in previous models). The CO atom type was introduced to distinguish carboxylate carbon from other carbonyl carbons. The C8 atom type was added for arginine and lysine sp3 carbons, to distinguish them from glutamate, glutamine, and methionine. Each side chain atom type was added only if it had some chemical justification, allowed better reproduction of both quantum mechanics fitting targets, and improved dynamic properties, to verify that additional parameters are appropriate. Tables S1 and S2 provide atom types for all atoms in the amino acids.

Backbone

Another goal of this work is to develop empirical adjustments to the dihedral parameters in order to improve secondary structure balance, and agreement with NMR scalar coupling data for short peptides in solution. Particularly, HN–Hα scalar couplings calculated from the MD ensemble were too high11a, suggesting too much sampling in the region between β and ppII, where the Karplus curve suggests coupling constant values greater than the values observed in the NMR experiments (Figure 1A). Despite this region being a formal barrier region, and therefore not represented by the minimized ff99SB training set conformations (Figure 1A), free energy profiles for alanine dipeptide suggest that the barrier is low enough that population of the region can contribute significantly to the ensemble averaged coupling constant at 300K (Figure 1B). It seems reasonable that the optimized ff99SB energy profile may have been too low in this region yet still provided a good fit to the training data that lacked structures in the transition region. Rather than attempting to generate additional unminimized training data and refitting the entire backbone dihedral profile, we took the approach of developing a small empirical correction aimed at reducing the population in the transition region in order to improve the correspondence between the simulations and experiments. Importantly, while much of the rest of the force field (and certainly the other dihedral parameters) was fit to physics-based, QM data, this backbone dihedral adjustment is an empirical correction based on data obtained from simulations carried out in TIP3P explicit water. Therefore, transferability of this correction term to other implicit or explicit solvent models may need evaluation prior to production use.

Figure 1.

Figure 1

(A) Ramachandran plot with the structures in the ff99SB Ala3 training set3 shown as circles, with the Hu and Bax25 H-Hα Karplus curve data shown in the background as a color gradient. Vertical lines indicate the φ values where the Karplus curve matches the scalar coupling value from either NMR10 (black) or ff99SB simulations (gray). Note that ff99SB training data were limited near the maximum of the Karplus curve (φ=−120°), suggesting that the ff99SB energies may be poorly defined in this region. (B) Free energy surface for alanine dipeptide in ff99SB, showing that the β-ppII transition region near φ,ψ=−120°,160° has significant population despite lack of training data in Figure 1A.

We therefore developed modifications to φ′ and ψ′ torsional parameters to specifically address agreement between NMR and MD for short alanine peptides (see Supporting Information for complete details). We hypothesized that the main problem is that the β-ppII barrier is too low due to lack of fitting data, but as described below, we tested alternative strategies as well. Some modifications raise the barriers between β and ppII basins or between ppII and α basins. Other modifications stabilize ppII and α relative to β or stabilize α relative to β and ppII to account for the solvation inconsistencies described above (since aqueous solvation stabilizes the α-helical macrodipole). All pairs of individual φ and ψ modifications were combined and tested. To isolate backbone errors from side chain refitting, we continued to use alanine as a model system, and evaluated thirty candidate force fields against Ala5 scalar couplings. Besides decoupling the backbone and side chain, Ala5 is small, facilitating generation of precise conformational ensembles in explicit water, with error bars smaller than the differences resulting from changing the backbone parameters. Such reduction of precision errors to lower than the accuracy errors is crucial for quantitative force field validation. Scalar couplings were evaluated using multiple sets of Karplus parameters to seek consensus. But due to the limitations discussed above, evaluation of these coupling constants gives a qualitative guide to force field quality; we use this to filter parameters sets for more extensive (and computationally expensive) testing on larger systems. One potential strength of our approach is that we do not fit to scalar couplings directly (particularly since the values depend on the Karplus parameters selected), but use them to select a limited set among many physically-motivated empirical modifications. This differs from recent work that has focused on modifying a single torsional term to reproduce solution measurements directly11c, 26 or deriving coupled corrections against protein chemical shifts27. We also desired parameters that were transferable between short disordered peptides (such as Ala5) and larger peptides with propensity to adopt stable secondary structure. This puts our approach between that of Nerenberg and Head-Gordon26 and those of Best and Hummer11c, and Li and Brüschweiler27.

Potential limitations in our approach

We retain many of the approximations present in ff99SB, such as weaknesses in the harmonic description of covalent bonds and angles, as well as the 6–12 Lennard-Jones function. We also retained the same 1–4 nonbonded scaling factors employed with ff99SB. We refit all side chain dihedral parameters except the “generic” terms applied to nonpolar hydrogens, which were left at the values from ff99. We continue to assume that backbone corrections for alanine are appropriate for all amino acids except glycine, and that explicit coupling between dihedral pairs can be neglected. For going beyond the fine-tuning applied here, this might not be the case, and per-residue, explicitly coupled backbone corrections may provide better accuracy. We also assume that comparing to scalar couplings by Karplus relations is rigorous enough to identify the best force field candidates for further screening. This naturally depends on errors in the approximate and empirical Karplus parameters or the experimentally measured scalar couplings. We additionally assume that gas-phase comparison against MP2/6-31+G**//HF/6-31G* quantum energies is sufficient to improve the side chain parameters. A brief test on aspartate conformations spaced every 60° indicated that our chosen level of theory agrees with MP2/aug-cc-pVTZ to within less than 0.3 kcal/mol average normalized error3 (ANE, described below) for both α and β backbone conformations. Ultimately, improving the level of theory or adding solvent in QM may alleviate some errors or inconsistencies in this approach. These assumptions, however, allowed us to overhaul the side chain dihedral parameters that had been carried over from ff94, in the context of the RESP charge model used in many force fields, and also to further refine the ff99SB protein backbone parameters.

Methods

Backbone dihedral empirical adjustment

Backbone parameter modifications were based on conformational ensembles for Ala3 in TIP3P explicit water at 300 K that we published previously11a. These ensembles were used to predict shifts in populations with dihedral parameter modification, and to predict the resulting change in ensemble-averaged scalar coupling values. A grid with 5° spacing was generated for both φ and ψ dihedral angles and calculated the normalized population and relative free energy of each grid bin using the ff99SB simulation ensemble. Next, the expected J value contribution for each grid bin was calculated using the population and Karplus parameters28. A spreadsheet was used to calculate ff99SB dihedral energies for each grid bin, as well as dihedral energies calculated with a Fourier series with altered amplitudes for backbone dihedrals. For each grid bin, the free energy was reweighted by the difference between the energies calculated using ff99SB and modified dihedral parameters. These altered free energies for the bins were converted to normalized populations, and then back to expected J values using the Karplus curve. Thus the potential impact on the agreement between simulated and measured 3J(HNHα) from candidate dihedral parameter amplitude adjustments could be interactively estimated. The details of each modification to phase, periodicity and amplitude are provided in Tables S5 and S6, and the rationale for each change is discussed in the Supporting Information.

Side chain dihedral training

Dipeptide structure generation

Acetyl and N-methyl capped dipeptides of the natural amino acids, except proline, alanine, and glycine, were built using LEaP29 at α (−60°, −45°) and β (−135°, 135°) backbone conformations. χ was explored by rotating in 10° increments, re-optimizing at each step, or by high temperature simulation (described in Results).

Quantum mechanics optimizations were performed with RHF/6-31G*. Scanned residues were optimized using GAMESS (US)30 with default options. Optimization continued until the RMS gradient was less than 1.0 × 10−4 Hartree Bohr−1, with an initial trust radius of 0.1 Bohr that could then adjust between 0.05 and 0.5 Bohr. Minimization proceeded by the quadratic approximation. Residues sampled by high temperature simulations were optimized using Gaussian9831 with VTight convergence criteria. Quantum mechanics energies for training data were calculated with MP2/6-31+G**. Molecular mechanics re-optimizations were performed in the gas phase with ff99SB for a maximum of 1.0 × 107 cycles or until the RMS gradient was less than 1.0 × 10−4 kcal mol−1 Å−1, with a non-bonded cutoff of 99.0 Å and initial step size of 10−4. Dihedral restraint force constants were 2.0 × 105 kcal mol−1 rad−2. Minimization employed 10 steps of steepest descent followed by conjugate gradient. Molecular mechanics energies were calculated from the last step of ff99SB minimization.

Dipeptide energy calculations

Generating conformational diversity in the training set

To maximize transferability of the parameters, multidimensional structure scans were employed to generate conformational diversity. For smaller side chains, grid scans in dihedral space were used to generate side chain variety, including both α and β backbone conformations for each side chain rotamer. Grid scans were generated for Val in one dimension, as it only has χ1, at an interval of 10°. Grids were generated for Asp, Asn, Cys, Phe, His (δ-, ɛ-, and doubly-protonated), Ile, Leu, Ser, Thr, and Trp in two dimensions, as they have χ1 and χ2, at intervals of 20°, yielding 324 structures per amino acid.

We were unable to exhaustively explore side chain conformational space side chains with more than two rotatable bonds. Tyrosine has 3 rotatable χ bonds, but dihedral space is reduced as 180° rotation of either the phenol (χ2) or of the hydroxyl produce the same effect when accounting for symmetry of the ring. We therefore fully scanned each tyrosine dihedral when the other two were at a stable rotamer defined as any instance of that value in the rotamer library for this amino acid, rounded to the nearest 10° and limiting χ2 to (−90°, 90°] to account for symmetry. Stable rotamers for the hydroxyl, not in the rotamer library, were inferred from the QM energy profiles discussed above. Stable rotamers were 180° or ±60° for χ1, ±30° or 90° for χ2, and 0° or 180° for the hydroxyl. Conformations were generated using a full scan for each dihedral (at 20° increments), repeated for every combination of stable rotamer values for the other two dihedrals. As protonated aspartate has nearly the same dihedrals as Tyr (χ1, χ2 and hydroxyl), it was scanned in the same manner, but without χ2 restriction because aspartate does not have the same symmetry properties.

Cysteine presents a special case, as it can form disulfide bonds that bridge two amino acids. In addition to developing parameters for reduced Cys (no disulfide), a pair of Cys dipeptides with a disulfide bond was employed to scan the S-S energy profile. However, a disulfide between CysA and CysB has a total of five dihedrals: χ1A, χ2A, χSS, χ2B, and χ1B. As full sampling across five dihedrals is clearly intractable, conformation space was reduced by applying the same χ1 / χ2 values to both dipeptides. Using this symmetry, a two-dimensional scan was performed for all χ1 / χ2 combinations using 20° spacing; this scan was repeated with χSS restrained to 180°, ±60°, or ±90° (five 2D scans). Separately, the χSS profile was scanned with 20° spacing using χ1 of 180° or ±60° and χ2 of 180° or ±60° (nine 1D scans total). As with the other amino acids, the entire procedure was repeated with the backbone in α and β conformations; here, both dipeptides adopted the same backbone conformation.

The remaining side chains, Arg+, Gln, Glu (protonated), Glu,Lys+, and Met, have at least three side chain dihedrals (Table S1). Rather than performing a grid search, MD simulations were used to generate diverse conformations of these side chains. Each dipeptide was simulated twice, with α or β backbone restraints, for 100 ns each. To overcome kinetic traps, these simulations were performed at 500 K and the dielectric was set to 4r. Next, a diverse subset was generated by mapping each conformation to a multidimensional grid spaced 10° in each χ. The five lowest energy conformations at each grid point were saved. From each simulation grid, five hundred structures were randomly selected (comparable to the number generated by the grid procedure described above for Tyr). Because the longer, more flexible side chains of these amino acids can adopt conformations with strong interactions between backbone and side chain, conformations where we suspected the in vacuo MM description may produce fitting artifacts were excluded, using electrostatic and distance cutoffs defined in the Supporting Information.

Objective function for parameter optimization

As in ff99SB, the errors in relative energies between all pairs of conformations were evaluated to alleviate the bias of a single, potentially arbitrary reference conformation. We first defined the relative energy error (REE) between a single pair of conformations i and j:

REE(i,j)=(EQM,iEQM,j)(EMM,iEMM,j) (1)

where EQM,i and EMM,i are the quantum and molecular mechanics energies of conformation i. EMM is calculated either as ff99SB, or, during parameter search as the ff99SB energy with the dihedral energy replaced using the candidate dihedral parameters, Eff_new:

Eff_new=Eff99SB+χ(Eχff_newEχff99SB) (2)

Where the sum is taken over all side chain rotatable bonds χ. Eχff is the sum of dihedral energy contributions of Nχ 4-atom sets around each rotatable bond, excluding those containing nonpolar hydrogens (Table S2), which remained at the values from ff99SB. For each dihedral, the periodicity n=1–4 Fourier series contributions with amplitude Vχ[c],nff and phase γχ[c],nff were summed:

Eχff=c=1Nχn=14Vχ[c],nff(1+cos(nφχ[c]γχ[c],nff)) (3)

We note that this equation is consistent with the Amber standard and lacks a factor of 1/2, and thus the true range of the energy for each term is twice Vffχ. The fitting was limited to the fourth order term in each correction, consistent with ff99SB. Test fits using more terms resulted in noisier corrections without significantly improving fit quality (data not shown).

To focus the energy differences on side chain rotamer profiles, comparisons between structures pairs with different backbone conformations, or of different amino acids, were excluded. Alternate protonation states for ionizable amino acids were summed separately. For each amino acid, in either α or β backbone conformation, the magnitudes of REE over all pairs of N side chain conformations were summed, dividing by the number of pairs to obtain the average absolute error (AAE, as defined in Hornak et al.3) for this amino acid, in a given protonation state, in a specific backbone conformation:

AAE=2N(N1)ij<i|REE(i,j)| (4)

The average of the AAEs for each residue and backbone conformation was minimized by adjusting the amplitude and phase parameters for all terms in Eqn. 3. Formally, optimization was performed by minimizing the objective function:

O=1nprofilesr=1aminoacidbb=α,βAAEr,bb (5)

where nprofiles is the number of AAE profiles, resulting in a normalized O value that represents the error in energy differences for conformation pairs, averaged over all backbone contexts, amino acids and protonation states.

The parameters for all non-hydrogen dihedrals in the protein side chains describing rotation around single bonds, as well as hydroxyl or sulfhydryl torsions are presented in Table S2. As discussed above, our structure training set was designed to include amino acid conformation pairs with simultaneous changes to more than one rotatable bond, thus necessitating concurrent optimization of parameters for multiple dihedrals (rather than the simpler approach32 of scanning parameter space one rotatable bond at a time). This enables the optimized energy corrections for each rotatable bond to incorporate implicit coupling to nearby conformational diversity. Furthermore, the presence of similar local structure (as described by atom types) in multiple amino acids often led to the requirement for fitting parameters using data from all amino acids where that functionality exists. This provides parameters that implicitly account for nearby chemical diversity, as opposed to training in a single amino acid and use in others. As a result of these two design factors, the parameter space for optimization is considerable.

To reduce problem size and accelerate convergence, amino acids were separated into the solving groups listed in Table S3 based on shared dihedral atom types, and a separate objective function (Equation 6) was constructed for each of the solving groups. Specific tables of dihedrals in each solving group are provided in Table S2. As each group shares no four-atom dihedrals with other groups, the full parameter space could be partitioned, with each solving group providing all conformations and energies necessary for separate optimization of each parameter subset. Optimized values of the objective function for each solving group are provided in Table S3.

Our goal was to use the AAEs to optimize a single set of parameters that minimizes the REE for multiple backbone conformations. However, the AAE for α and β are averages over all side chain pairs, while the ability of the optimization procedure to maximize transferability hinges on the backbone dependence of the QM-MM energy error for pairs of side chain conformations. Greater similarity of REEs among backbone conformations would indicate a better likelihood of being able to optimize a single set of parameters that is transferable among said backbone conformations. To quantify this, we subtracted the β REE from the α REE for each pair i,j of side chain conformations, averaging the magnitudes of these differences to obtain the intrinsic backbone dependence (BBD):

BBD=2N(N1)ij<i|REE(i,j)αREE(i,j)β| (6)

Where the same notation is used as defined for Eqn. 4. We note that the BBD does not report on how well the QM and MM energies match, only on whether the differences are consistent as the backbone conformation changes. Thus BBD for each amino acid is a measure of the ultimate ability of side chain dihedral parameters to match QM data in the absence of explicit coupling between backbone and side chain par ameters; any difference cannot be corrected with side chain dihedral parameters.

The ANEs were minimized using a genetic algorithm to perturb the amplitudes V and phase shifts γ in Eqn. 3. Full details are provided in the Supporting Information, but the procedure is briefly outlined here. Two populations each for ff99SB, zeroed force field parameters, and random force field parameters were generated with 63 individuals, and evolved independently. For the first 200000 generations, the amplitude was allowed to be perturbed by any value between −0.5 and 0.5 kcal/mol. Then, to focus the search, amplitude changes were restricted to 0.002 kcal/mol for another 100000 generations and then 0.001 kcal/mol until an ff99SB-initialized population found the same solution as populations starting with zero or random force field parameters. Phase shifts were restricted to 0° or 180° to produce parameters applicable for alternate enantiomers.

Simulation protocols

Initial helical conformations were defined as all amino acids having (φ, ψ)=(−60°, −40°). Initial extended conformations were defined as all (φ, ψ)=(180°, 180°). Native conformations, as appropriate, were defined for each system as below. Explicit solvation was achieved with truncated octahedra of TIP3P water16 with a minimum 8.0 Å buffer between solute atoms and box boundary. All structures were built via the LEaP module of Ambertools. Except where otherwise indicated, equilibration was performed with a weak-coupling (Berendsen) thermostat33 and barostat targeted to 1 bar with isotropic position scaling as follows. With 100 kcal mol−1 Å−2 positional restraints on protein heavy atoms, structures were minimized for up to 10000 cycles and then heated at constant volume from 100 K to 300 K over 100 ps, followed by another 100 ps at 300 K. The pressure was equilibrated for 100 ps and then 250 ps with time constants of 100 fs and then 500 fs on coupling of pressure and temperature to 1 bar and 300 K, and 100 kcal mol−1 Å−2 and then 10 kcal mol−1 Å−2 Cartesian positional restraints on protein heavy atoms. The system was again minimized, with 10 kcal mol−1 Å−2 force constant Cartesian restraints on only the protein main chain N, Cα, and C for up to 10000 cycles. Three 100 ps simulations with temperature and pressure time constants of 500 fs were performed, with backbone restraints of 10 kcal mol−1 Å−2, 1 kcal mol−1 Å−2, and then 0.1 kcal mol−1 Å−2. Finally, the system was simulated unrestrained with pressure and temperature time constants of 1 ps for 500 ps with a 2 fs time step, removing center-of-mass translation and rotation every picosecond.

SHAKE34 was performed on all bonds including hydrogen with the AMBER default tolerance of 10−5 Å for NVT and 10−6 Å for NVE. Non-bonded interactions were calculated directly up to 8 Å. Beyond 8 Å, electrostatic interactions were treated with cubic spline switching and the particle-mesh Ewald approximation35 in explicit solvent, with direct sum tolerances of 10−5 for NVT or 10−6 for NVE. A continuum model correction for energy and pressure was applied to long-range van der Waals interactions. The production timesteps were 2 fs for NVT and 1 fs for NVE.

System-specific details

Ala5

Ala5 was simulated with protonated N- and C-termini under NVT conditions. 891 water molecules were used to solvate the system. Equilibration was performed as previously11a. The structures were saved every 20ps. mod1ψ, mod2ψ, mod3ψ, mod4ψ, mod1φ, mod1φ1ψ, mod1φ2ψ, mod1φ3ψ, mod1φ4ψ, mod2φ, mod2φ1ψ, mod2φ2ψ, mod2φ3ψ, mod2φ4ψ, mod3φ, mod3φ1ψ, mod3φ3ψ, mod3φ4ψ, mod4φ, mod4φ1ψ, mod4φ3ψ, and mod5φ3ψ were run for 160 ns. ff99SB, mod3φ2ψ, mod4φ2ψ, mod4φ4ψ, mod5φ, mod5φ1ψ, and mod5φ2ψ were run for 320 ns. mod5φ4ψ was run for 480 ns.

Helices

Simulations were performed for two helical peptide systems: a hydrogen bond surrogate peptide (HBSP) and K19. The HBSP sequence denoted 3a by Wang et al.36 (Ac-GQVARQLAEIY-NH2) was chosen, as it had the greatest measured helical content36. HBSP has a covalently pre-organized α turn, with the O of the first CO and the H of the NH of residue 5 substituted by carbons, with a covalent single bond between the substituted carbons. Modeling of this covalent modification was approximated by a harmonic distance restraint between the CO of the acetyl cap and the NH of A5 with force constant 100 kcal mol−1 Å−2. This restraint was chosen as it well reproduced the distribution of hydrogen bond distances present in a crystal structure of aquaporin (PDB ID: 3ZOJ37) (see Supporting Information). For K19, we chose the sequence Ac-GGG(KAAAA)3K-NH2, consistent with previous work38.

HBSP and K19 were solvated with 2643 and 3427 TIP3P water molecules, respectively, and simulated for 1.6 μs in the NVT ensemble. Each system had two independent runs. Initial structures were either all helical (as defined in Initial Structures) or semi-extended conformations. The HBSP semi-extended conformation was built with the first five residues helical to satisfy the covalent modification in the experiment, with the remaining residues extended. The K19 semi-extended conformation was a random coil conformation extracted from simulation using ff99SB in which helical content was absent.

Cln025

As a model system to carry out initial tests of secondary structure balance, we turned to CLN025, an engineered fast-folding hairpin that is a thermally optimized variant of Chignolin39. CLN025 contains N- and C-terminal glycine-to-tyrosine substitutions from Chignolin. Thus the CLN025 sequence was YYDPETGTWY. The native conformation was chosen as the fifth conformation in the NMR ensemble39b, as that conformation was closest to the average of the NMR ensemble.

Proteins

We simulated four folded proteins for comparison of dynamic properties to NMR. First was the third Igg-binding domain of protein G (GB3). The native structure was defined as a liquid crystal NMR structure (PDB ID: 1P7E40). Second was the bovine pancreatic trypsin inhibitor (BPTI). The native structure was defined as a joint neutron/X-ray diffraction structure (PDB ID: 5PTI41). Third was ubiquitin (Ubq), with the native structure defined as a crystal structure (PDB ID: 1UBQ42). Fourth was hen egg white lysozyme (HEWL), with the native structure defined as a crystal structure (PDB ID: 6LYT43). Owing to their larger size, the proteins were equilibrated as above, but with the unrestrained step extended to a full nanosecond, rather than 500 ps.

Analysis

Calculation of NMR observables

Scalar couplings were calculated from simulations using Karplus relations12. Backbone scalar couplings were calculated as by Best et al.11b: using the Orig parameters25, 44 also used by Graf et al.10 and the DFT1 and DFT2 parameters from Case et al.45. Side chain scalar couplings were calculated using Ile, Thr, and Val C/N-Cγ Karplus parameters from Chou et al.46, and Perez et al. Karplus parameters47 for all other χ1 scalar couplings. Backbone NH Lipari-Szabo S2 order parameters were calculated using the iRED method48 via cpptraj49.

NOE reproduction in CLN025 was evaluated by computing r−6 for all interproton vectors and comparing r616 of each vector with the NOE-based restraints published by Honda et al.39b, downloaded from the BMRB50. For ambiguous restraints, contributions from each proton pair to the NOE were summed51. For each force field we generated two ensembles, one combining structures from the 4 initially folded simulations and the other combining the 4 initially extended simulations. These were used to calculate NOE deviations, with the difference between ensembles from different initial structures used to quantify precision.

Calculation of helical content

When comparing to CD (as for HBSP), helical content was defined as the fraction of residues in α-helix (H) or 3–10 helix (G) as defined by DSSP52 as implemented in cpptraj49, with simulation averages calculated for the whole trajectory. DSSP is used for comparison to CD, since both are sensitive to formation of complete helical turns, rather than local backbone dihedral angles. When comparing to CSDs (as for K19), helical content was defined based on the Ramachandran surface, as done previously38, with the α basin encompassing φ ∈ [−90°, −30°] and ψ ∈ [−77°, −17°]. HBSP helical content included all amino acids, whereas K19 helical content was averaged over residues 8,10,11,12,16, and 17 to correspond to experimental helical measurements.38

Hairpin clustering, etc

First, representative structures were extracted from each simulation to make the cluster analysis tractable, selecting frames spaced every 5 ns. For each 5 μs trajectory, this selected 1000 frames, yielding 16000 total frames for the two force fields, two initial structures, and four independent simulations for each force field and initial structure. The combined trajectories were clustered together to allow direct comparison of clusters between them. Clusters were formed using the hierarchical/agglomerative algorithm implemented in cpptraj49, with a 3 Å cutoff. The mask included all non-symmetric atoms to avoid the need to account for differences between clusters arising from symmetry. Thus atom names N, Cα, C, H, O, Cβ, Cγ, Cγ2, Cζ, Oγ1, Hγ1, Oη, Hη, all tryptophan non-hydrogens, and proline Cδ were included in the cluster analysis. Finally, the counts of each cluster were divided into four quarters representing each force field and initial structure. Hence, each cluster count accounts for four independent simulations with each force field/initial structure combination. The average values for each force field were computed, and the difference across initial structures was used to calculate uncertainties for each force field.

Further Methods details are provided in the Supporting Information.

Results

Reproduction of Ala5 NMR scalar coupling data is improved

We tested ff99SB and 29 backbone parameter modifications (Figure 2) by simulating Ala5 in explicit water, with each combination of φ and ψ modifications. We calculated scalar couplings from each ensemble using the Orig, DFT1, and DFT2 parameters used by Best et al.11b and us11a previously25, 45. We quantified the deviations between simulations and NMR using Best et al.’s χ2 metric11b:

χ2=1NiN(JisimJi,NMR)2σi2 (7)

where σi is the estimated systematic error in experimental constant i. The χ2 for each force field, according to each set of Karplus parameters, are provided in Tables 13. Consistent with our previous report11a, our starting point of ff99SB had χ2 of 1.89±0.09, 1.45±0.04, and 1.70±0.09 according to Orig, DFT1, and DFT2 parameters, respectively (error bars are from independent simulations using different initial structures). Two modified force fields achieved χ2 less than 1.0, both with the Orig parameters—mod1φ had a χ2 of 0.89±0.04 and mod1φ2ψ had χ2 = 0.93±0.02. Significantly, none of the force fields tested by Best et al. had achieved χ2 values under 1.0.11b The mod1φ and mod1φ2ψ modifications also improved DFT2-based χ2 to 1.22±0.03 (comparable to 1.2 reported for C36 using DFT2-based parameters9) and 1.41±0.06, but actually worsened agreement according to DFT1 parameters. In fact, none of the modifications improved agreement via the DFT1 parameters. Potential sensitivity of Karplus parameter derivation to peptide length may be a relevant discrepancy between Ala5 and the DFT1 parameters trained against Ac–Ala–Nme26. The force field best reproducing Ala5 scalar couplings with the DFT2 parameters is mod3φ, with χ2 of 1.11±0.01, once again improved compared to all of the force fields tested by Best et al.11b and slightly improved relative to C36.9

Figure 2.

Figure 2

Ramachandran heat maps showing energy differences between ff99SB (lower left, all values 0) and each of the five φ (across) and the four ψ (up) modifications, and all combinations. Note that while these surfaces are graphed with φ and ψ axes, many modifications adjust the φ′ and ψ′ corrections, some with phase shifts, and thus the graphs may not be symmetric about the x and y axes. See main text for definition of “prime” dihedrals.

Table 1.

Scalar coupling χ2 using the “Orig” Karplus parameters for Ala5, in each backbone parameter combination, with error bars from independent simulations

ff99SBφ mod1φ mod2φ mod3φ mod4φ mod5φ
ff99SBψ 1.89±0.09 0.89±0.04 1.09±0.14 1.17±0.01 1.38±0.04 1.78±0.02
mod1ψ 1.77±0.07 1.26±0.04 1.07±0.10 1.35±0.01 1.48±0.10 1.58±0.11
mod2ψ 2.11±0.30 0.93±0.02 1.19±0.07 2.26±0.59 1.99±0.33 1.65±0.22
mod3ψ 1.89±0.02 1.13±0.03 1.52±0.20 1.45±0.01 1.77±0.06 2.27±0.09
mod4ψ 2.23±0.21 1.49±0.02 1.56±0.08 1.51±0.12 2.11±0.06 2.51±0.35

Table 3.

Scalar coupling χ2using DFT2 parameters for Ala5 in each backbone parameter combination

ff99SBφ mod1φ mod2φ mod3φ mod4φ mod5φ
ff99SBψ 1.70±0.09 1.22±0.03 1.20±0.08 1.11±0.01 1.20±0.02 1.51±0.02
mod1ψ 1.59±0.05 1.79±0.03 1.31±0.02 1.39±0.03 1.40±0.05 1.41±0.05
mod2ψ 1.89±0.30 1.41±0.06 1.31±0.05 2.14±0.64 1.79±0.35 1.45±0.16
mod3ψ 1.71±0.03 1.46±0.03 1.52±0.19 1.36±0.04 1.56±0.05 2.03±0.05
mod4ψ 2.04±0.21 1.76±0.08 1.66±0.05 1.46±0.13 1.89±0.04 2.25±0.37

We selected parameter combinations for further testing in larger systems according to the best performing in each Karplus parameter set —mod1φ for Orig, ff99SB for DFT1, and mod3φ for DFT2. We also carried over mod1φ2ψ because it achieved χ2 < 1.0 with Orig and was nearly within error bars of the performance with mod1φ; this also allowed us to include a ψ modification in further testing. While results for some other parameter sets are very close when considering uncertainties, we needed to limit the number of combinations carried over to testing of converged ensembles for larger systems in explicit water.

Side chain rotamer energies show improved match to QM data and better transferability between backbone conformations

Whereas the initial tests of backbone parameter improvement focused on the highly flexible alanine, testing with systems exhibiting more well-defined structural propensity requires use of sequences that include side chains with rotatable bonds. Therefore, we next derived new side chain parameters to provide a more accurate model for testing the impact of the backbone parameter changes.

An important question is how to define EQM,i and EMM,i used for calculating REE in Eqn. 1. As discussed above, restraints could be applied to dihedrals other than the specific 4-atom set defining the φ, ψ, and χ rotatable bonds. We tested several choices, including restraining only the 4-atom sets defining φ, ψ, and χ, as well as restraining all possible 4-atom dihedrals, or restraining all dihedrals in the backbone but only the defining dihedrals in the side chains (see Table S1 for dihedral classifications). We also tested the impact of MM re-optimization of QM geometries. As discussed above, these choices in the generation and comparison of structures can introduce artifacts in the energy profiles that hamper parameter optimization and weaken transferability. We evaluated the impact of these choices by calculating the intrinsic BBD as well as the AAE for various restraint and structure optimization options, using ff99SB as a baseline MM model. Restraining all possible backbone dihedrals and re-optimizing the QM structure with MM before calculating energy yielded both the lowest AAE (2.55±0.09 kcal mol−1 for Asp and 1.98±0.01 kcal mol−1 for Asn, error bars reflect difference between α and β backbone context) and lowest BBD (1.35±0.01 kcal mol−1 for Asp and 1.42±0.03 kcal mol−1 for Asn). Thus we restrained all possible backbone dihedrals and re-optimized QM structures with MM when building our training set. Further analysis of different options can be found in the Supporting Information.

As discussed in Methods, each solving group was optimized separately (values for each solving group are provided in Table S3), but here we average the individual objective function O values, weighted by the number of profiles, to facilitate their comparison between different parameter sets. The resulting O values quantify the magnitude of error in energy differences for conformation pairs, averaged over all amino acids and backbone conformations. In ff99SB, O was 1.52 kcal mol−1, while O for the final optimization parameter set was 0.98 kcal mol−1. This 35% improvement is decomposed by residue and by backbone conformation in Figure 3, and the distribution of all pair energy errors (REEs) is shown in Table S4. All of the amino acids with errors larger than 2 kcal/mol in ff99SB (tyrosine and protonated or deprotonated aspartic acid) were significantly improved with the new parameters. In addition to improvement for the ILDN residues previously addressed by Lindorff-Larsen et al.8, we observed better agreement with the QM training data for every residue compared to ff99SB. The only profile that didn’t improve was α-backbone Phe, in which the initial ff99SB error was close to the average final AAE, limiting the potential for improvement. It is remarkable to see that the optimization procedure was able to find a solution that simultaneously improved performance for all amino acids, and with little resulting backbone dependence.

Figure 3.

Figure 3

The AAE of each force field for each amino acid (single letter codes), with data for both α and β backbone conformation. For ionizable residues, the ionic form is indicated by a charge superscript. CC indicates the disulfide bridge. Data are shown for ff99SB, ff99SB-ILDN, and ff99SB with the reparametrized side chain corrections obtained using the procedure described in the text.

Although we noted improvement in the reproduction of QM results for Arg and Lys side chains (Figure 3), further testing of the type discussed later in this article showed that scalar coupling agreement was slightly worsened by application of the new Arg and Lys parameters. Given the risk of overfitting for amino acids with four side chain dihedrals, especially given the ability of these side chains to form hydrogen bonds with the backbone that may affect the fitting, we decided that arginine and lysine may need a stronger effort, with more conformations, perhaps with implicit solvent at the QM stage. We therefore decided not to include the refit parameters for Arg and Lys in the final ff14SB parameter set or in the further testing discussed below.

We refer to the combination of ff99SB with new side chain dihedral parameters as ff14SBonlysc; adding the updated backbone parameters (discussed below) will result in the ff14SB model. Although it is promising that the ff14SBonlysc parameters show improved reproduction of the QM data, several caveats apply. First, the performance in Figure 4 measures the ability of the parameters to reproduce energies for structures that were used in the training shown above, but not for the other force fields, thus better performance on the training data is expected. Second, closely reproducing gas-phase QM data does not guarantee reliable simulation properties53. As discussed above, it is possible that training against gas-phase QM data might counteract some of the influence of the “pre-polarized” partial charges in our model, potentially worsening performance for simulations in aqueous solution. Thus we followed the training against QM data with more rigorous testing in solution simulations, with comparison to experiments also in solution.

Figure 4.

Figure 4

RMSD to the NMR structure vs time for the four linear and four native runs of CLN025 with ff14SB and ff99SB, colored by cluster being sampled: black=0, blue=1, green=2, cyan=3, red=4, fuchsia=5, gold=6, and all other clusters light gray.

Testing Strategy

The fitting just presented rests on several key assumptions that raise important questions. One question is whether the backbone corrections that reproduce Ala5 scalar couplings through empirical Karplus equations will improve secondary structure balance in larger systems with more complex (and well-defined) structure than Ala5. Since the computational cost is greater for these longer peptides, and transition rates are slower between more stable minima, we utilized small test systems when possible. To test the impact of the backbone parameter changes on secondary structure balance, we first simulated several peptides that adopt modest amounts of helical structure, comparing results from simulation and experiment. Since we did not want to increase helical stability at the cost of destabilizing β structures, we also simulated a short β-hairpin system to evaluate the ability of the force field to provide balance in sequence-dependent secondary structure content. These peptides have more complex side chains than alanine, thus some of the tests also incorporated the updated side chain parameters described above.

A second question is whether the diversity and planned backbone-independence of our side chain training set will improve side chain rotamer preferences for proteins in solution, despite training against in vacuo dipeptide energies at a modest level of QM theory. To investigate the accuracy of side chain rotamer sampling, we compared against χ1 scalar couplings for a set of folded proteins including GB3, ubiquitin, lysozyme, and BPTI (collated by Lindorff-Larsen et al.8, 25, 46, 54). Importantly, we considered the performance of the new model relative to ff99SB and ff99SB-ILDN8 (ff99SB with new (I)le, (L)eu, Asp (D), and As(n) parameters) in different secondary structure contexts, to evaluate whether inclusion of multiple dipeptide backbone conformations in side chain training improved transferability between different backbone conformations in proteins. We also tested the benefit of re-optimizing parameters for side chains other than ILDN.

These side chain parameter evaluations are subject to all the caveats of scalar couplings outlined in Fitting Strategy. In fact, many reported experimental scalar couplings lie outside the range of the relevant Karplus curves, suggesting that reproducing the experimental observations using these curves would be impossible regardless of the ensemble of conformations sampled in simulation. In these cases, we adjusted the target value by adopting the value on the Karplus curve lying closest to the experimental value; otherwise, the experimental value was used as the target:

 3Ji,NMR={min(Ji,Karplus),3Ji,NMR<min(Ji,Karplus)max(Ji,Karplus),3Ji,NMR<max(Ji,Karplus) 3Ji,NMR,otherwise (8)

Additionally, because H-H scalar couplings reporting on some residues have a much larger range than C-C scalar couplings reporting on others, deviations were normalized by the magnitude of the Karplus curve range. The errors are summarized in terms of the average normalized error ANE:

ANE=1NiN|Jisim3Ji,NMR|max(Ji,Karplus)min(Ji,Karplus) (9)

The resulting metric is more intuitive than average error, as 0 indicates best possible agreement, while 1 indicates maximum deviation.

In the peptides and proteins tested here, backbone and side chain dihedrals are coupled to each other within and between residues, making it difficult to determine exactly why a particular scalar coupling may disagree with experiment (assuming the error is not because of the experimental measurement or the Karplus curve). Likewise, this hinders ascribing credit for improvement to any specific backbone or side chain update. To help aid in the decomposition, we therefore tested helical content in a model peptide with just backbone parameter updates, and then introduced the side chain modifications. For protein side chains, we tested χ1 scalar couplings with just side chain modifications and then introduced backbone updates, to help isolate the effects of intended and secondary changes. On the other hand, this dihedral coupling can mean that χ1 scalar couplings implicitly report on backbone, χ2 or χ3 torsions; thus reproducing χ1 data may suggest reasonable accuracy in other parameters as well.

Lastly, after testing whether ff14SB achieves its design goals of improving secondary structure balance and side chain dynamics, we tested the final combination of backbone and side chain improvements on folded proteins to ensure that the new force field maintained reasonably accurate protein order parameter reproduction as reported previously for ff99SB3. We calculated backbone NH order parameters using the same simulations used to analyze χ1 scalar couplings.

Testing Results

Helical stability is improved

For testing helical propensity, we employed the hydrogen bond surrogate peptide (HBSP) of Arora and colleagues36, 55. With a covalently pre-organized nucleus that avoids the limiting entropic cost of helix initiation, experiments indicate the presence of significant helical content despite the short length of only 10 amino acids, providing an ideal initial model system. A covalent link replaces what would be the first helical hydrogen bond between residues 1 and 5, but we wanted to avoid introducing additional new parameters other than those described above. We created a model for the HBSP by including only natural amino acids, but using a H-bond distance restraint as an analog for the covalent bond used in experiments (see Methods and Supporting Information); the sequence was otherwise the same in order to allow comparison of helix propagation propensity. We generated ensembles for the system using the backbone parameter modifications that performed well for the Ala5 scalar couplings, as discussed above, and compared helical content calculated with DSSP to that from experiment. Wang et al. reported ~46% helical content in PBS55c, but due to the potential for aggregation in that experiment, we followed the suggestion56 of the authors and used the value of 70.13% helical content in 10% TFE, adjusted downward by ~ 5–10% to obtain an estimate in water of ~ 65%.

Simulations with ff99SB exhibited 0.17±0.01 fraction helix (Table 4; uncertainties represent data from independent runs), compared with the 0.65 target value discussed above. The mod1φ correction, which had the lowest Ala5 χ2 among all parameter sets, tripled the helical content to 0.51±0.01. Adding the mod2ψ correction, mod1φ2ψ, yielded 0.72±0.01 helical content. This number is somewhat higher than experiment, suggesting that this ψ modification, introduced to improve helical stability, may do so too strongly. This is notable because the results show that improvements in helical content are already achieved through modification of only the φ energy profile, which was designed to improve Ala5 scalar couplings. The other parameter set that was carried over from the Ala5 test, mod3φ, showed a more modest improvement relative to ff99SB, with 0.26 ± 0.01 fraction helix.

Table 4.

Helical content of HBSP and K19, from experiments and force fields, namely ff99SB and our modifications chosen based on Ala5 results. Simulated helical content was determined based on DSSP and Ramachandran analyses for HBSP and K19, respectively. The force field uncertainties were obtained from two independent simulations (see Methods).

HBSP (only updated backbone parameters) HBSP (adding updated sidechain parameters) K19
Experimental 0.65 0.65 0.31
ff99SB 0.17±0.01 0.26±0.01 0.08±0.01
mod1ϕ 0.51±0.01 0.60±0.01 0.26±0.05
mod1ϕ2ψ 0.72±0.01 0.79±0.01 0.87±0.03
mod3ϕ 0.26±0.01 0.46±0.01 0.10±0.01

Since HBSP contains non-Ala amino acids, we next combined these backbone adjustments with the new side chain parameters discussed above (ff14SBonlysc). When combined with mod1φ, we obtained a further increase in HBSP helical content, with 0.60±0.1 helix (compared to the experimental estimate of 0.65). Adding the new side chain parameters to mod1φ2ψ and mod3φ resulted in 0.79±0.01 and 0.46±0.01helix, respectively; in both cases the fractions were again higher than obtained with the backbone adjustments alone. Indeed, simply updating the side chain parameters for the ff99SB backbone parameters (ff14SBonlysc) nearly doubled the helical content to 0.26±0.02. Previous studies have modified the ff99SB backbone parameters to quantitatively match experimental observations11c, 27. Our data suggests that doing so, without considering side chain errors, could lead to a non-transferable cancellation of error between side chain and backbone effects; updating side chain parameters in those models may actually worsen agreement with experiment.

Although reproduction of Ala5 scalar couplings and HBSP helicity is promising, we also performed tests on a longer peptide without covalent modification, the K19 Baldwin-type peptide that we had simulated previously in implicit solvent38. For this peptide sequence, none of the side chain parameters differ from those in ff99SB. K19 simulations with ff99SB produced very low helical content (average of 0.08±0.01 for residues with measured CSDs, see Methods), in disagreement with the experimental estimate of 0.31 (Table 4). With mod1φ, K19 helical content was significantly improved at 0.26±0.05. Moreover, examination of per-residue helicity for amino acids with measured CSDs (Figure S4) shows that the largest difference between the error bar range from mod1φ MD and the experimental value is 3% (absolute error). The other backbone modifications did not perform as well as mod1φ, with trends similar to those obtained for HBSP; use of mod1φ2ψ resulted in too much helix (0.87±0.03), while mod3φ resulted in small improvement over ff99SB (0.10±0.01).

Overall, mod1φ has three advantages: it was physically motivated based on analysis of the ff99SB training data, it provides the best reproduction of Ala5 scalar coupling data among the combinations that we tested, and when combined with the QM-based side chain parameters the helical content also reasonably matches experiment for two different systems. mod1φ was thus selected as the backbone parameter update for ff14SB.

Testing hairpin stability and structure

We next tested whether the improvement in helical content was obtained at the cost of less accurate performance on β systems. As a model system to carry out initial tests of secondary structure balance, we turned to CLN02539, an engineered fast-folding hairpin that is a thermally optimized variant of Chignolin39. CLN025 contains N- and C-terminal glycine-to-tyrosine substitutions from Chignolin, which already possesses one tyrosine and one tryptophan. The presence of four aromatic side chains in a short peptide suggests the potential for strong sensitivity of observed stability to accurate treatment of side chain conformational energy profiles, as well as of hydrophobicity. The system also presents a challenge due to the relatively slow folding of β-sheets compared to the helical systems (although estimates of 100 ns for CLN025 were obtained from T-jump IR experiments), and obtaining precise measures of population may be difficult. Still, use of CLN025 as a model presents a reasonable route to obtaining a qualitative view of whether ff14SB’s increased helical propensity also compromises β stability.

For each of ff99SB and ff14SB, we performed four MD runs starting from the NMR-based structure closest to the ensemble average, and four additional runs starting from fully linear structures to quantify convergence. We compared simulation snapshots against the initial NMR structure using all non-symmetric atoms (Figure 4). We also performed cluster analysis on the combined trajectories from both force fields so that the influence of force field on cluster populations could be directly compared. Simulations with ff99SB predominantly sampled structures within cluster 0 at around 3 Å RMSD (59±10%), or within cluster 1 at around 4.8 Å RMSD (34±9%; all error bars were calculated from the difference between initially extended/hairpin ensembles). Compared to ff99SB, the ff14SB simulations sampled cluster 0 with similar frequency (57±14%), but sampled cluster 1 much less than ff99SB (5±3%), though the comparisons are somewhat qualitative due to the uncertainties. Instead, the ff14SB simulations are more diverse when unfolded, sampling structures ranging from 4 Å to just over 9 Å RMSD. ff99SB simulations sampled 194 clusters with non-zero frequency, whereas ff14SB simulations sampled 843.

Inspection of the second major cluster of ff99SB (cluster 1, blue in Figure 4) reveals a hairpin with shift of the C-terminal strand one residue out of phase relative to the N-terminal strand (representative structures for clusters 0 and 1 are shown together with the NMR-derived structure in Figure S5). The populations suggest that ff14SB destabilizes this alternate conformation, although the populations are not well converged; however the difference is also qualitatively apparent in observing that this cluster is significantly sampled in 6 of 8 ff99SB simulations, but only 2–3 of 8 ff14SB simulations, with typically shorter persistence time than with ff99SB (Figure 4). Whether the ff14SB parameter changes favor the native-like cluster over the alternate cluster can be probed by decomposing the dihedral energies of each cluster according to each force field. In particular, we evaluated how the difference in energies of the two main clusters depends on the force field:

ΔΔE=(Uff14SBcluster0Uff14SBcluster1)(Uff99SBcluster0Uff99SBcluster1) (10)

Analysis using Eqn. 10 indicates that the dihedral changes in ff14SB favor the native cluster over the alternate by 2.9 kcal mol−1 relative to ff99SB. Further decomposition of this difference suggests that parameter changes applied to Asp3 χ2 favor this native structure by 1.2 kcal mol−1, and then φ modifications favor the native structure by 0.5 kcal mol−1 in the backbone of Glu5. This is especially promising because the same modifications also improved agreement with experiment for the helical systems.

Although it may appear desirable for ff14SB to favor the native conformation more than in ff99SB, presence of a similar strand-shifted structured in the simulated ensemble for the Chignolin hairpin was reported to improve agreement of simulations with experimental NOEs for that system57. We therefore calculated distances corresponding to NOEs from our CLN025 simulations, using the ‘naïve’ approach58 that was used for Chignolin57. The sum of all NOE deviations was 2.8±0.2 Å for ff99SB and 2.1±0.4 Å for ff14SB. Furthermore, NOE agreement is better for ff99SB native run 3, which sampled only the native cluster, than for ff99SB native runs 1 and 4 or extended runs 2 through four, which sampled comparable amounts of the two clusters (Table S7). This suggests that in our simulations of CLN025, reduction in population of the non-native hairpin improves agreement with experiment. Although no definitive conclusions can be drawn about improvements of ff14SB based on these simulations alone, the simulations together with energy analysis suggest that ff14SB is at least as reasonable as ff99SB at hairpin modeling, and thus the desirable increase in α-helical content with ff14SB did not worsen β-hairpin simulation accuracy.

Agreement with side chain NMR scalar couplings is improved with ff14SB

In addition to indirectly testing side chains sampling accuracy in the context of overall conformational propensities, it is appropriate to evaluate side chain parameter changes more directly by comparing to experimental measures of side chain dynamics. We therefore simulated GB3, ubiquitin, lysozyme, and bovine pancreatic trypsin inhibitor (BPTI) to compare against experimental scalar couplings aggregated by Lindorff-Larsen et al.8, 25, 46, 54.

We tested ff99SB and ff99SB-ILDN as references, ff14SB which includes the backbone and side chain parameter updates described above, and also ff14SBonlysc, which includes the side chain updates described above while retaining the ff99SB φ and ψ parameters. This allows us to partially deconvolute the influence of improvements to the side chain and backbone. Simulations of each protein were carried out using each force field, and the ANE (Eqn. 9) was calculated for each amino acid where experimental data is available (Figure 5). The average error was 0.160±0.004 with ff99SB, 0.129±0.003 with ff99SB-ILDN, 0.127±0.003 with ff14SBonlysc, and 0.129±0.003 with ff14SB. The average figures were within statistical uncertainty for ff99SB-ILDN, ff14SBonlysc, and ff14SB, which show measurable improvement over ff99SB. Not surprisingly for these stably folded proteins, there is little difference between ff14SB and ff14SBonlysc, suggesting that the improvement over ff99SB observed in this test is largely due to side chain parameter updates.

Figure 5.

Figure 5

Average normalized errors (ANE) in side chain scalar couplings for all amino acids in GB3, ubiquitin (Ubq), lysozyme (HEWL), and bovine pancreatic trypsin inhibitor (BPTI), according to ff99SB, ff99SB-ILDN, ff14SBonlysc, and ff14SB. Amino acids are shown with single letter code, with charge state noted for ionizable side chains. Error bars are calculated from four independent simulations.

All of the variants significantly improved upon ff99SB in average, however the specific improvements of each force field differed. For example, the errors obtained using ff14SB (ff99SB-ILDN values given in parentheses after ff14SB values) in isoleucine, leucine, aspartate, and asparagine—the four residues modified by ff99SB-ILDN—were 0.11±0.01 (0.091±0.005), 0.16±0.02 (0.13±0.01), 0.111±0.009 (0.16±0.02), and 0.12±0.02 (0.154±0.009), respectively—slightly improved in 2 cases, and slightly worsened in 2 others.

As discussed above, ff99SB-ILDN was fit using β backbone conformations, while our fitting procedure was designed to improve side chain energetics for multiple backbone conformations. We investigated whether explicit inclusion of dipeptide α backbone conformations for QM calculations in the gas phase was successfully translated to improvement in scalar couplings of helical residues in larger proteins. We performed further fitting, simulations and analysis in order to gain insight into the impact of these choices in the quality of reproduction of experimental data, with an aim toward guiding future optimization efforts. We found that the choice of restraints during the QM and MM optimizations played an important role, which became more important when only a single backbone conformation was used during fitting of side chain parameters (see Supporting Information for more details). Overall, the results suggest that more careful consideration of these issues should be a factor in future force field efforts, as these measures can impact performance of simulations using the resulting parameters. These choices include how finely geometric changes outside the scan region are controlled, what level of variation in this geometry is desirable between QM and MM energy evaluations, and how these decisions are affected by intentional inclusion of diversity in neighboring regions.

We analyzed residues refit by both ff99SB-ILDN and ff14SB that matched the following criteria: in a helix, solvent-exposed and therefore likely to represent the intrinsic preferences of the amino acid, and experimentally characterized by χ1 scalar couplings. Only three residues fit these criteria, N35 of GB3, D32 of ubiquitin, and N97 of lysozyme. Of the three, all are better reproduced with ff14SB than ff99SB-ILDN, with ANEs for N35, D32 and N93 of 0.11±0.03/0.22±0.09 (ff14SB/ff99SB-ILDN), 0.15±0.04/0.47±0.02, and 0.16±0.02/0.31±0.04, respectively (although the N35 differences are within uncertainty ranges). We investigated these results further, and found that differences at the level of the QM and MM energy calculations are likely responsible for these differences, and that the ability to accurately predict quantum mechanics training energies correlates with reproduction of χ1 scalar couplings. See the Supporting Information for detailed analysis.

High quality of backbone dynamics in the native state is maintained

We also evaluated the ability of ff14SB to reproduce local dynamics in well-folded proteins as measured by backbone NH S2 Lipari-Szabo order parameters. We calculated NH order parameters from the same simulations used for side chain scalar coupling evaluation. This calculation was performed using iRED, which does not require separability of local and global motions48. For this analysis, we averaged iRED results calculated for windows of length 5 times the tumbling correlation time (τC), as has been suggested to best reproduce the model-free S2 order parameters59. This analysis was also repeated with window lengths found in previous publications, to facilitate comparison of results (see Supporting Information). The iRED-calculated order parameters, shown in Figure 6, are comparable among the different force fields, as indicated by low RMSD between simulation results. The greatest differences among simulations occurs for GB3, where the calculated ff99SB-ILDN and ff14SB order parameters differ by 0.05 RMSD. With ff14SB, order parameters are slightly improved with lysozyme (0.055±0.004 RMSD against NMR60, versus 0.07±0.01 for ff99SB and 0.065±0.004 for ff99SB-ILDN), and slightly worsened with GB3 (0.08±0.01 RMSD against NMR61, versus 0.060±0.005 for ff99SB and 0.061±0.004 for ff99SB-ILDN), though the statistical significance of these differences is limited. Meanwhile, ubiquitin S2 RMSDs were between 0.045 and 0.050 against NMR62 with all three force fields. We conclude that the high quality order parameter reproduction previously reported for ff99SB3 is maintained with ff14SB. There are, however, subtle differences worth noting.

Figure 6.

Figure 6

Order parameters from NMR compared to those back calculated by iRED for ff99SB, ff99SB-ILDN, and ff14SB simulations of GB3, ubiquitin, and lysozyme. Error bars represent the standard deviation of average values from four independent runs. The top panels show differences between simulation and experiment, while the lowest panels show average data for each secondary structure region, following Hornak et al.3.

Firstly, loop 4 in lysozyme is better reproduced with ff14SB on average (0.09±0.05 RMSD against NMR versus 0.15±0.06 and 0.14±0.04 for ff99SB and ff99SB-ILDN). As with the overall S2 RMSDs, these differences are not highly statistically significant. But it is interesting that L4 connects two helices, and thus enhanced rigidity could stem from our goals of improving helical stability. On the other hand, the first hairpin in GB3 is reproduced with ff14SB less well on average (0.12±0.02 RMSD for residues 1–20 versus 0.08±0.01 for both ff99SB and ff99SB-ILDN). Although the RMSDs considered don’t vary largely when considering uncertainties, ff14SB may model the stability of some hairpins slightly less accurately. Evaluating whether ff14SB reproduces S2 across different secondary structures more equitably in the general case would require a more comprehensive examination. Nonetheless, the differences between force fields are not very significant. Thus, although ff14SB may slightly improve modeling of helical regions, yet may slightly worsen modeling of hairpins, we conclude that ff14SB maintained ff99SB’s excellent order parameter reproduction overall.

Conclusion

The weaknesses of ff99SB addressed in this work are the less than ideal agreement with polyalanine scalar couplings, helical propensity, and side chain preferences. We tackled the former two weaknesses with the best of an array of empirical tweaks to the backbone potentials (here denoted mod1phi) and the latter by de novo fitting against a backbone-independent MP2 training set. The successor to ff99SB, ff14SB, augmented helical content of HBSP and K19 and improved side chain rotamer distributions as suggested by scalar couplings, while maintaining the reasonable reproduction of order parameters and hairpin structure characteristic of ff99SB. Interestingly, we were able to continue to improve the force field in solution by training against an in vacuo quantum mechanics benchmark, with performance that is similar between α and β contexts when compared against QM or against NMR scalar couplings. The ubiquity of ff14SB improvements and more thorough description of potential limitations will require further testing than possible here. But based on the benchmark reported, we recommend ff14SB for the simulation of proteins and peptides.

Supplementary Material

SI

Table 2.

Scalar coupling χ2 using DFT1 parameters for Ala5 in each backbone parameter combination

ff99SBφ mod1φ mod2φ mod3φ mod4φ mod5φ
ff99SBψ 1.45±0.04 2.71±0.15 2.21±0.08 1.83±0.01 1.47±0.08 1.53±0.03
mod1ψ 1.63±0.01 3.60±0.02 2.61±0.13 2.33±0.08 1.94±0.14 1.76±0.16
mod2ψ 1.69±0.15 3.16±0.13 2.37±0.01 2.56±0.58 1.96±0.31 1.70±0.10
mod3ψ 1.65±0.03 2.92±0.15 2.31±0.15 1.99±0.12 1.71±0.02 1.96±0.09
mod4ψ 1.91±0.09 3.12±0.27 2.61±0.05 2.19±0.15 1.96±0.04 2.12±0.26

Acknowledgments

The authors gratefully acknowledge helpful suggestions from David Green.

Funding Sources

The authors acknowledge funding from the NIH (GM061678 and GM107104) and support from Henry and Marsha Laufer. This work is also supported by an NSF Petascale Computational Resource (PRAC) Award from the National Science Foundation (OCI-1036208).

ABBREVIATIONS

ff99SB

force field 99 Stony Brook

ff14SB

force field 14 Stony Brook

NMR

nuclear magnetic resonance

CD

circular dichroism

CSD

chemical shift deviation(s)

iRED

isotropic reorientational eigenmode dynamics

GB3

third immunoglobin-binding domain of protein G

BPTI

bovine pancreatic trypsin inhibitor

Ubq

ubiquitin

HEWL

lysozyme

HBSP

hydrogen bond surrogate peptide

QM

quantum mechanics

MM

molecular mechanics

MD

molecular dynamics

REMD

replica exchange MD

MP2

Møller-Plesset Perturbation Theory of the Second Order

HF

Hartree-Fock

RHF

restricted Hartree-Fock

6-31G*

6-31G(d)

ESP

electrostatic potential

AAE

average absolute error

BBD

backbone dependence

ANE

average normalized error

CMAP

correction map

PDB

protein data bank

RMS

root mean square

REE

relative energy error

Orig

“original” Karplus parameters

DFT1

density functional theory-based Karplus parameters, derived from Ala1

DFT2

density functional theory-based Karplus parameters, derived from Ala2

NOE

nuclear overhauser effect

BMRB

Biological Magnetic Resonance Bank

ILDN

isoleucine, leucine, aspartate, and asparagine

ppII

polyproline helix, type II

K19

ac-G3(KAAAA)3K-NH2

ff14SBonlysc

ff99SB with the updated ff14SB side chain corrections

BB

backbone

RMSD

root mean squared deviations

Footnotes

Supporting Information. Force field parameter files, additional information about structure generation and filtration, and fitting of new side chain dihedral parameters, analysis of side chain scalar coupling convergence, NOE violations divided into backbone-backbone, backbone-side chain, and side chain-side chain restraints, testing of the AMBER12-bundled ff12SB in comparison with ff14SB presented in the main text. This material is available free of charge via the Internet at http://pubs.acs.org.

Author Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

References

  • 1.(a) Patel S, Mackerell AD, Brooks CL. CHARMM fluctuating charge force field for proteins: II Protein/solvent properties from molecular dynamics simulations using a nonadditive electrostatic model. J Comp Chem. 2004;25(12):1504–1514. doi: 10.1002/jcc.20077. [DOI] [PubMed] [Google Scholar]; (b) Patel S, Brooks CL. CHARMM fluctuating charge force field for proteins: I parameterization and application to bulk organic liquid simulations. J Comp Chem. 2004;25(1):1–16. doi: 10.1002/jcc.10355. [DOI] [PubMed] [Google Scholar]; (c) Anisimov VM, Lamoureux G, Vorobyov IV, Huang N, Roux B, MacKerell AD. Determination of Electrostatic Parameters for a Polarizable Force Field Based on the Classical Drude Oscillator. J Chem Theory Comp. 2005;1(1):153–168. doi: 10.1021/ct049930p. [DOI] [PubMed] [Google Scholar]; (d) Lopes PEM, Huang J, Shim J, Luo Y, Li H, Roux B, MacKerell AD. Polarizable Force Field for Peptides and Proteins Based on the Classical Drude Oscillator. J Chem Theory Comp. 2013;9(12):5430–5449. doi: 10.1021/ct400781b. [DOI] [PMC free article] [PubMed] [Google Scholar]; (e) Warshel A, Levitt M. Theoretical studies of enzymic reactions: Dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. J Mol Biol. 1976;103(2):227–249. doi: 10.1016/0022-2836(76)90311-9. [DOI] [PubMed] [Google Scholar]; (f) Ren P, Ponder JW. Polarizable Atomic Multipole Water Model for Molecular Mechanics Simulation. J Phys Chem B. 2003;107(24):5933–5947. [Google Scholar]
  • 2.Fried SD, Wang LP, Boxer SG, Ren PY, Pande VS. Calculations of the Electric Fields in Liquid Solutions. J Phys Chem B. 2013;117(50):16236–16248. doi: 10.1021/jp410720y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C. Comparison of multiple amber force fields and development of improved protein backbone parameters. Proteins: Struct, Funct Bioinf. 2006;65(3):712–725. doi: 10.1002/prot.21123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. A Second Generation Force Field For the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J Am Chem Soc. 1995;117(19):5179–5197. [Google Scholar]
  • 5.Wang JM, Cieplak P, Kollman PA. How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? J Comp Chem. 2000;21(12):1049–1074. [Google Scholar]
  • 6.Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, Onufriev A, Simmerling C, Wang B, Woods RJ. The Amber biomolecular simulation programs. J Comp Chem. 2005;26(16):1668–1688. doi: 10.1002/jcc.20290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.(a) Showalter SA, Brüschweiler R. Validation of molecular dynamics simulations of biomolecules using NMR spin relaxation as benchmarks: Application to the AMBER99SB force field. J Chem Theory Comp. 2007;3(3):961–975. doi: 10.1021/ct7000045. [DOI] [PubMed] [Google Scholar]; (b) Li D-W, Brüschweiler R. Certification of molecular dynamics trajectories with NMR chemical shifts. Journal of Physical Chemistry Letters. 2009;1(1):246–248. [Google Scholar]; (c) Lange OF, van der Spoel D, de Groot BL. Scrutinizing Molecular Mechanics Force Fields on the Submicrosecond Timescale with NMR Data. Biophys J. 2010;99(2):647–655. doi: 10.1016/j.bpj.2010.04.062. [DOI] [PMC free article] [PubMed] [Google Scholar]; (d) Cerutti DS, Freddolino PL, Duke RE, Case DA. Simulations of a Protein Crystal with a High Resolution X-ray Structure: Evaluation of Force Fields and Water Models. J Phys Chem B. 2010;114(40):12811–12824. doi: 10.1021/jp105813j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, Dror RO, Shaw DE. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins: Struct, Funct Bioinf. 2010;78(8):1950–1958. doi: 10.1002/prot.22711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Best RB, Zhu X, Shim J, Lopes PEM, Mittal J, Feig M, MacKerell AD., Jr Optimization of the Additive CHARMM all-atom protein force field targeting improved sampling of the backbone ϕ, ψ and side-chain χ1 and χ2 dihedral angles. J Chem Theory Comp. 2012;8(9):3257–3273. doi: 10.1021/ct300400x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Graf J, Nguyen PH, Stock G, Schwalbe H. Structure and dynamics of the homologous series of alanine peptides: A joint molecular dynamics/NMR study. J Am Chem Soc. 2007;129(5):1179–1189. doi: 10.1021/ja0660406. [DOI] [PubMed] [Google Scholar]
  • 11.(a) Wickstrom L, Okur A, Simmerling C. Evaluating the performance of the ff99SB force field based on NMR scalar coupling data. Biophys J. 2009;97(3):853–6. doi: 10.1016/j.bpj.2009.04.063. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Best RB, Buchete NV, Hummer G. Are current molecular dynamics force fields too helical? Biophys J. 2008;95(1):L7–L9. doi: 10.1529/biophysj.108.132696. [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Best RB, Hummer G. Optimized Molecular Dynamics Force Fields Applied to the Helix-Coil Transition of Polypeptides. J Phys Chem B. 2009;113(26):9004–9015. doi: 10.1021/jp901540t. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.(a) Karplus M. Vicinal Proton Coupling in Nuclear Magnetic Resonance. J Am Chem Soc. 1963;85(18):2870–+. [Google Scholar]; (b) Karplus M. Contact Electron-Spin Coupling of Nuclear Magnetic Moments. J Chem Phys. 1959;30(1):11–15. [Google Scholar]
  • 13.Salvador P, Tsai IH, Dannenberg JJ. J-coupling constants for a trialanine peptide as a function of dihedral angles calculated by density functional theory over the full Ramachandran space. Phys Chem Chem Phys. 2011;13(39):17484–17493. doi: 10.1039/c1cp20520j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Seabra GD, Walker RC, Roitberg AE. Are Current Semiempirical Methods Better Than Force Fields? A Study from the Thermodynamics Perspective. J Phys Chem A. 2009;113(43):11938–11948. doi: 10.1021/jp903474v. [DOI] [PubMed] [Google Scholar]
  • 15.Bayly CI, Cieplak P, Cornell WD, Kollman PA. A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges – the Resp Model. J Phys Chem. 1993;97(40):10269–10280. [Google Scholar]
  • 16.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of Simple Potential Functions for Simulating Liquid Water. J Chem Phys. 1983;79(2):926–935. [Google Scholar]
  • 17.(a) Duan Y, Wu C, Chowdhury S, Lee MC, Xiong G, Zhang W, Yang R, Cieplak P, Luo R, Lee T, Caldwell J, Wang J, Kollman P. A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations. J Comp Chem. 2003;24(16):1999–2012. doi: 10.1002/jcc.10349. [DOI] [PubMed] [Google Scholar]; (b) Cerutti DS, Rice JE, Swope WC, Case DA. Derivation of Fixed Partial Charges for Amino Acids Accommodating a Specific Water Model and Implicit Polarization. The Journal of Physical Chemistry B. 2013;117(8):2328–2338. doi: 10.1021/jp311851r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kirschner KN, Yongye AB, Tschampel SM, González-Outeiriño J, Daniels CR, Foley BL, Woods RJ. GLYCAM06: A generalizable biomolecular force field. Carbohydrates. J Comp Chem. 2008;29(4):622–655. doi: 10.1002/jcc.20820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Thompson EJ, DePaul AJ, Patel SS, Sorin EJ. Evaluating Molecular Mechanical Potentials for Helical Peptides and Proteins. Plos One. 2010;5(3) doi: 10.1371/journal.pone.0010056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Nguyen H, Maier J, Huang H, Perrone V, Simmerling C. Folding Simulations for Proteins with Diverse Topologies Are Accessible in Days with a Physics-Based Force Field and Implicit Solvent. J Am Chem Soc. 2014;136(40):13959–13962. doi: 10.1021/ja5032776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Janowski PA, Liu C, Deckman J, Case DA. Molecular dynamics simulation of triclinic lysozyme in a crystal lattice. Protein Sci. 2015:n/a–n/a. doi: 10.1002/pro.2713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.(a) Perez A, Marchan I, Svozil D, Šponer J, Cheatham TE, Laughton CA, Orozco M. Refinenement of the AMBER force field for nucleic acids: Improving the description of alpha/gamma conformers. Biophys J. 2007;92(11):3817–3829. doi: 10.1529/biophysj.106.097782. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Zgarbová M, Otyepka M, Šponer J, Mladek A, Banas P, Cheatham TE, 3rd, Jurečka P, Refinement of the Cornell et al. Nucleic Acids Force Field Based on Reference Quantum Chemical Calculations of Glycosidic Torsion Profiles. J Chem Theory Comp. 2011;7(9):2886–2902. doi: 10.1021/ct200162x. [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Zgarbová M, Luque FJ, Šponer J, Cheatham TE, 3rd, Otyepka M, Jurečka P. Toward Improved Description of DNA Backbone: Revisiting Epsilon and Zeta Torsion Force Field Parameters. J Chem Theory Comp. 2013;9(5):2339–2354. doi: 10.1021/ct400154j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.MacKerell AD, Feig M, Brooks CL. Improved treatment of the protein backbone in empirical force fields. J Am Chem Soc. 2004;126(3):698–699. doi: 10.1021/ja036959e. [DOI] [PubMed] [Google Scholar]
  • 24.(a) Dunbrack RL, Jr, Karplus M. Backbone-dependent Rotamer Library for Proteins Application to Side-chain Prediction. J Mol Biol. 1993;230(2):543–574. doi: 10.1006/jmbi.1993.1170. [DOI] [PubMed] [Google Scholar]; (b) Dunbrack RL, Jr, Karplus M. Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains. Nat Struct Mol Biol. 1994;1(5):334–340. doi: 10.1038/nsb0594-334. [DOI] [PubMed] [Google Scholar]; (c) Lovell SC, Word JM, Richardson JS, Richardson DC. The penultimate rotamer library. Proteins: Struct, Funct Genet. 2000;40(3):389–408. [PubMed] [Google Scholar]
  • 25.Hu JS, Bax A. Determination of phi and chi(1) angles in proteins from C-13-C-13 three-bond J couplings measured by three-dimensional heteronuclear NMR. How planar is the peptide bond? J Am Chem Soc. 1997;119(27):6360–6368. [Google Scholar]
  • 26.Nerenberg PS, Head-Gordon T. Optimizing Protein-Solvent Force Fields to Reproduce Intrinsic Conformational Preferences of Model Peptides. J Chem Theory Comp. 2011;7(4):1220–1230. doi: 10.1021/ct2000183. [DOI] [PubMed] [Google Scholar]
  • 27.Li D-W, Brüschweiler R. Iterative Optimization of Molecular Mechanics Force Fields from NMR Data of Full-Length Proteins. J Chem Theory Comp. 2011;7(6):1773–1782. doi: 10.1021/ct200094b. [DOI] [PubMed] [Google Scholar]
  • 28.Brüschweiler R, Case DA. Adding Harmonic Motion to the Karplus Relation for Spin-Spin Coupling. J Am Chem Soc. 1994;116(24):11199–11200. [Google Scholar]
  • 29.Zhang W, Hou T, Schafmeister C, Ross WS, Case DA. LEaP and gleap. 2010 [Google Scholar]
  • 30.Schmidt MW, Baldridge KK, Boatz JA, Elbert ST, Gordon MS, Jensen JH, Koseki S, Matsunaga N, Nguyen KA, Su S, Windus TL, Dupuis M, Montgomery JA. General atomic and molecular electronic structure system. J Comp Chem. 1993;14(11):1347–1363. [Google Scholar]
  • 31.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Zakrzewski VG, Montgomery JAJ, Stratmann RE, Burant JC, Dapprich S, Millam JM, Daniels AD, Kudin KN, Strain MC, Farkas O, Tomasi J, Barone V, Cossi M, Cammi R, Mennucci B, Pomelli C, Adamo C, Clifford S, Ochterski J, Petersson GA, Ayala PY, Cui Q, Morokuma K, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Cioslowski J, Ortiz JV, Baboul AG, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komaromi I, Gomperts R, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Gonzalez C, Challacombe M, Gill PMW, Johnson B, Chen W, Wong MW, Andres JL, Head-Gordon M, Replogle ES, Pople JA. Gaussian 98. Gaussian, Inc; Pittsburgh, PA: 1998. Revision A.7. [Google Scholar]
  • 32.Wang JM, Kollman PA. Automatic parameterization of force field by systematic search and genetic algorithms. J Comp Chem. 2001;22(12):1219–1228. [Google Scholar]
  • 33.Berendsen HJC, Postma JPM, Vangunsteren WF, Dinola A, Haak JR. Molecular-Dynamics with Coupling to an External Bath. J Chem Phys. 1984;81(8):3684–3690. [Google Scholar]
  • 34.Ryckaert JP, Ciccotti G, Berendsen HJC. Numerical-Integration of Cartesian Equations of Motion of a System with Constraints – Molecular-Dynamics of N-Alkanes. J Comput Phys. 1977;23(3):327–341. [Google Scholar]
  • 35.Darden T, York D, Pedersen L. Particle Mesh Ewald – an N. Log(N) Method for Ewald Sums in Large Systems. J Chem Phys. 1993;98(12):10089–10092. [Google Scholar]
  • 36.Wang D, Chen K, Kulp JL, Iii, Arora PS. Evaluation of biologically relevant short alpha-helices stabilized by a main-chain hydrogen-bond surrogate. J Am Chem Soc. 2006;128(28):9248–56. doi: 10.1021/ja062710w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kosinska Eriksson U, Fischer G, Friemann R, Enkavi G, Tajkhorshid E, Neutze R. Subangstrom Resolution X-Ray Structure Details Aquaporin-Water Interactions. Science. 2013;340(6138):1346–1349. doi: 10.1126/science.1234306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Song K, Stewart JM, Fesinmeyer RM, Andersen NH, Simmerling C. Structural insights for designed alanine-rich helices: Comparing NMR helicity measures and conformational ensembles from molecular dynamics simulation. Biopolymers. 2008;89(9):747–760. doi: 10.1002/bip.21004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.(a) Davis CM, Xiao SF, Raleigh DP, Dyer RB. Raising the Speed Limit for beta-Hairpin Formation. J Am Chem Soc. 2012;134(35):14476–14482. doi: 10.1021/ja3046734. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Honda S, Akiba T, Kato YS, Sawada Y, Sekijima M, Ishimura M, Ooishi A, Watanabe H, Odahara T, Harata K. Crystal Structure of a Ten-Amino Acid Protein. J Am Chem Soc. 2008;130(46):15327–15331. doi: 10.1021/ja8030533. [DOI] [PubMed] [Google Scholar]
  • 40.Ulmer TS, Ramirez BE, Delaglio F, Bax A. Evaluation of Backbone Proton Positions and Dynamics in a Small Protein by Liquid Crystal NMR Spectroscopy. J Am Chem Soc. 2003;125(30):9179–9191. doi: 10.1021/ja0350684. [DOI] [PubMed] [Google Scholar]
  • 41.Wlodawer A, Walter J, Huber R, Sjölin L. Structure of bovine pancreatic trypsin inhibitor: Results of joint neutron and X-ray refinement of crystal form II. J Mol Biol. 1984;180(2):301–329. doi: 10.1016/s0022-2836(84)80006-6. [DOI] [PubMed] [Google Scholar]
  • 42.Vijay-Kumar S, Bugg CE, Cook WJ. Structure of ubiquitin refined at 1.8Åresolution. J Mol Biol. 1987;194(3):531–544. doi: 10.1016/0022-2836(87)90679-6. [DOI] [PubMed] [Google Scholar]
  • 43.Young ACM, Dewan JC, Nave C, Tilton RF. Comparison of radiation-induced decay and structure refinement from X-ray data collected from lysozyme crystals at low and ambient temperatures. J Appl Crystallogr. 1993;26(3):309–319. [Google Scholar]
  • 44.(a) Ding KY, Gronenborn AM. Protein backbone H-1(N)-C-13(alpha) and N-15-C-13(alpha) residual dipolar and J couplings: New constraints for NMR structure determination. J Am Chem Soc. 2004;126(20):6232–6233. doi: 10.1021/ja049049l. [DOI] [PubMed] [Google Scholar]; (b) Hennig M, Bermel W, Schwalbe H, Griesinger C. Determination of psi torsion angle restraints from (3)J(C-alpha,C-alpha) and 3J(C-alpha,H-N) coupling constants in proteins. J Am Chem Soc. 2000;122(26):6268–6277. [Google Scholar]; (c) Wirmer J, Schwalbe H. Angular dependence of 1J(Ni,Calphai) and 2J(Ni,Calpha(i-1)) coupling constants measured in J-modulated HSQCs. J Biomol NMR. 2002;23(1):47–55. doi: 10.1023/a:1015384805098. [DOI] [PubMed] [Google Scholar]
  • 45.Case DA, Scheurer C, Brüschweiler R. Static and dynamic effects on vicinal scalar J couplings in proteins and peptides: A MD/DFT analysis. J Am Chem Soc. 2000;122(42):10390–10397. [Google Scholar]
  • 46.Chou JJ, Case DA, Bax A. Insights into the mobility of methyl-bearing side chains in proteins from (3)J(CC) and (3)J(CN) couplings. J Am Chem Soc. 2003;125(29):8959–8966. doi: 10.1021/ja029972s. [DOI] [PubMed] [Google Scholar]
  • 47.Perez C, Lohr F, Ruterjans H, Schmidt JM. Self-consistent Karplus parametrization of (3)J couplings depending on the polypeptide side-chain torsion chi(1) J Am Chem Soc. 2001;123(29):7081–7093. doi: 10.1021/ja003724j. [DOI] [PubMed] [Google Scholar]
  • 48.Prompers JJ, Brüschweiler R. General framework for studying the dynamics of folded and nonfolded proteins by NMR relaxation spectroscopy and MD simulation. J Am Chem Soc. 2002;124(16):4522–4534. doi: 10.1021/ja012750u. [DOI] [PubMed] [Google Scholar]
  • 49.Roe DR, Cheatham TE., III PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J Chem Theory Comp. 2013;9(7):3084–3095. doi: 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
  • 50.Ulrich EL, Akutsu H, Doreleijers JF, Harano Y, Ioannidis YE, Lin J, Livny M, Mading S, Maziuk D, Miller Z, Nakatani E, Schulte CF, Tolmie DE, Wenger RK, Yao HY, Markley JL. BioMagResBank. Nucleic Acids Res. 2008;36:D402–D408. doi: 10.1093/nar/gkm957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Nilges M. Calculation of Protein Structures with Ambiguous Distance Restraints – Automated Assignment of Ambiguous Noe Crosspeaks and Disulfide Connectivities. J Mol Biol. 1995;245(5):645–660. doi: 10.1006/jmbi.1994.0053. [DOI] [PubMed] [Google Scholar]
  • 52.Kabsch W, Sander C. Dictionary of Protein Secondary Structure – Pattern-Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers. 1983;22(12):2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
  • 53.Mackerell AD, Feig M, Brooks CL. Extending the treatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J Comp Chem. 2004;25(11):1400–1415. doi: 10.1002/jcc.20065. [DOI] [PubMed] [Google Scholar]
  • 54.(a) Berndt KD, Guntert P, Orbons LPM, Wuthrich K. Determination of a High-Quality Nuclear-Magnetic-Resonance Solution Structure of the Bovine Pancreatic Trypsin-Inhibitor and Comparison with 3 Crystal-Structures. J Mol Biol. 1992;227(3):757–775. doi: 10.1016/0022-2836(92)90222-6. [DOI] [PubMed] [Google Scholar]; (b) Grimshaw SB. Ph D Thesis. University of Oxford; 1999. Novel approaches to characterizing native and denatured proteins by NMR. [Google Scholar]; (c) Miclet E, Boisbouvier J, Bax A. Measurement of eight scalar and dipolar couplings for methine-methylene pairs in proteins and nucleic acids. J Biomol NMR. 2005;31(3):201–216. doi: 10.1007/s10858-005-0175-z. [DOI] [PubMed] [Google Scholar]; (d) Schwalbe H, Grimshaw SB, Spencer A, Buck M, Boyd J, Dobson CM, Redfield C, Smith LJ. A refined solution structure of hen lysozyme determined using residual dipolar coupling data. Protein Sci. 2001;10(4):677–688. doi: 10.1110/ps.43301. [DOI] [PMC free article] [PubMed] [Google Scholar]; (e) Smith LJ, Sutcliffe MJ, Redfield C, Dobson CM. Analysis of Phi and Chi-1 Torsion Angles for Hen Lysozyme in Solution from H-1-Nmr Spin Spin Coupling-Constants. Biochemistry. 1991;30(4):986–996. doi: 10.1021/bi00218a015. [DOI] [PubMed] [Google Scholar]
  • 55.(a) Patgiri A, Jochim AL, Arora PS. A hydrogen bond surrogate approach for stabilization of short peptide sequences in alpha-helical conformation. Acc Chem Res. 2008;41(10):1289–300. doi: 10.1021/ar700264k. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Chapman RN, Dimartino G, Arora PS. A highly stable short alpha-helix constrained by a main-chain hydrogen-bond surrogate. J Am Chem Soc. 2004;126(39):12252–3. doi: 10.1021/ja0466659. [DOI] [PubMed] [Google Scholar]; (c) Wang D, Chen K, Dimartino G, Arora* PS. Nucleation and stability of hydrogen-bond surrogate-based α-helices. Org Biomol Chem. 2006;4(22):4074–4081. doi: 10.1039/b612891b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Arora P. personal communication. 2015 [Google Scholar]
  • 57.Kuhrova P, De Simone A, Otyepka M, Best RB. Force-Field Dependence of Chignolin Folding and Misfolding: Comparison with Experiment and Redesign. Biophys J. 2012;102(8):1897–1906. doi: 10.1016/j.bpj.2012.03.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Feenstra KA, Peter C, Scheek RM, van Gunsteren WF, Mark AE. A comparison of methods for calculating NMR cross-relaxation rates (NOESY and ROESY intensities) in small peptides. J Biomol NMR. 2002;23(3):181–94. doi: 10.1023/a:1019854626147. [DOI] [PubMed] [Google Scholar]
  • 59.Gu Y, Li D-W, Brüschweiler R. NMR Order Parameter Determination from Long Molecular Dynamics Trajectories for Objective Comparison with Experiment. J Chem Theory Comp. 2014;10(6):2599–2607. doi: 10.1021/ct500181v. [DOI] [PubMed] [Google Scholar]
  • 60.Buck M, Boyd J, Redfield C, MacKenzie DA, Jeenes DJ, Archer DB, Dobson CM. Structural Determinants of Protein Dynamics: Analysis of 15N NMR Relaxation Measurements for Main-Chain and Side-Chain Nuclei of Hen Egg White Lysozyme. Biochemistry. 1995;34(12):4041–4055. doi: 10.1021/bi00012a023. [DOI] [PubMed] [Google Scholar]
  • 61.Hall JB, Fushman D. Characterization of the overall and local dynamics of a protein with intermediate rotational anisotropy: Differentiating between conformational exchange and anisotropic diffusion in the B3 domain of protein G. J Biomol NMR. 2003;27(3):261–275. doi: 10.1023/a:1025467918856. [DOI] [PubMed] [Google Scholar]
  • 62.Tjandra N, Feller SE, Pastor RW, Bax A. Rotational diffusion anisotropy of human ubiquitin from 15N NMR relaxation. J Am Chem Soc. 1995;117(50):12562–12566. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

RESOURCES