Abstract
Given by χ torsional angles, rotamers describe the side-chain conformations of amino acid residues in a protein based on the rotational isomers (hence the word rotamer). Constructed rotamer libraries, based on either protein crystal structures or dynamics studies, are the tools for classifying rotamers (torsional angles) in a way that reflect their frequency in nature. Rotamer libraries are routinely used in structure modeling and evaluation. In this perspective article, we would like to encourage researchers to apply rotamer analyses beyond their traditional use. Molecular dynamics (MD) of proteins highlight the in silico behavior of molecules in solution and thus can identify favorable side-chain conformations. In this article, we used simple computational tools to study rotamer dynamics (RD) in MD simulations. First, we isolated each frame in the MD trajectories in separate Protein Data Bank files via the cpptraj module in AMBER. Then, we extracted torsional angles via the Bio3D module in R language. The classification of torsional angles was also done in R according to the penultimate rotamer library. RD analysis is useful for various applications such as protein folding, study of rotamer-rotamer relationship in protein-protein interaction, real-time correlation between secondary structures and rotamers, study of flexibility of side chains in binding site for molecular docking preparations, use of RD as guide in functional analysis and study of structural changes caused by mutations, providing parameters for improving coarse-grained MD accuracy and speed, and many others. Major challenges facing RD to emerge as a new scientific field involve the validation of results via easy, inexpensive wet-lab methods. This realm is yet to be explored.
Introduction
Proteins play major structural and functional roles in all living beings. The molecular understanding of protein structures at the atomic level is important for explaining cellular mechanisms as well as for various medical and nonmedical applications. At the moment of writing this work, there are already more than 137,000 biological macromolecular structures deposited at the Protein Data Bank (PDB). Among these macromolecules, over 127,000 structures have been annotated as proteins (https://www.rcsb.org). This experimentally verified crystal-structure reservoir is a thriving environment for bioinformatics research and exploitation via computational study. Molecular dynamics (MD) analysis is a well-established computational method for study of structures in controlled environment settings. Based on predefined parameters, known as force fields (FFs), it is possible to apply the laws of physics to calculate the effect of molecular forces (i.e., bond lengths, bond angles, torsional or dihedral angles, and electrostatic and van der Waals bonds) and predict the trajectories of all atoms in a system. MD analysis of proteins employs these calculations for many times to simulate the behavior of molecules in a well-controlled aqueous environment in silico. More than 20,000 ISI-indexed articles per year have been published on the topic of MD in the past 3 years (webofscience.com).
On the other hand, as early as the first protein structures were discovered, it was shown that particular bonds and angles were restricted toward ideal ranges. Torsional angles can either describe the dihedral rotation in the backbone (φ between Cα and N, ψ between Cα and C, ω between C and N) or side chain (χ1 between Cα and Cβ, χ2 between Cβ and Cγ, etc.) as shown in Fig. 1 A. Rotamers, the rotations of the side-chain torsional angles, were intensively studied by various methods to identify the ideal rotamer ranges occurring in nature. To construct a rotamer library, one method depends on collection of protein structures and statistical analysis of side-chain conformations, whereas another method applies clustering approach of the three possible carbon sp3–sp3 rotations (i.e., +60, 180, and −60°). These three torsional angles can be represented by IUPAC gauche-trans nomenclature (g+, t, g−, respectively); however, previous researchers used g+ and g− to represent either +60 or −60° without consistency. An alternative nomenclature (p, t, m for +60, 180, and −60°, respectively; shown in Fig. 1 B) was proposed by Lovell et al. (1).
Figure 1.
(A) Nomenclature of the torsional angles in the backbone and side-chain structure. (B) Representation of the first three χ torsional angles in the side chain and the p, t, m nomenclature is given.
Rotamers usually represent a local energy minimum of torsional angles, and thus the backbone torsional angles φ and ψ can also be involved (2). The Dunbrack backbone-dependent rotamer library is among the widely used rotamer libraries (3). Rotamer libraries that are dependent on secondary structure are useful for homology modeling (4). However, Dunbrack argues that his data are in favor of backbone dependency rather than explicit secondary structures (2). In this work, we used the term “secondary structure” to broadly describe backbone torsional angles (Φ and ψ) as well as interchain hydrogen bonds.
The “penultimate rotamer library” by the Richardson laboratory (1) attempted to avoid the internal atomic clashes resulting from ideal hydrogen atoms in the structure and uncertain residues with high B-factor. Thus, this library provided a higher quality and coverage and low number of rotamer classes (nearly 153 rotamers). The latter advantage was ideal for analysis and graphical representation, which is why this library was chosen for the protocol presented in this article.
Both libraries mentioned earlier focused on high-quality crystal structures of proteins, and thus little information is known about rotamer flexibility in solution. The dynameomics rotamer library employs MD simulation for at least 31 ns at 25°C to predict rotamers of proteins in solution environment (5). The library was compared to several structure data sets, and the researchers investigated the role of buried versus surface residues in both crystal and dynamic structures. Furthermore, the library was supported by experimental data from NMR relaxation to measure S2 side-chain order parameters of Ala Cβ, Ile Cγ and Cδ, Leu Cδ, Met Cε, Thr Cγ, and Val Cγ on a picosecond to nanosecond timescale. This NMR-based method is routinely used to directly probe methyl group mobility in the side chains of proteins (6).
Previous studies employing MD analysis and focusing on rotamers often represented their data via plotting changes in χ dihedral angles over time (7, 8) or through principal component analysis (9). Although few dihedral angles are easy to plot, a simple classification scheme is required when dealing with large number of heterogeneous residues. We hope that our work will address the graphical challenges faced by previous researchers. Watanabe and others resorted to decomposing rotamer histograms from MD simulations into Gaussians to make dihedral populations (10), which is equivalent to the construction of a new rotamer library. It is important to develop a simple, comparative, benchmarked, and easy-to-visualize rotamer analysis method in MD simulations that can be widely adapted by researchers. The penultimate rotamer library is an ideal choice for use in MD analysis because it is backbone independent (hence all possible rotamers are included at once), with a countable number of rotamers (thus easy to classify for graphical visualization) and simple nomenclature (for instance, ptp rotamer of Met residue describes torsions for χ angles in the order p, then t, then p, for χ1, χ2, and χ3, respectively). The ptm−85 and ptm180 rotamers of Arg residue describe the χ4 angle to be around the mean −85 and 180°, respectively. The penultimate rotamer library also describes all possible rotamer ranges predicted from a very stringent collection of highly resolved and refined structures.
The purpose of this perspective is to create a simple and easy RD analysis strategy that can be adapted and developed by researchers in the MD field. However, for a full proof of concept exploitation, it is also important to develop biophysically relevant graphical visualization. In the following protocol and example sections, we will also point out several unforeseen technical challenges and discuss the best ways to overcome them.
A wide range of programs have been developed for MD simulations. Many programs can already perform analysis related to RD. The CHARMM program uses a correlation function to study average and root-mean-square fluctuation for χ1 and χ2 angles (11). In GROMACS, it is possible to prepare an index file with the preferred dihedral angles and extract these data from a simulation in the form of trigonometric functions and perform principal component analysis (12). In this case, rotamer classification is performed postanalysis. RD analysis can still be done in the same way as our protocol; however, it might be more laborious, particularly in preparing the index file and performing the rotamer classification that is more biophysically relevant. We have noticed that extracting dihedral angles and assigning them to residues poses a challenge when performing analysis via many programs because it requires either selecting the four atoms per dihedral angle or defining them in an index file like in GROMACS. We do not have practical experience in these programs, but the list includes LAMMPS (13) and Python modules. Another example is the VMD Timeline plugin (University of Illinois at Urbana-Champaign, Urbana-Champaign, IL), which can produce some dihedral angle graphical representation in a trajectory. At this point, the Bio3D module (Grant lab, University of California, San Diego, CA) in R language (The R Foundation for Statistical Computing, Vienna, Austria) was a very attractive choice because we only had to define residues (not dihedral angles) for extraction of dihedral angles. The only limitation was that it can perform this for a single structure at a time and in PDB format.
Protocol
In brief, an MD simulation was first done using the sander module in the AMBER 14 program (University of California, San Francisco, CA). Because torsional angle calculations in the Bio3D module in R language can only be performed for a single structure at a time, the process was automated to perform calculations for all simulation frames. Firstly, the trajectory file was converted to PDB format, and all frames were saved as single PDB file per frame using cpptraj module in AMBER (Fig. 2 A). Torsional angles were calculated and saved for each residue using the Bio3D module in R. The data were transformed by collecting each angle value for each frame to final format of angles (in columns) and frames (in rows). Using the penultimate rotamer library, the torsional angle data were classified into rotamers using if/else statements (Fig. 2 B; Supporting Materials and Methods).
Figure 2.
(A) CPPTRAJ script. (B) R script is given. Comments are shown in green, counts and numbers in orange, strings in gray, commands in purple, and logic in blue.
To show a practical example of this protocol, we applied RD analysis on peptide-protein interaction in the next section. In this example, implicit water MD simulations of the neurotrophic pNGF peptide (SSSHPIFHRGEFSV-NH2) and its receptor (Ig2 extracellular domain of tropomyosin receptor kinase TrkA) were done in free and bound states. Atomistic coordinates were derived from the PDB model (PDB: 2IFG) and modified in UCSF Chimera (University of California, San Francisco). Input coordinate and topology files were prepared using the H++ server (14) with protonation states optimal for physiological pH (7.4). Canonical ensemble (NVT) MD simulations were performed using an improved generalized Born solvent model for protein simulations (15). The structures were minimized (maximum 5000 cycles) and equilibrated for 500 ps at 25°C, and production was performed in Langevin dynamics (25°C, 16 Å nonbonded cutoff, 0.002 ps time steps) for several consecutive periods of 50 ns. Coordinates were printed every 1000 steps. The bound complex separated after ∼150 ns, and therefore only part of the entire simulation will be shown for demonstrative purposes. Approximately 50,000 frames, each representing 2 ps time step and corresponding to a total 100 ns of MD simulation, were used for the example.
There are a few important notes to take care of when saving trajectories in multiple PDB files (Fig. 2 A). The number of frames (reflecting the number of files) is decided by the time interval, during which we expect the rotamers’ convergence to be visible. This can be controlled by taking strides in steps between frames, for instance, using “offset 5” in the trajout command to skip five frames between each output (offset 1 is the default). In the case of large molecule simulations, computational load is important (quick tip: opening a folder with a large number of files will slow down the computer). Removal of nonprotein atoms can speed up the processing. Further, to process large numbers of frames in large molecules, we also recommend selecting tens of residues at a time and saving them separately (e.g., the command “strip !(:9–26)” can be used in the line before “trajout …” to select residues 9–26). Residues of Ala and Gly can be avoided because they have no rotamers. The reader is referred to the AMBER manual for further customization. In this example, we saved 50,000 frames of three simulations (peptide free, receptor free, peptide-receptor bound) in PDB format.
To open the PDB files in R language, the Bio3D library module was loaded in step 1 (Fig. 2 B). Commands for reading each PDB frame were produced via FOR statement and captured in variable x, which was in turn saved in a file and executed. In step 2, the torsional angles of residue number 3 were collected for all frames by commands produced via FOR statement and captured in variables y supplemented with z, which was in turn saved in a file and executed. Similar code automation can be done for collecting data for all residues (shown in steps 2–3a). Previously, the angles are saved in a variable called tor_residue3. In step 3, the angles are saved in a tab-delimited text file.
The classification of rotamers according to the penultimate rotamer library was justified in the previous section. In step 4, the angles that were saved in step 3 are read into a variable called tor. A variable for rotamers is created. In step 5, an IF/ELSE statement is used to classify and save the rotamers variable Rota_residue1 in a tab-delimited text file. IF/ELSE statement scripts for the rest of amino acid rotamers are shown in Supporting Materials and Methods. If the computational load is larger than processing capacity, which might happen with Arg residues containing up to 34 groups, we recommend dividing the groups into two or more steps and saving each calculation in separate column, e.g., Rota_residue1 [i,1], Rota_residue1 [i,2], etc… Alternatively, it is possible to combine groups together for easier visualization (e.g., focusing on χ1 and χ2).
The rotamer groups are represented by data strings (in accordance to nomenclature in Fig. 1). There is an array of methods that can be used for data visualization and analysis. Graphical representation includes rotamer frequency distribution, distribution over time (time evolution), and other correlations with time, secondary structure, energy, ligands, and other rotamer combinations.
Example: RD analysis of pNGF peptide binding to TrkA receptor
Neurotrophic peptides are a new generation of synthetic neurotrophic factors derived from neurotrophins and can be used to induce neuron differentiation and prevent or reverse neuronal degeneration for treatment of various diseases (16). Because of the vast cross-interactions between neurotrophins and the three tropomyosin receptor kinases (TrkA, TrkB, and TrkC), the therapeutic selectivity and specificity pose a challenge in controlling their side effects (17). Understanding the peptide binding process in detail is important for optimization and development of selective therapeutics. Here, we show a study of the binding between the pNGF peptide (Fig. 3 A) and its counter, the Ig2 extracellular domain of tropomyosin receptor kinase TrkA (Fig. 3 D). This example is shown for the purpose of graphical visualization only and should not be used to derive conclusions without experimental validation. Among the residues on the interaction interface from the peptide are H4 and P5, which face the residues S304 and H343 on the receptor, respectively. In free form, H4 exhibits a “bend” or no secondary structure and a variety of rotamers (m−70, m80, m170, and t−80), whereas in bound form, H4 exhibits a predominant t−80 rotamer (Fig. 3 B). Similarly, P5 residue shifts toward the Cγ exo rotamer stabilized with formation of α helix secondary structure (Fig. 3 C). On the other side, the stabilization of both H343 residue m−70 rotamer and H4 residue t−80 rotamer resulted in nearly equal dynamic interaction between the two histidines and the S304 residue, forming both m and t rotamers (Fig. 3, E and H). The latter was a rare case in which a rotamer (S304 residue) is more fixed when the protein is free, whereas the other examples show the expected fixation of rotamers upon binding, namely, t rotamer in V294 (Fig. 3 F), m−70 in H298 (Fig. 3 G), and p90 in F329 (Fig. 3 I) residues. The distribution of rotamer frequency provides a summary for RD and its relationship with secondary structure frequency.
Figure 3.
Representative example of rotamer analysis. An implicit MD simulation of neurotrophic peptide and its receptor done in free and bound states is shown. (A) pNGF peptide (SSSHPIFHRGEFSV-NH2) structure is shown in green ribbon. (B) Secondary structure-rotamer relationship in H4 residue from peptide is shown. (C) Secondary structure-rotamer relationship in P5 residue from peptide is shown. (D) Part of the TrkA receptor (in orange ribbon) binding to the pNGF peptide (in green ribbon) is shown. (E) Four residues at the binding interface with distinct rotamer relations are shown. H4 from peptide acquired the t−80 rotamer. P5 from peptide acquired the Cγ exo rotamer. S304 from the receptor acquired both m and t rotamers to accommodate both adjacent histidines. H343 from the receptor acquired the m−70 rotamer. (F) Secondary structure-rotamer relationship in V294 residue from receptor is shown. (G) Secondary structure-rotamer relationship in H298 residue from receptor is shown. (H) Secondary structure-rotamer relationship in S304 residue from receptor is shown. (I) Secondary structure-rotamer relationship in F329 residue from receptor is shown. The crystal structure (PDB: 2IFG) was used. The structure was edited using UCSF Chimera and processed via the H++ server. MD simulations were performed in implicit water in AMBER 14. Rotamer analysis was done in R language, as described in the text.
Further information about the dynamics of rotamer-rotamer interaction over time can be obtained via time-evolution plots (Fig. 4 A). Here, a detailed timescale of 2 ps per frame showed a stable rotamer conformation over scales of tens of nanoseconds. The frequency of rotamer combination for the four listed residues showed the highest occurrence of the t−80, Cγ exo, m, and m−70 rotamers of H4, P5, S304, and H343, respectively (Fig. 4 B).
Figure 4.
RD data visualization and analysis of the bound residues H4 and P5 in pNGF and H343 and S304 in TrkA. (A) An RD time-evolution graph for most frequent rotamers is shown. The graph was generated using the image () function in the gplots module in R language. With an exception for the TrkA S304 residue, the other residues folded into stable rotamer conformation after 10 ns and continued for most of the simulation. (B) Using the count () function in plyr module in R, the number of frames was calculated for each cluster of rotamers. The table shows the highest eight clusters. (C) Multiple factor analysis for mixed data generated using MFAmix () function in PCAmixdata module in R is shown. The graph shows squared loadings of variables. Based on the vector angles, it is very clear that His4 and His343 were much correlated with each other in the two dimensions. (D) A component map of the levels showing the individual rotamer is given. Representative correlated rotamers are shown in green ellipse.
In contrast to the principal component analysis for torsional angles that we mentioned in the Introduction, a special type of factor analysis for nominal and mixed types of variables is used (18). Here, the whole residue (i.e., side chain) is studied as one unit instead of a heterogeneous index of dihedral angles. Multiple factor analysis for mixed data showed direct correlation between pNGF His4 and TrkA His343 in the first two dimensions (Fig. 4 C). A detailed correlation of the possible rotamer combinations can be visualized as well (Fig. 4 D). For example, in the negative panel of the first component dimension, four interesting rotamers are correlated together: the t−80, Cγ exo, m, and m−70 rotamers of H4, P5, S304, and H343, respectively (Fig. 4 D). Trimming of the noise from the data (removing the first and last 10 ns) improved the component dimension 1 (from 8.86 to 12.03%) and 2 (from 6.37 to 11.39%) with similar outcomes. This highlights the benefit of time-evolution plots used in Fig. 4 A.
We believe that this factor analysis approach to the study of dihedral angles via RD is more relevant to the researchers nowadays than factor analysis of heterogeneous indices of dihedral angles. In the previous example, the objective is to understand the binding process to produce a more selective mutant peptide for therapy. We think that rotamers provide a biophysically relevant representation of the structure, particularly when they are studied in association with energetics and thermodynamics.
Applications and future prospects
Before discussing the applications of RD analysis, it is important to give this subject its modest and unexaggerated weight. When performing a MD simulation, the major forces are calculated from two kinds of energy: bonded and nonbonded. The changes in dihedral angles belong mostly to the bonded energy, whereas a great contribution to the total energy of the system comes from interactions mediated by nonbonded, viz., noncovalent interactions. These include the van der Waals and the electrostatic (ionic and hydrogen bonds). In protein-protein interactions and protein-ligand interactions, water also plays a significant role through networks of hydrogen bonds (19). During globular protein folding, hydrophobic side chains are confined inside the protein in tightly packed and more rigid fashion than the rest of the protein, whereas hydrophilic residues protrude to face the water surface (20). We hope that the usefulness of RD analysis in providing real-time insight on protein folding could be evaluated by analyzing the flexibility of rotamers in large data sets. It is worth a note that implicit water models might provide more rotamer flexibility than explicit water models because the latter would provide more realistic water-based hydrogen bonding. Comparative RD analysis can give a new perspective to improvements on implicit water models that are more representative of explicit water in the future.
As mentioned earlier, the relationship between the backbone and the side chains shifts toward the energy minimum. Thus, it is possible to visualize the correlation between secondary structures and rotamers (Fig. 3). Similarly, protein-protein interactions can involve rotamer-rotamer relationships, which can either involve switching or fixing of movement (Fig. 3 E). RD analysis can show these changes in real time (Fig. 4 A).
Molecular docking is a computational method used to study both protein-protein interactions and protein-ligand interactions in which the interaction is often scored using scoring functions based on free energy estimates via molecular mechanics or other methods (21). Most docking software employs predefined parameters that can increase the accuracy of the prediction. These include existing water molecules, protonation states of some residues, and “explicit flexibility” of amino acid side chains in the binding site or binding interface. However, the aforementioned parameters are the roots of many challenges in producing a universally accurate and reliable solution via MD (22). We hope that by controlling more factors involved in the interaction and by defining the restraints involved in rotamers, this method can improve its accuracy to some extent. In fact, similar innovations have been implemented to integrate MD in molecular docking aside from the usual postdocking validation procedures (23). The protein’s intrinsic flexibility is a major drawback for docking; however, a combination of MD method and sampling of multiple receptor conformations was shown to match and possibly outperform crystal structures in retrospective virtual screening experiments (23). Multiple-receptor-conformation-based methods are often referred to as ensemble docking, which implements “implicit flexibility” in both side chains and backbone (24). The role of side-chain flexibility is highlighted in peptide-protein docking (25).
Another plausible application of RD can be found in mutational scans that are used for functional analysis and also for exploratory industrial development of recombinant proteins and enzymes. The specificity and binding affinity are determined by structural and physicochemical properties at the interaction interface or in a binding site, even with a small number of amino acid substitutions (26). Nevertheless, laboratory mutagenesis methods can be time-consuming and highly costly. RD analysis can give a different detailed map of the changes in the three-dimensional landscape surrounding the mutated residue to provide further insight into its desired functionality. The same can be said regarding post-translational modifications of amino acids and addition of sugars, lipids, etc. On the other hand, increasing protein stability (a.k.a. protein engineering) is a desirable goal for different life science purposes ranging from basic research to clinical and industrial applications (27). We hope that further analysis of rotamers in MD simulation will help identify the local and distant factors that contribute to residues with rotamers of highly restricted dihedral angles range. The development of highly rigid protein structures is important for improving thermostability, nondegradability, and pressure tolerance.
Rotamers analysis in MD simulations was previously shown to be useful in predicting side-chain packing, which is important for developing coarse-grained MD (a technique that differs from atomistic MD by grouping atoms or residues into grains/beads of various sizes, thus reducing computational load). In fact, predicting χ1 rotamer states alone increased the speed of MD calculations and thus reduced MD simulation time significantly (28).
The number of articles on the topic of rotamer or rotamers rarely exceeded the range of 70–90 articles per year, compared to more than 20,000 articles published on the topic of MD (webofscience.com ). Clearly, the development of state-of-the-art RD tools and awareness among scientists of possible applications of rotamers in MD analysis are required to narrow this gap. We hope that by improving the tools for RD study (computationally and experimentally) and by dissemination of knowledge of the issue, the researchers will have better background in this field and will be able to extend their analysis of MD beyond the backbone and secondary structure into more detailed side-chain structure study. One strategy is to study convergence of rotamers over time as triggered by certain events. Another strategy is to gain functional information from fluctuations in side chains from comparative study, as shown in the example (e.g., ligand free versus ligand bound, native proteins versus protein interaction, or any comparative conditions).
As mentioned earlier, validation of RD analysis with NMR measurement of methyl side-chain order parameter values (S2-values) is the gold standard; however, it is not the cheapest, easiest, nor most feasible among researchers. Such difficulty was obviously not an issue for experimental validation of torsional angles of secondary structures. Indicators of secondary structure can be detected by circular dichroism spectroscopy (29), Fourier transform infrared spectroscopy (30), Raman spectroscopy (31), and other methods. However, some relevant information is still attainable regarding side chains of proteins but with more laborious work in interpretation (32). For some aromatic residues such as tyrosine and tryptophan, the fluorescence lifetime (i.e., decay) can accurately analyze rotamer distributions (33, 34). Recent studies via vibrational spectrometry—complemented with computational chemistry—reported detailed assignments of the side-chain torsional vibrations of dipeptides such as Ala-Gln (35), Gly-Val (36), Gly-Leu (37), Gly-Tyr (38), Met-Ser (39), and His-Phe (40). Torsional vibrations were mostly featured in the spectral range below 1000 cm−1. On the other hand, alternative computational methods for prediction of rotamers without MD are less common, yet one noticeable example is the dead-end elimination algorithm method (41). The dead-end elimination algorithm method—often used for protein three-dimensional structure prediction—relies on calculating and minimizing potential energy and then limiting side-chain conformations to discrete set of rotamers. As the name implies, rotamers that cannot be grouped in the global minimal energy conformation are eliminated. On a different level, it is possible to derive information related to RD from the profiles of root mean-square deviations and root mean-square fluctuations of atomic coordinates. Unlike RD, these mathematical methods require a defined reference, and although they can give quick indications on the fixed versus flexible residues, the study of RD gives a better chemical and geometric description of the system.
As with all MD studies, the possible bias in results that originates from using certain FFs is also a concern for RD analysis. FF bias has been a critical problem, particularly for the studies of protein folding, even when using different versions of the same FF. For instance, Shao and Zhu reported that some versions of AMBER FF can have a preference for certain secondary structures in contrast to others (42). This issue has been previously addressed in Biophysical Journal (43, 44). The accuracy of FFs can be further improved by analyzing both backbone and side-chain torsional angles (45). Hopefully, in protein folding studies, RD analysis can be useful for assessment of MD bias resulting from FF. The field of protein structural biology is on the verge of accurate sequence and crystal-structure prediction, as shown by fast advances in de novo and homology modeling techniques (46). However, we are yet far from approaching the sequence and dynamic-structure prediction model, which better describes proteins in physiological conditions. We believe that a dynamic-structure model(s) can be established for proteins based on probabilistic distributions of torsional angles alone, in which case RD will play a pivotal role. Such quantitative and descriptive models will better exploit the infinite landscape of protein folding. Moreover, in 2014, a group of researchers were able to expand the genetic alphabet to include unnatural nucleotide basepairs (47). It is hoped that in the future, such expansion can be reflected in codons and eventually the expression of as many as 172 different synthetic amino acids (48). The development and understanding of both natural and synthetic amino acid side chains will require more attention by researchers in the coming decades.
In conclusion, computational methods have wide interest among researchers, and they are becoming more feasible, accessible, integrable, and accurate day by day. Here, we have questioned the feasibility and proof of concept of performing RD analysis using freely available tools. RD analysis is very descriptive and chemically and biophysically relevant when compared to torsional angle description, the same way a secondary structure is relevant when compared to backbone torsional angles. We think the time has come for a benchmarked and adaptable approach for performing RD analysis. The development of fast, cheap, and reliable experimental methods that validate rotamers in solution will make the breakthrough for this field.
Acknowledgments
Computational resources were provided by the Czech Education and Scientific Network LM2015042 and the CERIT Scientific Cloud LM2015085, provided under the program “Projects of Large Research, Development, and Innovations Infrastructures.” European Research Council under the European Union’s Horizon 2020 research and innovation program (grant agreement No. 759585) and The Czech Science Agency (project No. 18-10251S) are gratefully acknowledged.
Editor: Brian Salzberg.
Footnotes
Supporting Material can be found online at https://doi.org/10.1016/j.bpj.2019.04.017.
Author Contributions
Y.H. performed the computation and writing. V.A. reviewed the manuscript, and Z.H. was principle investigator and contributor to scheme and organization of the work. All authors have given approval to the final version of the manuscript.
Supporting Material
References
- 1.Lovell S.C., Word J.M., Richardson D.C. The penultimate rotamer library. Proteins. 2000;40:389–408. [PubMed] [Google Scholar]; Lovell, S. C., J. M. Word, …, D. C. Richardson. 2000. The penultimate rotamer library. Proteins. 40:389-408. [PubMed]
- 2.Dunbrack R.L., Jr. Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 2002;12:431–440. doi: 10.1016/s0959-440x(02)00344-5. [DOI] [PubMed] [Google Scholar]; Dunbrack, R. L., Jr. 2002. Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 12:431-440. [DOI] [PubMed]
- 3.Dunbrack R.L., Jr., Karplus M. Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J. Mol. Biol. 1993;230:543–574. doi: 10.1006/jmbi.1993.1170. [DOI] [PubMed] [Google Scholar]; Dunbrack, R. L., Jr., and M. Karplus. 1993. Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J. Mol. Biol. 230:543-574. [DOI] [PubMed]
- 4.Bates P.A., Sternberg M.J. Model building by comparison at CASP3: using expert knowledge and computer automation. Proteins. 1999;37(Suppl 3):47–54. doi: 10.1002/(sici)1097-0134(1999)37:3+<47::aid-prot7>3.3.co;2-6. [DOI] [PubMed] [Google Scholar]; Bates, P. A., and M. J. Sternberg. 1999. Model building by comparison at CASP3: using expert knowledge and computer automation. Proteins. 37(Suppl 3):47-54. [DOI] [PubMed]
- 5.Scouras A.D., Daggett V. The Dynameomics rotamer library: amino acid side chain conformations and dynamics from comprehensive molecular dynamics simulations in water. Protein Sci. 2011;20:341–352. doi: 10.1002/pro.565. [DOI] [PMC free article] [PubMed] [Google Scholar]; Scouras, A. D., and V. Daggett. 2011. The Dynameomics rotamer library: amino acid side chain conformations and dynamics from comprehensive molecular dynamics simulations in water. Protein Sci. 20:341-352. [DOI] [PMC free article] [PubMed]
- 6.Carbonell P., del Sol A. Methyl side-chain dynamics prediction based on protein structure. Bioinformatics. 2009;25:2552–2558. doi: 10.1093/bioinformatics/btp463. [DOI] [PubMed] [Google Scholar]; Carbonell, P., and A. del Sol. 2009. Methyl side-chain dynamics prediction based on protein structure. Bioinformatics. 25:2552-2558. [DOI] [PubMed]
- 7.Engh R.A., Chen L.X.-Q., Fleming G.R. Conformational dynamics of tryptophan: a proposal for the origin of the non-exponential fluorescence decay. Chem. Phys. Lett. 1986;126:365–372. [Google Scholar]; Engh, R. A., L. X.-Q. Chen, and G. R. Fleming. 1986. Conformational dynamics of tryptophan: a proposal for the origin of the non-exponential fluorescence decay. Chem. Phys. Lett. 126:365-372.
- 8.Das S., Das S., Maiti N.C. Orientation of tyrosine side chain in neurotoxic Aβ differs in two different secondary structures of the peptide. R. Soc. Open Sci. 2016;3:160112. doi: 10.1098/rsos.160112. [DOI] [PMC free article] [PubMed] [Google Scholar]; Das, S., S. Das, …, N. C. Maiti. 2016. Orientation of tyrosine side chain in neurotoxic Aβ differs in two different secondary structures of the peptide. R. Soc. Open Sci. 3:160112. [DOI] [PMC free article] [PubMed]
- 9.Altis A., Nguyen P.H., Stock G. Dihedral angle principal component analysis of molecular dynamics simulations. J. Chem. Phys. 2007;126:244111. doi: 10.1063/1.2746330. [DOI] [PubMed] [Google Scholar]; Altis, A., P. H. Nguyen, …, G. Stock. 2007. Dihedral angle principal component analysis of molecular dynamics simulations. J. Chem. Phys. 126:244111. [DOI] [PubMed]
- 10.Watanabe H., Elstner M., Steinbrecher T. Rotamer decomposition and protein dynamics: efficiently analyzing dihedral populations from molecular dynamics. J. Comput. Chem. 2013;34:198–205. doi: 10.1002/jcc.23119. [DOI] [PubMed] [Google Scholar]; Watanabe, H., M. Elstner, and T. Steinbrecher. 2013. Rotamer decomposition and protein dynamics: efficiently analyzing dihedral populations from molecular dynamics. J. Comput. Chem. 34:198-205. [DOI] [PubMed]
- 11.Schleif R. 2013. A Concise Guide to Charmm and the Analysis of Protein Structure and Function.http://pages.jh.edu/∼rschlei1/Random_stuff/publications/charmmbook.pdf [Google Scholar]; Schleif, R. 2013. A Concise Guide to Charmm and the Analysis of Protein Structure and Function. http://pages.jh.edu/∼rschlei1/Random_stuff/publications/charmmbook.pdf.
- 12.Abraham M.J., Spoel D. v. d., Hess B., The GROMACS Development Team . 2018. GROMACS User Manual version 2018.www.gromacs.org [Google Scholar]; Abraham, M. J., D. v. d. Spoel, …, B. Hess; The GROMACS Development Team. 2018. GROMACS User Manual version 2018. www.gromacs.org.
- 13.Sandia National Laboratories . 2018. LAMMPS User’s Manual: compute Dihedral/Local Command.https://lammps.sandia.gov/doc/compute_dihedral_local.html [Google Scholar]; Sandia National Laboratories. 2018. LAMMPS User’s Manual: compute Dihedral/Local Command. https://lammps.sandia.gov/doc/compute_dihedral_local.html.
- 14.Gordon J.C., Myers J.B., Onufriev A. H++: a server for estimating pKas and adding missing hydrogens to macromolecules. Nucleic Acids Res. 2005;33:W368–W371. doi: 10.1093/nar/gki464. [DOI] [PMC free article] [PubMed] [Google Scholar]; Gordon, J. C., J. B. Myers, …, A. Onufriev. 2005. H++: a server for estimating pKas and adding missing hydrogens to macromolecules. Nucleic Acids Res. 33:W368-W371. [DOI] [PMC free article] [PubMed]
- 15.Nguyen H., Roe D.R., Simmerling C. Improved generalized born solvent model parameters for protein simulations. J. Chem. Theory Comput. 2013;9:2020–2034. doi: 10.1021/ct3010485. [DOI] [PMC free article] [PubMed] [Google Scholar]; Nguyen, H., D. R. Roe, and C. Simmerling. 2013. Improved generalized born solvent model parameters for protein simulations. J. Chem. Theory Comput. 9:2020-2034. [DOI] [PMC free article] [PubMed]
- 16.Travaglia A., Pietropaolo A., Rizzarelli E. A small linear peptide encompassing the NGF N-terminus partly mimics the biological activities of the entire neurotrophin in PC12 cells. ACS Chem. Neurosci. 2015;6:1379–1392. doi: 10.1021/acschemneuro.5b00069. [DOI] [PubMed] [Google Scholar]; Travaglia, A., A. Pietropaolo, …, E. Rizzarelli. 2015. A small linear peptide encompassing the NGF N-terminus partly mimics the biological activities of the entire neurotrophin in PC12 cells. ACS Chem. Neurosci. 6:1379-1392. [DOI] [PubMed]
- 17.Haddad Y., Adam V., Heger Z. Trk receptors and neurotrophin cross-interactions: new perspectives toward manipulating therapeutic side-effects. Front. Mol. Neurosci. 2017;10:130. doi: 10.3389/fnmol.2017.00130. [DOI] [PMC free article] [PubMed] [Google Scholar]; Haddad, Y., V. Adam, and Z. Heger. 2017. Trk receptors and neurotrophin cross-interactions: new perspectives toward manipulating therapeutic side-effects. Front. Mol. Neurosci. 10:130. [DOI] [PMC free article] [PubMed]
- 18.Chavent M., Kuentz-Simonet V., Saracco J. Multivariate analysis of mixed data: the PCAmixdata R package. arXiv. 2014 http://arxiv.org/abs/1411.4911 arXiv:1411.4911. [Google Scholar]; Chavent, M., V. Kuentz-Simonet, …, J. Saracco. 2014. Multivariate analysis of mixed data: the PCAmixdata R package. arXiv, arXiv:1411.4911, http://arxiv.org/abs/1411.4911.
- 19.Ferreira L.G., Dos Santos R.N., Andricopulo A.D. Molecular docking and structure-based drug design strategies. Molecules. 2015;20:13384–13421. doi: 10.3390/molecules200713384. [DOI] [PMC free article] [PubMed] [Google Scholar]; Ferreira, L. G., R. N. Dos Santos, …, A. D. Andricopulo. 2015. Molecular docking and structure-based drug design strategies. Molecules. 20:13384-13421. [DOI] [PMC free article] [PubMed]
- 20.Hurley J.H. The Role of Interior Side-Chain Packing in Protein Folding and Stability. The Protein Folding Problem and Tertiary Structure Prediction. In: Merz K., LeGrand S., editors. Birkhauser; 1994. pp. 549–578. [Google Scholar]; Hurley, J. H. 1994. The Role of Interior Side-Chain Packing in Protein Folding and Stability. The Protein Folding Problem and Tertiary Structure Prediction, K. Merz and S. LeGrand, eds. (Birkhauser), pp. 549-578.
- 21.Liu J., Wang R. Classification of current scoring functions. J. Chem. Inf. Model. 2015;55:475–482. doi: 10.1021/ci500731a. [DOI] [PubMed] [Google Scholar]; Liu, J., and R. Wang. 2015. Classification of current scoring functions. J. Chem. Inf. Model. 55:475-482. [DOI] [PubMed]
- 22.Cheng T., Li Q., Bryant S.H. Structure-based virtual screening for drug discovery: a problem-centric review. AAPS J. 2012;14:133–141. doi: 10.1208/s12248-012-9322-0. [DOI] [PMC free article] [PubMed] [Google Scholar]; Cheng, T., Q. Li, …, S. H. Bryant. 2012. Structure-based virtual screening for drug discovery: a problem-centric review. AAPS J. 14:133-141. [DOI] [PMC free article] [PubMed]
- 23.De Vivo M., Masetti M., Cavalli A. Role of molecular dynamics and related methods in drug discovery. J. Med. Chem. 2016;59:4035–4061. doi: 10.1021/acs.jmedchem.5b01684. [DOI] [PubMed] [Google Scholar]; De Vivo, M., M. Masetti, …, A. Cavalli. 2016. Role of molecular dynamics and related methods in drug discovery. J. Med. Chem. 59:4035-4061. [DOI] [PubMed]
- 24.Bonvin A.M. Flexible protein-protein docking. Curr. Opin. Struct. Biol. 2006;16:194–200. doi: 10.1016/j.sbi.2006.02.002. [DOI] [PubMed] [Google Scholar]; Bonvin, A. M. 2006. Flexible protein-protein docking. Curr. Opin. Struct. Biol. 16:194-200. [DOI] [PubMed]
- 25.Dagliyan O., Proctor E.A., Dokholyan N.V. Structural and dynamic determinants of protein-peptide recognition. Structure. 2011;19:1837–1845. doi: 10.1016/j.str.2011.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]; Dagliyan, O., E. A. Proctor, …, N. V. Dokholyan. 2011. Structural and dynamic determinants of protein-peptide recognition. Structure. 19:1837-1845. [DOI] [PMC free article] [PubMed]
- 26.Allam A., Maigre L., Artaud I. New peptides with metal binding abilities and their use as drug carriers. Bioconjug. Chem. 2014;25:1811–1819. doi: 10.1021/bc500317u. [DOI] [PubMed] [Google Scholar]; Allam, A., L. Maigre, …, I. Artaud. 2014. New peptides with metal binding abilities and their use as drug carriers. Bioconjug. Chem. 25:1811-1819. [DOI] [PubMed]
- 27.Buß O., Rudat J., Ochsenreither K. FoldX as protein engineering tool: better than random based approaches? Comput. Struct. Biotechnol. J. 2018;16:25–33. doi: 10.1016/j.csbj.2018.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]; Buß, O., J. Rudat, and K. Ochsenreither. 2018. FoldX as protein engineering tool: better than random based approaches? Comput. Struct. Biotechnol. J. 16:25-33. [DOI] [PMC free article] [PubMed]
- 28.Jumper J.M., Freed K.F., Sosnick T.R. Rapid calculation of side chain packing and free energy with applications to protein molecular dynamics. arXiv. 2016 doi: 10.1371/journal.pcbi.1006342. http://arxiv.org/abs/1610.07277?context=physics arXiv:1610.07277. [DOI] [PMC free article] [PubMed] [Google Scholar]; Jumper, J. M., K. F. Freed, and T. R. Sosnick. 2016. Rapid calculation of side chain packing and free energy with applications to protein molecular dynamics. arXiv, arXiv:1610.07277, http://arxiv.org/abs/1610.07277?context=physics. [DOI] [PMC free article] [PubMed]
- 29.Greenfield N.J. Using circular dichroism spectra to estimate protein secondary structure. Nat. Protoc. 2006;1:2876–2890. doi: 10.1038/nprot.2006.202. [DOI] [PMC free article] [PubMed] [Google Scholar]; Greenfield, N. J. 2006. Using circular dichroism spectra to estimate protein secondary structure. Nat. Protoc. 1:2876-2890. [DOI] [PMC free article] [PubMed]
- 30.Yang H., Yang S., Yu S. Obtaining information about protein secondary structures in aqueous solution using Fourier transform IR spectroscopy. Nat. Protoc. 2015;10:382–396. doi: 10.1038/nprot.2015.024. [DOI] [PubMed] [Google Scholar]; Yang, H., S. Yang, …, S. Yu. 2015. Obtaining information about protein secondary structures in aqueous solution using Fourier transform IR spectroscopy. Nat. Protoc. 10:382-396. [DOI] [PubMed]
- 31.Rygula A., Majzner K., Baranska M. Raman spectroscopy of proteins: a review. J. Raman Spectrosc. 2013;44:1061–1076. [Google Scholar]; Rygula, A., K. Majzner, …, M. Baranska. 2013. Raman spectroscopy of proteins: a review. J. Raman Spectrosc. 44:1061-1076.
- 32.Barth A. Infrared spectroscopy of proteins. Biochim. Biophys. Acta. 2007;1767:1073–1101. doi: 10.1016/j.bbabio.2007.06.004. [DOI] [PubMed] [Google Scholar]; Barth, A. 2007. Infrared spectroscopy of proteins. Biochim. Biophys. Acta. 1767:1073-1101. [DOI] [PubMed]
- 33.Clayton A.H., Sawyer W.H. Tryptophan rotamer distributions in amphipathic peptides at a lipid surface. Biophys. J. 1999;76:3235–3242. doi: 10.1016/S0006-3495(99)77475-8. [DOI] [PMC free article] [PubMed] [Google Scholar]; Clayton, A. H., and W. H. Sawyer. 1999. Tryptophan rotamer distributions in amphipathic peptides at a lipid surface. Biophys. J. 76:3235-3242. [DOI] [PMC free article] [PubMed]
- 34.Saraiva M.A., Jorge C.D., Maçanita A.L. Earliest events in α-synuclein fibrillation probed with the fluorescence of intrinsic tyrosines. J. Photochem. Photobiol. B. 2016;154:16–23. doi: 10.1016/j.jphotobiol.2015.11.006. [DOI] [PubMed] [Google Scholar]; Saraiva, M. A., C. D. Jorge, …, A. L. Maçanita. 2016. Earliest events in α-synuclein fibrillation probed with the fluorescence of intrinsic tyrosines. J. Photochem. Photobiol. B. 154:16-23. [DOI] [PubMed]
- 35.Kecel S., Ozel A.E., Celik S. Conformational analysis and vibrational spectroscopic investigation of L-alanyl-L-glutamine dipeptide. J. Spectrosc. 2010;24:219–232. [Google Scholar]; Kecel, S., A. E. Ozel, …, S. Celik. 2010. Conformational analysis and vibrational spectroscopic investigation of L-alanyl-L-glutamine dipeptide. J. Spectrosc. 24:219-232.
- 36.Celik S., Ozel A.E., Agaeva G. Conformational preferences, experimental and theoretical vibrational spectra of cyclo (Gly–Val) dipeptide. J. Mol. Struct. 2011;993:341–348. [Google Scholar]; Celik, S., A. E. Ozel, …, G. Agaeva. 2011. Conformational preferences, experimental and theoretical vibrational spectra of cyclo (Gly-Val) dipeptide. J. Mol. Struct. 993:341-348.
- 37.Celik S., Ozel A.E., Akyuz S. Comparative study of antitumor active cyclo (Gly-Leu) dipeptide: a computational and molecular modeling study. Vib. Spectrosc. 2016;83:57–69. [Google Scholar]; Celik, S., A. E. Ozel, and S. Akyuz. 2016. Comparative study of antitumor active cyclo (Gly-Leu) dipeptide: a computational and molecular modeling study. Vib. Spectrosc. 83:57-69.
- 38.Çelik S., Akyuz S., Ozel A.E. Vibrational spectroscopic and structural investigations of bioactive molecule Glycyl-Tyrosine (Gly-Tyr) Vib. Spectrosc. 2017;92:287–297. [Google Scholar]; Celik, S., S. Akyuz, and A. E. Ozel. 2017. Vibrational spectroscopic and structural investigations of bioactive molecule Glycyl-Tyrosine (Gly-Tyr). Vib. Spectrosc. 92:287-297.
- 39.Kecel-Gunduz S., Bicak B., Ozel A.E. Structural and spectroscopic investigation on antioxidant dipeptide, L-Methionyl-L-Serine: a combined experimental and DFT study. J. Mol. Struct. 2017;1137:756–770. [Google Scholar]; Kecel-Gunduz, S., B. Bicak, …, A. E. Ozel. 2017. Structural and spectroscopic investigation on antioxidant dipeptide, L-Methionyl-L-Serine: a combined experimental and DFT study. J. Mol. Struct. 1137:756-770.
- 40.Celik S., Ozel A.E., Akyuz S. Structural and IR and Raman spectral analysis of cyclo (His-Phe) dipeptide. Vib. Spectrosc. 2012;61:54–65. [Google Scholar]; Celik, S., A. E. Ozel, …, S. Akyuz. 2012. Structural and IR and Raman spectral analysis of cyclo (His-Phe) dipeptide. Vib. Spectrosc. 61:54-65.
- 41.Maglia G., Jonckheer A., Engelborghs Y. An unusual red-edge excitation and time-dependent Stokes shift in the single tryptophan mutant protein DD-carboxypeptidase from Streptomyces: the role of dynamics and tryptophan rotamers. Protein Sci. 2008;17:352–361. doi: 10.1110/ps.073147608. [DOI] [PMC free article] [PubMed] [Google Scholar]; Maglia, G., A. Jonckheer, …, Y. Engelborghs. 2008. An unusual red-edge excitation and time-dependent Stokes shift in the single tryptophan mutant protein DD-carboxypeptidase from Streptomyces: the role of dynamics and tryptophan rotamers. Protein Sci. 17:352-361. [DOI] [PMC free article] [PubMed]
- 42.Shao Q., Zhu W. Assessing AMBER force fields for protein folding in an implicit solvent. Phys. Chem. Chem. Phys. 2018;20:7206–7216. doi: 10.1039/c7cp08010g. [DOI] [PubMed] [Google Scholar]; Shao, Q., and W. Zhu. 2018. Assessing AMBER force fields for protein folding in an implicit solvent. Phys. Chem. Chem. Phys. 20:7206-7216. [DOI] [PubMed]
- 43.Mittal J., Best R.B. Tackling force-field bias in protein folding simulations: folding of Villin HP35 and Pin WW domains in explicit water. Biophys. J. 2010;99:L26–L28. doi: 10.1016/j.bpj.2010.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]; Mittal, J., and R. B. Best. 2010. Tackling force-field bias in protein folding simulations: folding of Villin HP35 and Pin WW domains in explicit water. Biophys. J. 99:L26-L28. [DOI] [PMC free article] [PubMed]
- 44.Freddolino P.L., Park S., Schulten K. Force field bias in protein folding simulations. Biophys. J. 2009;96:3772–3780. doi: 10.1016/j.bpj.2009.02.033. [DOI] [PMC free article] [PubMed] [Google Scholar]; Freddolino, P. L., S. Park, …, K. Schulten. 2009. Force field bias in protein folding simulations. Biophys. J. 96:3772-3780. [DOI] [PMC free article] [PubMed]
- 45.Best R.B., Zhu X., Mackerell A.D., Jr. Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles. J. Chem. Theory Comput. 2012;8:3257–3273. doi: 10.1021/ct300400x. [DOI] [PMC free article] [PubMed] [Google Scholar]; Best, R. B., X. Zhu, …, A. D. Mackerell, Jr. 2012. Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles. J. Chem. Theory Comput. 8:3257-3273. [DOI] [PMC free article] [PubMed]
- 46.Krieger E., Nabuurs S.B., Vriend G. Homology modeling. Methods Biochem. Anal. 2003;44:509–523. doi: 10.1002/0471721204.ch25. [DOI] [PubMed] [Google Scholar]; Krieger, E., S. B. Nabuurs, and G. Vriend. 2003. Homology modeling. Methods Biochem. Anal. 44:509-523. [DOI] [PubMed]
- 47.Malyshev D.A., Dhami K., Romesberg F.E. A semi-synthetic organism with an expanded genetic alphabet. Nature. 2014;509:385–388. doi: 10.1038/nature13314. [DOI] [PMC free article] [PubMed] [Google Scholar]; Malyshev, D. A., K. Dhami, …, F. E. Romesberg. 2014. A semi-synthetic organism with an expanded genetic alphabet. Nature. 509:385-388. [DOI] [PMC free article] [PubMed]
- 48.Service R.F. Synthetic biology. Designer microbes expand life’s genetic alphabet. Science. 2014;344:571. doi: 10.1126/science.344.6184.571. [DOI] [PubMed] [Google Scholar]; Service, R. F. 2014. Synthetic biology. Designer microbes expand life’s genetic alphabet. Science. 344:571. [DOI] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




