Abstract
Coarse-grained force fields for protein simulations are usually designed and parameterized to treat proteins composed of natural L-amino-acid residues. However, D-amino-acid residues occur in bacterial, fungal (e.g., gramicidins), as well as human-designed proteins. For this reason, we have extended the UNRES coarse-grained force field developed in our laboratory to treat systems with D-amino-acid residues. We developed the respective virtual-bond-torsional and double-torsional potentials for rotation about the Cα · · · Cα virtual-bond axis and two consecutive Cα · · · Cα virtual-bond axes, respectively, as functions of virtual-bond-dihedral angles γ. In turn, these were calculated as potentials of mean force (PMFs) from the diabatic energy surfaces of terminally-blocked model compounds for glycine, alanine, and proline. The potential-energy surfaces were calculated by using the ab initio method of molecular quantum mechanics at the Møller-Plesset (MP2) level of theory and the 6-31G(d,p) basis set, with the rotation angles of the peptide groups about and used as variables, and the energy was minimized with respect to the remaining degrees of freedom. The PMFs were calculated by numerical integration for all pairs and triplets with all possible combinations of types (glycine, alanine, and proline) and chirality (D or L); however, symmetry relations reduce the number of non-equivalent torsional potentials to 13 and the number of double-torsional potentials to 63 for a given C-terminal blocking group. Subsequently, one- (for torsional) and two-dimensional (for double-torsional potentials) Fourier series were fitted to the PMFs to obtain analytical expressions. It was found that the torsional potentials of the x-Y and X-y types, where X and Y are Ala or Pro, respectively, and a lowercase letter denotes D-chirality, have global minima for small absolute values of γ, accounting for the double-helical structure of gramicidin A, which is a dimer of two chains, each possessing an alternating D-Tyr-L-Tyr sequence, and similar peptides. The side-chain and correlation potentials for D-amino-acid residues were obtained by applying the reflection about the plane to the respective potentials for the L-amino-acid residues.
1 Introduction
Less than 1 % of all known protein and peptide structures contain D-amino acids.1 However, D-amino acids are abundant in natural proteins and peptides,2 such as cone snail venoms,3 the murein of bacteria,4 fungus-produced antibiotics,5,6 etc., and it is estimated that the D-amino acid market will reach $ 3.7 billion by the year 2017.7 Therefore, molecular simulations must be able to treat polypeptides and proteins containing D-amino-acid residues. For all-atom force fields, except those which contain sine torsional terms, there is no problem to handle the D-configuration; however, coarse-grained force fields require additional parameterization for this purpose. Because the coarse-grained force fields offer an enormous advantage for reducing the cost and broadening of the time step of simulations,8,9 it is worth investing this effort in extending the range of application of such approaches. However, most of the known coarse-grained force fields developed for proteins can treat only the natural L-amino acids.
In the last several years, we have been developing the coarse-grained UNited RESidue (UNRES) force field for simulation of protein structure and dynamics.10–22 Unlike most of the other force fields, UNRES has been derived13,23 and parameterized13–15,17,19–22,24 as a potential of mean force of polypeptide chains immersed in water. UNRES proved to be a powerful tool in predicting protein structure, as assessed in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments.25–29 In CASP4, we predicted, for the first time, the complete structure of a 70-residue protein, bacteriocin S (target T0102); with the introduction of correlation terms, we also were able to treat proteins with β-structure.27 During the CASP5 exercise in 2004, we predicted nearly the complete structure (203 out of 235-residues) of a six-a-helix bundle protein (target T0198), complete or nearly complete structures of three other smaller a- and a + b-proteins, and significant portions of structures of other proteins.28 It has also been extended to run coarse-grained molecular dynamics8,9,30,31 and to study oligomeric proteins.32,33 Recently, trans-cis isomerization of peptide groups has been introduced into UNRES.22 UNRES has also been applied to study biological problems such as amyloid formation,34,35 signaling36 and chaperone dynamics.37
In this work, we extend the UNRES force field to treat systems with D-amino-acid residues. For this purpose, we revise the local potentials that depend on amino-acid chirality, namely the virtual-bond torsional and double torsional potentials, the side-chain local potentials, and the correlation potentials. Of those, introduction of the torsional, double-torsional, and virtual-bond-angle potentials (which also depend on the adjacent virtual-bond-dihedral angles) requires reparameterization of the force field, while the other potentials for D-amino-acid residues are obtained in a trivial manner by applying symmetry operations to the respective potentials for the natural L-amino acid residues, as shown in section 2.2. In this work, we have determined the torsional and double-torsional potentials for D-amino-acid residues. The virtual-bond-angle potentials for D-amino-acid residues will be introduced later.
2 Methods
2.1 UNRES representation of polypeptide chain
In the UNRES model,10–13,15–23 a polypeptide chain is represented as a sequence of α-carbon (Cα) atoms with attached united side chains (SC’s) and united peptide groups (p’s) positioned halfway between two consecutive Cα’s. Only the united side chains and united peptide groups act as interaction sites, while the Cα atoms assist only in the definition of geometry (Figure 1). The effective energy function is defined as the restricted free energy (RFE) or the potential of mean force (PMF) of the chain constrained to a given coarse-grained conformation along with the surrounding solvent.13,18,23 This effective energy function is expressed by eq 1.
(1) |
where the U’s are energy terms, θi is the backbone virtual-bond angle, γi is the backbone virtual-bond-dihedral angle, αi and βi are the angles defining the location of the center of the united side chain of residue i (Figure 1), and di is the length of the ith virtual bond, which is either a Cα · · · Cα virtual bond or Cα · · · SC virtual bond. Each energy term is multiplied by an appropriate weight, wx, and the terms corresponding to factors of order higher than 1 are additionally multiplied by the respective temperature factors which were introduced in our recent work16 and which reflect the dependence of the first generalized-cumulant term in those factors on temperature, as discussed in refs 16 and 38. The factors fn are defined by eq 2.
(2) |
where T° = 300 K.
The term USCiSCj represents the mean free energy of the hydrophobic (hydrophilic) interactions between the side chains, which implicitly contains the contributions from the interactions of the side chain with the solvent. The term USCipj denotes the excluded-volume potential of the side-chain – peptide-group interactions. The peptide-group interaction potential is split into two parts: the Lennard-Jones interaction energy between peptide-group centers ( ) and the average electrostatic energy between peptide-group dipoles ( ); the second of these terms accounts for the tendency to form backbone hydrogen bonds between peptide groups pi and pj. The terms Utor, Utord, Ub, Urot, and Ubond are the virtual-bond-dihedral angle torsional terms, virtual-bond dihedral angle double-torsional terms, virtual-bond angle bending terms, side-chain rotamer, and virtual-bond-deformation terms; these terms account for the local properties of the polypeptide chain. The terms represent correlation or multibody contributions from the coupling between backbone-local and backbone-electrostatic interactions, and the terms are correlation contributions involving m consecutive peptide groups; they are, therefore, termed turn contributions. The multibody terms are indispensable for reproduction of regular α-helical and β-sheet structures.13,23,39
The energy-term weights are determined by force-field calibration to reproduce the structure and folding thermodynamics of selected training proteins.16,40 As a training proteins LysM domain from E. coli (and R + protein; PDB code: 1E0G),41 the Fbp28Ww domain from Mus musculus (a protein; PDB code: 1E0L),42 the albumin-binding GA module (an R protein; PDB code: 1GAB),43 and the IgG-bindingdomain from streptococcal protein G (an R + protein; PDB code: 1IGD)44 and namely the tryptophan cage (PDB code: 1L2Y)45 and the tryptophan zipper 2 (PDB code: 1LE1)46 were used.
The present UNRES is parameterized for L-amino acid residues. To extend it to D-amino-acid residues, the terms that depend on chirality must be changed. These are the torsional (Utor), double-torsional (Utord), side-chain-rotamer (Urot), correlation (Ucorr) and virtual-bond-angle (Ub) potentials. In the subsequent sections we will describe the modification of these potentials for D-amino acid residues. Because the torsional and double-torsional potentials require the greatest effort, we will devote most of the description to these potentials.
2.2 Effect of amino-acid chirality on the side-chain-rotamer and correlation potentials
As mentioned in section 2.1, amino-acid chirality affects the torsional, double-torsional, side-chain rotamer, virtual-bond-bending, and correlation potentials. We begin the discussion with the side-chain rotamer potentials, Urot, because their dependence on chirality is straightforward. The latest potentials derived in ref. 19 have the form of polynomials in the local coordinates of a side chain, in which, for the ith residue, the x axis is the bisector of the , virtual-bond angle, the y axis is perpendicular to the x axis, lies in the plane of the three Cα atoms, and runs in the direction from to , and the z axis forms a right-handed orthogonal coordinate system with the x and y axes (see Figure 1 in ref. 19 for the illustration of the definition of the local-coordinate system). The change of chirality is achieved just by reflection of a side-chain center in the xy plane; consequently, the sign at all coefficients that pertain to the local z coordinate in eq 3 of ref 19 must be changed. Similarly, the D-chirality can be introduced in the older (statistical) side-chain-rotamer potentials12 by changing the sign of the coefficients in eq 8 of ref 12 corresponding to odd powers of the polar angle β, which changes sign when the chirality is changed.
Changing chirality is also straightforward for the correlation potential, . These potentials contain the coefficients of the second-order Fourier expansion of the local-interaction-energy surfaces, eX, where X denotes a basic type of amino-acid residue (glycine, alanine, and proline, where alanine represents all non-glycine and non-proline residues) in terms of the angles λ(1) and λ (2) for rotation of the peptide groups about the Cα · · · Cα virtual bonds of terminally-blocked amino-acid residues,13 as given by eq 3.
(3) |
with
(4) |
where b1X and b2X are the 2-dimensional vectors of first-order Fourier coefficients and the matrices CX, DX, and EX are the 2 × 2 matrices of second-order Fourier coefficients of the second-order expansion of the energy surface in the angles λ(1) and λ(2) for residue X. Additionally, CX represents symmetric traceless matrices; therefore, cX,22 = −cX,11 and dX,22 = −dX,11.
The energy surface of a D-amino-acid-residue, ex, where the lowercase letter denotes the D-chirality; x = AlaD or ProD (because glycine does not have a side chain and, consequently, is not chiral) is obtained from that of the corresponding L-amino-acid residue by applying the inversion operation to the angular variables, as given by eq 5.
(5) |
with
(6) |
where U is the matrix for reflection in the plane.
(7) |
Thus, introducing the D-chirality in the correlation terms is achieved by changing the signs of the coefficients of the second-order expansion of eX at the terms with odd sine powers.
2.3 Determination of torsional and double-torsional potentials
As opposed to Urot, the virtual-bond-torsional, Utor, and double-torsional potentials, Utord that contain residues with mixed chirality, must be redetermined. The procedure developed in our earlier work14 was used, in which the respective potentials of mean force are calculated first from the energy maps of terminally-blocked amino-acid residues and then a one- (for Utor) or two-dimensional (for Utord) Fourier series is fitted to them. The harmonic-entropy contribution is also considered. The potentials of mean force corresponding to Utor and Utord are thus defined by eqs 8 and 9, respectively. The terminally-blocked di- and tripeptide systems used for the derivation of the torsional and double-torsional potentials are illustrated in Figure 2A and B, respectively.
(8) |
where eX and eY are the energy surfaces for the terminally-blocked residues of type X and Y respectively, H* is the energy Hessian computed over all variables except for the angles λi and λj (the terms with det H* account for the harmonic-entropy contribution to the PMF), β = 1/RT where R is the universal gas constant and T is the absolute temperature. For Y = Pro, the C-terminal blocking group of the X residue is pyrrolidine; otherwise it is the methyl group.
(9) |
It should be noted that, because the single-torsional potentials UXY and UYZ are subtracted from the RFE of the terminally-blocked tripeptide, UXYZ contains only that part of the free energy of the rotation about the Cα · · · Cα virtual-bond angles that cannot be accounted for by the single-torsional contributions.
The presence of γ1 − π − λ2 in eqs 8 and 9 and γ2 − π − λ3 in eq 9 arises from the fact that λ2 is shared between residues X and Y in the dipeptide and in the tripeptide systems, and λ3 is shared between residues Y and Z in the tripeptide system. The diabatic energy surface, eX, of a terminally-blocked amino-acid residue X can be expressed as a function of two local angles λ(1) and λ(2) for rotation about the consecutive virtual Cα · · · Cα bonds, defined by Nishikawa et al.47 and also shown in Figure 3. The local angles of consecutive residues are, in turn, related to the chain-wide angles λ1, λ2, …, λn shown in Figure 2 by eq 10.
(10) |
The integrals in eqs 8 and 9 were calculated by numerical quadrature by summing the values of the integrand over the nodes of a multidimensional grid in the angles λ. The grid size was the same as that with which the corresponding potential-energy surfaces were obtained. The presence of a proline residue reduced the dimensions of a grid by one, because the λ(1) angle of the proline residue is determined by the requirement to obey the constraints accruing from the presence of the pyrrolidine ring. The presence of proline residues also made it necessary to interpolate linearly between the closest points of the original grid to estimate the energy and the logarithm of the determinant of Hessian values for residues preceding proline; otherwise, all energies were taken directly from the ab initio values computed at grid points. This was caused by the fact that λ(1) of a proline residue is a function of λ(2).48 Thus, even if λ(2) is on the grid, λ(1) need not be, if the residue is proline. The angle λ(1) is, in turn, related by eq 10 to λ(2) of the preceding residue; therefore, the last angle also, generally, takes values outside the grid points. For non-proline residues, λ(1) is unrestricted. Consequently, the virtual-bond dihedral angles γ are from a grid of the same spacing as the angles λ and, by eq 10, all points taken to evaluate the integrals that occur in eq 8 and eq 9 are on the grid.
In the UNRES force field, the torsional potentials are expressed as a one- and two-dimensional Fourier series, respectively (eqs 11 and 12)
(11) |
(12) |
where ai, bi, …, hij are coefficients and n1 and n2 are the orders of the expression; we used n1 = 6 and n2 = 8. The coefficients in eqs 11 and 12 were calculated by fitting the UXY and UXY Z surfaces (obtained by numerical integration) to eqs 11 and 12, respectively. As in previous work,14 the target function F consisted of the sum of the squares of the errors over the grid points in γ1 and γ2 and an entropy-like or smoothing term preventing too large a variation of the fitted functional expression between the points of the grid (eq 13).
(13) |
where denotes the value of a torsional or a double-torsional potential at grid point i, obtained by numerical integration of the energy surfaces of terminally-blocked amino-acid residues (eq 8 or 9), Ui(X) denotes the value of U calculated by using the Fourier series (eqs 11 or 12), ΔU (γ) is the difference between the U expressed by eqs 11 or 12 and the value at that point obtained by linear interpolation between the grid points, and α is the weight of the Shannon-entropy term; we used α = 0.2. The value of α was set after several trial calculations to provide both smooth potentials and good fit to numerically calculated PMF values. The target function of eq 13 was minimized with respect to the coefficients a0, a1 …, b6 (for single-torsional potentials; eq 11) or c0, c1, …, h8,8 (for double-torsional potentials; eq 12) by using the Secant Unconstrained Minimization SoLver (SUMSL) procedure.49 The Shannon-entropy term (the last term in eq 13) prevents the fitted functions from oscillating between grid points, and therefore, enables one to obtain quite smooth functions despite using a large number of Fourier terms. It takes a zero value when the function is linear between the grid points and increases considerably if the function oscillates between the grid points.
It can easily be demonstrated that the symmetry relations expressed by eqs 14 and 15 hold; there are only 2 × 13 independent torsional potentials and 2 × 63 independent double-torsional potentials, respectively. The two sets of torsional and double-torsional potentials appear because a regular or proline residue, respectively, can follow a di- or tripeptide fragment in a polypeptide chain; if the following residue is proline, the model system implemented to derive the respective potential is blocked by the N-pyrrolidine group.
(14) |
(15) |
where X, Y, and Z denote amino-acid-residue types of a given chirality (L or D), and a bar over a symbol denotes the change of enantiomer chirality; this operation does not affect glycine.
2.4 Calculations of the energy surfaces for terminally-blocked amino-acid residues
In our previous work,14 we computed the energy maps of terminally-blocked amino-acid residues by using ab initio molecular quantum mechanics with the 6-31G(d,p) basis set. Because less powerful resources were available at that time, the energy was minimized in the Restricted Hartree-Fock (RHF) scheme and, subsequently, the single-point MP2 energy correction was computed for the energy-minimized conformation. Moreover, interfacing a residue to the following proline residue was modeled by blocking it with an N′,N′-dimethyl group. In this work, all energy maps were recalculated by minimizing the energy in the MP2/6-31G(d,p) scheme; additionally, as mentioned in section 2.3, we used the C-terminal pyrrolidine group and not the N,N-dimethylamine blocking group for residues preceding proline. Consequently, all diabatic energy maps of trans-N-acetyl-alanyl-N′-methyl amide, trans-N-acetyl-alanyl-N′-pyrrolidylamide, trans-N-acetyl-glycyl-N′-methyl amide, trans-N-acetyl-glycyl-N′-pyrrolidylamide trans-N-acetyl-prolyl-N′-methyl amide, and trans-N-acetyl-prolyl-N′-pyrrolidylamide were computed. The angles λ(1) and λ(2) were used as grid variables; both varied from −180° to 180° with a 15° step. The energy was minimized with respect to all other degrees of freedom. Energy Hessians (second-derivative matrices or force-constant matrices) were calculated at every grid point to determine the harmonic-entropy contribution to the PMF (eqs 8 and 9).
3 Results and discussion
3.1 Energy maps of terminally-blocked amino-acid residues
The potential-energy surfaces of trans-N-acetyl-L-alanyl-N′-methyl amide, trans-N-acetylL-alanyl-N′-pyrrolidylamide, trans-N-acetyl-glycyl-N′-methyl amide, trans-N-acetyl-glycyl-N′-pyrrolidylamide, trans-N-acetyl-L-prolyl-N′-methyl amide, trans-N-acetyl-L-prolyl-N′-pyrrolidylamide are shown in the Fig. 4 and 5, respectively. The potential-energy surfaces of terminally-blocked D-amino acid residues can be obtained from those shown in Figures 4 and 5 by applying the inversion-symmetry operation (eq 5). The potential-energy maps for the Ac-AlaL-NHMe and Ac-Gly-NHMe systems (Figure 4A and 4C) resemble those obtained in our earlier study (Figure 4a and 4c in ref 14). It should be noted that, in that study,14 we included the second-order electron-correlation contribution to the energy only as a correction calculated for RHF-energy-minimized conformations. However, the Ac-Gly-NMe2 and Ac-AlaL-NMe2 potential-energy surfaces used in our previous study14 to represent the residues preceding proline (Figure 5 in ref 14), exhibit some differences with respect to the Ac-Gly-Pyr and Ac-AlaL-Pyr potential-energy surfaces (Figure 4B and 4D). The most significant difference is that the high-energy regions λ(1) = 0° and λ(2) = 0°, and λ(1) = 0° and λ(2) = 180°, are noticeably smaller for both Ac-Gly-Pyr and Ac-AlaL-Pyr compared to those for Ac-Gly-NMe2 and Ac-AlaL-NMe2, respectively. This feature results from the lack of specific hydrogen-bonding interactions, especially those in the conformation, which create narrow and deep energy minima. With the bulky pyrrolidine group in place of a C-terminal amide-hydrogen atom, a broader but shallower minimum is present. Moreover, the high-energy regions of Ac-Gly-Pyr and Ac-AlaL-Pyr have more complex shape than those of Ac-Gly-NMe2 and Ac-AlaL-NMe2, which is understandable in view of the more complex shape of the Pyr blocking group compared to that of the NMe2 group.
Another difference between the potential-energy maps of NMe2- and Pyr-blocked residues is the shape and topology of the energy minima. In Ac-AlaL-NMe2, two clearly-separated minima at (λ(1) = −60°, λ(2) = 90°) and and (λ(1) = −150°, λ(2) = 150°), respectively can be observed, which correspond to the and extended conformation, respectively (it should be noted, though, that the hydrogen bond characteristic of the conformation is absent because of replacement of the amide-hydrogen atom with a methyl group). For Ac-AlaL-Pyr, these two minima are shifted to (λ(1) = −75°, λ(2) = 90° and λ(1) = −135°, λ(2) = 135°) and nearly merged to create a broad low-energy region comprising the and extended conformations. This feature is also likely to result from use of the full MP2/6-31G(d,p) energy function in the present study.
Yet bigger differences, resulting from use of a different C-terminal blocking group and minimization of a different energy function, can be observed between Ac-Pro-NMe2 (Figure 6 in ref 14) and Ac-ProL-Pyr (Figure 5). The Ac-ProL-Pyr potential-energy plot exhibits only one minimum, while that of Ac-ProL-NMe2 has two maxima. The maximum that appeared at λ(2) = 30° for Ac-ProL-NMe2 has coalesced with the minimum at λ 2) ≈ 60° into an inflexion point at λ(2) ≈ 20°. These differences and inclusion of the harmonic-entropy contribution result in different torsional and double torsional potentials involving a proline residue.
3.2 Torsional potentials
The 13 independent torsional potentials (see section 2.3) for each of the C-terminal blocking groups (NHMe or Pyr; a total of 26 potentials) calculated from the MP2/6-31G(d,p) potential-energy maps by using eq 8 are shown in Figure 6. Subsequently, Fourier series (eq 11) were fitted to them. The Fourier coefficients are collected in Table S1 of Supplementary Material.
Despite the more elaborate methodology for calculating the energy surfaces of terminally-blocked amino-acid residues, the torsional potentials of the L-amino-acid-residue sequences blocked with the NHMe group at the C-terminus (Figure 6A) resemble the previously determined potentials (Figure 8a in ref 14). As previously observed,14 three minima are present in the Gly-Gly potentials at γ = 180° and γ = ±60°, respectively. These minima correspond to extended, type I′ or type II′ (for γ = −60°) and type I or type II (for γ = 60°) β-turns, respectively. As in our previous work,14 the only minimum of the AlaL-AlaL torsional potential corresponds to the extended structure (Figure 7A). This feature results from the fact that, for the extended structure, both residues can form the energetically-favorable conformation with a C=O· · · H-N hydrogen bond, as illustrated in Figure 7A. On the other hand, both residues can also be in the extended conformation, which forms a broad low-energy region (Figure 4A and 4C) and, consequently, has high entropy. No minimum corresponding to an α-helical structure is present (γ ≈ 60°), which is understandable because that structure is stabilized only by interactions between the second- and third-neighbor residues, which are not included in the torsional potentials.
As mentioned in section 2.3, the torsional potentials for the system in which both amino-acid residues are in the D-configuration can be obtained by inverting the potentials for L-aminoacid-residue sequences about γ = 0° (eq 14); the same holds for the torsional potentials of the Gly-X and X-Gly sequences, where X denotes alanine or proline. Conversely, the potentials for the sequences of mixed-chirality dipeptides exhibit a different pattern, as shown in Figure 6. The energy minima occur for small absolute values of γ regardless of whether Ala or Pro is involved and regardless of the type of the C-terminal blocking group. This feature arises from the fact that, for small values of γ, both residues can occur in the energetically-favorable conformation (for the L-configuration) or conformation (for the D-configuration) or in the portion of the extended-structure region adjacent to the region. This situation is illustrated in Figure 7B) for the AlaL-AlaD dipeptide. Moreover, the presence of three minima for the Ac-Gly-Gly-NHMe system (Figure 6A) can also now be explained because glycine is an achiral amino acid and any sequence of and conformations corresponds to low energy of intra-residue interaction. It should be noted that four minima are also present for the AlaL-AlaD-NHMe systems; the global minimum occurs at γ ≈ −60° and the second one occurs at γ ≈ 30°. Of course, for the Ac-AlaD-AlaL-NHMe sequence, these minima occur at γ ≈ 60° and γ ≈ −30°, respectively, by eq 14.
The location of the minima for the alternating sequences of L- and D-amino-acid residues has very interesting structural implications, namely twisting of a pair of a two-strand β-sheet into a double helix. This feature arises because not only C7 (axial or equatorial) but also extended conformations, which are capable of forming inter-strand hydrogen bonds, contribute to the partition function at the minima of the torsional potentials for the DL- or LD-chirality sequences. The doubly-helical structure of gramicidin D (PDB code: 1AL4; Figure 8), which is a heterodimer of two chains, each with alternating D- and L-residue sequences, confirms this observation. Our observations of torsional poteitials behaviour were confirmed by conducting MREMD (multiple replica exchange molecular dynamics) of gramicin D (Fig 8C). The structure with the lowest potential energy obtained in 250K is in left-handed double-helix. The RMSD is 3.70Å and is a result of shift by one residue of turn and flexibile N-terminal fragments.
As can be seen from Figure 6B, the torsional potentials for the Pyr-blocked dipeptides are different from those for NHMe-blocked ones. The most significant difference occurs for the Gly-Gly dipeptide, for which the two minima at γ ≈ ±60° coalesce into a broad minimum region centered at γ = 0° and the extended-chain minimum at γ = 180° becomes very shallow. Similar differences occur for all glycine-containing systems. For the other systems, the torsional potentials for Pyr-blocked dipeptides have less pronounced fine structure compared to those for the NHMe-blocked systems. This is understandable because the large blocking group, which does not have a proton donor, prevents the formation of local hydrogen bonds and, thus, their formation cannot be reflected in the minima of the torsional potentials.
An analysis of the plots (Figure 6) also reveals that L-proline is an efficient helix breaker. For the AlaL-ProL-NHMe dipeptide, the torsional-potential values are about 9 kcal/mol in the entire region from γ = 0° to γ = 90°, i.e., for all possible values of γ corresponding to α-helical conformations (Figure 6A). If D-proline is inserted in the sequence, such high values of the torsional potential do not occur. This result contradicts the experimental observation that D-proline is a strong α-helix breaker.50,51 Thus, longer-range than intraresidue interactions must determine helix destabilization by D-proline.
3.3 Double-torsional potentials
All 126 independent potentials (63 for each type of the C-terminal blocking group) were calculated from the MP2/6-31G(d,p) potential-energy surfaces of terminally-blocked glycine, alanine, and proline, by using eq 9. To provide an example, the 4 UAlaL–Ala(D,L)–Ala(D,L)–NHMe(γ1, γ2) and the 4 UAlaL–Ala(D,L)–Ala(D,L)–P yr (γ1, γ2) PMF surfaces are shown in Figure 9; the surfaces for the system in which the first residue is AlaD are related to those by eq 15. The other PMF surfaces are shown in Figure S1 of Supplementary Material. Two-dimensional Fourier series (eq 12) were subsequently fitted to the PMFs; the coefficients are collected in Table S2 of Supplementary Material.
As observed for the torsional potentials (section 3.2, the double-torsional potentials are similar to those derived in our previous work,14 some differences arising for proline-containing tripeptides and the tripeptides blocked with the pyrrolidine C-terminal group. As for the torsional potentials discussed in the previous section, the difference between PMF minima and maxima is usually larger when the tripeptide is blocked with the pyrrolidine group. Even though the double-torsional potentials account for the third-order correlations, the range of their variation is comparable to that of the torsional potential and they, therefore, cannot be ignored in the UNRES effective energy function.
The global minimum for the Ac-AlaL-AlaL-AlaL-NHMe double-torsional potential (Figure 9A) occurs at (λ(1) = 15°, λ(2) = −15°) and the minimum region extends along the γ2 = γ1 axis towards positive values of γ1. Thus, the double-torsional potentials for all-L-residue polypeptide chains diminish the propensity towards extended chain conformations and drive the chain at right-handed α-helical conformations. When the L-alanine residue in the middle of the sequence is replaced by D-alanine (Figure 9C), the global minimum is shifted towards γ1 = γ2 = 180° and also a shallower minimum appears at γ1 = 120° and γ2 = 120°. Thus, for sequences with alternating chirality, the double-torsional potentials have the opposite effect as that of the torsional potentials which favor small values of γ (Figure 6). For the Ac-AlaL-AlaL-AlaD-NHMe system sequence (Figure 9E), the global minimum is located at γ1 = 15°, γ2 = −165° and for the Ac-AlaL-AlaD-AlaD-NHMe system (Figure 9G) the global minimum occurs at γ1 = −15°, γ2 = 180°, as in β-bulge structures.52 The double-torsional potentials for the Pyr-blocked systems (Figure 9B, D, and F) are close to those of the NHMe-blocked system except that the minimum regions are less structured. It should be stressed that both torsional and double torsional potentials contribute to final structure of protein and cannot be separated in easy manner. Structure of gramicidin D obtained from simulation (Fig. 8) cannot be solely contributed to torsional interactions but is a sum of all interactions.
4 Summary
In this work, the potential-energy surfaces of terminally-blocked glycine, L-alanine, and L-proline were calculated at the MP2/6-31G(d,p) ab initio level. The C-terminal blocking group was either the N-methylamino group to account for the situation in which the next residue in the chain is not a proline residue, or pyrrolidine to account for the connection of the residue to a proline residue next in the amino-acid sequence. For Gly and AlaL, the energy maps were constructed on the grid of the angles λ(1) and λ(2) for rotation about the Cα · · · Cα virtual bonds (Figure 3); for L-proline one-dimensional potential-energy surfaces in λ(2) were constructed because, for L-proline, λ(1) is a function of λ(2) due to the constraint imposed by the presence of the rigid pyrrolidine ring. For each grid point, the energy was minimized with respect to all other degrees of freedom. Energy Hessians were calculated at each grid point to account for harmonic-entropy contributions. Based on the calculated potential-energy surfaces, the energy surfaces of D-alanine and D-proline were constructed by applying the inversion operation.
On the basis of the calculated energy surfaces, the torsional and double torsional potentials for rotation about one or two consecutive Cα · · · Cα virtual-bond axes (defined by the virtual-bond-dihedral angle γ or two consecutive virtual-bond-dihedral angles γ1 and γ2), respectively, were calculated and one- or two-dimensional Fourier series, respectively, were fitted to them to obtain the potentials for use in the UNRES force field. The potentials were calculated for all possible combinations of D- and L-amino-acid residues and for both the N-methylamide and pyrrolidine blocking groups. Because the change of the chirality of all residues is equivalent to inversion of the sign of the angle γ or of the angles γ1 and γ2, respectively, there are only 13 and 63 independent torsional and double-torsional potentials for each type of C-terminal blocking group, respectively. For L-amino-acid residues and the NHMe C-terminal blocking group, the potentials are very similar to those determined in our previous work;14 they are, however, more accurate than the earlier14 ones because of full minimization of the MP2/6-31G(d,p) energy function instead of energy minimization in the RHF/6-31G(d,p) approximation and adding the second-order correction to energy and because of use of the pyrrolidine blocking group to account for a succeeding proline residue instead of using the N′,N′-dimethylamine group.
For sequences with alternating chirality, the torsional and double-torsional potentials exhibit shifted low-energy regions with respect to those for all-L or all-D-amino-acid residue sequences. The torsional potentials for the Ala-Ala sequence exhibit the global energy minimum at the extended conformation, which corresponds to the sequence of two or extended conformations of consecutive residues. When the chirality of one of the residues is changed, the preferred absolute value of the γ angle is small; this corresponds to a sequence of and conformations or vice versa, depending on the chirality of the first residue or of two extended conformations adjacent to the respective C7 region. This feature is reflected in the experimental conformation of gramicidin D, which forms a dimer which has a conformation of a left-handed double helix; this double helix can be considered as a distorted two-stranded intermolecular β-sheet and has the hydrogen-bond pattern characteristic of a β-sheet (Figure 8).
We have also demonstrated that the correlation and the side-chain-rotamer potentials of D-amino-acid residues can be obtained from those of L-amino-acid residues by reflection of the respective coefficients in the xy plane of a local coordinate system associated with the Cα · · · Cα · · · Cα unit of a given residue. To complete the extension of UNRES to systems containing D-amino-acid residues, the virtual-bond-angle potentials must be determined; these potentials also depend on the virtual-bond-angles γ1 and γ2 adjacent to a given virtual-bond-angle θ. This work in currently underway in our laboratory. We also plan to run series of simulation of proteins eg. gramicidin D containing D-aminoacid residues to check their folding pathways and properties.
The UNRES software is an open source software and can be obtained from webpage: www.unres.pl
Supplementary Material
Acknowledgments
This work was supported by grants UMO-2011/01/N/ST4/01772 from the National Science Center of Poland, the National Institutes of Health (GM-14312 and GM-62838), and the National Science Foundation (MCB10-19767). This research was supported by an allocation of advanced computing resources provided by the National Science Foundation (http://www.nics.tennessee.edu/), and by the National Science Foundation through TeraGrid resources provided by the Pittsburgh Supercomputing Center. Computational resources were also provided by (a) the supercomputer resources at the Informatics Center of the Metropolitan Academic Network (IC MAN) in Gdańsk, (b) the 624-processor Beowulf cluster at the Baker Laboratory of Chemistry, Cornell University, (c) our 184-processor Beowulf cluster at the Faculty of Chemistry, University of Gdańsk, and (d) the SOONER cluster at the University of Oklahoma.
References
- 1.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. Nucl Acid Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kreil G. Annu Rev Biochem. 1997;66:337–345. doi: 10.1146/annurev.biochem.66.1.337. [DOI] [PubMed] [Google Scholar]
- 3.Craig AG, Bandyopadhayay P, Olivera BM. Eur J Biochem. 1999;264:271–275. doi: 10.1046/j.1432-1327.1999.00624.x. [DOI] [PubMed] [Google Scholar]
- 4.Caparros M, Pisabarro AG, de Pedro MA. J Bacteriol. 1992;174:5549–5559. doi: 10.1128/jb.174.17.5549-5559.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wenger RM. Helv Chim Act. 1984;67:502–525. [Google Scholar]
- 6.Wade D, Boman A, Wahlin B, Drain CM, Andreu D, Boman HG, Merrifield RB. Proc Natl Acad Sci USA. 1990;87:4761–4765. doi: 10.1073/pnas.87.12.4761. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Global Industry Analysts, I. PRWEB
- 8.Liwo A, Khalili M, Scheraga HA. Proc Natl Acad Sci USA. 2005;102:2362–2367. doi: 10.1073/pnas.0408885102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Khalili M, Liwo A, Jagielska A, Scheraga H. J Phys Chem B. 2005;109:13798–13810. doi: 10.1021/jp058007w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Liwo A, Pincus MR, Wawak RJ, Rackovsky S, Scheraga HA. Protein Sci. 1993;2:1715–1731. doi: 10.1002/pro.5560021016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Liwo A, Ołdziej S, Pincus MR, Wawak RJ, Rackovsky S, Scheraga HA. J Comput Chem. 1997;18:849–873. [Google Scholar]
- 12.Liwo A, Pincus MR, Wawak RJ, Rackovsky S, Ołdziej S, Scheraga HA. J Comput Chem. 1997;18:874–887. [Google Scholar]
- 13.Liwo A, Czaplewski C, Pillardy J, Scheraga HA. J Chem Phys. 2001;115:2323–2347. [Google Scholar]
- 14.Ołdziej S, Kozłowska U, Liwo A, Scheraga HA. J Phys Chem A. 2003;107:8035–8046. [Google Scholar]
- 15.Liwo A, Ołdziej S, Czaplewski C, Kozłowska U, Scheraga HA. J Phys Chem B. 2004;108:9421–9438. [Google Scholar]
- 16.Liwo A, Khalili M, Czaplewski C, Kalinowski S, Ołdziej S, Wachucik K, Scheraga H. J Phys Chem B. 2007;111:260–285. doi: 10.1021/jp065380a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kozłowska U, Liwo A, Scheraga HA. J Phys: Cond Matter. 2007;19:285203. [Google Scholar]
- 18.Liwo A, Czaplewski C, Ołdziej S, Rojas AV, KaŸmierkiewicz R, Makowski M, Murarka RK, Scheraga HA. In: Coarse-Graining of Condensed Phase and Biomolecular Systems. Voth G, editor. Chapter 8. CRC Press; 2008. pp. 107–122. [Google Scholar]
- 19.Kozłowska U, Maisuradze GG, Liwo A, Scheraga HA. J Comput Chem. 2010;31:1154–1167. doi: 10.1002/jcc.21402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Makowski M, Liwo A, Sobolewski E, Scheraga HA. J Phys Chem B. 2011;115:6119–6129. doi: 10.1021/jp111258p. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Makowski M, Liwo A, Scheraga HA. J Phys Chem B. 2011;115:6130–6137. doi: 10.1021/jp111259e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sieradzan AK, Scheraga HA, Liwo A. J Chem Theor Comput. 2012;8:1334–1343. doi: 10.1021/ct2008439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liwo A, KaŸmierkiewicz R, Czaplewski C, Groth M, Ołdziej S, Wawak RJ, Rackovsky S, Pincus MR, Scheraga HA. J Comput Chem. 1998;19:259–276. [Google Scholar]
- 24.Kozłowska U, Liwo A, Scheraga HA. J Comput Chem. 2010;31:1143–1153. doi: 10.1002/jcc.21399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Liwo A, Lee J, Ripoll DR, Pillardy J, Scheraga HA. Proc Natl Acad Sci, U S A. 1999;96:5482–5485. doi: 10.1073/pnas.96.10.5482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lee J, Liwo A, Ripoll DR, Pillardy J, Saunders JA, Gibson KD, Scheraga HA. Int J Quant Chem. 2000;77:90–117. [Google Scholar]
- 27.Pillardy J, Czaplewski C, Liwo A, Lee J, Ripoll DR, KaŸmierkiewicz R, Oldziej S, Wedemeyer WJ, Gibson KD, Arnautova YA, Saunders J, Ye YJ, Scheraga HA. Proc Natl Acad Sci USA. 2001;98:2329–2333. doi: 10.1073/pnas.041609598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ołdziej S, et al. Proc Natl Acad Sci USA. 2005;102:7547–7552. doi: 10.1073/pnas.0502655102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liwo A, He Y, Scheraga HA. Phys Chem Chem Phys. 2011;13:16890–16901. doi: 10.1039/c1cp20752k. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Khalili M, Liwo A, Rakowski F, Grochowski P, Scheraga H. J Phys Chem B. 2005;109:13785–13797. doi: 10.1021/jp058008o. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Rakowski F, Grochowski P, Lesyng B, Liwo A, Scheraga HA. J Chem Phys. 2006;125:204107. doi: 10.1063/1.2399526. [DOI] [PubMed] [Google Scholar]
- 32.Saunders JA, Scheraga HA. Biopolymers. 2003;68:300–317. doi: 10.1002/bip.10226. [DOI] [PubMed] [Google Scholar]
- 33.Rojas AV, Liwo A, Scheraga HA. J Phys Chem B. 2007;111:293–309. doi: 10.1021/jp065810x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Rojas A, Liwo A, Browne D, Scheraga HA. J Mol Biol. 2010;404:537–552. doi: 10.1016/j.jmb.2010.09.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Rojas A, Liwo A, Scheraga H. J Phys Chem B. 2011;115:12978–12983. doi: 10.1021/jp2050993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.He Y, Weinstein AL, Scheraga HH. J Mol Biol. 2011;405:298–314. doi: 10.1016/j.jmb.2010.10.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Golas EI, Maisuradze GG, Senet P, Ołdziej S, Czaplewski C, Scheraga HA, Liwo A. J Chem Theory Comput. 2012;8:1334–1343. doi: 10.1021/ct200680g. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Shen H, Liwo A, Scheraga HA. J Phys Chem B. 2009;113:8738–8744. doi: 10.1021/jp901788q. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kolinski A, Skolnick J. J Chem Phys. 1992;97:9412–9426. [Google Scholar]
- 40.He Y, Xiao Y, Liwo A, Scheraga HA. J Comput Chem. 2009;30:2127–2135. doi: 10.1002/jcc.21215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bateman A, Bycroft M. J Mol Biol. 2000;299:1113–1119. doi: 10.1006/jmbi.2000.3778. [DOI] [PubMed] [Google Scholar]
- 42.Macias MJ, Gervais V, Civera C, Oschkinat H. Nat Struct Biol. 2000;7:375–379. doi: 10.1038/75144. [DOI] [PubMed] [Google Scholar]
- 43.Johansson MU, de Chateau M, Wikstrom M, Forsen S, Drakenberg T, Bjorck L. J Mol Biol. 1997;266:859–865. doi: 10.1006/jmbi.1996.0856. [DOI] [PubMed] [Google Scholar]
- 44.Derrick JP, Wigley DB. J Mol Biol. 1994;243:906–918. doi: 10.1006/jmbi.1994.1691. [DOI] [PubMed] [Google Scholar]
- 45.Neidigh JW, Fesinmeyer RM, Andersen NH. Nat Struct Biol. 2002;9:425–430. doi: 10.1038/nsb798. [DOI] [PubMed] [Google Scholar]
- 46.Cochran AG, Skelton NJ, Starovasnik MA. Proc Natl Acad Sci USA. 2001;98:5578–5583. doi: 10.1073/pnas.091100898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Nishikawa K, Momany FA, Scheraga HA. Macromolecules. 1974;7:797–806. doi: 10.1021/ma60042a020. [DOI] [PubMed] [Google Scholar]
- 48.Liwo A, Pincus MR, Wawak RJ, Rackovsky S, Scheraga HA. Protein Sci. 1993;2:1697–1714. doi: 10.1002/pro.5560021015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Gay DM. ACM Trans Math Software. 1983;9:503–524. [Google Scholar]
- 50.Krause E, Bienert M, Schmieder P, Wenschuh H. J Am Chem Soc. 2000;122:4865–4870. [Google Scholar]
- 51.Mitchell JB, Smith J. Proteins: Struct Func Bionif. 2003;50:563–571. doi: 10.1002/prot.10320. [DOI] [PubMed] [Google Scholar]
- 52.Shepherd NE, Hoang HN, Abbenante G, Fairlie DP. J Am Chem Soc. 2009;131:15877–15886. doi: 10.1021/ja9065283. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.