Development of a Polarizable Force Field For Proteins via Ab Initio Quantum Chemistry: First Generation Model and Gas Phase Tests

GEORGE A KAMINSKI; HARRY A STERN; B J BERNE; RICHARD A FRIESNER; YIXIANG X CAO; ROBERT B MURPHY; RUHONG ZHOU; THOMAS A HALGREN

doi:10.1002/jcc.10125

. Author manuscript; available in PMC: 2014 Mar 24.

Published in final edited form as: J Comput Chem. 2002 Dec;23(16):1515–1531. doi: 10.1002/jcc.10125

Development of a Polarizable Force Field For Proteins via Ab Initio Quantum Chemistry: First Generation Model and Gas Phase Tests

GEORGE A KAMINSKI ¹, HARRY A STERN ¹, B J BERNE ¹, RICHARD A FRIESNER ¹, YIXIANG X CAO ², ROBERT B MURPHY ², RUHONG ZHOU ², THOMAS A HALGREN ²

PMCID: PMC3963406 NIHMSID: NIHMS417048 PMID: 12395421

Abstract

We present results of developing a methodology suitable for producing molecular mechanics force fields with explicit treatment of electrostatic polarization for proteins and other molecular system of biological interest. The technique allows simulation of realistic-size systems. Employing high-level ab initio data as a target for fitting allows us to avoid the problem of the lack of detailed experimental data. Using the fast and reliable quantum mechanical methods supplies robust fitting data for the resulting parameter sets. As a result, gas-phase many-body effects for dipeptides are captured within the average RMSD of 0.22 kcal/mol from their ab initio values, and conformational energies for the di- and tetrapeptides are reproduced within the average RMSD of 0.43 kcal/mol from their quantum mechanical counterparts. The latter is achieved in part because of application of a novel torsional fitting technique recently developed in our group, which has already been used to greatly improve accuracy of the peptide conformational equilibrium prediction with the OPLS-AA force field.¹ Finally, we have employed the newly developed first-generation model in computing gas-phase conformations of real proteins, as well as in molecular dynamics studies of the systems. The results show that, although the overall accuracy is no better than what can be achieved with a fixed-charges model, the methodology produces robust results, permits reasonably low computational cost, and avoids other computational problems typical for polarizable force fields. It can be considered as a solid basis for building a more accurate and complete second-generation model.

Keywords: polarizable force field, ab initio quantum chemistry, gas phase tests

Introduction

The explicit inclusion of polarization in molecular mechanics force field is a long-standing objective of molecular modeling. Over the past 5 years, significant progress has been made in developing polarizable models for small molecules,² particularly water;³ such models now reproduce experimental gas phase and condensed phase data with a high degree of accuracy for many properties of interest. Although there is still much work to be done in the refinement of such models, as well as fundamental issues that must be addressed to better understand the comparisons between theory and experiment (e.g., influence of quantum effects on nuclear motion in predicting condensed phase dynamical properties, effects of Pauli exclusion upon polarizabilities and charges in the condensed phase), it is fair to say that success has been achieved in constructing an effective technology for small-molecule force field development.

However, interesting biological applications require treatment not only of small molecules but of larger medicinal compounds and peptides and also of macromolecules. Polarizable force field development for such systems is in a much less advanced state. There are few existing pubications in which systems with more than five heavy atoms per molecule are addressed,^2f,4 and none to our knowledge in which a complete force field suitable for macromolecular simulations has been presented. There are several reasons for this. First, parametrization of larger systems is a formidable problem both technically and because of a lack of suitable experimental data. Second, one has to be very careful about avoidance of a “polarization catastrophe” in which the variable part of the charge distribution grows without bound, leading to nonsensical structures and energies. Third, validation of such a force field is a difficult problem. Reliable testing requires an efficient simulation algorithm with appropriate boundary conditions and a model for aqueous solvation, which is compatible with the new force field.

In the present article, we take an initial step towards these objectives by presenting a methodology based primarily on ab initio quantum chemistry, and a first generation polarizable protein force field, which is tested in the gas phase. Althogh we have described some of the technology previously and demonstrated applications to small molecule systems,⁵ we regard the scale-up to a complete protein force field as a nontrivial demonstration of the promise of our approach; the gas phase tests presented herein, while unable to definitively evaluate accuracy, provide substantial evidence that the model behaves reasonably under a variety of conditions. Accuracy of ca. 0.5 kcal/mol in evaluating the molecular interactions is similar to what was obtained in a previous article for the OPLS-AA fixed-charges force field,¹ and we consider such an accuracy to be a sufficiently good target in this project. Future work will involve improvement of the force field to incorporate liquid state simulation data into the fitting process and extensive testing in solvent for structural and energetic predictive capabilities.

The article is organized as follows. In the next section, we briefly review our methodology for construction of the electrostatic model of an arbitrary molecule (both the permanent and polarizable components of the model), and present a few examples examining accuracy when the method is applied to model dipeptides. We then explain how this model is then deployed to build an electrostatic model for a polypeptide chain, computing parameters for all 20 amino acids in various protonation states and then transferring the parameters to assemble the macromolecular electrostastic model. In the next section we discuss how van der Waals parameters are determined from fitting ab initio quantum chemical data. The Valence Part of the Force Field section presents methods for parametrization of the valence energy; we retain stretching and bending terms from the OPLS-AA force field, so emphasis is on fitting of torsional parameters. The Applications of the Polarizable Force Field section describes computational methods for evaluating energies and forces of the protein model for use in molecular dynamics and presents results for gas phase protein minimizations and molecular dynamics simulations, examining RMS deviations of the computed structure from the native structure and comparing with fixed charge force field simulations. Note that one does not expect these simulation results to be superior to a fixed charge force field because at this point we have not included an explicit or implicit solvent model, and have not carried out such an extensive testing of our parameters as was done, for example, for the OPLS-AA in a course of many years; however, we can establish whether the native structure remains a reasonable local minimum and whether the model exhibits a polarization catastrophe.

Polarizable Electrostatics

The Model

The electrostatics of a molecule are represented by a set of fixed bond-charge increments between pairs of bonded sites, and polarizable dipoles placed on sites. We use the term “site” to denote either an atom or an off-atom virtual site. Bond-charge increments are convenient in that site charges result only from the assignment of equal (in magnitude) and opposite partial charges on bonded neighbors, thus ensuring that the molecule is always neutral (or always has a fixed nonzero charge, if additional fixed charges are added). We may represent transfer of (positive) charge from site i to site j by a bond-charge increment q_ij, which contributes a charge −q_ij to site i and +q_ij to site j. The total charge on a site is then the sum of the contributions from all bond-charge increments containing that site. In this article we employ fixed bond charge increments, and polarization response only springs from the polarizable sites with inducible dipoles interacting with these bond charges, external field, and each other. In other work we have also allowed the bond charge increments to fluctuate in response to changing electrostatic environment (for representative examples, see refs. 3a and 9).

The expression for the energy of an induced dipole moment μ_i on a site i is

U ({\underline{μ}}_{i}) = {\underline{χ}}_{i} • {\underline{μ}}_{i} + \frac{1}{2} {\underline{μ}}_{i} • {\underline{α}}_{i}^{- 1} • {\underline{μ}}_{i}

(1)

The quadratic term is the familiar self-energy of an induced dipole; α_i is the polarizability of site i. The linear coefficient χ_i represents (the negative of) an “intrinsic” electric field at site i—that is, an electric field that exists even in the absence of any other sites or external fields. We would expect χ_i to be nonzero only if the site were part of an asymmetric molecule. The parameter χ_i is really just a way to introduce a “permanent” nonzero dipole moment in an isolated molecule; by completing the squares we could have written eq. (1) up to a constant in the form; $\frac{1}{2} ({\underline{μ}}_{i} - μ_{i}^{0}) • α_{i}^{- 1} • ({\underline{μ}}_{i} - {\underline{μ}}_{i}^{0})$ ; where the permanent dipoles are μ_i⁰ = −α · χ_i; however, eq. (1) is somewhat more convenient in that one need keep track of only one dipole moment on a site (rather than both a permanent and induced dipole moment).

The electrostatics of a system of molecules is represented by a collection of interacting bond-charge increments and dipoles. We introduce a scalar coupling J_ij,kl between bond-charge increments on sites i,j and k,l; a vector coupling S_ij,k between a bond-charge increment on sites i,j and a dipole on site k; and a rank-two tensor coupling T_i,j between dipoles on sites i and j. Then the total energy is

U ({q_{i j}}, {{\underline{μ}}_{i}}) = \sum_{i} ({\underline{χ}}_{i} • {\underline{μ}}_{i} + \frac{1}{2} {\underline{μ}}_{i} • {\underline{α}}_{i}^{- 1} \cdot {\underline{μ}}_{i}) + \frac{1}{2} \sum_{i j \neq k l} q_{i j} J_{i j, k l} q_{k l} + \sum_{i j, k} q_{i j} S_{i j, k} • {\underline{μ}}_{k} + \frac{1}{2} \sum_{i \neq j} {\underline{μ}}_{i} • T_{i j} • {\underline{μ}}_{j}

(2)

A natural choice for coupling of bond-charge increments and dipoles that are well-separated in space is the Coulomb interaction:

J_{i j, k l} = \frac{1}{r_{i k}} - \frac{1}{r_{i l}} - \frac{1}{r_{j k}} + \frac{1}{r_{j l}}

(3)

S_{i j, k} = \frac{{\underline{r}}_{i k}}{r_{i k}^{3}} - \frac{{\underline{r}}_{j k}}{r_{j k}^{3}}

(4)

T_{i, j} = \frac{1}{r_{i j}^{3}} (1 - 3 \frac{{\underline{r}}_{i j} {\underline{r}}_{i j}}{r_{i j}^{2}})

(5)

The Coulomb interaction diverges as the distance between bond-charge increments and dipoles goes to zero, so will not be appropriate if they are too close. Physically, this represents the fact that a point multipole description is only accurate from far enough away. This can be remedied by omitting 1,2- and 1,3-interactions (between sites connected with each other directly or through a third one), as is commonly done in molecular-mechanics force fields.

For every spatial configuration of atoms, the dipole moments are determined by minimizing the total energy as given in eq. (2); that is, requiring that

\nabla_{μ i} U ({q_{i j}}, {{\underline{μ}}_{i}}) = 0

(6)

for all i. This is equivalent to the usual “self-consistent field” determination of induced dipole moments. Note that eq. (6) only specifies a minimum if the matrix of second derivatives of eq. (2) with respect to the dipole moments,

\nabla_{μ i} \nabla_{μ j} U = {\underline{α}}_{i}^{- 1} δ_{i j} + T_{i j} (1 - δ_{i j}),

(7)

is positive definite. If this matrix is not positive definite, no minimum of the total energy exists; the polarization energy can become arbitrarily large and negative. This is the so-called “polarization catastrophe” and again is due physically to the fact that the point multipole description is only accurate from far enough away. It should be noted here that, although we did not use any electrostatic screening of the inducible dipoles in this work, the Lennard–Jones part of the Hamiltonian was enough to prevent the molecular systems involved from approaching the “polarization catastrophe” regions mentioned above.

Equation (6) may be solved by matrix diagonalization or by iterative methods. Alternately, the dipole moments may be assigned fictitious masses and kinetic energies and integrated along with the spatial coordinates in the extended Lagrangian scheme.^6–9 The dynamics of inducible dipoles so generated is fictitious, and functions only as a way to keep the electronic degrees of freedom close to the minimum-energy “Born-Oppenheimer” surface.

Parameterization

Parameterizing the electrostatic model for a given molecule involves the following steps: (1) choosing virtual sites, (2) choosing sites on which a dipole moment will be placed, (3) fitting the polarizabilities—the parameters that specify the electrostatic response, and (4) fitting the “intrinsic fields” and fixed bond-charge increments—the parameters that describe the electrostatics of an isolated molecule.

Massless virtual sites representing lone pairs were attached to oxygen atoms at a distance of ca. 0.47 Å. For sp³-hybridized oxygens (e.g., alcohols) the virtual sites mimic a tetrahedral geometry: they lie in the plane perpendicular to the plane containing the bisector of the angle between the oxygen and its two bond atoms, and make an angle of ±tan⁻¹(2^1/2) ≈ ±54.74° from the plane containing these three atoms. For sp²-hybridized oxygen atoms (e.g., carbonyls) the virtual sites mimic a trigonal-planar geometry: they lie in the plane containing the oxygen, carbon, and atom bonded to the carbon and make an angle ±120° with the O—C bond.

Computational experiments have indicated that the effect of oxygen lone pairs on structures and energies is in general greater than that of nitrogen, although it is certainly possible to find cases where nitrogen lone pairs are essential. We expect to add nitrogen lone pairs in the next generation of our polarizable force field effort.

Bond-charge increments were placed on all sites, but permanent dipoles are only induced on polarizable atoms.

The electrostatic parameters of the model were fit in a manner similar to that described in refs. 5b and 10. We applied a series of electrostatic perturbations to the target molecule, in the form of dipolar probes consisting of two opposite charges of magnitude 0.78 e, 0.58 Å apart (for a dipole moment of 2.17D—similar to that of nonpolarizable models for liquid water such as SPC/E¹¹), placed at various locations. The outcome of the fitting procedure was relatively insensitive to the exact form of the perturbations, i.e., the magnitude or position of the probe charges. For each perturbation, the change in the electrostatic potential (ESP) at a set of gridpoints outside the van der Waals surface of the molecule was computed using density-functional theory (DFT) with the B3LYP method^12,13 and cc-pVTZ(-f) basis set. All calculations were performed with the Jaguar electronic structure code.¹⁴ The polarizablities α_i are assumed to be isotropic and are chosen to minimize the mean-square deviation between the change in the ESP as given by model and by the DFT calculations. Next, χ_i and the fixed bond-charge q_ij are fit so as to best reproduce DFT calculations of the ESP of the charge distribution of the unperturbed target molecule. For each peptide molecule, the fitting was done using several conformers simultaneously. Numbers of conformations for each particular case are given in tables accompanying the Valence Part of the Force Field section. The vectors χ_i are expressed as a sum of vector parameters pointing along bonds connecting adjacent atoms, and as such, will change during the course of a simulation as a flexible molecule changes conformation.

The choice of electronic structure method (DFT/B3LYP functional, cc-pVTZ (-f) basis set) yields quite accurate permanent charge distributions, but underestimates the gas phase polarizability compared to experiment. Closer agreement with gas phase experiments could be obtained by including diffuse functions in the DFT calculations. However, our computational experiments with liquid state simulations strongly suggest that these diffuse function contributions are considerably damped in the condensed phase, and that ignoring them is in fact a much better approximation than fully including them. Briefly, the theoretical argument is that in the condensed phase, Pauli repulsion from neighboring molecules raises the energies of diffuse functions and so diminshes their contribution to the polarization. Empirically, when diffuse functions are used to develop polarization responses for small molecules, liquid state simulations of these molecules manifest overpolarization of the solvent, in some cases leading to polarization catastrophes; quantitative properties such as the dielectric constant are also too large compared to experiment. We discuss this point further in a separate publication focused on liquid state simulation results.¹⁵

Results of Electrostatic Parameterization for Peptide Residues

The methodology described above was applied to all the residues in dipeptide form to produce the electrostatic part of the force field. Ability to reproduce two- and three-body energies of interaction of the dipeptides with the electrostatic probes described in the section above was used to validate the quality of the resulting parameters. The three-body energies were computed as follows:

E_{123} = E (1 + 2 + 3) - E (1 + 2) - E (1 + 3) - E (2 + 3) + E (1) + E (2) + E (3)

(8)

Here, E(1+2+3) is the energy of all the three bodies put together (Fig. 1), E(1+2), E(1+3), and E(2+3) represent two-body interaction energies, and finally E(1), E(2), and E(3), are the energies of the molecule and the probes alone, respectively. The three-body response E₁₂₃ is independent of the permanent charge distribution and is only influenced by the polarizable part of the electrostatics. First of all, the probes were placed at hydrogen-bonding positions. Then more probes were placed at random locations around the molecules. The total number of probes for each dipeptide was ca. 30–40 for each conformer. The distance of hydrogen-bonded probes from the acceptor or donor on the target molecule was fixed at 1.8 Å. The randomly positioned probes were also constrained to be located at 1.8 Å from the closest atomic site of the molecule involved. Table 1 shows RMS deviations/maximum deviations of the three-body energies for all the dipeptides from their DFT/cc-pVTZ(-f) counterparts, along with the ab initio average and maximum values of the three-body energies. It should be noted that, although the deviations are often similar in magnitudes to the averages, the accuracy of our fitting is still good because of their low absolute values. We have previously demonstrated that to capture the many-body effects one needs to use inducible dipoles; fluctuating charges alone do not suffice in some important cases.^5b (It is conceivable that one could also attempt to solve this problem by placing additional fluctuating charge sites at locations other than atomic centers). The data in the Table 1 demonstrate that the polarization based on inducible dipoles alone (without the use of any fluctuating charges) provides an adequate representation of the many-body responses, with the greatest RMSD being only 0.450 kcal/mol. Magnitudes of the three-body energies themselves are typically within a couple of kcal/mol. The agreement can be further improved by including into the model both inducible dipoles and fluctuating charges.

Definitions of two- and three-body energies used as the target/test of the polarizable force field quality.

Table 1.

RMS Deviations of Three-Body Energies for Dipeptides Computed with Quantum Mechanics and Polarizable Force Field and Average Three-Body Energies.

Dipeptide	E₁₂₃ RMS deviations/maximum deviations, kcal/mol	Average/maximum E₁₂₃, kcal/mol
Alanine	0.158/0.466	0.299/0.784
Serine	0.173/0.785	0.257/1.629
Phenylalanine	0.122/0.281	0.195/0.790
Cysteine	0.293/1.350	0.280/2.016
Asparagine	0.267/1.363	0.275/1.425
Glutamine	0.208/2.888	0.227/3.174
Histidine	0.280/1.614	0.249/4.267
Leucine	0.152/0.521	0.278/1.013
Isoleucine	0.258/1.984	0.307/2.592
Valine	0.136/0.340	0.286/1.154
Methionine	0.245/1.167	0.260/2.007
Proline	0.171/0.285	0.370/0.927
Tryptophan	0.276/0.949	0.196/4.178
Threonine	0.182/0.935	0.287/1.744
Tyrosine	0.450/1.775	0.179/1.402
Aspurtic acid	0.333/1.155	0.376/3.206
Glutamic acid	0.244/1.146	0.282/2.161
Lysine	0.166/1.693	0.162/2.273
Protonated histidine	0.130/0.628	0.174/1.137
Arginine	0.120/1.253	0.196/1.798

Dipeptide	Number of points (N)	E₁₂ RMSD	Average E₁₂
Alanine	225	1.017	3.236
Serine	221	1.332	4.664
Phenylalanine	130	1.181	16.023
Cysteine	198	1.442	4.543
Asparagine	77	2.902	5.710
Glutamine	509	2.732	7.413
Histidine	351	3.021	5.567
Leucine	364	3.187	4.418
Isoleucine	318	3.073	6.149
Valine	114	3.317	20.301
Methionine	296	2.053	5.238
Proline	35	1.000	3.834
Tryptophan	439	2.342	10.548
Threonine	307	1.456	6.837
Tyrosone	286	2.488	9.136
Aspartic acid	125	1.679	14.732
Glut. acid	327	1.809	15.515
Lysine	267	1.695	13.799
Histidine-H⁺	258	2.181	11.653
Arginine	269	3.278	11.527

Dimer	LMP2^a	MP2^a	CCSDT limit^b
H₂O	4.80	4.99	4.90
MeOH
H₂O	5.74	5.70	5.51
Me₂O
H₂O	5.15	5.21	5.17
H₂CO
MeOH	5.54	5.58	5.45
MeOH
HCOOH	13.92	13.79	13.93
—
HCOOH

Atom	σ/ε, PFF	σ/ε, OPLS-AA^a
O, sp², backbone and side chains	3.16/0.280	2.96/0.210
NH, backbone and side chains	3.30/0.280	3.25/0.170
OH, serine	3.25/0.280	3.12/0.170
OH, tyrosine	3.20/0.190	3.07/0.170
S, cysteine and methionine	3.60/0.425	3.60/0.425
N heterocycle, histidine	3.25/0.170	3.25/0.170
N3, lysine	3.0/0.080	3.25/0.170
O2, aspartic and glutamic acid	2.96/0.210	2.96/0.210
N2, arginine	3.20/0.100	3.25/0.170

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
C7_eq	0.00	−0.23/9.3	−0.11/9.1
C5	0.95	0.77/1.2	0.82/4.5
C7_ax	2.67	2.48/6.5	2.46/0.5
β ₂	2.75	—	—
α _L	4.31	—	—
α ′	5.51	6.11/8.6	5.97/8.0
RMS error^c	—	0.35/7.1	0.27/6.5

Conformer	Ab Initio ^a	PFF	OPLS-AA/L^b
1	2.71	3.31/1.0	3.19/4.4
2	2.84	2.87/4.7	3.19/6.5
3	0.00	0.14/7.8	−0.32/8.4
4	4.13	3.85/4.0	4.40/5.8
5	3.88	3.24/16.7	3.14/9.3
6	2.20	0.80/13.9	0.96/12.7
7	5.77	6.91/16.0	5.82/6.6
8	4.16	4.12/47.2	4.83/18.8
9	6.92	7.69/8.8	7.14/8.2
10	6.99	6.69/23.0	7.25/14.2
RMS error^c	—	0.69/19.1^d	0.56/10.4

Conformer	Ab initio ^a	PFF	OPLS-AA/L 1^b	OPLS-AA/L 2^b
1	0.00	−0.26/5.8	0.49/7.9	0.30/7.8
2	2.76	2.67/8.7	3.30/1.3	2.83/1.6
3	3.75	3.72/12.9	3.08/1.7	3.45/2.2
4	3.95	4.50/3.5	4.12/6.7	3.69/6.7
5	5.13	5.44/7.3	4.90/4.0	4.76/3.7
6	7.43	6.95/7.0	7.13/4.2	7.98/4.0
RMS error^c	—	0.34/8.1	0.44/4.9	0.34/4.9

Conformer	Ab initid ^a	PFF	OPLS-AA/L^b
1	0.00	−0.02/8.4	−0.19/6.2
2	0.88	0.91/10.0	0.90/10.6
3	1.65	1.63/10.2	1.82/3.8
RMS error^c	—	0.02/9.5	0.15/7.5

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	−0.29/7.0	0.15/6.2
2	1.72	1.96/2.2	1.82/3.7
3	2.26	2.43/6.2	2.79/6.7
4	3.18	2.83/3.5	2.84/6.9
5	4.79	5.03/3.5	4.36/4.9
RMS error^c	—	0.27/4.8	0.35/5.8

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	0.02/9.4	−0.16/8.8
2	3.49	3.46/8.0	3.64/26.2
RMS error^c	—	0.02/8.7	0.16/19.5

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.19	−0.30/8.5	0.30/6.1
2	0.46	−1.03/3.1	0.73/9.7
3	0.00	−0.72/4.5	−0.60/10.4
4	1.07	1.53/13.6	0.52/9.0
5	0.92	1.10/10.8	0.16/21.7
6	1.80	3.53/21.9	1.29/9.8
7	2.83	4.17/12.4	3.91/8.8
8	4.02	3.70/43.0	5.90/8.8
9	5.29	4.66/10.2	5.83/7.9
10	5.32	5.91/22.5	5.72/5.6
11	8.54	7.90/7.6	6.68/31.6
RMS error^c	—	0.92/18.0	0.96/13.9

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	−0.14/4.5	−0.63/5.0
2	0.19	0.78/14.6	0.31/4.1
3	2.41	1.31/8.9	0.77/7.3
4	2.95	3.96/10.1	4.18/10.7
5	3.26	3.96/5.6	4.18/3.9
6	3.45	3.54/11.9	3.88/10.4
7	4.90	5.25/7.1	4.48/49.6
8	5.48	3.96/45.0	5.45/4.3
RMS error^c	—	0.83/18.2	0.85/18.7

Conformer	Ab initio ^a	PFF	OPLS-AA/L 1^b	OPLS-AA/L 3^b
1	0.00	−0.45/3.0	0.41/2.5	0.53/2.5
2	0.81	1.25/4.1	0.15/10.8	0.25/10.5
3	0.77	0.65/5.0	0.38/6.2	0.14/6.0
4	1.23	1.03/3.6	1.33/5.1	1.32/4.1
5	1.28	1.05/7.1	1.00/3.4	1.23/2.7
6	2.01	2.02/4.8	2.05/4.6	1.83/4.5
7	2.91	2.59/7.4	3.16/8.8	2.95/8.8
8	3.27	3.42/5.1	3.60/2.5	3.70/1.4
9	3.63	4.32/4.3	3.80/5.6	3.93/5.6
RMS error^c	—	0.35/5.1	0.34/6.1	0.38/5.9

Conformer	Ab initio ^a	PFF	OPLS-AA/L 2^b	OPLS-AA/L 3^b
1	0.00	0.00/3.9	0.06/6.2	−0.20/6.5
2	0.35	0.36/2.1	0.24/3.2	0.36/3.3
3	0.69	0.67/7.6	0.74/12.8	0.87/12.9
RMS error^c	—	0.01/5.1	0.08/8.4	0.16/8.6

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	−0.31/4.6	0.64/3.5
2	2.95	3.46/3.6	2.26/5.1
3	2.49	2.47/5.8	2.47/3.9
4	1.88	0.81/9.3	1.28/3.9
5	3.06	3.07/4.1	2.64/4.3
6	2.07	2.35/2.8	2.17/3.1
7	3.56	4.17/5.0	4.55/5.3
RMS error^c	—	0.53/5.4	0.59/5.2

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
Minimum	0.00	1.72	1.78
+60°^c	3.18	3.65	2.86
−60°^c	2.99	2.56	3.87
+180°^c	12.45	10.69	10.12
RMS error^d	—	1.27	1.54

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	0.19/6.9	−0.01/7.2
2	0.15	0.56/14.2	0.38/6.5
3	1.30	1.68/16.4	2.16/11.6
4	1.65	2.01/2.1	1.72/4.7
5	2.18	0.94/38.4	2.56/9.8
6	2.22	2.43/6.2	2.05/5.2
7	3.26	2.94/34.9	2.19/49.0
8	2.91	2.94/11.0	2.56/48.2
9	3.41	3.39/3.5	3.48/13.7
RMS error^c	—	0.49/19.4	0.50/24.2

Conformer	Ab Initio ^a	PFF	OPLS-AA/L 1^b	OPLS-AA/L 2^b
1	0.00	0.77/4.3	−0.22/8.1	0.20/8.2
2	2.81	3.11/7.6	2.46/3.1	2.48/3.5
3	3.72	3.20/13.9	2.05/2.7	2.00/3.3
4	5.25	6.14/8.5	6.64/11.5	6.26/11.5
5	5.45	5.88/8.9	5.69/4.0	5.82/4.2
6	5.99	5.11/7.6	6.03/6.4	5.49/6.1
7	7.52	6.61/8.9	8.10/7.7	8.60/8.7
RMS error^c	—	0.75/8.9	0.87/6.9	0.87/7.1

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	−0.07/4.8	−0.09/4.4
2	0.34	0.81/13.4	1.13/5.7
3	0.39	0.12/7.0	0.07/9.2
4	1.67	1.88/1.8	1.73/3.2
5	2.17	1.86/2.4	1.78/5.5
6	2.64	2.61/14.8	2.30/14.9
RMS error^c	—	0.27/8.9	0.39/8.1

Conformer	Ab initio ^a	PFF	OPLS-AA/L, Ver. 1^b	OPLS-AA/L, Ver. 2^b
1	5.40	6.41	5.63	7.74
2	0.00	−0.84	−0.08	2.43
3	3.72	3.54	3.57	3.80
RMS error^c	—	0.77	0.16	1.95

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	0.58	−1.28
2	7.89	8.67	7.89
3	3.68	4.23	3.19
4	14.09	13.27	13.62
5	7.20	4.93	6.05
6	12.79	11.47	12.60
7	10.95	13.45	14.55
RMS error^c	—	1.47	1.53

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	17.16	17.65	16.87
2	21.45	20.68	20.08
3	16.70	16.10	17.48
4	0.00	0.38	1.39
5	15.21	16.01	15.05
6	13.25	12.92	12.90
RMS error^c	—	0.59	0.88

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	1.66	0.94
2	4.86	5.50	5.40
3	0.31	−1.08	0.56
4	7.20	7.13	6.92
5	4.48	3.78	5.03
6	4.67	4.54	2.68
RMS error^c	—	0.97	0.97

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	−0.88	−0.80
2	10.76	11.78	9.72
3	3.29	2.29	1.95
4	13.87	13.82	15.73
5	8.58	9.65	9.21
6	4.25	3.47	4.93
RMS error^c	—	0.79	1.15

Peptide	PFF	OPLS-AA/L^a	MMFF94^a
Tetrapeptide
Alanine	0.69	0.56
Dipeptides
Alanine	0.35	0.27
Serine	0.35	0.44/0.34	0.97
Phenylalanine	0.02	0.15	0.21
Cysteine	0.27	0.35	1.21
Asparagine	0.02	0.16	2.25
Glutamine	0.92	0.96	1.00
Histidine	0.83	0.85	1.60
Leucine	0.35	0.34/0.38	1.27
Isoleucine	0.57	0.38	0.66
Valine	0.01	0.08/0.16	1.01
Methionine	0.53	0.59	1.05
Proline	1.27	1.54
Tryptophan	0.49	0.50	0.83
Threonine	0.76	0.87	1.15
Tyrosine	0.27	0.39	0.28
Average^b	0.43	0.47	1.04

PERMALINK

Development of a Polarizable Force Field For Proteins via Ab Initio Quantum Chemistry: First Generation Model and Gas Phase Tests

GEORGE A KAMINSKI

HARRY A STERN

B J BERNE

RICHARD A FRIESNER

YIXIANG X CAO

ROBERT B MURPHY

RUHONG ZHOU

THOMAS A HALGREN

Abstract

Introduction

Polarizable Electrostatics

The Model

Parameterization

Results of Electrostatic Parameterization for Peptide Residues

Figure 1.

Table 1.

Table 2.

Lennard–Jones Parametrization of the Force Field

The Target–Ab Initio Energies

Table 3.

Fitting Dimerization Energies

Table 4.

Table 5.

Valence Part of the Force Field—Refitting the Torsional Potential

The Method

Figure 2.

Results

Alanine

Table 6.

Table 7.

Serine

Table 8.

Phenylalanine

Table 9.

Cysteine

Table 10.

Asparagine

Table 11.

Glutamine

Table 12.

Histidine

Table 13.

Leucine, Isoleucine, and Valine

Table 14.

Table 16.

Methionine

Table 17.

Proline

Table 18.

Tryptophan

Table 19.

Threonine

Table 20.

Tyrosine

Table 21.

Aspartic Acid

Table 22.

Glutamic Acid

Table 23.

Lysine

Table 24.

Protonated Histidine

Table 25.

Arginine

Table 26.

Summary

Table 27.

Table 28.

Applications of the Polarizable Force Field to Realistic Systems

Simulation Methods

Table 29.

Table 30.

Results

Conclusions

Supplementary Material

Table 15.

Acknowledgments

Footnotes

Peptide	PFF	OPLS-AA/L^a
Aspartic acid	0.77	0.16/1.95
Glutamic acid	1.47	1.53
Lysine	0.59	0.88
Protonated histidine	0.97	0.97
Arginine	0.79	1.15
Average	0.92	0.94/1.29

	No H atoms		Backbone only
Molecule	OPLS	PFF	OPLS	PFF
155c	2.22	2.37	1.86	2.07
1bp2	1.93	1.90	1.53	1.55
1cc5	1.92	2.22	1.65	1.91
1crn	2.00	1.83	1.60	1.63
1ctf	2.26	2.18	1.76	1.65
1fdx	2.67	2.43	2.33	1.94
1fx1	2.46	2.21	1.71	1.47
1gcn	4.14	3.77	2.88	2.00
1gcr	1.53	1.54	1.07	1.11
1paz	1.73	1.71	0.96	1.04
1pcy	1.78	1.77	1.05	1.05
1pgx	4.10	4.02	2.43	2.15
1ppt	2.20	1.90	1.07	1.70
1r69	1.68	1.70	1.12	1.13
1rnt	2.32	2.30	1.27	1.30
1sn3	2.56	2.28	1.52	1.17
1ubq	2.08	1.97	1.35	1.16
2cdv	3.41	3.53	2.25	2.37
2fxb	2.32	2.01	1.87	1.61
2gn5	2.57	2.53	2.10	2.06
2lzm	1.65	1.81	1.21	1.37
2ovo	1.74	1.76	1.53	1.55
2prk	1.26	1.33	1.03	1.10
2rn2	1.92	1.54	1.34	1.09
2sns	2.09	2.27	1.37	1.53
2ssi	1.93	1.89	1.49	1.48
351c	1.64	1.77	1.20	1.38
3adk	1.65	1.62	1.26	1.28
3c2c	2.11	2.02	1.29	1.16
3fxc	3.66	3.92	2.67	3.11
3icb	2.24	1.82	1.90	1.47
3wrp	1.93	1.98	1.30	1.28
4fdl	2.51	2.48	1.82	1.76
4fin	2.89	2.50	2.20	1.79
4fxn	2.25	1.91	1.38	1.07
4pti	2.41	2.30	1.66	1.58
5cpv	1.97	1.97	1.11	1.29
5fin	3.83	2.87	3.32	2.40
7rxn	1.73	1.74	1.26	1.31
Average	2.29	2.20	1.66	1.57

Conformer	Ab initio ^a	PFF	OPLS-AA/L^b
1	0.00	0.56/29.7	0.26/4.9
2	0.69	1.18/5.5	0.58/5.7
3	0.88	1.18/4.2	0.75/2.8
4	1.00	0.67/9.3	0.40/8.8
5	1.11	0.11/3.9	0.80/3.0
6	1.80	1.54/4.8	2.19/6.6
7	2.18	1.69/5.3	2.84/6.0
8	3.49	4.21/6.0	3.32/3.6
RMS error^c	—	0.88/11.8	0.38/5.5