Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Sep 12.
Published in final edited form as: J Chem Theory Comput. 2024 Feb 23;20(5):2098–2110. doi: 10.1021/acs.jctc.3c01347

Assessment of Amino Acid Electrostatic Parameterizations of the Polarizable Gaussian Multipole Model

Shiji Zhao , Piotr Cieplak , Yong Duan §,*, Ray Luo ∇,*
PMCID: PMC11060985  NIHMSID: NIHMS1987986  PMID: 38394331

Abstract

Accurate parameterization of amino acids is pivotal for the development of reliable force fields for molecular modeling of biomolecules such as proteins. This study aims to assess amino acid electrostatic parameterizations with the polarizable Gaussian Multipole (pGM) model by evaluating the performance of the pGM-perm (with atomic permanent dipoles) and pGM-ind (without atomic permanent dipoles) variants compared to the traditional RESP model. The 100-conf-combterm fitting strategy on tetrapeptides was adopted, in which (1) all peptide bond atoms (-CO-NH-) share identical set of parameters, and (2) the total charges of the two terminal N-acetyl (ACE) and N-methylamide (NME) groups were set to neutral. The accuracy and transferability of electrostatic parameters across peptides with varying lengths and real-world examples were examined. The results demonstrate the enhanced performance of the pGM-perm model in representing electrostatic properties of amino acids accurately. This insight underscores the potential of the pGM-perm model and the 100-conf-combterm strategy for the future development of the pGM force field.

Graphical Abstract

graphic file with name nihms-1987986-f0008.jpg

Introduction

An essential aspect of force field developments involves addressing atomic electrostatic interactions. In widely used point-charge additive force fields, electrostatic components are described by the interactions of fixed partial charges positioned at atom centers, following the Coulomb’s law. A frequently employed technique for deriving atomic partial charges involves using least-squares fitting to reproduce the electrostatic potentials (ESPs) determined through quantum mechanical (QM) calculations at numerous grid points surrounding molecules.15 For deriving charges for macromolecules with thousands of atoms like proteins, clearly one cannot perform QM calculations on them directly and so they must be broken into fragments of suitable sizes. For example, partial atomic charges for amino acids are often obtained from calculations on dipeptide fragments in specific representative conformations.68

To overcome the inherent deficiencies of the ESP fitting approach of lacking accuracy and transferability, Bayly et al. developed the restrained electrostatic potential (RESP) algorithm for deriving atomic charges.910 Moreover, the application of a multiple-conformation fitting strategy has notably improved the transferability of ESP-fitted charges.1112 By combining the multiple-conformation fitting strategy with the RESP approach, Cieplak et al. derived partial charges for ribonucleotides, deoxyribonucleotides, and amino acids, employing ESPs calculated at the HF/6-31G* level of theory. Subsequently, these charges were incorporated into the AMBER ff94 force field.67 Over time, the charge set from the ff94 force field has served as the cornerstone for several successive AMBER force fields. These include the AMBER ff99 force field,8 where notable modifications occurring mainly in torsional parameters while retaining mostly unchanged charges. The AMBER SB (Stony Brook) family of force fields for protein modeling1315 and the AMBER OL (Olomouc) family of force fields for nucleic acid modeling1618 also draw on this foundational charge set.

In spite of the improved accuracy and transferability achieved by the point-charge additive force fields employing charge parameters obtained via the RESP algorithm, these force fields encounter a significant limitation: they lack the ability to represent atomic polarization effects, accounting for the redistribution of atomic electron density in response to the electric field generated by adjacent charged atoms.19 The role of polarization effects holds significance across diverse biological processes such as protein-ligand interactions,2022 interactions between nucleic acids and ions,2324 and the transportation of ions through transmembrane ion channels.2526 Consequently, a range of approaches have been proposed to effectively integrate polarization effects into polarizable force field frameworks, including induced dipole models,2734 fluctuating charge models,3536 Drude oscillator models,3738 and continuum dielectric models.3940

The induced dipole model is an extensively explored polarizable model and has been integrated into various AMBER polarizable force fields, such as ff02,27 ff02rl,28 and ff12pol.2932 In this model, the induced dipole μi of atom i, caused by the external electric field Ei originating from all atoms except i, is represented as follows:

μi=αi[EijinTijμj] (1)

Here, ai denotes the isotropic polarizability of atom i, and Tij stands for the dipole field tensor described by the matrix form:

Tij=ferij3I3ftrij5[x2xyxzxyy2yzxzyzz2] (2)

In this equation, I is the identity matrix, while x, y and z denote the Cartesian components along the vector between atoms i and j at a distance rij. The components fe and ft represent distance-dependent damping functions that modify Tij to circumvent the problem termed the “polarization catastrophe,” where induced dipole moments become divergent due to cooperative induction between closely positioned dipoles.19, 41 Several damping schemes proposed by Thole42 have been integrated into the AMBER ff12pol force field.2932 However, Thole’s approaches have a drawback in that they solely address interactions between induced dipoles, leading to an inconsistent treatment of polarizations arising from fixed charges and permanent multipoles. Elking et al. later proposed an alternative damping scheme for modeling atomic electric multipoles using Gaussian distribution functions,4345 which was named the polarizable Gaussian Multipole (pGM) model.4652 The pGM model rectifies the limitation of Thole’s strategies by uniformly screening all short-range electrostatic interactions in a physically consistent manner, including interactions of charge-charge, charge-dipole, charge-quadrupole, dipole-dipole, etc. The formulas for damping functions fe and ft in the pGM model are as follows:

Sij=βiβjrij2(βi2+βj2)=rij2(Ri2+Rj2)fe=erf(Sij)2πSijexp(Sij2)ft=erf(Sij)2πSijexp(Sij2)(1+23Sij2) (3)

In this equation, βi is the inverse of pGM “radius” Ri of the Gaussian distribution function for atom i , calculated as βi=1Ri. The values of αi and pGM “radius” Ri can be found under https://github.com/ShijiZ/PyRESP. The term erf(Sij) stands for the error function of Sij.

In the present design of the pGM model, monopoles and induced dipoles are consistently included, while atomic permanent dipoles can be included optionally, resulting in two distinct pGM models, pGM-ind and pGM-perm. Within the pGM-ind model without atomic permanent dipoles, atomic dipoles are solely described by atomic induced dipoles. Within the pGM-perm model incorporating atomic permanent dipoles, atomic dipoles arise from both induced and permanent dipoles. To address the tendency of atomic permanent dipole moments to align with covalent bonding directions, a local reference frame based on covalent basis vectors (CBVs) has been introduced for the pGM-perm model. CBVs, unit vectors oriented along covalent bond directions, ensure that atomic permanent dipoles in the pGM-perm model consistently align with these bond directions.47 Consequently, the electric field Ei at atom i’s position in eq 1 has different formulas for the two pGM models. Within the pGM-ind model, Ei emerges exclusively from the monopoles of all atoms other than i, and its formula is shown in eq 4. In contrast, within the pGM-perm model, the Ei arises from both the monopoles and permanent dipoles of all atoms excluding i, whose formula is described in eq 5.

Ei=jinfeqjrij 3rji (4)
Ei=jin(feqjrij3rji+Tijpj) (5)

In these equations, qj represents the monopole of atom j, pj represents the permanent dipole of atom j in the global frame, and rji denotes the unit vector pointing from atom j to atom i.

Recently, the programs PyRESP_GEN and PyRESP have been developed for pGM model electrostatic parameterizations,5253 which extends the RESP program to accommodate both permanent and induced atomic dipoles and has been integrated into the AmberTools package.5455 An efficient simulation algorithm for the pGM model has also been incorporated into the SANDER program and is presently being adapted for utilization on parallel and GPU platforms through the PMEMD program.4748, 51 Numerous previous studies have highlighted the efficacy of the pGM models in terms of accuracy in modeling molecular properties, including molecular polarizability anisotropy,46 ESPs surrounding molecules,53 molecular electric moments,53 interaction energies and many-body interaction energies.49 Furthermore, the pGM model has been shown to possess transferability from training molecules and conformations to testing molecules (such as oligomers, molecule complexes or polymers) and different conformations.50

Amino acids are the fundamental building blocks of proteins, and the parameterization of amino acids is key to the wide application of the pGM models for molecular modeling. In this work, we explored the parameterization strategies of the pGM models for all standard amino acids, in the form of tetrapeptides. In addition, the performances of the pGM models in terms of reproducing the electrostatic properties of amino acids were assessed, in comparison with the RESP model. Finally, we assessed the transferability of electrostatic parameters of the pGM models derived from fitting tetrapeptides to designed penta-, hepta-, and nona-peptides, as well as real world peptides.

Computational Details

Data Sets, Geometry Preparations, and ESP Calculations

A total of seven data sets were used in this work, including Tet-badz, Tet-batz, Tet-matz, Penta, Hepta, Nona, and Eaton. Among the data sets, Tet-badz and Tet-batz were used to for electrostatic parameterizations, and all other data sets were used to test the transferability of the derived parameters.

Each of Tet-badz, Tet-batz, and Tet-matz data sets comprises 20 tetrapeptides (ACE-ALA-X-ALA-NME), where X corresponds to the 20 standard amino acids. For acidic and basic amino acids (ASP, GLU, ARG, LYS), their charged forms are applied. For histidine, the epsilon nitrogen protonated form (HIE) is applied. Each amino acid residue X is flanked by alanine (ALA) residues at both sides and capped by the N-acetyl (ACE) and N-methylamide (NME) groups at the N-terminal and C-terminal, respectively. Both ALAs and ACE/NME are used to mimic the chemical environment within peptides, which is necessary for electrostatic parameterizations with ideal transferability to longer peptides, as illustrated in our previous work.50 Five conformations were prepared for each tetrapeptide, including the right-handed helix (αR), left-handed helix (αL), β sheet (β), anti-β sheet (aβ), and polyproline II (pII) conformations, leading to 20 (tetrapeptide molecules) times 5 (conformations) equals 100 entries in total. The initial coordinates of the 100 entries were obtained from the work of Jiang et al,56 and were optimized at the MP2/6-311++G(d, p) level of theory with the mainchain torsional angles fixed at (ϕ, ψ) = (−140°, 135°), (57°, 47°), (−57°, −47°), (−119°, 113°), and (−79°, 150°), corresponding to the aβ, αL, αR, β, and pII conformations, respectively. The tetrapeptide molecules of Tet-badz, Tet-batz, and Tet-matz data sets share identical geometries, and the three data sets only differ in the QM level of theories for calculating QM ESPs surrounding molecules. As indicated by the data set names, B3LYP/aug-cc-pVDZ level of theory was used for molecules in Tet-badz; B3LYP/aug-cc-pVTZ level of theory was used for molecules in Tet-batz; MP2/aug-cc-pVTZ level of theory was used for molecules in Tet-matz.

The Penta, Hepta, and Nona data sets are comprised of pentapeptides (ACE-ALA-XY-ALA-NME), heptapeptides (ACE-ALA-(XY)2-ALA-NME), and nonapeptides (ACE-ALA-(XY)3-ALA-NME), respectively, where XY are combinations of the following 6 different amino acid residues: ALA (nonpolar), ASP (negatively charged), PHE (aromatic), LYS (positively charged), LEU (nonpolar), SER (polar). Similar to those in data sets Tet-badz, Tet-batz, and Tet-matz, both ALAs and ACE/NME are used to mimic the chemical environment within peptides. By design, each of Penta, Hepta, and Nona data sets contains 62 = 36 different polypeptide molecules. The geometry of each polypeptide was constructed from the above tetrapeptide geometries optimized at the MP2/6-311++G(d, p) level of theory. By rigid-body rotation and translation, polypeptides in aβ, αL, αR, β, and pII conformations with the mainchain torsional angles (ϕ, ψ) = (−140°, 135°), (57°, 47°), (−57°, −47°), (−119°, 113°), and (−79°, 150°) respectively were constructed, leading to 36 (polypeptide molecules) times 5 (conformations) equals 180 entries in total. Great majority of these constructed polypeptides were free from steric clashes, and those with serious steric clashes were excluded from analysis. The included conformations for Penta, Hepta, and Nona data sets are listed in Table S1, S2, S3, respectively. The QM ESPs surrounding molecules in Penta, Hepta and Nona data sets were calculated with B3LYP/aug-cc-pVTZ level of theory.

The Eaton data set is comprised of nonapeptides (ACE-X8-NME), where X8 are amino acid sequences from the N-terminal or C-terminal 8 amino acids of peptides used in the work of Kubelka et al.57 The selected nonapeptides are shown in Table S4. These “Eaton” peptides in aβ, αL, αR, β, and pII conformations with the mainchain torsional angles (ϕ, ψ) = (−140°, 135°), (57°, 47°), (−57°, −47°), (−119°, 113°), and (−79°, 150°) respectively were constructed from tetrapeptides by rigid-body translation and rotation. ACE/NME terminal blocking groups are used to mimic the chemical environment within peptides. Some of the constructed peptides have serious steric sidechain clashes and are excluded from analysis. The QM ESPs surrounding molecules in the Eaton data set were calculated with B3LYP/aug-cc-pVDZ level of theory.

For calculating QM ESPs surrounding molecules, the grid points were generated using the strategy developed by Singh et al. on molecular surfaces (with a density of 6 points/Å2) at each of 1.4, 1.6, 1.8 and 2.0 times the van der Waals radii.5859 All QM calculations were performed using the Gaussian 16 software.60

Electrostatic Parameterizations

As stated above, only Tet-badz and Tet-batz data sets were used for parameterizations, each containing 100 entries (20 tetrapeptide molecules times 5 conformations). For polarizable models pGM-ind and pGM-perm, the isotropic atomic polarizabilities derived in our previous work were used to calculate the induced dipoles.46 Since PyRESP provides the functionality of multiple conformation fitting,53 we designed four parameterization strategies: 2-conf, standing for fitting 2 conformations (aβ and αR) of each amino acid, as suggested by our previous study;50 5-conf, standing for fitting 5 conformations (aβ, αL, αR, β, and pII) of each amino acid; 40-conf, standing for fitting 20 (tetrapeptide molecules) times 2 (aβ and αR conformations) equals 40 entries all together; 100-conf, standing for fitting 20 (tetrapeptide molecules) times 5 (aβ, αL, αR, β, and pII conformations) equals 100 entries all together. Therefore, for 2-conf fitting and 5-conf fitting, 20 independent fittings need to be performed to parameterize all 20 amino acids. While for 40-conf fitting and 100-conf fitting, only one fitting is needed to parameterize all 20 amino acids at the same time. Another functionality provided by PyRESP is constrained fitting, which allows one to specify the total charge of molecular fragments. Obviously, for all four parameterization strategies, each amino acid residue (-NH-CHR-CO-) needs to be constrained to have correct net charge, e.g., ARG or LYS need to have net charge of +1 a.u., ASP or GLU need to have net charge of −1 a.u. For ACE and NME capping groups, we tested two parameterization strategies named septerm (separated terminal group neutralization) and combterm (combined terminal group neutralization), respectively. For the septerm strategy, each of ACE and NME groups is constrained to have 0 charge. For the combterm strategy, no such constraint was applied, so that the sum of charges of ACE and NME is 0, but each group may have nonzero charges. Both strategies can ensure correct total charge of each tetrapeptide. Therefore, there are 8 combined parameterization strategies in total: 2-conf-combterm, 2-conf-septerm, 5-conf-combterm, 5-conf-septerm, 40-conf-combterm, 40-conf-septerm, 100-conf-combterm, and 100-conf-septerm.

A modified two-stage parameterization procedure was adopted for all RESP, pGM-ind, and pGM-perm models.5253 Following the original RESP formulation910, a hyperbolic restraining function χ2=ai=1n(qi2+b2b) was applied , with b = 0.1 and a is given as the restraining strength. For all three models, weak (a = 0.0005) and strong (a = 0.001) restraining strengths were applied for charges in the first and second stages, respectively. For the pGM-perm model, weak (a = 0.00005) and stronger (a = 0.0001) restraining strengths were applied to permanent dipoles in the first and second stages, respectively.53 In the first stage, all charges (and permanent dipoles) were set free to change. In the second stage, only charges (and permanent dipoles) of methyl groups (CH3-) of ACE and NME groups were refitted, and all other fitting centers were set frozen to keep the values obtained from the first stage. In both stages, the restraints were applied to non-hydrogen heavy atoms only. Because proteins in solution are usually terminated by the basic amines (−NH3+) and acidic carboxyl (-COO) groups whose parameters will be developed separately for the individual amino acids, the terminal ACE and NME groups play insignificant roles in protein modeling. Therefore, the strategy used in this work is more appropriate to developing parameters used for proteins rather than small peptides. For the parameterizations of pGM-ind and pGM-perm models, both 1-2 and 1-3 polarization interactions were included in both stages for reasons elucidated before.46, 61

Intra-molecular equivalencing was applied in both stages.5253 In the first stage, the following five types of intra-molecular equivalencing were enforced: First, for each tetrapeptide (ACE-ALA-X-ALA-NME) except when X is PRO, all four peptide bond atoms (-CO-NH-) were equivalenced. Second, due to the cyclic structure of proline, when X is PRO, only three intact peptide bond atoms (-CO-NH-) are present in the tetrapeptide and were equivalenced. The carbonyl group atoms (-CO-) of PRO were equivalenced with other three carbonyl groups forming peptide bonds. Third, for each tetrapeptide (ACE-ALA-X-ALA-NME) except when X is ALA, both ALAs present in the tetrapeptide were equivalenced. Fourth, when X is ALA, all three ALAs present in the tetrapeptide were equivalenced. Fifth, except for those in ACE and NME groups, all other charges (and permanent dipoles) sharing identical chemical environment were equivalenced, such as the hydrogens in -CH3, -CH2-, and -NH2 groups. In the second stage, intra-molecular equivalencing were only applied to methyl hydrogens of ACE and NME groups, and all other fitting centers were frozen to keep the values obtained from the first stage.

Inter-molecular equivalencing was also applied in both stages to ensure identical charges and permanent dipoles of the same atom in different conformations or molecules. For 2-conf and 5-conf fittings, the following three types of inter-molecular equivalencing were enforced in the first stage: First, for all conformations of each tetrapeptide (ACE-ALA-X-ALA-NME), the peptide bond atoms (-CO-NH-) between ACE and the N-terminal ALA were equivalenced. Together with the intra-molecular equivalencing on peptide bond atoms above, for 2-conf fittings, all 4 × 2 = 8 (or 3 × 2 = 6 when X is PRO) sets of peptide bond atoms were equivalenced; for 5-conf fittings, all 4 × 5 = 20 (or 3 × 5 = 15 when X is PRO) sets of peptide bond atoms were equivalenced. Second, for all conformations of each tetrapeptide (ACE-ALA-X-ALA-NME), the N-terminal ALAs were equivalenced. Together with the intra-molecular equivalencing on ALAs above, for 2-conf fittings, all 2 × 2 = 4 (or 3 × 2 = 6 when X is ALA) ALAs were equivalenced; for 5-conf fittings, all 2 × 5 = 10 (or 3 × 5 = 15 when X is ALA) ALAs were equivalenced. Third, for each tetrapeptide (ACE-ALA-X-ALA-NME) except when X is ALA, the two central amino acids X for 2-conf fittings or the five central amino acids X for 5-conf fittings were equivalenced. In the second stage of 2-conf fittings, the two ACEs and NMEs of both conformations of each tetrapeptide (ACE-ALA-X-ALA-NME) were equivalenced separately. Similarly, in the second stage of 5-conf fittings, the five ACEs and NMEs of five conformations of each tetrapeptide (ACE-ALA-X-ALA-NME) were equivalenced separately.

For 40-conf fittings and 100-conf fittings, the following three types of inter-molecular equivalencing were enforced in the first stage: First, the peptide bond atoms (-CO-NH-) between ACE and the N-terminal ALA were equivalenced for all 40 entries of 40-conf fittings, and for all 100 entries of 100-conf fittings. Together with the intra-molecular equivalencing on peptide bond atoms above, for 40-conf fittings, all 4 × 2 × 20 − 2 (when X is PRO) =158 sets of peptide bond atoms were equivalenced; For 100-conf fittings, all 4 × 5 × 20 − 5 (when X is PRO) = 395 sets of peptide bond atoms were equivalenced. Second, the N-terminal ALAs were equivalenced for all 40 entries of 40-conf fittings and for all 100 entries of 100-conf fittings. Together with the intra-molecular equivalencing on ALAs above, for 40-conf fittings, all 2 × 2 × 20 + 2 (when X is ALA) = 82 ALAs were equivalenced; For 100-conf fittings, all 2 × 5 × 20 + 5 (when X is ALA) = 205 ALAs were equivalenced. Third, same as 2-conf and 5-conf fittings, for each tetrapeptide (ACE-ALA-X-ALA-NME) except when X is ALA, the two central amino acids X for 40-conf fittings or the five central amino acids X for 100-conf fittings were equivalenced. In the second stage of 40-conf fittings, the 40 ACEs and 40 NMEs of the 40 tetrapeptide (ACE-ALA-X-ALA-NME) entries were equivalenced separately. Similarly, in the second stage of 100-conf fittings, the 100 ACEs and 100 NMEs of the 100 tetrapeptide (ACE-ALA-X-ALA-NME) entries were equivalenced separately.

The recently developed PyRESP_GEN program was used to prepare input files for parameterizing molecules from each data set for each electrostatic model.52 Due to the slow speed resulted from the fact that Python is an interpreted language, it is impractical to perform 40-conf fittings and 100-conf fittings described above using the PyRESP program. Therefore, an in-house Fortran program named POLRESP that translated directly from the original PyRESP program was used to carry out all parameterization works.

Parameterization Quality Assessment

The quality and transferability of electrostatic parameters of all electrostatic models were measured by the root-mean-square-errors (RMSEs) and the relative-root-mean-square-errors (RRMSEs) of ESPs of each peptide, given by

RMSE=i=1N(ViQMVi)2N (6)
RRMSE=i=1N(ViQMVi)2i=1NViQM2 (7)

where N is the total number of ESP points surrounding each peptide in all conformations; ViQM and Vi are the ESP values at point i given by QM calculations and MM calculations, respectively. The RMSEs and RRMSEs are measured for each peptide so that we can compare the performances of electrostatic models on different amino acid types.

To calculate ESP values surrounding molecule A with the electrostatic parameters transferred from the parameterization of molecule B, the input file (-i) and qin file (-q) of molecule A are created using parameters derived for molecule B, which are provided as the inputs for the PyRESP program. The control parameter irstrnt of the PyRESP program is set to 2 so that no parameterization on molecule A is carried out. The RMSEs and RRMSEs of molecule A with the transferred parameters from molecule B are printed in the output file (-o).53

All bar plots were plotted using the Python package Matplotlib.

Results

Comparison of Parameters from 2-conf and 40-conf Fittings.

In the initial attempt, we performed 2-conf and 40-conf fittings on amino acids in aβ and αR conformations from the Tet-batz data set. For 2-conf fittings, inter-molecular equivalencing of peptide bond atoms (-CO-NH-) was applied only to the two conformers (aβ and αR) of each tetrapeptide, so that peptide bond parameters can differ for different tetrapeptides. For 40-conf fittings, inter-molecular equivalencing of peptide bond atoms was applied to all 40 entries, leading to consistent parameters of these important atoms. Therefore, comparison between the two strategies can shed light to the need of unique peptide bond parameters for different amino acids as in Amber ff03,62 or two sets of peptide bond parameters for charged and neutral amino acids as in Amber ff94,7 or a uniform set of peptide bond parameters shared by all amino acids.

The RMSEs and RRMSEs of the second stage fittings are shown in Figure 1. The RMSEs range from 0 to 0.005, and the RRMSEs range from 0 to 0.175. However, within each amino acid, the RMSEs and RRMSEs produced by the 12 models show identical patterns. This is because the formulas of RMSE (eq 6) and RRMSE (eq 7) only differ in the denominators (the number of ESP points vs. the sum of squares of all ESP values). One consequence is that for charged amino acid species (ARG, ASP, GLU, LYS), the RRMSEs of each model are significantly lower than those of uncharged amino acid species, due to the presence of very high ESP values in their denominators. Other than that, it is sufficient to only focus on RMSEs when comparing the performance of the 12 models.

Figure 1.

Figure 1.

RMSEs and RRMSEs of the second stage of 2-conf and 40-conf fittings on amino acids in aβ and αR conformations from the Tet-batz data set. The models are labeled as follows. Prefix 2: 2-conf fitting; Prefix 40: 40-conf fittings; Infix resp: The RESP model; Infix ind: The pGM-ind model; Infix perm: The pGM-perm model; Suffixes combterm and septerm: see Electrostatic Parameterizations section.

One general trend is that the RMSEs of the pGM-perm model are consistently lower than those of the RESP and pGM-ind models. However, despite its consideration of atomic induced dipoles, the pGM-ind model did not always outperform the RESP model, except for charged amino acid species. The second general trend is that for the pGM-ind and pGM-perm models, the RMSEs of 40-conf fittings are comparable to those of 2-conf fittings. The third general trend is that for the pGM-ind and pGM-perm models, models with combined terminal group neutralization (combterm) give higher RMSEs than those with separate terminal group neutralization (septerm). This unexpected trend might be explained by the modified fitting procedure where only parameters of methyl groups of ACE and NME groups were refitted in the second stage. For the combterm strategy, each of ACE and NME groups may have nonzero charges, and the refitting that forces intra-molecular equivalence of hydrogens of the two groups leads to slightly higher errors in the terminal groups than for the septerm strategy. In contrast to the second stage, lower RMSEs for the combterm strategy are observed in the first stage fitting. (Figure S1)

Parameters from 40-conf Fittings Show Unsatisfactory Transferability to Other Conformations

We tested if the parameters from 40-conf fitting transfer well to amino acids in all 5 conformations, by comparing their RMSEs and RRMSEs transferred to the entire Tet-batz data set with those produced by 5-conf and 100-conf fittings. For 100-conf fitting, all 100 entries including 20 amino acids in all 5 conformations (aβ, αL, αR, β, and pII) were combined for parameterizations. In contrast, for 5-conf fittings, each amino acid was individually fit by combining only their own five conformers. The RMSEs of the second stage of 5-conf, 100-conf, and 40-conf fittings on the entire Tet-batz data set are shown in Figure 2. Once again, for each amino acid, RRMSEs show identical patterns as RMSEs (Figure S2), despite that the RRMSEs of charged amino acid species are significantly lower than those of uncharged species. The RMSEs produced by the pGM-perm model are consistently lower than those of the RESP and pGM-ind models, even for the transferred parameters from 40-conf fittings. For the pGM-ind and pGM-perm models, the RMSEs produced by the 100-conf combined fitting are comparable to those of 5-conf combined fittings. An advantage to 100-conf combined fitting is that it allows uniform peptide bond parameters across all amino acids and simplifies the treatment of this crucial group. In comparison, the 5-conf fittings entail each amino acid to have individualized peptide bond parameters that can differ from those of other amino acids. Due to the modified second stage fitting method where only terminal groups were refitted, pGM-ind models with combined terminal group neutralization (combterm) give higher RMSEs than those with separate terminal group neutralizations (septerm), although the RMSEs of the first stage of 5-conf and 100-conf fittings (Figure S3) show the expected opposite trend. In contrast, the second stage fittings of pGM-perm models show expected trend where RMSEs of the combterm strategy are consistently lower than those of the septerm strategy, indicating the robustness of the pGM-perm model.

Figure 2.

Figure 2.

RMSEs of the second stage of 5-conf, 100-conf, and 40-conf fittings on the entire Tet-batz data set. The models are labeled as follows. Prefix 5: 5-conf fitting; Prefix 100: 100-conf fittings; Prefix 40-5: 40-conf fitting (transferred to all 5 conformations); Infixes and suffixes: see Figure 1 for notations.

Compared to RMSEs of parameters from 5-conf and 100-conf fittings, the transferred parameters from 40-conf fittings show somewhat inferior results. For the combterm strategy, RMSEs of 40-conf fittings are consistently higher than those of 5-conf and 100-conf fittings, for all amino acids and electrostatic models. For the septerm strategy, 40-conf fittings show ideal transferability for hydrophobic amino acids such as ALA, ILE, LEU, MET, VAL, etc. However, for charged amino acids that play key role in biological processes, RMSEs of 40-conf fittings are still higher than those of 5-conf and 100-conf fittings, indicating unsatisfactory transferability to other conformations (αL, β, and pII). Based on the above observations, and because of the uniform peptide bond parameters across all amino acids that can avoid potential mismatch in constructing longer peptides, we concluded that 100-conf-combterm fittings should be applied in later studies.

Parameters from 100-conf-combterm Fitting on Tet-batz Data Set Transfer Well to Tet-matz Data Set

Next, we tested the transferability of parameters from 100-conf-combterm and 40-conf-combterm fittings on the Tet-batz data set to the Tet-matz data set. That is, the parameters were derived from ESPs calculated at B3LYP/aug-cc-pVTZ level of theory and tested whether they can reproduce ESPs calculated by MP2/aug-cc-pVTZ level of theory. Figure 3 shows RMSEs and RRMSEs of the second stage of 5-conf-combterm and 100-conf-combterm fittings on the Tet-matz data set, as well as 100-conf-combterm and 40-conf-combterm fittings on the Tet-batz data set transferred to the Tet-matz data set. Interestingly, for the pGM-ind model, the transferred parameters (batz-100-ind and batz-40-ind) consistently show lower RMSEs and RRMSEs than fitted parameters on the Tet-matz data set (5-ind and 100-ind) for all amino acids except for ASP. For the RESP and pGM-perm model, as expected, the transferred parameters show higher RMSEs and RRMSEs than fitted parameters on the Tet-matz data set. In these two cases, 100-conf-combterm fittings consistently outperform 40-conf-combterm fittings in terms of transferability. In particular, the pGM-perm model with parameters from 100-conf-combterm fittings (batz-100-perm) gives very similar RMSEs and RRMSEs with those of fitted parameters (5-perm and 100-perm).

Figure 3.

Figure 3.

RMSEs and RRMSEs of the second stage of 5-conf-combterm and 100-conf-combterm fittings on the Tet-matz data set, as well as 100-conf-combterm and 40-conf-combterm fittings on the Tet-batz data set transferred to the Tet-matz data set. The models are labeled as follows. Prefix 5: 5-conf-combterm fitting; Prefix 100: 100-conf-combterm fittings; Prefix batz-100: 100-conf-combterm fitting on the Tet-batz data (transferred to the Tet-matz data set); Prefix batz-40: 40-conf-combterm fitting on the Tet-batz data (transferred to the Tet-matz data set); resp, ind and perm: see Figure 1 for notations.

Parameters of pGM-perm Model from 100-conf-combterm Fitting Transfer Well to Penta, Hepta, and Nona Data Sets

We then tested the transferability of parameters from 100-conf-combterm fittings to Penta, Hepta, and Nona data sets. Besides the parameters derived on the Tet-batz data set, we also performed 100-conf-combterm fittings on the Tet-badz data set, and RMSEs and RRMSEs of the first (Figure S4) and second (Figure S5) stages show identical pattern as those of the Tet-batz data set. As described in the Computational Details section, six different amino acid residues were selected to construct polypeptides in Penta, Hepta, and Nona data sets, including ALA (nonpolar), ASP (negatively charged), PHE (aromatic), LYS (positively charged), LEU (nonpolar), SER (polar). Due to missing conformations caused by steric clashes (Table S1S3), the average RMSEs and RRMSEs for included conformations are analyzed, where N in eq 6 and eq 7 becomes the number of ESP points surrounding each molecule in only one conformation, instead of all conformations. The average RMSEs of parameters from Tet-badz and Tet-batz data sets transferred to Penta, Hepta, and Nona data sets are shown in Figures 46. The first general trend is that the performances of parameters from Tet-badz and Tet-batz data sets are indistinguishable. The second general trend is that the pGM-perm model consistently shows the best transferability to all data sets, regardless of the length of peptides. Finally, for charged polypeptides (with sequences containing at least one “K” or “D”), the pGM-ind model consistently outperforms the RESP model. Interestingly, the average RMSEs of the RESP model correlate positively with the number of charged residues within the polypeptides, in contrast to the pGM-perm or pGM-ind models where the average RMSEs do not fluctuate significantly with the number of charged residues, highlighting the disadvantage of the RESP model due to the lack of considerations of the atomic polarization effect. For uncharged polypeptides, the RESP model generally performs slightly better than the pGM-ind model. However, for uncharged nonapeptides from the Nona data set, the pGM-ind model still outperforms the RESP model. The average RRMSEs of parameters sets transferred to Penta, Hepta and Nona data sets are shown in Figures S68. For charged polypeptides, the average RRMSEs produced by each model are significantly lower than those of uncharged polypeptides, due to the existence of very high ESP values in their denominators.

Figure 4.

Figure 4.

Average RMSEs of parameters from Tet-badz and Tet-batz data sets transferred to the Penta data set. The notations are as follows. Prefix resp: The RESP model; ind: The pGM-ind model; perm: The pGM-perm model. Suffix badz: Tet-badz data set; batz: Tet-batz data set.

Figure 6.

Figure 6.

Average RMSEs of parameters from Tet-badz and Tet-batz data sets transferred to the Nona data set. See Figure 4 for notation.

Parameters of pGM-perm Model from 100-conf-combterm Fitting Transfer Well to Real World Peptides

We also tested the transferability of parameters from 100-conf-combterm fittings to the Eaton data set. In contrast to the artificial peptides from Penta, Hepta, and Nona data sets, the sequences of polypeptides from the Eaton data set are from real world peptides for analyzing protein folding speed limit.57 (Table S4) Due to missing conformations caused by steric clashes, the average RMSEs and RRMSEs for existing conformations are analyzed. Figure 7 shows the average RMSEs and RRMSEs for existing conformations of parameters from Tet-badz and Tet-batz data sets transferred to the Eaton data set. Once again, the pGM-perm model outperforms all other models for all peptides. For charged peptides (with sequences containing at least one “R”, “K”, “D” or “E”), the RESP model consistently gives the highest average RMSE and RRMSE, which correlate positively with the number of charged residues. For uncharged peptides, the RESP model gives similar performance to the pGM-ind model.

Figure 7.

Figure 7.

Average RMSEs and RRMSEs of parameters from Tet-badz and Tet-batz data sets transferred to the Eaton data set. See Figure 4 for notation.

Discussion and Conclusions

This study represents a critical assessment work in the development of the pGM force fields, focusing on the electrostatic parameterization of standard amino acids. One of the most notable findings is the superior performance of the pGM-perm model over the RESP and pGM-ind models. This suggests that incorporating atomic permanent dipoles in the pGM-perm model leads to more accurate representations of electrostatic interactions in amino acids. On the other hand, the RESP model shows significant limitations, especially in the context of charged polypeptides, suggesting the need for models that can better account for atomic polarization effects. The variations in RMSEs observed among different parameterization strategies and models underscore the complexity of accurately modeling electrostatic properties in biomolecules. Moreover, this study highlights the critical role of transferability in parameterization strategies. The 100-conf-combterm fitting strategy demonstrates excellent transferability across different data sets containing polypeptide with varying lengths, making it a promising approach for future studies.

It is important to note that the parameters derived in this study are part of an assessment phase and are not intended for use in the final pGM force fields. Looking ahead, the actual parameterization work for the pGM force fields will incorporate a more comprehensive range of factors. First, the unexpected behavior of the pGM-ind model in Figures 12 suggest that the multistage fitting strategies need to be carefully designed to ensure the applicability of the parameters. Second, although the parameters derived from the Tet-batz data set transfer well to the Tet-matz data set as shown in Figure 3, and those derived from the Tet-batz and Tet-badz data sets show indistinguishable transferability to longer polypeptides in Figures 47, the QM level of theory for ESP calculations will still be a crucial consideration, ensuring that the derived parameters are grounded in robust theoretical foundations. The third significant aspect that will be incorporated into future parameterization efforts is the application of Polarizable Continuum Model (PCM) to accurately model solvation effects.63 This approach is expected to provide a more realistic representation of biomolecular interactions in aqueous and other solvent environments, which is essential for the accurate modeling of biological systems.

It addition, we also evaluated the traditional two-stage fitting strategy where the methyl and methylene groups were refitted at the second stage, using the 5-conf ESPs. With the terminal groups constrained to be neutral individually, the average RRMSEs of the traditional two-stage fitting were 13.10%, 11.69%, and 14.73%, for RESP, pGM-ind, pGM-perm, respectively. In comparison, the two-stage fitting in which only the terminal groups were changed at the second stage yielded better fittings and the average RRMSEs were 12.15%, 11.12%, and 6.11%, for RESP, pGM-ind, pGM-perm, respectively. The large difference between the average RRMSEs of pGM-perm in two strategies suggests that the strategy that refits only the terminal groups at the second stage and fixes all other atoms is a more accurate approach. Furthermore, because our present goal is to assess the transferability of the amino acid parameters across different peptide sequences whereas the terminal groups play relatively diminishing role for longer peptides (and rarely present in typical protein models), we chose to focus our discussions based on the strategy in which the on the terminal groups were refit at the second stage while all other parameters were fixed.

In essence, this study lays the groundwork for future advancements in the pGM force field development. By assessing the current parameterization strategies and models, it sets the stage for a more nuanced and sophisticated approach to molecular modeling to achieve greater accuracy and applicability in computational chemistry and biomolecular simulations.

Supplementary Material

SI

Figure 5.

Figure 5.

Average RMSEs of parameters from Tet-badz and Tet-batz data sets transferred to the Hepta data set. See Figure 4 for notation.

Acknowledgement

The authors gratefully acknowledge the research support from NIH (GM79383 to Y.D. and GM130367 to R.L.).

Footnotes

Supporting Information. Supporting Information is available free of charge at https://pubs.acs.org/.

The authors declare no competing financial interest.

References

  • 1.Momany FA, Determination of partial atomic charges from ab initio molecular electrostatic potentials. Application to formamide, methanol, and formic acid. The Journal of Physical Chemistry 1978, 82 (5), 592–601. [Google Scholar]
  • 2.Cox S; Williams D, Representation of the molecular electrostatic potential by a net atomic charge model. Journal of computational chemistry 1981, 2 (3), 304–323. [Google Scholar]
  • 3.Chirlian LE; Francl MM, Atomic charges derived from electrostatic potentials: A detailed study. Journal of computational chemistry 1987, 8 (6), 894–905. [Google Scholar]
  • 4.Besler BH; Merz KM Jr; Kollman PA, Atomic charges derived from semiempirical methods. Journal of computational chemistry 1990, 11 (4), 431–439. [Google Scholar]
  • 5.Breneman CM; Wiberg KB, Determining atom-centered monopoles from molecular electrostatic potentials. The need for high sampling density in formamide conformational analysis. Journal of computational chemistry 1990, 11 (3), 361–373. [Google Scholar]
  • 6.Cieplak P; Cornell WD; Bayly C; Kollman PA, Application of the multimolecule and multiconformational RESP methodology to biopolymers: Charge derivation for DNA, RNA, and proteins. Journal of computational chemistry 1995, 16 (11), 1357–1377. [Google Scholar]
  • 7.Cornell WD; Cieplak P; Bayly CI; Gould IR; Merz KM; Ferguson DM; Spellmeyer DC; Fox T; Caldwell JW; Kollman PA, A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. Journal of the American Chemical Society 1995, 117 (19), 5179–5197. [Google Scholar]
  • 8.Wang J; Cieplak P; Kollman PA, How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? Journal of computational chemistry 2000, 21 (12), 1049–1074. [Google Scholar]
  • 9.Bayly CI; Cieplak P; Cornell W; Kollman PA, A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. The Journal of Physical Chemistry 1993, 97 (40), 10269–10280. [Google Scholar]
  • 10.Cornell WD; Cieplak P; Bayly CI; Kollman PA, Application of RESP charges to calculate conformational energies, hydrogen bond energies, and free energies of solvation. Journal of the American Chemical Society 1993, 115 (21), 9620–9631. [Google Scholar]
  • 11.Reynolds CA; Essex JW; Richards WG, Atomic charges for variable molecular conformations. Journal of the American Chemical Society 1992, 114 (23), 9075–9079. [Google Scholar]
  • 12.Stouch T; Williams DE, Conformational dependence of electrostatic potential derived charges of a lipid headgroup: Glycerylphosphorylcholine. Journal of computational chemistry 1992, 13 (5), 622–632. [Google Scholar]
  • 13.Tian C; Kasavajhala K; Belfon KA; Raguette L; Huang H; Migues AN; Bickel J; Wang Y; Pincay J; Wu Q, ff19SB: Amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. Journal of chemical theory and computation 2019, 16 (1), 528–552. [DOI] [PubMed] [Google Scholar]
  • 14.Maier JA; Martinez C; Kasavajhala K; Wickstrom L; Hauser KE; Simmerling C, ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. Journal of chemical theory and computation 2015, 11 (8), 3696–3713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hornak V; Abel R; Okur A; Strockbine B; Roitberg A; Simmerling C, Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins: Structure, Function, and Bioinformatics 2006, 65 (3), 712–725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zgarbová M; Sponer J; Jurečka P, Z-DNA as a Touchstone for Additive Empirical Force Fields and a Refinement of the Alpha/Gamma DNA torsions for AMBER. Journal of Chemical Theory and Computation 2021, 17 (10), 6292–6301. [DOI] [PubMed] [Google Scholar]
  • 17.Zgarbová M; Sponer J; Otyepka M; Cheatham TE III; Galindo-Murillo R; Jurecka P, Refinement of the sugar–phosphate backbone torsion beta for AMBER force fields improves the description of Z-and B-DNA. Journal of chemical theory and computation 2015, 11 (12), 5723–5736. [DOI] [PubMed] [Google Scholar]
  • 18.Zgarbová M; Otyepka M; Šponer J. i.; Mládek A. t.; Banáš P; Cheatham TE III; Jurecka P, Refinement of the Cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. Journal of chemical theory and computation 2011, 7 (9), 2886–2902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Cieplak P; Dupradeau F-Y; Duan Y; Wang J, Polarization effects in molecular mechanical force fields. Journal of Physics: Condensed Matter 2009, 21 (33), 333102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gresh N; Cisneros GA; Darden TA; Piquemal J-P, Anisotropic, polarizable molecular mechanics studies of inter-and intramolecular interactions and ligand– macromolecule complexes. A bottom-up strategy. Journal of chemical theory and computation 2007, 3 (6), 1960–1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhao S; Schaub AJ; Tsai S-C; Luo R, Development of a Pantetheine Force Field Library for Molecular Modeling. Journal of chemical information and modeling 2021, 61 (2), 856–868. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.King E; Qi R; Li H; Luo R; Aitchison E, Estimating the roles of protonation and electronic polarization in absolute binding affinity simulations. Journal of chemical theory and computation 2021, 17 (4), 2541–2555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Draper DE; Grilley D; Soto AM, Ions and RNA folding. Annu. Rev. Biophys. Biomol. Struct 2005, 34, 221–243. [DOI] [PubMed] [Google Scholar]
  • 24.Gkionis K; Kruse H; Platts JA; Mladek A; Koca J; Sponer J, Ion binding to quadruplex DNA stems. Comparison of MM and QM descriptions reveals sizable polarization effects not included in contemporary simulations. Journal of chemical theory and computation 2014, 10 (3), 1326–1340. [DOI] [PubMed] [Google Scholar]
  • 25.Peng X; Zhang Y; Chu H; Li Y; Zhang D; Cao L; Li G, Accurate evaluation of ion conductivity of the gramicidin a channel using a polarizable force field without any corrections. Journal of chemical theory and computation 2016, 12 (6), 2973–2982. [DOI] [PubMed] [Google Scholar]
  • 26.Sun R-N; Gong H, Simulating the activation of voltage sensing domain for a voltage-gated sodium channel using polarizable force field. The Journal of Physical Chemistry Letters 2017, 8 (5), 901–908. [DOI] [PubMed] [Google Scholar]
  • 27.Cieplak P; Caldwell J; Kollman P, Molecular mechanical models for organic and biological systems going beyond the atom centered two body additive approximation: aqueous solution free energies of methanol and N - methyl acetamide, nucleic acid base, and amide hydrogen bonding and chloroform/water partition coefficients of the nucleic acid bases. Journal of computational chemistry 2001, 22 (10), 1048–1057. [Google Scholar]
  • 28.Wang ZX; Zhang W; Wu C; Lei H; Cieplak P; Duan Y, Strike a balance: optimization of backbone torsion parameters of AMBER polarizable force field for simulations of proteins and peptides. Journal of computational chemistry 2006, 27 (6), 781–790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wang J; Cieplak P; Li J; Hou T; Luo R; Duan Y, Development of polarizable models for molecular mechanical calculations I: parameterization of atomic polarizability. The journal of physical chemistry B 2011, 115 (12), 3091–3099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wang J; Cieplak P; Li J; Wang J; Cai Q; Hsieh M; Lei H; Luo R; Duan Y, Development of polarizable models for molecular mechanical calculations II: induced dipole models significantly improve accuracy of intermolecular interaction energies. The journal of physical chemistry B 2011, 115 (12), 3100–3111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wang J; Cieplak P; Cai Q; Hsieh M-J; Wang J; Duan Y; Luo R, Development of polarizable models for molecular mechanical calculations. 3. Polarizable water models conforming to Thole polarization screening schemes. The Journal of Physical Chemistry B 2012, 116 (28), 7999–8008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wang J; Cieplak P; Li J; Cai Q; Hsieh M-J; Luo R; Duan Y, Development of polarizable models for molecular mechanical calculations. 4. van der Waals parametrization. The Journal of Physical Chemistry B 2012, 116 (24), 7088–7101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ren P; Ponder JW, Consistent treatment of inter-and intramolecular polarization in molecular mechanics calculations. Journal of computational chemistry 2002, 23 (16), 1497–1506. [DOI] [PubMed] [Google Scholar]
  • 34.Ren P; Ponder JW, Polarizable atomic multipole water model for molecular mechanics simulation. The Journal of Physical Chemistry B 2003, 107 (24), 5933–5947. [Google Scholar]
  • 35.Banks JL; Kaminski GA; Zhou R; Mainz DT; Berne B; Friesner RA, Parametrizing a polarizable force field from ab initio data. I. The fluctuating point charge model. The Journal of chemical physics 1999, 110 (2), 741–754. [Google Scholar]
  • 36.Patel S; Brooks CL III, CHARMM fluctuating charge force field for proteins: I parameterization and application to bulk organic liquid simulations. Journal of computational chemistry 2004, 25 (1), 1–16. [DOI] [PubMed] [Google Scholar]
  • 37.Lamoureux G; Harder E; Vorobyov IV; Roux B; MacKerell AD Jr, A polarizable model of water for molecular dynamics simulations of biomolecules. Chemical Physics Letters 2006, 418 (1-3), 245–249. [Google Scholar]
  • 38.Lopes PE; Lamoureux G; Roux B; MacKerell AD, Polarizable empirical force field for aromatic compounds based on the classical drude oscillator. The Journal of Physical Chemistry B 2007, 111 (11), 2873–2885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Tan Y-H; Luo R, Continuum treatment of electronic polarization effect. J. Chem. Phys 2007, 126 (9), 094103. [DOI] [PubMed] [Google Scholar]
  • 40.Tan Y-H; Tan C; Wang J; Luo R, Continuum polarizable force field within the Poisson-Boltzmann framework. J. Phys. Chem. B 2008, 112 (25), 7675–7688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Applequist J; Carl JR; Fung K-K, Atom dipole interaction model for molecular polarizability. Application to polyatomic molecules and determination of atom polarizabilities. Journal of the American Chemical Society 1972, 94 (9), 2952–2960. [Google Scholar]
  • 42.Thole BT, Molecular polarizabilities calculated with a modified dipole interaction. Chemical Physics 1981, 59 (3), 341–350. [Google Scholar]
  • 43.Elking D; Darden T; Woods RJ, Gaussian induced dipole polarization model. Journal of computational chemistry 2007, 28 (7), 1261–1274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Elking DM; Cisneros GA; Piquemal J-P; Darden TA; Pedersen LG, Gaussian multipole model (GMM). Journal of chemical theory and computation 2010, 6 (1), 190–202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Elking DM; Perera L; Duke R; Darden T; Pedersen LG, Atomic forces for geometry-dependent point multipole and Gaussian multipole models. Journal of computational chemistry 2010, 31 (15), 2702–2713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Wang J; Cieplak P; Luo R; Duan Y, Development of polarizable Gaussian model for molecular mechanical calculations I: Atomic polarizability parameterization to reproduce ab initio anisotropy. Journal of chemical theory and computation 2019, 15 (2), 1146–1158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wei H; Qi R; Wang J; Cieplak P; Duan Y; Luo R, Efficient formulation of polarizable Gaussian multipole electrostatics for biomolecular simulations. The Journal of chemical physics 2020, 153 (11), 114116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Wei H; Cieplak P; Duan Y; Luo R, Stress tensor and constant pressure simulation for polarizable Gaussian multipole model. The Journal of chemical physics 2022, 156 (11), 114114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Zhao S; Wei H; Cieplak P; Duan Y; Luo R, Accurate Reproduction of Quantum Mechanical Many-Body Interactions in Peptide Main-Chain Hydrogen-Bonding Oligomers by the Polarizable Gaussian Multipole Model. Journal of Chemical Theory and Computation 2022, 18 (10), 6172–6188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Zhao S; Cieplak P; Duan Y; Luo R, Transferability of the Electrostatic Parameters of the Polarizable Gaussian Multipole Model. Journal of Chemical Theory and Computation 2023, 19 (3), 924–941. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Huang Z; Zhao S; Cieplak P; Duan Y; Luo R; Wei H, Optimal Scheme to Achieve Energy Conservation in Induced Dipole Models. Journal of Chemical Theory and Computation 2023, 19 (15), 5047–5057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Zhu Q; Wu Y; Zhao S; Cieplak P; Duan Y; Luo R, Streamlining and Optimizing Strategies of Electrostatic Parameterization. Journal of Chemical Theory and Computation 2023, 19 (18), 6353–6365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Zhao S; Wei H; Cieplak P; Duan Y; Luo R, PyRESP: A Program for Electrostatic Parameterizations of Additive and Induced Dipole Polarizable Force Fields. Journal of Chemical Theory and Computation 2022, 18 (6), 3654–3670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Case DA; Aktulga HM; Belfon K; Ben-Shalom I; Berryman JT; Broze SR II; Cerutti DS; Cheatham TE III; Cisneros GA; Cruzeiro VWD; Darden TA; Duke RE; Giambasu G; Gilson MK; Gohlke H; Goetz AW; Harris R; Izadi S; Izmailov SA; Kasavajhala K; Kaymak MC; King E; Kovalenko A; Kurtzman T; Lee T; LeGrand S; Li P; Lin C; Liu J; Luchko T; Luo R; Machado M; Man V; Manathunga M; Merz KM; Miao Y; Mikhailovskii O; Monard G; Nguyen H; O’Hearn KA; Onufriev A; Pan F; Pantano S; Qi R; Rahnamoun A; Roe DR; Roitberg A; Sagui C; Schott-Verdugo S; Shajan A; Shen J; Simmerling CL; Skrynnikov NR; Smith J; Swails J; Walker RC; Wang J; Wang J; Wei H; Wolf RM; Wu X; Xiong Y; Xue Y; York DM; Zhao S; Kollman PA, Amber 2022. University of California, San Francisco: 2022. [Google Scholar]
  • 55.Case DA; Aktulga HM; Belfon K; Cerutti DS; Cisneros GA; Cruzeiro VWD; Forouzesh N; Giese TJ; Götz AW; Gohlke H; Izadi S; Kasavajhala K; Kaymak MC; King D; Kurtzman T; Lee T-S; Li P; Liu J; Luchko T; Luo R; Manathunga M; Machado MR; Nguyen HM; O’Hearn KA; Onufriev AV; Pan F; Pantano S; Qi R; Rahnamoun A; Risheh A; Schott-Verdugo S; Shajan A; Swails J; Wang J; Wei H; Wu X; Wu Y; Zhang S; Zhao S; Zhu Q; Cheatham TE III; Roe DR; Roitberg A; Simmerling C; York DM; Nagan MC; Merz KM Jr., AmberTools. Journal of Chemical Information and Modeling 2023, 63 (20), 6183–6191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Jiang J; Wu Y; Wang Z-X; Wu C, Assessing the Performance of Popular Quantum Mechanics and Molecular Mechanics Methods and Revealing the Sequence-Dependent Energetic Features Using 100 Tetrapeptide Models. Journal of Chemical Theory and Computation 2010, 6 (4), 1199–1209. [Google Scholar]
  • 57.Kubelka J; Hofrichter J; Eaton WA, The protein folding ‘speed limit’. Current Opinion in Structural Biology 2004, 14 (1), 76–88. [DOI] [PubMed] [Google Scholar]
  • 58.Connolly ML, Analytical molecular surface calculation. Journal of applied crystallography 1983, 16 (5), 548–558. [Google Scholar]
  • 59.Singh UC; Kollman PA, An approach to computing electrostatic charges for molecules. Journal of computational chemistry 1984, 5 (2), 129–145. [Google Scholar]
  • 60.Frisch MJ; Trucks GW; Schlegel HB; Scuseria GE; Robb MA; Cheeseman JR; Scalmani G; Barone V; Petersson GA; Nakatsuji H; Li X; Caricato M; Marenich AV; Bloino J; Janesko BG; Gomperts R; Mennucci B; Hratchian HP; Ortiz JV; Izmaylov AF; Sonnenberg JL; Williams-Young D; Ding F; Lipparini F; Egidi F; Goings J; Peng B; Petrone A; Henderson T; Ranasinghe D; Zakrzewski VG; Gao J; Rega N; Zheng G; Liang W; Hada M; Ehara M; Toyota K; Fukuda R; Hasegawa J; Ishida M; Nakajima T; Honda Y; Kitao O; Nakai H; Vreven T; Throsse K ll; Montgomery JA Jr.; Peralta JE; Ogliaro F; Bearpark MJ; Heyd JJ; Brothers EN; Kudin KN; Staroverov VN; Keith TA; Kobayashi R; Normand J; Raghavachari K; Rendell AP; Burant JC; Iyengar SS; Tomasi J; Cossi M; Millam JM; Klene M; Adamo C; Cammi R; Ochterski JW; Martin RL; Morokuma K; Farkas O; Foresman JB; Fox DJ, Gaussian 16 Revision A. 03. 2016; Gaussian Inc. Wallingford CT: 2016, 2 (4). [Google Scholar]
  • 61.Xie W; Pu J; Gao J, A coupled polarization-matrix inversion and iteration approach for accelerating the dipole convergence in a polarizable potential function. The Journal of Physical Chemistry A 2009, 113 (10), 2109–2116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Duan Y; Wu C; Chowdhury S; Lee MC; Xiong G; Zhang W; Yang R; Cieplak P; Luo R; Lee T, A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations. Journal of computational chemistry 2003, 24 (16), 1999–2012. [DOI] [PubMed] [Google Scholar]
  • 63.Frecer V; Miertuš S, Polarizable continuum model of solvation for biopolymers. International Journal of Quantum Chemistry 1992, 42 (5), 1449–1468. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

RESOURCES