available in PMC: 2014 Feb 1.
Published in final edited form as: J Comput Aided Mol Des. 2013 Jan 26;27(2):107–114. doi: 10.1007/s10822-013-9634-x

Limiting Assumptions in Molecular Modeling: Electrostatics

Garland R Marshall 1
PMCID: PMC3594449  NIHMSID: NIHMS439392  PMID: 23354627


Science is a game of successive approximations. Our current state of “understanding” is transient, and dogmas almost always require revision and/or refinement. Certainly, much of the underlying physics and chemistry of molecular recognition have been routinely minimized in order to force problems of atomic interactions within our paradigm/computer. Oversimplification is especially dangerous when the systems under study are too complex to easily validate experimentally. Unfortunately, biological systems are both complex and important, and much effort is spent to rationalize their known behavior at low resolution with inadequate methodology at atomic resolution. A case in point is the common use of force fields utilizing monopole electrostatics. Special-purpose computers have been constructed to tackle the complexity of biological processes such as protein folding at the molecular level [2,3]. Three recent examples of MD simulations of complex systems (Src kinase, Abl kinase, and beta(1)/beta(2)-adrenergic receptors) illustrate this approach [46]. While these long simulations provide results that were interpreted as consistent with the limited experimental observations available, the complexity of the systems under study preclude any significant validation of the details of molecular dynamics during the simulation. Nevertheless, insight into protein folding has been derived by long MD simulations using a specialized supercomputer [11]. Obviously, the computational methodology used must be validated on well-chosen experimental systems before lending any credibility to the details of the MD results on more complex systems. Unfortunately, most MD simulations of complex biological systems, including those by the special-purpose supercomputer Anton [11], have incorporated monopole electrostatics without polarizability in the force fields used (AMBER, CHARMM, OPLS, etc.) that limits their accuracy.

One reason we do molecular modeling and simulations is to gain access to molecular events that are difficult to observe experimentally. These approaches provide a way to extrapolate between experimental observations. In order to simulate molecular recognition and intermolecular interactions, one simplifies the underlying physics and chemistry due to their inherent complexity. The question one must face is whether the introduced simplifications produce results with adequate resolution for the problem being studied. Obviously, adequate resolution means the ability to distinguish between alternative hypotheses; unfortunately, structure-based drug design requires accurate results if one is to predict binding affinities due to the frustrated potential surface. On the other hand, many of the observed properties of folded proteins can be correlated with simplified lattice models as demonstrated by Dill and co-workers [12,13]. The parameterization of CHARMM [1416], OPLS-AA [17] and AMBER [18] and the monopole water models demonstrated that many intensive and colligative molecular properties are adequately modeled with monopole electrostatics, including solvation free energies [19]. To reproduce the dynamical behavior of molecules, however, would appear to require more accurate force fields.

Monopole versus Multipole Electrostatics

Molecular mechanics attempts to represent intermolecular interactions in terms of classical physics. Initial efforts assumed a point charge located at the atom center and coulombic interactions. It has been recognized over multiple decades that simply representing electrostatics with a charge on each atom failed to reproduce the electrostatic potential surrounding a molecule as estimated by quantum mechanics. Molecular orbitals are not spherically symmetrical, an implicit assumption of monopole electrostatics. Despite the recognition of its inadequacies [20,21] and efforts to overcome them by the Darden group, [2226] and others in the modeling community [2732], the more computationally efficient monopole approximation is still used in order to model more complex systems of biological interest [33,34]; on detailed analysis when there is robust experimental characterization of the system, however, one finds that the computational results do not accurately predict the observed experimental data.

To illustrate the error associated with the monopole approximation, the case of water published in 1988 is shown (Fig. 1). An RMS error of more than 8% for the electrostatic potential sampled at 363 grid points surrounding the water molecule was the best possible fit of a monopole model to the quantum calculation. Addition of a dipole moment reduced the error to 1% and further addition of a quadrupole moment reduced the error to less than 0.1%. Consider the interactions of two waters, each with an error of 8% in their electrostatic fields; unfortunately, such errors do not cancel and lead to significant deviations from the correct geometry of interaction. The inability of monopole electrostatics to reproduce the experimentally determined geometry of water clusters has been shown [35]. One might assume that implicit solvent models would have overcome this limitation of explicit monopole models of water; unfortunately, they have been calibrated primarily with results from explicit calculations with monopole force fields. One can understand the rise in popularity of statistically based potentials derived from experimental atomic proximities that inherently avoid energetic dichotomies and focus on free energy per se [36,37].

Figure 1.

Figure 1

Comparison of best fits of monopole, dipole and quadrupole models to electrostatic potentials calculated by quantum mechanics. Modified from D. E. Williams [1].

Another illustrative example (Fig. 2) was published by Prof. Anthony Stone in his article on intermolecular potentials in Science [8]. This graphical example of the errors in electrostatic potential with the monopole approximation, and its attenuation by inclusion of higher multipoles is compelling. Williams [1,38], Hunter [3942], Stone [8,43], Price [43,44] and others showed that reproduction of the electrostatic potential required a more complex representation of electrostatics, including dipole and quadrupole (simply four alternating charges at the corners of a parallelogram) moments as well as monopoles. The XED (extended electron distribution) force field developed by Vinter [45] also recognized the limitation of monopole force fields, and was the first to move toward a second-generation force field that reproduced aromatic interactions [46] and other complex interactions, such as cation-pi [47], much better [48]. This led to a relevant method for comparing molecules [49] based on the extrema (Fig. 3) in their electrostatic potentials [9,50]. In many ways, this approach is philosophically similar to the field comparisons available from CoMFA [51] and GRID [52] leading to subsequent approaches such as COMBINE [53,54] and COMBINEr [55,56].

Figure 2.

Figure 2

Errors (V) vs. QM for electrostatic potential on a surface at 1.8 times van der Waals radii around N-methyl propanamide for two models. (Left) Point charges; (right) point charge, dipole, and quadrupole on C, N, and O; charge and dipole on H. The errors are much reduced by inclusion of higher multipoles [8].

Figure 3.

Figure 3

Examples of electrostatic extrema used for molecular comparisons [9].

In a recent analysis of helices in the Protein Data Base, Kuster et al. found that the classical view of a- and 310-helices disappeared as one examined high-resolution crystallographic data that was able to generate protein models without constraints from monopole-based force fields (Kuster et al., unpublished). Instead of a bifurcated distribution between α- and 310-helical torsion angles, a smooth, single-minimum distribution was found with intermediate backbone torsional angles. The helical parameters remained essentially identical with the a-helix due to a crankshaft-like motion of the amide bonds to support three-membered backbone hydrogen bonding. A major assumption by Pauling, Donohue, etc. was linear hydrogen bonds between amide groups in protein helices; this has been reinforced by model building programs that use monopole electrostatics as the minimum energy orientation of two interacting dipoles is linear. This association of monopole electrostatics with linear hydrogen bonding is not new. Halgren and Damm pointed out in 2001 in their seminal review of polarizable force fields [57] that “… that a proper account of hydrogen-bond directionality and, in cases like those examined here, hydrogen-bond energetics requires a representation of the permanent charge distribution that goes beyond the simple framework of atom-centered charges used in traditional force fields.” A more recent example, Morozov et al. [58] stated in 2004 that “Current molecular mechanics force fields widely used in biomolecular simulations essentially model hydrogen bonding as a purely electrostatic interaction: positive partial charges are placed on the proton and the acceptor base and negative partial charges, on the acceptor and donor atoms … The hydrogen bond modeled in this way is dominated by dipole–dipole interaction and the energy of two dipoles is at a minimum when all four atoms are collinear.”

Electrostatic Anisotropy and Polarizability - To adequately represent the interaction and orientation between a carbonyl oxygen and an amide hydrogen, multipole electrostatics are essential. Furthermore, non-bonded interatomic interactions require polarizability to describe mutually induced charge perturbations [57]. One must distinguish, however, between electrostatic anisotropy requiring multipoles and polarizability that are different physical interactions. In particular, aromatic/aromatic and charge/aromatic interactions require more sophisticated electrostatics than found in the force fields (AMBER, CHARMM, OPLS, etc.) in common usage. Truchon et al. have shown that aromatic/charge interactions are dominated by polarization, and can be approximated by an internal continuum model that does not include multipoles per se [59].

Fortunately, Ponder recognized the limitations of monopole force fields over a decade ago [60] and started the development of AMOEBA, a second-generation force field based on multipole electrostatics that includes polarizability [6163]. AMOEBA is freely available online as part of the TINKER package [64]. Recent papers attest to the ability of AMOEBA to reproduce experimental thermodynamics [65,66]. Methodology for deriving parameters for AMOEBA to allow incorporation of novel ligands has recently been published [67]. The intrinsic improvements associated with AMOEBA calculations have prompted others to incorporate it in model building from experimental electron density [27,68]. A comparison of ligand binding of benzamidine-like inhibitors of trypsin used both explicit and implicit solvation with a polarizable force field [69]. The binding free energies calculated from explicit-solvent simulations were well within the accuracy of experimental measurements.

The opportunity to compare the ability of AMOEBA to reproduce the dynamics of intermolecular interactions with monopole force fields presented itself with the NMR experimental studies of Rieman and Waters [10] on a series of four -hairpin peptides (Fig. 4). Stabilization of the hairpins occurred through cation/pi and aromatic/aromatic interactions between two tryptophan residues and a lysine -amino group with variations in the degree of methylation of the -ammonium group. The biological relevance of methylated lysine arises from its role in epigenetic control of gene expression [70], and from the homology with acetylcholine that traverses a deep gorge lined by aromatic residues to reach the active site of acetylcholinesterase [71]. Long molecular dynamics simulations (100 ns) of model peptides, Ac-R-W-V-W-V-N-G-Orn-K(Me)n -I-L-Q-NH2, where n = 0, 1, 2, or 3, were conducted in explicit solvent with AMBER, CHARMM, OPLS and AMOEBA to determine the ability to predict the experimentally observed NOE patterns that differed depending on the degree of methylation [7]. AMOEBA was able to predict over 80% of the observed NOEs from the MD simulation (Fig. 5); the three monopole force fields did not predict the same NOEs for any of the four peptides emphasizing the differences in their internal parameterizations (for an example of optimization efforts of monopole force fields to fit experimental data, See Macias and Mackerrell [72] and Mackerrell et al. [73]). While the summary of the agreement between the NOEs predicted by the MD simulations with AMOEBA are impressive (Fig. 5) and clearly demonstrate the necessity for more complex electrostatics, the lack of full agreement raises a question. Is this an indication of a need for further improvement in the parameterization of AMOEBA, or simply some minor experimental error in the NMR experiments? Will accurate prediction of robust experimental data from dynamic systems require inclusion of many-body interactions as well?

Figure 4.

Figure 4

Diagnostic NOEs observed by Rieman and Waters [10] in the series of four peptides differing by the degree of lysine -N-methylation.

Figure 5.

Figure 5

Summary of the number of experimental NOEs predicted by 100 nsec MD simulations in water for the four hairpin peptides by the four force fields [7].

Implications for other studies

Validation of force fields has been done primarily by comparison with static crystal structures or by estimation of intensive properties, such as solvent density and radial distribution function that are largely dependent on potential minima. A much more significant question is the ability of a force field to reproduce the dynamics of molecular systems that requires reproduction of the potentials surrounding the minima. Biology does not occur at zero degrees Kelvin, and kinetic energy explores the potential surface beyond the minima.

AMOEBA has been shown to reproduce the geometry of water clusters where monopole force fields give linear hydrogen bonding [35]. The quantitative agreement between AMOEBA predictions and experimental measurements on water is good in general for density, heat of vaporization, radial distribution functions, magnetic shielding, self-diffusion, and static dielectric constant [74]. A new water potential DMIP based on AMOEBA has been developed to improve computational performance [75]. Kramer et al. have suggested a method by which atomic multipoles can be rigorously implemented into common biomolecular force fields using monopole electrostatics [76]. Since biology is aqueous in nature, the ability of a force field to reproduce the properties of the solvent is absolutely essential. Another example of the necessity for high resolution in force fields, the bifurcated hydrogen bonding of the amide bond in protein helices, such as crambin, seen in high-resolution structures requires multipole electrostatics to be preserved in MD simulations (Kuster et al., unpublished).


What often appears trivial on first evaluation becomes more difficult as the complexity of the problem is revealed in all its glory. Often the past decades, computer-aided drug design has progressed toward a more complete understanding of the complexities of molecular recognition by attempting to design ligands for protein-binding sites. While numerous success stories are found in the literature, there remain numerous failures, often unreported. Optimization by focused combinatorial chemistry would not be so common if our computational methodologies were truly predictive.

What is clear, however, is that the monopole-electrostatics approximation is inadequate for certain problems requiring accurate molecular modeling. Second-generation force fields, such as AMOEBA, that incorporate multipole electrostatic models and include polarizability are essential for reliably predicting thermodynamic observables, but require considerable effort to develop parameters for novel molecules [67]. XED (freely distributed by Cresset ( to academics) is certainly much more realistic than monopole force fields in its ability to reproduce molecular geometries of complexes at minimal computational expense. One may question whether the inclusion of electrostatic energy generated by monopole force fields is not sometimes misleading? The abundance of aromatic/aromatic [7779] and aromatic/charge [80,81] interactions in biological systems makes this issue problematic. It clearly remains to be shown for the problem under consideration that the energies calculated including monopole interactions are relevant; certainly, the numerous efforts to modulate electrostatic interactions with distance-dependent dielectrics, for example, question the utility of monopole force fields. Historically, I first encountered this problem when attempting to reproduce minimization results obtained with zwitterionic amino-acid crystals [82]. Regardless of the constant dielectric we tried, the crystals would either expand or contract; a dielectric function was required to reproduce the experimental data. On reflection, lack of polarizability was the probable culprit. The prior results we were trying to reproduce had solved the problem by simply fixing the dimensions of the unit cell of the crystals (reference omitted on purpose). The ability of second-generation force fields to reproduce crystal structures of such systems characterized by high charge density remains to be shown.

Nevertheless, absolute truth is not essential in setting priorities in drug discovery. Often, the essential component in the competitive pharmaceutical world is speed. Despite the evidence that multipole electrostatics and polarizability are essential for accurate predictions, there is a price of increased complexity of computation (approximately 10-fold) to be paid. Force fields with monopole electrostatics can be useful in exploring a problem to determine where a more sophisticated approach is warranted. Certainly, molecular modeling provides a useful framework for hypothesis generation, regardless of the level of atomic resolution. The seductive models produced by modeling, however, must be subjected to validation by prediction and experimental tests. Predictive calculations of affinities, however, require both an accurate force field and exploration of the entropy of binding [83] – still a daunting task.

Nevertheless, it is difficult to ignore the obvious. When we limit the physical basis of molecular recognition to those we can conveniently incorporate into computational structure-based design, then we can expect to have success only when our limiting assumptions are compatible with the system under study. Unfortunately, electrostatics plays an essential role in molecular recognition, in the dynamics of protein folding, and in protein/ligand interactions. Until molecular modeling routinely includes multipole electrostatics and polarizability through the use of more sophisticated, second-generation force fields such as AMOEBA, we must anticipate significant errors in predictions to occur. Results from MD simulations using force fields with monopole electrostatics may, in fact, be adequate at a given level of resolution, but how does one judge unless the system has been validated with robust experimental data? Fortunately, access to computational power sufficient to enable application of more complex force fields, such as AMOEBA, is available by distributed computing (example =, for a comparison of the Markov State Model approach with ANTON, see the article by Lane et al. [84]) and the ever increasing power of CPUs available in clustered arrays.


Many have contributed to what we have collectively attempted in computer-aided molecular design; my thanks for sharing the dream. In particular I need to thank my colleagues, Drs. Xiange Zheng, for her MD simulations comparing monopole force fields with AMOEBA, and Dan Kuster for his analysis of high-resolution protein helices. Many (too numerous to mention) have generously pointed out critical mistakes along my ultimate path to humility. This includes the referees of the first draft of this manuscript. A long association with Prof. Andy Vinter (a founding editor of JCAMD) opened my eyes to the problems with monopole electrostatics, and generated a noticeable avoidance of modeling nucleic acids and membranes on my part due to their high charge density. In particular, however, my proximity to Prof. Jay Ponder during his development of AMOEBA has taught me the necessity to swim upstream, i. e. to do what is scientifically justified without concern for the myopia of the field. Hopefully, the validation of AMOEBA has reached an acceptable level of maturity, and the molecular-modeling community can reap the benefits in its application. Discussion on the problems of representing electrostatics with Prof. Anthony Stone of Cambridge University has clarified many of the issues. My thanks also to Prof. Rino Ragno and the Sapienza Università di Roma for being my host during the preparation of this perspective. Finally, my thanks to the Editors of JCAMD for providing me this opportunity to pontificate, so appropriate considering my location in Rome.


