Abstract
Atomic radii and charges are two major parameters used in implicit solvent electrostatics and energy calculations. The optimization problem for charges and radii is under-determined, leading to uncertainty in the values of these parameters and in the results of solvation energy calculations using these parameters. This paper presents a new method for quantifying this uncertainty in implicit solvation calculations of small molecules using surrogate models based on generalized polynomial chaos (gPC) expansions. There are relatively few atom types used to specify radii parameters in implicit solvation calculations; therefore, surrogate models for these low-dimensional spaces could be constructed using least-squares fitting. However, there are many more types of atomic charges; therefore, construction of surrogate models for the charge parameter space requires compressed sensing combined with an iterative rotation method to enhance problem sparsity. We demonstrate the application of the method by presenting results for the uncertainties in small molecule solvation energies based on these approaches. The method presented in this paper is a promising approach for efficiently quantifying uncertainty in a wide range of force field parameterization problems, including those beyond continuum solvation calculations. The intent of this study is to provide a way for developers of implicit solvent model parameter sets to understand the sensitivity of their target properties (solvation energy) on underlying choices for solute radius and charge parameters.
1. Introduction
Implicit solvent models and their applications have been the subject of numerous previous reviews [1-3]. Such solvation models require the coordinates of the solute atoms as well as atomic charge distributions and a representation of the solute-solvent interface. Charges and interfaces are generally modeled through parameterized empirical representations; however, these parameterizations are often under-determined, leading to uncertainty in the resulting parameter sets [4-6]. The Poisson equation is a popular model for implicit solvent electrostatics and serves as a good example for exploring the influence of this uncertainty on properties such as molecular solvation energy [1-3, 7]. In this paper, we use the term “solvation energy” to refer to the energy returned from the Poisson equation and to emphasize that we are not sampling over solute conformational states for a true free energy. This is a partial differential equation for the electrostatic potential
| (1) |
| (2) |
where is the problem domain, ∂Ω is the domain boundary, ϵ : Ω ↦ [1, ∞) is a dielectric coefficient, is the charge distribution, and φD is a reference potential function (e.g., Coulomb’s law) used for the Dirichlet boundary condition. The dielectric coefficient ϵ is usually defined implicitly [8-11] with respect to the solute atomic radii {σi} and solvent properties such that the coefficient reaches two limiting constant values: ϵu inside the solute and ϵv away from the solute in bulk solvent. The solvation energy is calculated by
| (3) |
where φ is the Poisson equation solution for the system with a bulk value of ϵ corresponding to the solvent of interest and φ0 is the solution for the system with a bulk value of ϵ corresponding to a vacuum. For atomic monopoles, the solute charge distribution has the (numerically unfortunate) form for NA solute atoms with positions {xi} and charges qi. The δ terms are formally defined as Dirac delta functionals but usually approximated by functions with finite support (e.g., when projected onto a grid or finite element basis). The delta functional approximation leads to a simplified form for the solvation energy in Eq. 3,
| (4) |
Charges and interfaces in implicit solvent representations are generally modeled through parameterized empirical representations; however, these parameterizations are often under-determined leading to uncertainty in the resulting parameter sets [4-6]. For example, atomic charge models are designed to approximate the “true” vacuum electrostatic potential due to quantum mechanical electron and nuclei charge distributions. While quantum mechanical charge distributions can be incorporated directly in implicit solvent models [12, 13], atomic point charge distributions are generally used [2]. These point charges can include inducible and fixed multipoles [14, 15] but monopoles are the most common form. For the purposes of assigning charges, atoms are grouped into sets based on molecular connectivity and environment [16]. The charge values for atoms in these sets are usually determined by numerical fitting to quantum mechanical vacuum electrostatic potentials. Such charge optimization is ill-posed and fitting requires careful choice of the objective function and regularization constraints [17-20]. While sophisticated fitting procedures have been developed, significant information reduction occurs in the transformation of the continuous quantum mechanical electron density into a discrete set of atomic point charges.
Solute-solvent interface models are much more empirical than the charge distribution models; the definition of a solvent “interface” is imprecise at length scales comparable to the size of water molecules. Therefore, such models are generally developed to represent a reasonable description of the solute geometry while also optimizing agreement with experimental quantities such as solvation energy. A large number of solute-solvent interface models exist, including van der Waals [11], solvent-accessible [21], solvent-excluded (or Connolly) [22], Gaussian-based [23], spline-based [24], and differential geometry surfaces [10, 25-29]. All of these interface models represent atoms as spheres and require information about the radii of these spheres. These radii are generally assigned to sets of atoms based on their “type” as determined by the local molecular connectivity. Unlike atomic charges, there are relatively few sets of atom types used to assign radii [16, 30]. These radii parameters are determined by optimization of properties such as solvation energy against experimental data [16, 30]. Additionally, many of these models also require information about solvent characteristics, generally in the form of a solvent radius, characteristic solvent length scales, or bulk solvent pressure/surface tension properties.
The intent of this study is to provide a way for developers of implicit solvent model parameter sets to understand the sensitivity of their target properties (solvation energy) on underlying choices for solute radius and charge parameters. In the present work, we present a new method to quantify the uncertainty in solvation energy calculated by the Poisson equation and induced by the uncertainty of the input radii and charge parameters. In particular, we construct two surrogate (or statistical regression) models of the solvation energy in terms of the radii and the atomic charges, respectively. These surrogate models enable us to estimate the solvation energy with different input parameters quickly and to evaluate the statistical information of the target properties (e.g., probability density function) efficiently. We model the input parameters as independent (i.i.d.) Gaussian random variables with different means and standard deviations; however, other probability distributions can also be used. To construct the surrogate of the Poisson model, we use a generalized polynomial chaos (gPC) [31, 32] expansion to represent the dependence of the solvation energy on uncertain parameters such as the atomic charge and radii. The efficacy of the gPC method for elliptic problems such as the Poisson equation has been extensively studied with robust results for its efficiency and accuracy [33, 34]. This approach is straightforward to apply to the relatively low-dimensional parameter sets. However, the main challenge of applying this method to implicit solvent calculation parameter uncertainty is the high-dimensionality of parameter sets (especially the atomic charges): the surrogate models require more basis functions and, therefore, more expansion coefficients need to be identified. To address this challenge, we adopt a compressive sensing method combined with the rotation-based sparsity-enhancing method first proposed by Lei et al. [35] and extended by Yang et al. [36], which enable us to construct the surrogate with relatively few sample outputs of the numerical Poisson solver.
2. Methods
We demonstrated the framework using a test set of 17 compounds from the SAMPL computational challenge for solvation energy prediction [16] (see Table 1). This set was chosen to demonstrate the uncertainty quantification framework on several different molecules; however, it was not chosen to calculate statistics over this small set. We use this subset of the SAMPL data to demonstrate the use of our method to quantify uncertainty in solvation energy due to implicit solvent parameter uncertainty.
Table 1:
List of 17 compounds from the SAMPL computational challenge for solvation energy prediction with solvation energy [16] and solvent accessible volume from APBS [37].
| Ind. | Compound | Solvation energy (kJ/mol) |
Molecular volume (Å3) |
|---|---|---|---|
| 1 | glycerol triacetate | −36.99 | 215.80 |
| 2 | benzyl bromide | −9.96 | 126.98 |
| 3 | benzyl chloride | −8.08 | 124.66 |
| 4 | m-bis(trifluoromethyl)benzene | 4.48 | 290.91 |
| 5 | N, N-dimethyl-p-methoxybenzamide | −46.07 | 188.44 |
| 6 | N, N – 4-trimethylbenzamide | −40.84 | 179.62 |
| 7 | bis-2-chloroethyl ether | −17.70 | 121.45 |
| 8 | 1, 1-diacetoxyethane | −20.79 | 148.50 |
| 9 | 1,1-diethoxyethane | −13.72 | 132.75 |
| 10 | 1, 4-dioxane | −21.13 | 87.86 |
| 11 | diethyl propanedioate | −25.10 | 165.44 |
| 12 | dimethoxymethane | −12.26 | 81.90 |
| 13 | ethylene glycol diacetate | −26.53 | 148.10 |
| 14 | 1, 2-diethoxyethane | −14.81 | 132.91 |
| 15 | diethyl sulfide | −6.49 | 108.15 |
| 16 | phenyl formate | −15.98 | 126.25 |
| 17 | imidazole | −41.05 | 67.36 |
2.1. Uncertain parameters
Many parameterization approaches for atomic charge use ESP (electrostatic potential) [38] or related methods (e.g., RESP [19]). These methods optimize atomic charges by least-squares fitting of the charges’ Coulombic potential to the electrostatic potential obtained from quantum mechanical calculations. This under-determined optimization is performed subject to various constraints, including the requirement that the atomic charges sum to the integer formal charge of the molecule. More specifically, the calculated ESP at the i-th grid point is the electrostatic potential given by Coulomb’s law summed over the charge qj at the centers of the j-th atoms. Least-squares fitting is performed by minimizing with constraints, where Vi is the electrostatic potential computed by ab initio calculations. Least-squares fitting implies a Gaussian noise model wherein the atomic charges qj can be modeled as Gaussian random variables as done in this study.
In the present work, we modeled the uncertainty in atomic charges by considering atomic charges obtained by 11 different approaches: AM1BCC [39], CHELP [40], CHELPG [41], CM2 [42], ESPMK [38], Gasteiger [43], PCMESP [44], Qeq [45], RESP [19], MMFF94 [46], Mulliken [47]. The Hartree-Fock method and the 6-31G*basis set were used to optimize molecular geometries. The methods we selected here are popular for implicit solvation models and all-atom approaches. Although many of these charge methods are used in all-atom simulations, implicit solvent models have been used with several of them, including RESP and ESP(MK) [30, 48], AM1-BCC [48], Mulliken [49], CHELPG [49, 50], Gasteiger [51], Qeq [52], etc.
We have assumed that the variation of atomic charges across different methods can be modeled by a Gaussian random field with covariance kernel
| (5) |
where ηi is the standard deviation of the i-th atomic charge, xi is the position of the i-th atom, and 0 < p < 2. The least-squares nature of most charge fitting methods makes Gaussian variables a natural choice; however, other probability distributions can also be used. We used atomic charges from 11 different methods to estimate ηi and then used the maximum likelihood estimate (MLE) method to estimate θ and p. Since the sum of NA charges in a molecule is constrained (to its formal molecular charge ), we modeled the Gaussian random field with NA − 1 atoms by removing the last hydrogen in the PDB file. Additionally, we use symmetry in the molecular structure to reduce the number of independent atomic charge types before applying the MLE to identify the random field. For example, in a benzene, there is only one type of carbon and one type of hydrogen due to the symmetry of this molecule. Therefore, we considered the charges of its atoms as a Gaussian random field with only two entries instead of 12 ones (the total number of atoms in benzene).
After obtaining the covariance matrix by integrating across methods, we represented the atomic charge as
| (6) |
where q = (q1, q2, ⋯ , qNA−1) are the atomic charges, ⟨q⟩ is the mean of q estimated from the 11 different charge values, γ = (γ1, γ2, ⋯ , γNA−1) are i.i.d. zero-mean unit-variance Gaussian random variables, and Lc is a lower triangular matrix from the Cholesky decomposition of the covariance matrix (Eq. 5). We note that for the atoms in the test set used in the present work, the covariance matrices of these random field are almost diagonal: the off-diagonal entries are smaller than 10−12. This suggests the correlation between atomic charges is effectively removed during their symmetry-based grouping. The atomic charge for the remaining atom is obtained by summation of the other random charge variables based on the constraint .
Similarly, we used multiple force fields (ZAP-9 [16], OPLSAA [53], Bondi [54] and PARSE [30]) to model uncertainty in the radii parameters in the same manner. Although radii are non-negative, we did not explicitly impose constraints on the radii. After obtaining the covariance matrix, we represented the radii as
| (7) |
where σ = (σ1, ⋯ , σNA), σi is the radius of atom (type) i, ζ = (ζ1, ⋯ , ζNA) are independent zero-mean unit-variance Gaussian random variable and Lr is a lower triangular matrix from the Cholesky decomposition of the covariance matrix. The small number of radii sets makes the selection of a probability distribution somewhat arbitrary. We have assumed that the radii follow a Gaussian distribution; however, other probability distributions can also be used. We note that the standard deviations here are smaller than 10% of the mean values which implies very low probabilities for unphysical negative radii values. Therefore, by employing truncated Gaussian random variables within 4 standard deviations (capturing more than 99.99% of the probability), we guaranteed that the radii are always positive and that the distributions of the truncated Gaussian variables were almost identical to the original Gaussian variates. We note that with this setting, no model assigns zero radius to protons or other atoms.
Although we use γ and ζ to denote the random variables used for modeling the uncertainties in qj and σj, in what follows, we still use ξ = (ξ1, ξ2, ⋯ ) to denote general uncertain inputs when introducing the algorithm and reporting results.
2.2. Solvation energy surrogate models
We used generalized polynomial chaos (gPC) expansions as surrogate models for the solvation energy. The goal of surrogate construction is to estimate the variations in quantities of interest, such as solvation energy, much more efficiently than solving the original problem, such as solving the Poisson equation. The details for these expansions are provided in Supporting Material.
2.3. Poisson equation solver
We used the Adaptive Poisson-Boltzmann Solver (APBS) [37] to solve the Poisson equation for solvation energies. Poisson calculations were performed with the finite difference solver using 973 grids focused from a 25 Å to a 13 Å cubic domain. Charges were discretized onto the grids using linear interpolation. Boundary conditions were assigned using a sum of Coloumb potentials. The molecular interior and solvent were assigned dielectric values of 2.0 and 78.0, respectively. The solute-solvent boundary was defined using a “Connolly” molecular surface [55]. Energies were calculated using the standard approach for Poisson-Boltzmann calculations [56, 57].
3. Results and discussion
For each test case, we used Monte Carlo simulations to sample the parameter probability distributions and generate 10, 000 samples of the input parameters ξq and then solved PB equation using APBS to obtain output samples of the solvation energy Eq = E(ξq). We used these outputs as ground-truth reference solutions to examine the performance of the surrogate models; these outputs will be referred to as “reference” in the remainder of this paper. More precisely, given a surrogate model , we use two different root-mean-squared error (RMSE) measures to examine its accuracy:
| (8) |
We also use box-whisker plots to demonstrate the statistics. The line in the middle is the median of 16 molecules, the tops and bottoms of the boxes are 25th and 75th percentiles, and the whisker plots cover more than 99% probability.
3.1. Influence of radii uncertainties on solvation energies
We investigated the effect of the uncertainties in the radii with fixed atomic charges obtained from AM1-BCC [39]. As an example, there are eight different sets of radii for N, N-dimethyl-p-methoxybenzamide across the ZAP-9, Bondi, OPLSAA, and PARSE parameter sets, as shown in the support material. We modeled the solvation as a function of eight i.i.d. Gaussian random variables. We constructed gPC surrogate models with multi-variate normalized Hermite polynomials up to third order. The surrogate model consisted of basis functions. Figure 1 (a) presents the RMSE obtained by our method with respect to different numbers of samples Eq. Figure 1 (b) compares the solvation energy probability distribution function (PDF) obtained by our method and the reference solutions. The numerical results are obtained by constructing the surrogate model with the 36 output samples first, then sampling the surrogate model 10, 000 times with random samples to estimate the PDF. The reference solution is computed from the 10, 000 outputs of Eq.
Figure 1:
Performance of the surrogate model for radii uncertainties for N, N-dimethyl-p-methoxybenzamide. (a): RMSE with different numbers of samples M. (b): comparison of the solvation energy PDFs estimated by the numerical surrogate method (“Numeric”) based on 40 output samples of APBS; dash line (“Experiment”) is the experimental result; diamonds are the results by using radii from ZAP-9, Bondi, OPLSAA and PARSE, respectively. The diamond closest to the experiment was obtained from ZAP-9.
We performed the same analysis for all the molecules in the test set and present the results in Figure 2. For most molecules, we can build an accurate surrogate model (RMSE< 0.05) for the solvation energy with only a few samples (less than 40) of the input parameters. However, m-bis-trifluoromethylbenzene (TFMB) required significantly more samples. In particular, the RMSE for the TFMB solvation energy surrogate model was close to 0.15 with 40 samples and required 100 samples to reduce the RMSE to less than 5%. This variability arises from the radius of fluorine: in the ZAP force field it is 2.4 Å; however, it is only ~ 1.4 Å for the other force fields. Hence, the standard deviation of this radius is around 25% of the mean and fluorine requires more terms in the surrogate model for an accurate description and therefore more samples to parameterize those terms. The influences of the uncertainties in the input radii on the solvation energy for each molecule are demonstrated in box-whisker plots in Figure 3. The experiment results are presented for comparison. We note that some experiments results are “outliers” of the box-whisker plots, this is because that the atomic charges are computed from AM1BCC for the purpose of fixing the atomic charges and it does not guarantee that the computed solvation energy is sufficiently close to the experiment results. For example, for the m-bis(trifluoromethyl)benzene AM1BCC charges yield negative solvation energy while the experiment result is positive.
Figure 2:
Performance of surrogate models with respect to number of samples. Circles are the RMSE1 of m-bis-trifluoromethylbenzene (TFMB), box-whisker plots are the RMSE1 of the remaining 16 molecules.
Figure 3:
Influence of radii uncertainties on molecular solvation energies for the 17-molecule test set. The red stars are the experiment results.
3.2. Influence of atomic charge uncertainties on solvation energies
We also examined the influence of charge perturbation for solvation energy calculations with fixed radii (ZAP-9). As an example, there are 14 different types of atoms in N, N-dimethyl-p-methoxybenzamide as shown in Supporting Material. We note that we model the surrogate with 13 inputs due to the constraint on the summation of the charges. The mean and standard deviation are computed from the results of 11 different charge fitting approaches. We used no more than 3000 multi-variate normalized Hermite polynomials (up to fourth order) in the gPC surrogate model for Eg for all the molecules. We use N, N-dimethyl-p-methoxybenzamide as an example. Figure 4 (a) presents the RMSE obtained by our method with respect to different numbers of samples Eq. It illustrates that 300 output samples are needed to reduce the RMSE to less than 5%. Figure 4 (b) compares the PDF obtained by our method and the reference solution. The numerical results are obtained by constructing the surrogate model with the 300 output samples first, then sampling the surrogate model 10, 000 times with random samples to estimate the PDF. The reference solution is computed from the 10, 000 outputs of Eq.
Figure 4:
Performance of surrogate models for charge uncertainties for N, N-dimethyl-p-methoxybenzamide. (a): RMSE for surrogate model with different number of output samples. (b): comparison of the PDFs estimated by the numerical surrogate method (“Numeric”) based on 300 output samples of APBS; dash line (“Experiment”) is the result by the experiment; diamonds are results by using atomic charges from AM1BCC, CHELP, CHELPg, CM2, ESPMK, Gasteiger, PCMESP, Qeq, RESP, MMFF94, Mulliken.
The influences of the uncertainties in the input atomic charges on the solvation energy for each molecule are demonstrated in Figure 5. For most molecules, the experiment results lie in the whisker plots and some of them are in the box. We also present the number of output samples needed to construct a surrogate with RMSE less than 5% with respect to the number of atom types in Figure 6.
Figure 5:
Results of atomic charge uncertainties. Box-whisker plots demonstrating the uncertainties in the numerical results of the solvation energy for 17 compounds. The red stars are the experiment results.
Figure 6:
“o” : number of output samples needed to construct a surrogate model with RMSE less than 5% with respect to the number of atom types; “−” is the best-fit curve 1.4x2 + 1.9x + 7.9.
3.3. Combined influence of radius and atomic charge uncertainties
We chose the charge methods and radii based on their popularity in the implicit solvent community. Not all of the radii and charges examined in this study would be expected to give accurate answers when used together. We could have chosen a more constrained set; however, we chose this diverse set as a more challenging example to test our method to illustrate our approach across significant variation in parameter values. Comparing the PDFs in Figures 1 (b) and 4 (b), we notice that the uncertainty in the solvation energy induced by the atomic charges is stronger than that induced by the radii. A similar observation has been made previously by Chakavorty et al. [58] who also noted that conformation introduces another important source of uncertainty across force fields. We have investigated the influence of conformational uncertainty on solvation in a previous paper using similar methods [35]; however, combining parameter and conformational uncertainty is outside the scope of this manuscript.
The atomic charges vary significantly across different methods while the variation in the radii is much smaller. To understand the combined influence of charges and radii on solvation energies, we modeled the correlated uncertainties for these two types of parameters can be modeled with i.i.d. Gaussian random variables. We use N, N-dimethyl-p-methoxybenzamide as an example. 480 output samples are needed to reduce the RMSE to less than 5%. Figure 7 (a) presents the RMSE obtained by our method with respect to different numbers of samples Eq. Figure 7 (b) compares the PDF obtained by our method and the reference solution. The numerical results are obtained by constructing the surrogate model from the 480 output samples and then sampling the surrogate model 10, 000 times with random samples to estimate the PDF. The reference solution is computed from the 10, 000 outputs of Eq. Not surprisingly, the number of output samples needed to construct an accurate surrogate increases as we take into account both uncertainties in the charges and radii. The shape of the solvation energy changes PDF also slightly as the radii variation of the radii across different methods are much smaller than charge variations.
Figure 7:
Results of radii and charges uncertainties for N, N-dimethyl-p-methoxybenzamide. (a): RMSE with different number of output samples M. (b): comparison of the PDFs estimated by the numerical method (“Numeric”) based on 480 output samples of APBS; dashed line (“Experiment”) is the experimental result.
The influences of the uncertainties in the input atomic charges on the solvation energy for each molecule are demonstrated in Figure 8. This figure is similar to Figure 5 since the uncertainties in the atomic charges dominate the results. Figure 9 shows the number of output samples needed to construct a surrogate with less than 5% RMSE for all 17 molecules in the test set. This figure illustrates the approximately quadratic scaling with the respect to the number of atom types in the molecule.
Figure 8:
Results of radius and atomic charge uncertainties. Box-whisker plots demonstrating the uncertainties in the numerical results of the solvation energy for 17 compounds. Red stars are the experiment results.
Figure 9:
“o” : number of output samples needed to construct a surrogate model with RMSE less than 5% with respect to the number of atom charge types plus radius types; “−” fitting curve —0.6x2 + 45x − 188.
4. Conclusions
We have developed a new method for quantifying the uncertainty associated with parameterization of implicit solvent models. In particular, we used a newly developed extension of compressive sensing method to construct surrogate models of solvation energy based on gPC expansions. These surrogate models allow us to efficiently and accurately estimate the variation in solvation energy due to uncertainty in charge and radius parameters. In this initial work, we used statistical distributions for radius and charge variation based on the observed differences in the parameter sets. However, in future studies, it may be useful to use the uncertainty quantification approach presented here with more physically motivated models that address the underlying uncertainties in determining charge and radius parameters. Our results demonstrate that for the data sets used in the present work, the variation of radii across different approaches are small. On the other hand, the variations of the atomic charges obtained by different methods are much larger, so that the number of output samples needed for accurate UQ analysis requires are much larger, growing quadratically with respect to the number of atom types. This framework can be applied to estimate the statistics (e.g., mean, variance), PDF, confidence interval, Chernoff-like bounds [59], etc. of solvation computing and other chemical computing when the inputs are uncertain. The current study focused on uncertainty in solute charges and radii; however, this framework could also be applied to other solvation model characteristics such as dielectric coefficient, solvent radius, and biomolecular surface definition. Likewise, this approach could also be used for quantities of interest other than solvation energy; e.g., dipole moments, titration states, etc.
In the future, we anticipate that this approach could be used for a much wider range of force field parameterization activities, including both coarse-grained and atomistic representations of biomolecules. Uncertainty quantification methods have begun to be used in force field parameterization of simple alkane systems [60]; this paper demonstrates the ability to extend the methods to higher-dimensional systems with more diversity of atom types. Application of these methods offer the benefit of efficiently characterizing parameter space and understanding the impact of parameter variation on quantities of interest. Additionally, the iterative method we used in the present work is very suitable for this type of problem, as the accuracy of the surrogate models are improved significant after iterations. Especially, the error of the surrogate models for the atomic charge induced uncertainties are reduced by 40% ~ 50% compared with the standard compressive sensing method. Also, there is significant room for development in the numerical methods. For example, the sparsity-enhancing approaches can be combined with other techniques including improved sampling strategies [61, 62], adaptive basis selection [63, 64], and advanced optimization methods [65, 66]. These approaches improve the accuracy of the compressive sensing method from different aspects. As such, they will help to reduce the number of expensive simulations or quantum mechanics calculations needed for constructing accurate surrogates.
Supplementary Material
Acknowledgments
This work was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific-Computing Research as part of the Collaboratory on Mathematics for Mesoscopic Modeling of Materials (CM4) and by NIH grant GM069702. Pacific Northwest National Laboratory is operated by Battelle for the DOE under Contract DE-AC05-76RL01830.
References
- [1].Lamm Gene. The PoissonBoltzmann Equation, pages 147–365. John Wiley & Sons, Inc., 2003. [Google Scholar]
- [2].Ren PY, Chun JH, Thomas DG, Schnieders MJ, Marucho M, Zhang JJ, and Baker NA. Biomolecular electrostatics and solvation: a computational perspective. Quarterly Reviews of Biophysics, 45(4):427–491, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Grochowski Pawe and Trylska Joanna. Continuum molecular electrostatics, salt effects, and counterion bindinga review of the poissonboltzmann theory and its modifications. Biopolymers, 89(2):93–113, 2008. [DOI] [PubMed] [Google Scholar]
- [4].Ponder Jay W. and Case David A.. Force fields for protein simulations. Advances in Protein Chemistry, 66:27–85, 2003. [DOI] [PubMed] [Google Scholar]
- [5].Gosink Luke J., Overall Christopher C., Reehl Sarah M., Whitney Paul D., Mobley David L., and Baker Nathan Andrew. Bayesian model averaging for ensemble-based estimates of solvation free energies. The Journal of Physical Chemistry B, page doi: 10.1021/acs.jpcb.6b09198, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Swanson Jessica M. J., Adcock Stewart A., and McCammon J. Andrew. Optimized radii for poissonboltzmann calculations with the amber force field. Journal of Chemical Theory and Computation, 1(3):484–493, 2005. [DOI] [PubMed] [Google Scholar]
- [7].Li Chuan, Li Lin, Petukh Marharyta, and Alexov Emil. Progress in developing poisson-boltzmann equation solvers. Molecular based mathematical biology, 1:42–62, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Swanson Jessica M. J., Mongan John, and McCammon J. Andrew. Limitations of atom-centered dielectric functions in implicit solvent models. The Journal of Physical Chemistry B, 109(31):14769–14772, 2005. [DOI] [PubMed] [Google Scholar]
- [9].Swanson Jessica M. J., Wagoner Jason A., Baker Nathan Andrew, and McCammon J. Andrew. Optimizing the poisson dielectric boundary with explicit solvent forces and energies: Lessons learned with atom-centered dielectric functions. Journal of Chemical Theory and Computation, 3(1):170–183, 2007. [DOI] [PubMed] [Google Scholar]
- [10].Bates PW, Wei GW, and Zhao Shan. Minimal molecular surfaces and their applications. Journal of Computational Chemistry, 29(3):380–391, 2008. [DOI] [PubMed] [Google Scholar]
- [11].Dong Feng and Zhou Huan-Xiang. Electrostatic contribution to the binding stability of protein-protein complexes. Proteins, 65:87–102, 2006. [DOI] [PubMed] [Google Scholar]
- [12].Eckert F, Diedenhofen M, and Klamt A. Towards a first principles prediction of pk(a): Cosmo-rs and the cluster-continuum approach. Molecular Physics, 108(3-4):229–241, 2010. [Google Scholar]
- [13].Tomasi Jacopo, Mennucci Benedetta, and Cammi Roberto. Quantum mechanical continuum solvation models. Chemical Reviews, 105(8):2999–3094, 2005. [DOI] [PubMed] [Google Scholar]
- [14].Schnieders Michael J. and Ponder Jay W.. Polarizable atomic multipole solutes in a generalized kirkwood continuum. Journal of Chemical Theory and Computation, 3(6):2083–2097, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Schnieders MJ, Baker NA, Ren P, and Ponder JW. Polarizable atomic multipole solutes in a poisson-boltzmann continuum. J Chem Phys, 126(12):124114, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Nicholls Anthony, Mobley David L, Guthrie J Peter, Chodera John D, Bayly Christopher I, Cooper Matthew D, and Pande Vijay S. Predicting small-molecule solvation free energies: an informal blind test for computational chemistry. Journal of Medicinal Chemistry, 51(4):769–779, 2008. [DOI] [PubMed] [Google Scholar]
- [17].Besler Brent H, Merz Kenneth M, and Kollman Peter A. Atomic charges derived from semiempirical methods. Journal of Computational Chemistry, 11(4):431–439, 1990. [Google Scholar]
- [18].Haschka Thomas, Hénon Eric, Jaillet Christophe, Martiny Laurent, Etchebest Catherine, and Dauchez Manuel. Direct minimization: Alternative to the traditional l2 norm to derive partial atomic charges. Computational and Theoretical Chemistry, 1074:50–57, 2015. [Google Scholar]
- [19].Bayly Christopher I, Cieplak Piotr, Cornell Wendy, and Kollman Peter A. A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the resp model. The Journal of Physical Chemistry, 97(40):10269–10280, 1993. [Google Scholar]
- [20].Bader Richard F. W.. A quantum theory of molecular structure and its applications. Chemical Reviews, 91(5):893–928, 1991. [Google Scholar]
- [21].Lee B and Richards FM. The interpretation of protein structures: Estimation of static accessibility. Journal of Molecular Biology, 55(3):379–400, 1971. [DOI] [PubMed] [Google Scholar]
- [22].Connolly ML. Analytical molecular surface calculation. Journal of Applied Crystallography, 16(5):548–558, 1983. [Google Scholar]
- [23].Grant J. Andrew, Pickup Barry T., and Nicholls Anthony. A smooth permittivity function for poisson-boltzmann solvation methods. Journal of Computational Chemistry, 22(6):608–640, 2001. [Google Scholar]
- [24].Im W, Beglov D, and Roux B. Continuum solvation model: Computation of electrostatic forces from numerical solutions to the poisson-boltzmann equation. Computer Physics Communications, 111:59–75, 1998. [Google Scholar]
- [25].Bates PW, Chen Zhan, Sun Yuhui, Wei Guo-Wei, and Zhao Shan. Geometric and potential driving formation and evolution of biomolecular surfaces. Journal of Mathematical Biology, 59:193–231, 2009. [DOI] [PubMed] [Google Scholar]
- [26].Chen Z, Baker NA, and Wei GW. Differential geometry based solvation model I: Eulerian formulation. J Comput Phys, 229(22):8231–8258, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Cheng Li-Tien, Dzubiella Joachim, McCammon J. Andrew, and Li Bo. Application of the level-set method to the implicit solvation of nonpolar molecules. The Journal of Chemical Physics, 127(8):084503–084503, 2007. [DOI] [PubMed] [Google Scholar]
- [28].Dzubiella J, Swanson JMJ, and McCammon JA. Coupling hydrophobicity, dispersion, and electrostatics in continuum solvent models. 96, 2006. [DOI] [PubMed] [Google Scholar]
- [29].Dzubiella J, Swanson JM, and McCammon JA. Coupling nonpolar and polar solvation free energies in implicit solvent models. The Journal of Chemical Physics, 124(8):84905–84905, 2006. [DOI] [PubMed] [Google Scholar]
- [30].Sitkoff Doree, Sharp Kim A., and Honig Barry. Accurate calculation of hydration free energies using macroscopic solvent models. The Journal of Physical Chemistry, 98(7):1978–1988, 1994. [Google Scholar]
- [31].Ghanem Roger G. and Spanos Pol D.. Stochastic finite elements: a spectral approach. Springer-Verlag, New York, 1991. [Google Scholar]
- [32].Xiu Dongbin and Karniadakis George Em. The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. Comput, 24(2):619–644, 2002. [Google Scholar]
- [33].Todor Radu Alexandru and Schwab Christoph. Convergence rates for sparse chaos approximations of elliptic problems with stochastic coefficients. IMA J. Numer. Anal, 27(2):232–261, 2007. [Google Scholar]
- [34].Babuška Ivo, Nobile Fabio, and Tempone Raul. A stochastic collocation method for elliptic partial differential equations with random input data. SIAM Rev., 52(2):317–355, 2010. [Google Scholar]
- [35].Lei Huan, Yang Xiu, Zheng Bin, Lin Guang, and Baker Nathan A. Constructing surrogate models of complex systems with enhanced sparsity: quantifying the influence of conformational uncertainty in biomolecular solvation. SIAM Multiscale Model. Simul, 13(4):1327–1353, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Yang Xiu, Lei Huan, Baker Nathan A, and Lin Guang. Enhancing sparsity of hermite polynomial expansions by iterative rotations. Journal of Computational Physics, 307:94–109, 2016. [Google Scholar]
- [37].Nathan A Baker David Sept, Joseph Simpson, Holst Michael J, and McCammon J Andrew. Electrostatics of nanosystems: application to microtubules and the ribosome. Proceedings of the National Academy of Sciences, 98(18):10037–10041, 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Singh U Chandra and Kollman Peter A. An approach to computing electrostatic charges for molecules. Journal of Computational Chemistry, 5(2):129–145, 1984. [Google Scholar]
- [39].Jakalian Araz, Bush Bruce L, Jack David B, and Bayly Christopher I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: I. method. Journal of Computational Chemistry, 21(2):132–146, 2000. [DOI] [PubMed] [Google Scholar]
- [40].Chirlian Lisa Emily and Francl Michelle Miller. Atomic charges derived from electrostatic potentials: A detailed study. Journal of Computational Chemistry, 8(6):894–905, 1987. [Google Scholar]
- [41].Breneman Curt M and Wiberg Kenneth B. Determining atom-centered monopoles from molecular electrostatic potentials. the need for high sampling density in formamide conformational analysis. Journal of Computational Chemistry, 11(3):361–373, 1990. [Google Scholar]
- [42].Li Jiabo, Zhu Tianhai, Cramer Christopher J, and Truhlar Donald G. New class iv charge model for extracting accurate partial charges from wave functions. The Journal of Physical Chemistry A, 102(10):1820–1831, 1998. [Google Scholar]
- [43].Gasteiger Johann and Marsili Mario. Iterative partial equalization of orbital electronegativitya rapid access to atomic charges. Tetrahedron, 36(22):3219–3228, 1980. [Google Scholar]
- [44].Cammi Roberto and Tomasi Jacopo. Remarks on the use of the apparent surface charges (ASC) methods in solvation problems: Iterative versus matrix-inversion procedures and the renormalization of the apparent charges. Journal of Computational Chemistry, 16(12):1449–1458, 1995. [Google Scholar]
- [45].Rappe Anthony K and Goddard William A III. Charge equilibration for molecular dynamics simulations. The Journal of Physical Chemistry, 95(8):3358–3363, 1991. [Google Scholar]
- [46].Halgren Thomas A. Merck molecular force field. I. basis, form, scope, parameterization, and performance of MMFF94. Journal of Computational Chemistry, 17(5-6):490–519, 1996. [Google Scholar]
- [47].Mulliken Robert S. Electronic population analysis on LCAO-MO molecular wave functions. I. The Journal of Chemical Physics, 23(10):1833–1840, 1955. [Google Scholar]
- [48].Knight Jennifer L. and Brooks Charles L.. Surveying implicit solvent models for estimating small molecule absolute hydration free energies. Journal of Computational Chemistry, 32(13):2909–2923, October 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Hou Guanhua, Zhu Xiao, and Cui Qiang. An implicit solvent model for SCC-DFTB with charge-dependent radii. J. Chem. Theory Comput, 6(8):2303–2314, August 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].Ginovska Bojana, Camaioni Donald M., Dupuis Michel, Schwerdtfeger Christine A., and Gil Quinn. Charge-Dependent cavity radii for an accurate dielectric continuum model of solvation with emphasis on ions: Aqueous solutes with oxo, hydroxo, amino, methyl, chloro, bromo, and fluoro functionalities. J. Phys. Chem. A, 112(42):10604–10613, October 2008. [DOI] [PubMed] [Google Scholar]
- [51].Czodrowski Paul, Dramburg Ingo, Sotriffer Christoph A., and Klebe Gerhard. Development, validation, and application of adapted PEOE charges to estimate pKa values of functional groups in protein-ligand complexes. Proteins: Structure, Function, and Bioinformatics, 65(2):424–437, August 2006. [DOI] [PubMed] [Google Scholar]
- [52].Yang Qingyi and Sharp Kim A.. Atomic charge parameters for the finite difference Poisson-Boltzmann method using electronegativity neutralization. J. Chem. Theory Comput, 2(4):1152–1167, July 2006. [DOI] [PubMed] [Google Scholar]
- [53].Jorgensen William L., Maxwell David S., and Tirado-Rives Julian. Development and testing of the opls all-atom force field on conformational energetics and properties of organic liquids. J. Am. Chem. Soc, 118(45):11225–11236, 1996. [Google Scholar]
- [54].Bondi A. van der waals volumes and radii. The Journal of Physical Chemistry, 68(3):441–451, 1964. [Google Scholar]
- [55].Connolly Michael L.. Computation of molecular volume. J. Am. Chem. Soc, 107(5):1118–1124, March 1985. [Google Scholar]
- [56].Micu Alexandru M., Bagheri Babak, Ilin Andrew V., Scott Ridgway, and Pettitt. Numerical considerations in the computation of the electrostatic free energy of interaction within the PoissonBoltzmann theory. Journal of Computational Physics, 136(2):263–271, September 1997. [Google Scholar]
- [57].Sharp Kim A. and Honig Barry. Calculating total electrostatic energies with the nonlinear Poisson-Boltzmann equation. J. Phys. Chem, 94(19):7684–7692, September 1990. [Google Scholar]
- [58].Chakavorty Arghya, Li Lin, and Alexov Emil. Electrostatic component of binding energy: Interpreting predictions from poisson-boltzmann equation and modeling protocols. J. Comput. Chem, 37(28):2495–2507, October 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [59].Rasheed Muhibur, Clement Nathan, Bhowmick Abhishek, and Bajaj Chandrajit. Statistical framework for uncertainty quantification in computational molecular modeling. In Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pages 146–155. ACM, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [60].Messerly Richard A., Knotts Thomas A., and Wilding W. Vincent. Uncertainty quantification and propagation of errors of the lennard-jones 12-6 parameters for n-alkanes. Journal of Chemical Physics, 146(19):194110, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [61].Rauhut Holger and Ward Rachel. Sparse Legendre expansions via ℓ1-minimization. J. Approx. Theory, 164(5):517–533, 2012. [Google Scholar]
- [62].Peng Ji, Hampton Jerrad, and Doostan Alireza. A weighted ℓ1-minimization approach for sparse polynomial chaos expansions. J. Comput. Phys, 267(0):92 – 111, 2014. [Google Scholar]
- [63].Yang Xiu, Choi Minseok, Lin Guang, and Karniadakis George Em. Adaptive ANOVA decomposition of stochastic incompressible and compressible flows. J. Comput. Phys, 231(4):1587–1614, 2012. [Google Scholar]
- [64].Jakeman John D, Eldred Michael S, and Sargsyan Khachik. Enhancing ℓ1-minimization estimates of polynomial chaos expansions using basis selection. J. Comput. Phys, 289:18–34, 2015. [Google Scholar]
- [65].Candès Emmanuel J., Wakin Michael B., and Boyd Stephen P.. Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl, 14(5-6):877–905, 2008. [Google Scholar]
- [66].Yang Xiu and Karniadakis George Em. Reweighted ℓ1 minimization method for stochastic elliptic differential equations. J. Comput. Phys, 248(1):87–108, 2013. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.









