A “Reverse-Schur” Approach to Optimization With Linear PDE Constraints: Application to Biomolecule Analysis and Design

Jaydeep P Bardhan; Michael D Altman; B Tidor; Jacob K White

doi:10.1021/ct9001174

. Author manuscript; available in PMC: 2012 Oct 8.

Published in final edited form as: J Chem Theory Comput. 2009;5(12):3260–3278. doi: 10.1021/ct9001174

A “Reverse-Schur” Approach to Optimization With Linear PDE Constraints: Application to Biomolecule Analysis and Design

Jaydeep P Bardhan ^†, Michael D Altman ^‡, B Tidor ^¶,^§,^✉, Jacob K White ^¶,^✉

PMCID: PMC3465730 NIHMSID: NIHMS156852 PMID: 23055839

Abstract

We present a partial-differential-equation (PDE)-constrained approach for optimizing a molecule’s electrostatic interactions with a target molecule. The approach, which we call reverse-Schur co-optimization, can be more than two orders of magnitude faster than the traditional approach to electrostatic optimization. The efficiency of the co-optimization approach may enhance the value of electrostatic optimization for ligand-design efforts–in such projects, it is often desirable to screen many candidate ligands for their viability, and the optimization of electrostatic interactions can improve ligand binding affinity and specificity. The theoretical basis for electrostatic optimization derives from linear-response theory, most commonly continuum models, and simple assumptions about molecular binding processes. Although the theory has been used successfully to study a wide variety of molecular binding events, its implications have not yet been fully explored, in part due to the computational expense associated with the optimization. The co-optimization algorithm achieves improved performance by solving the optimization and electrostatic simulation problems simultaneously, and is applicable to both unconstrained and constrained optimization problems. Reverse-Schur co-optimization resembles other well-known techniques for solving optimization problems with PDE constraints. Model problems as well as realistic examples validate the reverse-Schur method, and demonstrate that our technique and alternative PDE-constrained methods scale very favorably compared to the standard approach. Regularization, which ordinarily requires an explicit representation of the objective function, can be included using an approximate Hessian calculated using the new BIBEE/P (boundary-integral-based electrostatics estimation by preconditioning) method.

1 INTRODUCTION

The problem of optimizing electrostatic interactions is a task of particular importance in molecular design. One asks whether a candidate designed molecule, or ligand, is optimal for binding the target molecule, which is called a receptor, and if not, what chemical modifications might be made to improve binding affinity or specificity. A variety of factors contribute to the binding free energy, including conformational entropy, although often the contributions are dominated by packing effects and electrostatics. Although the short-range packing interactions can be conceptualized relatively easily, analysis of the electrostatic component is more complex. The electrostatic component of the binding free energy can be particularly non-intuitive due to the interactions’ long range and the trade-off between favorable ligand–receptor interactions in the bound state and the unfavorable desolvation penalties paid on binding.¹ These non-intuitive features have led to the important but challenging goal of designing optimal electrostatic interactions as an approach to designing useful molecular binding partners.¹^,² Questions in molecular biology regarding the evolution of biomolecules, whether to serve specific functions or to bind targets with high affinity and specificity, may also be interpreted as questions regarding optimization of a particular objective function.²^–⁴

Lee and Tidor presented the first work describing the possibility of optimizing electrostatic interactions between molecules,¹ showing that linear-response theory and simple assumptions about binding events—in particular, that the ligand binds rigidly and that no charge redistribution occurs on binding—give rise to a quadratic model for the electrostatic contribution to the binding free energy. Their primarily analytical study used a multipole-based representation of the ligand charge distribution and spherical geometries for the unbound ligand and the ligand–receptor complex. Chong et al. applied this theory to an idealized model of the protein barnase and found that small sets of biochemically reasonable charge distributions resembled the computed optimal charge distribution.⁵ Kangas and Tidor later proved that the electrostatic component of the binding free energy is a convex function under reasonable assumptions and extended the theory to address nonspherical geometries, alternative basis sets, and measures of binding specificity.⁶^–⁸

Following these developments, Lee and Tidor studied the interactions between two proteins, the extremely tight-binding partners barnase and barstar;³^,⁹ their analysis suggested that the inhibitor barstar is electrostatically optimized to bind to the enzyme barnase. In another application of the optimization theory, Kangas and Tidor studied the enzyme B. subtilis chorismate mutase.² This investigation indicated a particularly promising modification to improve the binding affinity of a transition-state analog inhibitor—the replacement of a carboxylate group by a nitro group. Mandal and Hilvert synthesized the proposed inhibitor; in agreement with the computational analysis, the resulting ligand bound the enzyme more tightly, and was in fact the tightest-binding chorismate mutase inhibitor reported to date in the literature.¹⁰

Several groups have applied the optimization theory to study a number of other molecular systems. Sulea and Purisima have studied cation–protein binding, the optimization of protein–protein interfaces, and the use of the charge optimization framework as a means to identify “hot spots” for binding.¹¹^,¹² Sims et al. studied two protein kinases, protein kinase A (PKA) and cyclin-dependent kinase 2 (CDK2), and several inhibitors.¹³ Green and Tidor have applied charge optimization theory to two systems.⁴^,¹⁴ In one, they demonstrated that glutaminyl-tRNA synthetase is optimized for its substrates;⁴ more recently, they proposed optimization-theory-based mutations to 5-Helix, which inhibits HIV-1 membrane fusion by gp41.¹⁴ Armstrong et al. have studied several inhibitors of neuraminidase and the relation between charge optimization and lead progression.¹⁵ Gilson has explored the theory allowing the optimization of flexible ligands.¹⁶ Schreiber and collaborators have also focused on optimizing ligand–receptor electrostatics.¹⁷^,¹⁸ Brock et al. have used a theory similar to Lee and Tidor’s in their analysis of protein–protein complexes.¹⁹

The computational expense associated with optimization has limited the broad application of the electrostatic-optimization theory. Traditional approaches to optimization begin with an explicit representation of the second-derivative, or Hessian, matrix or a means by which to multiply a vector by the Hessian. For electrostatic optimization problems, a large cost associated with calculating the Hessian matrix explicitly is the successive simulation of the bound and unbound systems with each of the point charges (more generally, the basis functions) used to describe the ligand charge distribution.¹ The cost to form the Hessian must be paid before optimization can be performed, and it scales essentially linearly with the number of basis functions for reasonably-sized problems. Lee and Tidor emphasized, in their original optimization paper, the importance of using sufficiently complete basis function sets to achieve convergence to the optimum affinity.¹ However, the fixed-location point-charge basis sets used in most electrostatic optimization work offer little insight into geometric sensitivity or to basis-set completeness. The availability of more efficient computational methods may therefore enable not only greater numbers of ligands to be optimized in design efforts, but also a more thorough exploration of the optimization theory itself and the extent to which biology may have employed electrostatic optimization to achieve desired binding affinities and specificities.

This paper presents a new, highly efficient approach, which we call reverse-Schur co-optimization, to solving the electrostatic optimization problem. The theory and implementation of efficient methods for optimization problems constrained by partial differential equations (PDEs) have become a progressively more important research topic over the past several years,²⁰^–²³ and we show that electrostatic optimization is actually a special case of a PDE-constrained optimization problem. Most PDE-constrained optimization techniques following one of two approaches. All-at-once approaches incorporate the PDE state variables (for electrostatic optimization, the electrostatic potentials in the bound and unbound states) directly into the optimization problem. The state variables together with the decision variables (the point charge values) satisfy the PDE, which is included as an equality constraint.²²^–²⁵ Such techniques are often termed simultaneous analysis and design (SAND) approaches.²⁰ The second general strategy, sometimes called a black-box approach, hides the PDE from the optimization algorithm.²³ The nested analysis and design (NAND) paradigm is a black-box method,²⁶^,²⁷ as are techniques that directly invert the PDE constraint before initiating optimization. The calculation of an explicit Hessian is effectively a black-box approach, because the mathematical details of the PDE simulation are entirely hidden from the optimization procedure. In electrostatic optimization problems, the decision variables and the state variables are related by a linear matrix equation. As we show in this paper, this linearity allows optimal charge distributions to be found without calculating the Hessian explicitly and without using the discretized PDE as an equality constraint.

To demonstrate the co-optimization method’s performance on problems of therapeutic relevance, we have applied the optimization methodology to two protein–small-molecule ligand complexes. The first is a complex between HIV-1 protease and the small-molecule inhibitor darunavir (TMC-114).²⁸ HIV-1 protease is an essential enzyme in the life cycle of HIV, and small-molecule inhibitors of the protease have been successful components of combinatorial strategies for treating HIV infection.²⁹^,³⁰ The second protein–ligand system studied is a complex between the protein cyclin-dependent kinase 2 (CDK2) and a small-molecule inhibitor.³¹ The CDK family of proteins involve regulating cell growth, and inhibitors of these enzymes are cancer therapies.³²^,³³ It may be possible to use charge optimization to identify regions of these small-molecule ligands that are suboptimal for binding their protein target, and chemical modification at these locations may lead to improved inhibitors.

The following section describes a linear-response continuum model for biomolecule electrostatics, two boundary-integral formulations of the PDE problem, boundary-element methods (BEM) for solving the integral equations numerically, the optimization problem based on the linear-response model, and methods for convex quadratic optimization. Section 3 presents the new reverse-Schur co-optimization method. In addition, we describe two more widely used approaches to PDE-constrained optimization problems, partially to highlight differences between these methods and the reverse-Schur approach, and partially to illustrate that the performance gains are not necessarily specific to the reverse-Schur method. Techniques for constrained co-optimization are also presented. Important details of the implementation—regularization methods and preconditioning—are described in Section 4. In Section 5 we present computational results that validate the method, demonstrate its computational efficiency, and show that realistic problems in biomolecule design can be studied using PDE-constrained optimization methods. Section 6 summarizes the paper and suggests future research directions.

2 THEORY

2.1 A Linear-Response Model for Estimating the Electrostatic Contribution to the Free Energy of Binding Between Biomolecules

Free energies of binding are commonly estimated using a thermodynamic cycle such as that shown in Figure Figure 1.³⁴ The lower set of images represents the ligand, receptor, and ligand–receptor complex in aqueous solvent, and the lower horizontal arrow represents the binding free energy $Δ G_{bind}^{0}$ to be estimated. The unbound ligand and receptor and the bound complex are assumed to be at infinite dilution. The upper cartoons represent the three species in a homogeneous low-dielectric environment with zero ionic strength throughout, and the horizontal arrow denoted by $Δ G_{bind}^{0, ref}$ represents the free energy change on binding in the low-dielectric environment, the electrostatic component of which is simply the ligand–receptor Coulomb-interaction energy. The three steps illustrated by vertical arrows involve the transfer of a molecule or complex between the low-dielectric environment and the solvent. The difference in a molecule’s free energy as it is transferred into solvent from the reference low-dielectric medium is called its solvation free energy,³⁴ and this free energy is frequently decomposed into non-polar and electrostatic terms so that

Δ G_{solv}^{0} = Δ G_{solv}^{0, np} + Δ C_{solv}^{0, es} .

(1)

A thermodynamic cycle for estimating binding free energies. The shaded regions on the lower set of cartoons represent aqueous solvent. The upper cartoons represent a uniform low dielectric (the same as that of the ligand and receptor) with zero ionic strength throughout.

In many models, the non-polar free energy $Δ G_{solv}^{0, np}$ is proportional to the molecular surface area, although recently more sophisticated models have been developed and parameterized (see, for example,35,36). The electrostatic component $Δ G_{solv}^{0, es}$ is often estimated using a macroscopic continuum electrostatic model,³⁴^,³⁷ shown in Figure Figure 2.

A mixed discrete-continuum model for estimating the electrostatic component of a solute’s solvation free energy; ε_I and ε_II represent the dielectric constants of the solute and solvent regions, Γ is the boundary between the dielectric regions and is typically a molecular (solvent-excluded) surface, and q₁ and q₂ are representative discrete point charges in the solute.

The molecule–solvent boundary, denoted as Γ in Figure Figure 2, is taken to be the Richards molecular surface³⁸ and separates the molecular interior, region I, from the solvent exterior, region II. The interior is modeled as a homogeneous dielectric with low dielectric constant ε_I and a charge distribution ρ(r); in this work, we assume that the charge distribution consists of n_c discrete point charges, the ith of which is located at r_i and has charge q_i. For many biomolecules, n_c ranges from a few dozen to several thousand. The electrostatic potential in region I, φ_I(r), satisfies a Poisson equation:

\nabla^{2} φ_{I} (r) = - \sum_{i = 1}^{n_{c}} \frac{q_{i}}{ε_{I}} δ (r - r_{i}) .

(2)

The solvent region is modeled as a homogeneous dielectric with high dielectric constant ε_II; in this region, the electrostatic potential φ_II(r) satisfies the Laplace equation

\nabla^{2} φ_{I I} (r) = 0,

(3)

for non-ionic solutions, or for dilute ionic solutions, the linearized Poisson–Boltzmann equation (LPBE):

\nabla^{2} φ_{I I} (r) = κ^{2} φ_{I I} (r),

(4)

where κ is the inverse Debye length. The continuity of the potential and normal displacement furnish boundary conditions for both regions.³⁹ In the remainder of this paper, we assume that κ = 0, noting that the PDE-constrained optimization techniques apply equally well when the LPBE is used to model the potential in the solvent.

This set of coupled partial differential equations (PDEs) cannot be solved analytically except for relatively simple geometries. For general problems and realistic treatments of molecular geometries, numerical methods such as the finite-difference, finite-element, or boundary-element methods must be employed.³⁹^–⁶¹ In contrast to the finite-difference and finite-element methods, which discretize the differential form of the PDE, boundary-element methods discretize boundary-integral-equation formulations of the PDE problem.³⁹^,⁴⁰^,⁴²^,⁴³^,⁴⁷^,⁵³^–⁵⁵^,⁵⁹^,⁶²^–⁶⁶ Boundary-integral formulations possess attractive theoretical and numerical properties such as reduced dimensionality, the possibility of exact treatment of the dielectric boundary, and exact treatment of point charge effects and boundary conditions at infinity.³⁹ Many integral-equation approaches have been described in the literature;³⁹^,⁴⁰^,⁴²^,⁴⁷^,⁵⁵^,⁶²^,⁶⁷^–⁶⁹ in this paper, we present only the polarizable continuum model (PCM) formulation introduced by Miertus et al.⁴⁰^,⁵¹^,⁷⁰^,⁷¹ (which was independently derived by Shaw and Zauhar, who also called it the apparent-surface-charge (ASC) formulation⁴²^,⁴³^,⁴⁵), and the formulation introduced by Yoon and Lenhoff.⁴⁷

Using the electrostatic model in Figure Figure 2, the computational challenge associated with calculating $Δ G_{s o l v}^{0, e s}$ , the electrostatic contribution to a solute’s solvation free energy, is evaluating the reaction-potentials at each of the n_c charge locations that is induced by solvent polarization in response to the charges themselves. Because we have assumed linear response, the vector of reaction potentials at the charge locations, φ_R, can be written as a weighted combination of the responses due to each of the individual point charges,

φ_{R} = S q,

(5)

where we have defined S to be the reaction-potential, or solvation matrix. The electrostatic free energy is then a quadratic function of q:

Δ C_{solv}^{0, es} = \frac{1}{2} q^{T} S q .

(6)

We now derive expressions for the reaction-potential matrix S such that it may be written as

S = M_{3} M_{2}^{- 1} M_{1},

(7)

where $M_{3}, M_{2}^{- 1}$ , and M₁ are linear operators.

2.1.1 The Apparent-Surface-Charge Formulation

Numerous groups have derived the boundary-integral equation for the surface charge that develops at a dielectric boundary in response to a distribution of charge;⁴⁰^,⁴²^,⁷²^,⁷³ it is known variously as the polarizable-continuum model (PCM) and apparent-surface-charge (ASC) formulation,⁴⁰^,⁴²^,⁵¹^,⁵³ and has been widely used in biomolecular simulations.⁴³^,⁴⁵^,⁵³^,⁶³^,⁶⁸^,⁷⁴ Rather than solving for the potential throughout space in the original mixed-dielectric PDE problem, one solves an equivalent problem with uniform dielectric constant ε_I everywhere, finding a distribution of charge σ_p(r) on Γ such that σ_p(r) reproduces the continuity conditions of the original mixed-dielectric problem. This surface charge satisfies the second-kind integral equation⁴²

\frac{ε_{I} + ε_{I I}}{2 ε_{I} (ε_{I} - ε_{I I})} σ_{p} (r) + {∫̶}_{Γ} \frac{\partial}{\partial n (r)} \frac{σ_{p} (r') d A'}{4 π ε_{I} ‖ r - r' ‖} = - \frac{\partial}{\partial n (r)} \sum_{i = 1}^{n_{c}} \frac{q_{i}}{4 π ε_{I} ‖ r - r_{i} ‖},

(8)

where ∫̶ denotes a principal-value integral⁷⁵^,⁷⁶ and n(r) denotes the outward normal direction into solvent. The surface charge distribution σ_p(r) produces in the molecular interior (region I) a potential equal to that induced by the polarization of the solute. The reaction potential at a solute charge location r_i is the result of convolving the free-space Green’s function with the surface charge distribution:

φ_{R} (r_{i}) = \int_{Γ} \frac{σ_{p} (r')}{4 π ε_{I} ‖ r_{i} - r' ‖} d A' .

(9)

The set of reaction potentials at all the charge locations is therefore the image of the charge distribution under three linear operators:

φ_{R} = M_{3} M_{2}^{- 1} M_{1} q .

(10)

The operator M₁ maps the solute charge distribution to the induced normal-displacement field at the dielectric boundary; that is, the application of M₁ to q generates the right-hand side (RHS) in Equation (8).

The operator M₂ generates the left-hand side in Equation (8) when applied to σ_p, and $M_{2}^{- 1}$ is used to denote the operator’s inverse. That is, $M_{2}^{- 1}$ applied to the RHS in Equation (8) generates σ_p(r). Finally, the integral operator M₃ maps the induced surface charge to the reaction potentials at the charge locations via Equation (9). Note that $M_{3} M_{2}^{- 1} M_{1}$ is an n_c-by-n_c matrix, even though M₁, M₂, and M₃ are operators.

Because the charge distribution is a set of discrete point charges, the difference in electrostatic free energy between the uniform ε_I domain and the mixed-dielectric problem is a finite-dimensional inner product:

Δ C_{solv}^{0, es} = \frac{1}{2} φ_{R}^{T} q,

(11)

where φR denotes the vector of reaction potentials computed at the n_c charge locations. For problems in which $ε_{I I} > ε_{I}, S = M_{3} M_{2}^{- 1} M_{1}$ is symmetric positive definite (SPD).

2.1.2 The Non-derivative Green’s-Theorem Formulation

Using Green’s theorem, Yoon and Lenhoff derived a pair of coupled integral equations capable of modeling solutes in dilute ionic solutions (that is, when the LPBE holds in the solvent).⁴⁷ The integral equations are

\frac{1}{2} φ (r) + {∫̶}_{Γ} φ (r') \frac{\partial G_{I}}{\partial n} (r; r') d A' - {∫̶}_{Γ} \frac{\partial φ}{\partial n} (r') G_{I} (r; r') d A' = \sum_{i = 1}^{n_{c}} \frac{q_{i}}{ε_{I}} G_{I} (r, r_{i});

(12)

\frac{1}{2} φ (r) + {∫̶}_{Γ} φ (r') \frac{\partial G_{I I}}{\partial n} (r; r') d A' - \frac{ε_{I}}{ε_{I I}} {∫̶}_{Γ} \frac{\partial φ}{\partial n} (r') G_{I I} (r; r') d A' = 0,

(13)

where G_I(r;r′) and G_II(r;r′) are the free-space Green’s functions in the solute and solvent regions and the unknown surface variables φ(r) and $\frac{\partial φ}{\partial n} (r)$ are the surface potential and its normal derivative. These equations are derived by applying Green’s theorem in regions I and II, finding the potential at arbitrary points in these regions by substituting the relevant Green’s functions, and then letting the points approach the surface by taking appropriate limits.⁴⁷^,⁶⁴^,⁶⁶ After solving Equation (12) and Equation (13) for φ and $\frac{\partial φ}{\partial n}$ , the reaction potential at the ith charge location induced by solvent polarization can be written as

φ_{R} (r_{i}) = \int_{Γ} [G_{I} (r_{i}; r') \frac{\partial φ}{\partial n} (r') - φ (r') \frac{\partial G_{I}}{\partial n} (r_{i}; r')] d A',

(14)

and again the electrostatic solvation free energy can be written as a product of three linear operators as in Equation (7).

2.2 Numerical Solution of the Integral Equations Using Boundary-Element Methods and Fast Algorithms

The boundary-element method (BEM) is a popular technique for solving boundary-integral equations numerically. To solve an integral equation such as Equation (8) using the BEM, one first introduces a set of basis functions defined on the surface. Representing the unknown surface variable as a weighted combination of the basis functions reduces the exact infinite-dimensional problem to an approximation problem with a finite number of unknowns, the weights used to scale the basis functions. A set of constraints on the weights are then written to force the approximate representation of the surface variable to satisfy the discretized integral equation as closely as possible in some metric (see, for example,75). The resulting problem—that of finding the basis function weights that minimize some function of the residual—is a finite-dimensional matrix equation.

Usually, it is convenient to discretize the surface into a set of surface patches, or boundary elements, before defining the basis functions. In biomolecule electrostatic simulations, these elements are commonly planar triangles,⁴⁷^,⁶³ although curved-element discretizations of molecule–solvent interfaces have been described by several groups.³⁹^,⁵⁶^,⁵⁹^,⁶¹^,⁷⁷^,⁷⁸ We present a boundary-element method for solving the ASC formulation. The Green’s-theorem formulation (Equation (12) and Equation (13)) can be solved analogously, but requires two weights for each basis function: one for the potential and one for its normal derivative. Full details for solving the Green’s-theorem formulation numerically can be found in.⁴⁷^,⁶⁴^,⁶⁶

First, the molecule–solvent interface is discretized using n_p boundary elements, and then a set of n_p piecewise-constant basis functions is defined such that

χ_{i} (r) = {\begin{array}{l} 1 & if r is on panel i \\ 0 & otherwise . \end{array}

(15)

The unknown surface charge density σ_p(r) is then represented approximately as

σ_{p} (r) \approx \sum_{i = 1}^{n_{p}} x_{i} χ_{i} (r),

(16)

where the weights x_i are unknown. Using a Galerkin discretization⁷⁵ of the PCM/ASC formulation in which the inner integral is evaluated via one-point quadrature,⁶⁸^,⁷⁹^,⁸⁰ one obtains the dense linear system M₂x = M₁q, with the entries of M₂ and M₁ given by

M_{2, i i} = \frac{\hat{ε}}{2 ε_{I}} α_{i} + {∫̶}_{panel i} \frac{\partial}{\partial n (r)} \frac{α_{i} d A'}{4 π ε_{I} ‖ r - r_{c_{i}} ‖}

(17)

M_{2, i j} = {∫̶}_{panel i} \frac{\partial}{\partial n (r)} \frac{α_{j} d A'}{4 π ε_{I} ‖ r - r_{c_{j}} ‖} (i \neq j)

(18)

M_{1, i j} = - {∫̶}_{panel i} \frac{\partial}{\partial n (r)} \frac{q_{j} d A'}{4 π ε_{I} ‖ r - r_{j} ‖},

(19)

where α_i denotes the area of panel i, $\hat{ε} = \frac{ε_{I} + ε_{I I}}{ε_{I} - ε_{I I}}$ , n(r) denotes the outward normal at r, and r_{c_i} denotes the centroid of panel i. The approach presented here differs slightly from the commonly used centroid-collocation method, which essentially approximates the outer Galerkin integral using one-point quadrature; the method described here offers superior accuracy.⁶⁸^,⁷⁹ We note that the matrix entries of Equation (17), Equation (18), and Equation (19) are specific to the PCM/ASC formulation; if the Green’s theorem formulation,⁴⁷^,⁶⁶ or other boundary-integral formulations, are employed to define the solvation matrix, similar matrices are defined that play analogous roles.⁸⁰

Protein-sized systems often require more than 10⁵ unknowns and boundary elements to accurately represent the molecule–solvent interfaces and surface variables. Because solving the n-dimensional dense matrix equation M₂x = M₁q using LU factorization requires O(n³) time, and even storing M₂ requires prohibitively large O(n²) memory, more efficient methods have been developed whose time and memory requirements scale linearly or near-linearly in the number of unknowns. ⁵⁴^,⁶³^,⁶⁵^,⁸¹^,⁸² These fast-solver approaches combine Krylov-subspace iterative methods⁸³ such as GMRES⁸⁴ with fast, approximate algorithms to apply the discretized integral operator matrix to a vector. At the kth iteration of a Krylov-subspace algorithm, one finds an approximate solution x_k that lies in the kth Krylov subspace, which is formed by repeatedly applying A to b:

x_{k} \in {b, A b, A^{2} b, \dots, A^{k - 1} b} .

(20)

The fast multipole method⁸¹^,⁸⁵ and the precorrected-FFT algorithm⁸⁶ represent two algorithms that compute BEM matrix–vector products in linear or near-linear time. The results in this paper were computed using the FFTSVD algorithm, which offers several advantages for biomolecule electrostatic problems.⁶⁵^,⁶⁶

Krylov-subspace iterative methods are commonly preconditioned so that instead of solving Ax = b for x, one solves PAx = Pb, where P is a matrix that approximates A⁻¹ such that the iterates x_k converge more rapidly towards the exact solution; that is, the use of a preconditioner reduces the number of matrix–vector products required to find a suitably accurate approximation to the actual solution x. Preconditioning the ASC formulation is easily accomplished using a diagonal matrix in which $P_{i i} = M_{2, i i}^{- 1}$ . Methods for preconditioning the Green’s-theorem formulation are presented in. ⁶⁴^,⁶⁶

2.3 Biomolecule Electrostatic Optimization

Using the thermodynamic cycle in Figure Figure 1, the total electrostatic contribution to the binding free energy can be written as

Δ G_{bind}^{0, es} = (- Δ G_{solv}^{0, R, es} - Δ G_{solv}^{0, L, es}) + Δ G_{bind}^{0, ref, es} + Δ G_{solv}^{0, L : R, es},

(21)

where the solvation free energies for the ligand, receptor, and complex are denoted by the subscripts L, R, and L : R, and the ligand–receptor Coulomb-interaction energy is written $Δ G_{bind}^{0, ref, es}$ .¹ Substituting the appropriate ligand, receptor, and ligand–receptor reaction potential matrices, one obtains

Δ G_{bind}^{0, es} = - \frac{1}{2} q_{R}^{T} R_{unbound} q_{R} - \frac{1}{2} q_{L}^{T} L_{unbound} q_{L} + {(G q_{R})}^{T} q_{L} + \frac{1}{2} q_{C}^{T} C_{bound} q_{C},

(22)

where q_L and q_R denote the n_{c_L} - and n_{c_R}-length vectors of ligand and receptor charge values, q_C = (q_L,q_R)^T, and L_unbound, R_unbound, and C_bound denote the appropriate solvation matrices; the electrostatic component of the low-dielectric binding free energy has been written (Gq_R)^T q_L, where the n_{c_L}-by-n_{c_R} Coulomb matrix G maps receptor-charge values to Coulomb potentials at the ligand-charge locations given the bound-state geometry.

The optimizable component of $Δ G_{bind}^{0, es}$ , which is the portion of $Δ G_{bind}^{0, es}$ that is dependent on the ligand charges, is called the variational electrostatic binding free energy $Δ G_{bind}^{0, var}$ .⁶ The second term in Equation (22) does not contribute to $Δ G_{bind}^{0, var}$ , nor does the component of the final term that depends only on the receptor charges. Writing $\frac{1}{2} q_{C}^{T} C_{bound} q_{C}$ as

Δ G_{solv, L - R}^{0, es} = \frac{1}{2} [\begin{matrix} q_{L}^{T} & q_{R}^{T} \end{matrix}] [\begin{array}{l} L_{bound} & C_{bound}^{L, R} \\ C_{bound}^{R, L} & R_{bound} \end{array}] [\begin{matrix} q_{L} \\ q_{R} \end{matrix}]

(23)

and exploiting the symmetry of C_bound allows the variational electrostatic binding free energy to be written as

Δ G_{bind}^{0, var} = - \frac{1}{2} q_{L}^{T} L_{unbound} q_{L} + \frac{1}{2} q_{L}^{T} L_{bound} q_{L} + q_{R}^{T} G^{T} q_{L} + q_{R}^{T} C_{bound}^{L, R} q_{L} .

(24)

The final two terms in Equation (24) are linear in the ligand charge values, and the vector

c = G q_{R} + C_{bound}^{L, R} q_{R},

(25)

which represents the total field induced by the receptor charges at the ligand-charge locations in the bound state, may be used to further simplify Equation (24):

Δ G_{bind}^{0, var} = \frac{1}{2} q_{L}^{T} (L_{bound} - L_{unbound}) q_{L} + c^{T} q_{L} .

(26)

Equation (26) is the objective function for optimizing the electrostatic component of the free energy of binding. Kangas and Tidor showed that the difference L_bound − L_unbound, which is the Hessian of the objective function, is positive definite if one assumes that the ligand binds rigidly, that the ligand charge distribution is unchanged on binding, and that the molecules have finite size.⁶ The variational electrostatic binding free energy $Δ G_{bind}^{0, var}$ is therefore a convex function with respect to the ligand charge distribution, and there exists a unique minimal free energy.

Often, it is of interest to enforce sum-of-charge constraints over subsets of the charges and possibly over the entire set. ¹^,³^,⁵ Defining the matrix H = L_bound − L_unbound and including the linear constraint Aq = b gives rise to the constrained optimization problem

\begin{array}{l} minimize & \frac{1}{2} q^{T} H q + c^{T} q \\ subject to & A q = b . \end{array}

(27)

In Equation (27) and for the remainder of the paper the vector q is used instead of q_L to represent the ligand charges. In addition to linear equality constraints, linear inequality constraints are sometimes imposed on the variables to ensure that the computed charges are physically reasonable. The resulting inequality-constrained problem

\begin{array}{l} minimize & \frac{1}{2} q^{T} H q + c^{T} q \\ subject to & A q = b \\ and & m \leq q_{i} \leq M, \forall i \in {1, \dots, n_{c}}, \end{array}

(28)

where m_i and M_i represent the lower and upper bounds for qi. Assuming without loss of generality that A has full rank, this problem can be transformed into the standard form for a convex quadratic problem,

\begin{array}{l} minimize & \frac{1}{2} x^{T} Ĥ x + {\hat{c}}^{T} x \\ subject to & \hat{A} x = \hat{b} \\ and & x \geq 0, \end{array}

(29)

using the substitutions

\begin{array}{l} x = & [\begin{matrix} t \\ r \end{matrix}] \\ Ĥ = & [\begin{array}{l} H & 0 \\ 0 & 0 \end{array}] \\ \hat{c} = & [\begin{matrix} c + H m \\ 0 \end{matrix}] \\ \hat{A} = & [\begin{array}{l} A & 0 \\ I & I \end{array}] \\ \hat{b} = & [\begin{matrix} b - A m \\ M - m \end{matrix}], \end{array}

(30)

where the slack variables t and r satisfy m + t = q and q + r = M. This notation for inequality-constrained biomolecule optimization problems will be used throughout the rest of the paper.

Titratable chemical groups in the ligand warrant a brief discussion. One of the assumptions inherent in the electrostatic optimization theory is that the ligand geometry and charge distribution do not change on binding.¹ Thus, the charge optimization scheme presented here, the number of charges n_c remains unchanged during optimization, as do their locations, and therefore charge optimization is performed for a particular titration state. In reality, of course, the ligand–receptor binding event may perturb the ligand geometry, its charge distribution, or state of protonation, or any combination of these. Exploring the dependence of the optimal charges (and the optimized binding free energy) on the titration state is the most obvious way to assess the impact of the assumptions underlying the theory, although in general such an undertaking is likely to be computationally expensive.

2.4 Solving Convex Quadratic Optimization Problems

This section presents methods for minimizing the quadratic function

f (x) = \frac{1}{2} x^{T} H x + c^{T} x,

(31)

where x is a vector of length n_primal and the matrix H, also known as the Hessian matrix, is symmetric and positive definite (SPD). The global minimizer x^* can be found by setting ∇f(x) = 0 and solving the resulting linear system

H x^{*} = - c .

(32)

Optimization problems with constraints require more sophisticated approaches (see, for example,87,88). The quadratic program

\begin{array}{l} minimize & \frac{1}{2} x^{T} H x + c^{T} x \\ subject to & A x = b \end{array}

(33)

can be solved using Lagrange multipliers.⁸⁷ The optimal solution is a point x^* and a corresponding vector of multipliers λ^* that together satisfy the matrix equation

[\begin{array}{l} H & A^{T} \\ A & 0 \end{array}] [\begin{matrix} x^{*} \\ λ^{*} \end{matrix}] = [\begin{matrix} - c \\ b \end{matrix}] .

(34)

Inequality-constrained problems require the introduction of a vector of slack variables s in addition to the Lagrange multipliers λ. Because the optimization problem in Equation (29) satisfies a constraint qualification,⁸⁷ an optimal solution (x^*, λ^*, s^*) can be calculated by finding a point that satisfies the Karush-Kuhn-Tucker (KKT) optimality conditions:

\begin{array}{l} s^{*} & = H x^{*} + c - A^{T} λ^{*} \\ A y^{*} & = b \\ x_{i}^{*} s_{i}^{*} & = 0 \forall i \in {1, 2, \dots, n_{primal}} \\ (x^{*}, s^{*}) & \geq 0 . \end{array}

(35)

These conditions can be interpreted as the zeros of the nonlinear function

F (x, λ, s) = [\begin{matrix} H x + c - A^{T} λ - s \\ b - A x \\ X s \end{matrix}],

(36)

where X represents the diagonal matrix with X_ii = x_i. Primal-dual interior point methods find the roots of this equation using a modified Newton–Raphson iteration, with the Newton–Raphson updates biased to ensure convergence and scaled to ensure positivity of the elements of x and s.⁸⁸ The kth update of a primal-dual iterative method satisfies the linear system of equations

[\begin{array}{c} H & - A^{T} & - I \\ A & 0 & 0 \\ S^{k} & 0 & X^{k} \end{array}] [\begin{matrix} Δ x^{k + 1} \\ Δ λ^{k + 1} \\ Δ s^{k + 1} \end{matrix}] = [\begin{matrix} - c + s^{k} - H x^{k} + A^{T} λ^{k} \\ b - A x^{k} \\ X^{k} S^{k} e \end{matrix}] + [\begin{matrix} 0 \\ 0 \\ σ \frac{x^{k, T} s^{k}}{n_{primal}} e \end{matrix}],

(37)

where e is the vector of ones, S^k is diagonal with $S_{i i}^{k} = s_{i}^{k}$ , and the second term on the right-hand side (RHS) biases the update towards a point with equal pairwise products x_is_i.⁸⁸ The parameter σ, which is between 0 and 1, determines the strength of the bias. It can be a fixed value over all iterations, or set dynamically based on the progress of the previous iterations.⁸⁹ Smaller values for σ allow faster convergence in most cases, but larger values offer superior robustness.⁸⁸

The reverse-Schur optimization method is specialized to PDE-constrained problems in which the relationships between the decision variables x, the PDE state variables y_I, and the external state variables y_E are affine. That is, the three vectors satisfy a matrix equality

[\begin{array}{c} B & A & 0 \\ D & C & - I \end{array}] [\begin{matrix} x \\ y_{I} \\ y_{E} \end{matrix}] = [\begin{matrix} z_{I} \\ z_{E} \end{matrix}]

(38)

for some vectors z_I and z_E; in the electrostatic optimization, z_I = 0 and z_E = 0.

3 THE REVERSE-SCHUR METHOD FOR ELECTROSTATIC OPTIMIZATION

Some problems in computational science can be solved more efficiently using a Schur complement, such that one solves not a block linear system such as

[\begin{array}{c} A & B \\ C & D \end{array}] [\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} a \\ b \end{matrix}],

(39)

but rather a smaller (or better-conditioned) system like

(D - C A^{- 1} B) y = b - C A^{- 1} a .

(40)

The reverse-Schur co-optimization method uses the exactly opposite approach. An unconstrained quadratic program

miminize \frac{1}{2} x^{T} H x + c^{T} x,

(41)

in which the Hessian is of the form $H = M_{3} M_{2}^{- 1} M_{1}$ , can be solved by setting the gradient equal to zero, leading to the linear system

M_{3} M_{2}^{- 1} M_{1} x^{*} = - c .

(42)

Equation (42) resembles Equation (40) with D = 0 and a = 0, and therefore the Hessian may be said to have the structure of a Schur complement. In reverse-Schur co-optimization, one solves the larger “reverse-Schur-complement” system

[\begin{array}{c} 0 & M_{3} \\ - M_{1} & M_{2} \end{array}] [\begin{matrix} x^{*} \\ y \end{matrix}] = [\begin{matrix} - c \\ 0 \end{matrix}] .

(43)

As discussed in Section 2, the Hessian matrix H = L_bound − L_unbound is a difference of two matrices, each of which has Schur structure. The reverse-Schur form of the unconstrained electrostatic optimization problem therefore has two reverse-Schur complements, and the optimizing ligand distribution q^* can be found by solving

[\begin{array}{c} 0 & M_{3, b} & - M_{3, u} \\ - M_{1, b} & M_{2, b} & 0 \\ - M_{1, u} & 0 & M_{2, u} \end{array}] [\begin{matrix} q^{*} \\ y_{b} \\ y_{u} \end{matrix}] = [\begin{matrix} - c \\ 0 \\ 0 \end{matrix}],

(44)

where the subscripts b and u denote the bound and unbound systems and the variables y_b and y_u are the surface variables for the corresponding BEM problems when the ligand charge distribution is the optimizing distribution q^*. In the ASC formulation, for instance, y_b represents the weights for the bound-state basis functions. The bound- and unbound-geometry state variables are therefore found simultaneously with the optimal decision variables q^*. The co-optimization approach in this respect resembles “all-at-once” methods; however, the corresponding state variables are not explicitly included in the optimization problem. Section 3.1.2 details that approach to Hessian-implicit optimization. It is important to note that the matrices associated with reverse-Schur co-optimization are not necessarily symmetric. Also, we note that the bound-state and unbound-state surface are different, with the bound state representing the larger ligand–receptor complex; consequently, the vector y_b is typically a longer vector than y_u.

As discussed in Section 2, electrostatic charge optimization problems usually have many fewer decision variables than there are degrees of freedom associated with the BEM problems. Reverse-Schur co-optimization systems such as Equation (44) are therefore only slightly larger than the corresponding BEM systems. Efficient preconditioning strategies, which will be described in Section 4, allow systems such as Equation (44) to be solved with approximately the same computational cost as would be required to solve a single bound-state and a single unbound-state electrostatic problem. We emphasize that the dense boundary-element matrices M_2,band M_2,u are almost always too large to be calculated or stored, and so their LU factorizations cannot be computed. Forming H in Equation (41) explicitly requires n_c solves of the bound and unbound geometries. The reverse-Schur method is therefore more computationally efficient than the explicit-Hessian approach.

The next section presents alternative Hessian-free PDE-constrained methods, such as a nested-Krylov method, or NAND-like approach, and a traditional PDE-constrained formulation, following the SAND paradigm. Section 3.2 describes biomolecule co-optimization techniques for constrained problems.

3.1 Alternative PDE-Constrained Approaches

3.1.1 Nested-Krylov Approach

One alternative to the co-optimization approach would be to use nested Krylov methods to solve the linear systems associated with the explicit-Hessian approach. The unconstrained problem

(M_{3, b} M_{2, b}^{- 1} M_{1, b} - M_{3, u} M_{2, u}^{- 1} M_{1, u}) q^{*} = - c

(45)

would then require two inner Krylov solves for every matrix–vector multiplication required for the outer Krylov method: one for the bound-state problem and one for the unbound-state problem. This approach represents an implementation of a nested analysis and design (NAND) approach to PDE-constrained optimization.

3.1.2 Incorporating the PDE as Constraints

The electrostatic optimization problem can also be formulated in a traditional PDE-constrained approach in which the surface variables of the bound- and unbound-state boundary-element problems are included as optimization variables, with the boundary-element method equations added as equality constraints. The resulting problem,

\begin{array}{l} minimize \frac{1}{2} {[\begin{matrix} q \\ y_{b} \\ y_{u} \end{matrix}]}^{T} [\begin{array}{c} 0 & \frac{1}{2} M_{3, b} & - \frac{1}{2} M_{3, u} \\ \frac{1}{2} M_{3, b}^{T} & 0 & 0 \\ - \frac{1}{2} M_{3, u}^{T} & 0 & 0 \end{array}] [\begin{matrix} q \\ y_{b} \\ y_{u} \end{matrix}] + {[\begin{matrix} c \\ 0 \\ 0 \end{matrix}]}^{T} [\begin{matrix} q \\ y_{b} \\ y_{u} \end{matrix}] \\ subject to [\begin{array}{c} - M_{1, b} & M_{2, b} & 0 \\ - M_{1, u} & 0 & M_{2, u} \end{array}] [\begin{matrix} q \\ y_{b} \\ y_{u} \end{matrix}] = [\begin{matrix} 0 \\ 0 \end{matrix}], \end{array}

(46)

can be solved by setting the gradient to zero and solving the resulting linear system

[\begin{array}{c} 0 & \frac{1}{4} M_{3, b} & - \frac{1}{4} M_{3, u} & - M_{1, b}^{T} & - M_{1, u}^{T} \\ \frac{1}{4} M_{3, b}^{T} & 0 & 0 & M_{2, b}^{T} & 0 \\ - \frac{1}{4} M_{3, u}^{T} & 0 & 0 & 0 & M_{2, u}^{T} \\ - M_{1, b} & M_{2, b} & 0 & 0 & 0 \\ - M_{1, u} & 0 & M_{2, u} & 0 & 0 \end{array}] [\begin{array}{c} q^{*} \\ y_{b}^{*} \\ y_{u}^{*} \\ λ_{b}^{*} \\ λ_{u}^{*} \end{array}] = [\begin{array}{c} - c \\ 0 \\ 0 \\ 0 \\ 0 \end{array}] .

(47)

The matrix in Equation (47) is symmetric, which allows the system to be solved using specialized Krylov-subspace iterative methods. However, every matrix–vector product requires twice as much calculation as the matrix-vector products required for Krylov methods to solve Equation (44). It can also be difficult or impractical to apply the transposed matrices $M_{2, b}^{T}$ and $M_{2, u}^{T}$ . Thus, both the reverse-Schur method and the SAND-type approach have strengths and weaknesses.

3.2 Constrained Co-Optimization

It is straightforward to use the reverse-Schur method to solve the constrained optimization problems presented in Section 2.4. After transforming Equation (34), the co-optimization system for solving the problem with linear-equality constraints is

[\begin{array}{c} 0 & A^{T} & M_{3, b} & - M_{3, u} \\ A & 0 & 0 & 0 \\ - M_{1, b} & 0 & M_{2, b} & 0 \\ - M_{1, u} & 0 & 0 & M_{2, u} \end{array}] [\begin{matrix} q^{*} \\ λ^{*} \\ y_{b} \\ y_{u} \end{matrix}] = [\begin{matrix} - c \\ b \\ 0 \\ 0 \end{matrix}] .

(48)

Similarly, the co-optimization system associated with the kth iteration of a primal-dual interior-point method is transformed from Equation (37) to

[\begin{array}{c} 0 & - {\hat{A}}^{T} & - I & {\hat{M}}_{3, b} & - {\hat{M}}_{3, u} \\ \hat{A} & 0 & 0 & 0 & 0 \\ S^{k} & 0 & X^{k} & 0 & 0 \\ - {\hat{M}}_{1, b} & 0 & 0 & M_{2, b} & 0 \\ - {\hat{M}}_{1, u} & 0 & 0 & 0 & M_{2, u} \end{array}] [\begin{array}{c} Δ x^{k + 1} \\ Δ λ^{k + 1} \\ Δ s^{k + 1} \\ Δ y_{b}^{k + 1} \\ Δ y_{u}^{k + 1} \end{array}] = [\begin{matrix} - \hat{c} - s^{k} - Ĥ x^{k} + {\hat{A}}^{T} λ^{k} \\ \hat{b} - \hat{A} x^{k} \\ X^{k} S^{k} e + σ \frac{x^{k, T} s^{k}}{n_{c}} e \\ {\hat{M}}_{1, b} x^{k} - M_{2, b} y_{b}^{k} \\ {\hat{M}}_{1, u} x^{k} - M_{2, u} y_{u}^{k} \end{matrix}],

(49)

where

\begin{matrix} {\hat{M}}_{1, b} = & [\begin{matrix} M_{1, b} & 0 \end{matrix}] \end{matrix}

(50)

\begin{matrix} {\hat{M}}_{3, b} = & [\begin{matrix} M_{3, b} \\ 0 \end{matrix}] \end{matrix}

(51)

with the zero submatrices of the appropriate size given the transformation of the inequality-constrained problem into standard form, and M̂_1,u and M̂_3,u are similarly defined. The term Ĥx^k on the right-hand side is computed as

Ĥ x^{k} = Ĥ x^{k - 1} + {\hat{M}}_{3, b} Δ y_{b}^{k} - {\hat{M}}_{3, u} Δ y_{u}^{k}

(52)

and Ĥx⁰ must be found before the first iteration.

4 IMPLEMENTATION

In this section we present two implementation details that are important for the co-optimization method to achieve high efficiency and accuracy. First, regularization of the optimization presents a critical challenge. The need for regularization arises due to numerical error in simulation; although the exact Hessian is positive definite, typically a numerically computed H is not. Given H explicitly, the eigendecomposition or singular value decomposition can be used to penalize or eliminate the nonphysical part of the matrix. However, for the reverse-Schur (or any other implicit-Hessian) method to produce results comparable to those obtained by current methods, accurate regularization techniques must be found so that the appropriate directions can be penalized. Second, effective preconditioning schemes need to be developed because the co-optimization linear systems are solved using Krylov iterative methods.

A simple model geometry, shown in Figure Figure 3, is used to demonstrate the performance of the presented regularization and preconditioning methods. The unbound ligand is a sphere of radius 8 Å; the ligand–receptor complex is modeled as the solvent-excluded surface produced by rolling a 1.4-Å probe sphere over the union of the ligand sphere and a 24-Å radius sphere representing the receptor, where the sphere centers are separated by 20 Å.³⁸ The internal dielectric constant is taken to be 4 and the solvent external dielectric constant to be 80. For simplicity in this paper, we assume that the Laplace equation holds in the solvent region (that is, that there are no mobile ions in solution). However, the implicit-Hessian methods can be used also for the case when the LPBE holds in the solvent region.

Schematic of a model ligand–receptor complex for studying implementation details for the co-optimization method. The surface of the ligand–receptor complex is defined by rolling a probe sphere of radius 1.4 Å over the union of the two spheres. All distances are in Angstroms.

The receptor charge distribution consists of 2000 randomly placed charges located such that no charge is within 1.5 Å of another charge, the dielectric boundary, or the ligand volume; the charge values have been chosen from a uniform distribution on [−1e, 1e]. The ligand charge distribution is built from a set of charge locations randomly placed in the ligand sphere subject to the constraints that no charges are within 1.5 Å of one another or of the ligand boundary. Several discretizations of each geometry have been generated. One surface discretization uses planar triangle boundary elements generated using MSMS,⁹⁰ with 71922 and 7924 elements used to approximate the bound and unbound surfaces; the other uses 2132 and 1810 curved boundary elements that exactly represent the two geometries.⁷⁸ Coarser discretizations have been used for some examples and are described where appropriate. Full computational details are deferred to Section 5.

4.1 Regularization

The explicit-Hessian approach to biomolecule electrostatic optimization allows straightforward regularization. The numerically calculated Hessian matrix is first symmetrized to remove numerical-error-based asymmetries. Because the Hessian is available explicitly, it is possible to calculate its eigendecomposition. The eigenspace corresponding to the smallest eigenvectors is then heavily penalized, but not removed explicitly, so that the existence of a feasible solution is assured even in the presence of other constraints. The co-optimization approach, in contrast, does not permit the use of direct eigendecomposition, because the Hessian matrix is not available. Co-optimization achieves its performance advantage by leaving the Hessian implicit, and therefore little information is available regarding its spectrum or corresponding eigenvectors. In order to solve the same optimization problems using the current protocols without sacrificing performance, the minimal eigenspace must be approximated inexpensively.

In,⁸⁹ linear-equality constrained co-optimization problems were preconditioned using an approximate Hessian Ĥ of the form

Ĥ = M_{3, b} P_{2, b} M_{1, b} - M_{3, u} P_{2, u} M_{1, u}

(53)

where P_2,b and P_2,u represent the bound- and unbound-state BEM preconditioners. Using a diagonal approximation of the ASC integral equation as the preconditioner P₂ corresponds to the recent BIBEE/P electrostatic model.⁹¹ When performing co-optimization using the Green’s theorem formulation, the operators M₁, M₂, M₃, and P₂ differ from those employed in the ASC formulation (i.e., the entries of M₂ are no longer defined according to Equation (17)). However, the Green’s theorem co-optimization linear systems are written the same way, and an approximate Hessian can still be defined according to Equation (53).

To illustrate that an approximation to the ASC formulation can be used to regularize the optimization method but not the Green’s-theorem method approximation, explicit Hessians and their approximations were calculated using both integral formulations and both planar and curved boundary elements. The four Hessian matrices and their approximations were decomposed using the singular value decomposition (SVD), and the right singular vectors of the approximate Hessians were projected onto those from the corresponding explicitly calculated Hessians. The Green’s-theorem and ASC formulations produced very similar explicit actual Hessians (‖H_A − H_G‖/‖H_A‖ < 0.01), regardless of whether planar or curved boundary elements were used.

To ensure comparable regularization between explicit-Hessian and implicit-Hessian optimization procedures, implicit-Hessian methods must penalize not only the same number of search directions as explicit-Hessian methods, but also the search directions themselves. To illustrate how the singular vectors of an approximate Hessian Ĥ are aligned with the singular vectors of the actual Hessian H, we calculate the matrix

V_{H}^{T} V_{Ĥ},

(54)

which represents the projection of the singular vectors of Ĥ onto those of H. Perfect alignment between the sets of vectors would produce a diagonal matrix X whose diagonal entries all have unit magnitude. Similarly, the degree to which the singular vectors of Ĥ are imperfectly aligned with those of H is reflected in the presence of non-zero entries off the diagonal. Figure Figure 4(a) is a pseudo-color plot of the magnitudes of the entries of X, using approximate and actual Hessians computed from the Green’s-theorem formulation and the planar-element discretization. The analogous plot, computed using the ASC formulation and planar elements, is shown in Figure Figure 4(b). The ASC-based approximate singular vectors are clearly much better aligned with the corresponding singular vectors of the explicitly computed ASC Hessian. Similarly, plotted in Figures Figure 5(a) and (b) are the results of curved-element simulations of the Green’s-theorem and ASC formulations, respectively, with the approximate Hessian right singular vectors projected onto the explicit-Hessian right singular vectors. Because the ASC formulation generates superior approximate Hessians to the mixed formulation for both kinds of discretizations, we attribute the fidelity to the superior conditioning of purely second-kind integral operators (see, for instance,⁵⁶). In principle, it is possible to regularize the Green’s-theorem co-optimization system using the penalty matrix derived from the ASC-based approximate Hessian. However, the results in Section 4.2 illustrate that the superior conditioning of the second-kind integral equation produces convergence of the co-optimization GMRES in many fewer iterations than are required for the Green’s-theorem based co-optimization.

Comparison of the explicit and approximate Hessians of the sample problem in Figure Figure 3 when discretized using planar boundary elements. The alignment between the singular vectors of exact and approximate Hessians is obtained by projecting the right singular vectors of an approximate Hessian Ĥ onto the right singular vectors of the explicit Hessian H and taking the magnitude of the resulting entries. Each row and column therefore has 2-norm of one. (a) Explicit and approximate Hessians obtained using the Green’s theorem formulation. (b) Explicit and approximate Hessians obtained using the polarizable continuum model/apparent surface charge formulation.

Comparison of the explicit and approximate Hessians of the sample problem in Figure Figure 3 when discretized using curved boundary elements. The alignment between the singular vectors of exact and approximate Hessians is obtained by projecting the right singular vectors of an approximate Hessian Ĥ onto the right singular vectors of the explicit Hessian H and taking the magnitude of the resulting entries. Each row and column therefore has 2-norm of one. (a) Explicit and approximate Hessians obtained using the Green’s theorem formulation. (b) Explicit and approximate Hessians obtained using the polarizable continuum model/apparent surface charge formulation.

Figure Figure 6(a) and (b) are plots of the singular values for the planar- and curved-element discretizations. Predictability of the relation between the approximate singular values σ̂_i and the actual values σ_i is important so that the appropriate number of search directions can be penalized. The singular values of Ĥ_A were much closer to those of H_A than the singular values of Ĥ_G were to H_G.

The magnitudes of the singular values of the explicit and approximate Hessians computed using (a) planar boundary elements and (b) curved boundary elements. The singular values of the approximate Hessians calculated using the Green’s-theorem formulation are less accurate than the singular values of the approximate Hessians computed using the apparent-surface-charge (ASC) formulation, regardless of whether planar or curved boundary elements are employed.

Based on the results in Figures Figure 4, Figure 5, and Figure 6 we adopted the following scheme to regularize the co-optimization solutions. The ASC-based Hessian approximation Ĥ_A was computed first, and the eigendecomposition of the symmetrized matrix $\frac{1}{2} (Ĥ_{A} + Ĥ_{A}^{T})$ was taken. The first right singular vector ν̂₁ was multiplied by the Hessian H_A using BEM simulation of the bound- and unbound-states, and the Rayleigh quotient

{\hat{λ}}_{1} = {\hat{ν}}_{1}^{T} H_{A} {\hat{ν}}_{1}

(55)

was then used as an estimate for the maximum eigenvalue of H_A. The penalty matrix

W = α V_{{:, I}} V_{{:, I}}^{T}

(56)

was then created, where the penalty paramater α = 100 kcal/mol/e², the eigenvalue tolerance γ = 10⁻⁴, and the set of penalized directions I = {i|σ̂_i < γλ̂₁}. The quadratic penalty term $\frac{1}{2} q^{T} W q$ was then added to the objective function, and optimization could begin. The unconstrained co-optimization system with a penalty matrix is

[\begin{matrix} W & A_{3, b} & - A_{3, u} \\ - A_{1, b} & A_{2, b} & 0 \\ - A_{1, u} & 0 & A_{2, u} \end{matrix}] [\begin{matrix} q^{*} \\ σ_{p, b} \\ σ_{p, u} \end{matrix}] = [\begin{matrix} - c \\ 0 \\ 0 \end{matrix}],

(57)

and the systems for constrained problems are similarly modified.

Even though λ̂₁ usually approximated λ₁ to within a few percent, sometimes the number of directions penalized was slightly different from the number that would be penalized in an explicit-Hessian method using the same tolerance γ. As a result, it was desirable to have an inexpensive means of obtaining approximations to the optimal distributions for problems where different numbers of directions were penalized. For the unconstrained and linear-equality constrained problems, such estimates could be obtained using an approximation to the Sherman-Woodbury-Morrison formula

{(H + U V^{T})}^{- 1} = H^{- 1} - H^{- 1} U {(I + V^{T} H^{- 1} U)}^{- 1} V^{T} H^{- 1},

(58)

which specifies how the inverse of a matrix H changes when H is perturbed by the low-rank update UV^T. Update-approximation methods for inequality-constrained problems represent an area of current research.

4.2 Preconditioning

The Hessian-implicit linear systems in Equation (44), Equation (48), and Equation (49) share a similar structure, and therefore can be preconditioned by similar methods. For a problem with n_primal decision variables and Σn_SV total unknowns associated with the BEM simulations, we define the desolvation operators

\begin{matrix} {\hat{J}}_{1} & = & [\begin{matrix} - M_{1, b} & 0 \\ - M_{1, u} & 0 \end{matrix}] \end{matrix}

(59)

\begin{matrix} {\hat{J}}_{2} & = & [\begin{matrix} M_{2, b} & 0 \\ 0 & M_{2, u} \end{matrix}] \end{matrix}

(60)

\begin{matrix} {\hat{P}}_{2} & = & [\begin{matrix} P_{2, b} & 0 \\ 0 & P_{2, u} \end{matrix}] \end{matrix}

(61)

\begin{matrix} {\hat{J}}_{3} & = & [\begin{matrix} M_{3, b} & - M_{3, u} \\ 0 & 0 \end{matrix}], \end{matrix}

(62)

where the zero blocks are sized such that M̂₁ ∈ ℜ^{Σn_SV×n_primal} and M̂₃ ∈ ℜ^{n_primal×Σn_SV}.

We define the preconditioners by block-factorizing the corresponding linear systems using the BEM preconditioners rather than the inverses of the BEM matrices. The Hessian-implicit preconditioners would therefore be exact if the BEM preconditioners were actually the BEM matrix inverses. The resulting preconditioners can be written as the product P̃ = P̃₄P̃₃P̃₂P̃₁. Other preconditioners, which for example block-triangularize the Hessian-implicit linear systems, can also be used but are generally less effective than the block factorization. Figures Figure 7(a) and (b) are plots of the preconditioned relative GMRES residuals as a function of iteration count for the unconstrained problem using the the Green’s theorem formulation and the apparent surface-charge formulation. Each solves the same 50-charge unconstrained problem using curved-element discretizations of the model problem in Figure Figure 3, using different preconditioners. It is clear that the ASC co-optimization converges in many fewer iterations than the corresponding Green’s-theorem co-optimization, regardless of which approach to preconditioning is employed.

Preconditioning effects on GMRES convergence for a 50-charge unconstrained optimization problem using (a) the non-derivative Green’s-theorem formulation of Yoon and Lenhoff⁴⁷ and (b) the Shaw apparent-surface-charge formulation.⁴²

5 COMPUTATIONAL RESULTS

5.1 Efficiency of Co-Optimization and PDE-constrained Approaches

The standard “all-at-once” approach and nested-Krylov approach to the electrostatic optimization problem were implemented in MATLAB using the geometry in Figure Figure 3 and relatively coarse discretizations of 142 and 124 curved elements for the bound and unbound geometries. The methods’ implementations were verified by direct inspection of the optimal charges computed by dense factorization of the systems in Equation (45) and Equation (47). These techniques and the co-optimization solver were then used with preconditioned GMRES to solve a set of unconstrained problems of varying dimension. Computational expense was measured by counting the total number of applications of the BEM operator M̂₂, because the computational cost of optimization is dominated by the application of the BEM operators. The all-at-once solver required two M̂₂, matrix–vector products for every Krylov iteration.

The number of required applications for the nested-Krylov method was estimated using the fact that the nested-Krylov method, which relies on an implicit Hessian, and an explicit-Hessian Krylov method, require the same number of GMRES iterations to achieve convergence. Therefore we estimated the nested-Krylov computational expense using the explicitly-calculated Hessian, rather than a true nested-Krylov code. The GMRES solve was preconditioned using the ASC-based Hessian approximation Ĥ_A. The all-at-once system was preconditioned using a block-factorization method similar to the co-optimization preconditioning schemes, so that the all-at-once preconditioner would be exact if the BEM preconditioner were exact. Every GMRES iteration for the all-at-once system requires two applications of the BEM operator M̂₂, as shown in Equation (47). Figure Figure 8 is a plot of the computational cost of these methods for solving unconstrained problems as the number of optimization variables varies from 5 to 130. The PDE-constrained approaches scale very favorably compared to the explicit-Hessian approach, and exhibit essentially comparable reductions in cost. It should be noted that the performance differences between the PDE-constrained approaches may reflect the relatively unoptimized implementations of the methods, and no significant conclusions should be drawn regarding the merits of one PDE-constrained technique over another. We note also that constrained problems exhibit similar performance trends.⁸⁹^,⁹²

The cost to solve unconstrained optimization problems of varying dimension using the co-optimization method, the two alternative implicit-Hessian approaches presented in Section 3.1, and by calculating the Hessian explicitly.

5.2 Realistic Protein–Ligand Systems

The three-dimensional structures of a complex between HIV-1 protease and darunavir (accession code 1T3R) and a complex between CDK2 and a small-molecule inhibitor (accession code 1OIT) were obtained from the Protein Data Bank (PDB). For both structures, protein side chains with missing density were rebuilt in their default geometry using the CHARMM computer program,⁹³ and in cases of multiple occupancy, the first entry listed was used. The final chi angles for as-paragine, glutamine, and histidine side chains were flipped by 180 degrees as necessary to improve the hydrogen bonding network. Hydrogen atoms were added to both structures using the HBUILD module⁹⁴ of CHARMM and the PARAM22 parameter set,⁹⁵ using a distance-dependent dielectric constant of 4. Ionizable residues were left in their standard states at pH 7. In the case of HIV-1 protease, the catalytic dyad was left doubly deprotonated. The receptor protonation states are assumed to be the same in both the bound and unbound states. For electrostatic simulations, atomic radii were taken from the PARSE parameter set.⁹⁶ Partial atomic charges for protein atoms were also taken from the PARSE parameter set; quantum-mechanically derived partial atomic charges for the small molecule inhibitors were calculated as follows. The geometry of each small molecule inhibitor was optimized using quantum mechanical calculations at the RHF/6–31G* level of theory as implemented in the program Gaussian 98.⁹⁷ After geometry optimization, partial atomic charges were fit to the quantum mechanical electrostatic potential using the RESP methodology.⁹⁸^,⁹⁹ Both the CDK2 and HIV-protease systems were optimized using the co-optimization method and an explicit Hessian; the same curved-element discretizations were used for both methods. As in Section 4, solute–solvent interfaces were defined as solvent-excluded (molecular) surfaces using a probe of radius 1.4 Å, and the solute and solvent dielectric constants were 4 and 80, respectively.

5.2.1 CDK2 and Inhibitor

The CDK2 inhibitor described by Anderson et al. has 40 atoms.³¹ Optimization of the partial atomic charges for a small-molecule inhibitor of CDK2 did not lead to significantly improved predicted electrostatic binding free energy; this inhibitor appears to already be very well optimized for its protein target. From finite-difference simulations, the total electrostatic contribution to binding for the quantum-mechanically-derived (wild-type) charge distribution, which has net zero charge, is 8.75 kcal/mol, and the optimal charge distribution leads to an electrostatic binding free energy of 5.54 kcal/mol.

Figure Figure 9 is a plot of the inhibitor with RESP-derived nominal charge values (in red) and the co-optimization unconstrained optimal charge values (in blue) labeling each atom, computed using curved boundary elements. It can be seen that the inhibitor atoms that directly hydrogen bond to the protein, especially those in the aminopyrimidine core, have optimal partial atomic charges that closely match those determined through quantum mechanics.

The Anderson *et al*. inhibitor of CDK2.³¹ The atoms are labeled with (red) partial atomic charge values derived from quantum mechanics via RESP⁹⁸,⁹⁹ fitting and (blue) optimized partial atomic charges computed with unconstrained minimization using curved-element BEM and co-optimization.

The wild-type molecule has zero net charge, and the co-optimization optimal solution from Figure Figure 10 has a net charge of −0.26 e. For comparison, the explicit-Hessian boundary-element approach leads to an unconstrained optimum with −0.33 e net charge, and the finite-difference optimal charges sum to 0.06 e; the preponderance of the difference is localized in a small number of atoms whose optimal charges are of large magnitude (Figure Figure 10). As expected, optimal charges computed using the explicit Hessian and the co-optimization methods were nearly identical; three unconstrained optimizations were performed, and the results are shown in Figure Figure 10. Explicit-Hessian calculations were performed using both finite-difference methods¹⁰⁰ and boundary-element methods,⁶⁶ as was the boundary-element based co-optimization approach. The first instructive comparison, of the explicit-Hessian approaches, demonstrates that even the vastly different approaches to numerical simulation produce optimal charges that correspond very closely. The methods must give exactly the same results in the limit of infinitely fine discretizations, of course, but it is valuable to know that such good agreement can be obtained even for discretizations that can be easily solved on a personal workstation. The excellent agreement between the boundary-element methods demonstrates the correct implementation of the co-optimization approach and that numerical errors associated with the implicit representation of the dense boundary-integral operators do not materially affect the computed optimal solution.

The unconstrained optimal partial atomic charges computed using finite-difference and boundary-element explicit Hessians, as well as using the reverse-Schur co-optimization method. The boundary-element simulations employed curved boundary element discretizations.

The total charge on the inhibitor was also constrained to different net charge values. For these problems with different constraints, very little change was observed in the partial atomic charges for atoms directly interacting with the receptor, and the calculated optimal binding free energy (the objective function at the optimum) changed minimally. Figure Figure 11 shows the box-constrained optimal charges when the total inhibitor charge was constrained to be −1, 0, or +1, computed using co-optimization. The co-optimization charges again correspond closely with calculations performed using either boundary-element or finite-difference methods with explicit Hessians (data not shown). The finite-difference calculations gave rise to optimized binding free energies of 6.88 kcal/mol for the −1 e-constrained problem, 5.54 for the zero-charge problem, and 6.55 for the +1 e-constrained problem.

The Anderson *et al*. inhibitor of CDK2.³¹ Partial atomic charges have been optimized using box inequality constraints to enforce that charge values are less than 0.85e in magnitude, and sum-of-charge constraints have been imposed such that the total inhibitor charge is −1 (red label), 0 (blue label), or +1 (green label).

The partial atomic charge values of the solvent-exposed sulfonamide group changed the most to accomodate these different net charges, largely because their solvent-exposed nature results in small desolvation penalties on binding. It should be noted that the 0.85e bound on the sulfur partial charge may be too stringent. These box constraints were introduced following earlier work³ and the observation that few biomolecular systems are modeled as having partial charges of larger magnitude, regardless of whether the partial charges are taken from molecular mechanics force fields such as CHARMM⁹⁵ or derived from electronic structure calculations and RESP fitting. Our primary purpose here, however, is to demonstrate that the co-optimization method is fully capable of treating inequality constraints using a primal-dual interior point method. Also, the constraining of the total charge to either −1, 0, or +1 is not meant to test protonation states, given that the same number of charges and the same set of charge locations are used in each test. Optimization under these varying constraints suggests whether the optimized binding affinity or the optimal charge distribution is sensitive to the overall inhibitor charge, and for this problem neither appears to be the case.

The unbound- and bound-state geometries consisted of 3,821 and 138,770 curved boundary elements, respectively. Calculating the explicit Hessian required 461 applications of the bound-state integral operator M_2,b. In contrast, using reverse-Schur co-optimization, the unconstrained and linear-equality constrained problems each required at most 2 M̂₂ matrix–vector products, which is essentially the same cost as 2 M_2,b matrix–vector products owing to the small size of the unbound system. The more than 200-fold reduction in the number of matrix–vector products for such a small problem suggests that the model problems based on Figure Figure 3 may actually be more computationally challenging than realistic problems, because a model problem of comparable size showed an acceleration of only a factor of ten using co-optimization over an explicit-Hessian method (see Figure Figure 8).

This example illustrates one weakness of our chosen metric for performance improvement— the reduction in the number of M̂₂ matrix–vector products. This metric neglects the startup cost that must be paid regardless of whether one intends to use explicit-Hessian or co-optimization methods and thus the overall compute time required to obtain an unconstrained optimium is not being reduced by a factor of 200. Our decision to use the number of M̂₂ matrix–vector products as an improvement metric is based on the consideration that it is impossible to perform the optimization without paying the initialization cost. Thus, from a theoretical perpective it is appropriate to neglect this cost in comparing optimization methods.

Nevertheless, as a practical matter, it is important to understand the impact of co-optimization on overall computational cost. On a 2-GHz Intel MacBook pro, the planar-boundary-element simulations require approximately 100 seconds of setup time, and the bound-state simulations require an average 5 seconds (6 GMRES iterations are generally required for bound-state simulations). The overall unconstrained co-optimization cost is thus about 102 seconds (100 seconds for initialization and 2 GMRES iterations to solve the unconstrained co-optimization problem), whereas the overall optimization cost for the BEM explicit-Hessian approach is about 300 seconds (100 seconds for initialization and 5 seconds for calculating each of the 40 columns of the Hessian). Thus the total reduction in computational cost is almost a factor of three. For large problems such as optimizing protein–protein interactions, it can be expected that the total reduction will be even larger.

5.2.2 HIV-1 Protease and Inhibitor

The 75-atom inhibitor darunavir binds tightly to HIV-1 protease.²⁸ In Figure Figure 12, the atoms are labeled with indices corresponding to the entries in Table Table 1, which lists the RESP-derived charge values, the unconstrained co-optimized partial atomic charge values, optimal charges under a zero total charge equality constraint, and the optimal charges computed with box constraints such that no charge exceeded 0.85e in magnitude, with sum-of-charge constraints set to −1, 0, and 1.

Table 1.

Partial atomic charges for darunavir. All charge values are given as multiples of the electron charge magnitude e and have been rounded to the nearest 0.01e. Box-constrained optimization enforced that each charge had maximum magnitude 0.85e.

Atom Index	RESP-fit	Unconstrained Optimal	Equality-Constrained Σq_i = 0	Box-Constrained
Atom Index	RESP-fit	Unconstrained Optimal	Equality-Constrained Σq_i = 0	Σq_i = −1	Σq_i = 0	Σq_i = +1

1	−0.87	−0.52	−0.45	−0.33	−0.44	−0.55
2	0.40	0.16	0.09	−0.05	0.06	0.16
3	−0.29	0.14	0.22	0.35	0.22	0.10
4	−0.18	−0.38	−0.46	−0.61	−0.48	−0.37
5	0.02	0.37	0.43	0.62	0.55	0.47

6	−0.18	−0.09	−0.12	−0.24	−0.20	−0.16
7	−0.29	−0.14	−0.12	−0.03	−0.08	−0.11
8	0.86	0.09	0.07	−0.33	−0.33	−0.35
9	−0.52	0.00	0.00	0.00	0.00	0.01
10	−0.50	0.07	0.06	0.08	0.09	0.10

11	−0.36	−2.35	−2.25	−0.85	−0.85	−0.85
12	−0.06	0.45	0.19	−0.75	−0.68	−0.48
13	0.15	0.55	0.80	0.85	0.81	0.49
14	−0.14	−0.19	−0.22	−0.02	−0.11	−0.09
15	−0.31	−0.22	−0.31	−0.16	−0.20	−0.09

16	−0.08	1.62	1.66	0.85	0.85	0.85
17	0.09	−0.17	−0.22	−0.21	−0.23	−0.19
18	−0.67	0.39	0.37	0.31	0.33	0.35
19	0.05	−1.08	−0.87	0.15	0.10	−0.11
20	−0.38	−0.41	−0.46	−0.85	−0.85	−0.79

21	0.68	0.76	0.76	0.82	0.85	0.85
22	−0.56	−0.38	−0.39	−0.42	−0.41	−0.39
23	−0.39	−0.13	−0.13	−0.03	−0.05	−0.07
24	0.12	−0.71	−0.67	−0.82	−0.85	−0.85
25	0.05	0.88	0.86	0.79	0.83	0.85

26	−0.49	−0.61	−0.58	−0.52	−0.56	−0.59
27	0.44	0.21	0.17	−0.03	0.06	0.16
28	−0.51	−0.34	−0.32	−0.26	−0.30	−0.33
29	0.07	−0.35	−0.48	−0.54	−0.37	−0.19
30	−0.12	1.70	1.89	0.85	0.85	0.85

31	−0.01	−0.14	−0.18	0.43	0.35	0.25
32	−0.08	1.36	1.26	0.84	0.85	0.85
33	−0.16	0.24	0.26	0.23	0.17	0.12
34	−0.19	−0.09	−0.13	−0.15	−0.07	−0.01
35	−0.18	0.11	0.16	0.20	0.10	0.01

36	−0.19	−0.11	−0.10	−0.04	−0.04	−0.05
37	−0.05	0.12	0.10	−0.03	−0.02	−0.00
38	−0.06	−0.63	−0.61	−0.43	−0.39	−0.35
39	0.39	0.22	0.19	0.16	0.19	0.22
40	0.39	0.10	0.03	−0.08	0.03	0.14

41	0.18	0.14	0.13	0.11	0.13	0.14
42	0.19	−0.13	−0.22	−0.37	−0.22	−0.07
43	0.21	0.03	0.03	0.04	0.04	0.04
44	0.21	0.20	0.20	0.21	0.21	0.21
45	0.07	0.14	0.17	0.25	0.26	0.25

46	0.11	0.06	0.09	0.17	0.18	0.18
47	0.02	0.07	0.03	0.05	0.05	0.11
48	0.03	0.04	0.03	−0.01	0.02	0.04
49	0.00	−0.07	−0.06	−0.09	−0.07	−0.07
50	0.06	0.04	0.03	−0.15	−0.11	−0.10

51	0.09	0.03	0.01	−0.04	0.02	0.06
52	0.06	−0.12	−0.19	−0.33	−0.18	−0.06
53	0.08	0.05	0.06	0.03	0.03	0.01
54	0.14	−0.04	−0.06	0.04	0.04	0.05
55	0.11	−0.28	−0.28	−0.15	−0.15	−0.16

56	0.14	0.34	0.34	0.28	0.29	0.30
57	0.41	0.25	0.25	0.28	0.29	0.29
58	0.18	0.13	0.08	−0.22	−0.22	−0.18
59	0.07	−0.21	−0.20	−0.16	−0.16	−0.15
60	0.07	−0.12	−0.11	−0.09	−0.08	−0.06

61	0.09	−0.06	−0.05	−0.03	−0.03	−0.04
62	0.16	0.01	0.00	0.01	0.02	0.03
63	0.17	−0.00	−0.01	−0.04	−0.02	−0.00
64	0.17	−0.08	−0.08	−0.09	−0.09	−0.08
65	0.17	−0.10	−0.22	−0.40	−0.20	−0.01

66	0.10	0.14	0.12	0.14	0.15	0.16
67	0.06	−0.05	−0.06	−0.05	−0.04	−0.03
68	0.11	−0.14	−0.14	−0.14	−0.14	−0.13
69	0.04	0.03	0.04	−0.04	−0.03	−0.02
70	0.06	0.22	0.18	0.12	0.19	0.25

71	0.07	−0.43	−0.43	−0.09	−0.17	−0.24
72	0.05	−0.44	−0.70	−0.69	−0.37	−0.05
73	0.11	0.08	0.11	0.19	0.13	0.07
74	0.09	0.13	0.16	0.22	0.18	0.13
75	0.23	0.24	0.24	0.27	0.28	0.29

Open in a new tab

We emphasize that the equality constraints have not been introduced to evaluate protonation effects, but only to estimate the influence of total charge on the optimal solution (and the associated binding free energy). Our results illustrate that the faster co-optimization method generates results consistent with the traditional approach. The unbound- and bound-state geometry discretizations consisted of 5,892 and 133,067 curved boundary elements. Computation of the explicit Hessian required 576 applications of the operator M_2,b, and unconstrained co-optimization required 15 M_2,b matrix–vector products. In Figure Figure 13 are plotted the unconstrained optimal charges computed using the explicit-Hessian and the co-optimization methods and again, the answers agree extremely well.

Unconstrained optimal partial atomic charges for the HIV-1 protease inhibitor darunavir computed using finite-difference and boundary-element explicit Hessians, as well as using the reverse-Schur co-optimization method. The boundary-element simulations employed curved boundary elements.

Electrostatic optimization of darunavir in the HIV-1 protease active site led to a significant improvement in the predicted electrostatic binding free energy. The ligand is net neutral and the wild-type charge distribution gives an finite-difference-calculated electrostatic binding free energy of 27.54 kcal/mol, and the unconstrained optimal solution computed using co-optimization gives an electrostatic binding free energy of 5.48 kcal/mol. The resulting optimized binding free energies for the bound-constrained problems were: for the −1 e problem, 10.40 kcal/mol; for the neutral problem, 6.89 kcal/mol; for the +1 e problem, 5.73 kcal/mol.

The improvement on optimization can be attributed mainly to an accumulation of positive charge in the center of the ligand, near the negatively charged aspartyl dyad. These atoms are buried in both the bound and unbound states due to the molecular shape, and consequently can take larger charge values without incurring significant desolvation penalties. For the unconstrained problems, the net ligand charge for the co-optimization method was 0.64 e, 0.64 e for the curved-boundary-element explicit-Hessian method, and 1.00 e using the finite-difference method.

Inhibitor atoms that make direct hydrogen-bonding interactions with the protease, such as the aniline nitrogen and hydrogen atoms (atom indices 1, 39, and 40), hydroxyl group (indices 18 and 57), and bis-tetrahydrofuran oxygen atoms (indices 26 and 28) have optimal charges very similar to their quantum-mechanically-derived values.

6 DISCUSSION

In this paper we have described an efficient technique, which we call reverse-Schur co-optimization, for calculating the molecular charge distribution that optimizes the electrostatic component of the free energy of binding to another molecule. The approach exhibits substantially better performance than traditional optimization approaches, which explicitly calculate the Hessian matrix before optimization. The co-optimization approach is a PDE-constrained optimization technique and exhibits comparable performance to alternative PDE-constrained optimization techniques that are well known in other areas of computational science and engineering. Although this paper has presented an approach based on a boundary-element method (BEM) for solving the electrostatics problem, no fundamental issues seem to preclude the use of other numerical methods in a co-optimization approach. The critical elements for efficient co-optimization appear to be the availability of good preconditioners and sufficiently accurate, but computationally inexpensive, Hessian approximations.

The reverse-Schur approach to PDE-constrained optimization is only possible because the PDE state variables and the decision variables are linearly related. This structure enables the PDE to be incorporated as a final algebraic manipulation before numerically solving the linear systems associated with quadratic programming. The all-at-once and nested approaches PDE-constrained optimization techniques, in contrast, are much more flexible with respect to the relationships between the state and decision variables.

Regularization—the penalization of certain search directions associated with the smallest eigenvalues—is more complicated for PDE-constrained approaches than for methods that rely on explicit Hessians. However, the BIBEE/P approach to estimating electrostatic interactions⁹¹ has been demonstrated to generate a sufficiently accurate Hessian approximation whose eigendecomposition can be used as the basis for deriving a penalty function. The superior conditioning of purely second-kind integral-equation formulations⁵⁶^,⁷⁵ relative to first-kind or mixed first-second-kind equations has an unexpected consequence in that the Yoon and Lenhoff formulation cannot be used to generate a preconditioner-based Hessian approximation.

A number of extensions to the co-optimization technique may make it still more efficient. For instance, primal-dual interior-point methods are less efficient than active-set methods for “warm start” problems, in which one begins optimizing from a neighborhood of the optimal solution. Coupling co-optimization to an active-set solver might therefore significantly reduce the cost required to solve problems that differ only by the inclusion of varying constraints. Furthermore, a co-optimization warm-start method may be faster than the present implementation because there exist ways (such as Gasteiger-Marsili charges¹⁰¹) to rapidly estimate a wild-type charge distribution that could be used as an initial guess for the optimal distibution. One might also save the Hessian-vector products as they are formed, in essence allowing the Hessian to be “built” such that after a sufficient number of optimizations have been performed, the solver is using a completely explicit Hessian.

The efficiency gains afforded by PDE-constrained methods in general, not just the reverse-Schur approach presented here, should allow significantly larger and more complex biological systems to be studied. Computational redesign of proteins, for instance, may produce optimization problems with dimension greater than one thousand. For these problems, co-optimization can offer a cost reduction of over two orders of magnitude;⁹² such an acceleration may allow the evaluation of many more candidate ligands or ligand poses. Also, geometry can now be varied to assess the best-case energetic cost of adding of a functional group as it changes a binding partner’s desolvation penalty, or the effects of molecular flexibility.¹⁶ For example, it may be computationally feasible to use co-optimization to study the influence of different protonation states on binding free energies and on the optimal charge distributions associated with each state. Finally, the original electrostatic optimization paper by Lee and Tidor noted an unexpected asymmetry between the receptor charge distribution and the calculated optimal ligand distribution.¹ Because co-optimization allows the use of substantially larger basis sets, it may be able to develop techniques that can identify optimal charge placement as well as value.

Acknowledgement

The authors thank J. Nocedal, J.-H. Lee, D. J. Willis, D. F. Green, M. Knepley, and S. Leyffer for useful discussions. J. P. Bardhan gratefully acknowledges support from a Department of Energy Comptutational Science Graduate Fellowship, as well as from a Wilkinson Fellowship in Scientific Computing funded by the Mathematical, Information, and Computational Sciences Division Subprogram of the Office of Advanced Scientific Computing Research, Office of Science, U. S. Dept. of Energy, under Contract DE-AC02-06CH11357. J. White acknowledges support from grants from the National Science Foundation and the Singapore–MIT Alliance. This work was supported in part by the National Institutes of Health (GM065418).

Footnotes

Supporting Information Available

None available.

References

1.Lee L-P, Tidor B. J. Chem. Phys. 1997;106:8681–8690. [Google Scholar]
2.Kangas E, Tidor B. J. Phys. Chem. B. 2001;105:880–888. [Google Scholar]
3.Lee L-P, Tidor B. Nat. Struct. Biol. 2001;8:73–76. doi: 10.1038/83082. [DOI] [PubMed] [Google Scholar]
4.Green DF, Tidor B. J. Mol. Biol. 2004;342:435–452. doi: 10.1016/j.jmb.2004.06.087. [DOI] [PubMed] [Google Scholar]
5.Chong LT, Dempster SE, Hendsch ZS, Lee L-P, Tidor B. Protein Sci. 1998;7:206–210. doi: 10.1002/pro.5560070122. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Kangas E, Tidor B. J. Chem. Phys. 1998;109:7522–7545. [Google Scholar]
7.Kangas E, Tidor B. Phys. Rev. E. 1999;59:5958–5961. doi: 10.1103/physreve.59.5958. [DOI] [PubMed] [Google Scholar]
8.Kangas E, Tidor B. J. Chem. Phys. 2000;112:9120–9131. [Google Scholar]
9.Lee L-P, Tidor B. Protein Sci. 2001;10:362–377. doi: 10.1110/ps.40001. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Mandal A, Hilvert D. J. Am. Chem. Soc. 2003;125:5598–5599. doi: 10.1021/ja029447t. [DOI] [PubMed] [Google Scholar]
11.Sulea T, Purisima EO. J. Phys. Chem. B. 2001;105:889–899. [Google Scholar]
12.Sulea T, Purisima EO. Biophys. J. 2003;84:2883–2896. doi: 10.1016/S0006-3495(03)70016-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Sims PA, Wong CF, McCammon JA. J. Comput. Chem. 2004;25:1416–1429. doi: 10.1002/jcc.20067. [DOI] [PubMed] [Google Scholar]
14.Green DF, Tidor B. Proteins. 2005;60:644–657. doi: 10.1002/prot.20540. [DOI] [PubMed] [Google Scholar]
15.Armstrong KA, Tidor B, Cheng AC. J. Med. Chem. 2006;49:2470–2477. doi: 10.1021/jm051105l. [DOI] [PubMed] [Google Scholar]
16.Gilson MK. J. Chem. Theory Comput. 2006;2:259–270. doi: 10.1021/ct050226y. [DOI] [PubMed] [Google Scholar]
17.Selzer T, Albeck S, Schreiber G. Nat. Struct. Biol. 2000;7:537–541. doi: 10.1038/76744. [DOI] [PubMed] [Google Scholar]
18.Shaul Y, Schreiber G. Proteins. 2005;60:341–352. doi: 10.1002/prot.20489. [DOI] [PubMed] [Google Scholar]
19.Brock K, Talley K, Coley K, Kundrotas P, Alexov E. Biophys. J. 2007;93:3340–3352. doi: 10.1529/biophysj.107.112367. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Haftka R. AIAA J. 1985;23:1099–1103. [Google Scholar]
21.Orozco C, Ghattas O. AIAA J. 1992;30:1877–1885. [Google Scholar]
22.Biegler L, Nocedal J, Schmid C. SIAM J. Optim. 1995;5:314–347. [Google Scholar]
23.Dennis J, Heinkenschloss M, Vicente L. SIAM J. Control Optim. 1998;36:1750–1794. [Google Scholar]
24.Biros G, Ghattas O. SIAM J. Sci. Comput. 2005;27:687–713. [Google Scholar]
25.Biros G, Ghattas O. SIAM J. Sci. Comput. 2005;27:714–739. [Google Scholar]
26.Brayton RK, Hachtel GD, Sangiovanni-Vincentelli AL. Proc. IEEE. 1981;69:1334–1362. [Google Scholar]
27.Fisher M, Nocedal J, Tremolet Y, Wright SJ. Data assimilation in weather forecasting: a case study in PDE-constrained optimization. [accessed August 1, 2009];Optimization and Engineering. 2008 DOI: 10.1007/s11081-008-9051-5. http://www.springerlink.com/content/e47441q543236t31/
28.Surleraux DL, Tahri A, Verschueren WG, Pille GM, de Kock HA, Jonckers TH, A. Peeters A, Meyer SD, Azijn H, Pauwels R, de Bethune MP, King NM, Prabu-Jeyabalan M, Schiffer CA, Wigerinck PB. J. Med. Chem. 2005;48:1813–1822. doi: 10.1021/jm049560p. [DOI] [PubMed] [Google Scholar]
29.Huff JR. J. Med. Chem. 1991;34:2305–2314. doi: 10.1021/jm00112a001. [DOI] [PubMed] [Google Scholar]
30.Flexner C. N. Engl. J. Med. 1998;338:1281–1292. doi: 10.1056/NEJM199804303381808. [DOI] [PubMed] [Google Scholar]
31.Anderson M, Beattie J, Breault G, Breed J, Byth K, Culshaw J, Ellston R, Green S, Minshull C, Norman R, Pauptit R, Stanway J, Thomas A, Jewsbury P. Bioorg. Med. Chem. Lett. 2003;13:3021. doi: 10.1016/s0960-894x(03)00638-3. [DOI] [PubMed] [Google Scholar]
32.Vandenheuvel S, Harlow E. Science. 1993;262:2050–2054. doi: 10.1126/science.8266103. [DOI] [PubMed] [Google Scholar]
33.Hartwell LH, Kastan MB. Science. 1994;266:1821–1828. doi: 10.1126/science.7997877. [DOI] [PubMed] [Google Scholar]
34.Honig B, Sharp K, Yang AS. J. Phys. Chem. 1993;97:1101–1109. [Google Scholar]
35.Levy RM, Zhang LY, Gallicchio E, Felts AK. J. Am. Chem. Soc. 2003;125:9523–9530. doi: 10.1021/ja029833a. [DOI] [PubMed] [Google Scholar]
36.Wagoner JA, Baker NA. Proc. Natl. Acad. Sci. USA. 2006;103:8331–8336. doi: 10.1073/pnas.0600118103. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Sharp KA, Honig B. Annu. Rev. Biophys. Biophys. Chem. 1990;19:301–332. doi: 10.1146/annurev.bb.19.060190.001505. [DOI] [PubMed] [Google Scholar]
38.Richards FM. Annu. Rev. Biophys. Bioeng. 1977;6:151–176. doi: 10.1146/annurev.bb.06.060177.001055. [DOI] [PubMed] [Google Scholar]
39.Juffer AH, Botta EFF, van Keulen BAM, van der Ploeg A, Berendsen HJC. J. Comput. Phys. 1991;97:144–171. [Google Scholar]
40.Miertus S, Scrocco E, Tomasi E. J. Chem. Phys. 1981;55:117–129. [Google Scholar]
41.Warwicker J, Watson HC. J. Mol. Biol. 1982;157:671–679. doi: 10.1016/0022-2836(82)90505-8. [DOI] [PubMed] [Google Scholar]
42.Shaw PB. Phys. Rev. A. 1985;32:2476–2487. doi: 10.1103/physreva.32.2476. [DOI] [PubMed] [Google Scholar]
43.Zauhar RJ, Morgan RS. J. Mol. Biol. 1985;186:815–820. doi: 10.1016/0022-2836(85)90399-7. [DOI] [PubMed] [Google Scholar]
44.Klapper I, Hagstrom R, Fine R, Sharp K, Honig B. Proteins. 1986;1:47–59. doi: 10.1002/prot.340010109. [DOI] [PubMed] [Google Scholar]
45.Zauhar RJ, Morgan RS. J. Comput. Chem. 1988;9:171–187. [Google Scholar]
46.Gilson MK, Honig B. Proteins. 1988;4:7–18. doi: 10.1002/prot.340040104. [DOI] [PubMed] [Google Scholar]
47.Yoon BJ, Lenhoff AM. J. Comput. Chem. 1990;11:1080–1086. [Google Scholar]
48.Nicholls A, Honig B. J. Comput. Chem. 1991;12:435–445. [Google Scholar]
49.Holst MJ. Ph.D.thesis. Univ. of Ill. at Urbana-Champaign: 1993. [Google Scholar]
50.You TJ, Harvey SC. J. Comput. Chem. 1993;14:484–501. [Google Scholar]
51.Cammi R, Tomasi J. J. Comput. Chem. 1995;16:1449–1458. [Google Scholar]
52.Madura JD, Briggs JM, Wade RC, Davis ME, Luty BA, Ilin A, Antosiewicz J, Gilson MK, Bagheri B, Ridgway-Scott L, McCammon JA. Comput. Phys. Comm. 1995;91:57–95. [Google Scholar]
53.Purisima EO, Nilar SH. J. Comput. Chem. 1995;16:681–689. [Google Scholar]
54.Bharadwaj R, Windemuth A, Sridharan S, Honig B, Nicholls A. J. Comput. Chem. 1995;16:898–913. [Google Scholar]
55.Cances E, Mennucci B, Tomasi J. J. Chem. Phys. 1997;107:3032–3041. [Google Scholar]
56.Liang J, Subramaniam S. Biophys. J. 1997;73:1830–1841. doi: 10.1016/S0006-3495(97)78213-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Holst M, Baker N, Wang F. J. Comput. Chem. 2000;21:1319–1342. [Google Scholar]
58.Rocchia W, Alexov E, Honig B. J. Phys. Chem. B. 2001;105:6507–6514. [Google Scholar]
59.Bordner AJ, Huber GA. J. Comput. Chem. 2003;24:353–367. doi: 10.1002/jcc.10195. [DOI] [PubMed] [Google Scholar]
60.Boschitsch AH, Fenley MO, Zhou H-X. J. Phys. Chem. B. 2002;106:2741–2754. [Google Scholar]
61.Boschitsch AH, Fenley MO. J. Comput. Chem. 2004;25:935–955. doi: 10.1002/jcc.20000. [DOI] [PubMed] [Google Scholar]
62.Zhou HX. Biophys. J. 1993;65:955–963. doi: 10.1016/S0006-3495(93)81094-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Purisima EO. J. Comput. Chem. 1998;19:1494–1504. [Google Scholar]
64.Kuo SS, Altman MD, Bardhan JP, Tidor B, White JK. Fast Methods for Simulation of Biomolecule Electrostatics. Proc. Int. Conf. Comput.-Aided Des. (ICCAD); San Jose, CA. New York, NY: ACM; 2002. pp. 466–473. [Google Scholar]
65.Altman MD, Bardhan JP, Tidor B, White JK. IEEE Trans. Comput.-Aided Des. 2006;25:274–284. [Google Scholar]
66.Altman MD, Bardhan JP, White JK, Tidor B. J. Comput. Chem. 2009;30:132–153. doi: 10.1002/jcc.21027. [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Chipman DM. J. Chem. Phys. 2004;120:5566–5575. doi: 10.1063/1.1648632. [DOI] [PubMed] [Google Scholar]
68.Altman MD, Bardhan JP, White JK, Tidor B. An Efficient and Accurate Surface Formulation for Biomolecule Electrostatics in Non-ionic Solution. Proc. 2005 IEEE Eng. Med. Biol. Conf. (IEEE-EMBS) 2005; Shanghai. Piscataway, NJ: IEEE; 2005. pp. 7591–7595. [DOI] [PubMed] [Google Scholar]
69.Grandison S, Penfold R, Vanden-Broeck J-M. J. Comput. Phys. 2007;224:663–680. [Google Scholar]
70.Barone V, Cossi M, Tomasi J. J. Chem. Phys. 1997;107:3210–3221. [Google Scholar]
71.Mennucci B, Cancès E, Tomasi J. J. Phys. Chem. B. 1997;101:10506–10517. [Google Scholar]
72.Levitt DG. Biophys. J. 1978;22:209–219. doi: 10.1016/S0006-3495(78)85485-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Attard P. J. Chem. Phys. 2003;119:1365–1372. [Google Scholar]
74.Boda D, Voliskó M, Eisenberg B, Nonner W, Henderson D, Gillespie D. J. Chem. Phys. 2006;125 doi: 10.1063/1.2212423. 034901. [DOI] [PubMed] [Google Scholar]
75.Atkinson KE. The Numerical Solution of Integral Equations of the Second Kind. Cambridge University Press; 1997. [Google Scholar]
76.Hsiao GC, Wendland WL. Encyclopedia of Computational Mechanics. 2004 [Google Scholar]
77.Zauhar RJ. J. Comput.-Aided Mol. Des. 1995;9:149–159. doi: 10.1007/BF00124405. [DOI] [PubMed] [Google Scholar]
78.Bardhan JP, Altman MD, White JK, Tidor B. J. Chem. Phys. 2007;127 doi: 10.1063/1.2743423. 014701. [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Tausch J, Wang J, White J. IEEE Trans. Comput.-Aided Des. 2001;20:1398–1405. [Google Scholar]
80.Bardhan JP. J. Chem. Phys. 2009;130 doi: 10.1063/1.3080769. 094102. [DOI] [PubMed] [Google Scholar]
81.Nabors K, Korsmeyer FT, Leighton FT, White J. SIAM J. Sci. Comput. 1994;15:713–735. [Google Scholar]
82.Lu BZ, Cheng XL, Huang J, McCammon JA. Proc. Natl. Acad. Sci. USA. 2006;103:19314–19319. doi: 10.1073/pnas.0605166103. [DOI] [PMC free article] [PubMed] [Google Scholar]
83.Golub GH, Loan CFV. Matrix Computations. 3rd ed. Baltimore, MD: The Johns Hopkins University Press; 1996. [Google Scholar]
84.Saad Y, Schultz M. SIAM J. Sci. Stat. Comput. 1986;7:856–869. [Google Scholar]
85.Greengard L, Rokhlin V. J. Comput. Phys. 1987;73:325–348. [Google Scholar]
86.Phillips JR, White JK. IEEE Trans. Comput.-Aided Des. 1997;16:1059–1072. [Google Scholar]
87.Bertsekas DP. Nonlinear Programming. 2nd ed. Nashua, NH: Athena Scientific; 1999. [Google Scholar]
88.Wright SJ. Primal-Dual Interior Point Methods. Philadelphia, PA: SIAM; 1997. [Google Scholar]
89.Bardhan JP, Lee JH, Altman MD, Benson S, Leyffer S, Tidor B, White JK. Biomolecule Electrostatic Optimization with an Implicit Hessian. In: Laudon M, Romanowicz B, editors. Tech. Proc. 2004 Nan-otech. Conf. Trade Show, v. 1; Boston MA. 2004; Cambridge MA: NSTI; 2004. pp. 164–167. [Google Scholar]
90.Sanner M, Olson AJ, Spehner JC. Biopolymers. 1996;38:305–320. doi: 10.1002/(SICI)1097-0282(199603)38:3%3C305::AID-BIP4%3E3.0.CO;2-Y. [DOI] [PubMed] [Google Scholar]
91.Bardhan JP. J. Chem. Phys. 2008;129 doi: 10.1063/1.2987409. 144105. [DOI] [PubMed] [Google Scholar]
92.Bardhan JP, Lee JH, Kuo SS, Altman MD, Tidor B, White JK. Fast Methods for Biomolecule Charge Optimization. In: Laudon M, Romanowicz B, editors. Tech. Proc. 2003 Nanotech. Conf. Trade Show, v. 2; San Francisco, CA. 2003; Cambridge, MA: NSTI; 2003. pp. 508–511. [Google Scholar]
93.Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. J. Comput. Chem. 1983;4:187–217. [Google Scholar]
94.Brünger AT, Karplus M. Proteins. 1988;4:148–156. doi: 10.1002/prot.340040208. [DOI] [PubMed] [Google Scholar]
95.MacKerell AD, Jr, Bashford D, Bellott M, Dunbrack RL, Jr, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph–McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, III, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiorkiewicz–Kuczera J, Yin D, Karplus M. J. Phys. Chem. B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
96.Sitkoff D, Sharp KA, Honig B. J. Phys. Chem. B. 1994;98:1978–1988. [Google Scholar]
97.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Zakrzewski VG, Montgomery JA, Jr, Stratmann RE, Burant JC, Dapprich S, Millam JM, Daniels AD, Kudin KN, Strain MC, Farkas O, Tomasi J, Barone V, Cossi M, Cammi R, Mennucci B, Pomelli C, Adamo C, Clifford S, Ochterski J, Petersson GA, Ayala PY, Cui Q, Morokuma K, Salvador P, Dannenberg JJ, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Cioslowski J, Ortiz JV, Baboul AG, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komáromi I, Gomperts R, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Challacombe M, Gill PMW, Johnson B, Chen W, Wong MW, Andres JL, Gonzalez C, Head-Gordon M, Replogle ES, Pople JA. Gaussian 98. Pittsburgh, PA: Gaussian, Inc.; 1998. [Google Scholar]
98.Green DF, Tidor B. J. Phys. Chem. B. 2003;107:10261–10273. [Google Scholar]
99.Bayly CI, Cieplak P, Cornell WD, Kollman PA. J. Phys. Chem. 1993;97:10269–10280. [Google Scholar]
100.Altman MD. Unpublished results. [Google Scholar]
101.Rizzo RC, Aynechi T, Case DA, Kuntz ID. J. Chem. Theory Comput. 2006;2:128–139. doi: 10.1021/ct050097l. [DOI] [PubMed] [Google Scholar]

[R1] 1.Lee L-P, Tidor B. J. Chem. Phys. 1997;106:8681–8690. [Google Scholar]

[R2] 2.Kangas E, Tidor B. J. Phys. Chem. B. 2001;105:880–888. [Google Scholar]

[R3] 3.Lee L-P, Tidor B. Nat. Struct. Biol. 2001;8:73–76. doi: 10.1038/83082. [DOI] [PubMed] [Google Scholar]

[R4] 4.Green DF, Tidor B. J. Mol. Biol. 2004;342:435–452. doi: 10.1016/j.jmb.2004.06.087. [DOI] [PubMed] [Google Scholar]

[R5] 5.Chong LT, Dempster SE, Hendsch ZS, Lee L-P, Tidor B. Protein Sci. 1998;7:206–210. doi: 10.1002/pro.5560070122. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Kangas E, Tidor B. J. Chem. Phys. 1998;109:7522–7545. [Google Scholar]

[R7] 7.Kangas E, Tidor B. Phys. Rev. E. 1999;59:5958–5961. doi: 10.1103/physreve.59.5958. [DOI] [PubMed] [Google Scholar]

[R8] 8.Kangas E, Tidor B. J. Chem. Phys. 2000;112:9120–9131. [Google Scholar]

[R9] 9.Lee L-P, Tidor B. Protein Sci. 2001;10:362–377. doi: 10.1110/ps.40001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Mandal A, Hilvert D. J. Am. Chem. Soc. 2003;125:5598–5599. doi: 10.1021/ja029447t. [DOI] [PubMed] [Google Scholar]

[R11] 11.Sulea T, Purisima EO. J. Phys. Chem. B. 2001;105:889–899. [Google Scholar]

[R12] 12.Sulea T, Purisima EO. Biophys. J. 2003;84:2883–2896. doi: 10.1016/S0006-3495(03)70016-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Sims PA, Wong CF, McCammon JA. J. Comput. Chem. 2004;25:1416–1429. doi: 10.1002/jcc.20067. [DOI] [PubMed] [Google Scholar]

[R14] 14.Green DF, Tidor B. Proteins. 2005;60:644–657. doi: 10.1002/prot.20540. [DOI] [PubMed] [Google Scholar]

[R15] 15.Armstrong KA, Tidor B, Cheng AC. J. Med. Chem. 2006;49:2470–2477. doi: 10.1021/jm051105l. [DOI] [PubMed] [Google Scholar]

[R16] 16.Gilson MK. J. Chem. Theory Comput. 2006;2:259–270. doi: 10.1021/ct050226y. [DOI] [PubMed] [Google Scholar]

[R17] 17.Selzer T, Albeck S, Schreiber G. Nat. Struct. Biol. 2000;7:537–541. doi: 10.1038/76744. [DOI] [PubMed] [Google Scholar]

[R18] 18.Shaul Y, Schreiber G. Proteins. 2005;60:341–352. doi: 10.1002/prot.20489. [DOI] [PubMed] [Google Scholar]

[R19] 19.Brock K, Talley K, Coley K, Kundrotas P, Alexov E. Biophys. J. 2007;93:3340–3352. doi: 10.1529/biophysj.107.112367. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Haftka R. AIAA J. 1985;23:1099–1103. [Google Scholar]

[R21] 21.Orozco C, Ghattas O. AIAA J. 1992;30:1877–1885. [Google Scholar]

[R22] 22.Biegler L, Nocedal J, Schmid C. SIAM J. Optim. 1995;5:314–347. [Google Scholar]

[R23] 23.Dennis J, Heinkenschloss M, Vicente L. SIAM J. Control Optim. 1998;36:1750–1794. [Google Scholar]

[R24] 24.Biros G, Ghattas O. SIAM J. Sci. Comput. 2005;27:687–713. [Google Scholar]

[R25] 25.Biros G, Ghattas O. SIAM J. Sci. Comput. 2005;27:714–739. [Google Scholar]

[R26] 26.Brayton RK, Hachtel GD, Sangiovanni-Vincentelli AL. Proc. IEEE. 1981;69:1334–1362. [Google Scholar]

[R27] 27.Fisher M, Nocedal J, Tremolet Y, Wright SJ. Data assimilation in weather forecasting: a case study in PDE-constrained optimization. [accessed August 1, 2009];Optimization and Engineering. 2008 DOI: 10.1007/s11081-008-9051-5. http://www.springerlink.com/content/e47441q543236t31/

[R28] 28.Surleraux DL, Tahri A, Verschueren WG, Pille GM, de Kock HA, Jonckers TH, A. Peeters A, Meyer SD, Azijn H, Pauwels R, de Bethune MP, King NM, Prabu-Jeyabalan M, Schiffer CA, Wigerinck PB. J. Med. Chem. 2005;48:1813–1822. doi: 10.1021/jm049560p. [DOI] [PubMed] [Google Scholar]

[R29] 29.Huff JR. J. Med. Chem. 1991;34:2305–2314. doi: 10.1021/jm00112a001. [DOI] [PubMed] [Google Scholar]

[R30] 30.Flexner C. N. Engl. J. Med. 1998;338:1281–1292. doi: 10.1056/NEJM199804303381808. [DOI] [PubMed] [Google Scholar]

[R31] 31.Anderson M, Beattie J, Breault G, Breed J, Byth K, Culshaw J, Ellston R, Green S, Minshull C, Norman R, Pauptit R, Stanway J, Thomas A, Jewsbury P. Bioorg. Med. Chem. Lett. 2003;13:3021. doi: 10.1016/s0960-894x(03)00638-3. [DOI] [PubMed] [Google Scholar]

[R32] 32.Vandenheuvel S, Harlow E. Science. 1993;262:2050–2054. doi: 10.1126/science.8266103. [DOI] [PubMed] [Google Scholar]

[R33] 33.Hartwell LH, Kastan MB. Science. 1994;266:1821–1828. doi: 10.1126/science.7997877. [DOI] [PubMed] [Google Scholar]

[R34] 34.Honig B, Sharp K, Yang AS. J. Phys. Chem. 1993;97:1101–1109. [Google Scholar]

[R35] 35.Levy RM, Zhang LY, Gallicchio E, Felts AK. J. Am. Chem. Soc. 2003;125:9523–9530. doi: 10.1021/ja029833a. [DOI] [PubMed] [Google Scholar]

[R36] 36.Wagoner JA, Baker NA. Proc. Natl. Acad. Sci. USA. 2006;103:8331–8336. doi: 10.1073/pnas.0600118103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Sharp KA, Honig B. Annu. Rev. Biophys. Biophys. Chem. 1990;19:301–332. doi: 10.1146/annurev.bb.19.060190.001505. [DOI] [PubMed] [Google Scholar]

[R38] 38.Richards FM. Annu. Rev. Biophys. Bioeng. 1977;6:151–176. doi: 10.1146/annurev.bb.06.060177.001055. [DOI] [PubMed] [Google Scholar]

[R39] 39.Juffer AH, Botta EFF, van Keulen BAM, van der Ploeg A, Berendsen HJC. J. Comput. Phys. 1991;97:144–171. [Google Scholar]

[R40] 40.Miertus S, Scrocco E, Tomasi E. J. Chem. Phys. 1981;55:117–129. [Google Scholar]

[R41] 41.Warwicker J, Watson HC. J. Mol. Biol. 1982;157:671–679. doi: 10.1016/0022-2836(82)90505-8. [DOI] [PubMed] [Google Scholar]

[R42] 42.Shaw PB. Phys. Rev. A. 1985;32:2476–2487. doi: 10.1103/physreva.32.2476. [DOI] [PubMed] [Google Scholar]

[R43] 43.Zauhar RJ, Morgan RS. J. Mol. Biol. 1985;186:815–820. doi: 10.1016/0022-2836(85)90399-7. [DOI] [PubMed] [Google Scholar]

[R44] 44.Klapper I, Hagstrom R, Fine R, Sharp K, Honig B. Proteins. 1986;1:47–59. doi: 10.1002/prot.340010109. [DOI] [PubMed] [Google Scholar]

[R45] 45.Zauhar RJ, Morgan RS. J. Comput. Chem. 1988;9:171–187. [Google Scholar]

[R46] 46.Gilson MK, Honig B. Proteins. 1988;4:7–18. doi: 10.1002/prot.340040104. [DOI] [PubMed] [Google Scholar]

[R47] 47.Yoon BJ, Lenhoff AM. J. Comput. Chem. 1990;11:1080–1086. [Google Scholar]

[R48] 48.Nicholls A, Honig B. J. Comput. Chem. 1991;12:435–445. [Google Scholar]

[R49] 49.Holst MJ. Ph.D.thesis. Univ. of Ill. at Urbana-Champaign: 1993. [Google Scholar]

[R50] 50.You TJ, Harvey SC. J. Comput. Chem. 1993;14:484–501. [Google Scholar]

[R51] 51.Cammi R, Tomasi J. J. Comput. Chem. 1995;16:1449–1458. [Google Scholar]

[R52] 52.Madura JD, Briggs JM, Wade RC, Davis ME, Luty BA, Ilin A, Antosiewicz J, Gilson MK, Bagheri B, Ridgway-Scott L, McCammon JA. Comput. Phys. Comm. 1995;91:57–95. [Google Scholar]

[R53] 53.Purisima EO, Nilar SH. J. Comput. Chem. 1995;16:681–689. [Google Scholar]

[R54] 54.Bharadwaj R, Windemuth A, Sridharan S, Honig B, Nicholls A. J. Comput. Chem. 1995;16:898–913. [Google Scholar]

[R55] 55.Cances E, Mennucci B, Tomasi J. J. Chem. Phys. 1997;107:3032–3041. [Google Scholar]

[R56] 56.Liang J, Subramaniam S. Biophys. J. 1997;73:1830–1841. doi: 10.1016/S0006-3495(97)78213-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] 57.Holst M, Baker N, Wang F. J. Comput. Chem. 2000;21:1319–1342. [Google Scholar]

[R58] 58.Rocchia W, Alexov E, Honig B. J. Phys. Chem. B. 2001;105:6507–6514. [Google Scholar]

[R59] 59.Bordner AJ, Huber GA. J. Comput. Chem. 2003;24:353–367. doi: 10.1002/jcc.10195. [DOI] [PubMed] [Google Scholar]

[R60] 60.Boschitsch AH, Fenley MO, Zhou H-X. J. Phys. Chem. B. 2002;106:2741–2754. [Google Scholar]

[R61] 61.Boschitsch AH, Fenley MO. J. Comput. Chem. 2004;25:935–955. doi: 10.1002/jcc.20000. [DOI] [PubMed] [Google Scholar]

[R62] 62.Zhou HX. Biophys. J. 1993;65:955–963. doi: 10.1016/S0006-3495(93)81094-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] 63.Purisima EO. J. Comput. Chem. 1998;19:1494–1504. [Google Scholar]

[R64] 64.Kuo SS, Altman MD, Bardhan JP, Tidor B, White JK. Fast Methods for Simulation of Biomolecule Electrostatics. Proc. Int. Conf. Comput.-Aided Des. (ICCAD); San Jose, CA. New York, NY: ACM; 2002. pp. 466–473. [Google Scholar]

[R65] 65.Altman MD, Bardhan JP, Tidor B, White JK. IEEE Trans. Comput.-Aided Des. 2006;25:274–284. [Google Scholar]

[R66] 66.Altman MD, Bardhan JP, White JK, Tidor B. J. Comput. Chem. 2009;30:132–153. doi: 10.1002/jcc.21027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R67] 67.Chipman DM. J. Chem. Phys. 2004;120:5566–5575. doi: 10.1063/1.1648632. [DOI] [PubMed] [Google Scholar]

[R68] 68.Altman MD, Bardhan JP, White JK, Tidor B. An Efficient and Accurate Surface Formulation for Biomolecule Electrostatics in Non-ionic Solution. Proc. 2005 IEEE Eng. Med. Biol. Conf. (IEEE-EMBS) 2005; Shanghai. Piscataway, NJ: IEEE; 2005. pp. 7591–7595. [DOI] [PubMed] [Google Scholar]

[R69] 69.Grandison S, Penfold R, Vanden-Broeck J-M. J. Comput. Phys. 2007;224:663–680. [Google Scholar]

[R70] 70.Barone V, Cossi M, Tomasi J. J. Chem. Phys. 1997;107:3210–3221. [Google Scholar]

[R71] 71.Mennucci B, Cancès E, Tomasi J. J. Phys. Chem. B. 1997;101:10506–10517. [Google Scholar]

[R72] 72.Levitt DG. Biophys. J. 1978;22:209–219. doi: 10.1016/S0006-3495(78)85485-X. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R73] 73.Attard P. J. Chem. Phys. 2003;119:1365–1372. [Google Scholar]

[R74] 74.Boda D, Voliskó M, Eisenberg B, Nonner W, Henderson D, Gillespie D. J. Chem. Phys. 2006;125 doi: 10.1063/1.2212423. 034901. [DOI] [PubMed] [Google Scholar]

[R75] 75.Atkinson KE. The Numerical Solution of Integral Equations of the Second Kind. Cambridge University Press; 1997. [Google Scholar]

[R76] 76.Hsiao GC, Wendland WL. Encyclopedia of Computational Mechanics. 2004 [Google Scholar]

[R77] 77.Zauhar RJ. J. Comput.-Aided Mol. Des. 1995;9:149–159. doi: 10.1007/BF00124405. [DOI] [PubMed] [Google Scholar]

[R78] 78.Bardhan JP, Altman MD, White JK, Tidor B. J. Chem. Phys. 2007;127 doi: 10.1063/1.2743423. 014701. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R79] 79.Tausch J, Wang J, White J. IEEE Trans. Comput.-Aided Des. 2001;20:1398–1405. [Google Scholar]

[R80] 80.Bardhan JP. J. Chem. Phys. 2009;130 doi: 10.1063/1.3080769. 094102. [DOI] [PubMed] [Google Scholar]

[R81] 81.Nabors K, Korsmeyer FT, Leighton FT, White J. SIAM J. Sci. Comput. 1994;15:713–735. [Google Scholar]

[R82] 82.Lu BZ, Cheng XL, Huang J, McCammon JA. Proc. Natl. Acad. Sci. USA. 2006;103:19314–19319. doi: 10.1073/pnas.0605166103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R83] 83.Golub GH, Loan CFV. Matrix Computations. 3rd ed. Baltimore, MD: The Johns Hopkins University Press; 1996. [Google Scholar]

[R84] 84.Saad Y, Schultz M. SIAM J. Sci. Stat. Comput. 1986;7:856–869. [Google Scholar]

[R85] 85.Greengard L, Rokhlin V. J. Comput. Phys. 1987;73:325–348. [Google Scholar]

[R86] 86.Phillips JR, White JK. IEEE Trans. Comput.-Aided Des. 1997;16:1059–1072. [Google Scholar]

[R87] 87.Bertsekas DP. Nonlinear Programming. 2nd ed. Nashua, NH: Athena Scientific; 1999. [Google Scholar]

[R88] 88.Wright SJ. Primal-Dual Interior Point Methods. Philadelphia, PA: SIAM; 1997. [Google Scholar]

[R89] 89.Bardhan JP, Lee JH, Altman MD, Benson S, Leyffer S, Tidor B, White JK. Biomolecule Electrostatic Optimization with an Implicit Hessian. In: Laudon M, Romanowicz B, editors. Tech. Proc. 2004 Nan-otech. Conf. Trade Show, v. 1; Boston MA. 2004; Cambridge MA: NSTI; 2004. pp. 164–167. [Google Scholar]

[R90] 90.Sanner M, Olson AJ, Spehner JC. Biopolymers. 1996;38:305–320. doi: 10.1002/(SICI)1097-0282(199603)38:3%3C305::AID-BIP4%3E3.0.CO;2-Y. [DOI] [PubMed] [Google Scholar]

[R91] 91.Bardhan JP. J. Chem. Phys. 2008;129 doi: 10.1063/1.2987409. 144105. [DOI] [PubMed] [Google Scholar]

[R92] 92.Bardhan JP, Lee JH, Kuo SS, Altman MD, Tidor B, White JK. Fast Methods for Biomolecule Charge Optimization. In: Laudon M, Romanowicz B, editors. Tech. Proc. 2003 Nanotech. Conf. Trade Show, v. 2; San Francisco, CA. 2003; Cambridge, MA: NSTI; 2003. pp. 508–511. [Google Scholar]

[R93] 93.Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. J. Comput. Chem. 1983;4:187–217. [Google Scholar]

[R94] 94.Brünger AT, Karplus M. Proteins. 1988;4:148–156. doi: 10.1002/prot.340040208. [DOI] [PubMed] [Google Scholar]

[R95] 95.MacKerell AD, Jr, Bashford D, Bellott M, Dunbrack RL, Jr, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph–McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, III, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiorkiewicz–Kuczera J, Yin D, Karplus M. J. Phys. Chem. B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]

[R96] 96.Sitkoff D, Sharp KA, Honig B. J. Phys. Chem. B. 1994;98:1978–1988. [Google Scholar]

[R97] 97.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Zakrzewski VG, Montgomery JA, Jr, Stratmann RE, Burant JC, Dapprich S, Millam JM, Daniels AD, Kudin KN, Strain MC, Farkas O, Tomasi J, Barone V, Cossi M, Cammi R, Mennucci B, Pomelli C, Adamo C, Clifford S, Ochterski J, Petersson GA, Ayala PY, Cui Q, Morokuma K, Salvador P, Dannenberg JJ, Malick DK, Rabuck AD, Raghavachari K, Foresman JB, Cioslowski J, Ortiz JV, Baboul AG, Stefanov BB, Liu G, Liashenko A, Piskorz P, Komáromi I, Gomperts R, Martin RL, Fox DJ, Keith T, Al-Laham MA, Peng CY, Nanayakkara A, Challacombe M, Gill PMW, Johnson B, Chen W, Wong MW, Andres JL, Gonzalez C, Head-Gordon M, Replogle ES, Pople JA. Gaussian 98. Pittsburgh, PA: Gaussian, Inc.; 1998. [Google Scholar]

[R98] 98.Green DF, Tidor B. J. Phys. Chem. B. 2003;107:10261–10273. [Google Scholar]

[R99] 99.Bayly CI, Cieplak P, Cornell WD, Kollman PA. J. Phys. Chem. 1993;97:10269–10280. [Google Scholar]

[R100] 100.Altman MD. Unpublished results. [Google Scholar]

[R101] 101.Rizzo RC, Aynechi T, Case DA, Kuntz ID. J. Chem. Theory Comput. 2006;2:128–139. doi: 10.1021/ct050097l. [DOI] [PubMed] [Google Scholar]

PERMALINK

A “Reverse-Schur” Approach to Optimization With Linear PDE Constraints: Application to Biomolecule Analysis and Design

Jaydeep P Bardhan

Michael D Altman

B Tidor

Jacob K White

Abstract

1 INTRODUCTION

2 THEORY

2.1 A Linear-Response Model for Estimating the Electrostatic Contribution to the Free Energy of Binding Between Biomolecules

Figure 1.

Figure 2.

2.1.1 The Apparent-Surface-Charge Formulation

2.1.2 The Non-derivative Green’s-Theorem Formulation

2.2 Numerical Solution of the Integral Equations Using Boundary-Element Methods and Fast Algorithms

2.3 Biomolecule Electrostatic Optimization

2.4 Solving Convex Quadratic Optimization Problems

3 THE REVERSE-SCHUR METHOD FOR ELECTROSTATIC OPTIMIZATION

3.1 Alternative PDE-Constrained Approaches

3.1.1 Nested-Krylov Approach

3.1.2 Incorporating the PDE as Constraints

3.2 Constrained Co-Optimization

4 IMPLEMENTATION

Figure 3.

4.1 Regularization

Figure 4.

Figure 5.

Figure 6.

4.2 Preconditioning

Figure 7.

5 COMPUTATIONAL RESULTS

5.1 Efficiency of Co-Optimization and PDE-constrained Approaches

Figure 8.

5.2 Realistic Protein–Ligand Systems

5.2.1 CDK2 and Inhibitor

Figure 9.

Figure 10.

Figure 11.

5.2.2 HIV-1 Protease and Inhibitor

Figure 12.

Table 1.

Figure 13.

6 DISCUSSION

Acknowledgement

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases