Generalized and efficient algorithm for computing multipole energies and gradients based on Cartesian tensors

Dejun Lin

doi:10.1063/1.4930984

. 2015 Sep 21;143(11):114115. doi: 10.1063/1.4930984

Generalized and efficient algorithm for computing multipole energies and gradients based on Cartesian tensors

Dejun Lin ^1,^a)

PMCID: PMC4583518 PMID: 26395695

Abstract

Accurate representation of intermolecular forces has been the central task of classical atomic simulations, known as molecular mechanics. Recent advancements in molecular mechanics models have put forward the explicit representation of permanent and/or induced electric multipole (EMP) moments. The formulas developed so far to calculate EMP interactions tend to have complicated expressions, especially in Cartesian coordinates, which can only be applied to a specific kernel potential function. For example, one needs to develop a new formula each time a new kernel function is encountered. The complication of these formalisms arises from an intriguing and yet obscured mathematical relation between the kernel functions and the gradient operators. Here, I uncover this relation via rigorous derivation and find that the formula to calculate EMP interactions is basically invariant to the potential kernel functions as long as they are of the form f(r), i.e., any Green’s function that depends on inter-particle distance. I provide an algorithm for efficient evaluation of EMP interaction energies, forces, and torques for any kernel f(r) up to any arbitrary rank of EMP moments in Cartesian coordinates. The working equations of this algorithm are essentially the same for any kernel f(r). Recently, a few recursive algorithms were proposed to calculate EMP interactions. Depending on the kernel functions, the algorithm here is about 4–16 times faster than these algorithms in terms of the required number of floating point operations and is much more memory efficient. I show that it is even faster than a theoretically ideal recursion scheme, i.e., one that requires 1 floating point multiplication and 1 addition per recursion step. This algorithm has a compact vector-based expression that is optimal for computer programming. The Cartesian nature of this algorithm makes it fit easily into modern molecular simulation packages as compared with spherical coordinate-based algorithms. A software library based on this algorithm has been implemented in C++11 and has been released.

I. INTRODUCTION

Classical molecular mechanics simulations use empirical potential energy functions, called force fields, to model molecular interactions at the atomic level. Force fields are typically divided into bonded and nonbonded terms, the latter of which are usually approximations of electrostatics, dispersion, and repulsion interactions. The most commonly used models for the nonbonded interactions are Coulomb and Lennard-Jones potentials. These potentials are, in general, functions of the nuclear positions and atomic properties such as the electron density, the latter of which are called parameters and are determined from quantum mechanical calculations.

Historically, the parameters for the Coulomb potential are the so-called atomic “partial charges,” which are fractional atom-centered charges representing the continuous electron density. Although the partial-charge model simplifies the calculation and is easy to implement, its failure to reproduce the molecular electrostatic potential has been known for decades,^1–9 especially in the biomolecular simulation field. This issue mainly arises from the fact that molecular orbitals, in general, are not isotropic in space as implicitly assumed by the partial-charge model. We refer the reader to recent reviews^10,11 on this issue. Most notably, this issue is more pronounced in the case of a coarse-grained force field, where a group of atoms are modeled as a super-atom whose partial charges sum to neutrality. Under this circumstance, the complete loss of electrostatics between neutral super-atoms has been shown to cause simulation artifacts.^12,13 One intuitive way to improve the electrostatic model is to include higher order electric multipole (EMP) moments to better represent the electron cloud. Significant effort has been devoted to the development of force fields that explicitly represent permanent and polarizable point atomic EMP moments^14–20 as well as continuous Gaussian EMP moments.^21,22 EMP moments have also been used in recent coarse-grained force fields for proteins²³ and lipids.²⁴ Very recently, a polarizable Gaussian atomic EMP model was used in refining x-ray crystal structure,^25–28 where the authors found that an expansion up to rank-4 EMP (hexadecapole) moments are sometimes necessary to describe bond electron density in tetrahedral geometries.

Another concern for calculating molecular potential is long-range interactions. This is most prominent in evaluating the Coulomb potential, although recently there are also reports on the importance of long-range dispersion interaction in computing fluid-fluid interfacial properties.^29–32 One of the first and yet most popular ways to evaluate the long-range interactions is Ewald summation and the particle-mesh (PM) method based on it. The details of this method have been extensively reviewed.^10,33,34 While the formalism for the PM method in the case of partial-charge model as well as the implementation has been established for several decades, the equivalent for the case of higher-rank EMP moments was only developed recently.^15,35–37 There are also real-space methods based on the Wolf summation³⁸ that have been developed recently^39,40 which explicitly incorporate EMP moments.

While the importance of higher-rank EMP moments in improving force fields has become more appreciated, the additional equations beyond the partial-charge model have not been cast into a easily recognizable form. Also, the formalism to compute the EMP-EMP interaction energies and forces is not generalized to encompass the variety of potential energy functions used under different situations, especially when additional, not necessarily complicated, mathematical or numerical manipulation is needed in treatment of long-range interactions. For example, the formalism to compute Ewald summation up to the quadrupole level was originally proposed by Smith⁴¹ but later recast by Aguado and Madden,³⁵ in a completely different form and in either case, but neither formalism is readily generalizable to support EMP moments of higher ranks. Moreover, these formalisms use sparse and obscure expressions that have to be written out explicitly in computer program, which makes them difficult to implement and debug.

The other aspect of the calculation involving EMP is efficiency. The canonical tensor-based method for calculating EMP interaction energies and forces has a vector-matrix-vector bilinear form and requires populating the central matrix. However, it is generally not the most efficient way to first populate the matrix and then carry out the bilinear contraction (or vector-matrix multiplication); I will show later that the algorithm developed here needs significantly fewer floating point operations to evaluate EMP interaction energies and forces than several algorithms developed recently where the central matrix is populated by recursion. Specifically, the algorithm reported here is about 4 times faster (in terms of the required number of floating-point operations) than the McMurchie-Davidson algorithm prosed by Sagui et al.¹⁵ and about 16 times faster than another one proposed by Boateng and Todorov³⁷ for evaluating the damped Coulomb potential used in Ewald summation at the hexadecapole level. It is even more competitive than a theoretically ideal recursion scheme where only 1 multiplication and 1 addition are needed to construct each element of the central matrix.

In this study, I summarize the problems in electric multipole-multipole interaction and provide an efficient algorithm to evaluate the multipole interaction energies, forces, and torques in Cartesian coordinates for any kernel function f(r). This study is based on the work of Applequist⁴² and Burgos and Bonadeo⁴³ on Cartesian tensor applied in the solution to Poisson’s equation, i.e., the Coulomb potential function. This algorithm is superior to the canonical tensor-based one in terms of computational efficiency and has a compact vector-based expression that makes the implementation in computer programs very easy. I compare its computational complexity to that of several recursive algorithms developed recently and show that the current method is, in general, more efficient. A C+ + 11 template library based on this algorithm has been developed and released.

The rest of this paper is organized as follows: Section II presents the basics of multipole expansion and multipole interaction with a generalization to a wide range of kernel functions. Sections III and IV present a derivation to an efficient algorithm to calculate multipole interactions via any kernel function of the form f(r) and show that its mathematical expression is invariant with the kernel function. Sections V and VI provide insight into how the algorithm works and its novelty, respectively. Section VII shows how one can apply the algorithm to Ewald summation. Section VIII provides complexity analysis of this algorithm in comparison to various recursive algorithms developed recently. In Section IX, I will describe how to implement the algorithm in computer programs.

II. THE BASIC PROBLEM

First, I refer the reader to the Appendix for an explanation of the notation I use throughout this paper. The Poisson’s equation for the electrostatic potential ϕ(r) of a given source ρ(r) is

- Δ ϕ (r) = 4 π ρ (r),

(2.1)

where Δ ≡ ∇² is the Laplace operator. For convenience, I denote the pair of ϕ(r) and ρ(r) as

ϕ (r) \overset{Possion}{⇌} ρ (r) .

(2.2)

The Green’s function for Equation (2.1) is

G (r) = \frac{1}{r} .

(2.3)

Now consider a cluster of N sources around a point (called the electric multipole or EMP site): r_j ∈ ℝ³,

ρ_{j} (r) \equiv \sum_{k = 1}^{N} q_{k} δ (r - r_{k}) * γ_{j} (r),

(2.4)

where δ(r) is the Dirac delta function, γ_j(r) represents the “shape” of the source function, and ∗ denotes convolution. Since convolution commutes with Δ, we have for $ϕ_{j} \overset{Possion}{⇌} ρ_{j}$ ,

ϕ_{j} (r) = \sum_{k = 1}^{N} q_{k} δ (r - r_{k}) * γ_{j} (r) * G (r) .

(2.5)

Define d_{j_k} ≡ r_k − r_j ∀k = 1, 2, …, N so that at a point r outside the cluster, i.e., |r − r_j| > max{|d_{j_k}|}, the Taylor series of Equation (2.5) about (d_j₁, …, d_{j_N}) = (0, …, 0) converges,

ϕ_{j} (r) = \sum_{n = 0}^{+ \infty} μ_{j}^{(n)} \cdot n \cdot \nabla_{j}^{(n)} δ (r - r_{j}) * γ_{j} (r) * G (r),

(2.6)

where ⋅ n ⋅ denotes n-fold contraction (see the Appendix for a description of the notation used) and $μ_{j}^{(n)}, n = 0, 1, \dots, N$ are the Cartesian multipoles at site j,

μ_{j}^{(n)} \equiv \frac{1}{n!} \sum_{k = 1}^{N} q_{j_{k}} d_{j_{k}}^{(n)},

(2.7)

with $d_{j_{k}}^{(n)}$ being the n-fold tensor product of d_{j_k}:

d_{j_{k}}^{(n)} \equiv \overset{a total of n terms}{\overset{︷}{d_{j_{k}} \otimes d_{j_{k}} \otimes \dots \otimes d_{j_{k}}}}

(2.8)

and $\nabla_{j}^{(n)}$ is the order-n gradient (with respect to the coordinates of site j) operator

\nabla_{}^{(n)} \equiv \nabla_{α_{1} α_{2} \dots α n}^{(n)} \equiv \frac{\partial^{n}}{\partial α_{1} \partial α_{2} \dots \partial α_{n}}, with α_{k} = x, y or z \forall k \in Z^{+} .

(2.9)

Similarly, the electrostatic energy between EMP site j and another site i is

U (r_{i} - r_{j}) \equiv \int_{all space}^{} ρ_{i} ϕ_{j} (r) d r = \int_{all space}^{} \sum_{m = 0}^{+ \infty} μ_{i}^{(m)} \cdot m \cdot \nabla_{i}^{(m)} δ (r - r_{i}) * γ_{i} (r) ϕ_{j} (r) d r .

(2.10)

We define a rank-n point EMP operator m_j and the corresponding shaped operator $m_{j}^{γ}$ of site j as

m_{j} = m_{j} (r) \equiv \sum_{n = 0}^{+ \infty} μ_{j}^{(n)} \cdot n \cdot \nabla_{j}^{(n)} δ (r - r_{j}),

(2.11)

m_{j}^{γ} \equiv m_{j} * γ_{j},

(2.12)

so that $ϕ_{j} (r) \overset{Possion}{⇌} m_{j}^{γ}$ as in Equation (2.6) and

U (r_{i} - r_{j}) = \int_{all space}^{} m_{i}^{γ} ϕ_{j} (r) d r

(2.13)

as in Equation (2.10), which means that m_j ( $m_{j}^{γ}$ ) acts as a point (shaped) EMP density distribution of site j and the electrostatic energies, forces, and torques between a pair EMP sites can be obtained from that between a pair of charge density distributions with the charge density replaced with the EMP operator defined in Equation (2.11). For example, the electrostatic energy between 2 point-EMP sites i and j is (by inserting Equation (2.6) into Equation (2.10))

U (r_{i} - r_{j}) = \sum_{m = 0}^{+ \infty} μ_{i}^{(m)} \cdot m \cdot \nabla_{i}^{(m)} \sum_{n = 0}^{+ \infty} μ_{j}^{(n)} \cdot n \cdot \nabla_{j}^{(n)} I_{i j} = \sum_{m = 0}^{+ \infty} \sum_{n = 0}^{+ \infty} {(- 1)}^{m} μ_{i}^{(m)} \cdot m \cdot \nabla_{j}^{(m + n)} I_{i j} \cdot n \cdot μ_{j}^{(n)},

(2.14)

where

I_{i j} = I (r_{i} - r_{j}) \equiv \int_{all space}^{} γ_{i} (r - r_{i}) γ_{j} (r - r_{j}) * G (r) d r = \int_{all space}^{} (G * γ_{j}) (u + r_{i} - r_{j}) γ_{i} (u) d u

(2.15)

and the force on site i is

F (r_{i} - r_{j}) = - \nabla_{i}^{(1)} U (r_{i} - r_{j}) = \sum_{m = 0}^{+ \infty} \sum_{n = 0}^{+ \infty} {(- 1)}^{m} μ_{i}^{(m)} \cdot m \cdot \nabla_{j}^{(m + n + 1)} I_{i j} \cdot n \cdot μ_{j}^{(n)}

(2.16)

and the torque on site i is (see Section S1 in the supplementary material for the derivation⁵⁷)

T (r_{i} - r_{j}) = \sum_{m = 0}^{+ \infty} \sum_{n = 0}^{+ \infty} ϵ^{(3)} \cdot 2 \cdot [{(- 1)}^{m + 1} (m + 1) μ_{i}^{(m+1)} \cdot m \cdot \nabla_{j}^{(m + n + 1)} I_{i j} \cdot n \cdot μ_{j}^{(n)}],

(2.17)

where $ϵ^{(3)} \equiv ϵ_{i j k}^{(3)} = (i - j) (k - i) (j - k) / 2$ is the Levi-Civita symbol. In general, we need to evaluate the bi-contraction of the form

A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f (r) \cdot m \cdot B^{(m)},

(2.18)

where the n-fold gradient of f(r) exists.

For most applications of EMP moments, such as Ewald sum or the Fast Multipole Method^44,45 (FMM), f is a function of inter-particle distance only. If we restrict f to f ≡ f(r²), we can rewrite Equation (2.18) as

A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f (r^{2}) \cdot m \cdot B^{(m)} .

(2.19)

In Sec. III, I will develop an expansion of $\nabla_{}^{(n)} f (r^{2})$ in terms of a function of $r^{(n)} \equiv \overset{a total of n terms}{\overset{︷}{r \otimes r \otimes \dots \otimes r}}$ so that expression (2.18) can be evaluated in the same expansion in terms of contractions of the form A⁽ⁿ⁾ ⋅ k ⋅ r^(k) and B^(m) ⋅ k ⋅ r^(k) without the need to populate the central tensor $\nabla_{}^{(n + m)} f (r^{2})$ . However, I want to remind the reader here that the definitions of Equations (2.11) and (2.12) are independent of the linear operator Δ and its Green’s function and are valid as long as the Taylor series in Equation (2.6) converges. That said, I do assume the linear operator is translation invariant, which is true for simulations of particle systems. This means we can apply it to other linear equations (with appropriate shaping function γ(r) to guarantee the convergence of Equation (2.15)) such as Yukawa’s equation,

- (Δ - κ^{2}) ϕ (r) = 4 π ρ (r), κ \in R^{+}

(2.20)

or the Helmholtz equation,

- (Δ + κ^{2}) ϕ (r) = 4 π ρ (r), κ \in R^{+} .

(2.21)

III. THE n-FOLD GRADIENT OF f(r²)

The direct approach to evaluating expression (2.19) is to first populate the tensor $\nabla_{}^{(n + m)} f (r^{2})$ as a matrix and then perform a vector-matrix-vector multiplication. However, the complexity of this approach could be very high depending on the complexity of calculating $\nabla_{}^{(n + n)} f (r^{2})$ . For example, the complexity of populating $\nabla_{}^{(n + m)} \frac{1}{r}$ in $A^{(n)} \cdot n \cdot \nabla_{}^{(n + n)} \frac{1}{r} \cdot n \cdot B^{(n)}$ is on the order of n⁶ based on the algorithm published previously.⁴²

On the other hand, if we can decompose $\nabla_{}^{(n + n)} f (r^{2})$ into $\overset{a total of 2 n terms}{\overset{︷}{t \otimes t \otimes \dots \otimes t}}$ , where t ≡ t(r) is a simple function of r, the order-2n bi-contraction can be split into two order-n contractions and the complexity is reduced to n³. Although such a decomposition does not exist, in general, a similar “divide-and-conquer” approach has been proposed previously for $f (r^{2}) = \frac{1}{r}$ using a diagrammatic method to evaluate EMP interactions.⁴³ Here, we take one step forward and generalize the approach to any function f ≡ f(r²) where f is a function of the squared norm of r.

First, we need the help of the following definition:

Definition 3.1.

Define r⁽ⁿ⁾δ^(s) as a tensor of rank n + 2s,

$r^{(n)} δ^{(s)} \equiv \sum_{P_{n + 2 s}^{s} {α_{1} \dots α_{n + 2 s}}} r_{α_{1} \dots α_{n}}^{(n)} δ_{α_{n + 1} α_{n + 2}} δ_{α_{n + 3} α_{n + 4}} \dots δ_{α_{n + 2 s - 1} α_{n + 2 s}},$ (3.1)

where δ_ij is the Kronecker delta and the sum is over all the permutations of the set of index {α₁…α_n+2s} that give distinct terms and there are a total of $P_{n + 2 s}^{s} \equiv (n + 2 s)! / (2^{s} s! n!)$ terms. The permutations in the sum guarantee that r⁽ⁿ⁾δ^(s) is totally symmetric.

The following lemmas are direct applications of Definition 3.1:

Lemma 3.1.

Given a tensor r⁽ⁿ⁾δ⁽¹⁾ of rank n + 2, as in Definition 3.1 and one of its index i ∈ [0, n + 2], we have the following equality:

$r^{(n)} δ^{(1)} = r_{i} r^{(n - 1)} δ^{(1)} + δ_{i} r^{(n)},$ (3.2)

where δ_ir⁽ⁿ⁾ is r^(r)δ⁽¹⁾ with a specific index i fixed on δ.

Proof.

This can be obtained trivially by realizing that the sum in Definition 3.1 can be split into a sum over index i attached to r plus another sum where i is attached to δ.□

Lemma 3.2.

$\nabla_{α_{n + 1}} r^{(n)} = δ_{α_{n + 1}} r^{(n - 1)} .$ (3.3)

Proof.

Since r⁽ⁿ⁾ = r_α₁r_α₂…r_{α_n}, the chain rules give us

$\nabla_{α_{n + 1}} r^{(n)} = \sum_{i = α_{1}}^{α_{n}} δ_{i α_{n + 1}} r^{(n - 1)} = δ_{α_{n + 1}} r^{(n - 1)} .$ (3.4)

□

With the help of Definition 3.1 and Lemmas 3.1 and 3.2, we arrive at the following theorem:

Theorem 3.1.

Given a real-valued function f on $R_{\geq 0}^{}$ , whose n-fold gradient exists, its n-fold gradient can be expanded into

$\nabla_{}^{(n)} f (r^{2}) = \sum_{k = 0}^{⌊\frac{n}{2}⌋} 2^{n - k} F_{n - k} (r^{2}) r^{(n - 2 k)} δ^{(k)},$ (3.5)

where $⌊\frac{n}{2}⌋$ is the biggest integer no larger than n/2 and

$F_{k} \equiv \frac{d^{k} f}{d {(r^{2})}^{k}} .$ (3.6)

Proof.

We will proceed by induction. Equation (3.5) is obviously true for n = 0 and we have

$\nabla_{}^{(n + 1)} f (r^{2}) = \nabla_{α_{n + 1}} \nabla_{}^{(n)} f (r^{2}) = \sum_{k = 0}^{⌊\frac{n}{2}⌋} 2^{n - k} (\nabla_{α_{n + 1}} F_{n - k} r^{(n - 2 k)} δ^{(k)} + F_{n - k} \nabla_{α_{n + 1}} r^{(n - 2 k)} δ^{(k)}) = \sum_{k = 0}^{⌊\frac{n}{2}⌋} (2^{n + 1 - k} F_{n + 1 - k} r_{α_{n + 1}} r^{(n - 2 k)} δ^{(k)} + 2^{n - k} F_{n - k} δ_{α_{n + 1}} r^{(n - 2 k - 1)} δ^{(k)}),$

where the 2nd equality follows directly from Lemma 3.2. Note that for k = 0 in the sum, the 1st term on the right-hand side is just 2ⁿ⁺¹F_n+1r⁽ⁿ⁺¹⁾; for $k = ⌊\frac{n}{2}⌋$ , the 2nd term on the right-hand side is just 2^n−kF_n−kδ^(k+1) if n is odd and it vanishes if n is even; for $0 < k < ⌊\frac{n}{2}⌋$ , the kth and the k + 1th term can be merged as

$2^{n + 1 - k} F_{n + 1 - k} r_{α_{n + 1}} r^{(n - 2 k)} δ^{(k)} + (2^{n - k} F_{n - k} δ_{α_{n + 1}} r^{(n - 2 k - 1)} δ^{(k)} + 2^{n - k} F_{n - k} r_{α_{n + 1}} r^{(n - 2 k - 2)} δ^{(k + 1)}) + 2^{n - k - 1} F_{n - k - 1} δ_{α_{n + 1}} r^{(n - 2 k - 3)} δ^{(k + 1)} = 2^{n + 1 - k} F_{n + 1 - k} r_{α_{n + 1}} r^{(n - 2 k)} δ^{(k)} + 2^{n + 1 - (k + 1)} F_{(n + 1) - (k + 1)} r^{(n + 1 - 2 (k + 1))} δ^{(k + 1)} + 2^{n + 1 - (k + 2)} F_{n + 1 - (k + 2)} δ_{α_{n + 1}} r^{(n + 1 - 2 (k + 2))} δ^{(k + 1)}$

which follows from Lemma 3.1. Thus, we have for

$n = 2 g, g \in N^{0}, \nabla_{}^{(n + 1)} f (r^{2}) = \sum_{k = 0}^{g} 2^{n + 1 - k} F_{n + 1 - k} r^{(n + 1 - 2 k)} δ^{(k)}$

and for

$n = 2 g + 1, g \in N^{0}, \nabla_{}^{(n + 1)} f (r^{2}) = \sum_{k = 0}^{g} 2^{n + 1 - k} F_{n + 1 - k} r^{(n + 1 - 2 k)} δ^{(k)} + 2^{n - g} F_{n - g} δ^{(g + 1)} .$

In general, we have

$\nabla_{}^{(n + 1)} f (r^{2}) = \sum_{k = 0}^{⌊\frac{n + 1}{2}⌋} 2^{n + 1 - k} F_{n + 1 - k} r^{(n + 1 - 2 k)} δ^{(k)}$

which is Equation (3.5) with n replaced by n + 1.□

Because $\nabla_{}^{(n)} f (r^{2})$ is totally symmetric, it is more convenient to rewrite Equation (3.5) in its compressed tensor form. Notice that the non-vanishing terms in r^(n−2k)δ^(k) are those of the form $r^{(n - 2 k)} (n_{1} - 2 k_{1}, n_{2} - 2 k_{2}, n_{3} - 2 k_{3}) \equiv r_{x}^{n_{1} - 2 k_{1}} r_{y}^{n_{2} - 2 k_{2}} r_{z}^{n_{3} - 2 k_{3}}$ , where $\sum_{i = 1}^{3} n_{i} = n, \sum_{i = 1}^{3} k_{i} = k$ , and 2k_i ≤ n_i. The redundancy of this term in the sum $P_{n + 2 s}^{s} {α_{1} \dots α_{n + 2 s}}$ is the number of ways k₁ distinct pairs of index α_iα_j can be selected from those in α₁…α_n which are assigned the value of 1, which is $P_{n_{1}}^{k_{1}} \equiv n_{1}! / (2^{k_{1}} k_{1}! (n_{1} - 2 k_{1})!)$ , times that number for k₂ and k₃. Thus,

r^{(n - 2 k)} δ^{(k)} (n_{1}, n_{2}, n_{3}) = \sum_{{k; \underset{1}{k}, k_{2}, k_{3}}} P_{n_{1}}^{k_{1}} P_{n_{2}}^{k_{2}} P_{n_{3}}^{k_{3}} r^{(n - 2 k)} (n_{1} - 2 k_{1}, n_{2} - 2 k_{2}, n_{3} - 2 k_{3}),

(3.7)

where the sum $\sum_{{k; \underset{1}{k}, k_{2}, k_{3}}}$ is over all the combination of k₁, k₂, k₃ whose sum is k. Insert this into Equation (3.5) and we get

\nabla_{}^{(n)} f (n_{1}, n_{2}, n_{3}) = \sum_{k = 0}^{⌊\frac{n}{2}⌋} 2^{n - k} F_{n - k} r^{(n - 2 k)} δ^{(k)} (n_{1}, n_{2}, n_{3}) = \sum_{k_{1} = 0}^{⌊\frac{n_{1}}{2}⌋} \sum_{k_{2} = 0}^{⌊\frac{n_{2}}{2}⌋} \sum_{k_{3} = 0}^{⌊\frac{n_{3}}{2}⌋} 2^{n - k} F_{n - k} P_{n_{1}}^{k_{1}} P_{n_{2}}^{k_{2}} P_{n_{3}}^{k_{3}} r^{(n - 2 k)} (n_{1} - 2 k_{1}, n_{2} - 2 k_{2}, n_{3} - 2 k_{3}) .

(3.8)

One can verify that when $f (r^{2}) = \sqrt{\frac{1}{r^{2}}}$ , Equations (3.5) and (3.8) reduce to the solid spherical harmonics of degree n in Cartesian coordinates, which is central to EMP interactions and has been given before.^42,43 Also, if $f (r^{2}) = \int_{0}^{1} exp (- r^{2} t^{2}) d t$ , which is the Boys function of order 0, Equations (3.5) and (3.8) reduce to the Hermite Coulomb integral, which plays an important role in evaluation of quantum molecular integrals in Gaussian type orbitals theory.^46,47 Notice that Definition 3.1 and Theorem 3.1 are independent of the dimension N of r, which means they can be applied to cases N ≥ 3. For example, if f(r²) = 1/(2π)^N/2exp(−r²/2), Equation (3.5) reduces to the N-dimensional Hermite polynomials.⁴⁸

The factorization of the derivatives of f in Theorem 3.1 provides a way to evaluate expression (2.18) without the need to populate the central tensor since r^(n−2k)δ^(k) is a simple function of r only. This is given in Section IV.

IV. EVALUATION OF THE BI-CONTRACTION FORM

Given Equation (3.5), evaluating expression (2.18) is equivalent to evaluating

A^{(n)} \cdot n \cdot r^{(m + n - 2 k)} δ^{(k)} \cdot m \cdot B^{(m)}

(4.1)

for $\forall k \in [0, ⌊\frac{m + n}{2}⌋]$ , multiplying by the coefficients and summing over all k. The terms in expression (4.1) can be split into the following categories:

1.
terms arising from r contracted with A⁽ⁿ⁾ or B^(m);
2.
terms arising from δ_ij contracted with both A⁽ⁿ⁾ and B^(m) where i comes from A⁽ⁿ⁾ and j comes from B^(m);
3.
terms arising from δ_ij contracted with A⁽ⁿ⁾, i.e., taking A⁽ⁿ⁾’s trace;
4.
terms arising from δ_ij contracted with B^(m), i.e., taking B^(m)’s trace.

If we let l_c, l_n, and l_m be the number of terms in categories 2–4, we arrive at the following expression:

A^{(n)} \cdot n \cdot r^{(m + n - 2 k)} δ^{(k)} \cdot m \cdot B^{(m)} = \sum_{{k; \underset{c}{l}, l_{n}, l_{m}}} C (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}) (A^{(m : l_{m})} \cdot O_{m} \cdot r^{(O_{m})}) \cdot l_{c} \cdot (B^{(n : l_{n})} \cdot O_{n} \cdot r^{(O_{n})}),

(4.2)

where O_m ≡ m − 2l_m − l_c, O_n ≡ n − 2l_n − l_c, and the sum is over all l_c, l_n, l_m ∈ ℕ⁰ so that l_c + l_n + l_m = k. The coefficient $C (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}) \equiv P_{n}^{l_{n}} P_{m}^{l_{m}} (\binom{n - 2 l_{n}}{l_{c}}) (\binom{m - 2 l_{m}}{l_{c}}) l_{c}!$ is the degeneracy of each $(A^{(m : l_{m})} \cdot O_{m} \cdot r^{(O_{m})}) \cdot l_{c} \cdot (B^{(n : l_{n})} \cdot O_{n} \cdot r^{(O_{n})})$ term in the sum. We can insert Equation (4.2) into (3.5) and (2.19) to get

A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f (r^{2}) \cdot m \cdot B^{(m)} = \sum_{k = 0}^{⌊\frac{n + m}{2}⌋} 2^{m + n - k} F_{m + n - k} \sum_{{k; \underset{c}{l}, l_{n}, l_{m}}} C (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}) (A^{(m : l_{m})} \cdot O_{m} \cdot r^{(O_{m})}) \cdot l_{c} \cdot (B^{(n : l_{n})} \cdot O_{n} \cdot r^{(O_{n})}) .

(4.3)

Similarly, we can obtain the expansion for the bi-contraction for forces as in Equation (2.16),

A^{(n)} \cdot n \cdot \nabla_{}^{(n + m + 1)} f (r^{2}) \cdot m \cdot B^{(m)} = \sum_{k = 0}^{⌊\frac{n + m + 1}{2}⌋} 2^{m + n + 1 - k} F_{m + n + 1 - k} [\sum_{\begin{array}{c} {k; \underset{c}{l}, l_{n}, l_{m}} \\ {1; \underset{n}{q}, q_{m}, q_{r}} \end{array}} C (\begin{array}{c} m, n, \\ l_{c}, l_{m}, l_{n}, \\ q_{m}, q_{n} \end{array}) r^{(q_{r})} \otimes (A^{(m : l_{m})} \cdot O_{m}^{'} \cdot r^{(O_{m}^{'})}) \cdot l_{c} \cdot (B^{(n : l_{n})} \cdot O_{n}^{'} \cdot r^{(O_{n}^{'})})]

(4.4)

and torques as in Equation (2.17),

B^{(m + 1)} \cdot m \cdot \nabla_{}^{(m + n + 1)} f (r^{2}) \cdot n \cdot A^{(n)} = \sum_{k = 0}^{⌊\frac{n + m + 1}{2}⌋} 2^{n + m + 1 - k} F_{n + m + 1 - k} \sum_{\begin{array}{c} {k; \underset{c}{l}, l_{n}, l_{m}} \\ {1; \underset{n}{q}, q_{m}, q_{r}} \end{array}} [C (\begin{array}{c} m + 1, n, \\ l_{c}, l_{m}, l_{n}, \\ q_{m}, q_{n} \end{array}) r^{(q_{r})} \otimes (B^{((m + 1) : l_{m})} \cdot O_{m}^{'} \cdot r^{(O_{m}^{'})}) \cdot l_{c} \cdot (A^{(n : l_{n})} \cdot O_{n}^{'} \cdot r^{(O_{n}^{'})})],

(4.5)

where the inner sum is over all l_c, l_n, l_m, q_r, q_n, q_m ∈ ℕ⁰ so that l_c + l_n + l_m = k and q_r + q_n + q_m = 1 and $O_{m}^{'} \equiv m - 2 l m - l_{c} - q_{m}$ , $O_{n}^{'} \equiv n - 2 l n - l_{c} - q_{n}$ . The coefficient $C (\begin{array}{c} m, n, \\ l_{c}, l_{m}, l_{n}, \\ q_{m}, q_{n} \end{array}) \equiv (\binom{m}{q_{m}}) (\binom{n}{q_{n}}) P_{m - q_{m}}^{l_{m}} P_{n - q_{n}}^{l_{n}} (\binom{m - q_{m} - 2 l_{m}}{l_{c}}) (\binom{n - q_{n} - 2 l_{n}}{l_{c}}) l_{c}!$ is again the degeneracy. One can verify that Equations (4.3) and (4.4) in the case of $f (r^{2}) = \sqrt{\frac{1}{r^{2}}}$ and $f (r^{2}) = \int_{1}^{\infty} exp (- r^{2} t^{2}) d t$ are equivalent to those given by Burgos and Bonadeo,⁴³ and Smith,⁴¹ respectively, although the expression given here is not limited to those cases.

There are a few things I want to point out here before changing the topic. First, note that when A⁽ⁿ⁾ or B^(m) is traceless, the terms corresponding to l_n≠0 or l_m≠0 vanish and Equations (4.3)–(4.5) are then much simpler. When both A⁽ⁿ⁾ and B^(m) are non-traceless, one can easily prove that if and only if $\nabla_{}^{(n + m)} f$ is totally symmetric and traceless would the following equations be true:

\nabla_{}^{(n + m)} f \cdot n {\cdot T}_{n} A^{(n)} = (2 n - 1)!! \nabla_{}^{(n + m)} f \cdot n \cdot A^{(n)}

(4.6)

and

T_{n} A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f \cdot m {\cdot T}_{m} B^{(m)} = (2 n - 1)!! (2 m - 1)!! A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f \cdot m \cdot B^{(m)},

(4.7)

where !! indicates the double factorial and 𝒯_n is the so called “detracer” operator defined by Applequist,⁴² which acts on a totally symmetric tensor A⁽ⁿ⁾ to give a totally symmetric and traceless tensor 𝒯_nA⁽ⁿ⁾. In practice, the traceless EMP moments $T_{n} μ_{j}^{(n)}$ are often used in lieu of $μ_{j}^{(n)}$ as defined in Equation (2.7). However, we want to remind the reader here that this $\nabla_{}^{(n + m)} f$ , in general, is not traceless and Equation (4.7) is not necessarily true. It can be proved that $\nabla_{}^{(n)} f (r^{2})$ is totally traceless if and only if

(2 n - 2 k + 1) F_{n - k} + 2 r^{2} F_{n - k + 1} = 0 \forall k \in [1, ⌊\frac{n}{2}⌋] .

(4.8)

The proof is given in Section S2⁵⁷ but the reader can easily verify that the Coulomb kernel $f (r^{2}) = \sqrt{\frac{1}{r^{2}}}$ satisfies this condition while $f (r^{2}) = \int_{1}^{\infty} exp (- r^{2} t^{2}) d t$ does not, the latter of which is the damped Coulomb potential as used in the direct-space sum in the Ewald summation method. Second, some spherical coordinate-based algorithms for calculating EMP interactions also take advantage of the traceless tensor when Equation (4.8) is satisfied.³⁶ With that, one would argue that a spherical coordinate-based algorithm is more efficient than a Cartesian coordinate one. However, I want to point out here that this very same trick in the spherical coordinate-based algorithm simply manifests itself in the current Cartesian coordinate-based approach by reducing the number of terms in Equations (4.3)–(4.5). Since Cartesian coordinate-based algorithms are generally more straightforward to implement, most molecular simulation packages use Cartesian rather than spherical coordinates. Therefore, coordinate transformation at every time step is required to use such a spherical coordinate-based algorithm in most simulation packages. Note that this transformation involves evaluating a large number of trigonometric functions, which, in general, is at least an order of magnitude slower than simple multiplication or addition and incurs a big overhead. Thus, the current algorithm is more efficient than the spherical coordinate-based one in terms of implementation in modern simulation software.

V. INTERPRETATION OF THE ALGORITHM

The tensor algebra shown in Section IV might seem difficult to digest for non-specialist so I also supply an intuitive explanation. Let us take Equation (4.3), for example. After a close examination, it is easy to show that the right-hand side of Equation (4.3) is a series of tensor contractions of the EMP moments with r^(m+n) (O_m and O_n), with themselves (l_m and l_n) and each other (l_c). As dot products can be interpreted as projections between vectors, the tensor contraction here can be read as the projection of the EMP moments onto r^(m+n), themselves, and each other. These projections arise from the contraction of the EMP moments with the r^(m+n)δ^(k) tensor, which in turn arises from the application of gradient operator on r^(m+n) by the chain rule of derivative (see the derivation from Lemma 3.2 and Theorem 3.1). In fact, the multipole expansion (right-hand side of Equation (2.6)) is equivalent to an expansion on the basis of spherical harmonics in Cartesian coordinates when the kernel function is Coulomb kernel $\frac{1}{r}$ , with each term in the expansion being a projection of the EMP moment on the spherical harmonic of the same degree. This has been discussed previously by Applequist.⁴² Here, the author generalized such expansion to any kernel function of the form f(r).

VI. NOVELTY OF THE ALGORITHM

There are three major novelties of the algorithm developed here. First of all, it shows that multipole interactions via all kernel functions of the form f(r), i.e., any function that depends on inter-particle distance, have essentially the same mathematical expression except for a few coefficients. This means the same set of working equations can be used for a wide variety of kernel potentials, e.g., direct Coulomb potential, reaction field potential, or the damped Coulomb potential in the Ewald summation, making it fit easily into modern molecular dynamics simulation packages for a broad range of applications.

Second, the number of floating-point operations required to perform the calculation is minimal as compared to the recursive algorithms developed previously;^15,37 the algorithm developed here is even faster than the best possible recursion scheme. The comparison will be given in Section VIII along with the explanation for why the algorithm is fast. I want to point out here that this reduction in floating-point operation has nothing to do with the implementation of the algorithm but it stems from the fact that the algorithm only takes into account necessary operations.

Last but not the least, the mathematical expression of this algorithm is highly compact and modularized, making it very easy to implement. One can easily cast these formulas into a set of matrix-vector multiplications, which can be performed using various high performance linear algebra packages. Also, a close examination of working equations (4.3)–(4.5) reveals that the derivatives of the kernel functions are just coefficients (F_i) of the tensor contraction and one only needs up to the (m + n)th scalar derivatives. These derivatives can be implemented as modules independent of the contraction, making the codes highly reusable for a wide variety of kernel functions. The details of the implementation will be given in Section IX.

VII. THE EWALD SUMMATION

The idea of Ewald summation for point charges has been discussed extensively so I refer the reader to the excellent reviews for details.^33,34 Here, I will apply the same idea to a system of general EMPs. I again refer the reader to the Appendix for the notation I use in what follows. Consider a system of N point EMP m_j at position r_j, j = 1, 2, …, N in a unit cell with m_j defined as in Equation (2.11). The unit cell is replicated to form a lattice $L$ . The electrostatic energy of the unit cell is

U = \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} \sum_{p \in L}^{'} \int_{all space}^{} ϕ_{j, p} (r) m_{i} (r) d r,

(7.1)

where ϕ_j,p ≡ m_j,p(r)∗γ_j(r)∗G(r) with $m_{j, p} (r) \equiv \sum_{n = 0}^{+ \infty} μ_{j}^{(n)} \cdot n \cdot \nabla_{j}^{(n)} δ (r - r_{j} - p)$ and the ′ in the 3rd sum means exclusion of i = j if p = 0. This sum is conditionally convergent for the interacting EMPs where the sum of the ranks is less than 2, i.e., for charge-charge, charge-dipole, dipole-dipole, and charge-quadrupole interaction.⁴⁹ However, in practice, the appropriate cutoff for rank >2 might be needed to be very large in order to converge the sum. In fact, it has been shown that short cutoffs of dispersion interaction of the form r⁻⁶ could cannot correctly reproduce the fluid-fluid interfacial properties.^29–32 It is thus reasonable to consider EMP interaction of rank 5 as long-range interaction. The idea of Ewald summation is to decompose ϕ_j,p into two functions, one of which decays rapidly in real space while the other decays rapidly in Fourier space, so that the sum can be evaluated with relatively small cutoffs in real and Fourier space. Because of the linearity of the Poisson’s equation, decomposition of the potential is equivalent to decomposition of the source density. The canonical choice of such decomposition is to let γ_j(r) = δ(r) = δ(r) − g_α(r) + g_α(r) as in Equation (2.6), where g_α(r) ≡ (α/π)^3/2e^−α|r|², so that ϕ_j,p = m_j,p∗G∗(δ − g_α) + m_j,p∗G∗g_α. Let $ϕ_{j, p}^{d} \equiv m_{j, p} * G * (δ - g_{α})$ and $ϕ_{j, p}^{k} \equiv m_{j, p} * G * g_{α}$ so that Equation (7.1) can be rewritten as U = U_d + U_k, where

U_{d} \equiv \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} \sum_{p \in L}^{'} \int_{all space}^{} ϕ_{j, p}^{d} m_{i} (r) d r,

(7.2)

U_{k} \equiv \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = 1}^{N} \sum_{p \in L}^{'} \int_{all space}^{} ϕ_{j, p}^{k} m_{i} (r) d r .

(7.3)

The second sum U_k is done in Fourier space (see the Appendix for a description of the notation used),

U_{k} = \frac{| \det {\tilde{A}}^{(2)} |}{2} \sum_{k \in \tilde{L}}^{k \neq 0} {\hat{g}}_{α} (k) \hat{G} (k) | \hat{M} (k) |^{2} - U_{self},

(7.4)

where ${\hat{g}}_{α} (k)$ , $\hat{G} (k)$ , and $\hat{M} (k)$ are the Fourier transform of g_α, G, and $M (r) \equiv \sum_{i = 1}^{N} m_{i} (r)$ , respectively. U_self is defined as

U_{self} \equiv \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = i} \int_{allspace}^{} ϕ_{j, p}^{k} (r) m_{i} (r) d r = \frac{1}{2} \sum_{i = 1}^{N} \sum_{j = i} \sum_{m = 0}^{+ \infty} \sum_{n = 0}^{+ \infty} {(- 1)}^{n} μ_{i}^{(n)} \cdot n \cdot \nabla_{j}^{(n + m)} \frac{\erf (\sqrt{α} | r_{i} - r_{j} |)}{| r_{i} - r_{j} |} \cdot m \cdot μ_{j}^{(m)},

(7.5)

where $\erf (x) \equiv \frac{2}{\sqrt{π}} \int_{0}^{x} exp (- u^{2}) d u$ is the error function. U_self is the interaction energy between m_j∗G∗g_α and m_j at r_j, which is a non-physical term included in the Fourier series of U_d and should be subtracted from the sum. The derivation will be given in Section S3 of the supplementary material.⁵⁷ Because $\hat{M} (k)$ has a very simple form in Fourier space (see Section S3⁵⁷), the evaluation of U_k does not involve complicated tensor-tensor contractions and can be done efficiently using the Fast Fourier Transform (FFT) technique.¹⁵ On the other hand, U_d can be summed rapidly in real space with a relatively small cut-off distance r_cut due to rapid decay of $ϕ_{j, p}^{d} \equiv m_{j} * G * (δ - g_{α}) = \frac{erfc (\sqrt{α} | r - r_{j} |)}{| r - r_{j} |}$ ,

U_{d} = \frac{1}{2} \sum_{i = 1}^{N} \sum_{\begin{array}{c} j \neq i \\ r_{i j} < r_{cut} \end{array}}^{N} \sum_{m = 0}^{+ \infty} \sum_{n = 0}^{+ \infty} {(- 1)}^{n} μ_{i}^{(n)} \cdot n \cdot \nabla_{j}^{(n + m)} \frac{erfc (\sqrt{α} | r_{i} - r_{j} |)}{| r_{i} - r_{j} |} \cdot m \cdot μ_{j}^{(m)},

(7.6)

where erfc ≡ 1 − erf is the complementary error function and r_ij ≡ |r_i − r_j|. Thus, aside from the need to maintain physical fidelity as discussed earlier in this section, efficient calculation is another argument for using Ewald summation to handle long-range EMP interactions.

Here, I will apply the results obtained from Sections III and IV to the evaluation of U_self and U_d. First, from Theorem 3.1, it is clear that if r = 0, as in Equation (7.5) with i = j, all the terms involving r^(n−2k) with n − 2k≠0 vanish, i.e., for i = j and |r_i − r_j| = 0,

\nabla_{j}^{(n + m)} \frac{\erf (\sqrt{α} | r_{i} - r_{j} |)}{| r_{i} - r_{j} |} = {\begin{cases} 0 & if m + n = 2 g + 1, g \in N^{0} \\ 2^{n + m - g} F_{n + m - g} δ^{(g)} & if m + n = 2 g, g \in N^{0} \end{cases}

(7.7)

so that in case of m + n = 2g, g ∈ ℕ⁰, U_self can be evaluated using Equation (4.3) with the constraint k = g, O_n = 0, and O_m = 0.

On the other hand, U_d can be evaluated using Equation (4.3) with $F_{k} = 2 \sqrt{\frac{α}{π}} {(- α)}^{k} B_{k} (α r^{2})$ , where $B_{k} (x) \equiv \int_{1}^{+ \infty} exp (- x t^{2}) t^{2 k} d t$ is the complementary Boys function of order k. A very useful downward recursion can be used to evaluate all B_n−k and F_n−k if we know B_n: B_k(x) = (2xB_k+1(x) − exp(−x))/(2n + 1).

VIII. COMPLEXITY ANALYSIS OF THE ALGORITHM

A. Comparison to recently developed recursion schemes

Previously, the McMurchie-Davidson formalism⁴⁶ has been exploited to populate the components of $\nabla_{j}^{(n + m)} \frac{erfc (\sqrt{α} | r_{i} - r_{j} |)}{| r_{i} - r_{j} |}$ and make the evaluation of U_d efficient.¹⁵ More recently, similar recursion schemes for the kernel functions $\frac{1}{r^{ν}}$ and $\frac{erfc (\sqrt{α} | r |)}{| r |}$ were proposed by Boateng and Todorov.³⁷ Since the algorithm I develop here (Equation (4.3)) does not require population of the central tensor in Equation (7.6), I would like to compare the efficiency of the current approach with the aforementioned recursion ones. For simplicity, the discussion here is restricted to the interaction energy between two EMP moments of the same rank p, but it is trivial to generalize the conclusions here to forces or torques as well as to EMP moments of different ranks.

In general, there are two major steps in the aforementioned recursion schemes:

1.
Construct a matrix representing the central tensor of $\nabla_{}^{(n)} f (r^{2})$ where f(r²) is the kernel function.
2.
Evaluate the vector-matrix-vector bilinear form, where the two interacting EMP moments are represented by the two vectors.

Without going into the details, I simply give the number of multiplication and addition as a function of p for the algorithms under comparison here (the derivation is given in Section S4 of the supplementary material⁵⁷). Note that I do not explicitly give the “big O” notation for complexity here because it only tells how the complexity scales as a function of the inputs and does not necessarily indicate the efficiency of the algorithm in solving a given problem. (However, one can easily fit a polynomial to the result I give later to obtain a “big O” estimate.) For example, the FMM, which scales as O(N), is significantly slower than the Particle Mesh Ewald (PME), which scales as O(NlogN), for most reasonable system sizes.^10,50 In molecular simulations with explicit representations of EMP, one usually cannot afford going higher than p = 4 even with state-of-the-art computation facilities. p ≤ 2 is more commonly seen.^{14–20,23,24} In fact, in the later comparison with the recursive algorithms developed previously (see below), one can verify that the algorithm here is faster than the recursive ones until p ≥ 100. Therefore, the “big O” notation is irrelevant in the comparison among the methods under discussion.

The McMurchie-Davidson formalism costs

(\binom{2 p + 4}{4}) * 2 + {(\binom{p + 2}{2})}^{2} + (\binom{p + 2}{2})

(8.1)

multiplications and

(\binom{2 p + 4}{4}) + {(\binom{p + 2}{2})}^{2} - 1

(8.2)

additions. The recursion scheme proposed by Boateng and Todorov for $\frac{1}{r^{ν}}$ costs

(\binom{2 p + 3}{3}) * 17 + {(\binom{p + 2}{2})}^{2} + (\binom{p + 2}{2})

(8.3)

multiplications and

(\binom{2 p + 3}{3}) * 9 + {(\binom{p + 2}{2})}^{2} - 1

(8.4)

additions, while that for $\frac{erfc (\sqrt{α} | r |)}{| r |}$ costs

(\binom{2 p + 3}{3}) * 31 + {(\binom{p + 2}{2})}^{2} + (\binom{p + 2}{2})

(8.5)

multiplications and

(\binom{2 p + 3}{3}) * 16 + {(\binom{p + 2}{2})}^{2} - 1

(8.6)

additions.

On the other hand, Equation (4.3) costs

6 (\binom{p + 2}{3}) + \sum_{l c = 0}^{p} (2 + (⌊\frac{p - l c}{2}⌋ + 1) (3 + (\binom{l c + 2}{2}) + (3 + (\binom{l_{c} + 2}{2})) (⌊\frac{p - l c}{2}⌋ + 1)))

(8.7)

multiplications and

4 (\binom{p + 2}{3}) + 2 T_{r} + \sum_{l c = 0}^{p} ((\binom{l c + 2}{2}) (⌊\frac{p - l c}{2}⌋ + 1) ⌊\frac{p - l c}{2}⌋ + (\binom{l c + 2}{2}) - 1)

(8.8)

additions, where

T_{r} \equiv {\begin{cases} \frac{4}{3} (\binom{g + 1}{2}) (g + 1) (g + 2) & if p = 2 g + 1, g \in N^{} \\ \frac{1}{3} (\binom{g + 1}{2}) (p^{2} + 2 p - 2) & if p = 2 g, g \in N^{} \end{cases} .

(8.9)

The ratios between the complexity of the 3 recursion schemes and Equation (4.3) are plotted as function of p in Figure 1. While the recursion schemes are intuitively easy to interpret, they are significantly slower as compared to the algorithm here. For example, to evaluate the energy between two hexadecapoles (p = 4) with the kernel $\frac{1}{r^{ν}}$ , the algorithm here is about 9 times faster than that prosed by Boateng and Todorov (Figures 1(c) and 1(d)); to evaluate the direct space energy between two hexadecapoles in the Ewald summation with the kernel function $\frac{erfc (\sqrt{α} | r |)}{| r |}$ , the algorithm here is about 4 times faster than the McMurchie-Davidson formalism (Figures 1(a) and 1(b)) and about 16 times faster than the algorithm proposed by Boateng and Todorov (Figures 1(e) and 1(f)).

FIG. 1. — The ratio of computational complexity (Y-axis) between the two recursion schemes and Equation (4.3) ((a) and (b): McMurchie-Davidson;¹⁵ (c) and (d): recursion scheme for computing gradients of $\frac{1}{r^{ν}}$ as in Ref. 37; (e) and (f): recursion scheme for computing gradients of $\frac{erfc (\sqrt{α} | r |)}{| r |}$ as in Ref. 37; (g) and (h): ideal recursion scheme where constructing the central tensor requires 1 operation per element) in terms of multiplication ((a), (c), (e), and (g)) or addition ((b), (d), (f), and (h)) as a function of EMP ranks (X-axis).

What makes the recursion algorithms slow is the construction of the central matrix in step 1. For small p, the reader can easily verify that step 1 alone is already much more expensive than the complete computation of the energy via Equation (4.3). This can also be seen from the fact that for the same kernel $\frac{erfc (\sqrt{α} | r |)}{| r |}$ , the McMurchie-Davidson formalism differs only in step 1 from that proposed by Boateng and Todorov and former is faster than the latter (compare Figures 1(a) and 1(b) with 1(e) and 1(f)) because of a much simpler recursion scheme^15,37 for constructing the central tensor. Also, from an optimization prospective, it is much easier to parallelize the computation in Equation (4.3) (see Section IX) than the recursive ones; the construction of the central tensor, which is the bottleneck step, is almost impossible to parallelize since its elements have to be computed in a specific order.

The need to construct the central tensor also requires a large amount of memory. In typical molecular simulations, the central tensor in Equation (7.6) has to be stored in memory for each pair of atoms, and this costs $n C {(\binom{p + 2}{2})}^{2}$ , where n is the total number of atoms and C depends on the distance cut-off scheme in the simulation. Care has to be taken in order to optimize for memory access, which adds to the difficulty of implementation.¹⁵ On the other hand, Equation (4.3) only requires storage of an array of $n ((\binom{p + 2}{3}) + \frac{T_{r}}{2})$ elements regardless of what cut-off scheme is used. This means Equation (4.3) is not only faster and more memory efficient but also easier to implement and optimize than the recursive algorithms for computing EMP interactions. Figure 2 shows the ratio between the memory consumption of the recursion schemes and the current algorithm. It is obvious that the recursion schemes consume about 3 orders of magnitude more memory than the current algorithm.

FIG. 2. — The ratio of memory consumption between the recursion schemes and Equation (4.3) as a function of EMP ranks. Note that in the recursion schemes, one needs to store a matrix for each pair of atoms. The number of pairs for each atom is assumed to be 140, which amounts to the typical size of the pair list with a distance cutoff of 10 Å in a simulation system that has the same density as water.

B. Comparison to the best possible recursion scheme

As the current algorithm is faster than the aforementioned recursive ones, one interesting question to ask is whether we can improve the recursion to speed up the calculation. Let us assume that in an ideal recursion scheme, one would need only one multiplication and one addition to construct each element of the central tensor/matrix. To build the central tensor/matrix up to rank p, one would need to calculate at least $(\binom{2 p + 3}{3})$ elements. This means we need at least $(\binom{2 p + 3}{3})$ multiplications and additions. To perform the bilinear form, we need additional ${(\binom{p + 2}{2})}^{2} + (\binom{p + 2}{2})$ multiplications and ${(\binom{p + 2}{2})}^{2} - 1$ additions. This totals to

(\binom{2 p + 3}{3}) + {(\binom{p + 2}{2})}^{2} + (\binom{p + 2}{2})

(8.10)

multiplications and

(\binom{2 p + 3}{3}) + {(\binom{p + 2}{2})}^{2} - 1

(8.11)

additions. Figures 1(g) and 1(h) show the complexity ratio between this ideal recursion scheme and the algorithm I develop here. It is obvious that the algorithm here is even more competitive than such an ideal (impossibly simple) recursion scheme. Also note that when the central tensor is traceless, the algorithm developed here can be even faster (see discussion in Section IV) while the recursion scheme cannot use this trick to speed up the calculation.

C. Numerical validation

To validate the comparison in Section VIII A, I implemented the algorithm developed here and the recursive one by Boateng and Todorov³⁷ for the Coulomb kernel ( $\frac{1}{r}$ ) and did a performance test on both. The test was to evaluate the electrostatic potential of a system containing 5000 hexadecapoles randomly and evenly distributed across a 20 × 20 × 20 box. The components of the hexadecapole were chosen randomly between 0 and 0.05. Units are chosen such that the Coulomb’s constant is 1. The potential energies between all unique pairs of hexadecapoles are evaluated, totaling the number of pairwise energy evaluations to $(\binom{5000}{2})$ . The same test is repeated for 100 times and at each time, the ratio between the execution time of the two algorithms is plotted in Figure 3. The recursive algorithm costs about 6.5 times as much as the algorithm developed here when the two algorithms give the same electrostatic potential energy (with 15-digit accuracy in double precision). Note that this ratio might vary a bit with different implementations but it agrees qualitatively with the result (9 times) from complexity analysis in Section VIII A. Again, this shows that the algorithm developed here is faster than the recursive one.

FIG. 3. — The ratio of execution time between the recursive algorithm by Boateng and Todorov³⁷ and Equation (4.3) with the Coulomb kernel $\frac{1}{r}$ at the hexadecapole level (p = 4) for each of the 100 tests.

D. Reasons for the algorithm’s efficiency

The efficiency of the current algorithm stems from the gist of the multipole expansion, which is basically a series of projections in multilinear space (see Section V), and decomposing the bilinear form into the minimal set of these projection operations. Perhaps more importantly, each of these projection operations has very simple operands, i.e., the two interacting EMP moments and the distance-dependent r^(m+n), so that no operation is wasted on computing intermediate variables. I will simply conclude this section with the textbook example of dipole-dipole interaction. It is easy to recover the following well-known dipole-dipole interaction energy equation by plugging the Coulomb potential ( $\frac{1}{r}$ ) into Equations (4.3) and (2.10):

U = \frac{- 3 (μ_{i}^{(1)} \cdot {\hat{r}}_{i j}) (μ_{j}^{(1)} \cdot {\hat{r}}_{i j}) + μ_{i}^{(1)} \cdot μ_{j}^{(1)}}{r_{i j}^{3}},

(8.12)

where ${\hat{r}}_{i j}$ is the unit vector of r_ij ≡ r_i − r_j. This only requires 3 dot products and a few scalar multiplications. On the other hand, if one was to use the recursive algorithm, e.g., the one proposed by Boateng and Todorov,³⁷ the working equation would be

U = μ_{i}^{(1)} M μ_{j}^{(1)},

(8.13)

where one would need to first fill out the upper triangle of the 3 × 3 symmetric matrix M via recursion, with the costs of each element being more than 20 floating-point operations, and then carry out the bilinear form with the equivalent of 4 dot products. Arguably, one could manually optimize out some multiplication-by-zero operations from the published equations;³⁷ however, even if we assume an ideal case where it needs only 1 multiplication and 1 addition, such a method spends a significant number of operations on computing unnecessary intermediate variables, making it suboptimal compared to the algorithm developed here.

IX. IMPLEMENTATION

Equations (4.3), (2.16), and (2.17) can be easily implemented as a series of matrix-vector products. Taking Equation (4.3) as an example, we can cast it into a triple sum over l_c, l_m, and l_n as follows:

A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f (r^{2}) \cdot m \cdot B^{(m)} = \sum_{l_{c} = 0}^{\min (m, n)} {\sum_{l_{m} = 0}^{⌊\frac{m - l_{c}}{2}⌋} {[A^{(m : l_{m})} \cdot O_{m} \cdot r^{(O_{m})}] \cdot l_{c} \cdot [\sum_{l_{n} = 0}^{⌊\frac{n - l_{c}}{2}⌋} (B^{(n : l_{n})} \cdot O_{n} \cdot r^{(O_{n})} C^{'} (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}))]}},

(9.1)

where

C^{'} (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}) \equiv C (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}) 2^{m + n - l_{c} - l_{m} - l_{n}} F_{m + n - l_{c} - l_{m} - l_{n}} .

(9.2)

For a given triplet of m, n, and l_c, $C^{'} (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array})$ can be re-interpreted as a $(⌊\frac{m - l_{c}}{2}⌋ + 1) \times (⌊\frac{n - l_{c}}{2}⌋ + 1)$ matrix C^′{m,n,l_c},

C_{l_{m}, l_{n}}^{' {m, n, l_{c}}} \equiv C^{'} (\begin{array}{c} m, n \\ l_{c}, l_{m}, l_{n} \end{array}),

(9.3)

where the superscript {m, n, l_c} is just a reminder of the fact that C′ is a function of m, n, and l_c. Similarly, A^(m:l_m) ⋅ O_m ⋅ r^(O_m) and B^(n:l_n) ⋅ O_n ⋅ r^(O_n) can be treated as the elements of the respective l_m + 1 and l_n + 1 array of l_c-rank tensors,

A_{l_{m}}^{{m, l_{c}}} \equiv A^{(m : l_{m})} \cdot O_{m} \cdot r^{(O_{m})},

(9.4)

B_{l_{n}}^{{n, l_{c}}} \equiv B^{(n : l_{n})} \cdot O_{n} \cdot r^{(O_{n})},

(9.5)

where the superscripts {m, l_c} and {n, l_c} again indicate that A and B are functions of m or n and l_c. Equation (9.1) can then be cast into a vector-matrix-vector bilinear form

A^{(n)} \cdot n \cdot \nabla_{}^{(n + m)} f (r^{2}) \cdot m \cdot B^{(m)} = \sum_{l_{c} = 0}^{\min (m, n)} A C^{'} B

(9.6)

if we define the vector-vector dot product between the two tensor arrays A and C′B as element-wise l_c-fold tensor contraction.

There are a few advantages to using the bilinear form in Equation (9.6) to calculate the EMP energy in Equation (2.14). First of all, as discussed in Section VIII, the number of operations is much smaller than directly evaluating the bi-contraction via the canonical matrix formalism even in the case where the central tensor can be populated recursively. Second, in applications where the EMP sites are represented by EMP polytensor,^51,52 e.g., as in a molecular dynamics force field, one needs to evaluate Equation (9.6) for each A⁽ⁿ⁾ against a series of B^(m) at the same displacement r where m varies. One can just populate the array defined in Equation (9.4) for the largest possible l_c and use it for any m. This is in contrast with the canonical matrix formalism where the central tensor has to be populated for each individual pair of m and n, which costs extra operations. Last but not the least, unlike the formalism proposed by Smith,⁴¹ where a large number of scalar arithmetic terms have to be written out manually, the simplicity of Equation (9.6) makes it possible to use compact data structures such as arrays in implementation with the ease of debugging and code maintenance. For example, for a list of EMP sites interacting with A⁽ⁿ⁾, one can build up a matrix D whose rows are the B arrays for each of the interacting sites so that the total energy can be computed using Equation (9.6) with the matrix-vector product C′B simply replaced by the matrix-matrix product of C′D. This also makes it very easy to use high performance linear algebra libraries such as BLAS^53,54 or Blaze^55,56 to carry out the computation.

For force and torque as in Equations (4.4) and (4.5), one can also find an expression similar to Equation (9.6). Instead of elaborating the details, the author has implemented a C++11 template library for basic operations of symmetric Cartesian tensors with the focus on its application in computation of EMP energy, force, and torque. It is available for download at https://github.com/dejunlin/emp/releases. The implementation achieves consistency between the numerical and analytical forces and torques with 10-digit accuracy (see Section S5 and Figure S1 for details⁵⁷).

X. CONCLUSIONS

In conclusion, I established the algorithm to calculate multipole-multipole interaction of a generalized potential, which is a function of inter-site distance only. The method presented here is much faster than the canonical tensor-based formalism and has a compact expression that is very easy to implement in computer programs. A comparison of the current method to various recursive algorithms developed recently showed that this method is generally faster. This formalism is applicable to various interaction potentials where the explicit representation of multipole moments is needed and its Cartesian basis makes it a natural fit in modern molecular simulation scheme.

Acknowledgments

Thanks goes to Dr. Alan Grossfield at the University of Rochester for helpful comments on the paper. This work was supported by Grant No. GM095496 from the NIH and the Elon Huntington Hooker Graduate Fellowship from the University of Rochester.

APPENDIX: MATHEMATICAL NOTATION

1. Tensors

Here, I adopt the tensor notation used elsewhere.⁴² We will denote a Cartesian tensor of rank n by the boldface symbol A⁽ⁿ⁾, or by $A_{α_{1} \dots α_{n}}^{(n)}$ , where the subscripts α_i (i = 1, 2, …, n) denote Cartesian axes 1, 2, 3 (corresponding to x, y, z, respectively). A⁽ⁿ⁾ is called a totally symmetric tensor if $A_{α_{1} \dots α_{n}}^{(n)}$ is invariant under any permutation of the sequence of subscripts. A totally symmetric tensor can be denoted by its compressed form A⁽ⁿ⁾(n₁, n₂, n₃), where n_i is the number of times the i occurs in the sequence of subscripts α₁…α_n and n₁ + n₂ + n₃ = n. For simplicity, I will abbreviate a rank-1 tensor v⁽¹⁾, which is a vector, as v unless specified otherwise. In the context that v is defined as a vector, I will denote its kth component by the corresponding non-bold character with subscript k, i.e., v_k and the norm of v by the same character with no subscript, i.e., v ≡ |v| if v is a vector.

2. Tensor contractions and traces

We denote an n-fold contraction by ⋅ n ⋅ , e.g.,

A^{(n)} \cdot n \cdot B^{(n)} \equiv A_{α_{1} \dots α_{n}}^{(n)} B_{α_{n} \dots α_{1}}^{(n)},

(A1)

where a summation is assumed over a index that appears twice in a product. Contraction can also be performed between tensors of different ranks, e.g.,

T_{β_{1} \dots β_{m - n}}^{(m - n)} = A^{(n)} \cdot n \cdot B^{(m)} \equiv A_{α_{1} \dots α_{n}}^{(n)} B_{α_{n} \dots α_{1} β_{1} \dots β_{m - n}}^{(m)} (m \geq n)

(A2)

s_{α_{1} \dots α_{m - n}}^{(m - n)} = B^{(m)} \cdot n \cdot A^{(n)} \equiv B_{α_{1} \dots α_{m - n} β_{1} \dots β_{n}}^{(m)} A_{β_{n} \dots β_{1}}^{(n)} (m \geq n) .

(A3)

We will abbreviate ⋅ 1 ⋅ as ⋅ and thus a dot product between 2 vectors or a matrix-vector multiplication is abbreviated as

u \cdot v \equiv u^{(1)} \cdot 1 \cdot v^{(1)}

(A4)

A^{(2)} \cdot v \equiv A^{(2)} \cdot 1 \cdot v,

(A5)

respectively.

The m-fold trace of A⁽ⁿ⁾ is defined as

A_{α_{2 m + 1} \dots α_{n}}^{(n : m)} \equiv A_{μ_{1} μ_{1} \dots μ_{m} μ_{m} α_{2 m + 1} \dots α_{n}}^{(n)} .

(A6)

3. Fourier transform

The Fourier transform of a function f(r) is denoted as

\hat{f} (s) \equiv \int_{all space}^{} e^{- i 2 π s \cdot r} f (r) d r,

(A7)

where dr ≡ dr₁dr₂dr₃. The corresponding inverse transform is denoted as

\overset{ˇ}{f} (r) = \int_{all space}^{} e^{i 2 π r \cdot s} f (s) d s

(A8)

so that $f = \overset{ˇ}{\hat{f}}$ .

4. Lattice and Dirac comb

A lattice in ℝ³ is defined by 3 linearly independent bases u₁, u₂, u₃ so that any lattice point p is expressed by the integer linear combination of the bases,

p = n_{1} u_{1} + n_{2} u_{2} + n_{3} u_{3}, n_{1}, n_{2}, n_{3} = 0, \pm 1, \pm 2, \dots .

(A9)

We will denote such a lattice by $L$ . Any $L$ can be transformed from ℤ³ via an invertible linear transformation A, i.e.,

L = A (Z^{3}) .

(A10)

A can be expressed as a rank-2 Cartesian tensor A⁽²⁾ so that

A^{(2)} \cdot n \in L, \forall n \in Z^{3} .

(A11)

The dual lattice $\tilde{L}$ of $L$ is given by

\tilde{L} = \tilde{A} (Z^{3}),

(A12)

where $\tilde{A} \equiv {(A^{- 1})}^{T}$ is the transpose of the inverse of A, so that

p \cdot q \in Z^{}, \forall p \in L, and \forall q \in \tilde{L} .

(A13)

We denote the Dirac comb function associated with $L$ as

Ш_{L} (r) \equiv \sum_{n \in Z^{3}}^{} δ (r - A^{(2)} \cdot n),

(A14)

where the sum is over all points in ℤ³. A function f(r) defined in 𝕍 (𝕍 ⊂ ℝ³) can be periodized by convolving it with $Ш_{L} (r)$ ,

f_{L} (r) \equiv \sum_{p \in L} f (r - p) = f (r) * Ш_{L} (r),

(A15)

where ∗ denotes convolution.

Define

k \equiv {\tilde{A}}^{(2)} \cdot n (n \in Z^{3})

(A16)

as a vector in $\tilde{L}$ . The Fourier transform of $f_{L} (r)$ is given by

{\hat{f}}_{L} (s) = \hat{f} (s) {\hat{Ш}}_{\tilde{L}} (s)

(A17)

= \frac{1}{| \det A^{(2)} |} \sum_{k \in \tilde{L}} \hat{f} (k) δ (s - k),

(A18)

where |detA⁽²⁾| is the determinant of A⁽²⁾. Taking the inverse transform of both sides of Equation (A18) will give

f_{L} (r) = \frac{1}{| \det A^{(2)} |} \sum_{k \in \tilde{L}} e^{i 2 π r \cdot k} \hat{f} (k) .

(A19)

5. Poisson’s equation on a lattice

The Poisson’s equation for the electrostatic potential $ϕ_{L} (r)$ of a given source $ρ_{L} (r)$ is

- Δ ϕ_{L} (r) = 4 π ρ_{L} (r),

(A20)

where both $ϕ_{L} (r)$ and $ρ_{L} (r)$ are periodic on lattice ℒ ≡ A(ℤ³) and Δ ≡ ∇² is the Laplace operator. We will denote the pair of solution $ϕ_{L} (r)$ and the source function $ρ_{L} (r)$ in Equation (A20) as

ϕ_{L} (r) \overset{Possion}{⇌} ρ_{L} (r) .

(A21)

By using Equation (A19), Equation (A20) has the solution as the Fourier series,

ϕ_{L} (r) = \frac{1}{| \det A^{(2)} |} \sum_{k \in \tilde{L}} e^{i 2 π r \cdot k} \hat{ϕ} (k)

(A22)

= \frac{1}{| \det A^{(2)} |} \sum_{k \in \tilde{L}} e^{i 2 π r \cdot k} \frac{1}{π | k |^{2}} \hat{ρ} (k) (k \neq 0),

(A23)

where $\hat{ϕ} (k)$ and $\hat{ρ} (k)$ are the Fourier transform of ϕ(r) and ρ(r), which are in turn one period of $ϕ_{L} (r)$ and $ρ_{L} (r)$ , respectively, and k and $\tilde{L}$ are defined as in Equations (A16) and (A12), respectively.

REFERENCES

1.Buckingham A. D. and Fowler P. W., Can. J. Chem. 63, 2018–2025 (1985). 10.1139/v85-334 [DOI] [Google Scholar]
2.Hurst G. J. B., Fowler P. W., Stone A. J., and Buckingham A. D., Int. J. Quantum Chem. 29, 1223–1239 (1986). 10.1002/qua.560290520 [DOI] [Google Scholar]
3.Williams D. E., J. Comput. Chem. 9, 745–763 (1988). 10.1002/jcc.540090705 [DOI] [Google Scholar]
4.Spackman M. A., J. Chem. Phys. 85, 6587 (1986). 10.1063/1.451441 [DOI] [Google Scholar]
5.Colonna F., Evleth E., and Ángyán J. G., J. Comput. Chem. 13, 1234–1245 (1992). 10.1002/jcc.540131007 [DOI] [Google Scholar]
6.Sokalski W. A., Keller D. A., Ornstein R. L., and Rein R., J. Comput. Chem. 14, 970–976 (1993). 10.1002/jcc.540140812 [DOI] [PubMed] [Google Scholar]
7.Dykstra C. E., Chem. Rev. 93, 2339–2353 (1993). 10.1021/cr00023a001 [DOI] [Google Scholar]
8.Price S. L., Faerman C. H., and Murray C. W., J. Comput. Chem. 12, 1187–1197 (1991). 10.1002/jcc.540121005 [DOI] [Google Scholar]
9.Kong Y., “Multipole electrostatic methods for protein modeling with reaction field treatment,” Ph.D. thesis, Graduate School of Arts and Sciences of Washington University, 1997. [Google Scholar]
10.Cisneros G. A., Karttunen M., Ren P., and Sagui C., Chem. Rev. 114, 779–814 (2014). 10.1021/cr300461d [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Marshall G., J. Comput.-Aided Mol. Des. 27, 107–114 (2013). 10.1007/s10822-013-9634-x [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Vorobyov I., Li L., and Allen T. W., J. Phys. Chem. B 112, 9588–9602 (2008). 10.1021/jp711492h [DOI] [PubMed] [Google Scholar]
13.Bennett W. D. and Tieleman D. P., J. Chem. Theory Comput. 7, 2981–2988 (2011). 10.1021/ct200291v [DOI] [PubMed] [Google Scholar]
14.Toukmaji A., Sagui C., Board J., and Darden T., J. Chem. Phys. 113, 10913–10927 (2000). 10.1063/1.1324708 [DOI] [Google Scholar]
15.Sagui C., Pedersen L. G., and Darden T. A., J. Chem. Phys. 120, 73–87 (2004). 10.1063/1.1630791 [DOI] [PubMed] [Google Scholar]
16.Wang W. and Skeel R. D., J. Chem. Phys. 123, 164107 (2005). 10.1063/1.2056544 [DOI] [PubMed] [Google Scholar]
17.Cerda J. J., Ballenegger V., Lenz O., and Holm C., J. Chem. Phys. 129, 234104 (2008). 10.1063/1.3000389 [DOI] [PubMed] [Google Scholar]
18.Ponder J. W., Wu C., Ren P., Pande V. S., Chodera J. D., Schnieders M. J., Haque I., Mobley D. L., Lambrecht D. S., DiStasio R. A., Head-Gordon M., Clark G. N. I., Johnson M. E., and Head-Gordon T., J. Phys. Chem. B 114, 2549–2564 (2010). 10.1021/jp910674d [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Shi Y., Xia Z., Zhang J., Best R., Wu C., Ponder J. W., and Ren P., J. Chem. Theory Comput. 9, 4046–4063 (2013). 10.1021/ct4003702 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Laury M. L., Wang L.-P., Pande V. S., Head-Gordon T., and Ponder J. W., J. Phys. Chem. B 119, 9423–9437 (2015). 10.1021/jp510896n [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Elking D., Darden T., and Woods R. J., J. Comput. Chem. 28, 1261–1274 (2007). 10.1002/jcc.20574 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Elking D. M., Cisneros G. A., Piquemal J.-P., Darden T. A., and Pedersen L. G., J. Chem. Theory Comput. 6, 190–202 (2010). 10.1021/ct900348b [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Wu J., Zhen X., Shen H., Li G., and Ren P., J. Chem. Phys. 135, 155104 (2011). 10.1063/1.3651626 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Orsi M. and Essex J. W., PLoS One 6, e28637 (2011). 10.1371/journal.pone.0028637 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Schnieders M. J., Fenn T. D., Pande V. S., and Brunger A. T., Acta Crystallogr., Sect. D: Biol. Crystallogr. 65, 952–965 (2009). 10.1107/S0907444909022707 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Fenn T. D., Schnieders M. J., Brunger A. T., and Pande V. S., Biophys. J. 98, 2984–2992 (2010). 10.1016/j.bpj.2010.02.057 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Schnieders M. J., Fenn T. D., and Pande V. S., J. Chem. Theory Comput. 7, 1141–1156 (2011). 10.1021/ct100506d [DOI] [PubMed] [Google Scholar]
28.Fenn T. D. and Schnieders M. J., Acta Crystallogr., Sect. D: Biol. Crystallogr. 67, 957–965 (2011). 10.1107/S0907444911039060 [DOI] [PubMed] [Google Scholar]
29.in’t Veld P. J., Ismail A. E., and Grest G. S., J. Chem. Phys. 127, 144711 (2007). 10.1063/1.2770730 [DOI] [PubMed] [Google Scholar]
30.Mazars M., J. Phys. A: Math. Theor. 43, 425002 (2010). 10.1088/1751-8113/43/42/425002 [DOI] [Google Scholar]
31.Isele-Holder R. E., Mitchell W., and Ismail A. E., J. Chem. Phys. 137, 174107 (2012). 10.1063/1.4764089 [DOI] [PubMed] [Google Scholar]
32.Míguez J. M., Piñeiro M. M., and Blas F. J., J. Chem. Phys. 138, 034707 (2013). 10.1063/1.4775739 [DOI] [PubMed] [Google Scholar]
33.Deserno M. and Holm C., J. Chem. Phys. 109, 7678–7693 (1998). 10.1063/1.477414 [DOI] [Google Scholar]
34.Sagui C. and Darden T. A., Annu. Rev. Biophys. Biomol. Struct. 28, 155–179 (1999). 10.1146/annurev.biophys.28.1.155 [DOI] [PubMed] [Google Scholar]
35.Aguado A. and Madden P. A., J. Chem. Phys. 119, 7471–7483 (2003). 10.1063/1.1605941 [DOI] [Google Scholar]
36.Simmonett A. C., Pickard F. C. IV, Schaefer H. F. III, and Brooks B. R., J. Chem. Phys. 140, 184101 (2014). 10.1063/1.4873920 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Boateng H. A. and Todorov I. T., J. Chem. Phys. 142, 034117 (2015). 10.1063/1.4905952 [DOI] [PubMed] [Google Scholar]
38.Wolf D., Keblinski P., Phillpot S. R., and Eggebrecht J., J. Chem. Phys. 110, 8254–8282 (1999). 10.1063/1.478738 [DOI] [Google Scholar]
39.Lamichhane M., Gezelter J. D., and Newman K. E., J. Chem. Phys. 141, 134109 (2014). 10.1063/1.4896627 [DOI] [PubMed] [Google Scholar]
40.Lamichhane M., Newman K. E., and Gezelter J. D., J. Chem. Phys. 141, 134110 (2014). 10.1063/1.4896628 [DOI] [PubMed] [Google Scholar]
41.Smith W., CCP5 Info. Quart. 46, 18–30 (1998). [Google Scholar]
42.Applequist J., J. Phys. A: Math. Gen. 22, 4303 (1989). 10.1088/0305-4470/22/20/011 [DOI] [Google Scholar]
43.Burgos E. and Bonadeo H., Mol. Phys. 44, 1–15 (1981). 10.1080/00268978100102251 [DOI] [Google Scholar]
44.Greengard L. and Rokhlin V., J. Comput. Phys. 73, 325–348 (1987). 10.1016/0021-9991(87)90140-9 [DOI] [Google Scholar]
45.Greengard L. and Rokhlin V., Acta Numer. 6, 229–269 (1997). 10.1017/S0962492900002725 [DOI] [Google Scholar]
46.McMurchie L. E. and Davidson E. R., J. Comput. Phys. 26, 218–231 (1978). 10.1016/0021-9991(78)90092-X [DOI] [Google Scholar]
47.Helgaker T., Jørgensen P., and Olsen J., Molecular Electronic Structure Theory (John Wiley & Sons, Ltd., Chichester, 2000). [Google Scholar]
48.Grad H., Commun. Pure Appl. Math. 2, 325–330 (1949). 10.1002/cpa.3160020402 [DOI] [Google Scholar]
49.de Leeuw S. W., Perram J. W., and Smith E. R., Proc. R. Soc. London, Ser. A 373, 27–56 (1980). 10.1098/rspa.1980.0135 [DOI] [Google Scholar]
50.Pollock E. and Glosli J., Comput. Phys. Commun. 95, 93–110 (1996). 10.1016/0010-4655(96)00043-4 [DOI] [Google Scholar]
51.Applequist J., J. Math. Phys. 24, 736–741 (1983). 10.1063/1.525770 [DOI] [Google Scholar]
52.Applequist J., J. Chem. Phys. 83, 809–826 (1985). 10.1063/1.449496 [DOI] [Google Scholar]
53.Dongarra J. J., Du Croz J., Hammarling S., and Duff I. S., ACM Trans. Math. Software 16, 1–17 (1990). 10.1145/77626.79170 [DOI] [Google Scholar]
54.Dongarra J. J., Du Croz J., Hammarling S., and Hanson R. J., ACM Trans. Math. Software 14, 18–32 (1988). 10.1145/42288.42292 [DOI] [Google Scholar]
55.Iglberger K., Hager G., Treibig J., and Rüde U., in 2012 International Conference on High Performance Computing & Simulation (HPCS), 2012. [Google Scholar]
56.Iglberger K., Hager G., Treibig J., and Rüde U., SIAM J. Sci. Comput. 34, C42–C69 (2012). 10.1137/110830125 [DOI] [Google Scholar]
57.See supplementary material at http://dx.doi.org/10.1063/1.4930984 E-JCPSA6-143-026536 for the details of the derivation of some equations.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

See supplementary material at http://dx.doi.org/10.1063/1.4930984 E-JCPSA6-143-026536 for the details of the derivation of some equations.

[c1] 1.Buckingham A. D. and Fowler P. W., Can. J. Chem. 63, 2018–2025 (1985). 10.1139/v85-334 [DOI] [Google Scholar]

[c2] 2.Hurst G. J. B., Fowler P. W., Stone A. J., and Buckingham A. D., Int. J. Quantum Chem. 29, 1223–1239 (1986). 10.1002/qua.560290520 [DOI] [Google Scholar]

[c3] 3.Williams D. E., J. Comput. Chem. 9, 745–763 (1988). 10.1002/jcc.540090705 [DOI] [Google Scholar]

[c4] 4.Spackman M. A., J. Chem. Phys. 85, 6587 (1986). 10.1063/1.451441 [DOI] [Google Scholar]

[c5] 5.Colonna F., Evleth E., and Ángyán J. G., J. Comput. Chem. 13, 1234–1245 (1992). 10.1002/jcc.540131007 [DOI] [Google Scholar]

[c6] 6.Sokalski W. A., Keller D. A., Ornstein R. L., and Rein R., J. Comput. Chem. 14, 970–976 (1993). 10.1002/jcc.540140812 [DOI] [PubMed] [Google Scholar]

[c7] 7.Dykstra C. E., Chem. Rev. 93, 2339–2353 (1993). 10.1021/cr00023a001 [DOI] [Google Scholar]

[c8] 8.Price S. L., Faerman C. H., and Murray C. W., J. Comput. Chem. 12, 1187–1197 (1991). 10.1002/jcc.540121005 [DOI] [Google Scholar]

[c9] 9.Kong Y., “Multipole electrostatic methods for protein modeling with reaction field treatment,” Ph.D. thesis, Graduate School of Arts and Sciences of Washington University, 1997. [Google Scholar]

[c10] 10.Cisneros G. A., Karttunen M., Ren P., and Sagui C., Chem. Rev. 114, 779–814 (2014). 10.1021/cr300461d [DOI] [PMC free article] [PubMed] [Google Scholar]

[c11] 11.Marshall G., J. Comput.-Aided Mol. Des. 27, 107–114 (2013). 10.1007/s10822-013-9634-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[c12] 12.Vorobyov I., Li L., and Allen T. W., J. Phys. Chem. B 112, 9588–9602 (2008). 10.1021/jp711492h [DOI] [PubMed] [Google Scholar]

[c13] 13.Bennett W. D. and Tieleman D. P., J. Chem. Theory Comput. 7, 2981–2988 (2011). 10.1021/ct200291v [DOI] [PubMed] [Google Scholar]

[c14] 14.Toukmaji A., Sagui C., Board J., and Darden T., J. Chem. Phys. 113, 10913–10927 (2000). 10.1063/1.1324708 [DOI] [Google Scholar]

[c15] 15.Sagui C., Pedersen L. G., and Darden T. A., J. Chem. Phys. 120, 73–87 (2004). 10.1063/1.1630791 [DOI] [PubMed] [Google Scholar]

[c16] 16.Wang W. and Skeel R. D., J. Chem. Phys. 123, 164107 (2005). 10.1063/1.2056544 [DOI] [PubMed] [Google Scholar]

[c17] 17.Cerda J. J., Ballenegger V., Lenz O., and Holm C., J. Chem. Phys. 129, 234104 (2008). 10.1063/1.3000389 [DOI] [PubMed] [Google Scholar]

[c18] 18.Ponder J. W., Wu C., Ren P., Pande V. S., Chodera J. D., Schnieders M. J., Haque I., Mobley D. L., Lambrecht D. S., DiStasio R. A., Head-Gordon M., Clark G. N. I., Johnson M. E., and Head-Gordon T., J. Phys. Chem. B 114, 2549–2564 (2010). 10.1021/jp910674d [DOI] [PMC free article] [PubMed] [Google Scholar]

[c19] 19.Shi Y., Xia Z., Zhang J., Best R., Wu C., Ponder J. W., and Ren P., J. Chem. Theory Comput. 9, 4046–4063 (2013). 10.1021/ct4003702 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c20] 20.Laury M. L., Wang L.-P., Pande V. S., Head-Gordon T., and Ponder J. W., J. Phys. Chem. B 119, 9423–9437 (2015). 10.1021/jp510896n [DOI] [PMC free article] [PubMed] [Google Scholar]

[c21] 21.Elking D., Darden T., and Woods R. J., J. Comput. Chem. 28, 1261–1274 (2007). 10.1002/jcc.20574 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c22] 22.Elking D. M., Cisneros G. A., Piquemal J.-P., Darden T. A., and Pedersen L. G., J. Chem. Theory Comput. 6, 190–202 (2010). 10.1021/ct900348b [DOI] [PMC free article] [PubMed] [Google Scholar]

[c23] 23.Wu J., Zhen X., Shen H., Li G., and Ren P., J. Chem. Phys. 135, 155104 (2011). 10.1063/1.3651626 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c24] 24.Orsi M. and Essex J. W., PLoS One 6, e28637 (2011). 10.1371/journal.pone.0028637 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c25] 25.Schnieders M. J., Fenn T. D., Pande V. S., and Brunger A. T., Acta Crystallogr., Sect. D: Biol. Crystallogr. 65, 952–965 (2009). 10.1107/S0907444909022707 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c26] 26.Fenn T. D., Schnieders M. J., Brunger A. T., and Pande V. S., Biophys. J. 98, 2984–2992 (2010). 10.1016/j.bpj.2010.02.057 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c27] 27.Schnieders M. J., Fenn T. D., and Pande V. S., J. Chem. Theory Comput. 7, 1141–1156 (2011). 10.1021/ct100506d [DOI] [PubMed] [Google Scholar]

[c28] 28.Fenn T. D. and Schnieders M. J., Acta Crystallogr., Sect. D: Biol. Crystallogr. 67, 957–965 (2011). 10.1107/S0907444911039060 [DOI] [PubMed] [Google Scholar]

[c29] 29.in’t Veld P. J., Ismail A. E., and Grest G. S., J. Chem. Phys. 127, 144711 (2007). 10.1063/1.2770730 [DOI] [PubMed] [Google Scholar]

[c30] 30.Mazars M., J. Phys. A: Math. Theor. 43, 425002 (2010). 10.1088/1751-8113/43/42/425002 [DOI] [Google Scholar]

[c31] 31.Isele-Holder R. E., Mitchell W., and Ismail A. E., J. Chem. Phys. 137, 174107 (2012). 10.1063/1.4764089 [DOI] [PubMed] [Google Scholar]

[c32] 32.Míguez J. M., Piñeiro M. M., and Blas F. J., J. Chem. Phys. 138, 034707 (2013). 10.1063/1.4775739 [DOI] [PubMed] [Google Scholar]

[c33] 33.Deserno M. and Holm C., J. Chem. Phys. 109, 7678–7693 (1998). 10.1063/1.477414 [DOI] [Google Scholar]

[c34] 34.Sagui C. and Darden T. A., Annu. Rev. Biophys. Biomol. Struct. 28, 155–179 (1999). 10.1146/annurev.biophys.28.1.155 [DOI] [PubMed] [Google Scholar]

[c35] 35.Aguado A. and Madden P. A., J. Chem. Phys. 119, 7471–7483 (2003). 10.1063/1.1605941 [DOI] [Google Scholar]

[c36] 36.Simmonett A. C., Pickard F. C. IV, Schaefer H. F. III, and Brooks B. R., J. Chem. Phys. 140, 184101 (2014). 10.1063/1.4873920 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c37] 37.Boateng H. A. and Todorov I. T., J. Chem. Phys. 142, 034117 (2015). 10.1063/1.4905952 [DOI] [PubMed] [Google Scholar]

[c38] 38.Wolf D., Keblinski P., Phillpot S. R., and Eggebrecht J., J. Chem. Phys. 110, 8254–8282 (1999). 10.1063/1.478738 [DOI] [Google Scholar]

[c39] 39.Lamichhane M., Gezelter J. D., and Newman K. E., J. Chem. Phys. 141, 134109 (2014). 10.1063/1.4896627 [DOI] [PubMed] [Google Scholar]

[c40] 40.Lamichhane M., Newman K. E., and Gezelter J. D., J. Chem. Phys. 141, 134110 (2014). 10.1063/1.4896628 [DOI] [PubMed] [Google Scholar]

[c41] 41.Smith W., CCP5 Info. Quart. 46, 18–30 (1998). [Google Scholar]

[c42] 42.Applequist J., J. Phys. A: Math. Gen. 22, 4303 (1989). 10.1088/0305-4470/22/20/011 [DOI] [Google Scholar]

[c43] 43.Burgos E. and Bonadeo H., Mol. Phys. 44, 1–15 (1981). 10.1080/00268978100102251 [DOI] [Google Scholar]

[c44] 44.Greengard L. and Rokhlin V., J. Comput. Phys. 73, 325–348 (1987). 10.1016/0021-9991(87)90140-9 [DOI] [Google Scholar]

[c45] 45.Greengard L. and Rokhlin V., Acta Numer. 6, 229–269 (1997). 10.1017/S0962492900002725 [DOI] [Google Scholar]

[c46] 46.McMurchie L. E. and Davidson E. R., J. Comput. Phys. 26, 218–231 (1978). 10.1016/0021-9991(78)90092-X [DOI] [Google Scholar]

[c47] 47.Helgaker T., Jørgensen P., and Olsen J., Molecular Electronic Structure Theory (John Wiley & Sons, Ltd., Chichester, 2000). [Google Scholar]

[c48] 48.Grad H., Commun. Pure Appl. Math. 2, 325–330 (1949). 10.1002/cpa.3160020402 [DOI] [Google Scholar]

[c49] 49.de Leeuw S. W., Perram J. W., and Smith E. R., Proc. R. Soc. London, Ser. A 373, 27–56 (1980). 10.1098/rspa.1980.0135 [DOI] [Google Scholar]

[c50] 50.Pollock E. and Glosli J., Comput. Phys. Commun. 95, 93–110 (1996). 10.1016/0010-4655(96)00043-4 [DOI] [Google Scholar]

[c51] 51.Applequist J., J. Math. Phys. 24, 736–741 (1983). 10.1063/1.525770 [DOI] [Google Scholar]

[c52] 52.Applequist J., J. Chem. Phys. 83, 809–826 (1985). 10.1063/1.449496 [DOI] [Google Scholar]

[c53] 53.Dongarra J. J., Du Croz J., Hammarling S., and Duff I. S., ACM Trans. Math. Software 16, 1–17 (1990). 10.1145/77626.79170 [DOI] [Google Scholar]

[c54] 54.Dongarra J. J., Du Croz J., Hammarling S., and Hanson R. J., ACM Trans. Math. Software 14, 18–32 (1988). 10.1145/42288.42292 [DOI] [Google Scholar]

[c55] 55.Iglberger K., Hager G., Treibig J., and Rüde U., in 2012 International Conference on High Performance Computing & Simulation (HPCS), 2012. [Google Scholar]

[c56] 56.Iglberger K., Hager G., Treibig J., and Rüde U., SIAM J. Sci. Comput. 34, C42–C69 (2012). 10.1137/110830125 [DOI] [Google Scholar]

[c57] 57.See supplementary material at http://dx.doi.org/10.1063/1.4930984 E-JCPSA6-143-026536 for the details of the derivation of some equations.

PERMALINK

Generalized and efficient algorithm for computing multipole energies and gradients based on Cartesian tensors

Dejun Lin

Abstract

I. INTRODUCTION

II. THE BASIC PROBLEM

III. THE n-FOLD GRADIENT OF f(r2)

Definition 3.1.

Lemma 3.1.

Proof.

Lemma 3.2.

Proof.

Theorem 3.1.

Proof.

IV. EVALUATION OF THE BI-CONTRACTION FORM

V. INTERPRETATION OF THE ALGORITHM

VI. NOVELTY OF THE ALGORITHM

VII. THE EWALD SUMMATION

VIII. COMPLEXITY ANALYSIS OF THE ALGORITHM

A. Comparison to recently developed recursion schemes

FIG. 1.

FIG. 2.

B. Comparison to the best possible recursion scheme

C. Numerical validation

FIG. 3.

D. Reasons for the algorithm’s efficiency

IX. IMPLEMENTATION

X. CONCLUSIONS

Acknowledgments

APPENDIX: MATHEMATICAL NOTATION

1. Tensors

2. Tensor contractions and traces

3. Fourier transform

4. Lattice and Dirac comb

5. Poisson’s equation on a lattice

REFERENCES

Associated Data

Data Citations

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

III. THE n-FOLD GRADIENT OF f(r²)