Accelerating self-consistent field convergence with the augmented Roothaan–Hall energy function

Xiangqian Hu; Weitao Yang

doi:10.1063/1.3304922

. 2010 Feb 4;132(5):054109. doi: 10.1063/1.3304922

Accelerating self-consistent field convergence with the augmented Roothaan–Hall energy function

Xiangqian Hu ^1,^a), Weitao Yang ^1,^b)

PMCID: PMC2830258 PMID: 20136307

Abstract

Based on Pulay’s direct inversion iterative subspace (DIIS) approach, we present a method to accelerate self-consistent field (SCF) convergence. In this method, the quadratic augmented Roothaan–Hall (ARH) energy function, proposed recently by Høst and co-workers [J. Chem. Phys. 129, 124106 (2008)], is used as the object of minimization for obtaining the linear coefficients of Fock matrices within DIIS. This differs from the traditional DIIS of Pulay, which uses an object function derived from the commutator of the density and Fock matrices. Our results show that the present algorithm, abbreviated ADIIS, is more robust and efficient than the energy-DIIS (EDIIS) approach. In particular, several examples demonstrate that the combination of ADIIS and DIIS (“ADIIS+DIIS”) is highly reliable and efficient in accelerating SCF convergence.

INTRODUCTION

Quantum mechanical (QM) methods play an important role in many research areas from elaborating the subtle reaction mechanisms of enzymes¹^,² to designing novel structures of molecules and materials.³^,⁴^,⁵^,⁶^,⁷ Because of the good balance between computational cost and desired accuracy, Kohn–Sham (KS) density functional theory (KS-DFT),⁸^,⁹ based on a single Slater determinant, is a very popular method to describe the electronic structures of molecules or materials. In both Hatree–Fock and KS approaches, the self-consistent field (SCF) scheme is applied to obtain the total energies of systems. In this scheme, an initial guess for the density matrix (D) is employed to construct the Fock matrix (or KS matrix in DFT) [F(D)]. Subsequently, an updated density matrix is solved and used to calculate the next Fock matrix, and this procedure is iterated until the density matrix becomes invariant (i.e., a SCF convergence is achieved). Unfortunately, SCF convergence without any accelerating technique is always problematic in many cases.¹⁰ To achieve and accelerate the SCF convergence, a variety of methods has been developed such as simple density mixing between the previous and current density matrices, level shifting,¹¹ fractional electron occupations,¹² the optimal damping algorithm (ODA),¹³^,¹⁴^,¹⁵ the direct inversion iterative subspace (DIIS) approach,¹⁶^,¹⁷ energy-DIIS (EDIIS),¹⁸^,¹⁹ density subspace minimization (DSM),²⁰^,²¹ and the ground-state-directed optimization scheme.²²^,²³ Since the DIIS-based algorithms are particularly robust and efficient in most molecular systems, we focus on the DIIS procedure for SCF convergence in this work.

As emphasized in Ref. 23, the standard SCF iterative procedure with the DIIS-based methods involves two separate steps. The first step is to diagonalize the Fock matrix in order to construct a new density matrix; the second step is to improve the new density matrix using the DIIS scheme to combine linearly this new density matrix with the density matrices obtained from previous iterations. In the standard DIIS approach developed by Pulay,¹⁶^,¹⁷ the linear coefficients of each density matrix are optimized by minimizing the orbital rotation gradient based on the commutator matrix of the Fock and density matrices (i.e., [F(D),D]) in the orthonormal basis space. However, the minimization of this orbital rotation gradient does not always lead to a lower energy, particularly when the SCF is not close to convergence.¹⁸ This may cause large energy oscillations and divergence in the SCF procedure. To stabilize DIIS, the EDIIS approach developed by Scuseria and co-workers¹⁸ minimizes a quadratic energy function derived from the ODA¹³^,¹⁴^,¹⁵ to obtain the linear coefficients in DIIS. This energy minimization driven EDIIS method rapidly brings the density matrix from the initial guess to a convergent region. In addition, the combination of EDIIS and DIIS makes SCF robust and efficient. Although “EDIIS+DIIS” is successful in accelerating the SCF convergence for many cases, the quadratic energy function used in EDIIS is only accurate for HF. In KS-DFT, approximate quadratic interpolation of energies¹⁸ is employed because of the nonlinearity of the exchange-correlation functional. Therefore, the reliability of EDIIS can be impaired by the interpolation accuracy.

Here, we present a new approach based on the augmented Roothaan–Hall (ARH) energy function²²^,²³ to obtain the linear coefficients for the density matrices in DIIS. We first illustrate the mathematical expressions defining the ARH energy in Sec. 2A. Then, we discuss the combination of the ARH energy and DIIS (ADIIS in short) and compare ADIIS with the standard DIIS and EDIIS algorithms in Sec. 2B. We performed calculations on several molecular systems to validate the efficiency of ADIIS in comparison with DIIS and EDIIS, and report these results in Sec. 3. Finally, we summarize our work in Sec. 4.

METHODS

The ARH energy function

In Ref. 22, Høst and co-workers presented a quadratic ARH energy function to optimize directly the density matrix in the SCF procedure. Here, the key points about the ARH energy function for a closed-shell system are explained. (For an open-shell system, the α and β spins need to be considered separately.) The total HF or KS-DFT energy for a closed-shell system can be expanded to second order, with respect to the density matrix, using the Taylor expansion,

E (D) \approx \tilde{E} (D) = E (D_{n}) + ⟨ D - D_{n} | E^{[1]} (D_{n}) ⟩ + \frac{1}{2} ⟨ D - D_{n} | E^{[2]} (D_{n}) | D - D_{n} ⟩,

(1)

where E(D) is the total energy of a density matrix D, $\tilde{E} (D)$ is the approximate ARH energy function, D_n is the density matrix of the nth SCF iteration, and ⟨A|B⟩=Tr(A^TB). In Eq. 1, the first derivative of E(D_n) [i.e., E^[1](D_n)] with respect to D_n is the corresponding Fock matrix [i.e., F(D_n)],

E^{[1]} (D_{n}) = {\frac{\partial E (D)}{\partial D} |}_{D = D_{n}} = 2 F (D_{n}) = 2 F_{n} .

(2)

The second derivative of E(D_n) with respect to D_n [i.e., E^[2](D_n)] is much more computationally expensive to evaluate exactly from the analytical expressions. A useful quasi-Newton approximation to E^[2](D_n) is imposed to circumvent this costly calculation,²²

E^{[2]} (D_{n}) (D - D_{n}) \approx E^{[1]} (D) - E^{[1]} (D_{n}) = 2 F (D) - 2 F (D_{n}) .

(3)

Thus Eq. 1 for a closed-shell system is rewritten as

E (D) \approx E (D_{n}) + 2 ⟨ D - D_{n} | F (D_{n}) ⟩ + ⟨ D - D_{n} | [F (D) - F (D_{n})] ⟩ .

(4)

Since the density matrix must satisfy the symmetry (D=D^T), trace [Tr(DS)=N_elec, where S is the overlap matrix of atomic basis sets and N_elec is the total number of electrons], and idempotency (DSD=D) constraints, the elements of D cannot be optimized directly in Eq. 4. In Ref. 22, an antisymmetric matrix is utilized to parametrize the elements of D and fulfill all three constraints on the density matrix. Combined with the trust-region method,²⁴ direct optimization of the density matrix without diagonalization was carried out. Their approach was demonstrated to be robust and efficient, particularly when the classical DIIS approaches with diagonalization fail to converge, or converge to a saddle point instead of a minimum.

However, the diagonalization step in DIIS is the most efficient method to obtain the coefficients and eigenvalues of molecular orbitals as well as the density matrix. In particular, the diagonalization cost is low for many interesting moderately sized systems, such as enzymatic active sites (treated using HF or KS-DFT) in QM∕MM simulations.¹ Therefore, in this work, we propose a new method combining the ARH energy function and the standard DIIS approach, abbreviated as ADIIS, to improve the efficiency and reliability of the SCF convergence.

The ADIIS algorithm

Let us assume that D₁, D₂, …, and D_n are n density matrices and F₁, F₂, …, and F_n are the corresponding Fock matrices after n SCF iterations are performed. Similar to EDIIS,¹⁸ the approximate density matrix ${\tilde{D}}_{n + 1}$ for the (n+1)th iteration can be linearly expanded in terms of the previous density matrices and the linear coefficients can be obtained readily and efficiently via energy minimization

{\tilde{D}}_{n + 1} = arg min {E (\tilde{D}), \tilde{D} = \sum_{i = 1}^{n} c_{i} D_{i}, \sum_{i = 1}^{n} c_{i} = 1, c_{i} \geq 0} .

(5)

Here, a convex linear combination of density matrices (i.e., c_i∊[0,1])¹⁸ as in EDIIS is imposed to simplify and stabilize the optimization. Once {c_i,i=1,…,n} are obtained via energy minimization, the Fock matrix ${\tilde{F}}_{n + 1}$ can be evaluated by Pulay’s DIIS scheme

{\tilde{F}}_{n + 1} = \sum_{i = 1}^{n} c_{i} F_{i} .

(6)

Then the new density matrix D_n+1, fulfilling the symmetry, trace, and idempotency constraints, can be constructed by diagonalization of ${\tilde{F}}_{n + 1}$ . In this work, to obtain the linear coefficients in Eq. 5, the ARH energy function [defined by Eq. 4] is used to compute $E (\tilde{D})$ , which leads to the ARH-energy DIIS or ADIIS scheme. Using the notation $f^{ADIIS} = \tilde{E} (\sum_{i = 1}^{n} c_{i} D_{i})$ , Eq. 4 can be rewritten as

f^{ADIIS} (c_{1}, \dots, c_{n}) = E (D_{n}) + 2 \sum_{i = 1}^{n} c_{i} ⟨ D_{i} - D_{n} | F (D_{n}) ⟩ + \sum_{i = 1}^{n} \sum_{j = 1}^{n} c_{i} c_{j} ⟨ D_{i} - D_{n} | [F (D_{j}) - F (D_{n})] ⟩ .

(7)

In contrast, for a closed-shell system, the energy expression $E (\tilde{D})$ used in EDIIS is given by

f^{EDIIS} (c_{1}, \dots, c_{n}) = \sum_{i = 1}^{n} c_{i} E (D_{i}) - \sum_{i = 1}^{n} \sum_{j = 1}^{n} c_{i} c_{j} ⟨ D_{i} - D_{j} | F_{i} - F_{j} ⟩ .

(8)

ADIIS is based on the second order Taylor expansion of the total energy with respect to the density matrix and hence f^ADIIS is accurate for HF and KS-DFT calculations only if the quasi-Newton condition is sufficient. In contrast, f^EDIIS is only precise for HF calculations because the HF energy is quadratic in the density matrix. But f^EDIIS is an approximate quadratic expression for KS-DFT because the exchange-correlation term in KS-DFT is nonlinear in the density matrix.¹⁸ (Note that the DSM-based method²⁰^,²¹ only uses up to the first order Taylor expansion as the energy function to optimize linear coefficients {c_i} in the purified density matrix.)

For EDIIS or ADIIS, the optimization problem is

min {f^{EDIIS} or f^{ADIIS}, \sum_{i = 1}^{n} c_{i} = 1, c_{i} \geq 0} .

(9)

For DIIS, the optimization problem is

min {f^{DIIS}, \sum_{i = 1}^{n} c_{i} = 1},

(10)

where f^DIIS is defined as

f^{DIIS} (c_{1}, \dots, c_{n}) = \sum_{i = 1}^{n} \sum_{j = 1}^{n} c_{i} c_{j} [F_{i}, D_{i} S] \cdot [F_{j}, D_{j} S] .

(11)

Based on Eqs. 9, 10, ADIIS, similar to EDIIS, is indeed an interpolation scheme with the convex constraints {c_i∊[0,1]}. Pulay’s standard DIIS is an extrapolation procedure without the convex constraints. Moreover, ADIIS utilizes a clearly defined energy expression [Eq. 1] to optimize the linear coefficients {c_i} toward the energy minimum, while DIIS only minimizes the orbital rotation gradient, which does not necessarily lead toward the energy minimum.¹⁸

As explained in Ref. 18 by Kudin et al., the energy functions in both ADIIS and EDIIS are less sensitive than the orbital rotation gradient in DIIS when the density matrix is varied. As such, ADIIS and EDIIS may be less efficient than DIIS in the region close to SCF convergence. Therefore, “ADIIS+DIIS,” similar to EDIIS+DIIS in Ref. 18, was also implemented in this work. In ADIIS+EDIIS, ADIIS is carried out at the beginning of the SCF to obtain quickly a nearly converged density matrix and then ADIIS is switched over to DIIS to achieve rapid final convergence.

RESULTS

All the algorithms, including DIIS, EDIIS, and ADIIS, were implemented in an in-house program.²⁵ Several molecules were chosen to illustrate the performance of ADIIS and ADIIS+DIIS in comparison with DIIS, EDIIS, and EDIIS+DIIS. These are CH₃CHO,¹⁸ a water cluster,²² a cadmium complex,²² a polyalanine peptide,²² and two ruthenium compounds. (All of the geometry structures are available via http://www.chem.duke.edu∕~xqhu/geoms.tar.gz) In the standard DIIS calculations, neither level shifting or density damping is added. For CH₃CHO and the cadmium complex, the core Hamiltonians were used to obtain the initial density matrix. For the water cluster, the polyalanine peptide, and two Ruthenium compounds, the atomic density matrices were employed to construct the initial density matrix. SCF convergence is considered being achieved when the energy difference between the current iteration and the previous one is less than 10⁻⁸ a.u. For ADIIS+DIIS and EDIIS+DIIS, we used a simple criterion to activate DIIS: when the energy difference between two SCF cycles is less than 0.01 a.u (2.0 a.u. was used for the polyalanine peptide and 1.0 a.u. was used for two Ruthenium compounds), ADIIS or EDIIS is switched to DIIS. (Note that other criteria¹⁸ can be also applied to activate DIIS.) To obtain the linear coefficients for EDIIS and ADIIS, the L-BFGS algorithm²⁶^,²⁷ was used and the energy functions [Eq. 8 for EDIIS and Eq. 7 for ADIIS] were optimized using Eq. 9 with the aid of first derivatives (∂f^EDIIS∕∂c_i and ∂f^ADIIS∕∂c_i, respectively). [Note that variable substitutions $(c_{i} = t_{i}^{2} ∕ \sum_{j} t_{j}^{2})$ are applied to transform the constrained optimization problem into an unconstrained one in L-BFGS.] The averaged coefficients were taken as the initial guess during the L-BFGS optimizations. In this work, only six Fock matrices were combined linearly to construct ${\tilde{F}}_{n + 1}$ in Eq. 6.

Well-behaved molecular systems: CH₃CHO and a water cluster

We first chose CH₃CHO (acetaldehyde) as a simple and well-behaved small molecule to illustrate the performance of DIIS, EDIIS, ADIIS, EDIIS+DIIS, and ADIIS+DIIS. Calculations were performed at the RHF∕6-31G^∗ (Refs. ²⁸^,²⁹) level of theory (53 Gaussian basis functions). As shown in Fig. 1, compared to the standard DIIS approach, both EDIIS and ADIIS are much slower to reach SCF convergence because both EDIIS and ADIIS are based on minimizing the approximate energy functions [Eq. 8 and Eq. 7] at each SCF cycle. The energy functions are not sensitive to small variations in density matrices, particularly when the density matrix is close to the convergence region. In contrast, the orbital rotation gradient minimized in DIIS is more sensitive to the variations of density matrices. Overall, DIIS is much faster than ADIIS and EDIIS to converge the energy of simple small organic molecules, which is consistent with the discussions in Ref. 18. However, in the first few SCF iterations, EDIIS and ADIIS are capable of getting a lower energy from the crude initial guess than the DIIS energy (for instance, see the energies at the fourth cycle for EDIIS and ADIIS). Therefore, similar to the EDIIS+DIIS approach,¹⁸ we also combined ADIIS with DIIS (i.e., ADIIS+DIIS): in the beginning of the SCF, ADIIS is applied to accelerate the SCF procedure and is switched over to DIIS when the energy difference between two SCF cycles is less than the specified threshold (0.01 a.u. is the default value in this work.). As such, ADIIS helps to obtain a superior density matrix close to the convergence region from the initial guess in the beginning of the SCF and makes DIIS more reliable and robust afterward. For CH₃CHO, when DIIS was switched on within ten SCF cycles, ADIIS+DIIS and EDIIS+DIIS significantly improved the speed of SCF convergence, as shown in Fig. 1. In addition, the efficiency of ADIIS+DIIS was the same as DIIS in this case. Since ADIIS+DIIS or EDIIS+DIIS is much more efficient than ADIIS or EDIIS alone, we only compared these two methods with DIIS for the other molecules. A much bigger system, a water cluster with 51 monomers²² using B3LYP(VWN5)³⁰^,³¹^,³²^,³³∕6-31G^∗ (1275 Gaussian basis functions), was also considered and the results were shown in Fig. 2. Since a good initial guess for density matrix was obtained from the atomic density matrices, DIIS was faster than ADIIS+DIIS and EDIIS+DIIS in terms of the number of SCF cycles required for convergence. Hence, for well-behaved systems such as CH₃CHO and the water cluster, DIIS is the most efficient algorithm for accelerating SCF convergence.¹⁸ The efficiency of our ADIIS+DIIS approach is comparable to DIIS and is more efficient than EDIIS+DIIS in such well-behaved systems.

Convergence of the CH₃CHO energy with DIIS (the black solid line with circles), EDIIS (the red solid line with empty squares), ADIIS (the blue solid line with empty triangles), EDIIS+DIIS (the red dashed line with squares), and ADIIS+DIIS (the blue dashed line with triangles). A core Hamiltonian was applied to obtain the initial density matrix. The total energy was calculated with HF/6-31G^∗. The converged energy is −152.914 4029 a.u.

Convergence of the energy for a water cluster consisting of 51 monomers with DIIS (the black solid line with circles), EDIIS+DIIS (the red dashed line with squares), and ADIIS+DIIS (the blue dashed line with triangles). The initial density matrix was constructed from the atomic density matrices. The total energy was calculated with B3LYP∕6-31G^∗∗. The converged energy is −3898.232 5683 a.u.

A challenging system: A cadmium complex

A Cd²⁺-imidazole complex²² at both the RHF and B3LYP(VWN5)∕3–21G³⁴^,³⁵ levels (92 Gaussian basis functions) is a challenging system. For this cadmium-imidazole compound, the standard DIIS scheme failed to converge for HF and B3LYP (see Fig. 3) when the core Hamiltonian was employed to obtain the initial density matrix. However, EDIIS+DIIS and ADIIS+DIIS can achieve SCF convergence efficiently with HF, as shown in Fig. 3a. For DFT methods such as B3LYP, EDIIS+DIIS needed almost 60 SCF iterations to accumulate enough information to bring the density matrix close to convergence. These different convergence behaviors of EDIIS+DIIS are due to the energy expression in EDIIS, which is precise for HF and is approximately calculated by the quadratic ODA for DFT. The ODA scheme can be problematic for some molecules with DFT. Compared to DIIS and EDIIS+DIIS, ADIIS+DIIS minimized the energy robustly in the beginning of the SCF via the ADIIS scheme and it proceeded through the SCF iterations smoothly and efficiently after DIIS was switched on in both the HF and DFT cases [see Figs. 3a, 3b]. This suggests that the ARH energy function [i.e., Eq. 7] is good approximation both for HF and DFT. The Taylor expansion up to second order with respect to the density matrix is sufficiently accurate to predict the next improved Fock matrix ${\tilde{F}}_{n + 1}$ in Eq. 6. Hence, during the SCF procedure, the density matrix of the cadmium-imidazole system was rapidly optimized from the crude initial guess by ADIIS for HF or DFT, which significantly improved the efficiency and reliability of the subsequent DIIS in ADIIS+DIIS.

Convergence of the energy for a cadmium-imidazole complex with DIIS (the black solid line with circles), EDIIS+DIIS (the red dashed line with squares), and ADIIS+DIIS (the blue dashed line with triangles). A core Hamiltonian was applied to obtain the initial density matrix. The total energy was calculated with RHF/3–21G (a) and with B3LYP/3–21G (b). The converged energies are −5663.143 3914 a.u. for RHF and −5666.922 5744 a.u. for B3LYP.

A typical biological system: A polyalanine peptide

To further illustrate the performance of DIIS, EDIIS+DIIS, and ADIIS+DIIS in a normal biological system, energy calculations were carried out for a polyalanine peptide with 29 residues²² with RHF and B3LYP(VWN5)∕6–31G³⁶^,³⁷ (1599 Gaussian basis functions). As shown in Fig. 4a, for all of three algorithms, the energy decreased monotonically through the SCF iterations with HF and convergence was achieved within 20 cycles. In contrast, for the B3LYP results in Fig. 4b, EDIIS+DIIS failed to converge because the energy error in Eq. 8 from the exchange-correlation energy in DFT is larger due to error accumulation as the chemical system size becomes larger. In the first few SCF iterations, the energy obtained with EDIIS+DIIS actually increased and the density matrix was trapped in the incorrect electronic state. For DIIS and ADIIS+DIIS, both algorithms eventually converged to the energy minimum. However, DIIS first converged to a saddle point and required more than 15 SCF cycles to jump out this critical point [see the broad peak in Fig. 4b for DIIS]. Compared to the total 57 SCF cycles for DIIS, ADIIS+DIIS needed fewer cycles to overcome the saddle point, and it reached the energy minimum within 40 SCF iterations. When a larger energy threshold was used (ΔE=2.0 a.u.) to switch on DIIS, ADIIS+DIIS also achieved SCF convergence. This indicates that the performance of ADIIS+DIIS does not depend on the energy threshold critically.

Convergence of the energy of a 29-residue polyalanine peptide with DIIS (the black solid line with circles), EDIIS+DIIS (the red dashed line with squares), ADIIS+DIIS with the energy difference threshold of 0.01 a.u. to activate DIIS (the blue dashed line with triangles), and ADIIS+DIIS with the threshold of 2.0 a.u. (the magenta dashed line with stars). The initial density matrix was constructed from the atomic density matrices. The total energy was calculated by RHF/6–31G and B3LYP/6–31G. The converged energies are −7127.568 1797 a.u. for RHF and −7171.566 7248 a.u. for B3LYP.

Problem cases: Two Ruthenium compounds

We further studied the SCF convergence behavior for two ruthenium compounds: [Ru^V=O]^+2.5 and Ru₄(CO) with B3LYP(VWN5)∕Lanl2dz³⁸ and the results are shown in Fig. 5. [Ru^V=O]^2.5+ is [Ru(tpy)(bpm)(OH₂)]^2.5+ (tpy is 2,2^′:6^′,2^″-terpyridine; bpm is 2,2^′-bipyrimidine) that is an open-shell system with a fractional electron, 363 Gaussian basis functions, and 56 157 TIP3P water molecules.³⁹ [Ru^V=O]^2.5+ has 123α spin electrons and 122.5β spin electrons with hundreds of thousands of TIP3P water molecules in the system. The fractional electronic occupation has been successfully applied in redox potential simulations⁴⁰ and it is very useful to study the quality of approximate density functionals in DFT.⁴¹^,⁴²^,⁴³ However, many systems with the fractional electronic occupations often suffer SCF convergence problems. As shown in Fig. 5a, EDIIS+DIIS failed to achieve the convergence. Our ADIIS+DIIS method is more stable and robust than DIIS for such a fractional-charge open-shell system.

Convergence of the energies of [Ru^V=O]^+2.5 and Ru₄(CO) with DIIS (the black solid line with circles), EDIIS+DIIS (the red dashed line with squares), and ADIIS+DIIS with the energy difference threshold of 1.0 a.u. to activate DIIS (the blue dashed line with triangles). The initial density matrix was constructed from the atomic density matrices. The total energy was calculated by B3LYP/LanL2dz. [Ru^V=O]^+2.5 has 123α spin electrons and 122.5β spin electrons. Ru₄(CO) is closed shell. The converged energies are −1595.849 0175 a.u. for [Ru^V=O]^+2.5 and −488.710 6840 a.u. for Ru₄(CO).

For the second example, Ru₄(CO) is a closed-shell system with 114 Gaussian basis functions and its geometry was taken from Ref. 44. This compound was used to study CO absorption on transition metal clusters.⁴⁴ Fig. 5b shows that both EDIIS+DIIS and DIIS failed to achieve SCF convergence for Ru₄(CO). However, for ADIIS+DIIS, the total energy of Ru₄(CO) decreased monotonically, reaching convergence after 90 iterations. Thus, ADIIS+DIIS seems to be more effective and robust than EDIIS+DIIS and DIIS for heavy transition metal systems in both open- and closed-shell cases.

In summary, various molecular systems, ranging from simple organic molecules to biomolecules, from closed-shell to open-shell cases, and from light atoms to transition metals, were computed by several SCF algorithms. ADIIS+DIIS with both HF and DFT is efficient enough to guide the construction of updated density matrices in the SCF process for the systems studied. The density matrices explored by ADIIS in the beginning stages of the SCF make the standard DIIS more robust and reliable afterward. ADIIS+DIIS provides a robust and efficient protocol to accelerate SCF convergence.

CONCLUSIONS

In the standard DIIS approach, the orbital rotation gradient is minimized to obtain the linear Fock matrix coefficients, then a new Fock matrix is constructed for diagonalization in the SCF. However, the gradient minimum may not correspond to the correct energy minimum, which causes energy oscillations during the SCF [for instance, see Figs. 3 5b]. Although EDIIS minimizes the energy function derived from the ODA in each SCF cycle, the energy expression is only accurate for HF. For DFT, EDIIS can have large errors and fail to achieve SCF convergence in some molecular systems [for instance, see Fig. 4b]. In this work, we applied a more rigorously defined ARH energy function to optimize the linear coefficients of the Fock matrices using BFGS. This newly developed ADIIS scheme rapidly brings the initial density matrix close to the convergence region. Several examples in this paper demonstrate that after ADIIS is combined with DIIS to yield ADIIS+DIIS, it accelerates SCF convergence more reliably and efficiently than DIIS, EDIIS, and EDIIS+DIIS.

It is worth noting that the quadratic ARH energy function combined with trust-region optimization was originally developed to accelerate SCF convergence without Fock matrix diagonalization.²²^,²³ In those studies, two separate computational steps (i.e., diagonalization and DIIS averaging) were no longer involved in each SCF cycle and direct optimization of the density matrix was performed. However, the diagonalization step can be considered as a direct minimization of a quadratic model of the total energy and is still a most efficient scheme to obtain useful molecular orbital coefficients and the corresponding density matrix in each SCF cycle. Particularly, the diagonalization step is not too costly for many interesting molecular systems (even for systems with thousands of basis functions). To capitalize on the efficiency from diagonalization, the new ADIIS method makes use of the ARH energy function derived from the Taylor expansion to optimize the linear Fock matrix coefficients, while the Fock matrices are based on density matrices constructed from diagonalization in previous SCF steps. Furthermore, ADIIS+DIIS can be easily coded in current quantum computational packages. Our results indicate that ADIIS dramatically improves the quality of density matrix at the beginning of the SCF and helps DIIS converge to the minimum rapidly in ADIIS+DIIS.

In addition to the acceleration techniques, the exchange-correlation functional form in DFT could also affect the SCF convergence behavior. In particular, the intrinsic delocalization and static correlation errors⁴¹^,⁴²^,⁴³ in HF and approximate DFT functionals can force the SCF to converge to a unphysical electronic state. We believe that the erratic or divergent behavior of the SCF procedure can be relieved by the next generation of DFT functionals with minimal delocalization and static correlation errors. More importantly, ADIIS is extendable to any new functional form because the ARH energy function is only based on the second order Taylor expansion with respect to the density matrix, with the Hessians approximated by a quasi-Newton approach.

ACKNOWLEDGMENTS

We acknowledge the support from the National Institutes of Health and the National Science Foundation. Discussions with Dr. Erin R. Johnson have been helpful.

References

Hu H. and Yang W. T., Annu. Rev. Phys. Chem. 59, 573 (2008). 10.1146/annurev.physchem.59.032607.093618 [DOI] [PMC free article] [PubMed] [Google Scholar]
Tantillo D. J., Chen J. G., and Houk K. N., Curr. Opin. Chem. Biol. 2, 743 (1998). 10.1016/S1367-5931(98)80112-9 [DOI] [PubMed] [Google Scholar]
Wang M. L., Hu X. Q., Beratan D. N., and Yang W. T., J. Am. Chem. Soc. 128, 3228 (2006). 10.1021/ja0572046 [DOI] [PubMed] [Google Scholar]
Keinan S., Hu X. Q., Beratan D. N., and Yang W. T., J. Phys. Chem. A 111, 176 (2007). 10.1021/jp0646168 [DOI] [PubMed] [Google Scholar]
Hu X. Q., Beratan D. N., and Yang W. T., J. Chem. Phys. 129, 9 (2008). [Google Scholar]
Jorgensen W. L., Science 303, 1813 (2004). 10.1126/science.1096361 [DOI] [PubMed] [Google Scholar]
Franceschetti A. and Zunger A., Nature (London) 402, 60 (1999). 10.1038/46995 [DOI] [Google Scholar]
Parr R. G. and Yang W. T., Density-Functional Theory of Atoms and Molecules (Oxford University Press, New York, 1989). [Google Scholar]
Kohn W. and Sham L. J., Phys. Rev. 140, A1133 (1965). 10.1103/PhysRev.140.A1133 [DOI] [Google Scholar]
Koch W. and Holthausen M. C., A Chemist’s Guide to Density Functional Theory (Wiley-VCH, Weinheim, 2002). [Google Scholar]
Saunders V. R. and Hillier I. H., Int. J. Quantum Chem. 7, 699 (1973). 10.1002/qua.560070407 [DOI] [Google Scholar]
Rabuck A. D. and Scuseria G. E., J. Chem. Phys. 110, 695 (1999). 10.1063/1.478177 [DOI] [Google Scholar]
Cancès E. and Le Bris C., Math. Modell. Numer. Anal. 34, 749 (2000). 10.1051/m2an:2000102 [DOI] [Google Scholar]
Cancès E. and Le Bris C., Int. J. Quantum Chem. 79, 82 (2000). [DOI] [Google Scholar]
Cancès E., J. Chem. Phys. 114, 10616 (2001). 10.1063/1.1373430 [DOI] [Google Scholar]
Pulay P., Chem. Phys. Lett. 73, 393 (1980). 10.1016/0009-2614(80)80396-4 [DOI] [Google Scholar]
Pulay P., J. Comput. Chem. 3, 556 (1982). 10.1002/jcc.540030413 [DOI] [Google Scholar]
Kudin K. N., Scuseria G. E., and Cancès E., J. Chem. Phys. 116, 8255 (2002). 10.1063/1.1470195 [DOI] [Google Scholar]
Kudin K. N. and Scuseria G. E., Math. Modell. Numer. Anal. 41, 281 (2007). 10.1051/m2an:2007022 [DOI] [Google Scholar]
Thøgersen L., Olsen J., Yeager D., Jørgensen P., Sałek P., and Helgaker T., J. Chem. Phys. 121, 16 (2004). 10.1063/1.1755673 [DOI] [PubMed] [Google Scholar]
Thøgersen L., Olsen J., Kohn A., Jørgensen P., Sałek P., and Helgaker T., J. Chem. Phys. 123, 074103 (2005). 10.1063/1.1989311 [DOI] [PubMed] [Google Scholar]
Høst S., Olsen J., Jansik B., Thøgersen L., Jørgensen P., and Helgaker T., J. Chem. Phys. 129, 124106 (2008). 10.1063/1.2974099 [DOI] [PubMed] [Google Scholar]
Høst S., Jansik B., Olsen J., Jørgensen P., Reine S., and Helgaker T., Phys. Chem. Chem. Phys. 10, 5344 (2008). 10.1039/b807639a [DOI] [PubMed] [Google Scholar]
Fletcher R., Practical Methods of Optimization, 2nd ed. (Wiley, Chichester, 1987). [Google Scholar]
An in-house program for QM/MM simulations (http://www.qm4d.info).
Nocedal J., Math. Comput. 35, 773 (1980). 10.2307/2006193 [DOI] [Google Scholar]
Liu D. and Nocedal J., Math. Program. Ser. B 45, 503 (1989). 10.1007/BF01589116 [DOI] [Google Scholar]
Petersson G. A., Bennett A., Tensfeldt T. G., Al-Laham M. A., Shirley W. A., and Mantzaris J., J. Chem. Phys. 89, 2193 (1988). 10.1063/1.455064 [DOI] [Google Scholar]
Petersson G. A. and Al-Laham M. A., J. Chem. Phys. 94, 6081 (1991). 10.1063/1.460447 [DOI] [Google Scholar]
Becke A. D., Phys. Rev. B 38, 3098 (1988). [DOI] [PubMed] [Google Scholar]
Becke A. D., J. Chem. Phys. 98, 5648 (1993). 10.1063/1.464913 [DOI] [Google Scholar]
Lee C. T., Yang W. T., and Parr R. G., Phys. Rev. B 37, 785 (1988). 10.1103/PhysRevB.37.785 [DOI] [PubMed] [Google Scholar]
Vosko S. H., Wilk L., and Nusair M., Can. J. Phys. 58, 1200 (1980). [Google Scholar]
Binkley J. S., Pople J. A., and Hehre W. J., J. Am. Chem. Soc. 102, 939 (1980). 10.1021/ja00523a008 [DOI] [Google Scholar]
Dobbs K. D. and Hehre W. J., J. Comput. Chem. 8, 880 (1987). 10.1002/jcc.540080615 [DOI] [Google Scholar]
Hehre W. J., Ditchfield R., and Pople J. A., J. Chem. Phys. 56, 2257 (1972). 10.1063/1.1677527 [DOI] [Google Scholar]
Francl M. M., Pietro W. J., Hehre W. J., Binkley J. S., Gordon M. S., Defrees D. J., and Pople J. A., J. Chem. Phys. 77, 3654 (1982). 10.1063/1.444267 [DOI] [Google Scholar]
Wadt W. R. and Hay P. J., J. Chem. Phys. 82, 284 (1985). 10.1063/1.448800 [DOI] [Google Scholar]
Jorgensen W. L., Chandrasekhar J., Madura J. D., Impey R. W., and Klein M. L., J. Chem. Phys. 79, 926 (1983). 10.1063/1.445869 [DOI] [Google Scholar]
Zeng X., Hu H., Hu X. Q., Cohen A. J., and Yang W. T., J. Chem. Phys. 128, 124510 (2008). 10.1063/1.2832946 [DOI] [PMC free article] [PubMed] [Google Scholar]
Cohen A. J., Mori-Sánchez P., and Yang W. T., J. Chem. Phys. 129, 121104 (2008). 10.1063/1.2987202 [DOI] [PubMed] [Google Scholar]
Cohen A. J., Mori-Sánchez P., and Yang W. T., Science 321, 792 (2008). 10.1126/science.1158722 [DOI] [PubMed] [Google Scholar]
Mori-Sánchez P., Cohen A. J., and Yang W. T., Phys. Rev. Lett. 100, 146401 (2008). 10.1103/PhysRevLett.100.146401 [DOI] [PubMed] [Google Scholar]
Zeinalipour-Yazdi C. D., Cooksy A. L., and Efstathiou A. M., Surf. Sci. 602, 1858 (2008). 10.1016/j.susc.2008.03.024 [DOI] [Google Scholar]

[c1] Hu H. and Yang W. T., Annu. Rev. Phys. Chem. 59, 573 (2008). 10.1146/annurev.physchem.59.032607.093618 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c2] Tantillo D. J., Chen J. G., and Houk K. N., Curr. Opin. Chem. Biol. 2, 743 (1998). 10.1016/S1367-5931(98)80112-9 [DOI] [PubMed] [Google Scholar]

[c3] Wang M. L., Hu X. Q., Beratan D. N., and Yang W. T., J. Am. Chem. Soc. 128, 3228 (2006). 10.1021/ja0572046 [DOI] [PubMed] [Google Scholar]

[c4] Keinan S., Hu X. Q., Beratan D. N., and Yang W. T., J. Phys. Chem. A 111, 176 (2007). 10.1021/jp0646168 [DOI] [PubMed] [Google Scholar]

[c5] Hu X. Q., Beratan D. N., and Yang W. T., J. Chem. Phys. 129, 9 (2008). [Google Scholar]

[c6] Jorgensen W. L., Science 303, 1813 (2004). 10.1126/science.1096361 [DOI] [PubMed] [Google Scholar]

[c7] Franceschetti A. and Zunger A., Nature (London) 402, 60 (1999). 10.1038/46995 [DOI] [Google Scholar]

[c8] Parr R. G. and Yang W. T., Density-Functional Theory of Atoms and Molecules (Oxford University Press, New York, 1989). [Google Scholar]

[c9] Kohn W. and Sham L. J., Phys. Rev. 140, A1133 (1965). 10.1103/PhysRev.140.A1133 [DOI] [Google Scholar]

[c10] Koch W. and Holthausen M. C., A Chemist’s Guide to Density Functional Theory (Wiley-VCH, Weinheim, 2002). [Google Scholar]

[c11] Saunders V. R. and Hillier I. H., Int. J. Quantum Chem. 7, 699 (1973). 10.1002/qua.560070407 [DOI] [Google Scholar]

[c12] Rabuck A. D. and Scuseria G. E., J. Chem. Phys. 110, 695 (1999). 10.1063/1.478177 [DOI] [Google Scholar]

[c13] Cancès E. and Le Bris C., Math. Modell. Numer. Anal. 34, 749 (2000). 10.1051/m2an:2000102 [DOI] [Google Scholar]

[c14] Cancès E. and Le Bris C., Int. J. Quantum Chem. 79, 82 (2000). [DOI] [Google Scholar]

[c15] Cancès E., J. Chem. Phys. 114, 10616 (2001). 10.1063/1.1373430 [DOI] [Google Scholar]

[c16] Pulay P., Chem. Phys. Lett. 73, 393 (1980). 10.1016/0009-2614(80)80396-4 [DOI] [Google Scholar]

[c17] Pulay P., J. Comput. Chem. 3, 556 (1982). 10.1002/jcc.540030413 [DOI] [Google Scholar]

[c18] Kudin K. N., Scuseria G. E., and Cancès E., J. Chem. Phys. 116, 8255 (2002). 10.1063/1.1470195 [DOI] [Google Scholar]

[c19] Kudin K. N. and Scuseria G. E., Math. Modell. Numer. Anal. 41, 281 (2007). 10.1051/m2an:2007022 [DOI] [Google Scholar]

[c20] Thøgersen L., Olsen J., Yeager D., Jørgensen P., Sałek P., and Helgaker T., J. Chem. Phys. 121, 16 (2004). 10.1063/1.1755673 [DOI] [PubMed] [Google Scholar]

[c21] Thøgersen L., Olsen J., Kohn A., Jørgensen P., Sałek P., and Helgaker T., J. Chem. Phys. 123, 074103 (2005). 10.1063/1.1989311 [DOI] [PubMed] [Google Scholar]

[c22] Høst S., Olsen J., Jansik B., Thøgersen L., Jørgensen P., and Helgaker T., J. Chem. Phys. 129, 124106 (2008). 10.1063/1.2974099 [DOI] [PubMed] [Google Scholar]

[c23] Høst S., Jansik B., Olsen J., Jørgensen P., Reine S., and Helgaker T., Phys. Chem. Chem. Phys. 10, 5344 (2008). 10.1039/b807639a [DOI] [PubMed] [Google Scholar]

[c24] Fletcher R., Practical Methods of Optimization, 2nd ed. (Wiley, Chichester, 1987). [Google Scholar]

[c25] An in-house program for QM/MM simulations (http://www.qm4d.info).

[c26] Nocedal J., Math. Comput. 35, 773 (1980). 10.2307/2006193 [DOI] [Google Scholar]

[c27] Liu D. and Nocedal J., Math. Program. Ser. B 45, 503 (1989). 10.1007/BF01589116 [DOI] [Google Scholar]

[c28] Petersson G. A., Bennett A., Tensfeldt T. G., Al-Laham M. A., Shirley W. A., and Mantzaris J., J. Chem. Phys. 89, 2193 (1988). 10.1063/1.455064 [DOI] [Google Scholar]

[c29] Petersson G. A. and Al-Laham M. A., J. Chem. Phys. 94, 6081 (1991). 10.1063/1.460447 [DOI] [Google Scholar]

[c30] Becke A. D., Phys. Rev. B 38, 3098 (1988). [DOI] [PubMed] [Google Scholar]

[c31] Becke A. D., J. Chem. Phys. 98, 5648 (1993). 10.1063/1.464913 [DOI] [Google Scholar]

[c32] Lee C. T., Yang W. T., and Parr R. G., Phys. Rev. B 37, 785 (1988). 10.1103/PhysRevB.37.785 [DOI] [PubMed] [Google Scholar]

[c33] Vosko S. H., Wilk L., and Nusair M., Can. J. Phys. 58, 1200 (1980). [Google Scholar]

[c34] Binkley J. S., Pople J. A., and Hehre W. J., J. Am. Chem. Soc. 102, 939 (1980). 10.1021/ja00523a008 [DOI] [Google Scholar]

[c35] Dobbs K. D. and Hehre W. J., J. Comput. Chem. 8, 880 (1987). 10.1002/jcc.540080615 [DOI] [Google Scholar]

[c36] Hehre W. J., Ditchfield R., and Pople J. A., J. Chem. Phys. 56, 2257 (1972). 10.1063/1.1677527 [DOI] [Google Scholar]

[c37] Francl M. M., Pietro W. J., Hehre W. J., Binkley J. S., Gordon M. S., Defrees D. J., and Pople J. A., J. Chem. Phys. 77, 3654 (1982). 10.1063/1.444267 [DOI] [Google Scholar]

[c38] Wadt W. R. and Hay P. J., J. Chem. Phys. 82, 284 (1985). 10.1063/1.448800 [DOI] [Google Scholar]

[c39] Jorgensen W. L., Chandrasekhar J., Madura J. D., Impey R. W., and Klein M. L., J. Chem. Phys. 79, 926 (1983). 10.1063/1.445869 [DOI] [Google Scholar]

[c40] Zeng X., Hu H., Hu X. Q., Cohen A. J., and Yang W. T., J. Chem. Phys. 128, 124510 (2008). 10.1063/1.2832946 [DOI] [PMC free article] [PubMed] [Google Scholar]

[c41] Cohen A. J., Mori-Sánchez P., and Yang W. T., J. Chem. Phys. 129, 121104 (2008). 10.1063/1.2987202 [DOI] [PubMed] [Google Scholar]

[c42] Cohen A. J., Mori-Sánchez P., and Yang W. T., Science 321, 792 (2008). 10.1126/science.1158722 [DOI] [PubMed] [Google Scholar]

[c43] Mori-Sánchez P., Cohen A. J., and Yang W. T., Phys. Rev. Lett. 100, 146401 (2008). 10.1103/PhysRevLett.100.146401 [DOI] [PubMed] [Google Scholar]

[c44] Zeinalipour-Yazdi C. D., Cooksy A. L., and Efstathiou A. M., Surf. Sci. 602, 1858 (2008). 10.1016/j.susc.2008.03.024 [DOI] [Google Scholar]

PERMALINK

Accelerating self-consistent field convergence with the augmented Roothaan–Hall energy function

Xiangqian Hu

Weitao Yang

Abstract

INTRODUCTION