Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Feb 28.
Published in final edited form as: J Chem Theory Comput. 2022 Apr 13;18(5):2975–2982. doi: 10.1021/acs.jctc.2c00142

Regularized Localized Molecular Orbitals in a Divide-and-Conquer Approach for Linear Scaling Calculations

Liang Peng 1, Daoling Peng 2, Feng Long Gu 3, Weitao Yang 4
PMCID: PMC9972215  NIHMSID: NIHMS1872808  PMID: 35416665

Abstract

Non-orthogonal localized molecular orbitals (NOLMOs) have been employed as building blocks for the divide-and-conquer (DC) linear scaling method. The NOLMOs are calculated from subsystems and used for constructing the density matrix (DM) of the entire system, instead of the subsystem DM in the original DC approach. Also, unlike the original DC method, the inverse electronic temperature parameter β is not needed anymore. Furthermore, a new regularized localization approach for NOLMOs has been developed, in which the localization cost function is a sum of the spatial spread function, as in the Boys method, and the kinetic energy, as a regularization measure to limit the oscillation of the NOLMOs. The optimal weight of the kinetic energy can be determined by optimization with analytical gradients. The resulting regularized NOLMOs have enhanced smoothness and better transferability because of reduced kinetic energies. Compared with the original DC, while NOLMO-DC has a similar computational linear scaling cost, the accuracy of NOLMO-DC is better by several orders of magnitude for large conjugated systems and by about 1 order of magnitude for other systems. The NOLMO-DC method is thus a promising development of the DC approach for linear scaling calculations.

Graphical Abstract

graphic file with name nihms-1872808-f0010.jpg

I. INTRODUCTION

To address complex problems with electronic structure calculations, linear scaling computational methods have been developed.113 Within the conventional density functional theory (DFT) and Hartree–Fock (HF) calculations based on canonical molecular orbitals (CMOs) that extend over the whole system, the high scaling mainly comes from the matrix diagonalization or the orbital orthogonalization.8,11,14,15 Thus, calculations for large systems become challenging due to its high scaling. The development of linear scaling electronic structure methods with faster speed and more efficiency in parallel implementation has been the goal of theoretical developments.

The key to achieve linear scaling is to bypass the use of CMOs that are delocalized in general and use density matrices 1, 3, 7, 8 or localized molecular orbitals (LMOs).2,46,811,16

The divide-and-conquer (DC) method developed by Yang was the first method in linear scaling approaches for electronic structure calculations.1,14 Based on a density matrix (DM), it divides the whole system into small subsystems and performs calculations on each subsystem and then constructs the DM of the whole system from the density matrices of the subsystems. Its original version based on density1 limited application in Kohn–Sham DFT with the exchange–correlation energy as explicit functionals because of the use of the electronic density as the basic variable. Later, Yang and Lee developed the formulation based on a one-electron DM which extended the DC approach to the HF, semiempirical MO, and hybrid density functional calculations.17 Moreover, buffer regions were introduced in the DC method to achieve increasing accuracy with the increasing size of the buffer regions.18 The use of the buffer region is important especially for calculations including HF exchange interactions because of their long-range behavior.19 For the efficient optimization of the molecular structure and possibly for molecular dynamics, Zhao and Yang developed an approach to compute analytically energy gradients with respect to the nuclear coordinates.20 Nakai and co-workers have developed the density matrix-based DC-MP2,21 where a scheme for partitioning the correlation energy based on the energy decomposition analysis22 to eliminate the contribution from the buffer region in the subsystem correlation energy is proposed. Also, other DC post-SCF correlation methods, such as MP2, coupled cluster singles and doubles (CCSD), and CCSD(T), have been also developed.2326 Other linear scaling correlation energy methods have been also developed, such as the pair localization with the fragmentation approach27 and the approach based on the local natural orbitals.28 It has achieved totally linear scaling calculations with good accuracy. Recently, Nakai2932 developed the DC-type density-functional tight-binding (DC-DFTB) method, which can simulate thousands to millions of atoms.

In LMO linear scaling approaches, there are two categories of LMOs, the orthogonal localized molecular orbitals (OLMOs)4,33 and the non-orthogonal localized molecular orbitals (NOLMOs).5,6,34 As the most localized representation of electronic degrees of freedom, NOLMOs are potentially the most efficient for linear scaling electronic structure calculations of large systems. Without the unnecessary constraint of orthogonality,35,36 NOLMOs can get rid of the long-range tails outside the localization area in the orbitals.16,37 Because NOLMOs only include information from the local environment, they should be more transferable from one system to other systems with similar local environments than OLMOs.16,37 Methods based on NOLMOs have shown to give much better localization of the electronic density and significantly more accurate results compared to those by using OLMOs.9,34,3739 Yang have developed an absolute energy minimum variational principle for carrying out linear scaling calculations with NOLMOs,6,9 which plays an important role in the present work. Paulus and co-workers have shown that NOLMOs could be used at the CCSD level, and their results show that the NOLMOs have much improved transferability.39 Anikin and co-workers used a penalty function to enforce non-orthogonality while variationally optimizing the energy of the system.40 It preserves orbital normalization analytically by the variation of virtual degrees of freedom, which are orthogonal to the initial orbitals. To improve the convergence performance of the NOLMO method, beyond the previous work,6,9 we16 developed an ab initio self-consistent-field calculation method with an effective preconditioning approach by taking into account the diagonal and some important off-diagonal second-order derivatives of the energy with respect to the NOLMO variables.16 It has shown that the convergence of the energy optimization has been greatly improved and is comparable with that of the conventional SCF approach, while achieving linear scaling in computational effort.16

NOLMOs are considered to be the most localized representation of electronic degrees of freedom.37,41 As such, NOLMOs are expected to be more efficient for linear scaling calculations of electronic structures for large systems. The transferability is one of the most useful properties of the LMOs, which allows the LMOs be transferred from a molecule or fragment to another in a similar chemical environment. Combined with the parallel calculating technology, transferability of LMOs should allow the calculation of a very large molecular system to become possible. This means that one can divide the whole system into small subsystems and calculate the LMOs of each subsystem and then use these LMOs of subsystems to build the LMOs of the whole system. Several groups have tried this kind of approach. Sironi et al.42 tested a simple transferability of extremely LMOs, followed by a simple relaxation process. Their results show that the errors in properties, such as distribution of the electronic density and atomic charge, were very small. However, their results show that the errors in total energy are very large even for a small molecule. Meyer et al.43,44 have built libraries of extremely LMOs. Their results show that it can obtain completely acceptable reconstructions of electron distributions.

Up to now, the developments of the DC method and LMO methods have followed different pathways. In addition, the DC SCF calculations can encounter convergence problems.19,45,46 The inverse electronic temperature parameter β, which is involved in the Fermi function for keeping the constant number of electrons, affects not only total energy but also the SCF convergence. As β increases, the convergence of total energy can become slow. By varying β in the middle of iteration,46 this problem still has not been completely solved. For delocalized systems, the DC buffer region cut-off errors can be large, and a larger buffer region should be adapted to reduce the errors, making the calculations less efficient.45

In this work, we develop a NOLMO-based DC method, denoted as NOLMO-DC. It divides a system into subsystems in physical space and optimizes the NOLMOs instead of density matrices for each subsystem separately and then obtains the NOLMOs for the whole system directly from the NOLMOs of subsystems, allowing the whole DM to be constructed directly without subsystem partition functions in the original DC. In this work, a new regularized localization approach to obtain NOLMOs is developed to achieve greater efficiency for delocalized systems. In this approach, the optimization function for the localization is a sum of the two terms, the spatial spread function, as in the Boys localization method,47 and the kinetic energy, as a regularization to limit the oscillation of the NOLMOs. The weight of the kinetic energy can be optimized with analytical gradients. The smoothness of the resulting NOLMOs has been greatly enhanced and thus have better transferability. This is because kinetic energy of NOLMOs is reduced. Compared with the original DC method, the accuracy of NOLMO-DC is better by several orders of magnitude for large conjugated systems and by about 1 order of magnitude for other systems, with similar computational linear scaling costs.

II. METHODOLOGY

We first review the conventional DC approach.1 The total electron density ρ(r) is decomposed to subsystem densities by using the partition functions pα(r)

ρα(r)=pα(r)ρ(r) (1)

where α denotes the subsystems in real space. Because the partition functions are normalized αpα(r)=1, the sum of subsystem densities is just the total electron density αρα(r)=ρ(r). One can also use the partition function in DM space, in which the electron density can then be expressed as

ρ(r)=μνPμνϕμ(r)ϕν(r) (2)

where μ and ν are used to denote the AO basis functions ϕμ(r) and ϕν(r), respectively.

A simple choice of the partition function in the DM space is

{1,ifμαandνα1/2,if(μαandνα)or(μαandνα)0,others (3)

where α could be a single atom or fragment.17 Although the value of the partition function pμνα is not 0 for μα and νβ according to the above-mentioned definition in eq 3, the associated DM elements between the two distant subsystems are zero because the corresponding orbital products are zero.

While the subsystem matrices Fα are directly obtained from the whole system Fock matrix including both subsystem α and its buffer region α′, the subsystem density matrices and electron densities ρα(r) are obtained from the eigenvectors of subsystem α by solving the eigenvalue equation

FαCα=SαCαεα (4)

in the AO basis space including both subsystem α and buffer region α′. The local eigenfunctions can then be expressed as

ψiα(r)=μ(αα)Cμiαϕμ(r) (5)

Because only local AO functions are used to obtain subsystem eigenfunctions, the local DM is approximately constructed from the local eigenfunctions

Pμνα=2pμναiαfβ(εFεi)CμiCνi* (6)

where fβ is the Fermi function with inverse temperature β. The subsystem electron density becomes

ρα(r)=μνPμναϕμ(r)ϕν(r) (7)

Fermi level εF is set by the electron normalization condition

N=ρ(r)dr=αρα(r)dr (8)

Equations 4 to 7 must be solved iteratively to get the converged DC electron density and energy.

Instead of calculating the subsystem DM in eq 6, we now explore the use of the LMO from subsystems in the DC approach. In NOLMO-DC, that is, our new method, we determine the NOLMOs of subsystem α by minimizing the following quantity

Θ[φkα]=φkα|(rrk0)2|φkα+ωφkα|T^|φkα (9)

where φkα are NOLMOs of subsystem α, rk0=(xk0,yk0,zk0) is the fixed centroid of φkα, T^ is the kinetic energy operator, and ω, a positive constant, is the weight of the kinetic energy term. When ω = 0, eq 9 is just the widely used Boys localization cost function. The added term of the kinetic energy is to regularize the solution NOLMO, or to enhance its smoothness, by reducing the kinetic energy of the orbitals. It will be shown that this leads to significant enhancement in computational accuracy for the same buffer region. Because φkα are occupied NOLMOs, they must be the linear combination of occupied CMOs of the subsystem; namely

φkα=iNα/2Aikψiα (10)

where Nα is the number of electrons in subsystem α, and ψ are CMOs calculated from eqs 4 and 5. Θ[φkα] can then be rewritten as the function of coefficients Aik

Θ[A]=i,jAik*Ajk[ψiα|(rrk0)2|ψjα+ωψiα|T^|ψjα]=i,jAikAjk[R˜ij2xk0Xij2yk0Yij2zk0Zij+r˜k0+ωTij] (11)

where R˜ij is the integration over the square of the cartesian vector ψiα|x^2+y^2+z^2|ψjα, Xij, Yij, and Zij are integrals of individual cartesian vector components x^, y^, and z^, respectively, and r˜k0=xk0xk0+yk0yk0+zk0zk0 is a value which equals the square of the cartesian vector rk0.

In order to obtain the optimal value of transformation matrix A, we need the first derivative of Θ[A] with respect to Aml

ΘAml=2j(Qmjl+ωTmj)Ajl (12)

where Qmjl=R˜mj2(xl0Xmj+yl0Ymj+zl0Zmj)+r˜l0. We can see that different NOLMOs are not coupled. The NOLMOs can be optimized individually. For a given centroid l, finding the best NOLMOs of centroid l is equivalent to solving the following eigenvalue equation in subsystem α

jΘmjlAjl=θplAmlp (13)

where Θmjl=Qmjl+ωTmj. The eigenvector with lowest eigenvalue θ0l is the best localized NOLMO with centroid l

φlα=mAml0ψmα (14)

Once we obtain the subsystem NOLMOs with centroid l, and this NOLMO belongs to the fragment of this subsystem, it is taken as part of the NOLMOs for the full system. (for the details on how to define the centroid of NOLMOs, see the Supporting Information). The closed-shell DM of the full system can be directly evaluated from all NOLMOs

ρ^=2k,lN/2|φk(S1)klφl| (15)

where N is the total number of electrons of the full system, and S−1 is the inverse of molecular orbital overlap matrix S, Sij = ⟨φi|φj⟩. Equation 15 indicates that inverse electronic temperature parameter β in the original DC approach (e.g., eqs 6 and 7) is not needed in our new NOLMO-DC approach. To circumvent the O(N3) operation normally associated with the matrix inversion for S−1 and achieve linear scaling, an expression (2X – XSX) is used as an approximation of S−1, where Hermitian matrix X, symmetric presently, is used as an auxiliary matrix which at the minimum becomes S−1.6 The DM is calculated from the NOLMOs, which means that the total energy of NOLMO-DC is variational, in contrast to the original DC method. The total energy is the function of DM operator ρ^

E[ρ^]=T[ρ^]+J[ρ^]+V(r)ρ^(r)dr+Exc[ρ^] (16)

Equations 4, 5, 13, and 15 are solved iteratively to get the converged NOLMOs, the electron density, and the total energy.

III. COMPUTATIONAL DETAILS

An illustration of the NOLMO-DC strategy, the division of a system into disjoint fragments in physical space, is given in Figure 1. Taking a hexatriene system containing three CH═CH units as an example, the whole system is divided into three fragments, each of which contains one CH═CH unit (in the solid line box). To improve the description of the subsystem, the neighboring region of the fragment is taken into consideration as a buffer region when calculating the subsystem. Here, we take one CH═CH unit on each side of every fragment as the buffer (in the dashed box but outside the solid line box). The bond between the buffer region and the other region is saturated by a hybrid sp3 orbital. The hybrid orbitals follow what was defined in the FMO method.48 The details on how to construct hybrid orbitals are presented in the Supporting Information. Atomic orbitals in the buffer of the subsystems are included in the construction of the subsystem NOLMOs for a better accuracy. The global NOLMOs are assembled directly from the NOLMOs obtained from subsystem calculations. SPARSKIT is employed for sparse matrix multiplication to achieve linear scaling in CPU time.49

Figure 1.

Figure 1.

Schematic illustration of the NOLMO-DC method and the global NOLMOs’ construction.

In testing the methods, all calculations of HF and B3LYP of DFT methods with STO-3G, 6-31G, 6-31G(d,p), and 6-31+G(d,p) basis sets were performed with a modified version of the Gamess US program.50 The buffer type for DC and NOLMO-DC is chosen to be the so-called “RADSUB” type built in Gamess. By choosing RADSUB in Gamess, any fragment with pre-defined buffer size Rbf means that each atom of this fragment will have a sphere with the radius of Rbf. If an atom is included in any sphere of this fragment, the corresponding fragment of this atom will be considered as a buffer. This type of buffer can avoid cutting bonds within each fragment. Testing systems are water clusters taken from the crystal structure of ice, polyglycine (n = 20), with alpha helix configuration and beta sheet configuration, and a polyacetylene [H–(C═C)n–H, n = 20].

IV. RESULTS

Figure 2 depicts the accuracy in the total energy plotted on the log scale of NOLMO-DC and DC versus the buffer size with and without the kinetic energy included in the localization cost function for a conjugate system, polyacetylene [H–(C═C)n–H, n = 20] at B3LYP/6-31G(d,p) and HF/6-31G(d,p) levels. With a small buffer size such as a buffer size of 4.0 Å, the accuracy of NOLMO-DC with the optimal ω at the HF level is higher than that of DC by 1 order of magnitude. The accuracy of NOLMO-DC is about 10−3 a.u. for a system of 40 carbon atoms. The accuracy of DC approaches the same level with a much larger buffer size at 14.0 Å. With the buffer size larger than 9.0 Å, the accuracy of NOLMO-DC is higher than that of DC by more than 3 orders of magnitude. The accuracy of NOLMO-DC will be around 10−6 a.u. when the buffer size is around 11.5 Å, while to achieve the same accuracy, the buffer size of DC will be larger than 24.0 Å. These results show that the buffer size for NOLMO-DC is only about one-third of that of the DC method to achieve the similar accuracy level. Thus, NOLMO-DC has a significant improvement in the accuracy compared to DC. As one can see from Figure 2, the accuracy of NOLMO-DC at the HF level with the optimal ω is improved by 1–3 orders of magnitude compared to ω = 0, while at the DFT level, it is improved by 0.5–1.5 orders of magnitude. The accuracy in energy with optimal ω is much higher than that of ω = 0 of NOLMOs, which means that the better transferability of NOLMOs is obtained with kinetic energy included.

Figure 2.

Figure 2.

Accuracy of NOLMO-DC and DC tested on polyacetylene [H–(C═C)n–H, n = 20] at B3LYP/6-31G(d,p) and HF/6-31G(d,p) levels for different values of ω and buffer sizes. ω is the weight of the kinetic energy of the cost function, while opt ω is the optimal value of ω determined by optimization with the analytical gradients. dE is the absolute difference between conventional energy and the DC or NOLMO-DC energy.

Figure 3 depicts the real-space representation of a π NOLMO distribution with different weights of the kinetic energy. From Figure 3, one can see that without the kinetic energy (i.e., ω = 0) included in the cost function, the oscillation of the NOLMO persists almost in the whole system, leading to poor localization. When the weight of the kinetic energy becomes larger, the orbital becomes less oscillatory and thus more localized.

Figure 3.

Figure 3.

Real-space representation of a π NOLMO with different weights of the kinetic energy in real space tested on polyacetylene [H–(C═C)n–H,n = 20] at the B3LYP/6-31G level; the buffer size is 19.0 Å, and the isosurface value is 10−5 e/bohr3. ω is the weight of the kinetic energy of the cost function, while opt ω is the optimal value of ω determined by optimization with the analytical gradient. The effect of the regularization is evident–it greatly reduces the oscillation of the orbitals.

Figure 4 depicts the NOLMO coefficients of Pz atomic orbital distribution on different carbon atoms with different weights of the kinetic energy. From Figure 4, one can see that without the kinetic energy included, the coefficient distribution curve is oscillatory, and it decays slowly. As the weight of the kinetic energy increases, the curve becomes smoother, and the number of stationary points decreased. Both the width of the main peak and the number of the stationary points affect the decay rate of the coefficient.

Figure 4.

Figure 4.

Molecular orbital coefficient distribution of the Pz orbital with different values of ω from 0 to 1400 tested on polyacetylene [H–(C═C)n–H, n = 20] with a buffer size of 9.0 Å at the B3LYP/STO-3G level. The origin of the X-axis is the centroid of the bond, and the X-axis represents the distance (in Å) of neighboring atoms from the centroid, and Y-axis represents the molecular orbital coefficients.

As shown in Figure 4, with a small ω, the molecular coefficient decreases from the center of the orbital to the edges in an oscillatory manner. With increasing values of ω, the molecular coefficient decreases quickly and with no oscillation, which means that the molecular orbital becomes more smooth. The smoothness feature of the molecular orbitals is very important for NOLMOs. With reduced oscillation of the NOLMO, the buffer size can be smaller.

However, as the weight of the kinetic energy increases, the orbitals become more diffuse, as shown in Figure 3 and in the Supporting Information. It is easy to understand the limit when the weight becomes infinity; the optimized orbital will be the lowest eigenvalue of the kinetic energy operator, and it will be delocalized in the whole space. Thus, the weight of the kinetic energy term should not be too large. The weight of the kinetic energy can be determined by the minimization of the total energy for a given buffer size. With a suitable kinetic energy weight in the localization procedure, the resulting NOLMOs become more compact. More NOLMOs in real space at different isosurface values are shown in the Supporting Information. With an optimized weight of the kinetic energy, one can get the most compact NOLMO. This is the main reason why the total energy calculated by NOLMO-DC with optimal weight of the kinetic energy as regularization is more accurate than that without kinetic energy included in the cost function.

Figure 5 depicts the accuracy plotted on the log scale at different buffer sizes with different weights of the kinetic energy. It shows that the accuracy increases first with the weight and then it decreases with the weight. The minimum point is different for different buffer sizes. This indicates that the optimal weight depends on the buffer size. Figure 5 also depicts that with an optimal weight, the accuracy of NOLMO-DC is about 1 order of magnitude higher than that of NOLMO-DC without the kinetic energy. This clearly shows that the kinetic energy in the localization procedure is very important for the improved accuracy, especially for conjugated systems.

Figure 5.

Figure 5.

Accuracy of the NOLMO-DC method with different weights of kinetic energy (ω) at different buffer sizes (Rbf). dE denotes the energy difference between the NOLMO-DC method and the conventional one tested in polyacetylene [H–(C═C)n–H, n = 20] at the B3LYP/6-31G level.

In a non-conjugated system, the kinetic energy is shown to be unnecessary for the localization procedure (see the Supporting Information). However, the accuracy of NOLMO-DC is still better than that of DC. Figure 6 shows that in a 3-dimensional water cluster, with the same buffer size, the accuracy in total energy plotted on the log scale of NOLMO-DC is still about 1.5 order of magnitude improved than that of DC. This result indicates that the NOLMO-DC method can be applied to general systems.

Figure 6.

Figure 6.

Accuracy of the NOLMO-DC method at different buffer sizes (Rbf) tested on a 3-dimension water system with 104 water molecules at the HF/6-31+G(d,p) level. dE denotes the absolute energy difference between the DC or NOLMO-DC and the conventional one. The geometry of the water system is taken from the data bank of a crystal ice with a radius of 9.2 Å. The DC method with a buffer size of Rbf = 2.9 Å is not converged.

Figures 7 and 8 are the results for SCF CPU time of the NOLMO-DC and DC method with different buffer sizes tested on a linear conjugated system and 3-dimension system. The results show that for the linear conjugated system, the SCF CPU time of the NOLMO-DC method is a little bit more than that of the original DC method due to the costs paid for the optimization of ω. However, in the 3-dimension non-conjugated system, the SCF CPU time of the NOLMO-DC method and the original DC method is comparable.

Figure 7.

Figure 7.

CPU time per iteration of the NOLMO-DC and DC methods with different buffer sizes tested on polyacetylene [H–(C═C)n–H, n = 100] at the HF/6-31G level. The testing is performed on the computing machine with a single CPU.

Figure 8.

Figure 8.

CPU time per iteration of the NOLMO-DC and DC methods with different buffer sizes tested on a 3-dimension water system with 104 water molecules at the HF/6-31G(d,p) level. The geometry of the water system is taken from the data bank of a crystal ice with a radius of 9.2 Å. The testing is performed on the computing machine with a single CPU.

Figure 9 is the comparison of the scaling of the SCF CPU time of the NOLMO-DC versus DC method tested on polyacetylene [H–(C═C)n–H] at the HF/6–31G level. To reach the similar accuracy of total energy convergency, a buffer size of 9.0 Å is used for NOLMO-DC, while a buffer size of 29 Å is required for DC. This clearly shows that NOLMO-DC is more efficient than the original DC method.

Figure 9.

Figure 9.

Comparison of the scaling of the SCF CPU time of the NOLMO-DC versus DC method tested on polyacetylene(H–(C═C)n–H) at the HF/6-31G level. For the same accuracy of convergency, a buffer size of 9.0 Å is used for NOLMO-DC, while a buffer size of 29 Å is required for DC. The testing is done for one CPU machine. A y = x line is plotted in the insert figure to compare the efficiency of NOLMO-DC and DC.

V. CONCLUSIONS

In this work, the DC approach based on NOLMOs is developed for linear scaling calculations of large molecular systems. The accuracy of the NOLMO-DC approach is greatly improved by several orders of magnitude with respect to DC for large conjugated systems. There are several advantages of the NOLMO-DC method. (1) One can form the DM of the whole system from the NOLMOs obtained directly from subsystems avoiding further localization and the use of the spatial partition function. (2) The calculation of S−1 in forming the density of the whole system can achieve linear scaling by introducing a second minimization approach to matrix inverse for S−1 in the NOLMO method. (3) The NOLMO-DC method does not need β to keep the constant number of electrons.

From the perspective of the DC method, the use of NOLMOs in the NOLMO-DC method from each subsystem calculation extracts directly local electronic structure information better than the DM of the subsystem. The total electron number normalization is replaced by more stringent idempotency and normalization conditions of the total DM. The use of the regularized NOLMOs, newly developed in this work, further enhances the transferability, which is needed in transferring the NOLMOs from subsystems to the whole system. This leads to better accuracy and also the satisfaction of the variational principle for the total energy. From the perspective of the NOLMO approach, the most expensive procedure of the optimization of NOLMOs for the whole systems has been replaced by the calculation of NOLMOs from subsystems. The accuracy of such subsystem NOLMOs can be systematically improved with the use of increasing size of the buffer regions. Thus, the NOLMO-DC method combines the appealing features of the DC and LMO linear scaling methods. Test results on accuracy demonstrate that it is a promising method for addressing large complex problems in electronic structure calculations.

For regular chemical systems which are in most applications, as we have documented in detail in the present work, the NOLMO-DC method outperformed the original DC method with improvement of accuracy by more than 1 order of magnitude with the same buffer size (and similar computational costs). This demonstrates clearly the power and potential of NOLMO-DC. For systems with delocalized charges and spins, specifying the number of electrons and the corresponding number of NOLMOs for each subsystem may be challenging. However, we believe that this challenge could be overcome with the use of an alternative formulation for the NOLMO developed in ref 6: namely, one uses more NOLMOs than the number of occupied orbitals. In this way, we do not need to specify the precise number of NOLMOs in each subsystem. This will be implemented in future work.

Supplementary Material

ct2c00142_si_001

ACKNOWLEDGMENTS

The authors are grateful for financial support from the National Key R&D Program of China (grant no. 2017YFB0203403). This work was also supported by the National Natural Science Foundation of China (grant no. 21673085) and the Guangdong–Hong Kong Technology Cooperation Funding Scheme (grant no. 2017A050506048). W.Y. was partially supported by the National Institutes of Health (award no. R01-GM061870).

Footnotes

The authors declare no competing financial interest.

ASSOCIATED CONTENT

Supporting Information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.2c00142.

Detailed description of hybrid orbitals for bonded systems, centroids of NOLMOs, method for optimization of X, analytical gradient for the weight of kinetic energy, real-space distribution with different values of ω, influence of the kinetic energy on non-conjugated systems, accuracy tests for bonded systems, and all the data for figures (PDF)

Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jctc.2c00142

Contributor Information

Liang Peng, Key Laboratory of Theoretical Chemistry of Environment, Ministry of Education; School of Environment, South China Normal University, Guangzhou 510006, People’s Republic of China.

Daoling Peng, Key Laboratory of Theoretical Chemistry of Environment, Ministry of Education; School of Environment, South China Normal University, Guangzhou 510006, People’s Republic of China.

Feng Long Gu, Key Laboratory of Theoretical Chemistry of Environment, Ministry of Education; School of Environment, South China Normal University, Guangzhou 510006, People’s Republic of China.

Weitao Yang, Department of Chemistry, Duke University, Durham, North Carolina 27708-0346, United States.

REFERENCES

  • (1).Yang W Phys. Rev. Lett. 1991, 66, 1438–1441. [DOI] [PubMed] [Google Scholar]
  • (2).Galli G; Parrinello M Phys. Rev. Lett. 1992, 69, 3547–3550. [DOI] [PubMed] [Google Scholar]
  • (3).Li X-P; Nunes RW; Vanderbilt D Phys. Rev. B 1993, 47, 10891–10894. [DOI] [PubMed] [Google Scholar]
  • (4).Mauri F; Galli G; Car R Phys. Rev. B 1993, 47, 9973–9976. [DOI] [PubMed] [Google Scholar]
  • (5).Hierse W; Stechel EB Phys. Rev. B 1994, 50, 17811–17819. [DOI] [PubMed] [Google Scholar]
  • (6).Yang W Phys. Rev. B 1997, 56, 9294–9297. [Google Scholar]
  • (7).Challacombe M J. Chem. Phys. 1999, 110, 2332–2342. [Google Scholar]
  • (8).Goedecker S Rev. Mod. Phys. 1999, 71, 1085–1123. [Google Scholar]
  • (9).Burger SK; Yang W J. Phys.: Condens. Matter 2008, 20, 294209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Bowler DR; Miyazaki T Rep. Prog. Phys. 2012, 75, 036503. [DOI] [PubMed] [Google Scholar]
  • (11).Jensen F Introduction to Computational Chemistry; John wiley & sons, 2017; pp 182–183. [Google Scholar]
  • (12).Martin RM Electronic Structure: Basic Theory and Practical Methods; Cambridge University Press, 2004. [Google Scholar]
  • (13).Yang WT; Perez-Jorda JM In Encyclopedia of Computational Chemistry; Schleyer P. v. R., Ed.; John Wiley & Sons: New York, 1998; pp 1496–1513. [Google Scholar]
  • (14).Parr RG; Yang W Annu. Rev. Phys. Chem. 1995, 46, 701–728. [DOI] [PubMed] [Google Scholar]
  • (15).Sałek P; Høst S; Thøgersen L; Jørgensen P; Manninen P; Olsen J; Jansík B; Reine S; Pawłowski F; Tellgren E J. Chem. Phys. 2007, 126, 114110. [DOI] [PubMed] [Google Scholar]
  • (16).Peng L; Gu FL; Yang W Phys. Chem. Chem. Phys. 2013, 15, 15518–15527. [DOI] [PubMed] [Google Scholar]
  • (17).Yang W; Lee TS J. Chem. Phys. 1995, 103, 5674–5678. [Google Scholar]
  • (18).Yang W Phys. Rev. A 1991, 44, 7823–7826. [DOI] [PubMed] [Google Scholar]
  • (19).Akama T; Kobayashi M; Nakai H J. Comput. Chem. 2007, 28, 2003–2012. [DOI] [PubMed] [Google Scholar]
  • (20).Zhao Q; Yang W J. Chem. Phys. 1995, 102, 9598–9603. [Google Scholar]
  • (21).Kobayashi M; Imamura Y; Nakai H J. Chem. Phys. 2007, 127, 074103. [DOI] [PubMed] [Google Scholar]
  • (22).Nakai H Chem. Phys. Lett. 2002, 363, 73–79. [Google Scholar]
  • (23).Kobayashi M; Akama T; Nakai H J. Chem. Phys. 2006, 125, 204106. [DOI] [PubMed] [Google Scholar]
  • (24).Kobayashi M; Nakai H J. Chem. Phys. 2008, 129, 044103. [DOI] [PubMed] [Google Scholar]
  • (25).Kobayashi M; Nakai H Int. J. Quantum Chem. 2009, 109, 2227–2237. [Google Scholar]
  • (26).Kobayashi M; Nakai H J. Chem. Phys. 2009, 131, 114108. [DOI] [PubMed] [Google Scholar]
  • (27).Guo Y; Becker U; Neese F J. Chem. Phys. 2018, 148, 124117. [DOI] [PubMed] [Google Scholar]
  • (28).Rolik Z; Szegedy L; Ladjánszki I; Ladóczki B; Kállay M J. Chem. Phys. 2013, 139, 094105. [DOI] [PubMed] [Google Scholar]
  • (29).Nishizawa H; Nishimura Y; Kobayashi M; Irle S; Nakai H J. Comput. Chem. 2016, 37, 1983–1992. [DOI] [PubMed] [Google Scholar]
  • (30).Sakti AW; Nishimura Y; Nakai H J. Phys. Chem. B 2017, 121, 1362–1371. [DOI] [PubMed] [Google Scholar]
  • (31).Nakai H; Sakti AW; Nishimura Y J. Phys. Chem. B 2016, 120, 217–221. [DOI] [PubMed] [Google Scholar]
  • (32).Nishimura Y; Nakai H J. Comput. Chem. 2019, 40, 1538–1549. [DOI] [PubMed] [Google Scholar]
  • (33).Kim J; Mauri F; Galli G Phys. Rev. B 1995, 52, 1640–1648. [DOI] [PubMed] [Google Scholar]
  • (34).Liu S; Pérez-Jordá JM; Yang W J. Chem. Phys. 2000, 112, 1634–1644. [Google Scholar]
  • (35).Payne PW J. Am. Chem. Soc. 1977, 99, 3787–3794. [Google Scholar]
  • (36).Mortensen JJ; Parrinello M J. Phys.: Condens. Matter 2001, 13, 5731. [Google Scholar]
  • (37).Feng H; Bian J; Li L; Yang W J. Chem. Phys. 2004, 120, 9458–9466. [DOI] [PubMed] [Google Scholar]
  • (38).Krol MC; Altona C Mol. Phys. 1991, 72, 375–393. [Google Scholar]
  • (39).Paulus B; Rościszewski K; Stoll H; Birkenheuer U Phys. Chem. Chem. Phys. 2003, 5, 5523–5529. [Google Scholar]
  • (40).Anikin NA; Anisimov VM; Bugaenko VL; Bobrikov VV; Andreyev AM J. Chem. Phys. 2004, 121, 1266–1270. [DOI] [PubMed] [Google Scholar]
  • (41).Cui G; Fang W; Yang W J. Phys. Chem. A 2010, 114, 8878–8883. [DOI] [PubMed] [Google Scholar]
  • (42).Sironi M; Famulari A; Raimondi M; Chiesa S J. Mol. Struct.: THEOCHEM 2000, 529, 47–54. [Google Scholar]
  • (43).Meyer B; Guillot B; Ruiz-Lopez MF; Genoni A J. Chem. Theory Comput. 2016, 12, 1052–1067. [DOI] [PubMed] [Google Scholar]
  • (44).Meyer B; Guillot B; Ruiz-Lopez MF; Jelsch C; Genoni A J. Chem. Theory Comput. 2016, 12, 1068–1081. [DOI] [PubMed] [Google Scholar]
  • (45).Akama T; Fujii A; Kobayashi M; Nakai H Mol. Phys. 2007, 105, 2799–2804. [Google Scholar]
  • (46).Akama T; Kobayashi M; Nakai H Int. J. Quantum Chem. 2009, 109, 2706–2713. [Google Scholar]
  • (47).Boys SF Rev. Mod. Phys. 1960, 32, 296. [Google Scholar]
  • (48).Fedorov DG; Kitaura K Modern Methods for Theoretical Physical Chemistry of Biopolymers; Elsevier, 2006; pp 3–38. [Google Scholar]
  • (49).Saad Y SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations (Version 2); SPARSKIT, 1994. https://wwwusers.cse.umn.edu/~saad/software/SPARSKIT/. [Google Scholar]
  • (50).Schmidt MW; Baldridge KK; Boatz JA; Elbert ST; Gordon MS; Jensen JH; Koseki S; Matsunaga N; Nguyen KA; Su S; Windus TL; Dupuis M; Montgomery JA Jr J. Comput. Chem. 1993, 14, 1347–1363. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct2c00142_si_001

RESOURCES