Fermionic neural-network states for ab-initio electronic structure

Kenny Choo; Antonio Mezzacapo; Giuseppe Carleo

doi:10.1038/s41467-020-15724-9

. 2020 May 12;11:2368. doi: 10.1038/s41467-020-15724-9

Fermionic neural-network states for ab-initio electronic structure

Kenny Choo ^1,^✉, Antonio Mezzacapo ^2,^✉, Giuseppe Carleo ^3,^✉

PMCID: PMC7217823 PMID: 32398658

Abstract

Neural-network quantum states have been successfully used to study a variety of lattice and continuous-space problems. Despite a great deal of general methodological developments, representing fermionic matter is however still early research activity. Here we present an extension of neural-network quantum states to model interacting fermionic problems. Borrowing techniques from quantum simulation, we directly map fermionic degrees of freedom to spin ones, and then use neural-network quantum states to perform electronic structure calculations. For several diatomic molecules in a minimal basis set, we benchmark our approach against widely used coupled cluster methods, as well as many-body variational states. On some test molecules, we systematically improve upon coupled cluster methods and Jastrow wave functions, reaching chemical accuracy or better. Finally, we discuss routes for future developments and improvements of the methods presented.

Subject terms: Quantum chemistry, Quantum simulation

Despite the importance of neural-network quantum states, representing fermionic matter is yet to be fully achieved. Here the authors map fermionic degrees of freedom to spin ones and use neural-networks to perform electronic structure calculations on model diatomic molecules to achieve chemical accuracy.

Introduction

Predicting the physical and chemical properties of matter from the fundamental principles of quantum mechanics is a central problem in modern electronic structure theory. In the context of ab-initio quantum chemistry (QC), a commonly adopted strategy to solve for the electronic wave-function is to discretize the problem on finite basis functions, expanding the full many-body state in a basis of anti-symmetric Slater determinants. Because of the factorial scaling of the determinant space, exact approaches systematically considering all electronic configurations, such as the full configuration interaction (FCI) method, are typically restricted to small molecules and basis sets. A solution routinely adopted in the field is to consider systematic corrections over mean-field states. For example, in the framework of the coupled cluster (CC) method^1,2, higher level of accuracy can be obtained considering electronic excitations up to doublets, in CCSD, and triplets in CCSD(T). CC techniques are routinely adopted in QC electronic calculations, and they are often considered the “gold standard” in ab-initio electronic structure. Despite this success, the accuracy of CC is intrinsically limited in the presence of strong quantum correlations, in turn restricting the applicability of the method to regimes of relative weak correlations.

For strongly correlated molecules and materials, alternative, non-perturbative approaches have been introduced. Most notably, both stochastic and non-stochastic methods based on variational representations of many-body wave-functions have been developed and constantly improved in the past decades of research. Notable variational classes for QC are Jastrow–Slater wave-functions³, correlated geminal wave-functions⁴, and matrix product states^5–7. Stochastic projection methods systematically improving upon variational starting points are for example the fixed-node Green’s function Monte Carlo⁸ and constrained-path auxiliary field Monte Carlo⁹. Main limitations of these methods stem, directly or indirectly, from the choice of the variational form. For example, matrix-product states are extremely efficient in quasi-one-dimensional systems, but suffer from exponential scaling when applied to larger dimensions. On the other hand, variational forms considered so-far for higher dimensional systems typically rely on rigid variational classes and do not provide a systematic and computationally efficient way to increase their expressive power.

To help overcome some of the limitations of existing variational representations, ideas leveraging the power of artificial neural networks (ANN) have recently emerged in the more general context of interacting many-body quantum matter. These approaches are typically based on compact, variational parameterizations of the many-body wave-function in terms of ANN¹⁰. These approaches to fermionic problems are however comparatively less explored than for lattice spin systems. Two main conceptually different implementations have been put forward. In the first, fermionic symmetry is encoded directly at the mean field level, and ANNs are used as a positive-definite correlator function¹¹. Main limitation of this ansatz is that the nodal structure of the wave function is fixed, and the exact ground state cannot, in principle, be achieved, even in the limit of infinitely large ANN. The second method is to use ANNs to indirectly parameterize and modify the fermionic nodal structure^12–15. In this spirit, “backflow” variational wave functions^16,17 with flexible symmetric orbitals have been introduced^13,14, and only very recently applied to electronic structure^18,19.

In this article, we provide an alternative representation of fermionic many-body quantum systems based on a direct encoding of electronic configurations. This task is achieved by mapping the fermionic problem onto an equivalent spin problem, and then solving the latter with spin-based neural-network quantum states. Using techniques from quantum information, we analyze different model agnostic fermion-to-spin mappings. We show results for several diatomic molecules in minimal Gaussian basis sets, where our approach reaches chemical accuracy (<1 kcal/mol) or better. The current challenges in extending the method to larger basis sets and molecules are also discussed.

Results

Electronic structure on spin systems

We consider many-body molecular fermionic Hamiltonians in second quantization formalism,

H = \sum_{i, j} t_{i j} c_{i}^{†} c_{j} + \sum_{i, j, k, m} u_{i j k m} c_{i}^{†} c_{k}^{†} c_{m} c_{j},

where we have defined fermionic annihilation and creation operators with the anticommutation relation ${c_{i}^{†}, c_{j}} = δ_{i, j}$ on N fermionic modes, and one- and two-body integrals t_ij and u_ijkm. The Hamiltonian in Eq. (1) can be mapped to interacting spin models via the Jordan–Wigner²⁰ mapping, or the more recent parity or Bravyi–Kitaev²¹ encodings, which have been developed in the context of quantum simulations. These three encodings can all be expressed in the compact form

\begin{matrix} c_{j} \to \frac{1}{2} \prod_{i \in U (j)} σ_{i}^{x} \times (σ_{j}^{x} \prod_{i \in P (j)} σ_{i}^{z} - i σ_{j}^{y} \prod_{i \in R (j)} σ_{i}^{z}) \\ c_{j}^{†} \to \frac{1}{2} \prod_{i \in U (j)} σ_{i}^{x} \times (σ_{j}^{x} \prod_{i \in P (j)} σ_{i}^{z} + i σ_{j}^{y} \prod_{i \in R (j)} σ_{i}^{z}), \end{matrix}

where we have defined an update U(j), parity P(j), and remainder R(j) sets of spins, which depend on the particular mapping considered^22,23, and $σ_{i}^{(x, y, z)}$ denote Pauli matrices acting on site i. In the familiar case of the Jordan–Wigner transformation, the update, parity, and remainder sets become U(j) = j, P(j) = {0, 1, ... j − 1}, R(j) = P(j), and the mapping takes the simple form

\begin{matrix} c_{j} \to (\prod_{i = 0}^{j - 1} σ_{i}^{z}) σ_{j}^{-} \\ c_{j}^{†} \to (\prod_{i = 0}^{j - 1} σ_{i}^{z}) σ_{j}^{+}, \end{matrix}

where $σ_{j}^{+ (-)} = (σ_{j}^{x} + (-) i σ_{j}^{y}) / 2$ . For all the spin encodings considered, the final outcome is a spin Hamiltonian with the general form

H_{q} = \sum_{j = 1}^{r} h_{j} {l σ}_{j},

defined as a linear combination with real coefficients h_j of σ_j, N-fold tensor products of single-qubit Pauli operators I, σ^x, σ^y, σ^z. Additionally, under such mappings, there is a one to one correspondence between spin configuration $\vec{σ}$ and the original particle occupations ${\vec{n}}_{σ}$ . In the following, we will consider the interacting spin Hamiltonian in Eq. (4) as the starting point for our variational treatment.

Neural-network quantum states

Once the mapping is performed, we use neural-network quantum states (NQS) introduced in ref. ¹⁰ to parametrize the ground state of the Hamiltonian in Eq. (4). One conceptual interest of NQS is that, because of the flexibility of the underlying non-linear parameterization, they can be adopted to study both equilibrium^24,25 and out-of-equilibrium^26–31 properties of diverse many-body quantum systems. In this work, we adopt a simple neural-network parameterization in terms of a complex-valued, shallow restricted Boltzmann machine (RBM)^10,32. For a system of N spins, the many-body amplitudes take the compact form

Ψ_{M} (\vec{σ}; W) = e^{\sum_{i} a_{i} σ_{i}^{z}} \prod_{j = 1}^{M} 2 \cosh θ_{j} (\vec{σ}), where

θ_{j} (\vec{σ}) = b_{j} + \sum_{i}^{N} W_{i j} σ_{i}^{z} .

Here, $W$ are complex-valued network parameters $W = {a, b, W}$ , and the expressivity of the network is determined by the hidden unit density defined by α = M/N where M is the number of hidden units. The simple RBM ansatz can efficiently support volume-law entanglement^33–36, and it has been recently used in several applications³⁷.

One can then train the ansatz given in Eq. (5) with a variational learning approach known as variational Monte Carlo (VMC), by minimizing the energy expectation value

E (W) = \frac{⟨ Ψ_{M} ∣ H_{q} ∣ Ψ_{M} ⟩}{⟨ Ψ_{M} ∣ Ψ_{M} ⟩} .

This expectation value can be evaluated using Monte Carlo sampling using the fact that the energy (and, analogously, any other observable) can be written as

E (W) = \frac{\sum_{\vec{σ}} E_{loc} (\vec{σ}) ∣ Ψ_{M} (\vec{σ}) ∣^{2}}{\sum_{\vec{σ}} ∣ Ψ_{M} (\vec{σ}) ∣^{2}},

where we have defined the local energy

E_{loc} (\vec{σ}) = \sum_{{\vec{σ}}^{'}} \frac{Ψ_{M} ({\vec{σ}}^{'})}{Ψ_{M}^{*} (\vec{σ})} ⟨ {\vec{σ}}^{'} ∣ H_{q} ∣ \vec{σ} ⟩ .

Given samples $M$ drawn from the distribution $\frac{∣ Ψ_{M} (\vec{σ}) ∣^{2}}{\sum_{\vec{σ}} ∣ Ψ_{M} (\vec{σ}) ∣^{2}}$ , the average over the samples $Ê (W) = {⟨E_{loc} (\vec{σ})⟩}_{M}$ gives an unbiased estimator of the energy. Note that the computational cost of evaluating the local energy depends largely on the sparsity of the Hamiltonian H_q. In generic QC problems, this cost scales in the worst case with $O (N^{4})$ , as compared to the linear scaling in typical condensed matter systems with local interaction.

Sampling from $∣ Ψ_{M} (\vec{σ}) ∣^{2}$ is performed using Markov chain Monte Carlo (MCMC), with a Markov chain ${\vec{σ}}_{0} \to {\vec{σ}}_{1} \to {\vec{σ}}_{2} \to \dots$ constructed using the Metropolis–Hastings algorithm³⁸. Specifically, at each iteration, a configuration ${\vec{σ}}_{prop}$ is proposed and accepted with probability

P ({\vec{σ}}_{k + 1} = {\vec{σ}}_{prop}) = \min (1, {∣\frac{Ψ_{M} ({\vec{σ}}_{prop})}{Ψ_{M} ({\vec{σ}}_{k})}∣}^{2}) .

The sample $M$ then corresponds to the configurations of the Markov chain downsampled at an interval K, i.e., ${{\vec{σ}}_{0}, {\vec{σ}}_{K}, {\vec{σ}}_{2 K}, \dots}$ . For the simulations done in this work, we typically use K = 10N with a sample size of approximately 100,000.

Since the Hamiltonians we are interested in have an underlying particle conservation law, it is helpful to perform this sampling in the particle basis ${\vec{n}}_{σ}$ rather than the corresponding spin basis $\vec{σ}$ . The proposed configuration ${\vec{σ}}_{prop}$ at each iteration, then corresponds to a particle hopping between orbitals. Once a stochastic estimate of the expectation values is available, as well as its derivatives w.r.t. the parameters $W$ , the ansatz can be optimized using the stochastic reconfiguration method^39,40, closely related to the natural-gradient method used in machine learning applications^10,41.

Computational complexity

The main computational cost of the procedure arises from the evaluation of the local energy (Eq. (5)) of the samples generated. This gives an overall computational complexity of $O (N_{var} \times N_{op} \times N_{samp})$ where N_var = MN + M + N is the number of parameters in the network, N_op is the number of Pauli strings in the spin Hamiltonian defined by Eq. (4) and N_samp is sample size.

However, as can be seen in Fig. 2, there are only small number of relevant configurations in the wavefunction, thus each sample $M$ only contains a few unique configurations. By caching amplitudes $Ψ_{M} (\vec{σ})$ the computational cost can be significantly reduced to $O (N_{var} \times N_{op} \times N_{unique})$ where N_unique ≪ N_samp is the average number of unique configurations in each sample. Typically, for a sample size of 10,000 there are only about few hundred unique samples.

Potential energy surfaces

We first consider small molecules in a minimal basis set (STO-3G). We show in Fig. 1 the dissociation curves for C₂ and N₂, compared to the CCSD and CCSD(T). It can be seen that on these small molecules in their minimal basis, the RBM is able to generate accurate representations of the ground states, and remarkably achieve an accuracy better than standard QC methods. To further illustrate the expressiveness of the RBM, we show in Fig. 2 the probability distribution of the most relevant configurations in the wavefunction. We contrast between the RBM and configuration interaction limited to single and double excitations (CISD). In CISD, the Hilbert space is truncated to include only states which are up to two excitations away from the Hartree–Fock configuration. It is clear from the histogram that the RBM is able to capture correlations beyond double excitations.

Alternative encodings

The above computations were done using the Jordan–Wigner mapping. To investigate the effect of the mapping choice on the performance of the RBM, we also performed select calculations using the parity and Bravyi–Kitaev mappings. All the aforementioned transformations require a number of spins equal to the number of fermionic modes in the model. However, the support of the Pauli operators w_j = ∣σ_j∣ in Eq. (4), i.e., the number of single-qubit Pauli operators in σ_j that are different from the identity I, depends on the specific mapping used. Jordan–Wigner and parity mappings have linear scalings w_j = O(N), while the Bravyi–Kitaev encoding has a more favorable scaling $w_{j} = O (\log (N))$ , due to the logarithmic spin support of the update, parity, and remainder sets in Eq. (2). Note that one could in principle use generalized superfast mappings⁴², which have a support scaling as good as $w_{j} = O (\log (d))$ , where d is the maximum degree of the fermionic interaction graph defined by Eq. (1). However, such a mapping is not practical for the models considered here because the typical large degree of molecular interactions graphs makes the number of spins required for the simulation too large compared to the other model-agnostic mappings.

While these encodings are routinely used as tools to study fermionic problems on quantum hardware⁴³, their use in classical computing has not been systematically explored so far. Since they yield different structured many-body wave functions, it is then worth analyzing whether more local mappings can be beneficial for specific NQS representations. In Fig. 3, we analyze the effect of the different encodings on the accuracy of the variational ground-state energy for a few representative diatomic molecules. At fixed computational resources and network expressivity, we typically find that the RBM ansatz can achieve consistent levels of accuracy, independent of the nature of the mapping type. While the Jordan–Wigner allows to achieve the lowest energies in those examples, the RBM is nonetheless able to efficiently learn the ground state also in other representations, and chemical accuracy is achieved in all cases reported in Fig. 3.

Fig. 3 — Accuracy of the RBM (green star) representations for three different mapping types (Jordan–Wigner, Parity, and Bravyi–Kitaev) and three different molecules (LiH, C₂, and N₂) in their equilibrium configuration in the STO-3G basis. The geometries used are reported in the Methods section.

Sampling larger basis sets

The spin-based simulations of the QC problems studied here show a distinctive MCMC sampling behavior that is not usually found in lattice model simulations of pure spin models. Specifically, the ground-state wave function of the diatomic molecules considered is typically sharply peaked around the Hartree–Fock state, and neighboring excited states. This behavior is prominently shown also in Fig. 2, where the largest peaks are several order of magnitude larger than the distribution tail. As a result of this structure, any uniform sampling scheme drawing states $\vec{σ}$ from the VMC distribution $∣ Ψ_{M} (\vec{σ}) ∣^{2}$ , is bound to repeatedly draw the most dominant states, while only rarely sampling less likely configurations. To exemplify this peculiarity, we study the behavior of the ground state energy as a function of the number of MCMC samples used at each step of the VMC optimization. We concentrate on the water molecule in the larger 6-31g basis. In this case, the Metropolis sampling scheme exhibits acceptance rates as low as 0.1% or less, as a consequence of the presence of dominating states previously discussed.

In Fig. 4, we vary the sample size and also compare MCMC sampling with exact sampling. We can see that the accuracy of the simulation depends quite significantly on the sample size. The large number of samples needed in this case, together with a very low acceptance probability for the Metropolis–Hasting algorithm, directly points to the inefficiency of uniform sampling from $∣ Ψ_{M} (\vec{σ}) ∣^{2}$ . At present, this represents the most significant bottleneck in the application of our approach to larger molecules and basis sets. This issue however is not a fundamental limitation, and alternatives to the standard VMC uniform sampling can be envisioned to efficiently sample less likely—yet important for chemical accuracy—states. Beyond sampling issues, representability is also a factor as can be seen from the inset of Fig. 4. Enough hidden units are required to capture the wavefunction accurately, however, with more hidden units optimization also becomes more challenging, thus finding an appropriate network architecture is also crucial.

Discussion

In this work, we have shown that relatively simple shallow neural networks can be used to compactly encode, with high precision, the electronic wave function of model molecular problems in quantum chemistry. Our approach is based on the mapping between the fermionic quantum chemistry molecular Hamiltonian and corresponding spin Hamiltonians. In turn, the ground state of the spin models can be conveniently modeled with standard variational neural-network quantum states. On model diatomic molecules, we show that a RBM state is able to capture almost the entirety of the electronic excitations, improving on routinely used approaches as CCSD(T) and the Jastrow ansatz (Table 1).

Table 1.

Equilibrium energies (in Hartree) as obtained by different methods.

Molecule	RBM	Jastrow	CISD	CCSD	CCSD(T)	FCI
H₂	−1.1373	−1.1373	−1.1373	−1.1373	−1.1373	−1.1373
LiH	−7.8826	−7.8814	−7.8827	−7.8828	−7.8828	−7.8828
NH₃	−55.5277	−55.4770	−55.5258	−55.5280	−55.5281	−55.5282
H₂O	−75.0232	−74.9784	−75.0221	−75.0231	−75.0232	−75.0233
C₂	−74.6892	−74.5001	−74.6371	−74.6745	−74.6876	−74.6908
N₂	−107.6767	−107.5924	−107.6591	−107.6717	−107.6738	−107.6774

Open in a new tab

The basis set considered here is STO-3G, and the corresponding geometries are reported in the Methods section. Energies are reported in Hartrees and statistical uncertainty on RBM and Jastrow states energies are on the last reported digits. The RBM used has a hidden unit density α = 1 for all the molecules apart from C₂ and N₂ where we use α = 2.

Several future directions can be envisioned. The distinctive peaked structure of the molecular wave function calls for the development of alternatives to uniform sampling from the Born probability. These developments will allow to efficiently handle larger basis sets than the ones considered here. Second, our study has explored only a very limited subset of possible neural-network architectures. Most notably, the use of deeper networks might prove beneficial for complex molecular complexes. Another very interesting matter for future research is the comparison of different neural-network-based approaches to quantum chemistry. Contemporary to this work, approaches based on antisymmetric wave-functions in continuous space have been presented^18,19. These have the advantage that they already feature a full basis set limit. However, the discrete basis approach has the advantage that boundary conditions and fermionic symmetry are much more easily enforced. As a consequence, simple-minded shallow networks can already achieve comparatively higher accuracy than the deeper and substantially more complex networks so-far adopted in the continuum case. On a different note, in a recent article⁴⁴, the use of a unitary-coupled RBM applicable for noisy intermediate-scale quantum devices has been proposed and is also worth exploring.

Methods

Geometries for diatomic molecules

The equilibrium geometries for the molecules presented in this work were obtained from the CCCBDB database ⁴⁵. For convenience, we present them in Table 2.

Table 2.

Equilibrium configurations used for the ground-state calculations presented in the main text. The coordinates (x, y, z) are given in angstroms (Å).

Molecule	Basis	Geometry
H₂	STO-3G	H(0, 0, 0)
		H(0, 0, 0.734)
LiH	STO-3G	Li(0, 0, 0)
		H(0, 0, 1.548)
		N(0, 0, 0.149)
NH₃	STO-3G	H(0, 0.947, −0.348)
		H(0.821, −0.474, −0.348)
		H(−0.821, −0.474, −0.348)
C₂	STO-3G	C(0, 0, 0)
		C(0, 0, 1.26)
N₂	STO-3G	N(0, 0, 0)
	.	N(0, 0, 1.19)
		H(0, 0.769, −0.546)
H₂O	STO-3G	H(0, −0.769, −0.546)
		O(0, 0, 0.137)
		H(0, 0.795, −0.454)
H₂O	6-31G	H(0, −0.795, −0.454)
		O(0, 0, 0.113)

Open in a new tab

Computing matrix elements

A crucial requirement for the efficient implementation of the stochastic variational Monte Carlo procedure to minimize the ground-state energy, is the ability to efficiently compute the matrix elements of the spin Hamiltonian $⟨ {\vec{σ}}^{'} ∣ H_{q} ∣ \vec{σ} ⟩$ , appearing in the local energy, Eq. (9). Since H_q is a sum of products of Pauli operators, the goal is to efficiently compute matrix elements of the form

M (\vec{σ}, {\vec{σ}}^{'}) = ⟨ {\vec{σ}}^{'} ∣ σ_{1}^{ν_{1}} σ_{2}^{ν_{2}} \dots σ_{N}^{ν_{N}} ∣ \vec{σ} ⟩,

where $σ_{i}^{ν_{i}}$ denotes a Pauli matrix with ν = I, x, y, z acting on site i. Because of the structure of the Pauli operators, these matrix elements are non-zero only for a specific ${\vec{σ}}^{'}$ such that

\{\begin{matrix} σ_{i}^{'} = σ_{i} & ν_{i} \in (I, Z) \\ σ_{i}^{'} = - σ_{i} & ν_{i} \in (X, Y) \end{matrix}

and the matrix element is readily computed as

M (\vec{σ}, {\vec{σ}}^{'}) = (i^{n_{y}}) \underset{k : v_{k} \in (y, z)}{Π} σ_{k},

where n_y is the total number of σ^y operators in the string of Pauli matrices.

Simulation details

The optimization follows the stochastic reconfiguration scheme as detailed in the supplementary material of ref. ¹⁰. Given a variational ansatz $Ψ ({α_{k}}) \in C^{2^{n}}$ depending on parameters {α_k}, the parameter update δα_k is given by solution of the linear equation

\sum_{k^{'}} [⟨ O_{k}^{†} O_{k^{'}} ⟩ - ⟨ O_{k}^{†} ⟩ ⟨ O_{k^{'}} ⟩ + λ δ_{k k^{'}}] δ α_{k^{'}} = - ϵ [⟨ O_{k}^{†} Ĥ ⟩ - ⟨ O_{k}^{†} ⟩ ⟨ Ĥ ⟩],

where $O_{k} = \frac{\partial}{\partial α_{k}} \log [Ψ ({α_{k}^{0}})]$ are the logarithmic derivatives, ϵ is the step size and λ is the regularization parameter. For the simulations done in this paper, we take ϵ = 0.05 and λ = 0.01. The expectation values 〈⋯〉 are estimated with Markov chain Monte Carlo sampling as described in the main text.

The parameters of the RBM are initialized from a random normal distribution with a zero mean and a standard deviation of 0.05.

Supplementary information

Peer Review File^{(336.6KB, pdf)}

Acknowledgements

The Flatiron Institute is supported by the Simons Foundation. A.M. acknowledges support from the IBM Research Frontiers Institute. K.C. is supported by the European Unions’ Horizon 2020 research and innovation program (ERC-StG-Neupert-757867-PARATOP). Neural-network quantum states simulations are based on the open-source software NetKet⁴⁶. Coupled cluster and configuration interaction calculations are performed using the PySCF package⁴⁷. The mappings from fermions to spins are done using Qiskit Aqua⁴⁸. The authors acknowledge discussions with G. Booth, T. Berkelbach, M. Holtzmann, J. E. T. Smith, S. Sorella, J. Stokes, and S. Zhang.

Author contributions

K.C. performed the numerical simulations. K.C., A.M., and G.C. devised the algorithm and wrote the manuscript.

Data availability

The datasets generated during and/or analyzed during the current study are available from the authors on reasonable request.

Code availability

The code used in the current study is largely based on the open-sourced software NetKet⁴⁶ with some custom modifications, which will be made available from the authors upon reasonable request.

Competing interests

The authors declare no competing interests.

Footnotes

Peer review information Nature Communications thanks Sabre Kais, Nicholas Mayhall, and Frank Noé for their contributions to the peer review of this work. Peer review reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Kenny Choo, Email: kenny.choo@uzh.ch.

Antonio Mezzacapo, Email: amezzac@us.ibm.com.

Giuseppe Carleo, Email: gcarleo@flatironinstitute.org.

Supplementary information

Supplementary information is available for this paper at 10.1038/s41467-020-15724-9.

References

1.Coester F, Kümmel H. Short-range correlations in nuclear wave functions. Nucl. Phys. 1960;17:477–485. doi: 10.1016/0029-5582(60)90140-1. [DOI] [Google Scholar]
2.Čížek J. On the correlation problem in atomic and molecular systems. calculation of wavefunction components in Ursell-type expansion using quantum-field theoretical methods. J. Chem. Phys. 1966;45:4256–4266. doi: 10.1063/1.1727484. [DOI] [Google Scholar]
3.Jastrow R. Many-body problem with strong forces. Phys. Rev. 1955;98:1479–1484. doi: 10.1103/PhysRev.98.1479. [DOI] [Google Scholar]
4.Casula M, Sorella S. Geminal wave functions with Jastrow correlation: a first application to atoms. J. Chem. Phys. 2003;119:6500–6511. doi: 10.1063/1.1604379. [DOI] [Google Scholar]
5.White SR. Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 1992;69:2863–2866. doi: 10.1103/PhysRevLett.69.2863. [DOI] [PubMed] [Google Scholar]
6.White SR, Martin RL. Ab initio quantum chemistry using the density matrix renormalization group. J. Chem. Phys. 1999;110:4127–4130. doi: 10.1063/1.478295. [DOI] [Google Scholar]
7.Chan GK-L, Sharma S. The density matrix renormalization group in quantum chemistry. Annu. Rev. Phys. Chem. 2011;62:465–481. doi: 10.1146/annurev-physchem-032210-103338. [DOI] [PubMed] [Google Scholar]
8.Anderson JB. A random-walk simulation of the Schrödinger equation: H.3. J. Chem. Phys. 1975;63:1499–1503. doi: 10.1063/1.431514. [DOI] [Google Scholar]
9.Zhang S, Krakauer H. Quantum Monte Carlo method using phase-free random walks with Slater determinants. Phys. Rev. Lett. 2003;90:136401. doi: 10.1103/PhysRevLett.90.136401. [DOI] [PubMed] [Google Scholar]
10.Carleo G, Troyer M. Solving the quantum many-body problem with artificial neural networks. Science. 2017;355:602–606. doi: 10.1126/science.aag2302. [DOI] [PubMed] [Google Scholar]
11.Nomura Y, Darmawan AS, Yamaji Y, Imada M. Restricted Boltzmann machine learning for solving strongly correlated quantum systems. Phys. Rev. B. 2017;96:205152. doi: 10.1103/PhysRevB.96.205152. [DOI] [Google Scholar]
12.Xia R, Kais S. Quantum machine learning for electronic structure calculations. Nat. Commun. 2018;9:4195. doi: 10.1038/s41467-018-06598-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Ruggeri M, Moroni S, Holzmann M. Nonlinear network description for many-body quantum systems in continuous space. Phys. Rev. Lett. 2018;120:205302. doi: 10.1103/PhysRevLett.120.205302. [DOI] [PubMed] [Google Scholar]
14.Luo D, Clark BK. Backflow transformations via neural networks for quantum many-body wave functions. Phys. Rev. Lett. 2019;122:226401. doi: 10.1103/PhysRevLett.122.226401. [DOI] [PubMed] [Google Scholar]
15.Han J, Zhang L, Weinan E. Solving many-electron Schrödinger equation using deep neural networks. J. Comput. Phys. 2019;399:108929. doi: 10.1016/j.jcp.2019.108929. [DOI] [Google Scholar]
16.Feynman RP, Cohen M. Energy spectrum of the excitations in liquid helium. Phys. Rev. 1956;102:1189–1204. doi: 10.1103/PhysRev.102.1189. [DOI] [Google Scholar]
17.Tocchio LF, Becca F, Parola A, Sorella S. Role of backflow correlations for the nonmagnetic phase of the Hubbard model. Phys. Rev. B. 2008;78:041101. doi: 10.1103/PhysRevB.78.041101. [DOI] [Google Scholar]
18.Pfau, D., Spencer, J. S., de Matthews, A. G. G. & Foulkes, W. M. C. Ab-Initio solution of the many-electron Schrödinger equation with deep neural networks. Preprint at: http://arxiv.org/abs/1909.02487 (2019).
19.Hermann, J., Schätzle, Z. & Noé, F. Deep neural network solution of the electronic Schrödinger equation. Preprint at: http://arxiv.org/abs/1909.08423 (2019). [DOI] [PubMed]
20.Wigner E, Jordan P. Über das Paulische Äguivalenzverbot. Z. Phys. 1928;47:631. doi: 10.1007/BF01331938. [DOI] [Google Scholar]
21.Bravyi S, Kitaev A. Fermionic quantum computation. Ann. Phys. 2002;298:210–226. doi: 10.1006/aphy.2002.6254. [DOI] [Google Scholar]
22.Seeley J, Richard M, Love P. The Bravyi–Kitaev transformation for quantum computation of electronic structure. J. Chem. Phys. 2012;137:224109. doi: 10.1063/1.4768229. [DOI] [PubMed] [Google Scholar]
23.Tranter A, et al. The Bravyi–Kitaev transformation: properties and applications. Int. J. Quantum Chem. 2015;115:1431–1441. doi: 10.1002/qua.24969. [DOI] [Google Scholar]
24.Choo K, Carleo G, Regnault N, Neupert T. Symmetries and many-body excitations with neural-network quantum states. Phys. Rev. Lett. 2018;121:167204. doi: 10.1103/PhysRevLett.121.167204. [DOI] [PubMed] [Google Scholar]
25.Ferrari F, Becca F, Carrasquilla J. Neural Gutzwiller-projected variational wave functions. Phys. Rev. B. 2019;100:125131. doi: 10.1103/PhysRevB.100.125131. [DOI] [Google Scholar]
26.Czischek S, Gärttner M, Gasenzer T. Quenches near Ising quantum criticality as a challenge for artificial neural networks. Phys. Rev. B. 2018;98:024311. doi: 10.1103/PhysRevB.98.024311. [DOI] [Google Scholar]
27.Fabiani G, Mentink J. Investigating ultrafast quantum magnetism with machine learning. SciPost Phys. 2019;7:004. doi: 10.21468/SciPostPhys.7.1.004. [DOI] [Google Scholar]
28.Hartmann MJ, Carleo G. Neural-network approach to dissipative quantum many-body dynamics. Phys. Rev. Lett. 2019;122:250502. doi: 10.1103/PhysRevLett.122.250502. [DOI] [PubMed] [Google Scholar]
29.Nagy A, Savona V. Variational quantum Monte Carlo method with a neural-network Ansatz for open quantum systems. Phys. Rev. Lett. 2019;122:250501. doi: 10.1103/PhysRevLett.122.250501. [DOI] [PubMed] [Google Scholar]
30.Vicentini F, Biella A, Regnault N, Ciuti C. Variational neural-network Ansatz for steady states in open quantum systems. Phys. Rev. Lett. 2019;122:250503. doi: 10.1103/PhysRevLett.122.250503. [DOI] [PubMed] [Google Scholar]
31.Yoshioka N, Hamazaki R. Constructing neural stationary states for open quantum many-body systems. Phys. Rev. B. 2019;99:214306. doi: 10.1103/PhysRevB.99.214306. [DOI] [Google Scholar]
32.Smolensky, P. in Information Processing in Dynamical Systems: Foundations of Harmony Theory Vol. 1, 194–281 (MIT Press, Cambridge, MA, USA, 1986).
33.Deng D-L, Li X, DasSarma S. Quantum entanglement in neural network states. Phys. Rev. X. 2017;7:021021. [Google Scholar]
34.Huang, Y. & Moore, J. E. Neural network representation of tensor network and chiral states. Preprint at: http://arxiv.org/abs/1701.06246 (2017). [DOI] [PubMed]
35.Chen J, Cheng S, Xie H, Wang L, Xiang T. Equivalence of restricted Boltzmann machines and tensor network states. Phys. Rev. B. 2018;97:085104. doi: 10.1103/PhysRevB.97.085104. [DOI] [Google Scholar]
36.Levine Y, Sharir O, Cohen N, Shashua A. Quantum entanglement in deep learning architectures. Phys. Rev. Lett. 2019;122:065301. doi: 10.1103/PhysRevLett.122.065301. [DOI] [PubMed] [Google Scholar]
37.Melko RG, Carleo G, Carrasquilla J, Cirac JI. Restricted Boltzmann machines in quantum physics. Nat. Phys. 2019;15:887–892. doi: 10.1038/s41567-019-0545-1. [DOI] [Google Scholar]
38.Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57:97–109. doi: 10.1093/biomet/57.1.97. [DOI] [Google Scholar]
39.Sorella S. Green function Monte Carlo with stochastic reconfiguration. Phys. Rev. Lett. 1998;80:4558–4561. doi: 10.1103/PhysRevLett.80.4558. [DOI] [Google Scholar]
40.Sorella S, Casula M, Rocca D. Weak binding between two aromatic rings: feeling the van der Waals attraction by quantum Monte Carlo methods. J. Chem. Phys. 2007;127:014105. doi: 10.1063/1.2746035. [DOI] [PubMed] [Google Scholar]
41.Amari S-I. Natural gradient works efficiently in learning. Neural Comput. 1998;10:251–276. doi: 10.1162/089976698300017746. [DOI] [Google Scholar]
42.Setia, K., Bravyi, S., Mezzacapo, A. & Whitfield, J. D. Superfast encodings for fermionic quantum simulation. Phys. Rev. Research 1, 033033 (2019).
43.Kandala A, et al. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature. 2017;549:242. doi: 10.1038/nature23879. [DOI] [PubMed] [Google Scholar]
44.Hsieh, C.-y., Sun, Q., Zhang, S. & Lee, C. K. Unitary-coupled restricted Boltzmann machine ansatz for quantum simulations. Preprint at: https://arxiv.org/abs/1912.02988 (2019).
45.Johnson, R. Computational Chemistry Comparison and Benchmark Database (CCCBDB), https://cccbdb.nist.gov/ (2019).
46.Carleo G, et al. NetKet: a machine learning toolkit for many-body quantum systems. SoftwareX. 2019;10:100311. doi: 10.1016/j.softx.2019.100311. [DOI] [Google Scholar]
47.Sun, Q. et al. PySCF: the Python-based simulations of chemistry framework. Wiley 8, e1340 (2017).
48.Abraham, H. et al. Qiskit: An Open-Source Framework for Quantum Computing v0.7.2. (2019).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Peer Review File^{(336.6KB, pdf)}

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the authors on reasonable request.

The code used in the current study is largely based on the open-sourced software NetKet⁴⁶ with some custom modifications, which will be made available from the authors upon reasonable request.

[CR1] 1.Coester F, Kümmel H. Short-range correlations in nuclear wave functions. Nucl. Phys. 1960;17:477–485. doi: 10.1016/0029-5582(60)90140-1. [DOI] [Google Scholar]

[CR2] 2.Čížek J. On the correlation problem in atomic and molecular systems. calculation of wavefunction components in Ursell-type expansion using quantum-field theoretical methods. J. Chem. Phys. 1966;45:4256–4266. doi: 10.1063/1.1727484. [DOI] [Google Scholar]

[CR3] 3.Jastrow R. Many-body problem with strong forces. Phys. Rev. 1955;98:1479–1484. doi: 10.1103/PhysRev.98.1479. [DOI] [Google Scholar]

[CR4] 4.Casula M, Sorella S. Geminal wave functions with Jastrow correlation: a first application to atoms. J. Chem. Phys. 2003;119:6500–6511. doi: 10.1063/1.1604379. [DOI] [Google Scholar]

[CR5] 5.White SR. Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 1992;69:2863–2866. doi: 10.1103/PhysRevLett.69.2863. [DOI] [PubMed] [Google Scholar]

[CR6] 6.White SR, Martin RL. Ab initio quantum chemistry using the density matrix renormalization group. J. Chem. Phys. 1999;110:4127–4130. doi: 10.1063/1.478295. [DOI] [Google Scholar]

[CR7] 7.Chan GK-L, Sharma S. The density matrix renormalization group in quantum chemistry. Annu. Rev. Phys. Chem. 2011;62:465–481. doi: 10.1146/annurev-physchem-032210-103338. [DOI] [PubMed] [Google Scholar]

[CR8] 8.Anderson JB. A random-walk simulation of the Schrödinger equation: H.3. J. Chem. Phys. 1975;63:1499–1503. doi: 10.1063/1.431514. [DOI] [Google Scholar]

[CR9] 9.Zhang S, Krakauer H. Quantum Monte Carlo method using phase-free random walks with Slater determinants. Phys. Rev. Lett. 2003;90:136401. doi: 10.1103/PhysRevLett.90.136401. [DOI] [PubMed] [Google Scholar]

[CR10] 10.Carleo G, Troyer M. Solving the quantum many-body problem with artificial neural networks. Science. 2017;355:602–606. doi: 10.1126/science.aag2302. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Nomura Y, Darmawan AS, Yamaji Y, Imada M. Restricted Boltzmann machine learning for solving strongly correlated quantum systems. Phys. Rev. B. 2017;96:205152. doi: 10.1103/PhysRevB.96.205152. [DOI] [Google Scholar]

[CR12] 12.Xia R, Kais S. Quantum machine learning for electronic structure calculations. Nat. Commun. 2018;9:4195. doi: 10.1038/s41467-018-06598-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Ruggeri M, Moroni S, Holzmann M. Nonlinear network description for many-body quantum systems in continuous space. Phys. Rev. Lett. 2018;120:205302. doi: 10.1103/PhysRevLett.120.205302. [DOI] [PubMed] [Google Scholar]

[CR14] 14.Luo D, Clark BK. Backflow transformations via neural networks for quantum many-body wave functions. Phys. Rev. Lett. 2019;122:226401. doi: 10.1103/PhysRevLett.122.226401. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Han J, Zhang L, Weinan E. Solving many-electron Schrödinger equation using deep neural networks. J. Comput. Phys. 2019;399:108929. doi: 10.1016/j.jcp.2019.108929. [DOI] [Google Scholar]

[CR16] 16.Feynman RP, Cohen M. Energy spectrum of the excitations in liquid helium. Phys. Rev. 1956;102:1189–1204. doi: 10.1103/PhysRev.102.1189. [DOI] [Google Scholar]

[CR17] 17.Tocchio LF, Becca F, Parola A, Sorella S. Role of backflow correlations for the nonmagnetic phase of the Hubbard model. Phys. Rev. B. 2008;78:041101. doi: 10.1103/PhysRevB.78.041101. [DOI] [Google Scholar]

[CR18] 18.Pfau, D., Spencer, J. S., de Matthews, A. G. G. & Foulkes, W. M. C. Ab-Initio solution of the many-electron Schrödinger equation with deep neural networks. Preprint at: http://arxiv.org/abs/1909.02487 (2019).

[CR19] 19.Hermann, J., Schätzle, Z. & Noé, F. Deep neural network solution of the electronic Schrödinger equation. Preprint at: http://arxiv.org/abs/1909.08423 (2019). [DOI] [PubMed]

[CR20] 20.Wigner E, Jordan P. Über das Paulische Äguivalenzverbot. Z. Phys. 1928;47:631. doi: 10.1007/BF01331938. [DOI] [Google Scholar]

[CR21] 21.Bravyi S, Kitaev A. Fermionic quantum computation. Ann. Phys. 2002;298:210–226. doi: 10.1006/aphy.2002.6254. [DOI] [Google Scholar]

[CR22] 22.Seeley J, Richard M, Love P. The Bravyi–Kitaev transformation for quantum computation of electronic structure. J. Chem. Phys. 2012;137:224109. doi: 10.1063/1.4768229. [DOI] [PubMed] [Google Scholar]

[CR23] 23.Tranter A, et al. The Bravyi–Kitaev transformation: properties and applications. Int. J. Quantum Chem. 2015;115:1431–1441. doi: 10.1002/qua.24969. [DOI] [Google Scholar]

[CR24] 24.Choo K, Carleo G, Regnault N, Neupert T. Symmetries and many-body excitations with neural-network quantum states. Phys. Rev. Lett. 2018;121:167204. doi: 10.1103/PhysRevLett.121.167204. [DOI] [PubMed] [Google Scholar]

[CR25] 25.Ferrari F, Becca F, Carrasquilla J. Neural Gutzwiller-projected variational wave functions. Phys. Rev. B. 2019;100:125131. doi: 10.1103/PhysRevB.100.125131. [DOI] [Google Scholar]

[CR26] 26.Czischek S, Gärttner M, Gasenzer T. Quenches near Ising quantum criticality as a challenge for artificial neural networks. Phys. Rev. B. 2018;98:024311. doi: 10.1103/PhysRevB.98.024311. [DOI] [Google Scholar]

[CR27] 27.Fabiani G, Mentink J. Investigating ultrafast quantum magnetism with machine learning. SciPost Phys. 2019;7:004. doi: 10.21468/SciPostPhys.7.1.004. [DOI] [Google Scholar]

[CR28] 28.Hartmann MJ, Carleo G. Neural-network approach to dissipative quantum many-body dynamics. Phys. Rev. Lett. 2019;122:250502. doi: 10.1103/PhysRevLett.122.250502. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Nagy A, Savona V. Variational quantum Monte Carlo method with a neural-network Ansatz for open quantum systems. Phys. Rev. Lett. 2019;122:250501. doi: 10.1103/PhysRevLett.122.250501. [DOI] [PubMed] [Google Scholar]

[CR30] 30.Vicentini F, Biella A, Regnault N, Ciuti C. Variational neural-network Ansatz for steady states in open quantum systems. Phys. Rev. Lett. 2019;122:250503. doi: 10.1103/PhysRevLett.122.250503. [DOI] [PubMed] [Google Scholar]

[CR31] 31.Yoshioka N, Hamazaki R. Constructing neural stationary states for open quantum many-body systems. Phys. Rev. B. 2019;99:214306. doi: 10.1103/PhysRevB.99.214306. [DOI] [Google Scholar]

[CR32] 32.Smolensky, P. in Information Processing in Dynamical Systems: Foundations of Harmony Theory Vol. 1, 194–281 (MIT Press, Cambridge, MA, USA, 1986).

[CR33] 33.Deng D-L, Li X, DasSarma S. Quantum entanglement in neural network states. Phys. Rev. X. 2017;7:021021. [Google Scholar]

[CR34] 34.Huang, Y. & Moore, J. E. Neural network representation of tensor network and chiral states. Preprint at: http://arxiv.org/abs/1701.06246 (2017). [DOI] [PubMed]

[CR35] 35.Chen J, Cheng S, Xie H, Wang L, Xiang T. Equivalence of restricted Boltzmann machines and tensor network states. Phys. Rev. B. 2018;97:085104. doi: 10.1103/PhysRevB.97.085104. [DOI] [Google Scholar]

[CR36] 36.Levine Y, Sharir O, Cohen N, Shashua A. Quantum entanglement in deep learning architectures. Phys. Rev. Lett. 2019;122:065301. doi: 10.1103/PhysRevLett.122.065301. [DOI] [PubMed] [Google Scholar]

[CR37] 37.Melko RG, Carleo G, Carrasquilla J, Cirac JI. Restricted Boltzmann machines in quantum physics. Nat. Phys. 2019;15:887–892. doi: 10.1038/s41567-019-0545-1. [DOI] [Google Scholar]

[CR38] 38.Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57:97–109. doi: 10.1093/biomet/57.1.97. [DOI] [Google Scholar]

[CR39] 39.Sorella S. Green function Monte Carlo with stochastic reconfiguration. Phys. Rev. Lett. 1998;80:4558–4561. doi: 10.1103/PhysRevLett.80.4558. [DOI] [Google Scholar]

[CR40] 40.Sorella S, Casula M, Rocca D. Weak binding between two aromatic rings: feeling the van der Waals attraction by quantum Monte Carlo methods. J. Chem. Phys. 2007;127:014105. doi: 10.1063/1.2746035. [DOI] [PubMed] [Google Scholar]

[CR41] 41.Amari S-I. Natural gradient works efficiently in learning. Neural Comput. 1998;10:251–276. doi: 10.1162/089976698300017746. [DOI] [Google Scholar]

[CR42] 42.Setia, K., Bravyi, S., Mezzacapo, A. & Whitfield, J. D. Superfast encodings for fermionic quantum simulation. Phys. Rev. Research 1, 033033 (2019).

[CR43] 43.Kandala A, et al. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature. 2017;549:242. doi: 10.1038/nature23879. [DOI] [PubMed] [Google Scholar]

[CR44] 44.Hsieh, C.-y., Sun, Q., Zhang, S. & Lee, C. K. Unitary-coupled restricted Boltzmann machine ansatz for quantum simulations. Preprint at: https://arxiv.org/abs/1912.02988 (2019).

[CR45] 45.Johnson, R. Computational Chemistry Comparison and Benchmark Database (CCCBDB), https://cccbdb.nist.gov/ (2019).

[CR46] 46.Carleo G, et al. NetKet: a machine learning toolkit for many-body quantum systems. SoftwareX. 2019;10:100311. doi: 10.1016/j.softx.2019.100311. [DOI] [Google Scholar]

[CR47] 47.Sun, Q. et al. PySCF: the Python-based simulations of chemistry framework. Wiley 8, e1340 (2017).

[CR48] 48.Abraham, H. et al. Qiskit: An Open-Source Framework for Quantum Computing v0.7.2. (2019).

PERMALINK

Fermionic neural-network states for ab-initio electronic structure

Kenny Choo

Antonio Mezzacapo

Giuseppe Carleo

Abstract

Introduction

Results

Electronic structure on spin systems

Neural-network quantum states

Computational complexity

Fig. 2. Electronic correlations.

Potential energy surfaces

Fig. 1. Dissociation profiles.

Alternative encodings

Fig. 3. Comparison of different spin mappings.

Sampling larger basis sets

Fig. 4. Sampling size dependence of the converged energies.

Discussion

Table 1.

Methods

Geometries for diatomic molecules

Table 2.

Computing matrix elements

Simulation details

Supplementary information

Acknowledgements

Author contributions

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases