Skip to main content
. 2024 Nov 2;15:9481. doi: 10.1038/s41467-024-53748-7

Fig. 2. Application of crystal variational autoencoders (VAE) and downstream GW predictions.

Fig. 2

a Density functional theory (DFT) calculated wavefunction of states A, B and C in MoS2 (see (e)) used as input to VAE encoder, in crystal coordinates in units of the lattice vector. WFN denotes wavefunction. b VAE reconstructed wavefunctions of states A, B and C through latent space decoding. c Low-dimensional variational mean latent space for states A, B and C. (d) parity plot comparing the exact calculated values (x-axis) to the Machine learning (ML) predicted values (y-axis) of the GW correction for individual states. Blue (orange) dots represent training (test) sets. The mean absolute error (MAE) for the training set and test set are 0.06 and 0.11 eV respectively. The total number of data points in training (test) set are 19801(2201). e ML predicted GW band structures (blue solid curve) and calculated PBE band structures (blue dashed line) for monolayer MoS2. The red circles are the exactly calculated quasiparticle (QP) energies from GW self-energy. The red dashed lines are the interpolated GW band structures. f Whisker plot of ML predicted GW error without utilizing representations of KS states nk, DFT energies εDFT, superstates φsuper or charge density ρ. "All" denotes using all information. The orange (blue) boxes represent the test (training) set. The total number of data points under each box are 19801(2201) for training(test) sets. Each box plot displays the absolute error distribution of the self energies Σ^nkGWΣnkGW, highlighting the median (box center line, i.e. Q2), 25th – 75th percentiles of dataset (lower and upper boundary of box, i.e. Q1 and Q3), mean (small square within box) and outlier cutoff (lower and upper whisker mark are defined as Q1-(Q3-Q1)×0.2 and Q1 + (Q3-Q1)×0.2, so all datapoints beyond the range are considered outliers). The training process spans 1,000 epochs. Source data are provided as a Source Data file.