Skip to main content
. 2021 Aug 13;121(16):10001–10036. doi: 10.1021/acs.chemrev.0c01303

Table 1. Overview: Synthetic Quantum Data Sets in Three Data Families of Chemical Compound Space: Generated Data Base (GDB33,354,361), Transition Metal Complexes (TMC), and Periodic Systems (Crystalline Solids or Surfaces)a.

family data set composition size method properties year notes
GDB QM7386 C, O, N, S 7165 PBE0 E 2012  
  QM7b359 C, O, N, S, Cl 7211 PBE0, ZINDO, GW E, ε, α, E*, etc. 2013  
  QM9171 C, O, N, F 134k B3LYP/6-31G(2df,p) E, μ, α, ε, Pthermo, etc. 2014  
  QM8364 C, H, O, N, F 20k TDDFT, CC2/def2-TZVP E*, f1, f2 2015 excited state
  ANI-1367 C, O, N, F 20M w97x/6-31G(D) E 2017 off-equilibrium
  QM7bMl267 C, O, N, S, Cl 7211 {HF,MP2,CCSD(T)}/ {sto-3g, 6-31g, cc-pVDZ} E 2018 multifidelity QML
  Alchemy363 C, N, O, F, S, Cl 119k B3LYP/6-31G(2df,p) E, μ, α, ε, Pthermo, etc. 2019  
  QM7-X360 C, H, O, N, S, Cl 4.2M PBE0+MBD E, f, ε, μ, α, qA, C6, etc. 2020 off-equilibrium
  ANI-1x368 C, O, N, F 5M w97x/def2-TZVPP and CCSD(T)/CBS E, f, μ, qA, etc. 2020 off-equilibrium
  AGZ7366 B, C, N, O, F, Si, P, S, Cl, Br, Sn, I 140k B3LYP/cc-pVTZ E, μ, α, ε, Pthermo, etc. 2020  
          
TMC tmQM383 3d, 4d and 5d transition metals, B, Si, N, P, As, O, S, Se, halogens 86k TPSSh-D3BJ/def2-SVP E, μ, qA, ε, etc. 2020 GFN2-xTB geometry
  (MIT)384,387 Cr, Fe, Mn, Co, Ni, C, N, O, S, Cl >2M B3LYP/LANL2DZ (6-31g*) E, ΔEHL, redox potential 2017, 2020  
        
periodic Materials Project165 across periodic table >600k PBE E, electronic and response properties 2011  
  AFlow388 across periodic table 3M PBE E, electronic and response properties 2012  
  OQMD389 across periodic table 300k PBE E, electronic and response properties 2013  
  OC20390 across periodic table >1M RPBE E, Eads 2020  
a

Properties covered include E (total energy (or atomization energy)), f (atomic forces), qA (atomic charges), μ (dipole moments), α (polarizability), ε (eigenvalues), E* (excitation energy), fi: oscillation strength for transition from ground state to the ith excited state (i = 1 or 2), ΔEHL (high- and low-spin energy difference), C6 (London dispersion coefficients), Pthermo (thermochemical properties such as internal energies, enthalpy, free energy, and heat capacity); Eads (chemisorption energy).