Table 1. Overview: Synthetic Quantum Data Sets in Three Data Families of Chemical Compound Space: Generated Data Base (GDB33,354,361), Transition Metal Complexes (TMC), and Periodic Systems (Crystalline Solids or Surfaces)a.
family | data set | composition | size | method | properties | year | notes |
---|---|---|---|---|---|---|---|
GDB | QM7386 | C, O, N, S | 7165 | PBE0 | E | 2012 | |
QM7b359 | C, O, N, S, Cl | 7211 | PBE0, ZINDO, GW | E, ε, α, E*, etc. | 2013 | ||
QM9171 | C, O, N, F | 134k | B3LYP/6-31G(2df,p) | E, μ, α, ε, Pthermo, etc. | 2014 | ||
QM8364 | C, H, O, N, F | 20k | TDDFT, CC2/def2-TZVP | E*, f1, f2 | 2015 | excited state | |
ANI-1367 | C, O, N, F | 20M | w97x/6-31G(D) | E | 2017 | off-equilibrium | |
QM7bMl267 | C, O, N, S, Cl | 7211 | {HF,MP2,CCSD(T)}/ {sto-3g, 6-31g, cc-pVDZ} | E | 2018 | multifidelity QML | |
Alchemy363 | C, N, O, F, S, Cl | 119k | B3LYP/6-31G(2df,p) | E, μ, α, ε, Pthermo, etc. | 2019 | ||
QM7-X360 | C, H, O, N, S, Cl | 4.2M | PBE0+MBD | E, f, ε, μ, α, qA, C6, etc. | 2020 | off-equilibrium | |
ANI-1x368 | C, O, N, F | 5M | w97x/def2-TZVPP and CCSD(T)/CBS | E, f, μ, qA, etc. | 2020 | off-equilibrium | |
AGZ7366 | B, C, N, O, F, Si, P, S, Cl, Br, Sn, I | 140k | B3LYP/cc-pVTZ | E, μ, α, ε, Pthermo, etc. | 2020 | ||
TMC | tmQM383 | 3d, 4d and 5d transition metals, B, Si, N, P, As, O, S, Se, halogens | 86k | TPSSh-D3BJ/def2-SVP | E, μ, qA, ε, etc. | 2020 | GFN2-xTB geometry |
(MIT)384,387 | Cr, Fe, Mn, Co, Ni, C, N, O, S, Cl | >2M | B3LYP/LANL2DZ (6-31g*) | E, ΔEH–L, redox potential | 2017, 2020 | ||
periodic | Materials Project165 | across periodic table | >600k | PBE | E, electronic and response properties | 2011 | |
AFlow388 | across periodic table | 3M | PBE | E, electronic and response properties | 2012 | ||
OQMD389 | across periodic table | 300k | PBE | E, electronic and response properties | 2013 | ||
OC20390 | across periodic table | >1M | RPBE | E, Eads | 2020 |
Properties covered include E (total energy (or atomization energy)), f (atomic forces), qA (atomic charges), μ (dipole moments), α (polarizability), ε (eigenvalues), E* (excitation energy), fi: oscillation strength for transition from ground state to the ith excited state (i = 1 or 2), ΔEH–L (high- and low-spin energy difference), C6 (London dispersion coefficients), Pthermo (thermochemical properties such as internal energies, enthalpy, free energy, and heat capacity); Eads (chemisorption energy).