Organic molecular databases
|
CEPDB101,102
|
2.3M |
DFT |
Enumerated |
Organic compounds for photovoltaics |
Materials Project8–11
|
1.0M |
DFT |
ICSD & others |
153k bulk materials (main data), and 222k organic molecules, 4k battery materials, 25k battery electrolytes, 20k MOFs, 560k catalyst surfaces, 41k synthesis recipes |
OCELOT103,104
|
56k |
DFT |
CSD, community |
Crystalline organic semiconductors |
|
Organic + inorganic molecular datasets
|
PubChemQC21,99,100
|
86M |
PM6 + DFT-sp |
PubChem |
Organic and organometallic molecules containing first-row transition metals |
SPICE105
|
1.1M |
DFT |
Literature, PubChem, DES370K |
Conformations of small molecules, dimers, dipeptide, and solvated amino acids |
DES370K106
|
370K |
DFT + CC-sp |
Literature |
370k data points of dimer interactions of 392 mostly organic molecules |
Alexandria library107
|
2.7k |
DFT |
PubChem, ChemSpider |
Mostly organic molecules |
CCCBDB108
|
2.2k |
DFT |
Literature |
Gas-phase atoms and small molecules |
QuestDB109,110
|
>500 |
CC & others |
Literature |
Vertical excitation energies for small- and medium-sized molecules |
|
Organic molecular datasets
|
GEOM111
|
37M |
xTB |
AICures, QM9 |
37M conformers of 450k organic molecules |
Transition1x112
|
10M |
DFT-sp |
Grambow et al.113
|
Molecular configurations along the potential energy surface of 11 961 reactions |
ANI-1x114
|
5.0M |
DFT |
GDB11, ChEMBL, generated |
Small molecules |
QM7-X115
|
4.2M |
DFT |
QM7 |
Equilibrium and non-equilibrium structures of small organic molecules |
QMugs116
|
2.0M |
xTB + DFT-sp |
ChEMBL |
2M conformers of 665K biologically relevant organic molecules |
WS22 117
|
1.2M |
DFT |
Literature |
1.2M data points of equilibrium and non-equilibrium geometries of 10 species |
VQ24 118
|
836k |
DFT & xTB |
Generated |
Enumerated molecules with up to 5 heavy atoms from C, N, O, F, Si, P, S, Cl, Br |
Frag20 119
|
566k |
DFT |
ZINC, PubChem |
Small organic molecules from ZINC and PubChem |
ANI-1ccx114
|
500k |
DFT + CC-sp |
ANI-1x |
Subset of ANI-1x recomputed with CC-sp |
John et al.120
|
240k |
DFT |
PubChem |
Open- and closed-shell small organic molecules |
QM-symex121,122
|
173k |
DFT & TD-DFT |
Generated |
Includes point group and excited states of small molecules |
QM9 123
|
134k |
DFT |
GDB-17 |
Small organic molecules with up to 9 heavy atoms |
Kim et al.124
|
134k |
G4MP2 |
QM9 |
Refinement of QM9 |
Narayanan et al.125
|
133k |
G4MP2 |
QM9 |
Refinement of QM9 |
FORMED126
|
117k |
xTB, DFT-sp & TD-DFT |
CSD |
Organic molecules from the CSD |
OE62 127
|
62k |
DFT |
CSD |
Organic molecules from the CSD |
MQMspin128
|
13k |
DFT & CASSCF |
QM9 |
Small organic carbene molecules |
HOPV15 129
|
6.0k |
DFT |
Literature |
6k conformers of 353 p-type molecules for organic photovoltaics + exp. data |
VERDE Materials DB130,131
|
1.8k |
DFT |
Generated |
Light-responsive π-conjugated organic molecules |
HAB79 132
|
921 |
DFT & CASSCF |
Literature |
Benchmark dataset for DFT |
|
Transition metal complex (TMC) datasets
|
tmQM133
|
80k |
xTB + DFT-sp |
CSD |
Monometallic TMCs |
tmQMg134
|
60k |
DFT |
tmQM |
Subset of tmQM with full DFT and graphs from natural bond orders |
SC1MC-2022 135
|
7.0k |
Hartree–Fock |
Generated |
TMCs assembled from ligands |
OHLDB136
|
1.4k |
DFT |
Enumerated |
Homoleptic TMCs |
divTMC137
|
855 |
DFT |
CSD |
Octahedral TMCs assembled from monodentate ligands |
16OSTM10 138
|
160 |
DFT |
CSD |
Open-shell TMCs for conformer benchmark |
ROST61 139
|
61 |
CC |
Literature |
Open-shell TMCs for DFT functional benchmark |
MOR41 140
|
41 |
CC |
Literature |
Closed-shell TMCs for DFT functional benchmark |
|
Organic + inorganic molecular repositories
|
NOMAD66–69
|
12M |
DFT & others |
Submissions, MP, OQMD, AFLOW, and others |
9M bulks, 75k surfaces; 5k 2D, 33k 1D materials, 2.8M organic and inorganic molecules |
ioChem-BD36,37
|
356k |
DFT mixed |
Submissions |
38k materials and 318k molecules, chemically diverse |
|
Organic + inorganic molecular dataset repositories
|
QCarchive141,142
|
47 sets |
Mixed |
Mixed |
Datasets from publications |