Table 2.
Pre-training | Fine-tuning | SMILES Data | Task | # Task | # Compounds | ||
---|---|---|---|---|---|---|---|
Databases | ZINC [5], [56] | Category | Physical Chemistry [61] | ESOL | R | 1 | 1128 |
PubChem [57], [58] | FreeSolv | R | 1 | 642 | |||
ChEMBL [59], [60] | Lipophilicity | R | 1 | 4200 | |||
Biophysics [61] | PCBA | C | 128 | 437929 | |||
MUV | C | 17 | 93087 | ||||
HIV | C | 1 | 41127 | ||||
PDBbind | R | 1 | 11908 | ||||
BACE | C | 1 | 15513 | ||||
Physiology [61] | BBBP | C | 1 | 2039 | |||
Tox21 | C | 12 | 7831 | ||||
ToxCast | C | 617 | 8575 | ||||
SIDER | C | 27 | 1427 | ||||
ClinTox | C | 2 | 1478 | ||||
Proposed[62], [63], [64] | Antimalarial | C | 1 | 4794 | |||
Cocrystals | C | 1 | 3282 | ||||
Covid | C | 1 | 740 | ||||
Genes | R | 1 | 201 |
R - Regression.
C - Classification.