Table 3. Overall Performance for Classification Tasks on Toxicity-Related Datasets of MoleculeNet3a.
Data set ↑ | Tox21 | ToxCast | SIDER | ClinTox | BACE | BBBP |
---|---|---|---|---|---|---|
No. molecules | 7,677 | 8,405 | 1,356 | 1,438 | 1,511 | 1,959 |
Label dist. | 0.06:0.77:0.17 | 0.03:0.27:0.70 | 0.57:0.43:- | 0.51:0.49:- | 0.54:0.46:- | 0.76:0.24:- |
No. tasks | 12 | 617 | 27 | 2 | 1 | 1 |
D-MPNN | 0.759 ± 0.007 | 0.655 ± 0.003 | 0.57 ± 0.007 | 0.906 ± 0.006 | 0.809 ± 0.006 | 0.724 ± 0.004 |
AttentiveFP | 0.761 ± 0.005 | 0.637 ± 0.002 | 0.606 ± 0.032 | 0.847 ± 0.003 | 0.784 ± 0.022 | 0.643 ± 0.018 |
GEM | 0.781 ± 0.001 | 0.692 ± 0.004 | 0.672 ± 0.004 | 0.901 ± 0.013 | 0.856 ± 0.011 | 0.724 ± 0.004 |
SMILES-T | 0.691 ± 0.011 | 0.578 ± 0.011 | 0.504 ± 0.028 | 0.819 ± 0.045 | 0.739 ± 0.075 | 0.931 ± 0.012 |
ET (single) | 0.780 ± 0.004 | 0.685 ± 0.009 | 0.606 ± 0.01 | 0.851 ± 0.027 | 0.832 ± 0.009 | 0.960 ± 0.03 |
ET (multi) | 0.789 ± 0.003 | 0.623 ± 0.008 | 0.560 ± 0.011 | 0.843 ± 0.012 | 0.816 ± 0.013 | 0.955 ± 0.008 |
The normalized label distribution is denoted as active:inactive:nan. The equivariant transformer is denoted as ET. Here, (single) and (multi) differentiate between single- and multi-conformer training. We report the standard deviations for five different seed runs as subscripts. Two numbers are written in bold in one column if the standard deviations overlap.