Table 3.
Models’ performance (root-mean-square error) on lipophilicity database
Dataset (size) | Model | RMSE |
---|---|---|
Lipophilicity (4200) | RF | 0.824 ± 0.041 |
MPN (Deepchem)a | 0.630 ± 0.059 | |
MPN (Deepchem)b | 0.652 ± 0.061 | |
MPN | 0.630 ± 0.059 | |
SAMPN | 0.579 ± 0.036 | |
Multi-MPN | 0.594 ± 0.039 | |
Multi-SAMPN | 0.571 ± 0.032 | |
Water solubility (1311) | RF | 1.096 ± 0.092 |
MPN (Deepchem-1128)a | 0.580 ± 0.030 | |
MPN (Deepchem)b | 0.676 ± 0.022 | |
MPN | 0.694 ± 0.050 | |
SAMPN | 0.688 ± 0.057 | |
Multi-MPN | 0.674 ± 0.074 | |
Multi-SAMPN | 0.661 ± 0.063 |
Italics represents the best performance in the results
aValues were reported in [16]. In the lipophilicity prediction, we use the same dataset with Deepchem. In the water solubility prediction, our used dataset is larger than Deepchem used (1128 molecules)
bValues were calculated from the same data and the same stratified cross-validation protocol in our work