Metrics for the best models found in the current study (upper section) and for other state-of-art models available in the literature (lower section). Values were taken from the cited references. Missing values stand for entries that the cited authors did not study. SolChal columns stand for the solubility challenges. 2_1 represents the tight dataset (set-1), while 2_2 represents the loose dataset (set-2) as described in the original paper (see ref. 30). The best-performing model in each dataset has its RMSE value in bold.
| Model | Solubility challenge 1 | Solubility challenge 2_1 | Solubility challenge 2_2 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAE | r | RMSE | MAE | r | RMSE | MAE | r | |
| RF | 1.121 | 0.914 | 0.950 | 0.727 | 1.205 | 1.002 | |||
| DNN | 1.540 | 1.214 | 1.315 | 1.035 | 1.879 | 1.381 | |||
| DNNAug | 1.261 | 1.007 | 1.371 | 1.085 | 2.189 | 1.710 | |||
| kde4LSTMAug | 1.273 | 0.984 | 1.137 | 0.932 | 1.511 | 1.128 | 1.397 | 1.131 | |
| kde8LSTMAug | 1.247 | 0.984 | 1.044 | 0.846 | 1.418 | 1.118 | 1.676 | 1.339 | |
| kde10LSTMAug | 1.095 | 0.843 | 0.983 | 0.793 | 1.263 | 1.051 | 1.316 | 1.089 | |
| Linear regression25 | 0.75 | ||||||||
| UG-RNN34 | 0.90 | 0.74 | |||||||
| RF w/CDF descriptors27 | 0.93 | ||||||||
| RF w/Morgan fingerprints36 | 0.64 | ||||||||
| Consensus88 | 0.91 | ||||||||
| GNN89 | ∼1.10 | 0.91 | 1.17 | ||||||
| SolvBert90 | 0.925 | ||||||||
| aSolTranNet41 | 1.004 | 1.295 | 2.99 | ||||||
| bSMILES-BERT91 | 0.47 | ||||||||
| bMolBERT40 | 0.531 | ||||||||
| bRT42 | 0.73 | ||||||||
| bMolFormer43 | 0.278 | ||||||||
Has overlap between training and test sets.
Pre-trained model was fine-tuned on ESOL.