Table 1.
Model | Description | R2 | RMSE | Spearman ρ |
---|---|---|---|---|
– | AlogP | 0.83 [0.71,0.90] | 0.73 [0.55,0.93] | 0.90 [0.85,0.94] |
– | XlogP3 | 0.85 [0.75,0.92] | 0.67 [0.48,0.87] | 0.91 [0.87,0.95] |
– | S+ logP | 0.95 [0.91,0.97] | 0.40 [0.32,0.48] | 0.97 [0.94,0.98] |
1 | default | 0.93 [0.89,0.96] | 0.45 [0.36,0.57] | 0.96 [0.94,0.97] |
2 | 1 + rdkit | 0.93 [0.89,0.96] | 0.45 [0.37,0.55] | 0.96 [0.94,0.98] |
3 | rdkit only | 0.88 [0.82,0.92] | 0.60 [0.50,0.70] | 0.94 [0.91,0.96] |
4 | 1 + ChEMBL merged | 0.88 [0.81,0.92] | 0.60 [0.51,0.71] | 0.94 [0.92,0.96] |
5 | 1 + ChEMBL separate | 0.93 [0.88,0.95] | 0.47 [0.38,0.58] | 0.96 [0.94,0.98] |
6 | 5 + AZ_logD7.4 | 0.94 [0.91,0.96] | 0.42 [0.35,0.50] | 0.97 [0.95,0.97] |
7 | 5 + AZ_ADME | 0.94 [0.90,0.96] | 0.44 [0.36,0.51] | 0.97 [0.95,0.98] |
8 | 6 + hyperopt parameters | 0.93 [0.88,0.95] | 0.47 [0.39,0.58] | 0.96 [0.94,0.97] |
9 | 6 + S+ logP/logD7.4 as tasks | 0.95 [0.93,0.97] | 0.38 [0.32,0.44] | 0.97 [0.96,0.98] |
10 | 6 + S+ logP/logD7.4 as descriptors | 0.95 [0.92,0.97] | 0.39 [0.34,0.44] | 0.97 [0.96,0.98] |
11 | 1, ensemble of 10 | 0.94 [0.89,0.96] | 0.44 [0.35,0.55] | 0.96 [0.94,0.98] |
12 | 9, ensemble of 10 | 0.95 [0.92,0.97] | 0.39 [0.33,0.46] | 0.97 [0.96,0.98] |
The ordinal model numbers in the left-most column indicate the sequence in which the models were developed: for example model 6 (5 + AZ_logD7.4) means that the settings/data of model 5 were used and the AZ_logD7.4 data were added. The 95% confidence interval for the different performance metrics is shown between square brackets