Table 3. Performances of Models Developed Using Different Descriptor Selection Proceduresa.
unsupervised
selection |
neural
network pruning |
|||||
---|---|---|---|---|---|---|
RMSE |
RMSE |
|||||
descriptor set | N | training | test | N | training | test |
CDK | 159 | 0.93 | 1.13 | 6 | 0.89 | 1.2 |
Dragon | 1824 | 0.93 | 1.15 | 18 | 0.87 | 1.19 |
Fragmentor | 631 | 0.98 | 1.18 | 12 | 0.92 | 1.21 |
GSFrag | 202 | 0.97 | 1.1 | 24 | 0.97 | 1.18 |
Mera, Mersy | 242 | 0.93 | 1.04 | 10 | 0.93 | 1.18 |
Chemaxon | 97 | 0.93 | 1.16 | 11 | 0.92 | 1.16 |
Inductive | 39 | 0.94 | 1.17 | 21 | 0.93 | 1.16 |
Adriana | 133 | 0.93 | 1.14 | 8 | 0.92 | 1.1 |
QNPR | 381 | 0.95 | 1.12 | 74 | 0.89 | 1.13 |
E-state | 185 | 0.96 | 1.16 | 11 | 0.9 | 1.24 |
Consensus | 4036 | 0.88 | 1.08 | 186 | 0.85 | 1.13 |
N is the number of descriptors selected to develop the respective model. RMSE is the root mean squared error calculated for the training (n = 483) and full test set (n = 143).