Table 3. Top-Performing Models for DMSO Solubility Prediction Developed with Enamine Data Set within the Whole Data Set and 10% Coveragea.
method | descriptors | number of descriptorsb | BAC100% | PPV100% | ENR100% | PPV10% | ENR10% | BAC10% | SENS10% | SPEC10% | NSOL10% |
---|---|---|---|---|---|---|---|---|---|---|---|
J48 | CDK | 175 | 71.8 | 97.5 | 2.2 | 99.4 | 8.8 | 87.1 | 97.8 | 76.4 | 2.3 |
ASNN | E-state | 205 | 73.1 | 97.9 | 2.6 | 99.4 | 8.8 | 92 | 96 | 88 | 4.7 |
ASNN | Dragon6 | 1929 | 73.2 | 97.9 | 2.5 | 99.4 | 8.8 | 92.2 | 97 | 87.4 | 3.7 |
ASNN | Fragmentor | 872 | 73.3 | 97.9 | 2.5 | 99.3 | 7.6 | 90.2 | 95.8 | 84.6 | 4.3 |
ASNN | ChemAxon | 134 | 72.8 | 97.9 | 2.5 | 99.3 | 7.6 | 91.2 | 92.4 | 89.9 | 5.8 |
J48 | E-state | 205 | 71 | 97.3 | 2 | 99.2 | 7.6 | 77.5 | 99.4 | 55.6 | 1.7 |
ASNN | CDK | 175 | 73.2 | 97.9 | 2.5 | 99.2 | 6.6 | 91 | 94.8 | 87.2 | 5.3 |
J48 | Dragon6 | 1929 | 71 | 97.3 | 2 | 99.2 | 6.6 | 88.8 | 97 | 80.6 | 4 |
J48 | Fragmentor | 872 | 71 | 97.3 | 2 | 99.1 | 5.9 | 82.2 | 98 | 66.4 | 2.5 |
ASNN | Mera | 259 | 70.9 | 97.7 | 2.3 | 99.1 | 5.9 | 88.3 | 96.2 | 80.4 | 4.2 |
ASNN | Adriana | 119 | 72.9 | 97.9 | 2.5 | 99 | 5.3 | 86 | 82.2 | 89.8 | 7.8 |
J48 | Adriana | 119 | 69 | 97.1 | 1.8 | 99 | 5.3 | 79.4 | 98.8 | 60 | 2.5 |
RF | Fragmentor | 872 | 72 | 97.6 | 2.2 | 99 | 5.3 | 88.3 | 97.4 | 79.1 | 4.7 |
ASNN | GSFrag | 314 | 71.9 | 97.8 | 2.4 | 99 | 5.3 | 87.6 | 85.5 | 89.7 | 7.5 |
J48 | GSFrag | 314 | 69 | 97.1 | 1.8 | 99 | 5.3 | 83 | 98.8 | 67.1 | 2.9 |
RF | E-state | 205 | 73 | 97.8 | 2.4 | 99 | 5.3 | 88.5 | 96.3 | 80.6 | 5.1 |
RF | inductive | 38 | 67 | 97.2 | 1.9 | 99 | 5.3 | 84.9 | 94.8 | 74.9 | 3.5 |
consensus | all | 74.2 | 97.9 | 2.5 | 99.6 | 13 | 87.5 | 98 | 77 | 1.2 |
Models with PPV10% ≥ 99% were selected.
After filtering, see Methods section. NSOL10% is the percentage of nonsoluble molecules within 10% coverage of the most highly accurate predictions.