TABLE 3.
Optimal feature subsets and prediction performance results.
| Feature Set | Optimal no. Of Descriptors | AUC | ACC (%) | SEN (%) | SPE (%) | MCC |
|---|---|---|---|---|---|---|
| DS | 100 a | 0.8301 | 77.11 | 80.10 | 75.84 | 0.5203 |
| MOE | 123 b | 0.8264 | 76.94 | 79.89 | 75.68 | 0.5178 |
| RDKit | 59 a | 0.8225 | 75.95 | 82.61 | 73.11 | 0.5194 |
| DS + MOE | 205 b | 0.8394 | 76.79 | 80.58 | 75.18 | 0.5170 |
| DS + RDKit | 237 b | 0.8446 | 76.77 | 82.36 | 74.40 | 0.5263 |
| MOE + RDKit | 196 c | 0.8429 | 76.28 | 83.85 | 73.07 | 0.5252 |
| MOE + DS + RDKit | 328 b | 0.8288 | 76.79 | 78.77 | 75.94 | 0.5093 |
| FP | 1019 b | 0.7906 | 71.68 | 76.50 | 69.63 | 0.4269 |
| ExtFP | 1007 b | 0.7936 | 71.19 | 75.82 | 69.22 | 0.4173 |
| GraphFP | 969 b | 0.7674 | 70.53 | 71.74 | 70.02 | 0.3810 |
| EstateFP | 41 b | 0.7869 | 72.52 | 72.39 | 72.57 | 0.4191 |
| MACCSFP | 133 b | 0.8149 | 75.46 | 77.67 | 74.52 | 0.4814 |
| PubchemFP | 391 b | 0.8061 | 73.99 | 78.03 | 72.28 | 0.4604 |
| SubFP | 127 b | 0.8327 | 75.13 | 77.86 | 73.97 | 0.4797 |
| SubFPC | 125 b | 0.8459 | 76.10 | 83.06 | 73.14 | 0.5224 |
| KRFP | 1149 b | 0.8046 | 74.65 | 75.51 | 74.28 | 0.4666 |
| KRFPC | 1084 b | 0.8209 | 74.64 | 78.39 | 73.04 | 0.4758 |
| AP2DFP | 263 b | 0.7687 | 71.67 | 67.40 | 73.49 | 0.3795 |
| AP2DFPC | 241 b | 0.7791 | 71.49 | 75.04 | 69.99 | 0.4137 |
Feature selection with RFECV.
Feature preprocessing by removing null values, redundancy and irrelevant features.
Feature selection with MI technique.