Table 2.
10-fold cross-validation results of selected classifiers built on the 4-class classification dataset
| Algorithm | Descriptor set | Accuracy (%) | Kappa | AUC |
|---|---|---|---|---|
| MultilayerPerceptron, training time = 600 | MOE2D | 55.3 | 0.39 | 0.81 |
| MultilayerPerceptron, training time = 600 | MACCS | 59.0 | 0.44 | 0.79 |
| K nearest neighbours, k = 5 | MOE2D | 60.2 | 0.46 | 0.80 |
| K nearest neighbours, k = 5 | MACCS | 57.1 | 0.41 | 0.78 |
| K nearest neighbours, k = 5 | ECFP | 54.7 | 0.40 | 0.80 |
| RotationForest, iterations = 50 | MOE2D | 60.9 | 0.47 | 0.85 |
| RotationForest, iterations = 50 | MACCS | 62.7 | 0.50 | 0.84 |
| SVM with polynomial kernel | MOE2D | 57.1 | 0.42 | 0.78 |
| SVM with polynomial kernel | MACCS | 62.7 | 0.49 | 0.80 |
| SVM with polynomial kernel | ECFP | 70.2 | 0.60 | 0.84 |
| RandomForest, 200 trees | MOE2D | 60.9 | 0.47 | 0.85 |
| RandomForest, 200 trees | MACCS | 60.9 | 0.47 | 0.86 |
| RandomForest, 200 trees | ECFP | 67.1 | 0.55 | 0.88 |
When no results are shown for the descriptor set ECFP, it is because the computational time and/or memory needed were too large. In italic letters, the model that gives the best cross-validation results