Table 3. Results of FS Methods for SC-1 and SC-2.
| Feature selection | Pre process | Classifier | Wrapper | Number of features | AUPRC | AUROC |
|---|---|---|---|---|---|---|
| SC-1 Ph.1 | ||||||
| ReliefF* | STPE | LR | KNN | 55 | 0.9341 | 0.8235 |
| ReliefF | AF | LR | XGB | 10 | 0.9264 | 0.7745 |
| SC-1 Ph.3 | ||||||
| Fisher Score* | STPE | KNN | XGB | 60 | 0.9746 | 0.9167 |
| F Score* | STPE | KNN | XGB | 60 | 0.9746 | 0.9167 |
| mRMR* | STPE | SVM | XGB | 275 | 0.9706 | 0.9118 |
| mRMR* | STPE | KNN | KNN | 6,805 | 0.9632 | 0.8775 |
| Fisher Score* | STPE | RF | LR | 20,481 | 0.9628 | 0.8725 |
| Gini Index* | STPE | KNN | KNN | 12,302 | 0.9572 | 0.8627 |
| ReliefF* | STPE | XGB | KNN | 18,913 | 0.9502 | 0.8725 |
| ReliefF | STPE | LR | LR | 22,277 | 0.9498 | 0.8627 |
| Fisher Score* | AF | KNN | XGB | 40 | 0.9429 | 0.8235 |
| mRMR | AF | SVM | KNN | 16 | 0.9325 | 0.7745 |
| SC-2 Ph.1 | ||||||
| Fisher Score* | AF | KNN | KNN | 17,566 | 0.8515 | 0.8712 |
| F Score* | AF | KNN | KNN | 17,566 | 0.8515 | 0.8712 |
| Gini Index* | AF | LR | KNN | 14,673 | 0.8365 | 0.8561 |
| Chi Square* | AF | XGB | LR | 8 | 0.8187 | 0.7765 |
| Chi Square | STPE | KNN | XGB | 54 | 0.8112 | 0.7689 |
| Chi Square | AF | LR | KNN | 5 | 0.8039 | 0.7879 |
| SC-2 Ph.3 | ||||||
| Fisher Score* | AF | KNN | KNN | 18,084 | 0.8956 | 0.8561 |
| F Score* | AF | KNN | KNN | 18,084 | 0.8956 | 0.8561 |
| Gini Index* | AF | SVM | LR | 116 | 0.8908 | 0.8939 |
| Chi Square | AF | LR | LR | 19 | 0.8759 | 0.8712 |
| Chi Square | AF | LR | KNN | 6 | 0.8675 | 0,8561 |
| Chi Square | STPE | KNN | XGB | 180 | 0.8595 | 0.8333 |
| ReliefF | STPE | KNN | LR | 12,206 | 0.8518 | 0.8447 |
| Fisher Score* | STPE | KNN | KNN | 22,214 | 0.8497 | 0.8598 |
| Gini Index* | AF | LR | KNN | 8,495 | 0.8462 | 0.8333 |
| Gini Index* | AF | SVM | XGB | 4 | 0.8428 | 0.8258 |
| ReliefF | AF | LR | XGB | 92 | 0.821 | 0.8106 |
Note:
After the features are ranked by a filtering approach, a wrapper algorithm is utilized to select the best feature subset. Wrapper column indicates the prediction algorithm used in wrapper method. Number of Features column represents the number of distinct features selected. An asterisk (*) indicates that the hyper-parameters were not optimized.