Table 3. Performance of different machine learning pipelines.
| Machine learning pipeline | Training AUC | Cross-validation mean AUC | Test AUC |
|---|---|---|---|
| BorutaShap + RF* | 0.98* | 0.85* | 0.88* |
| BorutaShap + SVM | 0.99 | 0.82 | 0.84 |
| BorutaShap + LR | 0.96 | 0.82 | 0.83 |
| BorutaShap + MLP | 0.99 | 0.83 | 0.85 |
| Boruta + RF | 0.98 | 0.84 | 0.87 |
| Boruta + SVM | 0.97 | 0.84 | 0.86 |
| Boruta + LR | 0.95 | 0.83 | 0.84 |
| Boruta + MLP | 0.99 | 0.85 | 0.87 |
| LASSO + RF | 0.94 | 0.78 | 0.80 |
| LASSO + SVM | 0.94 | 0.78 | 0.80 |
| LASSO + LR | 0.98 | 0.84 | 0.86 |
| LASSO + MLP | 0.97 | 0.83 | 0.85 |
| RFE + RF | 0.97 | 0.84 | 0.86 |
| RFE + SVM | 0.96 | 0.85 | 0.87 |
| RFE + LR | 0.97 | 0.80 | 0.86 |
| RFE + MLP | 0.97 | 0.82 | 0.85 |
*, the best-performing pipeline. AUC, area under the receiver operating characteristic curve; RF, random forest; SVM, support vector machine; LR, logistic regression; MLP, multilayer perceptron; LASSO, least absolute shrinkage and selection operator; RFE, recursive feature elimination.