TABLE 7.
The best model class and its performance for each of the problems of interest: (i) normal v/s cancer using ten features, (ii) metastatic v/s non-metastatic using five features, (iii) molecular subtyping using 16 features, and (iv) histological subtyping using 24 features. Nested model selection was used to identify the best model class, with subsequent validation on external datasets. In the case of histological subtype, a voting ensemble of the two models shown was used for the external validation. The RF model for molecular subtyping was externally validated on another 26 TNBC samples, yielding 25 correct predictions. MCC and AUROC values of the best model in each case are scaled to the range [0,100].
| S.No | Model | Train | Test | External validation | |||||
|---|---|---|---|---|---|---|---|---|---|
| Balanced acc. (%) | Balanced acc. (%) | Specificity | Sensitivity | Precision (PPV) | MCC | AUROC | |||
| Normal v/s cancer | |||||||||
| 1 | NN (1 layer) | 99.82 | 100 | 97.42 | 95.74 | 99.09 | 95.74 | 94.84 | 97.42 |
| Non-metastatic v/s Metastatic | |||||||||
| 2 | NN (1 layer) | 99.17 | 82.24 | 88.22 | 93.87 | 78.57 | 91.67 | 80.87 | 88.22 |
| Molecular subtype | |||||||||
| 3 | RF | 99.99 | 91.43 | 88.79 | 93.11 | 84.46 | 93.63 | 84.06 | 90.23 |
| Histological subtype | |||||||||
| 4 | XGBoost | 95.13 | 88.74 | 76.92 | 53.85 | 100 | 93.81 | 71.07 | 76.92 |
| 5 | NN (1 layer) | 96.97 | |||||||