Table 3. The performance of Support vector machine (SVM) and Random Forest (RF) based models developed using different sets of selected features on training and independent or external validation dataset.
Features | Dataset | Technique | Performance Measures |
||||
---|---|---|---|---|---|---|---|
Sensitivity | Specificity | Accuracy (%) | MCC | ROC | |||
setA-1 (4 genes) | Training | SVM | 71.65 | 70.3 | 71.12 | 0.41 | 0.76 |
Validation | 68.25 | 78.05 | 72.12 | 0.45 | 0.80 | ||
Training | RF | 70.87 | 65.45 | 68.74 | 0.36 | 0.69 | |
Validation | 73.02 | 58.54 | 67.31 | 0.32 | 0.74 | ||
setB-1 (4 genes) | Training | SVM | 71.26 | 70.3 | 70.88 | 0.41 | 0.74 |
Validation | 74.6 | 68.29 | 72.12 | 0.42 | 0.74 | ||
Training | RF | 80.31 | 49.7 | 68.26 | 0.32 | 0.65 | |
Validation | 82.54 | 51.22 | 70.19 | 0.36 | 0.68 | ||
Combo-1 (8 genes) | Training | SVM | 75.20 | 70.30 | 73.27 | 0.45 | 0.77 |
Validation | 77.78 | 68.29 | 74.04 | 0.46 | 0.80 | ||
Training | RF | 81.1 | 55.15 | 70.88 | 0.38 | 0.73 | |
Validation | 82.54 | 51.22 | 70.19 | 0.36 | 0.74 |
These gene sets include setA-1 (4 overexpressed genes), setB-1 (4 under-expressed genes) and Combo-1 (combination of both gene sets i.e. setA-1 and setB-1).