Table 4.
Classification of SARS-CoV-2 and non-SARS-CoV-2 via ten-fold CV on the training data.
| Classifiers | Genome biomarkers selection methods | Accuracy (%) (mean ± SD) | F-measure (mean ± SD) | Kappa-score (mean ± SD) | Model built time (second) |
|---|---|---|---|---|---|
| k-NN | CFS | 96.47 ± 0.89 | 0.96 ± 0.01 | 0.92 ± 0.02 | 0.002 |
| Correlation | 97.56 ± 1.35 | 0.97 ± 0.02 | 0.95 ± 0.03 | 0.01 | |
| SVM-RBF | CFS (C-100, gamma-0.001) | 97.29 ± 1.96 | 0.96 ± 0.03 | 0.94 ± 0.04 | 0.2 |
| Correlation (C-1000, gamma-0.001) | 97.73 ± 1.43 | 0.96 ± 0.02 | 0.96 ± 0.03 | 0.29 | |
| DT | CFS | 96.47 ± 1.74 | 0.95 ± 0.02 | 0.93 ± 0.04 | 0.14 |
| Correlation | 97.46 ± 1.41 | 0.96 ± 0.02 | 0.94 ± 0.03 | 0.04 | |
| RF | CFS | 97.92 ± 1.66 | 0.97 ± 0.02 | 0.96 ± 0.03 | 0.22 |
| Correlation | 98.78 ± 1.09 | 0.98 ± 0.02 | 0.98 ± 0.02 | 0.27 |
k-NN – k-nearest neighbors; SVM-RBF: Support vector machine-radial basis function; DT-Decision tree; RF-Random forest, CFS-correlation-based feature selection, Correlation-Pearson correlation coefficient.