Table 3. Comparison of predictive accuracies of models with training and testing datasets.
Training dataset | Testing dataset | ||||||||
Single genes | |||||||||
LDA | KNN | SVM | RF | LDA | KNN | SVM | RF | ||
0.667 | 0.731 | 0.740 | 0.999 | 0.662 | 0.650 | 0.641 | 0.603 | ||
0.057 | 0.049 | 0.055 | 0.003 | 0.093 | 0.095 | 0.092 | 0.102 | ||
Combined biomarkers | |||||||||
Gene set | Genes | LDA | KNN | SVM | RF | LDA | KNN | SVM | RF |
1 |
PLA2G2A
WRAP73 |
0.893 0.026 |
0.888 0.035 |
0.899 0.031 |
1.000 0.000 |
0.873 0.077 |
0.859 0.060 |
0.864 0.055 |
0.841 0.088 |
2 |
DOHH
SLC22A14 |
0.802 0.026 |
0.879 0.026 |
0.882 0.028 |
1.000 0.000 |
0.800 0.071 |
0.852 0.057 |
0.826 0.070 |
0.829 0.062 |
3 |
OXTR
FURIN |
0.860 0.027 |
0.840 0.038 |
0.879 0.030 |
1.000 0.000 |
0.851 0.061 |
0.783 0.065 |
0.808 0.061 |
0.789 0.073 |
4 |
SLC41A3
BBIP1 |
0.887 0.028 |
0.889 0.028 |
0.935 0.022 |
1.000 0.000 |
0.881 0.055 |
0.838 0.060 |
0.852 0.055 |
0.799 0.065 |
5 |
TBP
TICAM1 |
0.854 0.041 |
0.867 0.030 |
0.881 0.029 |
1.000 0.000 |
0.827 0.071 |
0.815 0.066 |
0.803 0.070 |
0.782 0.066 |
6 |
MGRN1
PDGFB ZNF764 |
0.863 0.025 |
0.880 0.027 |
0.894 0.026 |
1.000 0.000 |
0.834 0.065 |
0.827 0.049 |
0.819 0.067 |
0.842 0.057 |
7 |
PSPC1
MPI EIF5 |
0.832 0.046 |
0.866 0.023 |
0.889 0.025 |
1.000 0.000 |
0.810 0.072 |
0.854 0.057 |
0.853 0.060 |
0.829 0.059 |
8 |
WDR6
PFDN6 PSPC1 |
0.853 0.030 |
0.856 0.033 |
0.869 0.027 |
1.000 0.000 |
0.843 0.059 |
0.786 0.069 |
0.805 0.059 |
0.798 0.071 |
9 |
ADM2
MFSD10 PAFAH1B1 |
0.834 0.028 |
0.817 0.034 |
0.858 0.029 |
1.000 0.000 |
0.799 0.070 |
0.734 0.075 |
0.771 0.082 |
0.789 0.060 |
10 |
LPIN1
PFDN6 DOHH |
0.869 0.036 |
0.885 0.026 |
0.927 0.015 |
1.000 0.000 |
0.858 0.037 |
0.860 0.020 |
0.873 0.048 |
0.920 0.036 |
LDA, Linear discriminant analysis; KNN, k-Nearest neighbors; SVM, Support vector machine; RF, Random forest.
ML methods are indicated, and values are the mean (top) and standard deviation (bottom) calculated from 100 reiterations.