Table 1.
Feature | Method | Precision | Recall | F-Score | Accuracy |
---|---|---|---|---|---|
K-tuple features | Autoencoder(8-mer) + SVM | 0.3622 | 0.2709 | 0.2388 | 0.6650 |
Autoencoder(8-mer) + RF | 0.3558 | 0.2701 | 0.2379 | 0.6654 | |
Autoencoder(8-mer) + LR | 0.2081 | 0.2506 | 0.2040 | 0.6460 | |
Autoencoder(8-mer) + XGBoost | 0.3271 | 0.2741 | 0.2487 | 0.6559 | |
Autoencoder(8-mer) + LightGBM | 0.3031 | 0.2649 | 0.2308 | 0.6573 | |
Autoencoder(8-mer) + EDP + SVM | 0.3888 | 0.2682 | 0.2331 | 0.6647 | |
Autoencoder(8-mer) + EDP + RF | 0.2938 | 0.2712 | 0.2376 | 0.6661 | |
Autoencoder(8-mer) + EDP + LR | 0.3787 | 0.2906 | 0.2790 | 0.6430 | |
Autoencoder(8-mer) + EDP + XGBoost | 0.3315 | 0.2716 | 0.2464 | 0.6522 | |
Autoencoder(8-mer) + EDP + LightGBM | 0.2946 | 0.2668 | 0.2325 | 0.6606 | |
Properties of open reading frame | SVM | 0.1622 | 0.2500 | 0.1967 | 0.6488 |
RF | 0.3596 | 0.2863 | 0.2748 | 0.6387 | |
LR | 0.2641 | 0.2575 | 0.2120 | 0.6598 | |
XGBoost | 0.3023 | 0.2644 | 0.2404 | 0.6265 | |
LightGBM | 0.2477 | 0.2526 | 0.2098 | 0.6457 | |
Fickett nucleotide features | SVM | 0.2843 | 0.2560 | 0.2120 | 0.6497 |
RF | 0.3108 | 0.2814 | 0.2633 | 0.6570 | |
LR | 0.1985 | 0.2633 | 0.2167 | 0.6539 | |
XGBoost | 0.3874 | 0.2946 | 0.2910 | 0.6366 | |
LightGBM | 0.3636 | 0.2904 | 0.2844 | 0.6338 | |
Physicochemical properties | SVM | 0.3232 | 0.2564 | 0.2098 | 0.6549 |
RF | 0.2740 | 0.2673 | 0.2495 | 0.6127 | |
LR | 0.3449 | 0.2629 | 0.2229 | 0.6636 | |
XGBoost | 0.2752 | 0.2649 | 0.2399 | 0.6268 | |
LightGBM | 0.4111 | 0.3913 | 0.3728 | 0.7018 | |
Mutli-scale secondary structures | SVM | 0.5076 | 0.4590 | 0.4356 | 0.7169 |
RF | 0.4204 | 0.4171 | 0.4000 | 0.6927 | |
LR | 0.2648 | 0.2574 | 0.2133 | 0.6576 | |
XGBoost | 0.4318 | 0.4122 | 0.4023 | 0.6928 | |
LightGBM | 0.4248 | 0.4040 | 0.3870 | 0.7042 |
For testing purposes, the autoencoder converts 65,536-dimensional 8-mer data into 128-dimensional output. The encoding layer consists of an input with 65,536 dimensions and three intermediate layers with nodes of 4096, 1024, and 256, respectively. The decoding layer corresponds to the encoding layer, and finally converts the 8-mer sequence into the 128-dimensional real value vector. EDP represents the combination of the EDP of the 2-mer and the EDP of the ORF.