. 2020 Oct 1;21(19):7271. doi: 10.3390/ijms21197271

Table 1.

The comparison of basic features on different models.

Feature	Method	Precision	Recall	F-Score	Accuracy
K-tuple features	Autoencoder(8-mer) + SVM	0.3622	0.2709	0.2388	0.6650
	Autoencoder(8-mer) + RF	0.3558	0.2701	0.2379	0.6654
	Autoencoder(8-mer) + LR	0.2081	0.2506	0.2040	0.6460
	Autoencoder(8-mer) + XGBoost	0.3271	0.2741	0.2487	0.6559
	Autoencoder(8-mer) + LightGBM	0.3031	0.2649	0.2308	0.6573
	Autoencoder(8-mer) + EDP + SVM	0.3888	0.2682	0.2331	0.6647
	Autoencoder(8-mer) + EDP + RF	0.2938	0.2712	0.2376	0.6661
	Autoencoder(8-mer) + EDP + LR	0.3787	0.2906	0.2790	0.6430
	Autoencoder(8-mer) + EDP + XGBoost	0.3315	0.2716	0.2464	0.6522
	Autoencoder(8-mer) + EDP + LightGBM	0.2946	0.2668	0.2325	0.6606
Properties of open reading frame	SVM	0.1622	0.2500	0.1967	0.6488
	RF	0.3596	0.2863	0.2748	0.6387
	LR	0.2641	0.2575	0.2120	0.6598
	XGBoost	0.3023	0.2644	0.2404	0.6265
	LightGBM	0.2477	0.2526	0.2098	0.6457
Fickett nucleotide features	SVM	0.2843	0.2560	0.2120	0.6497
	RF	0.3108	0.2814	0.2633	0.6570
	LR	0.1985	0.2633	0.2167	0.6539
	XGBoost	0.3874	0.2946	0.2910	0.6366
	LightGBM	0.3636	0.2904	0.2844	0.6338
Physicochemical properties	SVM	0.3232	0.2564	0.2098	0.6549
	RF	0.2740	0.2673	0.2495	0.6127
	LR	0.3449	0.2629	0.2229	0.6636
	XGBoost	0.2752	0.2649	0.2399	0.6268
	LightGBM	0.4111	0.3913	0.3728	0.7018
Mutli-scale secondary structures	SVM	0.5076	0.4590	0.4356	0.7169
	RF	0.4204	0.4171	0.4000	0.6927
	LR	0.2648	0.2574	0.2133	0.6576
	XGBoost	0.4318	0.4122	0.4023	0.6928
	LightGBM	0.4248	0.4040	0.3870	0.7042

For testing purposes, the autoencoder converts 65,536-dimensional 8-mer data into 128-dimensional output. The encoding layer consists of an input with 65,536 dimensions and three intermediate layers with nodes of 4096, 1024, and 256, respectively. The decoding layer corresponds to the encoding layer, and finally converts the 8-mer sequence into the 128-dimensional real value vector. EDP represents the combination of the EDP of the 2-mer and the EDP of the ORF.