. 2022 Mar 14;13:800853. doi: 10.3389/fgene.2022.800853

TABLE 5.

Results of the five average metrics scores from 50 different train–test-split experiments (mean ± SD) on the GEO GSE37745 data set. The accuracy, precision, recall, and f1-score were calculated with the optimal threshold selected using Youden’s J statistic.

Models	Average scores of 50 experiments on GEO datasets
	AUC	Accuracy	Precision	Recall	F1-score
DL-four-inputs	72.51 ± 6%	73.85 ± 6%	77.39 ± 14%	79.26 ± 7%	77.18 ± 7%
DL-three-inputs-age	70.77 ± 5%	71.03 ± 5%	68.96 ± 17%	81.26 ± 7%	72.60 ± 9%
DL-three-inputs-stage	72.36 ± 6%	72.46 ± 6%	71.04 ± 16%	81.32 ± 7%	74.39 ± 8%
DL-two-inputs	69.74 ± 6%	69.74 ± 6%	65.30 ± 17%	82.33 ± 9%	70.58 ± 10%
DL-one-input-BRITE	68.88 ± 5%	70.56 ± 5%	70.52 ± 14%	79.10 ± 8%	73.16 ± 7%
DL-one-input-pathway	67.37 ± 5%	68.05 ± 5%	62.70 ± 15%	80.91 ± 9%	68.89 ± 8%
KNN	55.76 ± 8%	63.85 ± 9%	56.35 ± 26%	82.35 ± 13%	60.84 ± 20%
SVM	54.32 ± 8%	61.33 ± 6%	63.13 ± 23%	72.04 ± 10%	63.28 ± 15%
Random-forest	55.59 ± 8%	60.72 ± 7%	52.78 ± 23%	77.37 ± 11%	58.21 ± 17%
Logistic-regression	54.08 ± 8%	58.51 ± 7%	49.83 ± 24%	75.82 ± 11%	55.07 ± 17%
MLP	54.69 ± 8%	59.03 ± 7%	49.04 ± 24%	75.89 ± 9%	55.56 ± 15%

The bold values are the highest among all the models.