. 2022 Aug 31;22(17):6594. doi: 10.3390/s22176594

Table 4.

Mean and standard deviation values of the 5-fold cross validation in terms of F1 score, precision, recall and accuracy. The first three rows reported the implementation results of other competitive methods using each of our dataset. Type of machine learning model was recorded if the model of the row was ensembled. The number of trainable parameters is displayed in thousands in the rightmost column. The models that shows significant difference with the TN stage (baseline) for all evaluation metrics are marked with an asterisk (*).

	F1 Score	Precision	Recall	Accuracy	Model	Trainable Parameters
Clinical (David Cox) [38]	68.11 (±2.4)	56.51 (±1.4)	85.84 (±5.5)	52.90 (±2.5)	Cox PH	-
HCR (Wen Yu et al.) [40]	72.19 (±3.4)	63.78 (±2.1)	83.23 (±5.5)	62.44 (±3.8)	RSF	-
DLR (André Diamant et al.) [41]	74.55 (±5.4)	69.12 (±6.3)	81.12 (±5.2)	67.35 (±7.1)	CNN	916 K
TN stage (baseline)	65.67 (±5.1)	70.09 (±8.0)	63.81 (±10.6)	61.54 (±3.8)	NN	1 K
Clinical	73.54 (±2.4)	68.99 (±5.1)	79.07 (±2.2)	66.46 (±3.8)	NN1	1 K
HCR	76.61 (±4.7)	73.07 (±5.2)	80.61 (±4.7)	71.08 (±5.7)	NN2	10 K
DLR	76.28 (±4.9)	70.11 (±3.1)	84.80 (±11.2)	69.54 (±4.2)	CNN	1046 K
Clinical & HCR *	77.58 (±5.0)	75.29 (±4.1)	80.08 (±6.4)	72.92 (±5.6)	QDA	11 K
Clinical & DLR *	76.86 (±4.7)	71.03 (±3.7)	83.75 (±6.1)	70.46 (±5.6)	LR	1047 K
HCR & DLR *	77.65 (±5.0)	73.70 (±4.5)	82.19 (±6.6)	72.31 (±5.9)	LR	1056 K
Clinical & HCR & DLR *	77.79 (±5.3)	75.71 (±4.8)	80.08 (±6.4)	73.23 (±6.0)	LDA	1057 K