. Author manuscript; available in PMC: 2021 Oct 1.

Published in final edited form as: Med Image Anal. 2020 Jul 18;65:101785. doi: 10.1016/j.media.2020.101785

Table 4.

Experimental results on clinical datasets (%, average (std) of cross-validation)

Method	Accuracy	AUC	F1	Recall	Precision	p-value
Ori CNN	84.80(2.43)	89.00(1.65)	70.29(4.26)	63.46(3.51)	78.83(5.70)	<0.05
MC-CNN	84.51(1.29)	90.85(1.13)	70.55(1.29)	62.85(1.53)	80.84(4.42)	<0.05
LSTM	86.27(1.29)	90.27(1.15)	74.17(2.47)	69.73(2.62)	79.56(5.69)	0.08
Time-LSTM	85.79(2.37)	90.81(1.57)	74.57(3.81)	71.08(3.56)	78.71(6.48)	0.42
tLSTM	86.42(1.48)	91.06(1.48)	74.36(1.99)	68.55(1.55)	81.49(5.28)	0.40

DLSTM1	86.97(1.45)	91.17(1.53)	76.11(2.68)	72.71(2.38)	80.04(5.18)	^*(base)
DLSTM2	86.98(1.20)	91.41(1.51)	75.54(1.67)	71.24(5.01)	81.22(6.11)	--
DLSTM3	85.99(1.13)	91.10(1.69)	74.68(2.89)	70.51(6.07)	80.23(5.55)	--
DLSTM4	86.91(1.37)	91.07(1.28)	75.85(1.94)	72.39(3.65)	80.21(6.34)	--

The average and standard deviation (std) of five-fold test results are reported.

The best average results are shown in bold. The p < 0.05 indicates our method significantly improve the compared method (McNemar test).