. 2024 Jan;91:103016. doi: 10.1016/j.media.2023.103016

Table 1.

Comparison of the test performance of the different methods on OSIC dataset when trained on imaging data only, as well as combined imaging and clinical data. The mean and standard deviation are reported over five runs with different random train/val/test splits. The best results are highlighted in bold.

Data	Method	C-Index $↑$	MAE $↓$	RAE $↓$
Imaging	DeepSurv (Cox)	67.441 ± 4.572	44.898 ± 19.505	2.286 ± 1.414
	CoxMB	71.067 ± 5.572	28.887 ± 2.315	1.762 ± 0.807
	DeepHit	53.165 ± 8.313	31.074 ± 7.765	1.830 ± 0.522
	DeepHit ( $L_{lik.}^{c}$ only)	57.607 ± 4.813	29.862 ± 3.742	1.926 ± 0.869
	Classical censoring	68.844 ± 5.313	20.448 ± 4.787	1.407 ± 0.853
	CenTime	69.273 ± 0.946	19.319 ± 1.613	1.338 ± 0.665

Imaging + Clinical	DeepSurv (Cox)	72.100 ± 2.186	27.603 ± 3.345	1.718 ± 0.742
	CoxMB	68.877 ± 2.413	24.413 ± 2.548	1.892 ± 0.868
	DeepHit	54.980 ± 3.490	31.246 ± 4.599	2.240 ± 0.862
	DeepHit ( $L_{lik.}^{c}$ only)	52.882 ± 3.843	28.718 ± 2.077	2.059 ± 0.722
	Classical censoring	70.350 ± 2.947	20.476 ± 1.85	1.546 ± 0.611
	CenTime	70.957 ± 3.048	19.178 ± 0.795	1.480 ± 0.671