. Author manuscript; available in PMC: 2024 Mar 18.

Published in final edited form as: Med Image Comput Comput Assist Interv. 2023 Oct 1;14224:189–198. doi: 10.1007/978-3-031-43904-9_19

Table 2.

Results of the NLG metrics (BLEU (BL), Meteor (M), Rouge $R_{L}$ ) and clinical efficacy (CE) metrics (Accuracy, Precision, Recall and F-1 score) on the Longitudinal-MIMIC dataset. Best results are highlighted in bold.

Method	NLG metrics						CE metrics
Method	BL-1	BL-2	BL-3	BL-4	M	$R_{L}$	A	P	R	F-1
AoANet	0.272	0.168	0.112	0.080	0.115	0.249	0.798	0.437	0.249	0.317
CNN+Trans	0.299	0.186	0.124	0.088	0.120	0.263	0.799	0.445	0.258	0.326
Transformer	0.294	0.178	0.119	0.085	0.123	0.256	0.811	0.500	0.320	0.390
R2gen	0.302	0.183	0.122	0.087	0.124	0.259	0.812	0.500	0.305	0.379
R2CMN	0.305	0.184	0.122	0.085	0.126	0.265	0.817	0.521	0.396	0.449
Ours	0.343	0.210	0.140	0.099	0.137	0.271	0.823	0.538	0.434	0.480
Baseline	0.294	0.178	0.119	0.085	0.123	0.256	0.811	0.500	0.320	0.390
+ report	0.333	0.201	0.133	0.094	0.135	0.268	0.823	0.539	0.411	0.466
+ image	0.320	0.195	0.130	0.092	0.130	0.268	0.817	0.522	0.34	0.412
simple fusion	0.317	0.193	0.128	0.090	0.130	0.266	0.818	0.521	0.396	0.450