Skip to main content
. Author manuscript; available in PMC: 2024 Mar 18.
Published in final edited form as: Med Image Comput Comput Assist Interv. 2023 Oct 1;14224:189–198. doi: 10.1007/978-3-031-43904-9_19

Table 2.

Results of the NLG metrics (BLEU (BL), Meteor (M), Rouge RL) and clinical efficacy (CE) metrics (Accuracy, Precision, Recall and F-1 score) on the Longitudinal-MIMIC dataset. Best results are highlighted in bold.

Method NLG metrics CE metrics
BL-1 BL-2 BL-3 BL-4 M RL A P R F-1
AoANet 0.272 0.168 0.112 0.080 0.115 0.249 0.798 0.437 0.249 0.317
CNN+Trans 0.299 0.186 0.124 0.088 0.120 0.263 0.799 0.445 0.258 0.326
Transformer 0.294 0.178 0.119 0.085 0.123 0.256 0.811 0.500 0.320 0.390
R2gen 0.302 0.183 0.122 0.087 0.124 0.259 0.812 0.500 0.305 0.379
R2CMN 0.305 0.184 0.122 0.085 0.126 0.265 0.817 0.521 0.396 0.449
Ours 0.343 0.210 0.140 0.099 0.137 0.271 0.823 0.538 0.434 0.480
Baseline 0.294 0.178 0.119 0.085 0.123 0.256 0.811 0.500 0.320 0.390
+ report 0.333 0.201 0.133 0.094 0.135 0.268 0.823 0.539 0.411 0.466
+ image 0.320 0.195 0.130 0.092 0.130 0.268 0.817 0.522 0.34 0.412
simple fusion 0.317 0.193 0.128 0.090 0.130 0.266 0.818 0.521 0.396 0.450