Table 2.
Results of the NLG metrics (BLEU (BL), Meteor (M), Rouge ) and clinical efficacy (CE) metrics (Accuracy, Precision, Recall and F-1 score) on the Longitudinal-MIMIC dataset. Best results are highlighted in bold.
| Method | NLG metrics | CE metrics | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| BL-1 | BL-2 | BL-3 | BL-4 | M | A | P | R | F-1 | ||
| AoANet | 0.272 | 0.168 | 0.112 | 0.080 | 0.115 | 0.249 | 0.798 | 0.437 | 0.249 | 0.317 |
| CNN+Trans | 0.299 | 0.186 | 0.124 | 0.088 | 0.120 | 0.263 | 0.799 | 0.445 | 0.258 | 0.326 |
| Transformer | 0.294 | 0.178 | 0.119 | 0.085 | 0.123 | 0.256 | 0.811 | 0.500 | 0.320 | 0.390 |
| R2gen | 0.302 | 0.183 | 0.122 | 0.087 | 0.124 | 0.259 | 0.812 | 0.500 | 0.305 | 0.379 |
| R2CMN | 0.305 | 0.184 | 0.122 | 0.085 | 0.126 | 0.265 | 0.817 | 0.521 | 0.396 | 0.449 |
| Ours | 0.343 | 0.210 | 0.140 | 0.099 | 0.137 | 0.271 | 0.823 | 0.538 | 0.434 | 0.480 |
| Baseline | 0.294 | 0.178 | 0.119 | 0.085 | 0.123 | 0.256 | 0.811 | 0.500 | 0.320 | 0.390 |
| + report | 0.333 | 0.201 | 0.133 | 0.094 | 0.135 | 0.268 | 0.823 | 0.539 | 0.411 | 0.466 |
| + image | 0.320 | 0.195 | 0.130 | 0.092 | 0.130 | 0.268 | 0.817 | 0.522 | 0.34 | 0.412 |
| simple fusion | 0.317 | 0.193 | 0.128 | 0.090 | 0.130 | 0.266 | 0.818 | 0.521 | 0.396 | 0.450 |