Table 2.
ROUGEa F1 results of diagnoses with incorrect words.
| ROUGE-Lb | BERTc | BioBERTd | LSTMe | Proposed Model |
| Diagnoses without error words (n=451)f | 0.704 | 0.717 | 0.651 | 0.698 |
| Diagnoses with incorrect words (n=138) | 0.676 | 0.692 | 0.640 | 0.674 |
aROUGE: Recall-Oriented Understudy for Gisting Evaluation.
bROUGE-L: ROUGE for the longest common subsequence.
cBERT: Bidirectional Encoder Representations from Transformers.
dBioBERT: Bidirectional Encoder Representations from Transformers trained on a biomedical corpus.
eLSTM: Long Short-Term Memory.
fn represents the number of reference labels.