Table 1.
Corpora used for lexical analysis
Language | Tokens | Report Types |
---|---|---|
English | 1 137 035 | Radiology, history & physical exam, and emergency department reports |
French | 6 781 411 | Discharge summaries, consultation reports, and surgical reports from a cardiology unit |
German | 100 150 | FReiburg Annotated MEDical corpus (FRAMED): discharge summaries, pathology, histology, and surgery reports [23] |
Swedish | 4 644 850 | The Stockholm EPR Corpus: Assessment entries from all clinic types [24] |