Skip to main content
. Author manuscript; available in PMC: 2014 Feb 13.
Published in final edited form as: Stud Health Technol Inform. 2013;192:677–681.

Table 1.

Corpora used for lexical analysis

Language Tokens Report Types
English 1 137 035 Radiology, history & physical exam, and emergency department reports
French 6 781 411 Discharge summaries, consultation reports, and surgical reports from a cardiology unit
German 100 150 FReiburg Annotated MEDical corpus (FRAMED): discharge summaries, pathology, histology, and surgery reports [23]
Swedish 4 644 850 The Stockholm EPR Corpus: Assessment entries from all clinic types [24]