Abstract
Medical Language Processing (MLP), especially in specific domains, requires fine-grained semantic lexica. We examine whether robust natural language processing tools used on a representative corpus of a domain help in building and refining a semantic categorization. We test this hypothesis with ZELLIG, a corpus analysis tool. The first clusters we obtain are consistent with a model of the domain, as found in the SNOMED nomenclature. They correspond to coarse-grained semantic categories, but isolate as well lexical idiosyncrasies belonging to the clinical sub-language. Moreover, they help categorize additional words.
Full text
PDF




Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Chute C. G., Cohn S. P., Campbell K. E., Oliver D. E., Campbell J. R. The content coverage of clinical classifications. For The Computer-Based Patient Record Institute's Work Group on Codes & Structures. J Am Med Inform Assoc. 1996 May-Jun;3(3):224–233. doi: 10.1136/jamia.1996.96310636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hersh W. R., Campbell E. H., Evans D. A., Brownlow N. D. Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools. Proc AMIA Annu Fall Symp. 1996:159–163. [PMC free article] [PubMed] [Google Scholar]
- Zweigenbaum P. MENELAS: an access system for medical records using natural language. Comput Methods Programs Biomed. 1994 Oct;45(1-2):117–120. doi: 10.1016/0169-2607(94)90029-9. [DOI] [PubMed] [Google Scholar]
