Abstract
The Unified Medical Language System (UMLS) [1, 2] Metathesuarus is concept-oriented; its goal is to unite all names with identical meaning in a single Concept. The names come from its constituent vocabularies or "sources"--a wide variety of biomedical terminologies including many controlled vocabularies and classifications used in patient records, administrative health data, bibliographic, research, full-text, and expert systems. Many offer little definitional information, and many are not themselves concept-oriented, so identifying synonymy is a challenging semantic task [3]. The rapidly increasing size of the Metathesaurus makes the task daunting, demanding effective computational support; there are more than 1.5 million names for 730,000 concepts in the January 2000 release. Vocabularies are added and updated using sophisticated lexical matching, selective algorithms, and expert review [4, 5, 6]. Yet the result isimperfect; we have discovered and corrected missed synonymy in approximately 1% of previously released concepts each year. This paper reviews general methods for finding missed synonymy and describes several specific novel approaches which we have found effective.
Full text
PDFSelected References
These references are in PubMed. This may not be the complete list of references from this article.
- Lindberg D. A., Humphreys B. L., McCray A. T. The Unified Medical Language System. Methods Inf Med. 1993 Aug;32(4):281–291. doi: 10.1055/s-0038-1634945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McCray A. T., Srinivasan S., Browne A. C. Lexical methods for managing variation in biomedical terminologies. Proc Annu Symp Comput Appl Med Care. 1994:235–239. [PMC free article] [PubMed] [Google Scholar]
- Tuttle M. S., Suarez-Munist O. N., Olson N. E., Sherertz D. D., Sperzel W. D., Erlbaum M. S., Fuller L. F., Hole W. T., Nelson S. J., Cole W. G. Merging terminologies. Medinfo. 1995;8(Pt 1):162–166. [PubMed] [Google Scholar]