Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2005;2005:930.

Utility of Semantically Constrained Automated Extraction and Mapping of UMLS Concepts from Clinical Narratives

Kevin M Coonan 1
PMCID: PMC1560596  PMID: 16779217

Background

There is a broad need for tools to facilitate information extraction from narrative clinical reports. Efforts using natural language processing have recently shown promise.

Objective

Assess the utility and measure optimization of MetaMap Transfer (MMTx) to extract UMLS concepts from narrative emergency department reports.

Methods

Dictated textual documents were manually reviewed, clinically relevant concepts identified, and the discovered concepts mapped at the level of the UMLS concept unique identifier (CUI). The 2005AA release of the UMLS Metathesaurus, the RRF Browser, UMLSKS (online) and Semantic Navigator (online) were used. Various MMTx settings were applied to empirically derive optimal configurations. Semantic constraints specific to the part of the narrative (e.g. HPI, PMH, Exam) were then applied to the result set. Results were compared with those obtained using the subset of the UMLS representing the NCHVS recommended standards for clinical information systems.

Results

Initial results showed >90% of concepts could be represented using terms from the UMLS. Significant conceptual overlap and unrecognized synonyms within the UMLS complicated the assignment. A small minority (5%) of concepts require complex postcoordination of terms to achieve clinically equivalent expressions to those included in the dictated reports. Specific types of concepts for which no acceptable expressions within the UMLS could be found were identified and characterized. Temporal-spatial concepts relating to patient/problem/medication status and emergency medicine domain specific concepts both proved difficult to express. Early experience with MMTx showed 4–19% of the specific UMLS concepts identified by hand were identified, and that MMTx generated frequent non-relevant matches. However, a less stringent review of the concepts generated by MMTx showed, while the number of identical concepts (i.e. CUI from MMTx = CUI by hand) was relatively low, that specific types of matches were considerably more accurate and often a closely related, but contextual or semantically different, concept was returned.

Conclusion

The MMTx currently provides a useful degree of functionality for extraction of clinical terms from narrative documents, but additional manipulation and contextual specification will be essential for optimal performance. Several limitations are currently imposed by incomplete relationships and duplicative concepts in the UMLS.


Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES