Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2006;2006:1131.

Maintaining Mappings from Source Systems in a Local Health Information Infrastructure

Daniel J Vreeman 1
PMCID: PMC1839296  PMID: 17238750

Abstract

We developed a program to assist in managing changes in source system observation terms. The program returns candidate matches based on approximate string comparator scores. A preliminary evaluation of the tool for managing radiology term updates demonstrates its usefulness by identifying exact matches for 61% of new terms and high probability matches for another 25% of new terms.

INTRODUCTION

Achieving the goal of interoperable health information exchange is hindered by the numerous internal, idiosyncratic conventions for identifying data in separate electronic systems. A comprehensive information exchange must coalesce all of the various sources that produce health data. The Indiana Network for Patient Care is an example of an operational local health information infrastructure (LHII) that carries millions of entries from five health systems, including fifteen separate hospitals.1 Ninety-two source systems sent HL7 clinical result messages to us in this collaborative during 2004.

In our network, we accomplish the task of integrating these disparate data sources by mapping the internal codes from the source systems to a common master dictionary, based on standardized vocabularies. Mapping the local observation codes from the various source systems requires extensive manual effort. This mapping effort is largest during the initial system integration period, but maintainers of the common dictionary must keep up with the changes in source terms as they occur over time. In our LHII, institutions retain control over their local term naming conventions and oversee the evolution of their terms over time. Ideally, source systems would follow good terminology principles and give advanced notice about all terminology changes. In our experience, we have found the work of surveillance for and managing these changes to be a challenging aspect of operating an LHII.

We attempt to address these challenges by using automated tools for assistance wherever possible. We have developed a program to help automatically map new observation codes from master file updates and our message processor exception logs to our LHII master dictionary. Here we present a preliminary analysis of its performance for maintenance mapping of a radiology center’s test terms.

METHODS

Radiology Naming Conventions

A common feature of the codes and names we receive from the six radiology centers in our LHII is the invention of multiple codes for one test to distinguish among the facilities performing that test. These tests often have the same name (or a close string variant), but their codes have a prefix to identify the facility location.

Data Source

We evaluated the utility of our program for mapping a quarterly radiology master file update from an urban not-for-profit hospital.

Semi-automated Term Mapping

We developed a program in Perl to identify candidate matches in our master dictionary for new local terms. The program identifies new terms by searching either the master file updates we receive or extracts from our message processor exception logs. For each new term, it looks for existing terms from that source system with the same string in the observation identifier text field, and returns the master dictionary code mapping. For new terms without an exact match, the program calculates a root mean square (RMS) value of three approximate string comparators (longest common substring, Levenshtein distance, and the Ukkonen similarity) and returns the top three existing term matches with an RMS score greater than 0.75.

RESULTS

The master file update contained 3,814 terms, of which 638 were not present in our existing mappings. The program identified master dictionary term mappings based on exact string matches for 61% (n=386) of these new terms. After manual review by a domain expert 25% (n=162) of the terms could be mapped to one of the top three candidate terms by RMS string comparator score. Another 5% (n=26) were mapped by term-by-term manual review, and the remaining 10% (n=64) of terms required creation of new master dictionary terms to accommodate.

CONCLUSION

We developed a program to assist in managing mappings from source vocabularies in our LHII. A preliminary evaluation of the program for radiology terms demonstrated its usefulness in mapping a majority of the term changes in the update period.

References

  • 1.McDonald CJ, Overhage JM, Barnes M, et al. The Indiana network for patient care: a working local health information infrastructure. Health Affairs. 1005;24(5):1214–1220. doi: 10.1377/hlthaff.24.5.1214. [DOI] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES