Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Feb 9.
Published in final edited form as: MIXHS 12 (2012). 2012 Oct-Nov;2012:1–6. doi: 10.1145/2389672.2389674

Clinical Clarity versus Terminological Order – The Readiness of SNOMED CT Concept Descriptors for Primary Care

Zhe He 1, Michael Halper 2, Yehoshua Perl 3, Gai Elhanan 4
PMCID: PMC4747247  NIHMSID: NIHMS722891  PMID: 26870837

Abstract

As SNOMED usage becomes more ingrained within applications, its range of concept descriptors, and particularly its synonym adequacy, becomes more important. A simulated clinical scenario involving various term-based concept searches is used to assess whether SNOMED's concept descriptors provide sufficient differentiation to enable possible concept selection between similar terms. Four random samples from different SNOMED concept populations are utilized. Of particular interest are concepts mapped duplicately into UMLS concepts due to shared term patterns. While overall synonym problems are rare (1%), some concept populations exhibited a high rate of potential problems for clinical use (17–62%). The vast majority of issues are due to SNOMED's inherent structure and fine granularity. Many findings hint at a lack of clear delineation between reference and interface terminological qualities. Closer attention should be given to practical clinical use-case scenarios. Reducing SNOMED's structural complexity may alleviate many of the described findings and encourage clinical adoption.

Keywords: Terminology, SNOMED CT, Synonyms, Clinical use

1. INTRODUCTION

Concept descriptors are important in promoting the use of controlled medical terminologies. Among these descriptors, synonyms are particularly important, as indicated by Chute et al. [1]. Synonyms may be even more important when it comes to interface terminologies. In fact, Rosenbloom et al. [2] speculate that one of the cornerstones for usability of clinical interface terminologies is the adequacy of synonymy. Not only is the extent of synonym coverage important, but so is the depth. Medical concepts are often referred to using numerous names, acronyms, and various levels of local variation. While SNOMED CT (SCT) has emerged internationally as a leading terminology, it surprisingly has a relative paucity of synonyms. Of course, a reference terminology is not necessarily expected to include all synonyms, but only 36% of SCT's concepts have assigned synonyms, for an average of 0.51 synonyms per concept (103,996 out of a total of 291,205, January 2010 release). In a recent survey [3], more than half of the SCT users responding indicated that expanding synonym coverage is important to them. Missing synonyms were reported as the second most encountered deficiency in SCT (after missing concepts) by 17% of the respondents.

Making synonym adequacy more critical is the fact that the HITECH regulations [4, 5] and the “meaningful use” initiative portend nearly exponential growth in the adoption of Electronic Health Record (EHR) systems in the near future [5, 6]. In fact, SCT is slated to become the exclusive encoding system for problem lists by 2015 [6]. This puts SCT front and center, and a much wider range of users is expected to interact with SCT-based content in clinical applications. Such users will expect correct and appropriate synonyms to allow for ease of differentiation between similarly worded concepts in order to efficiently select the clinical concepts that best apply to their patients. Errors in synonyms, lack of synonyms, or insufficient concept information to decipher the exact meaning of concepts’ descriptors may prove detrimental to widespread clinical adoption.

In the integration of SCT into the UMLS, there were numerous cases where two or more SCT concepts were mapped to the same UMLS concept [7]. Specifically, this happened for 13.4% of SCT's concepts. Fung et al. [7] also highlight the fact that the methodology of synonym integration may inadvertently increase ambiguity. While Fung et al. [7] provide the reasoning for such occurrences, they did not systematically explore synonyms within SCT itself. This further raises questions about whether SCT concept descriptors offer sufficient information for effective clinical differentiation.

In this paper, we attempt to further evaluate and categorize aspects of concept descriptor issues across SCT from a practical use perspective. Four random samples from different SCT concept populations are utilized in our study. Of particular interest are concepts mapped duplicately into UMLS concepts due to shared term patterns. We use a simulated clinical scenario involving various term-based searches for concepts to assess whether SNOMED's synonyms, and other descriptors, provide sufficient differentiation to enable possible concept selection between similar concepts.

2. BACKGROUND

Each of SCT's concepts has (i) a fully specified name (FSN) that includes the semantic tag in parentheses (e.g., hematoma (morphologic abnormality)), and commonly (ii) the preferred term (PT) (e.g., hematoma). In many instances, the FSN and the PT are identical except for the semantic tag, which captures the semantic category to which the concept belongs. PTs are meant to capture the common word or phrase used by clinicians to name concepts [8].

Occasionally, SCT concepts may be accompanied by one or more synonyms. Synonymous terms are intended to convey identical or nearly identical meaning [9], assuming similar semantics of certain words. In SCT, synonyms are acceptable alternatives to the preferred terms, and both are not necessarily unique [8]. Acronyms are also considered synonymous terms in SCT. For example, COPD and COLD are among the 15 synonyms of the concept chronic obstructive lung disease (disorder). SCT claims to include a large number of synonyms that provide flexibility of expression [8, 10]. On top of the included synonyms, SCT also offers a “word equivalent” table as part of its Developer Toolkit. This table supports enhanced searches that take into account semantically similar words and provides commonly used abbreviations without greatly increasing the volume of synonyms [11]. Thus, SCT strives to create a practical balance between synonym explosion and expressivity.

As part of our prior research to identify whether the UMLS is a reliable source for enhanced SCT synonymy, particularly regarding concepts covered by the NLM's published problem lists [12, 13], we encountered many cases where problematic synonyms in the UMLS were associated with instances where two SCT concepts were mapped to the same UMLS concept. For example, Ectopic beats and Premature beats are two distinct SCT concepts that are both mapped to the UMLS concept Premature cardiac complex. This is a known issue; as discussed in [7], the incorporation of SCT into the UMLS resulted in numerous instances of more than one SCT concept being mapped to the same UMLS concept. Several reasons are attributed to such occurrences [7]: (a) strict separation of hierarchies in SCT results in very similar concepts residing under different roots, (b) finer granularity in SCT, (c) “NOS” (“Not Otherwise Specified”) concepts, and (d) cases of missed synonymy in SCT. As an example of (a): concepts with the SCT semantic tags of {disorder} and {morphologic abnormality} may be considered synonymous by the UMLS. Clear errors that were detected during the editorial process by UMLS staff were reported to the editors of SCT. Although, as noted, the causes of most of these occurrences were explained in [7], they may still present a problem from a clinical use perspective, especially considering the size and fine granularity of SCT.

3. METHODS

A simulated clinical scenario is used to assess whether SCT's concept descriptors (especially its synonyms) provide sufficient differentiation to enable possible concept selection between similar terms. The evaluation is carried out with respect to single concepts or pairs of concepts within four randomly selected samples, described below. The scenario involves a clinical user doing a series of term-based searches for clinical content and being provided in the process with choices of concepts, displayed with the most closely matched PT or synonym according to the physician's search term and the application's search algorithm. We used the search mechanism of SCT's CliniClue browser [14] as our search tool, the functionality of which we consider similar to other acceptable standalone search tools or search mechanisms within clinical applications which may, or may not, use subsets of SCT. CliniClue offers several search options and we used the default “Words – any order” option without any constraints, and the “Flat list” results display option. Exact string matches are displayed at the top of the returned subset. The reviewer is instructed to evaluate no more than the topmost twenty items of any returned search results subset and to focus on exact matches. For example, there exist two aspirin concepts, aspirin (product) and aspirin (substance). If the user were to search by typing “aspirin” into our search tool, the highest ranking results would be these two seemingly identical aspirin concepts. In CliniClue, and most likely in any built-in search tool within clinical applications, search results are displayed without their respective semantic tags (e.g., {product} and {substance} shown for the aspirin concepts). Without additional information, our hypothetical user would have difficulty discerning which of the concepts is appropriate for his clinical need. In this case, it is obviously aspirin (product), not aspirin (substance).

We attempt to quantify the degree of difficulty that a user may face in making such a decision about whether any concept resulting from the search term is appropriate for clinical use. The analysis is performed even when the concepts are presented with their FSNs, which include the semantic tags (e.g., {finding}, {morphologic abnormality}, etc.). Our evaluation takes into consideration SCT's principle that PTs and synonyms are not required to be unique. We use a four-point scale, where 0 indicates a non-issue, 1 indicates a minimal issue, 2 indicates a moderate issue, and 3 indicates a significant/critical issue. In light of typically scarce terminological auditing resources, our evaluation involves a single auditor. To minimize the subjectivity of the evaluation, we convert the results of the four-point scale into a yes/no decision where Grades 0–2 are considered a “no (issue)” and Grade 3 is a “yes.” Thus, for example, the synonyms Arteriovenous catheterization and Arteriovenous cannulation were marked as Grade 3 because they were assigned to the concept Direct arteriovenous anastomosis.

In accordance with Fung et al. [7], we defined four data sets. Sample A (“Same String Pairs” – “SSP”) consists of 65 pairs of SCT concepts such that the concepts of each pair are mapped to the same UMLS concept and each shares at least one exact same string across their synonyms and/or their PTs. Sample B (“No Shared String Pairs” – “NoSSP”) comprises 81 concept pairs where each member of a pair is again mapped to the same UMLS concept, but in this case the pair members have completely different strings from one another across their synonyms and/or their PTs. Sample C (“Synonym Control” – “SynCtrl”) consists of 50 individual SCT concepts with at least one synonym such that each does not share a UMLS concept with any other SCT concept. Sample D (“Ctrl”) is made up of 100 individual SCT concepts without regard to their number of synonyms. The four randomly selected data sets used in our study are derived from the January 2010 release of SCT. All samples were chosen to be mutually disjoint, i.e., no concept appears in more than one of them. Excluded from Samples A and B are concept pairs with one member having the SCT semantic tag {substance} and the other having {product}, or one having {disorder} and the other, {morphologic abnormality}. We chose to impose this restriction due to the common occurrence of this kind of situation plus the fact that, a priori, they share identical or near identical strings. An example is aspirin (product) and aspirin (substance), both of which share “aspirin” as their PT. Many such clear-cut issues can potentially be resolved by using well-curated subsets. If such pairs were allowed to dominate Samples A and B, they might mask other potential issues.

4. RESULTS

All evaluations of the four samples were conducted by one of the authors (GE, a medical informatician experienced in curation and auditing of large terminologies). Table 1 provides general information regarding the synonym content of the concepts in our samples compared to the general population of SCT concepts. For example, in the general concept population, there are 291,205 (active) concepts (Column 1). Among these, 35.7% have synonyms, with an overall average of 0.51 per concept. Those concepts having synonyms have an average of 1.42 of them. The concept with the most has 27 synonyms. For Sample A (comprising 65 pairs or 130 concepts), we find 68.5% of the concepts having synonyms, with an overall average of 1.39 per concept. The average is 2.05 synonyms for those concepts with synonyms. The maximum number of synonyms for a concept is seven.

Table 1.

General synonym characteristics in SCT and the concept samples

General SCT Concepts with Synonyms Sample A (SSP) Sample B (NoSSP) Sample C (SynCtrl) Sample D (Ctrl)
# concepts 291,205 103,996 130 162 50 100
% concepts w/synonyms 35.7% 100% 68.5% 50.6% 100% 31%
Avg # synonyms 0.51 1.42 1.39 1.22 2.80 0.51
Avg # synonyms for concepts w/synonyms 1.42 1.42 2.05 2.40 2.80 1.65
Min / max # synonyms 0 / 27 1 / 27 0 / 7 0 / 8 2 / 8 0 / 5

Table 2 summarizes our findings with respect to each sample. As discussed above, we display only Grade 3 findings (“significant/critical issues”). Overall, 442 unique SCT concepts were evaluated (146 concept pairs, 150 individual concepts). As can be seen in the table, 62% (40) of the Sample A (SSP) concept pairs were deemed to harbor significant issues (Grade 3): synonym errors, duplicate concepts, “container classes” (i.e., concepts that are too general), and other issues. In seven pairs, at least one of the concepts was found to contain an erroneous synonym. For instance, balanoplasty is a surgical repair of the glans penis. Therefore, it is an incorrect synonym for the concept repair of penis (procedure), a general concept representing any repair on any part of the penis. Thus, if users were to search for “balanoplasty” in a SCT-based clinical application, they would be faced with two “balanoplasty” options: (a) Repair of penis, and (b) Balanoplasty. Without further querying of SCT's content, they would not be able to differentiate between the two and may select a concept that does not correctly describe the circumstances of their patient. In eight pairs, the concepts were deemed to be duplicates. For example, the concept oxygen nasal cannula (physical object) and the concept nasal oxygen catheter, device (physical object) co-exist, with the latter having the synonym “nasal oxygen cannula.” In 11 of the pairs, issues resulted from the fact that one or both of the involved concepts were container classes that serve to group together and subsume collections of more refined, sibling concepts. More specifically, one of the concepts was more general than the other, yet shared a synonym. As an example, Family Megapodiidae (organism) is the parent of megapode (organism), but the former has the synonym “megapode.” Fourteen other concepts, although they did not contain any of the above described issues, still lacked sufficient clarity to resolve potential clinical confusion. For example, a search for “tachycardia” returns two concepts, tachycardia as a {disorder} and tachycardia as a {finding}. The fine differentiation between a finding and a disorder may escape the common user.

Table 2.

Grade 3 findings across the four samples

Sample A (SSP) Sample B (NoSSP) Sample C (SynCtrl) Sample D (Ctrl)
# 65 (pairs) 81 (pairs) 50 100
Grade 3 Issues 40 14 1 1
% Grade 3 62% 17% 2% 1%
Synonym Errors 7 - 1 -
Duplicate Concepts 8 7 - -
Container Classes 11 3 - -
Other 14 4 - 1

For Sample B (NoSSP), 17% of the pairs exhibited Grade 3 issues. Seven pairs were considered duplicates; three showed container-class issues; and four others still resulted in potential confusion. Samples C (SynCtrl) and D (Ctrl) each had only one concept considered to exhibit a Grade 3 issue.

The differences in the numbers of Grade 3 problems between Sample A and Sample B, and between Samples A or B and Samples C or D were all statistically significant (Fisher's Exact Test, 2-Tail p-value < 0.001 for all).

5. DISCUSSION

Our findings indicate that specific subsets of SCT concepts may exhibit significant rate of synonym issues. However, the general population of SCT concepts with synonyms (35.7%) carries a relatively low rate (2%) of major issues with the overall quality of its synonyms. This finding is not in contradiction with the opinions collected in a recent survey of SCT users [3], where most of the issues raised were with missing synonyms and lack of synonyms, and not necessarily about erroneous ones. It should also be remembered that the relative paucity of SCT synonyms contributes towards this low rate and that our samples intentionally excluded most issues that may arise from the strict separation of hierarchies in SCT. However, when a specific population of concepts was examined, i.e., concepts that were deemed similar enough to be mapped to the same UMLS concept, a significantly higher rate of issues could be found. This subset (13.4%; see [7]) of SCT concepts is not negligible and deserves closer scrutiny. Such issues may lead users of SCT-based clinical applications to erroneously select a concept that does not necessarily apply to their patient. This, of course, may lead to subsequent errors by medical personnel and incorrect application of decision support and analytical tools.

From IHTSDO's perspective, most of these issues, in all likelihood, do not represent true problems. SCT's 19 hierarchies and its fine granularity virtually guarantee that strings with similar or identical word structure will reside under different roots, with different semantics. Indeed, SCT's User Guide [8] explicitly indicates that synonyms and PTs are not necessarily unique. As a result, the vast majority of SCT concepts duplicately mapped in the UMLS fall under such a category. For example, almost all drug names exist separately as {substance} and {product} concepts in SCT and correspondingly are almost invariably mapped to the same concept in the UMLS.

As much as this arrangement is logical within SCT's structure, it may present significant difficulties to the average user within clinical applications and even to software vendors. SCT is no longer considered the product of an “academic exercise.” Due to successful leadership and adoption initiatives, SCT has already passed the tipping point of clinical adoption. The accelerating adoption of EHRs and the regulatory emphasis on standardized encoding of clinical problems within such applications [4-6] will reasonably lead to an increased exposure of novice users to SCT, especially in primary care settings. These users cannot be expected to know the inherent structure and underlying logical modeling of such a terminology, and will be oblivious to many of the finer principles described in the SCT User Guide. Nor can we assume that such users have the desire to use terminological tools to discern the differences between the meanings of SCT's concepts prima facie.

Institutions like Kaiser Permanente (KP) and the Veterans Administration (VA) spent years and significant financial resources to be able to utilize aspects of SCT for their clinical needs [15, 16]. And while large EHR vendors can possibly match such an effort, most vendors of the 800 or so currently certified complete EHRs [17] cannot all reasonably muster such an effort. Many of the findings and examples presented in this study involve hierarchies that can be expected to be commonly used (diagnoses/findings, procedures). In this work, we excluded, a priori, readily identifiable issues such as aspirin (product) and aspirin (substance). Such issues can easily be dealt with using well-defined subsets of SCT. However, even with the use of limited subsets such as the CORE [12] and VA/KP [13] problem list subsets of SCT, or commercially available well-curated subsets, the described problems can still be expected to present themselves.

Let us consider a scenario where a community physician wishes to record the fact that a patient is undergoing chemotherapy. Intuitively, a user will type “chemotherapy” into a hypothetical search tool within the EHR that relies on SCT terms. Using CliniClue [14], for example, the two top-most terms returned are both synonyms named chemotherapy, each related to two different concepts: (1) antineoplastic chemotherapy regimen (procedure), and (2) administration of antineoplastic agent (procedure). Clearly, there are more than subtle differences between these two concepts. Since both concepts belong to the same hierarchy, limiting the search to a specific subset is not likely to eliminate the confusion. How is our hypothetical user to select the correct one if all s/he is presented with are two identical strings, one a PT and the other a synonym? Are we always to present her/him with only the PT? The FSN? All of them? Obviously, there is no simple answer, but in this case, even with exposure to all the available information, the decision might be difficult and frustrating, and may require dwelling on the finer details of SCT's conceptual representations.

Another observation is related to the way that SCT uses container-class concepts clearly created to subsume a group of other concepts under the “same roof.” As an example, cow's milk specific immunoglobulin E antibody measurement (procedure) is a child of milk specific IgE antibody measurement (procedure). Each of the two concepts has at least one synonym indicating that they are related to cow's milk RAST. However, a closer examination reveals that for the parent, this is an error since goat and sheep milk RAST tests are children as well. Although this can be considered simply a synonym error, the phenomenon observed here and in other cases is, most likely, that a concept that was formerly a leaf node became a container class. Such instances can be algorithmically detected and avoided altogether by a disciplined editorial approach. We propose that IHTSDO formulate special editorial rules for container classes, especially for ones that are not specific enough to be used clinically, and thus may not require synonyms. Unintended use of higher-level concepts can lead to reasoning mistakes by algorithmic decision-support systems.

Such scenarios were hypothesized when we evaluated our samples. And while most of our findings are not likely to be recognized by IHTSDO (except for potential duplicate concepts or erroneous synonyms), they may confound everyday clinical users. Even though we do not know how often such issues may arise under different clinical settings, the expanded role of SCT subsets suggests that the identified issues should be systematically addressed for better encoding and wider clinical adoption.

Aside from obvious errors that should be corrected, the most plausible mechanism that SCT offers to deal with such issues is the local extensions [8]. However, the extension mechanism requires a resource intensive, coordinated effort [18, 19], most likely, on a national level, and may still not resolve the majority of issues. If our hypothetical physician were to record the gender of a female patient by typing the term “female” into CliniClue, s/he would be presented with both “female” as female structure (body structure) and “female” as female (finding). Such complexity is by design and is not likely to be resolved by local extensions. Similar situations are presented where the involved concepts carry the semantic tags of {finding} and {disorder}, the semantics of which is essential for problem lists. The selection made under clinical circumstances carries with it significance beyond the common meaning of the string that represents it. Each concept has a different conceptual representation, and future reasoning engines may be compromised and draw different conclusions due to hasty selections made under sub-optimal conditions.

Although SCT is a reference terminology and is not expected by IHTSDO to serve as an interface terminology per se, many others have attempted to utilize it that way. The dangers associated with the ambiguities described above should be addressed. However, the prospect of creating a dedicated, clinically specific extension that addresses such issues, as well as many others—within a practical timeframe—is not promising although some issues, such as the use of container class concepts can be addressed algorithmically. Some of the issues highlighted in this study demonstrate a schism that already exists within SCT between reference and interface uses. Therefore, the complexity of a reference terminology, such as SCT, should be balanced against its clinical usefulness during the creation and editing process.

This study is qualitative, with a subjective aspect associated with the simulation and review process by a single expert. For that reason, we chose to expose only Grade 3 findings. As the examples above show, Grade 3 findings are non-arbitrary, clear-cut issues. However, medical decision making is often subjective as well, and it has been the authors’ experience, over many years of providing feedback to both the UMLS and the SCT governing bodies, that the error correction and content introduction process is not entirely objective and structured. Nevertheless, the findings expose aspects of SCT usability that were not considered before. Future work is required to systematically address and eliminate such confounding scenarios.

Our selection of the CliniClue search mechanism, although arbitrary, represents a reasonable approach and may affect only some types of errors, while others are independent of the search-and-display algorithm. Even though other search algorithms may display search results in a different manner, CliniClue is the prominent tool to view and investigate SCT [3] and offers a practical and satisfactory solution. It is unlikely that many of the vendors of the more than 800 currently certified complete EHRs will offer significantly better tools to explore medical ontologies.

A PubMed search reveals that the literature related to auditing of SCT synonyms is scant, with only two immediately relevant studies [7, 10]. Despite historical claims [8, 10], the overall paucity of synonyms mandates that a significant effort be directed at improving their coverage and depth. This is especially relevant for leaf-node concepts that are more likely to be used in clinical circumstances. This is particularly true in the short term for the proposed problem lists. Addressing specific subsets of SCT concepts, such as those covered by this study, can provide a good starting point.

6. CONCLUSION

SCT exhibits a low overall rate of synonym errors. However, its hierarchical structure and synonym content result in murky areas where non-expert users may find it difficult to choose the correct concepts in clinical settings. In this paper, we utilized a simulated clinical scenario to demonstrate some of the difficulties that could be encountered and evaluate samples of SCT's conceptual content in this regard. While IHTSDO does not consider SCT as an interface terminology, there is no immediately available alternative. Thus, it is desirable that IHTSDO should pay closer attention to practical clinical use cases and formulate editorial policies to better address practical clinical needs and reduce structural complexity. Clearly marking container class concepts that are not intended for clinical use and possibly removing synonyms from such concepts might serve as a start. In light of SNOMED's increasing role in primary care, more attention should be focused on pragmatic usability aspects.

ACKNOWLEDGMENT

This work was partially supported by the NLM under grant R-01-LM008912-01A1.

Footnotes

General Terms

Design, Standardization.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Contributor Information

Zhe He, Computer Science Dept., NJIT Newark, NJ 07102 1-973-596-2867 zh5@njit.edu.

Michael Halper, Information Technology Department, NJIT Newark, NJ 07102 1-973-596-5752 michael.halper@njit.edu.

Yehoshua Perl, Computer Science Dept., NJIT Newark, NJ 07102 1-973-596-2867 yehoshua.perl@njit.edu.

Gai Elhanan, Halfpenny Technologies, Inc. Blue Bell, PA 19422 1-347-443-9741 gelhanan@gmail.com.

REFERENCES

RESOURCES