Skip to main content
Journal of the American Medical Informatics Association: JAMIA logoLink to Journal of the American Medical Informatics Association: JAMIA
. 2015 Oct 21;22(3):507–518. doi: 10.1136/amiajnl-2014-003151

Scalable quality assurance for large SNOMED CT hierarchies using subject-based subtaxonomies

Christopher Ochs 1,, James Geller 1, Yehoshua Perl 1, Yan Chen 2, Junchuan Xu 3, Hua Min 4, James T Case 5, Zhi Wei 1
PMCID: PMC6283060  PMID: 25336594

Abstract

Objective Standards terminologies may be large and complex, making their quality assurance challenging. Some terminology quality assurance (TQA) methodologies are based on abstraction networks (AbNs), compact terminology summaries. We have tested AbNs and the performance of related TQA methodologies on small terminology hierarchies. However, some standards terminologies, for example, SNOMED, are composed of very large hierarchies. Scaling AbN TQA techniques to such hierarchies poses a significant challenge. We present a scalable subject-based approach for AbN TQA.

Methods An innovative technique is presented for scaling TQA by creating a new kind of subject-based AbN called a subtaxonomy for large hierarchies. New hypotheses about concentrations of erroneous concepts within the AbN are introduced to guide scalable TQA.

Results We test the TQA methodology for a subject-based subtaxonomy for the Bleeding subhierarchy in SNOMED's large Clinical finding hierarchy. To test the error concentration hypotheses, three domain experts reviewed a sample of 300 concepts. A consensus-based evaluation identified 87 erroneous concepts. The subtaxonomy-based TQA methodology was shown to uncover statistically significantly more erroneous concepts when compared to a control sample.

Discussion The scalability of TQA methodologies is a challenge for large standards systems like SNOMED. We demonstrated innovative subject-based TQA techniques by identifying groups of concepts with a higher likelihood of having errors within the subtaxonomy. Scalability is achieved by reviewing a large hierarchy by subject.

Conclusions An innovative methodology for scaling the derivation of AbNs and a TQA methodology was shown to perform successfully for the largest hierarchy of SNOMED.

Keywords: SNOMED CT, terminology quality assurance, standards quality assurance, abstraction network, scalable quality assurance, subject-based terminology quality assurance

INTRODUCTION

Biomedical terminologies, such as the Systematized Nomenclature of Medicine—Clinical Terms1 (SNOMED CT, SCT for short), are standards used to support electronic health record (EHR) encoding,2–9 meaningful use,6,10–12 interdisciplinary research,13–15 and many other applications. Quality assurance is an important part of standards maintenance.16 However, the size and complexity of modern biomedical terminologies makes terminology quality assurance (TQA) difficult, requiring manual reviews of thousands of concepts by domain experts. Resources for comprehensive content reviews are limited. Effective methodologies are needed to target portions of a terminology that are more likely to contain errors, to increase TQA yield, measured by the ratio of number of errors corrected to number of concepts reviewed.

To support effective maintenance, we have developed a theory of abstraction networks (AbNs) to summarize the content and structure of various terminologies.16,17 In particular, we have shown that the area taxonomy,17partial-area taxonomy,17 and disjoint partial-area taxonomy,18 kinds of AbNs, support maintenance of SCT.19–21Taxonomies partition concepts into groups, based on similar relationship structure and semantics. Certain concept groups contain uncommonly classified or complex concepts. These groups were found more likely to contain errors19–21 when compared to control groups.

In our previous studies, taxonomies were used to support maintenance of the Specimen hierarchy of SCT,19,21 and the Biological process hierarchy of the National Cancer Institute thesaurus (NCIt).16 However, when we applied the same methodologies to large SCT hierarchies, for example, Procedure and Clinical finding with 53 147 and 99 440 concepts, respectively, we encountered significant issues that inhibited our taxonomy-based TQA approach. First, the Procedure and Clinical finding taxonomies contain 10 828 and 10 614 groups, respectively, too many to review individually. Secondly, thousands of concepts were categorized by too general groups. These issues make our previously developed TQA methodologies impractical for large hierarchies.

To achieve scalability of our methodologies, we have developed taxonomy subsets, called subject-based subtaxonomies, which summarize a subhierarchy that covers a particular subject. A subtaxonomy is created by selecting a concept, for example, Bleeding or Heart disease, and all of its descendants. Since quality assurance for a whole hierarchy is not practical, we observed that auditors usually concentrate on subjects of high interest. Subject-based subtaxonomies allow an auditor to focus on manageable portions covering specific subjects of a large hierarchy.

Following our study of overlapping concepts21 (see ‘Background’ section), we test the hypothesis that overlapping concepts in a subject-based subtaxonomy of a large SCT hierarchy are more likely to have errors than non-overlapping concepts. Furthermore, we investigate three new hypotheses that identify groups of concepts that are even more likely to have errors. A methodology based on these hypotheses is introduced, which enables an auditor to obtain a high yield of corrections with a relatively small effort.

To test our methodology, we derive subject-based subtaxonomies for the Bleeding (and Cancer) subhierarchies in Clinical finding. We target Clinical finding due to its importance for standards encoding and widespread use.22 Three domain experts reviewing a sample of 300 concepts from the Bleeding subtaxonomy identified 87 erroneous concepts. The subject-based subtaxonomy quality assurance methodology is shown to uncover statistically significantly more erroneous concepts than a control experiment. Applying this methodology to many subjects, one at a time, can cover a substantial portion of a large hierarchy without overwhelming an auditor.

BACKGROUND

SNOMED CT TQA techniques

TQA is an important part of a terminology's lifecycle.16 Zhu et al.23 provide a comprehensive survey of manual, semi-automatic, and automatic TQA methodologies. SCT is a common target for TQA studies because of its importance. Jiang and Chute24 audited the semantic completeness of SCT concepts. Semantic, structural, and ontological techniques are offered by Rector et al.25,26 who identified major types of errors in SCT that were caused by problems in Description Logic modeling or in concept classification. Schulz et al.27–29 analyzed SCT's ‘health’ from an ontological and logical perspective, identifying major problem areas.

Ceusters et al.30 describe an ontology-based technique that utilized an external ontology. Mortensen et al.31 describe a crowdsourcing32 TQA methodology for verifying the correctness of axioms in SCT. Agrawal et al.33–35 utilized a combination of lexical and structural techniques to identify inconsistently modeled concepts. Bodenreider et al.36 analyzed SCT's IS-A hierarchy, found under-defined classes, and used lexical methods37 to evaluate consistency among SCT's terms.

SNOMED CT taxonomy AbNs

In previous work, we have developed various kinds of AbNs to support TQA16–18 for standards terminologies, for example, NCIt38 and SCT.1 AbNs are compact, hierarchical networks composed of nodes and links. Nodes summarize groups of similar concepts and links summarize hierarchical relationships between related groups. AbNs have been shown to highlight groups of concepts that are more likely to contain errors. We utilized the following three kinds of AbNs for SCT17,18 to uncover errors,19,21 briefly reviewed below: area taxonomy; partial-area taxonomy; and disjoint partial-area taxonomy. For a glossary of terms see online supplementary appendix I.

Figure 1A shows an excerpt of 29 concepts from the Clinical finding hierarchy. An area summarizes all concepts with the same set of outgoing attribute relationships (relationships for short). Only the types of the relationships are considered; target concepts are disregarded. An area taxonomy is a network where the nodes are areas and areas are connected by child-of links based on the underlying IS-A relationships.17 A root of an area is a concept that has no parents in its area. Areas are disjoint. Figure 1B shows the area taxonomy for the excerpt of figure 1A.

Figure 1.

Figure 1

(A) An excerpt of 29 concepts from the Clinical finding hierarchy. IS-A relationships are shown as upward arrows between concepts. Concepts with the exact same set of outgoing attribute relationships are grouped into dashed bubbles that are labeled with the set of relationships. For example, the concepts Bleeding and Inflammatory disorder all have one relationship, Associated morphology. (B) The area taxonomy for the concepts in (A). Areas are displayed as colored boxes, named by the common relationship(s). Areas are organized into color-coded levels according to their numbers of relationships. The 13 concepts with the Finding site relationship are now represented by the box named Finding site on level 1 (green) of the area taxonomy. Child-of links appear as bold arrows, for example, {Associated morphology, Finding site} is child-of both {Associated morphology} and {Finding site}. (C) The partial-area taxonomy for the concepts in (A). The five concepts in the {Associated morphology, Finding site} area are now refined into two partial-areas, Genitourinary tract hemorrhage (3) and Hemorrhage of abdominal cavity structure (4). These partial-areas are child-of both Bleeding and Finding by site. Child-of links are shown as arrows between partial-areas.

The partial-area taxonomy is a refinement of the area taxonomy. The (possibly multiple) root concepts of each area are used to define partial-areas. A partial-area consists of a root concept and all of its descendants in the area. Partial-areas summarize semantically similar concepts within each area, since all are descendants of this root. In a partial-area taxonomy the nodes are partial-areas, which are connected by child-of links.17 In a diagram, partial-areas are shown as white boxes within their areas. Each partial-area is labeled with the root concept's name and its number of concepts (in parentheses). Figure 1C shows the partial-area taxonomy for figure 1A.

Partial-areas sometimes overlap, that means, a concept can be contained in multiple partial-areas. This happens whenever a concept is a descendant of multiple roots. In figure 1A the concepts Hemorrhage of kidney and Bladder hemorrhage, with two parents each, are descendants of two roots: Genitourinary tract hemorrhage and Hemorrhage of abdominal cavity structure. In figure 1C, the partial-areas defined by these roots both contain Hemorrhage of kidney and Bladder hemorrhage, which explains why the number of concepts in the {Associated morphology, Finding site} area (5) is smaller than the sum of the concepts summarized by each partial-area (3+4 = 7). A concept that is summarized by more than one partial-area is called an overlapping concept. Summarization with overlapping concepts is not desirable.

To make sure that concepts appear only in one partial-area, we developed the disjoint partial-area taxonomy,18 which partitions the concepts of an area into disjoint partial-areas. ‘Disjoint’ means ‘free of overlap.’ For a formal definition see Wang et al.18 Intuitively, disjoint partial-areas are ‘carved out’ from partial-areas with overlapping concepts, thus eliminating the overlap. The roots of disjoint partial-areas are called ‘overlapping roots.’

In figure 2B, Genitourinary tract hemorrhage, Hemorrhage of abdominal cavity structure, Gastrointestinal hemorrhage, and Hemorrhage of anastomosis are the roots of {Associated morphology, Finding site}. Hemorrhage of kidney, Lower gastrointestinal hemorrhage, and Gastric hemorrhage are overlapping (since they are overlapping concepts) roots, because all of their parents are non-overlapping concepts. Hematoma of kidney and Gastrojejunal ulcer with hemorrhage are also overlapping roots, because they are descendants of different roots than their parents.

Figure 2.

Figure 2

(A) An example of a subhierarchy of 17 concepts in {Associated morphology, Finding site} grouped in partial-areas, which are enclosed by dashed colored bubbles. (B) The roots of the disjoint partial-areas are in color. Area roots are given a single color. Overlapping roots are multicolored according to the multiple area roots they are descendants of. (C) The disjoint partial-area taxonomy for (A). Disjoint partial-areas are color coded according to the colors of their root concept in (B). The nine disjoint partial-areas summarize the 17 concepts.

Level 1 disjoint partial-areas (single colored) are referred to as non-overlapping disjoint partial-areas, since their concepts are non-overlapping in the partial-area taxonomy. Disjoint partial-areas at higher indexed levels (multi-colored) are referred to as overlapping disjoint partial-areas, since their concepts overlap between multiple partial-areas in the partial-area taxonomy.

Figure 2C shows the disjoint partial-area taxonomy for the concepts in figure 2A. When compared to the partial-area taxonomy, the disjoint partial-area taxonomy provides a more accurate summary of the subhierarchies of concepts within an area when these subhierarchies overlap.

These three taxonomies were shown to support SCT maintenance,19–21 for example, small partial-areas were found to contain more errors than large partial-areas19,20 and overlapping concepts were found to contain more errors than non-overlapping concepts.21 However, the taxonomy for the large Clinical finding hierarchy has 10 614 partial-areas in 357 areas, which is too large for effective quality assurance.

Ochs et al.20 described the relationship subtaxonomy, which displays subsets of structurally similar groups from large partial-area taxonomies. This method was successfully used in support of TQA for the large Procedure hierarchy. However, anecdotal evidence reveals that terminology curators do not consider groups of concepts with structural similarity but groups of concepts from a specific subject (topic), for example, Bleeding or Lung cancer.

To support TQA, we developed the Biomedical Layout Utility for SCT (BLUSNO),39 a software tool for creating, visualizing, and exploring taxonomies. BLUSNO enables an auditor to view a concept in the context of a local neighborhood or a partial-area taxonomy. BLUSNO has been tested on many SCT hierarchies, creating taxonomies for 1000 to 100 000 concepts.

METHODS

Subject-based subtaxonomy derivation

The complete taxonomy for the Clinical finding hierarchy (zoomed-out excerpt in online supplementary appendix II) contains 10 614 partial-areas, smaller than the underlying hierarchy (99 440 concepts). However, it is too large to support TQA. It does not allow an auditor to focus on subject concepts, which are ‘hidden’ inside large partial-areas like Finding by site, with 9602 concepts. Thus, we now describe an innovative subject-based TQA approach to select a concept of current interest, for example, for bleeding-related disorders the concept Bleeding is selected.

A subtaxonomy is a taxonomy for a subhierarchy. Given an arbitrary concept c, a subtaxonomy is derived using the methodology described in the ‘Background’ section, but it is applied to the SCT subhierarchy rooted at c. The root area and unique root partial-area consist of c and all of its descendants with the same relationships. BLUSNO derives c's subtaxonomy.

The definition of the subtaxonomy is applicable to any SCT concept in a hierarchy with attribute relationships. This is important, because our previous approach, which allowed a taxonomy to be derived for an entire hierarchy only, led to many concepts being ‘hidden’ inside large partial-areas, for example, Finding by site contains several of the major causes of death listed in table 4.

Table 4.

Subject-based subtaxonomy metrics for the 10 leading causes of death in the USA

Rank Cause of death Subject-based subtaxonomy concept Concepts (n) Partial-areas (n) Areas (n) Relative size (concepts/partial-areas)
1 Heart disease Heart disease 2402 316 61 2.4%/3.0%
2 Cancer Malignant neoplastic disease 3531 125 19 3.6%/1.2%
3 Chronic lower respiratory diseases Disorder of lower respiratory system 1414 354 51 1.4%/3.4%
4 Stroke Cerebrovascular disease 262 75 15 0.3%/0.7%
5 Accidents Injury due to exposure to external cause 267 65 11 0.3%/0.6%
6 Alzheimer's disease Disorder of brain 2300 396 67 2.3%/3.8%
7 Diabetes Diabetes mellitus 112 30 14 0.1%/0.2%
8 Nephritis, nephrotic syndrome, and nephrosis Kidney disease 909 243 47 0.9%/2.3%
9 Influenza and pneumonia Pneumonitis 334 73 24 0.3%/0.7%
10 Suicide Suicide 16 9 2 0.4%/29%

By enabling the choice of an arbitrary concept as a root of a subtaxonomy, we enable an SCT editor to view a summary for a subhierarchy of concepts that meet some criteria. If an editor wants to concentrate on a subject area, she can choose a concept that best represents the subject area to be the root of a subtaxonomy. For example, Cancer and many of its descendent concepts are hidden in large partial-areas in the complete taxonomy. Cancer can be selected as the root of a subtaxonomy, as done in figure 5, making its subhierarchy of concepts more accessible for TQA. Hence, we introduce the subject-based subtaxonomy, a subtaxonomy that provides a compact view of the SCT subhierarchy rooted at a chosen concept that best represents the subject area.

Figure 5.

Figure 5

The Cancer subject-based subtaxonomy, following the graphical convention of figure 3. Levels have been organized into multiple rows due to space limitations. Areas of the same levels are color coded according to their number of relationships. Child-of links are not shown for readability. The Cancer subject-based subtaxonomy summarizes 3531 concepts by 125 partial-areas in 19 areas. The 64 partial-areas that do not appear in the complete Clinical finding taxonomy are highlighted in yellow. The concepts inside of the yellow partial-areas are found in the Mass of body structure (7010 concepts) partial-area in the complete taxonomy.

Subtaxonomies are not necessarily disjoint, because concepts may belong to multiple subtaxonomies. Additionally, subtaxonomy partial-areas are not always a subset of those in the complete taxonomy (see Cancer subtaxonomy in the ‘Results’ section).

Within a subtaxonomy, the disjoint partial-area taxonomy derivation methodology is altered to account for concepts in a subtaxonomy overlapping with partial-areas that are outside of the subtaxonomy. For example, the concept Intra-abdominal hematoma has two parents in its area in the complete taxonomy: Hemorrhage of abdominal cavity structure (in Bleeding's subtaxonomy) and Mass of abdominal cavity structure (in the partial-area Mass of body structure, outside the subtaxonomy). Intra-abdominal hematoma inherits the semantics of both partial-area roots and belongs in the disjoint taxonomy.

Thus, for the disjoint partial-area subtaxonomy we (1) ensure that all of the concepts in the disjoint partial-area taxonomy are semantically related to the subject c by considering only concepts that are descendants of c, and (2) consider overlapping concepts that overlap with partial-areas outside of the subtaxonomy, since such concepts are complex (figure 4).

Figure 4.

Figure 4

An excerpt of 23 disjoint partial-areas from the disjoint partial-area subtaxonomy derived for the concepts in {Associated morphology, Finding by site}. The disjoint partial-areas Mass of body structure and Injury of anatomical site, shown in a gray box, are not part of the Bleeding subject-based subtaxonomy, but many Bleeding concepts overlap with them. Partial-areas outside of the subtaxonomy, such as Mass of body structure, which overlap with partial-areas in the subtaxonomy, for example, Hemorrhage of abdominal cavity structure, are not part of the subtaxonomy and can be hidden, but are important for terminology quality assurance (TQA) to capture the complexity of the overlapping concepts. For example, the disjoint partial-area Pelvic hematoma (3) would not exist if such overlap was not considered.

Based on our experience with the Specimen hierarchy,19,21 a subject-based subtaxonomy containing 500–1500 concepts is of reasonable size to avoid overwhelming an auditor.

Subject-based subtaxonomy TQA

Our previous SCT TQA studies21 have focused on complex concepts, for example, overlapping concepts,18 which were shown to have more errors with high statistical significance for the small Specimen hierarchy due to the difficulty in modeling complex concepts. Overlapping concepts are more complex than non-overlapping concepts, since they are specializations of all the roots of the partial-areas they are contained in.

In this scalability study, we repeat our analysis of overlapping concepts (hypothesis H1) and test three new refined hypotheses (H2–H4) for a subject-based subtaxonomy of a large hierarchy.

  • Hypothesis H1: Overlapping concepts are more likely to have errors than non-overlapping concepts.

Another group of concepts, which was also shown to have more errors with high statistical significance, are uncommonly classified concepts, for example, those in small partial-areas.19 A possible reason for their uncommon classification may be a modeling error. Once the error is corrected (eg, by adding a parent or relationship), a concept may join another common classification according to its revised modeling.

However, to account for concepts that overlap between a small partial-area and a large partial-area we introduce H2:

  • Hypothesis H2: Concepts in small disjoint partial-areas are more likely to have errors than concepts in large disjoint partial-areas.

H1 and H2 can be compounded into H3:

  • Hypothesis H3: Concepts in small overlapping disjoint partial-areas are more likely to have errors than concepts in large overlapping disjoint partial-areas.

H3 expresses that concepts that are both complex and uncommonly classified tend to have more errors than concepts that are just complex.

We call the number of partial-areas a concept belongs to the ‘degree of overlap.’

  • Hypothesis H4: Concepts with a higher degree of overlap exhibit a higher error rate.

Concepts that overlap between more partial-areas inherit the semantics of more roots, and thus, are more complex than concepts that overlap between fewer partial-areas.

Even the number of overlapping concepts may be overwhelming when only limited resources may be available to audit them. The above hypotheses can guide a TQA methodology by prioritizing which overlapping concepts should be reviewed first to maximize yield (see ‘Results’ section).

To test the hypotheses, a sample of 300 concepts was reviewed for errors by three of the authors, (YC, JX, and HM), who are trained in medicine and have extensive terminology auditing experience. The review process consists of two phases. First, each auditor is given the complete sample as a list of concepts in alphabetical order and works independently. Each auditor then reports all errors found. As shown in Gu et al.40 TQA reports from different auditors show substantial differences and a report from one auditor is not reliable. However, a consensus among several auditors’ reports was shown to result in a reliable TQA report. Thus, we are using the second phase for consensus building. Each auditor is given a complete list of errors from all auditors. Each auditor then marks ‘agree’ or ‘disagree’ for each error. A concept is considered erroneous if all auditors agree on the error. A similar consensus TQA protocol was used when auditing overlapping concepts in the Specimen hierarchy.21 If H2–H4 are confirmed, they could guide a TQA methodology that prioritizes the review of concepts summarized by a subtaxonomy according to the error rates for H2–H4.

RESULTS

Bleeding subject-based subtaxonomy

We derived a subject-based subtaxonomy for the concept Bleeding from the January 2013 release of SCT (shown in figure 3). Compared to the Clinical finding partial-area taxonomy with 10 614 partial-areas, this subject-based subtaxonomy is significantly smaller.

Figure 3.

Figure 3

Top five (out of six) levels of the Bleeding subject-based subtaxonomy. Each level is color coded according to the number of relationships. Levels have been organized into multiple rows due to space limitations. Partial-areas in each area are listed in decreasing order, from left to right, according to their size. Child-of links are not shown for readability. A total of 932 bleeding-related concepts are summarized by 199 partial-areas in 42 areas. Over half (56% = 522/932) of the concepts summarized by this subtaxonomy are in {Associated morphology, Finding site}. The first row of larger partial-areas in this area indicates the major types of bleeding-related findings in SNOMED CT, such as Hemorrhage of abdominal cavity structure (186 concepts), Gastrointestinal hemorrhage (117), and Genitourinary tract hemorrhage (88), demonstrating the summary effect provided by the subject-based subtaxonomy.

The largest area in the Bleeding subtaxonomy, {Associated morphology, Finding site}, has 290 overlapping concepts (55.5%). Figure 4 shows an excerpt of 23 disjoint partial-areas from the disjoint partial-area subtaxonomy for {Associated morphology, Finding site}.

The disjoint partial-area taxonomy for {Associated morphology, Finding site} contains 236 disjoint partial-areas. Most disjoint partial-areas are small: 176 (78.8%) of them are singletons (one concept). The disjoint partial-area taxonomy more accurately summarizes the concepts in this area than the partial-area taxonomy (figure 3). For example, there are 186 concepts in the partial-area Hemorrhage of body cavity structure, but only 10 are descendants of just this root. The other 176 concepts also belong to other partial-areas. The overlapping disjoint partial-areas are made explicit in figure 4.

TQA results

To test H1–H4, three auditors reviewed a sample of 300 concepts from the {Associated morphology, Finding site} area in the Bleeding subtaxonomy for errors: 200 randomly selected overlapping concepts (70% = 200/290) and 100 randomly selected non-overlapping concepts (43% = 100/232). The latter were taken from partial-areas that had overlapping concept.

The auditors reviewed the January 2013 inferred version of SCT. Together, the auditors first found 131 erroneous concepts. Next all auditors agreed that 87 (66%) of these concepts had at least one same error (table 1). Among the erroneous concepts, 36 were primitives and 51 were fully defined. The auditors all agreed on 123 errors in these 87 concepts (1.41 errors per erroneous concept). For a breakdown by error type, see online supplementary appendix V.

Table 1.

Auditing results for overlapping concepts and non-overlapping concepts in small and large disjoint partial-areas

Disjoint partial-area size Overlapping
(levels 2–8)
Non-overlapping
(level 1)
Total
# Sample # Erroneous # Sample # Erroneous # Sample # Erroneous
Small (<7 concepts) 194 78 (40.2%) 34 7 (20.6%) 228 85 (37.3%)
Large (≥7 concepts) 6 0 (0%) 66 2 (3.0%) 72 2 (2.78%)
Total 200 78 (39%) 100 9 (9%) 300 87 (29%)

All erroneous concepts and proposed corrections were reported to JTC, head of the US Extension of SCT and a co-author. JTC noted that several significant modeling errors and inconsistency patterns were uncovered by this study (see online supplementary appendix IV).

For H1 we found 39% ( = 78/200) of overlapping concepts erroneous, versus 9% ( = 9/100) of non-overlapping concepts. Thus, overlapping concepts are 4.33 times more likely to be erroneous. For our statistical analysis, we used the double bootstrap approach to account for potential dependency of errors in our sample. We found H1 to be statistically significant (p = 0.0016).

For H2, we tested several boundary points between small and large disjoint partial-areas (see online supplementary appendix III). Using a boundary point of seven,16,19 we found 37.3% ( = 85/228) of concepts in small disjoint partial-areas erroneous versus 2.78% ( = 2/72) of concepts in large disjoint partial-areas. H2 was also significant (p = 0.0394).

In the Bleeding subtaxonomy, the disjoint partial-area taxonomy for {Associated morphology, Finding site} had only one large overlapping disjoint partial-area. The six concepts sampled from this disjoint partial-area had no errors. Small overlapping disjoint partial-areas, on the other hand, had an error rate of 40.2% ( = 78/194). But due to the small sample size for large overlapping disjoint partial-areas, H3 was not significance (p = 0.2601).

Table 2 provides a breakdown of errors by overlap level of the disjoint partial-area taxonomy. To test H4, each level is compared to the previous level. From level 1 to level 7 the error rate is increasing, as expected. We found this hypothesis statistically significant when comparing level 1 to level 2 (p = 0.0322) and level 2 to level 3 (p = 0.0336). Other comparisons were not significant due to the smaller sample sizes of level 4 and above. However, when we compared level 3 to levels 4–8 combined (error rate of 24/39 = 61.5%), we found significance (p = 0.0116). Table 3 shows five examples of errors and their proposed solutions.

Table 2.

Auditing results broken down by disjoint partial-area taxonomy level

Level Sample concepts (n) Erroneous concepts (n) Percentage of erroneous concepts
1 100 9 9%
2 90 24 26.7%
3 71 29 40.8%
4 18 9 50%
5 10 7 70%
6 6 5 83.3%
7 2 2 100%
8 3 2 66.7%
Total 300 87 32.3%
Total for 4–8 39 25 64.1%

Table 3.

Five examples of errors reported by the auditors

Concept Error Proposed solution
Bleeding varices of prostate Missing relationships: Associated morphology and Finding site, with target concepts Varix and Venous structure, respectively Add the two new relationships in a role group
Hemorrhage of cervix Incorrect parent: Hemorrhage of abdominal cavity structure Remove IS-A to Hemorrhage of abdominal cavity structure (corrected independently in Jan 2014 release)
Hematoma of pinna Missing child: Chronic hematoma of pinna (which is incorrectly a synonym of the concept Cauliflower ear). Add Chronic hematoma of pinna concept and remove the synonym from Cauliflower ear
Peptic ulcer with hemorrhage AND obstruction Incorrect relationship target: Associated morphology relationship with a target concept Hemorrhage Make target concept of Associated morphology relationship Bleeding ulcer to be consistent with Esophageal bleeding
Bleeding gastric varices Missing parent: Venous hemorrhage Add IS-A to Venous hemorrhage

JTC confirmed all of these errors and forwarded the corrections to the International Health Terminology Standards Development Organisation (IHTSDO).

Cancer subject-based subtaxonomy

Table 4 documents the metrics of several other subject-based subtaxonomies. We derived the Cancer (Malignant neoplasm disease) subject-based subtaxonomy for the January 2014 SCT release (figure 5). The majority of the Cancer subtaxonomy's concepts (3124, 88.5%) are in {Associated morphology, Finding site} (like Bleeding). While this area has several large partial-areas, for example, Malignant neoplasm of soft tissue (804), it has 2398 overlapping concepts (76.8%). Thus, the disjoint partial-area taxonomy will better summarize its content.

The Cancer subject-based subtaxonomy includes 64 partial-areas (highlighted in yellow in figure 5) that are not in the complete Clinical finding taxonomy. These concepts are typically inside large partial-areas in the complete taxonomy, for example, all of the concepts in the {Associated morphology, Finding site} yellow partial-areas in figure 5 are inside the large Mass of body structure (7010 concepts) partial-area. This occurs because the relationships are introduced in the subtaxonomy at a lower descendant than in the complete taxonomy. Thus, the Cancer subject-based subtaxonomy summarizes SCT Cancer disorders in a view that is more useful for TQA.

DISCUSSION

We demonstrated scalability of taxonomy-based terminology maintenance to large SCT hierarchies using subject-based subtaxonomies. This represents a significant improvement over our previous approach of reviewing complete taxonomies, which may have thousands of partial-areas (eg, Clinical finding). Such large taxonomies are hard for humans to visualize, which prevents effective taxonomy-based TQA, based on reviewing groups of concepts that have higher error rates (eg, small partial-areas16,19,20). There are thousands of such concepts in a large hierarchy, for example, 14 450 (14.3%) concepts in ‘small’ partial-areas and 14 220 (14.5%) overlapping concepts in the Clinical finding hierarchy. Available TQA resources do not typically enable a thorough review of so many concepts.

We addressed these difficulties by combining several novel techniques. The first technique is to concentrate on a subject-based subtaxonomy, which is intuitive for terminology curators because it summarizes all descendants of a chosen broad concept, for example, Bleeding or Cancer. This way, the attention of a curator is focused on a comprehensible subtaxonomy that still summarizes a sizable subject-based portion of the hierarchy. Second, we formulate refined hypotheses regarding concepts with a high likelihood of errors. Third, we prioritize the review of concepts according to the ratios for the hypotheses.

When previously developing taxonomy-based TQA methodologies in small hierarchies, we discovered two kinds of groups with a higher likelihood of errors in small hierarchies: concepts in small partial-areas19,20 and overlapping concepts.21 The challenges for scalability included whether this still holds true for concepts in subject-based subtaxonomies and prioritizing among the groups’ concepts.

We formulated and tested new hypotheses (H2–H4), while confirming the previously established hypothesis (H1), for the Bleeding subtaxonomy. When there are many overlapping concepts and an extensive level of overlap, as for the Bleeding and Cancer subtaxonomies, resources for reviewing overlapping concepts need to be prioritized.

According to this study, the TQA methodology steps corresponding to the hypotheses should be applied in decreasing error percentage order (table 5).

Table 5.

Order of TQA methodology steps

Order Hypothesis Overlap Errors expected
1 H4 Overlap levels 4–8 64.1%
2 H4 Overlap level 3 40.8%
3 H3 Small overlapping disjoint partial-areas 40.2%
4 H4 Overlap level 2 26.7%
5 H2 Small non-overlapping disjoint partial-areas 20.6%

H, hypothesis; TQA, terminology quality assurance.

Thus, an editor will achieve a higher yield for a given effort. Future studies will investigate error rates in other subject-based subtaxonomies, for example, Cancer with 2398 overlapping concepts, to verify this order.

The study confirms most of our hypotheses and the feasibility of the subject-based subtaxonomy paradigm to support scalability of taxonomy-based maintenance of large SCT hierarchies. More experiments will be performed, using other subtaxonomies, where the sample sizes in the Bleeding subtaxonomy were not sufficient to achieve statistical significance (ie, H3). Table 4 lists the metrics for subtaxonomies for the 10 most common causes of death,41 along with their sizes relative to the complete taxonomy in terms of number of concepts and partial-areas. Nine are from Clinical finding and Suicide is from Event.

CONCLUSIONS

We introduced the subject-based subtaxonomy, which summarizes a subhierarchy rooted at a subject concept within a large SCT hierarchy. The subject-based subtaxonomy supports effective terminology maintenance based on four hypotheses for groups of concepts expected to have a higher likelihood of errors. We derived the subject-based subtaxonomy and used it for terminology maintenance for the Bleeding subhierarchy of SCT. By directing TQA efforts towards more complex concepts, a higher error yield is achieved.

Supplementary Material

Supplementary Data

CONTRIBUTORS

CO, JG, and YP were the primary authors of this work. YC, JX, and HM reviewed the sample of 300 concepts for this study and participated in the consensus-based evaluation. JTC reviewed and verified our auditing results and provided the feedback for online supplementary appendix IV. ZW assisted with the statistical analysis.

COMPETING INTERESTS

None.

PROVENANCE AND PEER REVIEW

Not commissioned; externally peer reviewed.

SUPPLEMENTARY MATERIAL

Supplementary material is available online at http://jamia.oxfordjournals.org/.

REFERENCES

  • 1. Stearns MQ, Price C, Spackman KA, et al. SNOMED clinical terms: overview of the development process and project status. Proc AMIA Symp. 2001:662–6. [PMC free article] [PubMed] [Google Scholar]
  • 2. Giannangelo K, Fenton SH. SNOMED CT survey: an assessment of implementation in EMR/EHR applications. Perspect Health Inf Manag. 2008;5:7. [PMC free article] [PubMed] [Google Scholar]
  • 3. van der Kooij J, Goossen WT, Goossen-Baremans AT, et al. Using SNOMED CT codes for coding information in electronic health records for stroke patients. Stud Health Technol Inform. 2006;124:815–23. [PubMed] [Google Scholar]
  • 4. Elevitch FR. SNOMED CT: electronic health record enhances anesthesia patient safety. AANA J. 2005;73:361–6. [PubMed] [Google Scholar]
  • 5. Dougherty M. Standard terminology helps advance EHR. J AHIMA. 2003;74:59–60. [PubMed] [Google Scholar]
  • 6. Office of the National Coordinator for Health Information Technology, Department of Health and Human Services. Health information technology: initial set of standards, implementation specifications, and certification criteria for electronic health record technology; interim final rule . Fed Regist. 2010;75:44589–654. [PubMed] [Google Scholar]
  • 7. Hammond WE. eHealth interoperability. Stud Health Technol Inform. 2008;134:245–53. [PubMed] [Google Scholar]
  • 8. Giannangelo K, Berkowitz L. SNOMED CT helps drive EHR success. J AHIMA. 2005;76:66–7. [PubMed] [Google Scholar]
  • 9. Donnelly K. SNOMED-CT: The advanced terminology and coding system for eHealth. Stud Health Technol Inform. 2006;121:279–90. [PubMed] [Google Scholar]
  • 10. EHR Incentive Programs. 2014 [cited 9 September 2014]. http://www.cms.gov/Regulations-and-Guidance/Legislation/EHRIncentivePrograms/index.html?redirect=/ehrincentiveprograms/
  • 11.Mostashari F. Meaningful Use Stage 2: A Giant Leap in Data Exchange. 2012 [cited 9 September 2014]. http://www.healthit.gov/buzz-blog/meaningful-use/meaningful-use-stage-2/
  • 12. Eligible Professional Meaningful Use Core Measures Measure 3 of 13 2014 [cited 9 September 2014]. http://www.cms.gov/Regulations-and-Guidance/Legislation/EHRIncentivePrograms/downloads/3_Maintain_Problem_ListEP.pdf.
  • 13. Rubin DL, Shah NH, Noy NF. Biomedical ontologies: a functional perspective. Brief Bioinform. 2008;9:75–90. [DOI] [PubMed] [Google Scholar]
  • 14. Bodenreider O. Biomedical ontologies in action: role in knowledge management, data integration and decision support. Yearb Med Inform. 2008;1:67–79. [PMC free article] [PubMed] [Google Scholar]
  • 15. Chute CG. Medical concept representation. Medical; Informatics; 2005;8:163–82. [Google Scholar]
  • 16. Min H, Perl Y, Chen Y, et al. Auditing as part of the terminology design life cycle. J Am Med Inform Assoc. 2006;13:676–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Wang Y, Halper M, Min H, et al. Structural methodologies for auditing SNOMED. J Biomed Inform. 2007;40:561–81. [DOI] [PubMed] [Google Scholar]
  • 18. Wang Y, Halper M, Wei D, et al. Abstraction of complex concepts with a refined partial-area taxonomy of SNOMED. J Biomed Inform. 2012;45:15–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Halper M, Wang Y, Min H, et al. Analysis of error concentrations in SNOMED. AMIA Annu Symp Proc. 2007:314–18. [PMC free article] [PubMed] [Google Scholar]
  • 20. Ochs C, Perl Y, Geller J, et al. Scalability of abstraction-network-based quality assurance to large SNOMED hierarchies. AMIA Annu Symp Proc. 2013:1071–80. [PMC free article] [PubMed] [Google Scholar]
  • 21. Wang Y, Halper M, Wei D, et al. Auditing complex concepts of SNOMED using a refined hierarchical abstraction network. J Biomed Inform. 2012;45:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Elhanan G, Perl Y, Geller J. A survey of SNOMED CT direct users, 2010: impressions and preferences regarding content and quality. J Am Med Inform Assoc. 2011;18(Suppl 1):i36–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Zhu X, Fan JW, Baorto DM, et al. A review of auditing methods applied to the content of controlled biomedical terminologies. J Biomed Inform. 2009;42:413–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Jiang G, Chute CG. Auditing the semantic completeness of SNOMED CT using formal concept analysis. J Am Med Inform Assoc. 2009;16:89–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Rector AL, Brandt S, Schneider T. Getting the foot out of the pelvis: modeling problems affecting use of SNOMED CT hierarchies in practical applications. J Am Med Inform Assoc. 2011;18:432–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Rector AL, Iannone L. Lexically suggest, logically define: quality assurance of the use of qualifiers and expected results of post-coordination in SNOMED CT. J Biomed Inform. 2011;45:199–209. [DOI] [PubMed] [Google Scholar]
  • 27. Schulz S, Hahn U, Rogers J. Semantic clarification of the representation of procedures and diseases in SNOMED((R))CT. Stud Health Technol Inform. 2005;116:773–8. [PubMed] [Google Scholar]
  • 28. Schulz S, Hanser S, Hahn U, et al. The semantics of procedures and diseases in SNOMED CT. Methods Inf Med. 2006;45:354–8. [PubMed] [Google Scholar]
  • 29. Schulz S, Suntisrivaraporn B, Baader F, et al. SNOMED reaching its adolescence: ontologists’ and logicians’ health check. Int J Med Inform. 2009;78(Suppl 1):S86–94. [DOI] [PubMed] [Google Scholar]
  • 30. Ceusters W, Smith B, Kumar A, et al. Ontology-based error detection in SNOMED-CT. Stud Health Technol Inform. 2004;107(Pt 1):482–6. [PubMed] [Google Scholar]
  • 31. Mortensen JM, Musen MA, Noy NF. Crowdsourcing the verification of relationships in biomedical ontologies. AMIA Annu Symp Proc. 2013:1020–9. [PMC free article] [PubMed] [Google Scholar]
  • 32. Kittur A, Chi EH, Suh B. Crowdsourcing user studies with Mechanical Turk. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 2008:453–6. [Google Scholar]
  • 33. Agrawal A, Perl Y, Elhanan G. Identifying problematic concepts in SNOMED CT using a lexical approach. Stud Health Technol Inform. 2013;192:773–7. [PubMed] [Google Scholar]
  • 34. Agrawal A, Perl Y, Chen Y, et al. Identifying inconsistencies in SNOMED CT problem lists using structural indicators. AMIA Annu Symp Proc. 2013:17–26. [PMC free article] [PubMed] [Google Scholar]
  • 35. Agrawal A, Elhanan G, Halper M. Dissimilarities in the logical modeling of apparently similar concepts in SNOMED CT. AMIA Annu Symp Proc. 2010;2010:212–6. [PMC free article] [PubMed] [Google Scholar]
  • 36. Bodenreider O, Smith B, Kumar A, et al. Investigating subsumption in SNOMED CT: an exploration into large description logic-based biomedical terminologies. Artif Intell Med. 2007;39:183–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Bodenreider O, Burgun A, Rindflesch TC. Assessing the consistency of a biomedical terminology through lexical knowledge. Int J Med Inform. 2002;67:85–95. [DOI] [PubMed] [Google Scholar]
  • 38. Fragoso G, de Coronado S, Haber M, et al. Overview and utilization of the NCI thesaurus. Comp Funct Genomics. 2004;5:648–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Geller J, Ochs C, Perl Y, et al. New abstraction networks and a new visualization tool in support of auditing the SNOMED CT Content. AMIA Annu Symp Proc. 2012:237–46. [PMC free article] [PubMed] [Google Scholar]
  • 40. Gu H, Elhanan G, Perl Y, et al. A study of terminology auditors’ performance for UMLS semantic type assignments. J Biomed Inform. 2012;45:1042–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Deaths and Mortality. 2014 14 July 2014 [cited 9 September 2014]. http://www.cdc.gov/nchs/fastats/deaths.htm.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES