Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2010 Nov 13;2010:842–846.

The Use of Semantic Distance Metrics to Support Ontology Audit

Jian Wang *, Roger Day *, Shyam Visweswaran *, William Hogan
PMCID: PMC3041307  PMID: 21347097

Abstract

This study explored the possibility that semantic distance metrics can be used to develop methods for auditing biomedical ontologies. We developed and tested an approach using the Foundational Model of Anatomy (FMA) and the body-structure taxonomy of SNOMED CT. We evaluated 190 class pairs in human anatomical structures using three semantic distance metrics: simple edge count, normalized path length, and information content. We applied principal component analysis (PCA) to study relationships between the semantic distance measurements so produced in FMA and SNOMED CT. We found that our application of PCA could detect significant discrepancies, but not necessarily outright mistakes, in the two ontologies. A review of discrepancies revealed that they often relate to multiple design perspectives employed in ontological definitions.

Introduction

Evaluation of ontological representation of medical entities is an important part of ontology quality assurance. Different methods have been developed for this purpose; some are programmatic and others rely on input from the domain experts. Bodenreider & Zhang [1] programmatically compared lexical match and is-a relationship similarity between FMA and SNOMED CT and suggest a good compliance between the two with respect to modeling of anatomical entities. 97% of mappings by lexical models were supported by structural evidence. Héja, Surján & Varga [2] manually reviewed an arbitrary selection of SNOMED CT concepts based on the formal top-level ontology DOLCE and identified several types of ontological errors in SNOMED CT including mixing of the is-a relation with other relations. Kalet, Mejino, Wang, Whipple & Brinkley [3] manually reviewed lymphatic representation in FMA regarding its compliance with domain specific constraints; they observed modeling problems with downstream relations (connection of entities of the same types) and paths and corrected them in the ontology.

Quality assurance is a continuous task for ontology designers. Auditing methods explored by researchers [10] usually fall into one of these three categories: manual, automated systematic and automated heuristic. An automated method is often used to direct ontology designers to a small number of the most “problematic” classes and relationships in the ontologies.

Common semantic distance metrics include a path length metric and an information-density metric. Published investigations demonstrate promise of the metrics in general taxonomies of classes with an is-a relationship. Research by Rada, Mili, Bicknell & Blettner [4] suggest that “when the paths are restricted to is-a links (in a good hierarchical semantic network), the shortest path length does measure conceptual distance.” Resnik [6] states that “the measure performs encouragingly well.”

To our knowledge, there is no published work on the use of semantic distance metrics and statistic methods for ontology auditing. Because semantic distance metrics typically count the number of is-a edges between a pair of nodes as part of the calculation, it stands to reason that errors in is-a relationships could influence semantic distance. Besides error and true dissimilarity between the entities that classes represent, another possible reason for variation in semantic distance metrics is incomplete modeling of classes. For example, consider two hypothetical ontologies, A and B. Suppose further that ontology A states that dog is-a mammal, mammal is-a chordate, snake is-a reptile, and reptile is-a chordate. Ontology B, by contrast, states simply that snake is-a chordate and dog is-a chordate. Then the semantic distance between snake and dog is much shorter in ontology B as a result of the incomplete modeling.

The objective of this study is to develop methods that use semantic distance for medical ontology quality assurance. The methods compare the semantic distance produced by different calculation metrics and on different ontology platforms. The hypothesis is that comparing semantic distance between class pairs in two ontologies reveals a small amount of classes that represent likely ontology mistakes. The two ontologies of interest are the Foundational Model of Anatomy Ontology (FMA) and the body structure hierarchy of SNOMED CT.

Data

The FMA is an ontology of structural human anatomy, mainly used in medical education and clinical decision making systems. SNOMED CT is a comprehensive healthcare terminology. These applications require the ontologies to accurately reflect medical science and various application purposes. Both FMA and SNOMED CT contain human anatomy classes and use is-a as a major relation to link these classes in the respective networks. The graphical representation of these ontologies is an acyclic network of anatomical class nodes connected by is-a relationship edges.

For this study we used the FMA release in .obo format (fma2_obo.obo), downloaded from http://www.obofoundry.org/ on 8/28/2009. We retrieved all terms and their is-a relationships from this .obo file and built them into a hierarchical tree. The tree contains 78,541 unique nodes, and has a depth (largest number of edges from a leaf node to the root) of 20.

We used the July 31, 2009 release of SNOMED CT. We included for this study every concept whose name contains “(body structure)” in its description. SNOMED CT has a directed acyclic graph or DAG structure, because each class may have multiple parent classes. This could produce multiple paths to the network’s root node. The network contained 25,853 unique SNOMED CT classes, with root being “Physical anatomical entity (body structure)” class (SNOMED ID: 91722005). Its hierarchy depth is 17.

Note that although the FMA and SNOMED CT use different words to refer to their representational units (‘class’ vs. ‘concept’), to avoid ambiguity in what follows, we use the word ‘class’ consistently to refer to the representational units in both.

Author WRH selected a “convenience” sample of 20 human anatomical classes that exist in both FMA and SNOMED CT. The sample was chosen specifically to include both “similar” and “dissimilar” human anatomical classes. The “similar” and “dissimilar” classification is subjective and based on his clinical training. Names of the 20 selected classes are: Aorta, Brain, Femoral artery, Gallbladder, Heart, Humerus, Inferior vena cava, Kidney, Liver, Long thoracic nerve, Lower extremity, Lower lobe of lung, Lung, Renal pelvis, Right ventricle of heart, Semilunar line, Subclavian vein, Thoracic cavity, Tibia, Upper extremity.

Because SNOMED CT uses Structure-Entire-Part triples to model anatomy [9], there is not an exact one-to-one correspondence, for example, between Liver in the FMA and Liver in SNOMED CT.

SNOMED CT has Entire liver and Liver structure, both of which have a synonym of ‘Liver’. Thus, we decided to include both SNOMED Entire and SNOMED Structure in the analysis, where Liver in the FMA was mapped to Entire liver and Liver structure, respectively. Author WRH performed all mappings manually between FMA/SNOMED Entire and FMA/SNOMED Structure.

Three lists of class pairs were therefore compiled from these 20 classes, one list each for FMA, SNOMED Structure, and SNOMED Entire. Each list contains 190 unique class pairs (20 x 19 / 2).

Methods

This investigation consists of two major tasks: to calculate the semantic distance of class pairs and to develop ontology audit methods based on this semantic distance.

The semantic distance calculation employs three metrics: simple edge count, normalized shortest-path length [5], and information density [6]. These metrics are applied to the three lists of 190 class pairs across the two ontology networks. In this study, we refer to these metrics as “SEC”, “Leacock” and “Resnik”, respectively.

Calculation of the semantic distance with three metrics is straightforward for the 190 FMA class pairs as every pair is connected by a unique path in the FMA tree. In SNOMED CT, however, two classes might have multiple paths between them due to the multi-inheritance. For simplicity’s sake, we chose the shortest path to compute the semantic distance between a class pair in SNOMED CT. Pedersen et al. also took this approach [7].

Our hypothesis was that by comparing the semantic distance between two classes in two ontologies, we might uncover mistakes in one of the two ontologies. For example, if Liver and Lung have divergent semantic distance values in FMA vs. SNOMED CT, then there might be a problem with how one of the ontologies models the two classes.

We formalized this notion using principal component analysis. In this approach, we first compared the distance of the 190 pairs across the two ontologies. The first principal component (PC1) defines a relationship between semantic distance in FMA and the distance in SNOMED CT. The second principal component (PC2) measures how far each of the 190 class pairs lies from this relationship. One of our hypotheses is that pairs with the greatest deviation from the PC1 very likely represent errors in one of the two ontologies.

We repeated the principal component analysis for each of the semantic distance metrics. We plotted for each class pair the PC2 for one metric against the PC2 for another metric; we did so for each unique pair of metrics (SEC/Leacock, SEC/Resnik, Leacock/Resnik). We further hypothesized that discrepancies among metrics for a given class pair would help identify modeling mistakes.

Semantic distance of a class pair in the two ontologies are represented by a (x, y) point in a semantic distance scatter chart, where the X and Y axes are for the two ontologies of interest. In this scatter chart, the PC1 is represented by the line that is fitted to the scatter points along their most populated area (also called the major axis). In the scatter chart, a PC2 value is represented by a point’s orthogonal distance from the major axis. The orthogonal distance values from the PC1 are plotted on another scatter chart between two different metrics. The points that are far from both axes are the “outliers” agreed by both metrics, while the ones on either axis are the “outliers” that are disagreed by the two metrics.

In this study, we specified the “agreed outlier” as those class pairs whose absolute PC2 values are greater than 2 times of PC2 standard deviation, and the “disagreed outliers” as those class pairs whose absolute PC2 value is greater than 2 times of the standard deviation in one semantic distance metric and less than 20% of the standard deviation in the other semantic distance metric. The factor of 2 and 20% are selections of convenience which are expected to provide good separation in our study.

Results

The multi-inheritance in the SNOMED CT hierarchy produces a considerable number of different paths from a class to the network’s root class, “Physical anatomical entity (body structure).” The average number of such unique paths per class for the SNOMED Entire class group is 40 and 25 for the SNOMED Structure class group. This results in a very large number of paths between any two classes.

A calculation of the semantic distance with the three metrics on the three lists of class pairs produces a basic statistic profile as described in Table 1. On all three semantic-distance metrics, the average pairwise semantic distance is longer in the FMA than in SNOMED CT. Note that higher Leacock and Resnik values suggest shorter distances, and vice versa.

Table 1.

Profile of semantic distance results from the 190 class pairs.

Average /Standard Deviation
SEC Leacock Resnik
FMA 9.17 / 2.78 0.67 / 0.18 0.08 / 0.14
SNOMED Structure 6.55 / 2.60 0.76 / 0.23 0.22 / 0.13
SNOMED Entire 6.58 / 2.56 0.76 / 0.23 0.29 / 0.18

The principal component analysis begins with calculation of covariance, eigenvalues and eigenvector for each semantic distance metrics. And each eigenvector produces a major axis, as representation of the PC1 on the semantic distance scatter chart. See Figure 1 for examples.

Figure 1.

Figure 1.

Plots of semantic distance of 190 concept pairs between FMA and SNOMED Entire and their first principal component. The major axis is indicated by a line on each chart.

The average of the PC2 values in each metric is very close to zero, understandably. Their standard deviations vary in the three metrics. They are 2.35, 0.13 & 0.20 for FMA / SNOMEND Structure, and 2.12, 0.14 & 0.16 for FMA / SNOMED Entire, respectively. The PC2 values produced by the Leacock and Resnik metrics are of the same scale. In the rest of this experiment, we will only refer to the results produced by these two metrics and skip the SEC.

The PC2 values derived from the semantic distance on the two metrics are plotted on scatter charts (Figure 2) where each point represents a class pair. A class pair’s location indicates the extent of its deviation from the major axis. The farther a point is from the origin (0, 0), the less contribution the class pair makes to the PC1 relationship.

Figure 2.

Figure 2.

Plots of PC2 for Resnick and Leacock metrics. Arrows and circle indicate class pairs of “agreed outlier” and “disagreed outlier”, respectively.

We identified three “agreed outlier” class pairs in the PC2 distribution of FMA and SNOMED Entire. We also identified two “agreed outlier” class pairs and three “disagreed outlier” class pairs in the PC2 distribution of FMA / SNOMED Structure. Table 2 lists all the outlier class pairs and their type and ontology context origination resulted from this study.

Table 2.

Outlier class pairs, their type and ontology context.

Outlier Class Pairs Outlier Type & Ontology Context
Renal pelvis / Kidney
Liver / Kidney
Agreed in FMA / SNOMED Structure
Lung / Heart
Gallbladder / Liver
Gallbladder / Kidney
Disagreed in FMA / SNOMED Structure
Agreed in FMA / SNOMED Entire

Conclusion & Discussion

In this study, we found that there is a moderate-to-strong relationship between the semantic distance values in SNOMED CT and FMA for a convenience sample of 20 classes. In other words, the greater the semantic distance between two classes in SNOMED CT, the greater the semantic distance between those classes in FMA in general. The results of this study indicate that the PCA was effective in finding class pairs whose distance deviated significantly from the trend. The frequent appearance of certain classes in pairs with large deviations suggests classes for further review by experts to identify errors in modeling and/or differences in modeling between the two ontologies.

The semantic distance between two classes in an ontology network can be affected by many factors, among which are the construction of the ontology network, semantic distance metric selection, and the classes of interest. The plots of PC2 using Leacock and Resnik metrics suggests that a discrepancy from the major axis significant relationship is often more associated with some classes than with others. This association could be caused by several factors. One possible cause could be that FMA and SNOMED CT follow different design philosophies. A review of some of them seems to suggest that factors such as the relativity of two organs’ physical locations and the functional similarities between these organs are considered differently in the design of the two ontologies.

For example, the class “gallbladder” and the class “liver” have an SEC semantic distance of 7 in FMA and 2 in SNOMED CT for both the Structure and the Entire classes. FMA holds a single structural view in the two classes’ definitions and these two classes are classified based on the categorization of their physical structure. This produces 7 edges between the two classes with their least common subsumer being “Organ” class in the FMA. Figure 3 shows the path and the least common subsumer of the two classes in FMA.

Figure 3.

Figure 3.

Class ontological definitions (is-a relationship) of “Gallbladder” and “Liver” and their connection path in FMA.

SNOMED CT on the other hand also classifies organs by their various functions and locations, which leads to multiple inheritance. The Entire gallbladder class has 45 unique paths to the hierarchy’s root and the Entire liver class has 49 unique paths. This produces up to 45 x 49 = 2205 unique paths between these two classes in the hierarchy. This produces SEC values ranging from 2 to 19. Along these 2205 paths, 16 different classes serve as least common subsumers in the path-based semantic distance metrics. These 16 classes are the shared classification of the two concerned classes, among which Body region structure class speaks to these organs’ shared location and the Entire digestive organ speaks to a shared function. Figure 4 illustrates these two perspectives employed in SNOMED CT classification of the two classes. As far as the Resnik metric, the “Entire digestive organ” class has only 10 sub-classes defined in SNOMED while “Organ” class has 3866 sub-classes in FMA.

Figure 4.

Figure 4.

Excerpts of ontological definitions (is-a relationship) of class “Entire gallbladder” and “Entire liver” in SNOMED CT and their connection paths.

Also, we used the inferred is-a relationships from SNOMED CT and not stated is-a relationships. Given that the classifier in SNOMED CT infers numerous is-a relationships and thus is the source of many additional is-a paths between class pairs, it would be interesting to repeat this study using stated is-a relationships, since modeling errors occur here (assuming a bug-free classifier).

According to the realist approach to ontology with which the FMA is consistent, the class “Entire digestive organ” is not a universal because the attribute of having the function of digestion may be held by entities that are not organs such as “enzymes” which is classified as “substance” in SNOMED CT. Ceusters et al. [8] refer to these classes as “defined classes”, although they are not inappropriate for inclusion in an ontology under this perspective.

This study used a convenience sample of 20 anatomical entities and their representations in two ontologies. This sample is unlikely to be representative of the FMA and SNOMED CT in general. The study did not achieve fully the objective of developing a method that uses semantic distance to identify errors in ontological modeling. Nevertheless, we were able to demonstrate that our approach identified extreme structural differences between two ontologies in the representation of particular entities. Future work includes investigating the possibility of additional algorithmic processing to distinguish among those structural discrepancies that represent errors from those that represent differences in perspective.

References

  • 1.Bodenreider Olivier, Zhang Songmao. Comparing the representation of anatomy in the FMA and SNOMED CT. AMIA 2006 Symposium Proceedings; 2006. pp. 46–50. [PMC free article] [PubMed] [Google Scholar]
  • 2.Héja G, Surján G, Varga P. Ontological analysis of SNOMED CT. BMC Med Inform Decision Making. 2008 2008 Oct 27;8(Suppl 1):S8. doi: 10.1186/1472-6947-8-S1-S8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kalet Ira J, Mejino Jose LV, Wang Vania, Whipple Mark, Brinkley James F. Content-specific auditing of a large scale anatomy ontology. Journal of Biomedical Informatics. 2009;2009;42:540–549. doi: 10.1016/j.jbi.2009.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Rada Roy, Mili Hafedh, Bicknell Ellen, Blettner Maria. Development and application of a metric on semantic nets. IEEE Transactions on Systems,Man, and Cybernetics. 1989;19(1):17–30. [Google Scholar]
  • 5.Claudia Leacock and Martin Chodorow Combining local context and WordNet similarity for word sense identification. Fellbaum. 19981998:265–283. [Google Scholar]
  • 6.Resnik Philip. Using information content to evaluate semantic similarity. Proceedings of the 14th International Joint Conference on Artificial Intelligence; 1995. pp. 448–453. Montreal. [Google Scholar]
  • 7.Pedersen Ted, Pakhomov Serguei VS, Patwardhan Siddharth, Chute Christopher G. Measures of semantic similarity and relatedness in the biomedical domain. Journal of Biomedical Informatics. 2007;2007;40:288–299. doi: 10.1016/j.jbi.2006.06.004. [DOI] [PubMed] [Google Scholar]
  • 8.Ceusters Werner, Capolupo Maria, Smith Barry, De Moor Georges. An evolutionary approach to the representation of adverse events. Studies in Health Technology & Informatics. 2009;150:537–41. 2009. [PMC free article] [PubMed] [Google Scholar]
  • 9.Schulz S, Romacker M, Hahn U. Part-whole reasoning in medical ontologies revisited--introducing SEP triplets into classification-based description logics. Proc AMIA Symp. 1998:830–834. [PMC free article] [PubMed] [Google Scholar]
  • 10.Zhu X, Fan J, Baorto D, Weng C, Cimino J. A review of auditing methods applied to the content of controlled biomedical terminologies. Journal of Biomedical Informatics. 2009;2009;42:413–425. doi: 10.1016/j.jbi.2009.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES