Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2015 Nov 5;2015:2111–2120.

Biological Model Development as an Opportunity to Provide Content Auditing for the Foundational Model of Anatomy Ontology

Lucy L Wang 1, Eli Grunblatt 2, Hyunggu Jung 1, Ira J Kalet 1,3, Mark E Whipple 1,4
PMCID: PMC4765672  PMID: 26958311

Abstract

Constructing a biological model using an established ontology provides a unique opportunity to perform content auditing on the ontology. We built a Markov chain model to study tumor metastasis in the regional lymphatics of patients with head and neck squamous cell carcinoma (HNSCC). The model attempts to determine regions with high likelihood for metastasis, which guides surgeons and radiation oncologists in selecting the boundaries of treatment. To achieve consistent anatomical relationships, the nodes in our model are populated using lymphatic objects extracted from the Foundational Model of Anatomy (FMA) ontology.

During this process, we discovered several classes of inconsistencies in the lymphatic representations within the FMA. We were able to use this model building opportunity to audit the entities and connections in this region of interest (ROI). We found five subclasses of errors that are computationally detectable and resolvable, one subclass of errors that is computationally detectable but unresolvable, requiring the assistance of a content expert, and also errors of content, which cannot be detected through computational means. Mathematical descriptions of detectable errors along with expert review were used to discover inconsistencies and suggest concepts for addition and removal. Out of 106 organ and organ parts in the ROI, 8 unique entities were affected, leading to the suggestion of 30 concepts for addition and 4 for removal. Out of 27 lymphatic chain instances, 23 were found to have errors, with a total of 32 concepts suggested for addition and 15 concepts for removal. These content corrections are necessary for the accurate functioning of the FMA and provide benefits for future research and educational uses.

Introduction

The Foundational Model of Anatomy (FMA) ontology1 seeks to represent knowledge within the domain of human anatomy in a symbolic way. It attempts to formalize all parts and relationships in the body, and intends to be a “resource for developing the anatomy content of applications that target specific user groups,” which includes biomedical research applications.1 A human biological model may focus on the anatomy and physiology of a particular organ system or body part, and may therefore only concern itself with a subset of elements from the FMA. Isolating and using these specific concepts from the FMA allows for better linkage and recycling of models, as the terminology of the FMA can be used to bridge gaps between the authors of these models. More importantly, perhaps, and the focus of this paper, model building also helps to enhance the content of the FMA itself. Certain weaknesses in ontological structure as well as internal inconsistencies can be brought to light through the demands of practical application.

In this paper, we focus on a specific use case as an example: modeling tumor dissemination in the lymphatics of the head and neck. HNSCC is a major form of cancer that arises in the mucosa of the upper aerodigestive tract. This particular form of cancer is known for a high propensity towards metastasis, with the spread of tumor cells occurring nearly universally through the lymphatic system.2,3 For HNSCC, standard treatment involves surgical resection and/or irradiation of part of the neck to remove the mass and lymph nodes in which metastasis is suspected. Because both surgical removal and over-irradiation of surrounding tissue have negative consequences to the patient, it is in the clinician’s interest to minimize collateral damage. Ideally, only lymphatic regions with high probabilities of microscopic disease would be treated. Determining the locations within the head and neck with high likelihood for metastasis is critical in guiding surgeons and radiation oncologists in selecting the boundaries of treatment.46 In this region of the body, the metastatic trends generally follow known, but not precisely defined paths.7,8 Markov chain models have shown promise in quantifying the probabilities of metastasis to particular nodal regions9; we attempt to build such a model using the lymphatic topology represented in the FMA.

The anatomical underpinning of the model is retrieved from the FMA ontology. This allows us to establish consistency between our model and existing knowledge about lymphatic anatomy. Nodes in our Markov chain model are populated using terms and relationships derived from the relevant portions of the FMA. In most cases, the ontology provides a sufficient representation of the regional lymphatics. However, we were able to determine several areas of insufficiencies. These include failures to distinguish between instances and superclasses of lymphatic objects, missing concepts, and to a much lesser degree, incorrect concepts or connections. For many of these issues, we were able to fill in the gaps of knowledge using existing information in the FMA (i.e. by relocating mis-located information, or inferring information from existing relationships). In other cases, we contended with issues in the fundamental organization of the FMA, and a more significant discourse was necessary.

Our strategies for error detection and correction can be used to improve other parts of the FMA ontology. Some of our methods are outlined below; it is our hope that these techniques can be used to assess other parts of the FMA and bring about changes which will ultimately improve its logical underpinnings, consistency and usefulness.

Methods

We evaluate errors in the FMA ontology that can be categorized into the following classes: (1) computationally detectable and resolvable, (2) computationally detectable and unresolvable, or (3) undetectable. All errors can be addressed through the addition or removal of select concepts. Errors of class 1 and 2 can be described in logical terms and algorithmically discovered. Of these, fixes for class 1 errors can be constructed from pre-existing information in the FMA ontology. Fixes for class 2 errors cannot be automatically generated, and these errors must be addressed by content experts. Lastly, errors of class 3 can neither be detected nor fixed using computational methods. These are errors of content, where the ontology disagrees with established literature. These errors can currently only be detected through systematic validation by an anatomical expert. In the construction of our model, we were able to discover and determine fixes for all three classes of errors described above. Computational methods for automatic error detection were attempted in all cases. Once all possible detectable errors were found and fixed, an anatomical expert assessed and completed the auditing of the edited ontology.

We begin by describing the role of the FMA in the construction of our Markov chain model. Our ROI includes parts of the head and neck which constitute the mucosa of the upper aerodigestive tract, the lymphatic objects that drain these parts, and their lymphatic connections to venous circulation. We follow by describing the classes of errors that we found in this region, and how we infer concepts for addition and removal for each error class.

All anatomical parts, their names and properties were retrieved from the latest release of the FMA in the Web Ontology Language (OWL), version 3.2.1 (University of Washington Structural Informatics Group (UW-SIG)).10 SPARQL (SPARQL Protocol and RDF Query Language) queries were written and stored in the Query Integrator (UW-SIG)11 and accessed through an interface written in Clozure Common Lisp version 1.7 (Clozure Associates, Brookline, MA). All graphs of the lymphatic network were generated using Graphviz version 2.36 (AT&T Labs, Austin, TX).

Using the FMA to construct our model

The process of cancer metastasis can be represented using Markov chains, a type of memoryless stochastic model useful for describing the probability of events based on previous known states. For our Markov chain model of metastasis, each node captures both the anatomical location and the severity of disease. In cases of HNSCC, metastasis occurs along known lymphatic drainage paths. We can therefore model locations as lymphatic concepts from the FMA and severity using the cancer T-stage grading system.12

Lymphatic flow can be modeled as a unidirectional process. The lymph fluid is assumed to flow only in the direction of the two great lymphatic vessels: the thoracic duct and the right lymphatic duct, before entering venous circulation. At any point in time, all nodes downstream of the primary tumor site have some probability of hosting metastatic disease. Our model therefore must consist of all lymphatic structures reachable by a particular primary tumor. Because there are known connections between identifiable lymphatic regions, we can construct a partial map of lymphatic drainage in the head and neck. We do this by extracting the relevant anatomical concepts and connections from the FMA.

The lymphatic network in the FMA is organized under the terms trees, trunks, branches, tributaries, chains and nodes. The right and left lymphatic trees are considered organs, and the subbranches and subtrees that form them are considered organ regional parts. Within these lymphatic trees, the trunk identifies the main stem of the tree. Further bifurcation of the stem yields branches and tributaries. These further consist of chains, whose members are nodes.13 Although this mode of organization may not be analogous to all accepted conceptualizations of the lymphatic system, it does provide an appropriate number of subdivisions which can be used in modeling. The states in our Markov chain model refer to lymphatic chains, the smallest subdivision that is well-represented in the FMA. For the purposes of this article, specific entities and relationships from the FMA will be referred to in italics, with underscores in place of spaces.

Each tumor has an identifiable origin site, which corresponds to an entity in the FMA (e.g. Tongue, Soft_palate, Floor of mouth etc). Organ entities have a property lymphatic_drainage whose values are the lymphatic objects that drains the site. A SPARQL query can be written to extract this information, returning all lymphatic objects which directly drain an organ or organ part. For example, if a tumor is located in the tongue, we expect a query to retrieve the lymphatic objects {Right_jugulodigastric_lymphatic_chain, Left_jugulodigastric_lymphatic_chain, Submental_lymphatic_chain, Basal_lingual_lymphatic_tree, Central_lingual_lymphatic_tree, Right_marginal_lingual_lymphatic_tree, Left_marginal_lingual_lymphatic_tree}. The number of results can then be reduced by increasing the detail of the query, i.e., by querying on a regional part of the tongue, such as the Apex_of_tongue or Body_of_tongue. The FMA should be able to accommodate the most specific and anatomically relevant organ part, and return only those results appropriate for that entity. As you will see shortly, the annotation on the tongue is incomplete, and some modifications must be made to achieve these expected results.

Once we determine the direct lymphatic drainage of the tumor origin, we set out to define all possible paths of drainage. The FMA defines a relationship efferent_to for all lymphatic objects. This property points to lymphatic chains which are downstream of the chain of interest, i.e., closer to the great lymphatic vessels. Chains with multiple values under this property represent branching points in the lymphatic network. A full drainage map can be constructed by querying recursively over all efferent lymphatic chains until each one terminates in one of the two great lymphatic vessels. By combining the lymphatic objects at the tumor origin and the chains defined by the efferent_to relationship, we form a map defining all possible routes of metastasis for a particular tumor through the lymphatic network. The lymphatic objects in the final map can then be used to define nodes in the Markov chain model. Figure 1 shows the lymphatic drainage map constructed using entity Soft_palate as the seed tumor origin.

Figure 1.

Figure 1.

Values under the lymphatic_drainage property of the FMA object Soft_palate are queried and used to generate a full map of lymphatic flow downstream from the origin site. All unlabeled arrows between lymphatic objects represent the efferent_to relationship.

Inconsistencies identified in the FMA representation of the lymphatics

The FMA provided in large part a description of the lymphatic network which was accurate and detailed enough for our needs, such as the output presented in Figure 1. However, in many cases, the methods described above did not generate drainage paths which correctly represented lymphatic anatomy. Inconsistencies arose in the connections between lymphatic chains, as well as in the definition of the lymphatic_drainage property. Five subclasses of class 1 (detectable and resolvable) errors were discovered: (1a) organs whose direct lymphatic drainage incompletely represents the drainage of their regional parts, (1b) organs with lymphatic drainage by lymphatic chain superclasses, (1c) lymphatic objects with efferent connections to inappropriate objects, i.e., superclass objects, (1d) connections between lymphatic chain superclasses, and (1e) erroneous connections between objects in the left and right side lymphatic trees. One subclass of class 2 (detectable and unresolvable) errors was discovered, (2a) lymphatic objects with no efferent connections. Class 3 (undetectable) errors were also found through anatomical expert review following the detection and correction of the previous error subclasses.

We use the following notation to indicate specific types of objects in the ROI. Allow A to be the set of all lymphatic objects in the ROI. A consists of S, the set of superclass lymphatic chains, C, the set of instances of superclass lymphatic chains, and G, the set of great lymphatic vessels, defined as {Thoracic_duct, Right_lymphatic_duct}. S, C and G are mutually disjoint, and SCG = A. The FMA consists of structural entities, material objects in the body, i.e. left/right lung, and abstract entities, i.e. lung. The members of S are abstract entities, whose members’ instances C refer to structural entities in the body. Consequently, each member si in S has two instances representing bilaterally placed chains, siR and siL, the right and left side chains respectively, both of which are members of C.

C then consists of L, the set of lymphatic chains in the left side body, R, the set of lymphatic chains in the right side body, and M, the set of lymphatic chain instances that are located medially. L, R, and M are also mutually disjoint and LRM = C. For any lymphatic object ci in C, allow F (ci) to represent the set of all lymphatic objects with a one-step efferent_to relationship to ci.

Any organ or organ part N consists of recursive regional parts P = {p1, p2,…, pn. Each regional part pi in P is drained by some set of lymphatic objects Qi = {qi1; qi2,…,qim}. N also has lymphatic drainage objects D = {d1, d2, …, dk}, acquired from querying on the lymphatic_drainage property of N.

Error constraints are expressed using set theoretic notation in Table 1 for clarity and portability. These are the foundations of queries ultimately used to extract inconsistencies from the FMA. All errors are described in greater detail below.

  • Subclass 1a An organ is necessarily drained by all the lymphatic objects which drain the organ’s regional parts. Conversely, however, the lymphatic objects that drain all of an organ’s regional parts may not represent all of the chains that drain the whole organ. For example, a query on the entity Tongue returns the lymphatic drainage object {Jugulodigastric_lymphatic_chain}, yet a query on all of the tongue’s regional parts returns {Basal_lingual_lymphatic_lymphatic_lingual_lymphatic_lingual_tree, Marginal_tree, Central tree}. The first is not a superset of the second, violating the constraint given above.

  • Subclass 1b No organ or organ part should be drained by superclass lymphatic objects. The entity Tongue is drained by the Jugulodigastric_lymphatic_chain, a superclass object, which violates this constraint.

  • Subclass 1c No lymphatic object should have efferent connections to inappropriate objects such as superclass lymphatic chains. For any lymphatic object ci in C, F (ci) must contain only appropriate objects such as members of C or G, and must also not contain ci. An example violation is the Submental_lymphatic_chain, a medial lymphatic chain, which is efferently connected to the Submandibular_lymphatic_chain, a superclass object.

  • Subclass 1d Superclass chains should not have efferent_to relationships to other objects. The FMA is inconsistent with this distinction and some superclasses are connected erroneously. As an example, the Superior_lateral_deep_cervical_lymphatic_chain has an efferent connection to the Inferior_lateral_deep_cervical_lymphatic_chain. Both of these objects are superclass lymphatics, and the connection is erroneous.

  • Subclass 1e As above, the right and left sides of this ROI are drained by distinct lymphatic trees. Right side lymphatic chains connect to other right side chains and similarly for the left side. The starting lymphatic object Left_subscapular_axillary_lymphatic_chain, for example, has efferent chains {Right_central_axillary_lymphatic_chain, Right_apical_axillary_lymphatic_chain}. The starting object is a member of L, yet both of its efferent objects are members of R, violating our constraint. This example is outside of out ROI, and although no subclass 1e errors were detected within the scope of this paper, we felt it important to include the class definition here for future use.

  • Subclass 2a All lymphatic objects must have efferent relationships to other lymphatic objects. Examples of lymphatic objects in the FMA with no efferent objects are the left and right submandibular lymphatic chains. Querying the efferent_to property of either returns the empty set. Although we were able to detect these errors, we could not infer corrections from existing concepts in the ontology.

  • Class 3 Type 3 errors involve incorrect information in the FMA ontology. Certain efferent connections between lymphatic objects may not agree with the prevailing literature.1419 These content errors must be identified and corrected through expert review.

Table 1.

Descriptions of all error class constraints using set theoretic notation

Error subclass For any Constraint
1a N iIQiD,I={1,2,,n}
1b pi QiS=Ø
1c ci F(ci)((GC){ci}c)
1d si F(si)=Ø
1e ri, li, or mi F(li)LMG; F(ri)RMG; F(mi)LRMG
2a ci F(ci)Ø

Using existing information in the FMA to correct inconsistencies

Many of these inconsistencies can be corrected systematically. Subclass 1a errors in which an organ is not correctly labeled with a superset of the lymphatic drainage objects of all of its regional parts can be addressed by adding the missing relationships to the FMA. For each organ in the ROI, we construct two queries: one to find the lymphatics that drain the organ directly, D, and one to find the union of lymphatics which drain all of the organ’s recursive regional parts, ∪i∈I Qi, I = {1, 2, …, n}, which we refer to as Q. If the result of the first query is not a superset of the result of the second query, the lymphatic objects in Q\D should be added as lymphatic_drainage objects of the organ. This process is then applied to all organ parts and parts of parts to achieve complete annotation.

For subclass 1b errors, we detect all organ or organ parts which are drained by superclass lymphatic objects. For each organ, we use the same query results for Q given above. Members of set QS should be removed as the lymphatic_drainage for the organ. Additionally, for each inappropriate drainage relationship, two new efferent concepts can be inferred. Lymphatic drainage relationships can be inferred between the organ and the left and right side instances of the inappropriately connected superclass object.

Likewise, subclass 1c errors can be detected by querying for efferent connections between lymphatic instances and superclass objects. For each lymphatic object ci, concepts in the set difference F (ci) \ (GC) ∩ {ci}c are marked for removal as efferent_to values of ci.

We can also use the inappropriate connections in this case to infer correct but missing connections. The entity Central_lingual_lymphatic_tree for example, is a medial lymphatic chain (member of M). Querying on its efferent_to relationship returns {Superior_lateral_deep_cervical_lymphatic_chain, Submandibular_lymphatic_chain} both of which are superclass lymphatics (members of S). Although this violates the constraint for sub-class 1c errors, the intention of these connections may still be correct. If so, we extrapolate that Central_lingual_lymphatic_tree does in fact connect to both of these efferent chains, but to the right and left instances rather than the superclasses. We can then infer the addition of four efferent connections to: {Right_superior_lateral_deep_cervical_lymphatic_chain, Left_superior_lateral_deep_cervical_lymphatic_chain, Right_submandibular_lymphatic_chain, Left_submandibular_lymphatic_chain}.

Subclass 1d errors can be identified and extraneous connections between lymphatic chain superclasses removed. In cases where the left and right side connections do not exist, the erroneous superclass connection can be used to infer them. After verifying the existence of appropriate connections between the right and left instances and their respective efferent objects, the connections between the superclass lymphatic objects are marked for removal.

Subclass 1e errors can be corrected by changing inappropriate efferent connections to point to the ipsilateral lymphatic chain instance. We identify all lymphatic objects that connect efferently to lymphatic objects on the contralateral side. These connections are marked for deletion. If the connection to the analogous ipsilateral lymph chain does not already exist, a new efferent connection is added.

Subclass 2a errors cannot be addressed without input from a content expert. On occasion, the FMA harbors incorrect connections to superclasses (subclass 1c or 1d errors) that can be used to infer missing connections. In most cases though, it is impossible to conclude from existing information in the FMA how disjoint lymphatic objects should connect to the lymphatic network. This information must instead be found in the literature and propagated back into the FMA following review. Some examples of disconnected instances in the FMA are the right and left submandibular and the right and left paratrachael lymphatic chains. An existing connection in the FMA between the superclass Submandibular_lymphatic_chain and the Jugulo-omohyoid_lymphatic_chain indicates that this is the potential correct connection for the left and right instances. For the paratrachael chains, however, there is no available superclass connection from which to draw similar conclusions, and an anatomist must be consulted.

Diagrammatic representations of these error subclasses are given in Figure 2. Most errors could be detected and corrected computationally. Those errors that could not (class 3) were detected and corrected by a content expert.

Figure 2.

Figure 2.

Illustrations of error subclasses. Edges marked with part indicate a regional part relationship, l.d. indicates lymphatic drainage, and all unmarked edges refer to efferent_to relationships between lymphatic objects. Dotted lines refer to missing connections which should exist; lines that are crossed out refer to erroneous connections that should be removed.

Results

Within the upper aerodigestive tract, we extracted 106 distinct organ and organ regional parts from the FMA. There were 27 distinct lymphatic chain instances (right, left or medial) and 11 distinct superclasses of lymphatic objects in the ROI. Table 2 gives the number of inconsistencies detected for each error subclass, as well as the numbers of concepts suggested for addition and removal. These suggestions were generated first by programmatic rule evaluation and then confirmed with an expert anatomist. Counts in the table labeled with* are based solely on expert review, and represent circumstances in which the FMA differs from anatomical literature.1419 Some suggestion counts may disagree with the numbers implied in the Methods section, and this is either due to repetition in suggested concepts, or removal of suggestions based on expert review.

Table 2.

Error counts for each class of error in the ROI.

Error class Affects Objects affected Total objects Concepts to add Concepts to remove
1a organ (parts) 6 106 24
1b organ (parts) 4 106 6 4
1c lymphatic instances 3 27 6 6
1d lymphatic superclasses 7 11 8 9
1e lymphatic instances 0 27 0 0
2a lymphatic instances 12 27 10*
3 lymphatic instances 7* 27 8* 4*
All all entities 37 144 62 23

Prior to this round of content auditing, many lymphatics within the ROI were disconnected from the rest of the system (12 out of 27) or were connected erroneously to superclass entities (3 out of 27). Following automated inference and auditing, the number of disconnected entities drops to 4 out of 27, with no entities connected to superclass objects. Further concepts were added or removed by an anatomist, resolving class 3 errors, resulting in a final lymphatic map with no disconnected lymphatic instances. The structure and interconnectivity of the lymphatic ROI is shown in Appendix A. The network is shown in its original form (preceding any auditing) and at the completion of our review. Only right-side and medial lymphatic instances are shown for clarity. Final suggested concepts are in the process of being propagated back into the FMA.

Discussion

The FMA has been open to both changes to its underlying structure as well as content auditing. Over the past several years, it has undergone a dramatic change from a Frames-based system to an OWL representation.20,21 This change did not resolve many content issues, but increased the flexibility of the ontology, allowing for easier content auditing. Several attempts have been made in this regard, all of which have led to incremental improvement. In 2009, Gu et al. detected potential incorrect relationships by studying the implicit relationship between the is-a statement and structural relations such as part-of. An object A which is part of another object B, for example, may not also be an instance of B. Entities found to violate the implicit logic of these relationships were vetted by an expert anatomist and removed or corrected in the FMA.22 In the same year, Kalet et al. specifically suggested the auditing of the FMA’s representation of the lymphatic system.23 A study in 2012 expanded on some of these approaches by specifying and detecting “graph motifs,” small sets of relationships known to be problematic. SPARQL queries were used to fetch fragments of the FMA matching these motifs, and these problem concepts were then validated or changed in the FMA.24 Most recently, Luo et al. used the assumption of structural self-bisimilarity to detect potential anomalies in the FMA. They contend that certain types of relationships are expected to be symmetric on the two sides of the body, and differences in bilateral connectivity may be a strong indicator of error.25

The techniques used in this paper share similarities to some of those mentioned above. One major difference though, is that the current auditing happened in the context of model creation. The deficiencies found needed to be addressed to allow us to continue to use the FMA as a tool. Addition and removal of suggested concepts were immediately vetted not only by an anatomical expert, but through the rigor of modeling use and testing. The corrected portion of the ontology is therefore less likely to contain significant further errors.

Overall, we believe that the structure, organization and intention of the FMA is a suitable foundation upon which to build our Markov chain model of cancer metastasis. However, the prior state of the FMA was unable to support our modeling needs. In our attempts to use the FMA to populate nodes in our model, we were able to systematically review relevant parts of the anatomical representation, specifically the lymphatic system in the mucosa of the upper aerodigestive tract. The classes of errors identified through this process generated a constraint set which may apply broadly to other parts of the lymphatic tree. It is our hope to extrapolate these auditing techniques to the rest of the lymphatics and other analogous systems such as arterial-venous circulation and the peripheral nerves.

Limitations

The methods described above have only been used to audit a small section of the FMA ontology. It is yet unknown whether they are applicable to the rest of the lymphatic system or other analogous systems as represented in the FMA.

For example, in the upper body, relationships between lymphatics generally show bilateral symmetry. In the abdomen and lower body, however, lymphatics drain only into the thoracic duct, and this property no longer holds. Also, the organs are asymmetric within the abdominal cavity, so there are no clear superclass and instance relationships. This more complex topology existing in the rest of the body will require not only a reevaluation of the error classes we have defined in this paper, but will likely introduce opportunities for new error definitions. Likewise, applying such methods to the arterial-venous circulatory system or the peripheral nervous system may yield an analogous but different set of constraints.

This paper focused primarily on classes of errors that can be defined and identified computationally. Some errors, although detectable, cannot be automatically corrected. These content errors benefit from automated content auditing by being discovered, but we have no way of proceeding without review by a content expert. We touched on this briefly when discussing subclass 2a errors. For these errors, we cannot infer the correct efferent connections to a disconnected lymphatic object; deductive reasoning will not suffice. We attempt to infer some possible solutions for subclass 2a errors using other pieces of information that already exist in the FMA, such as erroneous connections between superclasses. However, these inferences may not hold in all cases. Often, we cannot infer any new knowledge at all. This limitation is foreseen and expected. After all, we attempt only to detect connections which are potentially erroneous, and not to automatically generate novel content.

Likewise, errors of content, which are not logical in nature and cannot be expressed in mathematical formulae, are extremely difficult to find in an ontology the size of the FMA. Although we were able to discover and fix some of these content errors by hand, the process was laborious and would not scale well to larger subsets of the FMA. Our techniques may therefore face additional challenges when applied broadly to the FMA ontology.

Additionally, we were unable to audit certain other classes of content errors within the scope of this paper. For example, an organ may be fully annotated while none of its parts are. This is of concern when information is missing at our specified resolution, e.g., when we want to know the lymphatic drainage of an organ part, but there is no drainage information at that level in the FMA. We have yet to determine a cheap or systematic way to discover these types of ontological errors.

Conclusions

Many models of biological processes are fundamentally rooted in anatomy. In order to be “anatomically correct,” a model should satisfy the relational and spatial constraints imposed by human anatomy. For example, the human heart has a set of properties, such as having two atria, two ventricles and four valves, and would not be considered a proper heart if these statements were untrue, or if the relative locations of these objects were incorrect. Likewise, in building a functional model of a heart, the modeler would also want to include all of the above constituent parts and their relationships. Otherwise, there is a high probability that the model is inaccurate. Although this is an overly simple example, one could conceive of assessing the accuracy of a model by judging the correctness of its represented anatomy. A mismatch between a model and the known anatomical worldview would not necessarily invalidate the model, but would indicate potential points of improvement.

An ideal anatomical ontology would be able to fulfill these needs. The FMA, in its conception, attempts to represent human anatomy correctly and faithfully. The current state of the FMA has certain insufficiencies that increase barriers to its use, but many of these issues can be addressed through systematic content auditing. The best way to reach the correct anatomical description perhaps, is to selectively use, validate and correct subsets of the FMA ontology. Making edits in such a piecemeal way may not seem like the most efficient course of action, but what it lacks in breadth it makes up for in practicality.

Acknowledgments

Research reported in this publication was supported by the National Library Of Medicine of the National Institutes of Health under Award Number R21-LM012075. This study was supported in part by the National Library of Medicine (NLM) Training Grant T15LM007442. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Many thanks to Onard Mejino and the UW Structural Informatics Group for providing insight into the content and organization of the FMA.

Appendix A

graphic file with name 2246501fa1.jpg

Connected graphs of the right side lymphatic network: the original extracted from FMA (top), and the final audited version (bottom).

References

  • 1.Foundational Model of Anatomy [Internet] Seattle, WA: University of Washington Structural Informatics Group; [cited 2014 Dec 10]. Available from: http://sig.biostr.washington.edu/projects/fm/AboutFM.html. [Google Scholar]
  • 2.Woolgar JA. Histological distribution of cervical lymph node metastases from intraoral/oropharyngeal squamous cell carcinoma. Br J Oral Maxillofac Surg. 1999;37(3):175–180. doi: 10.1054/bjom.1999.0036. [DOI] [PubMed] [Google Scholar]
  • 3.Mukherji SK, Armao D, Joshi VM. Cervical nodal metastases in squamous cell carcinoma of the head and neck: what to expect. Head Neck. 2001;23(11):995–1005. doi: 10.1002/hed.1144. [DOI] [PubMed] [Google Scholar]
  • 4.Hewitt M, Simone JV. Enhancing data systems to improve the quality of cancer care. Washington: National Academy Press; 2000. [PubMed] [Google Scholar]
  • 5.Nagl S. Cancer bioinformatics: from therapy design to treatment. London: Wiley Publishers; 2006. [Google Scholar]
  • 6.Preziosi L. Cancer modeling and simulation. New York: Chapman and Hall CRC; 2003. [Google Scholar]
  • 7.Candela FC, Kothari K, Shah JP. Patterns of cervical node metastases from squamous carcinoma of the oropharynx and hypopharynx. Head Neck. 1990 May-Jun;12(3):197–203. doi: 10.1002/hed.2880120302. [DOI] [PubMed] [Google Scholar]
  • 8.Shah JP, Candela FC, Poddar AK. The patterns of cervical lymph node metastases from squamous carcinoma of the oral cavity. Cancer. 1990 Jul 1;66(1):109–113. doi: 10.1002/1097-0142(19900701)66:1<109::aid-cncr2820660120>3.0.co;2-a. [DOI] [PubMed] [Google Scholar]
  • 9.Benson N, Whipple M, Kalet IJ. A Markov model approach to predicting regional tumor spread in the lymphatic system of the head and neck. AMIA Annu Symp Proc. 2006:31–35. [PMC free article] [PubMed] [Google Scholar]
  • 10.FMA in OWL [Internet] University of Washington; Structural Informatics Group: [Google Scholar]
  • 11.Query Integrator [Internet] Seattle, WA: University of Washington Structural Informatics Group; [cited 2015 Jan 31]. Available from: http://www.si.washington.edu/projects/qi. [Google Scholar]
  • 12.Cancer Staging [Internet] Bethesda, MD: National Cancer Institute at the National Institutes of Health; [cited 2015 Feb 25]. Available from: http://www.cancer.gov/cancertopics/factsheet/detection/staging. [Google Scholar]
  • 13.Rosse C, Mejino JL. A reference ontology for bioinformatics: the foundational model of anatomy. J Biomed Inform. 2003 Dec;36:478–500. doi: 10.1016/j.jbi.2003.11.007. [DOI] [PubMed] [Google Scholar]
  • 14.Fisch U. Lymphography of the Cervical Lymphatic System. London: W.B. Saunders Co; 1968. [Google Scholar]
  • 15.Mayerson HS. Lymph and the Lymphatic System. Springfield: Thomas Books; 1965. [Google Scholar]
  • 16.Rusznyak I. Lymphatics and Lymph Circulation. New York: Pergamon Press; 1967. [Google Scholar]
  • 17.Som PM, Curtin HD, Mancuso AA. An image based classification for the cervical nodes designed as an adjunct to recent clinically based nodal classification. Arch Orolaryngol Head Neck Surg. 1999;125(4):388–396. doi: 10.1001/archotol.125.4.388. [DOI] [PubMed] [Google Scholar]
  • 18.Robbins KT. Integrating radiological criteria into the classification of cervical lymph node disease. Arch Orolaryngol Head Neck Surg. 1999;125(4):385–387. doi: 10.1001/archotol.125.4.385. [DOI] [PubMed] [Google Scholar]
  • 19.Teng C, Shapiro LG, Kalet IJ. Head and neck lymph node region delineation with image registration. Biomed Eng Online. 2010;9(30):1–21. doi: 10.1186/1475-925X-9-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Golbreich C, Zhang S, Bodenreider O. The foundational model of anatomy in OWL: experiences and perspectives. Web Seman. 2006;4(3):181–195. doi: 10.1016/j.websem.2006.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Noy NF, Rubin DL. Translating the foundational model of anatomy into OWL. Web Seman. 2008;6(2):133–136. doi: 10.1016/j.websem.2007.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gu HH, Wei D, Mejino JL, Elhanan G. Relationship auditing of the FMA ontology. J Biomed Inform. 2009 Jun;42(3):550–557. doi: 10.1016/j.jbi.2009.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kalet IJ, Mejino JL, Wang V, Whipple M, Brinkley JF. Content-specific auditing of a large scale anatomy ontology. J Biomed Inform. 2009 Jun;42(3):540–549. doi: 10.1016/j.jbi.2009.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zhang GQ, Luo L, Ogbuji C, Joslyn C, Mejino J, Sahoo SS. An analysis of multi-type relational interactions in FMA using graph motifs with disjointness constraints. AMIA Annu Symp Proc. 2012:1060–1069. [PMC free article] [PubMed] [Google Scholar]
  • 25.Luo L, Mejino JL, Zhang GQ. An analysis of FMA using structural self-bisimiliarity. J Biomed Inform. 2013 Jun;46(3):497–505. doi: 10.1016/j.jbi.2013.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES