Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2014 Nov 14;2014:496–505.

Evaluation of need for ontologies to manage domain content for the Reportable Conditions Knowledge Management System

Karen L Eilbeck 1, Julie Lipstein 2, Sunanda McGarvey 3, Catherine J Staes 1
PMCID: PMC4419892  PMID: 25954354

Abstract

The Reportable Condition Knowledge Management System (RCKMS) is envisioned to be a single, comprehensive, authoritative, real-time portal to author, view and access computable information about reportable conditions. The system is designed for use by hospitals, laboratories, health information exchanges, and providers to meet public health reporting requirements. The RCKMS Knowledge Representation Workgroup was tasked to explore the need for ontologies to support RCKMS functionality. The workgroup reviewed relevant projects and defined criteria to evaluate candidate knowledge domain areas for ontology development. The use of ontologies is justified for this project to unify the semantics used to describe similar reportable events and concepts between different jurisdictions and over time, to aid data integration, and to manage large, unwieldy datasets that evolve, and are sometimes externally managed.

Introduction

Public health surveillance, investigation and intervention are essential for the prevention and control of communicable and non-communicable diseases. Toward that end, every jurisdiction in the U.S. publishes a list of “reportable conditions” that function as a communication tool between public health entities and reporting entities. When a reportable condition is identified, reporters (e.g., hospitals, laboratories) are required to report the event to public health authorities (15). Timely and accurate reporting is critical to identify, investigate, and control public health threats.

Accurate, timely and complete reporting of reportable conditions depends on reporters having correct and current information about a) the list of conditions that are reportable for the jurisdiction where they or their patients or clients are located, b) criteria that make a condition reportable, c) how quickly a report should be sent, d) what information should be included in a report, e) the preferred method to send a report, and f) where the report should be sent. This is referred to as the “who, what, when, where, and how” of reporting. Automated detection and electronic reporting depends on having this information available in machine-processable form to be stored as rules within a clinical decision support system, and used by an EHR or LIMS. Currently, this information is scattered across many different websites and documents(2). It is difficult for human users to navigate, and the information is not usually presented in a manner suitable for machine-processing. The Reportable Condition Knowledge Management System (RCKMS) is positioned to provide a solution for this challenge by providing a single entry point to manage and access reportable condition specifications.

The RCKMS project is managed using a phased approach and is supported by the Centers for Disease Control and Prevention (CDC). The current phase (2014–2015) targets an authoring framework and administration screens that will capture reporting specifications, transform them to executable rules, and store them in an open source clinical decision support tool. Prior work included a view/query interface, accompanying printable reports, a user login/profile management interface, and subscription management capability. Jurisdictional reporting specifications were collected from six jurisdictions for three conditions, and default (or base content) was defined for 13 conditions, along with recommendations for resolving underlying issues in the Council of State and Territorial Epidemiologist (CSTE) Position Statements(6) that limit their use in defining computable criteria.

The RCKMS project promotes adherence to standards and reuse of existing authoritative resources. In keeping with this philosophy, the team participated in the Office of the National Coordinator’s (ONC) Standards and Interoperability (S&I) Health eDecisions (HeD) pilot as an artifact provider. For the HeD pilot, an output file of reporting specifications was produced for pertussis reporting requirements that was both human-readable and machine-processable. Project members participate in other S&I initiatives related to the reporting lifecycle (e.g., Public health reporting Initiative, Data Access Framework, and Structured Data Capture), and the project will continue to assess adoption options as standards evolve. Finally, RCKMS will use, not replace, existing value set repositories managed by the CDC (e.g., PHINVADS(7)) or the National Library of Medicine (e.g., VSAC(8)).

Objective

Given the breadth and complexity of the data needed to automate the reporting process on a national level, the RCKMS Knowledge Representation Workgroup was tasked to a) explore the need for ontologies to support RCKMS functionality, and b) develop recommendations for ontology development if indicated.

Methods

The RCKMS Knowledge Representation workgroup convened weekly to bi-monthly by phone and web conference from April through September 2013. The workgroup included members from various backgrounds representing those a) who may author content or query for and use the system output; b) who have developed knowledge management systems or components that may address parts of the overall problem; and finally, c) ontology developers from outside of the reportable condition domain.

The workgroup represented a diverse set of members from within and beyond the reportable condition community, enabling a wide variety of viewpoints to be expressed. The processes and resources currently involved with public health reporting, available tools, other exploratory projects, and the prototype RCKMS were presented to the workgroup. Meetings were recorded for those who missed them, and consensus was gathered iteratively through discussion and email. The resulting artifacts provide a comprehensive survey into the kinds of knowledge management needed to automate much of the currently manual processes.

Specifically, the workgroup participated in the following tasks:

  1. Evaluate Content: The workgroup reviewed a variety of topics to understand the scope of content required and available for RCKMS, including the existing pilot application, other ontology and knowledge representation projects, related national initiatives, existing terminologies applications, and research that may contribute toward solving the defined problems. The workgroup surveyed the domain of public health reporting and described and evaluated the status of resources necessary for the RCKMS project.

  2. Define user stories and identify required data resource linkages: The workgroup reviewed user stories submitted for the following stakeholders groups: Public Health epidemiologists at all levels of public health, knowledge curators or terminologists, clinicians, researchers, electronic health record vendors, laboratorians, and public health officials. The user stories were grouped according to the area of RCKMS affected. Next, the workgroup identified the need to represent linkages between resources necessary to meet the requirements observed in the user stories.

  3. Assess candidate knowledge domains for ontology development: The following questions were developed by the workgroup and used to evaluate whether ontologies were needed to manage one or more of the sub-domains of information in RCKMS. Selected sub-domains were identified as candidates for ontological representation when the questions below were affirmative.

    • How complex are the relationships between the data?

      • ○ How many different types of relationships exist in the area under consideration?

      • ○ Is there a standard discernible structure that exists between the data?

        • ▪ Is there an inheritance of qualities between pieces of data?

        • ▪ Are there conflating ideas in the data that should be separated?

    • Is there a logical structure to the data?

    • Are users interested in questions that can more easily be answered if the data relationships are ontology-based?

    • Does the data change over time?

      • ○ Are new concepts added that need to be classified?

      • ○ Is the structure of the data rearranged over time?

    • Do we need to standardize the way more than one group describes or uses the data to share a common understanding?

    • Are there concerns about the quantity and maintenance of the data?

      • ○ Is there sufficient data to warrant the time, effort and cost to implement an ontology?

      • ○ Is maintenance of the data a manual burden/bottle neck?

      • ○ Is the speed of response to change an issue?

    • Does reasoning need to be done against the data structure?

      • ○ Is there a need to enable tooling to validate rules?

  4. Develop recommendations: Recommendations to advance the integration of ontologies into the RCKMS project were drafted based on input from the group, and refined iteratively over a course of meetings and comments submitted by workgroup members.

Results

Evaluate content

The workgroup identified domains of knowledge necessary for the RCKMS project and existing external coded resources for coded concepts represented in RCKMS. For each domain of knowledge, we described existing data sources, problems and recommendations for use and incorporation, and the relevance for RCKMS.

The workgroup identified the following three knowledge resources for RCKMS, representing data that needs to be managed for the project:

  • State reporting rules: Reporting rules are available in HTML or PDF documents on state/county/city/tribal websites. While this information is authoritative, it is not computable. In addition, under the current paradigm, reporters are not notified of changes so it is time consuming and inefficient to remain current. Within RCKMS, there will be an authoring interface that will allow jurisdictions to manage their reporting specifications and make this information available in both human-readable and machine-processable formats. RCKMS will also support notification of changes to be sent to reporters who subscribe to receive changes for conditions or by jurisdictions.

  • Reporting logic for national surveillance: The logic is available in Position Statements published as PDF documents on the CSTE website(6). While this information can be used as a default to encourage uniformity, the reporting logic in the Position Statements is not uniformly adopted across jurisdictions. A detailed CTSE report concerning the content is available from the Public Health Data Standards Consortium(9). Within RCKMS, criteria described in the Position Statements can be used as base content that can be copied and directly used by a jurisdiction, or modified to address jurisdiction-specific requirements.

  • Nationally notifiable conditions: Once conditions are reported to a state or other jurisdictional public health agency, case definitions are used to determine which reports should be forwarded from the ‘local’ public health agency to the CDC for national surveillance. The logic for defining a confirmed or probable case (i.e. case definitions) for each condition tracked on a national level is available in HTML on the CDC website(10). In addition, the coded concepts for these nationally notifiable events and other selected relevant value sets of coded concepts are available on the PHIN VADS website managed by the CDC(7). While the nationally notifiable conditions are similar to reportable events, they do not include all conditions and criteria for reporting in a given jurisdiction. In addition, the conditions change over time, but the reasons for change and relationship between current and previous conditions are not captured.

The workgroup identified concepts in RCKMS that can be represented using coded values from existing external resources. For example, the following major domains of concepts, resources for coded values and key requirements were identified by the workgroup:

  • Clinical condition (the name of a disease or health condition under surveillance)

    • ○ Coded values: Subsets from SNOMED-CT®/ICD-9 CM/ICD-10 CM

    • ○ Requirement: Allow jurisdictional reportable conditions to link to clinical condition.

  • Lab test names (described as test names and often test methods)

    • ○ Coded values: LOINC®

    • ○ Requirement: Support default value sets for lab test names, and allow jurisdictions to update these to meet jurisdiction-specific needs.

  • Lab test results (specifically for results that are organisms or positive or negative)

    • ○ Coded values: SNOMED-CT® for organisms

    • ○ Coded values: SNOMED-CT® or HL7/PHIN VADS codes for positive/negative value sets

    • ○ Requirement: Support default value sets for lab test results, and allow jurisdictions to update these to meet jurisdiction-specific needs.

  • Clinical observations

    • ○ Coded values: LOINC®

    • ○ Requirement: Align with the new agreement between LOINC® and SNOMED®(11)

  • Clinical values/findings

    • ○ Coded values: SNOMED-CT® concepts for nominal and ordinal values

    • ○ Requirement: Support default value sets for clinical findings, and allow jurisdictions to update these to meet jurisdiction-specific needs.

  • Jurisdiction context (Name of jurisdiction and designation of geographic coverage)

    • ○ Coded values: Federal Information Processing Standard Code (FIPS)

    • ○ Requirement: Flow of reporting within jurisdictions must be represented to assist reporters in identifying to whom a report should be sent, for example, if it should be sent to a local health department, the reporter, must be able to determine which local health department and if it is based on the residence of the patient, the site of care delivery, or the location of the servicing laboratory.

Define User Stories and identify required data resource linkages

The workgroup reviewed user stories provided by the following stakeholders groups: epidemiologist working at a state or local health department, CDC epidemiologist, knowledge curator/terminologist, reporter, researcher, EHR vendor, laboratorian, and public health official. We grouped user stories according to the area of RCKMS that may be affected by the user story. Each set of stories includes a variety of stakeholders but similar data structure needs.

  1. User stories addressing relationships between reportable events by jurisdiction: Story example: “I am the epidemiologist for the communicable diseases branch of my health department and I would like to see the reporting specifications for all conditions in our neighboring jurisdictions to compare against ours. In order to compare the criteria for a given reportable topic, there needs to be a way to link similar reportable events with different names or different criteria for the topic.” Resource example: Show the criteria for reporting pertussis-related events from a lab or clinical setting for the following spatial contexts: Utah and Colorado. The system would need to know that pertussis and whooping cough were the same.

  2. User stories that require understanding about the relationship between semantic changes in reportable events. Story example: “I am an epidemiologist defining a new or updating a reportable event and need to decide which subset of a condition should be made reportable. To improve consistency across jurisdictions, I want to search through the RCKMS to see related conditions from other jurisdictions and be able to view hierarchical relationships across related conditions.” Resource example: In Utah, the condition ‘Hantavirus infection and pulmonary syndrome’ was retired and replaced by ‘Hantavirus pulmonary syndrome’. In Colorado however, the change has yet to be made. What is the relationship between the two terms and the nationally notifiable condition used for national surveillance?

  3. User stories related to linking national surveillance to reportable events. Story example: “I am the CDC epidemiologist with the National Notifiable Disease Surveillance System (NNDSS) and I need to know if states are collecting information, in other words does the state include relevant reportable events in their jurisdiction to assert that a given state is reporting Nationally Notifiable Conditions (NNCs) for their jurisdiction. Currently, this knowledge is manually derived from CSTE’s State Reportable Condition Assessment (SRCA), which is retrospective and difficult to interpret.” Resource example: Each year, CDC’s NNDSS needs to know if each nationally notifiable condition is or is not reportable in each jurisdiction. This requires knowledge of each reportable condition in each jurisdiction and to understand whether or not it should map to a specific nationally notifiable condition for the surveillance year.

  4. User story that combines semantic changes over time and national surveillance. Story example: I am a researcher reviewing notifiable condition reports for Shiga toxin-producing Escherichia coli over the last 10 years. I want to take into account changes in naming and classification at the national level as well as differences in how jurisdictions report the disease so my data is inclusive of all the reports that are applicable. Resource example: Three notifiable conditions (Enterohemorrhagic Escherichia coli (EHEC) shiga toxin+ (serogroup non-O157), Enterohemorrhagic Escherichia coli (EHEC) O157:H7, and Enterohemorrhagic Escherichia coli (EHEC) shiga toxin+ (not serogrouped)) were retired in 2006 and replaced by the supertype Shiga toxin-producing Escherichia coli (STEC).

  5. User stories related to criteria and mappings to code systems. Story example: “I am a terminologist and I need to select the LOINC® codes that meet the epidemiologist’s description of criteria using the organism, method and specimen source. How do I find all the relevant codes, and how do I reselect the codes as new versions of LOINC® are released.” Resource example: What lab test codes should be included in the value sets corresponding to the selection logic for tuberculosis where the epidemiologist has specified a target organism, laboratory method and specimen types to be used in the logic?

  6. User story that extends #5 and combines with earlier user stories. Story example: “I am a terminologist, and a reportable condition is being updated into two reportable conditions. I need a report to show all vocabulary that was referenced by the condition so I can determine how the references may need to be updated based on the split (or combination, or reclassification).” Resource example: Ehrlichiosis, human monocytic (HME) is being replaced by two new events: Erlichia ewingii and Ehrlichia chaffeensis.

The workgroup then reviewed an existing concept map previously created by one of the authors (CJS) that shows many of the concepts represented in reporting requirements (Figure 1). The concepts were extracted from the websites and PDF’s described above, and were organized to be illustrative. The major concepts of interest are shaded. The concept map was useful for illustrating subdomains of content and some of the linkages required between concepts. The workgroup articulated several relationships between concepts, some of which are shown in the additional blue boxes (Figure 1).

Figure 1.

Figure 1.

Map of selected concepts needed to represent reporting requirements and identify candidate areas for use of ontology to manage RCKMS knowledge (Key concepts shaded. Newly-identified relationships in boxes).

There are existing resources such as SNOMED-CT that link diseases and causative agents, but most of the linkages required for clinical and laboratory reporters to know ‘what”, “when”, “where” and “how” to report to jurisdictional public health authorities do not exist in computable format sufficient to meet the user stories. For example, there are no existing resources to support queries based on a user’s need to see reportable events related to a diagnosis (“show me all the influenza reportable events, such as the criteria for pediatric deaths or influenza hospitalizations”), or jurisdiction (“show me what is required to be reported in New York City and New Jersey”). In addition, while a Reportable Condition Mapping Table was published in 2011 that lists LOINC® and SNOMED-CT concepts for Electronic Laboratory Reporting, the mappings were created through a manual, labor intensive process that requires manual updates as source data is updated, and the Mapping Table does not represent state reporting requirements(12).

Assess candidate knowledge domains

Using the criteria defined in the methods, the workgroup evaluated three subdomains of information within RCKMS. Ontologies were considered as tools to manage the following knowledge: reportable events, jurisdiction relationships, and reporting logic. The ontologies would be used for both standardizing the authoring of content and for querying RCKMS data. The findings are summarized in Table 2. The assessment illustrates a strong need for the use of ontologies to manage information about reporting logic and reportable events. There is not strong evidence for a need to manage jurisdiction information in an ontology. The hierarchical structure of jurisdictional information can be managed using mapping tables.

Table 2.

Evaluation of candidate domain areas for ontology development based on criteria for ontology use.

Sub-domains of RCKMS knowledge
Reportable event Jurisdiction Reporting logic (particularly laboratory-based logic)
1. How complex are the relationships in the data?
Complex relationships between reportable events, conditions and reporting criteria. Also, need to track semantic changes over time. Simple relationships Complex relationships between selection criteria and value sets.
1(b). Is there an inheritance of qualities between the data?
Within a single jurisdiction, events tend to be mutually exclusive. When attempting to aggregate across jurisdictions or over time, inheritance becomes an issue Cities/Counties inherit state-based rules and sometimes include additional reporting rules of their own. Lab tests have inherent hierarchical structure.
2. Is there a logical structure to the data?
Yes, events are about conditions, and have defining criteria. Events evolve over time and have relations to previous events. Jurisdictional events can be related to national events. Yes. Spatial structure and reporting flow structure. Yes
3. Are users interested in questions that can more easily be answered if data relationships are ontology-based?
Yes - see user stories 1,2,3,4,6. No user story described a need. Yes - see user story 5.
4. Does the data change over time?
Yes, jurisdictional and national reportable events change yearly (or more frequently). Stable Yes, updates to reporting criteria occur yearly or more frequently. LOINC® files are updated every 6 months
5. Are there multiple user groups, and need for a common understanding?
Yes. Authoring and reporting users. Also, national and jurisdictional reportable event developers. Yes. Authoring and reporting users. Yes. Authoring and reporting users. Multiple other uses for value set creation.
6. Is maintenance of the data a manual burden and creating a bottle neck?
Yes, Large manual burden. Currently, the State Reportable Condition Assessment(13) provides some capacity, but only retroactively with several year delay. No logical relationships used to manage tracking of event development, or mapping between national and jurisdictional events No. Small burden Yes. Large manual burden. Updates to selection criteria or underlying data sources (i.e., LOINC®) result in large manual undertaking to produce new value sets.
7. Does reasoning need to be performed to utilize the knowledge?
Yes, traversal of terms in a well-developed ontology could aid in surveillance questions. Reasoning may be valuable. Reasoning can be performed over existing spatial relations(e.g.,“What events should be reported for X city in Y county in Z state?”) Yes, tests can be described logically and reasoned into hierarchical structure for management of value sets.

Recommendations and Discussion

The workgroup evaluated the knowledge needed and the resources currently available to effectively address the user stories and necessary linkages between resources. Two areas of the reportable condition domain were agreed upon by the workgroup as candidates for ontology development. The undertaking of ontology development for a large project like RCKMS is not to be taken lightly and the workgroup made recommendations regarding implementation and ontology management.

Recommendations for domains to be managed using ontologies

The following areas of RCKMS knowledge were recommended to be managed using ontologies.

  • Reportable Events: An ontology of reportable events would allow RCKMS to meet the surveillance and querying requirements. The concepts to be managed by this ontology are the events, the relationships between active and retired events, the relationship between national and state-defined events and the criteria associated with given events. A reportable event ontology is a complex undertaking that involves management of concepts from the national and jurisdictional level. This is of high priority. The surveillance use cases require RCKMS to be able to track conditions over time, but the events that capture the conditions change. The querying use cases need to be able to evaluate what is currently reported, with what was previously reported, and to do so across changes at the state/local jurisdiction level, as well as at the national level. Therefore, the ontology would need to relate the reportable events to their criteria, and how they relate to each other. A key component of this reportable event ontology will be defining the relationships that exist between terms such as replaced by, that allow us to understand the evolution of the usage of a condition.

  • Reporting Logic: An ontology to manage reporting logic, particularly to select value sets using LOINC®, would provide consistency when authoring RCKMS content. The base content for this development may come from information already gathered by CSTE, including Position Statements and their associated technical implementation guides, and information gathered from the State Reportable Condition Assessment (SRCA). The technical implementation guides were a onetime effort and have not been updated since 2008 to reflect updates that may have occurred to CSTE Position Statements for Notifiable Conditions. After the initial population of terms, there will be a phase of jurisdictional curation. The content will need periodic updating as the referenced datasets (e.g., LOINC® are updated, and reporting logic sets are updated by each jurisdiction. These updates will be jurisdiction specific. Previous work(1416) successfully demonstrated the use of ontologies to hierarchically query the LOINC® database, and highlighted issues that must be resolved: 1. LOINC® cannot effectively be queried hierarchically in its present form. 2. The component axis of LOINC® is not suited for epidemiological queries. 3. There are data restriction issues relating to public access of the hierarchical structure in LOINC®

Recommendations for ontology management

  • Allocate adequate resources to support the stages of ontology development. The first stage; exploration of necessity, has been completed by this workgroup. The use of ontologies is justified for this project to unify the semantics used to describe similar concepts between different groups, to aid data integration, to allow logical inference over data and to manage large, unwieldy datasets. Ontology development needs a team approach: It is a specialist task and requires both domain knowledge and understanding of ontology technology and the existing data infrastructure and the semantics of the data. Creation of ontologies will be undertaken by knowledge curators with the assistance of technologists. The knowledge curators will be domain experts in the field of public health, who have some training in the ontology development. The technologists will be familiar with ontology development and application, and be trained in the use of an ontology management system. They will also be familiar with databases and the existing data. They will utilize input from a variety of sources to drive development (e.g., spreadsheets, textual information). To ensure that the ontologies accurately represent the semantics of the domain, a process of oversight and curation will be utilized. Testing and deployment tasks will be undertaken by the technologists and IT team. Maintenance will be vital to continue to support the community after development and deployment. This will involve updates coming from external data sources such as LOINC® and updates from public health data (national and jurisdictional). Maintenance should be less labor intensive, but longer running than the initial development.

  • Use a hybrid approach to ontology development for the RCKMS. There are three high level strategies for ontology development: top-down, bottom-up and hybrid; each approach has pros and cons and is suited to a different entry point in the process of elucidating and codifying ontologies. Top-down ontology development starts with defining and then extending top level core concepts, from very general terms, to very specific. This approach forces the developer to identify the key foundational concepts and their relationships from the beginning, and often leads to a comprehensive ontological model of the data. Existing foundational ontologies may be used (for example, Basic Formal Ontology, BFO(17)) to form the basis of the domain ontology. This method is good for consensus building among multiple groups. Bottom-up approaches to ontology building use pre-existing data sources as the starting point of development and take the reverse approach of specific to general term development. This is well suited for data integration projects, but often produces an incomplete model of the domain. The hybrid approach takes a pragmatic approach to the modeling problem – in that it uses both of the previous approaches to come to a consensus view. It is suitable for projects where there is need to harmonize the semantics used by the community and help provide conversion to heterogeneous datasets. The problems being addressed by RCKMS are wide, and there are elements that are best suited to both top-down and bottom-up development. There are many existing resources used in this project, such as each jurisdiction’s reportable event criteria, that may best be handled with a bottom up approach. It would however, be of great benefit to this community to develop a comprehensive domain ontology that is rooted by a foundational ontology. There is also evidence that suggests that using a foundational ontology does not slow development and improves interoperability and quality (18).

The ontologies developed for this project have the potential to provide a valuable resource to other investigators in the wider biomedical community. To enable and promote reuse, it is recommended that an upper ontology be used to place the terms developed into a wider context. An upper ontology such as BFO provides very high level terms upon which more specific ontologies can be developed. For example, there is a division between continuants and occurants, and between dependent and independent continuants. This upper level specification of the world forces the specific ontology developer to frame the new terms with the organizing principles in mind. The agreement imposed by the upper level ontology promotes reuse and interoperability of ontology terms. Providing consistency in the definitions of relationships used in the ontologies enables inference and reasoning over the ontologies. The Open Biomedical Ontology group’s Relations Ontology (19) provides a starting place to obtain and then extend relations for the ontologies built for RCKMS. Another option to promote the reuse of the ontologies developed here would be to host them on the National Center for Biomedical Ontology’s BioPortal website (20).

Recommendations for ontology development

The workgroup had specific recommendations regarding the management of ontology development: It is recommended that the ontology development be transparent, open source and versioned, preferably with a versioning system such as SVN and hosted via a collaborative site such as Google Code.

The project should provide a system to capture change requests to the ontologies. It is envisioned that there will be two mechanisms for development and change to the ontologies. The first is batch sets of changes relating to the updates of other datasets (new terms in LOINC®, yearly updates to selection logic by a jurisdiction). The second mechanism for change is via a developer working in a given knowledge area discovering the need for a new concept. These term requests should be tracked and managed in a way that is transparent and traceable. Many ontology projects use a term request tracker to document this process.

Where applicable, existing ontologies and knowledge representations should be used. The necessary terms can be imported into the RCKMS ontologies and strategies need to be in place for updates of existing terms. Some tools such as Ontofox (21) exist for Web Ontology Language (OWL) ontologies that allow this process to be managed.

The knowledge captured by the RCKMS project has the potential of being vast and complex. Large ontologies are not uncommon in the biomedical domain. Bada et al. evaluated the Gene Ontology for elements of success and concluded that clear goals, simple intuitive structure, early use, community engagement, and continuous evolution were the key factors (22). The following questions have been addressed to find solutions that will work for the particular requirements of RCKMS:

  • Clear Goals: One large all-encompassing ontology or several small ontologies? A practical solution to the knowledge management issues presented to this workgroup has led to the delineation of two areas for focused ontology development within the scope of RCKMS. They include:
    1. An ontology about reportable events
    2. An ontology to query LOINC® and develop value sets for selection logic

These two areas are sufficiently distinct from each other to warrant separate ontologies, developed by expert content leaders.

  • Intuitive structure: Separate ontology and data. There are several examples in biomedicine of large ontology projects where the data is captured separately from the ontology; the Gene Ontology Consortium (23) being a good example. The ontology development is managed independently from the annotation of the ontology terms to the data. In this way, a few developers maintain control of the content of the ontology while annotators manage the data. This is a good model for the RCKMS where a small number of terminologists or content managers will be responsible for the terminology, but a larger group of interested parties will use the ontology – for example with the authoring use cases. This proposed work must capture the knowledge necessary to automate the processing of public health data. This knowledge falls into several domain areas. While it is possible to incorporate ontologies into systems, and reason over and query them on the fly, it is also equally feasible to manage the knowledge separately in an ontology, and export the reasoned statements from the ontology to the system. Separation of the ontology from the system has the benefit of easier maintenance, less software complexity, quicker response time when queried, and would offer a quicker path to operationalize within the current architecture.

  • Community engagement. Key stakeholders (in the downstream use of the knowledge) must be involved in the development of the ontologies.

  • Early use. RCKMS should provide ontologies and annotations to public health community for use as early as possible. The ontologies and systems developed should be promoted and disseminated to the wider community, while providing a forum for feedback. There must be a visible, accessible website for RCKMS documentation, including links to the ontology resources and documentation to allow community use and feedback. The ontology should be released in formats digestible by current tools.

Conclusion

Currently the management of knowledge surrounding reportable conditions is a manual task, for both those providing the rules and those interpreting the rules. The tasks of authoring are manually driven, as base content for conditions is not maintained electronically. Surveillance of conditions over multiple jurisdictions is an onerous manual task, as there is no connection between the national reportable event, and what is reported locally. Reporting has a large manual component as the jurisdictional rules must be interpreted and implemented locally – requiring each reporter, for example to decide which positive LOINC® tests to report for a given condition. Given that the content of external resources and reporting rules change periodically, the amount of manual mapping and updates is considerable. The RCKMS aims to provide efficient, automated solutions to these problems.

Complex information is required to support RCKMS and provide computable, viewable and usable information for reporters to know what, where, when and how to report to public health. Ontologies have been applied successfully to manage the knowledge of other large domains within the biomedical informatics community, and the workgroup concluded would be an appropriate tool for to manage reportable condition knowledge. Ontologies are indicated for two specific domains of content: selection logic criteria and reportable events. The undertaking of ontology development for a large project like RCKMS is not to be taken lightly, and ontology development should be transparent, open source, and versioned using strategies that leverage existing ontologies and allow reuse of the knowledge developed for the RCKMS effort.

Acknowledgments

The RCKMS project is funded by the CDC, Office of Public Health Scientific Services (OPHSS). We would like to thank members of the RCKMS Knowledge Representation Working group for voluntary participation in weekly discussion during 2013. Members include Sundak Ganesan, Anna Orlova, Austin Kreisler, Nikolay Lipskiy, Arun Srinivasan, Jeff Kriseman, Cecil Lynch, Scott Keller, Jerry Sable, Sheila Abner, Ted Klein, Ruth Ann Jajosky, Mary Hamilton, and Heather Patrick. In memoriam of Cynthia Vinion who contributed use cases to our analysis.

References

  • 1.Chorba TL, Berkelman RL, Safford SK, Gibbs NP, Hull HF. Mandatory Reporting of Infectious Diseases by Clinicians. JAMA. 1989;262(21):3018–26. [PubMed] [Google Scholar]
  • 2.M’ikanatha NM, Welliver DP, Rohn DD, et al. Use of the Web by State and Territorial Health Departments to Promote Reporting of Infectious Disease. JAMA. 2004;291(9):1069–71. doi: 10.1001/jama.291.9.1069. [DOI] [PubMed] [Google Scholar]
  • 3.Freund E, Seligman PJ, Chorba TL, Safford SK, Drachman JG, Hull HF. Mandatory Reporting of Occupational Diseases by Clinicians. JAMA. 1989;262(21):3041–4. [PubMed] [Google Scholar]
  • 4.Roush S, Birkhead G, Koo D, Cobb A, Fleming D. Mandatory Reporting of Diseases and Conditions by Healthcare Professionals and Laboratories. JAMA. 1999;282(2):164–70. doi: 10.1001/jama.282.2.164. [DOI] [PubMed] [Google Scholar]
  • 5.Jajosky R, Rey A, Park M, Aranas A, Macdonald S, Ferland L. Findings from the Council of State and Territorial Epidemiologists’ 2008 Assessment of State Reportable and Nationally Notifiable Conditions in the United States and Considerations for the Future. J Public Health Manag Pract. 2011;17(3):255–64. doi: 10.1097/phh.0b013e318200f8da. [DOI] [PubMed] [Google Scholar]
  • 6.Position Statements [Internet] 2014. Available from: http://www.cste.org/?page=PositionStatements.
  • 7.CDC PHIN Vocabulary Access and Distribution System Atlanta: CDC; 2011 [cited 2014 June 7] Available from: https://phinvads.cdc.gov/vads/SearchVocab.action.
  • 8.NLM NLM Value Set Authority Center (VSAC) 2014 [cited 2014 July 27] Available from: https://vsac.nlm.nih.gov.
  • 9.Lipskiy L, Orlova A, Klein T, Huang M, Huang G, Minami M, et al. Assure Health IT Standards for Public Health: Enable Electronic Detection of Reportable Conditions Through Improved Codification of Public Health Reporting Criteria. Public Health Data Standards Consortium; 2014. [Google Scholar]
  • 10.CDC National Notifiable Diseases Surveillance System (NNDSS) 2014 [cited 2014] Available from: http://wwwn.cdc.gov/nndss/
  • 11.IHTSDO SNOMED CT and LOINC to be linked by cooperative work 2013. Available from: http://www.ihtsdo.org/about-ihtsdo/governance-and-advisory/harmonization/loinc.
  • 12.CDC Reportable Condition Mapping Table (RCMT) Another step toward standardizing electronic laboratory reporting (ELR) 2014 [cited 2014 July 27] Available from: http://www.cdc.gov/EHRmeaningfuluse/rcmt.html.
  • 13.Council of State and Territorial Epidemiologists. State Reportable Conditions Website 2011 [cited 2014 June 5] Available from: http://www.cste2.org/izenda/entrypage.aspx.
  • 14.Eilbeck K, Jacobs J, McGarvey S, Vinion C, Staes C, editors. Exploring the use of ontologies and automated reasoning to manage selection of reportable condition lab tests from LOINC®; International Conference for Biomedical Ontology; 2013; Montreal. [Google Scholar]
  • 15.Eilbeck K, Jacobs J, Staes CJ, editors. Optimize Querying of LOINC with an Ontology: Give Me the Chlamydia Tests the Epidemiologists Want Me to Use!; 46th Hawaii International Conference on System Sciences (HICSS); 2013 7–10 Jan. 2013; Hawaii. [Google Scholar]
  • 16.Adamusiak T, Bodenreider O. Quality assurance in LOINC using Description Logic. AMIA Annual Symposium proceedings/AMIA Symposium AMIA Symposium. 2012;2012:1099–108. [PMC free article] [PubMed] [Google Scholar]
  • 17.Grenon P, Smith B, Goldberg L. Biodynamic Ontology: Applying BFO in the Biomedical Domain. In: Pisanelli DM, editor. Ontologies in Medicine. Amsterdam: IOS Press; 2004. pp. 20–38. [PubMed] [Google Scholar]
  • 18.Keet M. The Use of Foundational Ontologies in Ontology Development: An Empirical Assessment. Lecture Notes in Computer Science. 2011;6643:321–35. [Google Scholar]
  • 19.Smith B, Ceusters W, Klagges B, Kohler J, Kumar A, Lomax J, et al. Relations in biomedical ontologies. Genome biology. 2005;6(5):R46. doi: 10.1186/gb-2005-6-5-r46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Whetzel PL. NCBO Technology: Powering semantically aware applications. Journal of biomedical semantics. 2013;4(Suppl 1):S8. doi: 10.1186/2041-1480-4-S1-S8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Xiang Z, Courtot M, Brinkman RR, Ruttenberg A, He Y. OntoFox: web-based support for ontology reuse. BMC research notes. 2010;3:175. doi: 10.1186/1756-0500-3-175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bada M, Stevens R, Goble C, Gil Y, Ashburner M, Blake JA, et al. A short study on the success of the Gene Ontology. Journal of Web Semantics. 2004;1(2) [Google Scholar]
  • 23.Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature genetics. 2000;25(1):25–9. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES