Abstract
Globalization of food networks increases opportunities for the spread of foodborne pathogens beyond borders and jurisdictions. High resolution whole-genome sequencing (WGS) subtyping of pathogens promises to vastly improve our ability to track and control foodborne disease, but to do so it must be combined with epidemiological, clinical, laboratory and other health care data (called “contextual data”) to be meaningfully interpreted for regulatory and health interventions, outbreak investigation, and risk assessment. However, current multi-jurisdictional pathogen surveillance and investigation efforts are complicated by time-consuming data re-entry, curation and integration of contextual information owing to a lack of interoperable standards and inconsistent reporting. A solution to these challenges is the use of ‘ontologies’ - hierarchies of well-defined and standardized vocabularies interconnected by logical relationships. Terms are specified by universal IDs enabling integration into highly regulated areas and multi-sector sharing (e.g., food and water microbiology with the veterinary sector). Institution-specific terms can be mapped to a given standard at different levels of granularity, maximizing comparability of contextual information according to jurisdictional policies. Fit-for-purpose ontologies provide contextual information with the auditability required for food safety laboratory accreditation. Our research efforts include the development of a Genomic Epidemiology Ontology (GenEpiO), and Food Ontology (FoodOn) that harmonize important laboratory, clinical and epidemiological data fields, as well as existing food resources. These efforts are supported by a global consortium of researchers and stakeholders worldwide. Since foodborne diseases do not respect international borders, uptake of such vocabularies will be crucial for multi-jurisdictional interpretation of WGS results and data sharing.
Keywords: genomic epidemiology, foodborne pathogen surveillance, outbreak investigations, ontology, contextual metadata
Introduction: the Importance of Metadata and Contextual Information in Foodborne Safety and Surveillance
Foodborne pathogens impact global health and can cost economies millions of dollars in lost productivity (Flynn, 2014; Minor et al., 2015; World Health Organization, 2015). “Integrated surveillance” combines data from different stages of the farm-to-fork food continuum to provide multi-sector information for infectious disease surveillance, and represents the most comprehensive strategy to improve food safety (Zaidi et al., 2008; Ammon and Makela, 2010; Danan et al., 2011). Central to public health microbiology, food safety, and disease surveillance activities, is the comparison of genetic relatedness between isolates from human, food, and environmental samples. Whole genome sequencing (WGS) provides the highest resolution evidence for inferring phylogenetic relationships among foodborne pathogens (Ashton et al., 2016; Kanagarajah et al., 2017; Waldram et al., 2017). However, genomic sequences can only be consistently interpreted for food safety and surveillance when the data are linked to standardized, fit-for-purpose contextual information suitable for use by data analysts, data consumers, and stakeholders (Lambert et al., 2017).
Contextual information in genomic epidemiology investigations includes critical knowledge about sequencing pipelines and sequence quality, sources of exposure and risk, clinical phenotypes, susceptible populations, geographical distribution and more. Reliable capture of parameters pertaining to sample provenance (specimen types and sources), sample processing (DNA extraction and sequencing library construction), quality control (sequence quality and contamination detection), data analysis (bioinformatic pipelines) are critical for reproducibility, comparability, and calibration of genomic results (Kircher et al., 2011; Paszkiewicz et al., 2014; Lynch et al., 2016). In addition to sequencing and bioinformatics parameters, laboratory test results characterizing antimicrobial resistance and virulence phenotypes often reveal important pathogen determinants that help to inform source and risk (World Health Organization, 2008; Clark et al., 2016; Glasset et al., 2016; Sharma et al., 2016; Day et al., 2017; Kanengoni et al., 2017; Tagini et al., 2017). Clinical information about the host, and epidemiological information about possible exposures (high-risk food types), are all useful to establish at-risk populations and hypothesize about likely sources of contamination (World Health Organization, 2008). This information is also used to establish the geographic distribution of pathogenic strains, as well as among populations, which is critical for determining transmission patterns (Moura et al., 2016; Njamkepo et al., 2016). Rich contextual information increases the utility of genomics data used for food safety surveillance, outbreak investigations, source attribution and risk assessments. Risk analysis in particular requires precise data on pathogen hazards in food to be systematically linked to epidemiological data, in order to make assessments, implement interventions and monitor outcomes (Lammerding and Fazil, 2000; Hoornstra et al., 2001; Food and Agriculture Organization of the United Nations [FAO], 2005).
Unfortunately, resource-demands for the collection of such information, inconsistencies in descriptors, as well as other political and technical barriers have proven to complicate data sharing and integration between agencies. Wide adoption of contextual information best practices, as well as storage and sharing practices, would enable rapid, on-demand comparison of sequences from different sources and agencies, enhancing pathogen detection, inter-agency communication and responses. Here, we describe these various challenges and explain how informatics innovations such as ontologies can provide much needed solutions to streamline data interpretation and exchange for improved food safety and public health.
Barriers To Integration and Sharing of Whole Genome Sequence Data and Contextual Information
Despite a growing global commitment to the use and sharing of public health microbiology data, implementation at local, regional, national, and international levels has proven challenging with both political and technological barriers (van Panhuis et al., 2014). Fundamental structural barriers embedded in public health governance systems arise as the result of lack of trust (Pisani and AbouZahr, 2010; Fidler and Gostin, 2011; van Panhuis et al., 2014). Perceptions of risk to patient privacy and intellectual property, as well as the fear of misinterpretation and potential misuse of data are some of the biggest challenges to the sharing of sequence data and the exchange of contextual information (van Panhuis et al., 2014). Risk aversion practices prompt health agencies to implement blanket policies restricting data sharing, which result in incomplete metadata attached to sequences in public data repositories (van Panhuis et al., 2014).
Technological barriers for electronic data interchange exacerbate issues of political distrust (van Panhuis et al., 2014). Contextual data are mostly expressed as free text or agency-specific terminology. While reports and guidelines exist in an effort to suggest minimum contextual information that should be attached to genomic sequences, these fields are rarely incorporated into Lab Information Management Systems (LIMS) and epidemiology surveillance forms (Field et al., 2014; Grad and Lipsitch, 2014; Aziz et al., 2015; McMahon and Denaxas, 2016; Lambert et al., 2017). Through user interviews and needs assessments, we and others have found that information is then “siloed” in different hard drives, agencies, in restrictive data formats (paper or antiquated electronic formats), and is often collected for short-term purposes (van Panhuis et al., 2014). Owing to such inconsistency, recoding of the data is often needed for data sharing across institutions participating in multi-jurisdictional surveillance, impacting response time. By relying on retrospective retrieval from different sources (as opposed to real-time collection), the quality and quantity of contextual information become eroded over time. Flow of contextual information from source to end user, as well as barriers to collection and sharing are illustrated in Figure 1.
Existing Resources For Metadata Standardization and Food Safety: From Checklists To Ontologies
One of the biggest challenges to the standardization of metadata capture for food safety is the large number of incompatible food classifications used worldwide. These food classifications range from lists of food types, descriptors of food production environments, codes of practice, guidelines, and other recommendations relating to foods, food production, and food safety. While these resources are certainly useful, they have been developed for specific uses, and fundamental differences in their architecture limit interoperability. A selection of such food dictionaries can be found in Table 1. For example, analyses of foodborne outbreak data for source attribution requires the categorization of reported food vehicle. Variation in the way aetiological agents and foods are defined and categorized, even within a single country or jurisdiction, has been shown to impede direct comparison of food attribution across countries within similar time periods (Greig and Ravel, 2009). While up-to-date food safety best practices prescribe data collection systems to be sufficiently precise in order to minimize uncertainty, in reality, inconsistencies in descriptors pertaining to the host, pathogen, environment, and the underlying attributes of potentially contaminated foods, all contribute to uncertainty in data analyses and delay in public health action (Greig and Ravel, 2009).
Table 1.
Resource | Description | URL |
---|---|---|
Codex Alimentarius |
|
http://www.fao.org/fao-who-codexalimentarius/codex-home/en/ |
LanguaL |
|
http://www.langual.org/ |
Food Ex2 |
|
https://www.efsa.europa.eu/en/data/data-standardisation |
USDA National Nutrient Database for Standard Reference |
|
https://ndb.nal.usda.gov/ndb/foods |
Compendium of Analytical Methods |
|
http://www.hc-sc.gc.ca/fn-an/res-rech/analy-meth/microbio/volume1-eng.php |
Food Commodity Classification Scheme |
|
http://www.ncbi.nlm.nih.gov/pubmed/19968563 |
The Agriculture Ontology (AgrO) |
|
http://www.obofoundry.org/ontology/agro.html |
Antimicrobial Resistance Ontology (ARO) |
|
https://card.mcmaster.ca/ |
Basic Formal Ontology (BFO) |
|
http://www.obofoundry.org/ontology/bfo.html |
BRENDA Tissue Ontology (BTO) |
|
http://www.obofoundry.org/ontology/bto.html |
Chemical Entities of Biological Interest Ontology (ChEBI) |
|
http://www.obofoundry.org/ontology/chebi.html |
Cell Ontology (CL) |
|
http://www.obofoundry.org/ontology/cl.html |
Human Disease Ontology (DOID) |
|
http://www.obofoundry.org/ontology/doid.html |
EMBRACE Data and Methods Ontology (EDAM) |
|
http://www.ontobee.org/ontology/EDAM |
Environment Ontology (ENVO) |
|
http://www.obofoundry.org/ontology/envo.html |
Epidemiology (EPO) |
|
http://www.obofoundry.org/ontology/epo.html |
Exposure (EXO) |
|
http://www.obofoundry.org/ontology/exo.html |
Foundational Model of Anatomy (FMA) |
|
http://www.obofoundry.org/ontology/fma.html |
FooDB Ontology (FoodO) |
|
http://aber-owl.net/ontology/FOODO |
Food Ontology (FoodOn) |
|
http://www.obofoundry.org/ontology/foodon.html http://foodontology.github.io/foodon/ |
Genomic Epidemiology Ontology (GenEpiO) |
|
http://www.genepio.org http://www.obofoundry.org/ontology/genepio.html |
Infectious Disease Ontology (IDO) |
|
https://bioportal.bioontology.org/ontologies/IDO |
Next-Generation Sequencing Ontology (NGSOnto) |
|
https://bioportal.bioontology.org/ontologies/NGSONTO |
Ontology for Biomedical Investigations (OBI) |
|
http://www.obofoundry.org/ontology/obi.html |
Phenotypic Quality Ontology (PATO) |
|
http://www.obofoundry.org/ontology/pato.html |
Relation Ontology (RO) |
|
http://www.obofoundry.org/ontology/ro.html |
The Sustainable Development Goals Interface Ontology (SDGIO) |
|
https://github.com/SDG-InterfaceOntology/sdgio |
Sequence Ontology (SO) |
|
http://www.obofoundry.org/ontology/so.html |
Systematized Nomenclature of Medicine (SNOMED) |
|
http://www.ihtsdo.org/snomed-ct |
Clinical Signs and Symptoms Ontology (SYMP) |
|
http://www.obofoundry.org/ontology/symp.html |
Pathogen Transmission Ontology (TRANS) |
|
http://www.obofoundry.org/ontology/trans.html |
Microbial Typing Ontology (TypOn) |
|
https://bioportal.bioontology.org/ontologies/TYPON |
Multi-Species Anatomy Ontology (UBERON) |
|
http://www.obofoundry.org/ontology/uberon.html |
MIxS |
|
Yilmaz et al., 2011 |
Project and Sample Application Standard |
|
Dugan et al., 2014 |
Minimum Information about a Phylogenetic Analysis (MIAPA) |
|
Leebens-Mack et al., 2006 |
STROME-ID guidelines |
|
Field et al., 2014 |
The Global Alliance for Genomics and Health (GA4GH) |
|
http://genomicsandhealth.org/ |
The Global Microbial Identifier (GMI) |
|
http://www.globalmicrobialidentifier.org/ |
The United Nations Environment Programme (UNEP) |
|
http://web.unep.org/ |
United Nations Environment Live |
|
https://uneplive.unep.org/sdgs |
In designing an approach to capture standardized metadata, it is critical to define what information about a sample is most informative for its intended use. This process is best achieved via engagement of a variety of end users - in this case food regulators, epidemiologists, lab analysts, bioinformaticians, at local, regional, national and international levels. Minimum Information (MI) checklists represent the sum of all essential data fields recommended by community experts and users, with controlled vocabularies used as ‘allowed values’ (Field and Sansone, 2006). A well-known genomic metadata standard is the MIxS checklist, a minimal metadata standard checklist developed by the Genomic Standards Consortium (GSC) for reporting information about any nucleotide sequence (Yilmaz et al., 2011). Similarly, the National Institute of Allergy and Infectious Diseases Genome Sequencing Center and Bioinformatics Resource Center (GSCID/BRC) Project and Sample Application Standard specifically addresses metadata types that should be attached to human pathogen genomic sequences (Dugan et al., 2014). Additionally, the Minimum Information about a Phylogenetic Analysis (MIAPA) represents a community-wide effort to develop minimal reporting standards for phylogenetic analyses (Leebens-Mack et al., 2006). These checklists contain a wide variety of descriptive fields; however, they currently lack standardized values to enter in the fields.
A more comprehensive mechanism for making metadata searchable and actionable, is through the use of ’ontologies’ (Bodenreider and Stevens, 2006; Brinkman et al., 2010). Ontologies are hierarchies of well-defined and standardized vocabulary interconnected by logical relationships (Bodenreider and Stevens, 2006). These logical interconnections provide a layer of intelligence to query engines, making ontologies much more powerful than simple flat lists of terms. Terms and their definitions, are specified by universal IDs (Universal Resource Identifiers), which associate descriptors with particular usages and disambiguate meaning (Bodenreider and Stevens, 2006). Ontologies also incorporate synonyms of terms in the definitions and identifiers (IDs) e.g., biscuits (United Kingdom) and cookies (North America), enabling institutions to use their preferred terminology while simultaneously mapping terms to an ontology standard. The hierarchical structure enables comparison of entities at different levels of granularity (e.g., leafy greens and spinach), which represents an important feature for evolving food safety investigations in which the hypothesized food vehicle is a moving target. Mapping to an ontology-based standard and reuse of universal IDs makes software implementing the ontology framework interoperable, enabling faster and more efficient data exchange (Arp et al., 2015). The reuse of terms and their IDs enables integration of different data types across domains (epidemiology, food, disease, agriculture, antimicrobial resistance, etc) and between agencies (Ferreira et al., 2013). Computer and human readable (in different natural languages), ontology hierarchies allow stakeholders to share data according to the level of granularity permitted by jurisdictional policies, and fields of information with legal or privacy issues can be flagged using ontology relations to increase security. Furthermore, fit-for-purpose ontologies provide contextual information with the auditability required for food safety and public health laboratory accreditation (Evans, 2015). Principles of good practice in ontology development have been put into practice within the framework of the Open Biomedical Ontologies consortium through its OBO Foundry initiative, which emphasizes collaborative development, interoperability and usability (Smith et al., 2007). Descriptors of genomic epidemiological processes have already been captured in a number of existing ontologies. Some examples include the Sequence Ontology (SO) (Eilbeck et al., 2005), the EDAM Bioinformatics Ontology (EDAM) (Ison et al., 2013), and DOID (Schriml et al., 2012), which describe sequences, genome assembly, and human disease. The Exposure, Epidemiology, Environment, Symptoms, and Transmission Ontologies (EXO, EPO, ENVO, SYMP, TRANS) describe types of exposures, facets of epidemiology, natural and built environments, clinical signs and symptoms, and modes of transmission (Mattingly et al., 2012; Pesquita et al., 2014; Buttigieg et al., 2016). Ontologies and other resources useful for genomic epidemiology are listed in Table 1.
Currently, no resource(s) integrate all the necessary components of a genomic epidemiology investigation. As such, our research efforts have focused on the development of a Genomic Epidemiology Ontology (GenEpiO), based on public health stakeholder interviews and the harmonization of important laboratory, clinical and epidemiological data fields, in collaboration with a consortium of researchers and end users. We are also actively developing, in collaboration with members of the international GenEpiO consortium, a Farm-to-Fork food ontology (FoodOn) aiming to harmonize existing food resources and describe food entities from point(s) of production/collection, through processing, distribution and consumption.
GenEpiO and FoodOn: New Developments in Food Safety Semantics
The Genomic Epidemiology Ontology (GenEpiO) is an ontology resource being developed according to the principles of the OBO Foundry, led by a partnership of Canadian scientists representing academic, provincial and federal public health interests. The objective of GenEpiO is to enable integration and propagation of all necessary contextual information required to interpret microbial pathogen genomics data, from the point-of-sample-intake, through sequencing, to end use (e.g., during a foodborne outbreak investigation). The GenEpiO hierarchy was constructed based on the Basic Formal Ontology (BFO) and Relation Ontology (RO) of the OBO Foundry, which delineate how things should be organized into higher level classes, and how things and classes should relate to one another (Smith et al., 2005; Arp et al., 2015). This architecture improves compatibility with other OBO biomedical ontologies, enriching vocabulary and data linkages, and facilitating the reuse of terminology and the integration of information across health and food safety domains (agriculture, veterinary care, environment, food production). The considerable consensus achieved by the OBO Foundry has paved the way for harmonization of complex content in a way that is unavailable with other disparate ontologies. GenEpiO terms are mapped to community standards and over 25 existing ontologies to ensure the accuracy of meaning and to facilitate interoperability (Figure 1B). GenEpiO also includes data models comprising disease/agency/reporting or analytical system/surveillance network-specific fields, which can be used to represent genomic epidemiology workflows, processes, disease progression and decision-making. GenEpiO currently contains over 2000 key fields and terms to harmonize sample metadata, lab analytics, wet lab and bioinformatics processes, quality control, clinical information as well as exposures and epidemiological data. As such, we anticipate that GenEpiO will better enable the calibration and validation of genomics for clinical and regulatory use. Controlled vocabulary and relationship logic are encoded in the Web Ontology Language, OWL. OWL files are publicly available, and can be implemented in different software applications (Table 1). The GenEpiO ontology is currently being implemented within the Integrated Rapid Infectious Disease Analysis (IRIDA) platform1, an open source, secure web-based, end-to-end platform for infectious disease genomic epidemiology, spearheaded in Canada. Within IRIDA, GenEpiO is being used to generate NCBI BioSample-compliant submission-ready genome metadata files, and to create different Line List visualization tools for epidemiological investigations. The next phase of development will involve the complete integration of GenEpiO to enhance the platform’s analytical power.
FoodOn encompasses materials in natural ecosystems, as well as human-centric food items, food production environments and handling of food (Griffiths et al., 2016). We aim to develop semantics for food safety, food security, the agricultural and animal husbandry practices linked to food production, culinary, nutritional and chemical ingredients and processes. As such, FoodOn architecture is similarly based on BFO and RO schema, as well as the facet-based LanguaL (Langua aLimentaria, or language of food) classification system of the US Food and Drug Administration (US FDA) (Ireland and Møller, 2010). Facets include Food Products, which can be linked to Food Sources, Cooking and Preservation Methods, Consumer Groups, Cultural Origins, Taxonomy and more. Thousands of individual food products have already been indexed according to the LanguaL system, and are publicly available in a separate FoodOn import file (Table 1). The scope of FoodOn is ambitious and will require input and long-term development by multiple domain experts. Further details regarding GenEpiO and FoodOn design and content will be discussed elsewhere (manuscripts in preparation).
In order to ensure utility, accuracy and usability, user engagement is a top priority for GenEpiO and FoodOn development. Feedback from engagement efforts has indicated that user-friendly tools for curation of terms, implementation, and mapping between interfaces and agencies, would serve to mobilize these technologies. To that effect, we are currently developing software applications for ontology mapping and curation. Additionally, both ontologies can be searched using various widely used portals such as the EBI Ontology Look-up Service, Ontobee, and NCBO BioPortal (Table 1). As harmonization of the both GenEpiO and FoodOn ontologies can only be achieved by consensus and wide adoption, involving open source and open access initiatives, we have catalyzed the formation of international consortia to build partnerships and solicit contributions from domain experts. The GenEpiO consortium membership comprises over 70 participants from 15 countries, with leadership, technical and editorial working groups. The interaction of the consortia, tools, applications, ontologies, users and repositories will be important for soliciting term contributions, as well as integrating regional- and sector-specific vocabulary, and evolving strategies for international uptake (Figure 1C).
Broader Context of Food Genomics Metadata and Ontologies
Several frameworks for integrating genomics and other data currently exist for tackling the real-world problems of emerging diseases, environmental degradation, world hunger, and sustainability. Each of these global partnerships seeks to streamline the flow of genomics knowledge and its application for solving global challenges. The Global Alliance for Genomics and Health (GA4GH) and The Global Microbial Identifier (GMI) work to establish common frameworks and transdisciplinary networks to better monitor and control emerging public health threats (Knoppers, 2014; Wielinga et al., 2017). The Environmental Working Group of the United Nations (UNEP) have developed Sustainable Development Goals addressing climate change, renewable energy, food, health and water provision requiring the coordinated global monitoring (United Nations, 2016). Each of these efforts involves highly negotiated language representing different disciplines and policies, which can be harmonized into a coherent system through the use of ontologies. GA4GH and UNEP currently implement OBO Foundry ontologies that have been integrated into GenEpiO (e.g., ENVO, UBERON, ChEBI). GenEpiO integrates the Minimal Data for Matching standards for matching pathogen isolates prescribed by the GMI consortium (Global Microbial Identifier, 2013), and GenEpiO and FoodOn standards are being considered for an upcoming ISO (International Organization for Standards) guideline on the use of WGS for Food Safety. The standardized food and food environment descriptors being developed in FoodOn can fill a critical gap in community standards required to integrate food related data in each of these efforts. Global initiatives and associated ontologies can be found in Table 1. Public health and genomics descriptors found in GenEpiO, combined with existing compatible ontologies for describing different environments (ENVO), agriculture (AgrO), and sustainable development (SDGIO), will greatly enable the integration of knowledge required to accomplish global health, equity and sustainability goals (Table 1).
Conclusion
Platforms implementing ontologies such as GenEpiO and FoodOn will be the work-engines ensuring the integration and reusability of genomics data from the collection of samples, through consumption by various end users. With the international nature of food distribution and food safety concerns, the most effective semantic resources must be open source, interoperable and collaboratively developed in order to best represent the needs of the international community. Global networks navigating the political challenges inherent in such community efforts will be crucial for the success of genomics as the new currency of food and waterborne pathogen typing. While no “one-size-fits-all” data dictionary for genomic epidemiology currently exists, harmonization of different vocabularies can be achieved through the use of ontologies and the flexibility they provide. With growing support of community-based development efforts, this foundational work can facilitate intra- and international data exchange, resulting in improved food safety and health outcomes globally, as well as promoting innovation and discovery.
Author Contributions
EG wrote the manuscript. EG and DD developed software, concepts and resources. MG and GVD contributed input, use cases and testing material for resource development. WH and FB conceived the project and supervised this work. DD, MG, GVD, FB, and WH provided feedback on the manuscript.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
The authors would like to thank the GenEpiO Consortium for their contributions and support, as well as Pier Luigi Buttigieg, Robert Hoehndorf, Matthew Lange and Chris Mungall of the FoodOn Consortium, and Jane Ireland and Anders Møller of The Danish Food Informatics (DFI) group, for their ongoing development efforts.
Funding. This work was funded by Genome Canada Bioinformatics and Computational Biology (BCB) 2012 Grant #172PHM with co-funding from Genome BC and the federal Genomics Research and Development Initiative (GRDI) interdepartmental Food and Water Safety project. FoodON is funded by Genome Canada BCB 2015 Grant #254EPI, with some additional support from AllerGen NCE, Inc., of the Government of Canada’s Networks of Centres of Excellence (NCE) program.
References
- Ammon A., Makela P. (2010). Integrated data collection on zoonoses in the European Union, from animals to humans, and the analyses of the data. Int. J. Food Microbiol. 139(Suppl. 1) S43–S47. 10.1016/j.ijfoodmicro.2010.03.002 [DOI] [PubMed] [Google Scholar]
- Arp R., Smith B., Spear A. D. (2015). Building Ontologies with Basic Formal Ontology. Cambridge, MA: The MIT Press. [Google Scholar]
- Ashton P. M., Nair S., Peters T. M., Bale J. A., Powell D. G., Painset A., et al. (2016). Identification of Salmonella for public health surveillance using whole genome sequencing. PeerJ 4:e1752 10.7717/peerj.1752 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aziz N., Zhao Q., Bry L., Driscoll D. K., Funke B., Gibson J. S., et al. (2015). College of american pathologists’ laboratory standards for next-generation sequencing clinical tests. Arch. Pathol. Lab. Med. 139 481–493. 10.3760/cma.j.issn.0529-5815.2017.02.004 [DOI] [PubMed] [Google Scholar]
- Bodenreider O., Stevens R. (2006). Bio-ontologies: current trends and future directions. Brief. Bioinform. 7 256–274. 10.1093/bib/bbl027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brinkman R. R., Courtot M., Derom D., Fostel J. M., He Y., Lord P., et al. (2010). Modeling biomedical experimental processes with OBI. J. Biomed. Semant. 1(Suppl. 1), S7 10.1186/2041-1480-1-S1-S7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buttigieg P. L., Pafilis E., Lewis S. E., Schildhauer M. P., Walls R. L., Mungall C. J. (2016). The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. J. Biomed. Semant. 7:57 10.1186/s13326-016-0097-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark C. G., Berry C., Walker M., Petkau A., Barker D. O. R., Guan C., et al. (2016). Genomic insights from whole genome sequencing of four clonal outbreak Campylobacter jejuni assessed within the global C. jejuni population. BMC Genomics 17:990 10.1186/s12864-016-3340-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danan C., Baroukh T., Moury F., Jourdan-Da Silva N., Brisabois A., Le Strat Y. (2011). Automated early warning system for the surveillance of Salmonella isolated in the agro-food chain in France. Epidemiol. Infect. 139 736–741. 10.1017/S0950268810001469 [DOI] [PubMed] [Google Scholar]
- Day M., Doumith M., Jenkins C., Dallman T. J., Hopkins K. L., Elson R., et al. (2017). Antimicrobial resistance in Shiga toxin-producing Escherichia coli serogroups O157 and O26 isolated from human cases of diarrhoeal disease in England, 2015. J. Antimicrob. Chemother. 72 145–152. 10.1093/jac/dkw371 [DOI] [PubMed] [Google Scholar]
- Dugan V. G., Emrich S. J., Giraldo-Calderón G. I., Harb O. S., Newman R. M., Pickett B. E., et al. (2014). Standardized metadata for human pathogen/vector genomic sequences. PLoS ONE 9:e99979 10.1371/journal.pone.0099979 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eilbeck K., Lewis S. E., Mungall C. J., Yandell M., Stein L., Durbin R., et al. (2005). The Sequence ontology: a tool for the unification of genome annotations. Genome Biol. 6:R44 10.1186/gb-2005-6-5-r44 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Evans P. (2015). “International standards development for use of whole genome sequencing in food microbiology,” in Proceedings of the InFORM Meeting Phoenix, AZ. [Google Scholar]
- Ferreira J. D., Paolotti D., Couto F. M., Silva M. J. (2013). On the usefulness of ontologies in epidemiology research and practice. J. Epidemiol. Commun. Health 67 385–388. 10.1136/jech-2012-201142 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fidler D. P., Gostin L. O. (2011). The WHO pandemic influenza preparedness framework: a milestone in global governance for health. JAMA 306 200–201. 10.1001/jama.2011.960 [DOI] [PubMed] [Google Scholar]
- Field D., Sansone S.-A. (2006). A special issue on data standards. OMICS J. Integr. Biol. 10 84–93. 10.1089/omi.2006.10.84 [DOI] [Google Scholar]
- Field N., Cohen T., Struelens M. J., Palm D., Cookson B., Glynn J. R., et al. (2014). Strengthening the reporting of molecular epidemiology for infectious diseases (STROME-ID): an extension of the STROBE statement. Lancet Infect. Dis. 14 341–352. 10.1016/S1473-3099(13)70324-4 [DOI] [PubMed] [Google Scholar]
- Flynn D. (2014). USDA: U.S. foodborne illnesses cost more than $15.6 billion annually. Food Saf. News. Available at: http://www.foodsafetynews.com/2014/10/foodborne-illnesses-cost-usa-15-6-billion-annually/ [Google Scholar]
- Food and Agriculture Organization of the United Nations [FAO] (2005). Food Safety Risk Analysis - An Overview and Framework Manual. Available at: https://www.fsc.go.jp/sonota/foodsafety_riskanalysis.pdf [Google Scholar]
- Glasset B., Herbin S., Guillier L., Cadel-Six S., Vignaud M.-L., Grout J., et al. (2016). Bacillus cereus-induced food-borne outbreaks in France, 2007 to 2014: epidemiology and genetic characterisation. Euro. Surveill. 21:30413 10.2807/1560-7917.ES.2016.21.48.30413 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Global Microbial Identifier (2013). 6th Annual Meeting on Global Microbial Identifier Sacramento, CA: Global Microbial Identifier; Available at: http://www.globalmicrobialidentifier.org/news-and-events/previous-meetings/6th-meeting-on-gmi [Google Scholar]
- Grad Y. H., Lipsitch M. (2014). Epidemiologic data and pathogen genome sequences: a powerful synergy for public health. Genome Biol. 15:538 10.1186/s13059-014-0538-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greig J. D., Ravel A. (2009). Analysis of foodborne outbreak data reported internationally for source attribution. Int. J. Food Microbiol. 130 77–87. 10.1016/j.ijfoodmicro.2008.12.031 [DOI] [PubMed] [Google Scholar]
- Griffiths E., Dooley D., Buttigieg P. L., Hoehndorf R., Brinkman F., Hsiao W. (2016). “FoodOn: a global farm-to-fork food ontology,” in Proceedings of the ICBO Conference Corvalis, OR. [Google Scholar]
- Hoornstra E., Northolt M. D., Notermans S., Barendsz A. W. (2001). The use of quantitative risk assessment in HACCP. Food Control 12 229–234. 10.1016/j.ijfoodmicro.2015.03.032 [DOI] [Google Scholar]
- Ireland J. D., Møller A. (2010). LanguaL food description: a learning process. Eur. J. Clin. Nutr. 64 S44–S48. 10.1038/ejcn.2010.209 [DOI] [PubMed] [Google Scholar]
- Ison J., Kalas M., Jonassen I., Bolser D., Uludag M., McWilliam H., et al. (2013). EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats. Bioinformatics 29 1325–1332. 10.1093/bioinformatics/btt113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kanagarajah S., Waldram A., Dolan G., Jenkins C., Ashton P. M., Carrion Martin A. I., et al. (2017). Whole genome sequencing reveals an outbreak of Salmonella Enteritidis associated with reptile feeder mice in the United Kingdom, 2012-2015. Food Microbiol. (in press). [DOI] [PubMed] [Google Scholar]
- Kanengoni A. T., Thomas R., Gelaw A. K., Madoroba E. (2017). Epidemiology and characterization of Escherichia coli outbreak on a pig farm in South Africa. FEMS Microbiol. Lett. 364:fnx010 10.1093/femsle/fnx010 [DOI] [PubMed] [Google Scholar]
- Kircher M., Heyn P., Kelso J. (2011). Addressing challenges in the production and analysis of illumina sequencing data. BMC Genomics 12:382 10.1186/1471-2164-12-382 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knoppers B. M. (2014). Framework for responsible sharing of genomic and health-related data. HUGO J. 8:3 10.1186/s11568-014-0003-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lambert D., Pightling A., Griffiths E., Van Domselaar G., Evans P., Berthelet S., et al. (2017). Baseline practices for the application of genomic data supporting regulatory food safety. J. AOAC Int. 100 721–731. 10.5740/jaoacint.16-0269 [DOI] [PubMed] [Google Scholar]
- Lammerding A. M., Fazil A. (2000). Hazard identification and exposure assessment for microbial food safety risk assessment. Int. J. Food Microbiol. 58 147–157. 10.1016/S0168-1605(00)00269-5 [DOI] [PubMed] [Google Scholar]
- Leebens-Mack J., Vision T., Brenner E., Bowers J. E., Cannon S., Clement M. J., et al. (2006). Taking the first steps towards a standard for reporting on phylogenies: minimum information about a phylogenetic analysis (MIAPA). Omics J. Integr. Biol. 10 231–237. 10.1089/omi.2006.10.231 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch T., Petkau A., Knox N., Graham M., Domselaar G. V. (2016). A primer on infectious disease bacterial genomics. Clin. Microbiol. Rev. 29 881–913. 10.1128/CMR.00001-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mattingly C. J., McKone T. E., Callahan M. A., Blake J. A., Hubal E. A. C. (2012). Providing the missing link: the exposure science ontology ExO. Environ. Sci. Technol. 46 3046–3053. 10.1021/es2033857 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McMahon C., Denaxas S. (2016). A novel framework for assessing metadata quality in epidemiological and public health research settings. AMIA Summits Transl. Sci. Proc. 2016 199–208. [PMC free article] [PubMed] [Google Scholar]
- Minor T., Lasher A., Klontz K., Brown B., Nardinelli C., Zorn D. (2015). The per case and total annual costs of foodborne illness in the United States. Risk Anal. 35 1125–1139. 10.1111/risa.12316 [DOI] [PubMed] [Google Scholar]
- Moura A., Criscuolo A., Pouseele H., Maury M. M., Leclercq A., Tarr C., et al. (2016). Whole genome-based population biology and epidemiological surveillance of Listeria monocytogenes. Nat. Microbiol. 2:16185 10.1038/nmicrobiol.2016.185 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Njamkepo E., Fawal N., Tran-Dien A., Hawkey J., Strockbine N., Jenkins C., et al. (2016). Global phylogeography and evolutionary history of Shigella dysenteriae type 1. Nat. Microbiol. 1:16027 10.1038/nmicrobiol.2016.27 [DOI] [PubMed] [Google Scholar]
- Paszkiewicz K. H., Farbos A., O’Neill P., Moore K. (2014). Quality control on the frontier. Front. Genet. 5:157 10.3389/fgene.2014.00157 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pesquita C., Ferreira J. D., Couto F. M., Silva M. J. (2014). The epidemiology ontology: an ontology for the semantic annotation of epidemiological resources. J. Biomed. Semant. 5:4 10.1186/2041-1480-5-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pisani E., AbouZahr C. (2010). Sharing health data: good intentions are not enough. Bull. World Health Organ. 88 462–466. 10.2471/BLT.09.074393 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schriml L. M., Arze C., Nadendla S., Chang Y.-W. W., Mazaitis M., Felix V., et al. (2012). Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res. 40 D940–D946. 10.1093/nar/gkr972 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharma M., Nunez-Garcia J., Kearns A. M., Doumith M., Butaye P. R., Argudín M. A., et al. (2016). Livestock-associated methicillin resistant Staphylococcus aureus (LA-MRSA) clonal complex (CC) 398 isolated from UK animals belong to European lineages. Front. Microbiol. 7:1741 10.3389/fmicb.2016.01741 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith B., Ashburner M., Rosse C., Bard J., Bug W., Ceusters W., et al. (2007). The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25 1251–1255. 10.1038/nbt1346 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith B., Ceusters W., Klagges B., Köhler J., Kumar A., Lomax J., et al. (2005). Relations in biomedical ontologies. Genome Biol. 6:R46 10.1186/gb-2005-6-5-r46 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tagini F., Aubert B., Troillet N., Pillonel T., Praz G., Crisinel P. A., et al. (2017). Importance of whole genome sequencing for the assessment of outbreaks in diagnostic laboratories: analysis of a case series of invasive Streptococcus pyogenes infections. Eur. J. Clin. Microbiol. Infect. Dis. 10.1007/s10096-017-2905-z [Epub ahead of print]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- United Nations (2016). Biodiversity and the 2030 Agenda for Sustainable Development. Available at: http://www.undp.org/content/undp/en/home/librarypage/environment-energy/ecosystems_and_biodiversity/biodiversity-and-the-2030-agenda-for-sustainable-development---p.html [Google Scholar]
- van Panhuis W. G., Paul P., Emerson C., Grefenstette J., Wilder R., Herbst A. J., et al. (2014). A systematic review of barriers to data sharing in public health. BMC Public Health 14:1144 10.1186/1471-2458-14-1144 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waldram A., Dolan G., Ashton P. M., Jenkins C., Dallman T. J. (2017). Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014. Food Microbiol. (in press) 10.1016/j.fm.2017.02.012 [DOI] [PubMed] [Google Scholar]
- Wielinga P. R., Hendriksen R. S., Aarestrup F. M., Lund O., Smits S. L., Koopmans M. P., et al. (2017). “Global microbial identifier,” in Applied Genomics of Foodborne Pathogens eds Deng X., Bakker H. C., den Hendriksen R. S. (Cham: Springer International Publishing; ) 13–31. [Google Scholar]
- World Health Organization (2008). Foodborne Disease Outbreaks : Guidelines for Investigation And Control. Geneva: World Health Organization; Available at: http://www.who.int/iris/handle/10665/43771 [Google Scholar]
- World Health Organization (2015). WHO’s First Ever Global Estimates of Foodborne Diseases Find Children Under 5 Account for Almost One Third of Deaths. Geneva: World Health Organization; Available at: http://www.who.int/mediacentre/news/releases/2015/foodborne-disease-estimates/en/ [Google Scholar]
- Yilmaz P., Kottmann R., Field D., Knight R., Cole J. R., Amaral-Zettler L., et al. (2011). Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat. Biotechnol. 29 415–420. 10.1038/nbt.1823 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zaidi M. B., Calva J. J., Estrada-Garcia M. T., Leon V., Vazquez G., Figueroa G., et al. (2008). Integrated food chain surveillance system for Salmonella spp. in Mexico. Emerg. Infect. Dis. 14 429–435. 10.3201/eid1403.071057 [DOI] [PMC free article] [PubMed] [Google Scholar]