Version Changes
Revised. Amendments from Version 1
The most significant change in this revision is the addition of two simple example workflows
Supplementary File 1 was added which includes example result outputs for the two remaining API calls not shown in the main manuscript
Updated the Implementation section to clarify the WikiPathways RDF generation process
The summary section also discusses creating more complex workflows and supported platforms and libraries
Explanation for acceptable URIs that can be used in API calls was added
Updated the Methods section to specify what happens in the API call when an interaction direction is not specified
The Methods section now also explains how ontological interaction type information can be retrieved from the JSON returned by the API call
Abstract
Open PHACTS is a pre-competitive project to answer scientific questions developed recently by the pharmaceutical industry. Having high quality biological interaction information in the Open PHACTS Discovery Platform is needed to answer multiple pathway related questions. To address this, updated WikiPathways data has been added to the platform. This data includes information about biological interactions, such as stimulation and inhibition. The platform's Application Programming Interface (API) was extended with appropriate calls to reference these interactions. These new methods of the Open PHACTS API are available now.
Keywords: Open PHACTS, drug discovery, semantic, bioinformatics, WikiPathways, pathway database, API
Introduction
Targeting proteins to ideally restore normal biological processes is a common starting point in drug discovery 1. The Open PHACTS Discovery Platform (OPDP) was designed to help identify protein targets and information about their associations with each other 2– 4. The OPDP supports target identification and validation by including target-target interactions from WikiPathways 5– 7. Of these interaction networks, proteins sharing a downstream path allows investigation of alternative drug target combinations. Even the knowledge of which biological pathways participate in disease-related processes provides insight in the pathway topology between the targets. The importance and need of providing access to interaction information for real-world research questions was outlined in a recent Open PHACTS paper 8.
The Open PHACTS project was born out of the desire to integrate pharmacological data from multiple precompetitive sources to efficiently address scientific questions that cannot be answered with single data sources 8. It integrates data using linked data approaches 3 from chemical and biological sources such as ChEBI, ChEMBL, UniProt, and WikiPathways 6. However, the OPDP did not previously include calls to access specific up- and downstream interaction effects. This information is needed for questions related to drug repositioning and repurposing. Up- or downstream targets may be interesting alternatives with similar therapeutic effect to targets, for which it is particularly hard to develop a drug agent. Thus, finding a target that has already been drugged or is more drug tractable will be advantageous. Here we describe how to identify alternative targets in the same cellular pathway using OPDP against the WikiPathways data.
Methods
Implementation
The WikiPathways Resource Description Framework data (WPRDF) is released as part of the monthly releases 5. The native format for WikiPathways is Graphical Pathway Markup Language (GPML) based on the eXtensible Markup Language (XML) standard. The RDF export is transformed from the original GPML. In the RDF representation we use two distinct controlled vocabularies, to distinguish between the graphical notation of a pathway and the biological meanings expressed in the pathway. This is done to allow integration with other pathway repositories which use other graphical notations or none. The WikiPathways RDF also includes details about directed and undirected interactions. Directed biochemical interactions capture the source and target which are depicted as an arrow in simple pathway drawings. WikiPathways adds biological meaning to interactions with Molecular Interaction Map (MIM) interaction types, like inhibitions, enzyme catalyzed reactions, and stimulations 9, as well as Systems Biology Graphical Notation (SBGN) interactions 10. Reactome pathways in WikiPathways use SBGN interactions 11, 12. However, because MIM and SBGN use different drawing styles, we normalize their inhibition types into a common inhibition type, defined by the WikiPathways ontology ( https://vocabularies.wikipathways.org/wp).
The WikiPathways basic drawing tools also contain generic arrows and T-bar annotations that give the user the ability to create basic diagrams without the semantic meaning of MIM or SBGN notations. The interactions connecting these nodes are captured, but the only explicit information is that it is a directed interaction from a source to a target. To handle more complicated enzyme reaction drawings, where there is not a single line that directly connects targets in a cascade of enzymatic reactions, a query was developed that recognizes these types of reactions. However, this is not implemented in the current Open PHACTS Application Programming Interface (API).
Version 2.1 of the OPDP API contains three new calls for interactions and their pathways. The first call, /pathway/getInteractions, returns all interactions involved in a pathway. To use this feature, the user specifies a pathway URI and OPDP returns its interactions including information about direction and the connected entities. The direction information is relayed as a starting node having a wp:source annotation, while the end of the interaction has the wp:target annotation. In its simplest form, this means that if gene product A is interacting with a gene product B, then we have wp:source for product A and wp:target for product B. However, the presented new methods also support interactions with multiple sources and targets for more complex interactions that are more accurately represented this way.
The second added call, /pathways/interactions/byEntity, returns the direction of the interactions involving this entity. An entity is specified by a URI and can be a metabolite, protein, gene product, or RNA. API options allow the user to select only upstream or only downstream interactions. If a direction is not specified in the call, all the adjacent interactions will be retrieved regardless of their direction. The results also specify the interaction type (e.g. inhibition, stimulation, conversion). Vocabularies.wikipathways.org also identifies catalysis and binding events as well as a more generic directedInteraction in the case where the type of the interaction is not identified. This ability to select the interaction direction is specifically what allows users to answer scientific questions around upstream and downstream effects, such as those defined by Open PHACTS. The third API call is /pathways/interactions/byEntity/count which is a helper function that returns the number of interactions for a target.
Operation
The OPDP API calls are backed by SPARQL searches against the loaded WikiPathways RDF. The query parameters that are required or optional are given in the documentation of Open PHACTS ( https://dev.openphacts.org/docs/2.1). As in previous versions, the API uses HTTP GET to call methods and needs a (free) application ID and key (see https://dev.openphacts.org/signup) 3.
To ensure multiple URI schemes can be used to identify genes, proteins, and metabolites, the Open PHACTS platform uses an Identifier Mapping Service (IMS) 6. This ensures that people can use Ensembl, NCBI Gene, and others for genes, UniProt, Ensembl, etc. for proteins, and HMDB, ChEBI, CAS registry number, and PubChem for metabolites. Furthermore, it supports identifiers.org formatted URIs, further simplifying entering identifiers 13.
Example queries
We are demonstrating the platform with three example calls. All the API calls require use of an application ID and an application key. This key and ID can be acquired by creating a free Open PHACTS account. The first example is an application to the PI3K/AKT pathway for cell growth regulation which contain important targets for cancer treatment 14. The AKT protein has a central role and usefully shows the API call’s ability to return connected elements with the /pathways/interactions/byEntity and the /pathway/getInteractions calls. The API calls can help aid drug discovery by taking a target, in this case AKT, and easily identify other connected proteins that could potentially be used as drug targets with a common downstream effect.
Figure 1 shows the web interface of the API call that returns the connectivity of the AKT2 target to both upstream or downstream proteins or gene products. This method allows the user to identify connections to other targets in the pathway. The results of that API call ( Figure 2) show the AKT2 interaction with microRNA. A helper method ( Figure 3): /pathways/interactions/byEntity/count is also included. It returns the number of all interactions in which an entity is participates. This helps the user get a sense of the prevalence of the queried entity with interactions in pathways found on WikiPathways. An example result for this query can be found in Supplementary Figure 1.
The other call implemented, /pathway/getInteractions ( Figure 4), demonstrates an API call to return all interactions in the MicroRNAs in cardiomyocyte hypertrophy pathway 15. This pathway has interaction details for AKT, mTOR, and PI3K, which are all important targets in cancer research 16. For each interaction the participants are given and whether it is a directed or undirected interaction. An example result for this query can be seen in Supplementary Figure 2.
Example workflows
In order to demonstrate the basic use of the introduced API methods, we developed two workflows, available in the Supplementary Material. One uses Python to return a file with the results in a table and the other uses a HTML webpage using the ops.js JavaScript client library 17. More involved workflows have been developed for KNIME and Pipeline Pilot 18, 19.
The Python script example uses the Open PHACTS /pathway/getInteraction API call and prompts the user to enter a WikiPathways pathway number that they wish to query, such as 1544 for WikiPathways pathway WP1544. Invocation of the API call with the pathway identifier returns information about the directed interactions that are involved with the pathway. The information that is returned is the interaction ID used by WikiPathways, the interaction type, and URIs for the source and target of the interaction. In order to convert the URIs into something more readable, a SPARQL query is then executed to get labels, from the WikiPathways SPARQL endpoint, for the source and target of the interaction. The results are written to a file with the interaction ID, interaction type, URIs for the source and target, as well as alias IDs, the curl for the API call, the pathway ID used, and a number of interactions returned.
The second example uses a HTML5 webpage and the ops.js JavaScript client library to retrieve interactions for a particular gene, using the URI for the gene’s Ensembl identifier and the /pathways/interactions/byEntity API method. The ops.js library passes the returned JSON with interaction information to a callback function, where the interacting source and target are extracted and the interacting entity determined. For each interacting entity, which may be a protein, RNA, or small compound, a call to the /pathways/interactions/byEntity/count method is made to return the number of interaction that entity has.
Summary
While the calls identified here are simple calls, workflow tools make it possible to take advantage of the integrative nature of the OPDP to make API calls in succession. Two such workflow tools that work with the OPDP are KNIME and Pipeline Pilot. With these tools, it is possible to perform a directional query of a target and identify alternative targets that can then be queried against the chemistry calls to identify active compounds for these alternative targets. The client libraries ops.js, ops4j, and ropenphacts also support Open PHACTS and the interaction calls for pathways. This allows users to perform API calls to the OPDP using their preferred language or platform, such as JavaScript, Java, or R.
The addition of interactions with direction information allows OPDP to answering more of the pre-defined scientific questions 2. The directional information allows the user to explore how proteins and gene products are connected with one another and easily access this information. This is illustrated in the example queries using the cancer target AKT.
Software availability
Online service: https://dev.openphacts.org/docs/2.1
Latest source code is available at: https://github.com/openphacts/OPS_LinkedDataApi
Archived source code of discussed version: https://doi.org/10.5281/zenodo.1068252 20
License: Apache License 2.0
Acknowledgments
A special thanks goes to all members of the Open PHACTS project that provided the platform that was necessary.
Funding Statement
This work was supported by the Innovative Medicines Initiative Joint Undertaking [115191], resources of which are composed of financial contribution from the European Union’s Seventh Framework Programme (FP7/2007-2013) and EFPIA companies’ in kind contribution.
[version 2; referees: 2 approved]
Supplementary material
Supplementary File 1. Contains two additional figures that show example API call output and contains the workflows to demonstrate the use of the APIs.
References
- 1. Schreiber SL: Target-oriented and diversity-oriented organic synthesis in drug discovery. Science. 2000;287(5460):1964–1969. 10.1126/science.287.5460.1964 [DOI] [PubMed] [Google Scholar]
- 2. Azzaoui K, Jacoby E, Senger S, et al. : Scientific competency questions as the basis for semantically enriched open pharmacological space development. Drug Discov Today. 2013;18(17–18):843–852. 10.1016/j.drudis.2013.05.008 [DOI] [PubMed] [Google Scholar]
- 3. Williams AJ, Harland L, Groth P, et al. : Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today. 2012;17(21–22):1188–1198. 10.1016/j.drudis.2012.05.016 [DOI] [PubMed] [Google Scholar]
- 4. Digles D, Zdrazil B, Neefs JM, et al. : Open PHACTS computational protocols for in silico target validation of cellular phenotypic screens: knowing the knowns. Medchemcomm. 2016;7(6):1237–1244. 10.1039/c6md00065g [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Waagmeester A, Kutmon M, Riutta A, et al. : Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources. PLoS Comput Biol. 2016;12(6):e1004989. 10.1371/journal.pcbi.1004989 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Gray AJG, Groth P, Loizou A, et al. : Applying linked data approaches to pharmacology: Architectural decisions and implementation. Semant Web. 2014;5(2):101–113. 10.3233/SW-2012-0088 [DOI] [Google Scholar]
- 7. Kelder T, Pico AR, Hanspers K, et al. : Mining biological pathways using WikiPathways web services. PLoS One. 2009;4(7):e6447. 10.1371/journal.pone.0006447 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Chichester C, Digles D, Siebes R, et al. : Drug discovery FAQs: workflows for answering multidomain drug discovery questions. Drug Discov Today. 2015;20(4):399–405. 10.1016/j.drudis.2014.11.006 [DOI] [PubMed] [Google Scholar]
- 9. Luna A, Karac EI, Sunshine M, et al. : A formal mim specification and tools for the common exchange of mim diagrams: an xml-based format, an api, and a validation method. BMC Bioinformatics. 2011;12(1):167. 10.1186/1471-2105-12-167 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Le Novère NL, Hucka M, Mi H, et al. : The systems biology graphical notation. Nat Biotechnol. 2009;27(8):735–741. 10.1038/nbt.1558 [DOI] [PubMed] [Google Scholar]
- 11. Kutmon M, Riutta A, Nunes N, et al. : WikiPathways: capturing the full diversity of pathway knowledge. Nucleic Acids Res. 2016;44(D1):D488–D494. 10.1093/nar/gkv1024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Croft D, Mundo AF, Haw R, et al. : The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42(Database issue):D472–D477. 10.1093/nar/gkt1102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Juty N, Le Novère N, Hermjakob H, et al. : Towards the collaborative curation of the registry underlying Identifiers.org. Database (Oxford). 2013;2013:bat017. 10.1093/database/bat017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Vanhaesebroeck B, Stephens L, Hawkins P: PI3K signalling: the path to discovery and understanding. Nat Rev Mol Cell Biol. 2012;13(3):195–203. 10.1038/nrm3290 [DOI] [PubMed] [Google Scholar]
- 15. Levels M, Hanspers K, Kutmon M, et al. : Micrornas in cardiomyocyte hypertrophy (Homo sapiens).2017. Reference Source [Google Scholar]
- 16. Li H, Zeng J, Shen K: PI3K/AKT/mTOR signaling pathway as a therapeutic target for ovarian cancer. Arch Gynecol Obstet. 2014;290(6):1067–1078. 10.1007/s00404-014-3377-3 [DOI] [PubMed] [Google Scholar]
- 17. Dunlop I, Willighagen E, Elblood AW, et al. : ops.js: Ops.js 7.1.0 for Open PHACTS 2.2 api. Zenodo. 2018. 10.5281/zenodo.167595 [DOI] [Google Scholar]
- 18. Berthold MR, Cebron N, Dill F, et al. : KNIME: The Konstanz Information Miner.In Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007) Springer,2007;319–326. 10.1007/978-3-540-78246-9_38 [DOI] [Google Scholar]
- 19. Ratnam J, Zdrazil B, Digles D, et al. : The application of the open pharmacological concepts triple store (Open PHACTS) to support drug discovery research. PLoS One. 2014;9(12):e115460. 10.1371/journal.pone.0115460 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. fundatureanu-sever, Kerber R, Soiland-Reyes S, et al. : openphacts/OPS_LinkedDataApi: Open PHACTS Linked Data API 2.1.0 (Version 2.1.0). Zenodo. 2016. 10.5281/zenodo.1068252 [DOI] [Google Scholar]