Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Aug 1.
Published in final edited form as: Toxicol In Vitro. 2020 May 6;66:104877. doi: 10.1016/j.tiv.2020.104877

Applying knowledge-driven mechanistic inference to toxicogenomics

Ignacio J Tripodi 1,*, Tiffany J Callahan 2, Jessica T Westfall 3, Nayland S Meitzer 4, Robin D Dowell 5, Lawrence E Hunter 6
PMCID: PMC7306473  NIHMSID: NIHMS1594356  PMID: 32387679

Abstract

When considering toxic chemicals in the environment, a mechanistic, causal explanation of toxicity may be preferred over a statistical or machine learning-based prediction by itself. Elucidating a mechanism of toxicity is, however, a costly and time-consuming process that requires the participation of specialists from a variety of fields, often relying on animal models. We present an innovative mechanistic inference framework (MechSpy), which can be used as a hypothesis generation aid to narrow the scope of mechanistic toxicology analysis. MechSpy generates hypotheses of the most likely mechanisms of toxicity, by combining a semantically-interconnected knowledge representation of human biology, toxicology and biochemistry with gene expression time series on human tissue. Using vector representations of biological entities, MechSpy seeks enrichment in a manually curated list of high-level mechanisms of toxicity, represented as biochemically- and causally-linked ontology concepts. Besides predicting the canonical mechanism of toxicity for many well-studied compounds, we experimentally validated some of our predictions for other chemicals without an established mechanism of toxicity. This mechanistic inference framework is an advantageous tool for predictive toxicology, and the first of its kind to produce a mechanistic explanation for each prediction. MechSpy can be modified to include additional mechanisms of toxicity, and is generalizable to other types of mechanisms of human biology.

Keywords: computational toxicology, mechanistic inference, artificial intelligence, mechanistic toxicology, adverse outcome pathways

1. Introduction

Several recent computational methods have displayed excellent performance in predicting toxicity outcomes [1, 2, 3] of chemicals. Yet, to our knowledge, there is to date no computational approach to generate mechanistic hypotheses to answer why these chemicals elicit a toxic response. There is great value in understanding the mechanism of toxicity for a chemical that appears to elicit an adverse response. Novel small molecule development is one example, where a chemical that failed initial toxicological screenings could be assessed to evaluate the actual mechanism of toxicity, greatly reducing research time and expenses on subsequent ones. The value of a mechanistic awareness of toxicity also applies to pharmacovigilance, when researching rare adverse effects of a drug in subsets of the population. The development of oncological chemotherapeutics is another example, where certain mechanisms of cytotoxicity can actually be desirable to eliminate different types of cancer cells. More importantly, the costs, time expenditure, and ethical concerns of toxicity animal models, make in vitro and in silico approaches an enticing alternative. We present a solution that uses a combination of gene expression assays and biomedical knowledge to address the gap of answering the why question.

We currently have access to a wealth of biological, chemical and medical structured information represented in domain-specific ontologies. These ontologies elucidate diverse relationships between its constituent entities, like protein-protein interactions, or participation of enzymes (or chemicals) in biological processes, at specific cellular compartments. The Gene Ontology [4, 5] (GO) is the most commonly used, and a variety of tools exist to seek enrichment of particular concepts [6, 7, 8] or even specific pathways [9]. Our knowledge of molecular biology and biochemistry, however, goes well beyond what is described in GO, and it can be complemented by other ontologies and public databases that describe interactions between enzymes and/or chemicals.

In the case of in vitro toxicology studies involving gene expression assays (also referred to as “transcriptomics”), the simplest approach is to take the results of differential transcriptomics analysis and seek GO concept enrichment to gain clues of the underlying biological behavior of the treated cells. This is sometimes useful, yet often provides a somewhat disconnected list of terms that vary in specificity. The GO enrichment approach could be expanded by seeking enrichment of particular pathways, among databases like the Kyoto Encyclopedia of Genes and Genomes [10] (KEGG) or Reactome [11]. However, this output results in a list of possibly related biological pathways that are not necessarily tailored to the study’s domain (e.g. toxicology). Moreover, current pathway enrichment strategies don’t take into account the sequential order in which the experimentally-significant expression changes occur.

The use of artificial intelligence to infer mechanistic behavior in biology or other disciplines is still at a nascent stage. Most of the computational work on biological mechanisms has been focused on their representation [12, 13], and studies aimed at elucidating a mechanism of toxicity have generally been targeted to specific compounds, or narrow classes of them. Prior work in seeking enrichment of adverse outcome pathways (AOPs [14], which are the most common representation for chemical-specific toxicity), was focused on targets like pulmonary fibrosis [15] or fatty liver [16]. Computational approaches are sometimes faced with skepticism, particularly as the techniques become increasingly “black box”-like [17]. A hypothesis generation tool that produces mechanistic narratives backed by curated existing knowledge, however, would result in a more attractive alternative as scientists can validate the plausible explanations offered. Moreover, scientists could highlight any potential mistakes, adjusting our web of knowledge accordingly, and having the entire community benefit from it.

The idea of computationally-generated explanations is not new, as Schank [18] proposed a framework for this in 1986. At that time, the breadth of scope and lack of computational power resulted in explanation patterns remaining mostly as a theoretical exercise. However, we now have the tools to make it possible. Here we present MechSpy, a computational framework (Fig. 1) to produce mechanistic hypotheses of toxicity from in vitro assays. MechSpy uses a directed graph representation of our current knowledge of molecular biology and biochemistry. Taking a time series gene expression experiment as input, it generates a transparent narrative for each of the three most likely mechanisms of toxicity taking place, linking experimental events to different mechanism steps.

Figure 1: Overview of MechSpy’s mechanistic inference process.

Figure 1:

The knowledge graph (a) of semantically-integrated ontologies and databases and the transcriptomics data (d), in dashed purple frames, are our inputs. After adding new edges to the graph by deductively closing it (b), MechSpy uses node2vec to generate dense vector embeddings of each node (c). We also perform differential expression analysis on the transcriptomics data (d), and obtain a list of the top N most significant changes in gene expression (e). Based on these changes, and using the embeddings for genes and all mechanism steps, MechSpy generates an enrichment score for each mechanism (f) and ranks the top-three with the highest scores. Using the original knowledge graph (a) and the significant genes across time (e), it then produces both a narrative (g) and a graphical explanation (h) for each of the three most enriched mechanisms.

Materials and methods

Toxicity mechanisms

Many mechanisms in biology can be represented as an ordered sequence of ontology concepts. After an extensive literature review, we curated a list of high-level mechanistic toxicology descriptions. A total of 11 high-level mechanisms were curated, based on a mechanistic toxicology textbook [19], which we were able to represent as causally-linked concepts from the gene ontology. After parsing the textual description of a mechanism, we listed the sequence of key events that would summarize it at a high level. For each of these events, we looked for the GO concept that most closely represented it. The mechanisms with their respective steps, which represent events at a molecular level that must follow a sequential order, are listed in Table 1. We attempted to capture with these some of the most common ways a toxicological insult results in an adverse cellular outcome.

Table 1:

Mechanisms of toxicity evaluated.

M1: Triggering of caspase-mediated apoptosis via release of cytochrome C
1. Positive regulation of mitochondrial membrane permeability GO:0035794
2. Positive regulation of release of cytochrome C from mitochondria GO:0090200
3. Caspase activation GO:0006919
4. Apoptotic DNA fragmentation GO:0006309
5. Intrinsic apoptotic signaling pathway in response to DNA damage GO:0008630
M2: ATP depletion from calcium homeostasis disruption, resulting in necrosis
1. Regulation of calcium ion transport GO:0051924
2. Positive regulation of cytosolic calcium ion concentration GO:0007204
3. Positive regulation of mitochondrial membrane permeability GO:0035794
4. Negative regulation of ATP biosynthetic process GO:2001170
5. Necrotic cell death GO:0070265
M3: Increased cytosolic calcium, resulting in calpain-mediated cytoskeletal damage
1. Regulation of calcium ion transport GO:0051924
2. Positive regulation of cytosolic calcium ion concentration GO:0007204
3. Calcium-dependent cysteine-type endopeptidase (calpain) activity GO:0004198
4. Microtubule severing GO:0051013
5. Necrotic cell death GO:0070265
M4: Xenobiotic-induced oxidative stress
1. Membrane lipid catabolic process (peroxidation) GO:0046466
2. Aldehyde oxidase activity GO:0004031
3. Oxidation-reduction process GO:0055114
4. Cellular response to redox state GO:0071461
5. Cellular response to oxidative stress GO:0034599
6. Regulation of oxidative stress-induced cell death GO:1903201
M5: Mitochondria-mediated toxicity by inhibition of electron transport chain
1. Negative regulation of mitochondrial electron transport, NADH to ubiquinone GO:1902957
2. Negative regulation of mitochondrial ATP synthesis coupled proton transport GO:1905707
3. Negative regulation of ATP biosynthetic process GO:2001170
4. Cellular response to reactive oxygen species GO:0034614
5. Mitochondrial DNA repair GO:0043504
M6: Inhibition of tissue repair by cell cycle disruption
1. Negative regulation of G0 to G1 transition GO:0070317
2. Negative regulation of cell cycle G2/M phase transition GO:1902750
3. Negative regulation of mitotic cell cycle GO:0045930
4. Positive regulation of cell cycle arrest GO:0071158
5. Positive regulation of apoptotic process GO:0043065
M7: Endoplasmic reticulum stress (by chemical or a metabolite covalently bound to proteins)
1. Endoplasmic reticulum unfolded protein response GO:0030968
2. Positive regulation of signal transduction GO:0009967
3. Positive regulation of protein folding GO:1903334
4. Positive regulation of chaperone-mediated protein folding GO:1903646
5. Response to endoplasmic reticulum stress GO:0034976
6. Positive regulation of endoplasmic reticulum stress-induced intrinsic apoptotic signaling pathway GO:1902237
M8: Triggering of estrogen receptor (ER) activity
1. Protein homodimerization activity GO:0042803
2. Estrogen response element binding GO:0034056
3. Estrogen receptor activity GO:0030284
4. Intracellular estrogen receptor signaling pathway GO:0030520
5. Cellular response to estrogen stimulus GO:0071391
M9: Triggering of aryl hydrocarbon receptor (AHR) activity
1. Aryl hydrocarbon receptor binding GO:0017162
2. Protein heterodimerization activity GO:0046982
3. Glutathione transferase activity GO:0004364
4. Glucuronosyltransferase activity GO:0015020
5. Negative regulation of cell cycle phase transition GO:1901988
M10: Triggering of androgen receptor (AR) activity
1. Protein dimerization activity GO:0046983
2. Androgen receptor binding GO:0050681
3. Androgen receptor signaling pathway GO:0030521
M11: Triggering of peroxisome proliferator-activated receptor gamma (PPAR-γ) alteration of fatty acid metabolism
1. Peroxisome proliferator activated receptor binding GO:0042975
2. Fatty acid binding GO:0005504
3. Positive regulation of fatty acid biosynthetic process GO:0045723
4. Negative regulation of fatty acid beta-oxidation GO:0031999
5. Positive regulation of lipid storage GO:0010884
6. Oxidative phosphorylation uncoupler activity GO:0017077
These mechanisms of toxicity were manually curated from the literature, and every mechanism step was represented using an ontology concept. The enrichment of each mechanism step is expected to happen following the sequential order in which they are described here.

The curated mechanisms can be considered as cell-focused, different from AOPs which go beyond the cellular scope and towards organ- and individual-specific responses. These mechanisms were reviewed by several members of the Toxicology department at the University of Colorado, Anschutz medical campus. We surveyed the literature for evidence of mechanistic explanations of toxicity for every compound used in the exposure assays we tested. The possible mechanism labels and literature sources for each of these chemicals are listed in Supplemental Table SI. MechSpy selects the three most likely high-level mechanisms of toxicity for each transcriptomics time series, and produces a putative explanation for each.

Knowledge graph

A knowledge graph (KG) is a more powerful tool to employ than any ontology by itself, as it provides much richer contextual information for any concept, and can uncover relations between entities that would be missed in separate ontologies. We extended a KG [20] (Fig. 1.a) that was generated by semantically integrating multiple open biomedical ontologies (OBOs [21]) and other sources of publicly available linked open data. An ontology is a formal representation of entities or concepts, and the relations between them. Ontologies employ a directed acyclic graph (DAG) representation, usually described as a list of “triples” (subject, predicate, object). The KG included concepts from the Gene Ontology [4, 5], Protein Ontology [22], Cell Ontology [23], Human Phenotype Ontology [24], Human Disease Ontology [25], and Chemical Entities of Biological Interest [26], among others, in a semantically-consistent fashion. Only human entities and the relations between them were included. Several public databases were also used to incorporate additional directed edges into the KG, e.g. the Cellular Toxicogenomics Database [27], Reactome [28], the STRING database [29], the AOP Wiki [30], the National Cancer Institute thesaurus [31], and Uniprot [32].

This KG focuses on concepts directly related to human biology, rather than those involving model organisms. Fig. 2 illustrates the richness of information that can be extracted from these interconnected sources. The ELK reasoner [33, 34] was utilized to deductively close the KG (Fig. 1.b). ELK was specifically designed for Web Ontology Language EL profiles, easily handles reasoning of large ontologies, utilizes multiple processors, and leverages a consequence-based reasoning engine enabling it to derive inferences over ontology class hierarchies, object properties, and instances of ontology classes. This adds new edges for transitive relations where applicable. For example, the “DFFA” gene participates in the “Apoptosis-induced DNA fragmentation” pathway, and the “Apoptosis-induced DNA fragmentation” pathway has part in the “Apoptotic DNA fragmentation” biological process, so a new edge is added to represent “DFFA participates in the apoptotic DNA fragmentation biological process”. The graph was used both in its original form to generate the mechanistic narratives, and in its deductively-closed form to generate vector embeddings for its nodes.

Figure 2: Illustrative knowledge graph sample.

Figure 2:

Example of how ontology concepts and database sources are interconnected in the KG extended from the PheKnowLator project [20].

Data sources

Since the KG is focused on human biology, we sought out publicly available time series datasets that used human cells, from different tissue types. The raw human microarray data (Fig. 1.d) used as input was sourced from the Open TG-Gates [35] database, the carcinoGENOMICS [36] project and Netherlands Toxicogenomics Centre project available in the diXa Data Warehouse [37], as well as other publicly available tobacco-related human microarray datasets [38, 39, 40, 41] in ArrayExpress [42] that used human nasal, buccal and bronchial epithelial tissue. These sources provided a variety of time series gene expression experiments, at different time points and using different human cell types. Specifically, we used samples from Open TG-Gates that featured 3 time points, 3 replicates for each condition and 3 doses of exposure (low, medium, and high, varying by each chemical). Other samples from Open TG-Gates were also chosen as canonical examples of toxicity mechanisms from Urs A. Boelsterli’s textbook [19]. For all other data sources, we used any time series which had at least two time points with significant differentially-expressed genes.

We used gene expression time series from diXa studies DIXA-002 (liver cells HepaRG and HepG2), DIXA-003 (kidney cells RPTEC/TERT1), DIXA-004 (lung epithelial cells BR200, BR234, BR259, BR234/CDK4, and BR234/p16), and DIXA-078 (HepG2 liver cells). Using the Limma [43] R package with robust multiarray average (RMA) background correction and normalization, the control and treatment replicates were contrasted to determine the most significant gene expression changes at each time point, based on a p-value cutoff of 0.05 (Fig. 1.e). The only exception was for the DIXA-004 lung epithelium samples, since only a single replicate per condition was available, so we had to rely on fold change to determine the most significant genes. We evaluated a total of 234 different exposure time series, out of which 221 resulted in the prediction of at least one mechanism of toxicity with sufficient statistical confidence. A total of 113 distinct chemicals were used in these time series, 85 of them with hypothesized mechanisms of toxicity as detailed in Supplemental Table S1 and Supplemental Table S4, at different doses of exposure to different tissue types.

Mechanistic inference

The inference step (Fig. 1.f) consisted of scoring each curated mechanism, based on how much the affected gene nodes in the KG (corresponding to the differential analysis findings) related to the GO nodes describing the mechanism steps. The three mechanisms of toxicity with the highest ranking were then presented as the most likely candidates. We used a latent vector representation (embedding) of each node in the KG, a popular tool in natural language processing applications [44], to determine mechanism enrichment scores. The idea of generating dense real number vector representations of language tokens can be applied to KGs, with techniques to generate embeddings for either nodes or edges. These are generated by models that predict the most likely node based on the context (neighboring) nodes. Thus, two nodes with a similar vector representation in semantic space are expected to have a closely related meaning. A recent review [45] provides a good summary of the current embedding models. One such algorithm to generate node embeddings from a graph is node2vec [46], which utilizes random walks from the node in question up to a number of hops away, with a bias for depth or breadth configurable by hyperparameters. This was our method of choice to create vector embeddings to represent the different kinds of entities contained in our graph (biological processes, genes, proteins, biochemical reactions, etc). Using an emphasis on breadth (--directed --dimensions 32 --q 3, for best performance with our deductively-closed graph) we created 32-dimensional vector representations of all 109,255 nodes (Fig. 1.c).

Cosine distance is an established way to compare vector embeddings, and an appropriate metric for this study, since we focused on similar hyperdimensional orientation to denote similar meaning. For each experimental time point j , MechSpy collected the embeddings for the (up to) 100 most significant genes. These embeddings were then averaged to obtain the centroid (gj) of all those gene vectors, which represented the significant expression changes, as a whole, at that time point (Fig 3). From this centroid gj, MechSpy calculated the cosine distance to each mechanism step of every curated mechanism. In order to use this as a weight rather than a distance (such that a larger magnitude equals a stronger enrichment), it substracted the cosine distance value from 1 to use it as an enrichment score (also known as cosine similarity).

Figure 3: Summary of the mechanistic inference architecture, showing the use of vector embeddings generated using node2vec, and our sequential order penalty scheme to find the score of a particular mechanism.

Figure 3:

In this hypothetical example we have 3 experimental time points, and a mechanism composed of 5 causally-linked steps. For every time point, the n most significant gene changes (where 1 ≤ n ≤ 100) are averaged into a single vector. MechSpy then generates a preliminary enrichment vector ex,j, which consists of the cosine similarity value between that time point’s gene aggregation and each mechanism step, subtracted from 1. The sequential penalty filter (in purple shades, dividing the 5 mechanism steps in 3 bins) gives less weight to mechanism steps that don’t correspond to the time point in question, with an increasing penalty the farther away we are from our corresponding bin. Finally, the weighted enrichment vectors for each time point are combined such that the maximum score for each mechanism step is kept (ex), and the score for this mechanism is the median of these maximum values.

The sequential order in which the mechanism steps are described (and enriched) matters. If a set of genes are closely related to the last step of a mechanism, for example, we value their contribution in the last time points of our experimental time series, more than in the first ones. Therefore, we devised a weighting scheme that prioritized the contributions to each mechanism step corresponding to the current time point (Fig. 3, in purple). Given a mechanism Mx = [mx0, mx1, ⋯ , mxi] with i steps, and an experimental time series T = [t0, t1, ⋯ , tj] with j time points, MechSpy segmented M in ∣T∣ bins. These bins were used for a sequential weighting scheme, to penalize genes that enrich mechanism steps out of proper sequential order. A vector wj of size ∣Mx∣ was used to weight enrichment scores at each time point. Having bj as the bin index corresponding to the current time point, and bs as the bin index being evaluated, the weight for each bin was calculated as:

wbj,bs=1abs(bjbs)2T

Thus, each of the values exi,j of an enrichment vector ex,j of ∣Mx∣ dimensions (Fig. 3, bottom), containing the enrichment score for each of the i steps in mechanism x at time point j, were calculated as:

exi,j=[1cosine_distance(gj,mxi)]wbj,bi=[1gjmxigjmxi]wbj,bi

To test whether sequential order of enrichment improved our predictions, we pseudo-randomly shuffled time points for all time series (ensuring they were always in incorrect order), and recalculated all mechanism enrichment scores. The resulting accuracy was lower across all data stratifications, particularly for the top-scoring mechanism, which demonstrated the need for MechSpy’s sequential penalty scheme. Across all time series, the precision for incorrectly ordered ones was lower by 0.041, 0.040 and 0.054 for the top, top-2 and top-3 mechanisms, respectively.

Since the cosine similarity and weight penalty are bounded in [0,1], so is the enrichment score for each mechanism step. The maximum values of the enrichment vectors for each experimental time point j were then combined into a single overall enrichment vector ex for mechanism Mx (Fig. 3, bottom). The ex vector featured the highest scores achieved for each mechanism step. The final enrichment score for mechanism x (scoreMx) was the median of ex elements. This procedure was repeated for every curated mechanism. Thus, our set of predictions P was defined as:

scoreMx=med{maxjexi,j}PargmaxMxM,M=3{scoreMxp_valx0.05,x}

The three mechanisms with the highest score in P that were deemed statistically significant (see below) became the ones suggested as most likely to occur.

Statistical significance

In order to determine the statistical significance of these predictions, we evaluated how they would compare to a pseudo-random assortment of genes. To this end, we employed a bootstrap approach where MechSpy calculated the final score for each mechanism, using random draws of the same number of genes originally used at each time point. Given that all microarray public samples used the same chip (Affymetrix Human Genome U133 Plus 2.0), the genes were randomly drawn from all probes available in it, represented by their ontology concepts in the KG. After 1000 simulations, MechSpy generated an empirical distribution of mechanism scores under the equivalent experimental conditions, whose median it used to compare our real mechanism score against. The resulting p-value for each mechanistic prediction was therefore calculated as:

p_valx=(#simulated scoresscoreMx)+11001

This empirical p-value, which at 1000 iterations has a lower bound of 9.99 × 10−4, was used to discard mechanistic predictions that were not significant, based on a cutoff of 0.05. Mechanisms that scored below the median of random simulations were also dropped. The rest of the mechanisms were sorted in descending order by their final enrichment score to determine the three most likely predictions.

Performance evaluation

Chemicals can elicit different mechanisms of toxicity which can depend on dosage and tissue type. For nearly half of the assays evaluated, we have more than one possible correct label of expected mechanisms. We thus decided to present the top-three most likely mechanisms, considering that MechSpy is a hypothesis generation aid. The existence of multiple correct labels in addition to the lack of a real gold standard, made the evaluation difficult in traditional terms of performance and recall. To calculate precision, a “positive” sample was defined as having at least one of the expected mechanisms among the top, top-two or top-three predictions generated.

We used the precision for all assays of the top-scoring mechanism, then among the top-two, and among the top-three, as a global performance metric. Across all the evaluated experiments, these values were 0.406, 0.611, and 0.709, respectively. The Results section presents these statistics across multiple dataset stratification strategies. This metric was applicable to any experiment for which we had at least one statistically significant prediction. In other words, those assays for which the gene expression data was not sufficient to produce at least one mechanistic prediction, were ignored. The datasets were then stratified by two criteria: chemicals with only one known mechanism of toxicity, and chemicals with two known mechanisms of toxicity. Taking into account that not all assays we tested were conducted at toxic doses, these precision values represent a lower bound estimate of performance. Therefore, a third stratification dimension used was to consider only assays at the highest exposure dose available, for each chemical and cell type combination. The results at high exposure doses likely reflect the most realistic performance of MechSpy.

Mechanistic narratives

Another aspect of MechSpy’s novelty is that it’s not simply a mechanism prediction tool: for each toxicity mechanism, MechSpy produces a narrative (Fig. 1.g) of a putative explanation. Using the non-deductively-closed version of the KG, for each of the top-three ranked mechanisms, MechSpy searched for paths connecting the most significant genes at each time point j with each of the mechanism steps. It prioritized those up/downregulated genes at the time points that corresponded to each mechanism step (based on our binning strategy described in Methods). Fig. 4 presents an example narrative generated for liver cells exposed to a 400 μ M dosage of diclofenac sodium for up to 24 hours, for which “ATP depletion due to calcium homeostasis disruption” (M2) was predicted as one of the most likely mechanism of toxicity.

Figure 4: Mechanistic narrative generated for a time series of diclofenac sodium exposure.

Figure 4:

Example of a generated mechanistic narrative for a particular transcriptomics time series, generated by MechSpy.

This narrative was limited by default to paths from genes to mechanism steps no farther than two hops away in our KG (to focus on closely related entities), or more if intersecting at Reactome entities, such as pathways or biochemical reactions. This limitation can, however, be easily relaxed to include many other possible pathways. The mechanistic explanation also includes a list of suggested gene knockouts, to help experimentally validate these claims. The list of suggested knockouts is sorted by significance among all time points, and could help discover new genes with a key role in the enriched mechanism. A visualization of each putative mechanistic explanation (Fig. 1.h) is also presented as a network diagram. Fig. 5 shows the generated diagram corresponding to the same example narrative presented in Fig. 4.

Figure 5: Graphical mechanistic explanation example, generated by MechSpy.

Figure 5:

The sequence of events should be followed top to bottom, left to right. Nodes in dark gray (leftmost column) are significant gene changes along the multiple time points (in order), those in light gray (middle) are the intermediate concepts (genes, pathways, etc), and purple rectangular nodes (rightmost column) represent the enriched mechanism steps.

Mechanistic explanation for M2 of Diclofenac (400uM) Open TG-Gates [liver] (double circles indicate one or more genes are known to be active in this tissue type)

Experimental validation

We sought out experimental validation for two of the chemicals without a well-known mechanism of toxicity: adapin and chlorpromazine. The selection of chemicals and mechanistic hypotheses to test was based on mechanisms consistently predicted by MechSpy at all concentrations, or at least at the two higher concentrations on the tissue used in the time series. The MITO-ID kit from Enzo Life Sciences was used to assess mitochondrial-mediated toxicity. This kit measures a shift in mitochondrial membrane potential from excited membrane to depolarized membrane (both markers of mitochondrial toxicity), and compromised plasma membrane integrity ( a marker of cell death). We validated our predictions of mitochondial toxicity using HUH7, a hepatocyte-derived carcinoma cell line. These results were also confirmed on an additional hepatocyte cell line, HepG2, and further evaluated on an orthogonal cell line (HCT116, human colorectal carcinoma cells) to assess tissue specificity. For experimental details and outcomes in all cell lines, see Supplemental Experimental Details.

Results

From the time series assays (several hours apart) MechSpy has been able to predict the most likely mechanisms of toxicity, starting with a set of compounds for which a “canonical” mechanism has been established (based on a mechanistic toxicology textbook [19]). As a proof of concept, we utilized assays from Open TG-Gates that used chemicals with an established most likely mechanism in the literature [19], and for which we had at least two time points with any significantly up/down-regulated genes. For this subset of experiments, we could predict the canonical mechanism(s) as the top choice with a precision of 0.594, and of 0.812 among both the top-two and top-three. This subset of experiments included exposure concentrations that were most likely not toxic. Looking only at the highest dose used for each compound, our precision for the most likely mechanism of toxicity was 0.615, and we were guaranteed to make a correct prediction among the top-two most likely already. With these encouraging results, we moved on to include the rest of chemicals in Open TG-Gates for which we had 3 time points, 3 replicates and 3 exposure doses (low, medium and high, which varied substantially depending on the compound), as well as other public gene expression time series (diXa-002, diXa-003, diXa-004, diXa-078, E-MTAB-4740, E-MTAB-4742, E-MTAB-5157, and E-MTAB-5697).

For all experiments using chemicals with at least one known mechanism of toxicity in the scientific literature, one or more of our top-three predictions (from all eleven mechanisms) matched its expected label with a precision of 0.709. If we only considered the highest dose per chemical and cell type, this precision increased to 0.852. A sample of MechSpy’s predictions with their respective empirical p-values is shown in Table 2. Besides looking at the highest dose for each chemical and cell type, we further stratified the assays regarding whether there was only one known mechanism of toxicity, or two at most. The performance for these various ways to segment the public datasets are displayed in Table 3. Taking the sequential order of the mechanism steps in mind, when calculating the enrichment scores, contributed to this performance. The accuracy obtained was also significantly better than a baseline estimated from random draws of three mechanisms, for every compound with known mechanisms of toxicity (Fig. 6).

Table 2:

MechSpy predictions for a subset of the time series evaluated.

Chemical used Dose Known
mechanisms
#1 Predicted
mechanism
#2 Predicted
mechanism
#3 Predicted
mechanism
acetaminophen 200 μ M M2, M4, M5 M1 (6.99E-03) M4 (8.99E-03) N/A
acetaminophen 1mM M2, M4, M5 M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
acetaminophen 5mM M2, M4, M5 M4 (9.99E-04) M2 (9.99E-04) M1 (9.99E-04)
aflatoxin B1 0.24 μ M M4 M5 (9.99E-04) M1 (9.99E-04) M2 (9.99E-04)
aflatoxin B1 1.2 μ M M4 M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
aflatoxin B1 6 μ M M4 M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
benzyl alcohol 10mM M1 M1 (9.99E-04) M4 (9.99E-04) M5 (9.99E-04)
cyclosporin A 1.2 μ M M1, M5 M5 (9.99E-04) M4 (1.40E-02) M1 (9.99E-04)
cyclosporin A 6 μ M M1, M5 M5 (9.99E-04) M1 (9.99E-04) M2 (3.00E-03)
diclofenac 16 μ M M2, M7 M2 (9.99E-04) M4 (6.99E-03) M8 (7.99E-03)
doxorubicin 2 μ M M5 M4 (9.99E-04) M1 (9.99E-04) M2 (9.99E-04)
doxorubicin 10 μ M M5 M5 (9.99E-04) M4 (9.99E-04) M1 (9.99E-04)
glibenclamide 4 μ M M2 M4 (9.99E-04) M5 (3.00E-03) M1 (3.00E-02)
glibenclamide 20 μ M M2 M4 (9.99E-04) M2 (9.99E-04) M5 (2.00E-03)
imipramine 100 μ M M1, M11 M9 (7.99E-03) M10 (2.00E-03) M11 (2.00E-02)
isoniazid 400 μ M M4 M4 (9.99E-04) M2 (9.99E-04) M1 (9.99E-04)
isoniazid 2mM M4 M4 (9.99E-04) M2 (9.99E-04) M5 (9.99E-04)
isoniazid 10mM M4 M4 (9.99E-04) M2 (9.99E-04) M5 (9.99E-04)
rotenone 2 μ M M5 M5 (9.99E-04) M1 (9.99E-04) M9 (9.99E-04)
sulfasalazine 30 μ M M4 M5 (4.30E-02) M2 (1.20E-02) M3 (9.99E-04)
sulfasalazine 150 μ M M4 M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)

Mechanistic inference results for 21 of the 234 time series evaluated (including the empirical p-value), that utilized chemicals with well-established mechanisms of toxicity at a variety of concentrations, some of which are examples in Boelsterli’s textbook [19]. The bolded predictions are those that match the known mechanisms for that chemical. The complete list of predictions for all chemicals evaluated is available in Supplemental Table S2 and Supplemental Table S3.

Table 3:

MechSpy performance across multiple levels of stratification.

Stratification # time
series
Top
Precision
Top-2
Precision
Top-3
Precision
All assays 234 0.406 0.611 0.709
Only one known mechanism 123 0.358 0.569 0.610
Only two known mechanisms 94 0.404 0.617 0.809
Highest dose per chemical/tissue type 108 0.463 0.704 0.852
Highest dose per chemical/cell type, only one known mechanism 60 0.450 0.717 0.767
Highest dose per chemical/cell type, only two known mechanisms 40 0.400 0.625 0.950
Same tissue type than the established mechanism (on any organism) 143 0.434 0.629 0.713
Same organism than the established mechanism (human, on any tissue type) 111 0.423 0.586 0.712
Same tissue type than the established mechanism (also human-based) 71 0.479 0.606 0.704
Assays using Lung epithelial cells 26 0.385 0.769 0.846
Assays using HepaRG cells 13 0.385 0.462 0.615
Assays using Nasal cells 3 0.333 0.333 0.333
Assays using Buccal cells 2 1.000 1.000 1.000
Assays using bronchial cells 1 1.000 1.000 1.000
Assays using HepG2 cells 18 0.278 0.500 0.556
Assays using Primary hepatocytes 151 0.397 0.596 0.682
Assays using Primary kidney cells 20 0.550 0.700 0.950

Precision for the top-three scoring mechanisms across the evaluated time series experiments. All time series evaluated utilized human cells.

Figure 6: Simulations of baseline precision.

Figure 6:

Comparison of actual precision values (black dots, top) to baseline estimations from random mechanism draws (violin plots, bottom), for different segmentations of the data (see Table 3). For each chemical used in the public datasets with one or more known mechanisms of toxicity, we randomly drew three mechanisms of the eleven curated (without replacement) to simulate the top-three enrichments. The accuracy across all chemicals was then calculated, and the process was repeated 1000 times. The violin plots show the distribution of baseline accuracy scores from those 1000 runs.

It is worth reiterating that many of these chemicals don’t act via a single mechanism of toxicity, so the accuracy for the strongest enrichment score is actually a lower-bound estimate. The actual mechanistic landscape is likely better represented by a combination of these top enriched mechanisms. The full list of mechanistic hypotheses for all time series is available in Supplemental Table S2. Some of the predictions can be linked to known toxicity endpoint organs. Clonidine, for example, has known cardiotoxicity issues [47] at high doses. This is not unusual with mitochondria-mediated toxicity, since cardiomyocytes are highly energy-demanding cells and this adverse mechanism results in a sharp decrease in ATP synthesis. While the strongest hypothesis is that clonidine triggers apoptosis via caspase release [48] (M1), MechSpy predicted it acted via mitochondrial-mediated toxicity (M5) with a higher score than M1 for all concentrations and cell types. Due to the different number of paths connecting mechanism steps to genes, it is natural to observe some over-representation of certain mechanistic predictions (like “M4”, oxidative stress). This is why we implemented an empirical p-value calculation, to ensure only those predictions with a score significantly different than a random observation would be considered. Every prediction listed in the result tables was below an empirical p-value threshold of 0.05 compared to a random distribution at matching conditions, as described in the Statistical Significance section.

Some of the chemicals used in the public datasets don’t have a well-established mechanism of toxicity to date. We experimentally validated one of the predicted mechanistic hypotheses for two of those, adapin and chlorpromazine. MechSpy consistently predicted mitochondrial-mediated toxicity for the two highest concentrations, as one of the three most likely mechanisms. Additionally, we exposed these chemicals to a human colorectal carcinoma cell line (HCT116), to evaluate whether the toxicity was specific to the cell types in question. MechSpy’s prediction that these chemicals affect liver tissue was confirmed using HUH7 cells and HepG2 hepatocyte-derived carcinoma cells. Fig. 7 shows validation of our predictions using Huh7 cells, and the rest of assay outcomes are made available in Supplemental Experimental Details.

Figure 7: Experimental validation of MechSpy’s mechanistic prediction of mitochondrial toxicity for chlorpromazine and adapin.

Figure 7:

Fluorescense intensities of the three potential-sensitive MITO-ID dyes for HUH7 hepatocytes after 24 hour exposure to chlorpromazine (a) and adapin (b). The bars corresponding to treated cells marked with an asterisk (*) present a p-value smaller than 0.05 when compared to the untreated cells using a t-test.

The validation assays showed significant decrease in mitochondrial membrane potential and increased depolarization for HUH7 cells over prolonged exposure to 75 μ M adapin (Fig. 7.a). This was also confirmed in a different hepatocyte cell line (HepG2, Supplemental Experimental Details). Adapin appeared to elicit a similar response in our control HCT116 cells, suggesting the mitochondrial-mediated toxicity may not be limited to hepatocytes. MechSpy’s prediction of mitochondrial toxicity was also verified for chlorpromazine, on HepG2 and particularly HUH7 cells where there was a significantly increased depolarization (Fig 7.b) when exposed to an 8 μ M concentration. In HCT116 cells there was a significant loss of mitochondrial membrane potential compared to vehicle control, yet not an increased membrane depolarization nor compromised membrane, suggesting chlorpromazine’s effects may be more tissue-specific. We acknowledge that the hepatocyte cell lines used for these experiments (HepG2 and HUH7) may not be the most metabolically active ones available, and as such should be considered a proof of concept.

We hope our mechanistic predictions for chemicals without an established mechanism of toxicity (summary in Table 4, full list of predictions in Supplemental Table S3) can help guide further experimental work to seek hypothesis validation. Some interesting patterns were observed, such as hydroxyzine being enriched for the estrogen receptor-mediated mechanism (M8) in the two higher concentrations on hepatocytes. This is particularly interesting for hydroxyzine, as it has been linked to teratogenicity in rat models [49]. Many of these compounds exhibit common mechanistic predictions at different concentrations. Labetalol-exposed hepatocytes appear to primarily display oxidative stress (M4) as the most likely mechanism at toxic concentrations. In lung epithelial tissue, urea also appears to present a primarily oxidative stress (M4) mechanism. Nifedipine is a calcium channel antagonist that is actually administered to counter the toxicity of several other drugs, thus it’s unsurprising that in most cases MechSpy didn’t detect a significant enough enrichment of toxicity mechanisms. At the highest concentration exposed to hepatocytes, however, MechSpy predicted mitochondrial-mediated toxicity (M5) as the most likely mechanism, as well as the second most likely for treated kidney cells. The rare reported toxicity events associated to nifedipine are related to cardiotoxicity [50], which is consistent with mitochondria-mediated toxicity, since cardiomyocytes are highly energy-demanding cells and this adverse mechanism results in depletion of ATP.

Table 4:

MechSpy predictions for chemicals without an established mechanism of toxicity.

Chemical Cell Type #1 Predicted
mechanism
#2 Predicted
mechanism
#3 Predicted
mechanism
1-Amino-2,4-dibromoanthra-quinone kidney M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
2-Amino-3-methylimidazo(4,5-f)quinoline HepaRG M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
2-nitrofluorene HepaRG M4 (9.99E-04) M1 (9.99E-04) M5 (8.99E-03)
4-Acetylaminofluorene kidney M4 (9.99E-04) M2 (9.99E-04) M5 (9.99E-04)
4-Acetylaminofluorene HepaRG M5 (7.99E-03) N/A N/A
adapin hepatocytes M4 (9.99E-04) M1 (9.99E-04) M5 (9.99E-04)
beclomethasone dipropionate lung epithelial M5 (9.99E-04) M4 (9.99E-04) M1 (9.99E-04)
benzofuran lung epithelial M4 (9.99E-04) M2 (9.99E-04) M1 (9.99E-04)
benzoin kidney M4 (9.99E-04) M1 (9.99E-04) M5 (9.99E-04)
bromodichloromethane kidney M2 (9.99E-04) M4 (9.99E-04) M9 (9.99E-04)
chlorpromazine hepatocytes M4 (9.99E-04) M1 (9.99E-04) M5 (9.99E-04)
cimetidine hepatocytes M4 (9.99E-04) M2 (9.99E-04) M1 (4.00E-03)
dimethyl sulfoxide lung epithelial M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
ethionine hepatocytes M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
hydrazine dihydrochloride HepaRG M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
hydroxyzine hepatocytes M9 (9.99E-04) M8 (9.99E-04) N/A
interleukin-6,-human hepatocytes M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
ipratropium bromide hydrate lung epithelial M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
labetalol hepatocytes M1 (9.99E-04) M4 (9.99E-04) M2 (1.10E-02)
nifedipine HepG2 M5 (9.99E-04) M4 (9.99E-04) M2 (9.99E-04)
nifedipine kidney M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
nitrilotriacetic-acid kidney M2 (2.60E-02) M9 (9.99E-04) M10 (2.00E-03)
N-Ethyl-N-(2-hydroxyethyl)nitrosamine kidney M4 (9.99E-04) M1 (9.99E-04) M5 (2.00E-03)
ochratoxin-A kidney M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
phthalic anhydride lung epithelial M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)
THS 2.2 nasal epithelial M8 (4.00E-03) N/A N/A
THS 2.2 buccal epithelial M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
THS 2.2 bronchial epithelial M4 (9.99E-04) M5 (9.99E-04) M1 (9.99E-04)
TGF β 1 hepatocytes M4 (9.99E-04) M1 (9.99E-04) M5 (9.99E-04)
urea lung epithelial M4 (9.99E-04) M5 (9.99E-04) M2 (9.99E-04)

This summarizes the mechanisms predicted, at the highest dose of exposure for each chemical to the indicated the tissue type. We experimentally tested some of the MechSpy-generated hypotheses for a couple of these chemicals (adapin and chlorpromazine). The detailed list of predictions for all time series with their corresponding concentrations can be found in Supplemental Table S3.

Some mechanisms from the curated list may be hard to detect from gene expression alone. Such is the case of M6, the cell cycle disruption one, which may require very narrow experimental time point margins to really detect it, or a different kind of assay altogether. The fact that this particular mechanism (M6) was missed for all compounds evaluated, may highlight the fact that either the long inter-sampling times or the assay types are inappropriate to identify it. As can be observed in Supplemental Table S2, 14% of assays (33 out of 234) where MechSpy failed to predict any of the expected mechanisms correspond to the lowest doses of exposure for that chemical. In fact, stratifying by chemicals at the lowest available dose (only for those where multiple doses are tested), 43% of the experiments (33 out of 76) correspond to experiments where MechSpy failed to predict the expected mechanisms of toxicity. For these cases, the dose may not elicit any toxic response at all, or the changes in expression may be too subtle for MechSpy to detect them. Other chemicals may pose relatively low risk of toxicity to humans, like the case of coumarin, which has proven hard to predict accurately and may require a much larger dose to elicit the expected oxidative stress response, extrapolated from animal models (a complex task, since coumarin presents different mechanisms of toxicity in mice, rats and humans [19]).

Not all exposure experiments to toxic compounds resulted in the expected predictions. Dibenz[a,h]anthracene is a particular case, given it’s generally known to result in an adverse response after exposure to ultraviolet (UV) light, leading to phototoxicity. Our evaluation was performed on exposure assays to lung epithelial cells without UV, which could potentially elucidate a different mechanism of toxicity. For methyltestosterone-treated cells, the choice of tissue (hepatocytes) could be a reason these assays were not enriched for estrogen receptor-mediated mechanisms. A stratification of mechanistic predictions by the cell/tissue type reveals that, in general, experiments employing HepG2 cells result in lower performance than those using HepaRG or primary hepatocytes. While there are many factors at play (different chemicals, doses, different number of examples, etc) this could also be attributed to the known reduced metabolic capacity of this cell line [51], particularly concerning cytochrome P450 (CYP) enzymes.

A challenge we faced in this study was the availability of public datasets that used chemicals known to elicit every mechanism of toxicity we curated. Despite the lack of public gene expression datasets involving chemicals known to act via mechanisms mediated by the aryl hydrocarbon receptor (M9) or androgen receptor (M10), we still included them in the process to show that these were not incorrectly predicted among most of the top-three results. It’s worth noting that all datasets consisted of microarray transcriptomics assays, which don’t necessarily have the best dynamic range of signal readouts, therefore other types of experiments like RNA-seq would be preferred. Furthermore, there may be certain mechanisms of toxicity that can only be detected using other assays than gene expression which, in the end, is a steady-state measurement of mature RNA, rather than a point-in-time measurement of transcription. This could depend on the molecular characteristics of each chemical. The similarities between chemicals that resulted in successful mechanistic predictions, and those that did not, are shared in Supplemental Table S4 for future work.

An inherent challenge of any kind of mechanistic study is also the dependence on the correct choices of time points and dosage. Multiple concentrations of the exposure dose must be tested, as cells could respond via different mechanisms. Such is the case of Cyclosporin A, which can elicit a toxic response via caspase-mediated apoptosis but, if the dose is high enough, then oxidative stress becomes the main reason for tissue damage [19], As was mentioned earlier, 14% of all experimental doses displayed in the results (Supplemental Table 1) are likely not high enough to elicit a toxic response. Unsurprisingly, the concentration of exposure was critical to start detecting strong enrichment of the expected mechanisms of some chemicals. Some illustrative examples were allopurinol, aspirin, coumarin, imipramine, or N-methyl-N-nitrosurea, for which MechSpy started predicting the expected mechanisms only at the highest concentration.

Discussion

We present in this study a framework that holds great potential to aid the hypothesis generation process of mechanistic toxicology. Combining data from two sources, experimental results and existing knowledge, presents the best of both worlds: this is neither a purely data-driven inference without regards of context, nor a purely semantic knowledge-based exercise of what is plausible. Given the economic, practical and ethical burden in animal models to elucidate mechanisms of toxicity, MechSpy can also serve as a potential animal testing reduction or replacement tool. Its richness of knowledge sources makes it useful for data originating in other assays beyond gene expression, like proteomics, metabolomics, or chromatin accessibility. MechSpy has a direct application to both pre-clinical drug development and pharmacovigilance later on, to study rare side effects on subsets of the population. We demonstrate how using a coarse transcriptomics time series post-exposure to a known toxicant, our method can identify the most likely mechanisms among the top-three most strongly enriched.

The predictions generated by MechSpy are in agreement with the literature for many of the chemicals we tested at different doses (see Results), even when many of the expected mechanisms came from animal models across a variety of species, sometimes using different tissue types. Out of the 133 unique chemical/mechanism labels gathered from the literature, used as ground truth, 74 (56%) were determined using animal models (Supplemental Table 1). To illustrate how each prediction relates to the way the mechanisms of toxicity has been determined, additional figures are provided in Supplemental Figures SF1. We focused on the top-three most highly enriched mechanisms not only because this is a hypothesis generation aid, but also because in many cases there is no single mechanism of toxicity for a compound. Such is the case of acetaminophen, which has been known to elicit a toxic response via at least three different mechanisms [52]. Moreover, a compound’s mechanism of toxicity may depend on the dose of exposure. Some of the predicted mechanisms for those chemicals without an established mechanism of toxicity were validated experimentally, which shows a mechanistic inference framework like MechSpy is a robust and practical tool.

We took a conservative approach building the knowledge graph, using only human-curated relations (i.e. edges), therefore it will be worth exploring the improvements that can be achieved by incorporating computationally-inferred edges. We also acknowledge that not all mechanisms of toxicity are necessarily linear pathways, and may feature branches or cycles in their representation. A modification of the presented algorithms to seek enrichment in mechanisms of these other configurations is a topic for future development. A natural next step to this work will be to seek enrichment from other types of time series than from gene expression data, and to use instead (or in combination with) nascent transcription, proteomics and metabolomics assays at matching time points. We could also apply MechSpy to a proper semantic representation of existing AOPs from AOPwiki or Effectopedia [53], rather than high-level molecular mechanisms of toxicity. Therefore, an integration between MechSpy and the AOP Ontology [54] would also be desirable to explore.

We envision that this mechanistic inference framework can be applied beyond the scope of toxicology, into any other discipline with a rich enough background knowledge represented using ontologies. The framework we present in this study can be extended to include other mechanisms as well, as long as they can be defined in terms of ontology concepts. The application of MechSpy goes beyond safety assessment and novel drug development, and can also be used to identify small molecules to be used as cancer therapeutics, based on their toxicity mechanism. Moreover, this mechanistic inference framework spans not just to other problems in molecular biology, but even to disciplines outside of the biomedical realm.

Supplementary Material

1

Supplemental Experimental Details. Additional details of the experimental validation of MechSpy-generated mechanistic hypotheses for adapin and chlorpromazine.

2

Supplemental Figures SF1. Additional analysis of MechSpy predictions, in relation to the types of chemicals utilized and whether their currently accepted mechanisms of toxicity have been determined using the same tissue type and organism.

3

Supplemental Table S1. Literature sources used to determine each mechanism label.

4

Supplemental Table S2. MechSpy predictions for all time series with two or more time points with significant gene expression changes, utilizing chemicals with compelling mechanistic explanations in the literature.

5

Supplemental Table S3. MechSpy predictions for chemicals for which there is no strong enough evidence of a particular mechanism of toxicity.

6

Supplemental Table S4. Categorization of all chemicals used in the time series.

Highlights.

  • Several mechanisms of cellular toxicity can be predicted using experimental time series data and a multidimensional vector representation of semantic knowledge graph concepts.

  • We present a computational framework that produces a putative mechanistic explanation for each of the most likely hypotheses, as an ordered list of gene expression changes and how they relate to each mechanism step.

  • Mechanisms of cellular toxicity can be represented using ontology concepts, and enriched taking the sequential order of events into account.

Acknowledgements

We would like to thank Dr. Jared Brown, Dr. Melanie Joy, Dr. Kristina Brooks, and Dr. Manisha Patel at the toxicology department of University of Colorado, Denver, Anschutz medical campus, for the insightful discussions on mechanistic toxicology and feedback on the curated mechanisms. We are also very grateful for Dr. Jared Brown’s and Dr. Kristofer Fritz’s help with obtaining HepG2 cells for our validation assays.

Funding

This work was funded in part by the Olke C. Uhlenbeck Graduate Fellowship (IJT), NIH T15LM009451 (TJC, LEH), NSF 1350915 (JTW), NIH GM125871 (RDD) and NIH R01LM008111 (LEH).

Footnotes

Software availability

All the code from MechSpy to process the samples and perform mechanistic inference is publicly available at https://github.com/ignaciot/MechSpy. The original version of the knowledge graph [20] that was extended for this study can be found at https://github.com/callahantiff/PheKnowLator/wiki.

Competing interests

One author (RDD) of this publication is a founder and scientific advisor for Arpeggio Biosciences.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Contributor Information

Ignacio J. Tripodi, University of Colorado, Computer Science / Interdisciplinary Quantitative Biology, Boulder, Colorado, 80309, USA.

Tiffany J. Callahan, University of Colorado Anschutz Medical Campus, Computational Bioscience, Denver, Colorado, 80045 USA

Jessica T. Westfall, University of Colorado, Molecular, Cellular and Developmental Biology, Boulder, Colorado, 80309, USA

Nayland S. Meitzer, University of Colorado, Chemical Engineering, Denver, Colorado, 80309, USA

Robin D. Dowell, University of Colorado, Molecular, Cellular and Developmental Biology / Interdisciplinary Quantitative Biology, Boulder, Colorado, 80309, USA

Lawrence E. Hunter, University of Colorado Anschutz Medical Campus, Computational Bioscience / Interdisciplinary Quantitative Biology, Denver, Colorado, 80045 USA

References

  • [1].Luechtefeld Thomas, Marsh Dan, Rowlands Craig, and Hartung Thomas. Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships (RASAR) Outperforming Animal Test Reproducibility. Toxicological Sciences, 165(1):198–212, September 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Mayr Andreas, Klambauer Günter, Unterthiner Thomas, and Hochreiter Sepp. DeepTox: Toxicity Prediction using Deep Learning. Frontiers in Environmental Science, 3, 2016. [Google Scholar]
  • [3].Pu Limeng, Naderi Misagh, Liu Tairan, Wu Hsiao-Chun, Mukhopadhyay Supratik, and Brylinski Michal. eToxPred: a machine learning-based approach to estimate the toxicity of drug candidates. BMC Pharmacology and Toxicology, 20(1):2, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Ashburner Michael, Ball Catherine A., Blake Judith A., Botstein David, Butler Heather, Cherry J. Michael, Davis Allan P., Dolinski Kara, Dwight Selina S., Eppig Janan T., Harris Midori A., Hill David P., Issel-Tarver Laurie, Kasarskis Andrew, Lewis Suzanna, Matese John C., Richardson Joel E., Ringwald Martin, Rubin Gerald M., and Sherlock Gavin. Gene Ontology: tool for the unification of biology, May 2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Research, 47(D1):D330–D338, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Mi Huaiyu, Muruganujan Anushya, Ebert Dustin, Huang Xiaosong and Thomas Paul D.. PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Research, 47(D1):D419–D426, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Martin David, Brun Christine, Remy Elisabeth, Mouren Pierre, Thieffry Denis, and Jacq Bernard. GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biology, 5(12):R101, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Al-Shahrour Fátima, Díaz-Uriarte Ramón, and Dopazo Joaquín. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics, 20(4):578–580, March 2004. [DOI] [PubMed] [Google Scholar]
  • [9].Tarca Adi Laurentiu, Draghici Sorin, Khatri Purvesh, Hassan Sonia S., Mittal Pooja, Kim Jung-sun, Kim Chong Jai, Kusanovic Juan Pedro, and Romero Roberto. A novel signaling pathway impact analysis. Bioinformatics, 25(1):75–82, January 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Kanehisa Minoru and Goto Susumu. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research, 28(1):27–30, January 2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Joshi-Tope G, Gillespie M, Vastrik I, D’Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath GR, Wu GR, Matthews L, Lewis S, Birney E, and Stein L. Reactome: a knowledgebase of biological pathways. Nucleic Acids Research, 33 (suppl_1):D428–D432, January 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Darden Lindley, Pal Lipika R., Kundu Kunal, and Moult John. The Product Guides the Process: Discovering Disease Mechanisms In Danks David and Ippoliti Emiliano, editors, Building Theories: Heuristics and Hypotheses in Sciences, Studies in Applied Philosophy, Epistemology and Rational Ethics, pages 101–117. Springer International Publishing, Cham, 2018. [Google Scholar]
  • [13].Darden Lindley, Kundu Kunal, Pal Lipika R., and Moult John. Harnessing formal concepts of biological mechanism to analyze human disease. PLOS Computational Biology, 14(12):e1006540, December 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Ankley Gerald T., Bennett Richard S., Erickson Russell J., Hoff Dale J., Hornung Michael W., Johnson Rodney D., Mount David R., Nichols John W., Russom Christine L., Schmieder Patricia K., Serrrano Jose A., Tietge Joseph E., and Villeneuve Daniel L.. Adverse outcome pathways: A conceptual framework to support ecotoxicology research and risk assessment. Environmental Toxicology and Chemistry, 29(3):730–741, 2010. [DOI] [PubMed] [Google Scholar]
  • [15].Nymark Penny, Rieswijk Linda, Ehrhart Friederike, Jeliazkova Nina, Tsiliki Georgia, Sarimveis Haralambos, Evelo Chris T., Hongisto Vesa, Kohonen Pekka, Willighagen Egon, and Grafström Roland C.. A Data Fusion Pipeline for Generating and Enriching Adverse Outcome Pathway Descriptions. Toxicological Sciences, 162(1):264–275, March 2018. [DOI] [PubMed] [Google Scholar]
  • [16].Bell Shannon M., Angrish Michelle M., Wood Charles E., and Edwards Stephen W.. Integrating Publicly Available Data to Generate Computationally Predicted Adverse Outcome Pathways for Fatty Liver. Toxicological Sciences, 150(2):510–520, April 2016. [DOI] [PubMed] [Google Scholar]
  • [17].Castelvecchi Davide. Can we open the black box of AI? Nature News, 538(7623):20, October 2016. [DOI] [PubMed] [Google Scholar]
  • [18].Schank Roger C.. Explanation Patterns: Understanding Mechanically and Creatively. Psychology Press, Hillsdale, N.J, 1 edition edition, October 1986. [Google Scholar]
  • [19].Boelsterli Urs A.. Mechanistic Toxicology: The Molecular Basis of How Chemicals Disrupt Biological Targets, Second Edition. CRC Press, Boca Raton, FL, 2nd edition edition, June 2007. [Google Scholar]
  • [20].Callahan TJ. PheKnowLator, March 2019.
  • [21].Smith Barry, Ashburner Michael, Rosse Cornelius, Bard Jonathan, Bug William, Ceusters Werner, Goldberg Louis J. Eilbeck Karen, Ireland Amelia, Mungall Christopher J., The OBI Consortium, Leontis Neocles, Rocca-Serra Philippe, Ruttenberg Alan Sansone Susanna-Assunta, Scheuermann Richard H., Shah Nigam, Whetzel Patricia L., and Lewis Suzanna. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology, 25(11):1251–1255, November 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Natale Darren A., Arighi Cecilia N., Blake Judith A., Bona Jonathan, Chen Chuming, Chen Sheng-Chih, Christie Karen R., Cowart Julie, Peter D’Eustachio Alexander D. Diehl, Drabkin Harold J., Duncan William D., Huang Hongzhan Ren Jia, Ross Karen, Ruttenberg Alan Shamovsky Veronica, Smith Barry, Wang Qinghua, Zhang Jian El-Sayed Abdelrahman, and Wu Cathy H.. Protein Ontology (PRO): enhancing and scaling up the representation of protein entities. Nucleic Acids Research, 45(D1):D339–D346, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Cell Ontology. http://www.obofoundry.org/ontology/cl.html, 2012. Accessed: 2019-05-01.
  • [24].Köhler Sebastian, Carmody Leigh, Vasilevsky Nicole, Jacobsen Julius O. B., Danis Daniel, Gourdine Jean-Philippe, Gargano Michael, Harris Nomi L., Matentzoglu Nicolas, McMurry Julie A., Osumi-Sutherland David, Cipriani Valentina, Balhoff James P., Conlin Tom, Blau Hannah, Baynam Gareth, Palmer Richard, Gratian Dylan, Dawkins Hugh, Segal Michael, Jansen Anna C., Muaz Ahmed, Chang Willie H. Bergerson Jenna, Laulederkind Stanley J. F., Yüksel Zafer, Beltran Sergi, Freeman Alexandra F., Sergouniotis Panagiotis I., Durkin Daniel, Storm Andrea L., Hanauer Marc, Brudno Michael, Bello Susan M., Sincan Murat, Rageth Kayli, Wheeler Matthew T., Oegema Renske, Lourghi Halima, Rocca Maria G. Della, Thompson Rachel, Castellanos Francisco, Priest James, Cunningham-Rundles Charlotte, Hegde Ayushi, Lovering Ruth C., Hajek Catherine, Olry Annie, Notarangelo Luigi, Similuk Morgan, Zhang Xingmin A., Gómez-Andrés David, Lochmüller Hanns, Dollfus Hélène, Rosenzweig Sergio, Marwaha Shruti, Rath Ana, Sullivan Kathleen, Smith Cynthia, Milner Joshua D., Leroux Dorothée, Boerkoel Cornelius F., Klion Amy, Carter Melody C., Groza Tudor, Smedley Damian, Haendel Melissa A., Mungall Chris, and Robinson Peter N.. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Research, 47(D1):D1018–D1027, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Schriml Lynn Marie, Arze Cesar, Nadendla Suvarna, Yu-Wei Wayne Chang Mark Mazaitis, Felix Victor, Feng Gang, and Kibbe Warren Alden. Disease Ontology: a backbone for disease semantic integration. Nucleic Acids Research, 40(D1):D940–D946, January 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, Turner S, Swainston N, Mendes P, and Steinbeck C. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic acids research, 44(D1):D1214–9, January 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Davis Allan Peter, Grondin Cynthia J., Johnson Robin J., Sciaky Daniela, Roy McMorran Jolene Wiegers, Wiegers Thomas C., and Mattingly Carolyn J.. The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Research, 47(D1):D948–D954, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Fabregat Antonio, Jupe Steven, Matthews Lisa, Sidiropoulos Konstantinos, Gillespie Marc, Garapati Phani, Haw Robin, Jassal Bijay, Korninger Florian, May Bruce, Milacic Marija, Corina Duenas Roca Karen Rothfels, Sevilla Cristoffer, Shamovsky Veronica, Shorser Solomon, Varusai Thawfeek, Viteri Guilherme, Weiser Joel, Wu Guanming, Stein Lincoln, Hermjakob Henning, and D’Eustachio Peter. The Reactome Pathway Knowledgebase. Nucleic Acids Research, 46(D1):D649–D655, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].von Mering Christian, Huynen Martijn, Jaeggi Daniel, Schmidt Steffen, Bork Peer, and Snel Berend. STRING: a database of predicted functional associations between proteins. Nucleic Acids Research, 31(1):258–261, January 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].AOP wiki. https://aopwiki.org/, 2019Accessed: 2019-05-01. [Google Scholar]
  • [31].NCI Thesaurus. https://ncit.nci.nih.gov/ncitbrowser/, 2007. Accessed: 2019-05-01. [Google Scholar]
  • [32].UniProt: a worldwide hub of protein knowledge. Nucleic Acids Research, 47(D1):D506–D515, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Kazakov Yevgeny, Krötzsch Markus, and Simančík František. Concurrent Classification of EL Ontologies. Springer; Berlin Heidelberg, 2011. [Google Scholar]
  • [34].Kazakov Yevgeny, Krötzsch Markus, and Simančík František. The Incredible ELK. Journal of Automated Reasoning, 53(1):1–61, June 2014. [Google Scholar]
  • [35].Igarashi Yoshinobu, Nakatsu Noriyuki, Yamashita Tomoya, Ono Atsushi, Ohno Yasuo, Urushidani Tetsuro, and Yamada Hiroshi. Open TG-GATEs: a large-scale toxicogenomics database. Nucleic Acids Research, 43(D1):D921–D927, January 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Vinken Mathieu, Doktorova Tatyana, Ellinger-Ziegelbauer Heidrun, Ahr Hans-Jürgen, Lock Edward, Carmichael Paul, Roggen Erwin, Joost van Delft Jos Kleinjans, Castell José, Bort Roque, Donato Teresa, Ryan Michael, Corvi Raffaella, Keun Hector, Ebbels Timothy, Athersuch Toby, Sansone Susanna-Assunta, Philippe Rocca-Serra Rob Stierum, Jennings Paul, Pfaller Walter, Gmuender Hans, Vanhaecke Tamara, and Rogiers Vera. The carcinoGENOMICS project: Critical selection of model compounds for the development of omics-based in vitro carcinogenicity screening assays. Mutation Research Reviews in Mutation Research, 659(3):202–210, September 2008. [DOI] [PubMed] [Google Scholar]
  • [37].diXa Data Warehouse. http://wwwdev.ebi.ac.uk/fg/dixa/index.html, 2019. Accessed: 2019-05-01.
  • [38].Iskandar Anita R., Martin Florian, Leroy Patrice, Schlage Walter K., Mathis Carole, Titz Bjorn, Kondylis Athanasios, Schneider Thomas, Vuillaume Grégory, Sewer Alain, Guedj Emmanuel, Trivedi Keyur, Elamin Ashraf, Frentzel Stefan, Ivanov Nikolai V., Peitsch Manuel C., and Hoeng Julia. Comparative biological impacts of an aerosol from carbon-heated tobacco and smoke from cigarettes on human respiratory epithelial cultures: A systems toxicology assessment. Food and Chemical Toxicology, 115:109–126, May 2018. [DOI] [PubMed] [Google Scholar]
  • [39].Zanetti Filippo, Sewer Alain, Mathis Carole, Iskandar Anita R., Kostadinova Radina, Schlage Walter K., Leroy Patrice, Majeed Shoaib, Guedj Emmanuel, Trivedi Keyur, Martin Florian, Elamin Ashraf, Merg Céline, Ivanov Nikolai V., Frentzel Stefan, Peitsch Manuel C., and Hoeng Julia. Systems Toxicology Assessment of the Biological Impact of a Candidate Modified Risk Tobacco Product on Human Organotypic Oral Epithelial Cultures. Chemical Research in Toxicology, 29(8):1252–1269, August 2016. [DOI] [PubMed] [Google Scholar]
  • [40].van der Toorn Marco, Sewer Alain, Marescotti Diego, Johne Stephanie, Baumer Karin, Bornand David, Dulize Remi, Merg Celine, Corciulo Maica, Scotti Elena, Pak Claudius, Leroy Patrice, Guedj Emmanuel, Ivanov Nikolai, Martin Florian, Peitsch Manuel, Hoeng Julia, and Luettich Karsta. The biological effects of long-term exposure of human bronchial epithelial cells to total particulate matter from a candidate modified-risk tobacco product. Toxicology in Vitro, 50:95–108, August 2018. [DOI] [PubMed] [Google Scholar]
  • [41].Malinska Dominika, Szymanski Jedrzej, Patalas-Krawczyk Paulina, Michalska Bernadeta, Wojtala Aleksandra, Prill Monika, Partyka Malgorzata, Drabik Karolina, Walczak Jaroslaw, Sewer Alain, Johne Stephanie, Luettich Karsta, Peitsch Manuel C., Hoeng Julia, Duszynski Jerzy, Szezepanowska Joanna, van der Toorn Marco, and Wieckowski Mariusz R.. Assessment of mitochondrial function following short- and long-term exposure of human bronchial epithelial cells to total particulate matter from a candidate modified-risk tobacco product and reference cigarettes. Food and Chemical Toxicology, 115:1–12, May 2018. [DOI] [PubMed] [Google Scholar]
  • [42].ArrayExpress functional genomics data. https://www.ebi.ac.uk/arrayexpress/, 2019. Accessed: 2019-05-01.
  • [43].Ritchie Matthew E., Phipson Belinda, Wu Di, Hu Yifang, Law Charity W., Shi Wei, and Smyth Gordon K., limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7):e47–e47, April 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs], January 2013. arXiv: 1301.3781. [Google Scholar]
  • [45].Nelson Walter, Zitnik Marinka, Wang Bo, Leskovec Jure, Goldenberg Anna, and Sharan Roded. To Embed or Not: Network Embedding as a Paradigm in Computational Biology. Frontiers in Genetics, 10, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Grover Aditya and Leskovec Jure. node2vec: Scalable Feature Learning for Networks. arXiv:1607.00653 [cs, stat], July 2016. arXiv: 1607.00653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Ahmad Syed A., Scolnik Dennis, Snehal Vala, and Glatstein Miguel. Use of naloxone for clonidine intoxication in the pediatric age group: case report and review of the literature. American Journal of Therapeutics, 22(1):e14–16, February 2015. [DOI] [PubMed] [Google Scholar]
  • [48].Fan Dan and Fan Ting-Jun. Clonidine Induces Apoptosis of Human Corneal Epithelial Cells through Death Receptors-Mediated, Mitochondria-Dependent Signaling Pathway. Toxicological Sciences, 156(1):252–260, March 2017. [DOI] [PubMed] [Google Scholar]
  • [49].King CTG and Joyce Howell. Teratogenic effect of buclizine and hydroxyzine in the rat and chlorcyclizine in the mouse. American Journal of Obstetrics and Gynecology, 95(1):109–111, May 1966. [DOI] [PubMed] [Google Scholar]
  • [50].Thorp John M., Spielman FJ, Valea Fidel A., Payne FG, Mueller RA, and Cefalo Robert C.. Nifedipine enhances the cardiac toxicity of magnesium sulfate in the isolated perfused Sprague-Dawley rat heart. American Journal of Obstetrics and Gynecology, 163(2):655–656, August 1990. [DOI] [PubMed] [Google Scholar]
  • [51].Westerink Walter M. A. and Schoonen Willem G. E. J.. Cytochrome P450 enzyme levels in HepG2 cells and cryopreserved primary human hepatocytes and their induction in HepG2 cells. Toxicology in Vitro, 21(8):1581–1591, December 2007. [DOI] [PubMed] [Google Scholar]
  • [52].Hinson Jack A., Roberts Dean W., and James Laura P.. Mechanisms of Acetaminophen-Induced Liver Necrosis. Handbook of experimental pharmacology, 196(196):369–405, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [53].Aladjov Hristo. Effectopedia, the online encyclopedia of adverse outcome pathways. https://www.effectopedia.org/, 2019. Accessed: 2019-05-01. [Google Scholar]
  • [54].Burgoon Lyle D.. The AOPOntology: A Semantic Artificial Intelligence Tool for Predictive Toxicology. Applied In Vitro toxicology, 3(3):278–281, September 2017. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Supplemental Experimental Details. Additional details of the experimental validation of MechSpy-generated mechanistic hypotheses for adapin and chlorpromazine.

2

Supplemental Figures SF1. Additional analysis of MechSpy predictions, in relation to the types of chemicals utilized and whether their currently accepted mechanisms of toxicity have been determined using the same tissue type and organism.

3

Supplemental Table S1. Literature sources used to determine each mechanism label.

4

Supplemental Table S2. MechSpy predictions for all time series with two or more time points with significant gene expression changes, utilizing chemicals with compelling mechanistic explanations in the literature.

5

Supplemental Table S3. MechSpy predictions for chemicals for which there is no strong enough evidence of a particular mechanism of toxicity.

6

Supplemental Table S4. Categorization of all chemicals used in the time series.

RESOURCES