Abstract
The ability to predict which chemicals are of concern for environmental safety is dependent, in part, on the ability to extrapolate chemical effects across many species. This work investigated the complementary use of two computational new approach methodologies to support cross-species predictions of chemical susceptibility: the US Environmental Protection Agency Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool and Unilever’s recently developed Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool. These stand-alone tools rely on existing biological knowledge to help understand chemical susceptibility and biological pathway conservation across species. The utility and challenges of these combined computational approaches were demonstrated using case examples focused on chemical interactions with peroxisome proliferator activated receptor alpha (PPARα), estrogen receptor 1 (ESR1), and gamma-aminobutyric acid type A receptor subunit alpha (GABRA1). Overall, the biological pathway information enhanced the weight of evidence to support cross-species susceptibility predictions. Through comparisons of relevant molecular and functional data gleaned from adverse outcome pathways (AOPs) to mapped biological pathways, it was possible to gain a toxicological context for various chemical-protein interactions. The information gained through this computational approach could ultimately inform chemical safety assessments by enhancing cross-species predictions of chemical susceptibility. It could also help fulfill a core objective of the AOP framework by potentially expanding the biologically plausible taxonomic domain of applicability of relevant AOPs.
1. Introduction
Historical toxicology approaches using in vivo vertebrate tests for chemical safety purposes are resource-intensive, ethically questionable, and time-consuming. This is a challenge because of the rapid rate at which new chemicals are developed. Therefore, new approach methodologies (NAMs) that by nature are designed to be non-animal-based methods, are needed to ensure chemicals are sufficiently evaluated for safety (R. Judson et al., 2009). New approach methodologies are often defined broadly to include any technologies and methodologies that circumvent the dependency on animal testing by using in silico, in chemico, and in vitro approaches or any combination of such to aid in the characterization of chemical hazard and risk assessment (European Chemicals Agency, 2016; USEPA, 2018; Van Der Zalm et al., 2022). Such NAMs are poised to give rise to a Next Generation Risk Assessment (NGRA) paradigm in line with the goals defined within the pivotal 2007 report by the National Research Council (NRC), “Toxicity Testing in the 21st Century: A Vision and a Strategy” (“Toxicity Testing in the 21st Century: A Vision and a Strategy,” 2007). Frameworks that use different NAMs as part of a weight of evidence approach for safety decision making demonstrate that the strategic combination of NAMs, including those developed for different purposes, may enhance the strengths and reduce the limitations of any individual approach (Baltazar et al., 2020; Middleton et al., 2022; Rajagopal et al., 2022).
An assumption of common species extrapolation approaches used in toxicology is that taxonomic relatedness confers similar susceptibility to chemicals. This assumption underlies the use of information on surrogate species to predict potential chemical hazards for other species (Perkins et al., 2013). A key approach to ecological toxicity testing relies on information from representative species as surrogates to represent all other species, typically within a related ecological taxonomic group, namely producers, primary and apical consumers (Colbourne et al., 2022; Spurgeon et al., 2020). It would not be possible, permissible, or desirable to perform toxicity tests on each species that may be exposed to an environmental contaminant. Therefore, NAMs must be thoroughly evaluated in accordance with essential elements of the scientific confidence framework (Van Der Zalm et al., 2022) and define the domain of applicability to support prediction and, ultimately, safety evaluations for a wide range of chemicals.
Computational methods within chemical risk assessment offer an efficient means to fill knowledge gaps to inform decisions on chemical safety and keep pace with the rate at which chemicals are being produced and entering the environment without the need for generating new in vivo (vertebrate) animal data (Krewski et al., 2020; Wittwehr et al., 2017; Zhang et al., 2018). Specifically, bioinformatics approaches for extrapolating chemical effects across species is critical with regards to environmental risk assessment (LaLone et al., 2018; Rivetti et al., 2020). With the wealth of new and existing biological information and the increasing availability of computational tools to analyze and interpret this information, the use of these tools in combination are likely to enhance cross-species extrapolation.
The objective of the project described herein was to demonstrate the value of combining two computational NAMs, the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool (LaLone et al., 2016) and the Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool (Rivetti et al., 2023), which make use of available omics data across species to support predictions of chemical susceptibility. This is achieved through the contribution of multiple related lines of evidence associated with chemical effects on biological pathways and taxonomic relevance. The potential improvements arising by the combination of these computational NAMs in predicting chemical susceptibility across species could ultimately help inform chemical safety assessment.
The main computational means of assessing taxonomic relatedness is by comparing gene or protein sequence and structural similarity. The US EPA SeqAPASS (v6.1; https://seqapass.epa.gov/seqapass/) tool and the recently developed G2P-SCAN (v0.0.1.0) tool rely on this technique to make predictions of an organism’s susceptibility to chemical toxicity (in the case of SeqAPASS) or to estimate biological pathway conservation (in the case of both SeqAPASS and G2P-SCAN) across species. The SeqAPASS tool utilizes protein sequence information to extrapolate chemical susceptibility across the diversity of species where protein sequence information is available (LaLone et al., 2013, 2016), which expands the biological space in which predictions of potential toxicity are possible. G2P-SCAN offers a way to efficiently obtain biological pathway-level information from human gene inputs and to support inferences of pathway conservation across a subset of 7 species commonly used for chemical safety evaluation: humans (Homo sapiens), mice (Mus musculus), rats (Rattus norvegicus), zebrafish (Danio rerio), fruit flies (Drosophila melanogaster), roundworms (Caenorhabditis elegans), and yeast (Saccharomyces cerevisiae).
Pathway-based approaches to toxicity assessments have been demonstrated as viable alternative models to whole animals with regards to protecting human and ecological health (Perkins et al., 2013; Xia et al., 2020). The essentiality of the identified molecular targets with respect to their mapped biological pathways was estimated along with the toxicological relevance of the pathways themselves. This was done by using network analyses (Zotenko et al., 2008) in combination with adverse outcome pathways (AOPs), where available. The AOP framework has been vital in the evolution of environmental safety assessment (Ankley et al., 2010). A key element of the AOP framework is the taxonomic domain of applicability (tDOA), which defines the taxonomic space that an AOP applies to (Jensen et al., 2022). By comparing relevant AOP molecular and functional data with mapped Reactome pathways, it was possible to gain a toxicological context for the chemical-protein interaction and allowed for the consideration of the list of species where the mapped pathways are likely to be conserved for extending the biologically plausible tDOA of related AOPs. Overall, this work demonstrated that by combining the use of SeqAPASS and G2P-SCAN additional lines of evidence and consensus data can be generated computationally to support cross-species extrapolation of chemical toxicity knowledge and predict susceptibility.
2. Methods
2.1. Target identification and evaluation
Forty chemicals were selected for evaluation that covered a wide range of chemical classes, use categories, and modes/mechanisms of action (see Table 1). Most of these chemicals were pharmaceuticals and ingredients in personal care products with characterized biological activity, which made them useful for demonstrating the combined use of computational NAMs. The molecular targets for these 40 chemicals were identified by leveraging: 1) EPA high-throughput in vitro data, 2) ToxCast (Richard et al., 2016) bioactivity data, 3) structural data available through the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) (Berman et al., 2000), and 4) existing chemical activity data available in the literature though manual searches (step A of Figure 1). The EPA high-throughput in vitro data was accessed through version 2 the RefChemDB (v2; publication in progress) and transcriptional points of departure (tPODs)(J. Harrill et al., 2019; J. A. Harrill et al., 2021) for a subset of the 40 chemicals of interest were extracted (refer to R. S. Judson et al., 2019 for information on RefChemDB v1). ToxCast data was obtained by searching the CompTox Chemicals Dashboard (available online at https://comptox.epa.gov/dashboard/) using DSSTox substance ID (DTSXID) chemical identifiers. Additionally, the RCSB PDB (https://www.rcsb.org/) was used to search for protein-ligand crystallization data by using a combination of protein target gene symbols and relevant chemical identifiers (SMILES or InChl String). Lastly, a literature search was performed manually by using Google Scholar and Boolean strings containing keywords (e.g., gene names, chemical names, “mechanism of action”, “binding”, “activation”, “inhibition”, “molecular docking”, “site-directed mutagenesis”, etc.) to find experimental data on the specific chemical-target interactions. All databases used in this study were accessed in September 2022.
Table 1.
A summary of the 40 chemicals selected for evaluation. Most of the compounds are pharmaceuticals and ingredients in personal care products and had known or theorized mechanisms of action, which made them useful for exploring computational toxicology approaches.
| Name | DSSTox Substance ID | CAS | Use Class |
|---|---|---|---|
|
| |||
| Butylparaben | DTXSID3020209 | 94–26-8 | Cosmetic additive |
| Coumarin | DTXSID7020348 | 91–64-5 | Cosmetic additive |
| Ethylzingerone | NA | 569646–79-3 | Cosmetic additive |
| HC Red 3 | DTXSID2021236 | 2871–01-4 | Cosmetic additive |
| Oxybenzone | DTXSID3022405 | 131–57-7 | Cosmetic additive |
| Aspartame | DTXSID0020107 | 22839–47-0 | Food additive |
| Sucrose acetate isobutyrate | DTXSID5029340 | 27216–37-1 | Food additive |
| 2-Ethylhexanoic acid | DTXSID9025293 | 149–57-5 | Industrial compound |
| Dibutyl phthalate | DTXSID2021781 | 84–74-2 | Industrial compound |
| Diethyl phthalate | DTXSID7021780 | 84–66-2 | Industrial compound / cosmetic and pharmaceutical additive |
| BHT | DTXSID2020216 | 128–37-0 | Industrial compound / food additive |
| Benzophenone-4 | DTXSID2042436 | 4065–45-6 | Industrial compound / pharmaceutical additive |
| Paraquat | DTXSID3034799 | 4685–14-7 | Pesticide |
| 1,2-Octanediol | DTXSID9036646 | 1117–86-8 | Pharmaceutical |
| all-trans-Retinoic acid | DTXSID7021239 | 302–79-4 | Pharmaceutical |
| Azathioprine | DTXSID4020119 | 446–86-6 | Pharmaceutical |
| Benzocaine | DTXSID8021804 | 94–09-7 | Pharmaceutical |
| Cetirizine dihydrochloride | DTXSID2044268 | 83881–52-1 | Pharmaceutical |
| Cyclophosphamide | DTXSID5020364 | 50–18-0 | Pharmaceutical |
| Dexamethasone | DTXSID3020384 | 50–02-2 | Pharmaceutical |
| Diethylstilbestrol | DTXSID3020465 | 56–53-1 | Pharmaceutical |
| Digoxin | DTXSID5022934 | 20830–75-5 | Pharmaceutical |
| Doxorubicin hydrochloride | DTXSID3030636 | 25316–40-9 | Pharmaceutical |
| Furosemide | DTXSID6020648 | 54–31-9 | Pharmaceutical |
| Glybenclamide | DTXSID0037237 | 10238–21-8 | Pharmaceutical |
| Hydralazine hydrochloride | DTXSID1044645 | 304–20-1 | Pharmaceutical |
| Ketoconazole | DTXSID7029879 | 65277–42-1 | Pharmaceutical |
| Methotrexate | DTXSID4020822 | 59–05-2 | Pharmaceutical |
| Niacinamide | DTXSID2020929 | 98–92-0 | Pharmaceutical |
| Nitrofurantoin | DTXSID7020972 | 67–20-9 | Pharmaceutical |
| Oxytetracycline hydrochloride | DTXSID5021097 | 2058–46-0 | Pharmaceutical |
| Paracetamol | DTXSID2020006 | 103–90-2 | Pharmaceutical |
| Rosiglitazone | DTXSID7037131 | 122320–73-4 | Pharmaceutical |
| Sodium salicylate | DTXSID5021708 | 54–21-7 | Pharmaceutical |
| Sulforaphane | DTXSID8036732 | 4478–93-7 | Pharmaceutical |
| Thalidomide | DTXSID9022524 | 50–35-1 | Pharmaceutical |
| Topiramate | DTXSID8023688 | 97240–79-4 | Pharmaceutical |
| Valproic acid (VPA) | DTXSID6023733 | 99–66-1 | Pharmaceutical |
| Verapamil hydrochloride | DTXSID2034095 | 152–11-4 | Pharmaceutical |
| Caffeine | DTXSID0020232 | 58–08-2 | Pharmaceutical / food and cosmetic additive |
Figure 1.
Diagram showing a proposed approach of combining the use of Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) and Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tools to support cross-species predictions of chemical effects. The key steps of this approach are labeled A-F and are described in detail in the corresponding methods section of this report. This approach is useful in extending inferences of pathway conservation across species. Additionally, the information derived through this approach could be used to both expand the biology and extend the taxonomic domain of applicability (tDOA) of existing adverse outcome pathways (AOPs) by identifying overlaps in mapped Reactome pathways and molecular initiating events (MIEs), key events (KEs), and key event relationships (KERs) of relevant AOPs. Additional abbreviation in the figure; protein-protein interaction (PPI).
All potential targets were assigned to chemicals if there was strong literature evidence to support a direct interaction with the compound of interest. In this way, high-throughput screening data from either ToxCast or RefChemDB was used as additional support for selecting targets but was not used to assign targets in the absence of literature evidence. Direct interactions were considered as having evidence of binding or alteration of protein activity (e.g., catabolic, anabolic, transport, macromolecule binding, etc.) in a concentration-dependent manner. Interactions that were solely based on impacts on mRNA or protein expression were considered as indirect, and therefore in these cases it was determined that there was not enough evidence to consider the gene or protein a molecular target (see SI for a summary of target selections and justifications).
2.2. Evaluating molecular targets using Genes to Pathways – Species Conservation Analysis (G2P-SCAN) tool
Using R software (v4.2.0), the “Genes2Pathways” R package (v0.0.1.0) was used to evaluate each of the identified human targets (step B of Figure 1). G2P-SCAN obtains pathway information from the Reactome Knowledgebase (https://reactome.org/) via an application programming interface (API). Reactome serves as a collection of molecular information that describes biological processes and discovers functional relationships between biological processes and outcomes (Jassal et al., 2020). The G2P-SCAN R package then extracts, synthesizes, and structures the data available from different databases i.e., orthologues, protein families, entities and reactions, linked to the genes in the identified pathways to substantiate the identification of conservation at the pathway level. Gene orthology across the selected species is retrieved via HumanMine (www.humanmine.org/) (Kalderimis et al., 2014; Smith et al., 2012) using the database Protein Analysis Through Evolutionary Relationships (Panther; www.pantherdb.org) (Mi et al., 2021). The comprehensive resource UniProt (www.uniprot.org) (The UniProt Consortium, 2019) was selected for mapping genes to proteins and is queried through G2P-SCAN API connections (Nightingale et al., 2017). Identified proteins are then queried through InterPro’s (www.interpro.org) API to map proteins to family classifications to provide functional annotation of conservation within G2P-SCAN (Blum et al., 2021). Finally, the Reactome data source is also used by G2P-SCAN to retrieve the numbers of associated entities and reactions per pathway identified in the analysis for each selected species (via Reactome Analysis and Content Services) (Jassal et al., 2020). The number of pathway entities and reactions, genes, proteins, and protein families are summarized as count values per identified pathway. The analysis parameters used for each target evaluation were as follows: use of all 7 model species (H. sapiens, R. norvegicus, M. musculus, D. rerio, D. melanogaster, C. elegans, and S. cerevisiae), analysis over Least Divergent Orthologs (LDOs; which represent nearly ‘equivalent’ gene pairs between different organisms based on a phylogenetic tree analysis; Mi et al., 2010) and limiting the analysis to terminal pathways (defined as those which have parent pathways, but no children in Reactome’s pathway hierarchy).
To determine the likelihood of pathway conservation across species using the information gathered through G2P-SCAN several factors were considered. First, whether the molecular target had orthologs identified for any of the 6 query species (all species other than human, as humans are by default the reference species for this tool). Second, whether the identified protein families for the molecular target are identical to the human protein families across any of the other 6 species. Lastly, by determining how significant the differences in overall count values of the pathway elements (proteins, protein families, entities, and reactions) are across each species. This was done by performing a one-way analysis of variance (ANOVA) followed by Tukey’s honestly significant difference (HSD) post-hoc test using the combined count data (i.e., the count value for each pathway element was considered as a sample within each species group) from each pathway element with respect to each species (using humans as the reference group). Using this test procedure, pathways were considered significantly different at p-values less than 0.10.
2.3. Pathway prioritization
It was important to identify mapped pathways where the molecular target is most likely to play an essential role in the overall pathway function. Two methods were relied upon for prioritizing mapped Reactome pathways from G2P-SCAN for further analysis. The first method was to simply select pathways comprised of a low number of genes. It was decided that pathways with ≤ 10 genes would be considered priority where the molecular target would represent at least 1 out of 10 (or 10%) of the pathway coverage. The assumption was that any chemical perturbation of the molecular target is likely to be more consequential to the function of the mapped pathway with fewer genes. The gene lists associated with pathways that were prioritized in this way were used to identify the corresponding protein accessions and queried in SeqAPASS.
For mapped pathways that were comprised of more than 10 genes, a second approach was used that relied on the use of protein-protein interaction (PPI) networks to better determine the biological significance of the molecular target within the pathway (step C of Figure 1). This was done using an analysis method external to the G2P-SCAN tool. A critical feature of biological networks is that essentiality correlates positively with centrality (Albert, 2005). Genes or proteins found at the core of interaction networks with high interconnectivity have been shown to be significant in determining phenotypic states (Jeong et al., 2001; Mutwil et al., 2009; Zotenko et al., 2008). Through this framework, the significance of genes and their protein products within mapped Reactome pathways could be inferred.
The latest version (v11.5) of a web-based tool called STRING (https://string-db.org/) was used to construct PPI networks (Szklarczyk et al., 2019) from the gene lists of all mapped Reactome pathways that were obtained via G2P-SCAN and comprised of more than 10 genes. The network edges represented the type of interaction evidence, and the minimum required interaction score was set to 0.4 on a scale of approximate confidence from zero to one with one representing approximately 100% confidence of a true association. These networks were then visualized using Cytoscape software (v3.9.1) where only reciprocal edges (i.e., edges that mutually linked two nodes: A-B, B-A) were used (Shannon et al., 2003).
Molecular Complex Detection (MCODE), available as an automated Cytoscape plugin, was used to perform a connectivity-based cluster analysis on the PPI networks (Bader & Hogue, 2003). MCODE calculates the degree of interconnectivity between each node of a network to identify densely connected regions making it a particularly useful algorithm for analyzing PPI networks. The highest scoring clusters from each network were used to infer the essentiality of the relevant molecular target. If a molecular target was found in the highest scoring cluster, then the proteins within the cluster were considered for SeqAPASS evaluation to help further define the plausible tDOA of the pathway (Jensen et al., 2022). To balance analysis efficiency with prediction confidence, if a molecular complex contained more than 10 proteins, then only the top ten scoring proteins (i.e., ten most interconnected nodes within the PPI network) that were also directly connected to the target protein were used in SeqAPASS evaluations (step D of Figure 1). This was a pragmatic decision made based on the lines of evidence indicating these proteins as being the most essential to the pathway.
2.4. Target evaluation using the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool
The SeqAPASS (v6.1; https://seqapass.epa.gov/seqapass/) tool was queried using the National Center for Biotechnology Information (NCBI) protein accessions for each of the identified human targets. Protein isoforms were selected (step E in Figure 1) by using a weighted decision tree approach (Figure 2). If a target had multiple known isoforms, they were prioritized by considering whether the isoform is 1) a known reference sequence, 2) is the longest sequence of the available isoforms within the reference species (which was humans, in this case), 3) is the most recently modified sequence, and 4) if it is the first version of that sequence (i.e., if it is isoform “a” or isoform 1). These last two factors were incorporated to help ensure that the sequence information of the isoform is the most current and common. Additional factors to consider for isoform selection would be after evaluation by SeqAPASS. For instance, if the selected isoform yielded a Level 1 susceptibility cutoff at 100% similarity, this would suggest that the cross-species protein alignments were not ideal, and a different isoform should be evaluated. Other factors could supersede this weighted approach if, for example, there is known evidence of a chemical interaction with a specific protein isoform.
Figure 2.
Weighted decision tree for selecting protein targets amongst multiple isoforms for SeqAPASS (LaLone et al., 2016) evaluations.
When isoforms for each target were determined, SeqAPASS Level 1 evaluations were performed (step F of Figure 1) using the default primary report settings: E-value ≤ 0.01, common domains = 1, sorted by the class taxonomic group, and species read-across for susceptibility predictions were set to “Yes.” SeqAPASS determines susceptibility by using a cut-off of percent similarity between protein isoforms across species. By default, SeqAPASS will set this cut-off to the percent similarity of the first identified ortholog candidate that is either equal to or higher than the percent similarity of the first local minimum of the distribution of the percent similarities calculated for each hit (LaLone et al., 2016). The susceptibility cut-off was adjusted where the default ortholog used for driving the percent similarity cutoff was a partial sequence of or not similar to the query accession in terms of protein annotation (see SI for a summary of the SeqAPASS evaluation results for all identified targets).
Level 2 evaluations were performed by first identifying conserved functional domains of the query protein using the NCBI Conserved Domain database (https://www.ncbi.nlm.nih.gov/cdd/). Conserved domains were selected for evaluation that were supported by evidence of direct interaction with the chemical stressor of interest (i.e., any of the 40 priority chemicals described above) or by showing evidence of interactions with similar compounds. For instance, if there was a conserved ligand binding site, this domain was often selected in cases where the chemical of interest was shown to mimic the action of known ligands or has been shown to competitively bind to the protein. The default primary report settings were used for Level 2 evaluations: E-value ≤ 10.0, sorted by the class taxonomic group, and species read-across for susceptibility predictions were set to “Yes.”
To perform a Level 3 evaluation, information on the chemical-protein interaction with respect to individual amino acid residues was needed. This information was obtained by relying on published studies that utilized techniques such as site-directed mutagenesis, molecular docking, crystallography, quantitative structure-activity relationships (QSAR), or gave evidence of mutational resistance. Searching the RCSB Protein Data Bank (PDB) (https://www.rcsb.org/) was a useful first step by searching the protein target name and filtering for the chemical of interest to find structural data on chemical-bound crystal structures. In cases where there was no clear experimental evidence of key amino acids mediating a chemical-protein interaction, residues were selected for Level 3 evaluations where there were chemical interactions such as hydrogen bonding and stacking shown within the PDB crystal structure (SI). In general, humans were used as the query species for SeqAPASS evaluations in this study to maintain consistency with G2P-SCAN, which currently cannot extrapolate from any species other than humans. Also, there is often more reliable protein sequence and structural data available for humans.
2.5. SeqAPASS and G2P-SCAN comparisons
One way in which G2P-SCAN helps infer pathway conservation across 7 species is by the identification of orthologs using PantherDB (Mi et al., 2010). Ortholog species identified by G2P-SCAN were compared to susceptibility predictions by mapping them to the SeqAPASS output, which contains hundreds of species. Any consensus in the selection of an ortholog and a susceptibility call provides additional support for the conservation of those proteins across species. When considering the consensus across multiple proteins within a mapped biological pathway, this inference of conservation can be extrapolated to the pathway itself.
By using the full reports (i.e., as opposed to primary reports, full reports include all sequence alignment results) from SeqAPASS Level 1 evaluations, the data was filtered to include only the results for the 7 model organisms used by G2P-SCAN. The percent overlap of species with a “Yes” susceptibility call by SeqAPASS and determined as least divergent orthologs by G2P-SCAN was calculated for each of the 89 successfully mapped targets. The heatmaply (v1.3.0) and VennDiagram (v1.7.3) packages in R were used to make heatmaps and Venn diagrams to visualize these comparisons.
2.6. Pathway-level species susceptibility predictions
Mapped Reactome pathways were prioritized for further evaluation if the molecular target was a member of a pathway consisting of 10 or fewer proteins or if the target was identified within the highest scoring molecular complex (based on network connectivity) of the pathway’s PPI network. The full list of genes that comprised the prioritized pathways with 10 or fewer proteins were used for SeqAPASS evaluations or in the case of prioritized pathways with greater than 10 proteins, only the top ten highest scoring proteins (i.e., highest connectivity scores) found within the highest scoring molecular complexes (that were also directly connected to the target protein) were used. Level 1 evaluations were performed on each protein associated with the pathway. Level 2 and Level 3 evaluations were performed only on the target protein where a chemical-protein interaction was known and, in the case of Level 3, if key amino acid residues had been identified in the published literature.
Once each protein within the prioritized pathway or within the highest scoring molecular complex (that contained the molecular target) were evaluated by SeqAPASS, the common susceptible species across each of these protein lists were merged. The orthologs identified by G2P-SCAN were then identified from the susceptible-species lists to further support the conservation of those pathways across the identified susceptible species. Additional lines of evidence to support these predictions of pathway conservation were added by assessing whether the orthologs identified by G2P-SCAN also had shared functional protein families, and by comparing the counts of pathway proteins, protein families, reactions, and entities across those species. In this way, the outputs of G2P-SCAN and SeqAPASS provide multiple related lines of evidence to support the prediction of pathway conservation in those identified species.
2.7. Linking pathway data from G2P-SCAN with AOP data
Many AOP molecular initiating events (MIEs), key events (KEs), and key event relationships (KERs) can be described by specific genes and proteins. Much of this molecular information is made readily available through the AOP-Database (AOP-DB) (https://aopdb.epa.gov/) (Mortensen et al., 2021; Pittman et al., 2018). A batch search was performed within the AOP-DB using the ID numbers of all the available AOPs at the time of this analysis (467 in total as of September 2022) as an input and specifying genes as the output. This resulted in 137 unique AOPs that contained gene-level information according to the database. Only 26 of those AOPs were annotated with 10 or more gene names. Thirteen of the 89 molecular targets that were successfully mapped using G2P-SCAN were associated with at least one of the 137 AOPs that contained gene information. Of these 13 targets, three were selected for further evaluation as case examples (i.e., PPARα, ESR1, and GABRA1). These three targets were selected because they encompass a wide range of existing toxicological information. Perturbation of normal PPARα activity is related to various adverse outcomes including cancer (Klaunig et al., 2003). ESR1 has a well characterized role in mediating endocrine effects (Shanle & Xu, 2011). The chemical modulation of GABRA1 activity in relation to seizure-like behaviors remains less well characterized (Bai et al., 2022). The AOPs and the mapped Reactome pathways of these three targets were compared to identify overlapping biology. These comparisons were performed through a qualitative assessment of the molecular information and functional outcomes of the AOP MIEs, KEs, and KERs and the mapped Reactome pathways. More specifically, the number of genes that overlapped between the mapped Reactome pathways and the AOP was determined. Regarding functional comparisons, the biological outcomes of an MIE, KE, or KER was compared against those of the mapped Reactome pathways. If a mapped pathway resulted in the activation of a specific protein, for instance, and this protein activation was also used to describe a particular KE within an AOP, then this was considered as an overlap. In this way, relatedness in either gene identity or biological function provided a means of deriving a toxicological context for the mapped Reactome pathway information. Therefore, in cases of AOP and Reactome pathway relatedness, the susceptible species lists obtained via SeqAPASS evaluations performed on those pathways could be used as a line of evidence for expanding the biologically plausible tDOA of the related AOP (as described by Jensen et al., 2023).
3. Results and Discussion
Identification of chemical molecular targets
Many of the 40 chemicals selected for evaluation had known mechanisms of action and had multiple molecular targets. For instance, it is well established that caffeine binds to Adenosine A1 (ADORA1) and A2A (ADORA2A) receptors, which contributes to its familiar psychoactive effects (Fisone et al., 2004). In total, 97 unique targets were identified across the 40 chemicals (see “Target list.xlsx”; column “selected” in SI). There were 35 targets that had resolved chemical-bound crystal structures readily available within the RCSB-PDB, such as methotrexate bound to dihydrofolate reductase (PDB ID: 1U72) (Cody et al., 2005). However, there were eight chemicals for which a molecular target could not be identified based on our criteria: 1,2-octanediol, benzophenone-4, BHT, diethyl phthalate, ethylzingerone, HC red 3, nitrofurantoin, and sucrose acetate isobutyrate. These cases highlight the domain of applicability of approaches such as SeqAPASS and G2P-SCAN, which require knowledge of the chemical-protein interaction. Advancement of high-throughput screening approaches to obtain reliable bioactivity information holds promise for the identification of molecular targets enabling the utility of such tools to cover an even larger chemical space (J. A. Harrill et al., 2021; Richard et al., 2016). Identified targets were evaluated using both SeqAPASS and G2P-SCAN. The data obtained from each of these tools was assessed to understand how they can complement one another to aid cross-species extrapolation of biological and toxicological information.
SeqAPASS species susceptibility overlaps G2P-SCAN derived orthologs
Eighty-nine of the 97 identified molecular targets (n = 89) were successfully mapped to a Reactome pathway in G2P-SCAN. Since SeqAPASS Level 3 susceptibility predictions offer the most species-specific predictions, these susceptibility calls were used to compare to the orthologs obtained through G2P-SCAN, which are specific for the 7 species. Figure 3 provides an example of this comparative analysis using ADORA2A. These comparisons were made across all 89 of the mapped molecular targets, which led to an average agreement at 74.4% (n = 34) with respect to the 7 model organisms. It is worth noting, however, that this analysis included Homo sapiens, which was the query species used by both G2P-SCAN and SeqAPASS and was considered as an identified ortholog and susceptible species by default. The SeqAPASS evaluations of the 89 targets resulted in predictions of susceptibility for at least one of the 7 species that were not identified as orthologs by G2P-SCAN for 43 targets. Conversely, G2P-SCAN identified 8 orthologs that were not predicted as susceptible by SeqAPASS Level 3 evaluations.
Figure 3.
(A) Heatmap of SeqAPASS susceptibility calls from Levels 1 – 3 and G2P-SCAN ortholog determinations for the 7 G2P-SCAN model organisms using ADORA2A (protein accession: NP_000666.2) as an example. (B) Venn diagram of SeqAPASS Level 3 susceptible species and G2P-SCAN ortholog species. Four of the six ortholog species identified via G2P-SCAN where also identified as susceptible by SeqAPASS through a Level 3 evaluation (66.7% total).
In cases where SeqAPASS identified susceptible species that did not have orthologs identified by G2P-SCAN, it can be assumed that the mapped Reactome pathways from G2P-SCAN may not be conserved in the 7 species. However, those species may still be susceptible to the chemical of interest through interaction with the aligned protein target identified by SeqAPASS for that species. On the other hand, in cases where G2P-SCAN identified orthologs for the 7 species that were not found to be susceptible by SeqAPASS, those G2P-SCAN mapped Reactome pathways are likely conserved in those species. However, the chemical of interest may not interact with the protein target in the same manner as other known sensitive species.
As an example, G2P-SCAN identified a fruit fly (D. melanogaster) ortholog for ADORA2A, but fruit flies were predicted as not likely susceptible through the Level 3 SeqAPASS evaluation based on the chemical specific interactions of ADORA2A with caffeine (see Figure 3). This finding is supported by previous studies that showed the effects of caffeine in D. melanogaster are likely due to an adenosine receptor-independent mechanism (Nall et al., 2016). Therefore, it can be inferred that the fruit fly has similar pathways associated with ADORA2A-like proteins, but caffeine likely does not initiate these pathways as it would in other sensitive species. This outlines the benefit of taking a combined approach with the tools and points to the utility of the chemical specificity in the SeqAPASS Level 3 evaluation in aiding interpretation of the pathway conservation inferences made with G2P-SCAN, bringing together the evaluation of the specific chemical-protein interaction. The underlying logic of this example is as follows: if there was knowledge that 1) the activation of ADORA2A via caffeine led to a biological effect in one species, 2) other species show conservation of those critical amino acid residues known to mediate the caffeine interaction with ADORA2A, and 3) the ADORA2A-mediated biological pathways were conserved within other species, then it could be inferred through these multiple lines of evidence that caffeine would lead to a similar effect in all species where ADORA2A and its associated pathways are conserved.
Three case examples were selected to demonstrate the utility of the combined approaches
To demonstrate the use of these combined computational approaches in aiding cross-species extrapolation of potential chemical susceptibility, the results from the combined approaches for three of the identified molecular targets: PPARα (target of 2-ethylhexanoic acid), ESR1 (target of diethylstilbestrol, butylparaben, oxybenzone, and dibutyl phthalate), and GABRA1 (target of topiramate) were highlighted (see SI for summary of SeqAPASS and G2P-SCAN outputs for each case example). These targets were chosen to exemplify the approach because they were found to be associated with characterized AOPs and because they underscore the potential benefits and areas of improvement in using such combined computational approaches. To be clear, it is not being claimed that these chemical-protein interactions are the primary interactions responsible for reported adverse events. All molecular targets were selected based on the specific search procedure described herein.
As it was done with all other identified targets, these three targets were used as inputs for G2P-SCAN using their official NCBI gene symbols as separate queries (see Table 2). The G2P-SCAN package then mapped these genes to existing Reactome pathways (Figure 4). Two pathways were mapped by both ESR1 and PPARA – “SUMOylation of intracellular receptors” and “Nuclear Receptor transcription pathway.” Only two of the ESR1 mapped pathways had coverage percentages of over 10% – “RUNX1 regulates transcription of genes involved in WNT signaling” (ESR1 represented 1 out of 5 genes in this pathway) and “RUNX1 regulates estrogen receptor mediated transcription” (ESR1 represented 1 out of 6 genes in this pathway).
Table 2.
Summary of SeqAPASS information used in Level 1 – 3 evaluations for the 3 case example targets.
| Chemical Name | Target | Target Gene Symbol | Level 1 Query Species (protein accession) | Level 2 Domain (domain accession) | Level 3 Template Species (protein accession) | Level 3 Amino Acids |
|---|---|---|---|---|---|---|
|
| ||||||
| Oxybenzone; Butylparaben; Dibutyl phthalate; Diethylstilbestrol | estrogen receptor 1 | ESR1 | Homo sapiens (NP_000116.2) | Ligand binding domain of Estrogen receptor (cd06949) | Homo sapiens (NP_000116.2) | Butylparaben (353E, 394R, 404F); Diethylstilbestrol (343M, 353E, 394R, 404F, 524H) |
| 2-Ethylhexanoic acid | peroxisome proliferator-activated receptor alpha | PPARA | Homo sapiens (NP_005027.2) | Ligand binding domain of peroxisome proliferator-activated receptors (cd06932) | NA | NA |
| Topiramate | gamma-aminobutyric acid receptor subunit alpha-1 precursor | GABRA1 | Homo sapiens (NP_001121116.1) | Neurotransmitter-gated ion-channel ligand binding domain (pfam02931) | NA | NA |
Figure 4.
(A) Bar plot of gene counts from all Reactome pathways that were mapped using either ESR1, PPARA, or GABRA1 as a G2P-SCAN input. Two pathways were mapped by both ESR1 and PPARA – “SUMOylation of intracellular receptors” and “Nuclear Receptor transcription pathway.” (B) Bar plot of the pathway coverage percentage for each mapped Reactome pathway. Two pathways that were mapped using ESR1 had coverage percentages of over 10% – “RUNX1 regulates transcription of genes involved in WNT signaling” and “RUNX1 regulates estrogen receptor mediated transcription.”
PPARA – 2-Ethylhexanoic acid
Peroxisome proliferator-activated receptor alpha (PPARα) is a nuclear transcription factor that has a known role of mediating the activity of peroxisome proliferators in rodents (Gonzalez et al., 1998) and in humans (Kersten & Stienstra, 2017). 2-Ethylhexanoic acid, which is an industrial compound used to make paints and plasticizers (Table 1), has been shown to target PPARα in previous studies. For example, 2-Ethylhexanoic acid was shown to activate PPARα with an EC2X (two-fold effect concentration) of 500 μM (Lampen et al., 2003), and this interaction with PPARα was shown to be more selective compared to other PPARs (Maloney & Waxman, 1999). Although there is currently no clear evidence of 2-ethylhexanoic acid directly binding to PPARα, these studies demonstrated a dose-dependent activation. Therefore, PPARα was considered as a molecular target of 2-ethylhexanoic acid for this study.
By using the isoform selection approach described in Figure 2, NP_005027.2 was used as the query accession for the SeqAPASS Level 1 evaluation representing the human PPARα. As there was (at the time of this study) no evidence of 2-ethylhexanoic acid binding directly to PPARα, it was assumed that the interaction would likely occur at the ligand binding site of the protein since previous studies had demonstrated an activating effect on PPARα similar to known ligands (Lampen et al., 2003; Maloney & Waxman, 1999). Therefore, the conserved domain cd06932 was used for the SeqAPASS Level 2 evaluation. No information on critical amino acid specific information was available to support a Level 3 evaluation.
Using PPARA as an input for G2P-SCAN yielded 10 Reactome pathways (Figure 4). None of these mapped pathways had a percent coverage by PPARA of over 10%, so separate PPI networks were constructed using STRING for each of these pathways and an MCODE analysis was performed on each for the connectivity-based cluster analysis. The pathways in which PPARα was found within the highest scoring molecular complex were selected for further SeqAPASS evaluations. In some cases, the molecular complexes containing PPARα were too large (>10 proteins) to efficiently evaluate each protein of the complex within SeqAPASS. Therefore, the top ten scoring proteins contained within the complex that were also directly connected to PPARα were used in Level 1 evaluations. This approach to filtering key pathway proteins was intended to balance analysis efficiency with prediction confidence. The resulting “susceptible” species lists obtained from each of the Level 1 evaluations using the high scoring molecular complex proteins and the Level 2 evaluation of PPARα were merged to identify the identical “susceptible” species across each list (Figure 5).
Figure 5.
A diagram demonstrating the process of making across species predictions of biological pathway conservation via SeqAPASS. 1) The highest scoring molecular complex of the PPI network formed by the Reactome pathway R-HSA-2426168 was determined using the MCODE analysis. 2) The top ten scoring proteins (circled in red) connected to PPARα (enlarged) were evaluated in SeqAPASS. 3) The “susceptible” species lists from each of the SeqAPASS evaluations for these 11 proteins were merged to identify the identical, “susceptible” species across each list, which resulted in 187 mammalian species.
The prediction of chemical susceptibility can be further supported by combining it with the species which G2P-SCAN identified as having the pathway conserved. R. norvegicus, M. musculus, and D. rerio had orthologs of the human PPARA gene as well as identical protein families. If any of these three species were also identified as “susceptible” from the SeqAPASS results across all the high scoring molecular complex proteins of a prioritized, PPARα-mapped Reactome pathway, then this overlap would give added support to the prediction that those species would likely be impacted by exposure to 2-ethylhexanoic acid. For example, one of the mapped pathways, “Activation of gene expression by SEBF (SREBP)” (R-HSA-2426168), resulted in 187 mammalian species predicted as “susceptible”. Of these 187 species, R. norvegicus and M. musculus (out of the 6 G2P-SCAN query species) were found. Furthermore, the analysis of the pathway protein, protein family, reaction and entity counts supported the conservation of this pathway within rats and mice as the count values for these species did not differ significantly (p-values > 0.10) from the human counts (SI). In this way, both SeqAPASS and G2P-SCAN results provided separate lines of evidence to support the inference of this pathway as being conserved within these two species and through principles of species read-across (LaLone et al., 2013) the results supported the hypothesis that this pathway is likely to be conserved within the mammalian taxon.
However, the conservation of PPARα-related pathways alone does not necessarily provide additional evidence of species susceptibility to adverse outcomes from exposure to 2-ethylhexanoic acid. In fact, PPARα activation has been suggested as potentially neuroprotective (Deplanque et al., 2003). Therefore, comparisons to AOPs were made to understand the potential toxicity of PPARα activation and PPARα-mapped pathways. Ten AOPs that involved PPARα were identified from the AOP-DB. One of these AOPs was “NR1I3 (CAR) suppression leading to hepatic steatosis” (AOP 58; https://aopwiki.org/aops/58), which had the most gene-level information out of any other existing AOP (i.e., 33 genes in total). This high amount of gene-level information allowed for comparison of the genes contained within this AOP to those identified via G2P-SCAN from mapped Reactome pathways. Of the 33 genes that were associated with AOP 58, 11 were unique, human genes. Ten of these 11 human genes were found within at least one PPARα mapped Reactome pathway derived from G2P-SCAN (Figure 6). The fact that most (10 out of 11) of the human genes known to be associated with AOP 58 were found within PPARα-mapped Reactome pathways indicated that these pathways may be useful in broadening the biological information used to describe the MIE, KEs, or KERs of this AOP. It also suggests that the SeqAPASS pathway conservation predictions for the related priority pathways could be extended to this AOP, specifically as evidence to define the biologically plausible tDOA. However, it is also important to consider the biological plausibility of these overlapping genes within the context of the specific AOP and Reactome pathways before expanding the biological information for the tDOA in this way.
Figure 6.
A Venn diagram showing a total of a 6.0% overlap between the gene-level information associated with AOP (58): NR1I3 (CAR) suppression leading to hepatic steatosis and the genes associated with various PPARα-mapped Reactome pathways. Only one of the AOP 58 genes was not found in any of the Reactome pathways – MLXIPL. The Venn diagram was created using Venny (v2.1; https://bioinfogp.cnb.csic.es/tools/venny/).
Specifically, the pathways associated with the 10 overlapping genes included: “PPARA activates gene expression” (R-HSA-1989781), “Nuclear Receptor transcription pathway” (R-HSA-383280), and “Activation of gene expression by SEBF (SREBP)” (R-HSA-2426168). Only “Activation of gene expression by SEBF (SREBP)” (R-HSA-2426168) and “Nuclear Receptor transcription pathway” (R-HSA-383280) were determined as priority pathways. There were several key events (KEs) within AOP 58 (“NR1I3 (CAR) suppression leading to hepatic steatosis”) that related directly to these three Reactome pathways. For instance, the first KE of the AOP (“Activation, SREBF1”) related to the function of the Reactome pathway R-HSA-2426168, which describes transcriptional coactivation of various genes through the combined activity of SREBF1 and the PPARA:RXRA coactivator complex. Additionally, this Reactome pathway involves the regulation of the genes fatty acid synthase (FASN), acetyl-CoA carboxylase alpha (ACACA), and stearoyl-CoA desaturase (SCD). These three genes were associated with three separate KEs within AOP 58: “Up Regulation, SCD-1,” “Up Regulation, FAS,” and “Up Regulation, Acetyl-CoA carboxylase-1 (ACC-1).” There were also two Reactome pathways that involve the positive regulation gene CD36 by PPARα (R-HSA-1989781 and R-HSA-381340), which was associated with the KE “Up Regulation, CD36.”
Taken together, these overlaps in the function of the Reactome pathways and the AOP KEs offers support that the interaction between 2-ethylhexanoic acid and PPARα may lead to an adverse outcome. However, AOP 58 defines PPARα inhibition as a molecular initiating event (MIE), whereas studies showed that 2-ethylhexanoic acid activated PPARα (Lampen et al., 2003; Maloney & Waxman, 1999). Although the role of PPARα in the development of liver steatosis differs in AOP 61 “NFE2L2/FXR activation leading to hepatic steatosis,” which lists PPARα activation as an early KE. Therefore, it is still possible that 2-ethylhexanoic acid could lead to liver steatosis by activating PPARα, but this apparent discrepancy in the modulation of PPARα highlights a challenge in making predictions of apical outcomes from chemical perturbation through this computational approach alone and underscores the importance of considering chemical-protein interactions in the context of biological pathways. Through careful evaluation of pathway information in relation to AOPs or otherwise though quantitative approaches (e.g., transcriptomics, proteomics, etc.) to assess the activity of additional pathway elements in response to chemical exposure these differences could be further resolved. Regardless, by identifying biological pathways that can be used to better describe the molecular mechanisms underpinning the events of AOPs, the information of those related pathways could be used to provide additional lines of evidence to define the biologically plausible tDOA.
ESR1 – Butylparaben, Oxybenzone, Dibutyl phthalate, and Diethylstilbestrol
The estrogen receptor 1 (ESR1 or ERα) is a target for various endogenous estrogens and plays a critical role as a transcription factor related to many basic biological functions and disease states (Harris et al., 2002; Nilsson & Gustafsson, 2008). It has also been shown by previous studies to be a target for synthetic estrogenic compounds including 4 of the 40 chemicals assessed in this study: butylparaben (Delfosse et al., 2015), oxybenzone (Blüthgen et al., 2012; Matsumoto et al., 2005), dibutyl phthalate (Toda et al., 2004), and diethylstilbestrol (Shiau et al., 1998). There are existing PDB crystal structures of ESR1 bound to diethylstilbestrol (PDB ID: 3ERD) and butylparaben (PDB ID: 4MG9). These structures identify key amino acids involved in the binding of those chemicals to ESR1 (see Table 2), which allowed for SeqAPASS Level 3 evaluations querying those key residues. No key residues could be identified from the interaction of ESR1 and the other two active chemicals – oxybenzone and dibutyl phthalate. The crystal structures of diethylstilbestrol and butylparaben bound to ESR1 showed that these chemicals interacted with ESR1 at the conserved ligand binding domain cd06949, which was used for SeqAPASS Level 2 evaluations.
G2P-SCAN mapped 12 Reactome pathways using ESR1 as an input (see Figure 4). Two of the mapped pathways, “RUNX1 regulates transcription of genes involved in WNT signaling” (R-HSA-8939256) and “RUNX1 regulates estrogen receptor mediated transcription” (R-HSA-8931987), were prioritized for having ESR1 coverage of over 10%. All other pathways were evaluated using the PPI network approach applying STRING, which led to 4 additional prioritized pathways: “Estrogen-dependent gene expression” (R-HSA-9018519), “Extra-nuclear estrogen signaling” (R-HSA-9009391), “SUMOylation of intracellular receptors” (R-HSA-4090294), and “TFAP2 (AP-2) family regulates transcription of growth factors and their receptors” (R-HSA-8866910). The top 10 scoring proteins that were directly connected to ESR1 in the molecular complexes of these PPI networks produced by STRING were used to query SeqAPASS for evaluation of pathway conservation. As examples, the SeqAPASS evaluations on the key pathway proteins for “Estrogen-dependent gene expression” (R-HSA-9018519) resulted in 139 “susceptible” mammalian species, and the evaluations for “RUNX1 regulates transcription of genes involved in WNT signaling” (R-HSA-8939256) resulted in 272 “susceptible” species across 8 taxa (Actinopteri, Amphibia, Aves, Chondrichthyes, Crocodylia, Lepidosauria, Mammalia, and Testudinata).
The G2P-SCAN approach identified ESR1 orthologs for R. norvegicus, M. musculus, and D. rerio. Additionally, the protein, protein family, reaction, and entity counts for each of the 6 priority pathways within these three species are not significantly different from the counts in humans (p-values > 0.10), and the protein functional families across these species were identical to the human protein family for ESR1. When considered in combination with the SeqAPASS results, multiple lines of evidence are derived to support the inference of pathway conservation for these specific species and for the hypothesis that the mapped pathways are conserved across species within identical taxonomic groups. For instance, species within the Actinopteri class were found to be “susceptible” when evaluating the key proteins of the “RUNX1 regulates transcription of genes involved in WNT signaling” (R-HSA-8939256) pathway, and the G2P-SCAN evaluation of this pathway supports the prediction of its conservation in D. rerio, which is a member of that taxon.
When searching for AOPs that involved ESR1, 9 AOPs were identified. However, the total amount of gene-level information contained within these AOPs was limited. The AOP with the greatest number of genes associated with it (AOP 67; https://aopwiki.org/aops/67) had 9 genes in total (which accounted for 6 vertebrate species associated with the AOP). Only 2 of these 9 genes were unique human genes – ESR1 and NR2F2. Moreover, NR2F2 was not found to be associated with any of the ESR1-mapped Reactome pathways. This lack of mapping reduced confidence in assigning a toxicological context to these pathways by simply assessing the amount of gene-level overlap there is between relevant AOPs. Instead, there was an increased reliance on the functional aspects of the pathways and the KEs of the relevant AOPs to make such comparisons.
This ESR1 case example was useful because the apical endpoints impacted due to perturbation by chemical stressors have been well characterized in comparison to the example with 2-ethylhexanoic acid and PPARα. Diethylstilbestrol, for instance, is a well-studied endocrine disruptor (Robotti, 2021). Given that the toxicity of ESR1 stressors is well defined, there is an opportunity to expand the descriptions of the biological mechanisms through the AOP framework. As an example, diethylstilbestrol is considered a prototypical stressor of AOP 167 (“Early-life estrogen receptor activity leading to endometrial carcinoma in the mouse”) and is known to activate ESR1 (Shiau et al., 1998). ESR1 activation is the first KE of AOP 167 (KE ID: 1065). The second KE is “Promotion, SIX-1 positive basal-type progenitor cells.” The key event relationship (KER) between these two events is considered “non-adjacent” and the steps by which the activation of ESR1 leads to the promotion of SIX-1 positive cells is not well described. However, one of the prioritized ESR1-mapped Reactome pathways, “RUNX1 regulates transcription of genes involved in WNT signaling” (R-HSA-8939256), contains detailed molecular information on how ESR1 activity may impact the promotion of SIX1 via the activation of the pathway “degradation of beta-catenin by the destruction complex” (R-HSA-195253) (Figure 7), which is also indirectly related to certain forms of cancer (Kimelman & Xu, 2006) and therefore is in line with AOP 167 (development of endometrial carcinoma in the mouse).
Figure 7.
A snapshot of the Reactome pathway “RUNX1 regulates transcription of genes involved in WNT signaling” (R-HSA-8939256; https://reactome.org/PathwayBrowser/#/R-HSA-8939256) showing the connections between ESR1 activity and the activation of the pathway “Degradation of beta-catenin by the destruction complex” (R-HSA-195253; https://reactome.org/PathwayBrowser/#/R-HSA-195253), which is associated with SIX1 gene expression according to Reactome. The molecular events described in these pathways could be used to enhance the description of the KER of AOP 167 that connects ESR1 activation and SIX1 promotion.
Unfortunately, similar to the PPARα example (AOP 61 and 58), there is also a divergence in the role of ESR1 activation and SIX1 expression. While AOP 167 described ESR1 activation leading to SIX1 upregulation in mice, another study found that ESR1 activation via diethylstilbestrol led to SIX1 downregulation in mice (Terakawa et al., 2020). The effect of ESR1 activity on SIX1 expression could be dependent on factors like developmental stage (Suen et al., 2016). Despite these discrepancies in the directional impact on SIX1 expression, both sources agree that ESR1 activation via diethylstilbestrol indeed alters SIX1 expression (in some way) and ultimately leads to the promotion of endometrial carcinoma (Suen et al., 2016; Terakawa et al., 2020). Therefore, the opportunity for using pathway information derived from these bioinformatics approaches to fill knowledge gaps in existing AOPs remains. This added molecular information derived from the combined computational approaches described here could then be used to enhance cross-species extrapolation of chemical susceptibility. As highlighted by the example of ESR1, the reliance on the mere presence of a protein or gene provides an initial understanding of the likelihood of the chemical interactions, however, to understand potential toxicity other factors will need to be taken into consideration as well. For instance, in the case of AOP 167, which describes the development of endometrial carcinoma, it would be necessary to know if the species has an endometrium. Additionally, it would be useful to know if the other molecular entities of the related, ESR1-mapped Reactome pathways are also activated by the presence of diethylstilbestrol. Information on the bioactivity of additional pathway elements (like SIX1) due to chemical exposure could enhance understanding of the mechanisms through transcriptomic data (Xia et al., 2020), which could be efficiently obtained for many chemicals through an in vitro high-throughput transcriptomic (HTTr) screening approach (J. A. Harrill et al., 2021).
GABRA1 – Topiramate
Gamma-aminobutyric acid type A receptor subunit alpha1 (GABRA1) is a ligand-gated chloride channel that is associated with epilepsy syndromes and other neuropsychiatric diseases in humans (Musicoro et al., 2021). There is also some evidence that it is a likely target of the anticonvulsant drug, topiramate (Bai et al., 2022; White et al., 1997, 2000). However, unlike the previously discussed targets, PPARα and ESR1, the weight of evidence to support GABRA1 as a direct target of topiramate is less substantial. Rather, studies have shown that topiramate is able to modulate GABA-evoked currents in neurons via GABAA receptors (White et al., 2000) and this activity is dependent on the specific subunit combinations (Simeone et al., 2006). Thus, the exact mechanism underlying this activity is not fully understood.
This uncertainty in the mechanism of topiramate with regards to GABRA1 helps to underscore the domain of applicability for this pathway-based cross-species analysis approach and to highlight critical challenges. The overall confidence in the prediction of cross-species susceptibility to topiramate is dependent on the weight of evidence supporting GABRA1 as a true molecular target. This evidence needs to be experimentally derived and the quality of that evidence needs to be critically evaluated. Therefore, instead of applying this approach to cross-species extrapolation of chemical susceptibility, it was used for generating hypotheses of susceptibility. As topiramate is thought to act as a partial agonist of GABRA1 (Simeone et al., 2006), the conserved neurotransmitter-gated ion-channel ligand binding domain, pfam02931, was used as a basis for a SeqAPASS Level 2 evaluation. A Level 3 evaluation was not possible because there was no direct evidence of topiramate binding to GABRA1, and therefore, no data was found on critical amino acid residues.
There was considerably less pathway information available on GABRA1. G2P-SCAN mapped two Reactome pathways using GABRA1: “GABA receptor activation” (R-HSA-977443) and “Signaling by ERBB4” (R-HSA-1236394). GABRA1 did not represent ≥10% of either of these pathways. Therefore, the PPI network approach to prioritization was used. This resulted in considerations of “Signaling by ERBB4” (R-HSA-1236394) since GABRA1 was not found within the highest-scoring molecular complex of the “GABA receptor activation” (R-HSA-977443) PPI network. Unlike with the previously discussed targets, there is only one priority pathway available for GABRA1 to use for cross-species analysis. Orthologs were identified for R. norvegicus, M. musculus, and D. rerio using G2P-SCAN. Similarly, the protein families for GABRA1 were the same as the human families across these three species. The protein, protein family, reaction, and entity count did not differ significantly (p-value > 0.10) from humans either (SI), giving support for the conservation of this pathway within these species. The SeqAPASS evaluation of the key pathway proteins resulted in a list of 199 “susceptible” mammalian species that included R. norvegicus or M. musculus (according to SeqAPASS full reports). Therefore, both tools gave support for the hypothesis that this pathway is likely to be conserved across mammalian species.
When searching for GABRA1 within the AOP-DB, only one AOP was found – “Binding to the picrotoxin site of ionotropic GABA receptors leading to epileptic seizures in adult brain” (AOP 10). This AOP had 12 genes total associated with it. Three of these 12 genes were unique human genes: GABRA1, GABRA5, and GABRA6. When comparing overlapping genes in the mapped Reactome pathways derived from G2P-SCAN, all three of these genes were found in at least one of the two mapped pathways, but only GABRA1 was found in the prioritized pathway. Like the analysis described using ESR1, this lack of overlap in molecular information led to a greater dependency on functional overlaps to understand potential toxicity. However, AOP 10 describes the biological steps that lead to epileptic seizures in the adult, human brain, and it is well known that topiramate is used in the treatment and prevention of seizures. The MIE of AOP 10 involves binding to GABRA1 and the first KE is the reduction in the conductance of the channel because of binding. Topiramate is known to enhance GABA-channel conductance. Therefore, the ability to understand the potential toxicity of topiramate exposure from an ecotoxicology perspective is severely limited with respect to its known targets using this approach.
Combination of empirical toxicology data with computational approaches
The central premise underlying all efforts to advance computational predictions of toxicity is that predictions will likely improve with more biological information and context (LaLone et al., 2018; Yahya et al., 2021). At present, SeqAPASS and G2P-SCAN rely on gathering evidence regarding the presence or absence of proteins for similarity across species at the target and pathway level, respectively, as a predictor of chemical susceptibility. By combining approaches that consider additional factors such as life stage, life history, biological sex and toxicokinetic factors like absorption, distribution, metabolism, excretion (ADME) – these predictions will yield a more complete understanding of the chemical exposure and resulting biological impacts. Using biological pathway information and assessing protein interaction networks offers another promising way to enhance the predictive capabilities of computational approaches towards a systems-based view.
Currently, there are many limitations of this combined approach including the lack of exposure data to adequately link a chemical-protein interaction to an adverse outcome, the poor ontological annotation of many AOP KEs and KERs, the dependency on a priori chemical-protein interaction information, and the lack of empirical evidence linking the modulatory effects of chemicals on target activity to downstream pathway elements. Several of these limitations could be mitigated through transcriptomic and proteomic methods. Toxicogenomic data allows for quantitative pathway detection in a chemical-specific and concentration-dependent manner (J. A. Harrill et al., 2021; Xia et al., 2020). The additional application of network approaches to toxicogenomic data could be used to help validate and potentially identify novel molecular targets (Audouze et al., 2010). Additionally, more specific determinations on impacted pathways could be achieved through topology-based pathway analysis approaches as these approaches incorporate quantitative molecular data into biological pathways to better understand system perturbations (Mitrea et al., 2013). Altogether this would generate deeper pathway understanding and potentially support connectivity to AOPs to improve cross-species predictions of chemical susceptibility. Ultimately, the improvement of cross-species analysis of chemical susceptibility will further support NAMs in becoming established in the chemical-safety assessment process (Van Der Zalm et al., 2022).
4. Conclusions
This work establishes an initial framework for using biological pathway information to enhance chemical susceptibility predictions across species and demonstrates the synergy that can be obtained through the combined use of existing methods. SeqAPASS uses protein sequence information to help make predictions of chemical susceptibility, while G2P-SCAN accesses information from various databases to help infer biological pathway conservation across select species. In combination, it was demonstrated that these tools can be used to expand the prediction of biological pathway conservation across all species with relevant protein data, aid in the prediction of cross-species susceptibility, extend the biologically plausible tDOA of relevant AOPs, and provide additional biological information to help better characterize KEs and KERs. That said, there is still room for improvement as demonstrated by the three case examples discussed here. The incorporation of additional factors such as exposure, life stage, life history, biological sex, toxicokinetic factors like absorption, distribution, metabolism, excretion (ADME), and omics-based data into this approach would yield a more complete understanding of the biological impacts of chemicals. This would support molecular target identification for less studied chemicals and improve biological pathway prioritizations for more focused comparisons with related AOPs.
Supplementary Material
Footnotes
Any use of trade, firm, or product, names is for descriptive purposes only and does not imply endorsement by the authors or the U.S. Government.
References
- Albert R (2005). Scale-free networks in cell biology. Journal of Cell Science, 118(21), 4947–4957. 10.1242/JCS.02714 [DOI] [PubMed] [Google Scholar]
- Ankley GT, Bennett RS, Erickson RJ, Hoff DJ, Hornung MW, Johnson RD, Mount DR, Nichols JW, Russom CL, Schmieder PK, Serrrano JA, Tietge JE, & Villeneuve DL (2010). Adverse outcome pathways: A conceptual framework to support ecotoxicology research and risk assessment. Environmental Toxicology and Chemistry, 29(3), 730–741. 10.1002/ETC.34 [DOI] [PubMed] [Google Scholar]
- Audouze K, Juncker AS, Roque FJSSA, Krysiak-Baltyn K, Weinhold N, Taboureau O, Jensen TS, & Brunak S (2010). Deciphering Diseases and Biological Targets for Environmental Chemicals using Toxicogenomics Networks. PLOS Computational Biology, 6(5), e1000788. 10.1371/JOURNAL.PCBI.1000788 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bader GD, & Hogue CWV (2003). An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4(1), 1–27. 10.1186/1471-2105-4-2/FIGURES/12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bai YF, Zeng C, Jia M, & Xiao B (2022). Molecular mechanisms of topiramate and its clinical value in epilepsy. Seizure, 98, 51–56. 10.1016/J.SEIZURE.2022.03.024 [DOI] [PubMed] [Google Scholar]
- Baltazar MT, Cable S, Carmichael PL, Cubberley R, Cull T, Delagrange M, Dent MP, Hatherell S, Houghton J, Kukic P, Li H, Lee MY, Malcomber S, Middleton AM, Moxon TE, Nathanail AV, Nicol B, Pendlington R, Reynolds G, … Westmoreland C (2020). A next-generation risk assessment case study for coumarin in cosmetic products. Toxicological Sciences, 176(1). 10.1093/toxsci/kfaa048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, & Bourne PE (2000). The Protein Data Bank. In Nucleic Acids Research (Vol. 28, Issue 1). 10.1093/nar/28.1.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blum M, Chang H-Y, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, Richardson L, Salazar GA, Williams L, Bork P, Bridge A, Gough J, Haft DH, Letunic I, Marchler-Bauer A, … Finn RD (2021). The InterPro protein families and domains database: 20 years on. Nucleic Acids Research, 49(D1), D344–D354. 10.1093/nar/gkaa977 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blüthgen N, Zucchi S, & Fent K (2012). Effects of the UV filter benzophenone-3 (oxybenzone) at low concentrations in zebrafish (Danio rerio). Toxicology and Applied Pharmacology, 263(2), 184–194. 10.1016/J.TAAP.2012.06.008 [DOI] [PubMed] [Google Scholar]
- Cody V, Luft JR, & Pangborn W (2005). Understanding the role of Leu22 variants in methotrexate resistance: comparison of wild-type and Leu22Arg variant mouse and human dihydrofolate reductase ternary crystal complexes with methotrexate and NADPH. Acta Crystallographica Section D Biological Crystallography, 61(2), 147–155. 10.1107/S0907444904030422 [DOI] [PubMed] [Google Scholar]
- Delfosse V, Grimaldi M, Cavaillès V, Balaguer P, & Bourguet W (2015). Structural and functional profiling of environmental ligands for estrogen receptors. Environmental Health Perspectives, 122(12), 1306–1313. 10.1289/EHP.1408453 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deplanque D, Gelé P, Pétrault O, Six I, Furman C, Bouly M, Nion S, Dupuis B, Leys D, Fruchart JC, Cecchelli R, Staels B, Duriez P, & Bordet R (2003). Peroxisome Proliferator-Activated Receptor-α Activation as a Mechanism of Preventive Neuroprotection Induced by Chronic Fenofibrate Treatment. Journal of Neuroscience, 23(15), 6264–6271. 10.1523/JNEUROSCI.23-15-06264.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- European Chemicals Agency. (2016). New approach methodologies in regulatory science: proceedings of a scientific workshop: Helsinki, 19–20 April 2016. https://data.europa.eu/doi/10.2823/543644 [Google Scholar]
- Fisone G, Borgkvist A, & Usiello A (2004). Caffeine as a psychomotor stimulant: mechanism of action. Cellular and Molecular Life Sciences CMLS 2004 61:7, 61(7), 857–872. 10.1007/S00018-003-3269-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gonzalez FJ, Peters JM, & Cattley RC (1998). Mechanisms of Action of the Nongenotoxic Peroxisome Proliferators: Role of the Peroxisome Proliferator-Activated Receptor & alpha; https://academic.oup.com/jnci/article/90/22/1702/2519382 [DOI] [PubMed]
- Harrill JA, Everett LJ, Haggard DE, Sheffield T, Bundy JL, Willis CM, Thomas RS, Shah I, & Judson RS (2021). High-Throughput Transcriptomics Platform for Screening Environmental Chemicals. Toxicological Sciences, 181(1), 68–89. 10.1093/TOXSCI/KFAB009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harrill J, Shah I, Setzer RW, Haggard D, Auerbach S, Judson R, & Thomas RS (2019). Considerations for strategic use of high-throughput transcriptomics chemical screening data in regulatory decisions. In Current Opinion in Toxicology (Vol. 15). 10.1016/j.cotox.2019.05.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris HA, Katzenellenbogen JA, & Katzenellenbogen BS (2002). Characterization of the Biological Roles of the Estrogen Receptors, ERα and ERβ, in Estrogen Target Tissues in Vivo through the Use of an ERα-Selective Ligand. Endocrinology, 143(11), 4172–4177. 10.1210/en.2002-220403 [DOI] [PubMed] [Google Scholar]
- Jassal B, Matthews L, Viteri G, Gong C, Lorente P, Fabregat A, Sidiropoulos K, Cook J, Gillespie M, Haw R, Loney F, May B, Milacic M, Rothfels K, Sevilla C, Shamovsky V, Shorser S, Varusai T, Weiser J, … D’eustachio P (2020). The reactome pathway knowledgebase. Nucleic Acids Research, 48. 10.1093/nar/gkz1031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jensen MA, Blatz DJ, & LaLone CA (2022). Defining the Biologically Plausible Taxonomic Domain of Applicability of an Adverse Outcome Pathway: A Case Study Linking Nicotinic Acetylcholine Receptor Activation to Colony Death. Environmental Toxicology and Chemistry. 10.1002/ETC.5501 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeong H, Mason SP, Barabási AL, & Oltvai ZN (2001). Lethality and centrality in protein networks. Nature 2001 411:6833, 411(6833), 41–42. 10.1038/35075138 [DOI] [PubMed] [Google Scholar]
- Judson R, Richard A, Dix DJ, Houck K, Martin M, Kavlock R, Dellarco V, Henry T, Holderman T, Sayre P, Tan S, Carpenter T, & Smith E (2009). The toxicity data landscape for environmental chemicals. Environmental Health Perspectives, 117(5), 685–695. 10.1289/EHP.0800168 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Judson RS, Thomas RS, Baker N, Simha A, Howey XM, Marable C, Kleinstreuer NC, & Houck KA (2019). Workflow for defining reference chemicals for assessing performance of in vitro assays. Altex, 36(2). 10.14573/altex.1809281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kalderimis A, Lyne R, Butano D, Contrino S, Lyne M, Heimbach J, Hu F, Smith R, Štěpán R, Sullivan J, & Micklem G (2014). InterMine: Extensive web services for modern biology. Nucleic Acids Research, 42(W1). 10.1093/nar/gku301 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kersten S, & Stienstra R (2017). The role and regulation of the peroxisome proliferator activated receptor alpha in human liver. Biochimie, 136, 75–84. 10.1016/j.biochi.2016.12.019 [DOI] [PubMed] [Google Scholar]
- Kimelman D, & Xu W (2006). β-Catenin destruction complex: insights and questions from a structural perspective. Oncogene 2006 25:57, 25(57), 7482–7491. 10.1038/sj.onc.1210055 [DOI] [PubMed] [Google Scholar]
- Klaunig JE, Babich MA, Baetcke KP, Cook JC, Corton JC, David RM, DeLuca JG, Lai DY, McKee RH, Peters JM, Roberts RA, & Fenner-Crisp PA (2003). PPARα Agonist-Induced Rodent Tumors: Modes of Action and Human Relevance. Critical Reviews in Toxicology, 33(6), 655–780. 10.1080/713608372 [DOI] [PubMed] [Google Scholar]
- Krewski D, Andersen ME, Tyshenko MG, Krishnan K, Hartung T, Boekelheide K, Wambaugh JF, Jones D, Whelan M, Thomas R, Yauk C, Barton-Maclaren T, & Cote I. (2020). Toxicity testing in the 21st century: progress in the past decade and future perspectives. Archives of Toxicology, 94, 1–58. 10.1007/s00204-019-02613-4 [DOI] [PubMed] [Google Scholar]
- LaLone CA, Villeneuve DL, Burgoon LD, Russom CL, Helgen HW, Berninger JP, Tietge JE, Severson MN, Cavallin JE, & Ankley GT (2013). Molecular target sequence similarity as a basis for species extrapolation to assess the ecological risk of chemicals with known modes of action. Aquatic Toxicology, 144–145, 141–154. 10.1016/j.aquatox.2013.09.004 [DOI] [PubMed] [Google Scholar]
- Lalone CA, Villeneuve DL, Doering JA, Blackwell BR, Transue TR, Simmons CW, Swintek J, Degitz SJ, Williams AJ, & Ankley GT (2018). Evidence for Cross Species Extrapolation of Mammalian-Based High-Throughput Screening Assay Results. Environmental Science and Technology, 52(23), 13960–13971. 10.1021/acs.est.8b04587 [DOI] [PMC free article] [PubMed] [Google Scholar]
- LaLone CA, Villeneuve DL, Lyons D, Helgen HW, Robinson SL, Swintek JA, Saari TW, & Ankley GT (2016). Sequence alignment to predict across species susceptibility (seqapass): A web-based tool for addressing the challenges of cross-species extrapolation of chemical toxicity. Toxicological Sciences, 153(2), 228–245. 10.1093/toxsci/kfw119 [DOI] [PubMed] [Google Scholar]
- Lampen A, Zimnik S, & Nau H (2003). Teratogenic phthalate esters and metabolites activate the nuclear receptors PPARs and induce differentiation of F9 cells. Toxicology and Applied Pharmacology, 188(1), 14–23. 10.1016/S0041-008X(03)00014-0 [DOI] [PubMed] [Google Scholar]
- Maloney EK, & Waxman DJ (1999). trans-Activation of PPARα and PPARγ by Structurally Diverse Environmental Chemicals. Toxicology and Applied Pharmacology, 161(2), 209–218. 10.1006/TAAP.1999.8809 [DOI] [PubMed] [Google Scholar]
- Matsumoto H, Adachi S, & Suzuki Y (2005). Estrogenic effects of UV absorbers and their related compounds. YAKUGAKU ZASSHI, 125(8), 643–652. 10.1248/YAKUSHI.125.643 [DOI] [PubMed] [Google Scholar]
- Mi H, Dong Q, Muruganujan A, Gaudet P, Lewis S, & Thomas PD (2010). PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Nucleic Acids Research, 38(suppl_1), D204–D210. 10.1093/NAR/GKP1019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mi H, Ebert D, Muruganujan A, Mills C, Albou LP, Mushayamaha T, & Thomas PD (2021). PANTHER version 16: A revised family classification, tree-based classification tool, enhancer regions and extensive API. Nucleic Acids Research, 49(D1). 10.1093/nar/gkaa1106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Middleton AM, Reynolds J, Cable S, Baltazar MT, Li H, Bevan S, Carmichael PL, Dent MP, Hatherell S, Houghton J, Kukic P, Liddell M, Malcomber S, Nicol B, Park B, Patel H, Scott S, Sparham C, Walker P, & White A (2022). Are Non-animal Systemic Safety Assessments Protective? A Toolbox and Workflow. Toxicological Sciences : An Official Journal of the Society of Toxicology, 189(1). 10.1093/toxsci/kfac068 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M, Voichita C, & Draghici S (2013). Methods and approaches in the topology-based analysis of biological pathways. Frontiers in Physiology, 0, 278. 10.3389/FPHYS.2013.00278 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mortensen HM, Senn J, Levey T, Langley P, & Williams AJ (2021). The 2021 update of the EPA’s adverse outcome pathway database. Scientific Data, 8(1). 10.1038/s41597-021-00962-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Musicoro VB, Sortino V, Pecora G, Tosto M, Bianco M. Lo, Soma R, Romano C, Falsaperla R, & Praticò AD (2021). Gamma-aminobutyric acid type a receptor genes and their related epilepsies. Journal of Pediatric Neurology. 10.1055/S-0041-1727269/ID/JR2100038-52 [DOI] [Google Scholar]
- Mutwil M, Usadel B, Schütte M, Loraine A, Ebenhöh O, & Persson S (2009). Assembly of an Interactive Correlation Network for the Arabidopsis Genome Using a Novel Heuristic Clustering Algorithm. Plant Physiology, 152(1), 29–43. 10.1104/PP.109.145318 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nall AH, Shakhmantsir I, Cichewicz K, Birman S, Hirsh J, & Sehgal A (2016). Caffeine promotes wakefulness via dopamine signaling in Drosophila. Scientific Reports 2016 6:1, 6(1), 1–12. 10.1038/srep20938 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nightingale A, Antunes R, Alpi E, Bursteinas B, Gonzales L, Liu W, Luo J, Qi G, Turner E, & Martin M (2017). The Proteins API: accessing key integrated protein and genome information. Nucleic Acids Research, 45(W1), W539–W544. 10.1093/nar/gkx237 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nilsson S, & Gustafsson JÅ (2008). Biological Role of Estrogen and Estrogen Receptors. 10.1080/10409230290771438, 37(1), 1–28. 10.1080/10409230290771438 [DOI] [PubMed] [Google Scholar]
- Perkins EJ, Ankley GT, Crofton KM, Garcia-Reyero N, LaLone CA, Johnson MS, Tietge JE, & Villeneuve DL (2013). Current perspectives on the use of alternative species in human health and ecological hazard assessments. Environmental Health Perspectives, 121(9), 1002–1010. 10.1289/EHP.1306638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pittman ME, Edwards SW, Ives C, & Mortensen HM (2018). AOP-DB: A database resource for the exploration of Adverse Outcome Pathways through integrated association networks. Toxicology and Applied Pharmacology, 343. 10.1016/j.taap.2018.02.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rajagopal R, Baltazar MT, Carmichael PL, Dent MP, Head J, Li H, Muller I, Reynolds J, Sadh K, Simpson W, Spriggs S, White A, & Kukic P (2022). Beyond AOPs: A Mechanistic Evaluation of NAMs in DART Testing. Frontiers in Toxicology, 4. 10.3389/ftox.2022.838466 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richard AM, Judson RS, Houck KA, Grulke CM, Volarath P, Thillainadarajah I, Yang C, Rathman J, Martin MT, Wambaugh JF, Knudsen TB, Kancherla J, Mansouri K, Patlewicz G, Williams AJ, Little SB, Crofton KM, & Thomas RS (2016). ToxCast Chemical Landscape: Paving the Road to 21st Century Toxicology. 10.1021/acs.chemrestox.6b00135 [DOI] [PubMed] [Google Scholar]
- Rivetti C, Allen TEH, Brown JB, Butler E, Carmichael PL, Colbourne JK, Dent M, Falciani F, Gunnarsson L, Gutsell S, Harrill JA, Hodges G, Jennings P, Judson R, Kienzler A, Margiotta-Casaluci L, Muller I, Owen SF, Rendal C, … Campos B (2020). Vision of a near future: Bridging the human health–environment divide. Toward an integrated strategy to understand mechanisms across species for chemical safety assessment. In Toxicology in Vitro (Vol. 62). 10.1016/j.tiv.2019.104692 [DOI] [PubMed] [Google Scholar]
- Rivetti C, Houghton J, Basili D, Hodges G, & Campos B (2023). Genes-to-Pathways Species Conservation ANalysis (G2P-SCAN): enabling the exploration of conservation of biological pathways and processes across species. Environmental Toxicology and Chemistry. 10.1002/etc.5600 [DOI] [PubMed] [Google Scholar]
- Robotti S (2021). The heritable legacy of diethylstilbestrol: a bellwether for endocrine disruption in humans. Biology of Reproduction, 105(3), 687–689. 10.1093/biolre/ioab146 [DOI] [PubMed] [Google Scholar]
- Shanle EK, & Xu W (2011). Endocrine Disrupting Chemicals Targeting Estrogen Receptor Signaling: Identification and Mechanisms of Action. Chemical Research in Toxicology, 24(1), 6–19. 10.1021/tx100231n [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, & Ideker T (2003). Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Research, 13(11), 2498. 10.1101/GR.1239303 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shiau AK, Barstad D, Loria PM, Cheng L, Kushner PJ, Agard DA, & Greene GL (1998). The Structural Basis of Estrogen Receptor/Coactivator Recognition and the Antagonism of This Interaction by Tamoxifen. Cell, 95(7), 927–937. 10.1016/S0092-8674(00)81717-1 [DOI] [PubMed] [Google Scholar]
- Simeone TA, Wilcox KS, & White HS (2006). Subunit selectivity of topiramate modulation of heteromeric GABAA receptors. Neuropharmacology, 50(7), 845–857. 10.1016/J.NEUROPHARM.2005.12.006 [DOI] [PubMed] [Google Scholar]
- Smith RN, Aleksic J, Butano D, Carr A, Contrino S, Hu F, Lyne M, Lyne R, Kalderimis A, Rutherford K, Stepan R, Sullivan J, Wakeling M, Watkins X, & Micklem G (2012). InterMine: A flexible data warehouse system for the integration and analysis of heterogeneous biological data. Bioinformatics, 28(23). 10.1093/bioinformatics/bts577 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spurgeon D, Lahive E, Robinson A, Short S, & Kille P (2020). Species Sensitivity to Toxic Substances: Evolution, Ecology and Applications. In Frontiers in Environmental Science (Vol. 8). 10.3389/fenvs.2020.588380 [DOI] [Google Scholar]
- Suen AA, Jefferson WN, Wood CE, Padilla-Banks E, Bae-Jump VL, & Williams CJ (2016). SIX1 Oncoprotein as a Biomarker in a Model of Hormonal Carcinogenesis and in Human Endometrial Cancer. Molecular Cancer Research, 14(9), 849–858. 10.1158/1541-7786.MCR-16-0084 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, Jensen LJ, & Von Mering C (2019). STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 47(D1), D607–D613. 10.1093/NAR/GKY1131 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Terakawa J, Serna VA, Nair DM, Sato S, Kawakami K, Radovick S, Maire P, & Kurita T (2020). SIX1 cooperates with RUNX1 and SMAD4 in cell fate commitment of Müllerian duct epithelium. Cell Death & Differentiation, 27, 3307–3320. 10.1038/s41418-020-0579-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- The UniProt Consortium. (2019). UniProt: a worldwide hub of protein knowledge. Nucleic Acids Research, 47(D1), D506–D515. 10.1093/nar/gky1049 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Toda C, Okamoto Y, Ueda K, Hashizume K, Itoh K, & Kojima N (2004). Unequivocal estrogen receptor-binding affinity of phthalate esters featured with ring hydroxylation and proper alkyl chain size. Archives of Biochemistry and Biophysics, 431(1), 16–21. 10.1016/J.ABB.2004.07.028 [DOI] [PubMed] [Google Scholar]
- Toxicity testing in the 21st century: A vision and a strategy. (2007). In Toxicity Testing in the 21st Century: A Vision and a Strategy. National Academies Press. 10.17226/11970 [DOI] [Google Scholar]
- USEPA. (2018). Strategic Plan to Promote the Development and Implementation of Alternative Test Methods Within the TSCA Program.
- Van Der Zalm AJ, Barroso J, Browne Patience, Casey W, Gordon J, Tala, ·, Henry R, Kleinstreuer NC, Lowit AB, Perron M, & Clippinger AJ (2022). A framework for establishing scientific confidence in new approach methodologies. Archives of Toxicology, 96, 2865–2879. 10.1007/s00204-022-03365-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- White HS, Brown SD, Woodhead JH, Skeen GA, & Wolf HH (1997). Topiramate enhances GABA-mediated chloride flux and GABA-evoked chloride currents in murine brain neurons and increases seizure threshold. Epilepsy Research, 28(3), 167–179. 10.1016/S0920-1211(97)00045-4 [DOI] [PubMed] [Google Scholar]
- White HS, Brown SD, Woodhead JH, Skeen GA, & Wolf HH (2000). Topiramate Modulates GABA-Evoked Currents in Murine Cortical Neurons by a Nonbenzodiazepine Mechanism. Epilepsia, 41(SUPPL. 1), 17–20. 10.1111/J.1528-1157.2000.TB02165.X [DOI] [PubMed] [Google Scholar]
- Wittwehr C, Aladjov H, Ankley G, Byrne HJ, de Knecht J, Heinzle E, Klambauer G, Landesmann B, Luijten M, MacKay C, Maxwell G, Meek MEB, Paini A, Perkins E, Sobanski T, Villeneuve D, Waters KM, & Whelan M (2017). How Adverse Outcome Pathways Can Aid the Development and Use of Computational Prediction Models for Regulatory Toxicology. Toxicological Sciences, 155(2), 326–336. 10.1093/TOXSCI/KFW207 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia P, Zhang H, Peng Y, Shi W, & Zhang X (2020). Pathway-based assessment of single chemicals and mixtures by a high-throughput transcriptomics approach. Environment International, 136, 105455. 10.1016/J.ENVINT.2019.105455 [DOI] [PubMed] [Google Scholar]
- Yahya FA, Hashim NFM, Israf Ali DA, Chau Ling T, & Cheema MS (2021). A brief overview to systems biology in toxicology: The journey from in to vivo, in-vitro and –omics. Journal of King Saud University - Science, 33(1), 101254. 10.1016/J.JKSUS.2020.101254 [DOI] [Google Scholar]
- Zhang Q, Li J, Middleton A, Bhattacharya S, & Conolly RB (2018). Bridging the Data Gap From in vitro Toxicity Testing to Chemical Safety Assessment Through Computational Modeling. Frontiers in Public Health, 6, 261. 10.3389/fpubh.2018.00261 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zotenko E, Mestre J, O’Leary DP, & Przytycka TM (2008). Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality. PLoS Computational Biology, 4(8), e1000140. 10.1371/journal.pcbi.1000140 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







