EXTRACTING AND BENCHMARKING EMERGING ADVERSE OUTCOME PATHWAY KNOWLEDGE

Nathan L Pollesch; Daniel L Villeneuve; Jason M O’Brien

doi:10.1093/toxsci/kfz006

. Author manuscript; available in PMC: 2023 Oct 2.

Published in final edited form as: Toxicol Sci. 2019 Apr 1;168(2):349–364. doi: 10.1093/toxsci/kfz006

EXTRACTING AND BENCHMARKING EMERGING ADVERSE OUTCOME PATHWAY KNOWLEDGE

Nathan L Pollesch ^†,^*, Daniel L Villeneuve ^†, Jason M O’Brien ^‡

PMCID: PMC10545168 NIHMSID: NIHMS1918157 PMID: 30715536

Abstract

As the community of toxicological researchers, risk assessors, and risk managers adopt the adverse outcome pathway (AOP) framework for organizing toxicological knowledge, the number and diversity of AOPs in the online AOP knowledgebase (KB) continues to grow. To track and investigate this growth, AOPs in the AOP-KB were assembled into a single network. Summary measures on the current state of the AOP-KB and the overall connectivity and structural features of the resulting network were calculated. Our results show that networking the 187 user-defined AOPs currently described in the AOP-KB resulted in the emergence of 9405 unique, previously undescribed, linear AOPs (LAOPs). To investigate patterns in this emerging knowledge we assembled the AOP-KB network retrospectively by sequentially adding each of the 187 user-defined AOPs and found that the creation of new AOPs that borrowed components from previously existing AOPs in the KB most described emergence of new LAOPs. However, the introduction of non-adjacent key event relationships and cycles among KEs also play key roles in emergent LAOPs. We provide examples of how to identify application-specific critical paths from this large number of LAOPs. Our research shows that the global AOP network may have considerable value as a source of emergent toxicological knowledge. These findings are not only helpful for understanding the nature of this emergent information, but can also be used to manage and guide future development of the AOP-KB, and how to tailor this wealth of information to specific applications.

Keywords: Adverse Outcome Pathway, AOP Networks, Network Analysis, Emergent Knowledge

INTRODUCTION

Adverse outcome pathways (AOPs) provide researchers and regulators a formal framework to link molecular level perturbations elicited by chemicals or other stressors (i.e. molecular initiating events, or MIEs) to adverse outcomes (AOs) of regulatory significance by defining a series of evidence-based causal relationships between measurable key events (KEs) (Ankley et al., 2010). AOP knowledge is contributed by the international scientific community via crowd-sourcing and stored online in the AOP knowledgebase (AOP-KB), which is currently accessed primarily via the AOP-Wiki (aopwiki.org). In the AOP-KB, AOPs are created by users as collections of sequential KE descriptions (including MIE and AO descriptions) and key event relationship (KER) descriptions, where each KER describes the causal evidence between one upstream KE and one downstream KE. The modular nature of the AOP framework permits the sharing of KEs and KERs between individual AOPs, and as such, AOPs with shared components naturally form into AOP networks (Villeneuve et al., 2014 Knapen et al., 2018; Villeneuve et al., 2018).

The resulting AOP networks capture the potential interactions between individual AOPs. They also capture the diversity of effects that may occur in different species, life stages, target organs, etc. following perturbation of a biological pathway or function and, conversely, the range of different MIEs that can plausibly contribute to a given adverse outcome. Considering this broader, systems-level scope, in which AOPs operate is expected to be critical to their practical use in risk assessment and decision-making (Villeneuve et al. 2014). For example, chemicals or mixtures of chemicals and/or other stressors will often exert their adverse effects via more than one pathway. Most of these pathways do not occur in isolation, but may interact and influence one another. Thus, it is important to simultaneously consider all pathways induced by a stressor and any potential interactions among them. The modularity of the AOP framework was thus designed as a tractable and pragmatic approach for capturing and summarizing our current knowledge about induced pathway interactions in a complex systems context (LaLone et al., 2017).

As crowd-sourced contributions to the AOP-KB continue to accumulate, the resulting network of AOPs grows not only in complexity, but also in its potential for revealing previously undiscovered relationships and knowledge. The growing AOP-KB network and its components have attributes that emerge and evolve over time. These include basic properties such as the number of AOPs, KEs and KERs, or more complex measures of network connectivity that are determined using modern graph theory methods (Pavlopoulos et al., 2011; Villeneuve et al., 2018). In addition to these properties, previously undescribed pathways also emerge when AOPs are networked together through sharing of KEs (Figure 1). These “emergent” AOPs will often represent potential, previously uncharacterized, mechanisms of toxicity and/or points of AOP interaction. Identifying, characterizing, and interpreting these emergent properties and pathways will not only be useful for monitoring and managing the growth of the AOP-KB, but will also be integral for determining the most effective way to tailor AOP knowledge to any particular application.

Figure 1: — Two linear AOPs are combined to form an AOP network. When two or more user-defined AOPs that share a key event are combined into a network, previously undefined AOPs can emerge

The objectives of the current research were three-fold. First, we wanted to identify and develop baseline metrics that could be used to track the growth and development of the AOP-KB over time. This could be used as one indicator of success relative to the uptake and impact of the AOP framework (Carusi et al. 2018). Second, we aimed to apply computational approaches to characterize the emergence of new relationships as AOPs are created and linked together in the AOP-KB, including the identification of the major drivers. Finally, we wanted to provide further demonstration of how network analysis approaches could help isolate AOPs of interest from a broader network of AOPs. To achieve these aims, we assembled a network from all AOPs currently in the AOP-KB (from AOPWiki.org as of April 1, 2018). We then measured a variety of properties of the resulting network, including basic summary information, structural features, and connectivity metrics. We also identified and quantified the emergent AOPs, and identified several factors that influence the rate at which they occur. Lastly, we provided several examples of how to identify paths of interest from the large amount of emergent AOP information and tailor these paths for particular research questions.

METHODS

AOP data was downloaded in XML format from the “Permanent Downloads” section of the AOPWiki website (https://aopwiki.org/info_pages/5). We used the data set dated April 1, 2018. All analyses were conducted using R (R Core Team, 2017). The relevant data was parsed from the XML file using the xml2 library. Many of the functions used for network analysis were from the R package igraph (Csardi and Nepusz, 2006). Network analyses specific to AOP networks were derived as needed. All R scripts used for network assembly, analysis and visualization are available in the Supplementary Material.

AOP-KB NETWORK CONSTRUCTION

All AOPs were assembled into a single, directed network based on their KER information. The first step in creating the AOP-KB network was downloading the XML file from aopwiki.org. Next the xml2 R package was used to import the XML into R. The non-redundant set of upstream and downstream KEs from each KER (KEup and KEdown) were represented as nodes in the network. Each unique KER was represented as a directed edge linking its respective KEup and KEdown. This was done by describing all KERs in the AOP-KB as an edge list table, such that each KER only occurred once in the table, and that each KER was represented as a KEup and KEdown pair. This edge list was then used to construct the AOP-KB network using the graph_from_edgelist function from igraph. Two types of AOPs present in the KB were removed before analysis: archived and empty AOPs. Archived AOPs are from the very early design phase of the AOP-KB and have been replaced with updated versions. Empty AOPs are AOP pages that were created but had no KEs or any KERs associated to them. No other properties (such as OECD status, taxonomic applicability, etc.) were considered in the initial construction of the AOP-KB network.

After the AOP-KB network was constructed, several attributes from the AOP-KB dataset were assigned to each KE and KER (Table 1). For KEs, these attributes included: key event ID, AOP ID, key event designators (MIE, KE, or AO) and level of biological organization (LOBO). Attributes from the AOP-KB assigned to KERs included: KER ID, AOP ID, weight of evidence (WOE), and quantitative understanding (OECD, 2018). For some KEs and KERs, these attributes were not specified in the AOP-KB. In these instances, we assigned default attribute values. For example, if a KER didn’t have a quantitative understanding score in the AOP-KB, we assigned it a default value of “low” for that attribute (Table 1). The final assembled network is available in the Supplementary Material in two formats: as an igraph object that can be imported into R, or as pair of tab-delimited tables (one table for KEs and KE attributes, and one table for KERs and all KER attributes).

Table 1:

Attributes from the AOP-KB that were assigned to each key event (KE) and key event relationship (KER)

KE Attribute	Description

KE ID	Unique identifier number for each KE
AOP ID	List of IDs for all AOPs that a KE is a member of
KE Designator	Indicates whether the event a regular key event (KE), a molecular initiating event (MIE), or an adverse outcome (AO)
Level of Biological Organization	Indicates the level of biological organization as either: molecular, cellular, tissue, organ, individual, or population
KER Attribute	Description

KER ID	Unique identifier number for each KER
AOP ID	List of IDs for all AOPs that a KER is a member of
Weight of Evidence^*	A value of low, moderate or high that describes the level of confidence in the empirical evidence supporting the KER
Quantitative Understanding^*	A value of low, moderate or high that describes the level of quantitative understanding of the KER

Open in a new tab

For KERs that did not have a value specified for these attributes in the AOP-KB, a default value of “low” was assigned.

NETWORK ANALYSIS AND VISUALIZATION

Organizing the AOP-KB information in a single a global network enabled us to utilize many standard techniques developed in the fields of network analysis and graph theory, for which many resources exist (e.g. Kolaczyk and Csardi, 2014). AOP specific network analyses were derived and utilized in our research as appropriate that considered unique aspects of AOP networks.

Key event relationships included in a given AOP description can be described as either adjacent or non-adjacent (OECD 2018). Adjacent KERs capture evidence connecting upstream and downstream events are next to one another in the sequence defined for a specific AOP. Non-adjacent KERs are used to capture pair-wise evidence for a causal connection between an upstream and downstream KE, where one or more KEs lie in between for the sequence defined for the AOP. Importantly, because the number of KEs to include in a linear AOP description is subjective, edges (KERs) that are adjacent in one AOP may be non-adjacent in another. For network construction, edge adjacency was determined using a custom algorithm (Supplementary Figure 1). Briefly, the algorithm used a three-step process to identify non-adjacent KERs: 1) identify the KERs for which KEup and KEdown are also connected via an alternate path that is longer than one KER; 2) identify all the longest unique paths in the network; 3) a KER is identified as non-adjacent if its KEup and KEdown are connected via an alternate path (from step 1), and will not disrupt any unique longest paths (from step 2) if removed. The code and a more detailed explanation of the algorithm are provided in the supplemental R scripts.

Our study into the structure and attributes of the AOP-KB network included determining network components, centrality measures for KEs, path determinations, and connectivity measures between KE pairs, specifically, MIE/AO pairs that describe AOPs. Weakly and strongly connected components in the AOP-KB network were identified using the components function of igraph. Linear AOPs were identified by determining all unique simple paths between all possible MIE and AO pairs using the all_simple_paths igraph function, a simple path is a list of sequentially connected nodes, where no nodes are repeated. Degree centrality, which counts the number of KERs entering and leaving a given KE, was determined for all KEs using degree in igraph. An AOP-specific betweenness metric called “AOP occurrence” was developed and determined for KEs by counting the number of times each KE occurred in a unique linear AOP. Edge connectivity between all possible MIE and AO pairs was determined using the edge_connectivity igraph function; edge connectivity of two nodes is the minimum number of edges that, if removed, will eliminate all possible paths between them.

There are many possibilities for visualization of a network (see Kerren et al., 2014). Network attributes and application both influence selection of appropriate layout. All network visualizations, unless otherwise indicated, were plotted using the Fruchterman and Reingold layout (1991). When applicable (i.e. when no cycles were present in the network visualized) we utilized a custom AOP network layout derived based on the topological sorting of the network found with the topo_sort igraph function. Topological sorting of an AOP network lists its KEs in a linear sequence such that for every KER in the network, KEup always occurs before KEdown (see Villeneuve et al., 2017).

RESULTS AND DISCUSSION

A recent survey of the regulatory toxicology community identified the analysis and application of AOPs in a network context as a key research need for advancing the regulatory utility of the AOP framework (LaLone et al., 2017). In response to this survey, an international work group of researchers, regulators and other stakeholders developed a vision and series of recommendations for AOP network applications (Knapen et al., 2018) and AOP network analysis (Villeneuve et al., 2018). Metrics to track the growth and development of the AOP-KB were also recommended among a set of critical indicators to measure the impact and success of the AOP framework over time (Carusi et al. 2018). Providing a set of baseline metrics is a critical first step in that process. Consequently, in this study, we applied several network analytical methods to the network derived from all AOPs currently stored in the AOP-KB (April 1, 2018). This work served three major purposes: 1) to measure baseline characteristics of the AOP-KB that could be used to track growth and development of the AOP-KB over time, 2) to extract and begin to interpret the new knowledge that emerges from combining discrete AOPs into a larger network (i.e., emergent AOP knowledge) as a result of collaborative contributions to the AOP-KB, and; 3) demonstrate how network analysis approaches could be used to find AOPs of interest from complex networks of AOPs.

We constructed the AOP-KB network from the list of all AOPs described in the AOP-KB as of April 1, 2018 (downloaded from AOPWiki.org/downloads). At any point in time, AOPs in the AOP-KB may be at different points of development. On April 1, 2018, only a relative handful of user-specified AOPs had been subjected to a formal OECD-coordinated review process. Indeed, a survey of all of the user-defined fields (ex: title, description, empirical evidence, etc…) in the KB indicated that the majority of AOPs, KEs and KERs were only partially complete (Table 2).

Table 2:

AOP Knowledge base field attribute summary, including percent completed by attribute

AOP FIELDS

		#	%
Total AOPs		219	-
AOPs with:	title	219	100.0
	short-name	219	100.0
	authors	175	79.9
	abstract	116.0	53.0
	background	28	12.8
	at least 1 KE	205	93.6
	at least 1 KER	190	86.8
	MIE	182	83.1
	Evidence supporting MIE	51	23.3
	AO	200	91.3
	Regulatory exampes of AO	55	25.1
	Essentiality of KEs	44	20.1
	taxonomic domain^*	n/a	n/a
	sex domain	117	53.4
	life-stage domain	101	46.1
	potential applications	36	16.4
	AOP stressors	78	35.6
	Overall assessment: description	57	26.0
	Overall assessment: domain of applicability	53	24.2
	Overall assessment: KE essentiality	44	20.1
	Overall assessment: WOE	47	21.5
	Overall assessment: Quantitative	44	20.1
Average percent fields with entry			49.2
KE FIELDS

		#	%
Total KEs		911	-
KEs with:	title	911	100.0
	short name	911	100.0
	level of biological organization	911	100.0
	organ/cell term	500	54.9
	description	257	28.2
	measurement method	231	25.4
	taxonomic domain	182	20.0
	sex domain	89	9.8
	life-stage domain	87	9.5
	evidence supporting domain	203	22.3
Average percent fields with entry			47.0
KER FIELDS

		#	%
Total KERs		1056	-
KERs with:	title	1056	100.0
	description	239	22.6
	biological plausibility	245	23.2
	empirical support	251.0	23.8
	quantitative understanding	213	20.2
	uncertainties and inconsistencies	213	20.2
	taxonomic domain	161	15.2
	sex domain	103	9.8
	life-stage domain	97	9.2
	evidence supporting domain	200	18.9
Average percent fields with entry			26.3

Open in a new tab

field information currently unavailable from aopwiki.org XML data

On average, AOPs only contained information in 49.2% of possible fields. Whereas most AOPs had title information, and information about their component KEs and KERs, very few had entries in in the evidence and assessment fields. Similarly, only 47% and 26.3% of user-defined fields contained information for KEs and KERs, respectively. Again, the fields with the lowest amount of information were those associated with describing the supporting evidence for the KEs and KERs. Consequently, because of their varying degrees of completion, many of the AOPs in the KB had unconventional or even incorrect structures. Nevertheless, we included all AOPs, regardless of their stage of development or adherence to accepted conventions, in our KB network and analysis. We felt this would provide an accurate depiction of the “baseline” condition of the AOP-KB (ca. April 2018). Further, it has been recognized that AOPs at nearly all levels of development have utility for at least some applications (Edwards et al. 2016). Given its open, crowd-sourced nature, even with ongoing curation, the AOP-KB may always contain partial or atypical AOPs at various stages of development. Describing an accurate baseline that represents the current conditions of the AOP-KB will help reveal whether ongoing outreach and guidance concerning AOP development approaches can improve the quality and structuring of AOP information being assembled in the AOP-KB.

Just as the AOP-KB network contains all AOPs regardless of their level of development or review, it also contains AOPs for all species, life stages, sexes, etc. currently represented in the AOP-KB. No attempts were made to restrict content based on biological attributes. Consequently, some of the emergent pathways within the network will not necessarily be biologically plausible (for example, if a pathway contains some segments that are only relevant to males and other segments that are only relevant to females). The proposed concepts of “filters” and “layers” (Knapen et al., 2018) would be useful tools for identifying the most relevant AOPs, and for tailoring an AOP network to any particular utilization. Although this approach was not employed in our current study, recent work towards the development of the AOP database tool stands out as a useful approach for AOP filtering in the future by using the stored AOP information on AOP-gene targets, chemical, disease, pathway, and species information to identify contextually and biologically relevant pathways (Pittman et al., 2018).

It is important to keep the broad, inclusiveness of AOPs in mind when interpreting all downstream analysis of the network created for this study. Likewise, our inclusion of partial and unreviewed AOPs in our analysis should not be taken to indicate substantiation or critique of the specific AOP information used in or derived from the present analyses, instead our research develops and identifies techniques used for determining the status of the AOP-KB and to understand some of the properties of the emerging knowledge that results as new AOP information is included in the AOP-KB.

BASIC AOP-KB NETWORK SUMMARY METRICS

The AOP-KB is a crowd-sourced repository and AOPs are contributed by authors and developers from institutions all over the world. Supplementary Figure 2 shows the growth of unique AOP contributors to the AOP-KB. The AOP-KB is ever-evolving as contributors constantly add, remove, and refine AOPs. Thus, the number of AOPs, and component KEs and KERs, in the knowledgebase varies over time. At the time of our analysis there were a total of 219 unique user-specified AOPs in the knowledgebase. Of these, 6 were archived (i.e. early versions that were replaced as guidance evolved) and 26 were empty (did not contain any KERs and/or KEs) and therefore not included in the network construction and omitted from further analysis. The remaining 187 AOPs were comprised from 840 unique KEs and 1050 KERs. The number of KEs per AOP ranged from 2 to 29, with an average of 7.3 and median of 7 (Supplementary Figure 3A). The number of KERs per AOP ranged from 1 to 26, with an average of 7.2 and median of 6 (Supplementary Figure 3B). Several partially described AOPs contained KEs that were not linked by any KERs. There were 15 such KEs. These KEs also could not be used for network construction and were omitted from further analysis. Thus, the remaining 825 KEs and 1050 KERs from 187 user-defined AOPs were used to construct the AOP-KB network. The result was a directed graph where all KEs were represented as nodes and all KERs were represented as directed edges (Figure 2). Summary information for the AOP-KB network are provided in Table 3.

Figure 2: — All AOPs in the AOP-KB were assembled into a network. Key events (KEs) are represented as circular nodes. Molecular Initiating Events (MIEs) and Adverse Outcomes (AOs) are two specialized KEs represented as green and red nodes, respectively. Key event relationships (KERs) are represented as directed edges between nodes. Non-adjacent KERs are represented as orange edges.

Table 3:

AOP-KB Network Benchmarking Summary

AOP Attributes	#

User-specified AOPs	219
Archived AOPs	6
Empty AOPs	26
Networked	187
Key Events in network	825
Key Events not included in network	15
Key Event Relationships in network	1050
Key Event Attributes	#

Key Event Designator
Molecular Initiating Events	126
Regular Key Events	586
Adverse Outcomes	113
Level of Biological Organization
Molecular	226
Cellular	270
Tissue	138
Organ	78
Individual	88
Population	25
Key Event Relationship Attributes	#
Adjacency
Adjacent KERs	943
Non-adjacent KERs	107
Average weight of evidence score (out of 3)	1.9
Average quantitative understanding score (out of 3)	1.3
Network Attributes	#

Components
Weakly connected components	34
KEs in largest weakly connected component	578
Strongly connected components	4
KEs in largest strongly connected component	7
Unique Linear AOPs	9876
User-specified linear AOPs	471
Emergent linear AOPs	9405
Most linear AOPs for a single MIE/AO pair	292
Connectivity
Greatest degree-in for a KE	10
Greatest degree-out for a KE	10
Greatest AOP occurrence for a KE	6615
Greatest edge connectivity for an MIE/AO pair	5

Open in a new tab

Two specialized types of KEs, the MIE and AO, are used to identify the point on the pathway that can be initiated by molecular interaction with a stressor, and the points corresponding to adverse biological effects of regulatory concern, respectively. Among the AOP-KB network’s 825 KEs, 126 were designated by the AOP developers as MIEs and 113 as AOs. We assigned these designations as KE attributes in the constructed network to conduct a variety of downstream pathway analyses (Figure 2, MIEs and AOs are represented as green and red nodes, respectively).

AOP developers specify KE level of biological organization as molecular, cellular, tissue, organ, individual, or population (Table 1). AOPs can contain KEs that span multiple levels of biological organization (Figure 3). Of the 825 total KEs, 270 KEs were at the “cellular” level, making this level of biological organization the most represented in the knowledgebase. The “population” level had the fewest KEs with only 25. There were many more KEs at the “individual” level than the “population” level. This could, in part, reflect a bias in the AOP-KB toward human health-oriented AOPs, as there is generally stronger need to link ecological effects to population-level outcomes. However, it is also likely that relevant links to population-level outcomes have often not been described as they are often based largely on plausibility arguments and lack other types of empirical supporting data. The current baselining/benchmarking exercise should be helpful in evaluating whether more AOPs will be extended to the population-level over time.

Figure 3: — A) Histogram showing the distribution of the level of biological organization (LOBO) of all key events in the AOP-KB network, and B) their location in the network

IDENTIFYING THE ADJACENT AND NON-ADJACENT KERS

Our analyses showed that KER adjacency plays an important role in the structure of the AOP-KB network. Two different types of KERs are defined during AOP development: adjacent KERs and non-adjacent KERs. By definition, an adjacent KER describes and provides evidence for the causal relationship between two KEs that occur immediately upstream or downstream of each other in an AOP. A linear AOP contains just one adjacent KER between each of its sequential KEs. Thus, the series of adjacent KERs in an AOP describes the sequence of causal relationship leading from MIE to AO. On the other hand, non-adjacent KERs describe evidence that links KEs further apart in an AOP sequence. Non-adjacent KERs are generally used as supporting evidence for AOPs, and do not necessarily describe alternative mechanisms between non-adjacent KEs. Rather, they are used as pragmatic placeholders for correlational evidence between KEs that are often measured together, but have intermediate KEs between them. It is important to distinguish between these two types of KERs in network analysis because, non-adjacent KERs over-inflate connectivity of the network region in which they occur and result in artificially-short path lengths through the AOP network (Villeneuve et al. 2018).

One difficulty in identifying non-adjacent KERs, is that adjacency is not a fixed property of a KER; a KER that is adjacent in one AOP can become non-adjacent when combined into a network of AOPs, and vice versa. Furthermore, prior to recent updates to the guidance on AOP development, the adjacency of KERs was often incorrectly defined by users. Therefore, the user-defined designations are not useful for discriminating adjacent and non-adjacent KERs in a networked context. To address this, we developed an algorithm for identifying adjacent and non-adjacent KERs after AOPs are joined into a network (see methods). Of the 1050 unique KERs in the KB, there were 859 that were defined by users as adjacent in all AOPs in which they occurred. 180 KERs were defined as non-adjacent in all occurrences. There were 11 KERs defined as both adjacent or non-adjacent in different AOPs. By comparison, the algorithm that we applied identified 943 of the KERs as adjacent, and 107 as non-adjacent after all AOPs were assembled into the network (Figure 2). We used these computationally- derived adjacency designations for all downstream analyses.

COMPONENT ANALYSIS

As the number of KEs and KERs that are shared between more than one AOP in the AOP-KB increases, groups of AOPs will become connected and form into distinct connected components. AOP networks are directed, so components can be weakly or strongly connected. Weakly connected components are groups of nodes for which a path between any pair of nodes in the group exists without consideration of the directionality of their edges, or cause-effect orientation within an AOP network. In an AOP context, weakly connected components represent groups of AOPs with overlapping biology or that share any KEs. Strongly connected components are groups of KEs for which a directed path exists between each KE in the component. All feedbacks/cycles are contained within strongly connected components, therefore the determination of these components in the AOP-KB provides a mechanism for the exhaustive identification of cycles in the networks.

As of April 1, 2018, there were 35 weakly connected components in AOP-KB network (Figure 4). The largest weakly connected component in the network contains 578 KEs; the majority of KEs in the network. The remaining 34 weakly connected components were much smaller, ranging in size from 2 to 17 KEs in size (brown backgrounds in Figure 4). The smaller components represent areas of toxicology with few AOPs; these may be novel areas of application or AOPs in early stages of development. The large component represents areas of toxicology with more highly developed AOPs, with interdependent mechanisms and biological descriptions.

There were 4 strongly connected components in the AOP-KB network, comprising 17 KEs (blue traces in Figure 4). Although some of these components contained non-adjacent KERS, the number of strongly connected components and their and KE membership was not dependent on the adjacency of KERs (i.e. the same result was obtained when considering all KERs or only adjacent KERs). All 4 of these cycles represent feedback loops that were created intentionally by users; that is, these strongly connected components exist in individual user-defined AOPs, before they were combined into the network, although two of these components became larger as a result of combining multiple AOPs. It is beyond the scope of the present study to delve into the specific biology of these components, it is of interest to note, and perhaps not surprising, that the largest of the strongly connected components contains the KE for oxidative stress at its center. In addition to identifying important cycles and feedbacks in the network, identifying strongly connected components is critical for determining which types of analyses can be performed on a given network. The presence of these components can restrict which methods that can be applied, for example, Bayesian networks, which represent a promising approach for creating quantitative AOP models (qAOPs) (Jaworska et al. 2013; Wittwehr et al. 2016), can only be constructed using directed acyclic graphs. Topological sorting, relevant for both analysis and visualization of complex causal dependencies (Villeneuve et al. 2018) is also only possible for directed networks that do not contain cycles/feedbacks.

PATH DETECTION AND EMERGENT AOPS

The progression from MIE to AO in a simple linear AOP is clear. However, MIE to AO path identification in an AOP network is often much more complex. There are often multiple paths between any given MIE and AO, and multiple combinations of MIEs and AOs to consider. To better understand how the organization of information according to the modular AOP framework can yield new insights and accelerate development of complex systems knowledge, including emergent novel LAOPs, we need to distinguish AOPs specified by users (i.e., user-defined AOPs) from those that emerge as KEs are linked together in the overall AOP-KB network (Figure 1).

We need to establish some terminology, “linear AOP (LAOP)” and “user-defined AOP”. A LAOP is an AOP that begins with an MIE, ends with an AO, and where each key event in the progression has no more than one upstream and one downstream key event (Figure 1); a LAOP may contain more than a single MIE or AO and must contain at least an MIE and an AO with a KER between the two. The set of unique LAOPs for the AOP-KB network contains all the possible paths between every MIE/AO pair; this makes LAOPs the fundamental unit by which to quantify unique AOP knowledge. To computationally identify all unique LAOPs, we determined all the simple paths between every MIE/AO pair. A simple path in a network is sequence of nodes, connected by edges, where no node is traversed more than once. In an AOP network context, simple path analysis between MIEs and AOs provides a mechanism to identify all possible unique LAOPs.

A “user-defined AOP” is defined as any set of KEs (including MIEs and AOs) and KERs that share the same AOP-ID. Whenever a user specifies a new AOP in the AOP-KB, it is assigned a unique AOP-ID. All KEs and KERs assigned to the created AOP are associated with this AOP-ID, although any KE or KER can be associated with multiple AOP-IDs. It is important to note that “user-defined AOPs” are not necessarily LAOPSs. For example, it is permitted for users to include branched structures when adding AOPs to the AOP-KB (OECD, 2018). These branches can create small networks of LAOPs; thus, user-defined AOPs may contain more than one LAOP. Likewise, some user-defined AOPs currently under development may contain zero LAOPs (e.g., in instances where MIEs, AOs or KERs have not yet been defined).

A key distinction then arises between “user-defined LAOPs” and “emergent LAOPs”. We designate a “user-defined LAOP” as a LAOP for which all KEs and KERs share the same AOP-ID. These represent the LAOPs that are contained within the set of “user-defined AOPs”. In contrast, an “emergent LAOP” is one that was not formally entered into the KB by a user, but which emerges from sharing of KEs between two or more user-defined AOPs. Consequently, emergent LAOPs are LAOPs that contain at least one KE with an AOP-ID that is different from the rest of the KEs in that LAOP. Emergent LAOPs represent potentially novel pathways, resulting from the contribution of multiple AOP contributors to the AOP-KB.

The AOP-KB network was assembled from 187 user-defined AOPs, each with a unique AOP-ID. As KEs can be shared there are on average 1.6 AOP-IDs assigned to each KE in the network (median = 1, max =21). Similarly, KERs in the network had an average of 1.3 AOP-IDs (median = 1, max =13). Among the 187 user-defined AOPs, there are 471 user-defined LAOPs in the AOP-KB network. The number of LAOP per user-defined AOP is not evenly distributed (Supplementary Figure 4). Rather, of the 187 user-defined AOPs, 40 (21%) had zero LAOPs, 63 (34%) had only one LAOP, 38 (20%) had two LAOPS, and the remaining 46 (25%) had greater than 2. The user-defined AOP with the greatest number of LAOPs contained 32 LAOPs. This was an AOP for liver steatosis, which was designed as an AOP network (Angrish et al. 2017).

When all the user-defined AOPs were combined into the global AOP-KB network (Figure 2), a far greater number of LAOPs (9876) were identified. Of these, 9405 novel LAOPs (9876 total LAOPs minus 471 user-defined LAOPs) emerged because of network connectivity. This means that there are approximately 20 emergent LAOPs for each user-defined AOP in the network (Supplementary Figure 4); we will show later that this growth is, however, not linear. Emergent LAOPs represent a large amount of novel and potentially useful AOP knowledge derived from a relatively small number of user-defined AOPs that is only made possible through the collective effort of individual AOP developers in the crowd-sourced AOP-KB framework.

The algorithm to determine the set of LAOPs for the entire AOP-KB network considered all possible MIE/AO pairs in the network. Thus, the results of the global analysis were also able to used to investigate the distribution of LAOPs for each MIE/AO pair (Supplementary Figure 5). Looking at each MIE/AO pair indicates the number of parallel routes from a given MIE (e.g., inhibition of a particular enzyme) to a given outcome (e.g., impaired reproduction). There were 913 MIE/AO pairs that were connected by at least one LAOP. Of these, 235 pairs were connected by only one unique LAOP, while 179 pairs had two LAOPs between them. However, the majority of pairs (499 MIE/AO pairs) had greater than 2 LAOPS. Of these, 22 pairs had greater than 50 LAOPS between them, with the greatest number of LAOPs per MIE pair being 292. All the MIE/AO pairs that had greater than 50 LOAPs were part of a large user-defined AOP network that was developed to investigate honey bee colony death (LaLone, 2017). This subnetwork of 292 unique LAOPs captures a myriad of ways in which activation of the nicotinic acetylcholine receptor can plausibly contribute to colony loss/failure (Supplementary Figure 6). It also contains one of the strongly connected components, which we identify as partially explain the large number of LAOPs that result from the associated user-defined AOPs.

CONNECTIVITY AND ROBUSTNESS

To characterize the “baseline” state of the AOP-KB (April 2018), we also wanted to examine if any KEs were stand-outs by measures of connectivity and if there were MIE/AO pairs in the global AOP-KB network were the most robustly connected. These analyses provide insights into biological pathways that have been a focus for AOP development to date and can also provide insights into how updates to the guidance (OECD 2018), such as the use of structured ontology terms in defining KEs (Ives et al. 2017), and other forms of outreach are encouraging greater sharing of KE/KER content by AOP developers (rather than defaulting to de novo KE and KER description). We investigated several measures of centrality and connectivity for KEs in the global AOP-KB network. Some analyses, results which we share here, provided useful insight. Degree centrality for a KE, is the measure of the number of in- and/or out-going KERs from each KE in a network. In an AOP context, degree measures identify regions of convergence and divergence in the AOP-KB network (Villeneuve et al. 2018; Figure 5). Two KEs shared the highest number of outgoing KERs (degree-out) with a score of 10. These were KE ID #18, AhR activation, and KE ID #167, LxR activation. High degree-out scores indicate that these KEs, which are both MIES, are points of high divergence. KE ID #351, increased mortality, had the highest number of incoming KERs (degree-in) with a score of 10; indicating that increased mortality is a point of high convergence.

Figure 5: — The size of the KE nodes in the AOP-KB are scaled to various measures of network connectivity and the KEs with the greatest values are indicated by arrows. A) KE nodes are scaled to degree-out. The KEs with the greatest degree-out were AhR activation and LxR activation; B) KE nodes are scaled to degree-in. The KE with the greatest degree-in was increased mortality; and C) KE nodes were scaled to the log of AOP occurrence. The KE with the greatest AOP occurrence was impaired learning and memory.

Involvement of a node within network paths is frequently quantified by a measure called betweenness centrality, however, this measure has limited use in AOP networks, so an AOP-specific measure of involvement was derived to identify which KEs were contained within the most LAOPs. This measure, called AOP Occurrence counts the number of times a KE occurs in unique LAOPs (Villeneuve et al. 2018). Our results show a wide distribution in AOP occurrence values in the global AOP-KB (Figure 5; Supplementary Figure 7). The average AOP occurrence per KE was 123 (median=5). The greatest value was for KE ID #341 (impairment, learning and memory), which occurred in 6615 unique LAOPs (a surprising 67% of all LAOPs). This KE is an adverse outcome that is used in 12 different user-defined AOPs, including several AOPs that are part of the previously mentioned honey bee colony death subnetwork that contains the MIE/AO pairs with the largest number of LAOPs, it is also located downstream of one of the four strongly connected components; we have identified each of these factors as contributing to the large AOP occurrence value for this KE.

Many MIE/AO pairs have at least one LAOP between them in the global AOP-KB network, however we sought to find a measure to quantify the strength of those connections between MIEs and AOs. We saw that the number of LAOPs between MIE/AO pairs ranged from 0 to 292, however the number of LAOPs between MIE and AO does not necessarily mean that the connection between them is robust.

To explore network robustness, we used a measure called edge connectivity. Edge connectivity of two nodes is the minimum number of edges that must be removed to disrupt all paths between the nodes. A histogram showing the edge connectivity of MIE/AO pairs that are connected by at least one LAOP is shown in Supplementary Figure 8. Most of these pairs had an edge connectivity of only one or two (859 and 42 pairs out of 913, respectively), several MIE/AO pairs had and edge connectivity between 3 and 5. This indicates that the connection between most MIE/AO pairs is dependent on one or two KERs. For example, MIE: 1486 and AO:566 has 80 unique paths (i.e. LAOPs) between them, yet only have an edge connectivity of 1 (Figure 6A). In contrast, MIE: 167 and AO:459 has only 18 LAOPs, but an edge connectivity of 5 (Figure 6B), indicating more lines of evidence linking this MIE/AO pair. This information could be useful in identifying MIE/AO pairs that are supported by the most evidence, identifying weak points in AOP networks that could be strengthened with further research, and identifying the most critical KERs of an AOP or AOP network. For example, the KERs that comprise the less robust network regions would make ideal candidates for assays that target certain MIE/AO pairs.

Figure 6: — Two subnetworks demonstrating and MIE/AO pairs with varying edge connectivity. A) MIE:1486 and AO:566 are connected by 80 unique LAOPs, but only have an edge connectivity of one. B) In contrast, MIE:167 and AO: 459 are connected by only 18 LAOPs, but have an edge connectivity of 5

GROWTH CHARACTERISTICS OF THE KNOWLEDGE BASE AND FACTORS AFFECTING CONNECTIVITY

The ratio of emergent to user-defined LAOPs was around 20:1 when we conducted our analysis. We sought to understand what factors influenced the emergence of LAOPs in the global AOP-KB network. To investigate both the rate of growth of emergent LAOPs and to try to identify network attributes that affect this growth, we conducted a network growth simulation that incrementally added the user-defined AOPs, one at a time, to the AOP-KB network. At each iteration of the simulation we measured various network statistics and compared them to the rate of emergence of LAOPs (Figure 7). As the AOP-KB network grows (as determined by the number of KEs in the network), there are several large increases in the total number of LAOPs in network (Figure 7A). The occurrence of non-adjacent KERs is one significant driver, as the total number of LAOPs was in fact 60% lower when only adjacent KERs were considered (9876 total LAOPs vs. 3097 adjacent-only LAOPs). Interestingly, the trend in the rate of emergence of LAOPs and adjacent-only LAOPs was nearly identical throughout our growth simulation (Figure 7A). Not surprisingly, the trend in maximum AOP occurrence was also highly correlated to the number of LAOPs in the network (Figure 7A). Since these three measures had such similar trends, the remainder of our growth simulation analyses focused on the total number of LAOPs as the indicator of network connectivity.

To better understand what variables have the greatest influence on the number of emergent LAOPs captured in the AOP-KB, we first examined the influence of non-adjacent KERs in more detail (Figure 7B). As one might expect, the number of adjacent KERs grew linearly with the number of KEs since, in general, most new KEs introduced would require, on average one new adjacent KER to connect it to the network. In contrast, non-adjacent KERs, which are an optional component of AOP descriptions, appeared to increase at a non-linear rate. Furthermore, it appeared that some of the increased rate of LAOP growth coincided with elevated rates of non-adjacent KER growth. However, not all increases in non-adjacent KERs were accompanied by increases in LAOPs and vice versa. This suggests that the addition of non-adjacent KERs may indeed affect the emergence of LAOPs in the network, but it is not the only contributing factor.

Next, we examined the influence of weakly or strongly connected components (Figure 7C). The number of weakly connected components did not appear to have a major effect on the emergence of LAOPs, with one exception (Figure 7C, pink squares). At one point during the growth of the network, when it was between 200 and 300 KEs in size, there was a large decrease in the number weakly connected components, which appeared to be related to a large increase in LAOPs. This represented a moment during the growth of the AOP-KB network when user-defined AOPs first started to have shared components and form into a network, resulting in the emergence of new LAOPs. No other changes in LAOP growth appear to be related to the number of weakly connected components. The emergence of LAOPs was also compared with the point at which the four strongly connected components were introduced into the network (Figure 7C, purple squares). In many of the instances that a strongly connected component was introduced into the network, there was a large increase in LAOPs, indicating this as an important factor affecting network connectivity. However, as with the non-adjacent KERs, not every increase in LAOPs was accompanied by the introduction of a new strongly connected component, indicating a need to explore other factors.

Finally, we considered the number of times that a user-created a new AOP that borrowed one or more KEs that already existed in the AOP-KB network (Figure 7D). We anticipated that LAOP emergence would correspond well with borrowing events. Our analyses showed that borrowing represents the main mechanism by which the AOP-KB can be formed into a network. The rate at which borrowing occurred was the most reliably predictive factor for determining the rate of LAOP emergence. This finding strongly supports the modular structure of the AOP framework and importance of the crowd-sourced approach allowing for shared units of AOP description to maximize the knowledge captured relative to de novo contributions by users.

CRITICAL PATH IDENTIFICATION

Given the nearly 10,000 unique paths between MIEs and AOs currently in the AOP-KB network, it is common that an MIE and AO will have multiple paths between them. When multiple paths are found between an MIE and AO, further inquiry may focus on determining critical paths and path filtering. The concept of a critical path in an AOP network will vary with context and application (Villeneuve et al. 2018). With the application defined, however, the criticality of a path can be determined using a variety of KE and/or KER attributes relevant to that problem or research question. In some applications, the path of highest interest may be the most concise connection between an MIE and an AO. In other applications it could be the most well-researched path between an MIE and an AO. In this section we present methods for using the information contained within the KERs for identifying such paths in the AOP-KB network.

In this section, we use Event ID #201 (binding of antagonist, NMDA receptors) and Event ID #341 (Impairment, learning and memory) as an example MIE/AO pair to illustrate how critical paths of interest can be identified. This pair has a moderate number of LAOPs that form a relatively simple subnetwork comprised of 14 KEs and 18 KERs (Figure 8A). This subnetwork contained no cycles and so a topological sorting and visualization was utilized. As noted earlier, topological sorting is a useful visualization tool for identifying which KEs occur upstream and downstream of each other and identifying causal dependencies, which are not obvious in the unsorted subnetwork (Figure 8B). Of note, in cases where cycles are present in an AOP network, there are computational techniques available for graph condensation to contract cycles and create an acyclic network (Villeneuve et al. 2018).

Figure 8: — There are 35 unique linear AOPs (LAOPs) between MIE:201 and AO:341. A) These LAOPs form a small subnetwork. B) Topological sorting can be used to visualize which KEs occur upstream and downstream of each other, and can help identify causal relationships. A shortest path analysis can be used to identify LAOPs of interest in this subnetwork, such as C) the shortest LAOP from MIE to AO (highlighted in light blue), D) the shortest LAOPs that only includes adjacent KERs (in this subnetwork there 5 unique LAOPs of equal length that meet these criteria, each highlighted in a different shade of blue), and E) the LAOP that only includes adjacent KERs and has the greatest combined weight of evidence score (highlighted in purple).

The MIE, binding of antagonist NMDA receptors, and AO, Impairment, learning and memory, are connected by 35 unique LAOPs. In some cases, one might be interested in identifying the shortest possible path among these LAOPs. By using a shortest path analysis, we identified an LAOP connecting this MIE/AO pair that is only four KERs long (Figure 8C). The concept of a shortest path between two nodes is intuitive; in its simplest form the shortest path between two nodes is a sequence from the starting node (MIE: Event ID #201) to the ending node (AO: Event ID #341) that traverses the fewest nodes along the way. This example would be considered an unweighted shortest path because it does not consider any additional information associated to the KERs, or assumes each edge carries equal weight. Such a simply-derived critical path would not likely be useful for very many applications. This shortest path from MIE:201 to AO:341 contains a non-adjacent KER that bypasses many KEs. Non-adjacent KERs will commonly be identified in this type of analysis, which may be desirable if the quantitative understanding of the non-adjacent KER is such that measurement of the intermediate KERs is unnecessary. However, in other cases the unweighted shortest path may be missing useful information or the appropriate level of detail and evidence for support.

A shortest path analysis can also be performed using edge weights that are dependent on the information encoded in the KERs. Specific to AOP networks, another slightly more complex example would be to identify the shortest path between this MIE/AO pair, but omit the non-adjacent KERs. This analysis would not bypass any potentially important information by jumping KEs. However, when considering only adjacent KERs in our example subnetwork, we identified 5 equally short unique LAOPs, each 7 KERs long (Figure 8D). It is very likely that for any application several paths of equal length will be identified, and further information may be required to filter paths of greatest interest for the application.

One approach to filter the 5 shortest adjacent-only paths is to consider the additional weight of evidence (WOE) information encoded as KER attributes in the AOP-KB. Each KER in the AOP-KB is assigned a WOE score of “low”, “moderate”, or “high”, to describe the strength of the empirical evidence that supports the causal relationship it describes. Recent work has outlined approaches for using AOPs in weight of evidence assessment and called for standardized WOE determinations for KERs (Garcia-Reyero and Murphy, 2018; Collier et al., 2016). Our analyses utilize the WOE rankings determined by AOP authors to differentiate between possible paths within a given AOP network. Specifically, we used KER WOE information as edge weights to identify the shortest path between MIE:201 and AO:341 that only traversed adjacent KERs with the greatest combined weight of evidence score (Figure 8E). By this approach, we isolated a single LAOP of potentially high relevance from the multiple paths possible in this complex subnetwork. The assumption of missing WOE KER attributes of as “low” is a conservative assignment in the face of missing data. As actual measures for these missing KER attributes are contributed by the AOP authors, the result of higher weights, such as “medium” or “high” will then serve to identify more supported KERs that may emerge within these critical path analyses. This approach can be applied to much more complex networks and can be based on almost any of the information that is stored as KERs attributes (for example the “quantitative understanding” score, or experimentally derived time-course data) as long as they can be represented numerically.

It is important to note that AOPs are intended to be pragmatic simplifications of complex biology and that the length of an AOP is not necessarily indicative of its quality. The path detection method based on edge weights utilized above did not take path length into consideration. As a result, this method has a bias for identifying longer paths as the critical path (ex: an LAOP with many “low” WOE KERs will have a higher overall WOE score than an LAOP with only a few “high” WOE KERs). A critical path identification strategy that uses length-normalized measures of edge weights might be more appropriate for addressing this issue. A length-normalized approach gave us the same result as the unnormalized approach in our example subnetwork (data not shown) and may prove useful in more complex networks.

If path length is considered, identifying the longest path may be desirable for specific applications. For example, in the AOP-KB network, the longest simple path between an MIE and AO might indicate the most detailed mechanistic description available in the KB. To explore this concept, we determined all the longest LAOPs in the AOP-KB (visually depicted in Supplementary Figure 9). We identified 7 unique LAOPs that were each 16 KERs in length. There was considerable overlap between several of these LAOPs, only differing by a single KER in some instances. These 7 LAOPs represent the most detailed AOPs currently in the AOP-KB.

For AOP networks, path criticality is linked to context and it is not possible to say which path detection method is best. Our goal was to describe a few different approaches for path filtering and the determination of critical paths. Our simple example of the MIE:201/AO:341 subnetwork demonstrates several ways in which weighted and unweighted path analyses can be used to identify paths of interest from the AOP-KB network.

CONCLUSIONS, CHALLENGES AND FUTURE DIRECTIONS

The AOP-KB represents a massive international research and epistemic effort. The AOP-KB has now accumulated sufficient contributions that individual user-defined AOPs can now be assembled into networks and assessed in a more integrated context. This is critical; at the inception of the AOP framework, it was assumed that AOP networks would be the functional unit of information that would be most relevant in practical applications (Villeneuve et al. 2014). However, continued adoption of the framework and growth of the AOP-KB is not assured. Significant investments in education and training, governance, incentives for contribution, and demonstrations of practical application are all needed to support ongoing growth of the AOP-KB (Carusi et al. 2018). Measures of the growth and properties of the AOP-KB network in turn are needed to quantify and evaluate the success of these efforts in promoting further development of the AOP-KB.

The current analysis of the AOP-KB network provides the critical baseline data (Table 3; Figure 9) for these future indicators of success (or lack thereof). As of April 2018, the AOP-KB contained just over 200 user-defined AOPs, composed of roughly 825 KEs, connected by over 1000 KERs. Collectively, those user-defined AOPs describe over 9000 simple paths linking a MIE to an AO via a unique sequence of KEs. Those paths organize into 34 weakly connected components. The anticipation is that over time, the number of user-defined AOPs will continue to increase and that the connectivity between them will continue to increase, resulting in fewer connected components with more nodes per component, more emergent LAOPs, and an increase in the number of highly connected “hub” KEs. Periodic analysis of the overall AOP-KB network will provide useful and informative tracking of the overall growth and development of the AOP-KB.

The AOP-KB is a resource derived from crowd-sourcing, depending on individual contributions for its continued growth. As an additional benefit, our benchmarking efforts can also be used to provide valuable feedback to individual AOP developers. Some of the baseline analyses presented in this paper can be used to track the influence that an AOP author’s contributions are having within the larger AOP community. Values for AOP occurrence and KE/KER borrowing are two examples of simple metrics that convey global impacts of single, fundamental KE or KER units. As the AOP-KB information technology develops, metrics such as these could be included as part of AOP-KB downloadable status reports, shown along with the KE/KER descriptions in the AOP-KB, or attached to an AOP contributor’s user profile within the KB and would ideally serve to encourage further refinement and contribution to the AOP-KB.

As it stands in April 2018, the AOP-KB is starting to become a formidable crowd-sourced repository of toxicological knowledge, with hundreds of AOPs contributed comprising approximately a thousand KEs and KERs spanning many levels of biological organization. Connectivity in the AOP-KB was initially low, but after addition of about 200-300 KEs to the AOP-Wiki and the practical onset of sharing of KE and KERs among AOPs, the total number of LAOPs began to grow at a rate that averages out to over 20 times the rate at which new user-defined LAOPs are added. This illustrates one of the major advantages of assembling information in and integrated, modular platform as opposed to more traditional documents. Open-access and sharing of AOP knowledge is critical for continued growth of the knowledge base, even for AOPs that are at early stages in their development. Crowd-sourcing and open-access allows for sharing of KEs and KERs, creating efficiencies in development and avoiding redundant research effort and duplicative content in the knowledge base. Borrowing KEs and KERs, as our analysis shows, is also the single most significant factor contributing to growth of emergent LAOP information.

In addition to the knowledge capture itself, practical application of AOP networks requires that appropriate computational tools to extract, visualize, and analyze AOP networks be developed (Knapen et al. 2018; Villeneuve et al. 2018). While KEs and KERs have been identified, many of the current AOP descriptions still lack robust supporting information. Additionally, from a network development and benchmarking standpoint, the types of computational analyses employed here can only be conducted on information that is stored in the KB in a machine-readable format. Therefore, it is our recommendation that additional ontologies be incorporated into the KE, KER and AOP descriptions. Information such as “research area” (ex: neurotoxicology, genetic toxicology, etc.) and demographic information on authors would be incredibly useful for targeting areas to focus development and training, and for the identification of novel contributions to AOP-KB.

Our research demonstrates the benefits of utilizing edge weights on KERs derived from machine-readable structured information captured in the AOP-KB. By combining computational methods for extracting shortest or longest paths from the network along with rules for calculating normalized weight of evidence or quantitative understanding metrics along those paths, it should be possible to extract AOPs (including emergent LAOPs) from more complex AOP networks. In some applications AOP path length alone may be of interest, however generally attributes of edges in the path such as WOE, quantitative understanding, or adjacency will enable context-relevant critical path identification. As the amount of machine-readable information captured in the AOP-KB increases, more and more options to tailor critical path analyses for different research questions or risk assessments should be available.

As a repository alone the AOP-KB is useful for the toxicological community. However, the present study demonstrates that the AOP-KB has additional value as a source of novel emergent knowledge. We showed that combining 187 user-defined AOPs into a network context resulted in the emergence of 9405 unique, previously undescribed, linear AOPs (LOAPs). Although we acknowledge not all emergent LAOPs are biologically plausible, our research provides a quantification of unexplored potential AOP knowledge. The global AOP-KB network was constructed using largely uncurated data and was assembled without consideration for any species-, sex-, or life stage-specific information. Nevertheless, because the network was constructed based on the components that are shared between the user-defined AOPs, many of these resulting LAOPs are likely to be of some toxicological relevance. One major challenge will be to assess and curate this large amount of emergent AOP information, specifically when an application has been identified for the use of user-specified or emergent AOPs. Approaches developed for generating computationally-predicted AOPs (Bell et al. 2016; Oki et al. 2016) may also have use for ground-truthing emergent LAOPs within the overall AOP-KB network. Methods must be developed for appropriately filtering this information and for quality assurance and quality control (several filtering methods are discussed in Knapen et al. 2018 and Villeneuve et al. 2018). Tools, such as the critical path analyses presented here, provide some approaches to path filtering. The AOP DB, under ongoing development, is another promising new tool in this direction (Pittman et al., 2018).

We demonstrated the utility of applying established network analysis/graph theory techniques to benchmarking and extracting information from the AOP-KB. We have also derived and utilized AOP network specific analysis techniques, such as AOP occurrence; these AOP-specific variants on standard network analyses will likely continue to be developed and find use. For example, future analyses of edge connectivity may be modified to include WOE ratings along the edges identified in the analysis to help to add further contextual relevance to their identification of robust MIE/AO pairs for AOP applications. It is expected that as the AOP-KB continues to evolve, methods that leverage more sophisticated network analysis techniques will be required to fully capitalize on the wealth of information captured in this international toxicology knowledge sharing database.

Supplementary Material

Supplementary materials

NIHMS1918157-supplement-Supplementary_materials.zip^{(5MB, zip)}

ACKNOWLEDGEMENTS

The authors thank S. Edwards, E. Kazymova, and C. Ives for assistance with the AOP-Wiki and development of the data downloads used for this analysis. We thank the anonymous reviewers for their helpful comments. Mention of trade names or commercial products does not constitute endorsement or recommendation for use by either U.S. EPA or Environment and Climate Change Canada. Likewise, the contents of the paper neither constitute nor necessarily reflect official policy of the U.S. EPA or Environment and Climate Change Canada.

FUNDING INFORMATION.

This work was supported by the U.S. EPA Office of Research and Development and Environment and Climate Change Canada’s Ecotoxicology and Wildlife Health Division.

REFERENCES

Angrish MM, McQueen CA, Cohen-Hubal E, Bruno M, Ge Y, Chorley BN (2017). Editor’s highlight: mechanistic toxicity tests based on an adverse outcome pathway network for hepatic steatosis. Toxicol Sci. 159, 159–169. doi: 10.1093/toxsci/kfxl21. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ankley GT, Bennett RS, Erickson RJ, Hoff DJ, Hornung MW, Johnson RD, Mount DR, Nichols JW, Russom CL, Schmieder PK, Serrrano JA, Tietge JE, and Villeneuve DL (2010). Adverse outcome pathways: a conceptual framework to support ecotoxicology research and risk assessment. Environ. Toxicol. Chem 29, 730–741. doi: 10.1002/etc.34. [DOI] [PubMed] [Google Scholar]
Bell SM, Angrish MM, Wood CE, Edwards SW (2016). Integrating publicly available data to generate computationally predicted adverse outcome pathways for fatty liver. Toxicol Sci. 150, 510–520. doi: 10.1093/toxsci/kfw017. [DOI] [PubMed] [Google Scholar]
Carusi A, Davies MR, De Grandis G, Escher BI, Hodges G, Leung KM, Whelan M, Willett C, and Ankley GT (2018). Harvesting the promise of AOPs: An assessment and recommendations. Sci. Tot. Environ 628, 1542–1556. [DOI] [PMC free article] [PubMed] [Google Scholar]
Collier ZA, Gust KA, Gonzalez-Morales B, Gong P, Wilbanks MS, Linkov I, & Perkins EJ (2016). A weight of evidence assessment approach for adverse outcome pathways. Regulatory Toxicology and Pharmacology, 75, 46–57. [DOI] [PubMed] [Google Scholar]
Csardi G, and Nepusz T (2006). The igraph software package for complex network research. Inter. J. Complex Sys 1695, 1–9. [Google Scholar]
Edwards SW, Tan YM, Villeneuve DL, Meek ME, and McQueen CA (2016). Adverse outcome pathways—organizing toxicological information to improve decision making. J. Pharmacol. Exp. Ther 356, 170–181. [DOI] [PubMed] [Google Scholar]
Fruchterman TMJ, and Reingold EM, (1991), Graph Drawing by Force-Directed Placement, Software – Practice & Experience, Wiley, 21, 1129–1164, doi: 10.1002/spe.4380211102 [DOI] [Google Scholar]
Ives C, Campia I, Wang R-L, Wittwehr C, and Edwards S. (2017). Creating a structured AOP knowledgebase via ontology-based annotations. Appl. In vitro Toxicol 3, 298–311. doi: 10.1089/aivt.2017.0017. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jaworska J, Dancik Y, Kern P, Gerberick F, and Natsch A (2013). Bayesian integrated testing strategy to assess skin sensitization potency: from theory to practice. J. Appl. Toxicol 33, 1353–1364. [DOI] [PubMed] [Google Scholar]
Garcia-Reyero N, & Murphy CA (Eds.). (2018). A Systems Biology Approach to Advancing Adverse Outcome Pathways for Risk Assessment. Springer. [Google Scholar]
Kerren A, Purchase HC, and Ward MA (2014). Introduction to multivariate network visualization. In: Kerren A et al. (Eds.), Multivariate Network Visualization, Spring, New York, NY. Pp. 1–9. [Google Scholar]
Knapen D, Angrish MM, Fortin MC, Katsiadaki I, Leonard M, Margiotta-Casaluci L, Munn S, O’Brien JM, Pollesch N, Smith LC, Zhang X, and Villeneuve DL (2018). Adverse outcome pathway networks I: Development and applications. Environ. Toxicol. Chem 37, 1723–1733. doi: 10.1002/etc.4125. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kolaczyk ED and Csárdi G (2014) Descriptive Analysis of Network Graph Characteristics. In: Statistical Analysis of Network Data with R. Use R!, vol 65. Springer, New York, NY. 10.1007/978-1-4939-0983-4_4. [DOI] [Google Scholar]
LaLone CA, Ankley GT, Belanger SE, Embry MR, Hodges G, Knapen D, Munn S, Perkins EJ, Rudd MA, Villeneuve DL, Whelan M, Willett C, Zhang X, and Hecker M (2017). Advancing the adverse outcome pathway framework-An international horizon scanning approach. Environ. Toxicol. Chem 36, 1411–1421. doi: 10.1002/etc.3805. [DOI] [PMC free article] [PubMed] [Google Scholar]
Oki NO, Nelms MD, Bell SM, Mortensen HM, and Edwards SW (2016). Accelerating adverse outcome pathway development using publicly available data sources. Curr. Environ. Health Rep 3, 53–63. doi: 10.1007/s40572-016-0079-y. [DOI] [PubMed] [Google Scholar]
Organisation for Economic-Cooperation and Development (OECD). (2018). Users’ handbook supplement to the guidance document for developing and assessing AOPs. OECD Environment, Health, and Safety Publications, Series on Testing and Assessment, No. 233. ENV/JM/MONO(2016)12. https://one.oecd.org/document/ENV/JM/MONO(2016)12/en/pdf [Google Scholar]
Pavlopoulos GA, Secrier M, Moschopoulos CN, Soldatos TG, Kossida S, Aerts J, Schneider R, and Bagos PG (2011). Using graph theory to analyze biological networks. BioData Min. 4, 10. doi: 10.1186/1756-0381-4-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pittman ME, Edwards SW, Ives C, and Mortensen HM (2018). AOP-DB: A database resource for the exploration of Adverse Outcome Pathways through integrated association networks. Toxicol. Appl. Pharmacol 343, 71–83. doi: 10.1016/j.taap.2018.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
R Core Team. (2017). R: a language and environment for statistical computing. https://www.R-project.org/. [Google Scholar]
Villeneuve DL, Angrish MM, Fortin MC, Katsiadaki I, Leonard M, Margiotta-Casaluci L, Munn S, O’Brien JM, Pollesch NL, Smith LC, Zhang X, and Knapen D (2018). Adverse outcome pathway networks II: Network analytics. Environ. Toxicol. Chem 37, 1734–1748. doi: 10.1002/etc.4124. [DOI] [PMC free article] [PubMed] [Google Scholar]
Villeneuve DL, Crump D, Garcia-Reyero N, Hecker M, Hutchinson TH, LaLone CA,Landesmann B, Lettieri T, Munn S, Nepelska M, Ottinger MA, Vergauwen L, and Whelan M (2014). Adverse outcome pathway (AOP) development I: strategies and principles. Toxicol. Sci 142, 312–320. doi: 10.1093/toxsci/kful99. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wittwehr C, Aladjov H, Ankley G, Byrne HJ, de Knecht J, Heinzle E, Klambauer G, Landesmann B, Luijten M, MacKay C, Maxwell G, Meek ME, Paini A, Perkins E, Sobanski T, Villeneuve D, Waters KM, and Whelan M (2017). How adverse outcome pathways can aid the development and use of computational prediction models for regulatory toxicology. Toxicol. Sci 155, 326–336. doi: 10.1093/toxsci/kfw207. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary materials

NIHMS1918157-supplement-Supplementary_materials.zip^{(5MB, zip)}

[R1] Angrish MM, McQueen CA, Cohen-Hubal E, Bruno M, Ge Y, Chorley BN (2017). Editor’s highlight: mechanistic toxicity tests based on an adverse outcome pathway network for hepatic steatosis. Toxicol Sci. 159, 159–169. doi: 10.1093/toxsci/kfxl21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Ankley GT, Bennett RS, Erickson RJ, Hoff DJ, Hornung MW, Johnson RD, Mount DR, Nichols JW, Russom CL, Schmieder PK, Serrrano JA, Tietge JE, and Villeneuve DL (2010). Adverse outcome pathways: a conceptual framework to support ecotoxicology research and risk assessment. Environ. Toxicol. Chem 29, 730–741. doi: 10.1002/etc.34. [DOI] [PubMed] [Google Scholar]

[R3] Bell SM, Angrish MM, Wood CE, Edwards SW (2016). Integrating publicly available data to generate computationally predicted adverse outcome pathways for fatty liver. Toxicol Sci. 150, 510–520. doi: 10.1093/toxsci/kfw017. [DOI] [PubMed] [Google Scholar]

[R4] Carusi A, Davies MR, De Grandis G, Escher BI, Hodges G, Leung KM, Whelan M, Willett C, and Ankley GT (2018). Harvesting the promise of AOPs: An assessment and recommendations. Sci. Tot. Environ 628, 1542–1556. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Collier ZA, Gust KA, Gonzalez-Morales B, Gong P, Wilbanks MS, Linkov I, & Perkins EJ (2016). A weight of evidence assessment approach for adverse outcome pathways. Regulatory Toxicology and Pharmacology, 75, 46–57. [DOI] [PubMed] [Google Scholar]

[R6] Csardi G, and Nepusz T (2006). The igraph software package for complex network research. Inter. J. Complex Sys 1695, 1–9. [Google Scholar]

[R7] Edwards SW, Tan YM, Villeneuve DL, Meek ME, and McQueen CA (2016). Adverse outcome pathways—organizing toxicological information to improve decision making. J. Pharmacol. Exp. Ther 356, 170–181. [DOI] [PubMed] [Google Scholar]

[R8] Fruchterman TMJ, and Reingold EM, (1991), Graph Drawing by Force-Directed Placement, Software – Practice & Experience, Wiley, 21, 1129–1164, doi: 10.1002/spe.4380211102 [DOI] [Google Scholar]

[R9] Ives C, Campia I, Wang R-L, Wittwehr C, and Edwards S. (2017). Creating a structured AOP knowledgebase via ontology-based annotations. Appl. In vitro Toxicol 3, 298–311. doi: 10.1089/aivt.2017.0017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Jaworska J, Dancik Y, Kern P, Gerberick F, and Natsch A (2013). Bayesian integrated testing strategy to assess skin sensitization potency: from theory to practice. J. Appl. Toxicol 33, 1353–1364. [DOI] [PubMed] [Google Scholar]

[R11] Garcia-Reyero N, & Murphy CA (Eds.). (2018). A Systems Biology Approach to Advancing Adverse Outcome Pathways for Risk Assessment. Springer. [Google Scholar]

[R12] Kerren A, Purchase HC, and Ward MA (2014). Introduction to multivariate network visualization. In: Kerren A et al. (Eds.), Multivariate Network Visualization, Spring, New York, NY. Pp. 1–9. [Google Scholar]

[R13] Knapen D, Angrish MM, Fortin MC, Katsiadaki I, Leonard M, Margiotta-Casaluci L, Munn S, O’Brien JM, Pollesch N, Smith LC, Zhang X, and Villeneuve DL (2018). Adverse outcome pathway networks I: Development and applications. Environ. Toxicol. Chem 37, 1723–1733. doi: 10.1002/etc.4125. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Kolaczyk ED and Csárdi G (2014) Descriptive Analysis of Network Graph Characteristics. In: Statistical Analysis of Network Data with R. Use R!, vol 65. Springer, New York, NY. 10.1007/978-1-4939-0983-4_4. [DOI] [Google Scholar]

[R15] LaLone CA, Ankley GT, Belanger SE, Embry MR, Hodges G, Knapen D, Munn S, Perkins EJ, Rudd MA, Villeneuve DL, Whelan M, Willett C, Zhang X, and Hecker M (2017). Advancing the adverse outcome pathway framework-An international horizon scanning approach. Environ. Toxicol. Chem 36, 1411–1421. doi: 10.1002/etc.3805. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Oki NO, Nelms MD, Bell SM, Mortensen HM, and Edwards SW (2016). Accelerating adverse outcome pathway development using publicly available data sources. Curr. Environ. Health Rep 3, 53–63. doi: 10.1007/s40572-016-0079-y. [DOI] [PubMed] [Google Scholar]

[R17] Organisation for Economic-Cooperation and Development (OECD). (2018). Users’ handbook supplement to the guidance document for developing and assessing AOPs. OECD Environment, Health, and Safety Publications, Series on Testing and Assessment, No. 233. ENV/JM/MONO(2016)12. https://one.oecd.org/document/ENV/JM/MONO(2016)12/en/pdf [Google Scholar]

[R18] Pavlopoulos GA, Secrier M, Moschopoulos CN, Soldatos TG, Kossida S, Aerts J, Schneider R, and Bagos PG (2011). Using graph theory to analyze biological networks. BioData Min. 4, 10. doi: 10.1186/1756-0381-4-10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Pittman ME, Edwards SW, Ives C, and Mortensen HM (2018). AOP-DB: A database resource for the exploration of Adverse Outcome Pathways through integrated association networks. Toxicol. Appl. Pharmacol 343, 71–83. doi: 10.1016/j.taap.2018.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] R Core Team. (2017). R: a language and environment for statistical computing. https://www.R-project.org/. [Google Scholar]

[R21] Villeneuve DL, Angrish MM, Fortin MC, Katsiadaki I, Leonard M, Margiotta-Casaluci L, Munn S, O’Brien JM, Pollesch NL, Smith LC, Zhang X, and Knapen D (2018). Adverse outcome pathway networks II: Network analytics. Environ. Toxicol. Chem 37, 1734–1748. doi: 10.1002/etc.4124. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Villeneuve DL, Crump D, Garcia-Reyero N, Hecker M, Hutchinson TH, LaLone CA,Landesmann B, Lettieri T, Munn S, Nepelska M, Ottinger MA, Vergauwen L, and Whelan M (2014). Adverse outcome pathway (AOP) development I: strategies and principles. Toxicol. Sci 142, 312–320. doi: 10.1093/toxsci/kful99. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Wittwehr C, Aladjov H, Ankley G, Byrne HJ, de Knecht J, Heinzle E, Klambauer G, Landesmann B, Luijten M, MacKay C, Maxwell G, Meek ME, Paini A, Perkins E, Sobanski T, Villeneuve D, Waters KM, and Whelan M (2017). How adverse outcome pathways can aid the development and use of computational prediction models for regulatory toxicology. Toxicol. Sci 155, 326–336. doi: 10.1093/toxsci/kfw207. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

EXTRACTING AND BENCHMARKING EMERGING ADVERSE OUTCOME PATHWAY KNOWLEDGE

Nathan L Pollesch

Daniel L Villeneuve

Jason M O’Brien

Abstract

INTRODUCTION

Figure 1:

METHODS

AOP-KB NETWORK CONSTRUCTION

Table 1:

NETWORK ANALYSIS AND VISUALIZATION

RESULTS AND DISCUSSION

Table 2:

BASIC AOP-KB NETWORK SUMMARY METRICS

Figure 2:

Table 3:

Figure 3:

IDENTIFYING THE ADJACENT AND NON-ADJACENT KERS

COMPONENT ANALYSIS

Figure 4:

PATH DETECTION AND EMERGENT AOPS

CONNECTIVITY AND ROBUSTNESS

Figure 5:

Figure 6:

GROWTH CHARACTERISTICS OF THE KNOWLEDGE BASE AND FACTORS AFFECTING CONNECTIVITY

Figure 7:

CRITICAL PATH IDENTIFICATION

Figure 8:

CONCLUSIONS, CHALLENGES AND FUTURE DIRECTIONS

Figure 9:

Supplementary Material

ACKNOWLEDGEMENTS

FUNDING INFORMATION.

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases