NeKo: A tool for automatic network construction from prior knowledge

Marco Ruscone; Eirini Tsirvouli; Andrea Checcoli; Denes Turei; Emmanuel Barillot; Julio Saez-Rodriguez; Loredana Martignetti; Åsmund Flobak; Laurence Calzone

doi:10.1371/journal.pcbi.1013300

. 2025 Sep 16;21(9):e1013300. doi: 10.1371/journal.pcbi.1013300

NeKo: A tool for automatic network construction from prior knowledge

Marco Ruscone ^1,^2,^3,^4,^*, Eirini Tsirvouli ⁵, Andrea Checcoli ^1,^2,^3,⁶, Denes Turei ⁷, Emmanuel Barillot ^1,^2,³, Julio Saez-Rodriguez ^7,⁸, Loredana Martignetti ^1,^2,³, Åsmund Flobak ⁵, Laurence Calzone ^1,^2,^3,^*

Editor: Mark Alber⁹

PMCID: PMC12456767 PMID: 40956863

Abstract

Biological networks provide a structured framework for analyzing the dynamic interplay and interactions between molecular entities, facilitating deeper insights into cellular functions and biological processes. Network construction often requires extensive manual curation based on scientific literature and public databases, a time-consuming and laborious task. To address this challenge, we introduce NeKo, a Python package to automate the construction of biological networks by integrating and prioritizing molecular interactions from various databases. NeKo allows users to provide their molecules of interest (e.g., genes, proteins or phosphosites), select interaction resources and apply flexible strategies to build networks based on prior knowledge. Users can filter interactions by various criteria, such as direct or indirect links and signed or unsigned interactions, to tailor the network to their needs and downstream analysis. We demonstrate some of NeKo’s capabilities in two use cases: first we construct a network based on transcriptomics from medulloblastoma; in the second, we model drug synergies. NeKo streamlines the network-building process, making it more accessible and efficient for researchers.

Author summary

Biological networks are a powerful tool for understanding complex diseases, providing insight into molecular interactions and regulatory mechanisms. However, constructing these networks is a challenging and time-consuming task, particularly when relying on manual curation from existing literature. To reduce the burden of analyzing thousands of scientific articles, researchers can extract interactions from biological databases. While many online tools exist for building networks from databases, they often lack flexibility. What is missing is a versatile framework that allows users to construct networks using multiple approaches and from diverse sources, including offline datasets. In this article, we introduce NeKo, a Python package designed to automate network construction by integrating interactions from various sources while offering users control over the building process. We demonstrate its utility by applying NeKo to Medulloblastoma data and by comparing a manually curated network of gastric cancer (Cascade) with one generated automatically by NeKo. Our results highlight the potential of NeKo in streamlining network inference, bridging the gap between manual curation and automated network generation.

1. Introduction

Many biological processes, such as intracellular signaling, can be represented and studied as networks of interacting molecular entities. Networks can provide structured representations of the dynamic interplay between various biological levels and can facilitate a detailed appreciation of the underlying regulatory mechanisms governing cellular functions and decision-making.

Networks can be tailored to depict biological processes at varying levels of detail, ranging from broad and abstract overviews to highly detailed representations of molecular processes. Two approaches can be applied for the construction of networks: a data-driven and a knowledge-driven approach. Data-driven approaches are based on high-throughput experimental data, such as gene expression or phosphoproteomics, to infer regulatory interaction networks that underpin cellular functions. On the other side, knowledge-driven network construction involves the comprehensive review of scientific literature and extensive manual curation of information from databases that collect known interactions, such as SIGNOR [1] or Reactome [2]. While data-driven approaches adapt well to various data types and are particularly useful for large-scale analyses, they remain highly sensitive to data quality and can lead to potential overfitting or inaccuracies due to noise and high-dimensional data complexity [3,4]. On the other hand, a knowledge-driven approach provides more comprehensive networks as they gather information from many different sources, but they might be biased or incomplete because they are based on the current extent of biological knowledge.

To account for those limitations, data-driven and knowledge-driven approaches are frequently combined during network construction [5–7]. To achieve such a combination, several tools and strategies have been developed to integrate omics data and prior knowledge, including SignalingProfiler [8], CARNIVAL [9] and Augusta [10]. However, even in hybrid approaches, manual curation is still required to a great extent and remains a cumbersome and laborious task. During the manual curation step, a lot of different databases are typically queried, each collecting interactions based on different criteria, standards, biological focus, and approaches. To facilitate the harmonization of those sources of knowledge, efforts such as OmniPath [11] and Pathway Commons [12,13] have been developed to collect and harmonize interactions from various resources, making the integrated knowledge readily available for the user. However, this process could be further automatized by implementing strategies similar to how a curator builds a problem-specific network from a general compendium of interactions. Here, we present NeKo, a Python tool to automatically construct biological networks by employing a series of flexible strategies to extract, group, and merge molecular interactions from various databases. Given a list of molecular entities of interest (called seeds)(e.g., a list of differentially expressed genes or proteins) and a pre-defined source of interactions (e.g., public database or proprietary), NeKo enables users to select various strategies to connect the seeds. NeKo is able to consider or ignore the direction and causality of the interactions. The final network can be used as it is for further analysis, or amended by the user (e.g., removing nodes, edges or paths) before being exported in the various formats provided by NeKo. In the following sections, we describe in detail the network-building strategies offered by the tool and provide two use cases to showcase potential workflows to construct and use NeKo-derived networks. NeKo is a free software available at https://github.com/sysbio-curie/Neko.

2. Design and implementation

2.1. Requirements

NeKo is a Python package designed to be used with Jupyter Notebooks, offering a streamlined and intuitive approach for collecting and summarizing prior knowledge based on the input provided.

To install NeKo, we provide a step-by-step procedure in the Supplementary Material in S1 Text and at the following link: https://sysbio-curie.github.io/Neko/. To avoid issues with the different operating systems and versions of required packages, we strongly suggest installing and running the tool within a conda or Poetry environment.

2.2. Workflow

The input of NeKo can be a list of molecular entities (genes, proteins), the seeds. NeKo then uses knowledge reported in databases to construct a new network, or to expand an existing network provided by the user. NeKo searches in the specified databases for the seeds and establishes links between them, using various connection strategies. Moreover, NeKo provides the option to annotate the network and suggests how the resulting network connects to different ontology terms based on Gene Ontology (Sect 2.3.4). The result of the workflow is a network that can be further analyzed with other tools such as Cytoscape [14] for topological analysis, or a Boolean model in BNET format (Sect 2.3.5) for dynamical modeling. An example of a typical NeKo workflow is presented in Fig 1.

Fig 1 — First, the universe of interactions and the seed nodes of interest are defined. Then, NeKo will connect the seed nodes based on the user-defined algorithm. The resulting network can be manipulated by additing annotations or connected to phenotypes using Gene Ontology terms. Lastly, the final network can be visualized and exported in various formats. Created in BioRender. Ruscone, M. (2025) https://BioRender.com/z98d148

First, the user selects the database of interest and the seeds. By default, NeKo derives interactions from OmniPath [11], which allows extractions of interactions from multiple sources, and NeKo can also be configured to use other public or proprietary databases. We expect that the reader will access the OmniPath manuscript for additional information on sources covered. We also added simple direct clients to SIGNOR and PhosphoSitePlus because of their frequent updates and high significance in the field. For a specification of these databases, see Sect 2.3.1. The seeds refer to the entities to be connected by NeKo and can be provided using gene symbols.

As a next step, the user chooses one or more connecting strategies to create the network (see Sect 2.3.3 for more details). NeKo also provides methods for the editing of the networks it created (e.g., removing nodes, interactions, or paths). Furthermore, NeKo is able to include in the networks biological processes or phenotypes, by directly accessing the Gene Ontology database (see Sect 2.3.4). The final network can be visualized and exported in various formats, supporting a broad variety of downstream applications (see Sect 2.3.5).

2.3. Package structure

The NeKo package consists of six Python classes, each one with a specific function:

Network: hosts the structure and the functionalities of a network built with NeKo.
Universe: provides methods to build a database from external sources, store it, and make it accessible to the other classes.
Connections: implements basic algorithms for path-finding (such as Breadth First Search [15] and Depth First Search [16]), as well as more advanced algorithms for identifying any paths (see the function find_upstream_cascades).
Ontology: fetches nodes associated to a given ontology term and connects them to the network.
Exports: exports the network in different formats, such as BNET and SIF (Simple Interaction Format).
NetworkVisualizer: visualizes the network and exports the figures to pdf format.

2.3.1. The network class.

The Network class is represented by two pandas data frames: Network.nodes and Network.edges Fig 2A. The Network.nodes data frame contains the Gene Symbols and UniProt identifiers of the entities in the network. The Network.edges data frame stores the edges of the network, each defined by its source and target nodes, and characterized by an effect, which can be ‘stimulation’, ‘inhibition’, ‘bimodal’, ‘form_complex’, or ‘undefined’, depending on the available information of each resource and the user inputs. Additionally, Network.edges includes information about the edge type and literature references if these are available in the database.

To create a Network object two arguments are required: a list of seed nodes and a data frame representing the database adapted to the NeKo format. The seeds and any known interactions can also be provided as a network to be expanded in a SIF file format.

The Network class also includes methods for adding and removing nodes, edges, or entire paths. It also offers various strategies that users can invoke to find and merge paths within the network. Users can customize the path collection strategies, for instance by filtering for signed paths (more details in the Supplementary material (S1 Text), Additional Information, Sect S3 “Link between Python functions and NeKo strategies”).

2.3.2. The Universe class.

The Universe class stores the prior knowledge database in pandas data frames Fig 2B. By default, the complete OmniPath interactions database is used.

The Universe data frame has a column layout similar to the OmniPath interaction data frames. By default, it contains the following columns: ‘source’, ‘target’, ‘is_directed’, ‘is_stimulation’, ‘is_inhibition’, ‘form_complex’, ‘consensus_direction’, ‘consensus_stimulation’, ‘consensus_inhibition’, ‘references’, and ‘sources’, as shown in Fig 2. Of these columns, only the ‘source’ and ‘target’ are mandatory (defining to an unsigned network by default). The Universe.add_database method fills the missing columns with empty values of False. Additionally, NeKo provides specific functions to import the SIGNOR database [1] and the PhosphositePlus (PSP) [17], including the PSP regulatory sites dataset about the effects of phosphorylation sites on protein activities. These two databases were directly integrated into NeKo to support a higher update frequency and access to additional details compared to what is currently available in OmniPath. We chose to include support for these databases because of their high quality.

Additional information on how to build a database for NeKo can be found in the notebook “Build network using user-defined resources” (https://sysbio-curie.github.io/Neko/notebooks/2_add_resources.html).

2.3.3. The connections class.

NeKo offers different approaches, or strategies, to aggregate paths to build a network, starting from the seeds provided by the user. These strategies are based on well-known algorithms for path finding, such as Breadth-First Search (BFS) and Depth-First Search (DFS) algorithms.

The Connections class in NeKo implements these algorithms and provides a set of methods to explore and analyze networks efficiently. This class utilizes a pandas data frame as its underlying data structure, representing the edges from the prior knowledge database with ‘source’ and ‘target’ columns. To optimize performance, the class preprocesses the resource data into dictionary-based data structures, allowing for quick lookups of neighboring nodes.

Key features of the Connections class include methods for finding target and source neighbors, performing BFS and DFS to find one or multiple paths between selected nodes. The class also incorporates more advanced functionalities, such as finding upstream cascades and identifying minimal sets of regulators that cover specified target genes (with the function find_cascades). These methods enable an efficient exploration of complex relationships and regulatory patterns within biological networks.

2.3.4. The ontology class.

This class is designed to map a subset of nodes in the network to a certain phenotype. It allows the user to specify a GO accession code and the relative ontology term, to fetch the relative marker nodes from GeneOntology (https://geneontology.org/). In the same class, we also provide an option to map tissue specific expression to the network nodes, using data from the Human Protein Atlas [18], accessed via the OmniPath Annotations database.

2.3.5. The exports class.

The Export class outputs the networks in formats that are readily usable by popular downstream analysis tools. The SIF output is supported by Cytoscape [14] and many other software.

NeKo is also able to generate Boolean models from the networks it created. These can be exported in BNET format. A Boolean model is a mathematical representation of a biological system where each variable (node) can have one of two states: ON (1) or OFF (0), and the state of each node is determined by Boolean functions that link its input nodes with logical connectors, AND, OR and NOT. To create these models, NeKo automatically generates Boolean equations associated with each node in the network. The process for generating these equations is heuristic: all activators of a node are chained by OR gates, as are all inhibitors; the groups of activators and inhibitors are then separated by an AND NOT gate.

If the network encompasses nodes that form a complex (especially when fetching interactions from SIGNOR), the Boolean equation of the complex will have its members chained by AND gates. Furthermore, if the network contains bimodal interactions (those that can both activate and inhibit), an ensemble of models is created by permuting across all possible combinations of activation and inhibition for all bimodal interactions in the network. These generic rules can provide a starting point for further refinement based on expert knowledge. Tools such as BoNeSiS [19] can be used to refine these equations based on experimental data or additional biological knowledge.

The Export class thus serves as a bridge between NeKo’s network construction capabilities and the wider ecosystem of network analysis and modeling tools.

2.3.6. The NetworkVisualizer.

The class NetworkVisualizer converts NeKo’s networks to Graphviz objects (https://graphviz.org/), which support visualization. The network can be visualized directly in the Jupyter Notebook or exported to PDF. Moreover, it provides functions to customize the visualization, such as coloring nodes of interest, or to highlight the inputs and outputs of a specific node.

2.4. Connection strategies

Building on the classes presented previously, the Network class implements advanced strategies for network construction and expansion, using the functions in the Connection class. These strategies are illustrated in Fig 3 and described in detail below:

Fig 3 — Orange nodes denote the user-input nodes. Created in BioRender. Ruscone, M. (2025) https://BioRender.com/c93h960.

Reciprocal Pathway Extender (RPE) strategy: It ensures comprehensive connectivity between node pairs by searching for shortest paths and integrating neighboring interactions. This algorithm initiates by checking if each pair of nodes is connected in both directions using the BFS or DFS algorithm. If a connection in one or both directions is missing, the algorithm searches for the shortest paths to establish these connections, incorporating all intermediate nodes and edges into the network. It then integrates direct neighboring interactions of newly added nodes before moving to the next pair.
Iterative Neighbor Expansion (INE) strategy: It builds networks through progressive incorporation of neighboring nodes. This algorithm begins by exploring each node’s immediate neighbors and integrating them into the network. If the remaining nodes are disconnected, the network is expanded by adding the neighboring nodes of the newly added nodes from the previous iteration. The process is repeated until all nodes are connected or until the user-defined number of iterations is reached. After completing the designated loops, the network is refined by removing any disconnected nodes or non-regulated nodes (i.e., nodes without any sources).
Regulatory Cascade Explorer (RCE) strategy: It identifies and ranks upstream regulators of designated output nodes, allowing for deep exploration of regulatory networks. After identifying all regulators for each node, the regulators are then ranked based on the total number of output nodes they regulate. Importantly, the ranking is flexible and adjusts to the actual degree of influence: the more output nodes a regulator interacts with, the higher it ranks (see Fig 3 for an example). By modifying the “depth” argument of the function, the algorithm looks further back to the regulators of these initial regulators, applying the same flexible ranking method at each level, based on their influence over their immediate successors. This strategy is particularly useful for identifying and prioritizing common transcription factors of gene targets that correspond to seeds in the network.
Module Connection Mapper (MCM) strategy: It establishes connections between predefined groups of nodes, utilizing the RPE algorithm to refine pathways between groups. It comprehensively iterates through all possible pairings between the groups to list existing pathways. Each pathway’s intermediate nodes are aggregated into a separate group, Group C. Subsequently, the Reciprocal Pathway Extender (RPE) algorithm is applied exclusively to this intermediary group to refine and extend the network pathways between the original groups. The MCM algorithm allows for three modes of connections: IN, OUT, and ALL. For the IN and OUT modes, only directed interactions from the “source” towards the “target” group of nodes are included.
Phenotype Integration and Network Connectivity (PINC) strategy: It integrates phenotype-specific genes into existing networks using the Gene Ontology (GO) annotations and the MCM algorithm. Starting with a phenotype of interest, users input a corresponding GO accession ID and its name, allowing the algorithm to fetch all associated genes from the GO database. These genes are then organized into a new group, which is integrated into the user’s existing network using the afore-described MCM algorithm. Finally, to simplify and enhance the network’s interpretability, the user can decide to merge the connected phenotype-related genes into a single composite node. This node represents the phenotype and maintains all interactions inherited from the individual genes, offering a clear and consolidated view of the interactions between specific phenotypes and the existing network.

2.5. NeKo strategies for building a network

The network construction strategies available in NeKo have been designed to work in both a complementary and synergistic fashion. NeKo users can apply one or multiple strategies to build and refine biological networks originating from a set of biological entities. As an example, a typical use of NeKo begins with creating a baseline network by applying a single iteration of the Iterative Neighbor Expansion (INE) strategy on a set of seed nodes, corresponding to genes or proteins of interest. This set of nodes could be a list of genes provided by experts, the list of top hits resulting from a differential analysis (either proteomics- or transcriptomics-based), or a signature derived from a statistical analysis. The resulting network may connect the nodes, either directly or indirectly, while an unconnected network suggests missing components.

In the case of an unconnected network, several strategies can be proposed, independently from each other or consecutively.

If the nodes are not connected, there may exist intermediate entities in the database that would connect them. The search for intermediate nodes can be done with the Reciprocal Pathway Extender (RPE) strategy with the choice to include one, two, or more intermediates and connect them to the current network.

The Regulatory Cascade Explorer (RCE) Sect 2.3.3 can facilitate the retrieval of entities that regulate more than one node, and provide complementary information to the network by determining common regulators for a given set of nodes.

The Module Connection Mapper (MCM) strategy aims at connecting two separate groups of nodes, for example, a set of receptors with a set of downstream transcription factors. This strategy does not handle all nodes together, but as two groups, creating paths from one to the other.

The Phenotype Integration and Network Connectivity (PINC) strategy connects phenotypes to the existing network by replacing all genes that belong to a phenotype with a single phenotype node. This method enhances biological interpretability and provids a more comprehensive view of the biological system.

Alternatively, modelers can base their reasoning on sets of nodes derived from gene lists associated with specific ontology terms, i.e., cell cycle, or apoptosis. By combining multiple ontology terms, the corresponding nodes can be merged into a network (whose components can be connected following any of the available strategies presented above), thereby highlighting the relationships between these ontology terms and other nodes provided by the user.

Overall, the network construction strategies available in NeKo provide a flexible toolkit to handle different types of biological data and types of inferred networks (e.g., protein-protein interaction networks, regulatory networks, TF networks, etc.).

The methods presented here complete almost instantly when building small networks (≤50 nodes), while above 300 nodes the process may take up to two minutes (more details in the documentation, in the notebook “Re-creating famous pathways from SIGNOR and WIKIPATHWAYS using NeKo”).

3. Results

We showcase NeKo’s application to real-life questions in two case studies on cancer datasets.

3.1. Use Case 1 - Medulloblastoma subgroup network

Medulloblastoma, the most common pediatric brain tumor in childhood, can be classified into four distinct subgroups, each characterized by unique genetic signatures, clinical features, and outcomes [20]. The wingless (WNT) and sonic hedgehog (SHH) subgroups have well-defined profiles with abnormalities in developmental pathways. Group 3 and Group 4 are less well-defined but with Group 3 characterized by MYC amplification and Group 4 by CDK6 and SRC overexpression [21].

Previous analyses of patient stratification based on omics data [21] have shown that Groups 3 and 4 share some similarities but the underlying mechanisms are different. In Group 3, deregulations at the transcription level are reported, while in Group 4, alterations of post-transcriptional events are observed involving proteins like SRC or ERBB4.

Signatures based on transcriptome and DNA methylome profiles are adopted in clinical practice, but we lack a clear understanding of their causal relationship to the disease. We propose, with NeKo, to construct a network from these signatures and expect to see the emergence of new signaling pathways when connecting the genes of the signatures that could bring additional and insightful information.

Given that the assembled network topology is heavily influenced by the choice of the database, we opted to fetch causal interactions from SIGNOR (directed and signed interactions) and protein-protein interactions from HuRI (undirected and unsigned interactions), employing identical workflows and algorithms for both databases.

The network construction process starts with collecting genes/proteins of interest from major publications that focus on medulloblastoma [22]. The seed genes specific to each subgroup are reported in Table 1. It is important to note that not all the selected genes were present in the chosen databases. When querying the databases, NeKo ignores the genes of interest that are missing from the Universe.

Table 1. Seed genes for each subgroup.

WNT		SHH		Group 3	Group 4
ADGRB3	IRX2	BCOR	PRKAR1A	ATM	BRCA2	MYCN
ALX4	LEF1	BRCA2	PTCH1	CRX	CDK6	OTX2
APC	LHX8	CEBPA	PTEN	HLX	EOMES	POU2F1
ARID1A	MAF	CREBBP	RARB	IRX6	FOXP2	PRDM6
ATM	MSX2	DDX3X	SOX13	LHX9	GFI1	SIX1
CSNK2B	OSR2	FBXW7	SOX2	MYC	GFI1B	SIX6
CTNNB1	PAX3	GLI2	SUFU	PTEN	KDM6A	TAL1
DDX3X	PIK3CA	GSE1	TBX18	SMARCA4	KMT2C	TBR1
DLX3	PITX1	KMT2C	TCF4		KMT2D	ZIC1
EMX2	PRRX1	KMT2D	TERT		LHX1	ZMYM3
HOXC4	RUNX2	NFATC1	TP53		LHX2
HOXC5	SYNCRIP	PBX1			LMX1A
HOXD8	POU6F2	OTX2			MED12

Open in a new tab

Seed genes used to generate the networks for each subgroup. Those genes were selected from two major publications focused on medulloblastoma analysis, Northcott et al. [22] and Lin et al. [26].

In this use case, we applied the RPE strategy, using the depth-first search algorithm, to connect the seed genes while minimizing the amount of intermediate nodes. We selected a maximum path length of 3, which was increased to 4 for disconnected nodes. If the resulting network contained disconnected nodes, the RPE strategy was de novo applied using breadth-first search.

The networks were then exported in SIF format for visualization and topological analysis using Cytoscape.

An additional step was implemented for the networks built from SIGNOR to incorporate complex and protein family nodes, introducing all complex components and connecting them to the rest of the network.

We performed a GO enrichment analysis using the Cytoscape plug-in ClueGO [23] to assess whether the network constructed for each subgroup exhibited characteristic features (Fig 4) and compared it to the GO analysis before the network construction. The GO analysis was performed using the same settings for all subgroups (intermediate terms and p-value ≤ 0.05). The results show that after the network construction, mechanisms coherent with the known biology appear for each group, whereas the usual GO enrichment analysis with the seed genes falls short of providing a mechanistic explanation (S6 Fig). Indeed, the Wnt signaling pathway is strongly enriched in the NeKo network of Wnt subgroup confirming that Wnt/Wingless pathway activation characterize this distinct molecular subgroup of medulloblastomas [24]. Additionally, we find cell cycle-related terms significantly enriched in the network of Group 3, which is known for its accelerated proliferation driven by MYC activation [25]. Finally, the network of Group 4 shows an enrichment of kinase-receptor terms, consistent with the fact that this subgroup is particularly characterized by post-transcriptional regulatory events [21]. More details about the results are provided in Supplementary Material (S4, S5, and S6 Figs).

Fig 4 — A) Comparison of the main topological network properties between the four subgroups. B) Venn diagram displaying the common nodes among each subgroup. C) Functional comparison of the four medulloblastoma subgroup networks generated from the HURI database, using the ClueGO Cytoscape plug-in. Only leading terms with an adjusted p-value ≤ 0.05 are presented.

3.2. Use Case 2 - DrugLogics pipeline

The DrugLogics pipeline [27] is designed to predict drug synergy using a genetic algorithm to calibrate logical models to data. As input, the pipeline receives a network file in SIF format that contains signed and directed interactions. A manually curated logical model referred to as ‘Cascade’, describing the signaling interactions around seven drug targets in gastric cancer, previously demonstrated high predictive accuracy for identifying synergistic drug combinations [28]. To assess whether a NeKo-derived topology could achieve comparable performance with the manually curated topology, we constructed a NeKo topology using as an input the same seven drug targets as in the initial publication.

NeKo successfully connected all the input nodes resulting in a network of 57 nodes and 140 edges. Compared to the manually curated network by [28], which consisted of 77 nodes and 217 edges, the NeKo-derived network was relatively smaller, but more densely connected as shown by its higher average clustering coefficient (i.e., 0.4 for NeKo-network and 0.07 for the manually curated one). A more detailed topological comparison of the two networks is presented in the Supplementary Information (S7 Fig).

To assess whether the network generated by NeKo successfully captured the signaling events underlying drug responses, we evaluated its ability to identify experimentally validated synergistic drug combinations, as also presented in [28]. The network obtained via Neko correctly classified 3 of the 4 synergies (i.e., true positives), compared to the initial model which classified them all correctly.

For a more comprehensive comparison of their overall performance, the ROC (Receiver Operating Characteristic) curves of the two models are provided (Fig 5). The ROC-AUC (Receiver Operating Characteristic Area Under the Curve) values provide a metric that assesses the model’s ability to distinguish between synergies and non-synergies. A value of 1 denotes a perfect model that correctly classifies all combinations. A value of 0.5 implies that the model is random, with no predictive ability. In the case of the NeKo model, an AUC value of 0.65 is reported, highlighting that the model has a significant predictive ability, even with no user intervention to the resulting topology. This also indicates that the NeKo model includes biologically relevant interactions, that correctly capture the necessary events triggered in drug responses.

However, the manually-curated model, which contains validated processes reported in the literature, performs relatively better (Fig 5). The performance indicates that the network from NeKo is a useful starting point for further manual curation, saving substantial time compared to manual curating.

4. Discussion

NeKo offers users a versatile and efficient tool for building networks from prior knowledge databases. Its ability to automate the integration of molecular interactions and tailor network building to specific research facilitates the otherwise labor-intensive and time-consuming process of manual curation.

NeKo’s adaptability was demonstrated by two use cases related to cancer applications. The first use case on medulloblastoma showed the possibility of interpreting molecular signatures with networks and gaining some insights into the processes that are deregulated in each subgroup, which may not have been obvious with the sole signatures. The next step could be the translation of this network into a mathematical model to study dynamically the four subgroups. NeKo facilitates the process by providing a first draft of the network and an initial list of logical equations with standard rules. With the second use case, we compared two models that were developed for the same purpose of studying drug synergies and with the same initial list of genes, one built with manual curation and the other one with automatic inference by NeKo. We showed that NeKo can create biologically relevant models that can be used to identify synergistic drug combinations.

NeKo can be used as a first step for further analyses, such as topological studies, integration of omics data into these networks [29], or visualization of data onto the networks (with the SIF output), and modeling purposes (with BNET output). The type of downstream analyses that users intend to perform will also affect the network construction strategy. Depending on whether the aim is to conduct network analysis or build a predictive model, the choices regarding the number of genes, the selected database, and the level of detail in inferred interactions might differ significantly.

Despite these strengths, the effectiveness of NeKo is inherently limited by several factors and should be considered as a building block of a pipeline that would lead to further refinement and analyses. First, the scope of the analysis is confined by the selection of input entities (genes, proteins, phosphosites, etc.). The seed nodes significantly influence the resulting network, and any omission or misidentification of key molecular players can lead to incomplete or biased networks. Additionally, the choice of databases plays a crucial role in shaping the network. Different databases have varying levels of comprehensiveness, curation quality, and focus, which can impact the accuracy, completeness, and relevance of the constructed networks. Integration within a larger ecosystem of network biology, such as the Network Commons (https://github.com/saezlab/networkcommons), would allow synergizing NeKo with other components, including algorithms for network contextualization from omics data [7], complementary visualizations, and downstream benchmark.

Another limitation resides in the Boolean models that can be built with NeKo. The heuristic approach used to shape the Boolean equations that characterize each node of the network does not lead to any significant attractor in most cases. This is because NeKo does not rely on data to support or constrain specific network states, meaning that the attractors reached when simulating the Boolean model often do not have biological relevance. The main issue arises from the large number of edges introduced into the network, which generates excessive noise and reduces the interpretability of the model. Future improvements could involve refining the edge selection process to reduce noise and improve the meaningfulness of attractors. Finally, the graphic library used to visualize the network, Graphviz, struggles with performance and readability when dealing with a high number of nodes in Jupyter notebooks. We plan in future updates to support multiple libraries for graph visualization, such as Igraph or NetworkX, to provide more efficient and interactive alternatives.

In summary, NeKo represents a significant step forward in the automation and flexibility of biological network construction. Its ability to integrate diverse interaction data and provide customizable network-building strategies makes it a valuable tool for users aiming to explore complex biological systems.

5. Availability and future directions

NeKo is currently available in PyPi (refer to Supplementary Material (S1 Text) for detailed installation instructions). The package documentation, including the full API reference, is automatically generated using Sphinx [30]. This documentation includes several Jupyter Notebooks, each of which offers step-by-step guidance for building a NeKo network, providing practical examples to help users become familiar with the workflow. The use cases demonstrated in this manuscript are publicly accessible on GitHub (https://github.com/sysbio-curie/NeKo_Supp_Mat), with additional insights and analyses provided in the Supplementary Information (S1 Text). Regular updates will ensure continued support and expansion of NeKo’s features. Future versions will introduce new functionalities, including methods for the automated refinement and curation of NeKo networks based on user-specified criteria, such as data integration or prior knowledge input.

Additional export options will also be implemented to generate output files compatible with other modeling formalisms, such as Ordinary Differential Equation (ODE) models. Finally, we plan to refactor NeKo’s codebase to facilitate contributions from the community and enhance its accessibility. Since NeKo works best when integrated with other modeling and analysis tools, we will ensure that its structure remains modular and compatible with broader computational pipelines, including the aforementioned Network Commons.

Supporting information

S1 Text. This document contains additional details and analysis that complement the work presented in the main text.

The code and figures used for both the main article and the supplementary materials can be found at the following github page: https://github.com/sysbio-curie/NeKo_Supp_Mat.

(PDF)

pcbi.1013300.s001.pdf^{(8.8MB, pdf)}

Data Availability

The source code and data used to produce the results and analyses presented in this manuscript are available on the following GitHub repositories: https://github.com/sysbio-curie/Neko; https://github.com/sysbio-curie/NeKo_Supp_Mat.

Funding Statement

AC was financed by ModICeD project financially supported by ITMO Cancer of Aviesan (2020). LM is part of ModPhosphoNet, a project financially supported by ITMO Cancer of Aviesan (2021-2030). LC and MR are partially supported by Certainty project which is part of the European Union’s Horizon Europe research and innovation program under grant agreement n101136379. DT was supported by the HPC/Exascale Centre of Excellence for Personalised Medicine in Europe [PerMedCoE; European Union Horizon 2020 program, grant no. 951773] and by the German Federal Ministry of Education and Research [Bundesministerium für Bildung und Forschung BMBF; grant no. 031L0181B] funds granted to JSR. EB and LC are part of PRAIRIE 3IA Institute, funded by the French government under the management of ANR as part of the Investissements d’avenir program (ANR-19-P3IA-0001). JSR reports funding from GSK, Pfizer, and Sanofi and fees/honoraria from Travere Therapeutics, Stadapharm, Astex, Owkin, Pfizer, Moderna, and Grunenthal.

References

1.Licata L, Lo Surdo P, Iannuccelli M, Palma A, Micarelli E, Perfetto L, et al. SIGNOR 2.0, the SIGnaling Network Open Resource 2.0: 2019 update. Nucleic Acids Res. 2020;48(D1):D504–10. doi: 10.1093/nar/gkz949 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Jassal B, Matthews L, Viteri G, Gong C, Lorente P, Fabregat A, et al. The reactome pathway knowledgebase. Nucleic Acids Research. 2019. doi: 10.1093/nar/gkz1031 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020;17(2):147–54. doi: 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, et al. Wisdom of crowds for robust gene network inference. Nat Methods. 2012;9(8):796–804. doi: 10.1038/nmeth.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68. doi: 10.1038/nrg2918 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Le Novère N. Quantitative and logic modelling of molecular and gene networks. Nat Rev Genet. 2015;16(3):146–58. doi: 10.1038/nrg3885 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Garrido-Rodriguez M, Zirngibl K, Ivanova O, Lobentanzer S, Saez-Rodriguez J. Integrating knowledge and omics to decipher mechanisms via large-scale models of signaling networks. Mol Syst Biol. 2022;18(7):e11036. doi: 10.15252/msb.202211036 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Venafra V, Sacco F, Perfetto L. SignalingProfiler 2.0 a network-based approach to bridge multi-omics data to phenotypic hallmarks. NPJ Syst Biol Appl. 2024;10(1):95. doi: 10.1038/s41540-024-00417-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Liu A, Trairatphisan P, Gjerga E, Didangelos A, Barratt J, Saez-Rodriguez J. From expression footprints to causal pathways: contextualizing large signaling networks with CARNIVAL. NPJ Syst Biol Appl. 2019;5:40. doi: 10.1038/s41540-019-0118-z [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Musilova J, Vafek Z, Puniya BL, Zimmer R, Helikar T, Sedlar K. Augusta: From RNA-Seq to gene regulatory networks and Boolean models. Comput Struct Biotechnol J. 2024;23:783–90. doi: 10.1016/j.csbj.2024.01.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Türei D, Korcsmáros T, Saez-Rodriguez J. OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nat Methods. 2016;13(12):966–7. doi: 10.1038/nmeth.4077 [DOI] [PubMed] [Google Scholar]
12.Rodchenkov I, Babur O, Luna A, Aksoy BA, Wong JV, Fong D, et al. Pathway Commons 2019 Update: integration, analysis and exploration of pathway data. Nucleic Acids Res. 2020;48(D1):D489–97. doi: 10.1093/nar/gkz946 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Wong JV, Franz M, Siper MC, Fong D, Durupinar F, Dallago C, et al. Author-sourced capture of pathway knowledge in computable form using Biofactoid. Elife. 2021;10:e68292. doi: 10.7554/eLife.68292 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504. doi: 10.1101/gr.1239303 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Bundy A, Wallen L. Breadth-first search. Catalogue of artificial intelligence tools. Berlin, Heidelberg: Springer; 1984. p. 13. 10.1007/978-3-642-96868-6_25 [DOI]
16.Kozen DC. Depth-first and breadth-first search. The design and analysis of algorithms. New York: Springer; 1992. p. 19–24. 10.1007/978-1-4612-4400-4_4 [DOI]
17.Hornbeck PV, Kornhauser JM, Tkachev S, Zhang B, Skrzypek E, Murray B, et al. PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res. 2012;40(Database issue):D261-70. doi: 10.1093/nar/gkr1122 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347(6220):1260419. doi: 10.1126/science.1260419 [DOI] [PubMed] [Google Scholar]
19.Chevalier S, Froidevaux C, Paulevé L, Zinovyev A. Synthesis of Boolean networks from biological dynamical constraints using answer-set programming. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI); 2019. p. 34–41. https://ieeexplore.ieee.org/document/8995442.
20.Northcott PA, Korshunov A, Witt H, Hielscher T, Eberhart CG, Mack S, et al. Medulloblastoma comprises four distinct molecular variants. J Clin Oncol. 2011;29(11):1408–14. doi: 10.1200/JCO.2009.27.4324 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Forget A, Martignetti L, Puget S, Calzone L, Brabetz S, Picard D, et al. Aberrant ERBB4-SRC signaling as a hallmark of group 4 medulloblastoma revealed by integrative phosphoproteomic profiling. Cancer Cell. 2018;34(3):379-395.e7. doi: 10.1016/j.ccell.2018.08.002 [DOI] [PubMed] [Google Scholar]
22.Northcott PA, Buchhalter I, Morrissy AS, Hovestadt V, Weischenfeldt J, Ehrenberger T, et al. The whole-genome landscape of medulloblastoma subtypes. Nature. 2017;547(7663):311–7. doi: 10.1038/nature22973 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25(8):1091–3. doi: 10.1093/bioinformatics/btp101 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Clifford SC, Lusher ME, Lindsey JC, Langdon JA, Gilbertson RJ, Straughton D, et al. Wnt/Wingless pathway activation and chromosome 6 loss characterise a distinct molecular sub-group of medulloblastomas associated with a favourable prognosis. Cell Cycle. 2006;5(22):2666–70. doi: 10.4161/cc.5.22.3446 [DOI] [PubMed] [Google Scholar]
25.Kawauchi D, Ogg RJ, Liu L, Shih DJH, Finkelstein D, Murphy BL, et al. Novel MYC-driven medulloblastoma models from multiple embryonic cerebellar cells. Oncogene. 2017;36(37):5231–42. doi: 10.1038/onc.2017.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Lin CY, Erkek S, Tong Y, Yin L, Federation AJ, Zapatka M, et al. Active medulloblastoma enhancers reveal subgroup-specific cellular origins. Nature. 2016;530(7588):57–62. doi: 10.1038/nature16546 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Flobak Å, Zobolas J, Vazquez M, Steigedal TS, Thommesen L, Grislingås A, et al. Fine tuning a logical model of cancer cells to predict drug synergies: combining manual curation and automated parameterization. Front Syst Biol. 2023;3:1252961. doi: 10.3389/fsysb.2023.1252961 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Flobak Å, Baudot A, Remy E, Thommesen L, Thieffry D, Kuiper M, et al. Discovery of drug synergies in gastric cancer cells predicted by logical modeling. PLoS Comput Biol. 2015;11(8):e1004426. doi: 10.1371/journal.pcbi.1004426 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Chen H-H, Hsueh C-W, Lee C-H, Hao T-Y, Tu T-Y, Chang L-Y, et al. SWEET: a single-sample network inference method for deciphering individual features in disease. Brief Bioinform. 2023;24(2):bbad032. doi: 10.1093/bib/bbad032 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Sphinx. Sphinx documentation. https://www.sphinx-doc.org/en/master/

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013300.r001

Decision Letter 0

Mark Alber, Alberto J M Martin

6 Jan 2025

PCOMPBIOL-D-24-01942

NeKo: a tool for automatic network construction from prior knowledge

PLOS Computational Biology

Dear Dr. Ruscone,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days Mar 08 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter

We look forward to receiving your revised manuscript.

Kind regards,

Alberto J M Martin, Ph.D.

Academic Editor

PLOS Computational Biology

Mark Alber

Section Editor

PLOS Computational Biology

Additional Editor Comments:

Please do address issues raised by reviewers, specially those regarding escalation on larger datasets, limitations and validation of the tool

Journal Requirements:

1) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019.

2) Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150-200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines:

https://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-parts-of-a-submission

3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines:

https://journals.plos.org/ploscompbiol/s/figures

4) Please ensure that all Figure files have corresponding citations and legends within the manuscript. Currently, Figure 2 in your submission file inventory does not have an in-text citation. If the figure is no longer to be included as part of the submission, please remove it from the file inventory.

5) We have noticed that you have uploaded Supporting Information files, but you have not included a list of legends. Please add a full list of legends for your Supporting Information files after the references list.

6) Some material included in your submission may be copyrighted. According to PLOSu2019s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOSu2019s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form.

Please respond directly to this email and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Once you have responded and addressed all other outstanding technical requirements, you may resubmit your manuscript within Editorial Manager.

Potential Copyright Issues:

- The following Figure contains a logo or branding: Figure 1. We are not permitted to publish this under our CC-BY 4.0 license, even with permission. We ask that you please remove or replace it.

7) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

- State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

- State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.".

If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Please see the attachment in the online system.

Reviewer #2: The manuscript "NeKo: a tool for automatic network construction from prior knowledge" provides valuable insights into biological networks in the filed of transcriptomics. The authors have done excellent work in developing a Python package for construction of biological networks by integrating molecular interactions from various databases. The study aims to develop a Python package for construction of molecular network and also model drug synergies. The tool is available in github. However, the following issues need to be addressed before publication.

1) Which databases are integrated for genes and drugs to be addressed more specifically.

2) Comparison with similar packages and its advances to be highlighted.

3) Limitations of NeKo are to listed.

Reviewer #3: Review of “NeKo: a tool for automatic network construction from prior knowledge”

The manuscript introduces NeKo, a Python-based tool designed to automate the construction of biological networks by integrating molecular interactions from multiple prior knowledge databases. The authors aim to reduce the time-consuming and labor-intensive manual curation traditionally required in knowledge-driven network construction while providing flexibility. The manuscript highlights two use cases: constructing a network for medulloblastoma subgroup-specific signatures and modeling drug synergies using logical models.

NeKo represents a significant step forward in streamlining network construction and demonstrates the potential for broad application in systems biology. However, several major and minor issues need to be addressed to enhance the manuscript's clarity, rigor, and utility.

Major Comments

1. The authors provide a solid section on strategy selection, including descriptions of available algorithms and their applications. However, as strategy selection is a key challenge for usability, this section could be broadened to include more detailed guidance, examples, or decision trees to assist users in selecting the optimal strategy for their specific research questions. Clearer recommendations would improve accessibility for non-expert users and reduce the learning curve.

2. NeKo’s model for predicting drug synergies performed relatively poorly compared to the manually curated model. The manuscript should discuss whether the chosen strategy was optimal for this task and provide insights into how strategy selection influences performance. Was the poor result due to inherent limitations in the method or suboptimal strategy alignment?

3. While the manuscript discusses NeKo’s integration with databases like OmniPath and SIGNOR, a systematic evaluation of how database choice impacts network quality is important. This is particularly substantial given NeKo's reliance on external resources, and the lack of such benchmarking may limit confidence in its results.

4. While NeKo is tested on small-to-medium networks, its performance on larger datasets (>1000 nodes) is not discussed. Computational benchmarks assessing runtime and memory usage across varying network sizes would be helpful, especially for researchers handling large omics datasets.

5. Comparing NeKo’s functionality and outputs with widely used alternatives, such as STRING or Ingenuity Pathway Analysis (IPA), would provide context for its capabilities. Highlighting NeKo's strengths (e.g., flexibility, export formats) and limitations relative to these tools would better position it within the field of network construction.

Minor Comments

- The manually curated network's AUC value is not reported in Figure 5, making it difficult to assess the performance gap between NeKo-derived and curated networks. This metric should be included for a complete comparison.

- On page 4, the text mentions "activation" edges, while Figure 2 refers to "stimulation."

- On page 9, a citation appears as "(REF)." This needs correction.

- On page 10, "Addittional" contains a typographical error.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Dr. Dicky John Davis G

Reviewer #3: Yes: Ábel Fóthi

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: Review.docx

pcbi.1013300.s002.docx^{(19.1KB, docx)}

PLoS Comput Biol. 2025 Sep 16;21(9):e1013300. doi: 10.1371/journal.pcbi.1013300.r002

Author response to Decision Letter 1

17 Mar 2025

Attachment

Submitted filename: Response_to_reviewers.pdf

pcbi.1013300.s003.pdf^{(304.7KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013300.r003

Decision Letter 1

Mark Alber

1 May 2025

PCOMPBIOL-D-24-01942R1

NeKo: a tool for automatic network construction from prior knowledge

PLOS Computational Biology

Dear Dr. Ruscone,

Please submit your revised manuscript within 60 days Jul 01 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Alberto J M Martin, Ph.D.

Academic Editor

PLOS Computational Biology

Mark Alber

Section Editor

PLOS Computational Biology

Journal Requirements:

Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.Please ensure that the funders and grant numbers match between the Financial Disclosure field and the Funding Information tab in your submission form. Note that the funders must be provided in the same order in both places as well.

1) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

2) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.".

If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d

Reviewers' comments:

Figure resubmission:

Reproducibility:

PLoS Comput Biol. 2025 Sep 16;21(9):e1013300. doi: 10.1371/journal.pcbi.1013300.r004

Author response to Decision Letter 2

26 Jun 2025

Attachment

Submitted filename: Answer_to_review_final.pdf

pcbi.1013300.s004.pdf^{(128.2KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013300.r005

Decision Letter 2

Mark Alber

7 Jul 2025

Dear Dr. Ruscone,

We are pleased to inform you that your manuscript 'NeKo: a tool for automatic network construction from prior knowledge' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Mark Alber, Ph.D.

Section Editor

PLOS Computational Biology

Mark Alber

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #3: Thank you for your answer.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

Reviewer #3: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #3: Yes: Ábel Fóthi

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1013300.r006

Acceptance letter

Mark Alber

PCOMPBIOL-D-24-01942R2

NeKo: a tool for automatic network construction from prior knowledge

Dear Dr Ruscone,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Judit Kozma

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Text. This document contains additional details and analysis that complement the work presented in the main text.

The code and figures used for both the main article and the supplementary materials can be found at the following github page: https://github.com/sysbio-curie/NeKo_Supp_Mat.

(PDF)

pcbi.1013300.s001.pdf^{(8.8MB, pdf)}

Attachment

Submitted filename: Review.docx

pcbi.1013300.s002.docx^{(19.1KB, docx)}

Attachment

Submitted filename: Response_to_reviewers.pdf

pcbi.1013300.s003.pdf^{(304.7KB, pdf)}

Attachment

Submitted filename: Answer_to_review_final.pdf

pcbi.1013300.s004.pdf^{(128.2KB, pdf)}

Data Availability Statement

[pcbi.1013300.ref001] 1.Licata L, Lo Surdo P, Iannuccelli M, Palma A, Micarelli E, Perfetto L, et al. SIGNOR 2.0, the SIGnaling Network Open Resource 2.0: 2019 update. Nucleic Acids Res. 2020;48(D1):D504–10. doi: 10.1093/nar/gkz949 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref002] 2.Jassal B, Matthews L, Viteri G, Gong C, Lorente P, Fabregat A, et al. The reactome pathway knowledgebase. Nucleic Acids Research. 2019. doi: 10.1093/nar/gkz1031 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref003] 3.Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020;17(2):147–54. doi: 10.1038/s41592-019-0690-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref004] 4.Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, et al. Wisdom of crowds for robust gene network inference. Nat Methods. 2012;9(8):796–804. doi: 10.1038/nmeth.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref005] 5.Barabási A-L, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet. 2011;12(1):56–68. doi: 10.1038/nrg2918 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref006] 6.Le Novère N. Quantitative and logic modelling of molecular and gene networks. Nat Rev Genet. 2015;16(3):146–58. doi: 10.1038/nrg3885 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref007] 7.Garrido-Rodriguez M, Zirngibl K, Ivanova O, Lobentanzer S, Saez-Rodriguez J. Integrating knowledge and omics to decipher mechanisms via large-scale models of signaling networks. Mol Syst Biol. 2022;18(7):e11036. doi: 10.15252/msb.202211036 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref008] 8.Venafra V, Sacco F, Perfetto L. SignalingProfiler 2.0 a network-based approach to bridge multi-omics data to phenotypic hallmarks. NPJ Syst Biol Appl. 2024;10(1):95. doi: 10.1038/s41540-024-00417-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref009] 9.Liu A, Trairatphisan P, Gjerga E, Didangelos A, Barratt J, Saez-Rodriguez J. From expression footprints to causal pathways: contextualizing large signaling networks with CARNIVAL. NPJ Syst Biol Appl. 2019;5:40. doi: 10.1038/s41540-019-0118-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref010] 10.Musilova J, Vafek Z, Puniya BL, Zimmer R, Helikar T, Sedlar K. Augusta: From RNA-Seq to gene regulatory networks and Boolean models. Comput Struct Biotechnol J. 2024;23:783–90. doi: 10.1016/j.csbj.2024.01.013 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref011] 11.Türei D, Korcsmáros T, Saez-Rodriguez J. OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nat Methods. 2016;13(12):966–7. doi: 10.1038/nmeth.4077 [DOI] [PubMed] [Google Scholar]

[pcbi.1013300.ref012] 12.Rodchenkov I, Babur O, Luna A, Aksoy BA, Wong JV, Fong D, et al. Pathway Commons 2019 Update: integration, analysis and exploration of pathway data. Nucleic Acids Res. 2020;48(D1):D489–97. doi: 10.1093/nar/gkz946 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref013] 13.Wong JV, Franz M, Siper MC, Fong D, Durupinar F, Dallago C, et al. Author-sourced capture of pathway knowledge in computable form using Biofactoid. Elife. 2021;10:e68292. doi: 10.7554/eLife.68292 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref014] 14.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504. doi: 10.1101/gr.1239303 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref015] 15.Bundy A, Wallen L. Breadth-first search. Catalogue of artificial intelligence tools. Berlin, Heidelberg: Springer; 1984. p. 13. 10.1007/978-3-642-96868-6_25 [DOI]

[pcbi.1013300.ref016] 16.Kozen DC. Depth-first and breadth-first search. The design and analysis of algorithms. New York: Springer; 1992. p. 19–24. 10.1007/978-1-4612-4400-4_4 [DOI]

[pcbi.1013300.ref017] 17.Hornbeck PV, Kornhauser JM, Tkachev S, Zhang B, Skrzypek E, Murray B, et al. PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res. 2012;40(Database issue):D261-70. doi: 10.1093/nar/gkr1122 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref018] 18.Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015;347(6220):1260419. doi: 10.1126/science.1260419 [DOI] [PubMed] [Google Scholar]

[pcbi.1013300.ref019] 19.Chevalier S, Froidevaux C, Paulevé L, Zinovyev A. Synthesis of Boolean networks from biological dynamical constraints using answer-set programming. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI); 2019. p. 34–41. https://ieeexplore.ieee.org/document/8995442.

[pcbi.1013300.ref020] 20.Northcott PA, Korshunov A, Witt H, Hielscher T, Eberhart CG, Mack S, et al. Medulloblastoma comprises four distinct molecular variants. J Clin Oncol. 2011;29(11):1408–14. doi: 10.1200/JCO.2009.27.4324 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref021] 21.Forget A, Martignetti L, Puget S, Calzone L, Brabetz S, Picard D, et al. Aberrant ERBB4-SRC signaling as a hallmark of group 4 medulloblastoma revealed by integrative phosphoproteomic profiling. Cancer Cell. 2018;34(3):379-395.e7. doi: 10.1016/j.ccell.2018.08.002 [DOI] [PubMed] [Google Scholar]

[pcbi.1013300.ref022] 22.Northcott PA, Buchhalter I, Morrissy AS, Hovestadt V, Weischenfeldt J, Ehrenberger T, et al. The whole-genome landscape of medulloblastoma subtypes. Nature. 2017;547(7663):311–7. doi: 10.1038/nature22973 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref023] 23.Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25(8):1091–3. doi: 10.1093/bioinformatics/btp101 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref024] 24.Clifford SC, Lusher ME, Lindsey JC, Langdon JA, Gilbertson RJ, Straughton D, et al. Wnt/Wingless pathway activation and chromosome 6 loss characterise a distinct molecular sub-group of medulloblastomas associated with a favourable prognosis. Cell Cycle. 2006;5(22):2666–70. doi: 10.4161/cc.5.22.3446 [DOI] [PubMed] [Google Scholar]

[pcbi.1013300.ref025] 25.Kawauchi D, Ogg RJ, Liu L, Shih DJH, Finkelstein D, Murphy BL, et al. Novel MYC-driven medulloblastoma models from multiple embryonic cerebellar cells. Oncogene. 2017;36(37):5231–42. doi: 10.1038/onc.2017.110 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref026] 26.Lin CY, Erkek S, Tong Y, Yin L, Federation AJ, Zapatka M, et al. Active medulloblastoma enhancers reveal subgroup-specific cellular origins. Nature. 2016;530(7588):57–62. doi: 10.1038/nature16546 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref027] 27.Flobak Å, Zobolas J, Vazquez M, Steigedal TS, Thommesen L, Grislingås A, et al. Fine tuning a logical model of cancer cells to predict drug synergies: combining manual curation and automated parameterization. Front Syst Biol. 2023;3:1252961. doi: 10.3389/fsysb.2023.1252961 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref028] 28.Flobak Å, Baudot A, Remy E, Thommesen L, Thieffry D, Kuiper M, et al. Discovery of drug synergies in gastric cancer cells predicted by logical modeling. PLoS Comput Biol. 2015;11(8):e1004426. doi: 10.1371/journal.pcbi.1004426 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref029] 29.Chen H-H, Hsueh C-W, Lee C-H, Hao T-Y, Tu T-Y, Chang L-Y, et al. SWEET: a single-sample network inference method for deciphering individual features in disease. Brief Bioinform. 2023;24(2):bbad032. doi: 10.1093/bib/bbad032 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1013300.ref030] 30.Sphinx. Sphinx documentation. https://www.sphinx-doc.org/en/master/

PERMALINK

NeKo: A tool for automatic network construction from prior knowledge

Marco Ruscone

Eirini Tsirvouli

Andrea Checcoli

Denes Turei

Emmanuel Barillot

Julio Saez-Rodriguez

Loredana Martignetti

Åsmund Flobak

Laurence Calzone

Roles

Abstract

Author summary

1. Introduction

2. Design and implementation

2.1. Requirements

2.2. Workflow

Fig 1. Example of a typical workflow to construct a network using NeKo.

2.3. Package structure

2.3.1. The network class.

Fig 2. Structure of the node dataframe (upper, left panel), edge dataframe (upper, right panel) and Universe dataframe (bottom panel) used in the NeKo package.

2.3.2. The Universe class.

2.3.3. The connections class.

2.3.4. The ontology class.

2.3.5. The exports class.

2.3.6. The NetworkVisualizer.

2.4. Connection strategies

Fig 3. Overview of the connection strategies provided by NeKo.

2.5. NeKo strategies for building a network

3. Results

3.1. Use Case 1 - Medulloblastoma subgroup network

Table 1. Seed genes for each subgroup.

Fig 4. Topological and functional comparison of the NeKo networks created for the medulloblastoma subgroups using the HURI databases.

3.2. Use Case 2 - DrugLogics pipeline

Fig 5. Comparison of the results obtained combining NeKo and Druglogics pipeline with the original publication from [28].

4. Discussion

5. Availability and future directions

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Mark Alber

Alberto J M Martin

Roles

Author response to Decision Letter 1

Decision Letter 1

Mark Alber

Roles

Author response to Decision Letter 2

Decision Letter 2

Mark Alber

Roles

Acceptance letter

Mark Alber

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases