Abstract
Networks offer an intuitive visual representation of complex systems. Important network characteristics can often be recognized by eye and, in turn, patterns that stand out visually often have a meaningful interpretation. In conventional network layout algorithms, however, the precise determinants of a node’s position within a layout are difficult to decipher and to control. Here we propose an approach for directly encoding arbitrary structural or functional network characteristics into node positions. We introduce a series of two- and three-dimensional layouts, benchmark their efficiency for model networks, and demonstrate their power for elucidating structure-to-function relationships in large-scale biological networks.
Subject terms: Scientific data, Network topology, Regulatory networks
Networks offer a powerful visual representation of complex systems. This study introduces network visualizations that are easy to interpret and can help explore large datasets, such as the map of all molecular interactions in the cell.
Main
Networks are used to investigate a wide range of technological, social and biological systems1. Key factors for their success are the availability of powerful mathematical and computational analysis tools, but also their intuitive visual interpretation. For example, the central position of genes within molecular networks indicates essential cellular processes2, densely connected clusters represent functional complexes3, and global patterns, such as the ring-like architecture of co-regulation networks, have been found to reflect principles of cellular organization4. However, the full potential of network visualizations for exploring complex systems is limited by several conceptual and practical challenges. (1) Networks do not have a natural two- or three-dimensional (2D or 3D) embedding. Any network layout thus involves a choice of which aspects of the high-dimensional pairwise relationships are visually represented, and which are not. (2) In widely used layout algorithms, such as force-directed methods, this choice is made in an implicit and thus intransparent fashion, often based on subjective, esthetic criteria. This lack of a clear relationship between structural network characteristics and node positioning makes the resulting layouts difficult to interpret. (3) Likewise, there are no layout algorithms available that allow for explicitly representing a given network characteristic. (4) Finally, the big size of many real-world networks is a key limiting factor for producing comprehensible layouts, leading to proverbial hair-ball visualizations. In this Brief Communication we introduce a framework for generating network layouts that address these challenges by using dimensionality reduction to directly encode network properties into node positions. Not only can structural network properties be visually encoded in this fashion, but also external information reflecting the functional characteristics of nodes or links.
We propose the following procedure (Fig. 1a). For a given network, we first compile a set of F features for each of N nodes, incorporating any structural or functional characteristic we wish to be visually reflected in the final layout. The resulting (N × F) feature matrix is then converted into an (N × N) similarity matrix, which serves as input to dimensionality reduction methods to compute 2D or 3D embeddings. These embeddings can either be used directly as node coordinates, resulting in network layouts we termed portraits. Alternately, embeddings on 2D surfaces can be further extended towards 3D topographic or geodesic maps by using the third dimension for an additional variable of choice. The topographic map extends a flat 2D embedding by an additional z coordinate, and geodesic maps introduce an additional radial coordinate in spherical embeddings. In total, our framework thus offers four different maps in two and three dimensions (Fig. 1b). The key advantage of our framework, offering both versatility and interpretability, is its ability to incorporate and explicitly display various desired node characteristics or node pair relationships. We implemented five examples that demonstrate the diversity of potential layouts. (1) The global layout uses network propagation for an efficient, high-resolution representation of pairwise network distances. (2) The local layout emphasizes similar connection patterns between node pairs. (3) The importance layout combines several metrics for the overall importance of a node, such as degree, betweenness, closeness and eigenvector centrality. (4) Functional layouts depict node similarities according to external node features. (5) Combined layouts allow for tuning between layouts that are dominated by either structural or functional features.
To illustrate and benchmark our framework, we first applied it to easily interpretable model networks: (1) a Cayley tree, (2) a cubic grid and (3) a torus lattice (Fig. 1c). The Cayley tree is organized in hierarchical levels. All nodes except for those in the outermost level have the same number of neighbors (degree k = 3), and all nodes within the same level have identical centrality values. The cubic lattice contains four structurally different node groups: nodes at the corner (k = 3), along the 12 edges (k = 4), on the six faces (k = 5) or in the interior (k = 6). In the torus lattice, all nodes are equivalent in terms of all structural characteristics, including their degree (k = 4) and centrality metrics. Note that the definition of none of the model networks involves any spatial embedding, so, in principle, no layout is in any formal sense more correct than any other. However, for all three network models, canonical layouts in two and three dimensions, respectively, exist, offering an intuitive visualization of their global architecture. Our global layout provides a good approximation for these idealizations (Fig. 1d). The local and importance layouts produce entirely different results, each highlighting distinct structural aspects of the model networks. In the local layouts, the nodes are sorted into groups with shared neighbors (Fig. 1e). This layout reveals bi- and multipartite network structures, resulting in two clusters in the lattice-based networks (cube and torus), and in alternating patterns reflecting the ternary structure of the Cayley tree. The importance layout identifies groups of nodes with the same network centralities (Fig. 1f). In the Cayley tree, all nodes of the same hierarchy are clustered, and in the cubic grid, nodes of the same type (corner, edge, face nodes) and layer are grouped. In the torus, all nodes have equivalent structural roles, thus resulting in a uniform point cloud.
The global layout incorporates random walk-based features similar to the graph embedding method node2vec5. Also, for small to moderate network sizes, standard force-directed algorithms6 produce layouts that recapitulate network distances between node pairs. We can therefore use these algorithms as performance benchmarks. Figure 1g shows good overall correlations between network-based node distances in cubic lattice networks and the respective layout distances (Extended Data Fig. 1). A comparison of the correlations obtained for the same computational running time shows a substantial drop for force-directed algorithms as the network size increases (Fig. 1h). Conversely, force-directed methods are orders of magnitudes slower for fixed layout quality (Fig. 1i).
We next apply our framework to a large real-world network. The human interactome consists of N = 16,376 nodes and M = 309,355 links, representing proteins and their physical interactions that underlie biological processes7,8. Although several structure-to-function relationships in the interactome are well documented9, they are difficult to decipher visually from conventional layouts. Our framework offers a solution to this challenge. Figure 2a shows a 2D network portrait of the interactome in the importance layout. Visual inspection of 2,918 known essential genes reveals a relationship between their structural importance within the interactome and their biological importance. Cancer driver genes, rare disease genes and genes involved in early development show the same trend (Extended Data Fig. 2a–c). Although this finding represents one of the cornerstones of network biology2, it could not be derived from standard layouts (Extended Data Fig. 3a). Similarly, the agglomeration of genes associated with the same disease in local interactome neighborhoods is well documented10, yet remains hidden in standard layouts (Extended Data Fig. 3b). We can use functional network portraits to visualize disease-associated genes and their interconnectivity (Fig. 2b). Although the node placement is purely driven by a functional characteristic, the underlying network structure can be inspected through the links. This supports the identification of structure-to-function relationships in an iterative cycle of data visualization, hypothesis generation and validation. In addition to disease gene interconnectivity, Fig. 2b also shows a prominent cluster of highly connected genes associated with multiple diseases (Extended Data Fig. 4). Finally, we can also generate layouts in which the node positions are determined by a combination of structural and functional features (see Extended Data Figs. 5 and 6 for applications to a model network and the interactome).
Network maps with an additional quantity of interest depicted in the third dimension can be used to build application-specific visualizations. Figure 2c shows a 3D topographic map of the interactome, with a global layout on the x–y plane and the number of disease associations on the z axis, highlighting, for example, the prominent role of the tumor suppressor TP53 in many cancers11. The top view reveals several localized node clusters, which correspond to provincial hubs and their respective neighbors12. The side view shows the prominent role of the provincial hubs for diseases and their relationships, such as amyloid precursor protein (APP) and ELAV-like RNA binding protein (ELAVL1), which are located at the center of the respective interactome neighborhoods that are perturbed in the associated diseases13.
Figure 2d demonstrates how our framework can be utilized for generating network maps customized to the interactive annotation of rare genomic variants in a virtual reality environment14. The center sphere of the geodesic map contains 13 candidate genes that are suspected to cause a rare genetic disease in a particular patient. The enclosing spheres represent genes implicated in similar phenotypes or involved in related biological pathways, respectively, in a functional layout reflecting biological similarity. This allows for an efficient manual inspection of the biological context of the candidate genes.
The flexibility of our framework enables the development of customized network visualizations for a broad range of applications. In biology, for example, the introduced layouts may enhance existing tools for the integration and interpretation of diverse omics datasets15–19. Note that visual inspection alone will rarely suffice to conclusively show the presence of an observed structure-to-function relationship in a given network. Any hypothesis derived from a particular visualization thus requires an additional, more rigorous evaluation outside of our framework, for example, by statistical or experimental means.
Methods
A framework for creating interpretable network layouts and maps
Our pipeline consists of four basic steps. (1) The network of N nodes and M links is supplied in the form of a link list. (2) For each node in the network, we construct a vector of F features, resulting in an (N × F) feature matrix. The particular features that are used determine the layout. We introduce five such layouts, termed ‘global’, ‘local’, ‘importance’, ‘functional’ and ‘combined’ layouts, as detailed in the next sections. (3) The feature matrix is converted into an (N × N) similarity matrix, which serves as input for dimensionality reduction algorithms. The utility of dimensionality reduction techniques for network embedding is increasingly recognized, in particular for classification tasks and more recently also for visualizations20. We implemented the popular tools t-distributed neighbor embedding (t-SNE)21 and uniform manifold approximation and projection (UMAP)22, which offer embeddings in 2D and 3D Euclidean space, as well as embeddings on 2D surfaces, such as a sphere. (4) The node coordinates can either be used directly to lay out the network or can be further enhanced by an additional third dimension in the case of 2D embeddings. We termed the direct layouts ‘portraits’. Flat embeddings in 2D Euclidean space can be expanded into 3D topographic maps by using an additional, freely selectable variable as the z coordinate. Similarly, we can enhance embeddings on the surface of a sphere by introducing an additional radial variable, resulting in geodesic maps.
Global layout
In the global layout, each node is equipped with N features representing its network-based distances to all nodes in the network based on a random walk with the restart propagation method23. These random walk-based distances indicate how frequently a walker starting from node i and traveling along randomly chosen links will visit a given node j. Formally, we first determine the vector pi containing the visiting frequencies pi,j for all nodes j ∈ [1, N] starting from node i as seed for a random walk with restart probability r. These frequencies can be efficiently computed by matrix inversion according to the steady-state expression for a random walk with restart24. For all node pairs {n, m}, we then compute the cosine similarity S(n, m) between their respective visiting frequency vectors pn and pm and collect the results into an (N × N) similarity matrix Sglob that serves as input to the dimensionality reduction step of the pipeline.
Local layout
The local layout is based on the similarity of nodes in terms of shared neighbors. Two nodes that are connected to the exact same set of nodes are considered maximally similar, whereas nodes that do not have any common neighbors do not have any similarity. We can determine this similarity directly from the adjacency matrix A of the network, defined as Ai,j = 1 if nodes i and j are connected, and Ai,j = 0 otherwise. For all node pairs {n, m}, we compute the cosine similarity between their corresponding columns Ai,n and Ai,m, resulting in an (N × N) similarity matrix Sloc which serves as input to the dimensionality reduction step.
Importance layout
The importance layout reflects the similarity of nodes in terms of their network centralities1. Network centralities measure the importance of a particular node according to its position within the network. Numerous centrality measures have been proposed, and we incorporated four of the most widely used into a feature vector. For each node i we compute its (1) degree (the number of neighbors), (2) closeness (its average network distance to all other nodes), (3) betweenness (how often it acts as a bridge along the shortest path between two other nodes) and (4) eigenvector centrality (measuring its dynamic influence), resulting in a 4D vector ci. For all node pairs {n, m}, we then compute the cosine similarity between their corresponding vectors cn and cm, resulting in an (N × N) similarity matrix Scent, which serves as input to the dimensionality reduction step.
Functional layouts
Functional layouts can be used to display node similarities in terms of external features, such as the disease annotations of genes in Fig. 2b. For a given feature matrix F with Fi,j = 1 if node i is annotated to feature j, and Fi,j = 0 otherwise, we compute the cosine similarity between all node pairs {n, m} using the respective rows Fn,j and Fm,j, resulting in an (N × N) similarity matrix Sfunc, which serves as input to the dimensionality reduction step.
Combined layouts
Combined layouts allow for extrapolating between purely structural and functional layouts. We first construct a matrix with elements pi,j as in the global layout above, representing the structural aspect of the final layout. For each functional feature that we wish to include, for example annotations to different diseases, we then add an additional column containing the values Fi,j = 1 if node i is annotated to feature j, and Fi,j = 0 otherwise. These functional columns can now be scaled by a factor m ≥ 0, thereby modulating between purely structural layouts (m = 0) and layouts that are increasingly dominated by the functional annotations (m > 0). Finally, for all node pairs {n, m}, we compute the cosine similarity S(n, m) between their vectors pn and pm and collect the results into an (N × N) similarity matrix Scomb, which serves as input to the dimensionality reduction step of the pipeline.
Implementation
We used the Python package networkx25 to generate the model networks and compute the network properties required in the different layouts, such as adjacency matrices and node centralities. The force-directed layouts were generated using the Fruchterman–Reingold algorithm6 as implemented in NetworkX and igraph26, respectively, and using ForceAtlas227. Dimensionality reduction methods were implemented using the t-SNE24 and UMAP Python packages25, and the node2vec algorithm was implemented using the StellarGraph library28. Note that the implemented dimensionality reduction methods are not strictly deterministic, so that repeated calls may lead to slightly different outputs. To maximize the reproducibility, we therefore set a fixed random seed in the provided Python code.
To evaluate how well a particular layout algorithm reproduces network-based distances between nodes, we computed for all node pairs {n, m} the length of the respective shortest paths and their Euclidean distance within the layout. The agreement between the two was then quantified using the Pearson correlation coefficient:
where µSP and µEuc denote the respective mean values of network-based and Euclidean distances across all node pairs. We used the implementation contained in the numpy Python package29. Computational wall time was measured on computer hardware with a 2-GHz Quad-Core Intel Core i5 processor and 16 GB of RAM.
Source data
Acknowledgements
This work was supported by the Vienna Science and Technology Fund (WWTF) through project VRG15-005 granted to J.M. and by the Austrian Science Fund (FWF) through project W1261. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank S. Pirch and M. Chiettini for support with the virtual reality implementation, and I. Buljan and J. Pazmandi for information on neurofibromatosis. We affirm that the individual depicted in Fig. 2 provided informed consent for publication of their image.
Extended data
Author contributions
C.V.R.H. and J.M. developed the concept. C.V.R.H. implemented the framework and conducted the analysis. F.M. provided data and supported the implementation of the framework. C.S. supported the web app development. J.M. supervised the study. C.V.R.H. and J.M. wrote the manuscript. All authors contributed to the manuscript.
Peer review
Peer review information
Nature Computational Science thanks the anonymous reviewers for their contribution to the peer review of this work. Handling editor: Jie Pan, in collaboration with the Nature Computational Science team.
Data availability
All input files, together with the complete source code, have been deposited in a Zenodo repository30. The human interactome network was extracted from the HIPPIE database31, filtering for protein–protein interactions with at least one supporting PubMed article. Disease gene associations were taken from the DisGeNET database32 and mapped to disease categories according to Disease Ontology (DO)33. Functional gene annotations were derived from the ‘biological processes’ branch of the Gene Ontology (GO) database34. Essential genes were obtained from the Online Gene Essentiality (OGEE) database35, rare disease genes from OrphaNet36 and genes involved in early development from the EmExplorer database37. Source data are provided with this paper.
Code availability
Python source code and input data for reproducing the results in this paper are publicly available from the Zenodo repository30. We also provide the code as a Python package on GitHub at https://github.com/menchelab/CartoGRAPHs, together with Jupyter notebooks including a quickstarter, as well as separate notebooks for reproducing each figure. The CartoGRAPHs framework can also be used as an interactive web application at www.cartographs.xyz and source code is provided at https://github.com/menchelab/cartoGRAPHs_app (Extended Data Fig. 7). As output, 2D and 3D network interactive images can be generated and downloaded in html format. Layouts can also be exported as XGMML files that can be loaded for further processing in the cytoscape software38 . Finally, we offer export in Wavefront OBJ format to be implemented into 3D printing processes or for exploring network maps in VRNetzer, a virtual reality platform12 for network visualization and analysis.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
is available for this paper at 10.1038/s43588-022-00199-z.
Supplementary information
The online version contains supplementary material available at 10.1038/s43588-022-00199-z.
References
- 1.Newman, M. Networks (Oxford Univ. Press, 2018).
- 2.Jeong H, Mason SP, Barabási AL, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411:41–42. doi: 10.1038/35075138. [DOI] [PubMed] [Google Scholar]
- 3.Baryshnikova A. Systematic functional annotation and visualization of biological networks. Cell Syst. 2016;2:412–421. doi: 10.1016/j.cels.2016.04.014. [DOI] [PubMed] [Google Scholar]
- 4.Köberlin MS, et al. A conserved circular network of coregulated lipids modulates innate immune responses. Cell. 2015;162:170–183. doi: 10.1016/j.cell.2015.05.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Grover, A. & Leskovec, J. node2vec: scalable feature learning for networks. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 855–864 (ACM, 2016). [DOI] [PMC free article] [PubMed]
- 6.Fruchterman TMJ, Reingold EM. Graph drawing by force-directed placement. Softw. Pract. Exp. 1991;21:1129–1164. doi: 10.1002/spe.4380211102. [DOI] [Google Scholar]
- 7.Huttlin EL, et al. Architecture of the human interactome defines protein communities and disease networks. Nature. 2017;545:505–509. doi: 10.1038/nature22366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Luck K, et al. A reference map of the human binary protein interactome. Nature. 2020;580:402–408. doi: 10.1038/s41586-020-2188-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Caldera M, Buphamalai P, Müller F, Menche J. Interactome-based approaches to human disease. Curr. Opin. Syst. Biol. 2017;3:88–94. doi: 10.1016/j.coisb.2017.04.015. [DOI] [Google Scholar]
- 10.Menche J, et al. Disease networks. Uncovering disease-disease relationships through the incomplete interactome. Science. 2015;347:1257601. doi: 10.1126/science.1257601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Petitjean A, Achatz MIW, Borresen-Dale AL, Hainaut P, Olivier M. TP53 mutations in human cancers: functional selection and impact on cancer prognosis and outcomes. Oncogene. 2007;26:2157–2165. doi: 10.1038/sj.onc.1210302. [DOI] [PubMed] [Google Scholar]
- 12.Guimerà, R. & Amaral, L. A. N. Cartography of complex networks: modules and universal roles. J. Stat. Mech. 2005, P02001-1–P02001-13 (2005). [DOI] [PMC free article] [PubMed]
- 13.Li H, et al. Integrated bioinformatics analysis identifies ELAVL1 and APP as candidate crucial genes for Crohn’s disease. J. Immunol. Res. 2020;2020:3067273. doi: 10.1155/2020/3067273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Pirch S, et al. The VRNetzer platform enables interactive network analysis in Virtual Reality. Nat. Commun. 2021;12:2432. doi: 10.1038/s41467-021-22570-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gehlenborg N, et al. Visualization of omics data for systems biology. Nat. Methods. 2010;7:S56–S68. doi: 10.1038/nmeth.1436. [DOI] [PubMed] [Google Scholar]
- 16.Shi Z, Wang J, Zhang B. NetGestalt: integrating multidimensional omics data over biological networks. Nat. Methods. 2013;10:597–598. doi: 10.1038/nmeth.2517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Czerwinska U, Calzone L, Barillot E, Zinovyev A. DeDaL: Cytoscape 3 app for producing and morphing data-driven and structure-driven network layouts. BMC Syst. Biol. 2015;9:46. doi: 10.1186/s12918-015-0189-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Reimand J, et al. Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap. Nat. Protoc. 2019;14:482–517. doi: 10.1038/s41596-018-0103-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Legeay M, Doncheva NT, Morris JH, Jensen LJ. Visualize omics data on networks with Omics Visualizer, a Cytoscape App. F1000Res. 2020;9:157. doi: 10.12688/f1000research.22280.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Yue X, et al. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics. 2020;36:1241–1251. doi: 10.1093/bioinformatics/btz718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.van der Maaten L, Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008;9:2579–2605. [Google Scholar]
- 22.McInnes L, Healy J, Saul N, Großberger L. UMAP: Uniform manifold approximation and projection. J. Open Source Softw. 2018;3:861. doi: 10.21105/joss.00861. [DOI] [Google Scholar]
- 23.Cowen L, Ideker T, Raphael BJ, Sharan R. Network propagation: a universal amplifier of genetic associations. Nat. Rev. Genet. 2017;18:551–562. doi: 10.1038/nrg.2017.38. [DOI] [PubMed] [Google Scholar]
- 24.Lovász, L. et al. in Combinatorics. Paul Erdős is Eighty (eds. Miklós, D., Sós, V. T. & Szőnyi, T.) Vol. 2, 1–46 (Bolyai Society, 1993).
- 25.Hagberg, A., Swart, P. & S Chult, D. Exploring network structure, dynamics, and function using networkx. In Proc. 7th Python in Science Conference, SCIPY 08 (eds. Varoquaux, G., Vaught, T. & Millman, J.) (Los Alamos National Laboratory, 2008).
- 26.Csardi G, Nepusz T. The igraph software package for complex network research. InterJ. Complex Syst. 2006;1695:1–9. [Google Scholar]
- 27.Jacomy M, Venturini T, Heymann S, Bastian M. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE. 2014;9:e98679. doi: 10.1371/journal.pone.0098679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.CSIRO’s Data61. StellarGraph Machine Learning Library (GitHub, 2018).
- 29.Harris CR, et al. Array programming with NumPy. Nature. 2020;585:357–362. doi: 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hütter, C. V. R., Sin, C., Müller, F. & Menche, J. cartoGRAPHs (Zenodo, 2022); 10.5281/zenodo.5883000
- 31.Alanis-Lobato G, Andrade-Navarro MA, Schaefer MH. HIPPIE v2.0: enhancing meaningfulness and reliability of protein-protein interaction networks. Nucleic Acids Res. 2017;45:D408–D414. doi: 10.1093/nar/gkw985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Piñero J, et al. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 2017;45:D833–D839. doi: 10.1093/nar/gkw943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Schriml LM, et al. Human Disease Ontology 2018 update: classification, content and workflow expansion. Nucleic Acids Res. 2019;47:D955–D962. doi: 10.1093/nar/gky1032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 2019;47:D330–D338. doi: 10.1093/nar/gky1055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gurumayum S, et al. OGEE v3: Online GEne Essentiality database with increased coverage of organisms and human cell lines. Nucleic Acids Res. 2021;49:D998–D1003. doi: 10.1093/nar/gkaa884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Rath A, et al. Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum. Mutat. 2012;33:803–808. doi: 10.1002/humu.22078. [DOI] [PubMed] [Google Scholar]
- 37.Hu B, et al. EmExplorer: a database for exploring time activation of gene expression in mammalian embryos. Open Biol. 2019;9:190054. doi: 10.1098/rsob.190054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Shannon P, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All input files, together with the complete source code, have been deposited in a Zenodo repository30. The human interactome network was extracted from the HIPPIE database31, filtering for protein–protein interactions with at least one supporting PubMed article. Disease gene associations were taken from the DisGeNET database32 and mapped to disease categories according to Disease Ontology (DO)33. Functional gene annotations were derived from the ‘biological processes’ branch of the Gene Ontology (GO) database34. Essential genes were obtained from the Online Gene Essentiality (OGEE) database35, rare disease genes from OrphaNet36 and genes involved in early development from the EmExplorer database37. Source data are provided with this paper.
Python source code and input data for reproducing the results in this paper are publicly available from the Zenodo repository30. We also provide the code as a Python package on GitHub at https://github.com/menchelab/CartoGRAPHs, together with Jupyter notebooks including a quickstarter, as well as separate notebooks for reproducing each figure. The CartoGRAPHs framework can also be used as an interactive web application at www.cartographs.xyz and source code is provided at https://github.com/menchelab/cartoGRAPHs_app (Extended Data Fig. 7). As output, 2D and 3D network interactive images can be generated and downloaded in html format. Layouts can also be exported as XGMML files that can be loaded for further processing in the cytoscape software38 . Finally, we offer export in Wavefront OBJ format to be implemented into 3D printing processes or for exploring network maps in VRNetzer, a virtual reality platform12 for network visualization and analysis.