Abstract
One of the key characteristics of cancer cells is an increased phenotypic plasticity, driven by underlying genetic and epigenetic perturbations. However, at a systems-level it is unclear how these perturbations give rise to the observed increased plasticity. Elucidating such systems-level principles is key for an improved understanding of cancer. Recently, it has been shown that signaling entropy, an overall measure of signaling pathway promiscuity, and computable from integrating a sample's gene expression profile with a protein interaction network, correlates with phenotypic plasticity and is increased in cancer compared to normal tissue. Here we develop a computational framework for studying the effects of network perturbations on signaling entropy. We demonstrate that the increased signaling entropy of cancer is driven by two factors: (i) the scale-free (or near scale-free) topology of the interaction network, and (ii) a subtle positive correlation between differential gene expression and node connectivity. Indeed, we show that if protein interaction networks were random graphs, described by Poisson degree distributions, that cancer would generally not exhibit an increased signaling entropy. In summary, this work exposes a deep connection between cancer, signaling entropy and interaction network topology.
One of the key features of cancer is an increased cellular plasticity, mediated by an increased promiscuity in signaling patterns, and driven by underlying genetic and epigenetic aberrations which cause a fundamental rewiring of the intracellular signaling network1,2,3,4,5,6,7,8. Every aberration found in a cancer cell can be thought of as a perturbation if the aberration affects the gene functionally. Such perturbations can be classed as activating, if they result in an increased functional activity of the gene (e.g. amplification and overexpression of ERBB2 in breast cancer), or inactivating, if it compromises gene function (e.g. silencing through promoter DNA methylation). Whilst the effect of certain specific perturbations on gene function can be predicted, it is much less clear how individual perturbations affect the cellular phenotype as a whole, since this depends on the collective nature of the other aberrations that are present in the same cell. Predicting the net effect of multiple perturbations in a signaling network is hard due to complex effects such as pathway redundancy and epistasis3,6. Moreover, in the context of cancer, although the effect of specific aberrations on cell function is known, it is yet unclear how individual cancer perturbations may contribute to the observed increased signaling promiscuity and phenotypic plasticity.
One way to approach this challenge computationally, is to anchor the analysis on global measures which capture salient features of the cellular phenotype, and which are computable from, say, a sample's molecular profile (e.g. a sample's gene expression profile). Here we are particularly interested in measuring signaling promiscuity since evidence is mounting that this underlies a sample's phenotypic plasticity8. In previous work we have started to explore a measure which approximates intra-sample signaling promiscuity, and which is known as network signaling entropy9,10,11. Signalling entropy is computed from integrating a sample's genome-wide gene expression profile with a protein interaction network and, as shown by us, provides a surprisingly good estimate of a sample's height in Waddingtons's differentiation landscape, with human embryonic stem cells (hESCs) exhibiting the highest levels of entropy10. Indeed, signalling entropy was able to discriminate cellular samples according to their differentiation potential within distinct lineages, including hematopoietic, mesenchymal and neural lineages, and with terminally differentiated cells within these lineages exhibiting the lowest levels of entropy10. Importantly, signaling entropy was also found to be higher in cancer compared to normal tissue, consistent with the view that cancer cells represent a more undifferentiated stem-cell like state, characterised by an increase in phenotypic plasticity8,10.
Given that increased signaling entropy is such a robust and characteristic feature of differentiation potency and cancer, and that it is also amenable to computation9,10, it is of great theoretical and biological interest to study the changes in entropy caused by cellular network perturbations. In the context of cancer, two well-known network perturbations are the overexpression and underexpression of oncogenes and tumour suppressor genes, respectively, and although these perturbations are known to result in the uncontrolled activation of cell-growth and cell-proliferation pathways, it remains unclear how these perturbations affect signalling promiscuity. In order to deepen our understanding, we here decided to study the effect of such perturbations on signaling entropy, using both simulated and real data, and using a variety of different network types in order to assess the impact of network topology. Specifically, we consider Erdos-Renyi random (Poisson) graphs12, scale free networks13, as well as real protein-protein interaction (PPI) networks14,15,16. In doing so, we discover that in Poisson networks, perturbations (be they activating or inactivating) lead to reductions in the global entropy, but that this is not true for scale-free and more realistic PPI networks. In networks exhibiting a scale-free, or near scale-free topology, we show that gene expression perturbations affecting hubs exhibit a striking bi-modality, leading to increases or decreases in the global entropy rate depending on the directionality of the expression change. We further expose a subtle yet significantly positive correlation between differential gene expression in cancer and node-degree, which we show drives the increased signaling promiscuity of cancer, but only if the underlying protein interaction network has a scale-free (or near scale-free) topology. Thus, this work makes a deep connection between a defining feature of the cancer phenotype, i.e. high signaling entropy, its differential gene expression pattern and the (near) scale-free topology of real PPI networks.
Although there are many studies on network perturbations, it is worth clarifying that the network perturbations and outcome of interest (i.e. the entropy rate) considered in this work are very different from the perturbations and outcomes of interest considered in previous studies17,18,19,20,21. Specifically, we consider network perturbations which only alter the local edge weights without altering the underlying network topology9,10,22. Moreover, our network perturbations can be both activating as well as inactivating, representing the two different types of cancer alterations affecting oncogenes and tumour suppressors, respectively. In contrast, much of the previous literature has dealt with the effects of removing specific nodes in unweighted networks17,18, a type of inactivating perturbation which alters the underlying network topology, focusing on tolerance and robustness as outcome measures17,18,19,21. Thus, from a network theoretical perspective, the important novel insights reported in this work are made possible by considering a novel type of network perturbation in the context of weighted networks defined by a stochastic matrix. We should also stress that our outcome of interest, signaling entropy, is a systems-level measure that is constructed from the genome-wide expression profile of a given sample, and therefore has little to do with the protein signaling disorder measures considered by other studies and which do not use gene expression data23.
Results
Increased signaling entropy in cancer is driven by overexpression of hub genes
In earlier work we demonstrated that signaling entropy, a measure of the signaling promiscuity in a cellular sample, is increased in cancer compared to normal tissue, irrespective of tissue type9,10,11. This increased signaling entropy is consistent with the observed increased phenotypic plasticity of cancer cells (see e.g. Ref. 8). Thus, increased signaling entropy has emerged as a cancer systems hallmark9,11. Signaling entropy is estimated as the entropy rate24 of a sample-specific stochastic matrix which models the signaling interactions in the sample (Methods). This stochastic matrix is computed by integrating the gene expression profile of the sample with a comprehensive PPI network, invoking the mass-action principle to define the edge-weights in the network (Methods). The mass-action principle is based on the assumption that two proteins, which have been reported to interact, are more likely to interact in a given sample if both are highly expressed in that sample.
Here we wanted to shed light on why, theoretically, we observe increased signaling entropy in cancer. We decided to use liver cancer as a model since liver represents a relatively homogeneous tissue, and is thus less affected by contaminating non-epithelial cells. We downloaded gene-normalised RNA-Seq data for a matched subset of 50 normal liver and 50 liver cancer samples from The Cancer Genome Atlas (TCGA). Confirming our earlier work using Affymetrix gene expression data9,10,11, liver cancer exhibited a significantly higher signaling entropy rate compared to normal liver tissue (Fig. 1A). Randomisation of the RNA-Seq profiles over the nodes in the network resulted in a significantly reduced difference in entropy rate between normal and cancer tissue (Fig. 1A), indicating (as pointed out by us previously11) that the entropy increase in cancer is driven by a subtle interplay between specific gene expression changes and where these happen on the network. Specifically, we posited that the topological properties of the genes undergoing the largest changes in gene expression would be key features dictating the change in signalling entropy.
Since each gene i contributes an amount πiLSi to the entropy rate of a given sample (Methods), we computed for each gene the difference in the means of its local entropy rate, πiLSi, between normal and cancer tissue. In order to help interpretation, we also computed for each gene the difference in the means of the invariant measure πi between normal and cancer, as well as the difference in the average local entropy LSi (Methods). All these changes were assessed in relation to the connectivity of the genes in the network. We observed that the entropy rate increase in cancer is driven mainly by hubs, i.e. the nodes of highest degree in the network (Fig. 1B). Changes to the local entropy rates were driven by concomitant changes in the average invariant measure (Fig. 1C). Thus, hubs exhibited preferential increases in their average invariant measure, whilst also demonstrating positive increases in the average local entropy (Fig. 1D). Since the invariant measure value at a node i represents the steady-state probability of finding a random walker at this node, the observed preferential increase in the invariant measure at hubs means that there is an increased signaling flux through these hub nodes in cancer.
To gain insight as to why there is an increased signaling flux through hubs in cancer, we focused on the hub gene exhibiting the largest increase in the local entropy rate. This was the gene BUB1 (Fig. 2A). A scatterplot of the expression values of BUB1 and that of its neighbors (813 neighbors) in a representative normal sample versus the corresponding expression values in a representative cancer sample, demonstrates that most of the expression differences involve increases in gene expression, implicating both the hub itself as well as some of its neighbors (Fig. 2B). Thus, for the majority of neighbors of BUB1, the increased expression of BUB1 will, according to the mass action principle, drive increased signaling through this hub. Indeed, for each one of BUB1's neighbors we ranked its neighbors according to the largest increase in gene expression, revealing that the original hub (i.e. BUB1) ranked among the top 2% centile for 99% of the hub neighbors (SI Appendix, fig. S1). Interestingly, this effect was not unique to BUB1 since high-degree hubs generally exhibited a significant skew towards increased gene expression in cancer (Fig. 2C–D).
Confirming the biological significance of these results, we reached very similar conclusions by repeating the above analysis in the independent Affymetrix gene expression data set of normal liver and liver cancer tissue25 (SI Appendix, fig. S2–S3). Thus, the increased entropy rate in liver cancer is driven mainly by the increased expression of the highest degree hubs in the PPI network.
Effect of cancer perturbations on signaling entropy
That the highest degree genes show preferential expression increases in cancer (Fig. 2C–D, SI Appendix, fig. S3) suggests an intricate link between network topology and differential expression. Confirming this further, in both liver expression sets we also observed that the genes exhibiting the largest, or most significant, decreases in expression preferentially mapped to low-degree nodes (Fig. 2C–D, SI Appendix fig. S3–S4).
This intricate correlation between differential expression and node degree motivated us to pursue a deeper understanding of the complex interplay between network topology, gene expression perturbations and entropy rate. Intuitively, and from the perspective of a gene i that interacts with an oncogenic hub, overexpression of the latter would lead to an increased outgoing signaling flux of node i towards the hub, potentially leading to an increase in the overall entropy rate (Fig. 3A). Interestingly, underexpression of a low-degree node, which may connect to a hub either directly or indirectly through an intermediate node i would also lead to an increased signaling flux through the hub (Fig. 3A). Thus, the two characteristic topological features of differential gene expression changes in cancer could synergize causing increased signaling flux through key hubs. To test whether this is indeed the case, we performed a perturbation analysis for the top 100 genes ranked according to fold-change between normal liver and liver cancer. The initial signaling distribution was defined by invoking the mass action principle on the average expression profile over all 50 normal liver samples. Next, each of the top 100 ranked genes was individually perturbed by changing its expression level according to the observed difference between normal and cancer tissue. Confirming our hypothesis, underexpressed genes (which generally did not target hubs) led to marginal increases in the entropy rate, whilst overexpressed hubs caused significant entropy increases (Fig. 3B). Interestingly however, overexpression led to marginal entropy decreases whenever it did not target the highest degree hubs, suggesting that such perturbations draw away signaling flux from the major hubs (Fig. 3B).
The effect of perturbations on signaling entropy is dependent on network topology
To further investigate the effect of individual perturbations on signaling entropy, as well as the role of the underlying network topology, we devised a simulation framework on toy networks, perturbing each node in turn, and recording the effect on the entropy rate (Fig. 4A, Methods). To simplify the analysis we considered an initial uniform edge weight configuration, defining an unbiased random walk on the graph. We note that this initial configuration represents a state of relatively high signaling entropy, but not of maximal entropy (see Methods). As activating perturbations we consider local increases in gene expression, whereby all the weights of edges converging on a perturbed node i are assigned a relatively large weight (Fig. 4B). Thus, as seen from the perspective of a neighboring node j, before perturbation, node j has maximal local entropy, given by log kj (where kj is the degree of node j), whilst after the perturbation, the node's local entropy is close to 0 (Fig. 4B). We emphasize again that although in the initial configuration all local entropies are maximal, that the initial entropy rate over the whole network is not maximal (see Methods). Thus, after the perturbation, the global entropy rate of the network could increase or decrease.
In order to understand the potential impact of network topology, we first conducted the perturbation analysis above on Erdos-Renyi (ER) random graphs, for which the degree distribution is Poisson. For such ER graphs, we observed that activating perturbations (i.e. increases in gene expression), always led to a reduction in the global entropy rate, irrespective of node degree (Fig. 4C). Repeating the analysis for inactivating perturbations, i.e causing nodes to undergo underexpression, we observed that almost all nodes led to a decrease in entropy. Thus, given that cancer is characterised by an increase in signaling entropy, this suggests that the emergence of an increased signaling promiscuity regime in cancer must be due either to specific topological features not present in random graphs, or to non-random combinations of perturbations.
To investigate this further, we next performed the same perturbation analysis above, but now on networks characterised by a scale-free (or near scale-free) topology, a key feature of real biological networks26. The scale-free networks were matched to the same size and average connectivity than the previously considered Erdos-Renyi graphs. Remarkably, in scale-free networks we observed that activating perturbations exhibited a bi-modal response, with perturbations at lower-degree nodes resulting in a reduction of the global entropy rate, whilst hubs exhibited increases (Fig. 4C). In fact, we observed two distinct regimes with an opposite functional relationship between entropy change and node-degree (Fig. 4C). In the low-degree regime, the entropy rate decreased as node degree increases, whereas in the high-degree regime one observes entropy increases (Fig. 4C). Interestingly, this bi-phasic behaviour was not seen for inactivating perturbations where we observed a monotonic decrease of entropy with node degree (Fig. 4C). In stark contrast to Poisson networks, high-degree nodes in the scale-free network exhibited a bi-modal response dependent on the directionality of the perturbation (Fig. 4C): overexpressed hubs led to entropy increases, while underexpressed hubs led to corresponding decreases.
Next, we wanted to test whether this bi-phasic and bi-modal behaviour is also seen in real PPI networks. We first checked that our PPI network exhibited an approximate scale-free topology (SI Appendix, fig. S5). Its clustering coefficient was also significantly higher than that of a degree-distribution matched scale-free network (SI Appendix, fig. S5). Performing the perturbation analysis on the PPI network, we observed once again two phases, which was particularly striking for activating perturbations, with one phase exhibiting a negative correlation between node degree and entropy, whilst the hub regime exhibited a positive correlation (Fig. 4C). Very interestingly, however, increases in entropy were only observed for the highest-degree hubs, with lower-degree hubs exhibiting decreases which were surprisingly also of a larger magnitude (Fig. 4C). Thus, in networks with a scale-free or an approximate scale-free topology, overexpression of the highest degree hubs leads to an increase in the entropy rate. But increasing signaling flux through lower-degree nodes, even if of relatively high degree, leads to an overall reduction in the diffusion rate.
From the combined perturbation analysis, we can thus see that individual perturbations on an Erdos-Renyi graph, be they activations or inactivations (but both causing a local reduction in entropy), invariably lead to a reduction in the global entropy rate. This is in stark contrast to networks with a scale-free or approximate scale-free topology, where we observe that gene activations can have opposite effects on entropy rate depending on the degree of the activating nodes.
Entropy rate increase in cancer requires a scale-free interaction network topology
The previous perturbation analysis strongly supports the view that a scale-free, or near scale-free network topology, is important for the observed increased entropy rate in cancer. To test this formally, we recomputed the entropy rate of all 50 normal liver and 50 liver cancer samples, but now using an underlying Erdos-Renyi (ER) interaction network matched to the same size and average connectivity of the full PPI network. In order to faithfully preserve the correlation between gene expression and node degree of the PPI network, nodes of the ER network were ranked according to degree and gene expression values assigned according to their corresponding rank/centile in the original PPI network. Thus, this node mapping between the two networks preserves the observed rank correlation between differential expression and node-degree, allowing us to objectively assess the importance of the scale-free property. Recomputation of the entropy rates of all 100 samples on the ER network revealed no significant difference between normal and cancer, thus demonstrating that the observed entropy rate increase in cancer requires the scale-free property of the interaction network (Fig. 5A). Supporting this further, we observed, in two other matched normal-cancer RNA-seq expression sets from the TCGA, that the entropy was no longer higher in cancer when the PPI network was replaced with an equivalent ER graph (Fig. 5B–C). In independent Affymetrix gene expression data, we observed that the cancer-associated increase in the entropy rate was reduced upon computing entropy on an equivalent ER network, in three out of four studies (SI Appendix, fig. S6). Thus, in 6/7 data sets, there was a reduction in the entropy rate difference between cancer and normal tissue (Binomial, P = 0.008), supporting the view that a scale-free interaction topology is indeed necessary for the higher entropy signaling dynamics of cancer.
Discussion
Signaling entropy, a measure of the overall uncertainty or promiscuity in signaling patterns within a cellular sample, has been shown to be of biological significance in a variety of different contexts9,10,11. In cellular differentiation it provides a proxy to the energy potential (i.e. height) of Waddington's epigenetic landscape, allowing the differentiation potential of a sample to be assessed purely from its genome-wide transcriptomic profile10. Similarly, signaling entropy also provided us with a useful framework in which to identify specific systems-level features characterising cancer, one of which being the increased signaling promiscuity of cancer compared to its corresponding normal tissue9,10,11. This is important because an increased signaling promiscuity could underlie the increased phenotypic plasticity of cancer, as observed e.g. by Pisco et al8.
In this work we aimed to obtain a deeper theoretical understanding as to (i) why signaling entropy is increased in cancer and (ii) why it is such a robust discriminatory feature. We have here demonstrated that the increase in signaling entropy is driven by two factors. First, a subtle positive correlation between differential gene expression and the degree of the corresponding proteins in the PPI network. This correlation amounts to hubs exhibiting preferential increases in gene expression, whilst those genes exhibiting the most significant underexpression map preferentially to low-degree nodes. Second, the observed increase of entropy in cancer requires the scale-free (or near scale-free) topology characterising PPI networks. Indeed, by considering a Poisson network with an identical rank correlation coefficient between differential expression and node-degree, we no longer consistently observed a significant increased entropy rate in cancer (Fig. 5). Given the demonstrated biological significance of the entropy rate10,11, this last result thus exposes a deep connection between the cancer phenotype and the underlying scale-free property of real PPI networks. It suggests that if the degree distribution of a PPI network were Poisson, that the transcriptomic changes seen in cancer would not define a highly promiscuous signaling regime. In other words, our data support the view that cancer “hijacks” the scale-free property of real signaling networks in order to facilitate increased signaling promiscuity and intra-tumour heterogeneity.
The novel insights described above also explain why the entropy rate provides such a robust discriminatory feature of the cancer phenotype. The robustness stems from the subtle correlation between differential expression and node-degree. Although gene expression data is notoriously noisy, there is generally speaking good agreement across independent studies when comparing the changes in differential gene expression between two marked phenotypes such as normal and cancer tissue27. Secondly, although current PPI networks only represent mere caricatures of the real interactions in a cell, the “hubness” of a protein is likely to be a very robust feature. Indeed, that a given protein has exceptionally many interactions, thus defining a hub in a network, is likely to be a very robust feature, despite the fact that the specific interaction space of the hub may contain many false negatives and false positives15. Thus, the relative robustness of differential expression and hubness drives the robustness of the observed correlation between differential expression and node degree, which in turn explains why increased signaling entropy is such a consistent feature of the cancer phenotype9,10. Given the robustness of signaling entropy as a marker of differentiation potency10, it is therefore tempting to speculate that a subtle correlation between differential expression and node degree also exists in the context of normal cellular differentiation. Furthermore, it will be interesting to explore if the scale-free or near scale-free topology of PPI networks is also a key element underlying the nature of pluripotency, multipotency and terminal differentiation.
Although many previous studies have explored differential gene expression changes in cancer and other diseases in relation to network topology28,29,30,31,32,33,34,35,36,37, most of these have either focused on global topological properties, or on finding differential gene modules, or on studying absolute changes in differential expression. Indeed, a number of studies agree in reporting that absolute differential expression correlates negatively with node degree, meaning that hubs exhibit, on the whole, much smaller changes in expression between disease phenotypes28,30. Interestingly, however, relatively little attention has been paid to studying the directionality of differential gene expression in cancer in relation to node degree. Here we have shown that there exists a subtle yet significantly positive correlation between differential expression and protein-degree. On its own, the biological significance of this correlation is unclear. However, by interpreting this correlation in the novel contextual framework of signalling entropy, we have here shown how, in the context of real (near) scale-free networks, it could underpin the increased phenotypic plasticity of cancer.
In summary, increased expression of oncogenic hubs, as well as reduced expression of network-peripheral tumour suppressor genes, in interaction networks characterised by a (near) scale-free topology, drives the high signaling entropy of cancer and could thus underpin cancer's phenotypic robustness and plasticity. Further in-depth study of the complex interplay between local protein activity changes, their interaction network topology and the effect on signaling entropy is warranted.
Methods
The protein protein interaction (PPI) network
We used a PPI network similar to that used in our previous publication38. Briefly, the human interaction network derives from the Pathway Commons Resource (www.pathwaycommons.org)15, which brings together protein interactions from several distinct sources, including the Human Protein Reference Database (HPRD)14, the National Cancer Institute Nature Pathway Interaction Database (NCI-PID) (pid.nci.nih.gov), the Interactome (Intact) http://www.ebi.ac.uk/intact/ and the Molecular Interaction Database (MINT) http://mint.bio.uniroma2.it/mint/. Protein interactions in this network include physical stable interactions such as those defining protein complexes, as well as transient interactions such as post-translational modifications and enzymatic reactions found in signal transduction pathways, including 20 highly curated immune and cancer signaling pathways from NetPath (www.netpath.org)39. The network focuses on non-redundant interactions, only included nodes with an Entrez gene ID annotation and on the maximally connected component thereof, resulting in a connected network of 8,434 nodes (unique Entrez IDs) and 303,600 documented interactions.
Normal and cancer tissue gene expression data sets
We focused on liver cancer because the associated normal tissue constitutes a relatively homogeneous mass of cells, and thus the entropy rate is less likely to be influenced by changes in tissue-type composition. We downloaded the level 3 gene normalized RNA-Seq data from the TCGA (www.cancergenome.nih.gov) for a matched subset of 50 normal liver and 50 liver cancer samples. As validation, we considered an Affymetrix expression data set, consisting of 37 normal livers (including normal liver, cirrhosis and dysplasia) + 38 liver cancers25. To test generalisability, we also downloaded level 3 RNA-Seq gene normalised data from the TCGA for prostate cancer (52 cancers & 52 matched normals) and colon cancer (27 cancers & 27 matched normals). The other normal/cancer Affymetrix expression sets used have been described previously10.
Construction of the sample specific stochastic matrix and entropy rate
The construction of the entropy rate follows the same method described in our earlier work10,11. Briefly, we use the mass action principle to define a stochastic matrix, pij, for each individual sample. In detail, let Ei denote the normalised expression level of gene i in a given sample. For a given neighbour j ∈ N(i) (where N(i) labels the neighbours of i in the PPI), the mass-action principle means that the probability of interaction with j is approximated by the product EiEj, i.e. pij ∝ EiEj. Normalising this to ensure that , we get for the stochastic matrix,
Clearly, if j ∉ N(i), then pij = 0. From this stochastic matrix one can then construct a local signaling entropy (LS) as
which reflects the level of uncertainty or redundancy in the local interaction probabilities. We note that the above expression for the local entropy is not normalised so that the maximum possible entropy depends on the degree (ki) of the node. In fact,
Finally, the signaling entropy rate, SR, is defined in terms of the stationary distribution (or invariant measure) π of the stochastic matrix (πp = π), as24,40
i.e. this global signaling entropy rate is a weighted average of the local entropies LSi. We note that although LSi is independent of the expression level of gene i, that the gene's contribution to the entropy rate, i.e. πiLSi, is not. This is because πi will depend on the gene i's expression level. In this work we refer to the term LSRi ≡ πiLSi as the local entropy rate of gene i, whereas LSi is just the gene i's local entropy.
The maximum entropy rate
Given a connected network, the maximum entropy rate, maxSR, over the network does not depend on the gene expression data but only on the adjacency matrix of the network. In fact, the maximum entropy rate is attained for a stochastic matrix pij given by41
where v and are the dominant right eigenvector and eigenvalue of the adjacency matrix A, respectively. Thus, it is important to note that the configuration of maximal local entropy, i.e. the configuration where for each node i, pij = Aij/ki and LSi = log ki, is not the configuration of maximal global entropy.
Perturbation simulation analysis
In what follows we describe the perturbation analysis performed on Erdös-Renyi and scale-free networks, as well as on the full real PPI network described earlier. The calculation of the global signaling entropy rate is simplified significantly by the fact that the stochastic matrix defined by equation 1 has the detailed balance property, i.e. the stationary distribution obeys not only πp = π, but the more restrictive condition πipij = πjpji. This detailed balance condition can be shown to imply
where F is a normalisation constant and .
The initial configuration for the perturbative analysis is that of maximal local entropy for each node in the network, which as explained previously, does not represent the state of global maximum entropy. To construct this initial configuration we set the expression level of each gene/node to be identical xi = x. Thus, in the initial configuration, xT,i = kix, and from detailed balance we obtain for the stationary distribution that
where V is the number of nodes in the network and where is the average degree. As far as the entropy is concerned, the local entropy of each node i is simply log ki, so the initial entropy rate is simply
Now let us consider perturbing a gene in the network by altering its expression level by an amount . Without loss of generality we label the perturbed node by the index “1”, so that after perturbation, the expression levels in the network are described by . The new stationary distribution then becomes
For the local entropies, we get
where for i ∈ N(1), and (j ≠ 1). Thus, the change in the entropy rate, ΔSR = SR′ − SRo, is easily computable following any perturbation.
In the actual analysis, when performing activating perturbations, we set x = 2 and , whilst, when modeling inactivating perturbations, we set x = 16 and . These values are typical for logged Affymetrix or Illumina data, with highly expressed genes normally exhibiting values larger than 12, and lowly expressed genes showing values smaller than 4.
Author Contributions
A.E.T. conceived and performed the research and wrote the manuscript. P.S. and R.K. helped with analytical computations. C.R.S.B. and S.S. contributed useful comments to an earlier version of manuscript.
Supplementary Material
Acknowledgments
A.E.T. is supported by the Chinese Academy of Sciences, Shanghai Institute for Biological Sciences and the Max-Planck Gesellschaft.
References
- Califano A. Rewiring makes the difference. Mol Syst Biol 7, 463 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dutkowski J. & Ideker T. Protein networks as logic functions in development and cancer. PLoS Comput Biol 7, e1002180 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Creixell P., Schoof E. M., Erler J. T. & Linding R. Navigating cancer network attractors for tumor-specific therapy. Nat Biotechnol 30, 842–8 (2012). [DOI] [PubMed] [Google Scholar]
- Schramm G., Nandakumar K. & Konig R. Regulation patterns in signaling networks of cancer. BMC Syst Biol 4, 162 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bandyopadhyay S. et al. Rewiring of genetic networks in response to dna damage. Science 330, 1385–1389 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ideker T. & Krogan N. J. Differential network biology. Mol Syst Biol 8, 565 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Csermely P. & Korcsmaros T. Cancer-related networks: a help to understand, predict and change malignant. Semin Cancer Biol 23, 209–12 (2013). [DOI] [PubMed] [Google Scholar]
- Pisco A. O. et al. Non-darwinian dynamics in therapy-induced cancer drug resistance. Nat Commun 4, 2467 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- West J., Bianconi G., Severini S. & Teschendorff A. E. Differential network entropy reveals cancer system hallmarks. Sci Rep 2, 802 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Banerji C. R. et al. Cellular network entropy as the energy potential in waddington's differentiation landscape. Sci Rep 3, 3039 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teschendorff A. E., Sollich P. & Kuehn R. Signalling entropy: A novel network-theoretical framework for systems analysis and interpretation of functional omic data. Methods 67, 282–93 (2014). [DOI] [PubMed] [Google Scholar]
- Erdös P. & Rényi A. On random graphs. Pub Math 6, 290–297 (1959). [Google Scholar]
- Barabasi A. L. & Albert R. Emergence of scaling in random networks. Science 286, 509–512 (1959). [DOI] [PubMed] [Google Scholar]
- Prasad T. S., Kandasamy K. & Pandey A. Human protein reference database and human proteinpedia as discovery tools for systems biology. Methods Mol Biol 577, 67–79 (2009). [DOI] [PubMed] [Google Scholar]
- Cerami E. G. et al. Pathway commons, a web resource for biological pathway data. Nucleic Acids Res 39, D685–D690 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolland T. et al. A proteome-scale map of the human interactome network. Cell 159, 1212–26 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Albert R., Jeong H. & Barabasi A. L. Error and attack tolerance of complex networks. Nature 406, 378–382 (2000). [DOI] [PubMed] [Google Scholar]
- Jeong H., Mason S. P., Barabsi A. L. & Oltvai Z. N. Lethality and centrality in protein networks. Nature 411, 41–42 (2001). [DOI] [PubMed] [Google Scholar]
- Serra-Musach J. et al. Cancer develops, progresses and responds to therapies through restricted perturbation of the protein-protein interaction network. Integr Biol (Camb) 4, 1038–48 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim J. et al. Robustness and evolvability of the human signaling network. PLoS Comput Biol 10, e1003763 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang S. J., Wang Z., Jin T. & Boccaletti S. Emergence of disassortative mixing from pruning nodes in growing scale-free networks. Sci Rep 4, 7536 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teschendorff A. E. & Severini S. Increased entropy of signal transduction in the cancer metastasis phenotype. BMC Syst Biol 4, 104 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haynes C. et al. Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes. PLoS Comput Biol 2, e100 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gomez-Gardenes J. & Latora V. Entropy rate of diffusion processes on complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 78, 065102 (2008). [DOI] [PubMed] [Google Scholar]
- Wurmbach E. et al. Genome-wide molecular profiles of hcv-induced dysplasia and hepatocellular carcinoma. Hepatology 45, 938–947 (2007). [DOI] [PubMed] [Google Scholar]
- Barabasi A. L. & Oltvai Z. N. Network biology: understanding the cell's functional organization. Nat Rev Genet 5, 101–113 (2004). [DOI] [PubMed] [Google Scholar]
- Rhodes D. R. et al. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci U S A 101, 9309–9314 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu X., Jain V. V., Finn P. W. & Perkins D. L. Hubs in biological interaction networks exhibit low changes in expression in experimental asthma. Mol Syst Biol 3, 98 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Platzer A., Perco P., Lukas A. & Mayer B. Characterization of protein-interaction networks in tumors. BMC Bioinformatics 8, 224 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Komurov K. & Ram P. T. Patterns of human gene expression variance show strong associations with signaling network hierarchy. BMC Syst Biol 4, 154 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Komurov K., White M. A. & Ram P. T. Use of data-biased random walks on graphs for the retrieval of context-specific networks from genomic data. PLoS Comput Biol 6, pii: e1000889 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jonsson P. F. & Bates P. A. Global topological features of cancer proteins in the human interactome. Bioinformatics 22, 2291–2297 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tuck D. P., Kluger H. M. & Kluger Y. Characterizing disease states from topological properties of transcriptional regulatory networks. BMC Bioinformatics 7, 236 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hudson N. J., Reverter A. & Dalrymple B. P. A differential wiring analysis of expression data correctly identifies the gene containing the causal mutation. PLoS Comput Biol 5, e1000382 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ulitsky I. & Shamir R. Identification of functional modules using network topology and high-throughput data. BMC Syst Biol 1, 8 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chuang H. Y., Lee E., Liu Y. T., Lee D. & Ideker T. Network-based classification of breast cancer metastasis. Mol Syst Biol 3, 140 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu H., Kim P. M., Sprecher E., Trifonov V. & Gerstein M. The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol 3, e59 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- West J., Beck S., Wang X. & Teschendorff A. E. An integrative network algorithm identifies age-associated differential methylation interactome hotspots targeting stem-cell differentiation pathways. Sci Rep 3, 1630 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kandasamy K. et al. Netpath: a public resource of curated signal transduction pathways. Genome Biol 11, R3 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Latora V. & Baranger M. Kolmogorov-sinai entropy rate versus physical entropy. Phys Rev Lett 82, 520–524 (1999). [Google Scholar]
- Demetrius L. & Manke T. Robustness and network evolution-an entropic principle. Physica A 346, 682–696 (2005). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.