Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2021 Aug 10;34(4):1454–1470. doi: 10.1007/s11424-020-0013-0

Statistical Identification of Important Nodes in Biological Systems

Pei Wang 1,
PMCID: PMC8353063  PMID: 34393461

Abstract

Biological systems can be modeled and described by biological networks. Biological networks are typical complex networks with widely real-world applications. Many problems arising in biological systems can be boiled down to the identification of important nodes. For example, biomedical researchers frequently need to identify important genes that potentially leaded to disease phenotypes in animal and explore crucial genes that were responsible for stress responsiveness in plants. To facilitate the identification of important nodes in biological systems, one needs to know network structures or behavioral data of nodes (such as gene expression data). If network topology was known, various centrality measures can be developed to solve the problem; while if only behavioral data of nodes were given, some sophisticated statistical methods can be employed. This paper reviewed some of the recent works on statistical identification of important nodes in biological systems from three aspects, that is, 1) in general complex networks based on complex networks theory and epidemic dynamic models; 2) in biological networks based on network motifs; and 3) in plants based on RNA-seq data. The identification of important nodes in a complex system can be seen as a mapping from the system to the ranking score vector of nodes, such mapping is not necessarily with explicit form. The three aspects reflected three typical approaches on ranking nodes in biological systems and can be integrated into one general framework. This paper also proposed some challenges and future works on the related topics. The associated investigations have potential real-world applications in the control of biological systems, network medicine and new variety cultivation of crops.

Keywords: Biological network, complex network, important node, network motif, RNA-seq.

Footnotes

This paper was supported by the National Natural Science Foundation of China under Grant No. 61773153, the Natural Science Foundation of Henan under Grant No. 202300410045, the Supporting Plan for Scientific and Technological Innovative Talents in Universities of Henan Province under Grant No. 20HASTIT025, and the Training Plan of Young Key Teachers in Colleges and Universities of Henan Province under Grant No. 2018GGJS021. Partly supported by the Supporting Grant of Bioinformatics Center of Henan University under Grant No. 2018YLJC03.

This paper was recommended for publication by Editor GUO Jin.

References

  • [1].Newman M, Barabási A L, Watts D J. The Structure and Dynamics of Networks. Princeton and Oxford: Princeton University Press; 2006. [Google Scholar]
  • [2].Wu X, Wei W, Tang L, et al. Coreness and h-index for weighted networks. IEEE Trans. Circuits Syst. I: Reg. Papers. 2019;66(8):3113–3122. doi: 10.1109/TCSI.2019.2907751. [DOI] [Google Scholar]
  • [3].Mei G, Wu X, Wang Y, et al. Compressive-sensing-based structure identification for multilayer networks. IEEE Trans. Cyber. 2018;48(2):754–764. doi: 10.1109/TCYB.2017.2655511. [DOI] [PubMed] [Google Scholar]
  • [4].Wei X, Wu X, Chen S, et al. Cooperative epidemic spreading on a two-layered interconnected network. SIAM J. Appl. Dyn. Syst. 2018;17(2):1503–1520. doi: 10.1137/17M1134202. [DOI] [Google Scholar]
  • [5].Jia Z, Chen H, Tu L, et al. Stability and feedback control for a coupled hematopoiesis nonlinear system. Adv. Differ. Equa. 2018;2018:401. doi: 10.1186/s13662-018-1838-x. [DOI] [Google Scholar]
  • [6].Long Y, Jia Z, Wang Y. Coarse graining method based on generalized degree in complex network. Physica A. 2018;505:655–665. doi: 10.1016/j.physa.2018.03.080. [DOI] [Google Scholar]
  • [7].Chen L, Wang R, Zhang X. Biomolecular Networks: Methods and Applications in Systems Biology. New Jersey: Wiley; 2009. [Google Scholar]
  • [8].Liu S, Xu Q, Chen A, et al. Structural controllability of static and dynamic transcriptional regulatory networks for Saccharomyces cerevisiae. Physica A. 2020;537:122772. doi: 10.1016/j.physa.2019.122772. [DOI] [Google Scholar]
  • [9].Barabási A L, Gulbahce N, Loscalzo J. Network medicine: A network-based approach to human disease. Nat. Rev. 2011;12:56–68. doi: 10.1038/nrg2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Wang Z, Yang C, Chen H, et al. Multi-gene co-transformation can improve comprehensive resistance to abiotic stresses in B. napus L. Plant Sci. 2018;274:410–419. doi: 10.1016/j.plantsci.2018.06.014. [DOI] [PubMed] [Google Scholar]
  • [11].Shang B, Zang Y, Zhao X, et al. Functional characterization of GhPHOT2 in chloroplast avoidance of Gossypium hirsutum. Plant Physiol. Bioch. 2019;135:51–60. doi: 10.1016/j.plaphy.2018.11.027. [DOI] [PubMed] [Google Scholar]
  • [12].Qu X, Cao B, Kang J, et al. Fine-tuning stomatal movement through small signaling peptides. Front Plant Sci. 2019;10:69. doi: 10.3389/fpls.2019.00069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Wang D, Yang C, Dong L, et al. Comparative transcriptome analyses of drought-resistant and -susceptible Brassica napus L. and development of EST-SSR markers by RNA-Seq. J. Plant Biol. 2015;58:259–269. doi: 10.1007/s12374-015-0113-x. [DOI] [Google Scholar]
  • [14].Zhang S, Li X, Pan J, et al. Use of comparative transcriptome analysis to identify candidate genes related to albinism in channel catfish (Ictalurus punctatus) Aquaculture. 2018;500:75–81. doi: 10.1016/j.aquaculture.2018.09.055. [DOI] [Google Scholar]
  • [15].Dong W, Li M M, Li Z G, et al. Transcriptome analysis of the molecular mechanism of Chrysanthemum flower color change under short-day photoperiods. Plant Physiol. Bioch. 2020;146:315–328. doi: 10.1016/j.plaphy.2019.11.027. [DOI] [PubMed] [Google Scholar]
  • [16].Zhang G F, Yue C M, Lu T T, et al. Genome-wide identification and expression analysis of NADPH oxidase genes in response to ABA and abiotic stresses and in fibre formation in Gossypium. Peer J. 2020;8:e8404. doi: 10.7717/peerj.8404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Kitsak M, Gallos L K, Havlin S, et al. Identification of influential spreaders in complex networks. Nat. Phys. 2010;6:888–893. doi: 10.1038/nphys1746. [DOI] [Google Scholar]
  • [18].Wang P, Tian C, Lu J. Identifying influential spreaders in artificial complex networks. Journal of Systems Science and Complexity. 2014;27(4):650–665. doi: 10.1007/s11424-014-2236-4. [DOI] [Google Scholar]
  • [19].Lü L Y, Chen D, Ren X, et al. Vital nodes identification in complex networks. Phys. Rep. 2016;650:1–63. doi: 10.1016/j.physrep.2016.06.007. [DOI] [Google Scholar]
  • [20].Zhang Z K, Liu C, Zhan X X, et al. Dynamics of information diffusion and its applications on complex networks. Phys. Rep. 2016;651:1–34. doi: 10.1016/j.physrep.2016.07.002. [DOI] [Google Scholar]
  • [21].Ksiazek T G, Erdman D, Goldsmith C S, et al. A novel coronavirus associated with severe acute respiratory syndrome. N. Engl. J. Med. 2003;348:1953–1966. doi: 10.1056/NEJMoa030781. [DOI] [PubMed] [Google Scholar]
  • [22].Kuiken T, Fouchier R, Schutten M, et al. Newly discovered coronavirus as the primary cause of severe acute respiratory syndrome. Lancet. 2003;362:263–270. doi: 10.1016/S0140-6736(03)13967-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Zhu N, Zhang D, Wang W, et al. A novel coronavirus from patients with pneumonia in China. N. Engl. J. Med. 2020;382:727–733. doi: 10.1056/NEJMoa2001017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Huang C, Wang Y, Li X, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395:497–506. doi: 10.1016/S0140-6736(20)30183-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Wang P, Lu J, Jin Y, et al. Statistical and network analysis of 1212 COVID-19 patients in Henan, China. Int. J. Infect. Disease. 2020;95:391–398. doi: 10.1016/j.ijid.2020.04.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Pastor-Satorras R, Vespignani A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 2001;86(14):3200–3203. doi: 10.1103/PhysRevLett.86.3200. [DOI] [PubMed] [Google Scholar]
  • [27].Boguna M, Pastor-Satorras R, Vespignani A. Absence of epidemic threshold in scale-free networks with degree correlations. Phys. Rev. Lett. 2003;90(2):028701. doi: 10.1103/PhysRevLett.90.028701. [DOI] [PubMed] [Google Scholar]
  • [28].Gallos L K, Liljeros F, Argyrakis P, et al. Improving immunization strategies. Phys. Rev. E. 2007;75(4):045104. doi: 10.1103/PhysRevE.75.045104. [DOI] [PubMed] [Google Scholar]
  • [29].Xu S, Wang P, Zhang C, et al. Spectral learning algorithm reveals propagation capability of complex network. IEEE Trans. Cyber. 2019;49(12):4253–4261. doi: 10.1109/TCYB.2018.2861568. [DOI] [PubMed] [Google Scholar]
  • [30].Wang P, Lü J, Yu X. Identification of important nodes in directed biological networks: A network motif approach. PLoS One. 2014;9(8):e106132. doi: 10.1371/journal.pone.0106132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Wang P, Chen Y, Lü J, et al. Graphical features of functional genes in human protein interaction network. IEEE Trans. Biomed. Circuits Syst. 2016;10(3):707–720. doi: 10.1109/TBCAS.2015.2487299. [DOI] [PubMed] [Google Scholar]
  • [32].Wang P, Yang C, Chen H, et al. Exploring transcriptional factors reveals crucial members and regulatory networks involved in different abiotic stresses in Brassica napus L. BMC Plant Biol. 2018;18:202. doi: 10.1186/s12870-018-1417-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Wang P, Yang C, Chen H, et al. Transcriptomic basis for drought-resistance in Brassica napus L. Sci. Rep. 2017;7:40532. doi: 10.1038/srep40532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Chen F, Wang Y, Wang B, et al. Graph representation learning: A survey. 2019. [Google Scholar]
  • [35].Wu Z, Pan S, Chen F, et al. A comprehensive survey on graph neural networks. 2019. [DOI] [PubMed] [Google Scholar]
  • [36].Bühlmann P, van de Geer S. Statistics for High-Dimensional Data: Methods, Theory and Applications. Berlin Heidelberg: Springer-Verlag; 2011. [Google Scholar]
  • [37].Wang P, Yu X, Lü J. Identification and evolution of structurally dominant nodes in proteinprotein interaction networks. IEEE Trans. Biomed. Circuits Syst. 2014;8(1):87–97. doi: 10.1109/TBCAS.2014.2303160. [DOI] [PubMed] [Google Scholar]
  • [38].Xu S, Wang P, Lü J. Iterative neighbour-information gathering for ranking nodes in complex networks. Sci. Rep. 2017;7:41321. doi: 10.1038/srep41321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Brin S, Page L. Reprint of: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 2012;56(18):3825–3833. doi: 10.1016/j.comnet.2012.10.007. [DOI] [Google Scholar]
  • [40].Lü L, Zhang Y, Yeung C H, et al. Leaders in social networks the delicious case. PLoS One. 2011;6:e21202. doi: 10.1371/journal.pone.0021202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Xu S, Wang P. Identifying important nodes by adaptive LeaderRank. Physica A. 2017;469:654–664. doi: 10.1016/j.physa.2016.11.034. [DOI] [Google Scholar]
  • [42].Metzner R. Fundamental of statistical and thermal physics. Phys. Today. 1967;20(12):85–87. doi: 10.1063/1.3034084. [DOI] [Google Scholar]
  • [43].Milo R, Shen-Orr S, Itzkovitz S, et al. Network motifs: Simple building blocks of complex networks. Science. 2002;298:824–827. doi: 10.1126/science.298.5594.824. [DOI] [PubMed] [Google Scholar]
  • [44].Koschützki D, Schwöbbermeyer H, Schreiber F. Ranking of network elements based on functional substructures. J. Theor. Biol. 2007;248:471–479. doi: 10.1016/j.jtbi.2007.05.038. [DOI] [PubMed] [Google Scholar]
  • [45].Alon U. Network motifs: Theory and experimental approaches. Nat. Rev. Genet. 2007;8(6):450–461. doi: 10.1038/nrg2102. [DOI] [PubMed] [Google Scholar]
  • [46].Koschützki D, Schreiber F. Centrality analysis methods for biological networks and their application to gene regulatory networks. Gene Regulat. Syst. Biol. 2008;2:193–201. doi: 10.4137/grsb.s702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Sporns O, Kötter R. Motifs in brain networks. PLoS Biol. 2004;2:e369. doi: 10.1371/journal.pbio.0020369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Sporns O, Honey C J, Kötter R. Identification and classification of hubs in brain networks. PLoS One. 2007;2:e1049. doi: 10.1371/journal.pone.0001049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Rubinov M, Sporns O. Complex network measures of brain connectivity: Uses and interpretations. NeuroImage. 2010;52:1059–1069. doi: 10.1016/j.neuroimage.2009.10.003. [DOI] [PubMed] [Google Scholar]
  • [50].Härdle W K, Simar L. Applied Multivariate Statistical Analysis. Berlin Heidelberg: Springer-Verlag; 2012. [Google Scholar]
  • [51].Li W, Li J. Modeling and analysis of RNA-seq data: A review from a statistical perspective. Quantitative Biol. 2018;6(3):195–209. doi: 10.1007/s40484-018-0144-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [52].Samuels M L, Witmer J A, Schaffner A A. Statistics for the Life Sciences. 5th Edition. Edinburgh Gate, Harlow: Pearson Education; 2016. [Google Scholar]
  • [53].Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].Love M I, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [55].Li H, Wei Z, Maris J M. A hidden Markov random field model for genome-wide association studies. Biostat. 2010;11:139–150. doi: 10.1093/biostatistics/kxp043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [56].Chen M, Cho J, Zhao H, et al. Incorporating biological pathways via a Markov random field model in genome-wide association studies. PLoS Genet. 2011;7:e1001353. doi: 10.1371/journal.pgen.1001353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [57].Hou L, Chen M, Zhang C K, et al. Guilt by rewiring: Gene prioritization through network rewiring in genome wide association studies. Hum. Mol. Genet. 2014;23(10):2780–2790. doi: 10.1093/hmg/ddt668. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [58].Chalhoub B, Denoeud F, Liu S, et al. Early allopolyploid evolution in the post-neolithic Brassica napus oilseed genome. Science. 2014;345:950–953. doi: 10.1126/science.1253435. [DOI] [PubMed] [Google Scholar]
  • [59].Wang X, Wang H, Wang J, et al., The genome of the mesopolyploid crop species Brassica rapa, Nat Genet., 43: 1035–1039. [DOI] [PubMed]
  • [60].Liu S, Liu Y, Yong C, et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat. Commun. 2014;5:3930. doi: 10.1038/ncomms4930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [61].Huala E, Dickerman A W, Garciahernandez M, et al. The Arabidopsis Information Resource (TAIR): A comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001;29:102–105. doi: 10.1093/nar/29.1.102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [62].Li C, Li H. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformat. 2008;24(9):1175–1182. doi: 10.1093/bioinformatics/btn081. [DOI] [PubMed] [Google Scholar]
  • [63].Liao J G, Chin K V. Logistic regression for disease classification using microarray data: Model selection in a large p and small n case. Bioinformat. 2007;23(15):1945–1951. doi: 10.1093/bioinformatics/btm287. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Systems Science and Complexity are provided here courtesy of Nature Publishing Group

RESOURCES