The utility of the vast amounts of data produced by modern genomic techniques is largely limited by our ability to functionally interpret them. Normally, genomic data are functionally annotated and modeled by comparison with previously established databases that need constant updating to prevent their obsolescence.1
The most widely used approach to extract information from transcriptomes is to map mRNA expression data into pathways, which allows visualization and is the first step to explore which signaling cascades or metabolic processes are affected. Immunology and cancer research have been using this methodology for years, leading to the documentation of an ever-increasing number of pathways in those fields. In nephrology, the picture is different. In a recent survey, nephrology trainees did not consider renal genomics as an area worth investing additional instruction in during their fellowship.2 Not surprisingly, in both clinical and basic kidney research, the newer “omics” are emerging very slowly.
Before starting any type of pathway analysis, pathways need to be collected into a database that can be searched and screened.3,4 The complex nature of the kidney makes most of the available pathways of limited usefulness in renal research, because they lack the tissue and/or cell specificity required to analyze different renal tissues and structures. Most pathways address cellular processes in a general manner and do not distinguish between isozymes, splice variants, etc. Furthermore, many of the existing pathways were created using purely bioinformatics techniques that can incorrectly include gene products from homologous processes. As an example, species interconversion of pathways containing the enzyme urate oxidase, present in nearly all mammals but humans, can lead to inaccurate results. Similarly, physiologic variations in RNA levels when transcriptome data are used without regard for the underlying physiology could erroneously indicate absence of a particular transcript in a tissue. To circumvent these problems, intellectual input in curations by experts is imperative. As such, no one is more knowledgeable than renal physiologists for generation and curation of renal-specific pathways. Therefore, to produce quality renal genomics data, the renal community must become involved in the development and maintenance of renal databases. To encourage this process, we compiled a collection of kidney-specific pathways in the Renal Genomics Portal (http://renalgenomics.wikipathways.org) at the public repository WikiPathways.5 As an open platform, WikiPathways adopts the Creative Commons CC0 waiver, which makes its content freely available to download, build on, customize, and reuse. The use of the platform is very intuitive, and anyone with basic computer use knowledge and a user registration can contribute to pathway curation or creation.
Four of the pathways that we created summarize the enzymes and metabolites involved in (1) hexoses metabolism in proximal tubules, (2) fructose metabolism in proximal tubules, (3) lipid droplet metabolism, and (4) acute angiotensin signaling in thick ascending limbs. The other two are nonclassic pathways: an annotated list of proximal tubule transporters grouped by function and a collection of the main thick ascending limb transporters organized on a cell scheme for didactical purposes (Figure 1). These pathways were created using the drawing and analysis tool Pathvisio.6 We used Rattus novergicus as a model organism and drew data from multiple publications and databases, including segment-specific transcriptomes,7 to decide which nodes to include for each nephron segment. Nodes representing gene products were then annotated using their ENTREZ Gene ID, whereas nodes representing metabolic intermediaries and ions called “metabolites” were annotated using the Human Metabolome Database. Annotation assigns a specific identifier to each node that serves to crosslink information contained in the pathway with external databases (Figure 1) to either access other depositories that store information about that node or merge that node with experimental data.
Contributions from other groups include pathways, such as renin-angiotensin system, polycystic kidney disease, primary FSGS, Wnt signaling in kidney disease, polyol pathway in diabetes, and vasopressin that regulates renal water homeostasis via aquaporins, on the basis of different species, including Homo sapiens, Mus musculus, and Bos taurus.
Among uses of these pathways in nephrology are as didactical and reference material, as analytical tools in precision medicine, and to conduct functional analysis. There are three generations of methods to functionally analyze genomic data using pathways.8 The first generation, called over-representation analysis, uses a list of significantly changed gene products as input and analyzes the number of hits within a pathway. Pathways with more hits are expected to be more affected. The main advantage of this method is its simplicity and easy visualization, whereas its main limitation is that it does not take into account the additive effects of marginally affected genes.8 The second generation, functional class scoring (FCS), inputs a whole matrix of expression data into a pathway or pathway-containing databases and analyzes statistics on all nodes (gene products) within each pathway. The outcome is a score assigned to individual pathways that takes into account genes that may be excluded if analyzed individually.8 An example of FCS is gene set enrichment analysis.9 FCS is currently the most used pathway analysis methodology. The third generation, which is still under development, is called pathway topology (PT), which has the advantage to crosslink information of shared nodes on separate pathways. Thus, PT integrates expression data into signaling and metabolic networks.8 A disadvantage of PT is the complexity in the analysis and the meaning of the findings.10
The Renal Genomics Portal needs to be enriched with pathways yet to be created, such as AKI, CKD, diabetic nephropathy, kidney transplants and rejection, mitochondria in renal disease, membranous nephropathy, podocyte biology, and renal immune system among many others. The complexity and the level of detail required to create such pathways make it a herculean task for a single or even a few laboratories, a limitation that can be solved with crow contributions from nephrologists, renal physiologist, and basic and clinical researchers. Contributing to The Renal Genomics Portal provides an opportunity to integrate the renal knowledge base using modern technology and drive the discipline forward.
Disclosures
None.
Acknowledgments
The authors thank Kristina Hanspers and Alexander Pico from the Gladstone Institutes (San Francisco, CA) for facilitating information and resources at WikiPathways.
This work was supported, in part, by National Heart, Lung and Blood Institute of the National Institutes of Health grant HL128053 (to J.L.G.).
A portion of this article was presented at the 2017 Experimental Biology Meeting held April 22–26, 2017 in Chicago, Illinois.
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
References
- 1.de Silva NHND: Relational databases and biomedical big data. Methods Mol Biol 1617: 69–81, 2017 [DOI] [PubMed] [Google Scholar]
- 2.Rope RW, Pivert KA, Parker MG, Sozio SM, Merell SB: Education in nephrology fellowship: A survey-based needs assessment. J Am Soc Nephrol 28: 1983–1990, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.AlAjlan A., Badr G: Data Mining in Pathway Analysis for Gene Expression. In: Advances in Data Mining: Applications and Theoretical Aspects. ICDM 2015. Lecture Notes in Computer Science, edited by Perner P, Cham, Springer, 2015, pp 69–77
- 4.Kelder T, Pico AR, Hanspers K, van Iersel MP, Evelo C, Conklin BR: Mining biological pathways using WikiPathways web services. PLoS One 4: e6447, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kutmon M, Riutta A, Nunes N, Hanspers K, Willighagen EL, Bohler A, Mélius J, Waagmeester A, Sinha SR, Miller R, Coort SL, Cirillo E, Smeets B, Evelo CT, Pico AR: WikiPathways: Capturing the full diversity of pathway knowledge. Nucleic Acids Res 44: D488–D494, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kutmon M, van Iersel MP, Bohler A, Kelder T, Nunes N, Pico AR, Evelo CT: PathVisio 3: An extendable pathway analysis toolbox. PLOS Comput Biol 11: e1004085, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lee JW, Chou CL, Knepper MA: Deep sequencing in microdissected renal tubules identifies nephron segment-specific transcriptomes. J Am Soc Nephrol 26: 2669–2677, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Khatri P, Sirota M, Butte AJ: Ten years of pathway analysis: Current approaches and outstanding challenges. PLOS Comput Biol 8: e1002375, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102: 15545–15550, 2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bayerlová M, Jung K, Kramer F, Klemm F, Bleckmann A, Beißbarth T: Comparative study on gene set and pathway topology-based enrichment methods. BMC Bioinformatics 16: 334, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]