Skip to main content
eLife logoLink to eLife
. 2024 Jan 15;12:RP88863. doi: 10.7554/eLife.88863

Leveraging inter-individual transcriptional correlation structure to infer discrete signaling mechanisms across metabolic tissues

Mingqi Zhou 1,2,, Ian Tamburini 1,2,, Cassandra Van 2,, Jeffrey Molendijk 3, Christy M Nguyen 1,2, Ivan Yao-Yi Chang 1, Casey Johnson 1,2, Leandro M Velez 1,2, Youngseo Cheon 1,2, Reichelle Yeo 4, Hosung Bae 1,2, Johnny Le 1,2, Natalie Larson 1,2, Ron Pulido 1,2, Carlos HV Nascimento-Filho 1,2, Cholsoon Jang 1,2, Ivan Marazzi 1,2, Jamie Justice 5, Nicholas Pannunzio 6, Andrea L Hevener 7,8, Lauren Sparks 4, Erin E Kershaw 9, Dequina Nicholas 1,2,10, Benjamin L Parker 3, Selma Masri 1,2, Marcus M Seldin 1,2,
Editors: Evan Graehl Williams11, David E James12
PMCID: PMC10945578  PMID: 38224289

Abstract

Inter-organ communication is a vital process to maintain physiologic homeostasis, and its dysregulation contributes to many human diseases. Given that circulating bioactive factors are stable in serum, occur naturally, and are easily assayed from blood, they present obvious focal molecules for therapeutic intervention and biomarker development. Recently, studies have shown that secreted proteins mediating inter-tissue signaling could be identified by ‘brute force’ surveys of all genes within RNA-sequencing measures across tissues within a population. Expanding on this intuition, we reasoned that parallel strategies could be used to understand how individual genes mediate signaling across metabolic tissues through correlative analyses of gene variation between individuals. Thus, comparison of quantitative levels of gene expression relationships between organs in a population could aid in understanding cross-organ signaling. Here, we surveyed gene-gene correlation structure across 18 metabolic tissues in 310 human individuals and 7 tissues in 103 diverse strains of mice fed a normal chow or high-fat/high-sucrose (HFHS) diet. Variation of genes such as FGF21, ADIPOQ, GCG, and IL6 showed enrichments which recapitulate experimental observations. Further, similar analyses were applied to explore both within-tissue signaling mechanisms (liver PCSK9) and genes encoding enzymes producing metabolites (adipose PNPLA2), where inter-individual correlation structure aligned with known roles for these critical metabolic pathways. Examination of sex hormone receptor correlations in mice highlighted the difference of tissue-specific variation in relationships with metabolic traits. We refer to this resource as gene-derived correlations across tissues (GD-CAT) where all tools and data are built into a web portal enabling users to perform these analyses without a single line of code (gdcat.org). This resource enables querying of any gene in any tissue to find correlated patterns of genes, cell types, pathways, and network architectures across metabolic organs.

Research organism: Human

Introduction

Interaction and/or coordination between organs is central to maintaining physiologic homeostasis among multicellular organisms. Beginning with the discovery of insulin over a century ago, characterization of molecules responsible for signal between tissues has required careful and elegant experimentation where these observations have been integral to deciphering physiology and disease. Further, actions of these molecules have been the key focus for development of potent therapeutics. For example, physiologic dissection of the actions of soluble proteins such as proprotein convertase subtilisin/kexin type 9 (PCSK9) and glucagon-like peptide 1 (GLP1) have yielded among the most promising therapeutics to treat cardiovascular disease and obesity, respectively (Drucker, 2022; Trapp and Stanford, 2022; Dadu and Ballantyne, 2014; Lambert et al., 2012). A majority of our understanding in how organs and cells utilize these mechanisms of tissue communication has arisen from elegant biochemical and physiologic experimentation. While these targeted investigations exist as the most definitive way to demonstrate causality for mechanisms, scaling such approaches to deconvolute the actions of tens of thousands of unique molecules which circulate in the blood becomes an impossible task. A major obstacle in the characterization of such soluble factors is that defining their tissues and pathways of action requires extensive experimental testing in cells and animal models.

Recent technological advances have enabled more unbiased views of molecules in circulation. Next-generation technologies have quantified thousands of factors in the blood across large populations. For example, large-scale proteomic measures have prioritized disease biomarkers and suggested involvements in genome-wide association mechanisms (Anderson and Anderson, 2002; Ferkingstad et al., 2021; Sun et al., 2018). Similar studies focused on integrating genetic variation with metabolomics quantification have yielded similar insights (Nicholson et al., 2011; Kettunen et al., 2016; Harshfield et al., 2021). However, the challenge is understanding which organs are secreting these molecules, how fast they are produced/degraded, and also what are the recipient tissues processing and/or responding to these factors. Furthermore, it is important to also identify the receptors that sense the secreted factor and enable the target organ to respond. This is challenging because the abundance of secreted factors and target receptors are dynamic, and rapidly change throughout the day or in response to a variety of environmental changes (e.g. diet or time of day). In addition, it is well known that genetic- or sex-driven variation can also modulate endocrine signaling. Hence, the foundations of therapeutic discovery require a comprehensive understanding of the mechanisms of endocrine signaling and here lies massive potential and an unmet need.

Previous studies in mouse and human populations have demonstrated that, when sufficiently powered, several known and new mechanisms of organ communication can be identified through simple global analyses of gene-gene correlations (Seldin et al., 2018; Koplev et al., 2022; Seldin and Lusis, 2019; Cao et al., 2022; Velez et al., 2022). The intuition behind this approach is that correlations across tissues and individuals will show a relatively normal distribution and upper-limit skews reflecting highly significant relationships have the potential to capture direct signaling. In this study, we expanded on this intuition and tested the paralleled hypothesis that potential functions of signaling between tissues could be prioritized by focusing correlation analyses across individuals on specific genes. We highlight several areas where this approach was sufficient, as well as lacking in ability to recapitulate known tissue communication mechanisms. These analyses are contextualized by additional explorations of pathway-specific relationships (e.g. between go terms) and an example for context-specific gene-trait relationships for hormone receptors. In addition, we provide a user-friendly web tool to query these analyses in mouse and human population datasets at gdcat.org.

Results

Construction of a web tool to survey transcript correlations across tissues and individuals (GD-CAT)

Previous studies have established that ‘brute force’ analyses of correlation structure across tissues from population expression data can identify several known and new mechanisms of organ cross-talk. These were accomplished by surveying the global co-expression structure of all genes, where high correlation outliers highlighted proteins which elicit signaling (Seldin et al., 2018; Koplev et al., 2022; Seldin and Lusis, 2019; Cao et al., 2022; Velez et al., 2022). Following this intuition, we hypothesized that a paralleled but alternative approach to inter-individual correlation structure could be exploited to understand the functional consequences of specific genes. Our initial goal was to establish a user-friendly interface where all of these analyses and gene-centric queries could be performed without running any code. To accomplish this, we assembled a complete analysis pipeline (Figure 1A) as a shiny app and docker image hosted in a freely-available web address (gdcat.org). Here, users can readily-search gene correlation structure between individuals from filtered human (gene-by-tissue expression project [GTEx]) and mouse (hybrid mouse diversity panel [HMDP]) across tissues. GTEx is presently the most comprehensive pan-tissue dataset in humans (Battle et al., 2017), which was filtered for individuals where most metabolic tissues were sequenced (Velez et al., 2022). Collectively, this dataset contains 310 individuals, consisting of 210 male and 100 female (self-reported) subjects between the ages of 20–79. Data from the HMDP consisted of 96 diverse mouse strains fed a normal chow (5 tissues) or high-fat/high-sucrose (HFHS) diet (7 tissues) as well as carefully characterized clinical traits (Parks et al., 2015; Hui et al., 2015; Norheim et al., 2021; Org et al., 2015; Lusis et al., 2016; Bennett et al., 2010). Users first select a given species, followed by reported sex or diet (mouse) which loads the specified environment. Subsequent downstream analyses are then implemented accordingly from a specific gene in a given tissue. This selection prompts individual gene correlations across all other gene-tissue combinations using biweight midcorrelation (Langfelder and Horvath, 2008). From these charts, users are able to select a given tissue, where gene set enrichment analysis testing using clusterprofiler (Yu et al., 2012) and enrichR (Kuleshov et al., 2016) are applied to the correlated set of genes to determine the positively (activated) and negatively (suppressed) pathways which occur in each tissue. In addition to general queries of gene ~ gene correlation structure, comparison of expression changes is also visualized between age groups as well as reported sexes. In addition, we included the top cell-type abundance correlations with each gene. To compute cell abundance estimates from the same individuals, we used single-nucleus RNA-seq available from GTEx (Jones et al., 2022) and applied cellular deconvolution methods to the bulk RNA-seq (Danziger et al., 2019) (Materials and methods). Comparison of deconvolution methods (Danziger et al., 2019) showed that DeconRNA-Seq (Gong and Szustakowski, 2013) captured the most cell types within several tissues (Figure 1—figure supplements 13) and therefore was applied to all tissues where sn-RNA-seq was available. We note that visceral adipose, subcutaneous adipose, aortic artery, coronary artery, transverse colon, sigmoid colon, the heart left ventricle, the kidney cortex, liver, lung, skeletal muscle, spleen, and small intestine are the only tissues where sn-seq is available and not other tissues, such as brain, stomach, and thyroid.

Figure 1. Web tool overview and inter-individual correlation structure of established endocrine proteins.

(A) Web server structure for user-defined interactions, as well as server and shiny app implementation scheme for gene-derived correlations across tissues (GD-CAT). (B) All genes across the 18 metabolic tissues in 310 individuals were correlated with expression of ADIPOQ in subcutaneous adipose tissue, where a q-value cutoff of q<0.1 showed the strongest enrichments with subcutaneous and muscle gene expression (pie chart, left). Gene set enrichment analysis (GSEA) was performed using the bicor coefficient of all genes to ADIPOQ using gene ontology biological process annotations and network construction of top pathways using clusterprofiler, where pathways related to fatty acid oxidation were observed in adipose (left) and chemotaxis/ECM remodeling in skeletal muscle (right). (B–D) The same q-value binning, top within-tissue and top peripheral enrichments were applied to intestinal GCG (C), liver FGF21 (D), and muscle IL6 (E). For these analyses all 310 individuals (across both sexes) were used and q-value adjustments calculated using a Benjamini-Hochberg FDR adjustment.

Figure 1.

Figure 1—figure supplement 1. Performance across four methods of cell-type deconvolution where relative proportions of cells (y-axis) are shown for all cell types annotated in single-cell reference (x-axis) in liver.

Figure 1—figure supplement 1.

Figure 1—figure supplement 2. Performance across four methods of cell-type deconvolution where relative proportions of cells (y-axis) are shown for all cell types annotated in single-cell reference (x-axis) in heart.

Figure 1—figure supplement 2.

Figure 1—figure supplement 3. Performance across four methods of cell-type deconvolution where relative proportions of cells (y-axis) are shown for all cell types annotated in single-cell reference (x-axis) in skeletal muscle.

Figure 1—figure supplement 3.

Figure 1—figure supplement 4. Pancreatic INS expression correlations across tissues in gene-by-tissue expression project (GTEx) were binned according to q<0.1 (top) and corresponding pancreatic gene set enrichment analysis (GSEA) network graph is shown (bottom).

Figure 1—figure supplement 4.

We initially examined pan-tissue transcript correlation structures for several well-established mechanisms of tissue cross-talk via secreted proteins which contribute to metabolic homeostasis. Here, binning of the significant tissues and pathways related to each of these established secreted proteins resembled their known mechanisms of action (Figure 1B–E). For example, variation with subcutaneous adipose expression of ADIPOQ was enriched with genes in several metabolic tissues where it has been known to act (Figure 1B, left). In particular, subcutaneous adipose ADIPOQ expression correlated with fatty acid oxidative process within adipose (Figure 1B, middle) and was enriched with ECM, chemotaxis, and ribosomal biogenesis in skeletal muscle (Figure 1B, right). These correlated pathways align with the established physiologic roles of the protein in that fat-secreted adiponectin when oxidation is stimulated (da Silva Rosa et al., 2021; Straub and Scherer, 2019) and muscle is a major site of action (Ruan and Dong, 2016). Beyond adiponectin, inter-individual correlation structure additionally recapitulated broad signaling mechanisms for other relevant endocrine proteins. For example, intestinal GCG (encoding GLP1, Figure 1C), liver FGF21 (Figure 1D), and skeletal muscle IL6 (Figure 1E) showed binning patterns and pathway enrichments related to their known functions in pancreas (Drucker, 2022; McLean et al., 2021), adipose tissue (Fisher and Maratos-Flier, 2016; Flippo and Potthoff, 2021), and other metabolic organs (Pedersen and Febbraio, 2008), respectively. These analyses and web tool show some examples of exploring transcriptional correlation structure to confirm and identify mechanisms of signaling, where we note that additional limitations should be considered.

Pathway-based examination of gene correlation structure and significance thresholds across tissues

While the select observations shown in Figure 1 provide examples of support in exploring correlation structure of genes across inter-individual differences to investigate endocrinology, several limitations in these analyses should be considered. First, an additional explanation for a given gene showing strong correlation between the tissues could arise from a general pattern of correlation between the two tissues and not necessarily due to the discrete signaling mechanisms. In previous studies surveying correlation structure and network model architectures in the HMDP and STARNET populations, genes appeared generally stronger correlated between liver and adipose tissue compared to all other organ combinations explored (Seldin et al., 2018; Koplev et al., 2022; Seldin and Lusis, 2019). To investigate this global pattern of gene correlation structure between metabolic organs, we selected key gene ontology (GO) terms, KEGG pathways, and randomly sampled equal numbers of genes and evaluated relative significance of inter-tissue correlations across multiple statistical thresholds. These analyses suggested that usage of empirical Student’s correlation p-values recapitulated a clear pattern of inter-tissue correlations between pathways (Figure 2). For example, comparison of the number of genes achieving significance of correlation between tissues among select GO terms revealed that tissues such as adipose and muscle appeared more correlated than spleen and other tissues at p-values less than 1e-3 (Figure 2A, left column). These global patterns of gene correlation between tissues among select pathways were reduced when the p-value threshold was lowered to 1e-6 (Figure 2A, middle column) or q-value adjustments (Materials and methods) were performed (Figure 2A, right two columns). For these reasons, only q-value adjusted value was used and implemented into pie charts providing the tissue-specific occurrences of correlated genes at three thresholds (q<0.1, q<0.01, q<0.001) within the web tool. Next, in order to further evaluate these global patterns of innate transcript correlation structure and determine whether they reflected concordance between known metabolic pathways or innate to the dataset used, tissues were rank-ordered by the number of genes which meet p-value thresholds and compared to randomly sampled genes of similar pathway sized (Figure 2B). Among KEGG pathways selects (hsa04062 − chemokine signaling pathway, hsa04640 − hematopoietic cell lineage, and hsa00190 − oxidative phosphorylation), the top-ranked organs by correlated gene numbers differed (skeletal muscle, colon, and thyroid, respectively); however, a general trend of specific tissues ranking higher than others were observed (Figure 2B). For example, skeletal muscle and heart appeared among the strongest correlated across pathways and organs, compared to kidney cortex and spleen which were observed to rank among the lowest (Figure 2B, pathways). We note that when the same analysis was performed on randomly sampled genes from each organ consisting of the same number as genes within each KEGG pathway, these rankings and number of significant correlating genes were no longer observed (Figure 2B, random genes), suggesting that in certain instances differences between organs in general connectivity to others might reflect concordance between known pathways. It is important to consider here that for the organs ranking lower, the lack of relative correlating numbers is likely due to sparsity of available data and not necessarily general patterns of gene correlation. This point is supported by the fact that among the lowest-ranked 33% of tissues across pathways, we observed a significant negative overall correlation (bicor=–0.45, p-value=2.3e-5) between number of NA values per individual and the gene count for significance shown in Figure 2B. This negative correlation between missing data and number of significant correlations for pathways across tissues was not observed when binning the top 33% (bicor=0.09, p-value=0.42) or middle 33% (bicor=–0.12, p-value=0.27) of organs. Collectively, these analyses show that innate correlation structures exist between organs which differ depending on pathways investigated and that tissues which don’t show broad correlation structure could potentially be attributed to areas of missing data among GTEx.

Figure 2. Tissue-specific contributions to pan-organ gene-gene correlation structure.

Figure 2.

(A) Heatmap showing all the number of gene-gene correlations across tissues which achieve significance relative to total number of genes in each pathway at biweigth midcorrelation Student’s p-value <1e-3 (left column), p-value <1e-6 (left middle column) of BH-corrected q-value <0.1 (right middle column) or BH-corrected q-value <0.01 (right column). Within-tissue correlations are omitted from this analysis. (B–D) Genes corresponding to each KEGG pathway shown were correlated both within and across all other organs where the number of genes which meet each Student’s p-value threshold are shown (y-axis). Tissues (x-axis) are rank-ordered by the number of genes which correlate for hsa04062 − chemokine signaling pathway at p-value <0.01 and shown for other KEGG terms, hsa04640 − hematopoietic cell lineage (C) and hsa00190 − oxidative phosphorylation (D) and additionally p-value <1e-4 (right side).

PCSK9 signaling and lipid exchange between adipose and muscle apparent in simple network models of correlation structure

Next, we wanted to ask whether our approach of analyzing inter-individual correlation structure across tissue for endocrine proteins was also sufficient to define within-tissue signaling mechanisms or actions of enzymes producing metabolites that signal across organs. Dissimilar to the cross-tissue distributions of significance in Figure 1, the same analysis of liver PCSK9 highlighted exclusively liver genes which were varied together (Figure 3A), in particular those involved in cholesterol metabolism/homeostasis (Figure 3B). Consistent with the established role for PCSK9 as a primary degradation mechanism of LDLR (Lambert et al., 2012; Peterson et al., 2008), network model construction of correlated genes highlighted the gene as a central node linking cholesterol biosynthetic pathways with those involved in other metabolic pathways such as insulin signaling (Figure 3C). Given that organ signaling via metabolites comprises many critical processes among multicellular organisms, our next goal was to apply this gene-centric analyses to established mechanisms of metabolite signaling. The gene PNPLA2 encodes adipose triglyceride lipase which localizes to lipid droplets and breaks down triglycerides for oxidation or mobilization as free fatty acids for peripheral tissues (Zechner et al., 2009). Variation in expression of PNPLA2 showed highly significant enrichments with beta oxidation pathways in adipose tissue (Figure 3D). Muscle pathways enriched for the gene were represented by sarcomere organization and muscle contraction (Figure 3F). Construction of an undirected network from these expression data placed the gene as a central node between the two tissues, linking regulators of adipose oxidation (Figure 3F, red) to muscle contractile process (Figure 3F, purple) where additional strongly co-correlated genes were implicated as additional candidates (Figure 3F). In sum, these analyses provide two examples of within-liver signaling via PCSK9 and adipose-muscle communication through PNPLA2 where the top-correlated genes and network models recapitulate known mechanisms. Given the utility of these undirected network models, a function in gene-derived correlations across tissues (GD-CAT) was added to enable users to generate network models for any gene-tissue combination and select parameters such as number of within-tissue and peripheral correlated genes to include.

Figure 3. Inter-individual transcript correlation structure and network architecture of liver PCSK9 and adipose PNPLA2.

Figure 3.

(A) Distribution of pan-tissue genes correlated with liver PCSK9 expression (q<0.1), where 93% of genes were within liver (purple). (B) Gene ontology (GO) (BP) overrepresentation test for the top 500 hepatic genes correlated with PCSK9 expression in liver. (C) Undirected network constructed from liver genes (aqua) correlated with PCSK9, where those annotated for ‘cholesterol biosynthetic process’ are colored in red. (D–E) Overrepresentation tests corresponding to the top-correlated genes with adipose (subcutaneous) PNPLA2 expression residing in adipose (D) or peripherally in skeletal muscle (E). (F) Undirected network constructed from the strongest correlated subcutaneous adipose tissue (light aqua) and muscle genes (light brown) with PNPLA2 (black), where genes corresponding to GO terms annotated as ‘fatty acid beta oxidation’ or ‘muscle contraction’ are colored purple or red, respectively. For these analyses all 310 individuals (across both sexes) were used and q-value adjustments calculated using a Benjamini-Hochberg FDR adjustment. Network graphs generated based in biweight midcorrelation coefficients, where edges are colored blue for positive correlations or red for negative correlations. Network edges represent positive (blue) and negative (red) correlations and the thicknesses are determined by coefficients. They are set for a range of bicor=0.6 (minimum to include) to bicor=0.99.

Inter-individual correlation analysis of HMDP highlights tissue- and diet-specific phenotype relationships with sex hormones

Genetic reference panels in model organisms, such as mice, present appeal in studying complex traits in that environmental conditions can be tightly controlled, tissues and invasive traits readily accessible, and the same, often renewable, genetic background can be studied and compared among multiple exposures such as diets or drug treatments (Lusis et al., 2016; Li and Auwerx, 2020; Andreux et al., 2012; Seldin et al., 2019). For this resource, we utilized data from the HMDP fed a normal chow (Lusis et al., 2016; Bennett et al., 2010) or HFHS diet for 8 weeks (Parks et al., 2015; Hui et al., 2015; Norheim et al., 2021; Org et al., 2015). While the number of tissues available was less than in GTEx, these panels allow for comparison of how gene correlations shift depending on diet. Therefore, queries of gene correlation in mice were segregated into either chow or HFHS diet and an additional panel to download a table or visualize the relationship between genes and clinical measures was added. The inferred abundances of cell types from each individual are correlated across user-defined genes, with the bicor coefficient plotted for each cell type.

One advantage of HMDP data compared to GTEx is the abundance of phenotypic measures available within each cohort. To show the utility of examining correlations within this reference panel, we selected sex hormone receptors androgen receptor (Ar), estrogen receptor alpha (Esr1), or estrogen receptor beta (Esr2) and binned the top 10 phenotypes which were correlated. These analyses were stratified based on where sex hormones were expressed (either liver or adipose tissue) or dietary regiment of the ~100 strains (normal chow or HFHS diet). This analysis demonstrated the difference in relationships between tissue location of sex hormone receptor and dietary context with metabolic traits. For example, expression of Ar in adipose tissue among HMDP mice fed an HFHS diet was negatively correlated with fat mass and body weight traits, whereas expression in liver oppositely correlated with the same traits in a positive direction (Figure 4A). The top traits which correlated also differed by tissue or expression for Ar, such as plasma lipid parameters in adipose tissue compared to blood cell traits in chow-fed mice (Figure 4A). We note that among the three hormone receptors investigated, Esr2 appeared the most consistently correlated between tissues and diets with metabolic traits (Figure 4B). Expression of Esr1 also showed a clear tissue and diet difference in the traits which were the most strongly co-regulated. Under HFHS dietary conditions, a negative correlation with insulin and fat pad weights were observed exclusively with adipose expression, while positive correlations with liver lipids were observed with expression in liver (Figure 4C). These analyses highlight how phenotype correlations in mouse populations can help to determine contexts relevant for gene regulation and point to the diversity of potential contexts relevant for sex hormone receptors in metabolic tissues.

Figure 4. Hybrid mouse diversity panel (HMDP) tissue- and diet-specific correlations of sex hormone receptors.

Figure 4.

The top 10 phenotypic traits are shown for correlations with expression of androgen receptor (A), estrogen receptor 1 (B), or estrogen receptor 2 (C), colored by direction in the hybrid mouse diversity panel. Positive correlations are shown in light blue and negative correlations as sunset orange, where phenotypes (y-axis) are ordered by significance (x-axis, -log10(p-value) of correlation). Correlations are segregated by whether sex hormone receptors are expressed by gonadal adipose tissue (left two columns) in ~100 HMDP strains fed a high-fat/high-sucrose (HFHS) diet (left), normal chow diet (left middle), or liver-expressed receptors fed an HFHS diet (right middle) or normal chow diet (right). Dashed lines show a Student’s correlation p-value (from bicor) of p=0.05 (purple) or p=0.001 (beige).

Discussion

Limitations and conclusions

Here, we provide a new resource to explore correlations across organ gene expression in the context of inter-individual differences. We highlight areas where these align with established and relevant mechanisms of physiology and suggest that similar explorations could be used as a discovery tool. Several key limitations should be considered when exploring GD-CAT for mechanisms of inter-tissue signaling though. Primarily, the fact that correlation-based analyses could reflect both causal and reactive patterns of variation. While several statistical methods such as mediation (Richiardi et al., 2013; Zeng et al., 2021) and Mendelian randomization (Emdin et al., 2017; Sanderson et al., 2022) exist to further refine causal inferences, likely the only definitive method to distinguish is in carefully designed experimentation. Further, analyses of genetic correlation (e.g. correlations considering genetic loci to infer causality) also present appeal in refining some causal mechanisms. Correlation between molecular and phenotypic variables can occur for a variety of reasons, not just between their individual relationships, but often more broadly, from a variety of complex genetic and environmental factors. Further, many correlations tend to be dominated by genes expressed within the same organ. This could be due to the fact that, within-tissue correlations could capture both the pathways regulating expression of a gene and potential consequences of changes in expression/function, and distinguishing between the two presents a significant challenge. For example, a GD-CAT query of insulin (INS) expression in pancreas shows exclusive enrichments in pancreas and corresponding pathway terms reflect regulatory mechanisms such as secretion and ion transport (Figure 1—figure supplement 4). Representation of given genes may also differ significantly depending on the dataset used. For example, queries of other tissues correlated with the critical X inactive specific transcript (XIST), in liver show no significant correlations at qvalue cutoffs used. This is due to the fact that the gene operates in a sex-dependent manner, where females are significantly less represented in GTEx and liver exists as a sparser tissue compared to others (Figure 2). In addition, the analyses presented are derived from differences in gene expression across individuals which arise from complex interaction of genetic and environmental variables. Expression of a gene and its corresponding protein can show substantial discordances depending on the dataset used. These have been discussed in detail (Liu et al., 2016; Maier et al., 2009; Buccitelli and Selbach, 2020), but ranges of co-correlation can vary widely depending on the datasets used and approaches taken. We note that for genes encoding proteins where actions from acute secretion grossly outweigh patterns of gene expression, such as insulin, caution should be taken when interpreting results. As the depth and availability of tissue-specific proteomic levels across diverse individuals continues to increase, an exciting opportunity is presented to explore the applicability of these analyses and identify areas when gene expression is not a sufficient measure. For example, mass-spec proteomics was recently performed on GTEx (Jiang et al., 2020); however, given that these data represent 6 individuals, analyses utilizing well-powered inter-individual correlations such as ours which contain 310 individuals remain limited in applications.

The queries provided in GD-CAT use fairly simple linear models to infer organ-organ signaling; however, more sophisticated methods can also be applied in an informative fashion. For example, Koplev et al. generated co-expression modules from nine tissues in the STARNET dataset, where construction of a massive Bayesian network uncovered interactions between correlated modules (Koplev et al., 2022). These approaches expanded on analysis of STAGE data to construct network models using WGCNA across tissues and relating these resulting eigenvectors to outcomes (Talukdar et al., 2016). The generalized approach of constructing cross-tissue gene regulatory modules presents appeal in that genes are able to be viewed in the context of a network with respect to all other gene-tissue combinations. In searching through these types of expanded networks, individuals can identify where the most compelling global relationships occur. One challenge with this type of approach, however, is that co-regulated pathways and module members are highly subjective to parameters used to construct GRNs (e.g. reassignment threshold in WGCNA) and can be difficult in arriving at a ‘ground truth’ for parameter selection. We note that the WGCNA package is also implemented in these analyses, but solely to perform gene-focused correlations using biweight midcorrelation to limit outlier inflation. While the midweight bicorrelation approach to calculate correlations could also be replaced with more sophisticated models, one consideration would be a concern of overfitting models and thus, biasing outcomes.

In another notable example MultiCens was developed as a tool to uncover communication between genes and tissues and applied to suggest central processes which exist in multi-layered data relevant for Alzheimer’s disease (Kumar et al., 2022). In addition, Jadhav and colleagues adopted a machine learning approach to mine published literature for relationships between hormones and genes (Jadhav et al., 2022). Further, association mapping of plasma proteomics data has been extensively applied and intersection with genome-wide association disease loci has offered intriguing potential disease mechanisms (Ferkingstad et al., 2021; Suhre et al., 2021). Another common application to single-cell sequencing data is to search for overrepresentation of known ligand-receptor pairs between cell types (Armingol et al., 2021). These and additional applications to explore tissue communication/coordination present unique strengths and caveats, depending on the specific usage desired. Regardless of methods used to decipher, one important limitation to consider in all these analyses is the nature of underlying data. For example, our evaluation of GTEx data structure suggested that important organs such as spleen and kidney were insufficient due to availability in matching expression data between individuals. Further, GTEx sample varies as to the collection times, sample processing times, and other important parameters such as cause of death. Mouse population data such as the HMDP or BxD cohorts offer appeal in these regards, as environmental conditions and collection times are easily fixed. Regardless, careful consideration of how data was generated and normalized are fundamental to interpreting results.

In sum, we demonstrate that adopting a gene-centric approach to surveying correlation structure of transcripts across organs and individuals can inform mechanism of coordination between metabolic tissues. Initially, we queried several well-established and key mediators of physiologic homeostasis, such as FGF21, GCG, and PCSK9. These approaches are further suggested to be applicable to mechanisms of metabolite signaling, as evident by pan-tissue investigation of adipose PNPLA2. Exploration of HMDP data highlighted the diverse phenotype correlations depending on tissue and diet for sex hormone receptors. To facilitate widespread access and use of this transcript isoform-centric analysis of inter-individual correlations, a full suite of analyses such as those performed here can be performed from a lab-hosted server (gdcat.org) or in isolation from a shiny app or docker image.

Materials and methods

Availability of web tool and analyses

All analyses, datasets, and scripts used to generate the associated web tool (GD-CAT) can be accessed via GitHub (copy archived at Zhou and Seldin, 2023) or within the associated docker image. In addition, access to the GD-CAT web tool is also available through the web portal gdcat.org. This portal was created to provide a user-friendly interface for accessing and using the GD-CAT tool without the need to download or install any software or packages. Users can simply visit the website, process data, and start using the tool. Corresponding tutorial and the other resources were made available to facilitate the utilization of the web tool on GitHub. The interface and server of the web were built and linked based on the shiny package using R (v. 4.2.0). Shiny package provides a powerful tool for building interactive web applications using R, allowing for fast and flexible development of custom applications with minimal coding required.

Pathway-specific gene correlations across tissues

Detailed scripts and analyses for pathway-specific investigations across tissues in Figure 2 are provided in GitHub (copy archived at Tamburini, 2023). Briefly, to interrogate broad tissue correlation structure, the number of genes which passed each biweight midcorrelation p-value cutoff is shown normalized to the total number of genes corresponding to that pathway term. Pathways were selected by accessing all available GO annotations for all genes using the Universal Protein Resource (The UniProt Consortium, 2017) and subsetting genes where a given term is listed. To determine which tissues show the most co-correlation across genes and organs, KEGG terms shown were selected and each corresponding gene-tissue combinations were correlated. Tissues were then binned individually by the number of significant correlations which were observed across organs among each selected KEGG pathway at indicated correlation p-values. Rank-ordering on the figure was shown by chemokine signaling at p<0.01 and each term was compared to a randomly sampled set of genes corresponding to the same number contained in each pathway.

Data sources and availability

All human data used in this study can be immediately accessed via web tool or docker to facilitate analysis. Metabolic tissue data was accessed through GTEx V8 downloads portal on August 18, 2021, and previously described (Velez et al., 2022; Battle et al., 2017). These raw data can also be readily accessed from the associated R-based walkthrough via GitHub (copy archived at Velez, 2022). Briefly, these data were filtered to retain genes which were detected across tissues where individuals were required to show counts >0 across all data. Given that our goal was to look across tissues at enrichments, this was done to limit spurious influence of genes only expressed in specific tissues in specific individuals. HMDP data was collected from previously described studies (Parks et al., 2015; Lusis et al., 2016; Bennett et al., 2010; Seldin et al., 2019) and inter-individual differences were compared at the strain level to maximize possible comparisons between historical data.

Correlation analyses across tissues

Biweight midcorrelation coefficients and corresponding p-values within and across tissues were generated using WGCNA bicorandpvalue() function (Langfelder and Horvath, 2008). We note that while the WGCNA package was used to calculate coefficients and corresponding Student’s p-values, this generalized framework does not utilize any module generation. Associated q-value adjustments were applied using the Benjamini-Hochberg FDR from the R package ‘stats’. The BH procedure was selected instead of other FDR control methods because of its efficiency in CPU usage on the hosted server.

Pathway enrichment analyses

Pathway enrichments were generated using gene set enrichment analyses available from the R package clusterProfiler. Specifically, the bicor coefficients were used as the rank weight of each gene and enrichment tests performed by permuting against the human or mouse reference transcriptome. Terms used for the enrichment analyses were derived from GO (biological process, cellular component, and molecular function) which were accessed using the R package enrichR. For this analysis and on the available app, input genes were determined at indicated q-value threshold.

Deconvolution of bulk tissue seq data on web tool

All scripts and deconvolution data produced are available at GitHub (copy archived at Van, 2022). Briefly, sn-RNA-seq data was accessed from the Human Cell Atlas (Jones et al., 2022) for matching organ datasets with metabolic tissues. From these data, four deconvolution methods were applied using ADAPTS (Danziger et al., 2019) where DeconRNA-Seq (Gong and Szustakowski, 2013) was selected for its ability to capture the abundances of the most cell types across tissues such as liver, heart, and skeletal muscle (Figure 1—figure supplements 13). The full combined matrix was assembled for DeconRNA-Seq results across individuals in GTEx where correlations between cell types and genes were performed also using the bicorandpvalue() in WGCNA (Langfelder and Horvath, 2008).

Acknowledgements

We acknowledge the following funding sources for supporting these studies: MZ, CNF, IT, CMN, LMV, CV, CJ, and MMS were supported by NIH grants HL138193, DK130640, and DK097771. ALH is supported by NIH grants U54 DK120342, R01 DK109724, and P30 DK063491. NP is supported by NIH grant R37 CA266042. HB was supported by National Research Foundation of Korea (2021R1A6A3A14039132). JL was supported by an NIH grant F31DK134173-01A1. CJ was supported by AASLD Foundation Pinnacle Research Award in Liver Disease, Edward Mallinckrodt, Jr. Foundation Award, and an NIH grant AA029124.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Marcus M Seldin, Email: mseldin@uci.edu.

Evan Graehl Williams, University of Luxembourg, Luxembourg.

David E James, University of Sydney, Australia.

Funding Information

This paper was supported by the following grants:

  • National Institutes of Health HL138193 to Mingqi Zhou, Carlos HV Nascimento-Filho, Ian Tamburini, Christy M Nguyen, Leandro M Velez, Cassandra Van, Cholsoon Jang, Marcus M Seldin.

  • National Institutes of Health DK130640 to Mingqi Zhou, Carlos HV Nascimento-Filho, Ian Tamburini, Christy M Nguyen, Leandro M Velez, Cassandra Van, Cholsoon Jang, Marcus M Seldin.

  • National Institutes of Health DK097771 to Mingqi Zhou, Carlos HV Nascimento-Filho, Ian Tamburini, Christy M Nguyen, Leandro M Velez, Cassandra Van, Cholsoon Jang, Marcus M Seldin.

  • National Institutes of Health DK120342 to Andrea L Hevener.

  • National Institutes of Health DK109724 to Andrea L Hevener.

  • National Institutes of Health DK063491 to Andrea L Hevener.

  • National Institutes of Health CA266042 to Nicholas Pannunzio.

  • National Research Foundation of Korea 2021R1A6A3A14039132 to Hosung Bae.

  • National Institutes of Health F31DK134173-01A1 to Johnny Le.

  • AASLD Foundation Pinnacle Research Award in Liver Disease to Cholsoon Jang.

  • Edward Mallinckrodt, Jr Foundation to Cholsoon Jang.

  • National Institutes of Health AA029124 to Cholsoon Jang.

Additional information

Competing interests

No competing interests declared.

Reviewing editor, eLife.

Author contributions

Resources, Data curation, Software, Formal analysis, Validation, Investigation, Methodology, Writing – original draft, Writing – review and editing.

Software, Formal analysis, Visualization, Methodology, Writing – review and editing.

Software, Formal analysis, Writing – original draft.

Formal analysis, Writing – original draft, Writing – review and editing.

Formal analysis, Investigation, Methodology, Writing – review and editing.

Resources, Software, Methodology, Project administration.

Formal analysis, Funding acquisition, Visualization.

Resources, Data curation, Validation.

Supervision, Visualization, Methodology.

Formal analysis, Investigation, Writing – review and editing.

Data curation, Formal analysis, Validation.

Resources, Software, Investigation.

Resources, Software, Formal analysis.

Resources, Software, Investigation.

Validation, Investigation, Writing – original draft.

Conceptualization, Funding acquisition, Validation.

Conceptualization, Funding acquisition, Validation.

Validation, Investigation, Project administration.

Validation, Visualization, Writing – original draft.

Conceptualization, Data curation, Supervision, Funding acquisition, Validation, Writing – original draft.

Conceptualization, Supervision, Funding acquisition, Writing – original draft.

Conceptualization, Funding acquisition, Validation, Writing – original draft.

Conceptualization, Data curation, Validation, Visualization, Writing – original draft.

Conceptualization, Visualization, Writing – original draft.

Conceptualization, Validation, Methodology.

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Writing – original draft, Writing – review and editing.

Additional files

MDAR checklist

Data availability

All previously published datasets are listed below. Please see the Materials and methods for further details (section ‘Data sources and availability’).

The following previously published datasets were used:

Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, Hasz R, Walters G, Garcia F, Young N, Foster B, Moser M, Karasik E, Gillard B, Ramsey K, Sullivan S, Bridge J, Magazine H, Syron J, Fleming J, Siminoff L, Traino H, Mosavel M, Barker L, Jewell S, Rohrer D, Maxim D, Filkins D, Harbach P, Cortadillo E, Berghuis B, Turner L, Hudson E, Feenstra K, Sobin L, Robb J, Branton P, Korzeniewski G, Shive C, Tabor D, Qi L, Groch K, Nampally S, Buia S, Zimmerman A, Smith A, Burges R, Robinson K, Valentino K, Bradbury D, Cosentino M, Diaz-Mayoral N, Kennedy M, Engel T, Williams P, Erickson K, Ardlie K, Winckler W, Getz G, DeLuca D, MacArthur D, Kellis M, Thomson A, Young T, Gelfand E, Donovan M, Meng Y, Grant G, Mash D, Marcus Y, Basile M, Liu J, Zhu J, Tu Z, Cox NJ, Nicolae DL, Gamazon ER, Im HK, Konkashbaev A, Pritchard J, Stevens M, Flutre T, Wen X, Dermitzakis ET, Lappalainen T, Guigo R, Monlong J, Sammeth M, Koller D, Battle A, Mostafavi S, McCarthy M, Rivas M, Maller J, Rusyn I, Nobel A, Wright F, Shabalin A, Feolo M, Sharopova N, Sturcke A, Paschal J, Anderson JM, Wilder EL, Derr LK, Green ED, Struewing JP, Temple G, Volpi S, Boyer JT, Thomson EJ, Guyer MS, Ng C, Abdallah A, Colantuoni D, Insel TR, Koester SE, Little AR, Bender PK, Lehner T, Yao Y, Compton CC, Vaught JB, Sawyer S, Lockhart NC, Demchok J, Moore HF. 2013. Adult Genotype-Tissue Expression (GTEx) project. GTExPortal. adult-gtex

Farber CR, Bennett BJ, Orozco L, Eskin E, Lusis AJ. 2011. Bone mineral density in males of 96 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno1

Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Nucleotides and peptides in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno10

Bennett BJ, Lusis AJ. 2015. Diet effects on aortic lesion size in 103 Hybrid Mouse Diversity Panel strains that are transgenic for human APOE*3Leiden and human CETP (F1 hybrids) Mouse Phenome Database. HMDPpheno11

Bennett BJ, Farber CR, Orozco L, Eskin E, Lusis AJ. 2010. Plasma lipids and body composition in males of 99 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno2

Korshunov VA. 2012. Heart rate and blood pressure in males of 58 strains of mice. Mouse Phenome Database. HMDPpheno3

Park CC, Gale GD, Lusis AJ, Smith DJ. 2011. Fear conditioning in males of 94 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno4

Davis RC, van Nas A, Bennett BJ, Eskin E, Lusis AJ. 2013. Hematological parameters in males of 83 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno5

Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Amino acids in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno6

Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Carbohydrates, cofactors and vitamins, energy, and xenobiotics in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno7

Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Lipids (except lysolipids) in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno8

Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Lipids (lysolipids) in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno9

References

  1. Anderson NL, Anderson NG. The Human Plasma Proteome. Molecular & Cellular Proteomics. 2002;1:845–867. doi: 10.1074/mcp.R200007-MCP200. [DOI] [PubMed] [Google Scholar]
  2. Andreux PA, Williams EG, Koutnikova H, Houtkooper RH, Champy MF, Henry H, Schoonjans K, Williams RW, Auwerx J. Systems genetics of metabolism: the use of the BXD murine reference panel for multiscalar integration of traits. Cell. 2012;150:1287–1299. doi: 10.1016/j.cell.2012.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Armingol E, Officer A, Harismendy O, Lewis NE. Deciphering cell-cell interactions and communication from gene expression. Nature Reviews. Genetics. 2021;22:71–88. doi: 10.1038/s41576-020-00292-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Battle A, Brown CD, Engelhardt BE, Montgomery SB, GTEx Consortium. Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group. Statistical Methods groups—Analysis Working Group. Enhancing GTEx (eGTEx) groups. NIH Common Fund. NIH/NCI. NIH/NHGRI. NIH/NIMH. NIH/NIDA. Biospecimen Collection Source Site—NDRI. Biospecimen Collection Source Site—RPCI. Biospecimen Core Resource—VARI. Brain Bank Repository—University of Miami Brain Endowment Bank. Leidos Biomedical—Project Management. ELSI Study. Genome Browser Data Integration &Visualization—EBI. Genome Browser Data Integration &Visualization—UCSC Genomics Institute, University of California Santa Cruz. Lead analysts. Laboratory, Data Analysis &Coordinating Center (LDACC) NIH program management. Biospecimen collection. Pathology. eQTL manuscript working group Genetic effects on gene expression across human tissues. Nature. 2017;550:204–213. doi: 10.1038/nature24277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bennett BJ, Farber CR, Orozco L, Kang HM, Ghazalpour A, Siemers N, Neubauer M, Neuhaus I, Yordanova R, Guan B, Truong A, Yang W, He A, Kayne P, Gargalovic P, Kirchgessner T, Pan C, Castellani LW, Kostem E, Furlotte N, Drake TA, Eskin E, Lusis AJ. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Research. 2010;20:281–290. doi: 10.1101/gr.099234.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Buccitelli C, Selbach M. mRNAs, proteins and the emerging principles of gene expression control. Nature Reviews. Genetics. 2020;21:630–644. doi: 10.1038/s41576-020-0258-4. [DOI] [PubMed] [Google Scholar]
  7. Cao Y, Wang Y, Zhou Z, Pan C, Jiang L, Zhou Z, Meng Y, Charugundla S, Li T, Allayee H, Seldin MM, Lusis AJ. Liver-heart cross-talk mediated by coagulation factor XI protects against heart failure. Science. 2022;377:1399–1406. doi: 10.1126/science.abn0910. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dadu RT, Ballantyne CM. Lipid lowering with PCSK9 inhibitors. Nature Reviews. Cardiology. 2014;11:563–575. doi: 10.1038/nrcardio.2014.84. [DOI] [PubMed] [Google Scholar]
  9. Danziger SA, Gibbs DL, Shmulevich I, McConnell M, Trotter MWB, Schmitz F, Reiss DJ, Ratushny AV. ADAPTS: Automated deconvolution augmentation of profiles for tissue specific cells. PLOS ONE. 2019;14:e0224693. doi: 10.1371/journal.pone.0224693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. da Silva Rosa SC, Liu M, Sweeney G. Adiponectin Synthesis, Secretion and Extravasation from Circulation to Interstitial Space. Physiology. 2021;36:134–149. doi: 10.1152/physiol.00031.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Drucker DJ. GLP-1 physiology informs the pharmacotherapy of obesity. Molecular Metabolism. 2022;57:101351. doi: 10.1016/j.molmet.2021.101351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Emdin CA, Khera AV, Kathiresan S. Mendelian Randomization. JAMA. 2017;318:1925–1926. doi: 10.1001/jama.2017.17219. [DOI] [PubMed] [Google Scholar]
  13. Ferkingstad E, Sulem P, Atlason BA, Sveinbjornsson G, Magnusson MI, Styrmisdottir EL, Gunnarsdottir K, Helgason A, Oddsson A, Halldorsson BV, Jensson BO, Zink F, Halldorsson GH, Masson G, Arnadottir GA, Katrinardottir H, Juliusson K, Magnusson MK, Magnusson OT, Fridriksdottir R, Saevarsdottir S, Gudjonsson SA, Stacey SN, Rognvaldsson S, Eiriksdottir T, Olafsdottir TA, Steinthorsdottir V, Tragante V, Ulfarsson MO, Stefansson H, Jonsdottir I, Holm H, Rafnar T, Melsted P, Saemundsdottir J, Norddahl GL, Lund SH, Gudbjartsson DF, Thorsteinsdottir U, Stefansson K. Large-scale integration of the plasma proteome with genetics and disease. Nature Genetics. 2021;53:1712–1721. doi: 10.1038/s41588-021-00978-w. [DOI] [PubMed] [Google Scholar]
  14. Fisher FM, Maratos-Flier E. Understanding the Physiology of FGF21. Annual Review of Physiology. 2016;78:223–241. doi: 10.1146/annurev-physiol-021115-105339. [DOI] [PubMed] [Google Scholar]
  15. Flippo KH, Potthoff MJ. Metabolic Messengers: FGF21. Nature Metabolism. 2021;3:309–317. doi: 10.1038/s42255-021-00354-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gong T, Szustakowski JD. DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics. 2013;29:1083–1085. doi: 10.1093/bioinformatics/btt090. [DOI] [PubMed] [Google Scholar]
  17. Harshfield EL, Fauman EB, Stacey D, Paul DS, Ziemek D, Ong RMY, Danesh J, Butterworth AS, Rasheed A, Sattar T, Saleem I, Hina Z, Ishtiaq U, Qamar N, Mallick NH, Yaqub Z, Saghir T, Rizvi SNH, Memon A, Ishaq M, Rasheed SZ, Memon FUR, Jalal A, Abbas S, Frossard P, Saleheen D, Wood AM, Griffin JL, Koulman A. Genome-wide analysis of blood lipid metabolites in over 5000 South Asians reveals biological insights at cardiometabolic disease loci. BMC Medicine. 2021;19:232. doi: 10.1186/s12916-021-02087-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hui ST, Parks BW, Org E, Norheim F, Che N, Pan C, Castellani LW, Charugundla S, Dirks DL, Psychogios N, Neuhaus I, Gerszten RE, Kirchgessner T, Gargalovic PS, Lusis AJ. The genetic architecture of NAFLD among inbred strains of mice. eLife. 2015;4:e05607. doi: 10.7554/eLife.05607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jadhav A, Kumar T, Raghavendra M, Loganathan T, Narayanan M. Predicting cross-tissue hormone-gene relations using balanced word embeddings. Bioinformatics. 2022;38:4771–4781. doi: 10.1093/bioinformatics/btac578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jiang L, Wang M, Lin S, Jian R, Li X, Chan J, Dong G, Fang H, Robinson AE, Snyder MP, GTEx Consortium A quantitative proteome Map of the Human Body. Cell. 2020;183:269–283. doi: 10.1016/j.cell.2020.08.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jones RC, Karkanias J, Krasnow MA, Pisco AO, Quake SR, Salzman J, Yosef N, Bulthaup B, Brown P, Harper W, Hemenez M, Ponnusamy R, Salehi A, Sanagavarapu BA, Spallino E, Aaron KA, Concepcion W, Gardner JM, Kelly B, Neidlinger N, Wang Z, Crasta S, Kolluru S, Morri M, Pisco AO, Tan SY, Travaglini KJ, Xu C, Alcántara-Hernández M, Almanzar N, Antony J, Beyersdorf B, Burhan D, Calcuttawala K, Carter MM, Chan CKF, Chang CA, Chang S, Colville A, Crasta S, Culver RN, Cvijović I, D’Amato G, Ezran C, Galdos FX, Gillich A, Goodyer WR, Hang Y, Hayashi A, Houshdaran S, Huang X, Irwin JC, Jang S, Juanico JV, Kershner AM, Kim S, Kiss B, Kolluru S, Kong W, Kumar ME, Kuo AH, Leylek R, Li B, Loeb GB, Lu W-J, Mantri S, Markovic M, McAlpine PL, de Morree A, Morri M, Mrouj K, Mukherjee S, Muser T, Neuhöfer P, Nguyen TD, Perez K, Phansalkar R, Pisco AO, Puluca N, Qi Z, Rao P, Raquer-McKay H, Schaum N, Scott B, Seddighzadeh B, Segal J, Sen S, Sikandar S, Spencer SP, Steffes LC, Subramaniam VR, Swarup A, Swift M, Travaglini KJ, Van Treuren W, Trimm E, Veizades S, Vijayakumar S, Vo KC, Vorperian SK, Wang W, Weinstein HNW, Winkler J, Wu TTH, Xie J, Yung AR, Zhang Y, Detweiler AM, Mekonen H, Neff NF, Sit RV, Tan M, Yan J, Bean GR, Charu V, Forgó E, Martin BA, Ozawa MG, Silva O, Tan SY, Toland A, Vemuri VNP, Afik S, Awayan K, Botvinnik OB, Byrne A, Chen M, Dehghannasiri R, Detweiler AM, Gayoso A, Granados AA, Li Q, Mahmoudabadi G, McGeever A, de Morree A, Olivieri JE, Park M, Pisco AO, Ravikumar N, Salzman J, Stanley G, Swift M, Tan M, Tan W, Tarashansky AJ, Vanheusden R, Vorperian SK, Wang P, Wang S, Xing G, Xu C, Yosef N, Alcántara-Hernández M, Antony J, Chan CKF, Chang CA, Colville A, Crasta S, Culver R, Dethlefsen L, Ezran C, Gillich A, Hang Y, Ho P-Y, Irwin JC, Jang S, Kershner AM, Kong W, Kumar ME, Kuo AH, Leylek R, Liu S, Loeb GB, Lu W-J, Maltzman JS, Metzger RJ, de Morree A, Neuhöfer P, Perez K, Phansalkar R, Qi Z, Rao P, Raquer-McKay H, Sasagawa K, Scott B, Sinha R, Song H, Spencer SP, Swarup A, Swift M, Travaglini KJ, Trimm E, Veizades S, Vijayakumar S, Wang B, Wang W, Winkler J, Xie J, Yung AR, Artandi SE, Beachy PA, Clarke MF, Giudice LC, Huang FW, Huang KC, Idoyaga J, Kim SK, Krasnow M, Kuo CS, Nguyen P, Quake SR, Rando TA, Red-Horse K, Reiter J, Relman DA, Sonnenburg JL, Wang B, Wu A, Wu SM, Wyss-Coray T, Tabula Sapiens Consortium* The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans. Science. 2022;376:eabl4896. doi: 10.1126/science.abl4896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kettunen J, Demirkan A, Würtz P, Draisma HHM, Haller T, Rawal R, Vaarhorst A, Kangas AJ, Lyytikäinen LP, Pirinen M, Pool R, Sarin AP, Soininen P, Tukiainen T, Wang Q, Tiainen M, Tynkkynen T, Amin N, Zeller T, Beekman M, Deelen J, van Dijk KW, Esko T, Hottenga JJ, van Leeuwen EM, Lehtimäki T, Mihailov E, Rose RJ, de Craen AJM, Gieger C, Kähönen M, Perola M, Blankenberg S, Savolainen MJ, Verhoeven A, Viikari J, Willemsen G, Boomsma DI, van Duijn CM, Eriksson J, Jula A, Järvelin MR, Kaprio J, Metspalu A, Raitakari O, Salomaa V, Slagboom PE, Waldenberger M, Ripatti S, Ala-Korpela M. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nature Communications. 2016;7:11122. doi: 10.1038/ncomms11122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Koplev S, Seldin M, Sukhavasi K, Ermel R, Pang S, Zeng L, Bankier S, Di Narzo A, Cheng H, Meda V, Ma A, Talukdar H, Cohain A, Amadori L, Argmann C, Houten SM, Franzén O, Mocci G, Meelu OA, Ishikawa K, Whatling C, Jain A, Jain RK, Gan LM, Giannarelli C, Roussos P, Hao K, Schunkert H, Michoel T, Ruusalepp A, Schadt EE, Kovacic JC, Lusis AJ, Björkegren JLM. A mechanistic framework for cardiometabolic and coronary artery diseases. Nature Cardiovascular Research. 2022;1:85–100. doi: 10.1038/s44161-021-00009-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma’ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Research. 2016;44:W90–W97. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kumar T, Sethuraman R, Mitra S, Ravindran B, Narayanan M. MultiCens: Multilayer Network Centrality Measures to Uncover Molecular Mediators of Tissue-Tissue Communication. bioRxiv. 2022 doi: 10.1101/2022.05.15.492007. [DOI] [PMC free article] [PubMed]
  26. Lambert G, Sjouke B, Choque B, Kastelein JJP, Hovingh GK. The PCSK9 decade. Journal of Lipid Research. 2012;53:2515–2524. doi: 10.1194/jlr.R026658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Li H, Auwerx J. Mouse Systems Genetics as a Prelude to Precision Medicine. Trends in Genetics. 2020;36:259–272. doi: 10.1016/j.tig.2020.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Liu Y, Beyer A, Aebersold R. On the dependency of Cellular Protein Levels on mRNA abundance. Cell. 2016;165:535–550. doi: 10.1016/j.cell.2016.03.014. [DOI] [PubMed] [Google Scholar]
  30. Lusis AJ, Seldin MM, Allayee H, Bennett BJ, Civelek M, Davis RC, Eskin E, Farber CR, Hui S, Mehrabian M, Norheim F, Pan C, Parks B, Rau CD, Smith DJ, Vallim T, Wang Y, Wang J. The Hybrid Mouse Diversity Panel: a resource for systems genetics analyses of metabolic and cardiovascular traits. Journal of Lipid Research. 2016;57:925–942. doi: 10.1194/jlr.R066944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Maier T, Güell M, Serrano L. Correlation of mRNA and protein in complex biological samples. FEBS Letters. 2009;583:3966–3973. doi: 10.1016/j.febslet.2009.10.036. [DOI] [PubMed] [Google Scholar]
  32. McLean BA, Wong CK, Campbell JE, Hodson DJ, Trapp S, Drucker DJ. Revisiting the complexity of GLP-1 action from sites of synthesis to receptor activation. Endocrine Reviews. 2021;42:101–132. doi: 10.1210/endrev/bnaa032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Nicholson G, Rantalainen M, Li JV, Maher AD, Malmodin D, Ahmadi KR, Faber JH, Barrett A, Min JL, Rayner NW, Toft H, Krestyaninova M, Viksna J, Neogi SG, Dumas ME, Sarkans U, Donnelly P, Illig T, Adamski J, Suhre K, Allen M, Zondervan KT, Spector TD, Nicholson JK, Lindon JC, Baunsgaard D, Holmes E, McCarthy MI, Holmes CC, MolPAGE Consortium A genome-wide metabolic QTL analysis in Europeans implicates two loci shaped by recent positive selection. PLOS Genetics. 2011;7:e1002270. doi: 10.1371/journal.pgen.1002270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Norheim F, Chella Krishnan K, Bjellaas T, Vergnes L, Pan C, Parks BW, Meng Y, Lang J, Ward JA, Reue K, Mehrabian M, Gundersen TE, Péterfy M, Dalen KT, Drevon CA, Hui ST, Lusis AJ, Seldin MM. Genetic regulation of liver lipids in a mouse model of insulin resistance and hepatic steatosis. Molecular Systems Biology. 2021;17:e9684. doi: 10.15252/msb.20209684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Org E, Parks BW, Joo JWJ, Emert B, Schwartzman W, Kang EY, Mehrabian M, Pan C, Knight R, Gunsalus R, Drake TA, Eskin E, Lusis AJ. Genetic and environmental control of host-gut microbiota interactions. Genome Research. 2015;25:1558–1569. doi: 10.1101/gr.194118.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Parks BW, Sallam T, Mehrabian M, Psychogios N, Hui ST, Norheim F, Castellani LW, Rau CD, Pan C, Phun J, Zhou Z, Yang WP, Neuhaus I, Gargalovic PS, Kirchgessner TG, Graham M, Lee R, Tontonoz P, Gerszten RE, Hevener AL, Lusis AJ. Genetic architecture of insulin resistance in the mouse. Cell Metabolism. 2015;21:334–347. doi: 10.1016/j.cmet.2015.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pedersen BK, Febbraio MA. Muscle as an endocrine organ: focus on muscle-derived interleukin-6. Physiological Reviews. 2008;88:1379–1406. doi: 10.1152/physrev.90100.2007. [DOI] [PubMed] [Google Scholar]
  38. Peterson AS, Fong LG, Young SG. PCSK9 function and physiology. Journal of Lipid Research. 2008;49:1595–1599. doi: 10.1194/jlr.cx00001-jlr200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Richiardi L, Bellocco R, Zugna D. Mediation analysis in epidemiology: methods, interpretation and bias. International Journal of Epidemiology. 2013;42:1511–1519. doi: 10.1093/ije/dyt127. [DOI] [PubMed] [Google Scholar]
  40. Ruan H, Dong LQ. Adiponectin signaling and function in insulin target tissues. Journal of Molecular Cell Biology. 2016;8:101–109. doi: 10.1093/jmcb/mjw014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Sanderson E, Glymour MM, Holmes MV, Kang H, Morrison J, Munafò MR, Palmer T, Schooling CM, Wallace C, Zhao Q, Smith GD. Mendelian randomization. Nature Reviews. Methods Primers. 2022;2:6. doi: 10.1038/s43586-021-00092-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Seldin MM, Koplev S, Rajbhandari P, Vergnes L, Rosenberg GM, Meng Y, Pan C, Phuong TMN, Gharakhanian R, Che N, Mäkinen S, Shih DM, Civelek M, Parks BW, Kim ED, Norheim F, Chella Krishnan K, Hasin-Brumshtein Y, Mehrabian M, Laakso M, Drevon CA, Koistinen HA, Tontonoz P, Reue K, Cantor RM, Björkegren JLM, Lusis AJ. A strategy for discovery of Endocrine interactions with application to whole-body Metabolism. Cell Metabolism. 2018;27:1138–1155. doi: 10.1016/j.cmet.2018.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Seldin MM, Lusis AJ. Systems-based approaches for investigation of inter-tissue communication. Journal of Lipid Research. 2019;60:450–455. doi: 10.1194/jlr.S090316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Seldin M, Yang X, Lusis AJ. Systems genetics applications in metabolism research. Nature Metabolism. 2019;1:1038–1050. doi: 10.1038/s42255-019-0132-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Straub LG, Scherer PE. Metabolic Messengers: Adiponectin. Nature Metabolism. 2019;1:334–339. doi: 10.1038/s42255-019-0041-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Suhre K, McCarthy MI, Schwenk JM. Genetics meets proteomics: perspectives for large population-based studies. Nature Reviews. Genetics. 2021;22:19–37. doi: 10.1038/s41576-020-0268-2. [DOI] [PubMed] [Google Scholar]
  47. Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, Burgess S, Jiang T, Paige E, Surendran P, Oliver-Williams C, Kamat MA, Prins BP, Wilcox SK, Zimmerman ES, Chi A, Bansal N, Spain SL, Wood AM, Morrell NW, Bradley JR, Janjic N, Roberts DJ, Ouwehand WH, Todd JA, Soranzo N, Suhre K, Paul DS, Fox CS, Plenge RM, Danesh J, Runz H, Butterworth AS. Genomic atlas of the human plasma proteome. Nature. 2018;558:73–79. doi: 10.1038/s41586-018-0175-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Talukdar HA, Foroughi Asl H, Jain RK, Ermel R, Ruusalepp A, Franzén O, Kidd BA, Readhead B, Giannarelli C, Kovacic JC, Ivert T, Dudley JT, Civelek M, Lusis AJ, Schadt EE, Skogsberg J, Michoel T, Björkegren JLM. Cross-Tissue regulatory Gene Networks in Coronary Artery Disease. Cell Systems. 2016;2:196–208. doi: 10.1016/j.cels.2016.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Tamburini I. gtex-app-kegg-pathways. swh:1:rev:87d324da7bf196d72b143ddcefcdcc52c49e7160Software Heritage. 2023 https://archive.softwareheritage.org/swh:1:dir:a90ff9d5b6a4a45ae9a48f8465491869082e920d;origin=https://github.com/itamburi/gtex-app-kegg-pathways;visit=swh:1:snp:669f49929f008e3fbd9f0806e0a4369896c16850;anchor=swh:1:rev:87d324da7bf196d72b143ddcefcdcc52c49e7160
  50. The UniProt Consortium UniProt: the universal protein knowledgebase. Nucleic Acids Research. 2017;45:D158–D169. doi: 10.1093/nar/gkw1099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Trapp S, Stanford SC. New developments in the prospects for GLP-1 therapy. British Journal of Pharmacology. 2022;179:489–491. doi: 10.1111/bph.15788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Van C. Deconvolution. swh:1:rev:39c399083f5eea335924e65cf413ce68e6d2820aSoftware Heritage. 2022 https://archive.softwareheritage.org/swh:1:dir:ce6f7345337fe19269c75ae9738c193183af07f9;origin=https://github.com/cvan859/deconvolution;visit=swh:1:snp:a534dbe61e25177dcaa33e08cc496cd4a79ed5e4;anchor=swh:1:rev:39c399083f5eea335924e65cf413ce68e6d2820a
  53. Velez LM. Myokine-signaling. swh:1:rev:ebbc164fc5d6908a6cc63a98b786853e690d04baSoftware Heritage. 2022 https://archive.softwareheritage.org/swh:1:dir:516a1ca029d2e2bbc8bfd4435cfcb3def3ea67c5;origin=https://github.com/Leandromvelez/myokine-signaling;visit=swh:1:snp:f6e6fc7e2131764bf016cdc08478eab355cbba07;anchor=swh:1:rev:ebbc164fc5d6908a6cc63a98b786853e690d04ba
  54. Velez LM, Van C, Moore T, Zhou Z, Johnson C, Hevener AL, Seldin MM. Genetic variation of putative myokine signaling is dominated by biological sex and sex hormones. eLife. 2022;11:e76887. doi: 10.7554/eLife.76887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Zechner R, Kienesberger PC, Haemmerle G, Zimmermann R, Lass A. Adipose triglyceride lipase and the lipolytic catabolism of cellular fat stores. Journal of Lipid Research. 2009;50:3–21. doi: 10.1194/jlr.R800031-JLR200. [DOI] [PubMed] [Google Scholar]
  57. Zeng P, Shao Z, Zhou X. Statistical methods for mediation analysis in the era of high-throughput genomics: Current successes and future challenges. Computational and Structural Biotechnology Journal. 2021;19:3209–3224. doi: 10.1016/j.csbj.2021.05.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Zhou M, Seldin MM. GD-CAT. swh:1:rev:f06f6ce5f5cccb0830e78bb365b7effd1fa6d97dSoftware Heritage. 2023 https://archive.softwareheritage.org/swh:1:dir:6a1d85728a4bc0bd58ff76f06de1330b62159098;origin=https://github.com/mingqizh/GD-CAT;visit=swh:1:snp:f94e2e7f5ff539f437b217156562395dc678261e;anchor=swh:1:rev:f06f6ce5f5cccb0830e78bb365b7effd1fa6d97d

eLife assessment

Evan Graehl Williams 1

This important paper provides web based interface for cross-tissue analysis of omics datasets from – so far – two different human populations, with compelling evidence that the tool can be used to make meaningful scientific discoveries. Conceptually, these analyses are relevant for any systems biologist or bioinformatician who is interested in integrating large population datasets. Currently, the resource is already of use for scientists studying the HMDP or using GTEx data, and we hope to see updates in the coming years that incorporate more populations and more datatypes, which could make it a general tool for a wide community.

Reviewer #1 (Public Review):

Anonymous

Zhou et al. have slightly expanded and improved their web tool from the previous submission, fixing some small issues and adding in additional sets of data from HMDP mice. Essentially, the authors have created a tool that facilitates the integrated analysis of omics datasets (particularly transcriptomics, but could be easily adapted to include proteomics) across tissues.

The strength is that this is new; as far as I know, any other multi-tissue analysis software is relatively ad hoc and it is not easily supported by e.g. SRA/GEO, but rather you'd need to download the multiple datasets and DIY. The authors have now shown some statistically significant (albeit expected from literature) results created using their pipeline. Whether the method will be generally useful for the community depends on its further development and support, but of course whether a project is supported also depends on whether its first publication is accepted - somewhat of a Catch-22 for a reviewer. Right now, the results shown are a convincing proof-of-concept that would likely be of utility mostly to the hosting laboratory and their direct collaborators, but which, with continued development at a similar level of effort, could be more generally useful for the growing number of groups interested in cross-tissue analysis.

Reviewer #2 (Public Review):

Anonymous

Summary:

Zhou et al. have revised their previous manuscript, which has greatly improved the quality of the work. Zhou et al. use publicly available GTEx data of 18 metabolic tissues from 310 individuals to explore gene expression correlation patterns within-tissue and across-tissues. Furthermore, they have added an analysis of data from a diverse panel of inbred mouse strains, which allows them to also incorporate data on physiological phenotypes relevant to metabolic signaling between tissues. They now focus on validating their approach to exploring signal in gene co-expression rather than emphasizing unvalidated discoveries. They provide a webtool (GD-CAT) to allow users to explore these data. Focusing more on known biology does result in the study making stronger conclusions from its data. The webtool is also improved, expanded with the mouse data, and of value to the scientific community. Their revision has also corrected key misconceptions from the initial submission and provides greater clarification of the methodologies used.

Strengths:

GTEx as well as the hybrid diversity mouse panel are powerful resource for many areas of biomedicine, and this study represents a valid use of gene co-expression network methodology. They have greatly improved its description and contextualization within the gene co-expression studies. The authors previously did a good job of providing examples confirming known signaling biology and have further improved these. They have largely removed the sections on discovery of novel biology, which is potentially for the better given a lack of follow-up validation, which could be beyond the scope of this manuscript anyway. The webtool, GD-CAT, is easy to use and allows researchers with genes and tissues of interest to perform the same analyses in the GTEx and HMDP data.

Weaknesses:

With the previous version, the primary weaknesses for me were key misconceptions and lack of detail in the methods, which have all been greatly improved. The manuscript could be considered more of a "Resource" than "Research", though there is value in showing how the known biology is reflected in the correlation data and could presumably be paired with validation to discover new biology. Finally, there are sentences here and there that could be rephrased to improve clarity, but overall it is greatly improved.

eLife. 2024 Jan 15;12:RP88863. doi: 10.7554/eLife.88863.3.sa3

Author Response

Mingqi Zhou 1, Ian Tamburini 2, Cassandra Van 3, Jeffrey Molendijk 4, Christy M Nguyen 5, Ivan Yao-Yi Chang 6, Casey Johnson 7, Leandro M Velez 8, Youngseo Cheon 9, Reichelle Yeo 10, Hosung Bae 11, Johnny Le 12, Natalie Larson 13, Ron Pulido 14, Carlos Viesi Nascimento Filho, Cholsoon Jang 15, Ivan Marazzi 16, Jamie Justice 17, Nicholas Pannunzio 18, Andrea L Hevener 19, Lauren Sparks 20, Erin Kershaw 21, Dequina Nicholas 22, Benjamin L Parker 23, Selma Masri 24, Marcus M Seldin 25

The following is the authors’ response to the original reviews.

We thank the reviewers and editors for their time and careful consideration of this study. Nearly every comment proved to be highly constructive and thoughtful, and as a result, the manuscript has undergone major revisions including the title, all figures, associated conclusions and web app. We feel that the revised resource provides a more systematic and comprehensive approach to correlating inter-individual transcript patterns across tissues for analysis of organ cross-talk. Moreover, the manuscript has been restructured to highlight utility of the web tool for queries of genes and pathways, as opposed to focused discrete examples of cherry-picked mechanisms. A few key revisions include:

• Manuscript: All figures have been revised to place to explore broad pathway representation. These analyses have replaced the previous circadian and muscle-hippocampal figures to emphasize ability to recapitulate known physiology and remove the discovery portion which has not been validate experimentally.

• Manuscript: The term “genetic correlation” or “genetically-derived” has been replaced throughout with “transcriptional”, “inter-individual”, or mostly just “correlations”.

• Manuscript: A new figure (revised fig 2) has been added to evaluate the innate correlation structure of data used for common metabolic pathways, in addition an exploration of which tissues generally show more co-correlation and centrality among correlations.

• Manuscript: A new figure (revised fig 4) has been added to highlight the utility of exploring gene ~ trait correlations in mouse populations, where controlled diets can be compared directly. These highlight sex hormone receptor correlations with the large amount of available clinical traits, which differ entirely depending on the tissue of expression and/or diet in mouse populations.

• Web tool: Addition of a mouse section to query expression correlations among diverse inbred strains and associated traits from chow or HFHS diet within the hybrid mouse diversity panel.

• Web tool: Overrepresentation analysis for pathway enrichments have been replaced with score-based gene set enrichment analyses and including network topology views for GSEA outputs.

• Web tool: Associated github repository containing scripts for apps now include a detailed walk-through of the interface and definitions for each query and term.

Public Reviews:

Reviewer #1 (Public Review):

Zhou et al. have set up a study to examine how metabolism is regulated across the organism by taking a combined approach looking at gene expression in multiple tissues, as well as analysis of the blood. Specifically, they have created a tool for easily analyzing data from GTEx across 18 tissues in 310 people. In principle, this approach should be expandable to any dataset where multiple tissues of data were collected from the same individuals. While not necessary, it would also raise my interest to see the "Mouse(coming soon)" selection functional, given that the authors have good access to multi-tissue transcriptomics done in similarly large mouse cohorts.

Summary

The authors have assembled a web tool that helps analyze multiple tissues' datasets together, with the aim of identifying how metabolic pathways and gene regulation are connected across tissues. This makes sense conceptually and the web tool is easy to use and runs reasonably quickly, considering the size of the data. I like the tool and I think the approach is necessary and surprisingly under-served; there is a lot of focus on multi-omics recently, but much less on doing a good job of integrating multi-tissue datasets even within a single omics layer.

What I am less convinced about is the "Research Article" aspect of this paper. Studying circadian rhythm in GTEx data seems risky to me, given the huge range in circadian clock in the sample collection. I also wonder (although this is not even remotely in my expertise) whether the circadian rhythm also gets rather desynchronized in people dying of natural causes - although I suppose this could be said for any gene expression pathway. Similarly for looking at secreted proteins in Figure 4 looking at muscle-hippocampus transcript levels for ADAMTS17 doesn't make sense to me - of all tissue pairs to make a vignette about to demonstrate the method, this is not an intuitive choice to me. The "within muscle" results look fine but panels C-E-G look like noise to me...especially panel C and G are almost certainly noise, since those are pathways with gene counts of 2 and 1 respectively.

I think this is an important effort and a good basis but a significant revision is necessary. This can devote more time and space to explaining the methodology and for ensuring that the results shown are actually significant. This could be done by checking a mix of negative controls (e.g. by shuffling gene labels and data) and a more comprehensive look at "positive" genes, so that it can be clearly shown that the genes shown in Fig 1 and 2 are not cherry-picked. For Figure 3, I suspect you would get almost an identical figure if instead of showing pan-tissue circadian clock correlations, you instead selected the electron transport chain, or the ribosome, or any other pathway that has genes that are expressed across all tissues. You show that colon and heart have relatively high connectivity to other tissues, but this may be common to other pathways as well.

Response: We are thankful to the reviewer in their detailed assessment of the manuscript. The comments raised in both the public and suggested reviews clearly improved the revised study and helped to identify limitations. In general, we have removed data suggesting “discovery” using these generalized analyses, such as removing figures evaluating circadian rhythm genes and muscle-hippocampus correlations. These have been replaced with more thorough investigations of tissue correlation structure and potentially identified regions of data sparsity which are important for users to consider. Also, we have added a similar full detailed pipeline of mouse (HMDP) data and highlighted in the manuscript by showing transcript ~ trait correlations of sex hormone receptor genes which differ between organs and diets. Further responses to individual points are also provided below.

Reviewer #2 (Public Review):

Summary:

Zhou et al. use publicly available GTEx data of 18 metabolic tissues from 310 individuals to explore gene expression correlation patterns within-tissue and across-tissues. They detect signatures of known metabolic signaling biology, such as ADIPOQ's role in fatty acid metabolism in adipose tissue. They also emphasize that their approach can help generate new hypotheses, such as the colon playing an important role in circadian clock maintenance. To aid researchers in querying their own genes of interest in metabolic tissues, they have developed an easy-to-use webtool (GD-CAT).

This study makes reasonable conclusions from its data, and the webtool would be useful to researchers focused on metabolic signaling. However, some misconceptions need to be corrected, as well as greater clarification of the methodology used.

Strengths:

GTEx is a very powerful resource for many areas of biomedicine, and this study represents a valid use of gene co-expression network methodology. The authors do a good job of providing examples confirming known signaling biology as well as the potential to discover promising signatures of novel biology for follow-up and future studies. The webtool, GD-CAT, is easy to use and allows researchers with genes and tissues of interest to perform the same analyses in the same GTEx data.

Weaknesses:

A key weakness of the paper is that this study does not involve genetic correlations, which is used in the title and throughout the manuscript, but rather gene co-expression networks. The authors do mention the classic limitation that correlation does not imply causation, but this caveat is even more important given that these are not genetic correlations. Given that the goal of their study aligns closely with multi-tissue WGCNA, which is not a new idea (e.g., Talukdar et al. 2016; https://doi.org/10.1016/j.cels.2016.02.002), it is surprising that the authors only use WGCNA for its robust correlation estimation (bicor), but not its latent factor/module estimation, which could potentially capture cross-tissue signaling patterns. It is possible that the biological signals of interest would be drowned out by all the other variation in the data but given that this is a conventional step in WGCNA, it is a weakness that the authors do not use it or discuss it.

Response: Thank you for the helpful and detailed suggestions regarding the study. The review raised some important points regarding methodological interpretations (ex. bicor-exclusive application as opposed to module-based approaches), as well as clarification of “genetic” inferences throughout the study. The comparison to module-based approaches has also now been discussed directly, pointing our considerations and advantages to each. We hope that the reviewer with our corrections to the misconceptions posed, many of which we feel were due to our insufficient description of methodological details and underlying interpretations. The revised manuscript, web portal and associated github provide much more detail and many more responses to specific points are provided below.

Reviewer #3 (Public Review):

Summary: A useful and potentially powerful analysis of gene expression correlations across major organ and tissue systems that exploits a subset of 310 humans from the GTEx collection (subjects for whom there are uniformly processed postmortem RNA-seq data for 18 tissues or organs). The analysis is complemented by a Shiny R application web service.

The need for more multisystems analysis of transcript correlation is very well motivated by the authors. Their work should be contrasted with more simple comparisons of correlation structure within different organs and tissues, rather than actual correlations across organs and tissues.

Strengths and Weaknesses: The strengths and limitations of this work trace back to the nature of the GTEx data set itself. The authors refer to the correlations of transcripts as "gene" and "genetic" correlations throughout. In fact, they name their web service "Genetically-Derived Correlations Across Tissues". But all GTEx subjects had strong exposure to unique environments and all correlations will be driven by developmental and environmental factors, age, sex differences, and shared and unshared pre- and postmortem technical artifacts. In fact we know that the heritability of transcript levels is generally low, often well under 25%, even studies of animals with tight environmental control.

This criticism does not comment materially detract for the importance and utility of the correlations-whether genetic, GXE, or purely environmental-but it does mean that the authors should ideally restructure and reword text so as to NOT claim so much for "genetics". It may be possible to incorporate estimates of chip heritability of transcripts into this work if the genetic component of correlations is regarded as critical (all GTEx cases have genotypes).

Appraisal of Work on the Field: There are two parts to this paper: 1. "case studies" of cross-tissue/organ correlations and 2. the creation of an R/Shiny application to make this type of analysis much more practical for any biologist. Both parts of the work are of high potential value, but neither is fully developed. My own opinion is that the R/Shiny component is the more important immediate contribution and that the "case studies" could be placed in the context of a more complete primer. Or Alternatively, the case studies could be their own independent contributions with more validation.

Response: We thank the reviewer for their supportive and helpful comments. The discussion of usage of the term “genetic” has been removed entirely from the manuscript as this point was made by all reviewers. Further, we have revised the previous study to focus on more detailed investigations of why transcript isoforms seemed correlated between tissues and areas where datasets are insufficient to provide sufficient information (ex. Kidney in GTEx). As the reviewer points out, the previous “case studies” were unvalidated and incomplete and as a result, have been replaced. Additional points below have been revised to present a more comprehensive analyses of transcript correlations across tissues and improved web tool.

(Recommendations For The Authors):

As this manuscript is focused on the analytical process rather than the biological findings, the reviewer concerns are not a fundamental issue to subsequent acceptance of the paper, but some of the examples will need to be replaced or double-checked to ensure their biological and statistical relevance. To raise the scope and interest of the method developed, it would be seen very positively to include additional datasets, as the authors seem to have intended to have done, with a non-functional (and highlighted as such) selection for mouse data. Establishing that the authors can easily - and will easily - add additional datasets into their tool would greatly raise the reviewers' confidence in the methodology/resource aspect of this paper. This may also help address the significant concerns that all three reviewers raised with the biological examples, e.g. that GTEx data is so uncontrolled that studying environmentally-influenced traits such as circadian rhythm may be challenging or even impossible to do properly. Adding in a more highly controlled set of cross-tissue mouse data may be able to address both these concerns at once, i.e. the resource concern (can the website easily be updated with new data) and the biological concern (are the results from these vignettes actually statistically significant).

Reviewer #1 (Recommendations For The Authors):

Comments, in approximately reverse order of importance

1. Some figure panels are not referenced in the text, e.g. Fig 1B and Figure 2E.Response: Thank you for pointing this out. We have revised every figure in the manuscript and additionally gone through to make sure every panel is referenced in the text.

1. The authors mention "genetic data" several times but I don't see anything about DNA. By "genetic data" do you mean "transcriptome expression data," or something else?

Response: This is an important point, also raised by all 3 reviewers. We have clarified in the abstract, results and discussion that correlations are between transcripts. As a result, all mentions of “genetics” or “genetic data” has been removed, with the exception of introducing mouse genetic reference panels.

1. For Figure 3, the authors look at circadian clock data, but the GTEx data is from all sorts of different times of day from across the patient cohort depending on when the donor died, and I don't see this metadata actually mentioned anywhere. I see Arntl Clock and all the other circadian genes are highly coexpressed in each tissue (except not so strong in liver) but correlation across tissue seems more random. Also hypothalamus seems to be very strongly negatively correlated with spleen, but this large green block doesn't have significance? That is surprising to me, since the sample sizes are all equivalent I would expect any correlation remotely close to -1.0 to be highly significant.

Response: The reviewer raises several important points with regard to the source of data and underlying interpretations. We have added a revised Fig 2, suggesting that representation of gene expression between tissues can be strongly biased by nature of samples (ex. differences in data that is available for each tissue) and also discussed considerations of the nature of sample origin in the limitations section. We have also used some of these points when introducing rationale for using mouse population data. As a result of comments from this reviewer and others, we have removed the circadian rhythm analysis and muscle-hippocampal figures from the revised study; however, specifically mentioned these cohort differences in the discussion section (lines 294-298). Circadian rhythm terms are also evaluated in Fig 2 and consistent with the reviewers concerns, less overall correlations are observed between transcripts across tissues when compared to other common GO terms assessed.

1. Figure 4, this is all transcript-level data, so it is confusing to see protein nomenclature used, e.g. "expression of muscle ADAMTS17" should be "expression of muscle ADAMTS17" (ADAMTS17 the transcript should be in italics, in case the formatting is removed by the eLife portal). Same for FNDC5. In the figures you do have those in italics, so it is just an issue in the manuscript text. In general please look through the text and make sure whether you are referring really to a "gene," "transcript," or "protein." For instance, Figure 1 legend I think should be "A, All transcripts across the ... with local subcutaneous and muscle transcript expression." I know people still sometimes use "gene expression" to refer to transcripts, but now that proteomics is pretty mainstream, I would push for more careful vocabulary here.

Response: Thank you for pointing these out. While we have replaced Fig 4 entirely as to limit the unvalidated discovery or research aspects of the paper, we have gone through the text and figures to check that the correct formatting is used for references to human genes (capitalized italics) or the newly-included mouse genes (lower-case italics).

1. "Briefly, these data were filtered to retain genes which were detected across individuals where individuals were required to show counts > 0 in 1.2e6 gene-tissue combinations across all data." I don't quite understand the filtering metric here - what is 1.2 million gene-tissue combinations referring to? 20k genes times 18 tissues times 310 people is ~100 million measurements, but for a given gene across 310 people * 18 tissues that is only ~6000 quantifications per gene.

Response: We apologize for this oversight, as the numbers were derived from the whole GTEx dataset in total and not the tissues used for the current study. We have clarified this point in the revised manuscript (methods section in Datasets used) and also removed confusing references to specific numbers of transcripts and tissues unless made clear.

1. Generally I think your approach makes sense conceptually but... for the specific example used in e.g. figure 4, this only makes sense to me if applied to proteins and not to transcripts. Looking at the transcript levels per tissue for genes which are secreted could be interesting but this specific example is confusing, as is the tissue selected. I would not really expect much crosstalk between the hippocampus and the muscle, especially not in terms of secreted proteins.

Response: This is a valid point, also raised by other reviewers. While we wanted to highlight the one potentially-new (ADAMTS7) and two established proteins (FNDC5 and ERFE) and their correlations, the fact that this direct circuit remains to be validated led us to replace the figure entirely. The point raised about inference of protein secretion compared to action; however, has been expanded upon in the results and discussion. We now show that complexities arise when using this approach to infer mechanisms of proteins which are primarily regulated post-transcriptionally. We provide a revised Supplemental Fig 4 showing that this general framework, when applied to expression of INS (insulin), almost exclusively captured pathways leading to its secretion and not action.

1. It's not clear to me how correction for multiple testing is working in the analyses used in this manuscript. You mention q-values so I am sure it was done, I just don't see the precise method mentioned in the Methods section.

Response: We apologize for this oversight and have included a specific mention of qvalue adjustment using BH methods, where our reasoning was the efficiency in run-time (compared to other qvalue methods). In addition, we provide a revised Fig 2 which suggests that innate correlation structure exists between tissues for a variety of pathways which should be considered. We also compare several empirical bicor pvalues and qvalue adjustments directly between these large pathways where much of the innate tissue correlation structure does appear present when BH qvalue adjustments are applied (revised Fig 2A).

1. The piecharts in Figure 1 are interesting - I would actually be curious which tissues generally have closer coexpression. This would be an absolutely massive number of pairwise correlations to test, but maybe there is a smarter way to do it? For instance, for ADIPOQ, skeletal muscle has the best typical correlation, but would that be generally true just that many adipose genes have closer relationship between the two tissues?

Response: This comment inspired us to perform a more systematic query of global gene-gene correlation structures, which is now shown as the revised Fig 2A. With respect to ADIPOQ, the reviewer is correct in that there does appear to be a general pattern of muscle genes showing stronger correlation with adipose genes. We emphasize and discuss there in the revised manuscript to point out that global trends of tissue correlation structure should be taken into account when looking at specific genes. Much of this innate co-correlation structure could be normalized by the BH qvalue adjustment (above); however, strongly correlated pathways like mitochondria showed selective patterns throughout thresholds (revised Fig 2A). Further, we analyze KEGG terms and general correlation structures (revised Fig 2B) to point out the converse, that some tissues are just poorly represented. Interpretation of correlated genes from these organ and pathway combinations should be especially considered in the framework that their poor representation in the dataset clearly impacted the global correlation structures. We have added these points to both results and discussion. In sum, we feel that this was a critical point to explore and attempted to provide a framework to identify/consider in the revised manuscript.

1. The pathway enrichments in Figure 1 are more difficult for me to interpret, e.g. for ADIPOQ, the scWAT pathways make sense, but the enriched skeletal muscle pathways are less clearly relevant (rRNA processing?? Not impossible but no clear relevance either). What are the significances for these pathway enrichments? Is it even possible to select a gene that has no peripheral pathway enrichment, e.g. if you take some random Gm#### or olfactory receptor gene and run the analysis, are you also going to see significant pathways selected, as pathway enrichment often has a trend to overfit? The "within organ" does seem to make sense, but I am also just looking at 4 anecdotes here and it is unclear whether they are cherry picked because they did make sense. That is, it's unclear why you selected ADIPOQ and not APOE or HMGCR or etc. I also don't figure out how I can make these pathway enrichment plots using your website. I do get the pie chart but when I try the enrichment analysis block (NB: typo on your website, it says "Enrich-E-ment Analysis" with an extra E) I always get that "the selected tissue do not contain enough genes to generate positive the enrichment." (Also two typos in that phrase; authors should check and review extensively for improvements to the use of English.) After trying several genes I eventually got it to work. I think there is some significant overfitting here, as I am pretty sure that XIST expression in the white adipose tissue has nothing to do with olfactory signalling pathways, which are the top positive network (but with an n = 4 genes).

Response: Several good points within this comment. (1) the pathway enrichments have been revised completely. The reviewer provided a helpful suggestion of a rank-based approach to query pathways, as opposed to the previous over-representation tests. After evaluating several different pathway enrichment tools based on correlated tissue expression transcripts, a rank- and weight-based test (GSEA) captured the most physiologic pathways observed from known actions of select secreted proteins. Therefore, revised pathway enrichments and web-tool queries unitize a GSEA approach which accounts for the rank and weight determined by correlation coefficient. In implementing these new pathway approaches, we feel that pathway terms perform significantly better at capturing mechanisms. (2) With respect to the selection genes, we wanted to provide a framework for investigating genes which encode secreted proteins that signal as a result of the abundance of the protein alone. This is a group-bias; however, and not necessarily reflective of trying to tackle the most important physiologic mechanisms underlying human disease. We agree with the reviewer in those evaluating genes such as APOE and cholesterol synthesis enzymes present an exciting opportunity, our expertise in interpretation and mechanistic confirmation is limited. (3) We have gone through the revised manuscript and attempted to correct all grammatical and/or spelling mistakes.

1. The network figures I get on your website look actually more interesting than the ones you have in Figure 2, which only stay within a tissue. Making networks within a tissue is pretty easy I think for any biologist today, but the cross-tissue analysis is still fairly hard due to the size of the datasets and correlation matrices.

Response: We greatly appreciate the reviewer’s enthusiasm for the network model generation aspect. We have tried to improve the figure generation and expanded the gene size selection for network generation in the web tool, both within and across tissues. We are working toward allowing users to select specific pathway terms and/or tissue genes to include in these networks as well, but will need more time to implement.

1. I get a bug with making networks for certain genes, e.g. XIST - Liver does not work for plotting network graphs. Maybe XIST is a suppressed gene because it has zero expression in males? It is an interesting gene to look at as a "positive control" for many analyses, since it shows that sample sexing is done correctly for all samples.

Response: The reviewer recognized a key consideration in underlying data structure for GTEx. In the revised manuscript, we evaluated tissue representation (or lack thereof) being a crucial factor in driving where significant relationships cannot be observed in tissues such as kidney, liver and spleen (Fig 2). Moreover, the representation of females (self-reported) in GTEx is less-than half of males (100 compared to 210 individuals). We have emphasized this point in the discussion where we specifically pointed out the lack of XIST Liver correlation being a product of data structure/availability and not reflecting real biologic mechanisms. We expanded on this point by highlighting the clear sex-bias in terms of representation.

1. On the network diagram on your website, there doesn't seem to be any way to zoom in on the website itself? You can make a PDF which is nice but the text is often very small and hard to read.

Response: We have revised the web interface plot parameters to create a more uniform graph.

1. On a related note, is it possible to output the raw data and gene lists for the network graph? I would want to know what are those genes and their correlation coefficient.

Response: We have enabled explore as .pdf or .svg graphics for the network and all plots. In addition, following pie chart generation at the top of the web app, users now have the ability to download a .csv file containing the bicor coefficients, regression pvalues and adjusted qvalues for all other gene-tissue combinations.

1. Some functionality issues, e.g. on the "Scatter plot" block, I input a gene name again here. Shouldn't this use the same gene selected already at the top of the page? It seems confusing to again select the gene and tissue here, but maybe there is a reason for that.

Response: It would be more intuitive to only display genes from a given selected tissue for scatterplots; however, we chose to keep all possible combinations with the [perhaps unnecessary] option of reselecting a tissue to allow users to query any specific gene without having to wait to run the pathways for all that correspond to a given tissues.

1. Figure 4H should also probably be Figure 1A.

Response: Good point, the revised Fig 1A is now a summary of the web tool

I realize I have written a fairly critical review that will require most of the figures to be redone, but I think the underlying method is sound and the implementation by and end-user is quite simple, so I think your group should have no trouble addressing these points.

Response: Your comments were really helpful and we feel that the tool has significantly improved as a result. So, we are thankful to the time and effort put toward helping here.

Reviewer #2 (Recommendations For The Authors)

Comments on the use of "genetic correlation"

• The use of "genetic correlation" in title and throughout the manuscript is misleading. Should broadly be replaced with "gene expression correlation". Within genetics, "genetic correlation" generally refers to the correlation between traits due to genetic variation, as would be expected under pleiotropy (genetic variation that affects multiple traits). Here, I think the authors are somewhat conflating "genetic" (normally referring to genetic variation) with "gene" (because the data are gene expression phenotypes). I don't think they perform any genetic analysis in the manuscript. I hope I don't sound too harsh. I think the paper still has merit and value, but it is important to correct the terminology.

Response: This was an important clarification raised by all reviewers. We apologize for the oversight. As a result, all mentions of “genetics” or “genetic data” has been removed, with the exception of introducing mouse genetic reference panels. These have generally been replaced with “transcript correlations”, “correlations” or “correlations across individuals” to avoid confusion.

• The authors note an important limitation in the Discussion that correlations don't imply a specific causal model between two genes, and furthermore note that statistical procedures (mediation and Mendelian randomization) are dependent on assumptions and really only a well-designed experiment can completely determine the relationship. This is a very important point that I greatly appreciate. I think they could even further expand this discussion. The potential relationships between gene A and gene B are more complex than causal and reactive. For example, a genetic variant or environmental exposure could regulate a gene that then has a cascade of effects on other genes, including A and B. They belong to a shared causal pathway (and are potentially biologically interesting), but it's good to emphasize that correlations can reflect many underlying causal relationships, some more or less interesting biologically.

Response: We thank the reviewer for pointing this out. We have expanded both the results and discussion sections to mention specifically how correlation between two genes can be due to a variety of parameters, often and not just encompassing their relationship. We mention the importance of considering genetic and environmental variables in these relationships as well which we feel will be an important “take-home message” for the reader. These points were also explored in the revised Fig 2 in terms of investigating broad pathway gene-gene correlation structures. As noted by the reviewer, contexts such as circadian rhythm or other variables in the data which are not fixed show much less overall significance in terms of broad relationships across organs.

• It would be good for the authors to provide more context for the methods they use, even when they are fully published. For example, stating that biweight midcorrelation (bicor) is an approach for comparing to variables that is more robust to outliers than traditional correlations and is commonly used with gene co-expression correlation.

Response: Thank you for pointing this out. A lack of method description was also an important reason for lack of clarity on other aspects so we have done our best to detail what exact approaches are being implemented and why. In the revised manuscript, we mention the usage if bicor values to limit influence of outlier individuals in driving regressions, but also point out that it is still a generalized linear model to assess relationships. We hope that the revised methods and expanded git repositories which detail each analysis provide much more transparency on what is being implemented.

• Performing a similar analysis based on genetic correlation is an interesting idea, as it would potentially simplify the underlying causal models (removing variation that doesn't stem from genetic variants). I don't expect the authors to do this for this paper because it would be a significant amount of work (fitting and testing genetic correlations are not as straightforward). But still, an interesting idea to think about, and individuals in GTEx are genotyped I believe. Could be mentioned in the Discussion.

Response: Absolutely. While we did not implement and models of genetic correlation (despite misusing the term) in this analysis. We have added to the discussion on how when genetic data is available, these approaches offer another way to tease out potentially causal interactions among the large amount of correlated data occurring for a variety of reasons.

Comments on use of the term "local" and "regression"

• "Local" is largely used to mean within-tissue, so how correlated gene X in tissue Y is with other genes in tissue Y. I think this needs to be defined explicitly early in the manuscript or possibly replaced with something like "within-tissue".

Response: We have replaced al “local” mentions with “within-tissue” or simply name the tissue that the gene is expressed to avoid confusion with other terms of local (ex a transcript in proximity to where it is encoded on the genome).

• "Regression" is also used frequently throughout, often when I think "correlation" would be more accurate. It's true that the regression coefficient is a function of the correlation between X and Y, but I don't think actual regression (the procedure) applies here. The coefficients being used are bicor, which I don't think relates as cleanly to linear regression.

Response: Thank you for pointing this out. A lack of method description was also an important reason for lack of clarity on other aspects so we have done our best to detail what exact approaches are being implemented and why. In the revised manuscript, we mention the usage if bicor values to limit influence of outlier individuals in driving correlations, but also point out that it is still a generalized linear model to assess relationships. Further, we have removed usage of “regression” when referencing bicor values. We hope that the revised methods and expanded git repositories which detail each analysis provide much more transparency on what is being implemented.

• "Further, pan-tissue correlations tend to be dominated by local regressions where a given gene is expressed. This is due to the fact that within-tissue correlations could capture both the regulatory and putative consequences of gene regulation, and distinguishing between the two presents a significant challenge" (lines 219-223). This sentence includes both "local" and "regressions" (and would be improved by my suggested changes I think), but I also don't fully understand the argument of "regulatory and putative consequences". I think the authors should elaborate further. In the examples, the within-tissue correlations do look stronger, suggesting within-tissue regulation that is quite strong and potentially secondary inter-tissue regulation. If that's the idea, I think it can be stated more clearly.

Response: Thank you for pointing this out. We have revised the sentence to state the following:

Further, many correlations tend to be dominated by genes expressed within the same organ. This could be due to the fact that, within-tissue correlations could capture both the pathways regulating expression of a gene, as well as potential consequences of changes in expression/function, and distinguishing between the two presents a significant challenge. For example, a GD-CAT query of insulin (INS) expression in pancreas shows exclusive enrichments in pancreas and corresponding pathway terms reflect regulatory mechanisms such as secretion and ion transport (Supplemental Fig 4).

We feel that this point might not be intuitive, so have included a new figure (Supplemental Fig 4) which contains the tissue correlations and pathways for INS expression in pancreas. These analyses show an example where co-correlation structure seems almost entirely dominated by genes within the same organ (pancreas) and GSEA enrichments highlight many known pathways which are involved in regulating the expression/secretion of the gene/protein. We hope that this makes the point more clearly to the reader.

Additional comments on Results:

• I would break the titled Results sections into multiple paragraphs. For example, the first section (lines 84-129) has a few natural breakpoints that I noticed that would potentially make it feel less over-whelming to the reader.

Response: We have broken up the results section into separate paragraphs in the revised manuscript. In addition, we have gone through to try and make sure that the amount of information per block/sentence focuses on key points.

• "Expression of a gene and its corresponding protein can show substantial discordances depending on the dataset used" (line 224 of Results). This is a good point, and the authors could include citations here of studies that show discordance between transcripts and proteins, of which there are a good number. They could also add some biological context, such as saying differences could reflect post-translational regulation, etc.

Response: Thank you for the supportive comment. We have referenced several comprehensive reviews of the topic, each of which contain tables summarizing details of mRNA-protein correlation. The revised discussion sentence is as follows:

Expression of a gene and its corresponding protein can show substantial discordances depending on the dataset used. These have been discussed in detail39–41, but ranges of co-correlation can vary widely depending on the datasets used and approaches taken. We note that for genes encoding proteins where actions from acute secretion grossly outweigh patterns of gene expression, such as insulin, caution should be taken when interpreting results. As the depth and availability of tissue-specific proteomic levels across diverse individuals continues to increase, an exciting opportunity is presented to explore the applicability of these analyses and identify areas when gene expression is not a sufficient measure.

1. Liu, Y., Beyer, A. & Aebersold, R. On the Dependency of Cellular Protein Levels on mRNA Abundance. Cell 165, 535–550 (2016).

2. Maier, T., Güell, M. & Serrano, L. Correlation of mRNA and protein in complex biological samples. FEBS Letters 583, 3966–3973 (2009).

3. Buccitelli, C. & Selbach, M. mRNAs, proteins and the emerging principles of gene expression control. Nat Rev Genet 21, 630–644 (2020).

• In many ways, this work has similar goals to many studies that have performed multi-tissue WGCNA (e.g., Talukdar et al. 2016; https://doi.org/10.1016/j.cels.2016.02.002). In this manuscript, WGCNA's conventional approach to estimating robust correlations (bicor) is used, but they do not use WGCNA's data reduction/clustering functionality to estimate modules. Perhaps the modules would miss the signaling relationships of interest, being sort of lost in the presence of stronger signals that aren't relevant to the biological questions here. But I think it would be good for the authors to explain why they didn't use the full WGCNA approach.

Response: This is an important point and we also feel that the previous lack of methodological details and discussion did a poor job at distinguishing why module-based approaches were not used. We wanted to be careful not to emphasize one approach being superior/inferior to another, rather point out the different considerations and when a direct correlation might inform a given question. As the reviewer points out, our general feeling is that adopting a simple gene-focused correlation approach allows users to view mechanisms through the lens of a single gene; however, this is limited in that these could be influenced by cumulative patterns of correlation structure (for example mitochondria in revised Fig 2A) which would be much more apparent in a module-based approach. This comment, in combination with the other listed above, was our motivation in exploring cumulative patterns of gene-gene correlations in the revised Fig 2. In the revised manuscript, we expanded on the results and discussion section to highlight utility of these types of approaches compared to module-based methods:

The queries provided in GD-CAT use fairly simple linear models to infer organ-organ signaling; however, more sophisticated methods can also be applied in an informative fashion. For example, Koplev et al generated co-expression modules from 9 tissues in the STARNET dataset, where construction of a massive Bayesian network uncovered interactions between correlated modules6. These approaches expanded on analysis of STAGE data to construct network models using WGCNA across tissues and relating these resulting eigenvectors to outcomes42. The generalized approach of constructing cross-tissue gene regulatory modules presents appeal in that genes are able to be viewed in the context of a network with respect to all other gene-tissue combinations. In searching through these types of expanded networks, individuals can identify where the most compelling global relationships occur. One challenge with this type of approach; however, is that coregulated pathways and module members are highly subjective to parameters used to construct GRNs (for example reassignment threshold in WGCNA) and can be difficult in arriving at a “ground truth” for parameter selection. We note that the WGCNA package is also implemented in these analyses, but solely to perform gene-focused correlations using biweight midcorrelation to limit outlier inflation. While the midweight bicorrelation approach to calculate correlations could also be replaced with more sophisticated models, one consideration would be a concern of overfitting models and thus, biasing outcomes.

Additional comments on Discussion:

• In the second paragraph of the Discussion (lines 231-244), the authors mention that GD-CAT uses linear models to compare data between organs and point to other methods that use more complex or elaborate models. It's good to cite these methods, but I think they could more directly state that there are limitations to high complexity models, such as over-fitting.

Response: Thank you for this suggestion. We have added a line (above) mentioning the overfitting concern.

Comments on Methods:

• The described gene filtration in the Methods of including genes with non-zero expression for 1.2e6 gene-tissue combinations is confusing. If there are 310 individuals and 18 tissues, for a given gene, aren't there only 5,580 possible data points? Might be helpful to contextualize the cut-off in terms of like the average number of individuals with non-zero expression within a tissue.

Response: We apologize for this error. This number was pasted from a previous dataset used and not appropriate for this manuscript. In general, we have removed specific mentions of total number of gene_tissue correlation combinations, as these numbers reflect large but almost meaningless quantifications. Instead, we expanded the methods in terms of how individuals and genes filtered.

• More details should be given about the gene ontology/pathway enrichment analysis. I suspect that a set-based approach (e.g., hypergeometric test) was used, rather than a score-based approach. The authors don't state what universe of genes were used, i.e., the overall set of genes that the reduced set of interest is compared to. Seems like this could or should vary with the tissues that are being compared. A score-based approach could be interesting to consider (https://www.biorxiv.org/content/10.1101/060012v3), using the genetic correlations as the score, as this would remove the unappealing feature of sets being dependent on correlation thresholds. This isn't something that I would demand of the published paper, but it could be an appealing approach for the authors to consider and confirm similar results to the set-based analysis.

Response: This is an important point. Following this suggestion, we evaluated several different rank- and weight-based pathway enrichment tools, including FGSEA and others. Ultimately, we concluded that GSEA performed significantly better at (1) recapitulating known biology of select secreted protein genes and (2) leveraging the large numbers of genes occurring at qvalue cutoffs without having to further refine (ex. in the previous overrepresentation tests). For this reason, all pathway enrichments in the web tools and manuscripts not contain GSEA outputs and corresponding pathway enrichments or network graph visualizations. Thank you for this suggestion.

Comments on figures:

• I think there is a bit of a missed opportunity to use the figures to introduce and build up the story for readers. For example, in Figure 1, plotting ADIPOQ expression against a correlated gene in adipose (local) as well as peripheral tissues. This doesn't need to be done for every example, but I think it would help readers understand what the data are, and what's being detected before jumping into higher level summaries.

Response: Thank you, this point also builds on others which recommended to restructure the manuscript and figures. In the revised manuscript, we first introduce the web tool (which was last previously), and immediately highlight comparisons of within- and across-organ correlations, such as ADIPOQ. We feel that the revised manuscript presents a superior structure in terms of demonstrating the key points and utility of looking at gene-gene correlations across tissues.

• Figures 1 and 4 are missing the color scale legend for the bar plots, so it's impossible to tell how significant the enrichments are.

Response: We apologize for the oversight. The pathways in the revised Fig 1 detail pathway network graphs among the top pathways which should make interpretation more intuitive. We have also gone through and made sure that GSEA enrichment pvalues are now present for all figures including pathways (revised Fig 1, Fig 3 and supplemental Fig 4).

• The Figure 2 caption says that edges are colored based on correlation sign? Are there any negative correlations (red)? They all look blue to me. The caption could also state that edge weight reflects correlation magnitude (I assume). It would be ideal to include a legend that links a range of the depicted edge weights to their genetic correlation, though I don't know how feasible that may be depending on the package being used to plot the networks.

Response: Good catch. We included in the revised manuscript the network edge parameters:Network edges represent positive (blue) and negative (red) correlations and the thicknesses are determined by coefficients. They are set for a range of bicor=0.6 (minimum to include) to bicor=0.99

Related to seeing a dominant pattern of positive correlations, we agree that this observation is fascinating and gene-gene correlations being dominated by positive coefficients will be the topic of a closely-following manuscript from the lab

• Figure 4A would be more informative as boxplots, which could still include Ssec score. This would allow the reader to get a sense of the variation in correlation p-value across all hippocampus transcripts.

Response: Related to comments from this reviewer and others, we have removed the previous Fig 4 entirely from the manuscript to emphasize the ability of these gene-gene correlations to capture known biology and limit the extend of unvalidated “suggested” new mechanisms.

Comments on GD-CAT

• The online webtool worked nicely for me. It was easy to use and produce figures like in the manuscript. One suggestion is show data points in the scatter plot rather than just the regression line (if that's possible currently, I didn't figure it out). A regression line isn't that interesting to look at, but seeing how noisy the data look around it is something humans can usually interpret intuitively.

Response: Thank you so much. We are excited that the web tool works sufficiently. We have also revised the individual gene-gene correlation tab to show individual data points instead of simple regression lines.

Minor comments:

Response: Thank you for these detailed improvements

• This sentence is awkwardly constructed: "Here, we surveyed gene-gene genetic correlation structure for ~6.1x10^12 gene pairs across 18 metabolic tissues in 310 individuals where variation of genes such as FGF21, ADIPOQ, GCG and IL6 showed enrichments which recapitulate experimental observations" (lines 68-70). It's an important sentence because it's where in the Abstract/Introduction the authors succinctly state what they did, thus I would re-work it to something like: "Here, we surveyed gene expression correlation structure..., identifying genes, such as FGF21, ADIPOQ, GCG and IL6, that possess correlation networks that recapitulate known biological pathways."

Response: The numbers of pairs examined and dataset size have been removed for clarity and we have revised this statement and results as a whole

• Prefer swapping "signal" for "signaling" in line 53 of Abstract/Introduction.

Response: Done

• Remove extra period in line 208 of Results.

Response: Removed

• Change "well-establish" to "well-established" in line 247 of Discussion.

Response: Replaced

• Missing commas in line 302 of Methods.

Response: added

• Missing comma in line 485 of Figure 3 caption.

Response: The previous Fig 3 has been removed

• Typo in title of Figure 3E (change "Perihperal" to "Peripheral")

Response: Thank you, changed

• Add y-axis label to y-axis labels (relative cell proportions) to Supplemental Figures 1-3.

Response: These labels have been added

Reviewer #3 (Recommendations For The Authors):

Minor technical comment: The authors refer to correlations between genes when they actually mean correlations between GTEX transcript isoform models. It is exceedingly important to keep this distinction clear in the reader's mind, a fact that is emphasized by the authors themselves when they comment on the potential value of similar proteomic assays to evaluate multiorgan system communication. GTEx has tried to do proteomics but I do not know of any open data yet.

Response: Thank you for this point. We have gone through the manuscript and replaced “gene correlations” with “transcript” or other similar mentions. Related to the comment on GTEx proteomics, this is an important point as well. As the reviewer mentions, proteomics has been performed on GTEx data; however, given that this dataset contains only 6 sparsely-represented individuals, analyses such as the ones highlighted in our study remain highly limited. We have added the following to the discussion:As the depth and availability of tissue-specific proteomic levels across diverse individuals continues to increase, an exciting opportunity is presented to explore the applicability of these analyses and identify areas when gene expression is not a sufficient measure. For example, mass-spec proteomics was recently performed on GTEx42; however, given that these data represent 6 individuals, analyses utilizing well-powered inter-individual correlations such as ours which contain 310 individuals remain limited n applications.

The R/Shiny companion application: The community utility of this application would be greatly improved by a link to a primer and more basic functionality. The Github site is a "work in progress" and does not include a readme file or explanation (that I could find) on the license.

Response: Thank you, we are excited that the apps operate sufficiently. We have revised the github repository entirely to contain a full walk-through of app details and parameter selections. These are meant to walk users through each step of the pipeline and discuss what is being done at each step. We agree that this updated github repository allows users to understand the details of the R/Shiny app in much more detail. We also made all the app scripts, datasets, markdown/walkthrough files and docker image fully available to enhance accessibility.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, Hasz R, Walters G, Garcia F, Young N, Foster B, Moser M, Karasik E, Gillard B, Ramsey K, Sullivan S, Bridge J, Magazine H, Syron J, Fleming J, Siminoff L, Traino H, Mosavel M, Barker L, Jewell S, Rohrer D, Maxim D, Filkins D, Harbach P, Cortadillo E, Berghuis B, Turner L, Hudson E, Feenstra K, Sobin L, Robb J, Branton P, Korzeniewski G, Shive C, Tabor D, Qi L, Groch K, Nampally S, Buia S, Zimmerman A, Smith A, Burges R, Robinson K, Valentino K, Bradbury D, Cosentino M, Diaz-Mayoral N, Kennedy M, Engel T, Williams P, Erickson K, Ardlie K, Winckler W, Getz G, DeLuca D, MacArthur D, Kellis M, Thomson A, Young T, Gelfand E, Donovan M, Meng Y, Grant G, Mash D, Marcus Y, Basile M, Liu J, Zhu J, Tu Z, Cox NJ, Nicolae DL, Gamazon ER, Im HK, Konkashbaev A, Pritchard J, Stevens M, Flutre T, Wen X, Dermitzakis ET, Lappalainen T, Guigo R, Monlong J, Sammeth M, Koller D, Battle A, Mostafavi S, McCarthy M, Rivas M, Maller J, Rusyn I, Nobel A, Wright F, Shabalin A, Feolo M, Sharopova N, Sturcke A, Paschal J, Anderson JM, Wilder EL, Derr LK, Green ED, Struewing JP, Temple G, Volpi S, Boyer JT, Thomson EJ, Guyer MS, Ng C, Abdallah A, Colantuoni D, Insel TR, Koester SE, Little AR, Bender PK, Lehner T, Yao Y, Compton CC, Vaught JB, Sawyer S, Lockhart NC, Demchok J, Moore HF. 2013. Adult Genotype-Tissue Expression (GTEx) project. GTExPortal. adult-gtex [DOI] [PMC free article] [PubMed]
    2. Farber CR, Bennett BJ, Orozco L, Eskin E, Lusis AJ. 2011. Bone mineral density in males of 96 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno1
    3. Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Nucleotides and peptides in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno10
    4. Bennett BJ, Lusis AJ. 2015. Diet effects on aortic lesion size in 103 Hybrid Mouse Diversity Panel strains that are transgenic for human APOE*3Leiden and human CETP (F1 hybrids) Mouse Phenome Database. HMDPpheno11
    5. Bennett BJ, Farber CR, Orozco L, Eskin E, Lusis AJ. 2010. Plasma lipids and body composition in males of 99 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno2
    6. Korshunov VA. 2012. Heart rate and blood pressure in males of 58 strains of mice. Mouse Phenome Database. HMDPpheno3
    7. Park CC, Gale GD, Lusis AJ, Smith DJ. 2011. Fear conditioning in males of 94 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno4
    8. Davis RC, van Nas A, Bennett BJ, Eskin E, Lusis AJ. 2013. Hematological parameters in males of 83 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno5
    9. Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Amino acids in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno6
    10. Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Carbohydrates, cofactors and vitamins, energy, and xenobiotics in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno7
    11. Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Lipids (except lysolipids) in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno8
    12. Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Lipids (lysolipids) in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno9

    Supplementary Materials

    MDAR checklist

    Data Availability Statement

    All previously published datasets are listed below. Please see the Materials and methods for further details (section ‘Data sources and availability’).

    The following previously published datasets were used:

    Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, Hasz R, Walters G, Garcia F, Young N, Foster B, Moser M, Karasik E, Gillard B, Ramsey K, Sullivan S, Bridge J, Magazine H, Syron J, Fleming J, Siminoff L, Traino H, Mosavel M, Barker L, Jewell S, Rohrer D, Maxim D, Filkins D, Harbach P, Cortadillo E, Berghuis B, Turner L, Hudson E, Feenstra K, Sobin L, Robb J, Branton P, Korzeniewski G, Shive C, Tabor D, Qi L, Groch K, Nampally S, Buia S, Zimmerman A, Smith A, Burges R, Robinson K, Valentino K, Bradbury D, Cosentino M, Diaz-Mayoral N, Kennedy M, Engel T, Williams P, Erickson K, Ardlie K, Winckler W, Getz G, DeLuca D, MacArthur D, Kellis M, Thomson A, Young T, Gelfand E, Donovan M, Meng Y, Grant G, Mash D, Marcus Y, Basile M, Liu J, Zhu J, Tu Z, Cox NJ, Nicolae DL, Gamazon ER, Im HK, Konkashbaev A, Pritchard J, Stevens M, Flutre T, Wen X, Dermitzakis ET, Lappalainen T, Guigo R, Monlong J, Sammeth M, Koller D, Battle A, Mostafavi S, McCarthy M, Rivas M, Maller J, Rusyn I, Nobel A, Wright F, Shabalin A, Feolo M, Sharopova N, Sturcke A, Paschal J, Anderson JM, Wilder EL, Derr LK, Green ED, Struewing JP, Temple G, Volpi S, Boyer JT, Thomson EJ, Guyer MS, Ng C, Abdallah A, Colantuoni D, Insel TR, Koester SE, Little AR, Bender PK, Lehner T, Yao Y, Compton CC, Vaught JB, Sawyer S, Lockhart NC, Demchok J, Moore HF. 2013. Adult Genotype-Tissue Expression (GTEx) project. GTExPortal. adult-gtex

    Farber CR, Bennett BJ, Orozco L, Eskin E, Lusis AJ. 2011. Bone mineral density in males of 96 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno1

    Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Nucleotides and peptides in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno10

    Bennett BJ, Lusis AJ. 2015. Diet effects on aortic lesion size in 103 Hybrid Mouse Diversity Panel strains that are transgenic for human APOE*3Leiden and human CETP (F1 hybrids) Mouse Phenome Database. HMDPpheno11

    Bennett BJ, Farber CR, Orozco L, Eskin E, Lusis AJ. 2010. Plasma lipids and body composition in males of 99 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno2

    Korshunov VA. 2012. Heart rate and blood pressure in males of 58 strains of mice. Mouse Phenome Database. HMDPpheno3

    Park CC, Gale GD, Lusis AJ, Smith DJ. 2011. Fear conditioning in males of 94 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno4

    Davis RC, van Nas A, Bennett BJ, Eskin E, Lusis AJ. 2013. Hematological parameters in males of 83 strains of mice in the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno5

    Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Amino acids in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno6

    Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Carbohydrates, cofactors and vitamins, energy, and xenobiotics in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno7

    Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Lipids (except lysolipids) in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno8

    Ghazalpour A, Bennett BJ, Orozco L, Lusis AJ. 2014. Liver metabolite levels: Lipids (lysolipids) in males of 104 strains of the Hybrid Mouse Diversity Panel (HMDP) Mouse Phenome Database. HMDPpheno9


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES