The impact of exercise on gene regulation in association with complex trait genetics

Nikolai G Vetr; Nicole R Gay; MoTrPAC Study Group; Stephen B Montgomery

doi:10.1038/s41467-024-45966-w

. 2024 May 1;15:3346. doi: 10.1038/s41467-024-45966-w

The impact of exercise on gene regulation in association with complex trait genetics

Nikolai G Vetr ^1,^✉, Nicole R Gay ¹; MoTrPAC Study Group, Stephen B Montgomery ^1,^✉

PMCID: PMC11063075 PMID: 38693125

Abstract

Endurance exercise training is known to reduce risk for a range of complex diseases. However, the molecular basis of this effect has been challenging to study and largely restricted to analyses of either few or easily biopsied tissues. Extensive transcriptome data collected across 15 tissues during exercise training in rats as part of the Molecular Transducers of Physical Activity Consortium has provided a unique opportunity to clarify how exercise can affect tissue-specific gene expression and further suggest how exercise adaptation may impact complex disease-associated genes. To build this map, we integrate this multi-tissue atlas of gene expression changes with gene-disease targets, genetic regulation of expression, and trait relationship data in humans. Consensus from multiple approaches prioritizes specific tissues and genes where endurance exercise impacts disease-relevant gene expression. Specifically, we identify a total of 5523 trait-tissue-gene triplets to serve as a valuable starting point for future investigations [Exercise; Transcription; Human Phenotypic Variation].

Subject terms: Data integration, Gene expression, Genome-wide association studies, Transcriptomics

It is known that exercise influences many human traits, but not which tissues and genes are most important. This study connects transcriptome data collected across 15 tissues during exercise training in rats as part of the Molecular Transducers of Physical Activity Consortium with human data to identify traits with similar tissue specific gene expression signatures to exercise.

Introduction

Endurance exercise is associated with multiple positive health outcomes^1,2. However, the molecular basis of these positive effects has been challenging to study, with past work restricted to molecular assays in either few or easily accessible tissues³. Even when prior differential analyses have identified exercise-responsive genes, there is often limited evidence for their shared molecular impact on disease. To address this challenge, we have combined the extensive, multi-tissue transcriptome data from the Molecular Transducers of Physical Activity Consortium (MoTrPAC) preclinical endurance exercise training (EET) study in rats⁴ with data from the Genotype-Tissue Expression (GTEx) project, where genetic differences in expression levels have been previously connected to 114 traits and diseases from publicly available Genome Wide Association Studies (GWAS) distributed across several phenotypic and plausibly exercise-responsive categories⁵ (note: acronyms and abbreviations used in this paper are summarized in Supplementary Table 1). The MoTrPAC EET study provided differential expression results after treadmill exercise training for both female and male F344 rats, with multiple tissues harvested at 1, 2, 4, and 8 weeks of training. All samples were harvested 48 h after the last exercise bout, and the 8-week time point was taken to correspond to the adapted state, as it allowed for the greatest degree of long-term adaptation to exercise to have occurred, as well as the least degree of unadapted acute response (eg inflammation). In rats, as in humans, exercise capacity is a genetic trait with well-studied relationships across a range of human-relevant complex traits and diseases^6–8. Combined, these data provided a cross-tissue, whole organism molecular view of adaptation to exercise that is unattainable in human participants.

To assess the relationship of exercise adaption and complex disease risk in distinct tissues, we applied a combination of heritability and transcriptome-wide association study (TWAS) analyses (Fig. 1a). These analyses are state-of-the-art in human genetics but have yet to be broadly applied cross-species in the context of exercise adaption. They allow us to investigate exercise adaptation genes and gene sets for their relationship to specific complex diseases. We applied LDSC⁹, which can accommodate linkage disequilibrium to estimate SNP-heritability captured by sets of exercise adaption genes ( $h_{SNP}^{2}$ )¹⁰, alongside MESC¹¹, which incorporates both GWAS and Expression Quantitative Trait Loci (eQTL) summary statistics to estimate the proportion of $h_{SNP}^{2}$ mediated by gene expression within and across tissues to assess the relationship between genetic variability and adaptive exercise training response. Finally, we leveraged published S-PrediXcan^12,13 output, which estimates gene by tissue-level associations and directions of effect for specific diseases to identify genes where changes in gene expression due to exercise adaptation have the potential to alter disease risk.

Fig. 1 — In (a), we provide a general overview of the work described here, from human genetic and transcriptomic data and rats subjected to an Endurance Exercise Training (EET) experimental perturbation to triplets of causally entangled genes, tissues, and traits. In (b), we subset Differentially Expressed (DE) Genes to just those determined to be DE at 8W in a sex-consistent manner, and visualize their distribution and tissue-specific composition across uniquely DE genes (genes DE in only one tissue), pairs (in two tissues), triplets, etc. To check for overlap in these gene sets, we also plot the upper triangle of a Jaccard Similarity matrix. In (c), we present alternative ways to characterize Open Targets associations across these gene sets.

Using these data and approaches, we identify gene and tissue combinations where expression levels could mediate disease risk and where exercise training had the potential to induce expression differences capable of overwhelming the impact of human standing variation measured in the GTEx study, both overall and with respect to its genetic component. We further assess if specific diseases and traits are enriched for genes differentially expressed in exercise training, both in their overall occurrence and in their directionality of effect. Combining these approaches, we identify specific genes that lie at this intersection of biological relevance as candidates where exercise effects could override expression-mediated disease risk.

Results

Exercise training has unique disease gene signatures across tissues

Exercise training induces differential expression of rat genes across multiple body tissues, and many of these genes can be mapped to human orthologs: 94.5% of all unique, differentially expressed (DE) genes (87–98% across tissues), and 79% of all expressed genes (85–93% across tissues). However, most of these changes exhibit marked tissue specificity. As observed in the main MoTrPAC PASS1B paper⁴, we found that after long-term exercise training, there was limited overall concordance of adaptive differential expression across tissues in the subset of rat genes with identifiable human orthologs (hereafter ‘genes’, unless otherwise noted). Only two pairs of tissues in females—the skeletal muscles vastus lateralis and gastrocnemius, as well as white adipose and the colon—produced Spearman’s ‘ρ’s at a level greater than 0.3 (Supplementary Fig. 1b). Further, there was little overlap in differentially expressed gene sets (DEGs) corresponding to each tissue (Fig. 1b). Approximately 78% of DE genes were unique and differentially expressed in only one tissue, and 95% of genes matched at most a pair of tissues. Only one pair of tissues showed a Jaccard index > 0.1 (the gastrocnemius and the vastus lateralis, Jaccard Similarity ≈ 0.21). This indicates that unique genes and pathways adapt to exercise training in different tissues and likely impact different subsets of disease-relevant genes. To this point, we observed 370 high-scoring (Open Targets evidence score > 0.8, an arbitrary threshold chosen to select gene × trait relationships with high levels of supporting evidence) disease genes from 251 traits across our 15 surveyed tissues (Fig. 1c) that were consistently responsive to exercise training in both males and females, with an average of 18.2 genes per tissue. When we excluded easily biopsied tissues such as blood, skeletal muscles, and adipose, we found 178 well-established disease genes associated with an adaptive exercise training response. This corresponded to 143 traits and included 101 traits without any gene-trait associations in an easily biopsied tissue. Notably, these included well-studied genes such as LDLR (DE in {CORTEX, HIPPOC, SKM-GN, SKM-VL} and APOB (DE in {COLON, KIDNEY, LUNG}), both confidently associated with hypercholesterolemia; SLC6A8 (DE in {HEART, LIVER, LUNG}), associated with creatine transporter deficiency; FOXP3 (DE in {HEART, SPLEEN}, associated with immune dysregulation; and BRCA2 (DE in ADRNL), associated with breast neoplasia.

Exercise effects on regulation of gene expression

We sought to identify where changes in gene expression due to exercise training could potentially overcome either genetic or standing variability measured in GTEx. Here, our hypothesis was that exercise behavior may be more impactful than baseline variance or genetics at these loci for the component of disease risk mediated by gene expression. For each gene and tissue where we could detect non-zero genetic $h_{SNP}^{2}$ (IHW α = 0.10), we calculated genetic variance as the product of the estimated heritability and observed total phenotypic variance (Fig. 2). At 8W, we observed an average of 1 (range: 0–10) genes per tissue in at least one sex with effect sizes in trained rat that were > 2SD the genetic component of expression variability of the matched sex in humans (SD_geno), and 52 (range: 1–586) genes per tissue with DE > 2SD overall expression variability (SD_pheno), with the latter set featuring ≈ 50 genes per tissue whose $h_{SNP}^{2}$ could not be significantly distinguished from 0 after multiplicity adjustment. Intersecting these genes with Open Targets, we observed 30 unique > 2 SD_pheno DE genes with > 0.8 evidence scores, though only 14 of these were expressed in less accessible tissues. APOB was included in the latter group, mentioned above (DE in male lung at ≈ +9.7 SD_pheno, and in the female colon at ≈ +2.4 SD_pheno).

Fig. 2 — In this figure we visualize the procedure used to obtain Standardized Effect Sizes. In the numerator of (a) lies a truncated kernel density estimate of the distribution of log₂FCs induced by exercise at the 8w timepoint. A stacked histogram of estimates for $h_{SNP}^{2}$ for expression scores from GTEx (p < 0.10 after IHW correction) is on the left in the denominator. On the right are the inverse-gamma distributions serving to regularize log₂-normalized expression scores. Together, this results in a value equal to the estimated genetic variance of expression. Taking its square root yields a standard deviation, which we use to divide exercise-induced log₂FC. In (b), we plot the empirical quantile function for distributions of ratios of each tissue’s exercise-induced log₂(gene expression) / SD(log₂(gene expression)). As most of the interesting behavior is contained in the tails of each distribution, we applied two separate transforms to the axes of each plot. The horizontal axis, corresponding to a given quantile in (0,1), was logit-transformed. The vertical axis, corresponding to the ratio of DE / SD(log₂expression), received an inverse hyperbolic sine transformation. The upper panels are in units of standardized phenotypic effect, and the lower in units of standardized genetic effect for those genes and tissues with significant non-zero $h_{SNP}^{2}$ (IHW α = 0.10, one-sided). Source data for this figure are provided as a Source Data file.

Heritability of complex disease enriched in or near training-responsive genes

We investigated whether exercise specifically modulates any traits or diseases by building on a previous approach¹⁴ to identify these effects. First, we computed the trait or disease heritability for gene sets that were differentially expressed due to exercise training in each tissue at 8W in both sexes and in the same direction. We observed the strongest magnitude of enrichments in blood phenotypes in the blood tissue, especially traits corresponding to densities of immune cells (Fig. 3).

Fig. 3 — Here, we visualize conditional heritability enrichments (LDSC) of multiple traits within differentially expressed, sex-independent gene sets corresponding to different tissues. Colors distinguish tissues, with opaque diamonds corresponding to IHW-significant hits (α = 0.05), and size proportional to the magnitude of log₁₀(p-value) (two-sided, adjusted for multiple comparisons with IHW). The horizontal axis corresponds to the heritability enrichment factor, and the vertical to GWAS traits, grouped into high-level categories. Traits lacking an IHW-significant (α = 0.05) hit in at least one tissue are excluded from this visualization, and the horizontal axis has been truncated to exclude non-significant enrichments above that of the maximum significant enrichment, as well as estimated enrichments < 0, which are strictly impossible. Source data for this figure are provided as a Source Data file.

Across the 43 traits with at least one significant enrichment at Bonferroni-adjusted α = 0.05, the largest significant enrichment factor corresponded most often to the spleen (22/43 ≈ 51%), especially in Blood (9/14) and Immune (6/7) phenotypes, with an average enrichment factor of ≈2.85 across significant enrichments. The proportion of heritability captured by these gene sets is on the order of ≈ 10% (Supplementary Fig. 2a) and corresponds to broadly independent signals across tissues (Supplementary Fig. 2b–c). This approach provides a general prioritization for assessing which traits or diseases could be most impacted by exercise training. However, we performed simulation experiments using randomly sampled gene sets of equivalent size to our original tissue-specific gene sets. These produced highly similar distributions of p-values to those observed for the empirical data. As such, these results (Fig. 3) should be interpreted less in the framework of null-hypothesis significance testing and more descriptively, as a relative ordering of estimated magnitudes of effect.

PrediXcan-significant genes overlap adaptive training-response genes

We examined the intersection of genes that are differentially expressed at “8w_F1_M1" and “8w_F-1_M-1” (i.e., sex- and direction-consistent after 8 weeks of training) and IHW-significant PrediXcan hits (Fig. 4a). We were able to identify substantial enrichment in many of the traits, trait categories, tissues, and tissue-by-trait pairs through the use of a hierarchical Bayesian model able to partially pool estimates of difference effects towards the means of their respective populations. Here, we see confident (>95% posterior probability) enrichments across all levels of the model hierarchy (Fig. 4e–g). Specifically, we observed confident positive differences in the colon, kidneys, small intestines, spleen, hippocampus, lungs, and heart, in order of decreasing posterior mean, as well as in the Endocrine and Cardiometabolic categories. We also noted several specific trait enrichments across cardiometabolic markers, mainly cholesterol and saturated fatty acids. At the trait × tissue level, posterior output were broadly uncertain in most pairs’ directionality of enrichment, with a smaller number showing stronger confidence in positive enrichment (Fig. 4b).

Fig. 4 — Here, we visualize fitted output from our PrediXcan-DEG intersect enrichment model (n = 10,000 nominal iterations across four independent chains). In (a), we show the sizes of gene sets in the intersect of PrediXcan hits (IHW α = 0.05) across different traits (horizontal axis) and sex-homogeneous, differentially expressed genes (DEGs) at 8W in different tissues (vertical axis). Cell colors correspond to the size of the intersecting gene set. Numbers in each cell give the size of each intersect, with cell corners labeling cells whose marginal posterior difference parameter has >95% of its mass to one side of 0. Marginal counts give the maximum number of PrediXcan hits in each trait (vertical margin) or DEGs in each tissue (horizontal margin), after constraining the total pool to mutually expressed genes. In (b), we plot the histogram of posterior masses > 0 for trait × tissue difference effects, with colors drawn from cell labels in (a). In (c), the vertical axis corresponds to logit-transformed frequencies of PrediXcan hits in the DEGs from (a), and the horizontal axis represents the corresponding frequency in all genes outside this set. Only traits from six trait categories are depicted, with colors corresponding to tissues and shapes to categories. In (d), the vertical axis maps to the proportion of positive effects in the PrediXcan-DEG intersect across traits and tissues, and the horizontal axis to the same proportion outside that intersect. Point diameter is proportional to the square root of intersect gene set size, with colors and shapes retaining their meaning from (c). In (e–g), marginal posterior distributions from our intersect enrichment model are shown as violin plots, with internal lines representing middle 90% credible intervals and internal points representing posterior means. Internal line and point colors are white when the credible interval overlaps with 0, and black otherwise. Violins are arranged in order of increasing posterior mean, with the horizontal axis on the logit scale. In (e), we plot these at the tissue level, in (f) at the trait category level, and in (g) at the trait level. Source data for this figure are provided as a Source Data file.

Conversely, none of the frequentist analyses of this overlap produced significant results at FWER α = 0.05 (-log₁₀(0.05) ≈ 1.30, one-sided), with the most significant result corresponding to the multi-tissue GSEA for high cholesterol at an adjusted p-value of ≈ 0.064 (Supplementary Fig. 3, ES ≈ 0.63, log2err ≈ 0.48). However, results were broadly concordant across the two approaches, and more confident posterior distributions corresponded to lower frequentist p-values, with intermediate positive Spearman’s ρs for pairwise and trait-wise comparisons (Supplementary Fig. 3a–c). Frequentist meta-analysis of tissue and trait-category enrichments were in less confident agreement, with the latter showing mild disagreement, though at p ≈ 0.53 (output from stats::cor.test in R).

Exercise induces both more and less disease-like differential gene expression

To identify the direction of training effect in these intersecting gene sets, we queried the posterior output from a second multilevel model, visualizing posterior means for each tissue and tissue-by-trait combinations as a dot plot (Fig. 5). Given the reduced capacity for signal in these data (focal totals no longer being the set of DEGs, but the set of DEGs ∩ PrediXcan hits), we report on confident effects when a posterior mass is >90% to one side of 0. As such, the strongest confident mean enrichments for positive effects were observed in body fat percentage, asthma, and body mass index (BMI), and the strongest mean depletions in standing height and high cholesterol, though of the latter only standing height was “confident". Otherwise, body fat percentage was the only trait with posterior difference >95% in either direction. As traits varied in the degree to which they could be considered harmful or beneficial, we could not evaluate gross tissue effects across traits, but at the tails of each trait’s hyperdistribution, blood, spleen, and the two skeletal muscles had the strongest degree of deviation from null expectation. Additionally, ≈83% of the posterior mass of our G_SNP weight parameter θ fell above 0.5, with ≈28% falling above 0.9.

Fig. 5 — We visualize the posterior means of multilevel trait and trait × tissue terms from a Bayesian model corresponding to the proportion of genes imparting a positive effect on traits aligned along the horizontal axis. Diamonds mark traits or trait × tissue pairs whose difference effect’s posterior mass falls entirely to one side of 0, either prepending the trait name or else marking the trait × tissue symbol. Points sizes are in proportion to the square root of sample size, and traits are arranged on the horizontal axis according to the monotonic decrease in their posterior means. Source data for this figure are provided as a Source Data file.

When examining the direction of trajectories for 8-week gene sets linked to the two non-anthropometric traits, we noticed a regression towards a mean proportion of 0.5 across tissues. This is likely due to underlying genes only being differentially expressed at later time points (Fig. 6). Examining which genes and tissues correspond to both high deviation from the mean and relatively large amounts of DE, we observed blood genes associated with lower cholesterol in males (NDUFA13, FADS2, PNKD, AAMP, and OGDH), as well as the male training vastus lateralis gene TMBIM1, the female-specific training gene APOB in colon, and the female training gene ABCG8 in liver. With respect to increased risk of asthma, blood genes again had the largest relative effect sizes in males (BAG6, CCNG, CRAT, PTPA, and FAM89B), with female training genes exhibiting the largest effects in ATP6V1G2 in the vastus lateralis, ENDOU in white adipose, and CCNF in blood.

Fig. 6 — We visualize the observed proportion of positive effects for the two non-anthropometric traits that had the highest posterior mean enrichment in that proportion according to our proportion of positive effects enrichment model. Above, two panels correspond to self-reported high cholesterol, and below, self-reported asthma. Lines terminate at 8W on the right of each panel, splitting into tissues and genes. Additionally, we trace the proportion for the 8w gene set backwards in time, examining the effect of those genes at 1w, 2w, and 4w. Tissue names are followed by the total number of genes in the intersecting gene set in parentheses, and gene names are followed by their sign (a red + if the effect of DE on the trait is positive and a blue - if negative), and the standardized effect size of DE from Fig. 2b. Additionally, we plot a line corresponding to the set of all gene-tissue pairs in black, labeled ALL. Source data for this figure are provided as a Source Data file.

Discussion

In our study, we have identified multiple tissues and tissue-by-gene pairs where exercise may modify disease risk through gene expression. Despite human-rat differences, our unbiased approach identified multiple results that echo established exercise-disease relationships. However, some findings were unexpected.

Gene sets that responded to exercise were enriched for PrediXcan genes linked with cardiometabolic traits (Fig. 4e, f). The intersection of these genes seems to lean away from disease-like effects (Fig. 5), but we also found disease-like effects for genes associated with asthma and body fat percentage (Fig. 5). These associations, however, did not exhibit intersect sizes larger than expected by chance, and the latter showed only weak evidence of depletion (Fig. 4f). Additionally, when aggregating across traits, several of the most “classically" exercise-responsive tissues – the skeletal muscles and white adipose—appeared to be among the most depleted for PrediXcan hits (Fig. 4d), though no marginal posterior distributions for difference parameters there reached our 95% posterior mass threshold. Overall, estimates for enrichments and depletions in both intersect and directionality effects were small, even for confidently non-zero effects, predominately varying within 0 ± 0.3 on the log-odds scale (Figs. 4e–g, 5). This corresponds to a maximum difference of ≈7.5% on the probability scale (inv-logit(0.15) - inv-logit(-0.15)), and is consistent with the relatively small deviations observed from the 1-to-1 lines in Fig. 4c–d. Interpretation of these exercise biological findings should not lose sight of this context: small, subtle, but nevertheless discernible association.

In the case of body fat percentage (BF%), it may be that absent dietary control – for example, when rats are fed ad libidum – genes are regulated in a manner that elicits increased fat storage as an adaptation to higher energy expenditure¹⁵. Thus, even though exercise may often be done by humans with the goal of reducing BF% through increased caloric expenditure, an interaction with diet modulates this response. However, white fat itself does not appear to be enriched for positive effects on BF%, with the most pronounced enrichments evident in blood and heart. Similarly, the strongest depletion for positive effects, both within anthropometric traits and overall, occurred in the standing height phenotype. Evidence for exercise effects on height is weak and ambiguous. However, particularly intense exercise may have an attenuating effect on growth^16–18, especially under nutritional stress, which may partially underlie the associations observed here.

Asthma also emerged as a trait with shared transcriptional effects as exercise. This may be due to similar etiology between the general asthmatic condition measured by self-report and exercise-induced bronchoconstriction (EIB), where lung epithelial stress from exercise and increased drying and cooling of the airways due to increased ventilation triggers an inflammatory response alongside shortness of breath¹⁹. Though no individual tissues were found to be confidently enriched in positive effects for this phenotype (Fig. 5), DE genes in the spleen—a key immune and inflammatory response regulator²⁰ – emerged as having the greatest enrichment in $h_{SNP}^{2}$ , comprising nearly four times baseline expectation and accounting for ≈10% of trait heritability overall (Fig. 3, Supplementary Fig. 2a). Moreover, DE genes in the spleen had the highest $h_{SNP}^{2}$ enrichment across a number of additional immune cell and disease phenotypes, and both eosinophil and basophil counts were found to have moderate genetic correlations with the asthma phenotype, highlighting the recently proposed roles these cell types play in structuring EIB^21,22. Finally, even where exercise regulates gene expression in ostensibly “disease-like" directions, it may be that many phenotypes as those above manifest when inflammatory, hunger-regulating, or other effects of exercise occur without having first been induced by exercise. We hypothesize that by subjecting the body to disease-like stresses, regular exercise elicits adaptation to the symptoms of those diseases, reducing the risk of their manifestation from the disease itself. In this light, the presence of their signal here may also be expected.

At the gene level, several of the highlighted genes in the DEG / PrediXcan intersection were supported by prior literature. In the cholesterol phenotype, FADS2²³, PNKD²⁴, and OGDH²⁵, TMBIM1²⁶, APOB²⁷, and ABCG8²⁸ have all been implicated previously, while NDUFA13 and AAMP have not. For asthma and / or reduced lung function, links have been drawn to BAG6²⁹, CCNF³⁰, and CRAT³¹, though other genes are mentioned in similar contexts to that explored in this work, also relying on integration of eQTL and GWAS association mapping (e.g., FAM89B³²).

A notable limitation of this study may be that, despite their well-established use as an exercise model, rats are separated from humans by nearly 140 million years of evolution³³. Comparison of exercise-independent age and sex effects, meanwhile, may be limited by differences in age between individuals in GTEx and MoTrPAC, as most humans in the GTEx v8 dataset were aged 50+^34,35 while trained F344 rats were uniformly under eight months of age and therefore well under the age of onset of the F344 rat equivalent of sex-specific, aging-related changes such as menopause³⁶. These results may also have limited portability to non-European populations, as the GTEx sample comprises mostly European-descendant individuals. Identification of rat-human gene orthology is another difficult problem, and important biology almost certainly lies within disease and exercise-responsive genes across species whose correspondence can not be easily established. But while species differences can complicate interpretation of exercise-induced regulation of orthologous genes, these models remain crucial and provide high levels of experimental compliance and tissue accessibility from individuals who are far more straightforward to motivate. As such, a unique aspect of the MoTrPAC rat exercise training data includes the availability of differential expression data across 15 distinct tissues, many impossible or impractical to collect in humans as part of an exercise study. Accelerated rat life history also makes it feasible to conduct experiments on exercise training adaptation on timescales relevant to their lifespan. It’s simpler to regulate rat behavior than human behavior, reducing biases linked to non-compliance and attrition.

We expect future studies can benefit and expand on this work in several ways. Qualitative sex-specificity, a notable hallmark of exercise adaptation in humans^37,38, fell outside the scope considered here, though is afforded closer treatment in companion publications³⁹. Future causal inferential work may use the genetic correlates of physical activity⁴⁰ as instruments to infer tissue-specific drivers of phenotypic adaptation⁴¹ in humans. But analysis of experimental data from animal models will complement these efforts where genetic effects are weak (Fig. 2a), targeting causality directly to identify how tissue and organ systems adapt to exercise and influence a large variety of human traits and diseases. Finally, we expect that future studies may benefit from our work by evaluating specific loci therein for GxE interactions within large-scale human population biobanks. Combined, MoTrPAC’s EET study provides a large-scale, cross-tissue map of changes in exercise adaptation that enables generating new mechanistic hypotheses on the disease impacts of exercise training.

Methods

This study did not generate novel data, instead relying on data published in previous or concurrent studies. Animal procedures from the concurrent MoTrPAC PASS1B study⁴ were approved by the University of Iowa’s Institutional Animal Care and Use Committee.

MoTrPAC EET study design

The MoTrPAC⁴² Endurance Exercise Training Study is described in detail in the landscape manuscript⁴ (data accessible at https://motrpac-data.org/data-access). In brief, both female and male F344 rats were subjected to treadmill exercise training, with tissues harvested at 1, 2, 4, and 8 weeks of training. All samples were taken 48 h after the last exercise bout, with the 8-week time point taken to correspond to the adapted state. In this work, we leverage data from a total of 738 extracted samples across 15 tissues and 47-50 rats per tissue that were subjected to RNA-sequencing and differential expression analysis.

Differential expression analysis

Differential expression analysis (DEA) is described in detail by the main MoTrPAC manuscript⁴. Briefly, DEA was performed separately in each sex and tissue using filtered raw counts as input for DESeq2⁴³. Likelihood ratio tests (DESeq2::nbinomLRT()) were used to identify genes that changed over the training time course in at least one sex while accounting for RNA-Seq technical covariates (RNA integrity number, median 5′-3′ bias, percent of reads mapping to globin, and percent of PCR duplicates as quantified with Unique Molecular Identifiers). For each gene, male- and female-specific p-values were combined using the Fisher’s sum of logs method. These meta-analytic p-values were adjusted across all RNA-Seq datasets using Independent Hypothesis Weighting (IHW) with tissue as a covariate⁴⁴. Training-differential genes were selected at 5% IHW α. Given the regression model of each gene described above, contrasts were made between each training timepoint (i.e., 1, 2, 4, or 8 weeks) and the sex-matched sedentary controls using DESeq2::DESeq() to calculate time- and sex-specific summary statistics.

Correlation of differential analysis results

The nominal p-values and log fold-changes from the time- and sex-specific differential expression analysis results were transformed into standard normal random variables using qnorm(p-value / 2, lower.tail = F) * sign $(\log_{2} FC)$ in base-R. These “z-scores" were organized into a gene-by-condition matrix, where conditions were tissue, sex, and timepoint combinations. The z-score matrix was filtered to include the set of genes that had no missing values across all conditions. We calculated the Spearman correlation between all pairs of conditions to quantify the concordance of the training effect across conditions.

Graphical clustering of differential analysis results

Graphical clustering of differential analysis results is described in detail in the main MoTrPAC EET study manuscript⁴. All training-differential features at 5% IHW α were clustered into homogeneous patterns using their time- and sex-specific differential analysis z-scores. The statistical details are provided elsewhere^4,45–47. Briefly, the expectation-maximization (EM) process of the repfdr algorithm was used to assign one of three simplified states to each z-score: −1 for down-regulation, 0 for null (no change), or 1 for up-regulation⁴⁵. For each feature and timepoint, the simplified states from each sex were combined into one of nine possible states (−1, 0, or 1 for each sex). For example, the state “F1_M1" represented a feature that was up-regulated in both females (F1) and males (M1) at a given timepoint. Here, to focus on genes with sex-consistent training effects in the trained state, we selected genes that were assigned to the F1_M1 state (up-regulated in both sexes) or the F-1_M-1 state (down-regulated in both sexes) at 8 weeks. To enable comparison between genes expressed in rats and humans, we compiled a MoTrPAC rat-to-human ortholog map from GENCODE and RGD resources^4,48,49. The distribution of those genes able to be matched to human orthologs across tissues is summarized in Fig. 1b.

Open targets intersection

The Open Targets⁵⁰ database (Release 22.04) was downloaded on June 8th, 2022. Entries in this database represent curated sets of human genes with disease relationships established from multiple sources of evidence. We used the R-package sparklyr⁵¹ to cross-reference differentially expressed rat genes to all orthologous Open Targets gene-trait direct associations at different evidence-score thresholds. The abundance of these associations were quantified on a tissue-specific and tissue-shared bases, comprising genes differentially expressed in three or more tissues. A table listing all genes, top trait associations, and corresponding tissues is provided in the Supplementary Files folder of the GitHub repo.

Heritability analyses

We retrieved summary statistics (sumstats) for 114 published GWAS⁵. Using the program LDSC⁹, we estimated SNP-heritability ( $h_{SNP}^{2}$ ) for each GWAS in LDSC¹⁰, including the default baseline annotation of 53 functional categories. We further estimated $h_{SNP}^{2}$ using MESC¹¹, and with the provided expression scores meta-analyzed over 48 GTEx tissues, estimated expression-mediated heritability ( $h_{mediated}^{2}$ ) for our 114 traits, as well as the ratio of $h_{mediated}^{2} / h_{SNP}^{2}$ .

LDSC was used to estimate overall proportion of and enrichment in $h_{SNP}^{2}$ across loci within a 100kb window of all sex-consistent 8w DE gene sets in each tissue following the “Cell type specific analyses" tutorial. We included here the baseline annotation, as well as an annotation comprising loci within 100kb of all expressed genes in each tissue. Finally, to assess the sensitivity of tissue-specific results on overlaps in gene sets between tissues, we estimated heritability and heritability enrichment conditional on annotations corresponding to all other tissues alongside the baseline annotation.

Human expression data & effect standardization

To assess the degree that exercise effects could overcome genetic and phenotypic variability of gene expression in a tissue, we used the GTEx database (version 8)³⁴. To allow for a common scale between exercise DE and measures of gene expression in GTEx, we modified the GTEx pipeline to use a pseudolog (log₂(x + 1)) transformation in place of its default inverse-normal transform, otherwise keeping later steps in the pipeline intact. Next, we took the outputted expression matrices and residualized out the provided covariates using the lm() function in base-R (sex, the top 5 genotyping principal components, Sequencing platform, Sequencing protocol, and the suggested number of PEER factors in the GTEx documentation). On a per-gene basis, we then computed sample variances for each gene in each tissue, pooled across sex to reflect the sex-independent nature of exercise-induced DE. To regularize outlying variance estimates due to sampling effects, we fit an inverse-gamma distribution to tissue-specific sample variances using a maximum goodness-of-fit estimator implemented in the R-package fitdistrplus⁵² by the function fitdist(). As the inverse-gamma is the conjugate prior of the variance term of a normal distribution with known mean, we adopted an Empirical Bayesian strategy to produce posterior estimates of each gene’s expression variance. To allow for heterogeneity in this term across sex, we did this separately for male- and female-coded individuals in the GTEx study population. Additional details are provided in the Supplementary Methods. Across tissues, these empirical priors are plotted in the denominator of Fig. 2a. For each gene, we then took the square root of the posterior mean of inferred log₂expression variance ( $\sqrt{Var (\log_{2} (gene expression))}$ ) to estimate within-population standard deviation of the magnitude of gene expression. We then divided estimated exercise DE by these values to produce standardized estimates in units of within-tissue phenotypic standard deviation (SD_pheno). Further, these estimates were conditioned on both sex and population (quantile plots in upper panels of Fig. 2a).

To estimate the scale of genetic influence on gene expression, we used the software Plink⁵³ and GCTA⁵⁴, specifically GCTA-GRM⁵⁵, to estimate $h_{SNP}^{2}$ of each gene’s expression, using the same covariates as before. In contrast to prior work estimating $h_{SNP}^{2}$ in GTEx’ inverse-normal transformed gene expression matrices⁵⁶, we focused on obtaining estimates on a gene-specific basis, and so constrained output to be bounded between 0 and 1.

We then took these estimates, which represent the proportion of expression variance able to be explained by linear effects at the SNP level, and multiplied them by the estimates of expression variance, dividing our estimates of exercise-induced DE by the square root of that product to obtain exercise-effect sizes in units of genetic (SNP) standard deviation (SD_geno). Many of these $h_{SNP}^{2}$ point estimates were at or near 0, resulting in extreme standardized effect sizes. As a further filter, we thresholded on significance (IHW α = 0.10, with tissue as a covariate) to focus on confidently heritable genetically regulated expression. This removed ≈92% of gene x tissue pairs (583,238/632,738), leaving 49,500 for later analysis and use in figures.

Cross-referencing exercise-training genes with human TWAS

To identify specific genes where exercise-training effects may have the potential to mediate traits, we cross-referenced exercise-genes against transcriptome-wide association results (TWAS). Specifically, we downloaded S-PrediXcan¹³ output⁵ for 114 GWAS and MASHR-based expression models using GTEx v8, filtering by significance (IHW α = 0.05, with tissue x trait pairs as a covariate), and intersected with genes that were differentially expressed due to exercise at 8W in a sex-consistent manner, i.e., members of the nodes “8w_F1_M1" and “8w_F-1_M-1". 99 of 114 traits had a nonzero intersect in at least one tissue.

To assess potential enrichments in these intersections, we compared the observed count of S-PrediXcan hits in the DEG sets against those outside the DEG sets, adopting a tractable Binomial approximation to the Bernoulli distribution to test for enrichment or depletion of genes under a multilevel Bayesian model, following ref. ⁵⁷. This approach allowed us to partially pool information across tissues and traits, avoiding the need for post-hoc multiplicity adjustment⁵⁸, as multiplicity is explicitly built into the inference model itself through flexible regularization of model parameters towards 0. Specifically, we fit a model of the form:

y_{i,j}^{DEG} ~ Binomial (n_{i,j}^{DEG}, f (π_{i,j}^{DEG}))

1.1

y_{i,j}^{\neg DEG} ~ Binomial (n_{i,j}^{\neg DEG}, f (π_{i,j}^{\neg DEG}))

1.2

π_{i,j}^{DEG} = π_{i,j} + \frac{α + β_{i} + γ_{j} + ϵ_{i,j}}{2}

1.3

π_{i,j}^{\neg DEG} = π_{i,j} - \frac{α + β_{i} + γ_{j} + ϵ_{i,j}}{2}

1.4

α ~ Normal (0, 1)

1.5

β_{i} ~ Multi-Normal (\vec{0}, σ_{β}^{2} Σ_{i})

1.6

γ_{j} ~ Multi-Normal ({\vec{μ}}_{k}, σ_{γ}^{2} Σ_{j})

1.7

ϵ_{i,j} ~ Multi-Normal (\vec{0}, σ_{ϵ} Σ_{i \times j})

1.8

μ_{k} ~ Normal (0, σ_{μ})

1.9

π_{i,j} ~ Multi-Normal (η_{j}, σ_{π} Σ_{i \times j})

1.10

η_{j} ~ Multi-Normal ({\vec{λ}}_{k}, σ_{η}^{2} Σ_{j})

1.11

λ_{k} ~ Normal (μ, σ_{λ})

1.12

μ ~ Normal (0, 2)

1.13

σ_{β, γ, ϵ, μ, π, η, λ} ~ Half-Normal (0, 1)

1.14

Notation for this model is summarized in Supplementary Table 2, but in brief: the intersect size $y_{i,j}^{DEG}$ in tissue i ∈ {1, 2, …, 15} and trait j ∈ {1, 2, …, 99} was binomially distributed, with $n_{i,j}^{DEG}$ giving the total number of genes in that tissue that were differentially expressed at 8W and expressed at any level in the PrediXcan analysis (i.e., disregarding genes that were not expressed in both samples). The function f() can be any function mapping $R \to (0, 1)$ , but here was the inverse-logit function. On the logit-scale, $π_{i,j}^{DEG}$ was expressed as a deviation from a mean π_i,j, with an equal and opposite deviation to the log-odds of observing a PrediXcan hit in the complementary set, defined as all expressed genes that were not differentially expressed at 8W in a sex-consistent manner. This deviation term had four components: a tissue difference β_i, a trait difference γ_j, a tissue x trait difference ϵ_i,j, and an overall difference α. Adding and subtracting half from π_i,j to produce $π_{i,j}^{DEG}$ and $π_{i,j}^{\neg DEG}$ , respectively, was done to prevent specifying greater prior uncertainty on one of the two composite probability parameters.

The various scale parameters, σ, served to adaptively regularize estimates of each difference term towards their mean. Otherwise, we nested trait difference effects γ_j in trait category difference effects μ_k, where k ∈ {1, 2, …, 12} indexes previously designated trait categories⁵, i.e., members of the set {Psychiatric, Aging, Cardiometabolic, Allergy, Digestive, Immune, Endocrine, Skeletal, Anthropometric, Hair, Blood, Cancer}. If traits in a particular category showed consistent evidence of deviation, partial pooling shrunk estimates towards their respective mean hyperparameters, allowing them to share information to the extent the model could detect information to be shared. We use a similar model structure to express the overall location parameter, π_i,j.

Pseudo-replication across tissues and traits amplifies signals that inform higher-level parameters, leading the inference model to mistake interdependent effects as independent evidence for enrichment. When aggregating many Bernoulli random variables to a single binomial, signals of gene interdependence that would otherwise prevent this are lost. To address this, we introduced parameters Σ_i, Σ_j, and Σ_i×j, corresponding to i × i tissue, j × j trait, and (i ⋅ j) × (i ⋅ j) tissue × trait correlation matrices, respectively. For tractability, we then fixed these to maximum-likelihood estimates of each respective gene-wise correlation matrix under a bivariate probit, which we fit marginally across all DEGs, PrediXcan hits (jointly across tissues), and DEG × PrediXcan intersects using the nlm (non-linear minimization) algorithm⁵⁹ implemented in and accessed through the R-packages stats and optimx⁶⁰. As we fit each pairwise correlation individually, rather than simultaneously, there was no guarantee that the resulting correlation matrix is positive semi-definite. To ensure this constraint is met and all pairwise correlations are jointly possible, we transformed the pairwise-estimated correlation matrices with Higham’s algorithm⁶¹ implemented in the R-package Matrix⁶² function nearPD() before proceeding further.

We fit this model in Stan⁶³ using CmdStanR⁶⁴ and in Fig. 4e–g visualize marginal posterior distributions for tissue, trait, and trait category difference effects as violin plots using the R-package vioplot⁶⁵. Additionally, where the composite difference effect for a particular cell in Fig. 4a finds > 95% of its posterior mass to one side of 0, we colored its upper or lower corner with a red or blue triangle to signify enrichment or depletion of that tissue x trait combination, respectively. To accommodate this and subsequent models’ challenging posterior geometry, we used a non-centered parameterization, running four separate and randomly initialized chains for 2.5 × 10³ warmup and 2.5 × 10³ sampling iterations, with a target acceptance rate δ of 0.95. To diagnose pathologies in the MCMC output and confirm adequate convergence, mixing, and sampling intensity, we used the posterior package⁶⁶ for MCMC diagnostics, requiring that all model parameters, as well as posterior density and likelihood, receive $\hat{r} < 1.01$ and both bulk and tail Effective Sample Size (ESS) > 500, in addition to requiring that < 0.05% of iterations end in a divergence.

To complement this analysis, we also performed frequentist Gene Set Enrichment Analysis using the function fgsea() implemented in the R-package fgsea⁶⁷. Specifically, for each trait × tissue pair, we assessed enrichment of the set of DEGs in that tissue in the list of -log₁₀(p-values) of mutually expressed, orthologous genes’ p-values from PrediXcan, applying a Bonferroni correction to the output. To aggregate across these interdependent tests and assess tissue- and trait-level enrichment, we took the harmonic mean of subtest p-values corresponding to each grouping⁶⁸, applying a similar FWER adjustment to the output (α = 0.05). We leveraged the same meta-analytic procedure over trait-level enrichments to aggregate to trait categories. Finally, we explored an alternative approach to aggregating multi-tissue GSEA within traits. As a more stringent set of multi-tissue responsive genes, we took the set of all genes differentially expressed in three or more tissues. For PrediXcan p-values, we took the harmonic mean of PrediXcan p-values for each gene across all studied tissues prior to -log₁₀ transformation. These were input into conventional GSEA, and output from all of the above comparisons was visualized in Supplementary Fig. 3.

Proportion of disease-like effects

To assess the proportion of DE acting in disease-like directions relative to each phenotype (the product of the direction of DE and the direct of association from PrediXcan), we applied another Bayesian multilevel model, comparing the observed, unweighted frequency of positive effects against a “null" frequency of 0.5 (equivalently, against log-odds of 0):

y_{i,j} ~ Binomial (n_{i,j}, f (π_{i,j}))

2.1

π_{i,j} ~ Normal ({\vec{μ}}_{j}, σ_{j})

2.2

{\vec{μ}}_{j} ~ Multi-Normal (\vec{0}, {SRS}^{T})

2.3

R = G_{SNP} \times θ + I \times (1 - θ)

2.4

θ ~ Beta (1, 1)

2.5

diag (S) = δ e^{γ_{k}}

2.6

γ_{k} ~ Normal (0, σ_{γ})

2.7

σ_{j} = ρ e^{λ_{j}}

2.8

λ_{j} ~ Normal (0, σ_{λ})

2.9

ρ, δ ~ Half-Normal (0, 2)

2.10

σ_{γ, λ} ~ Half-Normal (0, 1)

2.11

Unlike for their overall frequency, signal for the directionality of effect (more or less disease-like) cannot be shared across traits or within trait categories, as the traits themselves vary in whether they are harmful, neutral, or beneficial. Instead, we perform partial pooling across the scale of these differences, across both trait categories and within traits themselves. Notation for this model is summarized in Supplementary Table 3, but in brief: we estimate overall scale parameters (ρ, δ), and then estimate a log-normally distributed multiplicative factor (λ_j, γ_k) to scale each on a trait-wise and trait-category-wise basis, respectively. To accommodate per-trait interdependence in the direction of deviation, we invert Cheverud’s conjecture⁶⁹, using our previously estimated G_SNP as a proxy for an environmental correlation matrix (Supplementary Fig. 1a). As these genetic correlations were estimated pairwise, positive-semidefiniteness (PSD) of the whole correlation matrix is not guaranteed. To satisfy the PSD constraint of G_SNP, we substituted the nearest PSD correlation matrix output from Higham’s algorithm⁶¹, implemented in the R-package Matrix⁶² function nearPD(). To allow flexibility in this modeling assumption, we compute a linearly weighted average of this matrix and the identity matrix (i), estimating the weight parameter θ from a flat Beta prior. MCMC sampling parameters were specified and diagnostics performed as previously described.

We examined the two non-anthropometric traits with the highest posterior means in Fig. 6, tracing the proportion of effects of the 8W gene set backwards to the first week. Where individual genes are not assigned to a graphical node signifying differential expression, their contribution to the count of positive effects was taken to be 0.5 when calculating the overall proportion. We include in these tissue-specific trajectories the set of genes corresponding to each tissue, their direction of effect on the trait, and their standardized effect size from Fig. 2b. Similar figures to Fig. 6 for all other traits may be found in the GitHub repository mentioned below.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

Peer Review File^{(5.2MB, pdf)}

Reporting Summary^{(2MB, pdf)}

Supplementary Information^{(847.1KB, pdf)}

Source data

Source Data^{(12.3MB, zip)}

Acknowledgements

We would like to thank Michael Gloudemans, Daniel Nachun, Bob Carpenter, Andrew Gelman, Laurens van de Wiel, Andrew Marderstein, Bruna Balliu, Kim Huffman, and Kate Gates for their valuable input on many parts of the analyses presented above. We would also like to thank Marty Walsh, John Williams, Matt Wheeler, and other members of MoTrPAC for their crucial feedback on this work. Finally, we would like to acknowledge the entire MoTrPAC team, including PASS, CAS and BIC, for their indispensable contributions in generating the exercise-response data used here. MoTrPAC is supported by NIH grants U24OD026629 (MSG, Bioinformatics Center), U24DK112349 (MSG), U24DK112342 (MSG), U24DK112340 (MSG), U24DK112341 (MSG), U24DK112326 (MSG), U24DK112331 (MSG), U24DK112348 (SBM, Chemical Analysis Sites), U01AR071133 (MSG), U01AR071130 (MSG), U01AR071124 (MSG), U01AR071128 (MSG), U01AR071150 (MSG), U01AR071160 (MSG), U01AR071158 (MSG, Clinical Centers), U24AR071113 (MSG, Consortium Coordinating Center), U01AG055133 (MSG), U01AG055137 (MSG), and U01AG055135 (MSG, PASS/Animal Sites). Research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under award number T15LM007033 (N.G.V.).

Author contributions

NGV, NRG, and SBM collectively conceived of and designed the analysis strategy underlying this work. NGV implemented most analysis and figure code, with NRG facilitating access and advising processing of GTEx, GWAS, and MoTrPAC data. NGV and NRG performed testing and validation, as well as compiling online materials. NGV drafted the manuscript, which then received extensive edits and suggestions from NRG and SBM. MSG provided substantive feedback and advice throughout all parts of this work. All authors approved the manuscript prior to submission.

Peer review

Peer review information

Nature Communications thanks Frank Booth, Taylor Head, Kangjin Kim and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

This study did not generate novel data, relying instead on previously or concurrently published data. MoTrPAC PASS1B data (10.1101/2022.09.21.508770) used here have been deposited at https://motrpac-data.org/data-access. Inquiries regarding access to these data should be sent to motrpac-helpdesk@lists.stanford.edu. Further resources are available at motrpac.org and motrpac-data.org. Where it would be difficult to re-host large datasets from GTEx³⁴, Open Targets⁵⁰, and PrediXcan⁵, we provide download links in the documentation of the associated code repository. Source data to generate all figures seen here are provided with this paper in the form of *.RData objects. These contain all necessary processed data to fully and quickly reproduce all paper figures using the scripts contained in https://github.com/NikVetr/MoTrPAC_Complex_Traits/tree/main/scripts/figures. Source data are provided with this paper.

Code availability

We provide end-to-end scripts to perform all analyses described above in a GitHub repository⁷⁰ located at the following URL: https://github.com/NikVetr/MoTrPAC_Complex_Traits. Additionally, we provide scripts to generate all figures, as well as intermediate data files corresponding to compiled results at each level of analysis (MCMC output, Open Targets associations, cross-referenced DEG-PrediXcan intersects, aggregated GCTA output, and relative effect sizes).

Competing interests

S.B.M. is a consultant for BioMarin, MyOme and Tenaya Therapeutics. These companies are broadly interested in treatments for rare and common genetic diseases but had no input on any component of this study. The authors have no other competing interests to declare.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A list of authors and their affiliations appears at the end of the paper.

Contributor Information

Nikolai G. Vetr, Email: nikgvetr@stanford.edu

Stephen B. Montgomery, Email: smontgom@stanford.edu

MoTrPAC Study Group:

Joshua N. Adkins, Brent G. Albertson, David Amar, Mary Anne S. Amper, Jose Juan Almagro Armenteros, Euan Ashley, Julian Avila-Pacheco, Dam Bae, Ali Tugrul Balci, Marcas Bamman, Nasim Bararpour, Elisabeth R. Barton, Pierre M. Jean Beltran, Bryan C. Bergman, Daniel H. Bessesen, Sue C. Bodine, Frank W. Booth, Brian Bouverat, Thomas W. Buford, Charles F. Burant, Tiziana Caputo, Steven Carr, Toby L. Chambers, Clarisa Chavez, Maria Chikina, Roxanne Chiu, Michael Cicha, Clary B. Clish, Paul M. Coen, Dan Cooper, Elaine Cornell, Gary Cutter, Karen P. Dalton, Surendra Dasari, Courtney Dennis, Karyn Esser, Charles R. Evans, Roger Farrar, Facundo M. Fernádez, Kishore Gadde, Nicole Gagne, David A. Gaul, Yongchao Ge, Robert E. Gerszten, Bret H. Goodpaster, Laurie J. Goodyear, Marina A. Gritsenko, Kristy Guevara, Fadia Haddad, Joshua R. Hansen, Melissa Harris, Trevor Hastie, Krista M. Hennig, Steven G. Hershman, Andrea Hevener, Michael F. Hirshman, Zhenxin Hou, Fang-Chi Hsu, Kim M. Huffman, Chia-Jui Hung, Chelsea Hutchinson-Bunch, Anna A. Ivanova, Bailey E. Jackson, Catherine M. Jankowski, David Jimenez-Morales, Christopher A. Jin, Neil M. Johannsen, Robert L. Newton, Jr, Maureen T. Kachman, Benjamin G. Ke, Hasmik Keshishian, Wendy M. Kohrt, Kyle S. Kramer, William E. Kraus, Ian Lanza, Christiaan Leeuwenburgh, Sarah J. Lessard, Bridget Lester, Jun Z. Li, Malene E. Lindholm, Ana K. Lira, Xueyun Liu, Ching-ju Lu, Nathan S. Makarewicz, Kristal M. Maner-Smith, D. R. Mani, Gina M. Many, Nada Marjanovic, Andrea Marshall, Shruti Marwaha, Sandy May, Edward L. Melanson, Michael E. Miller, Matthew E. Monroe, Samuel G. Moore, Ronald J. Moore, Kerrie L. Moreau, Charles C. Mundorff, Nicolas Musi, Daniel Nachun, Venugopalan D. Nair, K. Sreekumaran Nair, Michael D. Nestor, Barbara Nicklas, Pasquale Nigro, German Nudelman, Eric A. Ortlund, Marco Pahor, Cadence Pearce, Vladislav A. Petyuk, Paul D. Piehowski, Hanna Pincas, Scott Powers, David M. Presby, Wei-Jun Qian, Shlomit Radom-Aizik, Archana Natarajan Raja, Krithika Ramachandran, Megan E. Ramaker, Irene Ramos, Tuomo Rankinen, Alexander (Sasha) Raskind, Blake B. Rasmussen, Eric Ravussin, R. Scott Rector, W. Jack Rejeski, Collyn Z-T. Richards, Stas Rirak, Jeremy M. Robbins, Jessica L. Rooney, Aliza B. Rubenstein, Frederique Ruf-Zamojski, Scott Rushing, Tyler J. Sagendorf, Mihir Samdarshi, James A. Sanford, Evan M. Savage, Irene E. Schauer, Simon Schenk, Robert S. Schwartz, Stuart C. Sealfon, Nitish Seenarine, Kevin S. Smith, Gregory R. Smith, Michael P. Snyder, Tanu Soni, Luis Gustavo Oliveira De Sousa, Lauren M. Sparks, Alec Steep, Cynthia L. Stowe, Yifei Sun, Christopher Teng, Anna Thalacker-Mercer, John Thyfault, Rob Tibshirani, Russell Tracy, Scott Trappe, Todd A. Trappe, Karan Uppal, Sindhu Vangeti, Mital Vasoya, Elena Volpi, Alexandria Vornholt, Michael P. Walkup, Martin J. Walsh, Matthew T. Wheeler, John P. Williams, Si Wu, Ashley Xia, Zhen Yan, Xuechen Yu, Chongzhi Zang, Elena Zaslavsky, Navid Zebarjadi, Tiantian Zhang, Bingqing Zhao, and Jimmy Zhen

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-024-45966-w.

References

1.Ruegsegger GN, Booth FW. Health benefits of exercise. Cold Spring Harbor Perspect. Med. 2018;8:a029694. doi: 10.1101/cshperspect.a029694. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Fiuza-Luces C, et al. Exercise benefits in cardiovascular disease: beyond attenuation of traditional risk factors. Nat. Rev. Cardiol. 2018;15:731–743. doi: 10.1038/s41569-018-0065-1. [DOI] [PubMed] [Google Scholar]
3.Amar D, et al. Time trajectories in the transcriptomic response to exercise - a meta-analysis. Nat. Commun. 2021;12:3471. doi: 10.1038/s41467-021-23579-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.MoTrPAC Study Group Temporal dynamics of the multi-omic response to endurance exercise training across tissues. Preprint at https://www.biorxiv.org/content/10.1101/2022.09.21.508770v2 (2022). [DOI] [PMC free article] [PubMed]
5.Barbeira AN, et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 2021;22:49. doi: 10.1186/s13059-020-02252-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Koch LG, Britton SL. Rat models of exercise for the study of complex disease. Methods Mol. Biol. (Clifton, N.J.) 2019;2018:309–317. doi: 10.1007/978-1-4939-9581-3_15. [DOI] [PubMed] [Google Scholar]
7.Xiao K, et al. Beneficial effects of running exercise on hippocampal microglia and neuroinflammation in chronic unpredictable stress-induced depression model rats. Transl. Psychiatry. 2021;11:461. doi: 10.1038/s41398-021-01571-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Koch LG, et al. Intrinsic aerobic capacity sets a divide for aging and longevity. Circul. Res. 2011;109:1162–1172. doi: 10.1161/CIRCRESAHA.111.253807. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Bulik-Sullivan BK, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Finucane HK, et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 2018;50:621–629. doi: 10.1038/s41588-018-0081-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Yao DW, O’Connor LJ, Price AL, Gusev A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 2020;52:626–633. doi: 10.1038/s41588-020-0625-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Gamazon ER, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 2015;47:1091–1098. doi: 10.1038/ng.3367. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Barbeira AN, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Balliu B, et al. An integrated approach to identify environmental modulators of genetic risk factors for complex traits. Am. J. Hum. Genet. 2021;108:1866–1879. doi: 10.1016/j.ajhg.2021.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Pontzer H, et al. Constrained total energy expenditure and metabolic adaptation to physical activity in adult humans. Curr. Biol. 2016;26:410–417. doi: 10.1016/j.cub.2015.12.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Daly RM, Bass S, Caine D, Howe W. Does training affect growth? Phys. Sportsmed. 2002;30:21–29. doi: 10.3810/psm.2002.10.488. [DOI] [PubMed] [Google Scholar]
17.Borer KT. The effects of exercise on growth. Sports Med. 1995;20:375–397. doi: 10.2165/00007256-199520060-00004. [DOI] [PubMed] [Google Scholar]
18.Godfrey RJ, Madgwick Z, Whyte GP. The exercise-induced growth hormone response in athletes. Sports Med. (Auckl. N.Z.) 2003;33:599–613. doi: 10.2165/00007256-200333080-00005. [DOI] [PubMed] [Google Scholar]
19.Del Giacco SR, Firinu D, Bjermer L, Carlsen K-H. Exercise and asthma: an overview. Eur. Clin. Respir. J. 2015;2:27984. doi: 10.3402/ecrj.v2.27984. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Bronte V, Pittet MJ. The spleen in local and systemic regulation of immunity. Immunity. 2013;39:806–818. doi: 10.1016/j.immuni.2013.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Hallstrand TS, et al. Inflammatory basis of exercise-induced bronchoconstriction. Am. J. Respir. Crit. Care Med. 2005;172:679–686. doi: 10.1164/rccm.200412-1667OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Sastre B, et al. Distinctive bronchial inflammation status in athletes: basophils, a new player. Eur. J. Appl. Physiol. 2013;113:703–711. doi: 10.1007/s00421-012-2475-9. [DOI] [PubMed] [Google Scholar]
23.Hayashi Y, et al. Ablation of fatty acid desaturase 2 (FADS2) exacerbates hepatic triacylglycerol and cholesterol accumulation in polyunsaturated fatty acid-depleted mice. FEBS Letters. 2021;595:1920–1932. doi: 10.1002/1873-3468.14134. [DOI] [PubMed] [Google Scholar]
24.Ershov P, et al. Enzymes in the Cholesterol Synthesis Pathway: Interactomics in the Cancer Context. Biomedicines. 2021;9:895. doi: 10.3390/biomedicines9080895. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Fan Z, et al. Generation of an oxoglutarate dehydrogenase knockout rat model and the effect of a high-fat diet. RSC Adv. 2018;8:16636–16644. doi: 10.1039/c8ra00253c. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Zhao G-N, et al. Tmbim1 is a multivesicular body regulator that protects against nonalcoholic fatty liver disease in mice and monkeys by targeting the lysosomal degradation of Tlr4. Nat. Med. 2017;23:742–752. doi: 10.1038/nm.4334. [DOI] [PubMed] [Google Scholar]
27.Davis RA. Cell and molecular biology of the assembly and secretion of apolipoprotein B-containing lipoproteins by the liver. Biochim. Biophys. Acta Mol. Cell Biol. Lipids. 1999;1440:1–31. doi: 10.1016/s1388-1981(99)00083-9. [DOI] [PubMed] [Google Scholar]
28.Yu X-H, et al. ABCG5/ABCG8 in cholesterol excretion and atherosclerosis. Clin. Chim. Acta. 2014;428:82–88. doi: 10.1016/j.cca.2013.11.010. [DOI] [PubMed] [Google Scholar]
29.Legaki E, Arsenis C, Taka S, Papadopoulos NG. DNA methylation biomarkers in asthma and rhinitis: are we there yet? Clin. Transl. Allergy. 2022;12:e12131. doi: 10.1002/clt2.12131. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Song M-K, Kim DI, Lee K. Causal relationship between humidifier disinfectant exposure and Th17-mediated airway inflammation and hyperresponsiveness. Toxicology. 2021;454:152739. doi: 10.1016/j.tox.2021.152739. [DOI] [PubMed] [Google Scholar]
31.Lepeule J, et al. Gene promoter methylation is associated with lung function in the elderly: the normative aging study. Epigenetics. 2012;7:261–269. doi: 10.4161/epi.7.3.19216. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Zhu Z, et al. Shared genetic and experimental links between obesity-related traits and asthma subtypes in UK Biobank. J. Allergy Clin. Immunol. 2020;145:537–549. doi: 10.1016/j.jaci.2019.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Alvarez-Carretero S, et al. A species-level timeline of mammal evolution integrating phylogenomic data. Nature. 2022;602:263–267. doi: 10.1038/s41586-021-04341-1. [DOI] [PubMed] [Google Scholar]
34.Lonsdale J, et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 2013;45:580–585. doi: 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Oliva M, et al. The impact of sex on gene expression across human tissues. Science. 2020;369:eaba3066. doi: 10.1126/science.aba3066. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Sone K, et al. Changes of estrous cycles with aging in female F344/n rats. Exp. Anim. 2007;56:139–148. doi: 10.1538/expanim.56.139. [DOI] [PubMed] [Google Scholar]
37.Landen S, et al. Genetic and epigenetic sex-specific adaptations to endurance exercise. Epigenetics. 2019;14:523–535. doi: 10.1080/15592294.2019.1603961. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Landen S, et al. Physiological and molecular sex differences in human skeletal muscle in response to exercise training. J. Physiol. 2023;601:419–434. doi: 10.1113/JP279499. [DOI] [PubMed] [Google Scholar]
39.Many, G. M. et al. Sexual dimorphism and the multi-omic response to exercise training in rat subcutaneous white adipose tissue. bioRxiv: Preprint Server Biol. (2023). [DOI] [PMC free article] [PubMed]
40.Wang Z, et al. Genome-wide association analyses of physical activity and sedentary behavior provide insights into underlying mechanisms and roles in disease prevention. Nat. Genet. 2022;54:1332–1344. doi: 10.1038/s41588-022-01165-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Sanderson E, et al. Mendelian randomization. Nat. Rev. Methods Primers. 2022;2:1–21. doi: 10.1038/s43586-021-00092-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Sanford JA, et al. Molecular transducers of physical activity consortium (MoTrPAC): mapping the dynamic responses to exercise. Cell. 2020;181:1464–1474. doi: 10.1016/j.cell.2020.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:1–21. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Ignatiadis N, Klaus B, Zaugg JB, Huber W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat. Methods. 2016;13:577–580. doi: 10.1038/nmeth.3885. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Heller R, Yaacoby S, Yekutieli D. Repfdr: a tool for replicability analysis for genomewide association studies. Bioinformatics. 2014;30:2971–2972. doi: 10.1093/bioinformatics/btu434. [DOI] [PubMed] [Google Scholar]
46.Heller R, Yekutieli D. Replicability analysis for genome-wide association studies. Ann. Appl. Stat. 2014;8:481–498. [Google Scholar]
47.Efron B. Size, power and false discovery rates. Ann. Stat. 2007;35:1351–1377. [Google Scholar]
48.Frankish A, et al. GENCODE 2021. Nucleic Acids Res. 2021;49:D916–D923. doi: 10.1093/nar/gkaa1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Smith JR, et al. The year of the rat: the rat genome database at 20: a multi-species knowledgebase and analysis platform. Nucleic Acids Res. 2020;48:D731–D742. doi: 10.1093/nar/gkz1041. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Ochoa D, et al. Open targets platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Res. 2021;49:D1302–D1310. doi: 10.1093/nar/gkaa1027. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Luraschi, J. et al. sparklyr: R Interface to Apache Spark. R package version 1.7.7, https://CRAN.R-project.org/package=sparklyr (2022).
52.Delignette-Muller ML, Dutang C. Fitdistrplus: an R package for fitting distributions. J. Stat. Softw. 2015;64:1–34. [Google Scholar]
53.Chang CC, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011;88:76–82. doi: 10.1016/j.ajhg.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010;42:565–569. doi: 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Wheeler HE, et al. Survey of the heritability and sparse architecture of gene expression traits across human tissues. PLoS Genet. 2016;12:e1006423. doi: 10.1371/journal.pgen.1006423. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Gelman, A. et al. Bayesian Data Analysis, 3E (Chapman and Hall/CRC, 2013).
58.Gelman A, Hill J, Yajima M. Why we (usually) don’t have to worry about multiple comparisons. J. Res. Educat. Effect. 2012;5:189–211. [Google Scholar]
59.Schnabel RB, Koonatz JE, Weiss BE. A modular system of algorithms for unconstrained minimization. ACM Trans. Math. Softw. 1985;11:419–440. [Google Scholar]
60.Nash, J. C., Varadhan, R. & Grothendieck, G. optimx: Expanded Replacement and Extension of the ’optim’ Function. R package version 10.21, https://CRAN.R-project.org/package=optimx (2022).
61.Higham NJ. Computing the nearest correlation matrix—a problem from finance. IMA J. Numer. Anal. 2002;22:329–343. [Google Scholar]
62.Bates, D. & Maechler, M. Matrix. R package version 1.6-5, https://CRAN.R-project.org/package=Matrix (2019).
63.Team, S. D. Stan Modeling Language Users Guide and Reference Manual. Version 2.34, https://mc-stan.org (2023).
64.Gabry, J. & Češnovar, R. cmdstanr: R Interface to ’CmdStan’. R package version 0.3.0.9000, https://mcstan.org/cmdstanr/ (2022).
65.Adler, D., Kelly, S. T. & Elliott, T. M. vioplot: Violin Plot. R package version 0.4.0, https://CRAN.Rproject.org/package=vioplot (2021).
66.Bürkner, P., Gabry, J., Kay, M. & Vehtari, A. posterior: Tools for Working with Posterior Distributions. R package version 1.2.2, https://mc-stan.org/posterior/ (2022).
67.Korotkevich, G. et al. Fast gene set enrichment analysis. Preprint at https://www.biorxiv.org/content/10.1101/060012v3 (2021).
68.Wilson DJ. The harmonic mean p-value for combining dependent tests. Proc. Natl. Acad. Sci. 2019;116:1195–1200. doi: 10.1073/pnas.1814092116. [DOI] [PMC free article] [PubMed] [Google Scholar]
69.Sodini SM, Kemper KE, Wray NR, Trzaskowski M. Comparison of genotypic and phenotypic correlations: Cheverud’s conjecture in humans. Genetics. 2018;209:941–948. doi: 10.1534/genetics.117.300630. [DOI] [PMC free article] [PubMed] [Google Scholar]
70.Vetr, N., Gay, N. & Stephen, M.The impact of exercise on gene regulation in association with complex trait genetics. Version 1.0.0, https://zenodo.org/records/10211801 (2023). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Peer Review File^{(5.2MB, pdf)}

Reporting Summary^{(2MB, pdf)}

Supplementary Information^{(847.1KB, pdf)}

Source Data^{(12.3MB, zip)}

Data Availability Statement

[CR1] 1.Ruegsegger GN, Booth FW. Health benefits of exercise. Cold Spring Harbor Perspect. Med. 2018;8:a029694. doi: 10.1101/cshperspect.a029694. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Fiuza-Luces C, et al. Exercise benefits in cardiovascular disease: beyond attenuation of traditional risk factors. Nat. Rev. Cardiol. 2018;15:731–743. doi: 10.1038/s41569-018-0065-1. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Amar D, et al. Time trajectories in the transcriptomic response to exercise - a meta-analysis. Nat. Commun. 2021;12:3471. doi: 10.1038/s41467-021-23579-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.MoTrPAC Study Group Temporal dynamics of the multi-omic response to endurance exercise training across tissues. Preprint at https://www.biorxiv.org/content/10.1101/2022.09.21.508770v2 (2022). [DOI] [PMC free article] [PubMed]

[CR5] 5.Barbeira AN, et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 2021;22:49. doi: 10.1186/s13059-020-02252-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Koch LG, Britton SL. Rat models of exercise for the study of complex disease. Methods Mol. Biol. (Clifton, N.J.) 2019;2018:309–317. doi: 10.1007/978-1-4939-9581-3_15. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Xiao K, et al. Beneficial effects of running exercise on hippocampal microglia and neuroinflammation in chronic unpredictable stress-induced depression model rats. Transl. Psychiatry. 2021;11:461. doi: 10.1038/s41398-021-01571-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Koch LG, et al. Intrinsic aerobic capacity sets a divide for aging and longevity. Circul. Res. 2011;109:1162–1172. doi: 10.1161/CIRCRESAHA.111.253807. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Bulik-Sullivan BK, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Finucane HK, et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 2018;50:621–629. doi: 10.1038/s41588-018-0081-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Yao DW, O’Connor LJ, Price AL, Gusev A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 2020;52:626–633. doi: 10.1038/s41588-020-0625-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Gamazon ER, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 2015;47:1091–1098. doi: 10.1038/ng.3367. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Barbeira AN, et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 2018;9:1825. doi: 10.1038/s41467-018-03621-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Balliu B, et al. An integrated approach to identify environmental modulators of genetic risk factors for complex traits. Am. J. Hum. Genet. 2021;108:1866–1879. doi: 10.1016/j.ajhg.2021.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Pontzer H, et al. Constrained total energy expenditure and metabolic adaptation to physical activity in adult humans. Curr. Biol. 2016;26:410–417. doi: 10.1016/j.cub.2015.12.046. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Daly RM, Bass S, Caine D, Howe W. Does training affect growth? Phys. Sportsmed. 2002;30:21–29. doi: 10.3810/psm.2002.10.488. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Borer KT. The effects of exercise on growth. Sports Med. 1995;20:375–397. doi: 10.2165/00007256-199520060-00004. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Godfrey RJ, Madgwick Z, Whyte GP. The exercise-induced growth hormone response in athletes. Sports Med. (Auckl. N.Z.) 2003;33:599–613. doi: 10.2165/00007256-200333080-00005. [DOI] [PubMed] [Google Scholar]

[CR19] 19.Del Giacco SR, Firinu D, Bjermer L, Carlsen K-H. Exercise and asthma: an overview. Eur. Clin. Respir. J. 2015;2:27984. doi: 10.3402/ecrj.v2.27984. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Bronte V, Pittet MJ. The spleen in local and systemic regulation of immunity. Immunity. 2013;39:806–818. doi: 10.1016/j.immuni.2013.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Hallstrand TS, et al. Inflammatory basis of exercise-induced bronchoconstriction. Am. J. Respir. Crit. Care Med. 2005;172:679–686. doi: 10.1164/rccm.200412-1667OC. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Sastre B, et al. Distinctive bronchial inflammation status in athletes: basophils, a new player. Eur. J. Appl. Physiol. 2013;113:703–711. doi: 10.1007/s00421-012-2475-9. [DOI] [PubMed] [Google Scholar]

[CR23] 23.Hayashi Y, et al. Ablation of fatty acid desaturase 2 (FADS2) exacerbates hepatic triacylglycerol and cholesterol accumulation in polyunsaturated fatty acid-depleted mice. FEBS Letters. 2021;595:1920–1932. doi: 10.1002/1873-3468.14134. [DOI] [PubMed] [Google Scholar]

[CR24] 24.Ershov P, et al. Enzymes in the Cholesterol Synthesis Pathway: Interactomics in the Cancer Context. Biomedicines. 2021;9:895. doi: 10.3390/biomedicines9080895. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Fan Z, et al. Generation of an oxoglutarate dehydrogenase knockout rat model and the effect of a high-fat diet. RSC Adv. 2018;8:16636–16644. doi: 10.1039/c8ra00253c. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Zhao G-N, et al. Tmbim1 is a multivesicular body regulator that protects against nonalcoholic fatty liver disease in mice and monkeys by targeting the lysosomal degradation of Tlr4. Nat. Med. 2017;23:742–752. doi: 10.1038/nm.4334. [DOI] [PubMed] [Google Scholar]

[CR27] 27.Davis RA. Cell and molecular biology of the assembly and secretion of apolipoprotein B-containing lipoproteins by the liver. Biochim. Biophys. Acta Mol. Cell Biol. Lipids. 1999;1440:1–31. doi: 10.1016/s1388-1981(99)00083-9. [DOI] [PubMed] [Google Scholar]

[CR28] 28.Yu X-H, et al. ABCG5/ABCG8 in cholesterol excretion and atherosclerosis. Clin. Chim. Acta. 2014;428:82–88. doi: 10.1016/j.cca.2013.11.010. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Legaki E, Arsenis C, Taka S, Papadopoulos NG. DNA methylation biomarkers in asthma and rhinitis: are we there yet? Clin. Transl. Allergy. 2022;12:e12131. doi: 10.1002/clt2.12131. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Song M-K, Kim DI, Lee K. Causal relationship between humidifier disinfectant exposure and Th17-mediated airway inflammation and hyperresponsiveness. Toxicology. 2021;454:152739. doi: 10.1016/j.tox.2021.152739. [DOI] [PubMed] [Google Scholar]

[CR31] 31.Lepeule J, et al. Gene promoter methylation is associated with lung function in the elderly: the normative aging study. Epigenetics. 2012;7:261–269. doi: 10.4161/epi.7.3.19216. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Zhu Z, et al. Shared genetic and experimental links between obesity-related traits and asthma subtypes in UK Biobank. J. Allergy Clin. Immunol. 2020;145:537–549. doi: 10.1016/j.jaci.2019.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR33] 33.Alvarez-Carretero S, et al. A species-level timeline of mammal evolution integrating phylogenomic data. Nature. 2022;602:263–267. doi: 10.1038/s41586-021-04341-1. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Lonsdale J, et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 2013;45:580–585. doi: 10.1038/ng.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Oliva M, et al. The impact of sex on gene expression across human tissues. Science. 2020;369:eaba3066. doi: 10.1126/science.aba3066. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR36] 36.Sone K, et al. Changes of estrous cycles with aging in female F344/n rats. Exp. Anim. 2007;56:139–148. doi: 10.1538/expanim.56.139. [DOI] [PubMed] [Google Scholar]

[CR37] 37.Landen S, et al. Genetic and epigenetic sex-specific adaptations to endurance exercise. Epigenetics. 2019;14:523–535. doi: 10.1080/15592294.2019.1603961. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Landen S, et al. Physiological and molecular sex differences in human skeletal muscle in response to exercise training. J. Physiol. 2023;601:419–434. doi: 10.1113/JP279499. [DOI] [PubMed] [Google Scholar]

[CR39] 39.Many, G. M. et al. Sexual dimorphism and the multi-omic response to exercise training in rat subcutaneous white adipose tissue. bioRxiv: Preprint Server Biol. (2023). [DOI] [PMC free article] [PubMed]

[CR40] 40.Wang Z, et al. Genome-wide association analyses of physical activity and sedentary behavior provide insights into underlying mechanisms and roles in disease prevention. Nat. Genet. 2022;54:1332–1344. doi: 10.1038/s41588-022-01165-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] 41.Sanderson E, et al. Mendelian randomization. Nat. Rev. Methods Primers. 2022;2:1–21. doi: 10.1038/s43586-021-00092-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR42] 42.Sanford JA, et al. Molecular transducers of physical activity consortium (MoTrPAC): mapping the dynamic responses to exercise. Cell. 2020;181:1464–1474. doi: 10.1016/j.cell.2020.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:1–21. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] 44.Ignatiadis N, Klaus B, Zaugg JB, Huber W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat. Methods. 2016;13:577–580. doi: 10.1038/nmeth.3885. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] 45.Heller R, Yaacoby S, Yekutieli D. Repfdr: a tool for replicability analysis for genomewide association studies. Bioinformatics. 2014;30:2971–2972. doi: 10.1093/bioinformatics/btu434. [DOI] [PubMed] [Google Scholar]

[CR46] 46.Heller R, Yekutieli D. Replicability analysis for genome-wide association studies. Ann. Appl. Stat. 2014;8:481–498. [Google Scholar]

[CR47] 47.Efron B. Size, power and false discovery rates. Ann. Stat. 2007;35:1351–1377. [Google Scholar]

[CR48] 48.Frankish A, et al. GENCODE 2021. Nucleic Acids Res. 2021;49:D916–D923. doi: 10.1093/nar/gkaa1087. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.Smith JR, et al. The year of the rat: the rat genome database at 20: a multi-species knowledgebase and analysis platform. Nucleic Acids Res. 2020;48:D731–D742. doi: 10.1093/nar/gkz1041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR50] 50.Ochoa D, et al. Open targets platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Res. 2021;49:D1302–D1310. doi: 10.1093/nar/gkaa1027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] 51.Luraschi, J. et al. sparklyr: R Interface to Apache Spark. R package version 1.7.7, https://CRAN.R-project.org/package=sparklyr (2022).

[CR52] 52.Delignette-Muller ML, Dutang C. Fitdistrplus: an R package for fitting distributions. J. Stat. Softw. 2015;64:1–34. [Google Scholar]

[CR53] 53.Chang CC, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR54] 54.Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011;88:76–82. doi: 10.1016/j.ajhg.2010.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR55] 55.Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010;42:565–569. doi: 10.1038/ng.608. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR56] 56.Wheeler HE, et al. Survey of the heritability and sparse architecture of gene expression traits across human tissues. PLoS Genet. 2016;12:e1006423. doi: 10.1371/journal.pgen.1006423. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR57] 57.Gelman, A. et al. Bayesian Data Analysis, 3E (Chapman and Hall/CRC, 2013).

[CR58] 58.Gelman A, Hill J, Yajima M. Why we (usually) don’t have to worry about multiple comparisons. J. Res. Educat. Effect. 2012;5:189–211. [Google Scholar]

[CR59] 59.Schnabel RB, Koonatz JE, Weiss BE. A modular system of algorithms for unconstrained minimization. ACM Trans. Math. Softw. 1985;11:419–440. [Google Scholar]

[CR60] 60.Nash, J. C., Varadhan, R. & Grothendieck, G. optimx: Expanded Replacement and Extension of the ’optim’ Function. R package version 10.21, https://CRAN.R-project.org/package=optimx (2022).

[CR61] 61.Higham NJ. Computing the nearest correlation matrix—a problem from finance. IMA J. Numer. Anal. 2002;22:329–343. [Google Scholar]

[CR62] 62.Bates, D. & Maechler, M. Matrix. R package version 1.6-5, https://CRAN.R-project.org/package=Matrix (2019).

[CR63] 63.Team, S. D. Stan Modeling Language Users Guide and Reference Manual. Version 2.34, https://mc-stan.org (2023).

[CR64] 64.Gabry, J. & Češnovar, R. cmdstanr: R Interface to ’CmdStan’. R package version 0.3.0.9000, https://mcstan.org/cmdstanr/ (2022).

[CR65] 65.Adler, D., Kelly, S. T. & Elliott, T. M. vioplot: Violin Plot. R package version 0.4.0, https://CRAN.Rproject.org/package=vioplot (2021).

[CR66] 66.Bürkner, P., Gabry, J., Kay, M. & Vehtari, A. posterior: Tools for Working with Posterior Distributions. R package version 1.2.2, https://mc-stan.org/posterior/ (2022).

[CR67] 67.Korotkevich, G. et al. Fast gene set enrichment analysis. Preprint at https://www.biorxiv.org/content/10.1101/060012v3 (2021).

[CR68] 68.Wilson DJ. The harmonic mean p-value for combining dependent tests. Proc. Natl. Acad. Sci. 2019;116:1195–1200. doi: 10.1073/pnas.1814092116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR69] 69.Sodini SM, Kemper KE, Wray NR, Trzaskowski M. Comparison of genotypic and phenotypic correlations: Cheverud’s conjecture in humans. Genetics. 2018;209:941–948. doi: 10.1534/genetics.117.300630. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR70] 70.Vetr, N., Gay, N. & Stephen, M.The impact of exercise on gene regulation in association with complex trait genetics. Version 1.0.0, https://zenodo.org/records/10211801 (2023). [DOI] [PMC free article] [PubMed]

PERMALINK

The impact of exercise on gene regulation in association with complex trait genetics

Nikolai G Vetr

Nicole R Gay

Stephen B Montgomery

Abstract

Introduction

Fig. 1. Tissue-specific differential gene expression from exercise impacts unique sets of disease processes.

Results

Exercise training has unique disease gene signatures across tissues

Exercise effects on regulation of gene expression

Fig. 2. Tissue-specific differential gene expression from exercise can exceed natural variation.

Heritability of complex disease enriched in or near training-responsive genes

Fig. 3. Genetic variation near exercise training genes is enriched in heritability across human phenotypes.

PrediXcan-significant genes overlap adaptive training-response genes

Fig. 4. Exercise training genes are enriched for genes where expression is associated with trait variation across multiple trait categories.

Exercise induces both more and less disease-like differential gene expression

Fig. 5. Exercise training genes can be enriched for more or less disease-like effects.

Fig. 6. Examining which trait-associated, exercise-responsive genes are differentially expressed at levels outside natural variation yields interesting candidates for further study.

Discussion

Methods

MoTrPAC EET study design

Differential expression analysis

Correlation of differential analysis results

Graphical clustering of differential analysis results

Open targets intersection

Heritability analyses

Human expression data & effect standardization

Cross-referencing exercise-training genes with human TWAS

Proportion of disease-like effects

Reporting summary

Supplementary information

Source data

Acknowledgements

Author contributions

Peer review

Peer review information

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases