Skip to main content
eLife logoLink to eLife
. 2022 Jan 31;11:e68048. doi: 10.7554/eLife.68048

Inter-tissue convergence of gene expression during ageing suggests age-related loss of tissue and cellular identity

Hamit Izgi 1, Dingding Han 2,, Ulas Isildak 1, Shuyun Huang 2, Ece Kocabiyik 1, Philipp Khaitovich 3,, Mehmet Somel 1,, Handan Melike Dönertaş 4,5,
Editors: Bérénice A Benayoun6, Kathryn Song Eng Cheah7
PMCID: PMC8880995  PMID: 35098922

Abstract

Developmental trajectories of gene expression may reverse in their direction during ageing, a phenomenon previously linked to cellular identity loss. Our analysis of cerebral cortex, lung, liver, and muscle transcriptomes of 16 mice, covering development and ageing intervals, revealed widespread but tissue-specific ageing-associated expression reversals. Cumulatively, these reversals create a unique phenomenon: mammalian tissue transcriptomes diverge from each other during postnatal development, but during ageing, they tend to converge towards similar expression levels, a process we term Divergence followed by Convergence (DiCo). We found that DiCo was most prevalent among tissue-specific genes and associated with loss of tissue identity, which is confirmed using data from independent mouse and human datasets. Further, using publicly available single-cell transcriptome data, we showed that DiCo could be driven both by alterations in tissue cell-type composition and also by cell-autonomous expression changes within particular cell types.

Research organism: Human, Mouse

Introduction

Development and ageing in multicellular organisms are highly intertwined processes. On the one hand, certain ageing-related phenotypes, such as presbyopia and osteoporosis (Luegmayr et al., 2004), are believed to represent the continuation of developmental processes into adulthood (Blagosklonny, 2006; de Magalhães and Church, 2005). Such cases of ‘runaway development’ or higher than optimal function during ageing (recognised as the hyperfunction theory of ageing; Gems and Partridge, 2013) may arise due to declined natural selection pressure failing to optimise expression regulation after sexual reproduction starts (Fisher, 1930; Medawar, 1953; Williams, 1957). Indeed, recent experimental studies in Caenorhabditis elegans show that senescence phenotypes promoted by insulin-IGF-1 signalling pathways support the hyperfunction theory (Lind et al., 2019; Ezcurra et al., 2018). On the other hand, molecular studies have also reported a reversal of the ageing transcriptome towards pre-adult levels in various contexts, including primate brain regions (Somel et al., 2010; Dönertaş et al., 2017; Colantuoni et al., 2011), and mouse liver and kidney (Anisimova et al., 2020). Studying the functional consequences of this reversal pattern in the ageing human brain, we previously interpreted it as an indication of loss of cellular identity in neurons, possibly exacerbated by a reduction in the relative frequencies of neurons (Dönertaş et al., 2017). Such changes, in turn, could be caused by the accumulation of stochastic damage at the genetic, epigenetic, and proteomic levels over an adult lifetime, causing deregulation of gene expression networks.

Several major questions remain. First, the prevalence of reversal phenotypes across tissues is unclear as most research has been conducted in the brain (Somel et al., 2010; Dönertaş et al., 2017). A second question pertains to the similarity of reversal-exhibiting genes and pathways across tissues. Ageing-related expression changes are partly shared among organs (Zahn et al., 2007), and reversal trends are also shared across different regions of the primate brain (Dönertaş et al., 2017). Distinct tissues might hence show parallel reversal patterns. Alternatively, as mammalian tissues diverge from each other during development in their transcriptome profiles (Cardoso-Moreira et al., 2019), one may hypothesise that during ageing tissues converge back towards similar transcriptome profiles. Such a putative late-age convergence phenomenon would be consistent with the notion of ageing-related cellular identity loss (Yang et al., 2019; Dönertaş et al., 2017). A final question concerns the mechanism behind the observed reversal trends at the bulk tissue level. Specifically, the contribution of cell-type composition and cell-autonomous changes to the reversals at the tissue level remains unexplored.

Documenting the reversal phenomenon is critical to better understand the proximate mechanisms of mammalian ageing, and its ultimate mechanisms, such as the stochastic disruption versus continued expression of developmental genes. However, such work has been limited by the scarcity of studies that include both development and ageing periods of the same organism and across different tissues. This work presents an age-series analysis of bulk transcriptome profiles of mice, including samples of four tissues across postnatal development and ageing periods covering the whole postnatal lifespan. Using this dataset, we study the prevalence, mechanisms, and functional consequences of the reversal phenomenon in different mouse tissues. We further test the related hypothesis of tissue convergence during ageing and investigate the contribution of cell-type composition and cell-autonomous changes.

Results

We generated bulk RNA-seq data from 63 samples covering the cerebral cortex (which we refer to as cortex), liver, lung, and skeletal muscle (which we refer to as muscle) of 16 male C57BL/6J mice, aged between 2 and 904 days of postnatal age (Materials and methods). As mice reach sexual maturity by around 2 months (Tacutu et al., 2018), we treated samples from individuals aged between 2 and 61 days (n = 7) as the development series, and those aged between 93 and 904 days (which roughly correspond to 80-year-old humans; Flurkey et al., 2007) (n = 9) as the ageing series (Figure 1—figure supplement 1). The final dataset contained n = 15,063 protein-coding genes expressed in at least 25% of the 63 samples (one 904-day-old mouse lacked cortex data).

Tissues diverge during postnatal development

Consistent with earlier work (Brawand et al., 2011; Cardoso-Moreira et al., 2019), we found that variation in gene expression is largely explained by tissue differences, such that the first three principal components (PCs) separate samples according to tissue (ANOVA p<10–20 for PC1–3, Figure 1—source data 1), with the cortex most distant from the others (Figure 1a). Meanwhile, PC4, which explains 8% of the total variance, displayed a shared age effect across tissues in development (Spearman’s correlation coefficient ρ = [-0.88, –0.99], nominal p<0.01 for each test; Figure 1b). Also, after the tissue effect was removed by standardisation, principal components analysis (PCA) showed a strong influence of age on the first two PCs, which explains 31% of the variance in total (Figure 1—figure supplement 2). We further observed higher similarity among tissues at the juvenile stage compared to the young-adult stage. In other words, distances between tissues increased with age (change in mean Euclidean distance among tissues with age during development in PC1–PC4 space ρdev = 0.99, pdev = 1.5 × 10–5, Figure 1—source data 1), which resonates with previous reports of inter-tissue transcriptome divergence during development (Cardoso-Moreira et al., 2019). This divergence pattern was also observed when PCA was performed with developmental samples only (days 2–61: change in mean Euclidean distance among tissues in PC1–PC4 space; ρ = 0.95, p=0.0008; Figure 1—figure supplement 3a and b).

Figure 1. Data summary and age-related expression patterns.

(a) Principal components analysis (PCA) of expression levels of 15,063 protein-coding genes across four tissues of 16 mice. Values in parentheses show the variation explained by each component. (b) Age trajectories of PC3 (left) and PC4 (right). Spearman’s correlation coefficients between PC4 and age in each tissue in development range between 0.88 and 0.99 (see Figure 1—source data 1 for all tests). The dashed vertical line indicates 90 days of age, separating development and ageing periods. Age distribution of samples are given in Figure 1—figure supplement 1. (c) Similarity between the age-related gene expression changes (Spearman’s correlation coefficient between expression and age without a significance cutoff) across tissues in development and ageing. Similarities were calculated using Spearman’s correlation coefficient between expression-age correlations across tissues. CTX, cortex; LV, liver; LNG, lung; MS, muscle. (d) The number of significant age-related genes in each tissue (false discovery rate [FDR]-corrected p-value<0.1). (e) Shared age-related genes among tissues identified without using a significance cutoff. The x-axis shows the number of tissues among which age-related genes are shared. Significant overlaps are indicated with an asterisk (Figure 1—figure supplement 4). (f) The proportion of age-related expression change trends (no significance cutoff was used) in each tissue across the lifetime. UpDown: upregulation in development and downregulation in the ageing; DownUp: downregulation in development and upregulation in the ageing; UpUp: upregulation in development and upregulation in the ageing; DownDown: downregulation in development and downregulation in the ageing. We confirmed the robustness of the results using variance stabilising transformation (VST normalisation in Figure 1—figure supplement 10).

Figure 1—source data 1. Data summary, age-related expression patterns, and reversal patterns.

Figure 1.

Figure 1—figure supplement 1. Age distribution of samples.

Figure 1—figure supplement 1.

The x-axis shows the age in days in a log2 scale, and the y-axis lists different tissues. The period from 2- to 61-day-old mice is considered as postnatal development (referred to as development for brevity in the main text) and above 90-day-old as the ageing period. Random jitter was added on the y-axis to avoid overlap between points.
Figure 1—figure supplement 2. Principal components analysis (PCA) with all samples (tissue effect removed).

Figure 1—figure supplement 2.

PCA using all samples (n = 16) after each tissue is standardised separately (i.e. gene expression values for individuals are scaled to mean = 0, sd = 1). PC1 (x-axis) and PC2 (y-axis) are plotted, and the variation explained by each PC is denoted within parentheses on each axis. The size of the points indicates the age, and the colour shows the tissue. The plots on the right show the correlations between the PCs (y-axis) and age (x-axis, on the log2 scale) in development and ageing. PC1-age Spearman’s correlation test during development (n = 7 mice); abs(ρdev) = [0.88, 0.99], nominal pdev < 0.01 for each tissue, same test for PC2 vs. age; abs(ρdev) = [0.30, 0.99], nominal pdev < 0.01 except muscle (Figure 1—source data 1).
Figure 1—figure supplement 3. Principal components analysis (PCA) with development and ageing periods separately.

Figure 1—figure supplement 3.

PCA using only the samples from the development period (2–61 days of age, n = 7) (a–c) and the ageing period (93–904 days of age, n = 9) (d–f). (a, d) PC1 (x-axis) vs. PC2 (y-axis) and (b, e) PC3 (x-axis) vs. PC4 (y-axis) are plotted, and the variation explained by each PC is denoted within parentheses on each axis. The size of the points indicates the age, and the colour shows the tissue. (c, f) Correlation between the PCs (y-axis) and age (x-axis, in the log2 scale) in development (c) and ageing (f). (c) Age effects can be observed in PC2 and PC4 in development: PC2-age Spearman’s correlation test, abs(ρ) = [0.72, 0.94], nominal p<0.05 in 3/4 tissues; PC4-age Spearman’s correlation test, abs(ρ) = [0.88, 0.99], nominal p<0.01 in all tissues. Inter-tissue transcriptome divergence can be observed as a trend in PC3–PC4 space (change in the mean Euclidean distance among tissues with age in PC1–4 space, ρ = 0.95, p=0.0008). (f) A small age effect can be observed in PC4 in ageing: PC4-age Spearman’s correlation test: abs(ρ) = [0.11, 0.77], nominal p<0.05 in 2/4 tissues. Inter-tissue transcriptome convergence can be observed as a subtle trend in PC1–4 space (change in mean Euclidean distance among tissues with age in PC1–4 spaces, ρ = −0.64, p=0.059). All PC-age correlation test results are given in Figure 1—source data 1.
Figure 1—figure supplement 4. Permutation test results for shared expression trends among tissues.

Figure 1—figure supplement 4.

Permutation test results of shared up/down genes across tissues for development and ageing periods. ‘Up’ and ‘down’ indicate positive and negative expression-age correlations (ρ), respectively. No significance cutoff was applied for choosing up/down genes in tissues (i.e. only considering ρ > 0 or ρ < 0). The null distributions are created by permuting individual ages and calculating expression-age correlations in each tissue, then summing the number of genes changing in the same direction in 2, 3, and 4 tissues. The red dashed lines show the observed values, also noted as ‘Obs.’ The estimated false-positive proportion (eFPP) was calculated as the ratio between the median expected value from the permutations and the observed value. p-Values were calculated as the proportion of permutations that are higher than or equal to the observed value.
Figure 1—figure supplement 5. Shared age-related genes among tissues in development and ageing.

Figure 1—figure supplement 5.

(a) Overlap between significant (false discovery rate [FDR]-corrected p-value<0.1) age-related gene sets among tissues. The x-axis shows the number of tissues compared; 2: overlap in two tissues, 3: overlap in three tissues, 4: overlap in four tissues. Cyan: downregulation with age; pink: upregulation with age. Significant overlaps (permutation test, p<0.05, see Figure 1—figure supplement 6 for test results) are indicated with asterisks. (b) The differences between the magnitude of age-related expression changes in development and ageing: (abs(ρdev)–abs(ρageing)), for each gene (n = 15,063 genes) in four tissues (Wilcoxon signed-rank test, p<10–16 for each tissue).
Figure 1—figure supplement 6. Permutation test results for significant trends shared among tissues.

Figure 1—figure supplement 6.

Permutation test result for shared ‘up’ (or ‘down’) genes among tissues in development (a) and ageing (b). ‘Up’ and ‘down’ indicate positive and negative expression-age correlations (ρ), respectively. Significant up/down genes were chosen with false discovery rate [FDR]-corrected p-value<0.1, and their overlap across tissues was calculated. To create the null distributions, we chose as many up (or down) genes in permutations as the observed up (or down) genes in each tissue and then calculated the number of overlapping genes among tissues. The dashed red line shows the observed number of shared up (or down) genes between tissues, and estimated false-positive proportion (eFPP) was calculated as the ratio between the median expected value from the permutations and the observed value. ‘Obs’: number of genes displaying the same significant age-related change pattern among tissues. The p-value was calculated as the proportion of permutations that are higher than or equal to the observed value.
Figure 1—figure supplement 7. Similarities between age-related gene expression changes among tissues.

Figure 1—figure supplement 7.

The similarity between the age-related gene expression changes (Spearman’s correlation coefficient between expression and age) across tissues in development and ageing. Similarities were calculated using Spearman’s correlations coefficient between expression-age correlations (with cutoff: |ρ| > 0.6) across tissues. No significance cutoff was used for expression change similarities. The intensity of the colours shows the magnitude of the correlation coefficient, where darker blue indicates a stronger negative correlation and darker red indicates a stronger positive correlation. Correlation values are written on the lower triangle. The colour of the tissue label indicates development (orange) and ageing (blue) datasets.
Figure 1—figure supplement 8. Permutation test results for reversal patterns in each tissue.

Figure 1—figure supplement 8.

Permutation test result for up-down and down-up reversal genes in each tissue. Developmental up- (or down-) genes, that is, genes with expression-age ρ > 0 (or ρ < 0), were kept constant, and the age labels of the individuals in the ageing period were permuted (Materials and methods). No significance cutoff was used in choosing genes. The dashed red line shows the observed (‘Obs’) up-down (or down-up) proportions in tissues, and estimated false-positive proportion (eFPP) was calculated as the median expected value of the permutations divided by the observed value. p-Values were calculated as the proportion of permutations that are higher than or equal to the observed value. Left panel: up-down reversal proportions were calculated as UD/(UD + UU). Right panel: down-up reversal proportions were calculated as DU/(DU + DD).
Figure 1—figure supplement 9. Permutation test results for shared reversals among tissues.

Figure 1—figure supplement 9.

Permutation test result for shared up-down (or down-up) reversal genes across tissues. Developmental up- (or down-) genes were kept constant (among 2255 shared up-genes and 2209 shared down-genes in development), and the age labels of the individuals in the ageing period were permuted (Materials and methods). The dashed red line shows the observed (‘Obs’) up-down (or down-up) proportions shared among tissues, and estimated false-positive proportion (eFPP) was calculated as the median of the permutations divided by the observed value. p-Values were calculated as the proportion of permutations that are higher than or equal to the observed value. Left panel: up-down reversal proportions were calculated as UD/(UD + UU). Right panel: down-up reversal proportions were calculated as DU/(DU + DD).
Figure 1—figure supplement 10. Replication of Figure 1 results using variance stabilising transformation (VST) normalisation.

Figure 1—figure supplement 10.

To confirm the robustness of the results to the choice of normalisation method, the analysis was repeated using an alternative normalisation approach (VST) implemented in the DESeq2 package (see Materials and methods). (a) Principal components analysis (PCA) of expression levels of 14,973 protein-coding genes across four tissues of 16 mice. Values in parentheses show the variation explained by each component. (b) Age trajectories of PC3 (left) and PC4 (right). Spearman’s correlation coefficients between PC4 and age in each tissue in development range between 0.58 and 0.99 (see Figure 1—source data 1 for all tests). The dashed vertical line indicates 90 days of age, separating development and ageing periods. (c) Similarity between the age-related gene expression changes (Spearman’s correlation coefficient between expression and age without a significance cutoff) across tissues in development and ageing. Similarities were calculated using Spearman’s correlation coefficient between expression-age correlations across tissues. CTX, cortex; LV, liver; LNG, lung; MS, muscle. (d) The number of significant age-related genes in each tissue (false discovery rate [FDR]-corrected p-value<0.1). (e) Shared age-related genes among tissues identified without using a significance cutoff. The x-axis shows the number of tissues among which age-related genes are shared. (f) The proportion of age-related expression change trends in each tissue across the lifetime. No significance cutoff was used. UpDown: upregulation in development and down-regulation in the ageing; DownUp: downregulation in development and upregulation in the ageing; UpUp: upregulation in development and upregulation in the ageing; DownDown: downregulation in development and downregulation in ageing.
Figure 1—figure supplement 11. Correlation between quantile normalised (QN) and variance stabilising transformation (VST) normalisation methods using age-related expression changes.

Figure 1—figure supplement 11.

Spearman’s correlation coefficient between expression trajectories of QN (x-axis) and VST (VST method from DESeq2 package, y-axis) normalised data. Expression trajectories were calculated using Spearman’s correlation coefficient between age and expression level for each gene in both periods (ndev = [14,705, 14,710], nageing = [14,689, 14,710]). Blue lines represent the regression lines.
Figure 1—figure supplement 12. Clustering of genes by expression levels in cortex tissue.

Figure 1—figure supplement 12.

k-means clustering (k = 15) of genes (15,063) using expression levels in cortex tissue. Numbers in the parentheses show the number of genes in each cluster. Expression levels of genes were scaled across samples (mean = 1, sd = 0) before clustering. The optimal number of clusters was determined with gap statistics (see Materials and methods). Clusters enriched among divergence-convergence (DiCo) genes compared to all other clusters are indicated with red colour, and the ones depleted among DiCo genes are indicated with blue colour. The list of genes belonging to each cluster and their enrichment among DiCo genes are given in Figure 1—source data 1.
Figure 1—figure supplement 13. Clustering of genes by expression levels in lung tissue.

Figure 1—figure supplement 13.

k-means clustering (k = 17) of genes (15,063) using expression levels in lung tissue. Numbers in the parentheses show the number of genes in each cluster. Expression levels of genes were scaled across samples (mean = 1, sd = 0) before clustering. The optimal number of clusters was determined with gap statistics (see Materials and methods). Clusters enriched among divergence-convergence (DiCo) genes are indicated with red colour, and the ones depleted among DiCo genes are indicated with blue colour. The list of genes belonging to each cluster and their enrichment among DiCo genes are given in Figure 1—source data 1.
Figure 1—figure supplement 14. Clustering of genes by expression levels in liver tissue.

Figure 1—figure supplement 14.

k-means clustering (k = 14) of genes (15,063) using expression levels in liver tissue. Numbers in the parentheses show the number of genes in each cluster. Expression levels of genes were scaled across samples (mean = 1, sd = 0) before clustering. The optimal number of clusters was determined with gap statistics (see Materials and methods). Clusters enriched among divergence-convergence (DiCo) genes are indicated with red colour, and the ones depleted among DiCo genes are indicated with blue colour. The list of genes belonging to each cluster and their enrichment among DiCo genes are given in Figure 1—source data 1.
Figure 1—figure supplement 15. Clustering of genes by expression levels in muscle tissue.

Figure 1—figure supplement 15.

k-means clustering (k = 17) of genes (15,063) using expression levels in muscle tissue. Numbers in the parentheses show the number of genes in each cluster. Expression levels of genes were scaled across samples (mean = 1, sd = 0) before clustering. The optimal number of clusters was determined with gap statistics (see Materials and methods). Clusters enriched among divergence-convergence (DiCo) genes are indicated with red colour, and the ones depleted among DiCo genes are indicated with blue colour. The list of genes belonging to each cluster and their enrichment among DiCo genes are given in Figure 1—source data 1.

Tissues involve common gene expression changes with age

We next characterised age-related changes in gene expression shared across tissues by (1) studying overall trends at the whole transcriptome level and testing their consistency using permutation tests, and (2) studying statistically significant changes at the single-gene level. First, we investigated similarities in overall trends of gene expression changes with age using the Spearman’s correlation coefficient (ρ) between expression levels and age for each gene in each tissue separately for the developmental and ageing periods (Materials and methods; tissue-specific age-related gene expression changes and functional enrichment test results are available in Supplementary file 1). We then examined transcriptome-wide similarities across tissues during development and ageing by comparing these gene-wise expression-age correlation coefficients (Figure 1c). Considering the whole transcriptome without a significance cutoff, we found a weak correlation of age-related expression changes in tissue pairs, both during development (ρ = [0.17, 0.39], permutation test p<0.05 for all the pairs, Figure 1—source data 1), and ageing (ρ = [0.23, 0.33], permutation test p<0.05 in 4/6 pairs, Figure 1—source data 1). We then tested whether developmental patterns among tissues may be shared more than ageing-associated patterns, but we did not find significant difference between inter-tissue similarities within the development and those within ageing (Wilcoxon signed-rank test, p=0.31). Moreover, the number of genes with the same direction of change (without applying a significance cutoff) across four tissues was consistently more than expected by chance (permutation test p<0.05), except for genes upregulated in ageing (Figure 1e, Figure 1—figure supplement 4). This attests to overall similarities across tissues both during postnatal development and during ageing, albeit of modest magnitudes. We obtained similar results using another normalisation approach, variance stabilising transformation (VST) from the DESeq2 package (Love et al., 2014), and confirmed that the observed patterns are not affected by the choice of normalisation method (Figure 1—figure supplements 1011).

In the second approach, we focused on genes showing a significant age-related expression change, identified separately during development or during ageing (using Spearman’s correlation coefficient and false discovery rate [FDR]-corrected p-value<0.1, Figure 1d). We found that the developmental period was accompanied by a large number of significant changes (n = [1,941, 6,151], 13–41% across tissues), with the most manifest changes detected in the cortex. The genes displaying significant developmental changes across all four tissues also showed significant overlap (Figure 1—figure supplement 5a, Figure 1—figure supplement 6; permutation test: pshared_up = 0.027, pshared_down < 0.001). Using the Gene Ontology (GO), we found that shared developmentally upregulated genes were enriched in functions such as hormone signalling pathways and lipid metabolism (FDR-corrected p-value<0.1). Meanwhile, shared developmentally downregulated genes were enriched in functions such as cell cycle and cell division (FDR-corrected p-value<0.1; Supplementary file 2). Contrary to widespread expression change during development (13–41%), the proportion of genes undergoing significant expression change during ageing was between 0.013 and 15% (Figure 1d). This contrast between postnatal development and ageing was also observed in previous work on the primate brain (Somel et al., 2010; Işıldak et al., 2020). In terms of the number of genes with a significant ageing-related change, the most substantial effect we found was in the lung (n = 2319), while close to no genes showed a statistically significant change in the muscle (n = 2), a tissue previously noted for displaying a weak ageing transcriptome signature across multiple datasets (Turan et al., 2019). Not unexpectedly, we found no common significant ageing-related genes across tissues (Figure 1—figure supplement 5a). Considering the similarity between the ageing and development datasets (Figure 1c) and the similar sample sizes in development (n = 7) and ageing periods (n = 9), the lack of overlap in significant genes in ageing might be due to low signal-to-noise ratios in the ageing transcriptome as ageing-related changes are subtler compared to those in development (Figure 1—figure supplement 5b).

Gene expression reversal is a common phenomenon in multiple tissues

We then turned to investigate the prevalence of the reversal phenomenon (i.e. an opposite direction of change during development and ageing) across the four tissues. We first compared the trends of age-related expression changes between development and ageing periods in the same tissue, without a significance cutoff, to assess transcriptome-wide reversal patterns (Figure 1c). This revealed weak negative correlation trends in liver and muscle (though not in the lung and cortex), that is, genes up- or downregulated during development tended to be down- or upregulated during ageing, respectively. These reversal trends were comparable when the analysis was repeated with the genes showing relatively high levels of age-related expression change (|ρ| > 0.6 in both periods; Figure 1—figure supplement 7). We further studied the reversal phenomenon by classifying each gene expressed per tissue (n = 15,063) into those showing up- or downregulation during development and during ageing. Here, again, we did not use a statistically significance cutoff and summarised trends of continuous change versus reversal in each tissue. This approach follows Dönertaş et al., 2017 and focuses on global trends instead of single genes. In line with the above results, as well as earlier observations in the brain, kidney, and liver (Dönertaş et al., 2017; Anisimova et al., 2020), we found that ~50% (43–58%) of expressed genes showed reversal trends (Figure 1f), although these proportions were not significantly more than randomly expected in permutation tests (Figure 1—figure supplement 8, Materials and methods). Overall, we conclude that although the reversal pattern is not ubiquitous, the expression trajectories of the genes do not necessarily continue linearly into the ageing period.

Pathways related to development, metabolism, and inflammation are associated with the reversal pattern

We then asked whether genes displaying reversal patterns in each tissue may be enriched in functional categories. Our earlier study focusing on different brain regions had revealed that up-down genes, that is, genes showing developmental upregulation followed by downregulation during ageing, were enriched in tissue-specific pathways, such as neuronal functions (Dönertaş et al., 2017). Analysing up-down genes compared to all genes upregulated during development, we also found significant enrichment (FDR-corrected p-value<0.1) in functions such as ‘synaptic signalling’ in the cortex, as well as ‘tube development’ and ‘tissue morphogenesis’ in the lung, ‘protein catabolic process’ in the liver, and ‘cellular respiration’ pathways in the muscle (Supplementary file 3). Meanwhile, down-up genes (downregulation during development followed by upregulation during ageing) showed significant enrichment in functions such as ‘wound healing’ and ‘peptide metabolic process’ in the cortex, ‘translation’ and ‘nucleotide metabolic process’ in the lung, ‘inflammatory response’ in the liver, and ‘leukocyte activation’ in the muscle (Supplementary file 3).

Genes showing a reversal pattern are not shared among tissues

As tissues displayed modest positive correlations in their development- or ageing-related expression change trends (Figure 1c, Figure 1—figure supplement 7), and as we had previously observed that distinct brain regions show similarities in their reversal patterns (i.e. the same genes showing the same reversal type), different tissues might also be expected to show similarities in their reversal patterns. Interestingly, we found no overlap between gene sets with the reversal pattern (up-down or down-up genes) across tissues, relative to random expectation (permutation test, pup-down = 0.08, pdown-up = 0.53; Figure 1—figure supplement 9). Such a lack of overlap might be explained if genes showing reversal patterns in each tissue tend to be tissue-specific. It would also be consistent with the notion that reversals involve loss of cellular identities gained in development, during which tissue transcriptomes appear to diverge from each other (Figure 1a, Figure 1—figure supplement 3; Cardoso-Moreira et al., 2019). This result led us to ask whether, in accordance with the reversal phenomenon, inter-tissue transcriptome divergence may be followed by increasing inter-tissue similarity, or convergence, during ageing.

Inter-tissue divergence during development and convergence during ageing

We studied the inter-tissue divergence/convergence question using two approaches. In the first, we analysed how transcriptome-wide expression variation among tissues changes with age regardless of their age-related expression patterns in any particular tissue. To do this, for each individual, we calculated the coefficient of variation (CoV) across the four tissues for each commonly expressed gene (n = 15,063), which represents a measure of expression variation among tissues. Then, we assessed how such inter-tissue variation changes over the lifetime by calculating the Spearman’s correlation coefficient between CoV and age separately for development and ageing periods (correlation values for all genes are given in Figure 2—source data 1).

Using the CoV values calculated across all 15,063 genes (excluding one 904-day-old individual for which we lacked the cortex data), we observed a significant mean CoV increase in development (Spearman’s correlation coefficient ρ = 0.77, two-sided p=0.041), confirming that tissues diverge as development progresses (Figure 2a). Interestingly, during ageing, we observed a decrease in mean CoV with age, albeit not significant (ρ = −0.50, p=0.204, Figure 2a), suggesting that tissues may tend to converge during ageing. This was also supported by the PCA in which we observed a trend of ageing-associated decrease in mean Euclidean distance among tissues (using PC1–PC4 space with quantile-normalised data: ρ = −0.87, p=0.0026; with VST-normalised data ρ = −0.58, p=0.102, Figure 1—source data 1). We obtained the same divergence-convergence pattern by calculating the median CoV values for each individual instead of the mean (Figure 2—figure supplement 1). Figure 2b exemplifies this pattern of increasing and then decreasing CoV through lifetime for the gene displaying the strongest such signal.

Figure 2. Age-related change in gene expression variation among tissues estimated with coefficient of variation (CoV).

(a) Transcriptome-wide mean CoV trajectory with age. Each point represents the mean CoV value of all protein-coding genes (15,063) for each mouse (n = 15) except the one that lacks expression data in the cortex. (b) Age effect on CoV value of the Cd93 gene which has the highest rank for the divergence-convergence (DiCo) pattern in four tissues (Materials and methods). CoV increases during development and decreases during ageing, indicating expression levels show DiCo patterns among tissues. (c) Expression trajectories of the gene Cd93 in four tissues. (d) The number of significant CoV changes with age (false discovery rate [FDR]-corrected p-value<0.1) during development (left, nconverge = 772, ndiverge = 1809) and ageing (right, nconverge = 42, ndiverge = 20). Converge: genes showing a negative correlation (ρ) between CoV and age; diverge: genes showing a positive correlation between CoV and age. (e) Log2 ratio of convergent/divergent genes in development and in ageing. The graph represents only genes showing significant CoV changes (FDR-corrected p-value<0.1, given in panel d). Error bars represent the range of log2 ratios calculated from leave-one-out samples using the jackknife procedure (Materials and methods, values are given in Figure 2—source data 1).

Figure 2—source data 1. All the data related to divergence-convergence (DiCo) pattern: age-related coefficient of variation (CoV) change of genes, pairwise tissue expression correlations, analysis of independent datasets; GSE34378 (Jonker et al.), GSE132040 (Schaum et al.), and GTEx.

Figure 2.

Figure 2—figure supplement 1. Age-related change in coefficient of variation (CoV) summarised across genes using median CoV values.

Figure 2—figure supplement 1.

Each point represents the median CoV value (instead of the mean given in Figure 2a) of all protein-coding genes (15,063) for each mouse except the one that lacks expression data in the cortex (n = 15). x-axis is in log2 scale. The dashed grey line shows the start of the ageing period. The Spearman’s correlation coefficient and p-value for each period are indicated separately on the plot.
Figure 2—figure supplement 2. Clustering of divergence-convergence (DiCo) genes by expression variations (coefficient of variation [CoV]) among tissues.

Figure 2—figure supplement 2.

k-means clustering (k = 7) of DiCo genes (4802) using CoV values. Numbers in the parentheses show the number of genes in each cluster. CoV values were scaled across genes (mean = 1, sd = 0) before clustering. The optimal number of clusters was determined with gap statistics (Materials and methods). The list of genes belonging to each cluster and their age-related CoV change correlations are given in Figure 2—source data 1.
Figure 2—figure supplement 3. Clustering of divergence-convergence (DiCo) genes by expression levels in tissues.

Figure 2—figure supplement 3.

k-means clustering (k = 25) of DiCo genes (n = 4802) using gene expression levels. Numbers in the parentheses show the number of genes in each cluster. Expression levels of genes were scaled across tissues (mean = 1, sd = 0) before clustering. The optimal number of clusters was determined with gap statistics (Materials and methods). The list of genes belonging to each cluster and their age-related CoV change correlations are given in Figure 2—source data 1.
Figure 2—figure supplement 4. Number of genes with inter-tissue divergence and convergence tendencies in development and ageing.

Figure 2—figure supplement 4.

The number of coefficient of variation (CoV) changes with age (without a significance cutoff) during development and ageing. Converge: genes showing negative correlation (ρ < 0) between CoV and age; diverge: genes showing positive correlation (ρ > 0) between CoV and age (development: nconverge = 5939, ndiverge = 9058; ageing: nconverge = 7748, ndiverge = 7187).
Figure 2—figure supplement 5. Pairwise tissue expression correlations.

Figure 2—figure supplement 5.

Age-related changes in pairwise Spearman’s correlation coefficients for the expression levels (y-axis) between tissues of the same individual mouse in our dataset. The dashed grey line indicates the start of the ageing period. The Spearman’s correlation coefficients and p-values for each period are indicated separately on the plot.
Figure 2—figure supplement 6. Summary of pairwise expression correlations among tissues.

Figure 2—figure supplement 6.

Age-related change in the mean (left) or the median (right) pairwise expression correlations among tissues. Each point represents the mean (left) or the median (right) of pairwise expression correlations among tissues of the same mouse (mean/median values are calculated from Figure 2—figure supplement 5). (a) Absolute expression correlations were used to calculate the mean or the median. (b) Expression correlations were scaled within each tissue pair (mean = 1, sd = 0) before calculating the mean and median. The Spearman’s correlation coefficients and p-values for each period are indicated separately on the plot.
Figure 2—figure supplement 7. Coefficient of variation (CoV) and pairwise correlation analysis of Jonker dataset.

Figure 2—figure supplement 7.

(a, b) Principal components analysis (PCA) of expression values of 17,661 protein-coding genes across five tissues (brain [cortex], liver, lung, kidney, spleen) of 18 individuals in the Jonker dataset (contains samples only from the ageing period). Values in parentheses show the variance explained by each PC. (c) The change in mean pairwise Euclidean distance between the PC values for the tissues of the same individuals (y-axis) with age (x-axis). Transcriptome-wide (d) mean and (e) median CoV changes with age across five tissues. The x-axis shows age in days. Each point represents the mean or median CoV value of all protein-coding genes for each individual. (f) Associationbetween age (x-axis) and gene expression correlations of each individual in pairwise tissues (y-axis). Spearman’s correlation coefficient and p-values are indicated in each plot.
Figure 2—figure supplement 8. Principal components analysis (PCA) of GTEx dataset covering cortex, liver, lung, and muscle tissues.

Figure 2—figure supplement 8.

(a, b) PCA of expression values of 16,197 genes across four tissues (cortex, liver, lung, muscle) of 47 individuals in GTEx. Values in parentheses show the variance explained by each PC. (c) The change in mean pairwise Euclidean distance between the PC values for the tissues of the same individuals (y-axis) with age (x-axis). (d–g) Association between the first four PCs (y-axis) and age (x-axis). The tissue and age of the samples are indicated by the colour and size of the points, respectively. Spearman’s correlation test results are indicated in each plot.
Figure 2—figure supplement 9. Coefficient of variation (CoV) and pairwise correlation analysis of GTEx dataset covering cortex, liver, lung, and muscle tissues.

Figure 2—figure supplement 9.

(a, b) Transcriptome-wide mean (a) and median (b) CoV change with age across four tissues (cortex, liver, lung, muscle) in GTEx. Each point represents the mean or median CoV value of all protein-coding genes (16,197) for each individual (n = 47) in GTEx. Spearman’s correlation coefficients and p-values are also presented in the plot. (c) The change in pairwise Spearman’s correlation coefficient between gene expression values of the same individual across ages (y-axis) with age (x-axis). Spearman’s correlation coefficient and p-values between the pairwise tissue correlations and age are also presented in each plot.
Figure 2—figure supplement 10. Principal components analysis (PCA) of GTEx dataset with 10 tissues.

Figure 2—figure supplement 10.

(a, b) PCA of expression values of 16,290 genes across 10 tissues of 35 individuals in GTEx. Values in parentheses show the variance explained by each PC. (c) The change in mean pairwise Euclidean distance between the PC values for the tissues of the same individuals (y-axis) with age (x-axis). (d–g) Association between the first four PCs (y-axis) and age (x-axis). The tissue and age of the samples are indicated by the colour and size of the points, respectively.
Figure 2—figure supplement 11. Coefficient of variation (CoV) and pairwise correlation analysis of GTEx dataset with 10 tissues.

Figure 2—figure supplement 11.

(a, b) Transcriptome-wide mean (a) and median (b) CoV change with age across 10 tissues in GTEx. Each point represents the mean or median CoV value of all protein-coding genes (16,290) for each individual (n = 35) in GTEx. Spearman’s correlation coefficients and p-values are also presented in the plot. (c) Age-related changes in pairwise Spearman’s correlation coefficient between gene expression values of the same individual. The colour of points shows the correlations between age and pairwise correlations, where darker red colour indicates an increased correlation with age and darker blue indicates a decreased correlation. The size of points shows the mean similarity (correlation) between tissues using all ages. None of the correlations is significant after multiple testing correction (using Benjamini–Hochberg [BH]).
Figure 2—figure supplement 12. Permutation test result for the proportion of divergence-convergence (DiCo) genes.

Figure 2—figure supplement 12.

DiCo genes (n = 4802) were tested with a permutation-based test explained in Materials and methods. We kept the divergent genes (n = 9058) in development constant and permuted age labels of individuals in the ageing period. Then, we calculated the DiCo proportion among those genes in permutations. ‘Obs‘: observed DiCo proportion (Obs = 4802/9058, i.e. DiCo/(DiCo + Di–); Di~: divergence across lifetime). Estimated false-positive proportion (eFPP) was calculated as the median expected proportion divided by the observed value. The p-value was calculated as the proportion of permutations that are higher than or equal to the observed value.
Figure 2—figure supplement 13. Clustering of tissues by the presence of samples from the same individuals.

Figure 2—figure supplement 13.

Heatmap showing whether individuals (columns) have samples (light blue colour) in tissues (y-axis).
Figure 2—figure supplement 14. Reproducing Figure 2 results with variance stabilising transformation (VST) normalisation.

Figure 2—figure supplement 14.

(a) Transcriptome-wide mean coefficient of variation (CoV) trajectory with age. Each point represents the mean CoV value of all protein-coding genes (14,973) for each mouse (n = 15) except the one that lacks expression data in the cortex. (b) Age effect on CoV value of the Cd93 gene which has the highest rank for the divergence-convergence (DiCo) pattern in four tissues (Materials and methods). CoV increases during development and decreases during ageing, indicating expression levels show DiCo patterns among tissues. (c) Expression trajectories of the gene Cd93 in four tissues. (d) The number of significant CoV changes with age (false discovery rate [FDR]-corrected p-value<0.1) during development (left, nconverge = 398, ndiverge = 3,078) and ageing (right, nconverge = 13, ndiverge = 6). Converge: genes showing a negative correlation (ρ) between CoV and age; diverge: genes showing a positive correlation between CoV and age. (e) Log2 ratio of convergent/divergent genes in development and in ageing. The graph represents only genes showing significant CoV changes (at FDR-corrected p-value<0.1, given in panel d). Error bars represent the range of log2 ratios calculated from leave-one-out samples in jackknife procedure.
Figure 2—figure supplement 15. Effect of heteroscedasticity to divergence-convergence (DiCo) pattern.

Figure 2—figure supplement 15.

Two different heteroscedasticity tests were performed to compare DiCo (n = 4802) vs. divergent-divergent (DiDi) (n = 4182, divergent throughout the lifetime) genes to test whether the convergence pattern is a result of the regression towards the mean. (a) Density plots of Spearman’s correlation coefficients (x-axis) between heterogeneity and age for DiCo and DiDi genes in each tissue. Heterogeneity was calculated as the absolute residuals of the linear regression between age (log2 scale) and expression (see Materials and methods). Only in muscle tissue the two-sided Kolmogorov–Smirnov (KS) test result was marginally significant in the direction of higher heterogeneity change for DiDi genes (p=0.0496). (b) Density plots of chi-square test statistics (x-axis) from Breusch–Pagan test (from ‘car’ package in R) between expression level and age (log2 scale) for DiCo and DiDi genes in each tissue. Only in muscle tissue the two-sided KS test result was significant in the direction of higher heterogeneity change for DiDi genes (p=0.0423). p-Values of KS test results between DiCo and DiDi genes are given within each plot.
Figure 2—figure supplement 16. Sex effect on coefficient of variation (CoV) analysis using GTEx.

Figure 2—figure supplement 16.

(a, b) Transcriptome-wide mean (a) and median (b) CoV change with age across four tissues (cortex, liver, lung, muscle) in GTEx for female (n = 11) and male (n = 36) individuals separately. Each point represents the mean or median CoV value of all protein-coding genes (16,197) for each individual. Spearman’s correlation coefficients and p-values are also presented in the plots. (c, d) The change in pairwise Spearman’s correlation coefficient between gene expression values of the same individual (y-axis) for (c) females (n = 11) and (d) males (n = 36), across ages (x-axis). Spearman’s correlation coefficient and p-values between the pairwise tissue correlations and age are also presented in each plot.
Figure 2—figure supplement 17. Principal components analysis (PCA) of Schaum dataset covering cortex, liver, lung, and muscle tissues.

Figure 2—figure supplement 17.

(a, b) PCA of expression values of 16,806 genes across four tissues (cortex, liver, lung, muscle) of 37 individuals in the Schaum dataset. Values in parentheses show the variance explained by each PC. (c) The change in mean pairwise Euclidean distance between the PC values for the tissues of the same individuals (y-axis) with age (x-axis, in months). (d–g) Association between the first four PCs (y-axis) and age (x-axis, in months). The tissue and age of the samples are indicated by the colour and size of the points, respectively. Spearman’s correlation test results are indicated in each plot.
Figure 2—figure supplement 18. Coefficient of variation (CoV) and pairwise correlation analysis of Schaum dataset covering cortex, liver, lung, and muscle tissues.

Figure 2—figure supplement 18.

(a, b) Transcriptome-wide mean (a) and median (b) CoV change with age (in months) across four tissues (cortex, liver, lung, muscle) in Schaum dataset. Each point represents the mean or median CoV value of all protein-coding genes (16,806) for each individual (n = 37). Spearman’s correlation coefficients and p-values are also presented in the plot. (c) The change in pairwise Spearman’s correlation coefficient between gene expression values of the same individual (y-axis) with age (x-axis, in months). Spearman’s correlation coefficient and p-values between the pairwise tissue correlations and age are also presented in each plot.
Figure 2—figure supplement 19. Principal components analysis (PCA) of Schaum dataset with eight tissues.

Figure 2—figure supplement 19.

(a, b) PCA of expression values of 17,619 genes across eight tissues of 26 individuals in the Schaum dataset. Values in parentheses show the variance explained by each PC. (c) The change in mean pairwise Euclidean distance between the PC values for the tissues of the same individuals (y-axis) with age (x-axis, in months). (d–g) Association between the first four PCs (y-axis) and age (x-axis, in months). The tissue and age of the samples are indicated by the colour and size of the points, respectively.
Figure 2—figure supplement 20. Coefficient of variation (CoV) and pairwise correlation analysis of Schaum dataset with eight tissues.

Figure 2—figure supplement 20.

(a, b) Transcriptome-wide mean (a) and median (b) CoV change with age (in months) across eight tissues (brain [cortex], heart, kidney, liver, lung, muscle, spleen, subcutaneous fat) in Schaum dataset. Each point represents the mean or median CoV value of all protein-coding genes (17,619) for each individual (n = 26). Spearman’s correlation coefficients and p-values are also presented in the plot. (c) Age-related changes in pairwise Spearman’s correlation coefficient between gene expression values of the same individual. The colour of points shows the correlations between age and pairwise correlations, where darker red colour indicates an increased correlation with age and darker blue indicates a decreased correlation. The size of points shows the mean similarity (correlation) between tissues using all ages. Significant correlations are indicated with circles around the points after multiple testing correction using ‘Benjamini–Hochberg (BH) (5/7 of significant correlations were positive).

We identified n = 9058 genes showing divergent trends among tissues in development based on their CoV change with age (without using a significance cutoff per gene). Among these, n = 4802 showed convergent trends in ageing, which we refer to as DiCo genes. We next studied the transition points between divergence and convergence by clustering genes showing the DiCo pattern (n = 4802) based on their CoV values (Figure 2—figure supplement 2). Notably, cluster 1, which shows a slightly delayed divergence starting after 8 days and peaks around 3 months, was associated with metabolic and respiration-related processes (FDR-corrected p-value<0.1), and cluster 5, which shows a relatively delayed convergence after 4 months, was enriched in categories related to vascular development (FDR-corrected p-value<0.1) (Supplementary file 4). To assess the contribution of different tissues to the DiCo pattern, we further clustered DiCo-displaying genes (n = 4802) based on their expression levels (Figure 2—figure supplement 3). Not surprisingly, the clusters with relatively higher expression levels of a tissue (e.g. muscle in cluster 9) were enriched in functional categories (FDR-corrected p-value<0.1) related to that tissue (e.g. muscle cell development) (Supplementary file 5).

We then studied DiCo at the single-gene level. We tested each gene for a significant CoV change in their expression levels (i.e. divergence or convergence) in development and ageing (Spearman’s correlation test with FDR-corrected p-value<0.1). We found that the ratio of divergent and convergent genes differed significantly between development (70% divergence among 2581 significant genes) and ageing (68% convergence among 62 significant genes) (Figure 2d and e). The same pattern was also observed without using significance cutoff (Figure 2—figure supplement 4). We also confirmed that this pattern is also observed with VST-normalised data (Materials and methods), and is thus not affected by the data preprocessing approach (Figure 2—figure supplement 14).

To our knowledge, inter-tissue convergence during ageing is a novel phenomenon. We first considered the possibility that convergence during ageing could be explained by heteroscedasticity which could arise due to increased inter-individual variability in gene expression during ageing (Somel et al., 2006). To test this hypothesis, we compared expression-age heteroscedasticity levels between two gene sets: (1) genes with the DiCo pattern and (2) genes showing divergent patterns throughout lifetime (divergent-divergent [DiDi, n = 4182]) for each tissue separately (Materials and methods). We did not observe any significant difference in heteroscedasticity between DiCo and DiDi genes in any of the tissues (two-sided Kolmogorov–Smirnov [KS] test, p>0.05 in all tissues, Figure 2—figure supplement 15), which suggests that heteroscedasticity due to increased inter-individual variability probably does not drive the observed age-related convergence during ageing. Visual inspection of gene expression clusters also suggested that the DiCo pattern is not particularly associated with nonlinear changes in gene expression with age (Figure 1—figure supplements 1215).

In order to further verify the DiCo pattern, we used a second approach to test it in our mouse dataset. For each individual, we calculated correlations between pairs of tissues across their gene expression profiles. Under the DiCo pattern, we would expect pairwise correlations to decrease during development and increase during ageing. Among all pairwise comparisons, we observed a strong negative correlation during development (ρ = [-0.61, –0.9], nominal p<0.05 in five out of six tests), while during ageing, four out of six comparisons showed a moderate positive correlation (ρ = [0.16, 0.69], nominal p<0.05 in one out of six comparisons, Figure 2—figure supplement 5). Calculating the mean of pairwise correlations among tissues for each individual, we observed the same DiCo pattern (nominal p<0.05 for both periods, Figure 2—figure supplement 6).

The DiCo pattern indicates loss of tissue specificity during ageing

Potential explanations of the DiCo pattern involve two scenarios consistent with the age-related loss of identity: (1) decreased expression of tissue-specific genes in their native tissues or (2) non-specific expression of tissue-specific genes in other tissues. To test these predictions, we first identified tissue-specific gene sets based on relatively high expression of that gene in a particular tissue (cortex: 1175; lung: 839; liver: 986; muscle: 766 genes). We noted that tissue-specific genes show clear up-down reversal patterns, being mostly upregulated during development, and downregulated during ageing (Figure 3, 57–89%). The up-down reversal pattern was particularly strong among tissue-specific genes for the three of four tissues tested (OR = [1.65, 6.52], p<0.05 for each tissue except in liver: OR = 0.87, p=0.09, Figure 3—source data 1). Tissue-specific genes were also enriched among DiCo genes (Figure 3—source data 1, OR = 1.56, Fisher’s exact test p<10–16).

Figure 3. Reversal patterns among tissue-specific genes.

Figure 3.

Age-related expression changes of the tissue-specific genes. In each panel (a–d), the upper-left subpanels show effect size (ES) calculated with the Cohen’s d formula, using expression levels of each gene among tissues (Materials and methods). The IQR (line range) and median (point) ES for each tissue are shown. The number of tissue-specific genes is indicated inside each subpanel. The lower-left subpanels show violin plots of the distribution of age-related expression change values (Materials and methods) among tissue-specific genes in development and in ageing. Each quadrant represents the plots for each tissue-specific gene group. The red and blue lines connect gene expression changes for the same genes in development and ageing. DU: percentage of down-up reversal genes among downregulated, tissue-specific genes in development; UD: percentage of up-down reversal genes among upregulated, tissue-specific genes in development. Tissue-specific genes are enriched among UD reversal genes except in the liver (Fisher’s exact test; ORcortex = 1.65, ORlung = 6.52, ORliver = 0.87, ORmuscle = 1.26, p<0.05 for each test except in liver).

Figure 3—source data 1. Effect sizes for determination of tissue-specific genes, enrichment of divergence-convergence (DiCo), and reversal genes within tissue-specific genes.

We then tested our initial prediction that the DiCo pattern is related to tissue-specific genes losing their expression in their native tissue and/or gaining expression in non-native tissues during ageing. We first tested this hypothesis by considering all tissue-specific genes. We found a positive odds ratio between loss of expression in native tissue and gain in other tissues during ageing (OR = 5.50, Fisher’s exact test p=2.1 × 10–129, Figure 4a). The same analysis conducted with only the DiCo genes yielded a much stronger association (OR = 74.81, Fisher’s exact test p=5.9 × 10–203, Figure 4b). This suggests that loss of tissue-specific expression is observed across the transcriptome, with a particularly strong association among DiCo genes. Figure 4c–f exemplifies the expression trajectories of genes chosen from each group defined in Figure 4b.

Figure 4. The loss of tissue-specific expression during ageing and functional enrichment of divergence-convergence (DiCo) genes.

(a) Mosaic plot showing the association between maximal expression change in native vs. non-native tissues (x-axis) vs. down- (cyan) or upregulation (pink) during ageing across all tissue-specific genes (n = 3766). The highly significant odds ratio indicates that genes native to a tissue tend to be downregulated during ageing in that native tissue if they show maximal expression change during ageing in that tissue. Conversely, if they show maximal expression change during ageing in non-native tissue, those genes are upregulated during ageing. Consequently, tissue-specific expression patterns established during development will tend to be lost during ageing. (b) The same as (a) but using only the tissue-specific genes that show the DiCo pattern (n = 1287). (c) Summary of the association tests for ‘direction of maximal expression change in native vs. non-native tissues’ across all datasets analysed. The y-axis shows log2-transformed odds ratio (OR) for each dataset (x-axis) – Schaum4: using the same four tissues as our dataset. Schaum8: using eight tissues. GTEx4: using the same four tissues as our dataset. GTEx10: using 10 tissues. ***False discovery rate (FDR)-corrected p-value<10–87. p-Values are given in Table 2. The four groups are annotated as GR1–4 and gene expression changes for each group in our dataset is exemplified in (dg). (h) Trends of expression change with age of genes (x-axis) in categories enriched in DiCo (Gene Set Enrichment Analysis [GSEA]). Enriched categories (n = 184) are summarised into representatives (y-axis) using hierarchical clustering and Jaccard similarities (Materials and methods). Categories are ordered by the number of genes they contain from highest (bottom, n = 290) to lowest (top, n = 26). The most distant cluster with low within-cluster similarity in the hierarchical clustering (other Gene Ontology [GO]) was clustered separately and given in Figure 4—figure supplement 1.

Figure 4—source data 1. Gene Set Enrichment Analysis (GSEA) result of divergence-convergence (DiCo) genes, DiCo enrichment with tissue-specific expression loss, age-related expression change correlations, and convergence overlaps among datasets.

Figure 4.

Figure 4—figure supplement 1. Age-related expression change trends in divergence-convergence (DiCo)-enriched categories denoted as ‘Other GO’ in the first clustering.

Figure 4—figure supplement 1.

Age-related expression change trends of genes (x-axis) in categories enriched in DiCo (Gene Set Enrichment Analysis [GSEA]) that were grouped into one cluster ‘Other GO’ in Figure 4g. These categories (n = 69) were again summarised into representatives (y-axis) using hierarchical clustering and Jaccard similarities (see Materials and methods). Categories are ordered by the number of genes they contain from highest (bottom, n = 97) to lowest (top, n = 21). One cluster containing unrelated categories (n = 17) was again denoted as ‘Other GO’.
Figure 4—figure supplement 2. Comparison of datasets.

Figure 4—figure supplement 2.

(a) Heatmap using Spearman’s correlation coefficients among expression trajectories (Spearman’s correlation coefficients between expression and age) across datasets during ageing. As the pairwise tissue correlations range between –0.2 and 0.52, the colour palette was restricted to –0.52 to 0.52 range. The same tissues of our dataset and Jonker dataset were clustered together (cortex, lung, liver) in the lower-right corner. (b) Enrichment of convergent genes among datasets during ageing. GTEx10 and GTEx4: coefficient of variation (CoV) calculation was performed with 10 tissues and with the same 4 tissues as our dataset in GTEx. Schaum8 and Schaum4: CoV calculation was performed with eight tissues and with the same four tissues as our dataset in Schaum dataset. ***False discovery rate (FDR)-corrected p-value<0.001; **FDR-corrected p-value<0.01; *FDR-corrected p-value<0.1. All log2(OR) values were positive except for our data vs. GTEx10 (log2(OR) = –0.04) and Jonker vs. Schaum8 (log2(OR) = –0.06), both of which were non-significant.

We then asked whether genes displaying the DiCo pattern may be related to specific functional pathways or share specific regulators. Using GO, we searched for functional enrichment among convergent genes during ageing using developmentally divergent genes as the background (Materials and methods). We found enrichment for 184 GO Biological Process (BP) categories for the DiCo pattern (KS test, FDR-corrected p-value<0.1, Figure 4—source data 1) and summarised enriched categories by clustering them based on the number of genes they share. We then studied the trends of gene expression changes with age (without a significance cutoff) in each representative category for each tissue (Materials and methods) (Figure 4h; we provide detailed clustering for the categories in ‘Other GO’ Figure 4—figure supplement 1). On average, energy metabolism, mitochondria, and tissue function-related categories, as well as immune response-related categories, exhibit DiCo-type expression changes over time and across tissues, where temporal changes in different tissues occur in opposite directions. Notably, for the majority of representative GO categories, the lung had the most distinct expression patterns in both periods (Figure 4h, Figure 4—figure supplement 1).

Contrary to the functional enrichment results, we did not find any specific regulators (miRNA or transcription factors) associated with DiCo using the same background as above (at 235 tests for miRNA and 158 tests for TF, FDR-corrected p-value>0.1 for both tests) (Materials and methods), which suggests that DiCo pattern may not be driven by a limited number of specific regulators, but may instead be a transcriptome-wide phenomenon.

Additional mouse and human datasets confirm the association between loss of tissue specificity and inter-tissue convergence during ageing

We investigated inter-tissue convergence during ageing in three additional datasets where multiple tissue samples were available for the same individuals (Table 1). We conducted the analysis using a subset of the same four tissues in our dataset and also larger sets when additional samples were available. Age-related expression changes showed small to moderate correlations among all datasets analysed, with our dataset being most similar to the mouse dataset from Jonker et al., while the GTEx human dataset was the most distinct (Figure 4—figure supplement 2a).

Table 1. Dataset characteristics summarising species, tissues, number of individuals, age range, sex, and platform used for measuring gene expression values.

Dataset Species Tissues N Age range Sex Method
Izgi et al.,four tissues Mice Brain, lung, liver, muscle 8 3–30 months Male RNA-seq
Jonker et al.,five tissues Mice Brain, lung, liver, kidney, spleen 18 3–30 months Female Microarray
Schaum et al.,four tissues Mice Brain, lung, liver, muscle 37 3–27 months Male (n = 26)Female (n = 11) RNA-seq
Schaum et al., eight tissues Mice Brain, lung, liver, muscle, subcutaneous fat, kidney, heart, spleen 26 3–27 months Male (n = 20)Female (n = 6) RNA-seq
GTEx,four tissues Humans Brain, lung, liver, muscle 47 20–75 years Male (n = 36)Female (n = 11) RNA-seq
GTEx,10 tissues Humans Adipose, tibial artery, cerebellum, lung, skeletal muscle, tibial nerve, pituitary, sun-exposed skin, thyroid, whole blood 35 20–75 years Male (n = 27)Female (n = 8) RNA-seq

First, using the Jonker et al. dataset (Jonker et al., 2013) comprising five tissues (Table 1), we observed transcriptome-wide convergence during ageing with a significant decline in mean Euclidean distance between PCs (ρ = –0.57, p=0.014, Figure 2—figure supplement 7a–c) and a strong decrease in mean CoV during ageing (ρ = –0.48, p=0.044, Figure 2—figure supplement 7d). Moreover, we found that 7/10 tissue pairs showed increased pairwise tissue correlations during ageing, although none of them was significant after multiple testing correction (Figure 2—figure supplement 7f). 66% of the genes with a significant change in CoV were convergent comparable to our dataset showing 68% convergence among significant changes. We also tested the association between the loss of identity and convergence pattern by repeating the same analysis as in Figure 4b with the Jonker et al. dataset using only the convergent genes in ageing as we lack developmental period. We again found strong association, consistent with convergent genes losing expression in their native tissue and gaining in other tissues during ageing (OR = 7.52, p<10–16, Figure 4c). The results are summarised in Table 2.

Table 2. Result summary of the all datasets analysed.

First column shows the names of datasets analysed. Numbers in parentheses show the sample sizes. ‘Among all genes’ column refers to the analyses performed using all genes relevant to those analyses (subcolumns) without a significance cutoff. ‘Within significant CoV changes’: genes show significant CoV change with age with FDR-corrected p-value<0.1. In the ‘DiCo vs. tissue specificity (Di- as background)’ column, divergent genes in development (Di-) were chosen as background. ‘Co vs. expression change in native tissue association (Figure 4b)’ column refers to the analysis performed in Figure 4b for each dataset, and the results are presented in Figure 4c. The association tests were performed among convergent genes in ageing except in our dataset, which was performed with DiCo genes. Significant test results are indicated with italic fonts. Bold fonts show the results that support convergence or tissue-specific expression loss in ageing whether as a significant result or as a trend. Unsupportive test results and inapplicable tests are written in normal font.

Among all genes Within significant CoV changes
PCA change in Euclidean distance Mean CoV change Median CoV change Pairwise tissue correlations DiCo vs. tissue specificity(Di- as background) Co vs. expression change in native tissue association (Figure 4b) Co vs. Di proportions
Izgi2022 ρ = −0.87,
p=0.0026
ρ = −0.5, p=0.2 rho = −0.48, P = 0.23 4/6 positive, none significant* OR = 1.56, p=1.3 × 10–18 OR = 74.81, p=5.9 × 10–203 (among 1287 DiCo genes) 68% convergence(among 62 significant genes*)
Jonker 2013five tissues, two different than ours (n = 18) ρ=−0.57,
p=0.014
ρ=−0.48, p=0.044 ρ = −0.03, p=0.91 7/10 positive, none significant* Di- background missing OR = 7.52, p=6.5 × 10 –109 (among 2967 convergent genes) 66% convergence (among 1735 significant genes*)
Schaum 2020,same four tissues (n = 37) ρ = 0.13,
p=0.46
ρ = 0.25, p=0.14 ρ = 0.13, p=0.43 4/6 positive, two significant* OR = 1.33, p=1.07 × 10–8 OR = 58.03, p=1.5 × 10–197 (among 2124 convergent genes) 53% convergence (among 319 significant genes*)
Schaum 2020,eight tissues (n = 26) ρ = 0.1,
p=0.62
ρ = 0.16, p=0.43 ρ = 0.04, p=0.86 16/28 positive, five significant* Di- background missing OR = 84.2, p=9.7 × 10–96 (among 2380 convergent genes) 54% convergence (among 244 significant genes*)
GTEx,same four tissues ρ = −0.23,
p=0.12
ρ = −0.12, p=0.42 ρ = −0.18, p=0.23 5/6 positive, none significant* Di- background missing OR = 7.21, p=7 × 10–87 (among 2407 convergent genes) (no significant CoV changes)
GTEx,10 tissues ρ = −0.26,
p=0.13
ρ = −0.14, p=0.44 ρ = −0.3, p=0.08 29/45 positive, none significant* Di- background missing OR = 13.01, p=5.7 × 10–114 (among 2195 convergent genes) (all three significant genes were convergent)

ρ = Spearman’s correlation coefficient; OR = odds ratio; FDR = false discovery rate; CoV = coefficient of variation; DiCo = divergence-convergence; PCA = principal components analysis.

* FDR-corrected p-value<0.1.

Next, we used another mouse dataset by Schaum et al., 2020 (Table 1). Repeating the analysis on the same four tissues and also a larger set of eight tissues, we did not find support for transcriptome-wide convergence (Table 2, Figure 2—figure supplements 17 and 19). In the 4-tissue comparison, 4/6 tissue pairs, and in the 8-tissue comparison only 16/28 tissue pairs showed positive correlations, supporting the inter-tissue convergence during ageing (Figure 2—figure supplements 18c and 20c). Interestingly, 75% of the negative correlations involved muscle and subcutaneous fat. Convergence ratios among genes showing significant change in CoV (FDR-corrected p-value<0.1) were marginally above 50%. Although we did not observe widespread convergence during ageing in this dataset, we still detected strong associations between convergence in ageing and tissue specificity (OR4-tissue = 1.33, p=1.08 × 10–8) and identity loss (OR4-tissue = 58.3, p<10–16; OR8-tissue = 84.2, p < 10–16) (Figure 4c).

Lastly, we used the GTEx dataset to investigate inter-tissue convergence during ageing in humans. Calculating the change in mean Euclidean distance based on PCA and mean CoV values, we found a non-significant tendency towards convergence across the whole transcriptome in the same 4 tissues and a larger set of 10 tissues (Table 2, Figure 2—figure supplements 8 and 10). We also performed the four-tissue comparison with female and male individuals separately and observed relatively strong inter-tissue convergence among ageing females (ρfemale = –0.58, pfemale = 0.059) but less in males (ρmale = –0.052, pmale = 0.77) which lack individuals at the youngest and oldest age groups (Figure 2—figure supplement 16). Moreover, 5/6 and 29/45 tissue pairs showed increased correlation with age in 4-tissue and 10-tissue comparisons, consistent with inter-tissue convergence during ageing (Figure 2—figure supplements 9 and 11). Notably, 8 of 16 negative correlations in the 10-tissue comparison involved the skin tissue (Figure 2—figure supplement 11c). We also studied significant changes in CoV per gene, but found no significant gene in the 4-tissue comparison and only three genes in the 10-tissue comparison, all of which were convergent. Finally, we tested the association between the loss of expression in native tissue and gain in other tissues during ageing among convergent genes, confirming the association with the tissue identity (Figure 4c, Table 2).

Overall, analysis of these three additional datasets indicates that inter-tissue convergence during ageing is commonly, but not always, observed at the transcriptome-wide level in mice and in humans. Notably, the transcriptome-wide trend was weak in the Jonker et al. and GTEx datasets and not evident in the Schaum et al. dataset. The association between the loss of identity and convergence, on the other hand, was strong across all datasets (Table 2).

We further asked whether convergent gene sets identified in different datasets overlap. 11 of 15 comparisons were significant, but the effect sizes (ESs) were small (Figure 4—figure supplement 2b). We reason that the low overlap across datasets might reflect that transcriptome-wide convergence was weak and that we lack the developmental samples for the external datasets, that is, we can only compare convergence during ageing but not the DiCo pattern. Noteworthy, only 62% of convergent genes in ageing are divergent during development in our dataset, and low overlap between convergence does not rule out overlap across DiCo genes.

These results suggest that inter-tissue convergence in ageing may be a weak but widespread phenomenon and associated with the loss of tissue identity. Overall, while mouse and human tissues display divergence in development (Figures 1a and 2a, Cardoso-Moreira et al., 2019), this appears to be followed by a trend towards inter-tissue convergence in ageing (Figure 2a, Figure 2—figure supplements 120) and could be linked to loss of tissue identity.

Changes in cellular composition and cell-autonomous expression can both explain the DiCo pattern

Ageing-related transcriptome changes observed using bulk tissue samples may be explained by temporal changes in cell-type proportions within tissues, by cell-autonomous expression changes, or both. To explore whether the observed inter-tissue DiCo patterns may be attributed to changes in cell-type proportions, we used published data from a mouse single-cell RNA-sequencing experiment (Tabula Muris Consortium, 2020). For each of the four tissues in our original experiment, we collected cell-type-specific expression profiles from 3-month-old young adult mice in the Tabula Muris Senis dataset. We deconvoluted bulk tissue expression profiles in our mouse dataset using the corresponding tissue’s cell-type-specific expression profiles by regression analysis (Materials and methods) and studied the relative contributions of each cell type to tissue transcriptomes and how these change with age. The analysis was performed with three gene sets; all genes (n = [12,492, 12,849]), DiCo (n = [4007, 4106]), and non-DiCo genes (n = [8485, 8743]). Studying these deconvolution patterns, we observed a weak but consistent trend involving the most common cell types in different tissues. For instance, analysing DiCo genes in the liver and lung, we found that the most common cell type’s contribution (hepatocyte in the liver, and bronchial smooth muscle cell in the lung) tends to increase during development (Spearman’s correlation coefficient ρliver = 0.95, ρlung = 0.81, nominal p<0.05). This contribution then decreases during ageing (ρliver = –0.77, ρlung = –0.86, nominal p<0.05) (Figure 5a, Figure 5—figure supplement 1). This pattern was also observed in muscle and cortex, albeit not significantly (Figure 5a, Figure 5—figure supplement 1). These changes most likely reflect shifts in cellular composition, some of which were demonstrated directly in mice using in situ RNA staining (Schaum et al., 2020). Repeating the analysis with non-DiCo genes resulted in highly similar patterns considering the most common cell types in tissues, except in muscle ageing in which the age-related decrease was significantly higher with DiCo genes than the non-DiCo genes (permutation test with resampling all genes, pskeletal-muscle-satellite-cell = 0.04) (Figure 5a, Figure 5—figure supplement 1, Figure 5—figure supplements 25). These results indicate that the observed cellular composition changes may partly explain DiCo, although the influence of composition changes is not exclusive to genes displaying the DiCo pattern.

Figure 5. Contribution of tissue composition and cell-autonomous changes to the divergence-convergence (DiCo) pattern.

(a) Deconvolution analysis of our mouse dataset with the 3-month-old scRNA-seq data (Tabula Muris Senis) using DiCo (n = [4007, 4106]) and non-DiCo (n = [8485, 8743]) genes. Only the cell types with the highest relative contributions to each tissue bulk transcriptome are shown (cell-type names are given within each plot). Contributions of all cell types to bulk tissue transcriptomes are shown in Figure 5—figure supplement 1. (b) Distribution of correlations for minimally (left) and maximally (right) correlated cell-type pairs among tissues (n = 54 pairs). For each cell type of a given tissue, one minimally (or maximally) correlated cell type is chosen from other tissues among the 3-month age group of the Tabula Muris Senis dataset (density plots with solid line edges). Dashed lines show the correlation distributions in 24-month age of minimally or maximally correlated cell-type pairs identified in the 3-month age group. Bottom panel shows age-related expression similarity (ρ) changes of minimally (left) and maximally (right) correlated cell-type pairs. The correlation between age and tissue similarity (expression correlations) was calculated for each pair of cell types identified in the 3-month age group. All pairwise cell-type correlations and their age-related changes are given in Figure 5—source data 1.

Figure 5—source data 1. Cell-type proportion estimation and cell-autonomous changes using the Tabula Muris Senis dataset.

Figure 5.

Figure 5—figure supplement 1. Age-related changes in cell-type proportions calculated using divergence-convergence (DiCo) and non-DiCo genes.

Figure 5—figure supplement 1.

Deconvolution of bulk tissue expression profiles of the mice in our dataset with regression analysis using the single-cell expression profile of the 3-month-old mice in the Tabula Muris Senis dataset for cortex (a), liver (b), lung (c) and muscle (d). Contribution of each cell type was measured using three gene sets; all genes (n = [12,492, 12,849]), DiCo (n = [4007, 4106]), and non-DiCo genes (n = [8485, 8743]). Age-related changes of the relative contribution of each cell type in each tissue are given in Figure 5—source data 1.
Figure 5—figure supplement 2. Permutation-based comparison between divergence-convergence (DiCo) and non-DiCo-related cell-type proportion changes with age in the cortex.

Figure 5—figure supplement 2.

The difference between DiCo (4106) and non-DiCo (8743)-related cell-type proportion changes with age was tested in the cortex tissue. The x-axis is the Spearman’s correlation coefficient between age and relative contribution of a given cell type. The red vertical lines show the cell-type proportion changes calculated with DiCo genes (observed value), and the blue vertical lines indicate the same but with non-DiCo genes. Overlapping DiCo and non-DiCo values are indicated with blue. Null distributions for non-DiCo genes (density plots) were created with resampling among all genes (n = 12,849) (Materials and methods). Significant results are represented with yellow density plots, and the nominal p-values for permutation tests are indicated on the left side of the density plots. Permutation test results are also provided in Figure 5—source data 1.
Figure 5—figure supplement 3. Permutation-based comparison between divergence-convergence (DiCo) and non-DiCo-related cell-type proportion changes with age in the liver.

Figure 5—figure supplement 3.

The difference between DiCo (4007) and non-DiCo (8485)-related cell-type proportion changes with age was tested in the liver tissue. The x-axis is the Spearman’s correlation coefficient between age and relative contribution of a given cell type. The red vertical lines show the cell-type proportion changes calculated with DiCo genes, and the blue vertical lines indicate the same but with non-DiCo genes. Overlapping DiCo and non-DiCo values are indicated with blue. Null distributions for non-DiCo genes (density plots) were created with resampling among all genes (n = 12,492) (see Materials and methods). Significant results are represented with yellow density plots, and the nominal p-values for permutation tests are indicated on the left side of the density plots. Permutation test results are provided in Figure 5—source data 1.
Figure 5—figure supplement 4. Permutation-based comparison between divergence-convergence (DiCo) and non-DiCo-related cell-type proportion changes with age in the lung.

Figure 5—figure supplement 4.

The difference between DiCo (4084) and non-DiCo (8670)-related cell-type proportion changes with age was tested in the lung tissue. The x-axis is the Spearman’s correlation coefficient between age and relative contribution of a given cell type. The red vertical lines show the cell-type proportion changes calculated with DiCo genes, and the blue vertical lines indicate the same but with non-DiCo genes. Overlapping DiCo and non-DiCo values are indicated with blue. Null distributions for non-DiCo genes (density plots) were created with resampling among all genes (n = 12,754) (see Materials and methods). Significant results are represented with yellow density plots, and the nominal p-values for permutation tests are indicated on the left side of the density plots. Permutation test results are provided in Figure 5—source data 1.
Figure 5—figure supplement 5. Permutation-based comparison between divergence-convergence (DiCo) and non-DiCo-related cell-type proportion changes with age in the muscle.

Figure 5—figure supplement 5.

The difference between DiCo (4055) and non-DiCo (8568)-related cell-type proportion changes with age was tested in the muscle tissue. The x-axis is the Spearman’s correlation coefficient between age and relative contribution of a given cell type. The red vertical lines show the cell-type proportion changes calculated with DiCo genes, and the blue vertical lines indicate the same but with non-DiCo genes. Overlapping DiCo and non-DiCo values are indicated with blue. Null distributions for non-DiCo genes (density plots) were created with resampling among all genes (n = 12,623) (see Materials and methods). Significant results are represented with yellow density plots, and the nominal p-values for permutation tests are indicated on the left side of the density plots. Permutation test results are provided in Figure 5—source data 1.
Figure 5—figure supplement 6. Intra-tissue coefficient of variation (CoV) changes between cell types using the Tabula Muris Senis dataset.

Figure 5—figure supplement 6.

Intra-tissue CoV: CoV is calculated among cell types within each tissue for each individual mouse and in three age groups. Y-axis shows the mean CoV value of genes for each individual. The horizontal line on each age group shows the median of points. Cell types found in at least two individuals at every time point were considered.

Next, we investigated the possible role of cell-autonomous changes in the DiCo pattern. Cell-autonomous changes could contribute to inter-tissue convergence during ageing in two ways. First, expression profiles of similar cell types shared across different tissues, such as immune cells, might converge with age. Another possible scenario, consistent with the notion of age-related cellular identity loss, is that the expression profiles of unrelated cell types, such as tissue-specific cell types in different tissues, converge with age. To test these scenarios, we first ordered the pairwise correlations between cell types in different tissues at the 3-month age group to determine the most similar and dissimilar cell types across tissues (Materials and methods). Then, we studied how these similarities (i.e. pairwise correlations) change with age (Figure 5b). Intriguingly, we found that pairs of similar cell types (i.e. those with the highest correlations) among tissues tend to become less similar with age (36/54 [67%] of pairwise comparisons, Figure 5—source data 1). On the contrary, the most distinct cell types (i.e. those with the lowest correlations) among tissues become more similar with age (45/54 [83%], Figure 5—source data 1). Repeating the analysis considering DiCo genes only yielded a similar trend (30/54 [56%] decrease in correlation among the most similar cell types, permutation test with resampling non-DiCo genes, p>0.1; and 47/54 [87%] increase in correlation among the most distinct cell types, permutation test, p>0.1). These trends are consistent with age-related cellular identity loss, and they suggest that cell-autonomous changes may also contribute to inter-tissue convergence during ageing, although further data and analyses would be needed to fully establish their validity.

Finally, we tested the possibility of intra-tissue convergence of cell types in the Tabula Muris Senis dataset by calculating expression variation among cell types using the CoV measure for each individual. However, we did not observe a consistent trend of increasing similarity among cell types within tissues from 3-month-old to 24-month-old mice (Figure 5—figure supplement 6).

Discussion

Our findings confirm a number of ageing-associated phenomena identified earlier, while also revealing new patterns. First, we report parallel age-related expression changes among the four tissues studied, during development, as well as in ageing. The inter-tissue correlation distributions were modest and also comparable between development and ageing (Figure 1c). This last point may appear surprising at first glance, given the stochastic nature of ageing relative to development (Bahar et al., 2006; Martinez-Jimenez et al., 2017; Angelidis et al., 2019; Somel et al., 2006; Feser et al., 2010; Kim et al., 1996; Enge et al., 2017), and also given earlier observations that developmental expression changes tend to be evolutionarily conserved, while ageing-related changes much less so (Zahn et al., 2007; Somel et al., 2010). At the same time, when we consider that tissues diverge during development, and also that ageing is characterised by parallel expression changes among tissues related to damage response, inflammation, and reduced energy metabolism (Zahn et al., 2007; Yang et al., 2015), similar magnitudes of correlations during development and ageing may be expected.

Second, we verify the generality of the reversal pattern, that is, up-down or down-up expression change patterns across the lifetime, among distinct mouse tissues that include both highly mitotic (lung and liver) and less mitotic ones (skeletal muscle and cortex). Consistent with earlier observations in fewer tissues (Anisimova et al., 2020; Dönertaş et al., 2017), we find that about half the expressed genes display reversal in all cases studied. Importantly, expression reversal is not ubiquitous across all genes and our findings do not necessarily contradict the hyperfunction theory. Instead, we suggest that reversal is a common phenomenon that influences a notable fraction of the transcriptome and is a likely contributor to mammalian ageing.

Two observations here are notable. One is that reversal-displaying genes, especially those displaying the up-down pattern in each tissue, can be associated with tissue-specialisation-related pathways (e.g. morphogenesis) and tissue-specific functions (e.g. synaptic activity). The second observation is the lack of significant overlap among reversal genes among tissues. We thus hypothesised that reversals might be reflecting tissue specialisation during development (hence lack of overlap among tissues) and loss of specialisation during ageing. These processes could manifest themselves as inter-tissue divergence and convergence patterns over lifetime. We indeed observed that the up-down reversal pattern is enriched in tissue-specific genes, except in the liver. Studying inter-tissue similarity across mouse lifespan, we further found that the four tissues’ transcriptomes diverged during postnatal development, and we further detected a trend towards inter-tissue convergence during ageing. We then further investigated this phenomenon through different approaches: (1) by studying overall trends using PCA, (2) by analysing transcriptome-wide trends of inter-tissue CoV without considering gene-wise significance cutoffs, (3) by focusing on genes with significant age-related changes in inter-tissue CoV, (4) by studying age-related changes in pairwise tissue correlations, (5) by analysing different cell types using scRNA-seq data, and (6) by repeating the same analysis using independent mouse and human ageing datasets. The patterns we found were mostly consistent with inter-tissue convergence, but the majority of transcriptome-wide results were associated with low ESs, and some were not statistically significant. Importantly, all significant results suggested convergence during ageing. We therefore conclude that (1) developmental inter-tissue divergence does not continue into ageing and (2) convergence during ageing may be common although possibly not ubiquitous.

The weakness of the inter-tissue convergence signal per dataset and the limited overlap between convergent gene sets among datasets could have multiple reasons. These include the low signal-to-noise ratios characterising ageing-related expression patterns, the lack of old-age individuals in our mouse dataset (>3-year-old mice) and the GTEx dataset (>90-year-old humans), limited overlap of tissues between our mouse dataset (cortex, liver, lung, and muscle) and the Jonker et al. dataset (cortex, liver, lung, spleen, kidney), as well as differences in ageing patterns between species or between sexes. Further research involving larger sample sizes and diverse species is needed to confirm the generalisability of the observations.

Finally, we report a number of interesting observations on DiCo. We determine that tissue-specific genes tend to be downregulated in the tissues that they belong to during ageing, while non-tissue-specific genes are upregulated, which was confirmed by all external datasets (Figure 4c). Second, using deconvolution, we infer that cell types most common in a tissue (e.g. hepatocytes in the liver) tend to increase in frequency during development, but then decrease in frequency during ageing, as also shown recently using immunohistochemistry in a number of mouse tissues (Schaum et al., 2020). Accordingly, the DiCo phenomenon may at least partly be explained by shifts in cellular composition. This is intriguing as both highly mitotic and low mitotic tissues share this trend, indicating that an explanation based on stem cell exhaustion may not be applicable here. Third, we find increased expression similarity between distinct cell types in different tissues during ageing, but decreased similarity between similar cell types. Cell-autonomous expression changes, therefore, likely also contribute to the DiCo phenomenon. We note that higher expression variability among cells at old age (Hernando-Herraez et al., 2019; Enge et al., 2017) could also lead to inter-tissue convergence during ageing. A fourth interesting observation was the absence of significant enrichment for specific transcription factor or microRNA targets among DiCo genes. This result may not be surprising if inter-tissue convergence is mostly driven by stochastic damage accumulation, such as loss of epigenetic marks. It is also possible that instead of specific regulators their interaction and cooperativity are associated with the DiCo. Future experimental studies could test both mechanistic aspects and functional link to tissue specificity.

We also note two major limitations of our study. One is related to the fact that our dataset represents bulk tissue samples, which may suffer from infiltration of foreign cell types into tissues. Indeed, one of the external datasets, Schaum et al., included samples from perfused mice (Schaum et al., 2020), and we did not find support for the transcriptome-wide convergence during ageing, even though the association between tissue identity loss and convergence was also evident. The scRNA-seq dataset we analysed further suggested that DiCo is associated with tissue-specific genes and not immune- or blood-related categories, but we still cannot rule out possible infiltration artefacts that may affect our results. A second limitation is related to ageing being highly sex-dimorphic in mammals (Yuan et al., 2012; Sampathkumar et al., 2020). Hence, in-depth analysis of sex specificity of the DiCo pattern could be relevant. Our mouse dataset included only male mice, while that of Jonker et al. was female-only. The fact that both revealed DiCo patterns suggest DiCo is not particular to one sex, but there could still exist sex-specific effects. In fact, when we analysed DiCo among human male and female individuals in the GTEx dataset separately, we observed slightly stronger inter-tissue convergence among ageing females than in males, although the GTEx male samples have also a drastically narrower age range (Figure 2—figure supplement 16). Accordingly, the prevalence of DiCo among humans and sexes waits to be determined.

Despite the open questions that remain, our results consistently support a model where ageing mammals suffer from loss of specialisation at the tissue level, and possibly also at the cellular level, which are observed as expression reversals and the newly discovered DiCo phenomenon we report here.

Materials and methods

Sample collection

We collected bulk tissue samples from 16 male C57BL/6J mice. The samples were snap frozen in liquid nitrogen and stored at –80°C. No perfusion was applied. The mice were of different ages covering the whole lifespan of Mus musculus, comprising both postnatal development and ageing periods. The samples included four different tissues; cerebral cortex, liver, lung, and skeletal muscle. One 904-day-old mouse had no cortex tissue sample and was thus excluded from the analysis. As a result, we generated 63 RNA-seq libraries in total.

Separation of development and ageing periods

In order to compare gene expression changes during postnatal development and ageing, we studied the samples before sexual maturation (covering 2–61 days of age, n = 7) as the postnatal development period, and samples covering 93–904 days (n = 9 in all tissues except in cortex where we had n = 8) as the ageing period.

RNA-seq library preparation

RNA sequencing was performed as previously described (Liu et al., 2016) with slight modifications. Briefly, total RNA was extracted using the Trizol reagent (Invitrogen) from frozen tissue samples. For sequencing library construction, we randomised all samples to avoid batch effects and used the TruSeq RNA Sample Preparation Kit (Illumina) according to the manufacturer’s instruction. Libraries were then sequenced on the Illumina HiSeq 4000 system in three lanes within one flow cell using the 150 bp paired-end module.

RNA-seq data preprocessing

The quality assessment of the raw RNA-seq data was performed using FastQC v.0.11.5 (Andrews, 2010). Adapters were removed using Trimmomatic v.0.36 (Bolger et al., 2014). The low-quality reads were filtered using the parameters: ‘PE ILLUMINACLIP: TruSeq3-PE-2.fa:2:30:1:0:8:true, SLIDINGWINDOW:4:15, MINLEN:25’. The remaining high-quality reads were aligned to the mouse reference genome GRCm38 using STAR-2.5.3 (Dobin et al., 2013) with parameters: ‘--sjdbOverhang 99 --outSAMattrIHstart 0 --outSAMstrandfield intronMotif --sjdbGTFfile GRCm38.gtf’. The percentage of uniquely mapped reads in libraries ranged from 80% to 93%. We used cufflinks v.2.2.1 (Trapnell et al., 2010) to generate read counts for uniquely aligned reads (samtools ‘-q 255’ filter) and calculated expression levels as fragment per kilobase million (FPKM). In total, we quantified expression levels for 51,608 genes in the GRCm38.gtf file. We identified 50 duplicated genes with 1 > FPKM value assigned, and the sum of their FPKM values was used.

All the remaining analyses were performed in R v.4.1. We restricted the whole analysis to only protein-coding genes obtained by the ‘biotype’ feature of the biomaRt library v.2.48.2 (Durinck et al., 2009). We also excluded genes which were not detected (0 FPKM) in 25% or more of the samples (at least 15 of 63), resulting in 15,063 protein-coding genes in total. As FPKM normalisation does not effectively account for cross-library variability, we additionally performed two normalisation approaches:

  1. Quantile normalisation: Using all the samples together (n = 63, regardless of their age or tissue), FPKM values were log2 transformed (after adding 1) and quantile normalised with ‘normalize.quantiles’ function from ‘preprocessCore’ library v.1.54 (Bolstad, 2020). This approach equalises the distributions of different libraries. The assumption is that any large-scale differences in expression-level distributions reflect technical factors.

  2. VST: To assess the robustness of quantile normalisation on downstream analysis, we additionally implemented this approach, which ensures homoscedasticity, that is, variances of expression levels are independent of the mean (Anders and Huber, 2010). Uniquely aligned reads obtained from the STAR alignment were used to calculate read counts by HTSeq v.0.13.5 (Anders et al., 2015) with parameters: ‘--format= bam --order= pos --stranded= no --type= exon --mode= union --nonunique= none’. Read counts were then imported into R using the ‘DESeqDataSetFromHTSeqCount’ function in DESeq2 v.1.32.0 package (Love et al., 2014). The same filtration steps were applied as above, resulting in 14,973 protein-coding genes in total. Normalisation was performed with the ‘vst’ function and ‘blinded = T’ option in the DESeq2 package. The VST-normalised expression matrix was used to reproduce results of Figures 1 and 2, which are given in Figure 1—figure supplements 10 and 11 and Figure 2—figure supplement 14.

Principal components analysis

We studied the main sources of variation in the whole dataset using PCA on the scaled expression matrix with ‘prcomp’ function in the R base. The first four components, PC1–PC4, explained 31, 20, 17, and 8% of the total variance. We observed a clear separation of tissues in PC1 and PC2 and a strong age effect in PC4. To statistically confirm tissue differences, we performed ANOVA on individual PC scores with tissue as explanatory variable; this was run on each of the first four PCs (PC1–PC4) separately. The magnitude of the age effect on PCA was measured with Spearman’s correlation test between individual age and each individual’s PC score separately in each tissue. PCA was also repeated for development and ageing periods separately (Figure 1—figure supplement 3). We further calculated Euclidean distance in pairwise manner among tissues of each individual in PC1–4 space constructed in three different ways: (1) using all the samples together, (2) using only the developmental samples, and (3) using only the ageing samples. Then, we tested the effect of age on mean Euclidean distance among tissues using the Spearman’s correlation test. To study only the age effect on PC scores without the tissue effect, we performed the following: (1) we removed the tissue-specific effects from the data by scaling the expression levels of each gene to mean = 0 and sd = 1 in each tissue separately, (2) we combined the four scaled expression matrices, and (3) we conducted PCA on the combined dataset (Figure 1—figure supplement 2).

Age-related gene expression change

To identify genes showing age-related expression change in each tissue, we used Spearman’s correlation coefficient between individual age and expression level separately for development and ageing periods. To capture potential nonlinear but monotonic changes in expression, we chose the non-parametric two-sided Spearman’s correlation test for both periods. We have used two-sided tests for all statistical tests throughout the article except the permutation tests. Significance of age-related genes was assessed with the FDR (FDR-corrected p-value<0.1 cutoff, calculated with the Benjamini–Hochberg [BH] procedure; Benjamini and Hochberg, 1995) using the ‘p.adjust’ function in the R base library. Throughout the article, BH procedure with 0.1 cutoff was used for multiple test corrections of all statistical tests.

Functional associations

We tested the functional associations of age-related gene expression change in separate tissues for each period (development and ageing) separately, employing the gene set over-representation analysis (GORA) procedure with GO (Ashburner et al., 2000) BP categories using the ‘topGO’ package v.2.44 (Alexa and Rahnenfuhrer, 2019). We applied the ‘classical’ algorithm and performed Fisher’s exact test on categories that satisfy the criteria of a minimum 10 and maximum 500 number of genes. We used the whole set of expressed genes (n = 15,063) as the background. p-Values were corrected for multiple testing using the BH procedure. Categories with FDR-corrected p-value<0.1 were considered as significant.

Correlation between age-related gene expression changes in different tissues

We calculated Spearman’s correlation coefficients between age-related gene expression change ρgene values (i.e. correlation between gene expression levels and age) calculated per gene in each tissue pair (Figure 1c). In order to test the statistical significance of the correlations, we used a permutation scheme as the expression levels across tissues are not independent but belong to the same mice. In order to account for the dependence, the individual ages were permuted in each round, but the permuted values were kept constant across tissues (similar to permutation tests applied in Dönertaş et al., 2017; Işıldak et al., 2020; Dönertaş et al., 2018). Specifically, we performed 1000 permutation rounds. In each round, we randomised the individual ages using the ‘sample’ function in R, while keeping the permuted age labels constant for individuals across tissues. We calculated the age-related gene expression changes with permuted ages in development and ageing datasets separately, thus simulating the null distribution with no age effect in each period. We then calculated the Spearman’s correlation coefficient between the age-related expression levels from the permutations across tissues and assigned the p-value by calculating the proportion of permuted calculations with a more extreme correlation. All permutation tests in the article were performed as one-sided tests. The estimated false-positive proportion (eFPP; proportion of false positives among all true non-significant results (true negatives + false positives)) was calculated as the median value of expected values divided by the observed value (Figure 1—source data 1).

Shared gene expression changes across tissues

We summarised the number of shared age-related genes among tissues for up- and downregulated genes separately using FDR-corrected p-value<0.1 (Figure 1—figure supplement 5). The development and ageing datasets were tested separately. For each gene, we counted the number of tissues with the same direction of expression change with age. We calculated this overlap statistic among tissues (1) using genes with FDR-corrected p-value<0.1 and (2) with all genes without using any significance cutoff (Figure 1e, Figure 1—figure supplement 4).

Permutation test

We again used a permutation scheme to assess the significance of shared age-related genes to account for the dependence among tissues. We tested the significance of shared up- and downregulated genes, selected with or without an FDR cutoff, in development and in ageing periods separately. We used the age-related expression change values (ρ′gene) calculated by permuting individual ages, 1000 times. To test the significance of the overlap of significantly up- or downregulated genes (FDR-corrected p-value<0.1) among tissues, we used the following procedure: (1) for each permutation round, we ranked the ρ′gene values for each tissue in each period separately. (2) We chose the highest Nu (to test the upregulation) or lowest Nd (to test the downregulation) number of genes, where Nu and Nd are the number of significantly up- or downregulated genes, respectively, in a given tissue (FDR-corrected p-value<0.1). (3) For each permutation round, we calculated the number of overlaps across tissues using the chosen gene sets, that is, the number of tissues with the same direction of expression changes with age for those genes. Doing this for 1000 permutation results yielded a null distribution representing the expected overlaps if there were no age effects. (4) We calculated the p-value as the proportion of 1000 permutations where the number of overlaps was higher than the observed value. The eFPP was calculated as the median number of overlaps in permutations divided by the observed value.

Likewise, to test the significance of the overlap of shared up- and downregulated genes selected without FDR cutoff, we used the same permutation scheme explained above, but this time using all the age-related expression changes created using permutations (ρ′gene), without applying a significance cutoff for any tissue, and calculating the overlap across tissues in the same way.

Functional associations

We tested the functional associations of shared expression change trends among tissues in each period separately following the GORA procedure using the same criteria and algorithms explained in the previous section. To test shared upregulated (n = 45) or downregulated genes (n = 138) in development, we chose all significant age-related genes across tissues (n = 10,305) in the development period as background. Since we could not identify any shared ageing-related genes across tissues (Figure 1—figure supplement 5), we did not perform a functional test for the ageing period.

Analysis of gene expression reversals

We compared the direction of gene expression change during development and during ageing to identify reversal genes in each tissue separately. Genes showing upregulation (positive correlation with age) in development and downregulation (negative correlation with age) in ageing were assigned as up-down (UD) reversal genes, while the genes with the opposite trend (downregulation in development and upregulation in ageing) were assigned as down-up (DU) reversal genes. Without using any significance level for expression-age correlation values, we calculated the proportion of genes showing reversal by keeping the expression change direction in development the same, that is, UD% = UD/(UU + UD) and DU% = DU/(DD + DU).

Permutation test

To test the significance of reversal proportions, we kept the developmental changes constant and randomly permuted the individual ages only in the ageing period (as described earlier). Among developmental upregulated genes, we calculated the UD% in each permutation, simulating a null distribution for UD reversal. We applied the same principle for the DU genes. Thus, we created a null distribution with the expected reversal ratios and tested the significance of observed values for each tissue separately (Figure 1—figure supplement 8).

Functional associations

We used the GORA procedure as described earlier to test functional associations of reversal genes in each tissue but kept the developmental changes constant in the background. More specifically, we tested the functional enrichment of UD reversal genes against UU genes, and DU genes against DD genes. We thereby specifically test the functions associated with the reversal pattern, but not development-associated functions.

Overlap of reversal genes: Permutation test

We tested the significance of overlap using the same permutation scheme described above. Specifically, among developmental up- (or down-) regulated genes shared among tissues, we constructed null distributions by calculating the ratio of UD vs. UD + UU (or DU vs. DU + DD) genes shared among tissues, identified in 1000 random permutations of individual ages only in the ageing period (Figure 1—figure supplement 9). The number of shared upregulated genes was nup = 2255 (one gene excluded since it has constant expression in one tissue in ageing period), and the number of shared downregulated genes was ndown = 2209.

Tissue convergence and divergence calculations using CoV

For each individual mouse, for each gene (n = 15,063), we calculated the inter-tissue CoV estimate using normalised expression levels from the four tissues, dividing the standard deviation by the mean. We studied inter-tissue expression-variation change with age in development and ageing periods separately using two approaches: (1) using the change in mean or median CoV across genes and (2) studying significant CoV patterns at the single-gene level.

Mean/median CoV across all genes

We assessed transcriptome-wide variation among the tissues of each individual mouse by calculating the mean (or median) CoV of genes and then performing the Spearman’s correlation test between mean-CoV (or median-CoV) and individual age.

CoV at the single-gene level

In the second approach, we tested the correlation between the CoV value of a gene and individual age for each commonly expressed gene using the Spearman’s correlation test. p-Values were corrected for multiple testing using the ‘BH’ procedure. We used FDR-corrected p-value<0.1 as cutoff. The genes showing positive correlation between CoV and age were called ‘divergent,’ and the ones showing negative correlation were called ‘convergent’ (Figure 2b). Genes that display a divergent pattern during development and convergent pattern in ageing (without using a significance level) were called divergent-convergent (DiCo) genes (n = 4802).

Permutation test

To test the significance of DiCo genes (n = 4802), we kept the developmental divergent genes constant (n = 9058, without a significance cutoff) and randomly permuted the individual ages only in the ageing period (as described earlier). Among developmental divergent genes, we calculated the DiCo% for each permutation, simulating a null distribution for the DiCo pattern (Figure 2—figure supplement 12).

Clustering of DiCo genes

We used the k-means algorithm to cluster DiCo genes according to their CoV or expression changes with age separately (Figure 2—figure supplements 23). To find the optimum number of clusters for both procedures, we applied gap statistics using the ‘clusGap’ function in the ‘cluster’ package v.2.1.2 with 500 simulations (Tibshirani et al., 2001). We used the ‘kmeans’ function in base R with ‘iter.max = 20’ and ‘nstart = 50’ parameters to cluster CoV values or expression levels which were standardised to mean = 1 and sd = 0 across genes.

Effect of gene expression trajectories on DiCo

To identify potential non-monotonic expression changes with age that could not be detected with the Spearman’s correlation coefficient, we clustered all expressed genes (n = 15,063) in each tissue separately using the k-means algorithm following the same steps explained above (Figure 1—figure supplements 1215). The list of genes belonging to each cluster is given in Figure 2—source data 1. Then, for each cluster, separately in each tissue, we performed a Fisher’s exact test to assess if a particular cluster pattern is enriched or depleted in DiCo genes relative to all other expressed genes (the background).

Functional association analysis

To test the functional associations of the genes showing the DiCo pattern among tissues, we performed GSEA using GO BPs. We retrieved developmental divergent genes (with ρCoV-age > 0, n = 9058) and multiplied these ρCoV-age values with the ones calculated in the ageing period. Therefore, the genes with a negative value represent a DiCo pattern, while the ones with a positive value represent a DiDi pattern. We then ranked the genes according to the calculated product values and sought enrichment for the upper and lower tail of the distribution using the KS test implemented in the ‘clusterProfiler’ package v.4.0.0 (Yu et al., 2012). The ‘gseGO’ function was used with parameters: ‘nPerm = 1000, minGSSize = 10, maxGSSize = 500 and pValueCutoff = 1’. Therefore, the enriched categories for the genes in the lower tail of the distribution would represent DiCo enrichment. Categories with FDR-corrected p-value<0.1 were considered as significant.

We summarised DiCo-enriched categories into representative ones following Dönertaş et al., 2021 and used hierarchical clustering on gene similarities among categories. The tree was cut into 25 clusters. For each cluster, we chose as representative the category that has the highest mean Jaccard similarity to the other categories in the same cluster. Then, we calculated the mean age-expression correlation across all the genes in each representative category in each tissue and in each period. As the unrelated categories, those with the low within-cluster similarity were grouped into one cluster, we denoted them ‘Other GO,’ and performed the same clustering steps to further summarise them (Figure 4—figure supplement 1).

We further sought functional enrichment among DiCo genes that were clustered with the k-means algorithm for both CoV and expression clusters separately (Figure 2—figure supplements 23). Genes in each cluster were tested among all DiCo genes using the same GORA procedure as described before.

Jackknife to test the Di/Co ratio between development and ageing

We tested the significance of divergent/convergent gene ratios using a jackknife resampling procedure in development and in ageing periods separately. Leaving out an individual in each iteration, we recalculated the number of significant divergent and convergent genes and their ratios. As we could not obtain any gene with significant CoV changes when the youngest adults were left out due to the decreased power, standard error and confidence interval calculation was not possible. Instead, we report the range of pseudovalues. We note that the range of ratios in leave-out samples do not contain the value 1 either in the development (0.41–0.49) or in the ageing (1.20–2.83) period (Figure 2e).

Pairwise tissue DiCo test

In order to further verify the inter-tissue DiCo pattern that we observed between development and ageing periods, we used a different approach based on expression correlations among tissues. We calculated pairwise Spearman’s correlation coefficients among tissues of the same individual mouse using all commonly expressed genes among the tissues (n = 15,063). For each tissue pair, we tested the correlation between age and inter-tissue expression correlations using the Spearman’s correlation test in development and in ageing periods separately. In addition, we calculated the mean (or median) of all six pairwise tissue correlations for each individual mouse and tested the correlation between age and average inter-tissue expression correlations using the Spearman’s correlation test (Figure 2—figure supplement 6).

Determination of tissue-specific genes

To identify which tissue(s) contribute to the reversal pattern, we assigned each gene to a tissue to identify tissue-specific expression patterns. First, we calculated an ES between the expression of a gene in a tissue versus other three tissues using the development samples only, and repeated this procedure for all tissues. Hence, we obtained ES for each commonly expressed gene in each tissue. ES was calculated using the ‘Cohen’s d’ formula defined as the difference between the two means divided by the pooled standard deviation. We then assigned each gene to a tissue in which the gene has the highest ES. Finally, we retrieved only the fourth quartile (>Q3) of genes assigned to a tissue to define tissue-specific expression. Using this approach, we identified 3766 tissue-specific genes in total (cortex: 1175; lung: 839; liver: 986; muscle: 766 genes).

Enrichment test with the direction of age-related change

We tested the association between tissue specificity and age-related expression change during ageing using Fisher’s exact test. Specifically, we constructed a contingency table with two categorical variables; the first variable defines the direction (either positive or negative) of maximum expression change during ageing identified in a tissue-specific gene, which is determined by the slope of the regression between log2 age and expression. The second variable defines whether this maximum expression change identified in a tissue-specific gene occurs in its native tissue or not (either yes or no). Hence, a positive odds ratio (OR) suggests that (1) either the expression of genes decreases the most in their native tissue and/or (2) the expression of genes increase the most in a non-native tissue during ageing.

Enrichment of tissue-specific genes in DiCo genes

We tested the association between tissue specificity (being either tissue-specific [n = 3766] or not [n = 11,297]) and the DiCo pattern (either showing DiCo [n = 4802] or not [n = 10,261]) using the Fisher’s exact test, calculating the enrichment of tissue-specific genes within DiCo genes.

Additional publicly available bulk tissue transcriptome datasets

Jonker

We downloaded the raw data from the GEO database with GSE34378 accession number (Jonker et al., 2013) and followed the same analysis pipeline described above using all the samples from five tissues (‘brain – cortex,’ ‘lung,’ ‘liver,’ ‘kidney,’ ‘spleen’) of 18 female mice comprising 90 samples in total. This dataset represents the ageing period of the mouse, ranging from 90 to 900 days. Using the oligo package v.1.56.0 (Carvalho and Irizarry, 2010), we retrieved the expression matrices and performed ‘rma’ normalisation followed by removing the probesets that were annotated to more than one gene. We confined the analysis to only the protein-coding genes expressed in at least 25% of all samples. The resulting 17,661 genes were log2 transformed (after adding 1) and quantile normalised using the preprocessCore library (Bolstad, 2020) across all samples. Downstream analysis was the same as described above.

Schaum

We downloaded the raw count matrix from the GEO database with GSE132040 accession number (Schaum et al., 2020) and performed the same filtrating steps as described above. We discarded the samples that have less than 4 million reads, which was the cutoff used in the article. We restricted the analysis to only protein-coding genes expressed in at least 25% of the samples that have expression in four tissues (‘brain,’ ‘lung,’ ‘liver,’ ‘muscle’). One individual was removed from the analysis due to being an outlier in PCA after visual inspection (mouse ID: ‘3m7,’ PCA plots before and after outlier removal are present in our GitHub repository; hmtzg, 2022). Final dataset contained 16,806 protein-coding genes from 37 mice that range from 3 to 27 months of age covering the ageing period. There were 11 female mice ranging from 3 to 21 months of age and 26 male mice ranging from 3 to 27 months of age. We performed the same normalisation method and downstream analyses described above. We extended the analysis to eight tissues (‘brain,’ ‘heart,’ ‘kidney,’ ‘liver,’ ‘lung,’ ‘muscle,’ ‘spleen,’ ‘subcutaneous fat’) which were chosen based on the highest number of individuals that have the same tissue samples and that cover the whole ageing period (3–27 months). For the fat tissue, ‘subcutaneous fat’ was chosen as representative tissue which has the highest number of samples among all minor fat tissues. After performing the same preprocessing steps explained above, the final dataset contained 17,619 genes from 26 mice. Downstream analysis was the same as above.

GTEx

We downloaded the processed GTEx v8 dataset (Battle et al., 2017) from the data portal and repeated the analysis in human tissues. We first confirmed our results in the same 4 tissues (‘brain – cortex,’ ‘lung,’ ‘liver,’ ‘muscle – skeletal’) and then expanded the analysis to 10 tissues (‘adipose – subcutaneous,’ ‘artery – tibial,’ ‘brain – cerebellum,’ ‘lung,’ ‘muscle – skeletal,’ ‘nerve – tibial,’ ‘pituitary,’ ‘skin – sun exposed [lower leg],’ ‘thyroid,’ ‘whole blood’). In order to choose which tissues to analyse, we first chose the minor tissues with the highest number of samples for each major tissue, which prevents the representation of the same tissue multiple times. We then performed hierarchical clustering of tissues based on the presence of samples from the same individuals (Figure 2—figure supplement 13) and cut the tree into three clusters based on visual inspection. We selected the cluster with the highest number of overlapping individuals to analyse. The same procedure was followed for both 4- and 10-tissue analyses. In particular, we restricted the analysis to the individuals with samples in all tissues analysed and with a death circumstance of 1 (violent and fast deaths due to an accident) and 2 (fast death of natural causes) on the Hardy scale (n = 47 for 4 tissues, n = 35 for 10 tissues). We removed duplicated genes from the analysis. Similar to our analysis with the mice data, we used only the protein-coding genes that are expressed in at least 25% of all samples, totalling 16,197 for 4 tissues and 16,305 for 10 tissues. The TPM values obtained from the GTEx data portal were log2 transformed (after adding 1) and quantile normalised using the preprocessCore library (Bolstad, 2020) in R. Downstream analysis was the same as other datasets. To study the sex-specific convergence patterns, we repeated the same analysis separating female (n = 11) and male (n = 36) individuals.

Comparison of datasets

We compared the age-related expression change patterns across tissues of all datasets analysed using Spearman’s correlation coefficient. We used the ‘pheatmap’ function from pheatmap package v1.0.12 (Raivo, 2019) using hierarchical clustering (Figure 4—figure supplement 2a).

We performed Fisher’s exact test to test the enrichment of convergent genes among datasets during ageing. We used only the convergent genes in ageing in our dataset (n = 7748) for comparison. For GTEx and Schaum et al. datasets, we performed enrichment for the same four tissues as our dataset and also for the larger sets, indicated as GTEx10 and Schaum8, respectively (Figure 4—figure supplement 2b).

Regulatory analysis

We used MiRTarBase (downloaded on 03/08/2021; Hsu et al., 2011, Hsu et al., 2014) and TRANSFAC (downloaded on 03/08/2021; Matys et al., 2003; Matys et al., 2006) resources from the Ma’ayan lab database (Rouillard et al., 2016) for miRNA and transcription factor binding site (TFBS) enrichment analyses, respectively. As the database contains target information only for human HGNC IDs, we first converted those IDs to human Ensembl IDs and then to mouse Ensembl IDs only for the one-to-one ortholog genes using ‘getBM’ and ‘getLDS’ functions from the biomaRt package. In total, we analysed 235 miRNAs associated with 5458 target genes and 158 TFs associated with 7427 target genes. We conducted the overrepresentation analysis in the same way as for the DiCo functional enrichment analysis: specifically, we tested the targets of each regulator for enrichment in -Co genes (convergent genes in ageing) among Di- genes (divergent genes in development) used as background to keep developmental patterns fixed. We restricted the analysis for miRNA and TFs that have at least five target genes. After multiple testing correction with the BH procedure, we found no enrichment among either of the regulator types. Enrichment results are given in Figure 4—source data 1.

Heteroscedasticity tests on the DiCo pattern

To test the hypothesis that the convergence pattern observed in the ageing period could be explained by the increased noise with age, thus regression towards the mean, we performed two distinct heteroscedasticity tests to compare DiCo genes against the lifelong-divergent genes (DiDi). In the first, we followed the method used to measure heteroscedasticity in Işıldak et al., 2020 and Kedlian et al., 2019. We first fit a linear model between log2-transformed age and expression level for each gene in each tissue (Kedlian et al., 2019; Işıldak et al., 2020; Somel et al., 2006). This represents the variability of error along the explanatory variable, age. Then, we calculated Spearman’s correlation coefficient between the absolute residual values and age, which can be used as an estimate of heterogeneity change with age. We compared the heterogeneity change values of DiCo and DiDi genes using a two-sided KS test in each tissue. In the second approach, we used the ‘ncvTest’ function from the ‘car’ package v.3.0.11 (Fox and Weisberg, 2018), which is a chi-squared test for heteroscedasticity estimated using a linear model. Again, we compared the heteroscedasticity measures of DiCo and DiDi genes using a two-sided KS test in each tissue.

Single-cell RNA-seq

Preprocessing

We used the Tabula Muris Senis dataset (Schaum et al., 2020) for scRNA-seq analysis as it is the only dataset to our knowledge that includes time-series samples covering old age and the tissues present in our dataset. Seurat-processed FACS data of the tissues lung, liver, skeletal muscle, and non-myeloid brain were downloaded from the figshare database (Pisco, 2020). The Seurat package v.4.0.0 (Stuart et al., 2019) was used to retrieve the expression matrix of the cells that are annotated to cell types in the original article. Each tissue contains samples from three time points: 90- (3 months), 540- (18 months), and 720-day-old (24 months) mice, totalling 14 samples each in lung, liver, and brain, and 9 samples in liver. We excluded cell types with less than 15 cells among all samples and excluded genes if the expression level is 0 for all cells at a given age. This resulted in a median number of 99–382 cells assigned to cell types, 6–24 cell types and 16,951–22,122 genes across tissues. Using 3-month-old mice, we calculated cell-type-specific expressions in each tissue. Specifically, we first calculated the mean expression levels among cells of an individual mouse for each cell type, and then calculated the mean among individuals to obtain an average expression value for each cell type. Uniprot gene symbols were converted to Ensembl gene IDs using the ‘biomaRt’ R package (Durinck et al., 2009).

Deconvolution

We used cell-type-specific expression profiles of 3-month-old mice to estimate relative contributions of cell types to the transcriptome profiles of tissues in our mouse dataset. For a given tissue in our mouse dataset, we used single-cell expression profiles of that tissue from the Tabula Muris Senis dataset. We used a linear regression-based deconvolution method for each tissue using three genesets: all genes (n = [12,492, 12,849]), DiCo genes (n = [4007, 4106]), and non-DiCo genes (n = [8485, 8743]). Regression coefficients were used as relative contributions of cell types according to the following linear model:

Yi=a+bj1Xi1+bj2Xi2+...+bjnXin,

where i represents the tissue, Yi is the expression level of a sample in a tissue, bj1...jn represents the relative contributions of the n cell types in a tissue, and Xi1...in represents the expression levels of the n cell types in a tissue.

We then tested the effect of age on cell-type contributions (bj1,…bjn) using the Spearman’s correlation test in development and in ageing.

Cell-type similarities and their change during ageing

To investigate the contribution of cell-autonomous changes to inter-tissue convergence in ageing, we calculated pairwise cell-type expression correlations among tissues and studied how these correlations change with age. Based on pairwise correlations in the 3-month age group, we identified the maximally and minimally correlated cell-type pairs among tissues. Specifically, for each cell type in a given tissue, we chose the minimally correlated cell type in each of the other three tissues. For example, for each of the 10 cell types in the liver, we chose the minimally correlated cell type among the 15 cortex cell types, the minimally correlated cell type among the 24 lung cell types, and the minimally correlated cell type among the 6 muscle cell types. We repeated this procedure for all cell types in all four tissues, resulting in 54 cell-type pairs. Then, we calculated Spearman’s correlation coefficients between age and minimally correlated cell-type pairs identified in the 3-month-age group. Likewise, we repeated the same analysis for the maximally correlated cell-type pairs among tissues.

Permutation tests

To test whether DiCo genes are significantly more associated with cell-type proportion changes than non-DiCo genes, we performed a permutation test based on a resampling procedure. For each tissue, we took random samples among all genes (n = [12,492, 12,849]) with size N, where N is the number of DiCo genes in that tissue, and repeated the deconvolution analysis as explained above. By calculating cell-type proportion changes with age for each random sample repeated 1000 times, we created the null distribution for each cell type. Then, we calculated the p-values as the number of random samples having the same or higher cell-type proportion change values divided by the observed value (cell-type proportion changes with DiCo genes).

We applied a similar permutation scheme as explained above to test cell-type similarity change differences between DiCo and non-DiCo genes. For each random sample of non-DiCo genes with size N, we calculated the pairwise correlations among cell types of tissues and identified maximally and minimally correlated cell types in the 3-month-age group. Then, we calculated age-related changes of those correlations using Spearman’s correlation coefficient to construct the null distribution.

Analysis of within-tissue convergence of cell types

Analogous to inter-tissue convergence analysis, we also studied intra-tissue convergence of cell types in scRNA-seq data by calculating CoV among cell types within a tissue for each individual of ages 3 months, 18 months, and 24 months, separately. We filtered the data to obtain cell types present in at least two individual mice in every time point for each tissue which yielded 4, 7, 20, and 6 cell types in brain, liver, lung, and muscle, respectively. We then tested the mean CoV (or CoV per gene) change with age using Spearman’s correlation test.

Acknowledgements

We thank Wolfgang Enard and Wulf Hevers for help with the mouse experiments and sharing samples, Nurcan Tuncbag, Nihal Terzi Çizmecioğlu, and the whole METU CompEvo team for helpful comments and fruitful discussions, and Zeliha Gözde Turan and Melih Yıldız for the critical reading of the manuscript and their suggestions. This work was supported by EMBL (HMD), the Scientific and Technological Research Council of Turkey (TÜBİTAK 2232, MS), the Science Academy (of Turkey) BAGEP Award (MS), and a METU Internal Grant (BAP, MS). The publication of this article was funded by the Open Access Fund of the Leibniz Association and the Leibniz Institute on Aging – Fritz Lipmann Institute (FLI), Jena, Germany. The FLI is a member of the Leibniz Association and is financially supported by the Federal Government of Germany and the State of Thuringia.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Philipp Khaitovich, Email: p.khaitovich@skoltech.ru.

Mehmet Somel, Email: somel.mehmet@googlemail.com.

Handan Melike Dönertaş, Email: melike.donertas@leibniz-fli.de.

Bérénice A Benayoun, University of Southern California, United States.

Kathryn Song Eng Cheah, The University of Hong Kong, Hong Kong.

Funding Information

This paper was supported by the following grants:

  • European Molecular Biology Laboratory to Handan Melike Dönertaş.

  • Scientific and Technological Council of Turkey 2232 to Mehmet Somel.

  • Science Academy (Turkey) BAGEP Awards to Mehmet Somel.

  • METU Internal Grant to Mehmet Somel.

  • Leibniz Institute on Aging – Fritz Lipmann Institute (FLI) Open Access Fund to Handan Melike Dönertaş.

  • Leibniz Association Open Access Fund to Handan Melike Dönertaş.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Methodology, Software, Visualization, Writing – original draft, Writing – review and editing.

Investigation, Resources, Writing – review and editing.

Data curation, Formal analysis, Validation, Visualization, Writing – review and editing.

Investigation, Writing – review and editing.

Data curation, Formal analysis, Writing – review and editing.

Conceptualization, Project administration, Resources, Supervision, Writing – review and editing.

Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review and editing.

Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review and editing.

Ethics

Human subjects: Data involving human subjects were obtained from a published dataset, GTEx portal (https://www.gtexportal.org/home/datasets, with accession phs000424.v8.p2). Hence, no ethical statement is required.

Post-mortem samples were obtained from 16 C57BL/6J mice aged between 2 days and 904 days. All mouse experiments were overseen by the Institutional Animal Welfare Officer of the Max Planck Institute for Evolutionary Anthropology (MPI-EVA). They were performed according to the German Animal Welfare Legislation, ("Tierschutzgesetz") and registered with the Federal State Authority Landesdirektion Sachsen (No. 24-9162. 11-01 (T62/08)). The mice were sacrificed for reasons independent of this study, their tissues were harvested and frozen immediately, and stored at -80°C.

Additional files

Supplementary file 1. Gene set over-representation analysis (GORA) of age-related genes in tissues.

Tissue-specific age-related gene expression changes and functional enrichment test results, performed with GORA using ‘topGO’ package.

elife-68048-supp1.xlsx (2.5MB, xlsx)
Supplementary file 2. Gene set over-representation analysis (GORA) of shared age-related genes among tissues.

Functional enrichment for shared genes across tissues. The same GORA that was performed for Supplementary file 1 was used to test the enrichment of shared up-/downregulated genes in development among the background genes which are chosen as the all-significant age-related genes across tissues in development. We did not apply the test for the ageing period as there were no shared ageing-related expression changes.

elife-68048-supp2.xlsx (152KB, xlsx)
Supplementary file 3. Gene set over-representation analysis (GORA) of reversal patterns.

Functional enrichment for gene expression reversals. GORA was performed with the same criteria as explained above. Up-down reversal genes were tested against up-up genes, and down-up reversal genes were tested against down-down genes in each tissue.

elife-68048-supp3.xlsx (1.2MB, xlsx)
Supplementary file 4. Gene set over-representation analysis (GORA) of divergence-convergence (DiCo) gene clusters determined with coefficient of variation (CoV) values.

Functional enrichment of DiCo genes clustered with k-means algorithm according to their CoV values. GORA was performed using gene sets in each cluster (Figure 2—figure supplement 2) which were tested among all DiCo genes.

elife-68048-supp4.xlsx (814.7KB, xlsx)
Supplementary file 5. Gene set over-representation analysis (GORA) of divergence-convergence (DiCo) gene clusters determined with expression levels.

Functional enrichment of DiCo genes clustered with k-means algorithm according to their expression levels. GORA was performed using gene sets in each cluster (Figure 2—figure supplement 3) which are tested among all DiCo genes.

elife-68048-supp5.xlsx (1.9MB, xlsx)
Transparent reporting form

Data availability

Sequencing data generated for this study have been deposited in GEO under accession code GSE167665. All data analysed during this study are included in the manuscript and supporting files. Source data files have been provided for all figures and figure supplements. Four additional and previously published datasets are used in this study: Jonker et al. 2013, GTEx Consortium et al. 2017, Schaum et al. 2020, and Tabula Muris Consortium 2020. All the code used to perform analyses is available in GitHub: https://github.com/hmtzg/geneexp_mouse (copy archived at swh:1:rev:1f2434f90404a79c87d545eca8723d99b123ac1c).

The following dataset was generated:

Izgi H, Han D, Isildak U, Huang S, Kocabiyik E, Khaitovich P, Somel M, Donertas HM. 2021. Bulk RNA-seq of mice covering the whole lifespan (2 days to 904 days) from four tissues. NCBI Gene Expression Omnibus. GSE167665

The following previously published datasets were used:

Jonker MJ, Melis JP, Kuiper RV, van der Hoeven TV, Robinson J, van der Horst GT, Breit TM, Vijg J, Dollé ME, Hoeijmakers JH, van Steeg H. 2013. Aging Experiment. NCBI Gene Expression Omnibus. GSE34378

GTEx Consortium et al 2017. Gene TPMs. GTEx Portal. phs000424.v8.p2

Pisco A. 2020. Official data release for Tabula Muris Senis. figshare.

Schaum et al 2020. Tabula Muris Senis: Bulk sequencing. NCBI Gene Expression Omnibus. GSE132040

References

  1. Alexa A, Rahnenfuhrer J. topGO: Enrichment Analysis for Gene Ontology. TopGO 2019
  2. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biology. 2010;11:R106. doi: 10.1186/gb-2010-11-10-r106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Anders S, Pyl PT, Huber W. HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics (Oxford, England) 2015;31:166–169. doi: 10.1093/bioinformatics/btu638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Andrews S. FastQC: A Quality Control Tool for High Throughput Sequence Data. FastQC. 2010 http://www.bioinformatics.babraham.ac.uk/projects/fastqc
  5. Angelidis I, Simon LM, Fernandez IE, Strunz M, Mayr CH, Greiffo FR, Tsitsiridis G, Ansari M, Graf E, Strom T-M, Nagendran M, Desai T, Eickelberg O, Mann M, Theis FJ, Schiller HB. An atlas of the aging lung mapped by single cell transcriptomics and deep tissue proteomics. Nature Communications. 2019;10:963. doi: 10.1038/s41467-019-08831-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Anisimova AS, Meerson MB, Gerashchenko MV, Kulakovskiy IV, Dmitriev SE, Gladyshev VN. Multifaceted deregulation of gene expression and protein synthesis with age. PNAS. 2020;117:15581–15590. doi: 10.1073/pnas.2001788117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene Ontology: tool for the unification of biology. Nature Genetics. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bahar R, Hartmann CH, Rodriguez KA, Denny AD, Busuttil RA, Dollé MET, Calder RB, Chisholm GB, Pollock BH, Klein CA, Vijg J. Increased cell-to-cell variation in gene expression in ageing mouse heart. Nature. 2006;441:1011–1014. doi: 10.1038/nature04844. [DOI] [PubMed] [Google Scholar]
  9. Battle A, Brown CD, Engelhardt BE, Montgomery SB, GTEx Consortium. Laboratory, Data Analysis &Coordinating Center (LDACC) Statistical Methods groups—Analysis Working Group. Enhancing GTEx (eGTEx) groups. NIH Common Fund. NIH/NCI. NIH/NHGRI. NIH/NIMH. NIH/NIDA. Biospecimen Collection Source Site—NDRI. Biospecimen Collection Source Site—RPCI. Biospecimen Core Resource—VARI. Brain Bank Repository—University of Miami Brain Endowment Bank. Leidos Biomedical—Project Management. ELSI Study. Genome Browser Data Integration &Visualization—EBI. Genome Browser Data Integration &Visualization—UCSC Genomics Institute, University of California Santa Cruz. Lead analysts. NIH program management. Biospecimen collection. Pathology. eQTL manuscript working group Genetic effects on gene expression across human tissues. Nature. 2017;550:204–213. doi: 10.1038/nature24277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. 1995;57:289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]
  11. Blagosklonny MV. Aging and immortality: quasi-programmed senescence and its pharmacologic inhibition. Cell Cycle (Georgetown, Tex.) 2006;5:2087–2102. doi: 10.4161/cc.5.18.3288. [DOI] [PubMed] [Google Scholar]
  12. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England) 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bolstad B. preprocessCore: A Collection of Pre-Processing Functions. PreprocessCore. 2020 https://github.com/bmbolstad/preprocessCore
  14. Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, Albert FW, Zeller U, Khaitovich P, Grützner F, Bergmann S, Nielsen R, Pääbo S, Kaessmann H. The evolution of gene expression levels in mammalian organs. Nature. 2011;478:343–348. doi: 10.1038/nature10532. [DOI] [PubMed] [Google Scholar]
  15. Cardoso-Moreira M, Halbert J, Valloton D, Velten B, Chen C, Shao Y, Liechti A, Ascenção K, Rummel C, Ovchinnikova S, Mazin PV, Xenarios I, Harshman K, Mort M, Cooper DN, Sandi C, Soares MJ, Ferreira PG, Afonso S, Carneiro M, Turner JMA, VandeBerg JL, Fallahshahroudi A, Jensen P, Behr R, Lisgo S, Lindsay S, Khaitovich P, Huber W, Baker J, Anders S, Zhang YE, Kaessmann H. Gene expression across mammalian organ development. Nature. 2019;571:505–509. doi: 10.1038/s41586-019-1338-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Carvalho BS, Irizarry RA. A framework for oligonucleotide microarray preprocessing. Bioinformatics (Oxford, England) 2010;26:2363–2367. doi: 10.1093/bioinformatics/btq431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Colantuoni C, Lipska BK, Ye T, Hyde TM, Tao R, Leek JT, Colantuoni EA, Elkahloun AG, Herman MM, Weinberger DR, Kleinman JE. Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature. 2011;478:519–523. doi: 10.1038/nature10524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. de Magalhães JP, Church GM. Genomes optimize reproduction: aging as a consequence of the developmental program. Physiology (Bethesda, Md.) 2005;20:252–259. doi: 10.1152/physiol.00010.2005. [DOI] [PubMed] [Google Scholar]
  19. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics (Oxford, England) 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Dönertaş HM, İzgi H, Kamacıoğlu A, He Z, Khaitovich P, Somel M. Gene expression reversal toward pre-adult levels in the aging human brain and age-related loss of cellular identity. Scientific Reports. 2017;7:5894. doi: 10.1038/s41598-017-05927-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Dönertaş HM, Fuentealba Valenzuela M, Partridge L, Thornton JM. Gene expression-based drug repurposing to target aging. Aging Cell. 2018;17:e12819. doi: 10.1111/acel.12819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Dönertaş HM, Fabian DK, Valenzuela MF, Partridge L, Thornton JM. Common genetic associations between age-related diseases. Nature Aging. 2021;1:400–412. doi: 10.1038/s43587-021-00051-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Durinck S, Spellman PT, Birney E, Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nature Protocols. 2009;4:1184–1191. doi: 10.1038/nprot.2009.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Enge M, Arda HE, Mignardi M, Beausang J, Bottino R, Kim SK, Quake SR. Single-Cell Analysis of Human Pancreas Reveals Transcriptional Signatures of Aging and Somatic Mutation Patterns. Cell. 2017;171:321–330. doi: 10.1016/j.cell.2017.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Ezcurra M, Benedetto A, Sornda T, Gilliat AF, Au C, Zhang Q, van Schelt S, Petrache AL, Wang H, Guardia Y, Bar-Nun S, Tyler E, Wakelam MJ, Gems D. C. elegans Eats Its Own Intestine to Make Yolk Leading to Multiple Senescent Pathologies. Current Biology. 2018;28:3352. doi: 10.1016/j.cub.2018.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Feser J, Truong D, Das C, Carson JJ, Kieft J, Harkness T, Tyler JK. Elevated histone expression promotes life span extension. Molecular Cell. 2010;39:724–735. doi: 10.1016/j.molcel.2010.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Fisher RA. The Genetical Theory of Natural Selection. OUP; 1930. [DOI] [Google Scholar]
  28. Flurkey K, Currer JM, Harrison DE. In: The Mouse in Biomedical Research. Fox James G, Davisson Muriel T, Quimby Fred W, Barthold Stephen W, Newcomer Christian E, Smith Abigail L., editors. Academic Press; 2007. Chapter 20 - Mouse Models in Aging Research; pp. 637–672. [Google Scholar]
  29. Fox J, Weisberg S. An R Companion to Applied Regression. SAGE Publications; 2018. [Google Scholar]
  30. Gems D, Partridge L. Genetics of longevity in model organisms: debates and paradigm shifts. Annual Review of Physiology. 2013;75:621–644. doi: 10.1146/annurev-physiol-030212-183712. [DOI] [PubMed] [Google Scholar]
  31. Hernando-Herraez I, Evano B, Stubbs T, Commere P-H, Jan Bonder M, Clark S, Andrews S, Tajbakhsh S, Reik W. Ageing affects DNA methylation drift and transcriptional cell-to-cell variability in mouse muscle stem cells. Nature Communications. 2019;10:4361. doi: 10.1038/s41467-019-12293-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. hmtzg geneexp_mouse. swh:1:rev:1f2434f90404a79c87d545eca8723d99b123ac1cSoftware Heritage. 2022 https://archive.softwareheritage.org/swh:1:dir:b8c43e421f7216167380682c06ed9040db053627;origin=https://github.com/hmtzg/geneexp_mouse;visit=swh:1:snp:5a896cb4722794c85f464a75d459caf84021ffa0;anchor=swh:1:rev:1f2434f90404a79c87d545eca8723d99b123ac1c
  33. Hsu SD, Lin FM, Wu WY, Liang C, Huang WC, Chan WL, Tsai WT, Chen GZ, Lee CJ, Chiu CM, Chien CH, Wu MC, Huang CY, Tsou AP, Huang HD. miRTarBase: a database curates experimentally validated microRNA-target interactions. Nucleic Acids Research. 2011;39:D163–D169. doi: 10.1093/nar/gkq1107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hsu SD, Tseng YT, Shrestha S, Lin YL, Khaleel A, Chou CH, Chu CF, Huang HY, Lin CM, Ho SY, Jian TY, Lin FM, Chang TH, Weng SL, Liao KW, Liao IE, Liu CC, Huang HD. miRTarBase update 2014: an information resource for experimentally validated miRNA-target interactions. Nucleic Acids Research. 2014;42:D78–D85. doi: 10.1093/nar/gkt1266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Işıldak U, Somel M, Thornton JM, Dönertaş HM. Temporal changes in the gene expression heterogeneity during brain development and aging. Scientific Reports. 2020;10:4080. doi: 10.1038/s41598-020-60998-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jonker MJ, Melis JP, Kuiper RV, van der Hoeven TV, Wackers PFK, Robinson J, van der Horst GT, Dollé ME, Vijg J, Breit TM, Hoeijmakers JH, van Steeg H. Life spanning murine gene expression profiles in relation to chronological and pathological aging in multiple organs. Aging Cell. 2013;12:901–909. doi: 10.1111/acel.12118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kedlian VR, Donertas HM, Thornton JM. The widespread increase in inter-individual variability of gene expression in the human brain with age. Aging. 2019;11:2253–2280. doi: 10.18632/aging.101912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kim S, Villeponteau B, Jazwinski SM. Effect of replicative age on transcriptional silencing near telomeres in Saccharomyces cerevisiae. Biochemical and Biophysical Research Communications. 1996;219:370–376. doi: 10.1006/bbrc.1996.0240. [DOI] [PubMed] [Google Scholar]
  39. Lind MI, Ravindran S, Sekajova Z, Carlsson H, Hinas A, Maklakov AA. Experimentally reduced insulin/IGF-1 signaling in adulthood extends lifespan of parents and improves Darwinian fitness of their offspring. Evolution Letters. 2019;3:207–216. doi: 10.1002/evl3.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Liu X, Han D, Somel M, Jiang X, Hu H, Guijarro P, Zhang N, Mitchell A, Halene T, Ely JJ, Sherwood CC, Hof PR, Qiu Z, Pääbo S, Akbarian S, Khaitovich P. Disruption of an Evolutionarily Novel Synaptic Expression Pattern in Autism. PLOS Biology. 2016;14:e1002558. doi: 10.1371/journal.pbio.1002558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Luegmayr E, Glantschnig H, Wesolowski GA, Gentile MA, Fisher JE, Rodan GA, Reszka AA. Osteoclast formation, survival and morphology are highly dependent on exogenous cholesterol/lipoproteins. Cell Death and Differentiation. 2004;11 Suppl 1:S108–S118. doi: 10.1038/sj.cdd.4401399. [DOI] [PubMed] [Google Scholar]
  43. Martinez-Jimenez CP, Eling N, Chen H-C, Vallejos CA, Kolodziejczyk AA, Connor F, Stojic L, Rayner TF, Stubbington MJT, Teichmann SA, de la Roche M, Marioni JC, Odom DT. Aging increases cell-to-cell transcriptional variability upon immune stimulation. Science (New York, N.Y.) 2017;355:1433–1436. doi: 10.1126/science.aah4115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Matys V, Fricke E, Geffers R, Gössling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, Kloos D-U, Land S, Lewicki-Potapov B, Michael H, Münch R, Reuter I, Rotert S, Saxel H, Scheer M, Thiele S, Wingender E. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Research. 2003;31:374–378. doi: 10.1093/nar/gkg108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K, Voss N, Stegmaier P, Lewicki-Potapov B, Saxel H, Kel AE, Wingender E. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Research. 2006;34:D108–D110. doi: 10.1093/nar/gkj143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Medawar PB. Unsolved problem of biology. The Medical Journal of Australia. 1953;1:854–855. doi: 10.5694/j.1326-5377.1953.tb84985.x. [DOI] [PubMed] [Google Scholar]
  47. Pisco A. Tabula Muris Senis Data Objects. Figshare. 2020 doi: 10.6084/m9.figshare.12654728.v1. [DOI]
  48. Raivo K. Pheatmap: Pretty Heatmaps. R Package Version 2019
  49. Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma’ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database. 2016;2016:baw100. doi: 10.1093/database/baw100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Sampathkumar NK, Bravo JI, Chen Y, Danthi PS, Donahue EK, Lai RW, Lu R, Randall LT, Vinson N, Benayoun BA. Widespread sex dimorphism in aging and age-related diseases. Human Genetics. 2020;139:333–356. doi: 10.1007/s00439-019-02082-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Schaum N, Lehallier B, Hahn O, Pálovics R, Hosseinzadeh S, Lee SE, Sit R, Lee DP, Losada PM, Zardeneta ME, Fehlmann T, Webber JT, McGeever A, Calcuttawala K, Zhang H, Berdnik D, Mathur V, Tan W, Zee A, Tan M, Tabula Muris Consortium. Pisco AO, Karkanias J, Neff NF, Keller A, Darmanis S, Quake SR, Wyss-Coray T. Ageing hallmarks exhibit organ-specific temporal signatures. Nature. 2020;583:596–602. doi: 10.1038/s41586-020-2499-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Somel M, Khaitovich P, Bahn S, Pääbo S, Lachmann M. Gene expression becomes heterogeneous with age. Current Biology. 2006;16:R359–R360. doi: 10.1016/j.cub.2006.04.024. [DOI] [PubMed] [Google Scholar]
  53. Somel M, Guo S, Fu N, Yan Z, Hu HY, Xu Y, Yuan Y, Ning Z, Hu Y, Menzel C, Hu H, Lachmann M, Zeng R, Chen W, Khaitovich P. MicroRNA, mRNA, and protein expression link development and aging in human and macaque brain. Genome Research. 2010;20:1207–1218. doi: 10.1101/gr.106849.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, Hao Y, Stoeckius M, Smibert P, Satija R. Comprehensive Integration of Single-Cell Data. Cell. 2019;177:1888–1902. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Tabula Muris Consortium A single-cell transcriptomic atlas characterizes ageing tissues in the mouse. Nature. 2020;583:590–595. doi: 10.1038/s41586-020-2496-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Tacutu R, Thornton D, Johnson E, Budovsky A, Barardo D, Craig T, Diana E, Lehmann G, Toren D, Wang J, Fraifeld VE, de Magalhães JP. Human Ageing Genomic Resources: new and updated databases. Nucleic Acids Research. 2018;46:D1083–D1090. doi: 10.1093/nar/gkx1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society. 2001;63:411–423. doi: 10.1111/1467-9868.00293. [DOI] [Google Scholar]
  58. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology. 2010;28:511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Turan ZG, Parvizi P, Dönertaş HM, Tung J, Khaitovich P, Somel M. Molecular footprint of Medawar’s mutation accumulation process in mammalian aging. Aging Cell. 2019;18:e12965. doi: 10.1111/acel.12965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Williams GC. PLEIOTROPY, NATURAL SELECTION, AND THE EVOLUTION OF SENESCENCE. Evolution. 1957;11:398–411. doi: 10.1111/j.1558-5646.1957.tb02911.x. [DOI] [Google Scholar]
  61. Yang J, Huang T, Petralia F, Long Q, Zhang B, Argmann C, Zhao Y, Mobbs CV, Schadt EE, Zhu J, Tu Z, GTEx Consortium Synchronized age-related gene expression changes across multiple tissues in human and the link to complex diseases. Scientific Reports. 2015;5:15145. doi: 10.1038/srep15145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Yang J, Griffin P, Vera D, Apostolides J, Hayano M, Meer M, Salfati E. Erosion of the Epigenetic Landscape and Loss of Cellular Identity as a Cause of Aging in Mammals. Cold Spring Harbor Laboratory; 2019. [Google Scholar]
  63. Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Yuan Y, Chen Y-PP, Boyd-Kirkup J, Khaitovich P, Somel M. Accelerated aging-related transcriptome changes in the female prefrontal cortex. Aging Cell. 2012;11:894–901. doi: 10.1111/j.1474-9726.2012.00859.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Zahn JM, Poosala S, Owen AB, Ingram DK, Lustig A, Carter A, Weeraratna AT, Taub DD, Gorospe M, Mazan-Mamczarz K, Lakatta EG, Boheler KR, Xu X, Mattson MP, Falco G, Ko MSH, Schlessinger D, Firman J, Kummerfeld SK, Wood WH, Zonderman AB, Kim SK, Becker KG. AGEMAP: a gene expression database for aging in mice. PLOS Genetics. 2007;3:e201. doi: 10.1371/journal.pgen.0030201. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Bérénice A Benayoun 1

In this study, Izgi et al. investigated age-dependent gene expression pattern changes in male mice by analysing a new bulk RNA-seq data from four different tissues collected at different ages covering postnatal development and ageing. Gene expression patterns observed before sexual maturity show inter-tissue divergence, whereas convergence of gene expression profiles is observed after sexual maturity and during ageing, in a pattern that the authors call divergence-convergence or ‘DiCo.’ This observation may suggest that ageing results in at least a partial loss of tissue identity acquired developmentally.

Decision letter

Editor: Bérénice A Benayoun1

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Inter-tissue convergence of gene expression and loss of cellular identity during ageing" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Matt Kaeberlein as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this letter to help you prepare a revised submission.

Essential revisions:

1. The reviewers noted some potential issues with the normalization of the RNA-seq data that could affect the results, including the use of FPKM normalization, which may drive some of the observations due to lack of inter-sample-normalization (see specific comments by Reviewers #2 and 3). The authors would need to show that using appropriate count-based methods yields similar conclusions.

2. The authors often support their conclusions on low effect sizes and significance above usual thresholds. It would be important to explicitly discuss these as caveats, as well as strongly tone down these conclusions (including in the title.). More generally, the reviewers felt that potential alternative hypotheses needed to be clearly laid out, and potential ruled out, for the manuscript to stand more strongly (see specific comments by Reviewers #1, 2 and 3.)

3. In general, the reviewers felt that additional information on methods was required for reproducibility (see individual reviewer reports for specific points).

Reviewer #1 (Recommendations for the authors):

1. "Such "runaway development" phenomena may be due to insufficient natural selection pressure to optimise regulation after sexual reproduction starts." As stated, this sentence suggests that maladaptive traits that could lower an organism's total reproductive success would not be selected against as long as they only appear after the beginning of reproduction. I would perhaps ask that here the authors cite, not a general review of evolutionary theories of aging, but a publication that specifically proposes that alleles that are deleterious as early as the beginning of sexual reproduction nevertheless are not selected against.

2. Lines 84-91, 145-153, 229-240, 245-250, etc. Here the font size changes in the manuscript which I assume is not intended.

3. Figure 1 caption. "Similarities were calculated using Spearman correlations coefficient between expression-age correlations across tissues." Here I suspect that "Spearman correlations coefficient" was intended to be "Spearman's correlation coefficient".

Reviewer #2 (Recommendations for the authors):

Although the authors report an intriguing finding, there are major issues in the manuscript as it stands, notably concerning the clarity and rigor of the data analysis and manuscript.

1. For the analysis of their RNA-seq dataset across samples, the authors describe using the FPKM framework (methods, line 438). This is very problematic for a number of reasons – use of FPKM for differential gene analysis across samples is deprecated because it poorly controls for inter-sample variability [PMID: 22988256; PMID: 32284352]. FPKM are not comparable between samples because the normalization is sample-specific – thus, since no cross-library normalization step is performed, it should never be used for between sample comparison. Thanks to community benchmarking efforts, TMM or VST normalized counts are widely believed to be the superior choice for DGE analysis [PMID: 22988256; PMID: 32284352]. Thus, it is crucial that the authors correct their analysis to avoid the use of FPKM, and show that their conclusions would hold with an analysis on TMM or VST normalized counts.

2. There are issues regarding consistency in analytical choices throughout the paper.

a. Application of significance cutoffs vary widely in the manuscript when selecting specific gene sets. Reasoning should be provided for each case in which significance cutoff was (or not) applied (e.g. Line 111). For example, it is rather concerning that in Figure 1d, no gene was significantly up-regulated with age in the muscle (significance cutoff applied), but in Figure 1f, a large proportion of genes are shown to increase in expression with age in the muscle (significance cutoff not applied).

b. Authors are not consistent with the FDR threshold applied (e.g. FDR < 10% in Line 123, FDR 20% in Line 127). The reasoning for the chosen thresholds, as well as the different choices should be provided for cases in which higher than standard FDR is applied.

c. In Figure 1, authors show that the muscle presents with the least number of significantly changed gene expression patterns (up and down) with age. Additionally, in the manuscript, the authors explain that the lack of genes that show statistically significant change in the muscle is supported by a previous study that also showed "weak ageing transcriptome signature across multiple datasets (Line 137)." Thus, it is rather concerning that excluding the cortex, a tissue with a large number of differentially expressed genes, for CoV calculation resulted in greater significance.

d. The authors use a mix of multiple correction methods without clear explanations as to why a consistent measure isn't used (i.e. FDR/BH, BY) – BY test is justified for GO analysis line 472, but why BY was not used for DGE analysis is not clear (line 462).

e. A number of statistical analyses performed in the manuscript do not pass the standard significance thresholds (FDR < 5% or p < 0.05). The standard applied in the field is usually p-value or FDR < 5%, although relaxing this threshold can be done when dealing with noisier datasets. When using more relaxed FDR or p-value thresholds, the authors should also be careful to strongly tone down statements proportionally to the lower statistical support for their claims (eg. Line 240: "a SIGNIFICANT divergence-convergence pattern).

3. The authors use a dataset from exclusively male mice. Since aging is known to be extremely sex-dimorphic [PMID: 27304504], the authors need to explicitly discuss this caveat/limitation in the main text (i.e. these observations may not hold in female animals). If the authors wish to make a more general statement, a suggestion would be to reanalyze data from the recent Tabula Muris Senis Companion paper Schaum et al., [PMID: 32669715], which includes similar tissues, "developmental" time points (1 vs 3 months) and aging course (3 to 27 months), and both sexes to extend the findings. In addition, the Schaum dataset is also RNA-seq, and thus more directly comparable to the author's dataset compared to the microarray Jonker dataset used for partial validation (Figure 2 – supplement 8).

4. The authors should consider revising the title, since the current form seems to indicate that the work discovered s significant loss of cellular identity during aging. This statement is too strong based on the analyses of the paper. We suggest toning down the title, for instance to "Transcriptomic analyses suggest inter-tissue convergence of gene expression and loss of cellular identity during ageing".

Reviewer #3 (Recommendations for the authors):

Authors only analyze the gene sets that are correlated with age i.e. linearly go up or down with age. However, ageing trajectories for different genes can be very dynamic. This is the likely reason why only a few age-related changes were detected in some tissues. To assess the robustness of some of the key observations, authors should consider an orthogonal and unbiased clustering analysis similar to Figure 2 supplement to get the patterns for gene expression changes with age and to what extent DiCo is observed with different clusters.

Some of the key observations have either low effect size or are not statistically significant. For example, inter-tissue expression similarity during ageing and development (lines 115-119), the expression reversal (lines 158-161), changes in coefficient of variation with age (lines 202-207) etc. Authors clearly mention these in the manuscript, but it raises concerns on the extent to which some of these observations might be random, or due to idiosyncrasies of RNA-seq itself. These caveats should also be discussed.

In the global PCA based analysis for figure 1 and 2 combining both development and ageing data, the variance can be driven by one of the two states. For example, transcription divergence analysis in PCA in figure 1 based on development alone is not significant which suggests that the divergence of the developmental samples is driven in-fact by ageing. For PCA based analysis using separate PCAs for development and ageing will be more statistically sound.

Are there any known gene categories that have stronger expression reversal trends? For example, do known transcription factors or pathways that regulate development show DiCo if analyzed separately? Or is this trend only seen at the genome-wide level and completely stochastic?

For functional enrichment analyses authors only mention some selected functions in the text but GO terms have a lot of redundancy and it is unclear if there is any emerging functional trends looking beyond the redundancy or selected examples. It will be useful to thoroughly analyze and summarize the enriched GO terms to assess for any functional patterns that connect functional decline with ageing and convergent expression or DiCo.

Is there a significant overlap between mouse and human ageing related changes at the gene or pathway level?

Line 299 – linking the functional categories with cellular identity appears a bit hand-wavy. Stronger support would be needed to make this conclusion.

Some of the aspects need more methodological details:

For RNA-seq analysis, were all the samples normalized together, or were development and ageing samples normalized separately? It should be clarified in the methods.

Were the mouse perfused? The details should be added in methods. If the mouse were not perfused, a significant shared proportion of gene expression can be due to blood components. Also, the possibility that some of the shared changes could be due to infiltration of immune and other cell types in tissues should be mentioned.

More details for single cell deconvolution should be added in methods.

It is unclear to me how how the age related gene expression and PC correlation was performed. Was the loadings underlying PC axes used for correlation analysis? It should be elaborated in methods.

Some of the supplemental figures are missing.

eLife. 2022 Jan 31;11:e68048. doi: 10.7554/eLife.68048.sa2

Author response


Essential revisions:

1. The reviewers noted some potential issues with the normalization of the RNA-seq data that could affect the results, including the use of FPKM normalization, which may drive some of the observations due to lack of inter-sample-normalization (see specific comments by Reviewers #2 and 3). The authors would need to show that using appropriate count-based methods yields similar conclusions.

We believe this comment mainly stems from the lack of clarity of our methods, which we improved in this version. We had used quantile normalisation on FPKM normalised values, which is an inter-sample-normalisation and normalises for overall library sizes between samples. In this revised version, we further repeat the analysis with VST normalisation following the DESeq2 pipeline and confirm the robustness of our results. The details are provided under the relevant comments below.

2. The authors often support their conclusions on low effect sizes and significance above usual thresholds. It would be important to explicitly discuss these as caveats, as well as strongly tone down these conclusions (including in the title.). More generally, the reviewers felt that potential alternative hypotheses needed to be clearly laid out, and potential ruled out, for the manuscript to stand more strongly (see specific comments by Reviewers #1, 2 and 3.)

We thank the reviewers for raising this issue. We have now (1) explained the difference between FPR (now called eFPP to prevent confusion) and FDR in the methods, to clarify that we use the same significance cutoff across different analyses, and also provide an additional measure that estimates the false positive proportion (not a p-value) based on permutations; (2) toned down the manuscript, including the title; (3) included additional discussion to summarise all the limitations laid out throughout the manuscript; (4) specifically tested the alternative hypothesis suggested by the reviewer #1; (5) proposed ageing-related increase in inter-cellular noise (heteroskedasticity) as a possible explanation for the convergence observed during ageing.

3. In general, the reviewers felt that additional information on methods was required for reproducibility (see individual reviewer reports for specific points).

We have re-written a significant proportion of the methods and included detailed explanations in the figure legends where applicable to improve reproducibility.

Reviewer #1 (Recommendations for the authors):

1. "Such "runaway development" phenomena may be due to insufficient natural selection pressure to optimise regulation after sexual reproduction starts." As stated, this sentence suggests that maladaptive traits that could lower an organism's total reproductive success would not be selected against as long as they only appear after the beginning of reproduction. I would perhaps ask that here the authors cite, not a general review of evolutionary theories of aging, but a publication that specifically proposes that alleles that are deleterious as early as the beginning of sexual reproduction nevertheless are not selected against.

We thank the reviewer for the suggestion. The article by de Magalhaes and Church is the review article in which the authors revised the developmental theory of ageing especially in the light of genomic studies and thus is the original article where the ‘runaway development’ phenomenon was introduced in these exact terms. We now also included the original article by Mikhail V. Blagosklonny, where a related phenomenon, the hyperfunction theory was first proposed (Blagosklonny, 2006) – however, this is again a conceptual review where the theory was first proposed based on observations on many different grounds. Therefore, we have now also included papers with experimental support for this notion (Ezcurra et al., 2018; Lind et al., 2019). Moreover, we have now included a statement in the discussion to convey the message that we observe a reversal in the majority of the genes, but this does not necessarily negate the developmental or hyperfunction theory, which might be relevant for a limited number of pathways and may contribute to the ageing phenotype (lines 550-553). Finally, we now cite (Fisher, 1930; Medawar, 1952; Williams, 1957), which suggests the force of natural selection declines with age.

2. Lines 84-91, 145-153, 229-240, 245-250, etc. Here the font size changes in the manuscript which I assume is not intended.

Thank you for pointing this out – we have now corrected it.

3. Figure 1 caption. "Similarities were calculated using Spearman correlations coefficient between expression-age correlations across tissues." Here I suspect that "Spearman correlations coefficient" was intended to be "Spearman's correlation coefficient".

Thank you – we have changed it and carefully checked the manuscript one more time to prevent such typos.

Reviewer #2 (Recommendations for the authors):

Although the authors report an intriguing finding, there are major issues in the manuscript as it stands, notably concerning the clarity and rigor of the data analysis and manuscript.

1. For the analysis of their RNA-seq dataset across samples, the authors describe using the FPKM framework (methods, line 438). This is very problematic for a number of reasons – use of FPKM for differential gene analysis across samples is deprecated because it poorly controls for inter-sample variability [PMID: 22988256; PMID: 32284352]. FPKM are not comparable between samples because the normalization is sample-specific – thus, since no cross-library normalization step is performed, it should never be used for between sample comparison. Thanks to community benchmarking efforts, TMM or VST normalized counts are widely believed to be the superior choice for DGE analysis [PMID: 22988256; PMID: 32284352]. Thus, it is crucial that the authors correct their analysis to avoid the use of FPKM, and show that their conclusions would hold with an analysis on TMM or VST normalized counts.

We thank the reviewer for their comment. We had used FPKM followed by quantile normalisation, which performs an inter-sample normalisation. We had mentioned this at lines 443, 662, and 682 in the previous version, but had not explained this normalisation procedure in detail. Now the relevant section in the methods is updated to explain the quantile normalisation method (line 690). The main reason we choose quantile normalisation is that we analysed both RNA-seq and microarray data and this normalisation method can be applied to both, following other background normalisations and transformations (i.e. FPKM + log2 transformation for RNA-seq and RMA correction and log2 transformation for microarray). However, we agree with the reviewer that use of VST and TMM are more conventional as these are already implemented in the widely used analysis pipelines of DESeq2 and EdgeR.

We thus repeated the analysis of our dataset using VST and confirmed the main results.

We replicated our findings presented in Figure 1 using VST-normalised data. The results are presented in Figure 1—figure supplement 10 and 11:

i) We first compared VST- and QN-processed data in terms of age-related expression change trends in each tissue and period separately, without applying a significance cutoff (across n=14,973 genes). We found high correlations in all 8 comparisons (⍴dev=[0.81, 0.94], ⍴ageing=[0.83, 0.93], Figure 1—figure supplement 11).

ii) Repeating PCA analysis with VST-normalised data, we show that variation in gene expression (n=14,973 genes) is largely explained by tissue differences across PC1, 2, and 3 (ANOVA p<10-16 for PC1-3). In PC4, which explains 7% of the total variation (as opposed to 8% in Figure 1b, we observe an age-effect across tissues in development (Spearman’s correlation coefficient ⍴=[0.57, 0.99], nominal p<0.01 for each test except in muscle; Figure 1—figure supplement 10b). We also confirmed that tissues diverge during postnatal development, although the convergence during ageing was not significant (change in mean Euclidean distance among tissues with age in PC1-4 space, ⍴dev=0.937, pdev=0.00185; during ageing ⍴ageing=-0.58, pageing=0.102, Figure 1-source data).

iii) Similar to Figure 1c results, we find that tissue pairs show weak correlations in their age-related expression change trends measured with VST-normalised data (n=14,973 genes), both during development (⍴=[0.19, 0.39]) and during ageing (⍴=[0.25, 0.34]). Furthermore, we find that the number of genes with the same direction of change (with or without significance cutoff) across four tissues with VST-normalised data is comparable to Figure 1d and Figure 1e result (Figure 1—figure supplement 10d-e).

iv) Reversal analysis with VST-normalised data revealed that ~50% (42-59%) of expressed genes (n=14,973) showed reversal patterns (without significance cutoff) in tissues comparable to Figure 1f result (Figure 1—figure supplement 10f). Comparing VST- and QN-normalised data, we found that 70-79% of the genes showed similar reversal or continuous patterns across tissues.

We confirmed our findings from Figure 2 with VST normalisation and presented the results in Figure 2—figure supplement 14:

i) We calculated CoV values across all expressed genes (n=14,973) among tissues in VST-normalised data and confirmed a significant mean CoV increase in development (Spearman’s correlation coefficient ⍴=0.99, p=10-5), consistent with QN normalised data that yielded ⍴=0.77 (p=0.041). During ageing, the decrease in mean CoV with age (⍴=-0.48, p=0.23) was comparable to the results shown in Figure 2a (⍴=-0.50, p=0.204).

ii) We replicated the difference between development and ageing in the proportion of genes showing divergent versus convergent patterns (at gene-wise FDR<0.1) with the VSTnormalised data (Figure 2—figure supplement 14d-e). Specifically, VST-normalised data displayed 89% divergence in development (among 3,476 significant genes) as opposed to 70% divergence in QN normalised data (among 2,581 significant genes). Likewise, during ageing, VST-normalised data displayed 68% convergence (among 19 significant genes) similar to 68% convergence in QN normalised data (among 62 significant genes).

We conclude that the choice of the normalisation method does not affect our results. We now present the VST-based results briefly in the text (lines 143-146, 263-265) to prevent redundancy and provide detailed explanations in the supplementary figure legends (Figure 1figure supplement 10, Figure 2—figure supplement 14). We now also explain the VST normalisation steps in the Methods (line 675).

2. There are issues regarding consistency in analytical choices throughout the paper.

a. Application of significance cutoffs vary widely in the manuscript when selecting specific gene sets. Reasoning should be provided for each case in which significance cutoff was (or not) applied (e.g. Line 111). For example, it is rather concerning that in Figure 1d, no gene was significantly up-regulated with age in the muscle (significance cutoff applied), but in Figure 1f, a large proportion of genes are shown to increase in expression with age in the muscle (significance cutoff not applied).

We thank the reviewer for their comment. We actually analyse all the results in two ways: (1) using a consistent significance cutoff (multiple testing (FDR) adjusted p-value<0.1, or if no multiple testing was applied, p-value<0.05), and (2) without using a significance cutoff to assess transcriptome-wide trends. We emphasise this as soon as the first sentence of the Results section titled “Tissues involve common gene expression changes with age”. We now added a second sentence to emphasise this point:

“We next characterised age-related changes in gene expression shared across tissues by (i) studying overall trends at the whole transcriptome level and testing their consistency using permutation tests, and (ii) studying statistically significant changes at the single gene level.”

This has two motivations. First, given the low signal/noise ratios during mammalian ageing, identifying statistically significant patterns in individual genes is frequently difficult, but transcriptome-wide patterns can be more effectively identified. Second, observing the same trends using both approaches increases the robustness of our conclusions and may suggest that the observations are a genome-wide phenomenon, not restricted to a small number of genes. However, since transcriptome-wide trends are more prone to technical artefacts, we further test the genes that show statistically significant changes.

We now attempted to further clarify where a significance cutoff was and was not used in the analyses, by updating the text to emphasise the gene set of interest for each analysis (lines 131, 140).

We also thank the reviewer for pointing out the ambiguity about muscle genes. In Figure 1f, we study global reversal trends in each tissue for all the genes without any significance cutoff, i.e. only analysing expression change trends (following (Dönertaş et al., 2017)). Thus, although there are no statistically significant ageing-related genes in muscle tissue (Figure 1d), we observe a transcriptome-wide reversal trend (Figure 1f). We now updated the figure text to explain clearly what is presented (lines 183-187).

b. Authors are not consistent with the FDR threshold applied (e.g. FDR < 10% in Line 123, FDR 20% in Line 127). The reasoning for the chosen thresholds, as well as the different choices should be provided for cases in which higher than standard FDR is applied.

We thank the reviewer for pointing out that the difference between FDR (e.g. line 123 in the previous version) and FPR (e.g. line 127 in the previous version) was not clearly explained in the text. FDR is the multiple testing correction ‘false discovery rate’ applied using the ‘p.adjust’ function with BH procedure in R, which we did not clearly state in the first submission. We updated the text to explain that FDR is applied to correct the p-values for all statistical tests (functional association tests were corrected with BY procedure in the previous version but for consistency, we used FDR in all the tests in the current version – see point d.) (lines 711-714).

We have also updated the text to replace all “FDR<0.1” with “FDR corrected p-value<0.1”.

We had used the term FPR as an estimate of the false-positive rate calculated using random permutations (FPR = median number of positives observed across permutations / number of positives observed in the data). We provide this metric with permutation test results as additional information on the effect size, not as an alternative to FDR corrected p-values. However, thanks to the reviewer’s comment we realised that the phrase FPR could be confused with other metrics. We, therefore, redefined it as eFPP, ‘estimated false-positive proportion’ (lines 740-743). We now updated the relevant sections to make sure that these two definitions are not interchangeable. We also removed eFPP from the main text to avoid confusion and only provide this information only in the supplementary figures.

c. In Figure 1, authors show that the muscle presents with the least number of significantly changed gene expression patterns (up and down) with age. Additionally, in the manuscript, the authors explain that the lack of genes that show statistically significant change in the muscle is supported by a previous study that also showed "weak ageing transcriptome signature across multiple datasets (Line 137)." Thus, it is rather concerning that excluding the cortex, a tissue with a large number of differentially expressed genes, for CoV calculation resulted in greater significance.

We believe that the reviewer’s concern perhaps stems from our poor wording of CoV calculation in the text. For each individual, we calculated CoV across tissues using all expressed genes (n=15,063) regardless of their specific age-related expression changes (significant or not) in any tissue. Then, we summarised these gene-wise CoV values by calculating their mean to assess a genome-wide variation score among tissues of an individual and how these scores change with age (Figure 2a). We have now updated the relevant Results section to explain the CoV calculation in more detail (lines 221-223). We believe that the significant age-related expression changes in one tissue should not have a direct impact on the results as the number of significantly changed genes during ageing in the cortex was also very small (n=68).

Despite this, we repeated the same analysis, this time excluding the muscle tissue, which resulted in a nonsignificant but similar pattern during ageing (Spearman’s correlation test between mean CoV and age (n=15): ⍴dev=0.85, pdev=0.016; ⍴ageing=-0.29, pageing=0.49, Author response image 1a -right panel), and the lack of significance again might be driven by the lack of one cortex sample.

Author response image 1. Age-related change in CoV summarised across genes excluding cortex or muscle.

Author response image 1.

The exclusion of the cortex data was motivated by the fact that we were missing an aged individual in this tissue. However, estimating CoV using only three points may be suboptimal, and we, therefore, decided to be conservative and thus removed this analysis (i.e. results excluding cortex) from the manuscript.

Change in CoV using 15 samples from three tissues (one oldest individual (904 days of age) excluded due to its missing data in the cortex), either excluding cortex (left column) or muscle (right column) tissue. Each point represents the (a) mean CoV, (b) median CoV value of all protein-coding genes (15,063) for each mouse. x-axis is in log2 scale. The dashed grey line shows the start of the ageing period. The Spearman’s correlation coefficients and p values for each period are indicated separately on the plots.

d. The authors use a mix of multiple correction methods without clear explanations as to why a consistent measure isn't used (i.e. FDR/BH, BY) – BY test is justified for GO analysis line 472, but why BY was not used for DGE analysis is not clear (line 462).

We thank the reviewer for this point. For consistency, we now use FDR (BH) for multiple testing correction for all statistical tests throughout the text. We now updated the relevant sections (lines 711, 713-714).

e. A number of statistical analyses performed in the manuscript do not pass the standard significance thresholds (FDR < 5% or p < 0.05). The standard applied in the field is usually p-value or FDR < 5%, although relaxing this threshold can be done when dealing with noisier datasets. When using more relaxed FDR or p-value thresholds, the authors should also be careful to strongly tone down statements proportionally to the lower statistical support for their claims (eg. Line 240: "a SIGNIFICANT divergence-convergence pattern).

We thank the reviewer for the comment. We believe that the 10% FDR cutoff allows a compromise between overall false positives and false negatives, and is commonly used as a threshold in the transcriptomics literature, including recent analyses on the association between gene expression and splicing (DOI: 10.7554/eLife.67077), expression and neurodegeneration (DOI: 10.7554/eLife.64564), or expression and psychosocial experiences (DOI: 10.7554/eLife.63852).

Using a more stringent threshold would reduce the false positive probability per gene. However, our emphasis in this manuscript is not individual genes but transcriptome-wide patterns, which we test using the whole transcriptome without significance threshold, and also with significant genes (with an FDR corrected p-value<0.1). The particular instance where we use ‘significant’ in line 240 refers to the result of Spearman’s correlation test between mean pairwise correlations among tissues and age in development and ageing periods.

That said, we did follow the reviewer’s suggestion and toned down our conclusions. Accordingly, we changed the referred line to ‘we observed the same divergence-convergence pattern’ (lines 286-287).

3. The authors use a dataset from exclusively male mice. Since aging is known to be extremely sex-dimorphic [PMID: 27304504], the authors need to explicitly discuss this caveat/limitation in the main text (i.e. these observations may not hold in female animals). If the authors wish to make a more general statement, a suggestion would be to reanalyze data from the recent Tabula Muris Senis Companion paper Schaum et al., [PMID: 32669715], which includes similar tissues, "developmental" time points (1 vs 3 months) and aging course (3 to 27 months), and both sexes to extend the findings. In addition, the Schaum dataset is also RNA-seq, and thus more directly comparable to the author's dataset compared to the microarray Jonker dataset used for partial validation (Figure 2 – supplement 8).

We thank the reviewer for this valuable comment. In fact, we identify convergent trends in ageing in both female and male animals, in our dataset (only males) and in the Jonker et al., dataset (only females), respectively. We further find that convergent genes overlap between the two datasets, although modestly (OR = 1.22, p = 1.67x10-8, Figure 4—figure supplement 2b). This suggests that DiCo is not unique to a single-sex in mice. We now explain this in lines 430-434, 614-616.

We would also have been interested in testing possible sex dimorphism in DiCo, but this is not possible given technical differences between the two datasets (i.e. Jonker et al.,'s and ours).

The GTEx dataset includes both male and female humans and displays patterns consistent with convergence during ageing, although not significantly (Figure 2—figure supplement 811). Following the reviewer's point, we repeated the GTEx dataset analysis separately for females and males. We observed a stronger convergence in females than in males, although none of these results were significant (⍴female=-0.58, pfemale=0.059; ⍴male= -0.052, pmale=0.77) (Figure 2—figure supplement 16). The female and male GTEx samples differ both in sample size (n=11 vs n=36, respectively) and age range (no male individuals within 20-29 and 70-79 age groups), which could explain the observed differences between sexes.

We now discuss possible sex-dimorphism in Discussion as a limitation of this study and suggest a more comprehensive analysis with different datasets in the future (line 612-620).

Finally, we thank the reviewer for suggesting including the Schaum et al., dataset. We now included this dataset, analysing both the same 4 tissues we have and a broader set of 8 tissues (Figure 2—figure supplement 17-20). However, we could not include the development period as the sample size was very small for the two time points; 1 and 3 months of age. We could not confirm the transcriptome-wide trend towards convergence during ageing in this dataset and found that the muscle and subcutaneous fat tissues showed the most divergent patterns.

However, we confirmed a strong association between the loss of identity and convergence during ageing. Not having the developmental period for the external datasets limited the complete replication of our original study, which focuses on the DiCo pattern, divergence followed by convergence. Nevertheless, a trend towards convergence during ageing was observed at the transcriptome level in Jonker et al., and GTEx and the association between convergence and loss of identity was evident across all datasets. We now updated the text to reflect that the support for the transcriptome-wide trend is weak and our most robust result is the association between identity loss and convergence.

We now add Figure 4—figure supplement 2 and Table 1 to summarise all results across datasets.

4. The authors should consider revising the title, since the current form seems to indicate that the work discovered s significant loss of cellular identity during aging. This statement is too strong based on the analyses of the paper. We suggest toning down the title, for instance to "Transcriptomic analyses suggest inter-tissue convergence of gene expression and loss of cellular identity during ageing".

Thank you for the suggestion – we have now toned down the title: ‘Inter-tissue convergence of gene expression during ageing suggests age-related loss of tissue and cellular identity’

Reviewer #3 (Recommendations for the authors):

Authors only analyze the gene sets that are correlated with age i.e. linearly go up or down with age. However, ageing trajectories for different genes can be very dynamic. This is the likely reason why only a few age-related changes were detected in some tissues. To assess the robustness of some of the key observations, authors should consider an orthogonal and unbiased clustering analysis similar to Figure 2 supplement to get the patterns for gene expression changes with age and to what extent DiCo is observed with different clusters.

We thank the reviewer for the comment. The reason we used Spearman’s correlation was to capture potential non-linear but monotonic changes. However, we agree with the reviewer that Spearman’s correlation cannot identify non-monotonic changes. As suggested, we now performed k-means clustering for each tissue to identify age-related expression patterns and then performed an enrichment test to assess if DiCo genes are enriched or depleted in any cluster. The results are now presented as supplementary figures (Figure 1—figure supplement 12-15). Notably, we did not observe a bias towards a cluster having a linear pattern with age or not in any of the tissues. For example, the cortex tissue displayed 15 clusters of expression changes, among which 5 are enriched and another 5 are depleted in DiCo (Figure 1—figure supplement 12). Two of the DiCo enriched clusters (2/5) do not have linear trajectories with age (clusters 12 and 14) whereas three of the DiCo depleted clusters (3/5) display linear patterns (clusters 6, 7 and 15). We have now presented this result in the relevant section (lines 276-277).

Some of the key observations have either low effect size or are not statistically significant. For example, inter-tissue expression similarity during ageing and development (lines 115-119), the expression reversal (lines 158-161), changes in coefficient of variation with age (lines 202-207) etc. Authors clearly mention these in the manuscript, but it raises concerns on the extent to which some of these observations might be random, or due to idiosyncrasies of RNA-seq itself. These caveats should also be discussed.

We now added a limitations paragraph in the Discussion, detailing both this aspect and other limitations raised by the other reviewers (lines 605-620).

In the global PCA based analysis for figure 1 and 2 combining both development and ageing data, the variance can be driven by one of the two states. For example, transcription divergence analysis in PCA in figure 1 based on development alone is not significant which suggests that the divergence of the developmental samples is driven in-fact by ageing. For PCA based analysis using separate PCAs for development and ageing will be more statistically sound.

We thank the reviewer for this suggestion. We now followed the reviewer’s suggestion and analysed age-related convergence using PCA performed only on ageing samples. In this way, we now analyse Euclidean distance in PC1-4 (i) using all samples together (across the lifetime – Figure 1a), (ii) using samples from developmental period only (Figure 1—figure supplement 3a-b), and (iii) using samples from ageing period only (Figure 1—figure supplement 3d-e). For the PCA analysis based on all the samples together, change in mean Euclidean distance with age among tissues was significant both during development and ageing (⍴dev=0.99, pdev=1.5x10-5; during ageing ⍴age=-0.87, page=0.0026). When we performed the PCA analysis separately for the two periods, we still observed a significant divergence among tissues during development (⍴=0.95, p=0.0008) and observed a marginally significant convergence during ageing (⍴=-0.64, p=0.0596). We hope that this analysis addresses the reviewer’s point. We also note that our earlier analysis had an error (we had used PC3-4 spaces to calculate the Euclidean distance for the mouse dataset, mistakenly excluding PC1-2), which we now corrected (Figure 1, Figure 1—figure supplement 3).

Are there any known gene categories that have stronger expression reversal trends? For example, do known transcription factors or pathways that regulate development show DiCo if analyzed separately? Or is this trend only seen at the genome-wide level and completely stochastic?

We thank the reviewer for the suggestion. We now used the MiRTarBase database for miRNAs and TRANSFAC for transcription factors and tested if any specific regulator is associated with DiCo pattern genes. We did not find any significant association (lines 353357). This may be consistent with DiCo being a transcriptome-wide and diffuse phenomenon caused by stochastic changes, as the reviewer indicates. It is also possible that not specific regulators but a change in their cooperative action influences the pattern, or we lack sufficient power with the current datasets to identify subtle effects of individual regulators. We now mention this also in the Discussion (line 601-603).

For functional enrichment analyses authors only mention some selected functions in the text but GO terms have a lot of redundancy and it is unclear if there is any emerging functional trends looking beyond the redundancy or selected examples. It will be useful to thoroughly analyze and summarize the enriched GO terms to assess for any functional patterns that connect functional decline with ageing and convergent expression or DiCo.

In the previous version, enriched GO terms were summarised according to their Jaccard similarities using the ‘emapplot’ function of the ‘enrichplot’ package. However, we agree with the reviewer that the enrichment network did not remove redundancy among the categories, and might have hidden unique and interesting functions.

We have now used another approach to summarise and interpret DiCo enriched categories in more detail. First, we clustered the categories based on the number of genes they share using hierarchical clustering and cut the tree into 25 clusters. For each cluster, we chose a representative category that has the highest mean-Jaccard similarity to the other categories in the same cluster (Dönertaş et al., 2021). Then, we calculated the mean expression changes (⍴ between expression and age) of the genes in the representative categories for each tissue, separately (Figure 4h). We believe that this new analysis provides better insight to our main findings such that tissue-related functions gained during development display opposing expression trends during ageing, i.e. reversal pattern, providing support to loss of tissuespecific expression and thus contributing to convergence during ageing. We have now updated the main text and methods (lines 343-351, 874-881).

Is there a significant overlap between mouse and human ageing related changes at the gene or pathway level?

We thank the reviewer for the suggestion. We now compare the correlations between ageing related expression changes and overlap between convergent genes during ageing (Figure 4 figure supplement 2). While mouse datasets, especially our data and Jonker et al., showed higher correlations among each other, the correlations between age-related expression changes in the human and mouse datasets were smaller but overall positive when we consider the same tissues in both datasets (average correlations between GTEx and mouse datasets are ⍴Jonker=0.12, ⍴Our_dat=0.13, ⍴Schaum=0.01). Convergent genes in ageing also showed weak but significant overlaps across datasets. Human dataset showed significant overlap with Jonker et al., and Schaum et al., datasets but not with ours. Nevertheless, the overlap for these datasets was also small. Importantly, apart from the differences in species, sex, age distribution and the technology used to generate the data, the small overlap between datasets may stem from the fact that the external datasets lack developmental samples. Thus, we can only compare convergence during ageing but not the DiCo pattern, the main focus of our study. Since only 62% of the convergent genes during ageing are divergent in development in our dataset, we should emphasise that low overlap for convergence does not rule out overlap across DiCo genes. Similarly, we use divergent genes during development as the background for our functional enrichment tests to find functional associations of DiCo. Since we lack the developmental background we performed functional enrichment for only the convergent genes but did not find any significant association for the GTEx dataset.

Line 299 – linking the functional categories with cellular identity appears a bit hand-wavy. Stronger support would be needed to make this conclusion.

Thank you for the suggestion. We have now toned down that particular line (347-350), and also revised the manuscript, in general, to make sure we do not overstate our findings.

Some of the aspects need more methodological details:

For RNA-seq analysis, were all the samples normalized together, or were development and ageing samples normalized separately? It should be clarified in the methods.

We thank the reviewer for pointing out this ambiguity. We have combined all the samples together regardless of their age (development and ageing together) or tissue (n=63) and performed quantile normalisation on log2 transformed (after adding 1) FPKM values. As FPKM normalisation does not account for inter-sample variability, we adopted this approach to control for cross-library variability between different tissues and ages. We have rewritten the relevant section in the methods to explain the normalisation in more detail (line 666-685). The same approach is used for the VST-normalisation in this version and development and ageing periods are normalised together.

Were the mouse perfused? The details should be added in methods. If the mouse were not perfused, a significant shared proportion of gene expression can be due to blood components. Also, the possibility that some of the shared changes could be due to infiltration of immune and other cell types in tissues should be mentioned.

We thank the reviewer for this question. Indeed, infiltration of other cell types is a serious consideration in most bulk tissue analyses. Since we observe a similar trend between different cell types using scRNAseq data and that DiCo is particularly associated with tissue-specific genes and not with any bood or immune-related signature, we believe the effect should be minimal in our analysis. However, the new dataset we included in the analysis, Schaum et al., did use perfused mice and although the association between the loss of tissue identity and convergence was particularly strong, the transcriptome-wide trend did not support convergence. We now included the potential influence of infiltration of other cell types as a limitation of our study in the Discussion (line 606-612).

More details for single cell deconvolution should be added in methods.

We have now updated the deconvolution section in the methods to explain how we conducted the analysis in more detail (lines 1054-1064).

It is unclear to me how how the age related gene expression and PC correlation was performed. Was the loadings underlying PC axes used for correlation analysis? It should be elaborated in methods.

We performed PCA on individuals’ gene expression values and used these PC scores of the individuals (not the loadings) and their ages to perform the correlation analysis. We have now updated the relevant section in the methods (lines 693-696).

Some of the supplemental figures are missing.

We are sorry for this confusion, but having checked our submission we could not detect any missing files. In case this might be an issue with the submission portal, for the second round we will update the bioRxiv version of the article where the reviewers can check all files separately if needed.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Izgi H, Han D, Isildak U, Huang S, Kocabiyik E, Khaitovich P, Somel M, Donertas HM. 2021. Bulk RNA-seq of mice covering the whole lifespan (2 days to 904 days) from four tissues. NCBI Gene Expression Omnibus. GSE167665
    2. Jonker MJ, Melis JP, Kuiper RV, van der Hoeven TV, Robinson J, van der Horst GT, Breit TM, Vijg J, Dollé ME, Hoeijmakers JH, van Steeg H. 2013. Aging Experiment. NCBI Gene Expression Omnibus. GSE34378
    3. GTEx Consortium et al 2017. Gene TPMs. GTEx Portal. phs000424.v8.p2
    4. Pisco A. 2020. Official data release for Tabula Muris Senis. figshare. [DOI]
    5. Schaum et al 2020. Tabula Muris Senis: Bulk sequencing. NCBI Gene Expression Omnibus. GSE132040

    Supplementary Materials

    Figure 1—source data 1. Data summary, age-related expression patterns, and reversal patterns.
    Figure 2—source data 1. All the data related to divergence-convergence (DiCo) pattern: age-related coefficient of variation (CoV) change of genes, pairwise tissue expression correlations, analysis of independent datasets; GSE34378 (Jonker et al.), GSE132040 (Schaum et al.), and GTEx.
    Figure 3—source data 1. Effect sizes for determination of tissue-specific genes, enrichment of divergence-convergence (DiCo), and reversal genes within tissue-specific genes.
    Figure 4—source data 1. Gene Set Enrichment Analysis (GSEA) result of divergence-convergence (DiCo) genes, DiCo enrichment with tissue-specific expression loss, age-related expression change correlations, and convergence overlaps among datasets.
    Figure 5—source data 1. Cell-type proportion estimation and cell-autonomous changes using the Tabula Muris Senis dataset.
    Supplementary file 1. Gene set over-representation analysis (GORA) of age-related genes in tissues.

    Tissue-specific age-related gene expression changes and functional enrichment test results, performed with GORA using ‘topGO’ package.

    elife-68048-supp1.xlsx (2.5MB, xlsx)
    Supplementary file 2. Gene set over-representation analysis (GORA) of shared age-related genes among tissues.

    Functional enrichment for shared genes across tissues. The same GORA that was performed for Supplementary file 1 was used to test the enrichment of shared up-/downregulated genes in development among the background genes which are chosen as the all-significant age-related genes across tissues in development. We did not apply the test for the ageing period as there were no shared ageing-related expression changes.

    elife-68048-supp2.xlsx (152KB, xlsx)
    Supplementary file 3. Gene set over-representation analysis (GORA) of reversal patterns.

    Functional enrichment for gene expression reversals. GORA was performed with the same criteria as explained above. Up-down reversal genes were tested against up-up genes, and down-up reversal genes were tested against down-down genes in each tissue.

    elife-68048-supp3.xlsx (1.2MB, xlsx)
    Supplementary file 4. Gene set over-representation analysis (GORA) of divergence-convergence (DiCo) gene clusters determined with coefficient of variation (CoV) values.

    Functional enrichment of DiCo genes clustered with k-means algorithm according to their CoV values. GORA was performed using gene sets in each cluster (Figure 2—figure supplement 2) which were tested among all DiCo genes.

    elife-68048-supp4.xlsx (814.7KB, xlsx)
    Supplementary file 5. Gene set over-representation analysis (GORA) of divergence-convergence (DiCo) gene clusters determined with expression levels.

    Functional enrichment of DiCo genes clustered with k-means algorithm according to their expression levels. GORA was performed using gene sets in each cluster (Figure 2—figure supplement 3) which are tested among all DiCo genes.

    elife-68048-supp5.xlsx (1.9MB, xlsx)
    Transparent reporting form

    Data Availability Statement

    Sequencing data generated for this study have been deposited in GEO under accession code GSE167665. All data analysed during this study are included in the manuscript and supporting files. Source data files have been provided for all figures and figure supplements. Four additional and previously published datasets are used in this study: Jonker et al. 2013, GTEx Consortium et al. 2017, Schaum et al. 2020, and Tabula Muris Consortium 2020. All the code used to perform analyses is available in GitHub: https://github.com/hmtzg/geneexp_mouse (copy archived at swh:1:rev:1f2434f90404a79c87d545eca8723d99b123ac1c).

    The following dataset was generated:

    Izgi H, Han D, Isildak U, Huang S, Kocabiyik E, Khaitovich P, Somel M, Donertas HM. 2021. Bulk RNA-seq of mice covering the whole lifespan (2 days to 904 days) from four tissues. NCBI Gene Expression Omnibus. GSE167665

    The following previously published datasets were used:

    Jonker MJ, Melis JP, Kuiper RV, van der Hoeven TV, Robinson J, van der Horst GT, Breit TM, Vijg J, Dollé ME, Hoeijmakers JH, van Steeg H. 2013. Aging Experiment. NCBI Gene Expression Omnibus. GSE34378

    GTEx Consortium et al 2017. Gene TPMs. GTEx Portal. phs000424.v8.p2

    Pisco A. 2020. Official data release for Tabula Muris Senis. figshare.

    Schaum et al 2020. Tabula Muris Senis: Bulk sequencing. NCBI Gene Expression Omnibus. GSE132040


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES