Abstract
Integrating transcriptional profiles results in identifying gene expression signatures that are more robust than those obtained for individual datasets. However, a direct comparison of datasets derived from heterogeneous experimental conditions is problematic, hence their integration requires applying of specific meta-analysis techniques. The transcriptional response to hypoxia has been the focus of intense research due to its central role in tissue homeostasis and prevalent diseases. Accordingly, many studies have determined the gene expression profile of hypoxic cells. Yet, despite this wealth of information, little effort has been made to integrate these datasets to produce a robust hypoxic signature. We applied a formal meta-analysis procedure to datasets comprising 430 RNA-seq samples from 43 individual studies including 34 different cell types, to derive a pooled estimate of the effect of hypoxia on gene expression in human cell lines grown ingin vitro. This approach revealed that a large proportion of the transcriptome is significantly regulated by hypoxia (8556 out of 20,888 genes identified across studies). However, only a small fraction of the differentially expressed genes (1265 genes, 15%) show an effect size that, according to comparisons to gene pathways known to be regulated by hypoxia, is likely to be biologically relevant. By focusing on genes ubiquitously expressed, we identified a signature of 291 genes robustly and consistently regulated by hypoxia. Overall, we have developed a robust gene signature that characterizes the transcriptomic response of human cell lines exposed to hypoxia in vitro by applying a formal meta-analysis to gene expression profiles.
Keywords: transcription, hypoxia, RNAseq, meta-analysis
1. Introduction
Oxygen homeostasis is essential to sustain cellular metabolism in eukaryotes. Hypoxia triggers multiple adaptive mechanisms, from metabolism reprogramming to tissue restructuring, aimed to re-balancing oxygen supply and demand [1]. In multicellular organisms this response can be very diverse, depending on cell type, extension and degree of the oxygen deprivation, or pathological state.
Most of these responses are orchestrated at the transcriptional level, with the Hypoxia Inducible Factors (HIFs) being the main drivers of the hypoxic gene expression pattern [2]. The heterodimeric HIF transcription factor consists on a subunit (ARNT), constitutively expressed, and an subunit (HIF1A, EPAS1, HIF3A) which, in normoxic conditions, is marked for degradation by the concerted action of a family of oxygen-dependent enzymes (EGLN family) and the von Hippel-Lindau (VHL) ubiquitylation complex [3,4,5]. When oxygen concentration decreases, the subunits escape degradation due to the reduced activity of the EGLNs, translocate to the nucleus and bind to Hypoxia Response Elements along the subunit. Transcriptional activity of HIFs depends also on interaction with co-activators such as CREB-binding protein or p300, whose binding is also regulated in an oxygen-dependent manner [6,7].
Given the importance of the transcriptional response for tissue oxygen homeostasis and its alteration in disease, a large number of works have attempted to identify the full set of genes regulated by hypoxia through gene profiling experiments. Since these studies were performed in a wide variety of experimental conditions (cell types, oxygen tension, exposure time) integrating their results could lead to identify a set of genes ubiquitously regulated by hypoxia, as well as genes whose alteration is restricted to specific situations in addition to hypoxia. However, little effort has been done in this regard and, to the best of our knowledge, only two attempts to integrate all the hypoxic gene profiling experiments have been done [8,9]. The first analysis of this type, based on the analysis of gene profiles generated by means of DNA microarrays, produced the first list of genes universally induced by hypoxia and revealed that the set of genes induced by hypoxia were more conserved than those repressed [8]. A second, more recent study, exploited the information derived from RNA-seq experiments producing a more comprehensive list of hypoxia-regulated genes and characterized HIF-isoform common and specific targets [9]. In spite of their merit, none of these works employed formal meta-analysis approach for their analysis which, given the heterogeneous nature of the data, is critical to draw statistically sound conclusions [10].
Among the various meta-analysis methods applicable to transcriptomic data [11], we employed a model that combines the effect sizes (log fold change of the ratio of the gene expression under hypoxia and normoxia) on gene expression across the individual studies. This allowed us, not only to identify the set of genes differentially regulated in response to hypoxia, but also to estimate the magnitude of change for each gene. Instead of assuming a fixed effect of hypoxia on any given gene across the different studies, we used a random effects model that considers that the true effect could vary from study to study to reflect, for example, the different response in distinct cell types. In this study we aim to define core components of the transcriptional response to hypoxia taking advantage of the wider public availability of next generation sequencing data, RNA-seq in particular. Applying a random effects model to the expression data gathered we were able to define a molecular signature representing the early (≤48 h) transcriptional response to hypoxia, independently of cell type.
2. Materials and Methods
2.1. RNA-seq Data Download and Processing
Raw reads of the RNA-seq experiments were downloaded from Sequence Read Archive [12]. Pseudocounts for each gene were obtained with salmon [13] using RefSeq [14] mRNA sequences for human genome assembly GRCh38/hg38 as reference.
Differential expression in individual subsets was calculated with the R package DESeq2 [15] using local dispersion fit and apeglm [16] method for effect size shrinkage.
2.2. Meta-Analysis
The meta-analysis intended to identify the effect of sustained hypoxia on early gene expression in human cells compared to normoxic controls. To identify studies to be included in the meta-analysis Gene Expression Ommibus (GEO) repository was searched with the terms ‘hypoxia[Description] AND “expression profiling by high throughput sequencing” [DataSet Type]’ on 11 February 2021. The search resulted in a total of 394 studies. We only kept studies performed in human cells that determined steady-state RNA levels in total (poly-A) RNA samples and excluded analysis that did not include replicates, employed treatments other than reduced oxygen tension (e.g., chemical inhibitors or other hypoxia mimetics) or those where gene expression was analyzed after 48 h. We also excluded studies that used cycling/intermittent hypoxia or that were performed in non-human cell lines. A total of 46 studies (independent GSE entries) remained after application of the inclusion/exclusion criteria and were used for the meta-analyses (Supplementary Table S1). A pooled estimate of the size effect of hypoxia on expression was determined for each gene using the R packages metafor [17] and meta [18] using as input the log2-Fold change value and its associated standard error computed for each individual RNA-seq experiment using the R package DESeq2 [15]. Given that the individual estimates derive from an heterogeneous group of experiments, including different cell types and experimental conditions, we assumed that these individual estimates derive from a distribution of true effect sizes rather than a single one and thus applied a random-effects model for the meta-analysis. Since some of the selected studies included several cell types and/or experimental conditions (see results for details), we fitted a 3-level model [19] that, in addition to sampling error and between-study heterogeneity, takes into account possible dependencies between data subsets derived from a single study.
The meta-analysis intended to identify the effect of sustained hypoxia on early gene expression in human cells compared to normoxic controls. To identify studies to be included in the meta-analysis Gene Expression Ommibus (GEO) repository was searched with the terms ‘hypoxia[Description] AND “expression profiling by high throughput sequencing” [DataSet Type]’ on 11 February 2021. The search resulted in a total of 394 studies. We only kept studies performed in human cells that determined steady-state RNA levels in total (poly-A) RNA samples and excluded analysis that did not include replicates, employed treatments other than reduced oxygen tension (e.g., chemical inhibitors or other hypoxia mimetics) or those where gene expression was analyzed after 48 h. We also excluded studies that used cycling/intermittent hypoxia or that were performed in non-human cell lines. A total of 46 studies (independent GSE entries) remained after application of the inclusion/exclusion criteria and were used for the meta-analyses (Supplementary Table S1). A pooled estimate of the size effect of hypoxia on expression was determined for each gene using the R packages metafor [17] and meta [18] using as input the log2-Fold change value and its associated standard error computed for each individual RNA-seq experiment using the R package DESeq2 [15]. Given that the individual estimates derive from an heterogeneous group of experiments, including different cell types and experimental conditions, we assumed that these individual estimates derive from a distribution of true effect sizes rather than a single one and thus applied a random-effects model for the meta-analysis. Since some of the selected studies included several cell types and/or experimental conditions (see results for details), we fitted a 3-level model [19] that, in addition to sampling error and between-study heterogeneity, takes into account possible dependencies between data subsets derived from a single study.
2.3. Functional Enrichment Analysis
Enrichment of Gene Ontology terms was performed with the Bioconductor’s clusterProfiler package [20] using a q cut-off value of 0.05. The list of background genes included those expressed in at least 90% of the datasets and as foreground list the subset of genes significantly regulated by hypoxia (FDR < 0.01) with . As background genes. To reduce the redundancy, highly similar GO terms were removed keeping a single representative by using the “simplify” function using a cut-off value of 0.6 (up-regulated genes) or 0.7 (down-regulated genes). The much larger number of enriched terms found for up-regulated genes justified the use of a slightly more lenient cutoff value for the simplify function. Gene Set Enrichment Analysis [21] was performed using the preranked tool of the Broad Institute’s application for Linux (version v4.2.3). The Canonical pathways subset was used as gene set database and gene list was ranked according to the pooled LFC estimate derived from the meta-analysis. Genes expressed in less than 5% of the studies’ subsets were removed from the list. Pathways with an were considered significantly enriched.
3. Results
3.1. Hypoxia-Induced Transcriptional Profiles Show Limited Overlap
In order to identify genes consistently regulated by hypoxia across a wide range of cell types and experimental conditions, we compared the results from 46 studies analyzing the transcriptional response to hypoxia by means of RNA-seq (Supplementary Table S1). Since some studies included several cell types, oxygen tensions or times of exposure to hypoxia, we took subsets of the study’s data so that each one included a single cell line and set of experimental conditions (Figure 1). Thus, our initial data set included a total of 81 subsets of normoxia-hypoxia paired samples, each one comprising a single cell line, exposure time and oxygen tension (Table 1).
Table 1.
Dataset | Studies | Samples | Subsets | Cell Lines |
---|---|---|---|---|
Initial | 46 | 472 | 81 | 38 |
Filtered | 43 | 430 | 70 | 34 |
For each of these 81 subsets, we identified the genes significantly regulated () in response to hypoxia and recorded the number of times each gene was found to be down- or up-regulated across the 81 subsets (Figure 1A). This analysis revealed that the majority of genes showed significant changes in a small number of datasets (Figure 1B). Thus, of a total of 15,362 genes found significantly () repressed by hypoxia across all the 81 analyzed datasets, over 50% of them were found in a maximum of five datasets (Figure 1B, first red bar). Similarly, a total of 16,872 genes were significantly () induced in at least one dataset, but 60% of them were shared in a maximum of five datasets (Figure 1B, first blue bar). Conversely, not a single gene was found consistently down- or up-regulated across all the datasets, but induced genes tend to be more consistently regulated (Figure 1B). The most frequently repressed genes were present in at most 50–55 datasets while several genes were found significantly up-regulated in 65–70 of them (Figure 1B and Supplementary Tables S2 and S3). These results suggested a reduced overlap of the analyzed transcriptional profiles. Next, we performed all possible pair-wise comparisons between the 81 lists of DEGs (Figure 1A, bottom right panel). As shown in Figure 1C, the overlap between lists of DEGs was below 10% in the vast majority of pair-wise comparisons with a median value of 3.6% of shared genes between lists of repressed genes and a median value of 6.8% in the case of the induced genes. Altogether these results indicate a considerable heterogeneity in the transcriptional response to hypoxia, which is more pronounced in the case of repressed genes.
3.2. Identification of Robust Transcriptional Responses to Hypoxia
Given the heterogeneity in the transcriptional response to hypoxia, we decided to apply a formal meta-analysis to identify genes significantly regulated by hypoxia across all the datasets and estimate the magnitude of the change in their expression. To this end, for each of these 81 subsets we estimated the difference in expression levels between normoxic and hypoxic conditions for all genes (“LFC”, Figure 2). From these analyses we extracted the statistics (Effect size, “LFC”, and standard error associated to this estimate, “SE”) for individual genes across the different studies and performed a meta-analysis on each one of these gene-specific datasets to estimate the pooled effect of hypoxia (Figure 2 Meta-analysis). Thus, an independent meta-analysis was performed for each individual gene by integrating the effects of hypoxia on that particular gene across the different studies and conditions. The results provide a pooled estimate of the effect of hypoxia on the expression of the gene under analysis and its statistical significance. As an example, the results of the meta-analysis for the EGLN3 gene, encoding a cellular oxygen sensor known to be directly regulated by HIF in response to hypoxia [22], are shown in Supplementary Figure S1. Finally, we compiled the pooled estimates for all genes detected in more than one subset, together with the statistical significance value, to produce a table representing the overall effect of hypoxia on gene expression (Figure 2 “Compiled MA results”).
The quality of the meta-analyses’ results is critically dependent on the original data fed to the model. In this regard, correlation analyses revealed a few incoherent datasets (Supplementary Figure S2). In those cases where the lack of positive correlation was clearly due to a mistake in the labeling of samples in public databases, as indicated by a large negative correlation coefficients (Supplementary Figure S2, subsets “S42” and “S58”), the treatment labels were correctly set and the study was kept. The remaining incoherent studies, having a correlation coefficient not significantly different to zero (), were discarded. After these data sanity check procedures, the whole analysis strategy (Figure 2) was repeated on this corrected and filtered dataset. Table 1 shows the statistics of the data set after filtering and Supplementary Table S4, the full description of each of the samples included in the final analysis.
3.3. Identification of a Universal Core of Hypoxia-Inducible Genes
The results of the meta-analysis on the clean dataset, after filtering out the outlier subsets and removing genes detected in less than 5% of the subsets, revealed 6242 genes (out of a total of 20,918) whose expression was significantly () altered in response to hypoxia (Figure 3A and Supplementary Table S5), with similar number of genes being induced (3043) and repressed (3199). These numbers are larger than the typical values obtained in individual experiments, with median values of 1294 and 1442 genes significantly down- and up-regulated respectively (Figure 3B). The large number of DEGs identified by the meta-analyses is probably a consequence of the increased power to detect small effect sizes due to the integration of a large number of samples. In agreement, the median effect size (LFC) observed for the genes differentially expressed (DE) according to the meta-analyses are −0.31 and 0.42 for down- and up-regulated genes respectively, contrasting with the median effect size observed in individual studies of −0.76 and 0.86 for down- and up-regulated genes respectively (Figure 3C). Accordingly, the identification of DEGs based only on statistical significance yields a large number of genes barely changing in response to hypoxia (Figure 3A, genes labelled “FDR” in blue colour). As an example, the smallest effect size found among significantly up-regulated genes is 0.11 corresponding to fold induction over normoxia of about 1.1 times.
In view of these results we tried to identify a minimum effect size likely to represent biologically relevant changes in gene expression. To this end we explored the relationship between effect size and belonging to biological processes known to be regulated by hypoxia, testing whether increasing the Log2FC cut-off would also increase the proportion of genes with hypoxia related annotations. As shown in Supplementary Figure S3, the p-value for the association of biological function and regulation by hypoxia reached a minimum at effect size (Log2FC) values between 0.3 and 1.7. Since the choice of effect size values only affects the distribution of genes into categories (i.e., differentially expressed versus not altered) but does not change their total number, the minimum p-value corresponds to the least likely distribution expected by chance. Thus, we decided to take these values as the lower boundary required to produce a biological response to hypoxia. The median value of the effect sizes is 0.7, corresponding to an induction of 1.6 times over basal levels (0.6 times the normoxic level for repressed genes).
Thus, in response to hypoxia a total of 926 genes, 167 repressed and 759 induced, show a statistically significant change in expression () of a magnitude likely to be biologically meaningful () (Figure 3A labeled in green and red colours).
The difference in the number of repressed and induced genes is a consequence of the distribution of effect size values having a longer tail in the latter case (rug plot of the x-axis in Figure 3A,D left panel). The different shapes of the distribution of effect size values also suggest that hypoxia has a relatively weak effect on gene repression. In fact, while the number of significantly induced genes is about four times higher that of repressed genes (759 vs. 167) for an effect size higher than 0.7, the ratio increases to seventeen times more up-regulated than down-regulated genes (424 vs. 25) for effect sizes larger than 1. Since the meta-analyses included experiments done at relatively short exposure times (27% of the subsets correspond to exposure times ranging from 1–12 h), it could be argued that the smaller effect size observed for repressed genes is a consequence of short-time experiments failing to detect the effect on mRNA levels due to the relatively long half-life of mRNAs. To test this hypothesis, we repeated the meta-analyses selecting only subsets corresponding to treatments of 24–48 h, significantly longer than the median half-life of 5.7 h observed for human mRNAs under hypoxia [23]. As shown in Figure 3D left panel, both distributions show a small shift toward higher absolute effect size values, but the difference between them remains unaltered. Thus, the relatively smaller effect of hypoxia on gene repression does not appear to be due to the persistence of mRNA molecules present prior hypoxia exposure.
Finally, in order to get a list of core hypoxia-responsive genes we identified those that were ubiquitously expressed. To that end, we selected those genes whose expression, averaged across conditions, was detectable in at least 90% of the analyzed subsets (Figure 3A labeled in red color). The resulting list included a total of 295 genes (114 down- and and 181 up-regulated) consistently altered by hypoxia across conditions. These genes correspond to those most frequently found significantly regulated across individual datasets (Supplementary Tables S2 and S3). The top 5 most frequently down- and up-regulated genes are labelled in Figure 3A.
Functional enrichment of Gene Ontology terms, indicated that core hypoxia-induced genes are mainly involved metabolic reprogramming but also in differentiation and morphogenesis, being the development of the circulatory system particularly prominent (Figure 4A). On the other hand, the genes consistently repressed by hypoxia across conditions, are involved in cell cycle progression, DNA replication and repair, ribosome/rRNA biogenesis and metabolism of amino acids (Figure 4B). Similar results were obtained by Gene Set Enrichment Analysis (GSEA) using the meta-analysis derived LFC pooled estimates as ranking factor and pathway databases (Biocarta, KEGG, PID, Reactome and Wikipathways) as source of gene sets [24]. GSEA results showed that cell cycle, DNA replication and DNA repair pathways were repressed by hypoxia (Supplementary Table S6). In addition, mitochondrial respiratory electron transport and complex I biogenesis were also found repressed (Supplementary Table S6). On the other hand, HIF-related pathways and glucose metabolism were upregulated by hypoxia (Supplementary Table S7).
In summary, the application of a formal meta-analysis to hypoxia gene expression profiles using a random effects model lead to the identification of a core set of 295 ubiquitously expressed genes whose expression is significantly altered by hypoxia by a factor of at least 0.7 log2-units. The identity of these genes and along their response to hypoxia across individual subsets can be found in Supplementary Table S8.
3.4. Consistency of Meta-Analysis Results
To test the consistency of the pooled estimates described above, we applied a leave-one-out cross-validation, a common method to estimate how accurately a predictive model will perform on new data. To this end, we performed a set of meta-analyses using as input all data subsets except for one and then compared the estimated effect sizes with the actual LFC observed in the subset that was left out (Figure 5A). The process was repeated until all possibilities were exhausted. This approach yielded a list of 70 correlation coefficients corresponding to each iteration. As shown in Figure 5B, in almost all cases there was a strong correlation between the pooled estimates and the actual effect sizes observed in the individual subset that was left out of the meta-analyses, with 50% of the instances showing a Pearson’s correlation coefficient over 0.81 and 75% of the cases above 0.72. We also analyzed the overlap between the DEG derived from each meta-analyses and those from the individual experiment that was left out from it and found a median value of 19% percent of shared genes between lists of repressed genes and a median value of 18% percent in the case of the induced genes (Figure 5B). These values contrast with the low overlap found in pairwise comparisons between individual experiments (Figure 1C), in particular in the case of down-regulated genes. Finally, we analyzed the percentage of core genes genes (, and present in at least 90% of the subsets included in the meta-analysis) that were present in the DEG () from the subset not included in the meta-analysis. This analysis showed than core genes are consistently found among the DEG identified individual studies, with median values of 55% of the core repressed genes and 65% of the core induced genes (Figure 1D). Altogether these results indicate that the ensemble of pooled estimates predict with high accuracy the response to hypoxia and the identity of DEGs in new experiments not included in the meta-analysis.
3.5. Comparison of Meta-Analyses Results with a Reference Hypoxia Signature
The core set of genes identified in the analyses described before can be considered a signature of the transcriptional response to hypoxia. Thus, we next compared the core of hypoxia-inducible genes derived from the meta-analysis with the MSigDB’s Hallmark hypoxia geneset [24], a widely used gene signature composed of 200 genes up-regulated in response to low oxygen levels. As shown in Figure 6A, the overlap between both gene sets was relatively small, with less than one third (64 out of 200) of the genes in the Hallmark hypoxia signature being present in the meta-analyses derived geneset, in spite of both genesets being similar in size, median Log2FC and nearly universal expression (Supplementary Table S9). Moreover, the overlap was only moderately increased when the Hallmark hypoxia signature was compared to the geneset derived from the meta-analysis without restricting to ubiquitously expressed genes (Figure 6B). In order to understand the cause for the reduced overlap, we analyzed the effect of hypoxia on the expression of the 109 genes present in the Hallmark hypoxia signature only (Supplementary Table S10). Five of the genes in this group (CCN5, CCN1, BRS3, CCN2 and LALBA), were not among the 22,182 genes considered in the meta-analyses, probably due to the lack of detectable expression in the RNA-seq datasets. The effect of hypoxia on the remaining 104 genes is shown in Figure 6C. This analysis revealed that 40% percent of these genes (43 out of 104) were not present in the meta-analyses-derived signature because the pooled estimate of their induction by hypoxia was below the threshold value of 0.7, in spite of being statistically significant (labeled as “DEG_UP” and shown in green color). However, the remaining 60% of the genes (62 out of 104) did not show a statistically significant induction by hypoxia (57 out of 104; Figure 6C labeled as “Non_DEG” and shown in red color) or were repressed (5 out of 104 genes; Figure 6C labeled as “DEG_DN” and shown in blue color). A forest plot representing the pooled estimate of the LFC for the genes labeled as “Non_DEG” or “DEG_DN” shows that they cluster around the value of zero and that, in many cases, confidence interval for the point estimate is wide (Figure 6D) and overlaps the value of zero (no regulation). These results indicate that these 62 genes are either not consistently induced by hypoxia or show a cell-type/condition specific induction. An example of the latter is the HMOX1 whose expression is induced, repressed or left unaltered depending on the cell-type and/or experimental conditions (Supplementary Figure S4). Interestingly, among the genes whose regulation by hypoxia depends on the specific experimental conditions are ALDOB, and LDHC, which are tissue-specific paralogs of genes strongly and consistently induced by hypoxia, ALDOA and LDHA respectively (Figure 6D, labelled in red).
Altogether, these results suggest that the meta-analyses derived gene signature improves the gene sets derived from individual studies by excluding genes whose regulation is cell type or condition specific and those with effect sizes of small magnitude.
4. Discussion
The integration of multiple datasets representing the transcriptional response to a given stimulus, allows for the identification of consistent changes in gene expression. However, transcriptional profiles are noisy, and the correlation between them is poor [25,26]. Thus, the number of common DEGs decreases rapidly with the number of studies taken into consideration (Figure 1). To identify genes commonly regulated by hypoxia one can set a minimum number of studies where the gene needs to be found as a DEG [9]. Then again, there is no objective criteria to select minimal thresholds and this approach results in a list of commonly regulated genes which does not provide information regarding the magnitude of their regulation. Fortunately, applying meta-analysis methods appears to be a good and practical solution to reduce noise and increase signal across different studies [10].
Herein we describe the application of a formal meta-analysis procedure to identify genes whose expression is significantly modulated across a number of different gene profiling studies. This approach not only provides the identity of the genes but also a pooled estimate of the effect of the condition on the expression. Moreover, by applying a random effects model, this strategy takes into account the wide variability in gene expression expected from the integration of transcriptomes derived from different experimental conditions. The application of this approach to 70 paired normoxic/hypoxic transcriptomes representing a total of 430 samples resulted in the identification of 6242 genes, roughly 30% of the detectable genes, as significantly () regulated in response to changes in oxygen tension. These results beg answering the question of the biological relevance of statistically significant but small changes in gene expression. For example, the median effect size for significantly up-regulated genes was 0.42, corresponding to a fold induction of 1.34 times over normoxia. Thus, for half of the significantly-induced genes, the level of mRNA in hypoxia is at most 1.34 times higher than control levels.
For some genes this small increase in expression could have important consequences, but statistical significance by itself does not warrant biological relevance. For this reason we sought to identify an effect size that it is likely to have an impact of cellular biochemistry. To this end we recorded the changes in expression of genes known to have an impact on different biological processes upon exposure to hypoxia and took the median effect size value, 0.7 log2 units, as threshold. As only a fraction of the genes in a category are induced to initiate the biological response, this value is likely to be an underestimation and thus could be considered a lower boundary to identify biologically relevant changes. The importance of considering the effect size (Log2FC), and not only the statistical significance to identify DEGs, is a strong argument in favor of formal meta-analysis instead of other integrative methods that yield a consensus list of genes without an associated estimate of the effect of hypoxia. The effect of hypoxia on gene repression is apparently weaker than on gene induction, as indicated by relative smaller number of repressed genes above the effect size threshold. This could be a consequence of repression being indirectly regulated by HIFs [8,27,28] and would explain the higher heterogeneity of the response observed for repressed genes (Figure 1). The indirect regulation could occur via transcriptional regulators acting downstream of HIF [29] or be a consequence of the cellular adaptations to hypoxia. For example, alteration of energy availability during hypoxia could prevent cell division with the concomitant down-regulation of genes involved in DNA synthesis and cell cycle progression. Thus, further work is required fully understand the mechanisms responsible for gene repression during hypoxia.
As a further advantage, the application of a formal meta-analysis approach allows for the application of all associated statistical tests, including moderator analysis, to study the effect of different factors on the regulation of gene expression. Through the application of this analysis we found that endothelial cells are deficient in the induction of a relatively large set of genes in response to hypoxia. Among those genes there are many enzymes involved in glucose metabolism, particularly in glycolysis and synthesis of glycogen (data not shown). It is known that HIF1A, but not EPAS1, is responsible for the hypoxic induction of glycolytic genes [30], it is tempting to speculate that the specific pattern of expression observed in endothelial cells could be a consequence of the relative importance of EPAS1 over HIF1A isoform in this cell type. However, since most endothelial datasets consist of experiments performed on Human Umbilical Vein Endothelial Cells (HUVEC), we cannot rule out that the blunted induction of these genes is specific to this cell type rather than a general feature of endothelial cells. In agreement with this latter possibility, preliminary experiments showed a feeble induction of glycolytic genes in several, but not all, endothelial cells tested (data not shown).
The Hallmark subset of MSigDB contains signatures generated by a computational method based on the identification of overlaps across different gene sets and retaining those genes that display coordinate expression [24]. In spite of being an invaluable and widely used resource, our results suggests that the MSigDB hypoxia signature shows some shortcomings. For one thing, the signature lacks many hypoxia-regulated genes, containing only 37% of the genes strongly and consistently regulated by hypoxia across different cell types and experimental conditions (Figure 6A). In addition, the 114 core genes identified by the meta-analysis and not present in the Hallmark signature, include well characterized hypoxia-induced genes such as BCKDHA, EGLN1, several KDM family members and LOXL2 among others (Supplementary Table S10). On the other hand, the Hallmark signature includes some genes that are induced only in specific cell types or experimental conditions (Figure 6D) and thus, cannot be considered general hypoxia responsive genes. This result is particularly interesting as it explains the contradictory reports regarding the effect of hypoxia on specific genes such as HMOX1 [31,32,33,34,35] and PPARG1 [36,37,38,39,40,41,42].
In summary, herein we describe a formal meta-analysis approach that identifies the core transcriptional response to hypoxia. In addition to the identity of the genes, the approach results in a estimate of the magnitude of their change in expression in response to hypoxia. We also describe an approach to determine a minimum effect size to be used in combination with the statistical significance to identify biologically relevant changes in response to hypoxia.
Acknowledgments
We thank Yosra Berrouayel Dahour for her comments and suggestions about application of meta-analysis methods to transcriptomic data.
Abbreviations
The following abbreviations are used in this manuscript:
DEG(s) | Differentially expressed gene(s) |
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biomedicines10092229/s1, Figure S1. Meta-analysis of the effect of hypoxia on EGLN3 gene expression. Figure S2. Identification of outlier subsets. Figure S3. Identification of an effect size value that maximizes the association between gene expression and cellular responses to hypoxia. Figure S4. Meta-analysis of the effect of hypoxia on HMOX1 gene expression. Table S1. Studies selected from database search. Table S2. Frequency of hypoxia-repressed genes. Table S3. Frequency of hypoxia-induced genes. Table S4. Metadata of studies kept after filtering original datasets. Table S5. Compiled Meta-analyses results. Table S6. Biological pathways repressed by hypoxia. Table S7. Biological pathways induced by hypoxia. List of pathways repressed by hypoxia according to GSEA. Table S8. Effect of hypoxia on the expression of the genes identified in the meta-analysis as the core hypoxic signature. Table S9. Comparison of Meta-analysis derived core genes and Hall Mark hypoxia signature. Table S10. Effect of hypoxia on the expression of the genes included in gene signatures.
Author Contributions
Conceptualization, L.d.P.; data curation, L.P.-S. and L.S.-G.; formal analysis L.d.P. and L.P.-S.; funding acquisition L.d.P. and R.R.-R.; investigation N.P. and O.M.-C.; methodology, L.d.P. and L.P.-S.; software, L.d.P. and L.P.-S.; writing—original draft preparation, L.d.P.; writing—review and editing, L.P.-S.; visualization, L.d.P.; supervision, L.d.P. and R.R.-R.; project administration, L.d.P. and R.R.-R. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
Not Applicable.
Informed Consent Statement
Not Applicable.
Data Availability Statement
Data supporting reported results can be found at NCBI’s Gene Expression Ommibus (GEO) repository, Supplementary Table S1 includes the ID for the studies used herein.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
Funding Statement
This research was funded by Ministerio de Economia, industria y Competitividad (Spain) grant number SAF2017-88771-R; Ministerio Ciencia e Innovacion (MCIN/AEI/10.13039/501100011033 “ERDF A way of making Europe”, Spain) grant number PID2020-118821RB-I00 and Consejeria de Ciencia, Universidades e Innovacion de la CAM (Madrid, Spain) grant number IND2019/BMD-17134.
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Semenza G.L. Hypoxia-inducible factors in physiology and medicine. Cell. 2012;148:399–408. doi: 10.1016/j.cell.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Elvidge G.P., Glenny L., Appelhoff R.J., Ratcliffe P.J., Ragoussis J., Gleadle J.M. Concordant regulation of gene expression by hypoxia and 2-oxoglutarate-dependent dioxygenase inhibition: The role of HIF-1alpha, HIF-2alpha, and other pathways. J. Biol. Chem. 2006;281:15215–15226. doi: 10.1074/jbc.M511408200. [DOI] [PubMed] [Google Scholar]
- 3.Ivan M., Kondo K., Yang H., Kim W., Valiando J., Ohh M., Salic A., Asara J.M., Lane W.S., Kaelin W.G. HIFalpha targeted for VHL-mediated destruction by proline hydroxylation: Implications for O2 sensing. Science. 2001;292:464–468. doi: 10.1126/science.1059817. [DOI] [PubMed] [Google Scholar]
- 4.Jaakkola P., Mole D.R., Tian Y.M., Wilson M.I., Gielbert J., Gaskell S.J., von Kriegsheim A., Hebestreit H.F., Mukherji M., Schofield C.J., et al. Targeting of HIF-alpha to the von Hippel-Lindau ubiquitylation complex by O2-regulated prolyl hydroxylation. Science. 2001;292:468–472. doi: 10.1126/science.1059796. [DOI] [PubMed] [Google Scholar]
- 5.Maxwell P.H., Wiesener M.S., Chang G.W., Clifford S.C., Vaux E.C., Cockman M.E., Wykoff C.C., Pugh C.W., Maher E.R., Ratcliffe P.J. The tumour suppressor protein VHL targets hypoxia-inducible factors for oxygen-dependent proteolysis. Nature. 1999;399:271–275. doi: 10.1038/20459. [DOI] [PubMed] [Google Scholar]
- 6.Kasper L.H., Boussouar F., Boyd K., Xu W., Biesen M., Rehg J., Baudino T.A., Cleveland J.L., Brindle P.K. Two transactivation mechanisms cooperate for the bulk of HIF-1-responsive gene expression. EMBO J. 2005;24:3846–3858. doi: 10.1038/sj.emboj.7600846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lando D., Peet D.J., Gorman J.J., Whelan D.A., Whitelaw M.L., Bruick R.K. FIH-1 is an asparaginyl hydroxylase enzyme that regulates the transcriptional activity of hypoxia-inducible factor. Genes Dev. 2002;16:1466–1471. doi: 10.1101/gad.991402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ortiz-Barahona A., Villar D., Pescador N., Amigo J., del Peso L. Genome-wide identification of hypoxia-inducible factor binding sites and target genes by a probabilistic model integrating transcription-profiling data and in silico binding site prediction. Nucleic Acids Res. 2010;38:2332–2345. doi: 10.1093/nar/gkp1205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bono H., Hirota K. Meta-Analysis of Hypoxic Transcriptomes from Public Databases. Biomedicines. 2020;8:10. doi: 10.3390/biomedicines8010010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hong F., Breitling R. A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments. Bioinformatics. 2008;24:374–382. doi: 10.1093/bioinformatics/btm620. [DOI] [PubMed] [Google Scholar]
- 11.Makinde F.L., Tchamga M.S.S., Jafali J., Fatumo S., Chimusa E.R., Mulder N., Mazandu G.K. Reviewing and assessing existing meta-analysis models and tools. Brief. Bioinform. 2021;22:bbab324. doi: 10.1093/bib/bbab324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Leinonen R., Sugawara H., Shumway M. The sequence read archive. Nucleic Acids Res. 2011;39:148–162. doi: 10.1093/nar/gkq1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Patro R., Duggal G., Love M.I., Irizarry R.A., Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods. 2017;14:417–419. doi: 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.O’Leary N.A., Wright M.W., Brister J.R., Ciufo S., Haddad D., McVeigh R., Rajput B., Robbertse B., Smith-White B., Ako-Adjei D., et al. Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44:D733–D745. doi: 10.1093/nar/gkv1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhu A., Ibrahim J.G., Love M.I. Heavy-Tailed prior distributions for sequence count data: Removing the noise and preserving large differences. Bioinformatics. 2019;35:2084–2092. doi: 10.1093/bioinformatics/bty895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Viechtbauer W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 2010;36:1–48. doi: 10.18637/jss.v036.i03. [DOI] [Google Scholar]
- 18.Balduzzi S., Rücker G., Schwarzer G. How to perform a meta-analysis with R: A practical tutorial. Evid.-Based Ment. Health. 2019;22:153–160. doi: 10.1136/ebmental-2019-300117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Harrer M., Cuijpers P., Furukawa T.A., Ebert D.D. Doing Meta-Analysis with R: A Hands-On Guide. 1st ed. Chapman and Hall/CRC Press; Boca Raton, FL, USA: London, UK: 2021. [Google Scholar]
- 20.Yu G., Wang L.G., Han Y., He Q.Y. clusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters. OMICS J. Integr. Biol. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Subramanian A., Tamayo P., Mootha V.K., Mukherjee S., Ebert B.L., Gillette M.A., Paulovich A., Pomeroy S.L., Golub T.R., Lander E.S., et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Pescador N., Cuevas Y., Naranjo S., Alcaide M., Villar D., Landázuri M., del Peso L. Identification of a functional hypoxia-responsive element that regulates the expression of the egl nine homologue 3 (egln3/phd3) gene. Biochem. J. 2005;390:189–197. doi: 10.1042/BJ20042121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Tiana M., Acosta-Iborra B., Hernandez R., Galiana C., Fernandez-Moreno M.A., Jimenez B., del Peso L. Metabolic labeling of RNA uncovers the contribution of transcription and decay rates on hypoxia-induced changes in RNA levels. RNA. 2020;26:1006–1022. doi: 10.1261/rna.072611.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Irizarry R.A., Warren D., Spencer F., Kim I.F., Biswal S., Frank B.C., Gabrielson E., Garcia J.G., Geoghegan J., Germino G., et al. Multiple-laboratory comparison of microarray platforms. Nat. Methods. 2005;2:345–350. doi: 10.1038/nmeth756. [DOI] [PubMed] [Google Scholar]
- 26.Kuo W.P., Jenssen T.K., Butte A.J., Ohno-Machado L., Kohane I.S. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics. 2002;18:405–412. doi: 10.1093/bioinformatics/18.3.405. [DOI] [PubMed] [Google Scholar]
- 27.Mole D.R., Blancher C., Copley R.R., Pollard P.J., Gleadle J.M., Ragoussis J., Ratcliffe P.J. Genome-wide association of hypoxia-inducible factor (HIF)-1alpha and HIF-2alpha DNA binding with expression profiling of hypoxia-inducible transcripts. J. Biol. Chem. 2009;284:16767–16775. doi: 10.1074/jbc.M901790200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Xia X., Lemieux M.E., Li W., Carroll J.S., Brown M., Liu X.S., Kung A.L. Integrative analysis of HIF binding and transactivation reveals its role in maintaining histone methylation homeostasis. Proc. Natl. Acad. Sci. USA. 2009;106:4260–4265. doi: 10.1073/pnas.0810067106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Batie M., Del Peso L., Rocha S. Hypoxia and Chromatin: A Focus on Transcriptional Repression Mechanisms. Biomedicines. 2018;6:47. doi: 10.3390/biomedicines6020047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hu C.J., Wang L.Y., Chodosh L.A., Keith B., Simon M.C. Differential roles of hypoxia-inducible factor 1alpha (HIF-1alpha) and HIF-2alpha in hypoxic gene regulation. Mol. Cell Biol. 2003;23:9361–9374. doi: 10.1128/MCB.23.24.9361-9374.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Dunn L.L., Kong S.M., Tumanov S., Chen W., Cantley J., Ayer A., Maghzal G.J., Midwinter R.G., Chan K.H., Ng M.K., et al. Hmox1 (Heme Oxygenase-1) Protects against Ischemia-Mediated Injury via Stabilization of HIF-1α (Hypoxia-Inducible Factor-1α) Arterioscler. Thromb. Vasc. Biol. 2021;41:317–330. doi: 10.1161/ATVBAHA.120.315393. [DOI] [PubMed] [Google Scholar]
- 32.Fan J., Lv H., Li J., Che Y., Xu B., Tao Z., Jiang W. Roles of Nrf2/HO-1 and HIF-1α/VEGF in lung tissue injury and repair following cerebral ischemia/reperfusion injury. J. Cell. Physiol. 2019;234:7695–7707. doi: 10.1002/jcp.27767. [DOI] [PubMed] [Google Scholar]
- 33.Yu H., Chen B., Ren Q. Baicalin relieves hypoxia-aroused H9c2 cell apoptosis by activating Nrf2/HO-1-mediated HIF1α/BNIP3 pathway. Artif. Cells Nanomed. Biotechnol. 2019;47:3657–3663. doi: 10.1080/21691401.2019.1657879. [DOI] [PubMed] [Google Scholar]
- 34.Chen D., Wu Y.X., Qiu Y.B., Wan B.B., Liu G., Chen J.L., Lu M.D., Pang Q.F. Hyperoside suppresses hypoxia-induced A549 survival and proliferation through ferrous accumulation via AMPK/HO-1 axis. Phytomedicine. 2020;67:153138. doi: 10.1016/j.phymed.2019.153138. [DOI] [PubMed] [Google Scholar]
- 35.Shibahara S., Han F., Li B., Takeda K. Hypoxia and heme oxygenases: Oxygen sensing and regulation of expression. Antioxid. Redox Signal. 2007;9:2209–2225. doi: 10.1089/ars.2007.1784. [DOI] [PubMed] [Google Scholar]
- 36.Zhao Y.Z., Liu X.L., Shen G.M., Ma Y.N., Zhang F.L., Chen M.T., Zhao H.L., Yu J., Zhang J.W. Hypoxia induces peroxisome proliferator-activated receptor γ expression via HIF-1-dependent mechanisms in HepG2 cell line. Arch. Biochem. Biophys. 2014;543:40–47. doi: 10.1016/j.abb.2013.12.010. [DOI] [PubMed] [Google Scholar]
- 37.Itoigawa Y., Kishimoto K.N., Okuno H., Sano H., Kaneko K., Itoi E. Hypoxia induces adipogenic differentitation of myoblastic cell lines. Biochem. Biophys. Res. Commun. 2010;399:721–726. doi: 10.1016/j.bbrc.2010.08.007. [DOI] [PubMed] [Google Scholar]
- 38.Krishnan J., Suter M., Windak R., Krebs T., Felley A., Montessuit C., Tokarska-Schlattner M., Aasum E., Bogdanova A., Perriard E., et al. Activation of a HIF1α-PPARγ Axis Underlies the Integration of Glycolytic and Lipid Anabolic Pathways in Pathologic Cardiac Hypertrophy. Cell Metab. 2009;9:512–524. doi: 10.1016/j.cmet.2009.05.005. [DOI] [PubMed] [Google Scholar]
- 39.Xu J., Xiang Q., Lin G., Fu X., Zhou K., Jiang P., Zheng S., Wang T. Estrogen improved metabolic syndrome through down-regulation of VEGF and HIF-1α to inhibit hypoxia of periaortic and intra-abdominal fat in ovariectomized female rats. Mol. Biol. Rep. 2012;39:8177–8185. doi: 10.1007/s11033-012-1665-1. [DOI] [PubMed] [Google Scholar]
- 40.Ezzeddini R., Taghikhani M., Amir S.F., Somi M.H., Samadi N., Esfahani A., Rasaee M.J. Downregulation of fatty acid oxidation by involvement of HIF-1α and PPARγ in human gastric adenocarcinoma and related clinical significance. J. Physiol. Biochem. 2021;77:249–260. doi: 10.1007/s13105-021-00791-3. [DOI] [PubMed] [Google Scholar]
- 41.Lane S.L., Blair Dodson R., Doyle A.S., Park H., Rathi H., Matarrazo C.J., Moore L.G., Lorca R.A., Wolfson G.H., Julian C.G. Pharmacological activation of peroxisome proliferator-activated receptor γ (PPAR-γ) protects against hypoxia-associated fetal growth restriction. FASEB J. 2019;33:8999–9007. doi: 10.1096/fj.201900214R. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Ameshima S., Golpon H., Cool C.D., Chan D., Vandivier R.W., Gardai S.J., Wick M., Nemenoff R.A., Geraci M.W., Voelkel N.F. Peroxisome proliferator-activated receptor gamma (PPARγ) expression is decreased in pulmonary hypertension and affects endothelial cell growth. Circ. Res. 2003;92:1162–1169. doi: 10.1161/01.RES.0000073585.50092.14. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data supporting reported results can be found at NCBI’s Gene Expression Ommibus (GEO) repository, Supplementary Table S1 includes the ID for the studies used herein.