Skip to main content
RNA logoLink to RNA
. 2017 Nov;23(11):1648–1659. doi: 10.1261/rna.062224.117

Cis-regulatory elements explain most of the mRNA stability variation across genes in yeast

Jun Cheng 1,2, Kerstin C Maier 3, Žiga Avsec 1,2, Petra Rus 3, Julien Gagneur 1,2,
PMCID: PMC5648033  PMID: 28802259

Abstract

The stability of mRNA is one of the major determinants of gene expression. Although a wealth of sequence elements regulating mRNA stability has been described, their quantitative contributions to half-life are unknown. Here, we built a quantitative model for Saccharomyces cerevisiae based on functional mRNA sequence features that explains 59% of the half-life variation between genes and predicts half-life at a median relative error of 30%. The model revealed a new destabilizing 3′ UTR motif, ATATTC, which we functionally validated. Codon usage proves to be the major determinant of mRNA stability. Nonetheless, single-nucleotide variations have the largest effect when occurring on 3′ UTR motifs or upstream AUGs. Analyzing mRNA half-life data of 34 knockout strains showed that the effect of codon usage not only requires functional decapping and deadenylation, but also the 5′-to-3′ exonuclease Xrn1, the nonsense-mediated decay genes, but not no-go decay. Altogether, this study quantitatively delineates the contributions of mRNA sequence features on stability in yeast, reveals their functional dependencies on degradation pathways, and allows accurate prediction of half-life from mRNA sequence.

Keywords: cis-regulatory elements, codon optimality, mRNA half-life

INTRODUCTION

The stability of messenger RNAs is an important aspect of gene regulation. It influences the overall cellular mRNA concentration, as mRNA steady-state levels are the ratio of synthesis and degradation rate. Moreover, low stability confers high turnover to mRNA and, therefore, the capacity to rapidly reach a new steady-state level in response to a transcriptional trigger (Shalem et al. 2008). Hence, stress genes, which must rapidly respond to environmental signals, show low stability (Miller et al. 2011; Zeisel et al. 2011; Marguerat et al. 2014; Rabani et al. 2014). In contrast, high stability provides robustness to variations in transcription. Accordingly, a wide range of mRNA half-lives is observed in eukaryotes, with typical variations in a given genome spanning one to two orders of magnitude (Schwanhäusser et al. 2011; Eser et al. 2016; Schwalb et al. 2016). Also, significant variability in mRNA half-life among human individuals could be demonstrated for about a quarter of genes in lymphoblastoid cells and estimated to account for more than a third of the gene expression variability (Duan et al. 2013).

How mRNA stability is encoded in a gene sequence has long been a subject of study. Cis-regulatory elements (CREs) affecting mRNA stability are mainly encoded in the mRNA itself. Here we use the formal definition of CRE, i.e., a regulatory element affecting expression of the gene it belongs to in an allele-specific manner (Rockman and Kruglyak 2006; Skelly et al. 2009). CREs affecting mRNA stability include but are not limited to secondary structure (Rabani et al. 2008; Geisberg et al. 2014), sequence motifs present in the 3′ UTR including binding sites of RNA-binding proteins (Olivas and Parker 2000; Duttagupta et al. 2005; Shalgi et al. 2005; Hogan et al. 2008; Hasan et al. 2014), and, in higher eukaryotes, microRNAs (Lee et al. 1993). Moreover, translation-related features are frequently associated with mRNA stability. For instance, inserting strong secondary structure elements in the 5′ UTR or modifying the translation start codon context strongly destabilizes the long-lived PGK1 mRNA in S. cerevisiae (Muhlrad et al. 1995; LaGrandeur and Parker 1999). Codon usage, which affects the translation elongation rate, also regulates mRNA stability (Hoekema et al. 1987; Presnyak et al. 2015; Bazzini et al. 2016; Mishima and Tomari 2016). Further correlations between codon usage and mRNA stability have been reported in E. coli and S. pombe (Boël et al. 2016; Harigaya and Parker 2016). Adjacent codon pairs were also demonstrated to associate with mRNA decay in addition to individual codons in S. cerevisiae (Harigaya and Parker 2017).

Since the RNA degradation machineries are well conserved among eukaryotes, the pathways have been extensively studied using S. cerevisiae as a model organism (Garneau et al. 2007; Parker 2012). The general mRNA degradation pathway starts with the removal of the poly(A) tail by the Pan2/Pan3 (Brown et al. 1996) and Ccr4/Not complexes (Tucker et al. 2001). Subsequently, mRNA is subjected to decapping carried out by Dcp2 and promoted by several factors, including Dhh1 and Pat1 (Pilkington and Parker 2008; She et al. 2008). The decapped and deadenylated mRNA can be rapidly degraded in the 3′ to 5′ direction by the exosome (Anderson and Parker 1998) or in the 5′ to 3′ direction by Xrn1 (Hsu and Stevens 1993). Further mRNA degradation pathways are triggered when aberrant translational status is detected, including nonsense-mediated decay (NMD), no-go decay (NGD), and nonstop decay (NSD) (Garneau et al. 2007; Parker 2012).

Despite all this knowledge, prediction of mRNA half-life from a gene sequence is still not established. Moreover, most of the mechanistic studies so far were only performed on individual genes or reporter genes. It is therefore unclear how the measured effects generalize genome-wide. A recent study showed that translation-related features can be predictive for mRNA stability (Neymotin et al. 2016). Although this analysis supported the general correlation between translation and stability (Lackner et al. 2007), the model was not based purely on sequence-derived features. It also contained measured transcript properties such as ribosome density and normalized translation efficiencies. Hence, the question of how half-life is genetically encoded in mRNA sequence remains to be addressed.

Additionally, the dependencies of sequence features to distinct mRNA degradation pathways have not been systematically studied. One example of this is codon-mediated stability control. Although a causal link from codon usage to mRNA half-life has been shown for a wide range of organisms (Hoekema et al. 1987; Presnyak et al. 2015; Bazzini et al. 2016; Mishima and Tomari 2016), the underlying mechanism remains poorly understood. In S. cerevisiae, reporter gene experiments showed that codon-mediated stability control depends on the RNA helicase Dhh1 (Radhakrishnan et al. 2016). However, it is unclear whether this generalizes to all mRNAs genome-wide. Also, the role of other closely related degradation pathways has not been systematically assessed with genome-wide half-life data.

Here, we mathematically modeled mRNA half-life as a function of its sequence. Applied to S. cerevisiae, our model can explain most of the between-gene half-life variance from sequence alone. Using a semimechanistic model, we could interpret individual sequence features in the 5′ UTR, coding region, and 3′ UTR. Quantification of the respective contributions revealed that codon usage is the major contributor to mRNA stability. Applying the modeling approach to S. pombe supports the generality of these findings. Moreover, we systematically assessed the dependencies of these sequence elements on mRNA degradation pathways using half-life data for 34 knockout strains. This analysis revealed in particular novel pathways through which codon usage affects half-life.

RESULTS

To study cis-regulatory determinants of mRNA stability in S. cerevisiae, we chose the data set by Sun et al. (2013), which provides genome-wide half-life measurements for 4388 expressed genes of a wild-type laboratory strain and 34 strains knocked out for RNA degradation pathway genes (Fig. 1; Supplemental Table S1). When applicable, we also investigated half-life measurements of S. pombe for 3614 expressed mRNAs in a wild-type laboratory strain from Eser et al. (2016). We considered sequence features within five overlapping regions: the 5′ UTR, the start codon context, the coding sequence, the stop codon context, and the 3′ UTR. We assessed their effects in the wild type and in the 34 knockout strains (Fig. 1). Finally, we fitted a joint model to assess the contribution of individual sequence features and their single-nucleotide effects (Fig. 1). In all analyses, we considered the logarithm of half-life as the response variable rather than half-life in the natural scale. The primary motivation for choosing a logarithmic scale is that measurement noise for half-life is typically multiplicative. Also, the data did not provide supportive evidence discriminating between multiplicative or additive effects of the cis-regulatory elements on half-life (Supplemental Information). For simplicity, we used linear regressions, i.e., due to the logarithmic response, multiplicative models.

FIGURE 1.

FIGURE 1.

Study overview. The goal of this study is to discover and integrate cis-regulatory mRNA elements affecting mRNA stability and assess their dependence on mRNA degradation pathways. (Data) We obtained S. cerevisiae genome-wide half-life data from wild-type (WT) as well as from 34 knockout strains from Sun et al. (2013). Each of the knockout strains has one gene closely related to mRNA degradation pathways knocked out. (Analysis) We systematically searched for novel sequence features associating with half-life from 5′ UTR, start codon context, CDS, stop codon context, and 3′ UTR. Effects of previously reported cis-regulatory elements were also assessed. Moreover, we assessed the dependencies of different sequence features on degradation pathways by analyzing their effects on the knockout strains. (Integrative model) We built a statistical model to predict genome-wide half-life solely from mRNA sequence. This allowed the quantification of the relative contributions of the sequence features to the overall variation across genes and assessing the sensitivity of mRNA stability with respect to single-nucleotide variants.

The correlations between sequence lengths, GC contents and folding energies (Materials and Methods) with half-life and corresponding P-values are summarized in Supplemental Table S2 and Supplemental Figures S1–S3. In general, sequence lengths correlated negatively with half-life and folding energies correlated positively with half-life in both yeast species, whereas correlations of GC content varied with species and gene regions.

In the following subsections, we describe first the findings for each of the five gene regions and then a model that integrates all these sequence features.

Upstream AUGs destabilize mRNAs by triggering nonsense-mediated decay

Occurrence of an upstream AUG (uAUG) associated significantly with shorter half-life (median fold-change = 1.37, P < 2 × 10−16). This effect was strengthened for genes with two or more AUGs (Fig. 2A,B). Among the 34 knock-out strains, the association between uAUG and shorter half-life was almost lost only for mutants of the two essential components of the nonsense-mediated mRNA decay (NMD) UPF2 and UPF3 (Leeds et al. 1992; Cui et al. 1995), and for the general 5′ to 3′ exonuclease Xrn1 (Fig. 2A; Supplemental Fig. S6). The dependence on NMD suggested that the association might be due to the occurrence of a premature stop codon. Consistent with this hypothesis, the association of uAUG with decreased half-lives was only found for genes with a premature stop codon cognate with the uAUG (Fig. 2C). This held not only for cognate premature stop codons within the 5′ UTR, leading to a potential upstream ORF, but also for cognate premature stop codons within the ORF, which occurred almost always for uAUG out-of-frame with the main ORF (Fig. 2C). This finding likely holds for many other eukaryotes as we found the same trends in S. pombe (Fig. 2D). These observations are consistent with a single-gene study demonstrating that translation of upstream ORFs can lead to RNA degradation by NMD (Gaba et al. 2005) and that uORFs are enriched in NMD substrates (Celik et al. 2017). Altogether, these results show that uAUGs are mRNA destabilizing elements as they almost surely match with cognate premature stop codons, which, whether in frame or not with the gene, and within the UTR or in the coding region, trigger NMD.

FIGURE 2.

FIGURE 2.

Upstream AUG codons (uAUG) destabilize mRNA. (A) Distribution of mRNA half-lives for mRNAs without uAUG (left) and with at least one uAUG (right). From left to right: wild type, XRN1, UPF2, and UPF3 knockout S. cerevisiae strains. Median fold-change (Median FC) calculated by dividing the median of the group without uAUG with the group with uAUG. A complete view of the effect of uAUG across different knockouts is provided in Supplemental Figure S6. (B) Distribution of mRNA half-lives for mRNAs with zero (left), one (middle), or more (right) uAUGs in S. cerevisiae. (C) Distribution of mRNA half-lives for S. cerevisiae mRNAs with, from left to right: no uAUG, with one in-frame uAUG but no cognate premature termination codon, with one out-of-frame uAUG and one cognate premature termination codon in the CDS, and with one uAUG and one cognate stop codon in the 5′ UTR (uORF). (D) Same as in C for S. pombe mRNAs. All P-values were calculated with Wilcoxon rank-sum test. Numbers in the boxes indicate number of members in the corresponding group. Boxes represent quartiles, whiskers extend to the highest or lowest value within 1.5 times the interquartile range, and horizontal bars in the boxes represent medians. Data points falling further than 1.5-fold the interquartile distance are considered outliers and are shown as dots.

Translation initiation sequence features associate with mRNA stability

Several sequence features in the 5′ UTR including the start codon context associated with mRNA half-life (Supplemental Information; Supplemental Figs. S4–S5). This indicates that 5′ UTR elements may affect mRNA stability by altering translation initiation. However, none of these sequence features remained significant in the final joint model. Our analysis is therefore not conclusive on this point. A detailed analysis is provided in the Supplemental Information for interested readers.

Codon usage regulates mRNA stability through common mRNA decay pathways

When using frequency of each codon as an independent covariate, codon usage marginally explained 55% of the between-gene half-life variation in S. cerevisiae on test data (linear regression, Materials and Methods, Fig. 3A). The species-specific tRNA adaptation index (sTAI) (Sabi and Tuller 2014) significantly positively correlated with the coefficients for codons in this regression [Supplemental Fig. S4E, r = 0.48 with log(sTAI), P = 0.0001, Materials and Methods], confirming the association between codon optimality and mRNA stability (Presnyak et al. 2015; Harigaya and Parker 2016). We also performed regression against gene-level sTAI. However, it yielded to significant yet less accurate predictions (40% explained variance on test data). We therefore proceeded with modeling frequency of each codon as an independent covariate.

FIGURE 3.

FIGURE 3.

Codon usage regulates mRNA stability through common mRNA decay pathways. (A) Predicted mRNA half-life using only codons as features (linear regression) versus measured mRNA half-life. (B) mRNA half-life explained variance (y-axis, Materials and Methods) in wild-type (WT) and across all 34 knockout strains (grouped according to their functions). Each blue dot represents one replicate; bar heights indicate means across replicates. Bars with a red star are significantly different from the wild-type level (FDR < 0.1, Wilcoxon rank-sum test, followed by Benjamini–Hochberg correction).

Next, we quantified how much variation of mRNA half-life can be explained by codons in different knockout strains using the out-of-folds explained variance as a summary statistic (Supplemental Methods). The effect of codon usage exclusively depended on the genes from the common deadenylation- and decapping-dependent 5′ to 3′ mRNA decay pathway and the NMD pathway (all FDR < 0.1, Fig. 3B). In particular, all assessed genes of the Ccr4–Not complex, including CCR4, NOT3, CAF40, and POP2, were required for wild-type level effects of codon usage on mRNA decay. Among them, CCR4 has the largest effect. This confirmed a recent study in zebrafish showing that accelerated decay of nonoptimal codon genes requires deadenylation activities of Ccr4–Not (Mishima and Tomari 2016). In contrast to genes of the Ccr4–Not complex, PAN2/3 genes that also encode deadenylation enzymes were not found to be essential for the coupling between codon usage and mRNA decay (Fig. 3B).

Furthermore, our results not only confirm the dependence on Dhh1 (Radhakrishnan et al. 2016), but also on its interacting partner Pat1. The difference might come from the fact that we analyzed genome-wide half-life data, whereas mRNA half-life measurements from Radhakrishnan and colleagues were only performed on reporter genes.

Our systematic analysis revealed two additional novel dependencies: First, on the common 5′ to 3′ exonuclease Xrn1, and second, on UPF2 and UPF3 genes, which are essential players of NMD (all FDR < 0.1, Fig. 3B). Consistently, previous studies have shown that UPF genes are involved in more than just the degradation of nonsense messages, but rather target a wide range of mRNAs, including aberrant and normal ones (He et al. 2003; Hug et al. 2015). In line with this, substrates of Upf proteins have lower codon optimality (Celik et al. 2017). Furthermore, we did not observe any change of effect upon knockout of DOM34 and HBS1 (Fig. 3B), which are essential genes for the No-Go decay pathway. This implies that the effect of codon usage is unlikely due to stalled ribosomes at nonoptimal codons.

Altogether, our analysis indicates that the so-called “codon-mediated decay” (Mishima and Tomari 2016) is not an mRNA decay pathway itself, but a regulatory mechanism of the common mRNA decay pathways.

Stop codon context associates with mRNA stability

The first nucleotide 3′ of the stop codon significantly associated with mRNA stability. This association was observed for each of the three possible stop codons, and for each codon a cytosine significantly associated with lower half-life (Supplemental Fig. S4, also for P-values and fold-changes). However, this feature was not significant in the joint model, and analysis of the knockout strains did not reveal clear pathway dependencies for it (Supplemental Fig. S6). A detailed description is provided in the Supplemental Information for interested readers.

Sequence motifs in 3′ UTR

De novo motif search identified four motifs in the 3′ UTR to be significantly associated with mRNA stability (Fig. 4A, Materials and Methods). These include three described motifs: the Puf3 binding motif TGTAAATA (FDR = 3.2 × 10−5, median fold-change 1.29) (Gerber et al. 2004; Gupta et al. 2014), the Whi3 binding motif TGCAT (FDR = 7 × 10−4, median fold-change 1.24) (Colomina et al. 2008; Cai and Futcher 2013), and a poly(U) motif TTTTTTA (FDR = 0.09, median fold-change 1.20), which can be bound by Pub1 (Duttagupta et al. 2005), or is part of the long poly(U) stretch that forms a looping structure with a poly(A) tail (Geisberg et al. 2014). Moreover, an uncharacterized motif, ATATTC, was associated with lower mRNA half-life (FDR = 2 × 10−5, median fold-change 1.24). Genes harboring the ATATTC motif are significantly enriched for genes involved in oxidative phosphorylation (Bonferroni corrected P < 0.01, 4.4-fold enrichment, Gene Ontology analysis, Supplemental Methods; Supplemental Table S3). The motif ATATC preferentially localizes in the vicinity of the poly(A) site (Fig. 4B), and functionally depends on Ccr4 (FDR < 0.1, Supplemental Fig. S6), suggesting a potential interaction with deadenylation factors. Notably, the motif ATATTC was found in 13% of the genes (591 out of 4388) and significantly co-occurred with the other two destabilizing motifs found in 3′ UTR: Puf3 motif (FDR = 0.01) and Whi3 motif (FDR = 3 × 10−3) binding motifs (Fig. 4F). This 3′ UTR motif had been computationally identified by conservation analysis (Kellis et al. 2003), by regression of steady-state expression levels (Foat et al. 2005), and by enrichment analysis within gene expression clusters (Elemento et al. 2007). The motif was suggested to be named as PRSE (positive response to starvation element), because of its enrichment among genes that are up-regulated upon starvation (Foat et al. 2005). However, it was not experimentally validated for controlling of mRNA stability.

FIGURE 4.

FIGURE 4.

3′ UTR half-life determinant motifs in S. cerevisiae. (A) Distribution of half-lives for mRNAs grouped by the number of occurrence(s) of the motif ATATTC, TGCAT (Whi3), TGTAAATA (Puf3), and TTTTTTA (Pub1), respectively, in their 3′ UTR sequence. Numbers in the boxes represent the number of members in each box. FDR were reported from the linear mixed effect model (Materials and Methods). (B) Fraction of transcripts containing the motif (y-axis) within a 20-bp window centered at a position (x-axis) with respect to poly(A) site for different motifs (facet titles). Positional bias was not observed when aligning 3′ UTR motifs with respect to the stop codon. (C) Prediction of the relative effect on half-life (y-axis) for single-nucleotide substitution in the motif with respect to the consensus motif (y = 1, horizontal line). The motifs were extended two bases at each flanking site (positions +1, +2, −1, −2). (D) Nucleotide frequency within motif instances, when allowing for one mismatch compared with the consensus motif. (E) Mean conservation score (phastCons, Materials and Methods) of each base in the consensus motif with two flanking nucleotides (y-axis). (F) Co-occurrence significance (FDR, Fisher test P-value corrected with Benjamini–Hochberg) between different motifs (left). Number of occurrences among the 4388 mRNAs (right). (G) Steady-state expression level of SFG1 and NYV1 (normalized by ACT1 and TUB2 expression, Supplemental Methods). Bar height represents mean of each group, error bars represent ± one standard error of the mean, each dot represents one biological replicate (jittered at x-axis to avoid overlapping). P-values were calculated by comparing the normalized expression level of constructs with two scrambled motifs embedded versus that with two functional ATATTC motifs embedded (Wilcoxon rank-sum test).

We validated the 3′ UTR motif ATATTC with a reporter assay on two different genes, SFG1 and NYV1. Given the predicted small effect of a single motif, we generated constructs with two instances of the motif and compared them to constructs harboring two scrambled motifs at the same locations (Fig. 4G, Materials and Methods). Both reporter genes showed decreased expression levels compared to scrambled controls (P = 0.019 for SFG1, P = 0.00016 for NYV1, Wilcoxon rank-sum test). Since the 3′ UTR motif ATATTC is not significantly associated with mRNA synthesis rate (P = 0.38, Wilcoxon rank-sum test, synthesis rate of genes without motif versus genes with motif), we conclude that this decreased expression is due to decreased stability.

Consistent with the role of Puf3 in recruiting deadenylation factors, Puf3 binding motif localized preferentially close to the poly(A) site (Fig. 4B). The effect of the Puf3 motifs was significantly lower in the knockout of PUF3 (FDR < 0.1, Supplemental Fig. S6). We also found a significant dependence on the deadenylation (CCR4, POP2) and decapping (DHH1, PAT1) pathways (all FDR < 0.1, Supplemental Fig. S6), consistent with previous single gene experiments showing that Puf3 binding promotes both deadenylation and decapping (Olivas and Parker 2000; Goldstrohm et al. 2007). Strikingly, the Puf3 binding motif switched to a stabilization motif in the absence of Puf3 and Ccr4 (all FDR < 0.1, Supplemental Fig. S6), suggesting that deadenylation of the Puf3 motif containing mRNAs is not only facilitated by Puf3 binding, but also depends on it.

Whi3 plays an important role in cell cycle control (Garí et al. 2001). Binding of Whi3 leads to destabilization of the CLN3 mRNA (Cai and Futcher 2013). A subset of yeast genes are up-regulated in the Whi3 knockout strain (Cai and Futcher 2013). However, so far it was unclear whether Whi3 generally destabilizes mRNAs upon its binding. Our analysis showed that mRNAs containing the Whi3 binding motif (TGCAT) have a significantly shorter half-life (FDR = 6.9 × 10−04, median fold-change 1.24). Surprisingly, this binding motif is extremely widespread, with 896 out of 4388 (20%) genes that we examined containing the motif on the 3′ UTR region, which enriched for genes involved in several processes (Supplemental Table S3). Functionality of the Whi3 binding motif was found to be dependent on Ccr4 (FDR < 0.1, Supplemental Fig. S6).

The mRNAs harboring the TTTTTTA motif tended to be more stable (FDR = 0.086, median fold-change 1.22) and enriched for translation (P = 1.34 × 10−3, twofold enrichment; Supplemental Table S3). No positional preferences were observed for this motif (Fig. 4B). The effect of this motif depends on genes from Ccr4–Not complex and Xrn1 (Supplemental Fig. S6).

An additional four lines of evidence further supported the functionality of our identified motifs. First, single-nucleotide deviations from the motif's consensus sequence associated with decreased effects on half-life (Fig. 4C, linear regression allowing for one mismatch, Materials and Methods). Moreover, the flanking nucleotides did not show further associations indicating that the whole lengths of the motifs were recovered (Fig. 4C). Second, when allowing for one mismatch, the motif still showed strong preferences (Fig. 4D). Third, the motif instances were more conserved than their flanking bases from the 3′ UTR (Fig. 4E). Fourth, all four motifs show significant effects in the RNA half-life data set generated by Miller et al. (2011), which is also based on 4sU labeling, as well as in the data set of Presnyak et al. (2015), which is in contrast based on transcriptional arrest (Supplemental Fig. S7).

Fifty-nine percent between-gene half-life variation can be explained by sequence features

We next asked how well one could predict mRNA half-life from these mRNA sequence features, and what their respective contributions were when considered jointly. To this end, we performed a multivariate linear regression of the logarithm of the half-life against the identified sequence features. The predictive power of the model on unseen data was assessed using 10-fold cross-validation (Materials and Methods; a complete list of model features and their P-values is provided in Supplemental Table S4). To prevent overfitting, we performed motif discovery on each of the 10 training sets and observed the same set of motifs across all the folds. Altogether, 59% of S. cerevisiae half-life variance in the logarithmic scale can be explained by simple linear combinations of the above sequence features (Fig. 5A; Supplemental Table S5). The median out-of-folds relative error across genes is 30%. A median relative error of 30% for half-life is remarkably low because it is in the order of magnitude of the expression variation that is typically physiologically tolerated, and it is also about the amount of variation observed between replicate experiments (Eser et al. 2016). To make sure that our findings are not biased to a specific data set, we fitted the same model to a data set using RATE-seq (Neymotin et al. 2014), a modified version of the protocol used by Sun et al. (2013). On these data, the model was able to explain 51% of the variance (Supplemental Fig. S8). Moreover, the same procedure applied to S. pombe explained 45% of the total half-life variance, suggesting the generality of this approach. Because the measures also entail measurement noise, these numbers are conservative underestimations of the total biological variance explained by our model.

FIGURE 5.

FIGURE 5.

Genome-wide prediction of mRNA half-life from sequence features and analysis of the contributions. (A,B) mRNA half-life predicted (x-axis) versus measured (y-axis) for S. cerevisiae (A) and S. pombe (B), respectively. (C) Contribution of each sequence feature individually (Individual), cumulatively when sequentially added into a combined model (Cumulative), and explained variance drop when each single feature is removed from the full model separately (Drop). Values reported are the mean of 100 times of cross-validated evaluation (Materials and Methods). (D) Expected half-life fold-change of single-nucleotide variations on sequence features. For length and GC, dots represent median half-life fold-change of one nucleotide shorter or one G/C to A/T transition, respectively. For codon usage, each dot represents median half-life fold-change of one type of synonymous mutation; all kinds of synonymous mutations are considered. For uAUG, each dot represents median half-life fold-change of mutating out one uAUG. For motifs, each dot represents median half-life fold-change of one type of nucleotide transition at one position on the motif (Materials and Methods). Medians are calculated across all mRNAs.

The uAUG, 5′ UTR length, 5′ UTR GC content, 61 coding codons, CDS folding energy, all four 3′ UTR motifs, and 3′ UTR length remained significant in the joint model, indicating that they contributed individually to half-life (Supplemental Table S4). Most of them showed decreased effect in a joint model compared to marginal effects (Fig. 5C), likely because they correlate with each other. In contrast, start codon context, stop codon context, 5′ folding energy, the 5′ UTR motif AAACAAA (Supplemental Fig. S5), CDS length, and 3′ UTR GC content dropped below the significance when considered in the joint model (Supplemental Table S4). This loss of statistical significance may be due to lack of statistical power. Another possibility is that the marginal association of these sequence features with half-life is a consequence of a correlation with other sequence features. Among all sequence features, codon usage as a group is the best predictor both in a univariate model (55.29%) and in the joint model (44.63 %) (Fig. 5C). This shows that, quantitatively, codon usage is the major determinant of mRNA stability in yeast. This explains why only a small fraction of mRNA stability variation can be explained by RNA-binding proteins (Hasan et al. 2014). The variance analysis quantifies the contribution of each sequence feature to the variation across genes. Features that vary a lot between genes, such as UTR length and codon usage, favorably contribute to the variation. However, this does not reflect the effect on a given gene of elementary sequence variations in these features. For instance, a single-nucleotide variant can lead to the creation of an uAUG with a strong effect on half-life, but a single-nucleotide variant in the coding sequence may have little impact on overall codon usage. We used the joint model to assess the sensitivity of each feature to single-nucleotide mutations as median fold-change across genes, simulating single-nucleotide deletions for the length features and single-nucleotide substitutions for the remaining ones (Materials and Methods). Single-nucleotide variations typically altered half-life by <10%. The largest effects were observed in the 3′ UTR motifs and uAUG (Fig. 5D). Notably, although codon usage was the major contributor to the variance, synonymous variation on codons typically affected half-life by <2% (Fig. 5D; Supplemental Fig. S9). For those synonymous variations that changed half-life by more than 2%, most of them were variations that involved the most nonoptimized codons CGA or ATA (Supplemental Fig. S9; Presnyak et al. 2015).

Altogether, our results show that most of yeast mRNA half-life variation can be predicted from mRNA sequence alone, with codon usage being the major contributor. However, single-nucleotide variation at 3′ UTR motifs or uAUG had the largest expected effect on mRNA stability.

DISCUSSION

We systematically searched for mRNA sequence features associating with mRNA stability and estimated their effects at single-nucleotide resolution in a joint model. Up to GC content and length, all elements of the joint model are causal. One of them, the 3′ UTR motif ATATTC has been validated in this study. Overall, the joint model showed that 59% of the variance could be predicted from mRNA sequence alone in S. cerevisiae. This analysis showed that translation-related features, in particular codon usage, contributed most to the explained variance. This finding strengthens further the importance of the coupling between translation and mRNA degradation (Roy and Jacobson 2013; Huch and Nissan 2014; Radhakrishnan and Green 2016). Moreover, we assessed the dependencies of each sequence element on RNA degradation pathways. Remarkably, we identified that codon-mediated decay is a regulatory mechanism of the canonical decay pathways, including deadenylation- and decapping-dependent 5′ to 3′ decay and NMD (Figs. 3B, 6).

FIGURE 6.

FIGURE 6.

Overview and summary of conclusions from this study.

Predicting various steps of gene expression from sequence alone has long been a subject of study (Beer and Tavazoie 2004; Vogel et al. 2010; Zur and Tuller 2013; Wang et al. 2016). To this end, two distinct classes of models have been proposed: the biophysical models on the one hand and the machine learning models on the other hand (Zur and Tuller 2016). Biophysical models provide detailed understanding of the processes. On the other hand, machine learning approaches can reach much higher predictive accuracy but are more difficult to interpret. Also, machine learning approaches can pick up signals with predictive power that are correlative but not causal. Here we adopted an intermediate, semimechanistic modeling approach. We used a simple linear model that is interpretable. Also, all elements are functional, up to two covariates: GC content and length.

Our approach was based on the analysis of endogenous sequence, which allowed the identification of a novel cis-regulatory element. An alternative approach to the modeling of endogenous sequence is to use large-scale synthetic libraries (Dvir et al. 2013; Shalem et al. 2015; Wissink et al. 2016). Although very powerful to dissect known cis-regulatory elements or to investigate small variations around select genes, the sequence space is so large that these large-scale perturbation screens cannot uncover all regulatory motifs. It would be interesting to combine both approaches and design large-scale validation experiments guided by insights coming from modeling of endogenous sequences as we developed here.

Recently, Neymotin et al. (2016) showed that several translation-related transcript properties associated with half-life. This study derived a model explaining 50% of the total variance using many transcript properties including some not based on sequence (ribosome profiling, expression levels, etc.). Although non-sequence based predictors can facilitate prediction, they may do so because they are consequences rather than causes of half-life. For instance, increased half-life causes higher expression level. Also, increased cytoplasmic half-life, provides a higher ratio of cytoplasmic over nuclear RNA, and thus more RNAs available to ribosomes. Hence both expression level and ribosome density may help making good predictions of half-life, but not necessarily because they causally increase half-life. In contrast, we aimed here to understand how mRNA half-life is encoded in mRNA sequence and derived a model that is based on functional elements. This avoided using transcript properties that could be consequences of mRNA stability. Hence, our present analysis confirms the quantitative importance of translation in determining mRNA stability that Neymotin and colleagues quantified, and anchors it into pure sequence elements.

Confounding associations of sequence elements with mRNA stability could arise because of selection on expression levels acting at multiple stages of gene expression. For instance, genes that are selected for high protein expression levels may be enriched for elements that enhance translation and for elements that enhance mRNA stability. Functional validations are therefore needed to disentangle causality from co-selection. The sequence elements of our joint model, up to GC content and length, are all functional. However, we reported further elements that associate marginally with half-life. One of the interesting sequence elements that we found associated with half-life but did not turn out significant in the joint model is the start codon context. Given its established effect on translation initiation (Kozak 1986; Dvir et al. 2013), the general coupling between translation and mRNA degradation (Roy and Jacobson 2013; Huch and Nissan 2014; Radhakrishnan and Green 2016), as well as several observations directly on mRNA stability for single genes (LaGrandeur and Parker 1999; Schwartz and Parker 1999), the start codon context may nonetheless functionally affect mRNA stability. Consistent with this hypothesis, large-scale experiments that perturb 5′ sequence secondary structure and start codon context indeed showed a wide range of mRNA level changes in the direction that we would predict (Dvir et al. 2013).

We are not aware of previous studies that systematically assessed the effects of cis-regulatory elements in the context of knockout backgrounds, as we did here. This part of our analysis turned out to be very insightful. By assessing the dependencies of codon usage mediated mRNA stability control systematically and comprehensively, we generalized results from recent studies on the Ccr4–Not complex and Dhh1, but also identified important novel ones including NMD factors, Pat1 and Xrn1. With the growing availability of knockout or mutant background in model organisms and human cell lines, we anticipate this approach to become a fruitful methodology to unravel regulatory mechanisms.

MATERIALS AND METHODS

Data and genomes

Wild-type and knockout genome-wide S. cerevisiae half-life data were obtained from Sun et al. (2013), whereby all strains are histidine, leucine, methionine, and uracil auxotrophs. A complete list of knockout strains used in this study is provided in Supplemental Table S1. S. cerevisiae gene boundaries were taken from the boundaries of the most abundant isoform quantified by Pelechano et al. (2013). Reference genome fasta file and genome annotation were obtained from the Ensembl database (release 79). UTR regions were defined by subtracting out gene body (exon and introns from the Ensembl annotation) from the gene boundaries. Processed S. cerevisiae UTR annotation is provided in Supplemental Table S6.

Genome-wide half-life data of S. pombe as well as refined transcription unit annotation were obtained from Eser et al. (2016). Reference genome version ASM294v2.26 was used to obtain sequence information. Half-life outliers of S. pombe (half-life less than 1 or larger than 250 min) were removed.

For both half-life data sets, only mRNAs with mapped 5′ UTR and 3′ UTR were considered. mRNAs with 5′ UTR length shorter than 6 nt were further filtered out.

Codon-wise species-specific tRNA adaptation index (sTAI) of yeasts were obtained from Sabi and Tuller (2014). Gene-wise sTAIs were calculated as the geometric mean of sTAIs of all its codons (stop codon excluded).

Analysis of knockout strains

The effect level of an individual sequence feature was compared against the wild-type with Wilcoxon rank-sum test followed by multiple hypothesis testing P-value correction (FDR < 0.1). For details, see Supplemental Methods.

Motif discovery

Motif discovery was conducted for the 5′ UTR, the CDS and the 3′ UTR regions. A linear mixed effect model was used to assess the effect of each individual k-mer while controlling the effects of the others and for the region length as a covariate as described previously (Eser et al. 2016). For CDS we also used codons as further covariates. In contrast to Eser and colleagues, we tested the effects of all possible k-mers with lengths from 3 to 8. The linear mixed model for motif discovery was fitted with GEMMA software (Zhou et al. 2013). P-values were corrected for multiple testing using Benjamini–Hochberg's FDR. Motifs were subsequently manually assembled based on overlapping significant (FDR < 0.1) k-mers.

Folding energy calculation

RNA sequence folding energy was calculated with RNAfold from ViennaRNA version 2.1.9 (Lorenz et al. 2011), with default parameters.

S. cerevisiae conservation analysis

The phastCons (Siepel et al. 2005) conservation track for S. cerevisiae was downloaded from the UCSC Genome Browser (http://hgdownload.cse.ucsc.edu/goldenPath/sacCer3/phastCons7way/). Motif single-nucleotide level conservation scores were computed as the mean conservation score of each nucleotide (including two extended nucleotides at each side of the motif) across all motif instances genome-wide (removing NA values).

Linear regression model for codon usage

Throughout the study, we modeled codon usage in the linear model with each codon as an independent covariate using its frequency.

log(yg)=β0+cCodonsβcxc+εg, (1)

where xc=ncLg, nc is the number of codon c in gene g. Lg is the CDS length of gene g.

Relation between codon regression coefficient and sTAI

The coefficients of codon frequencies have an analogous interpretation as species-specific tRNA adaptation index (sTAI). The same applies also to tAI. The sTAI of a gene is defined as the geometric mean of the sTAIs of all its coding codons (Sabi and Tuller 2014). For a gene g with N number of codons, its sTAI is defined as follows:

sTAIg=(i=1Nwi)1N=w1w2wNN, (2)

where wi represent the sTAI of the ith codon in the gene.

The logarithm of a gene sTAI with N codons is

log(sTAIg)=1N(i=1Nlog(wi))=cCodons3log(wc)nc3N=cCodons3log(wc)xc (3)

where xc is defined in Equation 1, 3N = Lg is the CDS length, nc is the number of codon c in gene g, wc is the sTAI of codon c. Hence, in a linear model the regression coefficient βc of Equation 1 has an analogous interpretation to the log of sTAI [log(wc)].

Linear model for genome-wide half-life prediction

Multivariate linear regression models were used to predict genome-wide mRNA half-life on the logarithmic scale from sequence features. Only mRNAs that contain all features were used to fit the models, resulting in 3838 mRNAs for S. cerevisiae and 3360 mRNAs for S. pombe. Out-of-fold predictions were applied with 10-fold cross validation for any prediction task in this study. For each fold, a linear model was first fitted to the training data with all sequence features as covariates, then a stepwise model selection procedure was applied to select the best model with Bayesian Information Criterion as criteria [step function in R, with k = log(n)]. L1 or L2 regularization was not necessary, as they did not improve the out-of-fold prediction accuracy (tested with the glmnet R package [Friedman et al. 2010]). Motif discovery was performed again at each fold. The same set of motifs was identified within each training set only. For details, see Supplemental Methods.

Analysis of sequence feature contribution

Linear models were first fitted on the complete data with all sequence features as covariates, nonsignificant sequence features were then removed from the final models, ending up with 69 features for the S. cerevisiae model and 76 features for S. pombe (each single-coding codon was fitted as a single covariate). The contribution of each sequence feature was analyzed individually as a univariate regression and also jointly in a multivariate regression model. The contribution of each feature individually was calculated as the variance explained by a univariate model. Features were then added in a descending order of their individual explained variance to a joint model; “cumulative” variances explained were then calculated. The “drop” quantifies the drop of variance explained as leaving out one feature separately from the full model. All contribution statistics were quantified by taking the average of 100 times of 10-fold cross-validation.

Single-nucleotide variant effect predictions

The same model used in sequence feature contribution analysis was used for single-nucleotide variant effect prediction. For motifs, effects of single-nucleotide variants were predicted with the linear model modified from Eser et al. (2016). When assessing the effect of a given motif variation, instead of estimating the marginal effect size, we controlled for the effect of all other sequence features using a linear model with the other features as covariates. For details, see Supplemental Methods. For other sequence features, effects of single-nucleotide variants were predicted by introducing a single-nucleotide perturbation into the full prediction model for each gene, and summarizing the effect with the median half-life change across all genes. For details, see Supplemental Methods.

Construction of SFG1 and NYV1 mutant strains

One hundred base pair primers (IDT) containing the respective 3′ UTR mutations were used to amplify the kanMX cassette from plasmid pFA6a-KanMX6 (Euroscarf). PCR products were used for transformation of strain BY4741 (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0, Euroscarf) by homologous recombination, and transformants were selected on G418 plates. Correct clones were confirmed by sequencing. Details of the reporter assay design are provided in the Supplemental Methods. Sequences of the constructs are given in Supplemental Table S7.

Quantitative PCR

Cells were grown to OD600 0.8 in YPD from overnight cultures inoculated from single colonies. Cells were centrifuged at 4000 rpm for 1 min at 30°C and pellets were flash-frozen in liquid nitrogen. RNA was phenol/chloroform purified. cDNA synthesis was performed with 1.5 µg RNA using the Maxima Reverse Transcriptase (Thermo Fisher). qPCR was performed on a qTower 2.2 (Analytik Jena) using a 2-min denaturing step at 95°C, followed by 39 cycles of 5 sec at 95°C, 10 sec at 64°C, and 15 sec at 72°C with a final step at 72°C for 5 min. qPCR was performed using the SensiFAST SYBR No-ROX Kit (Bioline). Primer efficiencies were determined by performing standard curves for all primer combinations. All primer pairs had efficiencies of 95% or higher. Sequence information of primer pairs and efficiencies are provided in Supplemental Table S7. Ct data from nine biological and three technical replicates were used for analysis. Details of analyzing qPCR data are described in Supplemental Methods.

DATA DEPOSITION

Analysis scripts are available at https://github.com/gagneurlab/Manuscript_Cheng_RNA_2017.

SUPPLEMENTAL MATERIAL

Supplemental material is available for this article.

Supplementary Material

Supplemental Material

ACKNOWLEDGMENTS

We thank Patrick Cramer for supporting the motif validation experiment. We thank Fabien Bonneau (Max Planck Institute of Biochemistry) for helpful discussions on motifs and RNA degradation pathways, as well as useful feedback on the manuscript. We thank Björn Schwalb for communication on analyzing the knockout data. We thank Vicente Yépez for useful feedback on the manuscript and Patrick Cramer for institutional support. J.C. and Ž.A. are supported by a Deutsche Forschungsgemeinschaft fellowship through QBM.

Footnotes

Freely available online through the RNA Open Access option.

REFERENCES

  1. Anderson JS, Parker RP. 1998. The 3′ to 5′ degradation of yeast mRNAs is a general mechanism for mRNA turnover that requires the SKI2 DEVH box protein and 3′ to 5′ exonucleases of the exosome complex. EMBO J 17: 1497–1506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bazzini AA, Del Viso F, Moreno-Mateos MA, Johnstone TG, Vejnar CE, Qin Y, Yao J, Khokha MK, Giraldez AJ. 2016. Codon identity regulates mRNA stability and translation efficiency during the maternal-to-zygotic transition. EMBO J 35: 2087–2103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Beer MA, Tavazoie S. 2004. Predicting gene expression from sequence. Cell 117: 185–198. [DOI] [PubMed] [Google Scholar]
  4. Boël G, Letso R, Neely H, Price WN, Wong K, Su M, Luff JD, Valecha M, Everett JK, Acton TB, et al. 2016. Codon influence on protein expression in E. coli correlates with mRNA levels. Nature 529: 358–363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Brown CE, Tarun SZ Jr, Boeck R, Sachs AB. 1996. PAN3 encodes a subunit of the Pab1p-dependent poly(A) nuclease in Saccharomyces cerevisiae. Mol Cell Biol 16: 5744–5753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cai Y, Futcher B. 2013. Effects of the yeast RNA-binding protein Whi3 on the half-life and abundance of CLN3 mRNA and other targets. PLoS One 8: e84630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Celik A, Baker R, He F, Jacobson A. 2017. High-resolution profiling of NMD targets in yeast reveals translational fidelity as a basis for substrate selection. RNA 23: 735–748. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Colomina N, Ferrezuelo F, Wang H, Aldea M, Garí E. 2008. Whi3, a developmental regulator of budding yeast, binds a large set of mRNAs functionally related to the endoplasmic reticulum. J Biol Chem 283: 28670–28679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cui Y, Hagan KW, Zhang S, Peltz SW. 1995. Identification and characterization of genes that are required for the accelerated degradation of mRNAs containing a premature translational termination codon. Genes Dev 9: 423–436. [DOI] [PubMed] [Google Scholar]
  10. Duan J, Shi J, Ge X, Dölken L, Moy W, He D, Shi S, Sanders AR, Ross J, Gejman PV. 2013. Genome-wide survey of interindividual differences of RNA stability in human lymphoblastoid cell lines. Sci Rep 3: 1318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Duttagupta R, Tian B, Wilusz CJ, Khounh DT, Soteropoulos P, Ouyang M, Dougherty JP, Peltz SW. 2005. Global analysis of Pub1p targets reveals a coordinate control of gene expression through modulation of binding and stability. Mol Cell Biol 25: 5499–5513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Dvir S, Velten L, Sharon E, Zeevi D, Carey LB, Weinberger A, Segal E. 2013. Deciphering the rules by which 5′-UTR sequences affect protein expression in yeast. Proc Natl Acad Sci 110: E2792–E2801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Elemento O, Slonim N, Tavazoie S. 2007. A universal framework for regulatory element discovery across all genomes and data types. Mol Cell 28: 337–350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Eser P, Wachutka L, Maier KC, Demel C, Boroni M, Iyer S, Cramer P, Gagneur J. 2016. Determinants of RNA metabolism in the Schizosaccharomyces pombe genome. Mol Syst Biol 12: 857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Foat BC, Houshmandi SS, Olivas WM, Bussemaker HJ. 2005. Profiling condition-specific, genome-wide regulation of mRNA stability in yeast. Proc Natl Acad Sci 102: 17675–17680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Friedman J, Hastie T, Tibshirani R. 2010. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33: 1–22. [PMC free article] [PubMed] [Google Scholar]
  17. Gaba A, Jacobson A, Sachs MS. 2005. Ribosome occupancy of the yeast CPA1 upstream open reading frame termination codon modulates nonsense-mediated mRNA decay. Mol Cell 20: 449–460. [DOI] [PubMed] [Google Scholar]
  18. Garí E, Volpe T, Wang H, Gallego C, Futcher B, Aldea M. 2001. Whi3 binds the mRNA of the G1 cyclin CLN3 to modulate cell fate in budding yeast. Genes Dev 15: 2803–2808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Garneau NL, Wilusz J, Wilusz CJ. 2007. The highways and byways of mRNA decay. Nat Rev Mol Cell Biol 8: 113–126. [DOI] [PubMed] [Google Scholar]
  20. Geisberg JV, Moqtaderi Z, Fan X, Ozsolak F, Struhl K. 2014. Global analysis of mRNA isoform half-lives reveals stabilizing and destabilizing elements in yeast. Cell 156: 812–824. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gerber AP, Herschlag D, Brown PO. 2004. Extensive association of functionally and cytotopically related mRNAs with Puf family RNA-binding proteins in yeast. PLoS Biol 2: E79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Goldstrohm AC, Seay DJ, Hook BA, Wickens M. 2007. PUF protein-mediated deadenylation is catalyzed by Ccr4p. J Biol Chem 282: 109–114. [DOI] [PubMed] [Google Scholar]
  23. Gupta I, Clauder-Münster S, Klaus B, Järvelin AI, Aiyar RS, Benes V, Wilkening S, Huber W, Pelechano V, Steinmetz LM. 2014. Alternative polyadenylation diversifies post-transcriptional regulation by selective RNA-protein interactions. Mol Syst Biol 10: 719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Harigaya Y, Parker R. 2016. Analysis of the association between codon optimality and mRNA stability in Schizosaccharomyces pombe. BMC Genomics 17: 895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Harigaya Y, Parker R. 2017. The link between adjacent codon pairs and mRNA stability. BMC Genomics 18: 364. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hasan A, Cotobal C, Duncan CDS, Mata J. 2014. Systematic analysis of the role of RNA-binding proteins in the regulation of RNA stability. PLoS Genet 10: e1004684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. He F, Li X, Spatrick P, Casillo R, Dong S, Jacobson A. 2003. Genome-wide analysis of mRNAs regulated by the nonsense-mediated and 5′ to 3′ mRNA decay pathways in yeast. Mol Cell 12: 1439–1452. [DOI] [PubMed] [Google Scholar]
  28. Hoekema A, Kastelein RA, Vasser M, de Boer HA. 1987. Codon replacement in the PGK1 gene of Saccharomyces cerevisiae: experimental approach to study the role of biased codon usage in gene expression. Mol Cell Biol 7: 2914–2924. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hogan DJ, Riordan DP, Gerber AP, Herschlag D, Brown PO. 2008. Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol 6: e255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Hsu CL, Stevens A. 1993. Yeast cells lacking 5′–3′ exoribonuclease 1 contain mRNA species that are poly(A) deficient and partially lack the 5′ cap structure. Mol Cell Biol 13: 4826–4835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Huch S, Nissan T. 2014. Interrelations between translation and general mRNA degradation in yeast. Wiley Interdiscip Rev RNA 5: 747–763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hug N, Longman D, Cáceres JF. 2015. Mechanism and regulation of the nonsense-mediated decay pathway. Nucleic Acids Res 44: 1483–1495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES. 2003. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423: 241–254. [DOI] [PubMed] [Google Scholar]
  34. Kozak M. 1986. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44: 283–292. [DOI] [PubMed] [Google Scholar]
  35. Lackner DH, Beilharz TH, Marguerat S, Mata J, Watt S, Schubert F, Preiss T, Bähler J. 2007. A network of multiple regulatory layers shapes gene expression in fission yeast. Mol Cell 26: 145–155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. LaGrandeur T, Parker R. 1999. The cis acting sequences responsible for the differential decay of the unstable MFA2 and stable PGK1 transcripts in yeast include the context of the translational start codon. RNA 5: 420–433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lee RC, Feinbaum RL, Ambros V. 1993. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75: 843–854. [DOI] [PubMed] [Google Scholar]
  38. Leeds P, Wood JM, Lee B, Culbertson MR. 1992. Gene products that promote mRNA turnover in Saccharomyces cerevisiae. Mol Cell Biol 12: 2165–2177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. 2011. ViennaRNA Package 2.0. Algorithms Mol Biol 6: 26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Marguerat S, Lawler K, Brazma A, Bähler J. 2014. Contributions of transcription and mRNA decay to gene expression dynamics of fission yeast in response to oxidative stress. RNA Biol 11: 702–714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Miller C, Schwalb B, Maier K, Schulz D, Dümcke S, Zacher B, Mayer A, Sydow J, Marcinowski L, Dölken L, et al. 2011. Dynamic transcriptome analysis measures rates of mRNA synthesis and decay in yeast. Mol Syst Biol 7: 458. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Mishima Y, Tomari Y. 2016. Codon usage and 3′ UTR length determine maternal mRNA stability in zebrafish. Mol Cell 61: 874–885. [DOI] [PubMed] [Google Scholar]
  43. Muhlrad D, Decker CJ, Parker R. 1995. Turnover mechanisms of the stable yeast PGK1 mRNA. Mol Cell Biol 15: 2145–2156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Neymotin B, Athanasiadou R, Gresham D. 2014. Determination of in vivo RNA kinetics using RATE-seq. RNA 20: 1645–1652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Neymotin B, Ettore V, Gresham D. 2016. Multiple transcript properties related to translation affect mRNA degradation rates in Saccharomyces cerevisiae. G3 (Bethesda) 6: 3475–3483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Olivas W, Parker R. 2000. The Puf3 protein is a transcript-specific regulator of mRNA degradation in yeast. EMBO J 19: 6602–6611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Parker R. 2012. RNA degradation in Saccharomyces cerevisae. Genetics 191: 671–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Pelechano V, Wei W, Steinmetz LM. 2013. Extensive transcriptional heterogeneity revealed by isoform profiling. Nature 497: 127–131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Pilkington GR, Parker R. 2008. Pat1 contains distinct functional domains that promote P-body assembly and activation of decapping. Mol Cell Biol 28: 1298–1312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Presnyak V, Alhusaini N, Chen Y-H, Martin S, Morris N, Kline N, Olson S, Weinberg D, Baker KE, Graveley BR, et al. 2015. Codon optimality is a major determinant of mRNA stability. Cell 160: 1111–1124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Rabani M, Kertesz M, Segal E. 2008. Computational prediction of RNA structural motifs involved in posttranscriptional regulatory processes. Proc Natl Acad Sci 105: 14885–14890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Rabani M, Raychowdhury R, Jovanovic M, Rooney M, Stumpo DJ, Pauli A, Hacohen N, Schier AF, Blackshear PJ, Friedman N, et al. 2014. High-resolution sequencing and modeling identifies distinct dynamic RNA regulatory strategies. Cell 159: 1698–1710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Radhakrishnan A, Green R. 2016. Connections underlying translation and mRNA stability. J Mol Biol 428: 3558–3564. [DOI] [PubMed] [Google Scholar]
  54. Radhakrishnan A, Chen Y-H, Martin S, Alhusaini N, Green R, Coller J. 2016. The DEAD-box protein Dhh1p couples mRNA decay and translation by monitoring codon optimality. Cell 167: 122–132.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Rockman MV, Kruglyak L. 2006. Genetics of global gene expression. Nat Rev Genet 7: 862–872. [DOI] [PubMed] [Google Scholar]
  56. Roy B, Jacobson A. 2013. The intimate relationships of mRNA decay and translation. Trends Genet 29: 691–699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Sabi R, Tuller T. 2014. Modelling the efficiency of codon-tRNA interactions based on codon usage bias. DNA Res 21: 511–526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Schwalb B, Michel M, Zacher B, Frühauf K, Demel C, Tresch A, Gagneur J, Cramer P. 2016. TT-seq maps the human transient transcriptome. Science 352: 1225–1228. [DOI] [PubMed] [Google Scholar]
  59. Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Wei C, Selbach M. 2011. Global quantification of mammalian gene expression control. Nature 473: 337–342. [DOI] [PubMed] [Google Scholar]
  60. Schwartz DC, Parker R. 1999. Mutations in translation initiation factors lead to increased rates of deadenylation and decapping of mRNAs in Saccharomyces cerevisiae. Mol Cell Biol 19: 5247–5256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Shalem O, Dahan O, Levo M, Martinez MR, Furman I, Segal E, Pilpel Y. 2008. Transient transcriptional responses to stress are generated by opposing effects of mRNA production and degradation. Mol Syst Biol 4: 223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Shalem O, Sharon E, Lubliner S, Regev I, Lotan-Pompan M, Yakhini Z, Segal E. 2015. Systematic dissection of the sequence determinants of gene 3′ end mediated expression control. PLoS Genet 11: e1005147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Shalgi R, Lapidot M, Shamir R, Pilpel Y. 2005. A catalog of stability-associated sequence elements in 3′ UTRs of yeast mRNAs. Genome Biol 6: R86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. She M, Decker CJ, Svergun DI, Round A, Chen N, Muhlrad D, Parker R, Song H. 2008. Structural basis of Dcp2 recognition and activation by Dcp1. Mol Cell 29: 337–349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15: 1034–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Skelly DA, Ronald J, Akey JM. 2009. Inherited variation in gene expression. Annu Rev Genomics Hum Genet 10: 313–332. [DOI] [PubMed] [Google Scholar]
  67. Sun M, Schwalb B, Pirkl N, Maier KC, Schenk A, Failmezger H, Tresch A, Cramer P. 2013. Global analysis of eukaryotic mRNA degradation reveals Xrn1-dependent buffering of transcript levels. Mol Cell 52: 52–62. [DOI] [PubMed] [Google Scholar]
  68. Tucker M, Valencia-Sanchez MA, Staples RR, Chen J, Denis CL, Parker R. 2001. The transcription factor associated Ccr4 and Caf1 proteins are components of the major cytoplasmic mRNA deadenylase in Saccharomyces cerevisiae. Cell 104: 377–386. [DOI] [PubMed] [Google Scholar]
  69. Vogel C, Abreu R de S, Ko D, Le SYY, Shapiro BA, Burns SC, Sandhu D, Boutz DR, Marcotte EM, Penalva LO. 2010. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol 6: 400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Wang X, Hou J, Quedenau C, Chen W. 2016. Pervasive isoform-specific translational regulation via alternative transcription start sites in mammals. Mol Syst Biol 12: 875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Wissink EM, Fogarty EA, Grimson A. 2016. High-throughput discovery of post-transcriptional cis-regulatory elements. BMC Genomics 17: 177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Zeisel A, Köstler WJ, Molotski N, Tsai JM, Krauthgamer R, Jacob-Hirsch J, Rechavi G, Soen Y, Jung S, Yarden Y, et al. 2011. Coupled pre-mRNA and mRNA dynamics unveil operational strategies underlying transcriptional responses to stimuli. Mol Syst Biol 7: 529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Zhou X, Carbonetto P, Stephens M. 2013. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet 9: e1003264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Zur H, Tuller T. 2013. Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae. BMC Bioinformatics 14: 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Zur H, Tuller T. 2016. Predictive biophysical modeling and understanding of the dynamics of mRNA translation and its evolution. Nucleic Acids Res 44: 9031–9049. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from RNA are provided here courtesy of The RNA Society

RESOURCES