Abstract
Models of mRNA translation usually presume that transcripts are linear; upon reaching the end of a transcript each terminating ribosome returns to the cytoplasmic pool before initiating anew on a different transcript. A consequence of linear models is that faster translation of a given mRNA is unlikely to generate more of the encoded protein, particularly at low ribosome availability. Recent evidence indicates that eukaryotic mRNAs are circularized, potentially allowing terminating ribosomes to preferentially reinitiate on the same transcript. Here we model the effect of ribosome reinitiation on translation and show that, at high levels of reinitiation, protein synthesis rates are dominated by the time required to translate a given transcript. Our model provides a simple mechanistic explanation for many previously enigmatic features of eukaryotic translation, including the negative correlation of both ribosome densities and protein abundance on transcript length, the importance of codon usage in determining protein synthesis rates, and the negative correlation between transcript length and both codon adaptation and 5' mRNA folding energies. In contrast to linear models where translation is largely limited by initiation rates, our model reveals that all three stages of translation—initiation, elongation, and termination/reinitiation—determine protein synthesis rates even at low ribosome availability.
Author summary
Recent advances in proteomics show that translation is strongly dependent on transcript length, but current theoretical models fail to capture this relationship. Here, we propose that the high initiation rates and protein yields of short transcripts result from terminating ribosomes reinitiating on the same transcript. The frequency of reinitiation depends on the time required to complete one full transit of a transcript, coupling transcript lengths and elongation rates to protein yield. Any slow step reduces the protein yield of shorter transcripts more than the yield of longer transcripts, generating stronger selective pressure to eliminate slow steps in shorter transcripts and explaining the widespread negative correlations in eukaryotes between transcript length and both 5' mRNA folding energy and codon adaptation. Our reinitiation-based model reconciles conflicting results from previous initiation-limited models with recent advances in biotechnology and identifies the mechanism underlying length-dependent translation, allowing powerful prediction of translational regulation across eukaryotes.
Introduction
The physiological state of a cell is largely determined by the identity and abundance of the proteins encoded by its genome. Understanding how genetic information is first transcribed into messenger RNA and then translated into protein is therefore fundamental to our understanding of biological systems. A wide variety of technologies has allowed detailed investigations of transcription, but—until very recently—a lack of similar tools for empirical research on translation has meant that the study of post-transcriptional regulation has been largely restricted to mathematical models with little opportunity for parameterization or evaluation. Recent advances in both sequencing technology and mass spectrometry have now produced large amounts of data on the translation of eukaryotic mRNA, revealing how transcript features, RNA-binding proteins, and non-coding RNAs influence translation [1,2]. While many of the determinants of translation rates revealed by these empirical studies were predicted by existing models, some remain difficult to explain. Perhaps the most striking correlate of translation rate is the length of the transcript itself. Multiple experimental studies, across a wide range of eukaryotic organisms, have demonstrated a steep negative correlation between the length of a given coding sequence (CDS) and three different measures of translation: translation initiation rates [3–5], the density of ribosomes on a transcript [5–15], and the abundance of the encoded protein [16–19].
Ribosome and polysome profiling experiments have shown a positive relationship between ribosome density and protein abundance, leading to the conclusion that transcripts with higher ribosome densities have higher translation rates [9,11,20]. A positive relationship between ribosome density and translation rate can occur when translation is limited by low initiation rates. In traditional models of translation, initiation can be limiting when other steps in translation, such as elongation, occur quickly enough to prevent collisions between ribosomes [20]. Consistent with this key role of initiation rates in determining translation rates, Arava et al [6] found that the higher densities of ribosomes on shorter transcripts was most consistent with shorter transcripts having exponentially higher initiation rates than longer transcripts, estimating a halving of the initiation rate with every 400-codon-increase in CDS length. More recent analyses [3,4] have revealed that the relationship between CDS length and initiation rates is better described by a power law: the initiation rate is roughly halved for every doubling of CDS length (i.e. a log-log slope of approximately -1). However, the assumption of initiation-limitation leaves little room for variation in elongation rates to influence translation rates, which is at odds with recent work demonstrating that codon usage can be an important determinant of protein yields [21,22].
If translation is limited by the ability of transcripts to capture ribosomes from the cytoplasmic pool (the de novo initiation rate), mechanisms that allow transcripts to retain terminating ribosomes for subsequent rounds of translation should improve translation rates. The closed-loop model of translation was first proposed as a hypothetical mechanism to improve translation efficiency through intrapolysomal ribosome reinitiation [23,24]. By bringing the sites of termination and initiation into close proximity through circularization of the mRNA, the closed-loop complex allows ribosomes that have finished translating to reinitiate translation on the same mRNA molecule rather than returning to the cytoplasmic pool. The closed-loop model was initially based on the appearance of many polysomes in electron micrographs as circular, rather than linear, structures (detailed high resolution tomographic analyses of circular polysomes are now available [25]). Recent theoretical and experimental studies have shown that secondary structures in single stranded RNAs bring the 5' and 3' ends close together (equivalent to the distance spanned by 9–16 nucleotides) meaning that mRNAs are effectively circularized [26,27]. Interactions between initiation factors bound to the 5' end, and proteins associated with the 3' end including release and recycling factors, and the poly(A) binding proteins, are thought to facilitate translation, possibly by stabilizing the closed-loop structure or by actively promoting reinitiation [28,29].
The importance of reinitiation of ribosomes on circular transcripts in determining protein yield is well established in vitro [23,24,30–32]. Measuring translation of the luminescent protein luciferase in a eukaryotic cell-free system, Kopeina et al [31] showed that circular polysomes rarely exchanged ribosomes with the free pool or lost ribosomes to other transcripts, but linear polysomes did so frequently. On circular polysomes, most terminating ribosomes immediately reinitiated on the same mRNA molecule (see also [30]). Alekhina et al [32] found that protein production in a similar cell-free system does not rapidly reach a steady state, as would be expected under a linear model of translation, but rather accelerates over the lifetime of the transcript, consistent with reinitiation on the same transcript. They proposed that the translation rate initially depends on slow de novo initiation of ribosomes from the free pool but soon becomes dominated by the much faster process of reinitiation.
Here, we use a minimal computational model to investigate the consequences of ribosome reinitiation on translation, with particular focus on transcript length and codon usage. We find that reinitiation causes ribosome densities, overall initiation rates, and protein yields to decrease with increasing transcript length. Furthermore, higher levels of reinitiation increase the importance of codon usage in determining translation rates in a length-dependent manner, even at low ribosome densities or low de novo initiation rates. Reinitiation therefore provides a potential mechanistic explanation for multiple previously-enigmatic patterns observed in empirical studies of translation.
Model
We use a totally asymmetric simple exclusion process (TASEP, reviewed in [33]) to investigate the closed loop model of translation. The TASEP (Fig 1) models each transcript as a one-dimensional lattice consisting of a number of sites equal to the number of codons in the CDS: each site represents a single codon. Each site can be either free or occupied by a ribosome. Ribosomes move along the transcript in the 5' to 3' direction and cannot occupy the same codon(s) as any other ribosome. In our model, the transcript is circularized, meaning that terminating ribosomes can not only be released into the cytoplasmic pool (as in a linear TASEP) but can also move to the initiation site of the same transcript (reinitiation).
Four different types of reactions can take place in the TASEP: (i) de novo initiation: a free ribosome can be placed onto the 5' end of the transcript (the initiation site) at the de novo initiation rate; (ii) elongation: ribosomes at any codon on the transcript (except the termination site) can move forward one codon in the 3' direction at the elongation rate; if a ribosome occupies the termination site, it can either (iii) leave the transcript at the release rate or (iv) it can move to the initiation site at the reinitiation rate.
We model ribosomes as extended particles that occupy ten codons each: the A-sites (where each codon is translated) of adjacent ribosomes must be spaced apart by at least 10 codons. Thus, the elongation reaction is only possible when the A-site of the next ribosome in the 3' direction is > 10 codons downstream. Similarly, neither de novo initiation nor reinitiation is possible if any of the first 10 codons is occupied by an A-site.
Analytical solutions of the TASEP are possible, but currently can only be applied to the steady state. Consequently, most TASEP models, including a recent study of reinitiation [34], investigate translation at the steady state, where the rate at which ribosomes join a transcript equals the rate at which they leave, and the translation rate is constant. However, every real transcript spends some proportion of its lifetime outside of the steady state, where these solutions do not apply; the assumption of a perpetual steady state is therefore an approximation. A new transcript does not instantly acquire ribosomes distributed along its length. Instead, ribosomes join at the 5' end and gradually progress towards the stop codon, where they can be released. In the absence of reinitiation, the steady state can be reached once the first ribosome to join a transcript is released. The duration of this "pioneer round" [35] increases with transcript length, but generally represents a small proportion of eukaryotic transcript lifetimes. In the absence of reinitiation the steady state is therefore a good approximation (although it can be inappropriate for prokaryotes with short-lived transcripts [36,37]). However under reinitiation, ribosomes do not necessarily leave the transcript upon termination, which causes the effective initiation rate (and the translation rate) to increase over time [32]. The time to reach the steady state therefore increases with both transcript length and reinitiation probability, and the time spent outside of the steady state thus represents a greater proportion of transcript lifetimes (S1 Fig). The steady state assumption consequently becomes a much worse approximation of translation at higher levels of reinitiation, overestimating translation rates on long transcripts and underestimating translation on short transcripts (S1 Fig). It is therefore impossible to make a fair comparison of translation at different reinitiation levels using the steady state approximation, particularly for transcripts of different lengths.
Since we do not assume that translation on any given transcript is always at the steady state, we cannot use the steady state analytical solutions of the TASEP. Instead, we perform stochastic simulations using the Gillespie algorithm [38], which capture both the steady state and the non-steady state. In models that assume the steady state, all translation that occurs in simulations prior to the steady state is ignored. For example, in a recent reinitiation-based model of translation in yeast, the first 105 s of simulations was discarded [34]. Given that the average lifetime of yeast transcripts is on the scale of 103−104 s [39], this means that all translation occurring over biologically plausible lifetimes was excluded from the analysis. Here, we make no assumptions about the steady state; we simply account for all translation that occurs during the lifetime of a transcript (both before and after the steady state is achieved). We simulated translation on each transcript independently. Each run generated a time evolution of the ribosome occupancy at each codon on a given transcript. We computed three measures of translation: ribosome density (the average number of ribosomes on a transcript over its lifetime divided by one tenth of CDS length, because each ribosome occupies 10 codons), effective initiation rate (the total number of initiations occurring through either de novo initiation or reinitiation divided by transcript lifetime) and protein yield (the total number of ribosomes reaching the stop codon of a transcript). We averaged the results of 1000 runs to produce results that are not subject to large stochastic fluctuations. We do not consider untranslated regions and our transcripts therefore represent only the CDS. The code for our TASEP is available at: https://github.com/marvinboe/reTASEP
Changing any transcriptome-wide parameter can dramatically alter global ribosome usage. For instance, at a given de novo initiation rate, increasing the reinitiation probability increases the total number of actively translating ribosomes. While this effect may be true, given that reinitiation is expected to allow more efficient use of ribosomes (see Discussion), it makes parameterizing the model difficult because the actual level of reinitiation is unknown. To keep all simulations consistent with empirical values, we have adjusted the de novo initiation rate to maintain the empirically observed average ribosome density. For simplicity, we have kept the number of ribosomes on a 400-codon-long transcript constant (at 6 ribosomes) for all transcriptome-wide reinitiation probabilities and elongation rates. See S1 Text for details on parameter estimates used in each simulation.
Results
High levels of reinitiation generate length-dependent translation
Our model captures the negative correlation between ribosome density and CDS length observed in empirical studies, but only if the probability of reinitiation is high (Fig 2). This result is intuitive; if reinitiation were perfect, all ribosomes that initiate would continue to reinitiate and translate, never leaving a transcript until it degrades. The density of ribosomes on a transcript of a given length and age would therefore be determined exclusively by the de novo initiation rate. If the de novo initiation rate is the same for all transcripts, then all transcripts of a given age should carry the same number of ribosomes and ribosome density will be the inverse of CDS length (with a log-log slope of -1). At a given elongation rate, the time required for a ribosome to complete one cycle (travel from the start codon to the stop codon) is less for short transcripts than for long transcripts. This means that, prior to the steady state, reinitiation occurs more frequently on shorter transcripts resulting in higher protein yields for short transcripts than long transcripts. When all or nearly all terminating ribosomes reinitiate, the effective initiation rate is much higher for shorter transcripts—providing a simple mechanism that could explain the length-dependence of initiation rates predicted by recent studies of translation [3–5,8]. The higher ribosome densities on shorter transcripts increase the likelihood of collisions between ribosomes, resulting in deviations from the expected power law relationship between measures of translation and CDS length (Fig 2) through two related mechanisms. First, more frequent collisions between elongating ribosomes on shorter transcripts slow down translation, generating less steep slopes for the effective initiation rate and protein yield at low CDS lengths. Second, the initiation site is more likely to be occupied on a short transcript than on a long transcript, resulting in higher levels of initiation interference [21] on shorter transcripts, further flattening length-dependence for all measures of translation on short transcripts.
When reinitiation is not perfect, ribosomes can return to the cytoplasmic pool after termination, and the effect of CDS length on ribosome density, effective initiation rate, and protein yield is diminished. Even small reductions in reinitiation probability greatly weaken length-dependence (Fig 2). This is because short transcripts have more opportunities to lose ribosomes than do long transcripts. While a successful reinitiation event only guarantees that a ribosome remains associated with the transcript until the next termination event, ribosome loss is permanent. In the complete absence of reinitiation, length-dependence is therefore abolished.
Reinitiation, but not de novo initiation, has a larger effect on short transcripts than long transcripts
While changing transcriptome-wide parameters can dramatically affect global ribosome usage (see Model), altering parameters of transcripts encoded by a single gene will have little effect on global ribosome usage. This is because nearly all endogenous genes are expressed at low levels, so changing the translation parameters of the transcripts produced by a single gene will have a negligible effect on global ribosome availability [4,20,41]. By studying transcripts of individual genes, we can therefore investigate the consequences of changing a single parameter while holding all other values constant. We first tested the effects of altering the reinitiation rate of transcripts encoded by a single gene (Fig 3A and 3B). Doubling the reinitiation rate results in an extremely similar increase in all three measures of translation (ribosome density, effective initiation rate, and protein yield; results are therefore only shown for ribosome density), but the effects are greater for short transcripts than long transcripts. These effects are mirrored by a length-dependent decrease in translation when the reinitiation rate is halved (Fig 3B). Furthermore, the length-dependent effects of changing the reinitiation rate of a single transcript species are generally stronger at higher transcriptome-wide reinitiation probabilities, except when reinitiation is so high that ribosomes rarely leave the transcript (e.g. 99.9%).
We next tested the effects of altering the de novo initiation rate of a single transcript species (Fig 3C and 3D). In the absence of reinitiation, doubling the de novo initiation rate had an equal effect on ribosome density for transcripts of all lengths. However, at higher levels of reinitiation, doubling the de novo initiation rate resulted in a smaller increase in ribosome density on short transcripts than on long transcripts, caused by increased initiation interference; the higher density of ribosomes on short transcripts under reinitiation increases the probability that the initiation site is blocked, preventing successful de novo initiation. The effects of altering the de novo initiation rate on the effective initiation rate and protein yield are very similar to the effects on ribosome density.
High levels of reinitiation couple effective initiation rates and protein yields to the elongation rate
So far, we have assumed that all transcripts have identical elongation rates, but in reality the elongation rate varies between transcripts encoded by different genes [42]. We therefore investigated the consequences of changing the elongation rate of a single CDS from 10s-1 to either 20s-1 or 5s-1 (Fig 4). Increasing the elongation rate reduces the amount of time between initiation and termination. In the absence of reinitiation, this causes ribosomes to spend less time on the altered transcript resulting in decreased ribosome density, but has little effect on the initiation rate or protein yield since these elongation rates are generally not limiting. Altered elongation rates do affect how long it takes to clear the initiation site and therefore the amount of initiation interference, explaining the relatively small differences in initiation rates and protein yields seen at 0% reinitiation [21].
Under perfect reinitiation, terminating ribosomes explicitly reinitiate on the same transcript. Changing the elongation rate of a single gene therefore has no effect on the density of ribosomes on the altered transcript. However, by altering the time between reinitiation events, changing the elongation rate results in an equal change in the effective initiation rate of the altered transcript (Fig 4). The protein yield of any endogenous gene is therefore exquisitely sensitive to changes in elongation rate under perfect reinitiation. Under perfect reinitiation, this effect is seen at all CDS lengths. The importance of the elongation rate decreases dramatically when reinitiation levels are reduced: faster elongation results in more opportunities to lose ribosomes, particularly on short transcripts.
Length-dependent consequences of a single slow step on translation
So far, we have only considered the effects of changing the average elongation rate of a transcript. However, it is difficult to imagine a mechanism that could simultaneously alter the elongation rate of all codons in a single transcript species without affecting the global elongation rate. Instead, transcripts are likely altered by mutations affecting a single codon at a time. Codon usage can affect elongation by determining the stability of secondary structures in the mRNA, but different codons are also decoded at different rates depending on the cellular availability of the appropriate tRNA. Most amino acids are encoded by multiple codons, and some codons (including synonymous codons that code for the same amino acid) are decoded faster than others [42,43]. We therefore investigated the consequences of a single slow step on translation of transcript species of different lengths (Fig 5). Here, we only examined translation at 99.9% reinitiation; similar results would be expected for other models of length-dependent translation including those that omit reinitiation. Introducing a single slow step into any transcript reduces its effective initiation rate and protein yield, but the effects are much larger for short transcripts than for long transcripts (Fig 5). The length-dependence of a single slow step arises from two sources. First, a single site represents a larger proportion of a short transcript than a long transcript and consequently results in a greater decrease in the average elongation rate [44]. Second, short transcripts have higher ribosome densities and are therefore more prone to collisions or "traffic jams" than are long transcripts. Effective initiation rates and protein yields are particularly sensitive to single slow steps near the start codon, with larger effects on shorter transcripts: slow clearance of the initiation site delays reinitiation and blocks de novo initiation resulting in lower ribosome densities on affected transcripts.
A yeast-specific model of translation with reinitiation
Given the importance of variation in elongation rates to translation under reinitiation, we used our model to simulate translation in S. cerevisiae using codon-specific decoding rates. We used decoding rates (see S1 Table) estimated by Gilchrist & Wagner [45] which are based on tRNA availability and wobble pairing rules and scaled so that the average decoding rate is 10s-1; they are related to measures of codon occupancy reported in [5] (r = 0.494, n = 61, P < 0.0001). Since efficient reinitiation couples protein production to elongation rates, synonymous codon usage should have detectable consequences for protein yield at high levels of reinitiation. We tested the effects of synonymous codon usage at 99.9% reinitiation by predicting the yields of nine different synthetic GFP constructs [46] that differ only in their synonymous codon usage (Fig 6A). We compared these predictions to observed protein abundances measured in S. cerevisiae expressing each construct, and found a strong positive correlation between predicted yields and observed abundances (r = 0.750, n = 9, P = 0.020); our model predicted approximately half of the observed effect of using different synonymous codons (relative expression of highest vs. lowest construct, model = 2.4-fold, observed = 5.4-fold). Thus, efficient reinitiation correctly predicts a role for synonymous codon usage in determining yield.
Having established that Gilchrist & Wagner’s [45] codon-specific elongation rates are realistic, we used them to simulate the entire budding yeast translatome. The results of our simulations at 99.9% reinitiation are strongly correlated (Fig 6B) with experimental measures of ribosome densities (r = 0.932, n = 5542, using data from [6]) and calculated initiation rates (r = 0.742, n = 5348, using estimates from [3]; r = 0.618, n = 3728, using estimates from [4]). Our yield predictions are less strongly correlated with measured protein abundances (r = 0.478, n = 4686, data from Peptide Atlas 2013). This weaker correlation is unsurprising as our predictions of yield omit many important determinants of protein abundance including transcript abundance and protein stability. Results of simulations at other reinitiation levels are included in S3 Fig (fixed transcript lifetime) and S4 Fig (experimentally measured transcript lifetimes).
Discussion
We have shown that a fixed transcriptome-wide level of ribosome reinitiation can generate both length-dependent translation and a powerful transcript-specific role for codon usage, but only when reinitiation is extremely efficient. The level of reinitiation in live cells is unknown, but multiple studies have established that reinitiation is much more frequent than de novo initiation in cell-free systems. Furthermore, if reinitiation benefits the cell, we would expect it to evolve to become highly efficient. Maintaining a large pool of ribosomes consumes a substantial part of a cell's energy budget and selection will favor mechanisms that allow ribosomes to be used efficiently [47]. If, as reported by Nelson & Winkler [30] and Kopeina et al [31], reinitiation of post-termination ribosomes is faster than de novo initiation from the free ribosome pool, then efficient reinitiation reduces the amount of "dead time" ribosomes spend in the pool waiting to be recruited [34,48]. Each ribosome in a cell is therefore able to complete more rounds of translation in a given time interval under high levels of reinitiation compared to low levels of reinitiation. Reinitiation levels should be very closely associated with fitness: the translation initiation rate is thought to be the principal determinant of the rate of cell division [49,50]. Consequently, if reinitiation does occur in living cells, it is hard to imagine why it would not work very efficiently. Direct measurement of the level of reinitiation in vivo may soon be possible thanks to recent technological advancements enabling selective labeling of ribosomes [51] and the visualization of translation on individual mRNAs [52,53].
A single fixed level of reinitiation is not necessary to explain length-dependent translation; efficient reinitiation is only required on short transcripts (S5 Fig). Studies in living cells have shown that some transcripts are more likely to be associated with translation factors required to form the closed-loop complex than others [54]. If the closed-loop complex is required for efficient reinitiation, then reinitiation levels likely vary between transcripts. More specifically, shorter transcripts likely experience higher levels of reinitiation since they are both more likely to be enriched with closed-loop factors [15,55], form more stable closed-loop complexes [56], and may exhibit shorter end-to-end distances allowing increased levels of reinitiation by passive diffusion [57]. Additionally, cellular depletion of both the closed loop factor eIF4G and the translational regulator Asc1/RACK1 has also been shown to have a greater effect on the translation of short transcripts than on long transcripts [13,15]. Using length-dependent reinitiation levels in our simulations allows the empirical relationship between CDS length and ribosome density, effective initiation rate, and protein yield to be captured at an average reinitiation level orders of magnitude smaller (~90%; S5 Fig) than does a fixed reinitiation level (99.9%; S5 Fig).
Beyond acting on global mechanisms, natural selection also operates to maximize the protein yield of transcripts encoded by individual genes (translational efficiency [44]). Selection for increased translational efficiency can not only increase the abundance of a given protein in a cell, but can also maintain protein levels while minimizing the cost of transcription, which has been shown to be an important determinant of fitness in yeast [58]. The strength of selection depends on the magnitude of the effect of a given mutation on translational efficiency; mutations with larger effects are subject to stronger selection. We have shown that the magnitude of the effect on translational efficiency of altering a given parameter by an equal amount can vary with the length of the altered transcript species. Thus, the strength of selection on mutations that affect a given parameter can be length-dependent [44]. For instance, doubling the reinitiation rate of a single transcript species results in a bigger increase in translational efficiency for shorter transcripts (Fig 3). Mutations affecting the reinitiation rate of short transcripts are therefore more likely to be selected than are those than occur on long transcripts, potentially contributing to higher levels of reinitiation on shorter transcripts as discussed above (S5 Fig). In contrast, doubling the de novo initiation rate does not result in higher translational efficiency on shorter transcripts and, under reinitiation, can actually have smaller effects on shorter transcripts due to increased initiation interference (Fig 3). Selection for increased translational efficiency on individual transcript species is therefore not predicted to result in higher de novo initiation rates on shorter transcripts. Instead, selection under reinitiation will be more effective at reducing initiation interference on shorter transcripts.
At high levels of reinitiation, we have shown that a single slow step in translation causes a greater reduction in the translational efficiency of short transcripts than that of long transcripts (Fig 5). Eliminating slow steps has larger effects on the translation of short transcripts compared to long transcripts and therefore selection to eliminate slow steps will be most effective in genes encoding short transcripts. Length-dependent selection against slow steps under reinitiation therefore offers an explanation for the negative correlation between codon adaptation and CDS length observed across eukaryotes ([4, 44, 59–63] but see also [64]). Translational efficiency is particularly sensitive to slow sites near the start codon (Fig 5, see also [21]): slow clearance of the initiation site delays reinitiation (promoting ribosome loss) and blocks de novo initiation resulting in lower ribosome densities on affected transcripts. Multiple mechanisms can determine how slowly ribosomes vacate the initiation site including the presence of one or more slow codons [21] or the presence of stable 5' secondary structures in the transcript [65]. As both features reduce yield to a greater extent on short transcripts compared to long transcripts (Fig 5), selection should be more efficient at eliminating them on shorter transcripts, consistent with the negative correlations between CDS length and both 5' mRNA folding energy and 5' codon adaptation [59]. Thus, length-dependent translation generated by high levels of reinitiation will generate length-dependent selection against slow steps [44], which will in turn reinforce patterns of length-dependent translation.
Reinitiation provides a simple mechanistic explanation for empirically observed patterns of length-dependent translation including negative correlations between CDS length and ribosome density, effective initiation rate, protein yield, transcript codon adaptation, 5' codon adaptation, 5' folding energy, and association with closed-loop factors. Under reinitiation, these patterns are expected to emerge through selection for efficient ribosome usage, maximizing protein yield, and translational efficiency on individual transcript species. This is in sharp contrast to linear models in which, at low ribosome availability, length-dependence arises through direct selection for higher de novo initiation rates on shorter transcripts [3,4]. Our model is consistent with the emerging view that translation is controlled not only by initiation, but also by elongation and termination/reinitiation [21,22,66]. This conceptual shift makes clear that manipulating any these stages can have profound consequences on translation, and presents factors associated with elongation, release, and recycling as new targets for therapeutic intervention (cf. [67]).
Supporting information
Data Availability
The code for our TASEP is available at: https://github.com/marvinboe/reTASEP. Data sources for Fig 2: Yeast ribosome densities were taken from from Arava et al 2003 are available at http://genome-www.stanford.edu/yeast_translation/data.shtml. Yeast initiation rates were taken from Ciandrini et al 2013, available in Table S1 at http://dx.doi.org/10.1371/journal.pcbi.1002866. Yeast protein abundances were taken from the Peptide Atlas 2013 dataset from PaxDb available at http://pax-db.org/dataset/4932/242/. HEK293T ribosome densities were taken from Hendrickson et al 2009, available in Dataset S3 at http://dx.doi.org/10.1371/journal.pbio.1000238. HEK293T protein abundances were taken from the Geiger MCP 2012 data set (based on spectral counting) from PaxDb http://pax-db.org/dataset/9606/485/
Funding Statement
AT and DG received funding from the Max Planck Society https://www.mpg.de/en. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Lackner DH, Bähler J. Translational control of gene expression: from transcripts to transcriptomes. Int Rev Cell Mol Biol. 2008;271: 199–251. doi: 10.1016/S1937-6448(08)01205-7 [DOI] [PubMed] [Google Scholar]
- 2.Kuersten S, Radek A, Vogel C, Penalva LOF. Translation regulation gets its 'omics' moment. Wiley Interdiscip Rev RNA. 2013;4: 617–630. doi: 10.1002/wrna.1173 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ciandrini L, Stansfield I, Romano MC. Ribosome traffic on mRNAs maps to gene ontology: genome-wide quantification of translation initiation rates and polysome size regulation. PLoS Comp Biol. 2013;9: e1002866 doi: 10.1371/journal.pcbi.1002866 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Shah P, Ding Y, Niemczyk M, Kudla G, Plotkin JB. Rate-limiting steps in yeast protein translation. Cell. 2013;153: 1589–1601. doi: 10.1016/j.cell.2013.05.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Weinberg DE, Shah P, Eichhorn SW, Hussmann JA, Plotkin JB, Bartel DP. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 2016;14: 1787–1799. doi: 10.1016/j.celrep.2016.01.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Arava Y, Wang Y, Storey JD, Brown PO, Herschlag D. Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae. Proc Natl Acad Sci USA. 2003;100: 3889–3894. doi: 10.1073/pnas.0635171100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.MacKay VL, Li X, Flory MR, Turcott E, Law GL, Serikawa KA, et al. Gene expression analyzed by high-resolution state array analysis and quantitative proteomics. Mol Cell Proteomics. 2004;3: 478–489. doi: 10.1074/mcp.M300129-MCP200 [DOI] [PubMed] [Google Scholar]
- 8.Arava Y, Boas FE, Brown PO, Herschlag D. Dissecting eukaryotic translation and its control by ribosome density mapping. Nucl Acids Res. 2005;33: 2421–2432. doi: 10.1093/nar/gki331 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Lackner DH, Beilharz TH, Marguerat S, Mata J, Watt S, Schubert F, et al. A network of multiple regulatory layers shapes gene expression in fission yeast. Mol Cell. 2007;26: 145–155. doi: 10.1016/j.molcel.2007.03.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Qin X, Ahn S, Speed TP, Rubin GM. Global analyses of mRNA translational control during early Drosophila embryogenesis. Genome Biol. 2007;8: R63 doi: 10.1186/gb-2007-8-4-r63 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hendrickson DG, Hogan DJ, McCullough HL, Myers JW, Herschlag D, Ferrell JE, et al. Concordant regulation of translation and mRNA abundance for hundreds of targets of a human microRNA. PLoS Biol. 2009;7: e1000238 doi: 10.1371/journal.pbio.1000238 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324: 218–223. doi: 10.1126/science.1168978 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Park EH, Zhang F, Warringer J, Sunnerhagen P, Hinnebusch AG. Depletion of eIF4G from yeast cells narrows the range of translational efficiencies genome-wide. BMC Genomics. 2011;12: 68 doi: 10.1186/1471-2164-12-68 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lauria F, Tebaldi T, Lunelli L, Struffi P, Gatto P, Pugliese A, et al. RiboAbacus: a model trained on polyribosome images predicts ribosome density and translational efficiency from mammalian transcriptomes. Nucl Acids Res. 2015;43: e153 doi: 10.1093/nar/gkv781 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Thompson MK, Rojas-Duran MF, Gangaramani P, Gilbert WV. The ribosomal protein Asc1/RACK1 is required for efficient translation of short mRNAs. eLife 2016;5: e11154 doi: 10.7554/eLife.11154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, et al. Global analysis of protein expression in yeast. Nature. 2003;425: 737–741. doi: 10.1038/nature02046 [DOI] [PubMed] [Google Scholar]
- 17.Gunaratne J, Schmidt A, Quandt A, Neo SP, Saraç OS, Gracia T, et al. Extensive mass spectrometry-based analysis of the fission yeast proteome. Mol Cell Proteomics. 2013;12: 1741–1751. doi: 10.1074/mcp.M112.023754 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Vogel C, de Sousa Abreu R, Ko D, Le S-Y, Shapiro BA, Burns SC, et al. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol. 2010;6: 400 doi: 10.1038/msb.2010.59 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wang T, Cui Y, Jin J, Guo J, Wang G, Yin X, et al. Translating mRNAs strongly correlate to proteins in a multivariate manner and their translation ratios are phenotype specific. Nucl Acids Res. 2013;41: 4743–4754. doi: 10.1093/nar/gkt178 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Plotkin JB, Kudla G. Synonymous but not the same: the causes and consequences of codon bias. Nat Rev Genet. 2011;12: 32–42. doi: 10.1038/nrg2899 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chu D, Kazana E, Bellanger N, Singh T, Tuite MF, von der Haar T. Translation elongation can control translation initiation on eukaryotic mRNAs. EMBO J. 2014;33: 21–34. doi: 10.1002/embj.201385651 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tarrant D, von der Haar T. Synonymous codons, ribosome speed, and eukaryotic gene expression regulation. Cell Mol Life Sci. 2014;71: 4195–4206. doi: 10.1007/s00018-014-1684-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Phillips GR. Haemoglobin synthesis and polysomes in intact reticulocytes. Nature. 1965;205: 567–570. [DOI] [PubMed] [Google Scholar]
- 24.Baglioni C, Vesco C, Jacobs-Lorena M. The role of ribosomal subunits in mammalian cells. Cold Spring Harb Symp Quant Biol. 1969;34: 555–565. [DOI] [PubMed] [Google Scholar]
- 25.Afonina ZA, Myasnikov AG, Shirokov VA, Klaholz BP, Spirin AS. Conformation transitions of eukaryotic polyribosomes during multi-round translation. Nucl Acids Res. 2015;43: 618–628. doi: 10.1093/nar/gku1270 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Yoffe AM, Prinsen P, Gelbart WM, Ben-Shaul A. The ends of a large RNA molecule are necessarily close. Nucl Acids Res. 2011;39: 292–299. doi: 10.1093/nar/gkq642 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Leija-Martínez N, Casas-Flores S, Cadena-Nava R, Roca JA, Mendez-Cabañas JA, Gomez E, Ruiz-Garcia J. The separation between the 5'-3' ends in long RNA molecules is short and nearly constant. Nucl Acids Res. 2014;42: 13963–13968. doi: 10.1093/nar/gku1249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Mazumder B, Seshadri S, Fox PL. Translational control by the 3'-UTR: the ends specify the means. Trends Biochem Sci. 2003;28: 91–98. doi: 10.1016/S0968-0004(03)00002-1 [DOI] [PubMed] [Google Scholar]
- 29.Wilkie GS, Dickson KS, Gray NK. Regulation of mRNA translation by 5'- and 3'-UTR-binding factors. Trends Biochem Sci. 2003;28: 182–188. doi: 10.1016/S0968-0004(03)00051-3 [DOI] [PubMed] [Google Scholar]
- 30.Nelson EM, Winkler MM. Regulation of mRNA entry into polysomes. J Biol Chem. 1987;262: 11501–11506. [PubMed] [Google Scholar]
- 31.Kopeina GS, Afonina ZA, Gromova KV, Shirokov VA, Vasiliev VD, Spirin AS. Step-wise formation of eukaryotic double-row polyribosomes and circular translation of polysomal mRNA. Nucl Acids Res. 2008;36: 2476–2488. doi: 10.1093/nar/gkm1177 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Alekhina OM, Vassilenko KS, Spirin AS. Translation of non-capped mRNAs in a eukaryotic cell-free system: acceleration of initiation rate in the course of polysome formation. Nucl Acids Res. 2007;35: 6547–6559. doi: 10.1093/nar/gkm725 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Chou T, Mallick K, Zia RKP. Non-equilibrium statistical mechanics: from a paradigmatic model to biological transport. Rep Prog Phys. 2011;74: 116601 doi: 10.1088/0034-4885/74/11/116601 [Google Scholar]
- 34.Marshall E, Stansfield I, Romano MC. Ribosome recycling induces optimal translation rate at low ribosomal availability. J R Soc Interface. 2014;11: 20140589 doi: 10.1098/rsif.2014.0589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Maquat LE, Tarn WY, Isken O. The pioneer round of translation: features and functions. Cell. 2010;142: 368–374. doi: 10.1016/j.cell.2010.07.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Nagar A, Valleriani A, Lipowsky R. Translation by ribosomes with mRNA degradation: exclusion processes on aging tracks. J Stat Phys. 2011;145: 1385–1404. doi: 10.1007/s10955-011-0347-z [Google Scholar]
- 37.Gorissen M, Vanderzande C. Ribosome dwell times and the protein copy number distribution. J Stat Phys. 2012;148: 628–636. doi: 10.1007/s10955-012-0452-7 [Google Scholar]
- 38.Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 1977;81: 2340–2361. doi: 10.1021/j100540a008 [Google Scholar]
- 39.Pelechano V, Chávez S, Pérez-Ortín JE. A complete set of nascent transcription rates for yeast genes. PLoS ONE. 2010;5: e15442 doi: 10.1371/journal.pone.0015442 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wang M, Weiss M, Simonovic M, Haertinger G, Schrimpf SP, Hengartner MO, et al. PaxDb, a database of protein abundance averages across all three domains of life. Mol Cell Proteomics. 2012;11: 492–500. doi: 10.1074/mcp.O111.014704 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Charneski CA, Hurst LD. Positively charged residues are the major determinants of ribosomal velocity. PLoS Biol. 2013;11: e1001508 doi: 10.1371/journal.pbio.1001508 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Yu CH, Dang Y, Zhou Z, Wu C, Zhao F, Sachs MS, et al. Codon usage influences the local rate of translation elongation to regulate co-translation protein folding. Mol Cell. 2015;59: 744–754. doi: 10.1016/j.molcel.2015.07.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Presnyak V, Alhusaini N, Chen YH, Martin S, Morris N, Kline N, et al. Codon optimality is a major determinant of mRNA stability. Cell. 2015;160: 1111–1124. doi: 10.1016/j.cell.2015.02.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Comeron JM, Kreitman M, Aguadé M. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics. 1999;151: 239–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Gilchrist MA, Wagner A. A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol. 2006;239: 417–434. doi: 10.1016/j.jtbi.2005.08.007 [DOI] [PubMed] [Google Scholar]
- 46.Lanza AM, Curran KA, Rey LG, Alper HS. A condition-specific codon optimization approach for improved heterologous gene expression in Saccharomyces cerevisiae. BMC Syst Biol. 2014;8: 33 doi: 10.1186/1752-0509-8-33 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Warner JR. The economics of ribosome biosynthesis in yeast. Trends Biochem Sci. 1999;24: 437–440. doi: 10.1016/S0968-0004(99)01460-7 [DOI] [PubMed] [Google Scholar]
- 48.Chu D, Thompson J, von der Haar T. Charting the dynamics of translation. BioSystems. 2014;119: 1–9. doi: 10.1016/j.biosystems.2014.02.005 [DOI] [PubMed] [Google Scholar]
- 49.Soifer I, Barkai N. Systematic identification of cell size regulators in budding yeast. Mol Syst Biol. 2014;10: 761 doi: 10.15252/msb.20145345 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Polymenis M, Aramayo R. Translate to divide: control of the cell cycle by protein synthesis. Microbial Cell. 2015;2: 94–104. doi: 10.15698/mic2015.04.198 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Jan CH, Williams CC, Weissman JS. Principles of ER cotranslational translocation revealed by proximity-specific ribosome profiling. Science. 2014;346: 1257521 doi: 10.1126/science.1257521 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wang C, Han B, Zhou R, Zhuang X. Real-time imaging of translation on single mRNA transcripts in live cells. Cell. 2016;165: 990–1001. doi: 10.1016/j.cell.2016.04.040 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Yan X, Hoek TA, Vale RD, Tanenbaum ME. Dynamics of translation of single mRNA molecules in vivo. Cell. 2016;165: 976–989. doi: 10.1016/j.cell.2016.04.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Archer SK, Shirokikh NE, Hallwirth CV, Beilharz TH, Preiss T. Probing the closed-loop model of mRNA translation in living cells. RNA Biol. 2015;12: 248–254. doi: 10.1080/15476286.2015.1017242 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Costello J, Castelli LM, Rowe W, Kershaw CJ, Talavera D, Mohammed-Qureshi SS, et al. Global mRNA selection mechanisms for translation initiation. Genome Biol. 2015;16: 10 doi: 10.1186/s13059-014-0559-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Amrani N, Ghosh S, Mangus DA, Jacobson A. Translation factors promote the formation of two states of the closed-loop mRNP. Nature. 2008;453: 1276–1280. doi: 10.1038/nature06974 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Fernandes LD, de Moura A, Ciandrini L. 2017 Gene length as a regulator for ribosome recruitment and protein synthesis: theoretical insights. arXiv 1702.00632v1 [DOI] [PMC free article] [PubMed]
- 58.Kafri M, Metzl-Raz E, Jona G, Barkai N. The cost of protein production. Cell Rep. 2016;14: 22–31. doi: 10.1016/j.celrep.2015.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Ding Y, Shah P, Plotkin JB. Weak 5'-mRNA secondary structures in short eukaryotic genes. Genome Biol Evol. 2012;4: 1046–1053. doi: 10.1093/gbe/evs082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA. 1999;96: 4482–4487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Marais G, Duret L. Synonymous codon usage, accuracy of translation, and gene length in Caenorhabditis elegans. J Mol Evol. 2001;52: 275–280. [DOI] [PubMed] [Google Scholar]
- 62.Kliman RM, Irving N, Santiago M. Selection conflicts, gene expression, and codon usage trends in yeast. J Mol Evol. 2003;57: 98–109. doi: 10.1007/s00239-003-2459-9 [DOI] [PubMed] [Google Scholar]
- 63.Waldman YY, Tuller T, Shlomi T, Sharan R, Ruppin E. Translation efficiency in humans: tissue specificity, global optimization and differences between developmental stages. Nucl Acids Res. 2010;38: 2964–2974. doi: 10.1093/nar/gkq009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Coghlan A, Wolfe KH. Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. 2000;16: 1131–1145. doi: 10.1002/1097-0061(20000915)16:12<1131::AID-YEA609>3.0.CO;2-F [DOI] [PubMed] [Google Scholar]
- 65.Gu W, Zhou T, Wilke CO. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Comput Biol. 2010;6: e1000664 doi: 10.1371/journal.pcbi.1000664 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Merrick WC, Harris ME. Control not at initiation? Bah, humbug! EMBO J. 2014;33: 3–4. doi: 10.1002/embj.201387388 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Richter JD, Coller J. Pausing on polyribosomes: make way for elongation in translational control. Cell. 2015;163: 292–300. doi: 10.1016/j.cell.2015.09.041 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Lipson D, Raz T, Kieu A, Jones DR, Giladi E, Thayer E, Thompson JF, Letovsky S, Milos P, Causey M. Quantification of the yeast transcriptome by single-molecule sequencing. Nat Biotech. 2009;27: 652–658. doi: 10.1038/nbt.1551 [DOI] [PubMed] [Google Scholar]
- 69.Miura F, Kawaguchi N, Yoshida M, Uematsu C, Kito K, Sakaki Y, Ito T. Absolute quantification of the budding yeast transcriptome by means of competitive PCR between genomic and complementary DNAs. BMC Genomics 2008;9: 574 doi: 10.1186/1471-2164-9-574 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code for our TASEP is available at: https://github.com/marvinboe/reTASEP. Data sources for Fig 2: Yeast ribosome densities were taken from from Arava et al 2003 are available at http://genome-www.stanford.edu/yeast_translation/data.shtml. Yeast initiation rates were taken from Ciandrini et al 2013, available in Table S1 at http://dx.doi.org/10.1371/journal.pcbi.1002866. Yeast protein abundances were taken from the Peptide Atlas 2013 dataset from PaxDb available at http://pax-db.org/dataset/4932/242/. HEK293T ribosome densities were taken from Hendrickson et al 2009, available in Dataset S3 at http://dx.doi.org/10.1371/journal.pbio.1000238. HEK293T protein abundances were taken from the Geiger MCP 2012 data set (based on spectral counting) from PaxDb http://pax-db.org/dataset/9606/485/