Abstract
Mitochondria play critical roles in cellular metabolism, primarily by serving as the site of assembly and function of the oxidative phosphorylation (OXPHOS) machinery. The OXPHOS proteins are encoded by mitochondrial DNA (mtDNA) and nuclear DNA, which reside and are regulated within separate compartments. To unravel how the two gene expression systems collaborate to produce the OXPHOS complexes, the regulatory principles controlling the production of mtDNA-encoded proteins need to be elucidated. In this study, we performed a quantitative analysis of the mitochondrial messenger RNA (mt-mRNA) life cycle to determine which steps of gene expression experience strong regulatory control. Our analysis revealed that the high accumulation of mt-mRNA despite their rapid turnover was made possible by a 700-fold higher transcriptional output than nuclear-encoded OXPHOS genes. In addition, we observed that mt-mRNA processing and its association with the mitochondrial ribosome occur rapidly and that these processes are linked mechanistically. Based on these data, we developed a model of mtDNA expression that is predictive across human cell lines, revealing that differences in turnover and translation efficiency are the major contributors to mitochondrial-encoded protein synthesis. Applying this framework to a disease model of Leigh syndrome, French-Canadian type, we found that the disease-associated nuclear-encoded gene, LRPPRC, acts predominantly by stabilizing mt-mRNA. Our findings provide a comprehensive view of the intricate regulatory mechanisms governing mtDNA-encoded protein synthesis, highlighting the importance of quantitatively analyzing the mitochondrial RNA life cycle in order to decode the regulatory principles of mtDNA expression.
Introduction
A major goal of genomics is determining how DNA and RNA govern protein abundance, and the main principles underlying these relationships have been quantitatively investigated for both bacterial and eukaryotic nuclear genomes1,2. In these systems, regulation of transcription initiation is a primary determinant of protein production, with other layers of gene regulation supplementing or augmenting the control exerted at gene promoters. Animal cells maintain and express two genomes: the nuclear chromosomes and the small, circular, and highly polyploid mitochondrial DNA (mtDNA). Mitochondrial and nuclear gene expression processes are spatially separated and governed by evolutionary distinct transcription, RNA processing, and translation machinery3. However, because the nuclear and mitochondrial genomes each encode distinct essential components of the dual-origin oxidative phosphorylation (OXPHOS) complexes responsible for cellular respiration and ATP-production3–5, the expression of nuclear and mitochondrial DNA must be coordinated. In human cells, the variations in the relative abundance of mtDNA-encoded RNAs and proteins across cell types and conditions match those of their nuclear-encoded OXPHOS counterparts3,5,6. Deregulation of protein synthesis between the two compartments causes proteostatic stress5,7, and mutations in genes involved in mitochondrial gene expression can cause severe human disease syndromes8.
Nuclear-encoded genes are transcribed into monocistronic transcripts that are spliced, polyadenylated, and exported into the cytoplasm for translation. By contrast, mitochondrial genes are transcribed into two long polycistronic primary transcripts encoded on each strand of the mtDNA, the heavy and light strands (Figure 1A). The primary transcripts are cleaved into individual mt-mRNAs, transfer RNAs (mt-tRNA), and ribosomal RNAs (mt-rRNA). After processing, mt-mRNA is typically adorned with short poly(A) tails before being translated by the mitochondrial ribosomes (mitoribosomes) at the inner mitochondrial membrane9. Although the steps of mitochondrial gene expression are well-defined, the quantitative and kinetic rules that govern them have not been fully resolved. Because mt-RNAs are produced from only three promoters10, it is clear that regulation of transcription initiation cannot contribute much to the relative levels of mitochondrial-encoded proteins. Accordingly, the steady-state mitochondrial transcriptome must be established by post-transcriptional processes. However, the timing of these post-transcriptional processes in the mitochondrial RNA life cycle, such as processing, ribosome association, and turnover remain obscure, and the contributions of these processes to mitochondrial gene expression have not been quantified6,11. To understand how dual-origin OXPHOS complexes are co-regulated, we first need to determine the relative contributions of core regulatory modules to mitochondrial gene expression. The recent development of tools such as nucleotide-conversion-based strategies to measure RNA turnover (Fig 1B)12,13 and direct RNA sequencing to measure RNA abundance and processing (Fig 1G) enables the quantitative and kinetic analysis of the mt-RNA lifecycle with sufficient resolution to answer these questions for the first time14.
Figure 1.
RNA turnover is the main determinant of the relative abundance of the heavy strand–encoded transcriptome. A) The heavy strand of the mitochondrial genome encodes two ribosomal RNAs, 10 mt-mRNAs (two bicistronic), and several tRNAs, whereas the light strand encodes one mRNA as well as tRNAs. Mitochondrial transcription is polycistronic, with all transcription start sites located upstream of the coding regions10, resulting in long primary transcripts that are cleaved by processing enzymes into individual rRNAs, tRNAs, and mRNAs (color-coded by OXPHOS complex). B) Overview of TimeLapse-seq, which produces measurements of fraction new RNA. Green dots indicate metabolically incorporated 4sU, which is modified during TL chemistry and misread as C (purple dots) during reverse transcription. C) Turnover profiles for two mt-mRNAs, COX1 and ND1, and one mt-rRNA, RNR2. Dots show the measured data, and lines indicate the best fit for an exponential model. D) Distribution of half-lives derived from the exponential model for the mt-rRNA RNR2, all heavy strand–encoded mt-mRNAs (“Mito”, n=10), and all nuclear mRNAs encoding OXPHOS subunits (“Nuc”, n= 76). E) Same as D), but for transcript abundance estimated based on the RPKMs from the same experiment. Zero values were excluded. F) Production rate calculated from the turnover rate in D) and the abundance in E) by dividing the RPKM by the half-life. G) Schematic of nanopore direct RNA sequencing to acquire minimally biased RNA abundances. 3’-end tailing was accomplished through in vitro addition of a poly(A) tail or ligation of a short DNA linker. H) Reads from one direct RNA-seq experiment mapping to the mitochondrial heavy strand (in black), with the results from the transcription model (shown in the right red box) overlaid in red. The model takes into account turnover and transcription parameters to estimate RNA abundance. Firing rate = slope of 3’ ends of mapped reads from direct RNA-seq/elongation rate20,21. I) ND1-normalized abundance from nanopore direct RNA-seq is plotted against ND1-normalized turnover rates from mito-TL-seq. Error bars represent standard deviation. Pearson R2, calculated in non-log space, is displayed for all transcripts (top left corner) and all mt-mRNAs (excluding RNR2, top left corner in dashed line box). Two promoter distal genes are highlighted. J) LRPPRC is a multifunctional protein that stabilizes most mitochondrial mRNAs. K) Scatter plot showing a strong correlation between the change in RNA abundance and the change in turnover in HEK293T cells depleted of LRPPRC (LRPPRCKO) relative to wild-type (WT) cells. Data from two replicates are shown. Error bars show the standard deviation. The Pearson R2 is shown. All data in the figure are from HeLa cells except those in K), which includes data from HEK293T WT and LRPPRCKO cells.
In this study, we leveraged these tools to quantitatively measure the lives of mitochondrial RNAs by determining the rates of each major life-cycle step. We found that the rate of production of mt-mRNA was 700-fold higher than that of nuclear-encoded OXPHOS transcripts, and that differential turnover can explain steady-state mt-mRNA abundance, with transcriptional mechanisms playing little if any role. Mt-mRNA processing, which includes cleavage out of the primary transcript and poly(A) tailing, generally occurred rapidly (<1 min). Mitoribosome association occurred on the same time scale as processing, and most mt-mRNAs were mitoribosome-bound. In certain cases, mt-mRNA 5’-end processing was rate-limiting for ribosome association. A quantitative gene expression model combining all our kinetic measurements revealed that mitochondrial protein synthesis can be explained by mt-mRNA turnover rates (~60% of variability), translation efficiency (~40%), and ribosome-association rates (<1%). Intriguingly, mitochondrial translational efficiencies were largely constant across the cell lines we analyzed. Finally, we used our model to disentangle the role of the disease-associated LRPPRC protein (leucine rich pentatricopeptide repeat containing) in the control of gene expression and found that the critical impact of LRPPRC loss is the dysregulation of mt-RNA stability. Our model of mitochondrial gene expression control is the first systematic, quantitative, and comparative kinetic analysis of the mitochondrial transcriptome, and sheds light on how mitochondrial gene expression can be controlled by and coordinated with nuclear-encoded genes.
Results
Differential mt-mRNA degradation rates determine the heavy-strand transcriptome
The mtDNA heavy strand encodes two mitochondrial rRNAs, RNR1 and RNR2, and 10 of the 11 mt-mRNAs (Fig 1A), which are produced in a single primary transcript and processed into individual RNAs. Although produced at equimolar proportions, heavy-strand RNAs accumulate at levels that vary over two orders of magnitude6,11. RNA stability has been proposed to be a leading contributor to the differential steady-state levels of the individual transcripts15,16. However, testing this hypothesis has been challenging due to a lack of methods to quantify RNA turnover that do not rely on strong perturbations, e.g., transcription inhibition, which leads to compensatory changes to RNA turnover rates11,15. Recent advances in the detection of modified bases by next-generation sequencing have led to the development of nucleotide-conversion-based strategies to measure RNA turnover (Fig 1B)12,13. We leveraged this strategy and adapted TimeLapse-seq (TL-seq)13 for the quantitative measurement of mitochondrial RNA (mito-TL-seq). In our approach, the nucleotide analog 4-thiouridine (4sU) is introduced to cells for various amounts of time. After RNA extraction, incorporated 4sU is chemically modified, causing it to be misread during reverse transcription and resulting in T-to-C mismatches during sequencing (Fig 1B). The fraction of new transcripts for each gene is determined by the analysis of mismatches that occur after sequence alignment, and a simple exponential decay model is used to estimate the half-lives of all transcripts.
In developing mito-TL-seq, we found that 4sU was incorporated at a lower frequency in mitochondrial-encoded RNAs than in nuclear-encoded transcripts (Fig S1A). Because high levels of 4sU lead to numerous secondary effects17, rather than increase the concentration of 4sU we instead re-established computational approaches used in the analysis of these data14,18 to properly quantify T-to-C conversions in mitochondrial RNA sequencing reads. We developed strategies to decrease background, including the creation of reference genomes where single nucleotide polymorphisms have been removed for each cell line, analyzed reads aligning to the mitochondrial genome separately, and determined the T-to-C conversion rates using a custom binomial mixture model14. Using this approach, we analyzed HeLa cells labeled with as little as 50 uM 4sU, a concentration that causes minimal secondary effects17. Turnover rates (Fig 1C, Table S1) were reproducible between replicates and independent of 4sU concentration (Fig S1D). To test whether 4sU-incorporation impacted the stability of labeled RNA, as well as whether sequencing affected our estimates of RNA stability, we applied an alternative method in which we analyzed native RNA depleted from nascent 4sU-labeled RNA using a probe-based approach, MitoStrings (Fig S1B, C, and D, Table S1)14,19. We observed a strong correlation between the MitoStrings and mito-TL-seq half-lives (Figure S1D, R2 > 0.9).
We compared the half-lives of nuclear- and mitochondrial heavy strand–encoded OXPHOS mRNA, as well as RNR2 (RNR1 is removed during an rRNA depletion step). Half-lives ranged broadly, with mitochondrial-encoded mRNA turned over about 5-fold more rapidly than nuclear-encoded OXPHOS mRNAs (Fig 1D). By contrast, RNR2 was turned over >20-fold slower than any OXPHOS mRNA. Relative abundances measured by mito-TL-seq showed that mt-mRNAs were >100-fold more abundant than nuclear-encoded OXPHOS mRNAs (Fig 1E, Table S2). Moreover, RNR2 was 100-fold more abundant than the mt-mRNA (Fig 1E). To estimate production levels for each RNA, we divided the abundance estimates by the half-lives for each gene (Fig 1F), revealing that mitochondrial RNA production is more than 700-fold higher than nuclear-encoded OXPHOS mRNAs. Importantly, production rates for mitochondrial rRNA and mRNAs were comparable (Figure 1F), consistent with the polycistronic nature of mitochondrial transcription.
To obtain higher-accuracy measurements of RNA abundance with minimal biases from library-generation steps, such as PCR amplification, we used nanopore direct RNA-sequencing to quantify the steady-state heavy-strand transcriptome (Fig 1G, H, S1E, Table S2). By comparing the half-lives from mito-TL-seq with the abundance from direct RNA-seq, we sought to determine to what extent heavy-strand encoded mRNA abundance could be explained by differential RNA stability. The coefficient of determination (R2) is 0.858, indicating that more than 85% of the variability in mRNA abundance can be explained by degradation alone (Fig 1I). Importantly, the dynamic ranges of each measurement are matched (with ND1-relative abundance and half-lives both ranging between 0 to 100), which must be the case if turnover is truly a driver of the steady-state abundance of mitochondrial RNAs.
To test our approach in a system with different turnover rates, we analyzed LRPPRC knockout and matched control cells5. Mutations in LRPPRC cause the French-Canadian type of Leigh syndrome, and the LRPPRC protein is involved in mt-mRNA stabilization, poly(A) tailing, and translation (Fig 1J)5,22,235,22. Depletion of LRPPRC had profound effects on the steady-state levels of most heavy-strand encoded RNAs, as seen previously (Fig S1O). Moreover, we observed a linear relationship between the stabilities of wild-type mt-mRNAs and their degree of destabilization in the absence of LRPPRC (R2 = 0.59), indicating that LRPPRC plays a major role in establishing the relative abundance of the mitochondrial transcriptome (Fig S1P). Importantly, changes in RNA half-lives correlated strongly with changes in the steady-state levels of RNA in these cells (R2 = 0.92) (Fig 1K). Manipulating transcript stability led to predictable alterations in RNA abundance, emphasizing the prominent role of transcript stability in establishing the mitochondrial transcriptome.
Slow mt-RNA polymerase elongation contributes a detectable RNA fraction to the mitochondrial transcriptome
Many of the transcripts that deviate more from a direct one-to-one correspondence between turnover and abundance levels are encoded on the promoter-distal end (E.g. CYB and ND4/4L) (Fig 1I). When transcription is polycistronic, more actively transcribing RNA polymerases will have traversed promoter-proximal genes than promoter-distal genes at any point in time (Fig S1Q, 1H). If transcription is slow but RNA turnover is fast, transcription elongation will lead to more promoter-proximal than distal nascent RNA. We used a mathematical modeling approach20 to predict total RNA abundance from the estimates of mature RNA levels using our turnover measurements and the nascent RNA levels (Fig 1H). The transcript abundance differences of all heavy-strand genes could be well explained by the minimal model assumptions of a single transcription initiation site and rate, gene-specific RNA turnover rates, and an mt-RNA polymerase elongation rate in the range 0.06–0.12 kb/min (Fig 1H, S1R). Our range of elongation rates differs slightly from in vitro mt-RNA polymerase transcription elongation rate estimates, which range from 0.22 to 0.72 kb/min24–26, which could indicate a transcriptional slow-down in vivo due to nucleoid-associated DNA binding factors. In summary, our kinetic measurements suggest that the steady-state mitochondrial transcriptome is shaped largely by differential degradation, with transcription elongation providing a differential contribution for genes encoded proximal to the heavy strand transcription start site, leaving little room for other mechanisms such as early transcription termination.
The light strand-encoded transcriptome is also determined by degradation kinetics
The light strand-encoded transcriptome consists of one mRNA, ND6, the 7S regulatory RNA, eight tRNAs, and numerous antisense (or ‘mirror’) RNAs with unknown functions. Quantifying the abundance of light-strand RNAs is challenging due to the strong expression of the opposite strand and slow or incomplete RNA processing. Strand-switching or index hopping in RNA-seq library generation27 converted ~3% of the heavy-strand RNAs into contaminating reads mapping to the light-strand (Fig S1 F–I). Moreover, the accumulation of multiple light-strand RNA processing intermediates complicates their robust quantification by nanopore direct RNA-seq due to the method’s limited read lengths of ~1 kb. Nevertheless, we reasoned that we could use our 4sU-induced sequencing mismatches to detect strand-switched reads (Figure S1J and K). After correcting both the turnover (Figure S1L) and abundance (Figure S1K) measurements for contaminating reads, we found a strong correlation (R2>0.9) between abundance and turnover (Fig S1M, N), suggesting that the light strand–encoded transcriptome is also predominantly regulated by degradation kinetics.
Mitochondrial RNA processing is rapid and occurs predominantly co-transcriptionally
Mitochondrial precursor RNAs are processed into individual RNA molecules mainly through the excision of tRNAs that are encoded at the 5’ and 3’ ends of mRNAs28. We next investigated how RNA processing kinetics relate to transcription and turnover rates. To estimate processing rates in HeLa cells, we used two complementary approaches. First, nanopore direct RNA sequencing reads quantitatively capture both processed and unprocessed transcripts (Fig. 2A and B, Fig S2A), enabling measurements of the fraction of unprocessed transcripts at steady state, which we used to estimate processing rates (Fig 2C, Table S3). Secondly, we directly measured 5’-end processing for COX2, COX3, and ATP8/6 mRNAs by performing a time-series 4sU-labeling experiment capturing time points from 3 to 240 minutes. We then biotinylated and enriched the newly synthesized RNA and estimated the fraction of processed transcripts using MitoStrings probes that spanned processing junctions (Fig S2D, E). The indirect estimates based on nanopore direct RNA sequencing and the direct measurements made with MitoStrings agreed (Fig S2 F). These analyses revealed that processing occurs rapidly, with unprocessed transcripts having a median half-life of less than 1 minute (Fig 2D and Fig S2B). However, some processing steps took considerably longer (9–39 min). For example, the processing of ATP8/6 and COX3, previously observed to accumulate as a translation-competent tricistronic transcript5,19,29, took over 20 minutes (Fig 2A, D). We also observed the accumulation of the processing intermediate RNR2-tRNALeu-ND1, referred to as RNA1929, (Fig 2B), and a form of this transcript containing only tRNALeu-ND1, as well as significant amounts of unprocessed 5’ ends of COX1 (Fig S2A). We also observed the accumulation of these processing intermediates in direct RNA-seq libraries from HeLa cells, K562 cells, human myocytes, and myoblasts (Fig S2C). Because the mitochondrial transcription elongation rate is less than 1 kb/min (Figure 1H), we conclude that processing must occur predominantly co-transcriptionally.
Figure 2.
RNA processing is generally rapid, with notable exceptions. A) 100 randomly sampled reads from nanopore direct RNA-seq that align to ATP8/6 and/or COX3 include many transcripts (orange reads) that are unprocessed at the 5’ and/or 3’ ends. B) Same as A), but for 50 reads mapping to ND1. C) Equation used to estimate processing rates from nanopore direct RNA seq and mito-TL-seq turnover measurements as shown in D). D) The amount of time required until half of the newly synthesized transcripts have been processed at a specific site. Processing sites are subdivided into 3’ or 5’ ends of the mRNA, as well as into groups of canonical (tRNA-flanking) or non-canonical junctions. E) Distribution of poly(A) tail lengths measured by direct RNA-seq of poly(A)+ RNA. All pairwise comparisons between transcripts are significant (p-value < 0.05) by the Wilcoxon rank-sum test, except ND3 vs. ATP8/6. F) Density plot showing the distribution of poly(A) tail lengths for four genes. Orange lines show mRNAs with unprocessed 5’ ends (more nascent on average) and blue lines show mRNAs with processed 5’ ends. The numbers of reads underlying each distribution (in blue or yellow, respectively) and the p-value from the Wilcoxon rank-sum test comparing processed and unprocessed reads are displayed. All data in the figure are from HeLa cells.
Poly(A) tailing occurs rapidly after 3’-end processing and can precede 5’-end processing
Mitochondrial mRNAs, except ND6, have short poly(A) tails30. For most transcripts, the poly(A) tail is required to complete the stop codon30. To estimate poly(A) tailing kinetics, we adapted nanopore direct RNA sequencing to sequence RNAs with and without a poly(A) tail by ligating a short DNA linker to the 3’ ends of all RNAs. Using these data, we compared the number of mRNAs that had been cleaved at the 3’-end but not yet polyadenylated to the amount that is both cleaved and polyadenylated. Consistent with previous measurements30, we found that most mRNAs had poly(A) tail lengths between 40 and 60 nucleotides, with ND5 having shorter poly(A) tails than other transcripts (median: 28–37 nt, depending on the library preparation method, Fig 2E, Fig S2G, Table S4). Only a small fraction of cleaved transcripts (< 4%) still lacked poly(A) tails (Fig. S2H). Due to the low level of non-polyadenylated reads, there was not sufficient coverage to calculate polyadenylation rates. Nevertheless, our results indicate that polyadenylation occurs very rapidly following 3’-end cleavage.
To analyze the timing of 3’-end polyadenylation relative to 5’-end processing, we compared the lengths of poly(A) tails as a function of the 5’ processing status. We saw a significant difference for several mRNAs (Fig. 2F). ND1, COX1, and CYB mRNAs with unprocessed 5’-ends had shorter tails than the corresponding mRNAs with processed 5’-ends, an effect that was observed across different cell types (Fig. S2I). Nevertheless, the tail lengths for the unprocessed transcripts were still sizable, with a median length only 4–9 nt shorter than those of the processed transcripts. mRNAs with unprocessed 5’-ends are younger on average than processed transcripts and their average poly(A) tail length is shorter. The fact that we captured a difference in poly(A) tail length suggests that the lengthening of the poly(A) tail, relative to the initiation of polyadenylation, is a slower process31. By contrast, COX3 mRNA showed the opposite trend, with unprocessed transcripts having slightly longer tails (median 2.3 nt longer) than the processed transcripts. If poly(A) tail length reflects the age of the transcripts, this observation is consistent with the tricistronic ATP8/6-COX3 mRNA being more stable than the processed COX3 mRNA. In sum, poly(A) tailing is initiated rapidly on the same time scales as processing, and nascent mRNAs (with unprocessed 5’ ends) bear almost full-length tails, consistent with poly(A) tailing being a one-step process31.
mt-mRNAs arrive rapidly to mitoribosomes with rates that vary by 4-fold
mt-mRNAs are recruited directly to the monosome, and the only known ribosomal recruitment signal is the start codon32. However, it remains unknown how long mt-mRNAs take before they associate with the mitoribosome. To address this question, we performed mito-TL-seq on RNA associated with ribosomes. To enrich for mitoribosomes, we performed sucrose gradient fractionation (Fig S3A). Because mtDNA and associated nascent RNA co-purify with the mitoribosome in sucrose gradient fractions33, we performed DNase treatment to remove any contaminating mtDNA and immunoprecipitated the mitoribosome from the fractions containing mitochondrial monosomes and polysomes, an approach that faithfully reflects the full mitochondrial translatome5 (Fig 3A, Fig S3B). These additional purification steps resulted in strong enrichment (>20-fold) of mt-mRNA and mt-rRNA (Fig 3B) relative to the extraction of RNA directly from polysomes (compare Fig. 3B with S3C and D). Due to the challenges of analyzing light-strand RNAs, described above, ND6 was not included in these analyses.
Figure 3.
Mitoribosome-association kinetics differ between transcripts. A) Cell lysates from 4sU-labeled cells were fractionated using sucrose gradients, and the mitoribosomes were immunoprecipitated (IP) out of the gradient. RNA was extracted from the IPs and input (whole-cell lysate) and sequenced by mito-TL-seq. B) Scatter plot showing the reads per kilobase per million reads (RPKM) in libraries from the mitoribosome-IP experiment and the input samples for the same experiment. Mt-ribosomal mRNA and mt-rRNA are enriched in the IP and shown in red. C) Model used to calculate ribosome association rates. kdeg = degradation rate and kr= transfer rate from the unbound to the ribosome-bound state. RNA can be either unbound (free) or bound by ribosomes. D) Three example profiles from one replicate experiment performed using 100 μM 4sU. Black squares show the fraction new RNA in the mitoribosome-IP and black circles show the fraction new RNA in the input (Total). Dotted purple and yellow lines show the best fit for the total and mitoribosome-IP, respectively, deploying the model in C). Ribosome-association half-lives (half-life unbound, the time it takes for half of the transcripts to bind to the ribosome), in minutes, are estimated by the model for each gene. E) Heatmap showing the rank of mt-RNAs in each replicate based on their ribosome-association half-lives. All the data in the figure are from HeLa cells.
To determine ribosome association rates, we developed a mathematical model (Fig 3C) that assumed equal mRNA degradation rates both for the unbound (free) and ribosome-bound transcripts. An Akaike information criterion (AIC) test consistently preferred this simpler model to a slightly more advanced model with independent degradation rates for the free and ribosome-bound state. Regardless of which model we applied, the estimated ribosome-association rates were very similar (Fig S3E). RNR2 was estimated to take >200 minutes to reach the monosome, consistent with estimates of mitoribosome assembly of 2–3 hours (Fig 3D, E, Table S5)34. For mt-mRNA, the association times varied, with half-lives ranging from 13 minutes for COX1 to 3 minutes for COX3. The variability in mitoribosome association rates for mt-mRNA was surprising, because no physical barrier such as the nuclear envelope separates nascent RNA and ribosomes in mitochondria, and mt-mRNAs have minimal or absent 5’ UTRs. Nonetheless, three repetitions of these measurements in HeLa cells, twice using 100 μM 4sU and once with 500 μM 4sU to achieve higher signal-to-noise in the detection of T-to-C conversions (Fig 3E), yielded consistent results (Fig S3F). Measurements in HEK293T cells produced results similar to those obtained in HeLa cells (Fig S3G). These trends were only weakly related to the size of the 5’ UTR, most of which are 0 or 1 nt, or the start codon (Fig S3H). Overall, our measurements showed that ribosome association kinetics differed by a factor of ~4 across mt-mRNAs.
mt-mRNA 5’ but not 3’ processing is a prerequisite for mitoribosome association
Because no physical barriers separate transcription and translation in mitochondria, we sought to determine why ribosome association rates varied so widely. We reasoned that RNA processing, the immediately preceding step, might be rate-limiting for ribosome association. We found that ribosome association rates (median time until half of mRNA associated with ribosome: 5 min) were generally slower than processing rates (median 5’ and 3’ processing half-life: 100 s), with three exceptions (Figure 4A, B). For both ND1 and COX1, 5’-end processing took the same amount of time as ribosome association, and for the tricistronic ATP8/6-COX3 transcript, processing was slower than ribosome association (Fig 4A and B). Indicating that for ND1 and COX1, 5’-end processing might be a pre-requisite of ribosome association. To validate this, we analyzed mitoribosome profiling data to determine whether ribosome-protected fragments (RPF) contain uncleaved mt-mRNA (Fig 4C–F)5. We found several transcripts with unprocessed 3’-ends in the ribosomes (Fig 4D, F and Fig S4C, Table S6), consistent with the observation that mRNAs lacking stop codons are nevertheless commonly translated by mitoribosomes35. However, only one transcript, COX3, had unprocessed 5’ ends in the ribosome (Fig 4C, E, and S4B), in stark contrast to the nanopore direct RNA-seq, in which we found a large fraction of unprocessed transcripts also for ND1 and COX1 (Fig 2). Accordingly, the uncleaved 3’-end of ATP8/6 was by far the most common form of unprocessed 3’-end in the RPF data, consistent with the preference for COX3 translation initiation to occur on the tricistronic transcript5. We interrogated ribosome profiling data and RNA-seq data from primary human fibroblasts and found consistent results (Fig S4D–E).
Figure 4.
5’ RNA-processing might be a prerequisite for mitoribosome association. A) Scatter plot correlating 5’-end processing and mitoribosome-association half-lives. Error bars show standard deviations for three replicates. B) Same as A) but showing the 3’-end processing half-lives. C) Subsampling of 100 ribosome-protected fragments (RPF) with 5’ ends within −30 and +3 nt from the 5’ end of COX3. The scale bar shows the distance in nucleotides, and the color code refers to the processing status of each read. D) Same as C) but for ND1. E) Quantification of all ribosome-protected fragments that align to the 5’ ends of mitochondrial genes. The plot shows the fraction of the unprocessed reads (i.e. orange in D) and C)) over all reads aligning to the 5’ end of the gene (i.e. blue and orange reads in D) and C)). Error bars show ranges of two replicates. F) Same as in E) but for 3’ ends. G) Cartoon showing that COX3 is processed slowly but the processing status does not affect ribosome association. H) Cartoon illustrating that ND1 and COX1 mRNAs are both processed slowly at the 5’-end and that only processed transcripts are associated with the ribosome. All data in the figure are from HeLa cells.
In general, mt-mRNA 5’-end processing was fast; a few sites were cleaved more slowly, and among transcripts with unprocessed 5’ ends, only COX3 was detected in translating mitoribosomes (Fig 4G). Thus, for ND1 and COX1, 5’ processing is a likely prerequisite for translation (Fig 4H). These results indicate that translation in mitochondria, in contrast to many bacteria36, is unlikely to be co-transcriptional, even though transcription and translation occur in the same compartment.
Ribosome association is independent of mRNA turnover and fine-tunes mitochondrial translational output
Due to their compartment-specific transcription, nuclear export, and translation kinetics, nuclear-encoded OXPHOS transcripts spend a long time unbound by ribosomes14. Comparing mitochondrial and cytosolic ribosome association rates, we found that nuclear-encoded transcripts spent >10x more time in a non-ribosome-bound state than mitochondrial-encoded transcripts (median time: nuclear-encoded 113 min, mitochondrial-encoded 5 min; Fig 5A). Furthermore, mitochondrial RNAs were unbound by ribosomes for only one-third of the time that nuclear-encoded RNAs spent between nuclear export and ribosome association (i.e., median residence time in the cytoplasm when not bound to ribosomes: 13 min, Fig 5A), illustrating how rapid the process is in mitochondria14. Using the ribosome association rates, we calculated the fraction of each mRNA expected to be associated with the ribosome. We found that 90% of mt-mRNA should be ribosome-bound on average (Fig 5B), consistent with biochemical measurements37 and far higher than for nuclear-encoded transcripts (Fig 5B, ~50%).
Figure 5.
Ribosome association fine-tunes mitochondrial protein synthesis but has little predictive power relative to RNA turnover and translation efficiency. A) Distribution of ribosome association half-lives for mitochondrial-encoded mRNA or nuclear-encoded OXPHOS subunits, measured either from nuclear export (shorter half-lives) or from synthesis (longer). Mito data are from this study and HeLa cells, whereas nuclear-encoded data are from14 and K562 cells. B) Distribution of the fraction of each mRNA that is bound by ribosomes. Mitochondrial mRNAs are on average more frequently associated with the mitoribosome than nuclear-encoded transcripts with the cytosolic ribosomes. C) Correlation between either relative mRNA abundance at steady state (black dots) or mRNA abundance in the ribosome (green dots) and relative protein synthesis as measured by ribosome profiling (ribosome protected fragments, RPF)5. In most cases, the green dots are closer to the midline than the black dots, showing that ribosome association brings the total mRNA levels closer to relative protein synthesis levels. D) Quantitative gene expression model using the measured kinetic rates from this study. RPF: ribosome profiling, TE: translation efficiency, ktranscription = transcription rate, kdeg = turnover rate. E) Proportion of protein synthesis explained by different steps of gene expression in HeLa cells. RNA turnover and translation efficiency explain the vast majority of protein synthesis as determined by ribosome profiling, whereas ribosome association and transcription play minor roles. All data in the figure are from HeLa cells if nothing else is stated.
Nevertheless, we wondered whether mitoribosome association might act as an independent regulatory point for mitochondrial OXPHOS subunit production. Interestingly, ribosome association rates were not correlated with mRNA turnover (Fig S5A). Considering the high level of ribosome-bound mt-mRNA, we next asked whether the steady-state abundance of mRNA or the fraction of mitoribosome-associated transcript (ranging between 73% and 98%) could predict levels of protein synthesis as measured by ribosome profiling (Fig 5C). Using the fraction of mRNA that was bound by ribosomes modestly improved the correlation with the RPF data relative to using total RNA levels (from R2 = 0.63 to R2=0.65), suggesting that ribosome association rates fine-tune gene expression in mitochondria.
To test the overall importance of ribosome-association kinetics in determining protein synthesis, we developed a quantitative model that incorporated our measured RNA turnover rates (Fig 1D), transcription initiation and elongation rates (Fig 1H), and ribosome-association rates to predict protein synthesis levels calculated from ribosome profiling data (Fig 5D). The remaining variability we attributed to translation efficiency (TE = Ribosome profiling/RNA-abundance). Applying this model, we found that RNA turnover had the largest impact, predicting 59.4% of the variability in protein synthesis levels, followed by translational efficiency, which explained 40.1% (Fig 5E). Transcription initiation and elongation did not contribute to the variation, and ribosome association increased the predictive power only slightly, by 0.5%.
A minimal model of gene expression predicts nuclear- and mitochondrial-encoded OXPHOS expression variation across cell types
Having identified multiple key regulatory points of mitochondrial gene regulation, we simplified our quantitative model and compared the impact of different gene expression levels in setting the protein synthesis output for both mitochondrial- and nuclear-encoded OXPHOS genes (Fig 6A). The model parameters were estimated independently for the mitochondrial and nuclear genes, using only mRNA turnover, transcription, and translation efficiencies from HeLa cells (Fig 6A). Because the model was fitted to explain 100% of the variability in the ribosome profiling data for a single biological replicate, we applied the model with the estimated parameters to predict RPF in a replicate dataset (Fig 6B); the coefficient of determination was >0.94. For nuclear genes, variability in the transcription rate, and not mRNA turnover, largely predicted variation in expression (Figure 6B), whereas the reverse was true for mitochondrial genes: mRNA turnover, rather than transcription, was the predominant determinant (Fig 6B), consistent with our findings (Figs 1, 5). Translation efficiencies were less predictive for nuclear-encoded OXPHOS genes than for mitochondrial-encoded genes, explaining ~25% and ~47% of the variation, respectively.
Figure 6.
Mitochondrial gene expression is largely explained by RNA turnover and translation efficiencies even in situations where RNA levels are perturbed. A) Minimal model of mitochondrial gene expression used in B and C. B) The ability of the nuclear and mitochondrial models to predict variability in ribosome profiling data for nuclear or mitochondrial-encoded OXPHOS genes, respectively. The “HeLa Model” by definition explains 100% of the variability, whereas the “HeLa Replicate” deploys the same estimated parameters to predict a replicate HeLa RPF dataset. The same model parameters are also used to predict RPF in K562 and HEK293T cells. Variability in nuclear gene expression is mostly explained by transcription levels whereas variability in mitochondrial gene expression is mostly explained by RNA degradation. TE was more predictive for mitochondrial genes than for nuclear-encoded genes. C) Same estimated HeLa parameters as in B), but predicting RPF in cells depleted of LRPPRC (LRPPRCKO). Estimating the RNA degradation parameter (kdeg) using LRPPRCKO mRNA turnover data, instead of the HeLa kdeg, increased the predictive power.
To better understand the role of translation regulation, we compared published translation efficiencies (TEs) across five cell types, including human primary fibroblasts and myoblasts. The TEs were well correlated (Fig S6A), suggesting that translation regulation could be fairly fixed in human mitochondria. To test this idea further, we used our minimal model of mitochondrial protein synthesis with all parameters, i.e. transcription, mRNA degradation, and TE, estimated from our HeLa cell fits. HeLa transcription rates alone could predict ~46% of the variability in mitochondrial RPF in K562 and HEK293T cells (Fig 6B). However, when we applied the full model, including the HeLa TE data, the predictive power increased to 74% of protein synthesis in K562 and 85% in HEK293T cells (Fig 6B), consistent with the comparable translation efficiencies in these cell lines. The mitochondrial model slightly outperformed the nuclear model, which explained 68% and 78% of RPF variability for K562 and HEK293T cells, respectively. The underperformance of the nuclear gene expression model was due to the lower predictive power of the estimated TE parameter (~20% of RPF variability) relative to the mitochondrial model (~30%). Together, these models reveal that translation regulation is surprisingly robust and hardwired in mitochondria.
As a final test of our model, we analyzed its performance in predicting mitochondrial protein synthesis in cells lacking the ribosome-associated protein LRPPRC. Loss of LRPPRC leads to disrupted polyadenylation and destabilized mt-mRNA15,22,23. In our analysis of HEK293T cells lacking LRPPRC (LRPPRCKO), our quantitative HeLa-based model, which performed well for other cell lines, had little predictive power (Fig 6C). However, if we substituted LRPPRCKO degradation rates for HeLa mt-mRNA degradation rates, the model was again able to explain half of the variability in protein synthesis (Fig 6C), similar to what we observed in other cell lines (Fig 6B). Interestingly, the model using the HeLa TE parameters and the LRPPRCKO mRNA degradation rates increased the predictive power to 73%. These results indicate that translation regulation is fixed even in cells with vastly different RNA abundance and that LRPPRC predominantly acts on RNA stability. Although we found differences in LRPPRCKO TEs relative to wild-type cells (Fig 6C), they had a relatively small impact on the predictive power of our model.
Discussion
In this study, we developed a quantitative kinetic framework to analyze the human mitochondrial transcriptome, revealing many insights into mitochondrial gene regulation. Our measurements revealed that RNA processing, ribosome association, and turnover rates differed starkly from those of nuclear-encoded OXPHOS genes, highlighting the unique features of mitochondrial gene expression. The production rates of mt-mRNA were the same as those of mt-rRNA, likely the most abundant transcript in mitochondria. These observations are explained by our heavy strand polycistronic transcription model that includes a single transcription start site10 and rate. Our data reveal that RNA processing is very rapid relative to transcription elongation rates in human mitochondria, indicating that it occurs co-transcriptionally. Consistent with this, TEFM, the mitochondrial transcription elongation factor, is associated with RNA cleavage enzymes, and RNA processing is impaired in the absence of TEFM38.
In bacteria, transcription factors directly interact with the ribosome, creating an RNAP-ribosome expressome that co-transcriptionally translates the nascent mRNA39. In eukaryotes, by contrast, the physical separation between nuclear transcription and cytosolic translation prevents transcription–translation coupling. We found that in mitochondria, ribosome association is slower than processing, and is most likely initiated post-transcriptionally. Although an earlier model proposed that mt-mRNA first interacts with the small mitoribosome subunit40, recent work showed that mt-mRNAs interact directly with the monosome32 and that binding of the initiator tRNA stabilizes the mRNA--ribosome complex. Our ribosome-association analysis indicated that 5’ but not 3’ processing is a prerequisite for monosome association, linking 5’ processing status to translation. Consistent with this, when COX1 is poorly processed at the 5’ end, it is also poorly translated41. In terms of the mechanism, we propose that the uncleaved tRNA (or antisense tRNA) upstream of the mt-mRNA blocks the ribosome from binding prematurely to an mt-mRNA by blocking the start codon from entering the ribosome. In addition, mt-mRNAs that do not directly begin with a start codon (i.e. retain a short leader sequence) or use alternate start codons tended to be slower to associate with the mitoribosome (Fig S3H).
We observed that transcript abundance differed ~140-fold between the mitochondrial- and nuclear-encoded OXPHOS mRNA (Fig 1), and mitochondrial transcripts were predominantly associated with ribosomes (Fig 5). If all mitochondrial ribosomes that associate with mRNA are translationally active, mitochondrial ribosomes should have a translation rate constant (number of proteins synthesized per mRNA per time unit) ~100-fold smaller than cytosolic ribosomes, assuming stoichiometric synthesis. Alternatively, mitochondrial-encoded proteins could be synthesized at the same rate and then turned over much faster than their nuclear-encoded counterparts. However, because mitochondrial-encoded proteins are stable with half-lives much longer than the cell doubling time42, a low translation rate is the likeliest explanation. In HeLa cells, the translational output of the cytosolic ribosome has been estimated to be 1400 proteins/mRNA per hour, so the equivalent number for mitochondria would be 14 proteins/mRNA per hour. Considering that some mt-mRNAs exist for less than an hour, many mt-mRNAs would be translated only a few times.
Using our kinetic measurements over the mt-mRNA lifetime, we estimated parameters for a minimal gene expression model for mitochondria. Our findings indicate that RNA turnover is the prominent regulatory expression module, which alone explains up to two-thirds of the variability in mitochondrial protein synthesis. This is in stark contrast with nuclear-encoded OXPHOS subunits, which show variation due to their RNA production rates instead of mRNA turnover. Mitochondrial gene expression is then fine-tuned by ribosome association times and translation regulation. However, questions remain regarding the mechanisms that determine mt-RNA half-lives and the extent to which mt-RNA turnover is regulated between physiological conditions. mt-mRNA levels are affected by the depletion of hundreds of factors and also vary across cell lines, indicating that these mechanisms are likely to be complex, multifaceted, and important for cell type–specific metabolism. Nonetheless, our analysis indicates that LRPPRC plays a predominant role (Figs 1K, S1O, and S1P)6,19,43.
Model parameters estimated in HeLa cells translated well to HEK293T and K562 cells, consistent with the comparable rates of transcription and mRNA turnover in these cell types. This suggests that mt-mRNA may be largely hard-coded for translation efficiency, resolving the apparent contradiction between the outsized importance of translation regulation in mitochondria and the lack of traditional regulatory features of nuclear-encoded or bacterial mRNA. In LRPPRC KO cells, RNA turnover remained a strong predictor of translational output, and the HeLa TE parameter robustly increased the predictive power of the model. Thus, our modeling indicates that LRPPRC acts predominantly via its role in regulating RNA stability, illustrating how our approach can aid in identifying the core impact of regulatory factors. Applications of our models to other key regulatory factors, many of which play a role in disease8, would illuminate the underlying mechanistic regulation of distinct mitochondrial gene expression modules. More broadly, our quantitative and kinetic framework for studying mitochondrial gene expression provides a critical foundation for gaining a deeper understanding of the mitochondrial and nuclear co-regulation driving OXPHOS biogenesis.
Methods
Cell culture
HeLa cells were grown in DMEM containing glucose and pyruvate (ThermoFisher, 11995073), supplemented with 10% FBS (Thermo Fisher, CAT A3160402.). LRPPRC KO and matched WT HEK293T cells from5 were cultured in DMEM supplemented with 20 μg/mL of uridine (Sigma, CAT U3750), 3 mM sodium formate (Sigma CAT 247596), 10% FBS, 1 mM GlutaMAX (Thermo Fisher Scientific, CAT 35050061) and 1 mM sodium pyruvate (Thermo Fisher Scientific, CAT 11360070). Human myoblasts from anonymous healthy donors were kindly provided by Dr. Brendan Battersby (Institute of Biotechnology, University of Helsinki). Myoblasts were grown in Human Skeletal Muscle Cell Media with the provided growth supplement (HSkMC Growth Medium Kit, Cell Applications, 151K-500). For differentiation into myotubes, the media was replaced with DMEM (ThermoFisher, 11995073) with 2% heat-inactivated horse serum and 0.4 ug/mL dexamethasone. The differentiation media was replaced every two days.
4sU labeling
HeLa or HEK293T Cells were grown to 70–80% confluency and then pulsed by either 50, 100 or 500uM final concentration of 4-Thiouridine (4sU, Sigma T4509) in pre-conditioned cell culture media under standard cell culture conditions. For mitoribosome-IP experiments, one 15 cm plate of cells was used per time point while for all other experiments, one 10 cm plate was used per time point.
Cell lysis and RNA extraction for standard time course experiments
Labeled cells were washed in ice-cold PBS (ThermoFisher) before being directly scraped in Trizol LS (ThermoFisher, 10296010). RNA extraction was performed using Trizol LS according to the manufacturer’s protocol, except with the addition of DTT (Sigma-Aldrich) at a final concentration of 0.2 mM DTT in the isopropanol.
Undifferentiated myoblasts (day 0) and differentiated myotubes (day 7) were collected and lysed in 1 mL of Qiazol (Qiagen). RNA from undifferentiated and differentiated myoblasts was extracted using the Qiagen miRNeasy kit (Qiagen) according to the manufacturer’s protocol.
mtPolysome fractionation, mitoribosome-IP, and RNA extraction from mitoribosome-IP samples
After labeling cells were quickly washed in ice-cold PBS before being scraped in mt-polysome lysis buffer (0.25% Lauryl Maltoside, 10mM Tris pH 7.5, 0.5mM DTT, 20 mM magnesium chloride, 50 mM ammonium chloride 1× EDTA-free protease inhibitor cocktail (Roche)). Cell lysates were dounced 7 times in a 1mL dounce with the tight piston and then flash frozen. 1mL of thawed lysate was clarified by spinning twice at 10,000 rcf. To avoid contamination of nascent RNA still attached to mtDNA into the higher molecular weight sucrose gradient fractions we digested all DNA by 150 units of DNAse I (NEB) in the presence of superaseIn (ThermoFisher) and 0.5mM calcium chloride at room temperature for 1 h. To isolate mitoribosomes, lysates were loaded on 10–50% linear sucrose gradients and centrifuged in a Beckman ultra-centrifuge at 40,000 RPM for 3 h at 4°C using a SW41Ti rotor. Some of the input for the sucrose gradient fractionation was kept and treated as a standard sample described above. Gradients were mixed and fractionated using a BioComp instrument. To identify the mitoribosome containing fractions we western blotted for Mrpl12 (Proteintech 14795-1-AP) and Mrps18B (Proteintech 16139-1-AP) as described below. Monosomes and polysomes containing fractions were pooled and the mitoribosomes were immunoprecipitated out of the pooled fractions. For the immunoprecipitation, MRPL12 antibodies were conjugated to Protein A dynabeads (ThermoFisher) for 1 h in mt-polysome lysis buffer. After washing the beads, lysates were added and incubated for 3 h at 4C. After 3 h the supernatant was removed and the beads were washed three times in mt-polysome lysis buffer before the mitoribosomes were eluted in 0.2% SDS, 100 mM NaCl, 10 mM Tris pH 7.5, 1X EDTA-free protease inhibitor cocktail and SuperaseIn. IP efficiency was confirmed by western blotting as described below. RNA was extracted from the eluates using Trizol LS as described above and then further cleaned by using 1.8x volume of RNAClean XP beads (Beckman Coulter A63987) and washed in 80% ethanol.
Western blotting
Samples were mixed with 4 x LDS sample buffer (ThermoFisher NP0007) and 0.1M DTT. Samples were loaded onto a 4–12% Bis-Tris gel (Invitrogen NP0321BOX) in 1x MOPs buffer and run at 160V for 1 hour. The gel was transferred to a nitrocellulose membrane using the wet transfer method in 1x transfer buffer (25mM Tris base, 192mM glycine, 20% methanol) at 400mA for 75 minutes at 4°C. The membrane was blocked in a blocking buffer (5% non-fat milk powder in 1x Tris buffer saline with 0.1% Tween 20) for at least 60 minutes. Primary antibodies were diluted in blocking buffer and incubated with membranes overnight at 4°C. Membranes were washed 3 × 20 minutes with 1 x TBST, incubated for 1 hour at 25°C with secondary HRP-conjugated antibodies against rabbit IgG (Cell Signaling Technology, 7074S), washed again 3× 20 minutes with 1x TBST, and developed using ECL Western Blotting Detection Reagents (Cytiva) and Amersham Hyperfilm ECL (Cytiva) using an automated M35 X-OMAT Processor developer (Kodak).
TimeLapse-sequencing
Extracted 4sU-labeled RNA from HeLa or HEK 293T cells were treated by TimeLapse-chemistry as described in13,14. In short, 2ug of RNA was treated with 0.1M sodium acetate pH 5.2, 4mM EDTA, 5.2% 2,2,2-trifluoroethylamine, and 10mM sodium periodate at 45°C for 1 hour. RNA was then cleaned using equal volume or RNAClean XP beads (Beckman Coulter A63987) and washed in 80% ethanol. The cleaned RNA was reduced in 0.1M DTT, in 0.1M Tris pH7.5, 1M NaCl, and 0.01M EDTA for 30 min at 37 C before once again being cleaned by RNAClean XP beads. Sequencing libraries from treated RNA were created using the SMARTer Stranded Total RNA HI Mammalian kit (Takara 634873) following the manufacturer’s instructions. Libraries were sequenced on the NextSeq (Illumina, San Diego, CA) by the Biopolymers Facility at Harvard Medical School.
For the mitoribosome-IP the rRNA depletion step was left out for the IP:ed RNA as their cytosolic rRNA was already depleted by the IP step. In addition, the SMARTer RNA unique dual index kits (Takara 634452) were used instead of the primers that came with the SMARTer Stranded Total RNA HI Mammalian kit to allow for sequencing on the NovaSeq (Illumina, San Diego, CA) by the Biopolymers Facility at Harvard Medical School.
Creation of SNP-masked cell-line-specific reference genomes
To create cell-line-specific genomic single nucleotide polymorphism-corrected reference sequences, we used reads from samples without 4sU treatment. Reads were first aligned to the reference hg38 genome with STAR using parameters --outFilterMultimapNmax 100--outFilterMismatchNoverLmax 0.09 --outFilterMismatchNmax 15 --outFilterMatchNminOverLread 0.66 --outFilterScoreMinOverLread 0.66 --outFilterMultimapScoreRange 0 --outFilterMismatchNoverReadLmax 1. Variants were then called with BCFtools (Li 2011) ‘mpileup’ and ‘call’ using two bam files as input. The resulting variant call file (VCF) was then split into a file with INDEL records only and a file without INDEL records (substitutions only). The “no INDEL’’ VCF was further split by frequency of substitution: loci covered by >= 5 reads and with a variant frequency >75% to a single alternate base were assigned the alternate base; loci with variants with an ambiguous alternate base were masked by “N” assignment. The reference FASTA was modified for these non-INDEL substitutions using GATK (McKenna et al 2010) FastaAlternateReferenceMaker. Finally, rf2m (https://github.com/LaboratorioBioinformatica/rf2m) was used with the INDEL-only VCF file to further modify the FASTA genome reference as well as the corresponding GTF annotation file.
Read Alignments
First, fastq files shared by the sequencing facility were concatenated to combine data across the sequencer lanes. The adapter sequences were trimmed from both of the paired-end reads using Cutadapt v2.5 (Martin. EMBnet, 2011). Three more nucleotides were trimmed from the 5 prime end of the fragment to remove template switch bases, and five more from the 3 prime end to remove mismatches associated with random hexamer priming that result from the SMARTer Stranded library preparation. Read alignment was done using STAR 2.7.3 to cell-line-specific genomic reference sequences (see above) with different parameters for experiments with high and low 4sU treatment. For 500 uM 4sU experiments, parameters were --outFilterMismatchNmax 90--outFilterMismatchNoverLmax 0.3 --outFilterMatchNminOverLread 0.2--outFilterScoreMinOverLread 0.2. For 50 uM 4sU experiment, parameters were--outFilterMismatchNmax 10 --outFilterMismatchNoverLmax 0.05 --outFilterMatchNminOverLread 0.66 --outFilterScoreMinOverLread 0.66 --outFilterMultimapScoreRange 0.
T to C mismatch counting
For each sample, a subset of mitochondrial-aligned reads was used to determine the rate of T to C mismatches. Unless otherwise specified, this subset included all mitochondrial reads except for those aligning to MT-rRNA. For this analysis, reads were first filtered to remove non-primary alignments, reads with unmapped mates, and reads that map to more than 4 positions, and soft-clipped bases were removed. We then determined the number of T to C mismatches and the total number of T nts across each fragment (considering both read1 and read2 in each pair) using custom scripts from14. Two-dimensional distributions were generated containing the number of fragments with n Ts and k T to C mismatches.
Binomial mixture model to calculate T to C conversion rates
To estimate the 4sU-induced as well as background T to C conversion rates, we used a binomial mixture model we previously developed 14 to handle samples with low 4sU concentrations. In short, the binomial mixture model first estimates two background T to C conversion rates ( and ) and a global fraction parameter, with Binom a binomial distribution, given the measured n Ts and k T to C mismatches in the unlabeled control sample. Giving the background T to C conversion rates as follows: . The three parameters in this model were fitted to the above T to C distributions using linear regression (Python v3.7.4, package lmfit v1.0.2, function minimize). We tested if our binomial mixture error model better fitted the T to C conversions of the untreated samples using the Akaike Information Criterion and adjusted our background model to include one or two background rates depending on the test outcome44. The 4sU sample T to C distributions were then modeled as the 4sU-induced T to C conversions plus the background population once again with a global fraction parameter: .We note that it is necessary to calculate the T to C conversion rate from nuclear and mitochondrial encoded transcripts separately as the mitochondrial 4sU-incorporation rate is significantly lower than for nuclear-encoded genes. In experiments deploying high (500uM) 4sU concentration with no Uridine supplemented to the media, we had robust T to C conversion rates >1.5% and directly used GrandSlam to estimate conversion rates. The T to C conversion rates in the mitoribosome IP and input samples as well as samples from HEK293T cells were calculated from all reads aligning to the mitochondrial genome except reads aligning to COX1 and RNR2.
T to C conversion rate estimates for cases with very low 4sU incorporation
In two cases we ended up with T to C conversion rates lower than 0.6%: the earliest (15 min) time point for the HeLa mitoribosome IP experiments deploying 100uM 4sU (Fig 3D, E) and for all the time points in the HEK293T mitoribosome IP experiments (Fig S3 F). In these two cases, our binomial method failed and we instead calculated the T to C conversion rate based on the assumption that the ribosomal RNA would be turned over consistently. Practically, we set the T to C conversion rate such that it resulted in the fraction of new RNR2 being the same as at the equivalent time point in the high confidence time course experiments in Fig 1. In the IP we set the T to C conversion rate so that the fraction of new RNR2 was lower than 0.5%, to allow for some noise. This method gave similar results to the mitoribosome IP experiment with higher 4sU levels where we did not use any corrections (Fig 3E)
GRAND-SLAM parameters
Alignment files (.bam) containing all mitochondrial-aligned reads were converted into a .cit file using GRAND-SLAM v2.0.5f18. The no-4sU (0 min) alignment file was copied and both copies (“no4sU” and “0m”) were included in the GRAND-SLAM analysis so that the “no4sU” sample would be used for the background T to C mismatch rate and the sample details for the “0m” sample would also be output. GRAND-SLAM was run twice: first for .cit file creation and background rate determination, and a second time after modifying the T to C mismatch rates () according to the output of the binomial mixture model. Before the second run, the *.tsv and *ext.tsv files were removed from the output directory and the *rates.tsv file modified to add in new rates for single_new and double_new. For both runs the parameters used included -full -mode All -strandness Sense -snpConv 0.4 -trim5p 10 -trim3p 5.
Cell doubling correction of fraction new values
The fraction new RNA values for HeLa cells was corrected for dilution by cell doubling. The cell doubling time was estimated by growing HeLa cells under standard cell culture conditions to 70% confluency (0h). At 0h, 12h, and 24h, five replicates of 10cm plates were washed in PBS, and cells were collected by trypsinization. Cells were mixed with trypan blue and counted using an automated cell counter (Countess, Invitrogen). Cell doubling time was estimated to be 26.5 h by linear regression of the log2-fold change in cell number over time. The fraction new values were corrected as follows:
Kinetic modeling of mtRNA degradation
Based on the maximum a posteriori (MAP) values obtained from GRAND-SLAM for each time point of the 4sU pulse, we tested two simple models to estimate the degradation rates from the time t dependence of newly synthesized Total mtRNA T(t) 14,18,45.
Model 1: exponential (one-state) decay - RNA is produced with a transcription (production) rate kp and eventually degraded at the rate (Fig S1F): . To solve this ordinary differential equation analytically as described previously 14,18,45. we set all newly synthesized RNA levels to zero at t=0, i.e. before any 4sU pulse as the boundary condition. Next, the integrating factor method was used to obtain the solution: . Since the observed quantities are fraction of new RNA, rather than RNA levels we derived the model fraction of newly synthesized RNA: with the steady state levels of all (newly synthesized and pre-existing unlabeled) RNA, which equals the steady state solution to the ODE: . Both and are linear in , so no longer depends on . So, we arrive at our solution:
Model 2: non-exponential (two-state) decay - RNA first assumes state 1 upon which it can either undergo exponential decay at the rate or it can transition to state 2 with rate from which it exponentially degrades with rate (Fig S1F):
Here, the total amount of mtRNA is then given by the sum of both states: . Using the same procedure as described above for the 1 state model, the fraction new total RNA then becomes (equation 1)45:
To estimate the degradation parameters in model 1 (one parameter) and model 2 (three parameters), we used a non-linear least square fitting framework (Python v3.7.4, package lmfit v1.0.2, function minimize). Note, for all experiments we directly used the MAP values from GRAND-SLAM for the modeling except for the LRPPRC KO cell where 4sU incorporation was very low and turnover very high (and thus the precision in the measurement was low). In this case, we decreased the measured values of 100% new at the earliest time point down to 95% new to allow for the fitting to proceed smoothly (see supplemental list for details on which genes and values were modified).
To select the model to make predictions about the RNA decay parameters, we used the Akaike information criterion , where N denotes the number of data points, χ2 is residual square sum and is the number of variable parameters. To further estimate which has a higher probability we can define model weights to define the probability of model 1 over model 2. The difference in AIC for a given model i is given by , and is the minimum AIC across all models. Based on this Akaike weights are estimated by , where K is the number of models. Since we are comparing two models we defined the normalized probability as π by
Kinetic modeling of mtRNA association to the mitoribosome
To estimate the ribosome association rate, we reused the two-state model as described above, with the free mtRNA state, i.e. not bound to the ribosome, and the ribosome-bound mtRNA state . The ribosome association rate is then the transfer rate from the free state to the ribosome-bound state. In addition to the total RNA (equation 1 above), we now also fit the experimentally observed fraction of newly synthesized ribosome bound mtRNA:
We can further simplify these equations by making the assumption that :
To estimate the degradation and mitoribosome association rates in the simplified model (two parameters) and full model (three parameters), we used a non-linear least square fitting framework (Python v3.7.4, package lmfit v1.0.2, function minimize) to fit the total and ribosome-bound mtRNA. To select the model to make predictions about the mtRNA kinetic parameters, we used the Akaike information criterion as discussed above.
Lastly, to calculate the fraction of nuclear-encoded mRNA that was bound by the cytosolic polysome, we derived the following equation, with nuclear residenceand , cytoplasmic turnover and cyto-polysome entry rates as determined in14:
Transcription elongation model
In order to model the mitochondrial transcription elongation process, we first processed the Nanopore sequencing reads according to the assumption that total RNA consists of nascent and mature RNAs. Reads with a 3 prime end within a 100nt window of annotated mature RNA 3 prime ends contribute to the mature RNA fraction with coverage across the whole of the gene. Reads with ‘3 ends within the annotated genes, but outside of the 3 prime end window contribute to the nascent RNA fraction with coverage across the whole genome up to the from the single TSS in the mitochondrial genome, i.e. the HSP2 3 prime end site. The sum of nascent and mature RNA fractions makes up the total RNA coverage across the genome. The model mature RNA fractions equal and the model nascent RNA fraction equals , with x the distance (unit: nt) downstream of the TSS, as previously described in20,21. F is the transcriptional initiation (Firing) rate, the MT Polymerase elongation rate, and di the mature RNA degradation rate, specific to gene i, as estimated above through TimeLapse sequencing. The initiation rate F can then be estimated from the observed mature RNA counts from each gene i: mature . This leads to an ensemble of F estimates: . Since RNR2 is the most upstream and highly expressed, we used this “best estimate”: . We also calculate the and as the estimate bounds. To estimate , we use linear regression to fit the nascent RNA fraction model, whilst utilizing our , to get an elongation rate ensemble (Python package: lmfit, function: minimize, velo_seed = 3.81 * 60, equalling the previous estimate by25, velo_max = 100*1000 #units: nt / min) from RNR2 3 prime end to the 3 prime end of the genome. is then the elongation rate paired with , and , likewise for . Altogether, our estimates are then used to predict total RNA coverage with as best estimate (red lines) and and as error bands (Figure 1H).
Ribosome profiling of K562 cells
50×106 cells K562 cells were grown at 5×105 cells/mL for mitoribosome profiling. Upon removing media, cells were rinsed with ice-cold PBS and resuspended in 600μL of mitoribosome lysis buffer (0.25% lauryl maltoside,50 mM NH4Cl, 20 mM MgCl2, 0.5 mM DTT, 10 mM Tris, pH 7.5, and 1× EDTA-free protease inhibitor cocktail (Roche)). Before nuclease digestion, 5% NIH-3T3 lysate based on OD260 equivalence for normalization was spiked-in. The lysates were RNA digested for 30 min at room temperature to generate mitoribosome footprints using 8 units/μL of RNase If (NEB). The digestion was then inhibited by the addition of 80 units SUPERaseIn (Thermo Fisher Scientific) followed by centrifugation at 10,000 RPM for 5 min to clarify the lysate. For isolating mitoribosomes, 450 μL of clarified lysate was added on to 12 mL of 10−50% linear sucrose gradient and centrifuged at 40,000 RPM for 3h at 4°C using an SW41Ti rotor. The gradient was fractionated into 800 μL fractions using the BioComp instrument and the mitoribosomes were followed using western blots using antibodies against MRPL12 and MRPL18B. From the mt-monosome fraction, RNA was extracted using 1:1 phenol/chloroform extraction and separated on 15% polyacrylamide TBE-urea gel to collect RNA fragments between 28–40 nucleotide size. Libraries were prepared as previously described along with the modification for mitoribosome library preparations5. Sequencing was performed in an Illumina-based Next-seq 500 instrument at the Bauer sequencing facility (Harvard University).
Ribosome footprinting analysis
Mitochondrial ribosome profiling data from above and from5 (GEO accession number GSE173283) was initially processed as previously described5. To analyze 5 prime and 3 prime-end processing status of ribosome-loaded mt-mRNA, the position of both the 5 prime and 3 prime ends of each read were recorded. Soft-clipped bases were ignored for recording positions but flagged if they were determined to come from a 3 prime poly(A) tail (>80% of soft-clipped bases are A). A read was considered processed if its terminus is within two nucleotides of the processed transcript terminus. A read was considered unprocessed if it spans at least three nucleotides beyond a processed transcript terminus and was not flagged as containing a poly(A) tail.
MitoStrings experiments
mt-RNA turnover in HeLa cells was determined by measuring the decrease of the old/unlabeled mtRNA after a 4sU-pulse as described in14. In short, HeLa cells were labeled with 500uM 4sU for 0, 60, or 120 min and cells were lysed in Trizol LS and RNA was extracted according to the manufacturer’s protocol. A spike-in of in vitro transcribed ERCC-000148 was added to all the cell lysates during the RNA-extraction step. The RNA was denatured at 60C for ten minutes before being biotinylated by the addition of 5ug/ml biotin-MTS (Biotium) in 20% dimethylformamide (Sigma), 20mM Hepes pH 7.4 and 1 mM EDTA for 30 min at room temperature. Free biotin was removed using phase-lock heavy gel tubes (5prime) following the manufacturer’s instructions. Biotinylated RNA was removed by incubation with streptavidin beads from the uMACS Streptavidin kit (Miltenyi Biotec) for 15 min at room temp. Beads and RNA were loaded onto a uMacs column. Beads were washed with 100mM Tris pH 7.5, 10mM EDTA, 1M NaCl, 0.1% Tween 20, and unlabeled RNA was collected in the flow-through. RNA in the flow-through was purified using the miRNeasy Nano kit following the manufacturer’s instructions including the DNAse treatment step (Qiagen). 30 ng of RNA was incubated for 16h at 67°C with the XT Tagset-24 (NanoString Technologies) and with DNA-probes specific for mitochondrial RNA (Original Mitostring-probes19 were modified as in46) in hybridization buffer (NanoString Technologies) according to the manufacturer’s protocol before being loaded onto a nCounter Sprint Cartridge and quantified using the nCounter SPRINT Profiler (NanoString Technologies) at the Boston Children’s Hospital Molecular Genetics Core.
For direct measurements of processing kinetics HeLa cells were labeled with 500uM 4sU for 0, 1.875, 3.75, 7.5, 15, 30, 60, 120, or 240 min. RNA was extracted as described above with the addition of a second in vitro transcribed 4sU-labeled (10% of UTP was exchanged for 4sUTP in the T7 RNA polymerase in vitro reaction (NEB)) spike-in ERCC-00136. Biotinylation and streptavidin-bead binding were performed as above with the exception that in this case the flowthrough was discarded and bead-bound nascent RNA was eluted by incubating the beads at 2 × 5 min at 65C in 100mM TCEP (Thermo Scientific), 10mM EDTA, 500mM Tris at a final pH7. Eluted RNA was purified, hybridized with Mitostring probes, and analyzed on a nCounter Sprint Profiler as above.
Direct nanopore RNA sequencing
RNA was extracted as described above using Trizol LS and chloroform. Extracted RNA was rRNA-depleted by hybridization with a custom set of biotinylated-DNA probes complementary only to cytosolic rRNA (riboPOOL, siTOOLS). In short, 6ug of RNA was mixed with 5uM riboPOOL probes in 20 uL of 2.5mM Tris-HCl (pH 7.5), 0.25mM EDTA, and 500uM NaCl and hybridized by incubating at 68°C for 10 min followed by a slow cooling step decreasing the temperature to 37°C. 20uL of hybridized samples were mixed with Dynabeads MyOne Streptavidin C1 beads (ThermoFisher) in 80uL of 5mM Tris-HCl (pH 7.5), 0.5mM EDTA and 1M NaCl. Dynabeads had been pretreated with sodium hydroxide according to the manufacturer’s instructions to minimize RNase contamination. The sample and bead mix was incubated at 37°C for 15 min followed by a 5 min step at 50°C before beads were collected on a magnet and the supernatant was transferred to a new tube with beads and the incubation steps were repeated. RNA collected in the supernatant from the second round of bead depletion was cleaned up by isopropanol precipitation.
To include both non-polyadenylated and polyadenylated transcripts, 500 ng of rRNA-depleted RNA was polyadenylated in vitro using E. coli poly(A) polymerase as outlined in47. As an alternative strategy, we ligated a DNA linker (/5rApp/(N)6CTGTAGGCACCATCAAT, IDT) to the 3 prime-end of total RNA as previously described48. Briefly, 1 ug of total RNA was denatured for 2 minutes at 80°C and incubated with 500 ng DNA linker, 8 uL 50% PEG8000, 2 uL DMSO, 2 uL of T4 RNA Ligase Buffer (NEB) and 1 uL T4 RNA Ligase 2 (made in-house) in a 20 uL reaction for 16 hours at 16 °C. The RNA was purified with the Zymo RNA Clean & Concentrator Kit, including the optional removal of molecules < 200 nt to exclude the leftover linker. Direct RNA library preparation was performed using the kit SQK-RNA002 (Oxford Nanopore Technologies) with 500 ng of poly(A)-tailed or 3 prime-end ligated RNA according to manufacturer’s instructions with the following exceptions: the RCS was omitted and replaced with 0.5 uL water and the ligation of the reverse transcription adapter (RTA) was performed for 15 minutes. For 3 prime-end ligated samples, the standard RTA was replaced by a custom adapter that was generated by annealing two oligonucleotides (IDT): /5PHOS/GGCTTCTTCTTGCTCTTAGGTAGTAGGTTC (ONT_oligoA) and GAGGCGAGCGGTCAATTTTCCTAAGAGCAAGAAGAAGCCGATTGATGGT (ONT_oligoB_linker), where the underlined bases are complementary to the DNA linker. Oligonucleotides were diluted to 1.4 μM in 10 mM Tris-HCl pH 7.5, 50 mM NaCl and annealed by heating to 95°C and slowly cooling to room temperature.
Poly(A)+ RNA from myoblasts and HeLa samples were purified using the Dynabeads mRNA purification kit (ThermoFisher, 61006) according to the manufacturer’s instructions, starting with up to 75 ug of total RNA. Direct RNA library preparation was performed as described above with 500–700 ng of poly(A)+ RNA. Direct total poly(A)+ RNA sequencing data from K562 cells were obtained from14. Libraries were sequenced on a MinION device (Oxford Nanopore Technologies) for up to 72 hours.
Nanopore sequencing data analysis
Live base-calling of nanopore sequencing data was performed with MinKNOW (release 20.10.3 or later). Reads with a base-calling threshold > 7 were converted into DNA sequences by substituting U to T bases prior to alignment. Reads were aligned to the Hela-specific single nucleotide polymorphism-corrected reference hg38 genome (see below) using minimap249 with parameters -ax splice -uf -k14. Multi-mapping reads were included in all downstream analyses. All subsequent analyses were performed using custom python scripts.
For measuring transcript abundance, the alignment BAM file was converted to a BED file using pybedtools50 bamtobed with options cigar=True, tag=‘NM’. Reads mapping to the mitochondrial genome were extracted using pybedtools intersect with options wo=True, s=True, and a BED file containing the coordinates of all mitochondrial transcripts. Read/transcript pairs that contained the cigar string “N” within 25 nt of the transcript start or end were excluded. Unique reads intersecting each transcript by at least 25 nt were counted. For measuring the abundance of reads mapping to the last 100 nt of the transcript, the same strategy was used with a BED file containing the coordinates of the 3 prime-end most 100 nt for each transcript. For transcripts that are frequently detected in the unprocessed state (e.g. ATP8-6/COX3), the abundance of the upstream transcript was corrected to take into account reads that are not long enough to reach it. For each read, the genomic coordinates of the read start and the read end were extracted and intersected with the following genomic features within the transcripts of interest using pybedtools with options wo=True, s=True: the transcript start region (0 to +25 nt transcript start), the transcript end region (−25 to +25nt from the transcript end), the gene body (+25 nt from transcript start to –25 nt from transcript end) and the 5 prime upstream region (−1500 to 0 nt from transcript start). For the downstream transcript (e.g. COX3), the 5 prime-end fraction unprocessed was calculated as the number of reads mapping to this transcript that start in the 5 prime upstream region divided by the number of reads mapping to this transcript that start in the 5 prime upstream region or in the transcript start region. Reads mapping to this transcript that start in the gene body were defined as those lacking information about the 5 prime-end and that potentially extend into the upstream transcript (e.g. ATP8-6). The abundance of the upstream transcript was corrected using the following formula: reads mapping to upstream transcript + (fraction unprocessed downstream transcript * number of reads lacking information about 5 prime-end downstream transcripts).
For analysis of processing, the genomic coordinates of the read start and the read end were extracted and intersected with the following genomic features using pybedtools with options wo=True, s=True: the transcript start region (−15 to +50 nt from transcript start), the transcript end region (−15 to +15 nt from transcript end) and the gene body (transcript start to transcript end). Next, the entire reads were intersected with the following genomic features using pybedtools with options wo=True, s=True: the upstream region (−60 to −15 nt from transcript start), the downstream region (+15 to +60 nt from transcript end), the transcript end region (−15 to +15 nt from transcript end) and the gene body (transcript start to transcript end). Reads/feature pairs that overlapped by less than 15 nt or that contained the cigar string “N” within 15 nt of the feature start or end were excluded. Reads that mapped to a protein-coding gene and that ended in the transcript end region of that gene were classified as “processed” at the 3 prime-end. Reads that mapped to a protein-coding gene and to the downstream region of that gene were classified as “unprocessed” at the 3 prime-end. Reads that mapped to a protein-coding gene and that started in the transcript start region of that gene were classified as “processed” at the 5 prime-end. Reads that mapped to a protein-coding gene and to the upstream region of that gene were classified as “unprocessed” at the 5 prime-end.
Poly(A) tail lengths were estimated using nanopolish51. For 3 prime-end ligated samples, reads were classified as non-polyadenylated if the qc_tag from nanopolish was “NO_REGION” or polyadenylated if the qc_tag was “PASS” or “ADAPTER”. For poly(A)+, only reads classified as polyadenylated were analyzed. Only reads with the 3 prime-end processing status “processed” were included in poly(A) tail length analyses.
For modeling transcription, the genomic coordinates of the read 3 prime-end were extracted. Using pybedtools with options wo=True, s=True, read ends were classified as “processed” if they intersected with a 25 nt window around the transcript end site or “actively transcribing” if they mapped more than 25 nt upstream of the transcript end site.
Quantitative gene expression models
To estimate the importance of each of the levels of gene expression that we had acquired kinetic measurements we developed a quantitative gene-expression model (Fig 5) that we defined as follows: Where the from the transcription elongation model defined above, “fraction mRNA in ribosome” as defined above and the . The coefficient of determination from correlating the predicted RPF values from the model to that of measured RPF values was used to measure the predictive power of the model. The model per definition explains all of the variability in RPF. The importance of each gene-expression level was determined by setting the succeeding parameters to 1. E.g. to estimate the predictive value of RNA levels we set both Fraction mRNA in ribosome and the TE to 1.
We also developed a simpler minimal gene-expression model to directly compare nuclear and mitochondrial gene-expression control where where the and . RNA-levels were from direct RNA-seq (Fig 1) for mitochondria and TPMs from nuclear-encoded genes. All the model parameters were estimated using measured rates and RPF data from HeLa cells and were kept constant except for the example of using the from LRPPRC KO cells. The predictive power was determined as the coefficient of determination from correlating the predicted RPF values from the model to that of measured RPF values, either from the same HeLa data set (Model) or a HeLa replicate (Replicate) or from other cell lines (HEK293T wt, HEK293TLRRPRC KO or K562 cells). The importance of each gene-expression node was again determined by sequentially setting the TE and then parameters to 1. All rates, RPF, TPM and direct RNA-seq values can be found in the supplemental tables. For this analysis, we used averaged RPF, half-lives, and RNA abundances between two replicates for each cell line, except for the HeLa model and replicate. For the nuclear side we excluded two outliers, ATP5MC3 and ATP5F1E, out of 75 genes total as they single-handedly depressed the RPF correlation between the different cell lines.
Supplementary Material
Acknowledgements
We thank Anna Wredenberg (Karolinska Institute), Christoph Freyer (Karolinska Institute), Jake Bridgers, and Chantal Guegler for their careful review of the manuscript. We thank members of the Churchman lab, Matthias Selbach (Max Delbrück Center for Molecular Medicine), and Karl McShane (Malmö Stad) for fruitful discussions. We thank Ari Sarfatis for help developing the data analysis pipeline; the Biopolymers Core Facility at Harvard Medical School for sequencing services and the Boston Children’s Hospital Molecular Genetics Core for NanoStrings services.
Funding
This work was supported by National Institutes of Health grant R01-GM123002 (L.S.C.). E.M. is supported by European Molecular Biology Organization Long-Term Fellowship (ALTF 143-2018). K.C. is supported by post-doctoral fellowships from the Fonds de Recherche du Québec - Santé and the Canadian Institutes of Health Research. RI is supported by NIH/NIGMS T32 postdoctoral training grant GM007748-44.
Footnotes
Declaration of Interests
R.I. is a founder, board member, and shareholder of Cellforma, unrelated to the present work.
Availability of data and materials
Raw and processed data will be available from GEO. Code for analysis of all data will be made publicly available at http://github.com/churchmanlab/mitoRNA_kinetics.
References
- 1.Balakrishnan R. et al. Principles of gene regulation quantitatively connect DNA to RNA and proteins in bacteria. Science 378, eabk2066 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Schwanhäusser B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011). [DOI] [PubMed] [Google Scholar]
- 3.Isaac R. S., McShane E. & Churchman L. S. The Multiple Levels of Mitonuclear Coregulation. Annu. Rev. Genet. 52, 511–533 (2018). [DOI] [PubMed] [Google Scholar]
- 4.Couvillion M. T., Soto I. C., Shipkovenska G. & Churchman L. S. Synchronized mitochondrial and cytosolic translation programs. Nature 533, 499–503 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Soto I. et al. Balanced mitochondrial and cytosolic translatomes underlie the biogenesis of human respiratory complexes. Genome Biol. 23, 170 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mercer T. R. et al. The human mitochondrial transcriptome. Cell 146, 645–658 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Köhler F., Müller-Rischart A. K., Conradt B. & Rolland S. G. The loss of LRPPRC function induces the mitochondrial unfolded protein response. Aging 7, 701–717 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Suomalainen A. & Battersby B. J. Mitochondrial diseases: the contribution of organelle stress responses to pathology. Nat. Rev. Mol. Cell Biol. (2017) doi: 10.1038/nrm.2017.66. [DOI] [PubMed] [Google Scholar]
- 9.Rackham O. & Filipovska A. Organization and expression of the mammalian mitochondrial genome. Nat. Rev. Genet. (2022) doi: 10.1038/s41576-022-00480-x. [DOI] [PubMed] [Google Scholar]
- 10.Tan B. G. et al. The human mitochondrial genome contains a second light strand promoter. Mol. Cell 82, 3646–3660.e9 (2022). [DOI] [PubMed] [Google Scholar]
- 11.Gelfand R. & Attardi G. Synthesis and turnover of mitochondrial ribonucleic acid in HeLa cells: the mature ribosomal and messenger ribonucleic acid species are metabolically unstable. Mol. Cell. Biol. 1, 497–511 (1981). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Herzog V. A. et al. Thiol-linked alkylation of RNA to assess expression dynamics. Nat. Methods 14, 1198–1204 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Schofield J. A., Duffy E. E., Kiefer L., Sullivan M. C. & Simon M. D. TimeLapse-seq: adding a temporal dimension to RNA sequencing through nucleoside recoding. Nat. Methods 15, 221–225 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Smalec B. M. et al. Genome-wide quantification of RNA flow across subcellular compartments reveals determinants of the mammalian transcript life cycle. bioRxiv 2022.08.21.504696 (2022) doi: 10.1101/2022.08.21.504696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Chujo T. et al. LRPPRC/SLIRP suppresses PNPase-mediated mRNA decay and promotes polyadenylation in human mitochondria. Nucleic Acids Res. 40, 8033–8047 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Piechota J. et al. Differential stability of mitochondrial mRNA in HeLa cells. Acta Biochim. Pol. 53, 157–168 (2006). [PubMed] [Google Scholar]
- 17.Burger K. et al. 4-thiouridine inhibits rRNA synthesis and causes a nucleolar stress response. RNA Biol. 10, 1623–1630 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Jürges C., Dölken L. & Erhard F. Dissecting newly transcribed and old RNA using GRAND-SLAM. Bioinformatics 34, i218–i226 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wolf A. R. & Mootha V. K. Functional genomic analysis of human mitochondrial RNA processing. Cell Rep. 7, 918–931 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wu Z. et al. Quantitative regulation of FLC via coordinated transcriptional initiation and elongation. Proc. Natl. Acad. Sci. U. S. A. 113, 218–223 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ietswaart R., Rosa S., Wu Z., Dean C. & Howard M. Cell-Size-Dependent Transcription of FLC and Its Antisense Long Non-coding RNA COOLAIR Explain Cell-to-Cell Expression Variation. Cell Syst 4, 622–635.e9 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ruzzenente B. et al. LRPPRC is necessary for polyadenylation and coordination of translation of mitochondrial mRNAs. EMBO J. 31, 443–456 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Singh V., Itoh Y., Huynen M. A. & Amunts A. Activation mechanism of mitochondrial translation by LRPPRC-SLIRP. bioRxiv 2022.06.20.496763 (2022) doi: 10.1101/2022.06.20.496763. [DOI] [Google Scholar]
- 24.Sultana S., Solotchi M., Ramachandran A. & Patel S. S. Transcriptional fidelities of human mitochondrial POLRMT, yeast mitochondrial Rpo41, and phage T7 single-subunit RNA polymerases. J. Biol. Chem. 292, 18145–18160 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yu H. et al. TEFM Enhances Transcription Elongation by Modifying mtRNAP Pausing Dynamics. Biophys. J. 115, 2295–2300 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Posse V., Shahzad S., Falkenberg M., Hällberg B. M. & Gustafsson C. M. TEFM is a potent stimulator of mitochondrial transcription elongation in vitro. Nucleic Acids Res. 43, 2615–2624 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Costello M. et al. Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms. BMC Genomics 19, 332 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ojala D., Montoya J. & Attardi G. tRNA punctuation model of RNA processing in human mitochondria. Nature 290, 470 (1981). [DOI] [PubMed] [Google Scholar]
- 29.Temperley R. J., Wydro M., Lightowlers R. N. & Chrzanowska-Lightowlers Z. M. Human mitochondrial mRNAs--like members of all families, similar but different. Biochim. Biophys. Acta 1797, 1081–1085 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Rackham O., Mercer T. R. & Filipovska A. The human mitochondrial transcriptome and the RNA-binding proteins that regulate its expression. Wiley Interdiscip. Rev. RNA 3, 675–695 (2012). [DOI] [PubMed] [Google Scholar]
- 31.Bratic A. et al. Mitochondrial Polyadenylation Is a One-Step Process Required for mRNA Integrity and tRNA Maturation. PLoS Genet. 12, e1006028 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Remes C. et al. Translation initiation of leaderless and polycistronic transcripts in mammalian mitochondria. Nucleic Acids Res. (2023) doi: 10.1093/nar/gkac1233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kühl I. et al. POLRMT regulates the switch between replication primer formation and gene expression of mammalian mtDNA. Sci Adv 2, e1600963 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bogenhagen D. F., Ostermeyer-Fay A. G., Haley J. D. & Garcia-Diaz M. Kinetics and Mechanism of Mammalian Mitochondrial Ribosome Assembly. Cell Rep. 22, 1935–1944 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ng K. Y. et al. Nonstop mRNAs generate a ground state of mitochondrial gene expression noise. Sci Adv 8, eabq5234 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Irastortza-Olaziregi M. & Amster-Choder O. Coupled Transcription-Translation in Prokaryotes: An Old Couple With New Surprises. Front. Microbiol. 11, 624830 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lagouge M. et al. SLIRP Regulates the Rate of Mitochondrial Protein Synthesis and Protects LRPPRC from Degradation. PLoS Genet. 11, e1005423 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jiang S. et al. TEFM regulates both transcription elongation and RNA processing in mitochondria. EMBO Rep. (2019) doi: 10.15252/embr.201948101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Webster M. W. et al. Structural basis of transcription-translation coupling and collision in bacteria. Science 369, 1355–1359 (2020). [DOI] [PubMed] [Google Scholar]
- 40.Kummer E. et al. Unique features of mammalian mitochondrial translation initiation revealed by cryo-EM. Nature 560, 263–267 (2018). [DOI] [PubMed] [Google Scholar]
- 41.Ohkubo A. et al. The FASTK family proteins fine-tune mitochondrial RNA processing. PLoS Genet. 17, e1009873 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Morgenstern M. et al. Quantitative high-confidence human mitochondrial proteome and its dynamics in cellular context. Cell Metab. (2021) doi: 10.1016/j.cmet.2021.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Replogle J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Akaike H. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19, 716–723 (1974). [Google Scholar]
- 45.Sin C., Chiarugi D. & Valleriani A. Degradation Parameters from Pulse-Chase Experiments. PLoS One 11, e0155028 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Stefan Isaac R. et al. Single-nucleoid architecture reveals heterogeneous packaging of mitochondrial DNA. bioRxiv 2022.09.25.509398 (2022) doi: 10.1101/2022.09.25.509398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Drexler H. L. et al. Revealing nascent RNA processing dynamics with nano-COP. Nat. Protoc. 16, 1343–1375 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Mayer A. & Churchman L. S. Genome-wide profiling of RNA polymerase transcription at nucleotide resolution in human cells with native elongating transcript sequencing. Nat. Protoc. 11, 813–833 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Dale R. K., Pedersen B. S. & Quinlan A. R. Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics 27, 3423–3424 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Workman R. E. et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat. Methods 16, 1297–1305 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.