Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Nov 1.
Published in final edited form as: Trends Genet. 2022 May 28;38(11):1112–1122. doi: 10.1016/j.tig.2022.05.002

Gene product diversity: adaptive or not?

Jianzhi Zhang 1,*, Chuan Xu 2
PMCID: PMC9560964  NIHMSID: NIHMS1808046  PMID: 35641344

Abstract

One gene does not equal one RNA or protein. The genomic revolution has revealed numerous different RNA and protein molecules that can be produced from one gene, such as circular RNAs generated by back-splicing, proteins with residues mismatching the genomic encoding because of RNA editing, and proteins extended in the C-terminus via stop codon readthrough in translation. Are these diverse products results of exquisite gene regulations or imprecise biological processes? While there are cases where the gene product diversity appears beneficial, genome-scale patterns suggest that much of this diversity arises from nonadaptive, molecular errors. This finding has important implications for studying the functions of diverse gene products and for understanding the fundamental properties and evolution of cellular life.

Keywords: Molecular error, natural selection, transcription, translation, posttranscriptional modification

Gene product diversity

Gene expression produces RNAs and/or proteins according to the blueprint stored in the genome. While the basic process of gene expression including canonical transcription, RNA processing, and translation generates one mature mRNA and one protein per gene (Fig. 1A), a surprisingly large diversity in gene product per gene was recently revealed, thanks to the explosive development of genome biology. Is this diversity reflective of exquisite gene regulations that are adaptive, or molecular errors resulting from imprecise gene expression? In this article, we review steps in gene expression that produce diverse gene products. We argue based on genomic evidence that gene product diversity originates largely from molecular errors and discuss the implications of this understanding.

Fig. 1. Gene product diversity.

Fig. 1.

(A) The canonical process of gene expression, showing transcription, RNA processing, and translation. (B) Diversities created in transcription and RNA processing, showing alternative transcriptional initiation (ATI), alternative splicing (AS) including back-splicing that creates circRNAs, and alternative polyadenylation (APA). (C) Diversities created in posttranscriptional modification, showing A-to-I editing, C-to-U editing, and m6A modification. (D) Diversities created in translation, showing alternative translation initiation (ATLI), mistranslation, and stop codon readthrough. Exons are colored and numbered, whereas introns are in grey and not numbered.

Diverse gene products generated in transcription and RNA processing/modification

It is now known that the transcription of a gene often starts from one of several transcription start sites (TSSs) due to the existence of multiple core promoters per gene [1,2] (Fig. 1B). This phenomenon of alternative transcriptional initiation (ATI) generates transcripts varying in the 5′ untranslated region (UTR) and sometimes even the coding region. ATI is widespread (e.g., in humans, >50% of genes show ATI [3] and an average gene has four TSSs [4]) and varies among tissues and developmental stages [5,6]. During transcriptional elongation, RNA polymerases may incorporate into transcripts nucleotides unpaired with those in the genome (i.e., mistranscription), with a rate of 10−6 to 10−5 per nucleotide incorporation [7]. Upon the completion of transcription, a typical eukaryotic mRNA is cleaved at its 3’ end followed by an addition of a poly(A) tail, but the cleavage and polyadenylation can occur at one of several sites [8] (Fig. 1B). Such alternative polyadenylation (APA) gives rise to multiple mRNA isoforms that differ in the 3’ UTR and sometimes even the coding region [9]. APA is highly abundant (e.g., ~70% of human genes show APA [10]) and can vary among tissues [11] and developmental stages [12].

Intron-containing transcripts are spliced to generate mature mRNAs. Alternative splicing (AS) connects exons in different combinations, resulting in different mature mRNAs, which encode different protein isoforms [13] (Fig. 1B). While generating linear mRNAs most of the time, splicing occasionally creates circular RNAs (circRNAs) by covalently linking a downstream splice-donor site to an upstream splice-acceptor site (i.e., back-splicing) [14] (Fig. 1B). AS is highly prevalent in eukaryotes. For example, ~95% of human multi-exonic genes are alternatively spliced [15].

Over 160 different types of posttranscriptional modifications of RNA have been reported [16,17]. Among them, the conversion of adenosine to inosine (A-to-I editing) and cytosine to uridine (C-to-U editing) and the methylation at the nitrogen-6 position of the adenosine (m6A) have received the widest attention because of their relative prevalence in mRNAs [17,18] (Fig. 1C). These and other modifications can result in changes in RNA and/or protein sequences and properties [17,18].

Diverse gene products generated in translation

The translation of a gene often initiates from one of several positions, sometimes using non-AUG codons (typically deviating from AUG at one nucleotide) as start codons (Fig. 1D). This phenomenon of alternative translation initiation (ATLI) results in multiple proteins with distinct N-termini, some even with different reading frames, from the same gene [19]. ATLI is prevalent; even within one cell line, about 90% of genes exhibit ATLI [20] and on average 2.5 and 2 translation initiation sites per gene have been reported in human and mouse, respectively [19,21]. During translational elongation, an amino acid not encoded by the mRNA may be incorporated into the peptide synthesized (i.e., mistranslation), with a probability of 10−4 to 10−3 per amino acid incorporation [22,23]. The termination of translation is also variable, because sometimes the ribosome would readthrough the stop codon without stopping, generating different protein isoforms that differ in their C-termini [24] (Fig. 1D). For instance, ribosome-profiling experiments uncovered about 300 fruit fly genes subject to stop codon readthrough [25].

The adaptive and molecular error hypotheses

Why are so many different RNAs or proteins produced at each step of the expression of a gene? The prevailing view is that the production of an astronomically large number of functionally distinct RNAs and proteins from a limited number of genes may be necessary to support complex forms of life such as ourselves. That is, most of the observed gene product diversity is adaptive [23,26,27]. There are indeed examples of functional diversity among gene products that appears to be adaptive (Box 1).

Box 1. Potentially adaptive gene product diversity.

Human LEF1 encodes lymphoid enhancer binding factor 1 that regulates the transcription of Wingless/Integrated (Wnt)/β-catenin genes. LEF1 produces two different protein isoforms via ATI; the longer isoform recruits β-catenin to Wnt target genes, whereas the shorter isoform cannot interact with β-catenin and instead suppresses the Wnt regulation of target genes [76].

APA could impact mRNA stability, translation, protein function, and protein subcellular localization. For example, the mouse immunoglobulin heavy constant mu (Ighm) gene expresses a secreted form using a proximal polyadenylation site and a membrane-bound form using a distal site [77].

AS can play important physiological roles. For example, Sxl, Tra, and Dsx in the fruit fly sex determination pathway are alternatively spliced, resulting in either male or female individuals [78]. Back-splicing could be functional too. Some circRNAs can act as microRNA sponges [14] and some circRNAs bind to and titrate out RNA-binding proteins [79]. Furthermore, some circRNAs can be translated via cap-independent translation, potentially taking effect through their protein products [80].

An A-to-I recoding event in a K+ channel has been shown to underlie the cold adaptations of polar octopuses [81], and m6A has been suggested to impact mRNA stability, translational efficiency, cell fate, spermatogenesis, sex determination, and many other processes [82,83].

During translation, due to ATLI, human mitochondrial antiviral-signaling protein gene MAVS produces a full-length MAVS and a truncated mini-MAVS that are functionally different from each other; although both proteins positively regulate cell death, mini-MAVS interferes with interferon production induced by full-length MAVS [84]. It has been reported that CUG mistranslation in the human fungal pathogen Candida albicans creates cell surface variation that might help the pathogen escape from the host immune surveillance [85].

Stop codon readthrough is a common strategy of viruses to encode proteins with an extended C-terminus [86]. For instance, a ~5% probability of readthrough of the gag UAG stop codon is required for producing the Gag-Pol polyprotein necessary for the assembly of the murine leukemia virus [87]. Stop codon readthrough can also influence protein localization by adding a signal peptide [88].

Because the fitness effect of producing diverse gene products is generally unknown in the above cases, the biochemical and potentially functional effects may not mean that these diversities are adaptive. Furthermore, even when removing a diversity lowers fitness, the creation of the diversity in evolution may not be adaptive, as in the harm-permitting model of RNA editing [54] described in the main text.

Nonetheless, the known number of apparently adaptive cases are far fewer than the total number of variants known. While this disparity could mean that the adaptive values of most variants are yet to be identified, an alternative hypothesis is that much of the diversity arises from molecular errors in gene expression and is nonadaptive. Because the ultimate outcome of evolution is fitness maximization (given mutations), we define any fitness-reducing molecular event as a molecular error, regardless of whether it has a dedicated machinery and whether it shows tissue- or developmental stage-specificity.

Molecular error is expected to be abundant in part because cellular life relies on chemical reactions, which have a stochastic nature and cannot completely avoid error. This problem is especially serious in cells, where the number of molecules of each kind tends to be quite small (e.g., only two molecules of DNA exist for each gene in a diploid cell). Consequently, an enzyme molecule may occasionally act on a wrong substrate molecule, leading to an erroneous reaction that reduces fitness. Another cause for abundant molecular errors is that they have not been eliminated by natural selection. Why this is so is difficult to know for individual biological processes, but in principle this could occur for the following reasons. First, reducing molecular error is not cost-free, and natural selection will not favor error reduction if the cost of reducing the error exceeds the benefit. Second, the net benefit of reducing error may be positive, but it would become smaller as the error diminishes; eventually the selection for error-reduction would be too weak to overcome the effect of genetic drift [28]. Third, the cell may have evolved mechanisms to lessen the damage caused by the error such that the benefit of reducing error is minimized. For example, the molecular chaperon GroEL helps protein fold. Consequently, natural selection against mistranslation, which can cause protein misfolding, is relaxed for the obligate targets of GroEL [29]. Finally, there may simply exist no mutation that could eliminate error.

Genomic evidence for the error hypothesis

The molecular error hypothesis for diverse gene products makes a series of predictions that are unexpected a priori under the adaptive hypothesis, allowing differentiating between the two hypotheses [30] (Box 2). Furthermore, one can estimate the fraction of gene product diversity that is deleterious [31] (Box 2). Applying this rationale to the analysis of diverse gene products generated at various steps of gene expression reveals overwhelming evidence for the error hypothesis (Table 1). As examples, below we describe such evidence on the product diversity generated by alternative splicing, including back-splicing, and A-to-I editing.

Box 2. Predictions and tests of the molecular error hypothesis.

Molecular errors in gene expression can reduce fitness for several reasons. First, it lowers the average biological activity of the RNA/protein molecules produced. Second, it can create RNA/protein molecules that are cytotoxic. Third, it wastes energy and other resources in degrading and containing these toxic molecules. For a gene, the fitness cost of errors associated with the above second and third mechanisms rises with the gene expression level because the number of errors should be proportional to the expression level, while that associated with the first mechanism should not decrease with the expression level. Hence, natural selection against error is stronger so the observed error rate is lower in more highly expressed genes. Consequently, gene product diversity caused primarily by molecular error should be lower for more highly expressed genes (Fig. 1A in Box 2). The error hypothesis further predicts that gene product diversity is lower in the tissue or environment where the gene is more highly expressed (Fig. 1B in Box 2). By contrast, the adaptive hypothesis does not make these predictions, because the fitness advantage of an adaptive diversity should depend on the specific function and regulation of the gene concerned. In addition, depending on the type of error considered, gene product diversity is predicted by the error hypothesis to be negatively correlated with the amount of splicing (AS and back-splicing) or amount of translation (ATLI, mistranslation, and stop codon readthrough). Furthermore, because the extent of genetic drift lowers with Ne, the power of selection relative to that of drift rises with Ne. The error hypothesis thus predicts that gene product diversity decreases with Ne (Fig. 1C in Box 2), while the opposite may be predicted if the diversity is mostly beneficial (or no prediction if different species adapt with different molecular mechanisms). Finally, the error hypothesis predicts that sequence motifs underlying the generation of gene product diversity, such as the signals for minor polyadenylation sites, should not be evolutionarily conserved, while the adaptive hypothesis predicts conservation (Fig. 1D in Box 2). These contrasting predictions allow distinguishing between the adaptive and error hypotheses.

To estimate the fraction of gene product diversity that is deleterious, we assume that all deleterious diversity has been selectively removed in the most highly expressed genes. Hence, the observed diversity in these genes would be the non-deleterious amount (ND). Similarly, we assume that none of the deleterious diversity has been selectively purged in the least expressed genes; hence their diversity reflects the total diversity (T). Therefore, the fraction of deleterious diversity is Fdel = (T-ND)/T = 1 – ND/T. This Fdel estimate is conservative because T is an underestimate of all diversity before selection and ND is an overestimate of all non-deleterious diversity. Note that Fdel measures the fraction of deleterious diversity before the action of purifying selection. One can estimate the fraction (Odel) after the action of selection by replacing T with the overall diversity observed from all genes.

Fig. 1 in Box 2. Predictions of the error hypothesis.

Fig. 1 in Box 2.

(A) In a tissue, gene product diversity declines with gene expression level. (B) For a gene, product diversity in a tissue declines with the expression level of the gene in the tissue. (C) Gene prodcut diversity reduces with effective population size (Ne). (D) Sequence motifs for major sites of action such as major polyadenylation sites (PASs) but not minor sites of action such as minor PASs are evolutionary conserved. Pseudo-PASs are neutral controls.

Table 1.

Gene product diversity generated primarily by molecular errors in various steps of gene expression.

Diversity-generating process Genomic evidence for the error hypothesis Deleterious fractiona (species) Reference

Alternative transcriptional initiation (ATI) • The TSS diversity of a gene decreases with the gene expression level.
• The fractional use of the major TSS increases, but that of each minor TSS decreases with the gene expression level.
• For a pair of human-mouse orthologous genes, the one with the higher expression level tends to show a lower TSS diversity.
Cis-elements for major TSSs are selectively constrained while those for minor TSSs are not.
88% (human) [70]
Mistranscription • Lower mistranscription rates in more highly expressed genes. NA [71]
Alternative (linear) splicing (AS) • Lower rates of AS in genes with more introns and those with higher expression levels. 92–98% (Paramecium tetraurelia)
68% (human)
[31]
Back-splicing • Back-splicing is orders of magnitude rarer than linear splicing.
• In a tissue, back-splicing rate and splicing amount are negatively correlated across genes.
• For a given gene, the back-splicing rate in a tissue tends to decrease with its splicing amount in that tissue.
• CircRNAs are overall evolutionarily unconserved.
• Splice sites are not selectively constrained for back-splicing.
• The overall prevalence of back-splicing in a species declines with its Ne.
97% (human) [42]
Alternative polyadenylation (APA) • As the expression level of a gene rises, its polyadenylation diversity declines, relative use of the major polyadenylation site increases, and that of each minor site decreases.
• The mean number of polyadenylation signals per gene is below one half the random expectation.
• Signals for major but not minor polyadenylation sites are under purifying selection.
93% (human) [30,72]
A-to-I editing • Nonsynonymous editing frequency and level are respectively lower than the synonymous counterparts.
• Nonsynonymous editing is rarer and editing level is lower in essential genes than in nonessential genes.
• Nonsynonymous editing is rarer and editing level is lower in genes under stronger evolutionary constraints or of higher expression levels.
• Edited As are more likely than unedited As to be replaced with Gs in genome evolution.
• The vast majority of coding site editing is not shared between human and mouse.
NA [5052]
C-to-U editing • Nonsynonymous editing frequency and level are respectively lower than the synonymous counterparts.
• Nonsynonymous editing frequency decreases with gene importance and evolutionary constraint.
• Evolutionarily conserved sites tend to have lower nonsynonymous editing levels.
NA [73]
m6A modification • Relative to comparable unmethylated As, m6As are overall no more conserved in yeasts and only slightly more conserved in mammals.
• m6As and comparable unmethylated As have no significant difference in single nucleotide polymorphism (SNP) density or SNP site frequency spectrum.
• The methylation status of a gene, not necessarily the specific sites methylated in the gene, is subject to purifying selection for no more than ∼20% of m6A-modified genes.
NA [74]
Alternative translation initiation (ATLI) • The extent of ATLI for a gene is negatively correlated with its translational amount.
• Kozak regions of alternative initiation sites are unconserved.
• The fractional use of non-AUG start codons in a gene decreases with the translational amount.
75% (human) [20]
Mistranslation • Lower mistranslation rates are observed in more highly expressed genes. NA [75]
Stop codon readthrough • Readthrough rates decrease with gene expression levels.
• Readthrough motifs are underrepresented in highly expressed genes.
• Readthrough regions do not show higher sequence conservations than comparable regions that are untranslated.
72% (yeast)
68% (fruit fly)
[63]
a

Shown are Fdel except for alternative splicing and back-splicing, where Odel is shown (see Box 2).

Alternative (linear) splicing and back-splicing

Several studies suggested that many alternative mRNAs result from erroneous splicing [3234]. In particular, Saudemont et al. [31] observed lower rates of alternative splicing in genes with higher expressions, as predicted by the error hypothesis (Box 2). The error hypothesis further predicts that the rate of alternative splicing per intron decreases with the number of introns in the gene. This is because, given the splicing error rate per intron, the total splicing error rate per gene rises with the intron number; consequently, selection against splicing error is stronger in genes with more introns. Indeed, alternative splicing per intron is lower in genes with more introns [31]. Saudemont et al. estimated that, for genes with median expression levels in the intron-rich unicellular eukaryote Paramecium tetraurelia, at least 92–98% of splice variants are errors, and the corresponding number is at least two thirds in humans. The scarcity of alternative isoforms in proteomic data suggests that the vast majority of mature mRNA isoforms do not result in stable proteins [35], further supporting the error hypothesis.

First discovered in 1979 [36], circRNAs are generated by back-splicing that was initially regarded as splicing error [37]. Since the 2010s, however, numerous circRNAs have been discovered via high-throughput RNA sequencing (e.g., >50% of human protein-coding genes produce circRNAs [38]). This prevalence has led to the common belief that circRNAs are a large group of functional RNAs widely used in gene regulation [14,3941], reflected by a high research activity—“circRNA” appeared in the title or abstract of over 2900 papers in 2021 alone. However, six lines of evidence supporting the error hypothesis were found from an analysis of 11 shared tissues among human, macaque, and mouse [42]. First, the probability of back-splicing is minute—the median proportion of splicing events that are back-splicing is <0.2% across all genes. The orders of magnitude lower rate of back-splicing than that of linear splicing is expected if back-splicing is erroneous. Second, as predicted by the error hypothesis (Box 2), a negative correlation across genes was observed between the total splicing amount of a gene and its back-splicing rate, in each tissue of each species studied. Third, across tissues, the back-splicing rate of a gene tends to decrease with its splicing amount, as the error hypothesis predicts (Box 2). Fourth, the fraction of human (or macaque) back-splicing that is shared with the mouse is only slightly greater than the chance expectation, suggesting that almost all back-splicing is evolutionarily unconserved. Fifth, linear and back-splicing share splice sites; both intraspecific polymorphism and interspecific divergence of splice sites are negatively correlated with the amount of linear but not back-splicing, suggesting that back-splicing does not impose selective constraints on splice sites. Sixth, the overall back-splicing rate is lower in mouse than that in macaque, which is in turn lower than that in human, as expected under the error hypothesis given these species’ effective population sizes (Ne) (Box 2). It was estimated that, across human tissues, a median of 98.8% of back-splicing is deleterious prior to the action of selection [42]. Among the observed back-splicing, which is the remainder upon the action of selection, the corresponding value is 97.5% [42]. The above two percentages are similar, indicating that only a small fraction of deleterious back-splicing has been selectively purged. This is at least in part because the intensity of selection against errors in gene expression rises with the expression level of the gene (Box 2), but the expression level exhibits a power-law distribution across genes, with only a small fraction of genes having high expressions [43]. Together, these observations provide strong evidence that an overwhelming majority of circRNAs arise from splicing errors so may be called junk RNAs.

A-to-I editing

One of >160 types of posttranscriptional modifications of RNA, A-to-I editing is catalyzed by adenosine deaminases acting on RNA (ADARs). Because I is recognized by the ribosome as guanine (G), A-to-I editing of a coding site in mRNA is equivalent to an A-to-G change. Such editing is referred to as nonsynonymous editing or recoding when it leads to an amino acid change in the protein; otherwise, it is referred to as synonymous editing. ADARs and A-to-I editing of mRNAs transcribed from the nuclear genome appear to exist in all animals [44,45], but the origin and adaptive value of the editing are unclear. For a handful of sites, disrupting A-to-I recoding is lethal or strongly deleterious [46], prompting the view that recoding offers an “extreme advantage” [27]. It is commonly stated that coding RNA editing expands transcriptome and proteome diversities such that the same gene codes for proteins of different functions, which could be deployed in different tissues or at different times [47,48]. Interestingly, however, most of over 1 million edited sites in the human genome are in noncoding regions [49]. In coding regions, about 2000 sites are edited, but at each site the fraction of mRNA molecules edited (i.e., editing level) is typically much lower than 50% [49]. Xu and Zhang asked whether coding A-to-I editing is generally advantageous [50]. Under the assumption that synonymous editing is more or less neutral but nonsynonymous editing has potential fitness effects, they compared the fraction of A sites subject to nonsynonymous editing (fN) and that subject to synonymous editing (fS) in human coding regions. They found that fN is about two thirds that of fS. Because, in the absence of selection, the probability of editing should be equal between synonymous and nonsynonymous sites, the above finding suggests that about one third of recoding events have been purged by natural selection. At the observed editing sites, the median nonsynonymous editing level (LN) is significantly lower than that of synonymous editing (LS), indicating that selection has suppressed the nonsynonymous editing level. Moreover, fN/fS and LN/LS are significantly lower in essential genes than in nonessential genes, and these ratios decrease with the evolutionary constraint as well as the expression level of the gene. These patterns suggest the overall deleterious nature of nonsynonymous editing, which is probably because of ADARs’ limited substrate specificity [50]. The conclusion that A-to-I editing in human coding regions is mostly nonadaptive is further supported by the findings that (i) during evolution, edited As are more likely than unedited As to be replaced with Gs but not with Ts or Cs [50], (ii) among nonsynonymously edited As, those that are evolutionarily least conserved exhibit the highest editing levels [50], (iii) only a handful of editing events have known functions [46], and (iv) only 1.8% of human coding RNA editing events are shared with mouse [51,52].

While the above human-based findings are probably true in most animals, some organisms exhibit a different pattern. Tens of thousands of coding A-to-I editing events, including a large fraction of recoding events, occur in the neural tissues of coleoids (octopuses, squids, and cuttlefishes) [53]. Subsequent analysis, however, suggests that most of the recoding events, especially species-specific recoding events, which constitute the great majority, are nonadaptive [54]. Interestingly, a large number of A-to-I recoding events have also been observed in filamentous ascomycetes with features distinct from those in animals, although the editing enzyme has yet to be identified [55,56]. It is currently unknown whether most of the recoding events in this group of fungi are adaptive as suggested [56] or nonadaptive as in coleoids [54].

Although the functional importance of editing at a site can be assessed by disrupting editing at the site, we emphasize that the editing may not be adaptive even when disrupting it is harmful [54]. Consider, for example, an exonic G site where A is functionally prohibited. Gaining the editing capability may permit the fixation of a G-to-A mutation at the site, because A can be converted to I (equivalent to G) in at least a fraction of mRNA molecules of the gene. Upon the G-to-A substitution, the functionality of the gene becomes dependent on editing, so disrupting editing at the site would be deleterious. Nonetheless, no adaptation has occurred in this case, because the current genotype with an editable A is no fitter than the original one with G. In this so-called harm-permitting model, all editing does is permitting previously prohibited substitutions, without increasing fitness [54].

Implications

Paradigm shift in studying gene product diversity

The current research on gene product diversity typically starts with the discovery of a new variation such as a previously unknown posttranscriptional modification, which is followed by a genomic-scale survey and a mechanistic probe of this modification—identifying all sites in the transcriptome that are modified and the enzymes responsible for the modification. The prevalence of the modification and especially the finding of dedicated enzymes often lead to a suggestion that the modification is functionally important. Various phenotypes are then examined upon the deletion of the enzyme genes or prohibition of the modification at a particular site; the modification is declared generally important (and adaptive) if any phenotypic effect is detected. However, detecting a phenotypic effect that has a fitness consequence in these experiments shows that the modification of at least one site is important; it does not prove that the modification is generally important, nor does it prove that the modification is adaptive at a specific site or in general (e.g., the harm-permitting model explained). The existence of dedicated enzymes or mechanisms for the modification does not prove its general importance either, because the action of the dedicated enzymes may be functionally needed for only a small fraction of the observed diversity and because even erroneous biological processes that decrease fitness have mechanistic basis. Genome-wide patterns of gene product diversities (Table 1) strongly suggest that most of them arise from molecular errors and are nonadaptive. This understanding requires a paradigm shift in research—these diversities should be considered molecular errors unless proven otherwise, not the other way around. This understanding also implies that, most diversities in gene product are slightly deleterious instead of beneficial and that searching for their benefits is likely futile. It is interesting to note that, besides gene product diversity, the amount of each product also varies among isogenic individuals in the same environment, a phenomenon known as gene expression noise [57]. There is a general agreement that gene expression noise is caused by molecular error in gene expression and is generally deleterious [58], although under special circumstances, expression noise of certain genes can be beneficial and important [59,60]. Gene product diversity is not fundamentally different from gene expression noise.

Selection against error as a primary force shaping features of gene expression

In many steps of gene expression, the gene product diversity generated (e.g., the fraction of protein molecules with stop codon readthrough) decreases with the expression level of the gene, splicing amount, or translational amount of the gene concerned, creating genome-scale trends that are predicted by the error hypothesis (Table 1 and Box 2). This observation suggests that selection against molecular error plays a major role in shaping gene product diversity and features of gene expression, echoing proposals that such selection shapes gene, protein, and genome evolution [61,62]. Because gene expression is central to gene function, error minimization (and mitigation) must be a guiding principle in functional genomics. Because the splicing amount and translational amount of a gene are both proportional to the expression level of the gene, all of the diversities discussed here scale with the gene expression level. Just as the variation in Ne explains the variation in genome architecture among species [28], the variation in gene expression level explains the across-gene variation in gene product diversity. In both cases, it is the strength of selection, which can be modulated by expression level, relative to the power of genetic drift determined by Ne that plays a predominant role.

Mechanisms for error minimization

Generally speaking, the rate of a particular type of molecular error such as stop codon readthrough is determined by both cis-acting and trans-acting factors. Cis-factors are usually sequence motifs surrounding the site where the error occurs, such as the particular stop codon and neighboring nucleotides that influence the error rate locally [63], whereas trans-factors are usually diffusible factors in the cell that impact the error rate globally such as ribosomes. This distinction between cis- and trans-factors helps predict different evolutionary outcomes depending on the strength of selection against error [64]. For example, when Ne is large and gene expression is high, even selection for a reduced local error rate could be strong enough to drive adaptive cis-factor changes. But when Ne is small or the expression is low, selection for a reduced local error rate may be too weak to drive adaptive cis-factor changes.

When different error rates are observed in the same tissue for different genes, this among-gene variation in error rate must be due to their different cis-factors because all genes in the same cell share trans-factors. When different error rates are observed for the same gene in different tissues, this difference must be caused by a difference in trans-factors in different tissues, because the cis-factors of a gene do not vary across tissues. Because the error hypothesis is supported by both among-gene comparisons in the same tissue and among-tissue comparisons for the same gene (Table 1), both adaptive cis- and trans-changes for lower error rates must have occurred. The specific molecular changes, however, are usually less well understood and are worth studying in the future.

Implications for the molecular basis of organismal complexity

Given that organismal complexity, often quantified by the number of cell types, is correlated with neither the genome size nor gene number, some authors have hypothesized that the generation of diverse products from each gene is a primary mechanism behind organismal complexity because it could substantially increase transcriptome and proteome diversities both within and between cell types [6567]. This hypothesis is based on the implicit assumption that the diverse gene products from each gene generally have distinct and beneficial functions. The finding that much of this diversity results from molecular error and is nonadaptive casts doubt on this hypothesis.

Implications for health and disease

The overall deleterious nature of gene product diversity suggests that studying the diversity may shed light on disease mechanisms. For example, a particular position in human AZIN1 is subject to A-to-I editing, causing the recoding of Ser to Gly. In 45 vertebrate genomes examined, Ser, Arg, Gln, His, Asn, and Lys, but not Gly, are found at the site [50], suggesting that the Ser-to-Gly editing is deleterious. Indeed, an increase in the editing level of the site causes hepatocellular carcinoma [68]. In another example, gaining a proximal polyadenylation site in IRF5 causes systemic lupus erythematosus [69]. These anecdotes call for systematic explorations of noncanonical gene products as a general mechanism of disease.

Concluding remarks

Molecular error in the transmission of genetic information—mutation, is a long-standing and well-appreciated subject in biology. As shown here, molecular error in the expression of genetic information is also prevalent and important, but studies of this type of error have just begun and many key questions remain open (see Outstanding Questions). Regardless, patterns of gene product diversity have revealed that cellular life is full of error even in gene expression, one of the most important biological processes, contrasting the commonly depicted picture of an orderly and harmonic cellular life. Appreciating this reality will be important for understanding many aspects of life and its evolution.

Outstanding Questions.

  • Many forms of gene product diversity have not been studied in the context of the error hypothesis, including the vast majority of >160 types of posttranscriptional modifications. Part of the reason is that reliable genome-scale data are lacking for most of these diversities. It would be interesting to test the error hypothesis when such data become available.

  • Specific molecular mechanisms that lower rates of various types of error in gene expression are poorly understood. Having this mechanistic understanding could help synthetic biologists design organisms with lower error rates.

  • The error rate appears to vary across different steps of gene expression (e.g., the mistranscription rate is generally lower than the mistranslation rate). What general principles and factors govern this variation of error rate among different processes?

  • As mentioned, in theory, there are four reasons why the rate of a particular molecular error has not been selectively reduced to zero. In reality, which reason is the most important one?

  • While molecular errors could turn into adaptations, such events are likely a minority. Individual cases defying general patterns predicted by the error hypothesis (e.g., a high back-splicing rate found in a gene with a high splicing amount [42] or a high nonsynonymous editing level at a conserved site [52]) may be considered candidate adaptations. In general, however, detecting potentially adaptive diversities from the sea of largely deleterious ones remains challenging.

  • While this article focuses on molecular errors in gene expression, molecular errors in other cellular processes are presumably abound as well. However, these other errors are more difficult to study because they are generally harder to quantify and may not exhibit a simple scaling as found for errors in gene expression. The production of tens of thousands of long intergenic non-coding RNAs [70,71] is of particular interest, because it is unclear whether most of them are functional or results of erroneous transcription.

  • How does molecular error in gene expression and other cellular processes constrain and/or channel macroevolution? Do they play a major role in determining what types of organisms and biological phenomena exist on this planet?

HIGHLIGHTS.

  • Patterns of gene product diversity suggest that the diversity arises mostly from nonadaptive, molecular errors in gene expression.

  • That gene product diversity largely arises from molecular errors requires the diversity to be considered nonadaptive unless proven otherwise.

  • Natural selection against molecular error is a main force shaping genome-wide trends of features of gene expression.

  • Abundant molecular errors in the expression of genetic information contrast the commonly depicted picture of an orderly and harmonious cellular life, requiring considering the roles of molecular errors in biology, medicine, and evolution.

Acknowledgments

We thank members of the Zhang laboratory and two anonymous reviewers for valuable cmments. This work was supported by U.S. National Institutes of Health (R35GM109484 to J.Z.), National Natural Science Foundation of China (32100518 to C.X.), and Medical-Engineering Crossover Fund of Shanghai Jiao Tong University (YG2022QN084 to C.X.).

Glossary

A-to-I editing

a type of posttranscriptional modification that converts adenosine to inosine at specific positions of a transcript

Adenosine deaminases acting on RNA (ADARs)

a group of enzymes that perform A-to-I editing in animals

Alternative polyadenylation (APA)

the phenomenon that polyadenylation of a transcript occurs at one of several possible positions

Alternative splicing

the process of selecting different combinations of splice sites within a pre-mRNA to produce variably spliced mRNAs

Alternative transcriptional initiation (ATI)

the phenomenon that transcription of a gene initiates at one of several possible genomic positions

Alternative translation initiation (ATLI)

the phenomenon that translation of a gene starts at one of several possible positions that may or may not be AUG codons. ATLI is in part attributable to ATI

Back-splicing

Spliceosome-mediated splicing that joins a 5’ splice site (splice donor) of a downstream exon with a 3’ splice site (splice acceptor) of an upstream exon to yield a circular RNA

C-to-U editing

a type of posttranscriptional modification that converts cytosine to uridine at specific positions of a transcript

Circular RNA (circRNA)

a type of single-stranded RNA that, unlike linear RNA, forms a covalently closed continuous loop. CircRNAs are the result of back-splicing

Effective population size (Ne)

number of individuals in an idealized population that would have the same effect of random sampling on gene frequency as that in the actual population

Genetic drift

changes in the frequency of an allele in a population due to random sampling of organisms or gametes

m6A

methylation at the nitrogen-6 position of certain adenosines in RNA molecules

Mistranscription

incorporation of nucleotides that are not encoded by the genome into transcripts during transcription

Mistranslation

incorporation of amino acids that are not encoded by the mRNA into proteins during translation

Natural selection

process that promotes the spread of beneficial mutations (aka positive selection) or hinders the spread of deleterious mutations (aka negative selection)

Posttranscriptional modifications (aka RNA editing)

alterations of RNA molecules after transcription, through insertion, deletion, or modification of nucleotides, not including RNA processing events such as splicing, capping, or polyadenylation

Stop codon readthrough

the phenomenon that a translating ribosome goes past the stop codon and continues translating into the otherwise untranslated region of a transcript

The adaptive hypothesis of gene product diversity

a supposition that gene product diversity results mostly from gene regulations that are generally beneficial

The molecular error hypothesis of gene product diversity

a supposition that gene product diversity is largely attributable to imprecisions of the molecular processes of gene expression and are mostly deleterious

Transcription start site (TSS)

the genomic location where the first DNA nucleotide is transcribed into RNA

Untranslated regions (UTRs)

untranslated regions in mRNAs, including 5’ UTRs and 3’ UTRs

Footnotes

Declaration of interests

The authors declare no competing interests.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Davuluri RV et al. (2008) The functional consequences of alternative promoter use in mammalian genomes. Trends Genet 24, 167–177 [DOI] [PubMed] [Google Scholar]
  • 2.Sandelin A et al. (2007) Mammalian RNA polymerase II core promoters: insights from genome-wide studies. Nat Rev Genet 8, 424–436 [DOI] [PubMed] [Google Scholar]
  • 3.Kimura K et al. (2006) Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes. Genome Res 16, 55–65 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Forrest AR et al. (2014) A promoter-level mammalian expression atlas. Nature 507, 462–470 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Shephard EA et al. (2007) Alternative promoters and repetitive DNA elements define the species-dependent tissue-specific expression of the FMO1 genes of human and mouse. Biochem J 406, 491–499 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Pozner A et al. (2007) Developmentally regulated promoter-switch transcriptionally controls Runx1 function during embryonic hematopoiesis. BMC Dev Biol 7, 84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gout JF et al. (2017) The landscape of transcription errors in eukaryotic cells. Sci Adv 3, e1701484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Elkon R et al. (2013) Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 14, 496–506 [DOI] [PubMed] [Google Scholar]
  • 9.Di Giammartino DC et al. (2011) Mechanisms and consequences of alternative polyadenylation. Mol Cell 43, 853–866 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Derti A et al. (2012) A quantitative atlas of polyadenylation in five mammals. Genome Res 22, 1173–1183 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lianoglou S et al. (2013) Ubiquitously transcribed genes use alternative polyadenylation to achieve tissue-specific expression. Genes Dev 27, 2380–2396 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ulitsky I et al. (2012) Extensive alternative polyadenylation during zebrafish development. Genome Res 22, 2054–2066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kalsotra A and Cooper TA (2011) Functional consequences of developmentally regulated alternative splicing. Nat Rev Genet 12, 715–729 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kristensen LS et al. (2019) The biogenesis, biology and characterization of circular RNAs. Nat Rev Genet 20, 675–691 [DOI] [PubMed] [Google Scholar]
  • 15.Pan Q et al. (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40, 1413–1415 [DOI] [PubMed] [Google Scholar]
  • 16.Boccaletto P et al. (2022) MODOMICS: a database of RNA modification pathways. 2021 update. Nucleic Acids Res 50, D231–D235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Li S and Mason CE (2014) The pivotal regulatory landscape of RNA modifications. Annu Rev Genomics Hum Genet 15, 127–150 [DOI] [PubMed] [Google Scholar]
  • 18.Gilbert WV et al. (2016) Messenger RNA modifications: Form, distribution, and function. Science 352, 1408–1412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lee S et al. (2012) Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proc Natl Acad Sci U S A 109, E2424–2432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Xu C and Zhang J (2020) Mammalian alternative translation initiation is mostly nonadaptive. Mol Biol Evol 37, 2015–2028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Wan J and Qian SB (2014) TISdb: a database for alternative translation initiation in mammalian cells. Nucleic Acids Res 42, D845–850 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Drummond DA and Wilke CO (2009) The evolutionary consequences of erroneous protein synthesis. Nat Rev Genet 10, 715–724 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ribas de Pouplana L et al. (2014) Protein mistranslation: friend or foe? Trends Biochem Sci 39, 355–362 [DOI] [PubMed] [Google Scholar]
  • 24.von der Haar T and Tuite MF (2007) Regulated translational bypass of stop codons in yeast. Trends Microbiol 15, 78–86 [DOI] [PubMed] [Google Scholar]
  • 25.Dunn JG et al. (2013) Ribosome profiling reveals pervasive and regulated stop codon readthrough in Drosophila melanogaster. Elife 2, e01179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.de Klerk E and t Hoen PA (2015) Alternative mRNA transcription, processing, and translation: insights from RNA sequencing. Trends Genet 31, 128–139 [DOI] [PubMed] [Google Scholar]
  • 27.Nishikura K (2010) Functions and regulation of RNA editing by ADAR deaminases. Annu Rev Biochem 79, 321–349 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lynch M (2007) The Origins of Genome Architecture. Sinauer [Google Scholar]
  • 29.Warnecke T and Hurst LD (2010) GroEL dependency affects codon usage--support for a critical role of misfolding in gene evolution. Mol Syst Biol 6, 340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Xu C and Zhang J (2018) Alternative polyadenylation of mammalian transcripts is generally deleterious, not adaptive. Cell Syst 6, 734–742 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Saudemont B et al. (2017) The fitness cost of mis-splicing is the main determinant of alternative splicing patterns. Genome Biol 18, 208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Pickrell JK et al. (2010) Noisy splicing drives mRNA isoform diversity in human cells. PLoS Genet 6, e1001236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Melamud E and Moult J (2009) Structural implication of splicing stochastics. Nucleic Acids Res 37, 4862–4872 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Stepankiw N et al. (2015) Widespread alternative and aberrant splicing revealed by lariat sequencing. Nucleic Acids Res 43, 8488–8501 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Tress ML et al. (2017) Alternative splicing may not be the key to proteome complexity. Trends Biochem Sci 42, 98–110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hsu MT and Coca-Prados M (1979) Electron microscopic evidence for the circular form of RNA in the cytoplasm of eukaryotic cells. Nature 280, 339–340 [DOI] [PubMed] [Google Scholar]
  • 37.Cocquerelle C et al. (1993) Mis-splicing yields circular RNA molecules. FASEB J 7, 155–160 [DOI] [PubMed] [Google Scholar]
  • 38.Ji P et al. (2019) Expanded expression landscape and prioritization of circular RNAs in mammals. Cell Rep 26, 3444–3460 [DOI] [PubMed] [Google Scholar]
  • 39.Memczak S et al. (2013) Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495, 333–338 [DOI] [PubMed] [Google Scholar]
  • 40.Li X et al. (2018) The biogenesis, functions, and challenges of circular RNAs. Mol Cell 71, 428–442 [DOI] [PubMed] [Google Scholar]
  • 41.Salzman J (2016) Circular RNA expression: Its potential regulation and function. Trends Genet 32, 309–316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Xu C and Zhang J (2021) Mammalian circular RNAs result largely from splicing errors. Cell Rep 36, 109439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ueda HR et al. (2004) Universality and flexibility in gene expression from bacteria to human. Proc Natl Acad Sci U S A 101, 3765–3769 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Porath HT et al. (2017) Massive A-to-I RNA editing is common across the Metazoa and correlates with dsRNA abundance. Genome Biology 18, 185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hung LY et al. (2018) An evolutionary landscape of A-to-I RNA editome across metazoan species. Genome Biol Evol 10, 521–537 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Maas S et al. (2006) A-to-I RNA editing and human disease. RNA Biol 3, 1–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Nishikura K (2016) A-to-I editing of coding and non-coding RNAs by ADARs. Nat Rev Mol Cell Biol 17, 83–96 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gommans WM et al. (2009) RNA editing: a driving force for adaptive evolution? Bioessays 31, 1137–1145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ramaswami G et al. (2013) Identifying RNA editing sites using RNA sequencing data alone. Nat Methods 10, 128–132 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Xu G and Zhang J (2014) Human coding RNA editing is generally nonadaptive. Proc Natl Acad Sci U S A 111, 3769–3774 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Pinto Y et al. (2014) Mammalian conserved ADAR targets comprise only a small fragment of the human editosome. Genome Biol 15, R5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Xu G and Zhang J (2015) In search of beneficial coding RNA editing. Mol Biol Evol 32, 536–541 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Liscovitch-Brauer N et al. (2017) Trade-off between transcriptome plasticity and genome evolution in cephalopods. Cell 169, 191–202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Jiang D and Zhang J (2019) The preponderance of nonsynonymous A-to-I RNA editing in coleoids is nonadaptive. Nat Commun 10, 5411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Liu H et al. (2016) Genome-wide A-to-I RNA editing in fungi independent of ADAR enzymes. Genome Res 26, 499–509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Liu H et al. (2017) A-to-I RNA editing is developmentally regulated and generally adaptive for sexual reproduction in Neurospora crassa. Proc Natl Acad Sci U S A 114, E7756–E7765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Raser JM and O’Shea EK (2005) Noise in gene expression: origins, consequences, and control. Science 309, 2010–2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Wang Z and Zhang J (2011) Impact of gene expression noise on organismal fitness and the efficacy of natural selection. Proc Natl Acad Sci U S A 108, E67–76 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zhang Z et al. (2009) Positive selection for elevated gene expression noise in yeast. Mol Syst Biol 5, 299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Chang HH et al. (2008) Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 453, 544–547 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Warnecke T and Hurst LD (2011) Error prevention and mitigation as forces in the evolution of genes and genomes. Nat Rev Genet 12, 875–881 [DOI] [PubMed] [Google Scholar]
  • 62.Zhang J and Yang JR (2015) Determinants of the rate of protein sequence evolution. Nat Rev Genet 16, 409–420 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Li C and Zhang J (2019) Stop-codon read-through arises largely from molecular errors and is generally nonadaptive. PLoS Genet 15, e1008141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Rajon E and Masel J (2011) Evolution of molecular error rates and the consequences for evolvability. Proc Natl Acad Sci U S A 108, 1082–1087 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Graveley BR (2001) Alternative splicing: increasing diversity in the proteomic world. Trends Genet 17, 100–107 [DOI] [PubMed] [Google Scholar]
  • 66.Merkin J et al. (2012) Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science 338, 1593–1599 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Barbosa-Morais NL et al. (2012) The evolutionary landscape of alternative splicing in vertebrate species. Science 338, 1587–1593 [DOI] [PubMed] [Google Scholar]
  • 68.Chen L et al. (2013) Recoding RNA editing of AZIN1 predisposes to hepatocellular carcinoma. Nat Med 19, 209–216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Graham RR et al. (2007) Three functional variants of IFN regulatory factor 5 (IRF5) define risk and protective haplotypes for human lupus. Proc Natl Acad Sci U S A 104, 6758–6763 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Hon CC et al. (2017) An atlas of human long non-coding RNAs with accurate 5’ ends. Nature 543, 199–204 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Palazzo AF and Koonin EV (2020) Functional long non-coding RNAs evolve from junk transcripts. Cell 183, 1151–1161 [DOI] [PubMed] [Google Scholar]
  • 72.Xu C et al. (2019) Evidence that alternative transcriptional initiation is largely nonadaptive. PLoS Biol 17, e3000197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Meer KM et al. (2020) High transcriptional error rates vary as a function of gene expression level. Genome Biol Evol 12, 3754–3761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Xu C and Zhang J (2020) A different perspective on alternative cleavage and polyadenylation. Nat Rev Genet 21, 63. [DOI] [PubMed] [Google Scholar]
  • 75.Liu Z and Zhang J (2018) Human C-to-U coding RNA editing is largely nonadaptive. Mol Biol Evol 35, 963–969 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Liu Z and Zhang J (2018) Most m6A RNA modifications in protein-coding regions are evolutionarily unconserved and likely nonfunctional. Mol Biol Evol 35, 666–675 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Mordret E et al. (2019) Systematic detection of amino acid substitutions in proteomes reveals mechanistic basis of ribosome errors and selection for translation fidelity. Mol Cell 75, 427–441 [DOI] [PubMed] [Google Scholar]
  • 78.Arce L et al. (2006) Diversity of LEF/TCF action in development and disease. Oncogene 25, 7492–7504 [DOI] [PubMed] [Google Scholar]
  • 79.Peterson ML (2007) Mechanisms controlling production of membrane and secreted immunoglobulin during B cell development. Immunol Res 37, 33–46 [DOI] [PubMed] [Google Scholar]
  • 80.Salz HK and Erickson JW (2010) Sex determination in Drosophila: The view from the top. Fly (Austin) 4, 60–70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Abdelmohsen K et al. (2017) Identification of HuR target circular RNAs uncovers suppression of PABPN1 translation by CircPABPN1. RNA Biol 14, 361–369 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Pamudurti NR et al. (2017) Translation of CircRNAs. Mol Cell 66, 9–21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Garrett S and Rosenthal JJ (2012) RNA editing underlies temperature adaptation in K+ channels from polar octopuses. Science 335, 848–851 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Zhao BS et al. (2017) Post-transcriptional gene regulation by mRNA modifications. Nat Rev Mol Cell Biol 18, 31–42 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Zaccara S et al. (2019) Reading, writing and erasing mRNA methylation. Nat Rev Mol Cell Biol 20, 608–624 [DOI] [PubMed] [Google Scholar]
  • 86.Brubaker SW et al. (2014) A bicistronic MAVS transcript highlights a class of truncated variants in antiviral immunity. Cell 156, 800–811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Miranda I et al. (2013) Candida albicans CUG mistranslation is a mechanism to create cell surface variation. mBio 4, e00285–13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Firth AE and Brierley I (2012) Non-canonical translation in RNA viruses. J Gen Virol 93, 1385–1409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Honigman A et al. (1991) Cis acting RNA sequences control the Gag Pol translation readthrough in murine leukemia-virus. Virology 183, 313–319 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Schueren F et al. (2014) Peroxisomal lactate dehydrogenase is generated by translational readthrough in mammals. Elife 3:e03640. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES