Most investigators would agree that the most memorable achievement in the field of RNA in the past 20 years has been the discovery of RNA Interference and the subsequent elucidation of the diverse roles of noncoding RNA, such as siRNAs and miRNAs. From the seminal paper by Fire and Mello in 1998, the field has exploded with over 47,000 papers falling within this category in PubMed. Opinions will vary on which major questions remain to be addressed in the RNA field in the next 20 years. However if I were a betting person I would put my money on RNA modifications and how it creates the epitranscriptome.
The MODOMICS database for RNA modification was established by Henri Grosjean and colleagues in 2013 and the number of RNA modifications was estimated to be 144 within all phyla, with approximately 50 modifications in mammals. The majority of these modifications occur in tRNA; however, the number of enzymatic modification types and the incidence of modifications are rising within all classes of RNA. One of the next major challenges will be to sequence RNA and not just cDNA so that the true complexity of RNA can be revealed. It is sometimes forgotten that reverse transcriptase does not accurately copy modified nucleotides into cDNA, for example, inosine is transcribed as guanosine whereas N6-methyl adenosine (m6A) does not affect base pairing so it is read as adenosine. Often the reverse transcriptase stops or stalls when it encounters a modified base. Therefore RNAs that are modified may be underrepresented in cDNA when applying traditional sequencing methods. Currently the best method of detecting the variety and extent of RNA modification is to perform mass spectrometry on the RNA sample. It is likely that within the next five years new sequencing techniques and machines will be developed that can identify the most common modifications in RNA; however, it may take longer before the full spectrum of modifications can be routinely identified.
The Levanon group has estimated that there is over a hundred million editing sites with inosine in the human genome that occur predominantly in transcripts encoding Alu elements. The most common modification present internally within mRNA is m6A which occurs at more than 3 sites per mRNA molecule on average. There are three classes of proteins associated with m6A modification; “writers” that catalyze the reactions, “readers” that recognize the modification and “erasers” that remove it. If other RNA modifications are as dynamic as this then it will require analysis of RNA in different tissues under different conditions to elucidate the true complexity of the RNA population.
Once the amount and variation of RNA modification is uncovered the next question is to address its biological function. Modification of nucleic acids is fundamental to genome defense. So one would postulate that RNA modification has a role in innate immunity; however, it would be naïve to presume that that this would be the only biological function. For example recently it has been demonstrated that the editing enzyme ADAR1 (adenosine deaminase that acts on RNA) is essential for innate immunity in mammals; however, this enzyme also edits transcripts encoding subunits of key receptors in the central nervous system, thereby modulating their activity. Some modifications may be important for stabilization and folding of the RNA as observed in tRNA. Thus RNA modifications in mRNA could influence the structure of RNAs, folding them into unique structures not predicted from their primary sequence which would influence the availability of RNA for binding to other RNAs such as miRNAs or to proteins. Also the stability of a particular RNA could vary from tissue to tissue depending on the presence or absence of a modification that could be tissue specific or vary with disease conditions such as cancer.
In addition to the traditional ideas of modification being important for folding and stability of RNA, it can generate protein diversity. Both the ADARs and APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like) family of proteins create different isoforms of proteins due to deamination of adenosine to inosine or cytosine to uracil. Yi-Tao Yu and colleagues demonstrated that pseudouridine (Ψ) in mRNA can also change codons; uridine in nonsense codons in yeast were replaced by Ψ and surprisingly ΨAA and ΨAG were found to code for serine and threonine, whereas ΨGA codes for tyrosine and phenylalanine.
Already the list of disorders associated with different RNA modification is long; cancer, obesity, diabetes, hepatitis, neurological disorders, autoimmune disorders, hypoxia, metabolic diseases and viral infections. Assuming that the true complexity of RNA modification is still unknown the impact it may have on disease could be profound. Only time will tell.
Footnotes
Article and publication date are at http://www.rnajournal.org/cgi/doi/10.1261/rna.050260.115.
Freely available online through the RNA Open Access option.