Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2021 Dec 10;121(1):7–10. doi: 10.1016/j.bpj.2021.12.004

A novel algorithm for ranking RNA structure candidates

Anastacia Wienecke 1,2, Alain Laederach 1,2,
PMCID: PMC8758412  PMID: 34896370

Abstract

RNA research is advancing at an ever increasing pace. The newest and most state-of-the-art instruments and techniques have made possible the discoveries of new RNAs, and they have carried the field to new frontiers of disease research, vaccine development, therapeutics, and architectonics. Like proteins, RNAs show a marked relationship between structure and function. A deeper grasp of RNAs requires a finer understanding of their elaborate structures. In pursuit of this, cutting-edge experimental and computational structure-probing techniques output several candidate geometries for a given RNA, each of which is perfectly aligned with experimentally determined parameters. Identifying which structure is the most accurate, however, remains a major obstacle. In recent years, several algorithms have been developed for ranking candidate RNA structures in order from most to least probable, though their levels of accuracy and transparency leave room for improvement. Most recently, advances in both areas are demonstrated by rsRNASP, a novel algorithm proposed by Tan et al. rsRNASP is a residue-separation-based statistical potential for three-dimensional structure evaluation, and it outperforms the leading algorithms in the field.

Introduction

RNA is a single-stranded biomolecule with myriad key roles in regulating gene activity (1), catalyzing chemical reactions (2), and encoding plus decoding genetic information (3). Its single-stranded nature enables intramolecular base pairing, which allows the polymer to fold into intricate three-dimensional (3D) structures. A ubiquitous example is the amino acid carrier tRNA. Via base pairing, tRNAs adopt a two-dimensional cloverleaf topology. Interactions between the leaves then create a 3D “L”-shaped architecture, the exact dimensions of which precisely fit through a ribosome’s entry portal and facilitate protein synthesis (4). Alongside tRNAs, a whole host of other structured RNAs play leading and supporting roles in almost every process on the cellular stage; examples include ribozymes (2), mRNAs (5,6), riboswitches (7), spliceosomes (8), ribosomes (9), and micro-RNAs (10). The rapid development of new structural techniques such as cryoelectron microscopy (11, 12, 13), SHAPE-JuMP (14), single-molecule fluorescence resonance energy transfer (15), and small-angle x-ray scattering (16), as well as the refinement of more traditional techniques such as NMR (17) and crystallography (18), has focused the stage lights. Nonetheless, each of these methods has experimental limits on its resolution, and as such, computational techniques are required to further refine atomic resolution models from these data.

As in proteins, RNA structure and RNA function are closely tied (19,20). Even certain single-point mutations can augment RNA structures and lead to disease states (21). Although the number of genes encoding functional RNAs greatly exceeds the number that encode proteins, to date, only 1% of Protein Data Bank entries include RNAs. A sharper understanding of RNA structure, its various roles, and its patterns of evolution holds much promise in informing the functions of uncharacterized RNAs, clarifying mechanisms of noncoding disease mutations (22), guiding aptamer-based (23) and other therapies (24), and buttressing the science of RNA nanostructures (25,26).

Critical to understanding RNA is the ability to probe their structures reliably and accurately. This is no simple task, as RNAs are quite flexible and could even be said to exhibit an RNA version of the Levinthal paradox (27); that is, just a 20-base RNA composed of 10 A’s and 10 U’s has almost 10 million distinct folding conformations. In the experimental realm, frequently used methods such as NMR (17), cryoelectron microscopy (11, 12, 13), and SHAPE-JuMP (14) yield constraints either in the form of distances or density that, when combined with computational structural modeling, produce a cloud of probable structures, each of which agrees equally well with the experimental data. On the computational side, methods such as fragment assembly of RNA (28) and homology modeling piece together an RNA’s structure on the basis of its chemical similarity to already-known local RNA structures. Whatever the method used, a set of candidate structures is output, and ranking these structures to identify the most accurate, or native-like, structure remains computationally challenging (see Fig. 1 A).

Figure 1.

Figure 1

An in-depth look at rsRNASP’s ranking of native and decoy 3LA5 RNA structures and visualizing the performance of Tan et al.’s rsRNASP relative to four other energy functions. (A) Commonly used experimental and computational techniques output several candidate RNA structures. A reliable way of choosing the most accurate, or native-like, structure is crucial. (B) Visualization of 3D 3LA5 RNA structures at ten positions of the rsRNASP ranking. These structures include the native and nine computationally generated decoys (taken from Tan et al.’s test set III). rsRNASP determines rank by computing an energy value, in units of kBT, which is reported below each structure; the lower the energy value, the higher the rank and the higher the predicted similarity to the native. As this is a test case, this ranking scheme is compared with an independent reference list, which is determined by the deformation index. This index measures the deviation of a given structure from the accepted native structure in the Protein Data Bank. The gray box highlights the native 3LA5 structure, which rsRNASP correctly identifies and gives a rank of 1. In practice, the true native structure is unknown. (C) Scatterplot of the rsRNASP rank versus the reference list rank for all 42 3LA5 structures. The Pearson correlation coefficient (PCC) is 0.88, indicating a strong predictive ability of rsRNASP. Datapoints outlined in black are featured in (B). (D) Scatterplot of the ranking accuracy, as measured by the PCC, of rsRNASP, RNA3DCNN, 3dRNAscore, DFIRE-RNA, and RASP for each of the 42 RNAs in Tan et al.’s most realistic test set (test set III). Each RNA has an associated array of structures. The energy functions have no knowledge of which structure is the “native” and which are the “decoys.” A PCC of 1 indicates that the energy function correctly identifies the native structure and ranks the decoys in order of their structural similarity to the native deposited in the Protein Data Bank. (E) Violin plot displaying the distribution of RNA lengths in test set III, with the middle bar representing the median.

A great need in the field of RNA structure prediction is an accurate, reliable, and efficient way of evaluating and ranking candidate RNA structures. One of the most common methods for achieving a similar end in the protein world is a certain type of energy function: the knowledge-based statistical potential (29, 30, 31). Such a function relies on a reference state, which in the RNA world can be determined by information latent in the sequences, chemical bonds, and configurations of a well-characterized training set of RNA structures. Output by this energy function is a potential energy value for each input geometry; the geometry with the lowest energy is taken as the best estimate of the native structure. Several such functions have been proposed and are currently in use: RASP (32), 3dRNAscore (33), DFIRE-RNA (34), and RNA3DCNN (35).

The new and noteworthy

In this issue of Biophysical Journal, Tan et al. propose a new energy function: a residue-separation-based statistical potential (rsRNASP) for 3D RNA structure evaluation. Different from RNA3DCNN, which relies on the “black box” of 3D convolutional neural networks, the inner workings of rsRNASP are transparent. On the basis of the inverse Boltzmann law, rsRNASP relies on three factors: the temperature, the probabilities of nucleotide separation for the native state, and the probabilities of nucleotide separation for the reference state. These parameters are weighted to factor in distance, an important consideration given the hierarchical nature of RNA folding patterns (36). Output is an energy value in units of kBT; the lower a candidate structure’s energy, the higher its predicted similarity to the true native structure.

Tan et al.’s comprehensive testing on three large datasets indicates the high quality of rsRNASP in parsing native from decoy RNA structures. These decoys were computationally generated by normal mode perturbation, fragment assembly, replica exchange, shifting atom distances, or rearranging dihedral angles. Each RNA in these datasets has one accepted native structure and several associated decoy structures. Although its performance was not perfectly accurate, rsRNASP most successfully and most consistently identified the correct native structure and ranked the decoys. Its performance remained high for small RNAs with very similar decoys, for large RNAs with very variable decoys, and for the most realistic RNA-puzzles dataset. This approach appears to achieve a better balance of short- versus long-range interactions, and it functions at a higher resolution, perhaps explaining its outperformance of 3dRNAscore and RASP, as mentioned by Tan et al.

Tan et al. consider their test set III as the most realistic. This set includes a known native structure and multiple computer-generated decoy structures for (1) each of the 22 RNAs in the RNA-puzzles dataset and (2) each of 20 selected RNAs with known structures in the Protein Data Bank (see Fig. 1 A for an example). For a given RNA, the energy function ranks the associated array of one native and several decoy structures (see Fig. 1 B). It receives no input on which structure is native and which is decoy. During testing, this ranking scheme is then compared with a reference list of structures, ordered by a measure of structural deformation. This “deformation index” quantifies the divergence between the shape of each structure and the known native structure; a high deformation index implies a high divergence. Thus, for a given RNA, if the Pearson correlation coefficient between the energy function ranking and the reference list is 1, the energy function accurately identifies structures most similar and most dissimilar to the known native structure (see Fig. 1 B and C for a visualization of rsRNASP’s performance on the 71-base-long mc6 RNA riboswitch; PDB: 3LA5). For every RNA in their test set III, Fig. 1 D highlights how rsRNASP, RASP, 3dRNAscore, DFIRE-RNA, and RNA3DCNN compare in ranking the structures of RNAs (see Fig. 1 D) with a variety of lengths (see Fig. 1 E).

As more and more RNA structures continue to be probed, it is imperative for there to be an efficient, accurate, and reliable mechanism that ranks their candidate structures. rsRNASP meets this need.

Author contributions

A.W. wrote the manuscript and created Fig. 1. A.L. provided critical input and feedback. Both authors reviewed the final manuscript.

Acknowledgments

This work was supported by National Institutes of Health grants R35 GM140844 and R01 HL111527 to AL.

Editor: Alan Grossfield.

References

  • 1.O’Brien J., Hayder H., et al. Peng C. Overview of MicroRNA biogenesis, mechanisms of actions, and circulation. Front. Endocrinol. (Lausanne). 2018;9:402. doi: 10.3389/fendo.2018.00402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lilley D.M., Eckstein F., editors. Ribozymes and RNA Catalysis. Royal Society of Chemistry; 2007. [Google Scholar]
  • 3.Li J., Liu C. Coding or noncoding, the converging concepts of RNAs. Front. Genet. 2019;10:496. doi: 10.3389/fgene.2019.00496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Agirrezabala X., Valle M. Structural insights into tRNA dynamics on the ribosome. Int. J. Mol. Sci. 2015;16:9866–9895. doi: 10.3390/ijms16059866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hiller M., Zhang Z., et al. Stamm S. Pre-mRNA secondary structures influence exon recognition. PLoS Genet. 2007;3:e204. doi: 10.1371/journal.pgen.0030204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Mustoe A.M., Corley M., et al. Weeks K.M. Messenger RNA structure regulates translation initiation: a mechanism exploited from bacteria to humans. Biochemistry. 2018;57:3537–3539. doi: 10.1021/acs.biochem.8b00395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Serganov A., Nudler E. A decade of riboswitches. Cell. 2013;152:17–24. doi: 10.1016/j.cell.2012.12.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wilkinson M.E., Charenton C., Nagai K. RNA splicing by the spliceosome. Annu. Rev. Biochem. 2020;89:359–388. doi: 10.1146/annurev-biochem-091719-064225. [DOI] [PubMed] [Google Scholar]
  • 9.Watson Z.L., Ward F.R., et al. Cate J.H. Structure of the bacterial ribosome at 2 Å resolution. Elife. 2020;9:e60482. doi: 10.7554/eLife.60482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Roden C., Gaillard J., et al. Lu J. Novel determinants of mammalian primary microRNA processing revealed by systematic evaluation of hairpin-containing transcripts and human genetic variation. Genome Res. 2017;27:374–384. doi: 10.1101/gr.208900.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kappel K., Zhang K., et al. Das R. Accelerated cryo-EM-guided determination of three-dimensional RNA-only structures. Nat. Methods. 2020;17:699–707. doi: 10.1038/s41592-020-0878-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Koning R., Gomez-Blanco J., et al. Koster A.J. Asymmetric cryo-EM reconstruction of phage MS2 reveals genome structure in situ. Nat. Commun. 2016;7:12524. doi: 10.1038/ncomms12524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhang K., Li S., et al. Chiu W. Cryo-EM structure of a 40 kDa SAM-IV riboswitch RNA at 3.7Åresolution. Nat. Commun. 2019;10:5511. doi: 10.1038/s41467-019-13494-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Christy T.W., Giannetti C.A., et al. Weeks K.M. Direct mapping of higher-order RNA interactions by SHAPE-JuMP. Biochemistry. 2021;60:1971–1982. doi: 10.1021/acs.biochem.1c00270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Manz C., Kobitski A., et al. Nienhaus G.U. Single-molecule FRET reveals the energy landscape of the full-length SAM-I riboswitch. Nat. Chem. Biol. 2017;13:1172–1178. doi: 10.1038/nchembio.2476. [DOI] [PubMed] [Google Scholar]
  • 16.Chen Y., Pollack L. SAXS studies of RNA: structures, dynamics, and interactions with partners. Wiley Interdiscip. Rev. RNA. 2016;7:512–526. doi: 10.1002/wrna.1349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Barnwal R.P., Yang F., Varani G. Applications of NMR to structure determination of RNAs large and small. Arch. Biochem. Biophys. 2017;628:42–56. doi: 10.1016/j.abb.2017.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Reyes F.E., Garst A.D., Batey R.T. Strategies in RNA crystallography. Methods Enzymol. 2009;469:119–139. doi: 10.1016/S0076-6879(09)69006-6. [DOI] [PubMed] [Google Scholar]
  • 19.Mortimer S., Kidwell M., Doudna J. Insights into RNA structure and function from genome-wide studies. Nat. Rev. Genet. 2014;15:469–479. doi: 10.1038/nrg3681. [DOI] [PubMed] [Google Scholar]
  • 20.Piao M., Sun L., Zhang Q.C. RNA regulations and functions decoded by transcriptome-wide RNA structure probing. Genomics Proteomics Bioinformatics. 2017;15:267–278. doi: 10.1016/j.gpb.2017.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Halvorsen M., Martin J.S., et al. Laederach A. Disease-associated mutations that alter the RNA structural ensemble. PLoS Genet. 2010;6:e1001074. doi: 10.1371/journal.pgen.1001074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Waldern J.M., Kumar J., Laederach A. Disease-associated human genetic variation through the lens of precursor and mature RNA structure. Hum. Genet. 2021 doi: 10.1007/s00439-021-02395-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ni S., Zhuo Z., et al. Zhang G. Recent progress in aptamer discoveries and modifications for therapeutic applications. ACS Appl. Mater. Interfaces. 2021;13:9500–9519. doi: 10.1021/acsami.0c05750. [DOI] [PubMed] [Google Scholar]
  • 24.Corley M., Solem A., et al. Laederach A. An RNA structure-mediated, posttranscriptional model of human α-1-antitrypsin expression. Proc. Natl. Acad. Sci. U S A. 2017;114:E10244–E10253. doi: 10.1073/pnas.1706539114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Grabow W.W., Jaeger L. RNA self-assembly and RNA nanotechnology. Acc. Chem. Res. 2014;47:1871–1880. doi: 10.1021/ar500076k. [DOI] [PubMed] [Google Scholar]
  • 26.Jaeger L., Chworos A. The architectonics of programmable RNA and DNA nanostructures. Curr. Opin. Struct. Biol. 2006;16:531–543. doi: 10.1016/j.sbi.2006.07.001. [DOI] [PubMed] [Google Scholar]
  • 27.Levinthal C., How to fold graciously. Mossbauer Spectroscopy in Biological Systems: Proceedings of a Meeting Held at Allerton House. 1969; 22–24.
  • 28.Das R., Baker D. Automated de novo prediction of native-like RNA tertiary structures. Proc. Natl. Acad. Sci. U S A. 2007;104:14664–14669. doi: 10.1073/pnas.0703836104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sankar K., Jia K., Jernigan R.L. Knowledge-based entropies improve the identification of native protein structures. Proc. Natl. Acad. Sci. U S A. 2017;114:2928–2933. doi: 10.1073/pnas.1613331114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Rykunov D., Fiser A. New statistical potential for quality assessment of protein models and a survey of energy functions. BMC Bioinformatics. 2010;11:128. doi: 10.1186/1471-2105-11-128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sippl M.J. Knowledge-based potentials for proteins. Curr. Opin. Struct. Biol. 1995;5:229–235. doi: 10.1016/0959-440x(95)80081-6. [DOI] [PubMed] [Google Scholar]
  • 32.Capriotti E., Norambuena T., et al. Melo F. All-atom knowledge-based potential for RNA structure prediction and assessment. Bioinformatics. 2011;27:1086–1093. doi: 10.1093/bioinformatics/btr093. [DOI] [PubMed] [Google Scholar]
  • 33.Wang J., Zhao Y., et al. Xiao Y. 3dRNAscore: a distance and torsion angle dependent evaluation function of 3D RNA structures. Nucleic Acids Res. 2015;43:e63. doi: 10.1093/nar/gkv141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zhang T., Hu G., et al. Zhou Y. All-atom knowledge-based potential for RNA structure discrimination based on the distance-scaled finite ideal-gas reference state. J. Comput. Biol. 2019;27:856–867. doi: 10.1089/cmb.2019.0251. [DOI] [PubMed] [Google Scholar]
  • 35.Li J., Zhu W., et al. Wang W. RNA3DCNN: local and global quality assessments of RNA 3D structures using 3D deep convolutional neural networks. PLoS Comput. Biol. 2018;14:e1006514. doi: 10.1371/journal.pcbi.1006514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Brion P., Westhof E. Hierarchy and dynamics of RNA folding. Annu. Rev. Biophys. Biomol. Struct. 1997;26:113–137. doi: 10.1146/annurev.biophys.26.1.113. [DOI] [PubMed] [Google Scholar]

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES