Skip to main content
ACS Omega logoLink to ACS Omega
. 2025 Oct 26;10(43):51011–51027. doi: 10.1021/acsomega.5c05377

Molecular Dynamics Simulations of RNA Stem-Loop Folding Using an Atomistic Force Field and a Generalized Born Implicit Solvent

Tadashi Ando †,‡,*
PMCID: PMC12593105  PMID: 41222232

Abstract

Accurate modeling of the structural dynamics of ribonucleic acid (RNA) molecules, including common stem-loop motifs, remains challenging. This study presents de novo folding simulations of a diverse set of 26 RNA stem-loops, ranging from 10 to 36 residues, with and without bulges or internal loops, starting from their extended conformations. These simulations employed conventional molecular dynamics using an atomistic force field extensively refined by the Shaw group ( TanD., et al. Proc. Natl. Acad. Sci. U.S.A. 2018, 115, E1346−E1355, 10.1073/pnas.1713027115 ) and an implicit solvent model developed by the Simmerling group ( NguyenH., et al. J. Chem. Theory Comput. 2015, 11, 3714−3728, 10.1021/acs.jctc.5b00271 ). The 18 stem-loops without bulges or internal loops were folded into their respective structures, retaining all native base pairs in the stem regions. For most of these models, root mean square deviation (RMSD) values relative to experimentally determined structures were <2 Å for stem regions and <5 Å for the molecules. Furthermore, five of the eight stem-loops containing bulges or internal loops were successfully folded into structures with all respective native base pairs in the stem regions. The models initially formed stems directly connected to hairpin loops, followed by the remaining duplex stems between the bulge or internal loop and the terminal. The RMSD values for these structures were 0.9–4.5 Å for the stem regions and 2.8–8.3 Å for the molecules. The RMSD values for the loop regions were approximately 4 Å for all models. Accurate modeling of loop structures remains challenging in simulations using the implicit solvent model. However, our success in recapitulating the RNA stem folding of fundamental stem-loop motifs represents a pivotal step toward enhancing reliable and accurate modeling of RNA structural dynamics.


graphic file with name ao5c05377_0011.jpg


graphic file with name ao5c05377_0010.jpg

Introduction

Ribonucleic acids (RNAs) are important molecules in living systems. They are versatile and involved in numerous biological processes that are essential for the maintenance, regulation, and processing of genetic information. RNA-based medicine has experienced recent significant advances with the development of numerous applications, including therapeutics, vaccines, and diagnostics. RNA molecules are linear polymers composed of four nucleotides (nt) with nitrogenous bases: adenine (A), uracil (U), guanine (G), and cytosine (C). These molecules can adopt a variety of secondary structures, including various structural elements, such as stems, loops, internal loops, bulges, and pseudoknots, which are defined by the canonical Watson–Crick base pairs (A-U and G-C) and the noncanonical wobble base pair (G-U). , Interactions between distinct secondary structural elements form more complex tertiary structures. As with other biomolecules, the biological functions of RNAs are closely linked to their structures and dynamics. Deep learning-based methods have revolutionized the research field of RNA structure prediction with increasingly higher accuracy and efficiency. However, the conformational dynamics and flexibility of RNA are inherently coupled with its functions, as observed in riboswitches, ribozymes, and RNA–protein interactions. Therefore, developing methods to efficiently and accurately model RNA structures and dynamics will significantly impact RNA biology and medicinal chemistry.

All-atom molecular dynamics (MD) simulations, which use physics-based potentials, are powerful tools that can be used to investigate the relationships among the structures, dynamics, and biological functions of biomolecules. MD simulations provide the energetics of interactions and conformational dynamics of biomolecules in atomic detail and at fine temporal resolution. The AMBER RNA force field has been widely used for MD simulations of atomistic RNA systems. Currently, bsc0χOL3 is the standard force field used for RNA systems in AMBER, , hereafter called the AMBER-OL3 RNA force field. Since the AMBER-OL3 force field was parametrized in 2011, significant efforts have been made to improve its accuracy. Steinbrecher et al. modified the phosphate Lennard-Jones parameters with the aid of a thermodynamic cycle consisting of experimentally determined pK a values, solvation energies from free energy calculations, and quantum mechanical (QM) calculations. Although originally developed for bioinorganic phosphates, these parameters improved the quantitative agreement between MD-simulated conformational ensembles of r­(GACC) and r­(CCCC) tetranucleotides and experimental NMR data. , Refinements of backbone and glycosidic torsion parameters based on QM calculations have been performed. , Chen and García adjusted the Lennard-Jones parameters for base–base and base–water interactions to weaken base stacking, which is overstabilized in the original force field. Kührová et al. introduced structure-specific local correction potential selectively modifying native hydrogen bonds of RNA, denoted as gHBfix, to improve the performance of AMBER RNA force fields while minimizing undesired side effects. , The same group proposed a scheme for automatically adjusting gHBfix weights, and the resulting new parameter set was used in MD simulations, generating a structural ensemble of tetraloop RNAs that were in better agreement with experiments. Tan et al. introduced an extensive revision of electrostatic, van der Waals, and torsional parameters of the AMBER RNA force field based on QM calculations and existing experimental information to more accurately reproduce the energetics of nucleobase stacking, base pairing, and key torsional conformers. This force field (called the DESRES-RNA force field hereafter) combined with the TIP4P-D water model could effectively reproduce the structural and thermodynamic properties of various RNA systems. Specific adjustments for CH···O interactions were also proposed to improve RNA simulations. , In a recent study, Mlýnský et al. conducted a comprehensive evaluation of various RNA force fields, including a polarizable force field, using a UUCG tetraloop as a benchmark system. They concluded that substantial hurdles remain in achieving reliable and accurate modeling of the diverse structural dynamics of RNA systems.

In addition to the remarkable improvements in RNA force field accuracy in recent years, advances in computational resources and algorithms have greatly enhanced our ability to simulate RNA dynamics on various time scales. RNA dynamics encompass a wide range of time scales, reflecting the rugged free energy landscape of RNA folding, unfolding, and interactions. Depending on the specific RNA sequence and environmental conditions, these time scales can vary from microseconds to seconds. Recent experimental studies have shown that simple stem-loop folding also occurs on time scales ranging from tens of microseconds to several seconds. To tackle these long-time-scale dynamics by MD simulations, enhanced sampling techniques, such as replica-exchange MD (REMD) and metadynamics, are widely used. In parallel to enhanced sampling techniques, the development of optimized algorithms for the effective use of graphical processing units (GPUs) and single-instruction multiple-data (SIMD) architectures has significantly accelerated MD simulations, enabling the exploration of biomolecular dynamics at microsecond time scales in explicit solvents. , A specialized supercomputer, named ANTON, has further extended MD simulations of RNA in explicit solvent molecules to hundreds of microseconds.

Implicit solvent models can also be used to speed up atomistic simulations by approximating the explicit solvent as a continuum. , A drastic reduction in the number of particles in a simulation system can reduce the computational cost. In addition, the low solvent viscosity in implicit solvent simulations accelerates the rate of conformational sampling compared with explicit solvent simulations. , The generalized Born (GB) model, which was first introduced by Still et al. in 1990, is the most widely used implicit solvent model for atomistic MD simulations for biomolecules. Although the GB model is less realistic than an explicit solvent model, its accuracy has been significantly improved, and GPU-accelerated GB calculations have also been implemented. Nguyen et al. optimized the parameters of the original GB-neck model to reproduce more accurate Poisson–Boltzmann solvation energies for a broad range of peptide and protein systems. The GB-neck2 model combined with the AMBER ff14SBonlysc protein force field successfully simulated the folding of 16 proteins with diverse topologies. Notably, the AMBER ff14SBonlysc force field combined with the GB-neck2 model not only folds more proteins but also provides a better balance of different secondary structures than ff14SB combined with the explicit TIP3P water model. The GB-neck2 model parameters were refined for nucleic acids, and the resulting parameter sets enabled the successful folding of small DNA and RNA hairpins to near native structures.

While remarkable advances have been made in RNA force fields and sampling techniques, simulating RNA folding, even for secondary structures, remains challenging. Chen and García reported that three hyperstable 8 nt RNA tetraloops folded into the structure with a non-hydrogen root mean square deviation (RMSD) of 1–3 Å from their experimental structures, where they used temperature-REMD starting from their unfolded states and an optimized force field based on AMBER-99 with the TIP3P explicit solvent. Sponer and Bussi performed an extensive set of folding simulations for two tetranucleotides and one 8 nt tetraloop using various enhanced sampling techniques and the gHBfix-applied AMBER RNA force field with explicit solvent models to show an improved agreement between experimental data and conformational ensembles from simulations compared with the original AMBER force field. , The DESRES-RNA force field combined with the TIP4P-D water model can simulate the reversible folding of three tetraloops (two 10 nt and one 14 nt) using a simulated tempering enhanced sampling method. These simulations successfully sampled conformations with an overall non-hydrogen RMSD of 1–2 Å from their experimentally determined structures at low temperatures. With the same force field, magnesium ion-dependent folding of an 8 nt tetraloop was observed in a conventional MD simulation. An implicit solvent model captured the folding of three tetraloops (two 12 nt and one 14 nt) using deep boosted enhanced sampling MD with the DESRES-RNA force field and the GB-neck2 model. Linzer et al. reported unfolding/refolding events of stem regions for several stem-loops ranging from 10 to 27 nt in conventional MD simulations using the AMBER-OL3 force field and the GB-neck2 model, close to the predicted melting temperatures.

Here, we present the results of conventional MD simulations for 26 RNA stem-loops, ranging from 10 to 36 nt, with and without bulges or internal loops, starting from their extended conformations. Similar to MD simulations of protein systems, the folding simulation of RNA molecules serves as a stringent test to assess whether current molecular mechanics force fields are sufficiently accurate to enable long-time-scale MD simulations, as a powerful tool for characterizing large conformational changes in RNA. Among these examined models, 23 RNA molecules were successfully folded into structures with native base pairs and non-hydrogen RMSD values of <2 and 4 Å for the stem and loop regions, respectively. These structures were generally stable and formed dominant clusters in their trajectories. The loop structures sampled in their folded states were less accurate than the stem regions, and further improvement of the force field parameters is required. We expect that this force field and GB model combination will facilitate future MD simulation studies on the complex dynamics of a wide range of RNA systems, such as tertiary structure folding, conformational changes of riboswitches, and kissing-loop complexes.

Models and Simulation Methods

RNA Stem-Loop Models

A total of 26 RNA stem-loops with varying sequences and lengths were simulated, 15 of which corresponded to the models previously examined by Linzer et al. Their sequences, secondary structures, and Protein Data Bank (PDB) IDs are listed in Figure . Hereafter, all models are referred to using their PDB IDs. The stem-loops without bulges or internal loops (lengths in parentheses) are as follows (Figure A): 1R4H (10 nt), 1IDV (10 nt), 1ZIH (12 nt), 1I46 (13 nt), 1ESH (13 nt), 1F85 (14 nt), 1FHK (14 nt), 2EVY (14 nt), 2KOC (14 nt), 2Y95 (14 nt), 1OQ0 (15 nt), 1XWP (15 nt), 1JTW (16 nt), 1KKA (17 nt), 2RPK (20 nt), 1SZY (21 nt), 1PJY (22), and 2LDL (27 nt). The stem-loops with bulges or internal loops are as follows (Figure B): 1ESY (19 nt), 17RA (21 nt), 2KF0 (24 nt), 1ANR (29 nt), 2JWV (29 nt), 1R2P (34 nt), 1R7W (34 nt), and 1N8X (36 nt). The stem-loop structures of the 26 RNAs were solved by NMR. The loops ranged in length from 3 to 8 nt, and the bulges and internal loops ranged in length from 1 to 6 nt. A5 and U14 in 1ESY and A6 and U24 in 1ANR can form base pairs, but they were not formed in the NMR structures. Six stem-loop models (1F85, 2EVY, 1OQ0, 1JTW, 17RA, and 1N8X) contained the G-U wobble base pair. 2LDL and 2KF0 contained the noncanonical C-A wobble base pair; and 1N8X had a G-A mismatched base pair. Hereafter, we refer to stem-loops without a bulge or internal loop as “class I” stem-loops and those with a bulge or internal loop as “class II” stem-loops.

1.

1

Secondary structures of 26 RNA stem-loops simulated in this study and their corresponding Protein Data Bank IDs. (A) Eighteen class I stem-loops without bulges or internal loops. (B) Eight class II stem-loops with bulges or internal loops. The G-C/A-U Watson–Crick and the G-U wobble base pairs are represented by black and red lines, respectively. Wobble base pairs A7-C21 in 2LDL and C6-A18 in 2KF0 were observed by NMR spectroscopy and are denoted by plus signs (+). A G8-A27 mismatch base pair in 1N8X was observed by NMR spectroscopy and is denoted by an asterisk (*).

Implicit Solvent MD Simulations for Stem-Loops

All simulations were performed using the pmemd.cuda module of AMBER22 with an NVIDIA GeForce GTX 2080 Ti or RTX 3090 GPU. The DESRES-RNA force field and the GB-neck2 implicit solvent model with the mbondi3 intrinsic radii set for nucleic acids were used, where mbondi3 and mbondi2 are equivalent for nucleic acid simulations. The tleap module of AMBER22 generated extended conformations of RNA molecules as initial structures for the MD simulations, which were subjected to energy minimization for 1000 steps, consisting of 500 steps of the steepest descent followed by 500 steps of the conjugate gradient. Production simulations were performed at 298 K with a time step of 2 fs using the SHAKE algorithm to constrain the bond lengths involving hydrogen atoms. The temperature was controlled using a Langevin thermostat with a collision frequency of 1 ps–1. The salt concentration in the GB model was set at 0.15 M. The nonpolar contribution to the solvation free energy, which was approximated as a linear function of the solute surface area, was not computed in this study. No cutoff value was used for the calculations of nonbonded interactions and the effective Born radii. The simulation time was 8 μs for the 2EVY and 2LDL class I stem-loops as well as the class II stem-loops and 4 μs for the remaining class I stem-loops.

Analysis was performed using the cpptraj program in AmberTools24. , For reference structures, the first NMR structural models were used if multiple structural models were deposited. For 1I46, 1ESH, 1XWP, and 2RPK, minimized average structures were used as the reference structures. The RMSD was calculated for the heavy atoms of the molecules. Base pairs were detected by using the nastruct command in cpptraj with the default parameters. Although the nastruct judged the G1 and C10 pair in the 1R4H reference structure not to be base paired, G1-C10 was considered a native base pair in the stem region, as reported by the Rijnbrand group that originally determined the structure by NMR. The fraction of native base pairs (Q) was evaluated for the stem regions. The k-means algorithm was used to sort the conformations sampled in three independent simulations into 20 clusters (that is, k = 20) for each model, where the RMSD for the entire molecule was used as a metric for comparing RNA structures. The clusters were numbered in order of decreasing population, i.e., the first cluster was the most populated, and the second cluster was the second most populated. For analyzing the folding of class II stem-loops, stems were divided into two regions: stem 1 was a duplex region formed between the hairpin loop and the bulge or internal loop, and stem 2 was a duplex region formed from the bulge or internal loop to the RNA terminal. Images of the molecular structures and simulation trajectory movies were created using ChimeraX.

Explicit Solvent MD Simulations for Stem-Loops

The initial structures were built from the first models of their NMR structures deposited in the PDB. Each system was solvated in TIP4P-D water boxes with 150 mM NaCl, maintaining a minimum distance of 15 Å between the solute and box edges. The DESRES-RNA and CHARMM22 force fields were used for RNA and ions, respectively. The numbers of Na+ and Cl ions were calculated using SPLIT. Energy minimization was carried out in two stages: 1000 steps of the steepest descent, followed by 1000 steps of the conjugate gradient method. The systems were then heated linearly from 0 to 298 K over 500 ps at 1 bar, followed by 500 ps of equilibration at 298 K and 1 bar. A single 1 μs production simulation was conducted for each model. Long-range electrostatic interactions were calculated by the particle mesh Ewald method. Short-range electrostatic and Lennard-Jones interactions were calculated with a cutoff distance of 10 Å. Pressure was controlled using a Berendsen barostat with a relaxation time of 1 ps. All other simulation parameters were identical to those used in the GB-neck2 implicit solvent simulations.

Implicit Solvent MD Simulations for rU40

MD simulations of rU40 single-stranded RNA (ssRNA) were performed three times, each for 1 μs, using the same protocol as that for stem-loops. The Förster resonance energy transfer (FRET)-averaged end-to-end distance ⟨R FRET⟩ of the rU40 was calculated from MD simulations according to the procedure described by Tan et al. The distance between the O5′ atom of U1 and the O3′ atom of U40, R, was converted into the FRET efficiency, E FRET, using the equation E FRET = [1 + (R/R 0)6]−1, where R 0 was set to 55.0 Å according to the experimental results. The ⟨R FRET⟩ was then calculated using the equation ⟨R FRET⟩ = R 0(⟨E FRET–1 – 1)1/6, where ⟨E FRET⟩ is the ensemble-averaged value of E FRET. The standard error of the mean for ⟨E FRET⟩ was estimated by using three independent simulations.

Implicit Solvent MD Simulations for RNA Duplex Formation

Three non-self-complementary RNA duplexes, CGCGG, ACUGUCA, and CGACGCAG, were simulated using the DESRES-RNA force field with the GB-neck2 implicit solvent model, starting from separate complementary strands. Initial structures were generated by using the tleap module of AMBER22, where the two complementary strands were positioned 30 Å apart along the x-axis. To prevent the two strands from diffusing away, a half-harmonic restraining potential was applied to the distance between the phosphorus atoms of the central residues in each strand. The ionic concentration was set to 1 M. Each simulation was conducted for 2 μs, and three independent runs were performed. All other simulation parameters were identical to those used in the stem-loop folding simulations. Full methodological details are provided in the Supporting Information.

Results and Discussion

Folding Trajectories of RNA Stem-Loops

The time evolutions of the fractions of native base pairs Q in stem regions from the 18 class I stem-loops shown in Figure A and the 8 class II stem-loops shown in Figure B are shown in Figures and , respectively. The RMSD time series data for the stem and loop regions, Q, base pairs, and probability distributions of the RMSDs are presented in the Supporting Information (Figures S1–S26). Notably, all 18 class I stem-loops were able to fold into their conformations with Q = 1.0 in three independent 4 or 8 μs simulations starting from their extended conformations (Figure ). The 10 nt molecules of 1R4H and 1IDV with three base pairs in their stem regions showed frequent folding and unfolding transitions during the simulation time. In the case of 2EVY, the folded state was unstable and lasted only 1 μs out of a 24 μs simulation time. Except for the above three RNA models, once the molecules folded into conformations with all of their native base pairs, their folded structures were stable until the end of the simulations. For class II stem-loops, five of the eight models (17RA, 2KF0, 1ANR, 1R2P, and 1N8X) were able to fold to the conformations with Q = 1.0 (Figure ). In most cases, these RNA molecules reached the folded states with high Q values (Q > 0.6) and remained stable until the end of the simulations.

2.

2

Time evolution of the fraction of native base pairs Q for 18 class I stem-loops without bulges or internal loops (Figure A). Three independent simulations (red, green, and blue lines) were performed for each model, starting from the extended conformations.

3.

3

Time evolution of the fraction of native base pairs Q for eight class II stem-loops with bulges or internal loops (Figure B). Three independent simulations (red, green, and blue lines) were performed for each model, starting from the extended conformations.

Cluster Analysis

Cluster analysis of the sampled conformations was performed using the k-means method to extract representative structures from the trajectories and to determine whether the folded structures formed a dominant cluster. In the cluster analysis, the overall RMSD was used to evaluate the structural differences between the two model conformations. The centroid structures for the class I stem-loops with the lowest RMSD values as well as the cluster numbers, RMSDs for the molecules, and Q values are presented in Figure . The fractions of the total trajectory, RMSDs for all, stem, and loop regions, and Q values for the top three clusters for the class I stem-loops are listed in Table . For 15 of the 18 class I stem-loops, the centroid structures with the lowest RMSD values among the 20 clusters belonged to their most populated first clusters (Figure ). The centroid structures of the second and third most populated clusters for 1FHK and 1JTW, respectively, had the lowest RMSD values. The fractions of these clusters were 0.21 and 0.16 for 1FHK and 1JTW, respectively, which are still major components of their trajectories. For 2EVY, the centroid with the lowest RMSD value of 3.7 Å was in the 17th cluster due to the instability of the folded state of the model. The centroid structures with the lowest RMSD always have a Q of 1.0 and RMSD values for the molecules of <4.4 Å (Figure ). In the stem region, the RMSD values of the centroids were <2.0 Å, except for 1I46 (2.2 Å) and 2LDL (3.0 Å) (Table ). In contrast to the stem regions, the loop regions tended to have high RMSD values of approximately 4 Å (Table ). This trend of inaccuracy in loop regions is clearly visible in Figure , where the loops were disrupted relative to their respective experimental structures despite the stem regions being nearly perfectly formed. Tan et al. reported folding simulations for 1ZIH and 2KOC with the DESRES-RNA force field and the TIP4P-D explicit solvent model using the simulated tempering technique, where the structures with stem and loop RMSDs <2 Å were substantially sampled at low temperatures. Thus, using the GB-neck2 implicit solvent model decreased the modeling accuracy of the loop region.

4.

4

Comparison of class I stem-loop structures between the experiment (blue) and cluster centroid (red) exhibiting the lowest non-hydrogen root mean square deviation (RMSD) values for the entire molecule. Below the RNA name of each structure, the cluster number, RMSD value, and fraction of native base pairs (Q) are shown and separated by slashes. The k-means method with k = 20 was used to cluster the RNA structures sampled from the three independent molecular dynamics simulations starting from extended conformations.

1. Fraction of the Total Trajectory, Non-Hydrogen Root Mean Square Deviation (RMSD; Å) for the Entire Molecule, Stem, and Loop Regions, and the Fraction of Native Base Pairs Q of the Centroids for the Three Most Populated Clusters of the 18 Class I Stem-Loops .

  First cluster
Second cluster
Third cluster
    RMSD
    RMSD
    RMSD
 
Model Fraction Overall Stem Loop Q Fraction Overall Stem Loop Q Fraction Overall Stem Loop Q
1R4H 0.16 3.5 1.8 4.5 1.0 0.10 4.9 1.8 5.5 1.0 0.09 4.2 2.9 4.5 0.0
1IDV 0.25 3.7 1.4 4.5 1.0 0.16 3.7 2.9 3.8 0.3 0.05 8.9 9.8 2.8 0.0
1ZIH 0.94 2.5 1.5 2.5 1.0 0.02 7.0 6.6 5.7 0.0 0.01 11.7 13.6 4.3 0.0
1I46 0.42 3.6 2.2 4.7 1.0 0.07 6.9 6.5 4.1 0.0 0.05 13.7 14.7 4.8 0.0
1ESH 0.74 3.5 2.0 4.5 1.0 0.09 6.0 5.9 4.8 0.0 0.03 15.6 17.1 3.9 0.0
1F85 0.75 3.3 1.3 4.8 1.0 0.16 4.2 3.4 4.8 0.8 0.02 5.1 4.0 4.5 0.6
1FHK 0.28 4.4 1.1 5.3 1.0 0.21 3.1 0.9 3.7 1.0 0.12 8.4 5.1 7.2 0.0
2EVY 0.07 15.8 17.2 5.4 0.0 0.07 13.2 14.7 4.3 0.0 0.06 15.3 17.0 4.8 0.0
2KOC 0.46 3.1 1.3 4.5 1.0 0.18 4.2 1.5 6.5 1.0 0.03 14.5 15.1 4.3 0.0
2Y95 0.56 2.3 1.2 3.6 1.0 0.18 5.5 4.7 5.2 0.0 0.05 6.4 6.6 4.1 0.0
1OQ0 0.24 3.0 1.9 4.2 1.0 0.07 14.8 15.8 4.5 0.0 0.06 17.8 18.8 4.7 0.0
1XWP 0.32 4.2 1.4 4.9 1.0 0.06 18.8 20.0 8.3 0.0 0.05 18.0 19.4 6.7 0.0
1JTW 0.27 7.3 6.9 5.6 0.0 0.17 9.0 9.3 5.7 0.0 0.16 2.2 1.4 2.1 1.0
1KKA 0.55 4.1 1.3 4.8 1.0 0.05 20.0 22.5 5.3 0.0 0.05 16.2 17.9 6.0 0.0
2RPK 0.55 3.8 1.1 4.4 1.0 0.04 13.6 14.1 7.1 0.0 0.04 22.8 25.2 8.1 0.0
1SZY 0.36 3.0 1.1 2.9 1.0 0.05 21.6 24.0 7.2 0.0 0.04 20.3 22.1 7.3 0.0
1PJY 0.83 2.6 1.8 3.9 1.0 0.02 23.6 25.1 5.6 0.0 0.02 26.8 28.6 6.0 0.0
2LDL 0.15 4.4 3.0 3.9 1.0 0.07 28.9 32.1 6.0 0.0 0.06 28.3 31.2 8.0 0.0
a

Cluster analysis was performed using the k-means method with k = 20. For each model, the cluster whose centroid structure had the lowest RMSD value among the 20 clusters is highlighted in bold.

b

For 2EVY, the cluster with the lowest RMSD value was the 17th cluster, where the fraction of the cluster, the RMSD for the entire molecule, stem, and loop regions, and the Q value of the centroid were 0.04, 3.7 Å, 1.4 Å, 4.6 Å, and 1.0, respectively.

The centroid structures with the lowest RMSD values for the class II stem-loops, along with their cluster numbers, RMSD values for the molecules, and Q values are shown in Figure . The fraction of the trajectory, RMSDs for all, stem, and hairpin loop regions, and Q values for the top three clusters for the class II stem-loops are listed in Table . The RMSD values for the stems that directly connect to hairpin loops (stem 1) and those close to the 5′ and 3′ termini of the RNA (stem 2) are also listed. The conformations with Q = 1.0 were sampled for five class II stem-loops: 17RA, 2KF0, 1ANR, 1R2P, and 1N8X (Figure ). Among the five folded models, the centroid structures of the most populated clusters for 17RA, 2KF0, 1ANR, and 1N8X gave the lowest RMSD values, with 17RA and 2KF0 at <3 Å. For 1ANR and 1N8X, the centroid structures had slightly higher RMSD values at 5.9 and 4.8 Å, despite Q values of 1.0 and 0.7, respectively. For 1R2P, the centroid structure of the 16th cluster had the lowest RMSD value of 8.3 Å with Q = 1.0. These results showed that the modeling accuracy of stem regions was high but that of hairpin and internal loop regions and the relative orientation between the two stem regions was low with the current DESRES-RNA force field in the GB-neck2 implicit solvent model.

5.

5

Comparison of the class II stem-loop structures with bulges or internal loops between the experiment (blue) and cluster centroid (red) exhibiting the lowest non-hydrogen root mean square deviation (RMSD) for the entire molecule. For models 1ESY, 2JWV, and 1R7W, which did not correctly fold, the centroid structure of the first cluster is shown. Below the RNA name of each structure, the cluster number, RMSD value, and fraction of native base pairs (Q) are shown and separated by slashes. The k-means method with k = 20 was used to cluster the RNA structures sampled from the three independent molecular dynamics simulations starting from extended conformations.

2. Fraction of the Total Trajectory, Non-Hydrogen Root Mean Square Deviation (RMSD; Å) for the Entire Molecule, Stems 1 and 2, and Loop Regions, and the Fraction of Native Base Pairs Q of the Centroids for the Three Most Populated Clusters of the Eight Class II Stem-Loops .

  First cluster
Second cluster
Third cluster
    RMSD
    RMSD
    RMSD
 
Model Fraction Overall Stem 1 Stem 2 Loop Q Fraction Overall Stem 1 Stem 2 Loop Q Fraction Overall Stem 1 Stem 2 Loop Q
1ESY 0.23 14.2 9.5 7.5 6.2 0.1 0.10 9.1 6.6 5.8 5.4 0.0 0.08 11.0 8.7 9.4 6.1 0.0
17RA 0.33 2.8 0.9 1.5 3.4 1.0 0.11 4.2 0.7 1.6 3.9 1.0 0.05 21.4 9.4 24.1 4.1 0.0
2KF0 0.47 3.0 1.0 0.7 4.7 0.9 0.04 25.3 17.4 30.2 6.5 0.0 0.04 22.3 14.0 25.3 5.9 0.0
1ANR 0.35 5.9 1.4 1.3 5.1 1.0 0.10 22.6 16.2 25.2 7.1 0.0 0.08 25.0 18.3 25.5 8.1 0.0
2JWV 0.07 27.6 15.9 33.6 6.3 0.0 0.07 32.9 15.7 42.2 6.9 0.0 0.07 32.6 16.0 39.6 7.7 0.0
1R2P 0.09 32.7 17.0 38.3 6.2 0.0 0.07 37.1 16.7 46.0 5.5 0.0 0.07 29.5 15.9 33.7 4.3 0.0
1R7W 0.24 18.9 14.2 15.5 4.2 0.0 0.20 13.9 0.8 16.9 2.6 0.4 0.09 18.6 15.5 14.6 4.0 0.0
1N8X 0.32 4.8 3.5 2.0 2.6 0.7 0.16 5.9 1.8 6.4 2.1 0.5 0.13 9.7 3.9 10.6 3.2 0.4
a

Cluster analysis was performed using the k-means method with k = 20. For each model, the cluster with a centroid structure that had the lowest RMSD value among the 20 clusters is highlighted in bold. For the models 1ESY, 2JWV, and 1R7W, the RNA did not fold into the structures with a Q value of 1.0; therefore, the clusters with the lowest RMSD values are not highlighted in bold in this table.

b

For 1R2P, the cluster with the lowest RMSD value was the 16th cluster, where the fraction of the cluster, the RMSD for the entire molecule, stem 1, stem 2, and loop regions, and the Q value of the centroid were 0.04, 8.3 Å, 1.6 Å, 4.5 Å, 3.0 Å, and 1.0, respectively.

Folding Kinetics

Conventional MD simulations were performed at 298 K to investigate the folding of various RNA stem-loops. One advantage of conventional MD simulations when studying biomolecular folding compared with enhanced sampling simulations, such as replica-exchange MD simulations, is the relatively easy analysis of folding/unfolding kinetics and mechanisms in a time-dependent manner.

Recent experimental studies have shown that the folding of simple stem-loops can occur on time scales ranging from microseconds to seconds. In the MD simulations using the Langevin thermostat with the implicit GB solvent model presented in this report, the folding of the class I stem-loops occurred on a time scale of several microseconds. The rate of conformational changes was increased using GB implicit solvent models relative to experiments that were approximately 10–100-fold with a collision frequency γ of 1 ps–1. Therefore, our simulations estimate folding times of tens to hundreds of microseconds for simple class I stem-loops, which are within the lower limit of the experimental estimate.

Folding Mechanisms and Pathways

We show two representative trajectories of RNA stem-loops and discuss their folding mechanisms: 1SZY from the class I stem-loops and 1ANR from the class II stem-loops. The trajectories of the other stem-loops are shown in the Supporting Information (Figures S1–S26).

The time evolutions of the RMSD for the stem region and base pairs in the first trajectory of 1SZY in the class I stem-loops, as well as snapshots at selected time points, are shown in Figure . A simulation movie of this trajectory is provided in the Supporting Information (Movie S1). The RNA had seven base pairs in the stem region and a loop length of 7 nt. In the trajectory, folding occurred at approximately 2.4 μs from random conformations via a two-state transition without an intermediate state (Figure A). The stem-loop was zipped from the loop-closing base pair (G7-C15) to its terminal side of the chain (Figure A,C). In the trajectory, the loop-closing G7-C15 base pair was initially formed at 2.4314 μs from a random conformation, followed by the adjacent two base pairs G6-C16 and G5-C17 at 2.433 μs. The remaining stem base pairs formed by 2.4736 μs. Before folding at approximately 2.4 μs, the molecules formed almost no base pairs, and the RMSD for the molecule and stem region fluctuated between 10 and 30 Å, except at 1.3–1.9 μs, where the three misaligned base pairs (G1-U9, G2-C8, and C3-G7) were clearly formed. Misaligned base pairs were also observed in the second and third trajectories of the molecule (Figure S16).

6.

6

Folding trajectory of the 1SZY class I stem-loop from the first molecular dynamics simulation using the DESRES-RNA force field and the GB-neck2 model. (A) (Upper) Non-hydrogen root mean square deviation (RMSD) for the stem region. The dashed line indicates an RMSD of 2 Å. (Lower) Base pairs formed during simulation, where the native and non-native base pairs are represented by red and blue, respectively. (B) Secondary structure of 1SZY. The stem region is indicated in red. (C) Snapshots of the trajectory at various simulation time points. RNA backbones are represented by ribbons. The nucleotide ring structures are colored according to nucleotide type: adenine, red; cytosine, yellow; guanine, green; and uracil, cyan.

The time evolutions of the RMSD for the stem regions and base pairs in the third trajectory of 1ANR, as a representative trajectory for class II stem-loops, are shown in Figure . The figure also includes snapshots during the folding transition observed in this trajectory. A simulation movie of the trajectory is provided in the Supporting Information (Movie S2). The RNA consists of 29 nucleotides with an asymmetrical internal loop. Folding of the molecule from its extended conformations clearly occurred in a two-step manner: stem 1 was formed first, followed by stem 2, and they likely formed via the zipping mechanism from the hairpin loop-closing base pair for stem 1 and from the internal loop-closing base pair for stem 2. In the trajectory, four base pairs in the stem 1 region (C13-G20, G12-C21, A11-U22, and G10-C23) were initially formed by 0.998 μs from a random conformation (Figure C). This stem 1 duplex persisted for approximately 1 μs, whereas two strands in the stem 2 region largely fluctuated, corresponding to an intermediate state in the folding process. At 1.998 μs, the G5-C25 base pair in stem 2 and the A6-U24 base pair in the internal loop region of the molecule were formed, which promoted the formation of the remaining base pairs in the stem 2 region. The details of the A6-U24 base pair in the folded structures are described in the Supporting Information. All stem 2 base pairs were formed by 2.4 μs.

7.

7

Folding trajectory of the 1ANR class II stem-loop from the third molecular dynamics simulation using the DESRES-RNA force field and the GB-neck2 model. (A) (Upper) Non-hydrogen root mean square deviation (RMSD) for the two stem regions, where stem 1 (red lines) and stem 2 (orange lines) represent the double helical regions directly connecting the hairpin loop and those near the 5′ and 3′ termini of the RNA, respectively. The dashed line indicates an RMSD of 2 Å. (Lower) Base pairs formed during simulations, where the native and non-native base pairs are represented in red and blue, respectively. (B) Secondary structure of 1ANR. Stems 1 and 2 regions are indicated in red and orange, respectively. (C) Snapshots of the trajectory at various simulation time points. RNA backbones are represented by ribbons. The nucleotide ring structures are colored according to nucleotide type: adenine, red; cytosine, yellow; guanine, green; and uracil, cyan.

The folding mechanisms of the two-state transition and zipping from the loop-closing base pairs observed in 1SZY were consistently reproduced across other class I stem-loops (Figures S1–S18). For class II stem-loops, the two-step folding pathway observed in 1ANR was essentially the same for the successful folding trajectories of 17RA, 2KF0, 1R2P, and 1N8X (Figures S20, S21, S24, and S26, respectively).

Three to four misaligned base pairs were frequently observed for other class I stem-loops. Without continuously misaligned base pairs, class I stem-loops displayed substantial molecular fluctuations, with RMSD values for the molecules and stem regions fluctuating rapidly over 10 Å, except for 1JTW. For 1JTW, the overall RMSD values remained stable at approximately 7 and 9 Å for most of the simulation time in the second and first trajectories, respectively (Figure S13a). The corresponding structures were found in the first (RMSD ≈ 7 Å) and second clusters (RMSD ≈ 9 Å), where the guanosine bases formed stable hydrogen bonds with the backbone phosphate and were simultaneously stacked with neighboring residues (Figure S27).

Trajectories that Failed to Form Stable Native Stems: 2EVY, 1ESY, 2JWV, and 1R7W

The MD simulations for 2EVY, 1ESY, 2JWV, and 1R7W failed to form stable native stem-loop conformations. The MD simulations of the class I stem-loop 2EVY with the GB-neck2 model sampled structures with Q = 1.0; however, these structures were unstable and belonged to the 17th cluster (Figure and Table ). The centroid structures of the first and second clusters adopted expanded conformations (Figure S28). The stem of the molecule comprises five base pairs, beginning with a U-G wobble base pair at the loop-closing position followed by two A-U base pairs extending toward the terminal end. Given that other stem-loops tended to initiate base pair formation from the loop-closing pair and proceed toward the terminus via the zipping mechanism, stable base pairing near the loop region likely plays a crucial role in proper stem-loop folding. The current simulation model using DESRES-RNA and GB-neck2 may underestimate the stability of the G-U wobble and A-U pairs, which could limit the sampling of stably folded structures for 2EVY.

Class II stem-loop 2JWV could not fold into a structure with Q = 1.0 within the 3 × 8 μs MD simulations (Figure S23). While transient formation of the stem 1 region was observed, stable base pairs, including non-native base pairs, did not form during the simulations (Figure S23). The centroid structure of the first cluster adopted an expanded conformation (Figure ), and its population fraction was only 0.07, which is lower than the values of other stem-loop models. Stem 1 consists of four base pairs: a loop-closing U-G wobble pair followed by one C-G and two A-U base pairs to the internal loop. Because this region contains only a single G-C pair, like 2EVY, the current simulation model failed to form a stable duplex in stem 1, resulting in unsuccessful global folding of the molecule.

Class II stem-loops 1ESY and 1R7W were also unable to fold into a structure with Q = 1.0 within the 3 × 8 μs MD simulations (Figures S19 and S25, respectively). In the unsuccessful folding trajectories of 1ESY and 1R7W, misaligned base pairs were stably formed with reduced RMSD fluctuations. These misfolded structures formed dominant clusters, which are shown in Figure S29 along with their secondary structure diagrams. The persistence of the misfolded states likely prevents the successful folding of both RNA molecules.

These cases underscore the need for further refinement of the force field parameters, longer simulation durations, and the integration of enhanced sampling techniques to achieve more reliable and accurate RNA folding in MD simulations.

Comparison of Loop and Bulge Structures with Explicit Solvent Simulations and NMR Data

We investigated the potential causes of the loop and bulge structure modeling inaccuracies observed with the DESRES-RNA force field and the GB-neck2 implicit solvent model. Residues directly involved in binding to proteins, nucleic acids, and small metabolite molecules are predominantly found in single-stranded loop regions. Enhancing the modeling accuracy of these loop and bulge regions is crucial for advancing the use of MD simulations in RNA biology. Here, using four representative stem-loops, 1F85, 1FHK, 2KOC (class I), and 1R2P (class II), the sampled loop and bulge structures from the MD simulations using the DESRES-RNA force field with the GB-neck2 implicit solvent model were compared with those obtained from 1 μs MD simulations using the same RNA force field combined with the TIP4P-D explicit solvent model. These RNA molecules were chosen based on their distinct structural features. The analysis aimed to determine whether the observed inaccuracies could be primarily attributable to the RNA force field or the implicit solvent model.

Trajectories of the RMSDs for these stem-loops in the explicit solvent MD simulations are shown in Figure S30. The stem regions of the three selected class I stem-loops remained stable, with RMSD values consistently <2.0 Å throughout the 1 μs explicit solvent simulations. For 2KOC, the loop conformation remained stable, with an RMSD < 1.5 Å for most of the simulation time, averaging 0.7 Å. In contrast, the loop regions of 1F85 and 1FHK exhibited a marked increase in RMSD to approximately 3.5 Å at around 0.5 μs, which persisted until the end of the simulations. For class II stem-loop 1R2P, the overall RMSD increased to approximately 8.0 Å (see Figure S33 for the resulting overall structure). Elevated RMSD values were observed in the terminal stem 2 (∼4.5 Å) and bulge regions (∼3.3 Å), whereas the stem 1 and loop regions remained stably formed, with RMSD values consistently below 2.0 Å.

The loop structures of four stem-loops derived from reference NMR models, explicit solvent MD snapshots at 1 μs, and cluster centroids with the lowest overall RMSD values in the GB-neck2 implicit solvent MD are shown in Figure . For 1F85, the NMR structure features a sheared G6-A9 base pair, a critical interaction governing the loop conformation of the molecule. This noncanonical pairing was found in the structure of the explicit solvent MD simulation. However, U8 adopted a different orientation relative to the NMR structure, contributing to the elevated RMSD values in the loop region. In the structure of the GB-neck2 implicit solvent MD simulations, the noncanonical sheared base pair G6-A9 was completely disrupted, and A9 was pointed toward the outside of the loop. In the first cluster of the implicit solvent simulations, the fraction of structures forming the sheared G6-A9 base pair was less than 1%.

8.

8

Comparison of the loop and bulge structures in four representative RNA stem-loops from NMR and molecular dynamics (MD) simulations with explicit and implicit solvent models. Loop regions of 1F85, 1FHK, 2KOC, and 1R2P, as well as the bulge region of 1R2P, are shown for three structural sources: NMR-derived structures (left column), simulations using the DESRES-RNA force field with the TIP4P-D explicit solvent model (middle column), and simulations using the same force field with the GB-neck2 implicit solvent model (right column). Dashed black lines indicate hydrogen bonds. In the NMR structures, key hydrogen bonds stabilizing the loop conformations are highlighted with red dashed circles. For the TIP4P-D explicit solvent MD simulations, snapshots at 1 μs are shown with non-hydrogen atom root mean square deviation (RMSD) values for the loop and bulge regions. For the GB-neck2 implicit solvent MD simulations, the centroid structures with the lowest RMSD values for the entire molecule are shown with their cluster numbers and RMSD values for the loop and bulge regions. In the structures from the explicit and implicit solvent MD simulations, hydrogen bonds consistent with those observed in the NMR structures are highlighted with red dashed circles. In contrast, hydrogen bonds that are formed exclusively in the simulated structures are highlighted with blue dashed circles.

For 1FHK, the NMR structure has sheared G4-U11 and G5-A10 mismatch base pairs and a contiguous base stacking from G7 to U11. The reported absence of a nuclear Overhauser effect (NOE) between the imino protons of G4 and U11 in NMR experiments suggests that G4 and U11 do not form a wobble base pair. In the explicit solvent MD simulation, although the contiguous base stacking from G7 to U11 was preserved, the two sheared base pairs, G4-U11 and G5-A10, were not formed. In the implicit solvent MD simulations using the GB-neck2, G4-U11 adopted a wobble base pair, and A10 pointed outward from the loop without pairing with G5. The contiguous base stacking pattern was also disrupted. In the second cluster of the implicit solvent simulations, whose centroid structure exhibited the lowest RMSD, less than 1% of the structures formed the sheared G4-U11 and G5-A10 mismatch base pairs. In contrast, G4 and U11 formed wobble base pairs in 87% of the structures within the cluster.

For 2KOC, the NMR structure features a trans-wobble base pair, U6-G9, which includes an unusual 2′-OH hydrogen bond. In addition, a hydrogen bond between the 2′-OH of U7 and the N7 of G9 is also formed. In the structure from the explicit solvent MD simulation, these key interactions between U6 and G9, as well as between U7 and G9, were stably maintained, contributing to the low RMSD values for the loop region. In contrast, the implicit solvent MD simulation led to the disruption of these interactions, with the C8 and G9 bases adopting orientations that diverged from those in the NMR structure. In the first cluster of the implicit solvent simulations, the trans-wobble U6-G9 base pair was formed in less than 1% of the structures within the cluster.

For the 1R2P class II stem-loop, chemical shift and NOE data from NMR experiments indicate that the GAAA tetraloop (G15-A18) adopts a stable conformation, with A16 tightly stacked. In the NMR structure, G15 and A18 are in close proximity, forming a partial hydrogen bond between G15 (N3) and A18 (N6-H). This conformation was preserved in the explicit solvent MD simulation, yielding a low loop RMSD of 1.4 Å. Although the G15-A18 hydrogen bond observed in the NMR structure was disrupted, three new hydrogen bonds were formed: two involving G15 (N2-H2), one with A18 (N3) and another with its backbone phosphate oxygen atom, and a third between G15 (2′-OH) and A17 (N7), potentially contributing to loop rigidity. In contrast, the implicit solvent MD simulation showed weakened stacking of A16 with A17 and an outward orientation of A18. The nucleobase hydrogen bond between G15 and A18 observed in the NMR structure was lost and replaced by hydrogen bonds between G15 (N1-H/N2-H) and a backbone phosphate oxygen of A18, resulting in an elevated loop RMSD of 3.0 Å.

The structures of the bulge region in the 1R2P class II stem-loop obtained from the explicit and implicit solvent MD simulations are also shown in Figure . In the bulge region of 1R2P, comprising residues U9, A24, C25, and G26, the U9 residue can, in principle, form either a Watson–Crick base pair with A24 or a wobble base pair with G26. However, NMR experiments revealed no evidence of an interaction between U9 and A24 or between U9 and G26. The G26 residue adopted a syn conformation and flipped into the major groove of duplex stem 2, precluding canonical base pairing. In the explicit solvent MD simulation, the bulge region exhibited a slightly elevated average RMSD of 3.3 Å. The G26 residue consistently adopted the syn conformation throughout the trajectory. Over the 1 μs simulation, U9-A24 and U9-G26 base pairs were observed in 55% and 20% of the sampled structures, respectively. The U9-A24 pairing frequency in the explicit solvent MD simulation may be overestimated relative to experimental observations. In contrast, the GB-neck2 implicit solvent MD simulation showed markedly different behavior: only 5% of the structures in the 16th cluster adopted the syn conformation, and 81% formed a U9-G26 base pair. The RMSD of the bulge region in the centroid structure reached 6.5 Å, which is substantially higher than that observed in the explicit solvent simulation. These structural features observed in the GB-neck2 implicit solvent MD simulation are clearly inconsistent with experiments, suggesting limitations of the implicit solvent model in accurately capturing native structural preferences within the bulge region.

MD simulations using the DESRES-RNA force field with the TIP4P-D explicit solvent model partially reproduced key hydrogen bonds and base stacking, stabilizing distinct loop conformations for four representative stem-loops. Conversely, MD simulations using the same RNA force field with the GB-neck2 implicit solvent model completely disrupted these key interactions in the loop regions. Although only a single bulge model was examined in this study, the bulge conformation generated by the explicit solvent MD simulation more closely resembled the NMR structure in comparison to that obtained from the implicit solvent MD simulation. While longer simulations and application of enhanced sampling techniques across a broader set of RNA stem-loop structures are needed in explicit solvent simulations for a more rigorous comparison, the results presented here indicate that the implicit solvent representation substantially reduces modeling accuracy in the loop and bulge regions of the RNA stem-loops.

Comparison of Noncanonical and Transient Base Pairing with NMR Data

Several of the RNA stem-loops examined in this study include regions with potential base pairing that are not detected experimentally or whose base pairing behavior is modulated by pH conditions or ligand binding. We analyzed these base pairing patterns in the MD simulations using the DESRES-RNA force field and the GB-neck2 implicit solvent model and compared them with NMR data, which are described in the Supporting Information. Since NMR data reflect ensemble-averaged conformations in solution, whereas MD simulations yield time-resolved trajectories, direct comparison remains nontrivial due to differences in temporal resolution and conformational averaging. Nevertheless, many of the simulated base pairing patterns appeared to deviate from those observed in the NMR experiments. These findings, together with the results described above, suggest that further refinement of both the force field and the implicit solvent model is necessary to improve the accuracy of the RNA loop modeling.

Flexibility of the ssRNA rU40

To further investigate potential causes of loop structure modeling inaccuracies observed with the DESRES-RNA force field and GB-neck2 implicit solvent model, the flexibility of a single-stranded RNA was evaluated from MD simulations. Single-stranded loop regions generally exhibit greater flexibility than duplex stem regions. Consequently, the intrinsic flexibility of RNA models for a given force field in MD simulations can significantly affect the results of loop structure modeling. We evaluated the FRET-averaged end-to-end distance ⟨R FRET⟩ of rU40 from the implicit solvent MD simulation with the DESRES-RNA and AMBER-OL3 RNA force fields for comparison (see Models and Simulation Methods for details of ⟨R FRET⟩). Lower ⟨R FRET⟩ values indicate higher flexibility in the RNA model. The ⟨R FRET⟩ values of rU40 at various ionic concentrations were evaluated by experiments and MD simulations using the DESRES-RNA and AMBER-OL3 force fields with explicit water models.

The ⟨R FRET⟩ values calculated from the GB-neck2 implicit solvent and explicit solvent MD simulations and experiments at various ionic concentrations are shown in Figure . The end-to-end distance trajectories in the MD simulations with GB-neck2 at a 0.15 M ionic concentration are provided in the Supporting Information (Figure S31). The ⟨R FRET⟩ values calculated from the simulations with the DESRES-RNA force field and GB-neck2 were consistently higher than the experimental values across all ionic concentrations examined in this study, particularly at lower ionic concentrations. At 0.2 M ionic concentration, for example, the ⟨R FRET⟩ value for the DESRES-RNA with GB-neck2 was 82 Å, which is 1.3 times higher than the experimental value of 64 Å. At 1.0 M ionic concentration in the GB-neck2 implicit solvent, the simulation with the DESRES-RNA yielded a ⟨R FRET⟩ value of 70 Å, which is close to the experimental value of 68 Å observed at 0.05 M NaCl. Similarly, the AMBER-OL3 force field combined with GB-neck2 also yielded higher ⟨R FRET⟩ values than the experimental results at ionic concentrations of 0.15 and 0.2 M. These values were slightly lower than those of the DESRES-RNA with GB-neck2. In contrast, the ⟨R FRET⟩ values obtained using explicit solvent models were significantly lower than the experimental values. For the DESRES-RNA force field with the TIP4P-D water model, the ⟨R FRET⟩ value at 0.05 M NaCl was close to the experimental value, whereas the value at 0.2 M NaCl was 41 Å, 0.64 times the experimental result. For the AMBER-OL3 force field with the TIP3P water model, the ⟨R FRET⟩ value at 0.2 M NaCl was 24 Å, corresponding to 0.38 times the experimental value. These results indicate that the flexibility of the RNA model is highly sensitive to solvent models, with values observed in the implicit solvent model being significantly higher than those in explicit solvent models.

9.

9

FRET-averaged end-to-end distances ⟨R FRET⟩ (Å) of rU40 single-stranded RNA obtained from molecular dynamics (MD) simulations using the DESRES-RNA and AMBER-OL3 force fields with the GB-neck2 implicit and explicit solvent models at various ionic concentrations. Standard errors for the simulation data obtained using the GB-neck2 model are too small to be visible in the plot. Values for explicit solvent simulations were obtained from the plot in ref , where the TIP4P-D and TIP3P water models were used with the DESRES-RNA and AMBER-OL3 force fields, respectively, for the MD simulations of rU40 at 300 K. The experimental values were obtained from the plot reported in ref . In both the experiment and the explicit solvent MD simulations, NaCl was used as the background electrolyte.

Tan et al. reported that the single-stranded loop regions of the tetraloops, 2KOC and 1ZIH, repeatedly adopted conformations that were <0.8 Å RMSD from the experimental structures in the simulated tempering MD simulations at 0.15 M NaCl using the DESRES-RNA force field and the TIP4P-D explicit water model. In our MD simulations using the DESRES-RNA force field and the GB-neck2 implicit solvent model, the loop region of 1ZIH frequently adopted conformations with an RMSD of less than 1.0 Å (Figure S3). In contrast, the loop region of 2KOC did not largely adopt the conformations with RMSD < 1.0 Å in the MD simulations using GB-neck2; instead, the RMSD values fluctuated around 5 Å for the conformations with Q > 0.8, which were almost indistinguishable from those with Q < 0.2 (Figure S9). The combination of DESRES-RNA and the TIP4P-D model showed RNA strand flexibility higher than that of the experiment, but the modeling accuracy of the loop region was higher than that of the combination of DESRES-RNA and the GB-neck2 model. Therefore, adjusting force field parameters to enhance the ssRNA flexibility in the implicit solvent model may improve the accuracy of loop region modeling.

Toward More Accurate RNA Stem-Loop Folding Simulations

This section highlights several factors that require refinement to improve the accuracy of RNA modeling. As discussed above, RNA structural dynamics in MD simulations using the DESRES-RNA force field are highly sensitive to the choice of the solvent model. In simulations with the GB-neck2 implicit solvent model, hydrogen bonding between bases in loop regions tends to be disrupted, and RNA molecules exhibit reduced flexibility. In contrast, simulations employing the TIP4P-D explicit solvent demonstrate partial preservation of base–base hydrogen bonding within loop regions and greater conformational flexibility. Furthermore, analyses of 2EVY and 2JWV suggest that the current model may underestimate the stability of G-U wobble base pairs and canonical A-U base pairs. It is worth noting that the GB-neck2 parameters for nucleic acids were optimized against Poisson–Boltzmann calculations using a training set composed exclusively of stable DNA and RNA duplex conformations formed by canonical Watson–Crick base pairing. This may contribute to deficiencies in modeling loop region interactions and in accurately capturing the folding behavior of diverse RNA sequences. Since loop structures are often stabilized by noncanonical base pairs with diverse conformations, expanding the GB-neck2 training set to include a broader range of conformations, such as single-stranded RNA, loop structures, and noncanonical base pairs, is essential for improving its accuracy and generalizability in modeling complex RNA structures beyond regular duplexes.

In the simulations conducted in this study, the nonpolar contribution to the solvation free energy was not considered. Linzer et al. approximated the nonpolar solvation effect by strengthening Lennard-Jones interactions between pairs of heavy atoms in bases, which improved loop region modeling accuracy with the AMBER-OL3 RNA force field and GB-neck2 implicit solvent model. Therefore, incorporating the nonpolar solvation term in energy calculations may also improve the performance of the DESRES-RNA force field with the GB-neck2 implicit solvent model, particularly in accurately capturing loop region conformations.

Finally, it is worth mentioning the necessity of incorporating the effects of divalent cations in MD simulations for RNA systems to accurately capture biologically relevant conformations and interactions. Divalent cations, like Mg2+, play critical roles in RNA structures, folding, and functions. ,− The ionic strength dependence of the persistence length of rU40 varies markedly between the presence of NaCl and MgCl2, with Mg2+ significantly increasing the rigidity of the ssRNA even at the same ionic strength. To assess the influence of Mg2+ ions on the RNA structure, explicit solvent MD simulations of the 1R2P stem-loop were performed in the presence and absence of Mg2+ ions. The simulation protocol and results are provided in the Supporting Information (Figures S32 and S33). The results underscore both the importance and the inherent challenges of accurately modeling Mg2+ ion interactions in MD simulations. In the current standard GB implicit solvent models, including the GB-neck2 model, the effects of monovalent ions are introduced using a mean-field approximation (the linearized Debye–Hückel approximation), whereas multivalent ions remain out of reach. Hybrid explicit ions/implicit solvent models that use the GB calculation framework may help tackle this problem.

Conclusions

Atomic-level computer simulations of RNA systems are crucial for understanding RNA structure–dynamics–function relationships. RNA molecule folding simulations provide stringent tests that can be used to determine whether current molecular mechanics force fields and simulation methodologies can accurately characterize large conformational changes in RNA from long-time-scale MD simulations. In this study, we report that the DESRES-RNA force field combined with the GB-neck2 implicit solvent model can successfully simulate the folding of stem regions in 23 RNA stem-loops with varying sequences and lengths, including those with bulges, internal loops, and noncanonical wobble base pairs. The combination of the DESRES-RNA force field and the GB-neck2 implicit solvent model also successfully reproduced RNA duplex formation of CGCGG, ACUGUCA, and CGACGCAG, starting from two separate complementary strands, with details of the simulation protocols and results included in the Supporting Information (Figures S34–S36 and Table S1). Accurate modeling of loop structures remains challenging in simulations using the implicit solvent model. Conventional MD simulations were performed at room temperature. The enhanced sampling MD simulations with the DESRES-RNA force field and GB-neck2 provide a free energy landscape for RNA folding, offering valuable insights into folding mechanisms and potentially facilitating further optimization of RNA force field parameters. In MD simulations employing the DESRES-RNA force field and the GB-neck2 implicit solvent model, improving the conformational flexibility of single-stranded RNAs and the reproducibility of hydrogen bonds in loop regions is expected to increase the predictive accuracy of stem-loop folding and loop conformation modeling. This improvement could be achieved through optimization of GB parameters using a structurally diverse set of RNA molecules and by incorporating a nonpolar term in solvation free energy calculations. Accurate treatment of the interactions between RNA and divalent cations is also necessary for further advancements. The ability to recapitulate RNA stem folding of fundamental stem-loop motifs, which is reported in this study, represents a significant milestone toward accurately modeling RNA tertiary structures using MD simulations.

Supplementary Material

ao5c05377_si_001.pdf (9.5MB, pdf)
Download video file (33.3MB, mp4)
Download video file (21.5MB, mp4)
Download video file (28.4MB, mp4)

Acknowledgments

This research was financially supported by the Tokyo University of Science.

The input files used for MD simulations with AMBER22 are described in the Supporting Information. All data will be made available upon request.

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.5c05377.

  • Time evolutions of RMSD, fraction of native base pairs Q, formed base pairs, and probability distributions of RMSD values for class I and class II stem-loops; centroid structures of the first and second clusters of 1JTW and 2EVY; misfolded structures and corresponding secondary structure diagrams forming the first clusters for 1ESY and 1R7W; trajectories of 1 μs MD simulations for three class I stem-loops and one class II stem-loop; comparison of noncanonical and transient base pairing with NMR data; trajectories of the rU40 ssRNA end-to-end distance; explicit water molecular dynamics simulation of an RNA stem-loop in the presence of divalent magnesium ions; trajectories of 1 μs MD simulations for 1R2P in the absence and presence of magnesium ions; comparison of overall, loop, and bulge structures of 1R2P; MD simulations of RNA duplex formation); trajectories of three independent molecular dynamics simulations of CGCGG, ACUGUCA, and CGACGCAG RNA duplex formation; non-hydrogen atom RMSD, base pairing, and C3′-endo sugar pucker presence evaluated over the last 0.5 μs of the RNA duplex model simulations; descriptions of simulation movies; and input files used for AMBER MD simulations (PDF)

  • Movie S1: Simulation movie of a 1SZY class I stem-loop folding observed in the first molecular dynamics simulation using the DESRES-RNA force field and the GB-neck2 implicit solvent model (MP4)

  • Movie S2: Simulation movie of a 1ANR class II stem-loop folding observed in the third molecular dynamics simulation using the DESRES-RNA force field and the GB-neck2 implicit solvent model (MP4)

  • Movie S3: Simulation movie of a CGACGCAG RNA duplex formation in the first molecular dynamics simulation using the DESRES-RNA force field and the GB-neck2 implicit solvent model (MP4)

The author declares no competing financial interest.

References

  1. Sasso J. M., Ambrose B. J. B., Tenchov R., Datta R. S., Basel M. T., DeLong R. K., Zhou Q. A.. The progress and promise of RNA medicine horizontal line–An arsenal of targeted treatments. J. Med. Chem. 2022;65(10):6975–7015. doi: 10.1021/acs.jmedchem.2c00024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Burkard M. E., Turner D. H., Tinoco I.. The interactions that shape RNA structure. Cold Spring Harbor Monograph Series. 1999;37:233–264. [Google Scholar]
  3. Gan H. H., Pasquali S., Schlick T.. Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. Nucleic Acids Res. 2003;31(11):2926–2943. doi: 10.1093/nar/gkg365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Wang X., Yu S., Lou E., Tan Y.-L., Tan Z.-J.. RNA 3D structure prediction: Progress and perspective. Molecules. 2023;28(14):5532. doi: 10.3390/molecules28145532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Budnik M., Wawrzyniak J., Grala L., Kadzinski M., Szóstak N.. Deep dive into RNA: a systematic literature review on RNA structure prediction using machine learning methods. Artificial Intelligence Review. 2024;57(9):254. doi: 10.1007/s10462-024-10910-3. [DOI] [Google Scholar]
  6. Wang J., Fan Y., Hong L., Hu Z., Li Y.. Deep learning for RNA structure prediction. Curr. Opin. Struct. Biol. 2025;91:102991. doi: 10.1016/j.sbi.2025.102991. [DOI] [PubMed] [Google Scholar]
  7. Breaker R. R.. Riboswitches and the RNA World. Cold Spring Harb. Perspect. Biol. 2012;4(2):a003566. doi: 10.1101/cshperspect.a003566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Zhuang X. W., Kim H., Pereira M. J. B., Babcock H. P., Walter N. G., Chu S.. Correlating structural dynamics and function in single ribozyme molecules. Science. 2002;296(5572):1473–1476. doi: 10.1126/science.1069013. [DOI] [PubMed] [Google Scholar]
  9. Gualerzi C. O., Pon C. L.. Initiation of mRNA translation in bacteria: structural and dynamic aspects. Cell. Mol. Life Sci. 2015;72(22):4341–4367. doi: 10.1007/s00018-015-2010-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Karplus M.. Molecular dynamics simulations of biomolecules. Acc. Chem. Res. 2002;35(6):321–323. doi: 10.1021/ar020082r. [DOI] [PubMed] [Google Scholar]
  11. Sponer J., Banás P., Jurecka P., Zgarbová M., Kührová P., Havrila M., Krepl M., Stadlbauer P., Otyepka M.. Molecular dynamics simulations of nucleic acids. From tetranucleotides to the ribosome. J. Phys. Chem. Lett. 2014;5(10):1771–1782. doi: 10.1021/jz500557y. [DOI] [PubMed] [Google Scholar]
  12. Sponer J., Bussi G., Krepl M., Banás P., Bottaro S., Cunha R. A., Gil-Ley A., Pinamonti G., Poblete S., Jurecka P., Walter N. G., Otyepka M.. RNA structural dynamics as captured by molecular simulations: A comprehensive overview. Chem. Rev. 2018;118(8):4177–4338. doi: 10.1021/acs.chemrev.7b00427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Liebl K., Zacharias M.. The development of nucleic acids force fields: From an unchallenged past to a competitive future. Biophys. J. 2023;122(14):2841–2851. doi: 10.1016/j.bpj.2022.12.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Muscat S., Martino G., Manigrasso J., Marcia M., De Vivo M.. On the power and challenges of atomistic molecular dynamics to investigate RNA molecules. J. Chem. Theory Comput. 2024;20(16):6992–7008. doi: 10.1021/acs.jctc.4c00773. [DOI] [PubMed] [Google Scholar]
  15. Pérez A., Marchán I., Svozil D., Sponer J., Cheatham T. E., Laughton C. A., Orozco M.. Refinement of the AMBER force field for nucleic acids: Improving the description of α/γ conformers. Biophys. J. 2007;92(11):3817–3829. doi: 10.1529/biophysj.106.097782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Zgarbová M., Otyepka M., Sponer J., Mládek A., Banás P., Cheatham T., Jurecka P.. Refinement of the Cornell et al. nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. J. Chem. Theory Comput. 2011;7(9):2886–2902. doi: 10.1021/ct200162x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Steinbrecher T., Latzer J., Case D. A.. Revised AMBER parameters for bioorganic phosphates. J. Chem. Theory Comput. 2012;8(11):4405–4412. doi: 10.1021/ct300613v. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Bergonzo C., Cheatham T. E.. Improved force field parameters lead to a better description of RNA structure. J. Chem. Theory Comput. 2015;11(9):3969–3972. doi: 10.1021/acs.jctc.5b00444. [DOI] [PubMed] [Google Scholar]
  19. Bergonzo C., Henriksen N. M., Roe D. R., Cheatham T. E.. Highly sampled tetranucleotide and tetraloop motifs enable evaluation of common RNA force fields. RNA. 2015;21(9):1578–1590. doi: 10.1261/rna.051102.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Yildirim I., Stern H. A., Kennedy S. D., Tubbs J. D., Turner D. H.. Reparameterization of RNA χ torsion parameters for the AMBER force field and comparison to NMR spectra for cytidine and uridine. J. Chem. Theory Comput. 2010;6(5):1520–1531. doi: 10.1021/ct900604a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Aytenfisu A. H., Spasic A., Grossfield A., Stern H. A., Mathews D. H.. Revised RNA dihedral parameters for the amber force field improve RNA molecular dynamics. J. Chem. Theory Comput. 2017;13(2):900–915. doi: 10.1021/acs.jctc.6b00870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Chen A. A., García A. E.. High-resolution reversible folding of hyperstable RNA tetraloops using molecular dynamics simulations. Proc. Natl. Acad. Sci. U. S. A. 2013;110(42):16820–16825. doi: 10.1073/pnas.1309392110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kührová P., Best R. B., Bottaro S., Bussi G., Sponer J., Otyepka M., Banás P.. Computer folding of RNA tetraloops: Identification of key force field deficiencies. J. Chem. Theory Comput. 2016;12(9):4534–4548. doi: 10.1021/acs.jctc.6b00300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kührová P., Mlynsky V., Zgarbová M., Krepl M., Bussi G., Best R. B., Otyepka M., Sponer J., Banás P.. Improving the performance of the amber RNA force field by tuning the hydrogen-bonding interactions. J. Chem. Theory Comput. 2019;15(5):3288–3305. doi: 10.1021/acs.jctc.8b00955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Fröhlking T., Mlynsky V., Janecek M., Kührová P., Krepl M., Banás P., Sponer J., Bussi G.. Automatic learning of hydrogen-bond fixes in the AMBER RNA force field. J. Chem. Theory Comput. 2022;18(7):4490–4502. doi: 10.1021/acs.jctc.2c00200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Tan D., Piana S., Dirks R. M., Shaw D. E.. RNA force field with accuracy comparable to state-of-the-art protein force fields. Proc. Natl. Acad. Sci. U. S. A. 2018;115(7):E1346–E1355. doi: 10.1073/pnas.1713027115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Piana S., Donchev A. G., Robustelli P., Shaw D. E.. Water dispersion interactions strongly influence simulated structural properties of disordered protein states. J. Phys. Chem. B. 2015;119(16):5113–5123. doi: 10.1021/jp508971m. [DOI] [PubMed] [Google Scholar]
  28. Mlynsky V., Kührová P., Stadlbauer P., Krepl M., Otyepka M., Banás P., Sponer J.. Simple adjustment of intranucleotide base-phosphate interaction in the OL3 AMBER force field improves RNA simulations. J. Chem. Theory Comput. 2023;19(22):8423–8433. doi: 10.1021/acs.jctc.3c00990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Raguette L. E., Gunasekera S. S., Diaz Ventura R. I., Aminov E., Linzer J. T., Parwana D., Wu Q., Simmerling C., Nagan M. C.. Adjusting the energy profile for CH-O interactions leads to improved stability of RNA stem-loop structures in MD simulations. J. Phys. Chem. B. 2024;128(33):7921–7933. doi: 10.1021/acs.jpcb.4c01910. [DOI] [PubMed] [Google Scholar]
  30. Mlýnský V., Kührová P., Pykal M., Krepl M., Stadlbauer P., Otyepka M., Banáš P., Šponer J.. Can we ever develop an ideal RNA force field? Lessons learned from simulations of the UUCG RNA tetraloop and other systems. J. Chem. Theory Comput. 2025;21(8):4183–4202. doi: 10.1021/acs.jctc.4c01357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ma H. R., Proctor D. J., Kierzek E., Kierzek R., Bevilacqua P. C., Gruebele M.. Exploring the energy landscape of a small RNA hairpin. J. Am. Chem. Soc. 2006;128(5):1523–1530. doi: 10.1021/ja0553856. [DOI] [PubMed] [Google Scholar]
  32. Kuznetsov S. V., Ren C. C., Woodson S. A., Ansari A.. Loop dependence of the stability and dynamics of nucleic acid hairpins. Nucleic Acids Res. 2007;36(4):1098–1112. doi: 10.1093/nar/gkm1083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Stancik A. L., Brauns E. B.. Rearrangement of partially ordered stacked conformations contributes to the rugged energy landscape of a small RNA hairpin. Biochemistry. 2008;47(41):10834–10840. doi: 10.1021/bi801170c. [DOI] [PubMed] [Google Scholar]
  34. Sarkar K., Meister K., Sethi A., Gruebele M.. Fast folding of an RNA tetraloop on a rugged energy landscape detected by a stacking-sensitive probe. Biophys. J. 2009;97(5):1418–1427. doi: 10.1016/j.bpj.2009.06.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Sarkar K., Nguyen D. A., Gruebele M.. Loop and stem dynamics during RNA hairpin folding and unfolding. RNA. 2010;16(12):2427–2434. doi: 10.1261/rna.2253310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Jang S. S., Dubnik S., Hon J., Hellenkamp B., Lynall D. G., Shepard K. L., Nuckolls C., Gonzalez R. L.. Characterizing the conformational free-energy landscape of RNA stem-loops using single-molecule field-effect transistors. J. Am. Chem. Soc. 2023;145(1):402–412. doi: 10.1021/jacs.2c10218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Salomon-Ferrer R., Gotz A. W., Poole D., Le Grand S., Walker R. C.. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Ewald. J. Chem. Theory Comput. 2013;9(9):3878–3888. doi: 10.1021/ct400314y. [DOI] [PubMed] [Google Scholar]
  38. Nguyen H., Maier J., Huang H., Perrone V., Simmerling C.. Folding simulations for proteins with diverse topologies are accessible in days with a physics-based force field and implicit solvent. J. Am. Chem. Soc. 2014;136(40):13959–13962. doi: 10.1021/ja5032776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Roux B., Simonson T.. Implicit solvent models. Biophys. Chem. 1999;78(1–2):1–20. doi: 10.1016/S0301-4622(98)00226-9. [DOI] [PubMed] [Google Scholar]
  40. Kleinjung J., Fraternali F.. Design and application of implicit solvent models in biomolecular simulations. Curr. Opin. Struct. Biol. 2014;25:126–134. doi: 10.1016/j.sbi.2014.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Anandakrishnan R., Drozdetski A., Walker R. C., Onufriev A. V.. Speed of conformational change: Comparing explicit and implicit solvent molecular dynamics simulations. Biophys. J. 2015;108(5):1153–1164. doi: 10.1016/j.bpj.2014.12.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Zagrovic B., Pande V.. Solvent viscosity dependence of the folding rate of a small protein: distributed computing study. J. Comput. Chem. 2003;24(12):1432–1436. doi: 10.1002/jcc.10297. [DOI] [PubMed] [Google Scholar]
  43. Still W. C., Tempczyk A., Hawley R. C., Hendrickson T.. Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 1990;112(16):6127–6129. doi: 10.1021/ja00172a038. [DOI] [Google Scholar]
  44. Onufriev A. V., Case D. A.. Generalized Born implicit solvent models for biomolecules. Annu. Rev. Biophys. 2019;48:275–296. doi: 10.1146/annurev-biophys-052118-115325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Götz A. W., Williamson M. J., Xu D., Poole D., Le Grand S., Walker R. C.. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 1. Generalized Born. J. Chem. Theory Comput. 2012;8(5):1542–1555. doi: 10.1021/ct200909j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Mongan J., Simmerling C., McCammon J. A., Case D. A., Onufriev A.. Generalized Born model with a simple, robust molecular volume correction. J. Chem. Theory Comput. 2007;3(1):156–169. doi: 10.1021/ct600085e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Nguyen H., Roe D. R., Simmerling C.. Improved generalized Born solvent model parameters for protein simulations. J. Chem. Theory Comput. 2013;9(4):2020–2034. doi: 10.1021/ct3010485. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Tao P., Xiao Y.. Using the generalized Born surface area model to fold proteins yields more effective sampling while qualitatively preserving the folding landscape. Phys. Rev. E. 2020;101(6):062417. doi: 10.1103/PhysRevE.101.062417. [DOI] [PubMed] [Google Scholar]
  49. Nguyen H., Pérez A., Bermeo S., Simmerling C.. Refinement of generalized Born implicit solvation parameters for nucleic acids and their complexes with proteins. J. Chem. Theory Comput. 2015;11(8):3714–3728. doi: 10.1021/acs.jctc.5b00271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Halder A., Kumar S., Valsson O., Reddy G.. Mg2+ sensing by an RNA fragment: Role of Mg2+-coordinated water molecules. J. Chem. Theory Comput. 2020;16(10):6702–6715. doi: 10.1021/acs.jctc.0c00589. [DOI] [PubMed] [Google Scholar]
  51. Do H. N., Miao Y. L.. Deep boosted molecular dynamics: Accelerating molecular simulations with Gaussian boost potentials generated using probabilistic Bayesian deep neural network. J. Phys. Chem. Lett. 2023;14(21):4970–4982. doi: 10.1021/acs.jpclett.3c00926. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Linzer J. T., Aminov E., Abdullah A. S., Kirkup C. E., Diaz Ventura R. I., Bijoor V. R., Jung J., Huang S., Tse C. G., Alvarez Toucet E., Onghai H. P., Ghosh A. P., Grodzki A. C., Haines E. R., Iyer A. S., Khalil M. K., Leong A. P., Neuhaus M. A., Park J., Shahid A., Xie M., Ziembicki J. M., Simmerling C., Nagan M. C.. Accurately modeling RNA stem-loops in an implicit solvent environment. J. Chem. Inf. Model. 2024;64(15):6092–6104. doi: 10.1021/acs.jcim.4c00756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Lindorff-Larsen K., Piana S., Dror R. O., Shaw D. E.. How fast-folding proteins fold. Science. 2011;334(6055):517–520. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
  54. Rijnbrand R., Thiviyanathan V., Kaluarachchi K., Lemon S. M., Gorenstein D. G.. Mutational and structural analysis of stem-loop IIIc of the hepatitis C virus and GB virus B internal ribosome entry sites. J. Mol. Biol. 2004;343(4):805–817. doi: 10.1016/j.jmb.2004.08.095. [DOI] [PubMed] [Google Scholar]
  55. Jucker F. M., Heus H. A., Yip P. F., Moors E. H. M., Pardi A.. A network of heterogeneous hydrogen bonds in GNRA tetraloops. J. Mol. Biol. 1996;264(5):968–980. doi: 10.1006/jmbi.1996.0690. [DOI] [PubMed] [Google Scholar]
  56. Kim C. H., Tinoco I.. Structural and thermodynamic studies on mutant RNA motifs that impair the specificity between a viral replicase and its promoter. J. Mol. Biol. 2001;307(3):827–839. doi: 10.1006/jmbi.2001.4497. [DOI] [PubMed] [Google Scholar]
  57. Kim C. H., Kao C. C., Tinoco I.. RNA motifs that determine specificity between a viral replicase and its promoter. Nat. Struct. Biol. 2000;7(5):415–423. doi: 10.1038/75202. [DOI] [PubMed] [Google Scholar]
  58. Lukavsky P. J., Otto G. A., Lancaster A. M., Sarnow P., Puglisi J. D.. Structures of two RNA domains essential for hepatitis C virus internal ribosome entry site function. Nat. Struct. Biol. 2000;7(12):1105–1110. doi: 10.1038/81951. [DOI] [PubMed] [Google Scholar]
  59. Morosyuk S. V., Cunningham P. R., SantaLucia J.. Structure and function of the conserved 690 hairpin in Eschelichia coli 16 S ribosomal RNA. II. NMR solution structure. J. Mol. Biol. 2001;307(1):197–211. doi: 10.1006/jmbi.2000.4431. [DOI] [PubMed] [Google Scholar]
  60. Melchers W. J. G., Zoll J., Tessari M., Bakhmutov D. V., Gmyl A. P., Agol V. I., Heus H. A.. A GCUA tetranucleotide loop found in the poliovirus oriL by in vivo SELEX (un)­expectedly forms a YNMG-like structure: Extending the YNMG family with GYYA. RNA. 2006;12(9):1671–1682. doi: 10.1261/rna.113106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Nozinovic S., Fürtig B., Jonker H. R. A., Richter C., Schwalbe H.. High-resolution NMR structure of an RNA model system: the 14-mer cUUCGg tetraloop hairpin RNA. Nucleic Acids Res. 2010;38(2):683–694. doi: 10.1093/nar/gkp956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Duszczyk M. M., Wutz A., Rybin V., Sattler M.. The Xist RNA A-repeat comprises a novel AUCG tetraloop fold and a platform for multimerization. RNA. 2011;17(11):1973–1982. doi: 10.1261/rna.2747411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Leeper T., Leulliot N., Varani G.. The solution structure of an essential stem-loop of human telomerase RNA. Nucleic Acids Res. 2003;31(10):2614–2621. doi: 10.1093/nar/gkg351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Sakamoto T., Oguro A., Kawai G., Ohtsu T., Nakamura Y.. NMR structures of double loops of an RNA aptamer against mammalian initiation factor 4A. Nucleic Acids Res. 2005;33(2):745–754. doi: 10.1093/nar/gki222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Kerwood D. J., Cavaluzzi M. J., Borer P. N.. Structure of SL4 RNA from the HIV-1 packaging signal. Biochemistry. 2001;40(48):14518–14529. doi: 10.1021/bi0111909. [DOI] [PubMed] [Google Scholar]
  66. Cabello-Villegas J., Winkler M. E., Nikonowicz E. P.. Solution conformations of unmodified and A37N6-dimethylallyl modified anticodon stem-loops of Escherichia coli tRNAPhe . J. Mol. Biol. 2002;319(5):1015–1034. doi: 10.1016/S0022-2836(02)00382-0. [DOI] [PubMed] [Google Scholar]
  67. Dufour D., de la Peña M., Gago S., Flores R., Gallego J.. Structure-function analysis of the ribozymes of chrysanthemum chlorotic mottle viroid: a loop-loop interaction motif conserved in most natural hammerheads. Nucleic Acids Res. 2009;37(2):368–381. doi: 10.1093/nar/gkn918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Schweisguth D. C., Moore P. B.. On the conformation of the anticodon loops of initiator and elongator methionine tRNAs. J. Mol. Biol. 1997;267(3):505–519. doi: 10.1006/jmbi.1996.0903. [DOI] [PubMed] [Google Scholar]
  69. Staple D. W., Butcher S. E.. Solution structure of the HIV-1 frameshift inducing stem-loop RNA. Nucleic Acids Res. 2003;31(15):4326–4331. doi: 10.1093/nar/gkg654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Levengood J. D., Rollins C., Mishler C. H. J., Johnson C. A., Miner G., Rajan P., Znosko B. M., Tolbert B. S.. Solution structure of the HIV-1 exon splicing silencer 3. J. Mol. Biol. 2012;415(4):680–698. doi: 10.1016/j.jmb.2011.11.034. [DOI] [PubMed] [Google Scholar]
  71. Amarasinghe G. K., De Guzman R. N., Turner R. B., Summers M. F.. NMR structure of stem-loop SL2 of the HIV-1 Ψ RNA packaging signal reveals a novel A-U-A base-triple platform. J. Mol. Biol. 2000;299(1):145–156. doi: 10.1006/jmbi.2000.3710. [DOI] [PubMed] [Google Scholar]
  72. Smith J. S., Nikonowicz E. P.. NMR structure and dynamics of an RNA motif common to the spliceosome branch-point helix and the RNA-binding site for phage GA coat protein. Biochemistry. 1998;37(39):13486–13498. doi: 10.1021/bi981558a. [DOI] [PubMed] [Google Scholar]
  73. Venditti V., Clos L., Niccolai N., Butcher S. E.. Minimum-energy path for a U6 RNA conformational change involving protonation, base-pair rearrangement and base flipping. J. Mol. Biol. 2009;391(5):894–905. doi: 10.1016/j.jmb.2009.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Aboul-ela G., Karn J., Varani G.. Structure of HIV-1 TAR RNA in the absence of ligands reveals a novel conformation of the trinucleotide bulge. Nucleic Acids Res. 1996;24(20):3974–3981. doi: 10.1093/nar/24.20.3974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Reiter N. J., Maher L. J., Butcher S. E.. DNA mimicry by a high-affinity anti-NF-κB RNA aptamer. Nucleic Acids Res. 2008;36(4):1227–1236. doi: 10.1093/nar/gkm1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Sigel R. K. O., Sashital D. G., Abramovitz D. L., Palmer A. G., Butcher S. E., Pyle A. M.. Solution structure of domain 5 of a group II intron ribozyme reveals a new RNA motif. Nat. Struct. Mol. Biol. 2004;11(2):187–192. doi: 10.1038/nsmb717. [DOI] [PubMed] [Google Scholar]
  77. Du Z. H., Ulyanov N. B., Yu J. H., Andino R., James T. L.. NMR structures of loop B RNAs from the stem-loop IV domain of the Enterovirus internal ribosome entry site: A single C to U substitution drastically changes the shape and flexibility of RNA. Biochemistry. 2004;43(19):5757–5771. doi: 10.1021/bi0363228. [DOI] [PubMed] [Google Scholar]
  78. Lawrence D. C., Stover C. C., Noznitsky J., Wu Z. R., Summers M. F.. Structure of the intact stem and bulge of HIV-1 ψ-RNA stem-loop SL1. J. Mol. Biol. 2003;326(2):529–542. doi: 10.1016/S0022-2836(02)01305-0. [DOI] [PubMed] [Google Scholar]
  79. Case, D. A. ; Aktulga, H. M. ; Belfon, K. ; Ben-Shalom, I. Y. ; Berryman, J. T. ; Brozell, S. R. ; Cerutti, D. S. ; Cheatham, I. T. E. ; Cisneros, G. A. ; Cruzeiro, V. W. D. ; Darden, T. A. ; Duke, R. E. ; Giambasu, G. ; Gilson, M. K. ; Gohlke, H. ; Goetz, A. W. ; Harris, R. ; Izadi, S. ; Izmailov, S. A. ; Kasavajhala, K. ; Kaymak, M. C. ; King, E. ; Kovalenko, A. ; Kurtzman, T. ; Lee, T. S. ; LeGrand, S. ; Li, P. ; Lin, C. ; Liu, J. ; Luchko, T. ; Luo, R. ; Machado, M. ; Man, V. ; Manathunga, M. ; Merz, K. M. ; Miao, Y. ; Mikhailovskii, O. ; Monard, G. ; Nguyen, H. ; O’Hearn, K. A. ; Onufriev, A. ; Pan, F. ; Pantano, S. ; Qi, R. ; Rahnamoun, A. ; Roe, D. R. ; Roitberg, A. ; Sagui, C. ; Schott-Verdugo, S. ; Shajan, A. ; Shen, J. ; Simmerling, C. L. ; Skrynnikov, N. R. ; Smith, J. ; Swails, J. ; Walker, R. C. ; Wang, J. ; Wei, H. ; Wolf, R. M. ; Wu, X. ; Xiong, Y. ; Xue, Y.D.M. Y. ; Zhao, S. ; Kollman, P. A. . AMBER 2022; University of California, San Francisco, 2022. [Google Scholar]
  80. Onufriev A., Bashford D., Case D. A.. Exploring protein native states and large-scale conformational changes with a modified generalized born model. Proteins. 2004;55(2):383–394. doi: 10.1002/prot.20033. [DOI] [PubMed] [Google Scholar]
  81. Ryckaert J. P., Ciccotti G., Berendsen H. J. C.. Numerical-integration of cartesian equations of motion of a system with constraints - Molecular dynamics of n-alkanes. J. Comput. Phys. 1977;23(3):327–341. doi: 10.1016/0021-9991(77)90098-5. [DOI] [Google Scholar]
  82. Loncharich R. J., Brooks B. R., Pastor R. W.. Langevin dynamics of peptides: the frictional dependence of isomerization rates of N-acetylalanyl-N’-methylamide. Biopolymers. 1992;32(5):523–535. doi: 10.1002/bip.360320508. [DOI] [PubMed] [Google Scholar]
  83. Case D. A., Aktulga H. M., Belfon K., Cerutti D. S., Cisneros G. A., Cruzeiro V. W. D., Forouzesh N., Giese T. J., Götz A. W., Gohlke H., Izadi S., Kasavajhala K., Kaymak M. C., King E., Kurtzman T., Lee T. S., Li P. F., Liu J., Luchko T., Luo R., Manathunga M., Machado M. R., Nguyen H. M., O’Hearn K. A., Onufriev A. V., Pan F., Pantano S., Qi R. X., Rahnamoun A., Risheh A., Schott-Verdugo S., Shajan A., Swails J., Wang J. M., Wei H. X., Wu X. W., Wu Y. X., Zhang S., Zhao S. J., Zhu Q., Cheatham T. I. I. I., Roe D. R., Roitberg A., Simmerling C., York D. M., Nagan M. C., Merz K. J. r.. AmberTools. J. Chem. Inf. Model. 2023;63(20):6183–6191. doi: 10.1021/acs.jcim.3c01153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Case, D. A. ; Aktulga, H. M. ; Belfon, K. ; Ben-Shalom, I. Y. ; Berryman, J. T. ; Brozell, S. R. ; Cerutti, D. S. ; Cheatham, I. T. E. ; Cisneros, G. A. ; Cruzeiro, V. W. D. ; Darden, T. A. ; Forouzesh, N. ; Ghazimirsaeed, M. ; Giambasu, G. ; Giese, T. ; Gilson, M. K. ; Gohlke, H. ; Goetz, A. W. ; Harris, R. ; Izadi, S. ; Izmailov, S. A. ; Kasavajhala, K. ; Kaymak, M. C. ; Kovalenko, A. ; Kurtzman, T. ; Lee, T. S. ; Li, P. ; Li, Z. ; Lin, C. ; Liu, J. ; Luchko, T. ; Luo, R. ; Machado, M. ; Manathunga, M. ; Merz, K. M. ; Miao, Y. ; Mikhailovskii, O. ; Monard, G. ; Nguyen, H. ; O’Hearn, K. A. ; Onufriev, A. ; Pan, F. ; Pantano, S. ; Rahnamoun, A. ; Roe, D. R. ; Roitberg, A. ; Sagui, C. ; Schott-Verdugo, S. ; Shajan, A. ; Shen, J. ; Simmerling, C. L. ; Skrynnikov, N. R. ; Smith, J. ; Swails, J. ; Walker, R. C. ; Wang, J. ; Wang, J. ; Wu, X. ; Wu, Y. ; Xiong, Y. ; Xue, Y. ; D, M. Y. ; Zhao, C. ; Zhu, Q. ; Kollman, P. A. . AMBER 2024; University of California, San Francisco, 2024. [Google Scholar]
  85. MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1: Statistics; University of California Press, 1967; Vol. 5.1, pp 281–297. [Google Scholar]
  86. Meng E. C., Goddard T. D., Pettersen E. F., Couch G. S., Pearson Z. J., Morris J. H., Ferrin T. E.. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 2023;32(11):e4792. doi: 10.1002/pro.4792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. MacKerell A. D., Bashford D., Bellott M., Dunbrack R. L., Evanseck J. D., Field M. J., Fischer S., Gao J., Guo H., Ha S., Joseph-McCarthy D., Kuchnir L., Kuczera K., Lau F. T. K., Mattos C., Michnick S., Ngo T., Nguyen D. T., Prodhom B., Reiher W. E., Roux B., Schlenkrich M., Smith J. C., Stote R., Straub J., Watanabe M., Wiórkiewicz-Kuczera J., Yin D., Karplus M.. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B. 1998;102(18):3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
  88. Machado M. R., Pantano S.. Split the charge difference in two! A rule of thumb for adding proper amounts of ions in MD simulations. J. Chem. Theory Comput. 2020;16(3):1367–1372. doi: 10.1021/acs.jctc.9b00953. [DOI] [PubMed] [Google Scholar]
  89. Darden T., York D., Pedersen L.. Particle mesh Ewald - an N.log­(N) method for Ewald sums in large systems. J. Chem. Phys. 1993;98(12):10089–10092. doi: 10.1063/1.464397. [DOI] [Google Scholar]
  90. Berendsen H. J. C., Postma J. P. M., van Gunsteren W. F., Dinola A., Haak J. R.. Molecular dynamics with coupling to an external bath. J. Chem. Phys. 1984;81(8):3684–3690. doi: 10.1063/1.448118. [DOI] [Google Scholar]
  91. Chen H., Meisburger S. P., Pabit S. A., Sutton J. L., Webb W. W., Pollack L.. Ionic strength-dependent persistence lengths of single-stranded RNA and DNA. Proc. Natl. Acad. Sci. U. S. A. 2012;109(3):799–804. doi: 10.1073/pnas.1119057109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Schudoma C., May P., Nikiforova V., Walther D.. Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling. Nucleic Acids Res. 2010;38(3):970–980. doi: 10.1093/nar/gkp1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Draper D. E.. A guide to ions and RNA structure. RNA. 2004;10(3):335–343. doi: 10.1261/rna.5205404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Woodson S. A.. Metal ions and RNA folding: a highly charged topic with a dynamic future. Curr. Opin. Chem. Biol. 2005;9(2):104–109. doi: 10.1016/j.cbpa.2005.02.004. [DOI] [PubMed] [Google Scholar]
  95. Bowman J. C., Lenz T. K., Hud N. V., Williams L. D.. Cations in charge: magnesium ions in RNA folding and catalysis. Curr. Opin. Struct. Biol. 2012;22(3):262–272. doi: 10.1016/j.sbi.2012.04.006. [DOI] [PubMed] [Google Scholar]
  96. Lipfert J., Doniach S., Das R., Herschlag D.. Understanding nucleic acid-ion interactions. Annu. Rev. Biochem. 2014;83:813–841. doi: 10.1146/annurev-biochem-060409-092720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Nguyen H. T., Hori N., Thirumalai D.. Theory and simulations for RNA folding in mixtures of monovalent and divalent cations. Proc. Natl. Acad. Sci. U. S. A. 2019;116(42):21022–21030. doi: 10.1073/pnas.1911632116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Kolesnikov E. S., Xiong Y. Y., Onufriev A. V.. Implicit solvent with explicit ions generalized Born model in molecular dynamics: Application to DNA. J. Chem. Theory Comput. 2024;20(19):8724–8739. doi: 10.1021/acs.jctc.4c00833. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ao5c05377_si_001.pdf (9.5MB, pdf)
Download video file (33.3MB, mp4)
Download video file (21.5MB, mp4)
Download video file (28.4MB, mp4)

Data Availability Statement

The input files used for MD simulations with AMBER22 are described in the Supporting Information. All data will be made available upon request.


Articles from ACS Omega are provided here courtesy of American Chemical Society

RESOURCES