Skip to main content
NAR Genomics and Bioinformatics logoLink to NAR Genomics and Bioinformatics
. 2022 Nov 29;4(4):lqac088. doi: 10.1093/nargab/lqac088

Investigating RNA–protein recognition mechanisms through supervised molecular dynamics (SuMD) simulations

Matteo Pavan 1, Davide Bassani 2, Mattia Sturlese 3, Stefano Moro 4,
PMCID: PMC9706429  PMID: 36458023

Abstract

Ribonucleic acid (RNA) plays a key regulatory role within the cell, cooperating with proteins to control the genome expression and several biological processes. Due to its characteristic structural features, this polymer can mold itself into different three-dimensional structures able to recognize target biomolecules with high affinity and specificity, thereby attracting the interest of drug developers and medicinal chemists. One successful example of the exploitation of RNA’s structural and functional peculiarities is represented by aptamers, a class of therapeutic and diagnostic tools that can recognize and tightly bind several pharmaceutically relevant targets, ranging from small molecules to proteins, making use of the available structural and conformational freedom to maximize the complementarity with their interacting counterparts. In this scientific work, we present the first application of Supervised Molecular Dynamics (SuMD), an enhanced sampling Molecular Dynamics-based method for the study of receptor–ligand association processes in the nanoseconds timescale, to the study of recognition pathways between RNA aptamers and proteins, elucidating the main advantages and limitations of the technique while discussing its possible role in the rational design of RNA-based therapeutics.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

We present the first application of supervised molecular dynamics (SuMD), an enhanced sampling molecular dynamics-based method for the study of receptor–ligand association processes in the nanoseconds timescale, to the study of recognition pathways between RNA aptamers and proteins.

INTRODUCTION

According to the central dogma of molecular biology, ribonucleic acid (RNA) is considered the functional link between deoxyribonucleic acid (DNA), which is involved in the storage of genetic information, and proteins, which are the effectors of most pivotal cell functions (1). Complementing this ancestral and simplistic description of the biological role of RNA, in recent times this polymer has been linked with a variety of regulatory activities within the cell, cooperating with proteins to finely tune the genome expression and other biological processes (2). The consideration that the vast majority of human RNA is not translated into proteins (3) in conjunction with the fact that a large number of newly discovered non-coding RNAs are associated with various pathologies and illnesses (4,5), caused an increase in the popularity of RNA among the scientific community, both from a biological and a therapeutic perspective (6,7).

From a structural point of view, RNA exists mainly as a double-stranded molecule with a chain length that can span from a few tens of nucleobases, as in small hairpins (8), to few thousand nucleotides, as in long non-coding sequences (9). Compared to DNA, the higher conformational freedom of ribonucleic acids implies that they can assume a wide variety of three-dimensional structures in solution (10,11), organizing themselves in functional domains specifically designed to recognize other nucleic acids (12), proteins (13), glycated derivatives (14) or small organic molecules (15).

From a functional perspective, the successful exploitation of RNA’s ability to mold itself into different three-dimensional structures able to recognize target biomolecules with high affinity and specificity is represented by aptamers (16). This class of single-stranded oligonucleotides that fold into defined and complex architectures including stems, loops, bulges, hairpins, pseudoknots, triplexes or quadruplexes, can bind several molecular targets, such as proteins, small organic molecules, and ions, thus classifying as a useful tool both for a therapeutic and diagnostic purpose (17,18).

As is the case for other nucleic acids, interactions between aptamers and proteins are characterized by a complex network of van der Waals, hydrogen bond, stacking, and general non-polar interactions that define the complementarity of shape and electrostatic properties at the surface between the two interactors and determine the specificity of binding to a certain target (19,20). To the present date, the rising amount of experimentally solved three-dimensional RNA structures, especially concerning their complexes with both macro and small molecules, has led to an increased interest by the scientific community in the investigation of RNA structures at an atomistic level of detail, to apply structure-based drug design (SBDD) strategies for the rational design of novel therapeutic entities (21,22).

Among the tools that are routinely used, both in the academic and industrial environment, for investigating the structural determinants of biological entities’ recognition, molecular docking is by far the most widely and successfully adopted (23). Originally developed for predicting the interaction between small organic molecules and proteins (24), throughout the years this computational technique has been also applied to the investigation of protein-protein (25), protein-peptides (26), and antigen-antibodies (27) complexes with various degrees of success. Contrary to proteins, the application of molecular docking to the study of complexes involving nucleic acids has been so far very limited, and mainly applied to the prediction of the binding mode of small molecules (28). Compared to the aforementioned established docking protocols, a smaller number of methods are available for the investigation of nucleic acids-protein complexes, due to some intrinsic structural peculiarities of nucleic acids, particularly in the case of RNA (29). The first limitation is represented by the distinctive charge distribution that characterizes the RNA surface compared to that of proteins (30), the second one is portrayed by the neglected treatment of the role of solvent (31) and the third one is the lack or limited consideration of the structural flexibility and dynamicity of ribonucleic acids (32).

One possible approach to overcome the limitations of molecular docking is represented by molecular dynamics (MD) simulations. Despite the enhanced description of the binding event derived from the explicit treatment of solvent molecules and the consideration of both receptor and ligand flexibility, MD simulations are rarely carried out to investigate the whole binding event due to the long simulation times and computational effort required to sample these infrequent events and are therefore mostly exploited for the refinement of docking results (33).

To mitigate the time constraints of classic molecular dynamics, one possible strategy is the exploitation of enhanced sampling techniques that allow increasing the frequency of observing desired events, such as the receptor–ligand association (34). Among the plethora of enhanced sampling protocols that have been developed throughout the years, supervised molecular dynamics (SuMD) has proven to be particularly successful in investigating ligand-receptor recognition pathways at an atomistic level of detail without applying any energetic bias to the system while contemporarily reducing by orders of magnitude the simulation time compared to classic MD simulations (35). Particularly, from a ligand perspective, SuMD simulations have proven useful to work with a variety of molecular entities, ranging from small organic molecules (both fragment (36–41) and mature, lead-like, compounds (42–44)), to more complex chemical species such as macrocycles (45) and peptide ligands (46). From a receptor point of view, instead, SuMD was successfully applied to the study of both soluble (47,48) and membrane (49–51) systems, including both protein (52,53) and nucleic (54) targets.

In this scientific work, SuMD simulations were applied for the first time to the study of the recognition process between RNA macromolecules and proteins, to extend the applicability domain of the methodology. Particularly, we decided to focus our attention on RNA aptamers, due to their relevance as diagnostic and therapeutic tools and due to the variety of their structural landscape.

Briefly, we present four different applications of the SuMD methodology to RNA aptamer–protein complexes. Three test cases, involving systems in which the three-dimensional structure of the complex is known and deposited in the Protein Data Bank (55), are used to validate the ability of the SuMD protocol to correctly reproduce the experimental data while giving additional useful information that goes beyond the final state of the recognition process. The fourth and final case, concerning instead a complex whose structure has not yet been experimentally determined, is presented to show a possible prospective application of the SuMD protocol, discussing at the same time the advantages and limitations of the technique other than its possible role in a typical drug discovery pipeline.

MATERIALS AND METHODS

Hardware overview

General molecular modeling operations, such as the preparation of RNA aptamer–protein complex structures, the system setup for molecular dynamics simulations, and trajectory analysis were performed on an 8 CPU Linux workstation equipped with an Intel Xeon E5-1620 3.50 GHz processor. All molecular dynamics simulations were performed on a GPU cluster composed of 20 NVIDIA devices ranging from GTX980 to Titan V.

Structures preparation

The three-dimensional coordinates of the three RNA aptamer–protein complex used as control cases in this study were retrieved from the Protein Data Bank (56) (PDB ID: 3DD2, 4PDB, 5VOE) and prepared for subsequent simulations exploiting several modules from the Molecular Operating Environment (MOE) 2019.01 suite (57). At first, structures were pre-processed through the ‘Structure Preparation’ tool, to assign each residue with alternate conformations to the one with the highest occupancy, build missing loops through homology modeling, and correct inconsistencies between the primary sequence and the tertiary structure. For compatibility with each piece of software used in this work and for consistency with previous studies involving the SuMD approach, structures were manually edited to mutate each non-natural nucleic residue (e.g. fluorinated nucleotides) to the corresponding natural alternative. Afterward, titratable residues were assigned to the most probable protonation state at pH = 7.40 exploiting the ‘Protonate3D’ tool. Finally, each non-protein or non-nucleic residue was removed, and the nucleic ligand was moved away from the binding site at a distance of at least 30 Å from the nearest receptor atom, to explore the conformational degree of freedom of the ligand throughout the recognition process.

For the investigation of the recognition process between the SARS-CoV-2 Spike RNA aptamer, the structure of the Spike Receptor Binding Domain (RBD) was retrieved from the Protein Data Bank (accession code: 6M0J (58)). Concerning the RNA aptamer, the experimental structure was not available, therefore the primary sequence was obtained from the Supplementary Material of the original work from Valero et al. (59) and sequentially submitted to the NUPACK (60) and 3DRNA (61) webservers to retrieve the predicted secondary and tertiary structure respectively. In both cases, default parameters were chosen, and the lowest energy structure coming from the 3DRNA webserver was used for both docking calculations and MD simulations. Every other preparation passage is identical to the ones executed for the three control cases.

System setup and equilibration protocol

Nucleic-protein systems coming from the preliminary preparation stage were then further processed making use of both visual molecular dynamics (VMD) 1.9.2 (62) and several tools from the Ambertools14 suite (63). Each protein or nucleic atom was parameterized according to the ff14SB force field with χ modification tuned for RNA (χOL3) (64–66). At first, each system was solvated in a cubic box of TIP3P (67) water molecules with a padding of 35 Å. Afterward, each solvated system was neutralized through the addition of an appropriate number of Na+ and Cl counterions until a salt concentration of 0.154 M was reached. Before molecular dynamics (MD) simulations, each system was subjected to a 1500-step energy minimization phase with the conjugate-gradient algorithm.

Each minimized system underwent then a two stages equilibration protocol. The first stage consisted of 1 ns of simulation in the canonical ensemble (NVT), applying a 5 kcal mol−1 Å−2 harmonic positional restraint on each protein and nucleic atom. The second stage consisted instead of a 2 ns simulation in the isothermal-isobaric ensemble (NPT), with the same restraints applied only to the backbone atoms of both the protein and the nucleic acid. For each MD simulation performed in this scientific work, an integration step of 2 fs was used, the temperature was kept at a constant value of 310 K through a Langevin thermostat (68), and the M-SHAKE algorithm (69) was used to constrain the length of bonds involving hydrogen atoms, the particle-mesh Ewald (PME) (70) method was exploited to compute electrostatic interactions using cubic spline interpolation and a 1 Å grid spacing, while a 9.0 Å cutoff was set for calculation of Lennard–Jones interactions. For simulations in the NPT ensemble, the pressure was kept at a constant value of 1 atm through a Monte Carlo barostat (71). Finally, all MD simulations were run through the ACEMD 3 (72) engine, which is based on OpenMM 7 (73).

Supervised molecular dynamics (SuMD) simulations

SuMD is a well-established enhanced-sampling molecular dynamics approach that has been successfully applied to the study of the recognition process between various molecular entities at an atomic level of details on the nanosecond timescale (35).

The main advantage of the SuMD approach compared to traditional molecular dynamics simulations is the improved ability to sample infrequent events such as molecular association processes, thus reducing the timescale of the simulation that is required to spontaneously observe a binding event from the microseconds range to a few nanoseconds.

In detail, this task is accomplished by performing a sequence of short, unbiased, MD simulations followed by an evaluation of the simulation progress by a tabu-like algorithm. In this case, each of these MD simulations, defined as the ‘SuMD-step’, is run in the canonical ensemble at a constant temperature of 310 K for 300 ps, but the length of the ‘SuMD-step’ can vary and is chosen according to the system that is studied. At the end of each ‘SuMD-step’, the distance between the center of mass of the ligand and one of the user-defined binding sites is computed at each step of the simulation, and this data is then fitted into a linear function: if the slope of the resulting straight-line is negative, indicating that the ligand is approaching the binding site, the ‘SuMD-step’ is considered productive and retained for the generation of the final trajectory, while the final state of the simulation is used as the starting point for the successive step. On the contrary, if the slope is positive, thereby indicating that the ligand is not approaching the binding site, the ‘SuMD-step’ is considered not productive and therefore discarded: in this case, the step is repeated by randomly reassigning the particle velocities through the Langevin thermostat and retaining the final coordinates from the end of the previous ‘SuMD-step’. The supervision algorithm is switched off when the distance between the two centers of mass falls below a threshold value (10 Å, in this case): from that point on, the simulation proceeds for the other 10 ns of classic molecular dynamics, allowing the system to relax and reach the final state of the simulation without any external geometric biased imposed by the supervision. For each control case study, 10 SuMD simulations were collected: while every single one was visually inspected, only the best one according to the geometrical agreement with the reference (based on the RMSD between the ligand coordinates in the final step of the simulation and the ligand coordinate in the reference experimental structure after optimal superposition of the protein backbone) was thoroughly analyzed and discussed in the manuscript. In the case where the structure of the complex was not available, the best replica was chosen based on the MMGBSA interaction energy instead. In the current implementation, the SuMD code is written in Python and exploits the Numpy and ProDy (74) modules to perform the aforementioned geometrical supervision throughout the simulation. A list of residues utilized to define the ligand and protein binding site for each case is provided in Supplementary Table S4 (Supplementary Materials).

Trajectory analysis

The SuMD trajectories were analyzed by making use of an-in house tool written in Python 3 which represents an evolution and customization of the original one which is described in the work of Salmaso et al. (46).

Initially, trajectories representing each single ‘SuMD-step’ were merged into a single collective trajectory. Then, obtained trajectories were pre-processed by applying a stride and retaining one frame every 20 ps, superposing and aligning each frame on the protein backbone atoms of the first frame, and wrapping it into an image of the system simulated under periodic boundary conditions (PBC). Both geometric and energetic analyses were performed on the so-obtained SuMD trajectories.

Concerning the geometric properties of the system, regarding both the nucleic ligand and the protein receptor, the time-dependent evolution of both backbone RMSD and radius of gyration, a global and a time-dependent per-residue decomposition of the backbone RMSF were collected and reported in an aggregated panel. Furthermore, the geometric performance of the SuMD protocol in reproducing the experimental bound conformation of the ligand was evaluated by computing the ligand backbone RMSD compared to the experimental reference throughout the entire simulation. All these geometric analyses were performed making use of the appropriate functions of the MDAnalysis (75,76) Python library and plotted through the Matplotlib (77) module.

Regarding the energetic analysis, an estimation of the ligand-receptor interaction energy alongside the SuMD trajectory was obtained both through the MMGBSA protocol, as implemented in AMBER 14, and through the ‘NAMD Energy’ plugin for VMD, which exploits the NAMD (78) package to retrieve an estimate of the interaction energy defined as the sum of the van der Waals and electrostatic contribution calculated according to the user-defined force field (AMBER 14, in this case). The energy values were then plotted both as a function of the simulation time and of the RMSD to the reference pose, giving both a time-dependent and a geometry-dependent energetic profile of the trajectory.

Finally, a per-residue interaction energy decomposition analysis was carried out exploiting once again the ‘NAMD Energy’ plugin for VMD: plots report a time-dependent per-residue decomposition of the interaction energy for both the receptor and the ligand and a bidimensional interaction energy matrix in which interacting residues on the ligand side are correlated with the corresponding interacting residues on the receptor side. For all these per-residue analyses, the 25 most frequently contacted residues throughout the trajectory are considered (25 for the receptor, as well as for the ligand), defining contacting residues as the ones that are at a maximum distance of 4.5 Å from the nearest atom of the counterpart, either the ligand or the receptor.

A movie representation of the trajectory alongside the dynamic evaluation of its geometric and energetic features is also provided by the same analysis tool, which exploits VMD for the visual rendering of the simulated system. For uniformity reasons, in each plot and video, residue numbering is related to the fasta sequence for the wild-type receptor, as retrieved from the UniProt database.

Docking

To evaluate SuMD’s ability to reproduce the native conformation of the RNA aptamer–protein complex, we decided to compare its performance with the one of molecular docking. The program chosen to accomplish this task was HADDOCK (25) (‘High Ambiguity Driven protein-protein DOCKing’, version 2.4) since it has already been extensively used for dealing with protein-nucleic acid complexes (79) and it uses a priori information to steer the docking calculation in a similar way to how SuMD works.

Each one of the crystal structures used as control (the ones coming from PDB codes 3DD2, 4PDB, and 5VOE) was subjected to a docking run. For all these cases, the nucleic acid was treated as the ligand, while the protein was considered as the receptor. The binding site was defined based on residues at the contact surface in the crystallographic structures, both on the protein and nucleic acid sides.

Concerning the SARS-CoV-2 Spike RBD RNA aptamer, the selected protein residues were chosen instead based on the contact surface with human ACE2 in the structure 6M0J, while for the aptamer the residues were selected based on the information coming from the original paper by Valero et al. The list of protein and aptamer residues used as input for each docking calculation is reported in Supplementary Table S3 (Supplementary Material). All user-definable parameters for molecular docking were kept as default.

HADDOCK starts with a randomization stage, in which the docking partners are placed far in space from one another (about 150 Å) and randomly rotated around their centers of mass. The following step consists of a rigid-body energy minimization, which is followed by the rigid-body docking of the ligand and the receptor, allowing to obtain of 1000 complexes. The 200 best solutions in terms of intermolecular energies obtained at this stage are subjected to simulated annealing refinements. Both the intra- and inter-molecular energies are evaluated by HADDOCK using full electrostatic and van der Waals energy terms with an 8.5 Å distance cutoff using the OPLS (80) nonbonded parameters. The final complexes are then clustered based on the Fraction of Common Contacts (81) (FCCs) with a 0.6 cutoff similarity for clustering, and the clusters are then ranked for energetics.

RNA aptamer–protein complexes molecular dynamics simulations

To evaluate the dynamic behavior of aptamer-bound protein complexes, we performed several classic molecular dynamics simulations. At first, each system was subjected to a preparation step exploiting both AmberTools14 and VMD 1.9.2, as previously mentioned in the preparation stage for SuMD simulations. Specifically, each protein-nucleic acid complex was singularly solvated in an explicit TIP3P water box with a 40 Å padding. Each of these simulation boxes was then neutralized using Na+/Cl ions until reaching a physiological salt concentration of 0.154 M. The preparation phase was followed by a two-step equilibration protocol. The first equilibration was carried out in the canonical ensemble (NVT) and was composed of 1500 steps of energy minimization with a conjugate-gradient algorithm followed by a 1 ns MD simulation. In this first passage, harmonic positional restraints of 5 kcal mol−1 Å−2 were applied on both the protein and the nucleic acid, while the temperature was kept at the constant value of 310 K exploiting a Langevin thermostat (with friction coefficient set to 0.1 ps−1). The second equilibration was performed in the isothermal-isobaric ensemble (NPT), also this time for 1 ns of MD simulation in which the harmonic positional restraints of 5 kcal mol−1 Å−2 were applied just on the protein and nucleic acid backbones. The pressure was kept constant at the value of 1.0 atm using a Monte Carlo barostat. In each of the equilibration steps, a 2 ps integration step was adopted, the bonds involving the hydrogen atoms were constrained through the M-SHAKE algorithm, and a 9.0 Å cutoff was used for the calculation of the Lennard-Jones interaction. For the electrostatic interaction, a particle-mesh Ewald method (PME) was used. After this preparation phase, three different 50 ns replicates of classic MD simulation in the NVT ensemble at 310 K were executed.

Free RNA–aptamer molecular dynamics simulations

To complement the investigation of the structural dynamicity of investigated RNA aptamer–protein complexes, we also performed a classic MD simulation of the free RNA aptamer. To accomplish this task, we retrieved the coordinates for each nucleic acid molecule from the previously mentioned complexes. Each of these aptamers was then subjected to the same protocol described before for RNA aptamer–protein complexes, except for the parts related to the protein which, in this case, was not part of the system.

RESULTS

To assess the applicability domain and accuracy of Supervised Molecular Dynamics simulation in the context of the nucleic acids-protein recognition processes, we opted for a retrospective validation approach, evaluating the ability of the protocol to correctly reproduce the binding mode of nucleic ligands found in experimentally solved complex structures, focusing both on the sampling and ranking capabilities of the protocol. Particularly, we decided to focus our attention on the class of RNA aptamers, both for their therapeutical relevance and for their challenging nature due to their peculiar structural features such as intrinsically higher flexibility and density of negative charge compared to ligands considered in past applications of the SuMD protocol. In the following paragraphs, we present the application of SuMD to three different case studies for which the experimental structure of the RNA aptamer–protein complex is available on the Protein Data Bank, focusing on the geometrical accuracy of the technique in reproducing the experimentally determined binding mode and monitoring both the geometric and energetic features of the recognition process, stretching beyond the final state of the simulation. The three test cases are reported in chronological order, starting from the oldest structure to the most recent one. Furthermore, we also present a prospective application of the SuMD protocol to the investigation of a complex whose structure has not yet been experimentally determined, to present and discuss the role, advantages, and limitations of implementing the SuMD protocol in a pipeline for the rational design of RNA-based therapeutics. Information about each SuMD simulation reported in the manuscript are encompassed in Supplementary Table S2 (Supplementary Material).

RNA aptamer bound to human thrombin (PDB ID: 3DD2)

Due to its ability to process several proteins that are part of the coagulation cascade, including the cleavage of soluble fibrinogen into fibrin, which is responsible for the formation of clots, human thrombin is a serine protease that exerts a pivotal role in blood coagulation and is, therefore, a target of interest for anticoagulation therapy (82–84). Two surface regions of thrombin (exosite-1 and exosite-2), which are located on opposite sides of the molecule and away from the catalytic site, are responsible for its ability to interact with various macromolecular substrates. Particularly, exosite-2 is responsible for the binding of thrombin to heparin, a clinically used oligosaccharide that mediates its anticoagulant effect by facilitating the interaction of thrombin with its endogenous inhibitor antithrombin (85).

In 2001, White et al. reported the discovery of Toggle-25, an RNA aptamer that was developed to bind with a high affinity to both human and porcine thrombin, leading to the inhibition of both plasma clot formation and platelet activation (86). In 2008, Long et. al. were able to solve the crystal structure of Toggle-25t, a 25 nucleotide truncated version of Toggle-25, bound to the exosite-2 of human thrombin at a resolution of 1.90 Å (PDB ID: 3DD2 (87)), thereby allowing a structural characterization of the complex which nicely complements previous biochemical analysis (87). As reported in the original publication, the aptamer recognizes the thrombin exosite-2 in its native state, since no conformational changes can be observed between the unbound and bound state. Despite a relatively simple secondary structure, defined by a stem-loop with an internal bulge, Toggle-25t can achieve a selective and high-affinity binding to human thrombin (Kd = 0.54 ± 0.1 nM) thanks to the good complementarity of shape and electrostatic properties between the negatively charged aptamer and the basic protein region responsible for its recognition. The absence of significant structural alteration of the protein upon binding, the therapeutic relevance of the target, and the relatively modest size of the aptamer (25 residues), make this complex an ideal target for the application of the SuMD protocol to the study of RNA aptamer–protein interactions. As previously introduced, two simplifications were introduced in the system investigated through SuMD: firstly, each 2′ fluoro substituted pyrimidine residue was retro-mutated to the correspondent naturally occurring nucleotide, and secondly, the divalent Mg2+ ion was not included in the system. The choice of retro-mutating the fluorine-containing nucleotides was done for compatibility reasons since some of the software used in this scientific work (e.g. HADDOCK) could not work with non-natural nucleic residues. According to the original publication by Long et al. (87), the introduction of fluorine mainly impacted the aptamer resistance to ribonuclease rather than the binding affinity, as also underlined by the fact that only one single 2′ fluoro group is in direct contact with the protein, specifically at the level of the U17-Arg126 interaction. Concerning the presence of Mg2+ ions, despite their undisputable importance in the field of RNA folding, we opted not to include them at all in the simulations due to some intrinsic molecular mechanics limitations that hamper the possibility of fully and accurately describing their interaction with RNA (88), other than the difficulties in accurately predicting their locations without hints from experimental data (89).

As can be seen in Video 1 (Supplementary Material), about 15 ns of simulation time was sufficient to sample a putative molecular recognition event between the Toggle-25t RNA aptamer and the human thrombin exosite-2. This is a quite remarkable result, considering the usual hundreds of nanoseconds that are necessary to spontaneously sample a binding event with classic MD simulations. As illustrated by Video 1 and summarized in Figure 1, the final state of the SuMD simulation converged quite well both from a geometrical and an interactive point of view toward the crystal reference.

Figure 1.

Figure 1.

This panel encompasses the putative recognition pathway between the RNA aptamer Toggle-25t and the exosite-2 of human thrombin described by the best trajectory obtained through the SuMD protocol (the one with the lowest RMSD to the crystal reference). (A) Visual representation of the Toggle-25t conformation sampled in the last frame of the SuMD trajectory (green) superposed with the native Toggle-25t conformation (yellow), as found in the crystal structure deposited in the Protein Data Bank with accession code 3DD2 (RMSDSuMD-Crystal: 6.41 Å). The aptamer is represented as a ribbon, while the protein is represented as a Connolly surface colored according to the electrostatic potential as calculated with the APBS software (90), where red indicates a negatively charged area while blue indicates a positively charged one. (B) Profile of the ligand-receptor interaction energy (defined as the sum of the electrostatic and van der Waals contribution) throughout the recognition process as a function of both the simulation time and the RMSD between the ligand position during the trajectory and the ligand position in the crystal. (C) Receptor per-residue decomposition of the receptor–ligand interaction energy throughout the SuMD trajectory as a function of the simulation time: the 25 most-contacted residues are reported in the plot. (D) Per-residue interaction energy matrix: the 25 most-contacted residues for both the receptor and the ligand are considered, while each square composing the heatmap represents the average value of the interaction energy between the two paired residues alongside the trajectory.

Concerning the geometric accuracy, the RMSD between the ligand backbone conformation in the final state of the simulation and the native ligand backbone conformation observed in the crystal structure is 6.41 Å, which is quite impressive considering the intrinsic structural flexibility of these objects, which is also confirmed by the structural deviation than can be observed in classic MD simulations of both the crystal complex (average RMSD of the nucleic acid backbone in the final step of the simulation across three MD replicates: 2.87 Å) and the free aptamer (average RMSD of the nucleic acid backbone in the final step of the simulation across three MD replicates: 3.85 Å). It is not surprising, therefore, that SuMD performs worse compared to molecular docking from a geometrical point of view (RMSD between the best docking pose and the crystal binding mode of the aptamer: 2.51 Å).

Despite a lower geometrical accuracy of the method compared to the one of docking, which derives mainly from the high intrinsic flexibility of the nucleic ligand during the simulation (see also Supplementary Figure S4, Supplementary Material, which reports the pairwise RMSD matrix of the free RNA aptamer during classic MD simulations), SuMD can correctly pose the negatively charged RNA–aptamer in a native-like conformation that maximizes the complementarity of shape and electrostatic features with the electropositive concave surface of the thrombin exosite-2, as highlighted in Figure 1 (panel A).

Concerning the capability of the protocol to correctly recapture the pivotal binding features despite a suboptimal geometric accuracy, SuMD can accurately describe the main interaction determinants, as illustrated by a comparison between the per-residue energy decomposition of the first 300 ps of the crystal complex classic MD simulation and the last 300 ps of the SuMD simulation (Supplementary Figure S5, Supplementary Material). Particularly, as can be seen in Figure 1 (panels C and D), SuMD correctly captures the pivotal role of both Arg 101 (461) and Arg 233 (608) in driving the recognition mechanism, serving as electrostatic recruiters in the initial phases of the process and acting as anchors to stabilize the bound state in the final part of the simulation, in agreement with mutagenesis data that assess how mutation of each of these two residues completely abrogates any aptamer's effect (91). Particularly, Arg 233 (608) along with Arg 165 (533) are responsible for the formation of a stacked interaction domain motif defined as an ‘A-Arg zipper’ which involves five unpaired adenine residues on the ligand side, A4, A5, A7, A15, and A18 respectively. As can be seen in Figure 1 panel D, SuMD individuates all these five adenine residues as key interaction determinants, in agreement with the experimental data. Finally, SuMD also discriminates the non-relevance of Gln 239 (614), which is not reported in the analysis as it is not one of the 25 most contacted residues during the trajectory, coherently with mutagenesis data that shows how a mutation of this residue does not affect the aptamer binding (91).

On the ligand side, SuMD once again correctly encompasses the different roles portrayed by different nucleotides. As can be seen in Figure 1 (panel D), the interaction between U17 and Arg 126 (486) is retrieved by the SuMD simulation: mutagenesis data shows how the mutation of U17, one of the flipped-out nucleotides, with adenine has essentially no effect on the aptamer's affinity for the target. Looking at our ligand-based interaction map it can be noticed how this interaction is the only one in which this nucleotide is involved, other than being less intense compared to more prominent interaction such as the aforementioned ‘A-Arg zipper’, which suggest a non-pivotal contribution to the binding affinity. This can be related to the fact the U17 interacts with Arg126 through the backbone and not through its sidechain so that, as pointed out by the work of Jeter et al., this interaction would be maintained even when substituting the base (91). Substitution of U12 with adenine results in a nearly three-orders-of-magnitude diminished binding affinity: this is due to the stabilizing role that U12 plays towards A15 through a non-Watson-Crick base pairing, one of the bases involved in the formation of the ‘A-Arg zipper’(91). Once again, as depicted in Figure 1 panel D, SuMD correctly recognizes the pivotal role portrayed by A15, while contemporarily elucidating the indirect role of U12, which is not involved in any major interactions with protein residues. All other geometric and energetic analyses performed on the trajectory are summarized in Supplementary Figures S1-S3 (Supplementary Materials).

To assess the predictive power of the method, we retrospectively analyzed all trajectories using two different metrics, i.e. the electrostatic interaction energy and the MMGBSA interaction energy of the final state of the simulation. The idea to use these two metrics stems from the consideration that RNA binding to proteins usually requires a good level of complementarity of steric and electrostatic properties at the binding interface. As reported in Supplementary Figure S19, both metrics can successfully distinguish the native and native-like poses, i.e. the ones with a superimposable interaction pattern with the reference (measured through the mean signed error and the root-mean squared error of the per-residue interaction energy decomposition, panels C and D respectively), from the decoys. This observation suggests that both metrics could be utilized in a prospective application of SuMD to rank poses coming from different simulations prioritizing the ones that are most similar to the native binding mode.

RNA aptamer bound to bacillus anthracis ribosomal protein S8 (PDB ID: 4PDB)

Thanks to its central function in the vital cycle of bacteria, the bacterial ribosome, a complex machinery responsible for protein synthesis in prokaryotic organisms, has been extensively studied, both from a structural and a functional perspective, and has been validated as a target for multiple antibiotic drugs (92). Protein-RNA interactions play a key role in the assembly, maturation, and function of the bacterial ribosome (93). Among these, association processes involving ribosomal protein S8 are particularly relevant since it not only participates in the 30s subunit assembly by binding to 16S rRNA but, additionally, it also serves as a translational repressor of the spc operon mRNA, which encodes for 11 ribosomal proteins including S8 itself (94).

Due to its relevance, the complex formed between bacterial ribosomal protein S8 and 16S rRNA has been thoroughly characterized through different techniques, allowing us to establish that most of the protein–RNA contacts involve helices 21 and 25 and that a small RNA portion located in helix 21 is sufficient to confer specificity and high affinity to the S8-RNA interaction (95). Furthermore, the interaction determinants between S8 and its RNA targets are largely conserved, and the same degree of conservation applies also to the overall fold of various S8 proteins (96–98). Finally, the complementarity of shape and electrostatic properties that is required for the binding entails a high level of nucleotide sequence and secondary structure conservation, to impose an RNA shape that optimizes interaction properties with the protein surface (99).

To fetch RNA secondary structures that deviate from the conserved bacterial motif while retaining the ability to bind the S8 protein, in 2014 Davlieva et al. performed a SELEX experiment that led to the discovery of a 38-mer RNA aptamer that can bind the Bacillus anthracis S8 protein with high affinity (Kd = 110 ± 30 nM), determining at the same time the structure of the bound complex between the RNA aptamer and its protein target (99). The selection process was based on an RNA stem-loop scaffold containing symmetric and asymmetric internal loops of 16 randomized nucleotides, with the resulting aptamer sequence forming a secondary structure with a symmetric internal loop (99). The absence of major structural rearrangements on the protein side of the interaction (0.65 Å RMSDbackbone between the free and the bound form) and the relevance of this interaction from a mechanistic perspective led us to consider this complex as a suitable case study to validate the SuMD protocol.

As can be deducted from Video 2 (Supplementary Materials), less than 15 ns of simulation time were needed to sample a putative recognition mechanism between the RNA aptamer and the ribosomal S8 protein. As underlined by both Video 2 and Figure 2, the final state of the SuMD simulation converged impressively well with the experimental data, both from a geometric and an interactive point of view.

Figure 2.

Figure 2.

This panel encompasses the putative recognition pathway between the RNA aptamer and the S8 ribosomal protein of Bacillus anthracis described by the best trajectory obtained through the SuMD protocol (the one with the lowest RMSD to the crystal reference). (A) Visual representation of the RNA–aptamer conformation sampled in the last frame of the SuMD trajectory (green) superposed with the native RNA aptamer conformation (yellow), as found in the crystal structure deposited in the Protein Data Bank with accession code 4PDB (RMSDSuMD-Crystal: 2.61 Å). The aptamer is represented as a ribbon, while the protein is represented as a Connolly surface colored according to the electrostatic potential as calculated with the APBS software (90), where red indicates a negatively charged area while blue indicates a positively charged one. (B) Profile of the ligand-receptor interaction energy (defined as the sum of the electrostatic and van der Waals contribution) throughout the recognition process as a function of both the simulation time and the RMSD between the ligand position during the trajectory and the ligand position in the crystal. (C) Receptor per-residue decomposition of the receptor–ligand interaction energy throughout the SuMD trajectory as a function of the simulation time: the 25 most-contacted residues are reported in the plot. (D) Per-residue interaction energy matrix: the 25 most-contacted residues for both the receptor and the ligand are considered, while each square composing the heatmap represents the average value of the interaction energy between the two paired residues alongside the trajectory.

Specifically, regarding the geometric accuracy of the method, the RMSD between the ligand backbone conformation in the final state of the simulation and the native ligand backbone conformation observed in the crystal structure is only 2.61 Å, a result comparable with the performance of molecular docking (RMSD between the best docking pose and the crystal binding mode of the aptamer: 2.20 Å). The higher geometric accuracy of the SuMD protocol, compared to 3DD2 case, can be partially explained by the lower degree of conformational freedom available to the aptamer, as can be seen in Supplementary Figure S9 (Supplementary Materials, average RMSD of the nucleic acid backbone in the final step of the simulation across three MD replicates: 2.27 Å), despite similar stability of the bound state (average RMSD of the nucleic acid backbone in the final step of the simulation across three MD replicates: 2.80 Å).

Intriguingly, despite the impressive geometric convergence of the trajectory with the experimental data, the comparison between the per-residue energy decomposition from the last 300 ps of the SuMD simulation and the first 300 ps of the classic MD simulation of the crystal complex reveals a slightly lower congruence of the binding mode compared to the 3DD2 case (Supplementary Figure S10, Supplementary Materials). It is important to notice that, in this case, the difference is not due to the interaction pattern, which for the most part is correctly depicted by the SuMD simulation analysis, but instead to the relative strength of the interactions. Indeed, as can be noticed in Figure 2, panel A, the final state of the SuMD simulation is slightly shifted compared to the crystal reference: considering that the predominant electrostatic component to the total interaction energy is proportional to the squared distance between the two interactors, even small differences in the relative position of interacting residues can alter the quantitative estimation of the interaction energy. Coherently with this interpretation of the data coming from the simulation analysis, our SuMD protocol can qualitatively describe the vast majority of the key interaction determinants.

In the crystal structure, the interaction between the S8 protein and the RNA aptamer involves a strip of electropositive charge (as is also visible in Video 2 and Figure 2, panel A) along which the phosphate backbone of the aptamer traverses from residues A4–C7 and U27–A29: this behavior is correctly captured by our simulation, as depicted in Figure 2, panel D. Moreover, as can be noticed in Figure 2, panels C and D, SuMD correctly intercepts the polar interactions that form between the backbone of aptamer residues C16, C17, A24, U25, U27 and A26 and their counterparts on the protein side such as Glu 126, Ser 107, Gly124, Lys 110, Ser 109, Ala 91 and Thr123. Additionally, as depicted in Figure 2, panel D, the SuMD protocol is also able to spot some water-mediated interactions such as the contact between U27 to Glu 126. Finally, as can be depicted in Figure 2, panel D, SuMD can retrieve the stacked interaction between the peptide bond of highly conserved residues Ser 107–Thr 108–Ser 109 and the purine ring of A26: an analogous stacking interaction, involving A642, is present also in the complexes between the S8 protein and its natural RNA interactors, where it represents the only base-specific contacts of the complex (99). All other geometric and energetic analyses performed on the trajectory are summarized in Supplementary Figures S6–S8 (Supplementary Materials).

As for the previous case, we once again retrospectively analyzed all trajectories using the same metrics utilized before (electrostatic interaction energy and MMGBSA interaction energy), to assess the predictive power of the method. As reported in Supplementary Figure S20, also in this case both metrics can successfully distinguish the native and native-like poses from the decoys. This observation further supports the idea that either of the two metrics could be utilized in a prospective application of SuMD to rank poses coming from different simulations to prioritize the most similar to the native binding mode.

RNA aptamer bound to human factor xa (PDB ID: 5VOE)

One of the key events of the coagulation cascade is represented by the formation of the prothrombinase complex, macromolecular machinery formed by the serine protease factor Xa (FXa) and its cofactor factor Va (FVa) (100,101): this membrane-mediated interaction enhances the catalytic activity of the FXa, leading to increased conversion of prothrombin into thrombin by a factor of ∼105 and is, thereby, an interesting target for anticoagulation therapy (102,103). Despite a great effort devoted to the development of both small-molecule inhibitors of the FXa catalytic site and peptide inhibitors directed at epitopes on the binding interface with FVa, both these approaches have led to disappointing therapeutic results since the interaction surface is large and involves multiple hotspots (104,105) and the inhibition at the catalytic site-level interferes with natural regulatory agents such as antithrombin III (106).

To avoid the limitations of traditional small molecule ligands, in 2010 Buddai et al. developed RNA11f7t, an RNA aptamer that exerts a potent anticoagulant effect by binding to FXa with high affinity (Kd 1.1 ± 0.2 nM) and selectivity (∼3000 fold over other coagulation proteases) and inhibiting its interaction with FVa (107).

In 2018, Gunaratne et al. were able to solve the crystal structure of 11F7t bound to FXa, allowing us to better comprehend the key structural features that characterize this interaction (108). Specifically, in agreement with previous biochemical data, the analysis of the structure confirmed that the interaction occurs at a protein site which is implicated in the binding of both anticoagulant drug heparin and coagulation factor FactorVa (FVa) (109), with the interaction surface involving a central aptamer loop formed by residues C8, A10, A21 and C28–C30 and a protease exosite formed by Leu 59, Arg 64, Val 88, Ile 89, Asn 92, Arg 93, Lys 236 and Arg 240 (108). The absence of notable structural alterations of the protease upon aptamer binding, the therapeutical relevance of the target, and the relatively contained size of the aptamer (36 residues) induced us to consider it to validate the SuMD protocol. As in the case of structure 3DD2, 2′ fluoro-modified nucleotides were retro-mutated to the corresponding natural alternatives, and the presence of the two Mg2+ ions was not considered in the simulations.

As can be deemed by Video 3 (Supplementary Materials), in this case about 20 ns of simulation time were sufficient to sample a presumptive association pathway between the RNA aptamer 11F7t and human coagulation factor Xa. As can be noticed in Figure 3 (panel A), in this case, the geometric accuracy of the SuMD protocol in reproducing the crystal complex was worse compared to the first two cases, as also denoted by the 9.12 Å RMSD value between the final state of the simulation and the crystal reference. A first explanation of the lower geometric accuracy of the method can be found in the analysis of the classic MD simulation performed on both the crystal complex and the free aptamer: as highlighted by Supplementary Figure S14 (Supplementary Materials) and by the RMSD of the nucleic backbone in the crystal complex MD (5.98 Å) and the free aptamer MD (4.93 Å), this aptamer has a significantly higher degree of conformational freedom compared to the previous two cases, which increase the difficulty of sampling the binding event due to the reduced amount of time that the aptamer spends in the binding competent conformation. Intriguingly, in this case, also, molecular docking compared worse than in the previous one, with the most correct pose having an RMSD of 3.17 Å to the crystal reference, indicating that the structural flexibility of the object (which is not considered by molecular docking) is not the only determinant of the performance of the protocol.

Figure 3.

Figure 3.

This panel encompasses the putative recognition pathway between the 11F7t RNA aptamer and human factor Xa described by the best trajectory obtained through the SuMD protocol (the one with the lowest RMSD to the crystal reference). (A) Visual representation of the RNA aptamer conformation sampled in the last frame of the SuMD trajectory (green) superposed with the native RNA aptamer conformation (yellow), as found in the crystal structure deposited in the Protein Data Bank with accession code 5VOE (RMSDSuMD-Crystal: 9.12 Å). The aptamer is represented as a ribbon, while the protein is represented as a Connolly surface colored according to the electrostatic potential as calculated with the APBS software (90), where red indicates a negatively charged area while blue indicates a positively charged one. (B) Profile of the ligand-receptor interaction energy (defined as the sum of the electrostatic and van der Waals contribution) throughout the recognition process as a function of both the simulation time and the RMSD between the ligand position during the trajectory and the ligand position in the crystal. (C) Receptor per-residue decomposition of the receptor–ligand interaction energy throughout the SuMD trajectory as a function of the simulation time: the 25 most-contacted residues are reported in the plot. (D) Per-residue interaction energy matrix: the 25 most-contacted residues for both the receptor and the ligand are considered, while each square composing the heatmap represents the average value of the interaction energy between the two paired residues alongside the trajectory.

Contrary to the expectations, SuMD converged quite well from an interactive point of view with the experimental data: as can be seen in Supplementary Figure S15 (Supplementary Materials), SuMD can qualitatively retrieve most of the native crystal interactions even if, as in the case of complex 4PDB, the estimation of the relative interaction strength is not always congruent.

Specifically, as can be noticed in Figure 3, panel D, the SuMD protocol can correctly retrieve the pivotal part played by residues A10, A21 and C29–C30, while slightly missing out on the importance of contacts with residues C8 and C28. On the protein side, as illustrated by Figure 3, panel C, SuMD precisely describes the central role portrayed by residues Arg 64 (283), Val 88 (308), Ile 89 (309), Asn 92 (312), Arg 93 (313), Lys 236 (460) and Arg 240 (464), while only passing up on the interaction with Leu 59 (278). Interestingly, Arg 240 (464) and Lys 236 (460) are key residues for the binding of heparin, according to mutagenesis studies (110). Finally, SuMD analysis highlights how Arg 165 (387) and Lys 169 (391), two critical residues in the recognition of factor FXa by either factor Va and/or prothrombin, are not contacted during the trajectory, in agreement with both the crystal structure and previous observations which pointed out to the possibility that the abrogation of factor Va binding happened through an indirect effect rather than through occlusion of the interaction surface (107,108). All other geometric and energetic analyses performed on the trajectory are summarized in Supplementary Figures S11–S13 (Supplementary Materials).

Finally, encouraged by the promising insights provided by the first two cases, we retrospectively analyzed all trajectories using the same scoring metrics defined before, to establish if they were once again able to distinguish the native and native-like poses from the decoys. As can be noticed in Supplementary Figure S21, disappointingly both metrics fail to prioritize the most geometrically accurate solution, preferring instead the second-best one. Curiously, the reference crystal is also scored poorly by both metrics, indicating even if the SuMD protocol would have been able to sample it we would not have been able to prioritize it. As can be noticed in Supplementary Table S1, this case also HADDOCK fails to rank the most geometrically accurate pose as the top solution, attributing to it a lower rank than a completely incorrect pose (RMSD: 19.79 Å), which ranks as the second best one. The observation that HADDOCK, despite incorporating information about native contacts in its scoring protocol, has trouble in correctly ranking poses for this case, combined with the intrinsic instability of the crystal complex, as indicated by the high RMSD value (5.98 Å) in the classic MD simulations and the low interaction energy values attributed by both metrics, indicate how this case might be an outlier, thus justifying the hypothesis of using the previously proposed metrics in prospective applications of the SuMD protocol.

RNA aptamer bound to SARS-CoV-2 spike glycoprotein receptor binding domain (RBD)

The outbreak of the COVID-19 pandemic in December 2019 caused an unprecedented worldwide public health crisis, leading to the death of more than six million people all over the world (111,112). This illness is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a betacoronavirus able to infect human cells by expressing a surface glycoprotein known as spike (S) glycoprotein, which interacts through its receptor-binding domain (RBD) with the human angiotensin-converting enzyme 2 (hACE2) that mediates the viral uptake process in conjunction with the associated transmembrane serine protease 2 (TMPRSS2) (113). Due to the central role that this interaction plays in SARS-CoV-2 infectivity, the vast majority of therapeutic and prophylactic efforts in contrasting the COVID-19 pandemic have therefore been directed towards the inhibition of the spike-hACE2 interaction, either through vaccination or administration of monoclonal antibodies (114,115).

To overcome the main disadvantages of monoclonal antibodies, such as their high production costs, poor room temperature stability, and immunogenicity, in 2021 Valero et al. performed a SELEX experiment for the identification of a serum-stable RNA aptamer that could tightly bind the RBD of SARS-CoV-2 spike protein preventing the interaction with hACE2 thereby neutralizing viral infectivity (59). This experiment led to the identification of RBD–PB6, an elongated stem-loop RNA aptamer that can selectively interact with the RBD with low nanomolar affinity (KD ≈ 18 nM), inhibiting the binding of RBD to hACE2 in a concentration-dependent manner (59).

Considering the encouraging results shown by the three applications of the SuMD protocol to the study of recognition processes between RNA aptamers and proteins and to illustrate a possible application of the protocol in a prospective scenario, we used SuMD simulations to shed light on the possible association pathway between the RBD–PB6 aptamer and the SARS-CoV-2 spike RBD. A model of the RNA aptamer structure was obtained through the 3dRNA webserver, based on the input primary sequence retrieved from the original publication and on the secondary structure prediction by the NUPACK webserver, while the structure of the SARS-CoV-2 spike RBD was retrieved from the crystal structure of the complex between the RBD and hACE2, deposited in the PDB with accession code 6M0J. The RBD structure that is present in this crystal comprises protein residues ranging from Thr 333 to Gly 526, slightly shorter than the construct used in the SELEX experiment which included residues from Arg 319 to Asn 532. However, residues that are not experimentally solved in the crystal structure are on the opposite side relative to the hACE2 interaction interface, so their absence should not impact the validity of the simulation. In agreement with experimental data that indicate how the RBD–PB6 RNA aptamer and hACE2 compete for the same binding site on the RBD surface, SuMD simulations were carried out to sample a putative recognition mechanism between RBD–PB6 and RBD surface that is responsible for interaction with hACE2.

As can be observed in Video 4 (Supplementary Materials), about 20 ns of simulation time was enough to sample a putative recognition mechanism between the RBD–PB6 aptamer and the SARS-CoV-2 spike RBD. Interestingly, as can be noticed in Figure 4, panel A, there is a discrete level of convergence between the final state of the simulation and the docking-predicted binding mode, especially in the region ranging from A10 to A61, which is the minimal portion of the full-length aptamer which fully retains its binding capabilities to the RBD (RMSDbackboneSuMD-docking: 11.69 Å). Regarding the recognition mechanism proposed by the SuMD protocol, an analysis of the interaction pattern reveals that the first contacts on the aptamer side involve residues A10–A14 and G48–G52, all of which fall into the conserved aptamer moiety that is required for binding to RBD. After these initial contacts take place and steer the binding event, other ancillary stabilizing interactions occur, such as the one with residues U24–U26 and the one involving residues U62–U66. On the protein side, instead, the main contacts involve polar and charged residues, such as Thr 376, Lys 378, Arg 403, Arg 408, Tyr 449, Gln 498, Thr 500, Asn 501 and Tyr 505. Interestingly, the most contacted residues during the trajectory do not include either Lys 417 or Glu 484, which are involved in the mutations K417N and E484K that characterize most viral variants with augmented infectivity compared to the wild-type virus. This evidence is in agreement with experimental data showing the RBD–PB6 affinity for the spike protein is practically unaffected by these mutations (59). On the contrary, SuMD simulations indicate Asn 501 (involved in the N501Y mutation) as one of the most important residues in the recognition process: contrary to other previously mentioned residues, Asn 501 is surrounded by other interacting residues that can be found both in the complex between hACE2 and RBD and in the final state of the SuMD simulation, such as Gln 498, Thr 500 and Tyr 505, which could justify the contained impact of this mutation on the binding affinity of the aptamer compared to the other two. Interestingly, the most recent viral variants of concern include mutations such as Q498R, Y505H, and D405N that increase the positive charge on the spike surface: based on the analysis of the interaction pattern predicted by SuMD (Figure 4, panel C), all these mutations should increase the affinity with the negatively charged RBD–PB6 RNA aptamer, not only justifying its affinity towards the alpha and beta variant of SARS-CoV-2 but also towards the one from omicron on (116,117). All other geometric and energetic analyses performed on the trajectory are summarized in Supplementary Figures S16–S18 (Supplementary Materials).

Figure 4.

Figure 4.

This panel encompasses the putative recognition pathway between the RBD–PB6 RNA aptamer and the SARS-CoV-2 Spike RBD described by the best trajectory obtained through the SuMD protocol according to the MMGBSA interaction energy. (A) Visual representation of the RNA aptamer conformation sampled in the last frame of the SuMD trajectory (green) superposed with the best docking-predicted RNA aptamer conformation (yellow). The aptamer is represented as a ribbon, while the protein is represented as a Connolly surface colored according to the electrostatic potential as calculated with the APBS software (90), where red indicates a negatively charged area while blue indicates a positively charged one. (B) Profile of the ligand-receptor interaction energy (defined as the sum of the electrostatic and van der Waals contribution) throughout the recognition process as a function of both the simulation time and the RMSD between the ligand position during the trajectory and the ligand position in the crystal. (C) Receptor per-residue decomposition of the receptor–ligand interaction energy throughout the SuMD trajectory as a function of the simulation time: the 25 most-contacted residues are reported in the plot. (D) Per-residue interaction energy matrix: the 25 most-contacted residues for both the receptor and the ligand are considered, while each square composing the heatmap represents the average value of the interaction energy between the two paired residues alongside the trajectory.

DISCUSSION

In this scientific work, we presented the first-ever application of Supervised Molecular Dynamics (SuMD) to the study of recognition processes between RNA macromolecules and proteins. Specifically, we concentrated our efforts on the aptamer class, due to their relevance as both therapeutic and diagnostic tools.

Three different retrospective case studies, where the structure of the RNA aptamer–protein complex was available, were presented to validate the ability of the SuMD protocol to reproduce the experimental data. In all three cases, despite the intrinsic limitations derived from the relatively high conformational freedom of the ribonucleic ligands, SuMD was able to converge quite well, both from a geometric and interactive point of view, with the experimentally solved complex structures. The lower geometric accuracy did not impair the ability to retrieve most if not all the binding determinants of the native complex, with the increased RMSD compared to docking being related to the portion of the ligand not directly involved in the binding interface. Despite the increased complexity of the considered system compared to the usually investigated protein-small molecule complexes, the simulation times required to sample a putative recognition mechanism between the RNA aptamers and their protein target were comparable: in all presented cases, indeed, 10–20 ns of simulation time were sufficient to capture the entire association pathway, from the unbound state to the final complex. The reduced computational effort that the SuMD platform provides compared to classic, unsupervised, molecular dynamics simulation, makes it more suitable for its implementation in a drug discovery pipeline, flanking and complementing the role of already established approaches such as molecular docking. Due to the limited sampling capability of molecular dynamics-based methods compared to molecular docking, now the optimal strategy would be to combine the two techniques rather than using them in a mutually exclusive fashion: the rapidity of molecular docking could be useful to generate a series of a reasonable binding hypothesis that could be then more thoroughly investigated through SuMD simulations. The rapid increase in computational power available to scientists will hopefully make it possible to solve the sampling issue of MD-based methods, allowing them to fully replace more physically approximate methods such as molecular docking.

Concerning the applicability of the SuMD protocol in a prospective scenario, we also presented a case study where the experimental structure of the RNA aptamer–protein complex was not experimentally solved. Particularly, due to its therapeutic relevance, we decided to investigate the recognition process between the RBD–PB6 RNA aptamer, developed by Valero et al. (59), and the SARS-CoV-2 spike protein receptor-binding domain RBD. We showed how, even in the absence of an experimentally solved structure of the aptamer, the SuMD protocol can be coupled with structure prediction tools to give a structural prediction congruent with experimental evidence.

Regarding the ability of the SuMD protocol to be applied in a prospective scenario, one crucial point regards the capability to discriminate between the native-like binding mode and decoys. In the three presented cases, the simulation that was carefully analyzed and discussed in the manuscript was the one that presented the best geometric agreement with the crystal reference but such a metric could not be applied to a prospective investigation of complexes whose structures have not already been solved. By retroactively analyzing the geometric and energetic profile of the generated trajectories, we noticed that the electrostatic component of the interaction energy plays a fundamental role in steering the association process. Particularly, as can be noticed in Supplementary Figures S19–S21 (Supplementary Materials), both the MMGBSA interaction energy and the electrostatic component on its own can discriminate and prioritize the native complex and native-like poses from decoys in two out of three case studies. The only exception to this rule is represented by the complex between the RNA aptamer 11F7t and human factor Xa (PDB ID: 5VOE), for which both the electrostatic and MMGBSA scoring metrics indicate the second most geometrically accurate pose as the one with the most favorable energetic profile. Interestingly, in this case, molecular docking also fails to prioritize the most geometrically accurate pose, ranking it as the third-best one (Supplementary Table S1, Supplementary Materials). The indication that docking and SuMD, despite using different scoring metrics, failed to prioritize the native-like conformation suggests that there is still room for improvement regarding the scoring of complexes involving nucleic acids. However, despite this, the use of MMGBSA and or electrostatic interaction energy as scoring metrics can still be relatively accurate in suggesting reasonable binding mode hypotheses that are congruent with experimental evidence, as also previously pointed out by a benchmark study by Chen et al. from 2018 (118). Due to these considerations, we opted for using MMGBSA as a scoring metric for our prospective study of the interaction between the RNA aptamer RBD–PB6 and the SARS-CoV-2 spike RBD.

The last aspect that is worth addressing is related to the choice of residues to consider for the supervision of the association process throughout the SuMD trajectory. As is the case for molecular docking, where the binding site is defined by the user through a sphere or a box wrapped around the area of interest, SuMD also requires the user to specify a residue selection on both the receptor and ligand sides that are used to compute the distance between the center of mass of the binding site and of the ligand that is fed to the supervision algorithm. The choice of residues is usually based on prior knowledge of the interaction site derived from experimental evidence, but there could be some cases where this choice is not obvious. The first possible solution to this problem is represented by the analysis of the electrostatic potential of the receptor surface: as can be seen in Video 1–4, the recognition between the RNA aptamers and their protein targets usually involves a high level of complementarity of electrostatic properties, with the negatively charged ribonucleic surface being nicely harbored by positively charged patches on the protein side. A second possible solution is to perform a docking calculation to have a first indication of the preferable binding mode of the object, followed by more extensive characterization of the binding mode through SuMD simulations. This solution was used in the context of this article for the study of the interaction between the RBD–PB6 RNA aptamer and the SARS-CoV-2 spike RBD.

The possibility to investigate different binding sites and binding mode hypotheses can also be viewed as a strong point of the SuMD technique: for example, in the case of complex ribosomal protein S8, two different RNA recognition sites are available for the aptamer, specifically the site involved in the interaction with helix 21 and the site that mediates interaction with helix 25 (99). SuMD simulations would allow investigating of both possibilities at the same time, elucidating the mechanistic details that determine the preferential recognition of the primary binding site thereby helping the rational development of selective binders.

Furthermore, concerning the exploration of different binding hypotheses, SuMD would allow deciphering the possibility of alternative stoichiometries. For example, in the case of aptamer 11F7t, Buddai et al. noticed a peculiar and difficult to rationalize binding stoichiometry, other than a strong Ca2+ dependence of the interaction (107). The authors discussed various possibilities, including a possible effect on the protein and/or aptamer structure, but also a possible calcium-induced aptamer dimerization (107). In this case, the exploitation of the SuMD technique would have allowed the exploration of all these different hypotheses, which could not be investigated through static, time-independent techniques such as molecular docking.

DATA AVAILABILITY

The code to perform SuMD simulations is available free of charge at https://github.com/molecularmodelingsection/SuMD and at https://doi.org/10.5281/zenodo.7289442. The script utilized to perform analysis on SuMD trajectories is also available at https://github.com/molecularmodelingsection/SuMD-analyzer and at https://doi.org/10.5281/zenodo.728944. All trajectories presented in the article can be found at https://doi.org/10.5281/zenodo.6973437.

Supplementary Material

lqac088_Supplemental_Files

ACKNOWLEDGEMENTS

MMS lab is very grateful to Chemical Computing Group, OpenEye, and Acellera for their scientific and technical partnership. MMS lab gratefully acknowledges the support of NVIDIA Corporation with the donation of the Titan V GPU, used for this research. This work has been supported by “PNRR M4C2-Investimento 1.4- CN00000041” financed by – NextGenerationEU.

Contributor Information

Matteo Pavan, Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy.

Davide Bassani, Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy.

Mattia Sturlese, Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy.

Stefano Moro, Molecular Modeling Section (MMS), Department of Pharmaceutical and Pharmacological Sciences University of Padova, via Marzolo 5, 35131 Padova, Italy.

SUPPLEMENTARY DATA

Supplementary Data are available at NARGAB Online.

FUNDING

None declared.

Conflict of interest statement. None declared.

REFERENCES

  • 1. Gilbert W. Origin of life: the RNA world. Nature. 1986; 319:618–618. [Google Scholar]
  • 2. Breaker R.R., Joyce G.F.. The expanding view of RNA and DNA function. Chem. Biol. 2014; 21:1059–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Hangauer M.J., Vaughn I.W., McManus M.T.. Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs. PLoS Genet. 2013; 9:e1003569. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Cheetham S.W., Gruhl F., Mattick J.S., Dinger M.E.. Long noncoding RNAs and the genetics of cancer. Br. J. Cancer. 2013; 108:2419–2425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Esteller M. Non-coding RNAs in human disease. Nat. Rev. Genet. 2011; 12:861–874. [DOI] [PubMed] [Google Scholar]
  • 6. Morris K.V., Mattick J.S.. The rise of regulatory RNA. Nat. Rev. Genet. 2014; 15:423–437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Luo M., Groaz E., De Jonghe S., Snoeck R., Andrei G., Herdewijn P.. Amidate prodrugs of cyclic 9-(S)-(3-Hydroxy-2-(phosphonomethoxy)propyl)adenine with potent anti-herpesvirus activity. ACS Med. Chem. Lett. 2018; 9:381–385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Bartel D.P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004; 116:281–297. [DOI] [PubMed] [Google Scholar]
  • 9. Ponting C.P., Oliver P.L., Reik W.. Evolution and functions of long noncoding RNAs. Cell. 2009; 136:629–641. [DOI] [PubMed] [Google Scholar]
  • 10. Salmon L., Yang S., Al-Hashimi H.M.. Advances in the determination of nucleic acid conformational ensembles. Annu. Rev. Phys. Chem. 2014; 65:293–316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Cruz J.A., Westhof E.. The dynamic landscapes of RNA architecture. Cell. 2009; 136:604–609. [DOI] [PubMed] [Google Scholar]
  • 12. Bartel D.P. MicroRNAs: target recognition and regulatory functions. Cell. 2009; 136:215–233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Draper D.E. Protein-RNA recognition. Annu. Rev. Biochem. 1995; 64:593–620. [DOI] [PubMed] [Google Scholar]
  • 14. Lorger M., Engstler M., Homann M., Göringer H.U.. Targeting the variable surface of african trypanosomes with variant surface glycoprotein-specific, serum-stable RNA aptamers. Eukaryot. Cell. 2003; 2:84–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Disney M.D. Targeting RNA with small molecules to capture opportunities at the intersection of chemistry, biology, and medicine. J. Am. Chem. Soc. 2019; 141:6776–6790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Stoltenburg R., Reinemann C., Strehlitz B.. SELEX—A (r)evolutionary method to generate high-affinity nucleic acid ligands. Biomol. Eng. 2007; 24:381–403. [DOI] [PubMed] [Google Scholar]
  • 17. Keefe A.D., Pai S., Ellington A.. Aptamers as therapeutics. Nat. Rev. Drug Discov. 2010; 9:537–550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Jayasena S.D. Aptamers: an emerging class of molecules that rival antibodies in diagnostics. Clin. Chem. 1999; 45:1628–1650. [PubMed] [Google Scholar]
  • 19. Jones S., Daley D.T.A., Luscombe N.M., Berman H.M., Thornton J.M.. Protein–RNA interactions: a structural analysis. Nucleic Acids Res. 2001; 29:943–954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Morozova N., Myers J., Shamoo Y.. Protein–RNA interactions: exploring binding patterns with a three-dimensional superposition analysis of high resolution structures. Bioinformatics. 2006; 22:2746–2752. [DOI] [PubMed] [Google Scholar]
  • 21. Reynolds A., Leake D., Boese Q., Scaringe S., Marshall W.S., Khvorova A.. Rational siRNA design for RNA interference. Nat. Biotechnol. 2004; 22:326–330. [DOI] [PubMed] [Google Scholar]
  • 22. Boese Q., Leake D., Reynolds A., Read S., Scaringe S.A., Marshall W.S., Khvorova A.. Mechanistic insights aid computational short interfering RNA design. Methods Enzymol. 2005; 392:73–96. [DOI] [PubMed] [Google Scholar]
  • 23. Meng X.Y., Zhang H.X., Mezei M., Cui M.. Molecular docking: a powerful approach for structure-based drug discovery. Curr. Comput. Aided-Drug Des. 2012; 7:146–157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Kuntz I.D., Blaney J.M., Oatley S.J., Langridge R., Ferrin T.E.. A geometric approach to macromolecule-ligand interactions. J. Mol. Biol. 1982; 161:269–288. [DOI] [PubMed] [Google Scholar]
  • 25. Dominguez C., Boelens R., Bonvin A.. HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J. Am. Chem. Soc. 2003; 125:1731–1737. [DOI] [PubMed] [Google Scholar]
  • 26. Ciemny M., Kurcinski M., Kamel K., Kolinski A., Alam N., Schueler-Furman O., Kmiecik S.. Protein–peptide docking: opportunities and challenges. Drug Discovery Today. 2018; 23:1530–1537. [DOI] [PubMed] [Google Scholar]
  • 27. Pedotti M., Simonelli L., Livoti E., Varani L.. Computational docking of antibody-antigen complexes, opportunities and pitfalls illustrated by influenza hemagglutinin. Int. J. Mol. Sci. 2011; 12:226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. David Morley S., Afshar M.. Validation of an empirical RNA-ligand scoring function for fast flexible docking using ribodock. J. Comput. Aided Mol. Des. 2004; 18:189–208. [DOI] [PubMed] [Google Scholar]
  • 29. Nithin C., Ghosh P., Bujnicki J.M.. Bioinformatics tools and benchmarks for computational docking and 3D structure prediction of RNA-Protein complexes. Genes. 2018; 9:432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Disney M.D. Targeting RNA with small molecules to capture opportunities at the intersection of chemistry, biology, and medicine. J. Am. Chem. Soc. 2019; 141:6776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Fulle S., Gohlke H.. Molecular recognition of RNA: challenges for modelling interactions and plasticity. J. Mol. Recognit. 2010; 23:220–231. [DOI] [PubMed] [Google Scholar]
  • 32. Hermann T. Rational ligand design for RNA: the role of static structure and conformational flexibility in target recognition. Biochimie. 2002; 84:869–875. [DOI] [PubMed] [Google Scholar]
  • 33. De Vivo M., Masetti M., Bottegoni G., Cavalli A.. Role of molecular dynamics and related methods in drug discovery. J. Med. Chem. 2016; 59:4035–4061. [DOI] [PubMed] [Google Scholar]
  • 34. Bernardi R.C., Melo M.C.R., Schulten K.. Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochim. Biophys. Acta Gen. Subj. 2015; 1850:872–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Sabbadin D., Moro S.. Supervised molecular dynamics (SuMD) as a helpful tool to depict GPCR-ligand recognition pathway in a nanosecond time scale. J. Chem. Inf. Model. 2014; 54:372–376. [DOI] [PubMed] [Google Scholar]
  • 36. Deganutti G., Moro S., Ciruela F., Sotelo E.. Supporting the identification of novel fragment-based positive allosteric modulators using a supervised molecular dynamics approach: a retrospective analysis considering the human A2A adenosine receptor as a key example. Molecules. 2017; 22:818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Bolcato G., Bissaro M., Sturlese M., Moro S.. Comparing fragment binding poses prediction using HSP90 as a key study: when bound water makes the difference. Molecules. 2020; 25:4651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Ferrari F., Bissaro M., Fabbian S., De Almeida Roger J., Mammi S., Moro S., Bellanda M., Sturlese M.. HT-SuMD: making molecular dynamics simulations suitable for fragment-based screening. a comparative study with NMR. J. Enzyme. Inhib. Med. Chem. 2020; 36:1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Bissaro M., Bolcato G., Pavan M., Bassani D., Sturlese M., Moro S.. Inspecting the mechanism of fragment hits binding on SARS-CoV-2 m pro by Using supervised molecular dynamics (SuMD) simulations. ChemMedChem. 2021; 16:2075–2081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Bolcato G., Cescon E., Pavan M., Bissaro M., Bassani D., Federico S., Spalluto G., Sturlese M., Moro S.. A computational workflow for the identification of novel fragments acting as inhibitors of the activity of protein kinase CK1δ. Int. J. Mol. Sci. 2021; 22:9741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Grosjean H., Işık M., Aimon A., Mobley D., Chodera J., von Delft F., Biggin P.C.. SAMPL7 protein-ligand challenge: a community-wide evaluation of computational methods against fragment screening and pose-prediction. J. Comput. Aided Mol. Des. 2022; 36:291–311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Bolcato G., Bissaro M., Pavan M., Sturlese M., Moro S.. Targeting the coronavirus SARS-CoV-2: computational insights into the mechanism of action of the protease inhibitors lopinavir, ritonavir and nelfinavir. Sci. Rep. 2020; 10:20927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Pavan M., Bolcato G., Bassani D., Sturlese M., Moro S.. Supervised molecular dynamics (SuMD) insights into the mechanism of action of SARS-CoV-2 main protease inhibitor PF-07321332. J. Enzyme Inhib. Med. Chem. 2021; 36:1646–1650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Bolcato G., Pavan M., Bassani D., Sturlese M., Moro S.. Ribose and non-ribose A2A adenosine receptor agonists: do they share the same receptor recognition mechanism?. Biomedicines. 2022; 10:515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Hassankalhori M., Bolcato G., Bissaro M., Sturlese M., Moro S.. Shedding light on the molecular recognition of sub-kilodalton macrocyclic peptides on thrombin by supervised molecular dynamics. Front. Mol. Biosci. 2021; 8:730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Salmaso V., Sturlese M., Cuzzolin A., Moro S.. Exploring protein-peptide recognition pathways using a supervised molecular dynamics approach. Structure. 2017; 25:655–662. [DOI] [PubMed] [Google Scholar]
  • 47. Bissaro M., Federico S., Salmaso V., Sturlese M., Spalluto G., Moro S.. Targeting protein kinase CK1δ with riluzole: could it be one of the possible missing bricks to interpret its effect in the treatment of ALS from a molecular point of view?. ChemMedChem. 2018; 13:2601–2605. [DOI] [PubMed] [Google Scholar]
  • 48. Panday S.K., Sturlese M., Salmaso V., Ghosh I., Moro S.. Coupling supervised molecular dynamics (SuMD) with entropy estimations to shine light on the stability of multiple binding sites. ACS Med. Chem. Lett. 2019; 10:444–449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Deganutti G., Cuzzolin A., Ciancetta A., Moro S.. Understanding allosteric interactions in g protein-coupled receptors using supervised molecular dynamics: a prototype study analysing the human A3 adenosine receptor positive allosteric modulator LUF6000. Bioorg. Med. Chem. 2015; 23:4065–4071. [DOI] [PubMed] [Google Scholar]
  • 50. Paoletta S., Sabbadin D., von Kügelgen I., Hinz S., Katritch V., Hoffmann K., Abdelrahman A., Straßburger J., Baqi Y., Zhao Q.et al.. Modeling ligand recognition at the P2Y12 receptor in light of X-ray structural information. J. Comput. Aided Mol. Des. 2015; 29:737–756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Palazzotti D., Bissaro M., Bolcato G., Astolfi A., Felicetti T., Sabatini S., Sturlese M., Cecchetti V., Barreca M.L., Moro S.. Deciphering the molecular recognition mechanism of multidrug resistance staphylococcus aureus NorA efflux pump using a supervised molecular dynamics approach. Int. J. Mol. Sci. 2019; 20:4041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Cuzzolin A., Sturlese M., Deganutti G., Salmaso V., Sabbadin D., Ciancetta A., Moro S.. Deciphering the complexity of ligand-protein recognition pathways using supervised molecular dynamics (SuMD) simulations. J. Chem. Inf. Model. 2016; 56:687–705. [DOI] [PubMed] [Google Scholar]
  • 53. Deganutti G., Moro S., Reynolds C.A.. A supervised molecular dynamics approach to unbiased ligand-protein unbinding. ACS Appl. Mater. Interfaces. 2020; 2020:1804–1817. [DOI] [PubMed] [Google Scholar]
  • 54. Bissaro M., Sturlese M., Moro S.. Exploring the RNA-Recognition mechanism using supervised molecular dynamics (SuMD) simulations: toward a rational design for ribonucleic-targeting molecules?. Front. Chem. 2020; 8:107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E.. The protein data bank. Nucleic Acids Res. 2000; 28:235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Berman H.M. The protein data bank. Nucleic Acids Res. 2000; 28:235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Molecular Operating Environment Chemical Computing Group ULC, 1010 Sherbooke St. West, Suite #910, Montreal, QC, Canada, H3A 2R7, 2022. 2021; 19 January 2021, date last accessedhttps://www.chemcomp.com/Research-Citing_MOE.htm. [Google Scholar]
  • 58. Lan J., Ge J., Yu J., Shan S., Zhou H., Fan S., Zhang Q., Shi X., Wang Q., Zhang L.et al.. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor. Nature. 2020; 581:215–220. [DOI] [PubMed] [Google Scholar]
  • 59. Valero J., Civit L., Dupont D.M., Selnihhin D., Reinert L.S., Idorn M., Israels B.A., Bednarz A.M., Bus C., Asbach B.et al.. A serum-stable RNA aptamer specific for SARS-CoV-2 neutralizes viral entry. Proc. Natl. Acad. Sci. U.S.A. 2021; 118:e2112942118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Zadeh J.N., Steenberg C.D., Bois J.S., Wolfe B.R., Pierce M.B., Khan A.R., Dirks R.M., Pierce N.A.. NUPACK: analysis and design of nucleic acid systems. J. Comput. Chem. 2011; 32:170–173. [DOI] [PubMed] [Google Scholar]
  • 61. Wang J., Wang J., Huang Y., Xiao Y.. 3dRNA v2.0: an updated web server for RNA 3D structure prediction. Int. J. Mol. Sci. 2019; 20:4116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Humphrey W., Dalke A., Schulten K.. VMD: visual molecular dynamics. J. Mol. Graph. 1996; 14:33–38. [DOI] [PubMed] [Google Scholar]
  • 63. Case D.A., Cheatham T.E. 3rd, Darden T., Gohlke H., Luo R., Merz K.M. Jr, Onufriev A., Simmerling C., Wang B., Woods R.J.. The amber biomolecular simulation programs. J. Comput. Chem. 2005; 26:1668–1688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Maier J.A., Martinez C., Kasavajhala K., Wickstrom L., Hauser K.E., Simmerling C.. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput. 2015; 11:3696–3713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Pérez A., Marchán I., Svozil D., Sponer J., Cheatham T.E. 3rd, Laughton C.A., Orozco M.. Refinement of the AMBER force field for nucleic acids: improving the description of α/γ conformers. Biophys. J. 2007; 92:3817–3829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Zgarbová M., Otyepka M., Sponer J., Mládek A., Banáš P., Cheatham T.E. 3rd, Jurečka P.. Nucleic acids force field based on reference quantum chemical calculations of glycosidic torsion profiles. J. Chem. Theory Comput. 2011; 7:2886–2902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Jorgensen W.L., Chandrasekhar J., Madura J.D., Impey R.W., Klein M.L.. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983; 79:926–935. [Google Scholar]
  • 68. Davidchack R.L., Handel R., Tretyakov M.V.. Langevin thermostat for rigid body dynamics. J. Chem. Phys. 2009; 130:234101. [DOI] [PubMed] [Google Scholar]
  • 69. Kräutler V., Van Gunsteren W.F., Hünenberger P.H.. A fast SHAKE algorithm to solve distance constraint equations for small molecules in molecular dynamics simulations. J. Comput. Chem. 2001; 22:501–508. [Google Scholar]
  • 70. Essmann U., Perera L., Berkowitz M.L., Darden T., Lee H., Pedersen L.G.. A smooth particle mesh ewald method. J. Chem. Phys. 1998; 103:8577. [Google Scholar]
  • 71. Faller R., De Pablo J.J.. Constant pressure hybrid molecular dynamics–monte carlo simulations. J. Chem. Phys. 2001; 116:55. [Google Scholar]
  • 72. Harvey M.J., Giupponi G., De Fabritiis G.. ACEMD: accelerating biomolecular dynamics in the microsecond time scale. J. Chem. Theory Comput. 2009; 5:1632–1639. [DOI] [PubMed] [Google Scholar]
  • 73. Eastman P., Swails J., Chodera J.D., McGibbon R.T., Zhao Y., Beauchamp K.A., Wang L.P., Simmonett A.C., Harrigan M.P., Stern C.D.et al.. OpenMM 7: rapid development of high performance algorithms for molecular dynamics. Gentleman r, editor. PLoS Comput. Biol. 2017; 13:e1005659. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Bakan A., Meireles L.M., Bahar I.. ProDy: protein dynamics inferred from theory and experiments. Bioinformatics. 2011; 27:1575–1577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Michaud-Agrawal N., Denning E.J., Woolf T.B., Beckstein O.. MDAnalysis: a toolkit for the analysis of molecular dynamics simulations. J. Comput. Chem. 2011; 32:2319–2327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Gowers R.J., Linke M., Barnoud J., Reddy T.J.E., Melo M.N., Seyler S.L., Domanski J., Dotson D.L., Buchoux S., Kenney I.M.et al.. MDAnalysis: a python package for the rapid analysis of molecular dynamics simulations. Proceedings of the 15th Python in Science Conference. 2016; 98–105. [Google Scholar]
  • 77. Hunter J.D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 2007; 9:90–95. [Google Scholar]
  • 78. Phillips J.C., Hardy D.J., Maia J.D.C., Stone J.E., Ribeiro J.V., Bernardi R.C., Buch R., Fiorin G., Hénin J., Jiang W.et al.. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J. Chem. Phys. 2020; 153:044130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. van Zundert G.C.P., Rodrigues J., Trellet M., Schmitz C., Kastritis P.L., Karaca E., Melquiond A.S.J., van Dijk M., de Vries S.J., Bonvin A.. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J. Mol. Biol. 2016; 428:720–725. [DOI] [PubMed] [Google Scholar]
  • 80. Jorgensen W.L., Tirado-Rives J.. The OPLS (optimized potentials for liquid simulations) potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. J. Am. Chem. Soc. 1988; 110:1657–1666. [DOI] [PubMed] [Google Scholar]
  • 81. Rodrigues J.P., Trellet M., Schmitz C., Kastritis P., Karaca E., Melquiond A.S., Bonvin A.M.. Clustering biomolecular complexes by residue contacts similarity. Proteins Struct. Funct. Bioinf. 2012; 80:1810–1817. [DOI] [PubMed] [Google Scholar]
  • 82. Stubbs M.T., Bode W.. The clot thickens: clues provided by thrombin structure. Trends Biochem. Sci. 1995; 20:23–28. [DOI] [PubMed] [Google Scholar]
  • 83. Hoffman M., Monroe D.M.. A cell-based model of hemostasis. Thromb. Haemostasis. 2001; 85:958–965. [PubMed] [Google Scholar]
  • 84. Di Cera E. Thrombin interactions. Chest. 2003; 124:11S–17S. [DOI] [PubMed] [Google Scholar]
  • 85. Rau J.C., Beaulieu L.M., Huntington J.A., Church F.C.. Serpins in thrombosis, hemostasis and fibrinolysis. J. Thromb. Haemost. 2007; 5:102–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. White R., Rusconi C., Scardino E., Wolberg A., Lawson J., Hoffman M., Sullenger B.. Generation of species Cross-reactive aptamers using “Toggle” SELEX. Mol. Ther. 2001; 4:567–573. [DOI] [PubMed] [Google Scholar]
  • 87. Long S.B., Long M.B., White R.R., Sullenger B.A.. Crystal structure of an RNA aptamer bound to thrombin. RNA. 2008; 14:2504–2512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88. Vangaveti S., Ranganathan S.V., Chen A.A.. Advances in RNA molecular dynamics: a simulator's guide to RNA force fields. Wiley Interdiscipl. Rev.: RNA. 2017; 8:e1396. [DOI] [PubMed] [Google Scholar]
  • 89. Giambaşu G.M., Case D.A., York D.M.. Predicting site-binding modes of ions and water to nucleic acids using molecular solvation theory. J. Am. Chem. Soc. 2019; 141:2435–2445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90. Jurrus E., Engel D., Star K., Monson K., Brandi J., Felberg L.E., Brookes D.H., Wilson L., Chen J., Liles K.et al.. Improvements to the APBSbiomolecular solvation software suite. Protein Sci. 2018; 27:112–128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91. Jeter M.L., Ly L.V., Fortenberry Y.M., Whinna H.C., White R.R., Rusconi C.P., Sullenger B.A., Church F.C.. RNA aptamer to thrombin binds anion-binding exosite-2 and alters protease inhibition by heparin-binding serpins. FEBS Lett. 2004; 568:10–14. [DOI] [PubMed] [Google Scholar]
  • 92. Poehlsgaard J., Douthwaite S.. The bacterial ribosome as a target for antibiotics. Nat. Rev. Microbiol. 2005; 3:870–881. [DOI] [PubMed] [Google Scholar]
  • 93. Nierhaus K.H. The assembly of the prokaryotic ribosome. Biosystems. 1980; 12:273–282. [DOI] [PubMed] [Google Scholar]
  • 94. Nomura M., Yates J.L., Dean D., Post L.E.. Feedback regulation of ribosomal protein gene expression in escherichia coli: structural homology of ribosomal RNA and ribosomal protein mRNA. Proc. Natl. Acad. Sci. U.S.A. 1980; 77:7084–7088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95. Wu H., Jiang L., Zimmermann R.A.. The binding site for ribosomal protein S8 in 16S rRNA and spc mRNA from escherichia coli: minimum structural requirements and the effects of single bulged bases on S8-RNA interaction. Nucleic Acids Res. 1994; 22:1687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96. Merianos H.J., Wang J., Moore P.B.. The structure of a ribosomal protein S8/spc operon mRNA complex. RNA. 2004; 10:954–964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97. Tishchenko S., Nikulin A., Fomenkova N., Nevskaya N., Nikonov O., Dumas P., Moine H., Ehresmann B., Ehresmann C., Piendl W.et al.. Detailed analysis of RNA-protein interactions within the ribosomal protein S8-rRNA complex from the archaeon methanococcusjannaschii. J. Mol. Biol. 2001; 311:311–324. [DOI] [PubMed] [Google Scholar]
  • 98. Brodersen D.E., Clemons W.M., Carter A.P., Wimberly B.T., Ramakrishnan V.. Crystal structure of the 30 s ribosomal subunit from thermus thermophilus: structure of the proteins and their interactions with 16 s RNA. J. Mol. Biol. 2002; 316:725–768. [DOI] [PubMed] [Google Scholar]
  • 99. Davlieva M., Donarski J., Wang J., Shamoo Y., Nikonowicz E.P.. Structure analysis of free and bound states of an RNA aptamer against ribosomal protein S8 from bacillus anthracis. Nucleic Acids Res. 2014; 42:10795–10808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100. Mann K.G., Nesheim M.E., Church W.R., Haley P., Krishnaswamy S.. Surface-dependent reactions of the vitamin K-dependent enzyme complexes. Blood. 1990; 76:1–16. [PubMed] [Google Scholar]
  • 101. Mann K.G., Jenny R.J., Krishnaswamy S.. Cofactor proteins in the assembly and expression of blood clotting enzyme complexes. Annu. Rev. Biochem. 1988; 57:915–956. [DOI] [PubMed] [Google Scholar]
  • 102. Gross P.L., Weitz J.I.. New anticoagulants for treatment of venous thromboembolism. Arterioscler. Thromb. Vasc. Biol. 2008; 28:380–386. [DOI] [PubMed] [Google Scholar]
  • 103. Bauer K.A. New anticoagulants. Curr. Opin. Hematol. 2008; 15:509–515. [DOI] [PubMed] [Google Scholar]
  • 104. Krishnaswamy S. Prothrombinase complex assembly. Contributions of protein-protein and protein-membrane interactions toward complex formation. J. Biol. Chem. 1990; 265:3708–3718. [PubMed] [Google Scholar]
  • 105. Krishnaswamy S. Exosite-driven substrate specificity and function in coagulation. J. Thromb. Haemost. 2005; 3:54–67. [DOI] [PubMed] [Google Scholar]
  • 106. Björk I., Olson S.T.. Antithrombin. Adv. Exp. Med. Biol. 1997; 425:17–33. [PubMed] [Google Scholar]
  • 107. Buddai S.K., Layzer J.M., Lu G., Rusconi C.P., Sullenger B.A., Monroe D.M., Krishnaswamy S.. An anticoagulant RNA aptamer that inhibits proteinase-cofactor interactions within prothrombinase. J. Biol. Chem. 2010; 285:5212–5223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108. Gunaratne R., Kumar S., Frederiksen J.W., Stayrook S., Lohrmann J.L., Perry K., Bompiani K.M., Chabata C.V., Thalji N.K., Ho M.D.et al.. Combination of aptamer and drug for reversible anticoagulation in cardiopulmonary bypass. Nat. Biotechnol. 2018; 36:606–613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109. Rezaie A.R. Identification of basic residues in the Heparin-binding exosite of factor xa critical for heparin and factor va binding. J. Biol. Chem. 2000; 275:3320–3327. [DOI] [PubMed] [Google Scholar]
  • 110. Rezaie A.R. Identification of basic residues in the Heparin-binding exosite of factor xa critical for heparin and factor va binding. J. Biol. Chem. 2000; 275:3320–3327. [DOI] [PubMed] [Google Scholar]
  • 111. Guarner J. Three Emerging Coronaviruses in Two Decades: The Story of SARS, MERS, and Now COVID-19. Am. J. Clin. Pathol. 2020; 153:420–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112. COVID Live - Coronavirus Statistics - Worldometer (Internet) 2022; 11 May 2022, date last accessedhttps://www.worldometers.info/coronavirus/.
  • 113. Hoffmann M., Kleine-Weber H., Schroeder S., Krüger N., Herrler T., Erichsen S., Schiergens T.S., Herrler G., Wu N.H., Nitsche A.et al.. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020; 181:271–280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114. Thanh Le T., Andreadakis Z., Kumar A., Gómez Román R., Tollefsen S., Saville M., Mayhew S.. The COVID-19 vaccine development landscape. Nat Rev Drug Discov. 2020; 19:305–306. [DOI] [PubMed] [Google Scholar]
  • 115. Taylor P.C., Adams A.C., Hufford M.M., de la Torre I., Winthrop K., Gottlieb R.L.. Neutralizing monoclonal antibodies for treatment of COVID-19. Nat. Rev. Immunol. 2021; 21:382–393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116. Sartore G., Bassani D., Ragazzi E., Traldi P., Lapolla A., Moro S.. In silico evaluation of the interaction between ACE2 and SARS-CoV-2 spike protein in a hyperglycemic environment. Sci. Rep. 2021; 11:22860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117. Bassani D., Ragazzi E., Lapolla A., Sartore G., Moro S.. Omicron variant of SARS-CoV-2 virus: in silico evaluation of the possible impact on people affected by diabetes mellitus. Front. Endocrinol. (Lausanne). 2022; 13:284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118. Chen F., Sun H., Wang J., Zhu F., Liu H., Wang Z., Lei T., Li Y., Hou T.. Assessing the performance of MM/PBSA and MM/GBSA methods. 8. Predicting binding free energies and poses of protein-RNA complexes. RNA. 2018; 24:1183–1194. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

lqac088_Supplemental_Files

Data Availability Statement

The code to perform SuMD simulations is available free of charge at https://github.com/molecularmodelingsection/SuMD and at https://doi.org/10.5281/zenodo.7289442. The script utilized to perform analysis on SuMD trajectories is also available at https://github.com/molecularmodelingsection/SuMD-analyzer and at https://doi.org/10.5281/zenodo.728944. All trajectories presented in the article can be found at https://doi.org/10.5281/zenodo.6973437.


Articles from NAR Genomics and Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES