Mining for Ligandable Cavities in RNA

Jingru Xie; Aaron T Frank

doi:10.1021/acsmedchemlett.1c00068

. 2021 Jun 1;12(6):928–934. doi: 10.1021/acsmedchemlett.1c00068

Mining for Ligandable Cavities in RNA

Jingru Xie ^†, Aaron T Frank ^‡,^§,^*

PMCID: PMC8201482 PMID: 34141071

Abstract

graphic file with name ml1c00068_0006.jpg

Identifying potential ligand binding cavities is a critical step in structure-based screening of biomolecular targets. Cavity mapping methods can detect such binding cavities; however, for ribonucleic acid (RNA) targets, determining which of the detected cavities are “ligandable” remains an unsolved challenge. In this study, we trained a set of machine learning classifiers to distinguish ligandable RNA cavities from decoy cavities. Application of our classifiers to two independent test sets demonstrated that we could recover ligandable cavities from decoys with an AUC > 0.83. Interestingly, when we applied our classifiers to a library of modeled structures of the HIV-1 transactivation response (TAR) element RNA, we found that several of the conformers that harbored cavities with high ligandability scores resembled known holo-TAR structures. On the basis of our results, we envision that our classifiers could find utility as a tool to parse RNA structures and prospectively mine for ligandable binding cavities and, in so doing, facilitate structure-based virtual screening efforts against RNA drug targets.

Keywords: RNA, small molecule, structure analysis, ligandability, atomic fingerprinting, machine learning

The discovery of bioactive small molecules that target noncoding RNAs has resulted in renewed interest in exploring them as drug targets.¹⁻⁶ Structure-based virtual screening (SBVS) can be used to identify RNA-targeting small molecules that are likely to bind to an RNA with high affinity.⁷⁻⁹ A critical first step in SBVS is to map¹⁰⁻¹⁴ and then score the “ligandability” of individual cavities on a target. For proteins, the ligandability of a binding cavity is typically quantified as its propensity for ligand binding^15,16 or its maximal achievable ligand binding affinity, as estimated either from structural analysis¹⁷ or molecular dynamics simulations.¹⁸ Analogous methods for estimating the ligandability of cavities in RNA, however, have yet to be reported.

Here, we employed a machine learning approach to develop RNA ligandability predictors. Specifically, we cast the problem of predicting the ligandability of a binding cavity in an RNA as a classification problem¹⁶ in which we attempt to predict the likelihood that a given cavity would accommodate a small-molecule ligand. To accomplish this, we first compiled a data set of experimentally determined RNA–ligand complex structures. For each RNA in the data set, we used cavity mapping to detect cavities within their structure, defining the ligandable cavity as the one that accommodates the ligand and all else as decoys (Figure 1). We then used a distance-based cavity fingerprinting method that simultaneously encodes the physicochemical composition and geometry of the cavity to first map the binary cavity labels (i.e., ligandable vs decoy) to cavity fingerprints (FPs) and then train classifiers to distinguish the ligandable cavities from decoys based on their FPs.

In this study, we defined a binding cavity found in known RNA structure–ligand complexes as (a) ligandable if the center of geometry of the cavity’s bounding box is within 6.0 Å from the center of geometry of the known ligand and (b) decoy if it is more than 6.0 Å. Shown in the figure are the center of geometry of the ligandable (top; green ball) and decoy cavities (bottom; red balls) of an RNA. The distances between the pseudoatom and nearby RNA atoms (i.e., atoms in the pocket residues) are used to define the fingerprint (FP) that we employ to describe individual cavities.

The data we used to train the classifiers was composed of a set of 88 RNA–ligand complexes for which high-resolution crystal structures are available in the Protein Data Bank. The RNAs in the training data set contained pseudoknotted and stem-loop motifs (see Supporting Information). The distance-based fingerprinting technique we used to encode an RNA cavity is a function of the atomic distances between a pseudoatom placed at the center of the cavity and atoms in nearby RNA residues. Specifically, for a given cavity on an RNA, we define its binding cavity FP as the FP of a pseudoatom, i, placed at the center of the box bounding the cavity (Figure 1). For this pseudoatom, the atomic FP is a vector, {V_i}, whose elements V_i(η) are given by

where r_ij is the distance between the pseudoatom i and the heavy atom j in the RNA, η is a tunable width parameter, ν is a set of unique RNA atom types, and f_d(r_ij) is the damping function given by

graphic file with name ml1c00068_m002.jpg

This cavity FP contained two tunable hyperparameters, η and R_c, the distance cutoff, beyond which contributions to the FP are ignored.

We first asked which combination of the FP hyperparameters (η and R_c) resulted in the largest separation between the distributions of the cavity FPs for ligandable and decoy cavities. To answer this question, we varied η and R_c and, for each combination, embedded the cavity FPs into a two-dimensional (2D) space. We then computed the Kullback–Leibler (KL) divergence between the distributions of the known and decoy cavities within the 2D space.¹⁹ The KL divergence grows as two probability distributions diverge and so we used it to measure the extent to which our cavity FPs were able to separate the known ligandable cavities from decoys. For a given R_c, we observed that the KL divergence between the distribution of the known ligandable and decoy cavities in the 2D space increases when η increases from 2 Å to 4 Å (Figure 2). The largest KL divergence (3.86) was observed when η and R_c were set to 4.0 and 20.0 Å, respectively (Figure 2b).

2D scatter plots of the cavity FP space within the data set used to train the classifiers in this study. These 2D plots were generated by applying t-distributed stochastic neighbor embedding (t-SNE) to the cavity FPs in the data set. Shown are plots with the FP hyperparameter R_c set to 5.0, 10.0, 15.0, and 20.0 Å (left to right) and the hyperparameter η set to (a) 2.0 and (b) 4.0 Å, respectively (eq 1). Points are colored based on whether the associated cavity is ligandable (light green) or a decoy (red).

Using the optimal combination of η and R_c (Figure 2b), we next trained classifiers to discriminate ligandable cavities from decoys based on their cavity FPs. Here, the objective was to use the training data to learn a function that transforms the cavity FPs into ligandability scores. We trained five independent classifiers, namely, an extreme gradient boosting (XGB), a random forest (RF), a multilayer perceptron (MLP), a logistic regression (LR), and an extra-randomized trees (ERT) classifier.

To quantify our ability to discriminate ligandable cavities from decoys, we used the area under the receiver operating characteristic curves (AUC): AUC values approaching 1 correspond to near-perfect classification. For this analysis, the classifiers were deemed successful if the cavity with the highest ligandability score was within 6.0 Å of the native cavity. Shown in Figure 3 are the receiver-operator-characteristic (ROC) curves we obtained when we applied our ligandability classifiers to two distinct testing sets, a set of 23 RNA–ligand systems for which X-ray crystal structures were available (test set 1; Figure 3a) and a set of 20 RNA–ligand systems for which NMR structures were available (test set 2; Figure 3b). The AUC values of the 5 ligandability classifiers ranged between 0.89 and 0.93 on test set 1 (Figure 3a) and between 0.81 and 0.89 on test set 2 (Figure 3b). In addition to the five classifiers, we also made predictions using a consensus classifier. The consensus classification score is the mean of the predictions of all five classifiers. For both test sets, the consensus classifier exhibited the highest AUC values (0.94 and 0.90, respectively). It should be noted that we obtained almost identical results when, instead of 6.0 Å, 2.0 or 4.0 Å cutoffs were used to determine whether the classifiers were successful (Figure S2).

ROC curves for recovering ligandable cavities in (a) test set 1 (X-ray structures) and (b) test set 2 (NMR structures). Shown are examples of cases in (c) the X-ray and (d) the NMR test sets where the known ligandable cavities was not ranked in the top three of all of cavities. With the exception of 3SLM, the pseudoatom used for fingerprinting the cavity (light green spheres and dots) had large solvent-accessible surface area (SASA). For comparison, the classifiers were also applied to pseudoatoms placed at the position of each heavy atom in the ligand. The ligand-pseudoatom that exhibited the highest ligandability scores is shown in red spheres and dots. In all six cases, the ligandability score was higher for this ligand-pseudoatom than the pseudoatom placed at the center of the native binding cavity.

We next examined the predictions for individual RNAs in the testing sets. In particular, we determined the ranking of the known ligandable cavity relative to the decoys based on their consensus ligandability scores. The rankings of the known ligandable cavities ranged between 1 and 6 with a mean of 1.8 and 2.1 for test sets 1 and 2, respectively (Tables S2, S3). By comparison, the average number of cavities in each RNA system within the X-ray and NMR test sets were 22.5 and 14.5, respectively. As such, from the consensus ligandability score, known ligandable cavities were typically among the top ranked cavities. In both the X-ray and NMR test sets, there were three instances where the native ligandable cavities were not within the top three places among all cavities (Figure 3c,d; Tables S2, S3). We found that with the exception of 3SLM, the pseudoatoms used to fingerprint these cavities tend to be more solvent exposed than the pseudoatoms in the training set. Interestingly, for these examples, if pseudoatoms were placed at the same positions as the heavy atoms in the ligand that occupied these cavity instead of at the center of the cavity, then the cavities tended to be highly ranked, indicating that lower ranking of these ligandable cavities may be in part due to positioning of the pseudoatoms used for fingerprinting the native cavity (Figure 3c, d).

The structured, 27-nt HIV-1 transactivation response (TAR) element RNA regulates the transcription and translation of HIV-1. Because of its small size and ability to control the transcription and translation of HIV-1, TAR has emerged as an excellent model system to elucidate links between RNA function and dynamics²⁰ and to develop strategies for the design and discovery of RNA-targeting small molecules.^8,21,22 TAR is a highly dynamic RNA that appears to bind its ligands via conformational selection.^23,24 The high flexibility of RNAs like TAR poses a significant challenge to acquiring the structural data needed to target them using SBVS. While structure prediction methods can be used to generate 3D models of RNAs directly from their sequence, currently lacking are in silico methods for mining these structures for bound-like conformers that harbor ligandable cavities. To determine if we could use our classifiers to mine for bound-like conformers within a library of modeled structures, we used the modeling program SimRNA to generate 3D models of HIV-1 TAR. After clustering the set of 64 000 generated structures, we mapped the cavities on each structure and then used our consensus classifier to score each detected cavity (Supporting Information). We then selected the 20 SimRNA structures that harbored the highest-scoring binding cavities.

The 20 SimRNA structures that harbored the cavities with the highest ligandability scores cluster within the cavity FP-space (Figure 4a). Interestingly, they also colocalize with the cavities detected in several holo-TAR structures. Moreover, the high-scoring cavities in these conformers are frequently located near the bulge region of TAR and feature irregular structural motifs that often involve long-range, bulge-apical loop contacts (Figures S4, S5). In Figure 4b, we compare these 20 SimRNA structures with 12 experimental holo-TAR structures. Across the set of 12 holo-TAR structures, the heavy-atom RMSD relative to the closest SimRNA structure ranged between 3.58 and 4.84 Å (Figure 4b). This result indicates that the collection of 20 SimRNA structures included several bound-like TAR conformers (Figures 4b, 5).

Ligandability analysis of 1445, low-energy SimRNA structures of the HIV-1 TAR RNA. (a) Shown are the locations of cavities detected in all of the SimRNA structures (red), those in the 20 SimRNA structures that harbored the cavities with highest ligandability scores (light green), and those detected in 12, experimental holo-TAR structures (cyan). The cavity locations are shown within a t-SNE generated 2D cavity FP map. (b) Shown is the pairwise RMSD comparison between the 20 highest-scoring SimRNA structures and the 12 holo-TAR structures.

SimRNA-TAR structures versus holo-TAR structures. Shown are comparisons between SimRNA-TAR structures that harbor ligandable binding cavities and 12 experimentally determined holo-TAR structures. Indicated under each comparison are the ligandability score (Score) and the heavy-atom root-mean-square-distance (RMSD) between the SimRNA and the experimental structure.

To test whether the 20 SimRNA structures that harbored the cavities with the highest ligandability scores could be of utility in virtual screening, we carried out ensemble docking against a library of 102 326 small-molecule ligands. The library contained 26 known hits against HIV-1 TAR.⁹ When we sorted the library based on the Boltzmann-weighted, ensemble-averaged docking scores, the AUC for the recovery of the known hits was 0.84, and the 1% and 2% enrichment factors (EF) were 23.1 and 21.1%, respectively (Tables 1, S4). When we used the docking scores against the individual SimRNA conformers, the highest AUC was 0.86, and the 1% EF and 2% EF were 26.9 (SimRNA 4, 5, and 8; Table S4) and 17.3 (SimRNA 8; Table S4), respectively. Based on the AUC, these results are similar to those obtained when docking against NMR-optimized docking ensembles.⁹ The NMR-optimized ensembles did, however, exhibit considerably larger enrichment factors.

Table 1. Ensemble-Docking Results against the 20 SimRNA TAR Conformers That Harbored the Highest Scoring Binding Cavities^a.

ROC AUC	1% EF	2% EF
0.84	23.1	21.1

Open in a new tab

The ROC AUC and 1% and 2% EFs when using the ensemble-averaged docking scores to sort compounds in the TAR screening library. The virtual screening results for individual SimRNA conformers are listed in Table S4.

To assess the value added by docking against the 20 SimRNA conformers harboring the cavities with the highest ligandability scores, we compared the docking scores of the 26 TAR hits docked to these 20 SimRNA conformers with the docking scores of the hits docked to the 20 conformers harboring the cavities with the lowest ligandability scores. In general, the top 20 TAR conformers yielded lower docking scores, indicating that these conformers are better receptors for the hit compounds (Figure S6). Taken together, the results of our virtual screening assessment indicate that identifying and then docking against structures that harbor ligandable binding cavities could be a viable ensemble-based, virtual screening strategy. This strategy could be particularily useful when the experimental data needed to construct optimized ensembles is unavailable.

In this study, we sought to develop a strategy to estimate the ligandability of binding cavities found in RNA structures. Using a cavity fingerprinting technique in which the physicochemical composition and geometry of individual cavities are projected onto a pseudoatom placed at their centers, we trained a set of machine learning classifiers to discriminate known ligandable cavities from decoys. Our consensus classifier (which is an ensemble of five independent classifiers) exhibited high AUCs on two independent test sets (0.94 and 0.90), indicating that we could accurately estimate whether a given cavity is likely to accommodate ligands (Figure 3). Furthermore, using the ligandability scores from our classifiers, we mined within a library of modeled structures of the HIV-1 TAR RNA for conformers that harbor cavities with high ligandability scores. We discovered that several of these high-scoring conformers also resembled known holo-TAR structures (Figures 4 and 5).

A limitation of our approach is the definition of ligandability we used to train our classifiers. In this study, any binding cavity that is known to accommodate a small-molecule ligand is considered ligandable. A more rigorous definition would include some consideration of the thermodynamic and kinetic parameters of small-molecule ligands that are likely to bind to a specific cavity. For example, with knowledge of the maximal binding affinities of a specific cavity,^17,18 machine learning models could be trained to not only predict whether a ligand is likely to bind to a cavity, but also to provide an estimate of the maximal ligand affinity for the cavity. The challenge of training models that adopt a more rigorous definition of ligandability is the lack of existing training data. One strategy that we are now exploring to address this limitation is to use cosolvent simulations of RNA to generate thermodynamic and kinetic profiles of individual binding cavities and then train multitask machine learning models to simultaneously predict whether a cavity is likely to bind to small-molecule ligands and the likely thermodynamic and kinetic parameters of such ligands.

Another limitation is that we defined unoccupied cavities in a given RNA as decoy cavities. In actuality, a fraction of such ‘decoy cavities’ might be ligandable by a molecule other than the molecule present in a particular RNA–ligand complex. One could begin to address this limitation by filtering out cavities defined as decoys in one structure if they resemble a ligand-occupied cavity in another structure. We note that the training of classifiers using occupancy data derived from the cosolvent simulations we alluded to above would effectively alleviate this limitation.

We further note that ligandability does not equate to druggability, which one can define as the likelihood of a given cavity to accommodate high-affinity, drug-like molecules. To develop druggability predictors, one would need access to an extensive database of structures of RNAs bound to drug-like molecules. Currently, there are only a handful of examples of RNA bound to truly drug-like molecules. Though we expect that a subset of the cavities our ligandability classifiers predict to be ligandable would also be druggable, we cannot make any guarantees regarding the extent of the overlap of the cavities that are predicted by our classifiers to ligandable and those that are genuinely druggable. This limitation notwithstanding, we envision that as the database of structures of RNA bound to drug-like molecules continues to expand, the framework we now have in place could be leveraged to train druggability predictors.

Despite there being room for improvements, the results we obtained using our first-order definition of ligandability, and the current data set available to us, suggest that our classifiers can be combined with existing cavity mapping approaches to identify RNA cavities that are likely to bind to a small-molecule ligand. Though the cavity mapping tool’s ability to detect cavities on the surface of an RNA will inherently limit the ability to recover true ligandable cavities using our classifiers, the results presented above suggest that if ligandable cavities are included among all cavities detected by the cavity mapping tool, our classifiers will tend to identify them as high-scoring cavities. We envision several immediate applications of our ligandability classifiers. First, our classifiers can be used to corroborate RNA–ligand site prediction results.^25,26 Second, as done in this study, our classifiers can be used to mine for RNA structures with ligandable cavities within conformations generated using RNA conformational sampling tools or within collections of experimental RNA structures. Third, our classifiers can help prioritize cavities within large and complex RNAs such as riboswitches and ribozymes, for which cavity mapping tools tend to detect many possible cavities to choose from. Fourth, our classifiers can be used to analyze the spatiotemporal properties of cavity formation along molecular dynamics trajectories of RNA, which can facilitate characterization of transient or cryptic binding cavities. Fifth, our classifiers can also be integrated with conformational sampling tools to refine RNA structures on-the-fly.²⁷ Such an approach could help generate energetically favorable structures that also harbor ligandable binding cavities. Finally, though beyond the scope of this current study, our results suggest that the fingerprints we used for classifying binding cavities might also be useful for comparing the similarity of binding cavities across multiple RNA targets, which might have applications in anticipating off-target effects.

Acknowledgments

The authors thank Prof. Hashim Al-Hashimi for making available the TAR screening data used in the virtual screening analysis.

Glossary

Abbreviations

AUC: area under the curve
EF: enrichment factor
ERT: extra-randomized trees
FP: fingerprint
KL: Kullback–Leibler
LR: logistic regression
MLP: multilayer perceptron
RF: random forest
ROC: receiver operating characteristic
SBVS: structure-based virtual screening
TAR: trans-activating response
XGB: extreme gradient boosting

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsmedchemlett.1c00068.

Information on the training and testing sets, importance analysis, and virtual screening against the HIV-1 TAR RNA (PDF)

Author Contributions

A.T.F. conceived project. J.X. and A.T.F. conducted the computational experiments. J.X. and A.T.F. analyzed the results and wrote the manuscript. All authors reviewed the manuscript.

We created a tool called RNACavityMiner that exposes the classifiers we described in this study. A standalone version of RNACavityMiner is available at https://github.com/atfrank/RNACavityMiner.git. A Web server version of RNACavityMiner can be accessed through the SMALTR Science Gateway (smaltr.org, accessed 2021-5-24). The data and code used to train the RNACavityMiner classifiers are available at 10.5281/zenodo.4049068 (accessed 2021-5-24).

The authors declare no competing financial interest.

Supplementary Material

ml1c00068_si_001.pdf^{(10.6MB, pdf)}

References

Howe J. A.; Wang H.; Fischmann T. O.; Balibar C. J.; Xiao L.; Galgoci A. M.; Malinverni J. C.; Mayhood T.; Villafania A.; Nahvi A.; Murgolo N.; Barbieri C. M.; Mann P. A.; Carr D.; Xia E.; Zuck P.; Riley D.; Painter R. E.; Walker S. S.; Sherborne B.; de Jesus R.; Pan W.; Plotkin M. A.; Wu J.; Rindgen D.; Cummings J.; Garlisi C. G.; Zhang R.; Sheth P. R.; Gill C. J.; Tang H.; Roemer T. Selective small-molecule inhibition of an RNA structural element. Nature 2015, 526, 672–677. 10.1038/nature15542. [DOI] [PubMed] [Google Scholar]
Velagapudi S. P.; Vummidi B. R.; Disney M. D. Small molecule chemical probes of microRNA function. Curr. Opin. Chem. Biol. 2015, 24, 97–103. 10.1016/j.cbpa.2014.10.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
Palacino J.; Swalley S. E; Song C.; Cheung A. K; Shu L.; Zhang X.; Van Hoosear M.; Shin Y.; Chin D. N; Keller C. G.; Beibel M.; Renaud N. A; Smith T. M; Salcius M.; Shi X.; Hild M.; Servais R.; Jain M.; Deng L.; Bullock C.; McLellan M.; Schuierer S.; Murphy L.; Blommers M. J J; Blaustein C.; Berenshteyn F.; Lacoste A.; Thomas J. R; Roma G.; Michaud G. A; Tseng B. S; Porter J. A; Myer V. E; Tallarico J. A; Hamann L. G; Curtis D.; Fishman M. C; Dietrich W. F; Dales N. A; Sivasankaran R. SMN2 splice modulators enhance U1-pre-mRNA association and rescue SMA mice. Nat. Chem. Biol. 2015, 11, 511. 10.1038/nchembio.1837. [DOI] [PubMed] [Google Scholar]
Cheung A. K.; Hurley B.; Kerrigan R.; Shu L.; Chin D. N.; Shen Y.; O’Brien G.; Sung M. J.; Hou Y.; Axford J.; Cody E.; Sun R.; Fazal A.; Fridrich C.; Sanchez C. C.; Tomlinson R. C.; Jain M.; Deng L.; Hoffmaster K.; Song C.; van Hoosear M.; Shin Y.; Servais R.; Towler C.; Hild M.; Curtis D.; Dietrich W. F.; Hamann L. G.; Briner K.; Chen K. S.; Kobayashi D.; Sivasankaran R.; Dales N. A. J. Med. Chem. 2018, 61, 11021. 10.1021/acs.jmedchem.8b01291. [DOI] [PubMed] [Google Scholar]
Fedorova O.; Jagdmann G. E.; Adams R. L.; Yuan L.; van Zandt M. C.; Pyle A. M. Small molecules that target group II introns are potent antifungal agents. Nat. Chem. Biol. 2018, 14, 1073. 10.1038/s41589-018-0142-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Disney M. D. RNA with small molecules to capture opportunities at the intersection of chemistry, biology, and medicine. J. Am. Chem. Soc. 2019, 141, 6776–6790. 10.1021/jacs.8b13419. [DOI] [PMC free article] [PubMed] [Google Scholar]
Daldrop P.; Reyes F. E.; Robinson D. A.; Hammond C. M.; Lilley D. M.; Batey R. T.; Brenk R. Novel ligands for a purine riboswitch discovered by RNA–ligand docking. Chem. Biol. 2011, 18, 324–335. 10.1016/j.chembiol.2010.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stelzer A. C.; Frank A. T.; Kratz J. D.; Swanson M. D.; Gonzalez-Hernandez M. J.; Lee J.; Andricioaei I.; Markovitz D. M.; Al-Hashimi H. M. Discovery of selective bioactive small molecules by targeting an RNA dynamic ensemble. Nat. Chem. Biol. 2011, 7, 553. 10.1038/nchembio.596. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganser L. R.; Lee J.; Rangadurai A.; Merriman D. K.; Kelly M. L.; Kansal A. D.; Sathyamoorthy B.; Al-Hashimi H. M. High-performance virtual screening by targeting a high-resolution RNA dynamic ensemble. Nat. Struct. Mol. Biol. 2018, 25, 425–434. 10.1038/s41594-018-0062-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Morley S. D.; Afshar M. Validation of an empirical RNA–ligand scoring function for fast flexible docking using RiboDock®. J. Comput.-Aided Mol. Des. 2004, 18, 189–208. 10.1023/B:JCAM.0000035199.48747.1e. [DOI] [PubMed] [Google Scholar]
Laurie A. T.; Jackson R. M. Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 2005, 21, 1908–1916. 10.1093/bioinformatics/bti315. [DOI] [PubMed] [Google Scholar]
Huang B.; Schroeder M. LIGSITE csc: predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct. Biol. 2006, 6, 19. 10.1186/1472-6807-6-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dundas J.; Ouyang Z.; Tseng J.; Binkowski A.; Turpaz Y.; Liang J. CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res. 2006, 34, W116–W118. 10.1093/nar/gkl282. [DOI] [PMC free article] [PubMed] [Google Scholar]
le Guilloux V.; Schmidtke P.; Tuffery P. Fpocket: an open source platform for ligand pocket detection. BMC Bioinf. 2009, 10, 168. 10.1186/1471-2105-10-168. [DOI] [PMC free article] [PubMed] [Google Scholar]
Soga S.; Shirai H.; Kobori M.; Hirayama N. Use of amino acid composition to predict ligand-binding sites. J. Chem. Inf. Model. 2007, 47, 400–406. 10.1021/ci6002202. [DOI] [PubMed] [Google Scholar]
Halgren T. New method for fast and accurate binding-site identification and analysis. Chem. Biol. Drug Des. 2007, 69, 146–148. 10.1111/j.1747-0285.2007.00483.x. [DOI] [PubMed] [Google Scholar]
Cheng A. C.; Coleman R. G.; Smyth K. T.; Cao Q.; Soulard P.; Caffrey D. R.; Salzberg A. C.; Huang E. S. Structure-based maximal affinity model predicts small-molecule druggability. Nat. Biotechnol. 2007, 25, 71–75. 10.1038/nbt1273. [DOI] [PubMed] [Google Scholar]
Seco J.; Luque F. J.; Barril X. Binding site detection and druggability index from first principles. J. Med. Chem. 2009, 52, 2363–2371. 10.1021/jm801385d. [DOI] [PubMed] [Google Scholar]
Kullback S.; Leibler R. A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. 10.1214/aoms/1177729694. [DOI] [Google Scholar]
Ganser L. R.; Chu C.-C.; Bogerd H. P.; Kelly M. L.; Cullen B. R.; Al-Hashimi H. M. conformational equilibria within the functional cellular context. Cell Rep. 2020, 30, 2472–2480. 10.1016/j.celrep.2020.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Borkar A. N.; Bardaro M. F.; Camilloni C.; Aprile F. A.; Varani G.; Vendruscolo M. Structure of a low-population binding intermediate in protein-RNA recognition. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, 7171–7176. 10.1073/pnas.1521349113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganser L. R.; Kelly M. L.; Herschlag D.; Al-Hashimi H. M. The roles of structural dynamics in the cellular functions of RNAs. Nat. Rev. Mol. Cell Biol. 2019, 20, 474–489. 10.1038/s41580-019-0136-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Frank A. T.; Stelzer A. C.; Al-Hashimi H. M.; Andricioaei I. Constructing RNA dynamical ensembles by combining MD and motionally decoupled NMR RDCs: new insights into RNA dynamics and adaptive ligand recognition. Nucleic Acids Res. 2009, 37, 3670–3679. 10.1093/nar/gkp156. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lu J.; Kadakkuzha B. M.; Zhao L.; Fan M.; Qi X.; Xia T. Dynamic ensemble view of the conformational landscape of HIV-1 TAR RNA and allosteric recognition. Biochemistry 2011, 50, 5042–5057. 10.1021/bi200495d. [DOI] [PubMed] [Google Scholar]
Zeng P.; Li J.; Ma W.; Cui Q. Rsite: a computational method to identify the functional sites of noncoding RNAs. Sci. Rep. 2015, 5, 9179. 10.1038/srep09179. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang K.; Jian Y.; Wang H.; Zeng C.; Zhao Y. RBind: computational network method to predict RNA binding sites. Bioinformatics 2018, 34, 3131–3136. 10.1093/bioinformatics/bty345. [DOI] [PubMed] [Google Scholar]
Guterres H.; Lee H. S.; Im W. Ligand-binding-site structure refinement using molecular dynamics with restraints derived from predicted binding site templates. J. Chem. Theory Comput. 2019, 15, 6524–6535. 10.1021/acs.jctc.9b00751. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ml1c00068_si_001.pdf^{(10.6MB, pdf)}

[ref1] Howe J. A.; Wang H.; Fischmann T. O.; Balibar C. J.; Xiao L.; Galgoci A. M.; Malinverni J. C.; Mayhood T.; Villafania A.; Nahvi A.; Murgolo N.; Barbieri C. M.; Mann P. A.; Carr D.; Xia E.; Zuck P.; Riley D.; Painter R. E.; Walker S. S.; Sherborne B.; de Jesus R.; Pan W.; Plotkin M. A.; Wu J.; Rindgen D.; Cummings J.; Garlisi C. G.; Zhang R.; Sheth P. R.; Gill C. J.; Tang H.; Roemer T. Selective small-molecule inhibition of an RNA structural element. Nature 2015, 526, 672–677. 10.1038/nature15542. [DOI] [PubMed] [Google Scholar]

[ref2] Velagapudi S. P.; Vummidi B. R.; Disney M. D. Small molecule chemical probes of microRNA function. Curr. Opin. Chem. Biol. 2015, 24, 97–103. 10.1016/j.cbpa.2014.10.024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref3] Palacino J.; Swalley S. E; Song C.; Cheung A. K; Shu L.; Zhang X.; Van Hoosear M.; Shin Y.; Chin D. N; Keller C. G.; Beibel M.; Renaud N. A; Smith T. M; Salcius M.; Shi X.; Hild M.; Servais R.; Jain M.; Deng L.; Bullock C.; McLellan M.; Schuierer S.; Murphy L.; Blommers M. J J; Blaustein C.; Berenshteyn F.; Lacoste A.; Thomas J. R; Roma G.; Michaud G. A; Tseng B. S; Porter J. A; Myer V. E; Tallarico J. A; Hamann L. G; Curtis D.; Fishman M. C; Dietrich W. F; Dales N. A; Sivasankaran R. SMN2 splice modulators enhance U1-pre-mRNA association and rescue SMA mice. Nat. Chem. Biol. 2015, 11, 511. 10.1038/nchembio.1837. [DOI] [PubMed] [Google Scholar]

[ref4] Cheung A. K.; Hurley B.; Kerrigan R.; Shu L.; Chin D. N.; Shen Y.; O’Brien G.; Sung M. J.; Hou Y.; Axford J.; Cody E.; Sun R.; Fazal A.; Fridrich C.; Sanchez C. C.; Tomlinson R. C.; Jain M.; Deng L.; Hoffmaster K.; Song C.; van Hoosear M.; Shin Y.; Servais R.; Towler C.; Hild M.; Curtis D.; Dietrich W. F.; Hamann L. G.; Briner K.; Chen K. S.; Kobayashi D.; Sivasankaran R.; Dales N. A. J. Med. Chem. 2018, 61, 11021. 10.1021/acs.jmedchem.8b01291. [DOI] [PubMed] [Google Scholar]

[ref5] Fedorova O.; Jagdmann G. E.; Adams R. L.; Yuan L.; van Zandt M. C.; Pyle A. M. Small molecules that target group II introns are potent antifungal agents. Nat. Chem. Biol. 2018, 14, 1073. 10.1038/s41589-018-0142-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref6] Disney M. D. RNA with small molecules to capture opportunities at the intersection of chemistry, biology, and medicine. J. Am. Chem. Soc. 2019, 141, 6776–6790. 10.1021/jacs.8b13419. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref7] Daldrop P.; Reyes F. E.; Robinson D. A.; Hammond C. M.; Lilley D. M.; Batey R. T.; Brenk R. Novel ligands for a purine riboswitch discovered by RNA–ligand docking. Chem. Biol. 2011, 18, 324–335. 10.1016/j.chembiol.2010.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] Stelzer A. C.; Frank A. T.; Kratz J. D.; Swanson M. D.; Gonzalez-Hernandez M. J.; Lee J.; Andricioaei I.; Markovitz D. M.; Al-Hashimi H. M. Discovery of selective bioactive small molecules by targeting an RNA dynamic ensemble. Nat. Chem. Biol. 2011, 7, 553. 10.1038/nchembio.596. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] Ganser L. R.; Lee J.; Rangadurai A.; Merriman D. K.; Kelly M. L.; Kansal A. D.; Sathyamoorthy B.; Al-Hashimi H. M. High-performance virtual screening by targeting a high-resolution RNA dynamic ensemble. Nat. Struct. Mol. Biol. 2018, 25, 425–434. 10.1038/s41594-018-0062-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] Morley S. D.; Afshar M. Validation of an empirical RNA–ligand scoring function for fast flexible docking using RiboDock®. J. Comput.-Aided Mol. Des. 2004, 18, 189–208. 10.1023/B:JCAM.0000035199.48747.1e. [DOI] [PubMed] [Google Scholar]

[ref11] Laurie A. T.; Jackson R. M. Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 2005, 21, 1908–1916. 10.1093/bioinformatics/bti315. [DOI] [PubMed] [Google Scholar]

[ref12] Huang B.; Schroeder M. LIGSITE csc: predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct. Biol. 2006, 6, 19. 10.1186/1472-6807-6-19. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref13] Dundas J.; Ouyang Z.; Tseng J.; Binkowski A.; Turpaz Y.; Liang J. CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res. 2006, 34, W116–W118. 10.1093/nar/gkl282. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref14] le Guilloux V.; Schmidtke P.; Tuffery P. Fpocket: an open source platform for ligand pocket detection. BMC Bioinf. 2009, 10, 168. 10.1186/1471-2105-10-168. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] Soga S.; Shirai H.; Kobori M.; Hirayama N. Use of amino acid composition to predict ligand-binding sites. J. Chem. Inf. Model. 2007, 47, 400–406. 10.1021/ci6002202. [DOI] [PubMed] [Google Scholar]

[ref16] Halgren T. New method for fast and accurate binding-site identification and analysis. Chem. Biol. Drug Des. 2007, 69, 146–148. 10.1111/j.1747-0285.2007.00483.x. [DOI] [PubMed] [Google Scholar]

[ref17] Cheng A. C.; Coleman R. G.; Smyth K. T.; Cao Q.; Soulard P.; Caffrey D. R.; Salzberg A. C.; Huang E. S. Structure-based maximal affinity model predicts small-molecule druggability. Nat. Biotechnol. 2007, 25, 71–75. 10.1038/nbt1273. [DOI] [PubMed] [Google Scholar]

[ref18] Seco J.; Luque F. J.; Barril X. Binding site detection and druggability index from first principles. J. Med. Chem. 2009, 52, 2363–2371. 10.1021/jm801385d. [DOI] [PubMed] [Google Scholar]

[ref19] Kullback S.; Leibler R. A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. 10.1214/aoms/1177729694. [DOI] [Google Scholar]

[ref20] Ganser L. R.; Chu C.-C.; Bogerd H. P.; Kelly M. L.; Cullen B. R.; Al-Hashimi H. M. conformational equilibria within the functional cellular context. Cell Rep. 2020, 30, 2472–2480. 10.1016/j.celrep.2020.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref21] Borkar A. N.; Bardaro M. F.; Camilloni C.; Aprile F. A.; Varani G.; Vendruscolo M. Structure of a low-population binding intermediate in protein-RNA recognition. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, 7171–7176. 10.1073/pnas.1521349113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref22] Ganser L. R.; Kelly M. L.; Herschlag D.; Al-Hashimi H. M. The roles of structural dynamics in the cellular functions of RNAs. Nat. Rev. Mol. Cell Biol. 2019, 20, 474–489. 10.1038/s41580-019-0136-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] Frank A. T.; Stelzer A. C.; Al-Hashimi H. M.; Andricioaei I. Constructing RNA dynamical ensembles by combining MD and motionally decoupled NMR RDCs: new insights into RNA dynamics and adaptive ligand recognition. Nucleic Acids Res. 2009, 37, 3670–3679. 10.1093/nar/gkp156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref24] Lu J.; Kadakkuzha B. M.; Zhao L.; Fan M.; Qi X.; Xia T. Dynamic ensemble view of the conformational landscape of HIV-1 TAR RNA and allosteric recognition. Biochemistry 2011, 50, 5042–5057. 10.1021/bi200495d. [DOI] [PubMed] [Google Scholar]

[ref25] Zeng P.; Li J.; Ma W.; Cui Q. Rsite: a computational method to identify the functional sites of noncoding RNAs. Sci. Rep. 2015, 5, 9179. 10.1038/srep09179. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref26] Wang K.; Jian Y.; Wang H.; Zeng C.; Zhao Y. RBind: computational network method to predict RNA binding sites. Bioinformatics 2018, 34, 3131–3136. 10.1093/bioinformatics/bty345. [DOI] [PubMed] [Google Scholar]

[ref27] Guterres H.; Lee H. S.; Im W. Ligand-binding-site structure refinement using molecular dynamics with restraints derived from predicted binding site templates. J. Chem. Theory Comput. 2019, 15, 6524–6535. 10.1021/acs.jctc.9b00751. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Mining for Ligandable Cavities in RNA

Jingru Xie

Aaron T Frank

Abstract

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Table 1. Ensemble-Docking Results against the 20 SimRNA TAR Conformers That Harbored the Highest Scoring Binding Cavities^a.

Acknowledgments

Glossary

Abbreviations

Supporting Information Available

Author Contributions

Supplementary Material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Mining for Ligandable Cavities in RNA

Jingru Xie

Aaron T Frank

Abstract

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Table 1. Ensemble-Docking Results against the 20 SimRNA TAR Conformers That Harbored the Highest Scoring Binding Cavitiesa.

Acknowledgments

Glossary

Abbreviations

Supporting Information Available

Author Contributions

Supplementary Material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 1. Ensemble-Docking Results against the 20 SimRNA TAR Conformers That Harbored the Highest Scoring Binding Cavities^a.