Abstract
RNA recognition frequently results in conformational changes that optimize intermolecular binding. As a consequence, the overall binding affinity of RNA to its binding partners depends not only on the intermolecular interactions formed in the bound state but also on the energy cost associated with changing the RNA conformational distribution. Measuring these “conformational penalties” is, however, challenging because bound RNA conformations tend to have equilibrium populations in the absence of the binding partner that fall outside detection by conventional biophysical methods. In this study we employ as a model system HIV-1 TAR RNA and its interaction with the ligand argininamide (ARG), a mimic of TAR’s cognate protein binding partner, the transactivator Tat. We use NMR chemical shift perturbations and relaxation dispersion in combination with Bayesian inference to develop a detailed thermodynamic model of coupled conformational change and ligand binding. Starting from a comprehensive 12-state model of the equilibrium, we estimate the energies of six distinct detectable thermodynamic states that are not accessible by currently available methods. Our approach identifies a minimum of four RNA intermediates that differ in terms of the TAR conformation and ARG occupancy. The dominant bound TAR conformation features two bound ARG ligands and has an equilibrium population in the absence of ARG that is below detection limit. Consequently, even though ARG binds to TAR with an apparent overall weak affinity , it binds the prefolded conformation with a Kd in the nM range. Our results show that conformational penalties can be major determinants of RNA-ligand binding affinity as well as a source of binding cooperativity, with important implications for a predictive understanding of how RNA is recognized and for RNA-targeted drug discovery.
Graphical Abstract
INTRODUCTION
The ability of non-coding RNAs to adopt a multitude of conformations underlies their cellular roles in gene expression and regulation.1–6 These dynamic transitions occur on time scales ranging from picoseconds to tens of seconds and can feature global changes in the orientation of helical elements and local base pair reshuffling that remodel both secondary and tertiary structure.5,7 Although such alternative conformational states can be short-lived and/or poorly populated in the isolated RNA, they can become long-lived and appreciably populated to play regulatory roles through interactions with binding partners.8 The high prevalence of conformational changes in RNA on molecular recognition9–11 implies that binding energies are highly dependent on RNA conformational penalties.12,13 However, the relative magnitudes of these energetic penalties are difficult to experimentally quantify due to challenges in detecting extremely low population species in the absence of ligand. Detection and characterization of such transient species is further complicated by highly cooperative interactions that can deceivingly appear two-state.14,15 Significant advances in nuclear magnetic resonance (NMR) spectroscopy16–20 have allowed characterization of protein and more recently RNA conformational dynamics occurring on the microsecond to millisecond time scale by relaxation dispersion (RD) techniques including R1ρ,21–23 CPMG,24,25 and CEST,8,26 but challenges remain with respect to detection limitations and complicated data analysis. Developing a predictive understanding regarding the role of RNA conformational thermodynamics is of critical importance27 for RNA-targeted drug discovery28–30 and bioengineering.31–33
Prior studies have successfully determined complex mechanisms by globally fitting data collected as a function of multiple independent variables in order to reshape the population distribution of relevant conformational states.34–38 We apply this strategy in our study of trans-activation response element (TAR) RNA from human immunodeficiency virus type-1 (HIV-1), which is one of the first paradigmatic examples of RNA adaptive recognition. Its conformation-dependent binding interaction with the viral protein Tat is required for full-length transcription elongation39–41 and has therefore been the subject of numerous studies aimed at understanding the principles behind dynamic ligand recognition. TAR is also a drug target for the development of anti-HIV therapeutics and it has been shown that the ligand-bound RNA conformation is an important determinant of the ligand’s inhibitory activity.42–44 Previous NMR-derived structures show that unliganded TAR adopts a mainly bent (average bend angle ~47°) and flexible inter-helical conformation in which bulge residue U23 stacks on A22, while spacer bulge residues C24 and U25 are looped out and flexible.45–48 The ligand argininamide (ARG) has been used extensively in TAR binding studies because it recapitulates the functional structural changes that TAR undergoes upon binding to the HIV-1 Tat arginine-rich motif (ARM).45,49–53 ARG binding arrests inter-helical motions and stabilizes a coaxial conformation in which bulge residue U23 forms a base triple with helical residues A27 and U38, and junctional residues A22 and U40 become stably hydrogen bonded.49,53
Several studies measuring TAR binding to Tat-like peptides or ARG-derived compounds have reported evidence for two binding sites,43,54–57 while others report single apparent affinities,44,50,58–61 and it remains unclear what role the other Tat arginine residues located in the arginine-rich RNA binding domain (RBD, residues G48-R57)62 play in vivo. Overall, the results in the literature hint at conformational heterogeneity present in the TAR·ARG bound state that has not been fully explored. Here, we show that bulge-dependent binding occurs at two sites on TAR and that the tightest affinity occurs on the base tripled TAR state. The apical loop of the TAR construct used for all experiments in this study is replaced with a UUCG loop to simplify the analysis and ensure that no dispersion is present in the unliganded state.63 This substitution has been shown to have minimal effects on ARG binding or ARG-dependent conformational equilibria.53,64 We globally fit NMR chemical shift perturbation (CSP) data to a six-state model and then test the predictions generated from our model using NMR RD in the presence of ligand.
RESULTS AND DISCUSSION
Qualitative Inspection of Chemical Shift Perturbations′ Ligand and Temperature Dependence Reveals Three Conformations and Two Binding Sites.
We carried out NMR 2D HSQC experiments as a function of both ligand concentration and temperature to quantitatively dissect the coupled conformational change and binding reactions that occur when the small-molecule ligand, ARG (Figure 1B), binds to TAR RNA (Figure 1A). Chemical shifts are exceptionally sensitive to local electronic environment, making them excellent reporters of conformational change and proximity to ligands, although distinguishing the two can be challenging. Close inspection of the raw CSP data (Figure 1C) enables us to build basic elements of a binding model on the basis of the following three criteria.
Nonlinear trajectories observed in the 2D HSQC spectra65 in Figure 1C under the fast-exchange limit are indicative of multi-site binding.66–68 The data is consistent with fast exchange because no significant line broadening is observed at any of the ligand concentrations and temperatures tested (Figures 1C and S2).
Temperature-dependent chemical shifts69–72 at low ligand concentration (see residue U23-C1′ in Figure 1C) are indicative of conformational equilibria between multiple unliganded RNA species.
Temperature-dependent chemical shifts at high ligand concentration (see residue G28-C8 in Figure 1C) are indicative of conformational equilibria between multiple ARG-bound RNA species.
In addition to the perturbations addressed by these observations, ARG-dependent perturbations are also observed across a wide variety of residues including helical regions outside the apparent binding sites, raising the immediate question of whether these distant perturbations are due to long-range conformational effects or whether they reflect weak interaction with three or more ARG molecules. Strong temperature but no ligand dependencies are observed in the chemical shifts of resonances in the UUCG loop even though bulge-dependent equilibria are expected to be independent of the apical loop for the TAR sequence used here.64 This observation suggests that more than two conformational equilibria may contribute to the observed chemical shifts. A model that accounts for all of these additional RNA species would be under-determined without a substantial amount of additional data, and the equations may not have analytical solutions (for ligand binding to ≥ two non-identical sites).73 To circumvent this issue and focus on ligand-dependent CSPs that probe ARG binding, we used a bulgeless TAR variant to help identify the resonances reporting only on bulge-dependent ARG binding and conformational change.
Bulge Deletion Identifies Chemical Shifts Not Perturbed by Two Primary Binding Sites.
Quantitative interpretation of the observed CSP temperature and ligand dependencies depends on the assumption that the underlying data reports on the explicit equilibria included in the binding model. Ideally, the binding model would include all equilibria but due to experimental and computational limitations, we are forced to build minimal complexity models. Because we have chosen to model the bulge-dependent binding and conformational equilibria presumed to be critical for functionally important TAR-Tat interactions in the cell,39,40,74,75 it is important for accurate modeling that we exclude CSPs from global fitting that are bulge-independent. To identify these resonances, we carried out NMR CSP mapping experiments on a construct lacking the UCU bulge (ΔbulgeTAR) as a function of temperature and ARG concentration. Specifically, comparison of the NMR spectra between the two constructs both in the absence (Figure S3A) and presence (Figure S3B) of ARG reveals that ARG-induced perturbations in the lower helix do not depend on the bulge. Importantly, when we compare the NMR spectra of TAR and ΔbulgeTAR in the presence of ARG, any peaks in the ARG-bound spectra that do not overlay between TAR and ΔbulgeTAR identify a specific bulge-dependent binding site or a binding-dependent conformational change that can only occur in the construct with the trinucleotide bulge. The aromatic resonances for A20 and G21 are nearly identical for both RNA variants under ARG-saturating conditions (Figure S3B), which indicates that the lower helix does not report on bulge-dependent binding. As expected, resonances near the bulge that are known to play a role in ARG recognition (A22, G26, A27, U38, C39, U40)49 have large chemical shift differences (Δδ = δΔbulge·ARG − δTAR·ARG) in the presence of ARG. More interesting, however, are resonances located farther away from the bulge that do not overlay well in the ARG-bound spectra, namely G28-C8 and C29-C6. This result provides support for a second bulge-dependent binding site in the upper helix above the bulge, (Figure 2B and Figure S4) consistent with a recently published NMR study that showed the TAR·Tat RBD complex was stabilized by two arginine residues: R52 below the bulge and R49 above the bulge.76 Additionally, these results suggest that using a simplified small-molecule system to study RNA–protein binding can faithfully capture key features of the larger system and is an effective approach for dissecting complex binding equilibria. For the purpose of accurately modeling the TAR bulge-dependent binding equilibria, we exclude resonances from global fitting whose chemical shifts are sensitive to bulge-independent binding or to melting as discussed further in the Supporting Information.
Global Bayesian Fitting Provides Posterior Probability Distributions of Thermodynamic and Structural Parameters.
Global fitting of the TAR CSPs caused by changes in ligand concentration and temperature to a thermodynamic model can provide estimates of reaction energies and chemical shift fingerprints (i.e., structural information) of the various populated RNA species. Each data point collected at a given temperature and ARG concentration for a given resonance is modeled as the observed chemical shift position δobs. For systems exchanging between n states on the fast NMR time scale, the δobs at a given ligand concentration and temperature is an ensemble-averaged value given by
(1) |
for states i = 1, …, n. The population of a given species pi, expressed as
(2) |
is the probability of the system being in state i, where , R is the ideal gas constant, T is the absolute temperature in Kelvin, and ΔGi is the Gibbs free energy difference between state i and a reference state. The partition function represents the sum of the statistical weights of all microstates distinguishable by binding site occupancy and cooperatively formed conformational minima (see Methods for model-dependent Q derivations).
Because the raw data establish a mechanism that involves multiple RNA conformational states and multiple liganded states, we employed a Bayesian approach that has several advantages over least-squares fitting methods, particularly for estimation of correlated parameters in multi-component biophysical models.77–80 Least-squares methods that obtain point estimates for parameter values rely on the assumption that the error in the data is normally distributed around the mean value, but this assumption need not be true for more complicated systems in which the variance in the data has complex sources (Methods) and parameters are often correlated. A Bayesian approach, by contrast, estimates complete joint probability distributions for each parameter and in so doing estimates the associated uncertainty over all potential parameter space, providing more reliable estimates of parameter uncertainty. Additionally, Bayesian inference has the advantage over standard approaches in its ability to formally incorporate prior knowledge with the observed data, which together are used to maximize the log likelihood of the proposed model, or the conditional probability of observing the data given the model. This approach is represented by Bayes’s rule, p(θ|y) ∝ p(y|θ) p(θ), where p(θ|y) is the posterior probability distribution, p(y|θ) is the likelihood, and p(θ) is the prior probability distribution for a specified model with parameters θ and data y. Our implementation of Bayesian fitting is described in Methods.
A Six-State Thermodynamic Model Matches the Information Content of the CSP Data.
The model we developed for data fitting assumes explicit conformational changes and binding equilibria to quantitatively interpret the observed CSP temperature and ligand dependencies. As noted earlier, the curvature observed in the ligand-dependent 2D CSP trajectories provides evidence for at least two binding sites, and the temperature-dependent CSPs at both low and high ligand concentrations indicate conformational equilibria between at least three species. Taken together, these observations suggest the existence of a minimum of 12 distinct RNA species: three conformational states (referred to herein as states A, B, and C) and three bound forms of each conformational state, corresponding to ARG bound to site α or β, or both. This scheme is depicted in Figure 1D.
Ideally, there would be sufficient information in the experimental data to determine all of the equilibrium constants. Unfortunately, because some states are insufficiently populated to allow estimation of their contributions to observed CSPs, we are forced to reduce the complexity of the model to six RNA species: only the unliganded form of the A conformation (A), only the fully liganded form of the C conformation (CL2), and all possible liganded states of the B conformation (B, BLα, BLβ, and BL2). Our rationale for including these states and not the other six is described in detail in the Supporting Information. Although our model-fitting approach does not strictly require any prior assumptions regarding the structural identities of the A, B and C conformations, we propose that they correspond to the flexible, bent inter-helical form of TAR10,46,47,81 (state A); the coaxially stacked state observed at high Mg2+ concentration82–85 where junctional residues A22 and U40 below the bulge remain unpaired (state B); and the coaxially stacked state stabilized with a A22·U40 base pair and U23-A27·U38 base triple49 (state C), as depicted in Figure 1E. These predictions are based on earlier NMR studies that report hydrogen bond alignments between A22-U40 and U23-A27 in excess ARG but not in Mg2+,53 where both ARG and Mg2+ similarly reduce the TAR helix bend angle relative to the unliganded state according to transient electric birefringence, NMR, and circular dichroism studies.47,59,84,86
At intermediate ARG concentrations there must be significantly populated singly bound species, which must have either the B or C conformations, or both. As noted above, the minimal model must include one or more singly bound species in the same conformational state as the doubly bound species. The non-sigmoidal binding curves observed for several resonances would not be observed if the two binding sites were on different conformational states. Given the assumption that the unliganded C state is negligible, the B conformation is the best candidate for the singly bound species, i.e., BLα and/or BLβ as opposed to CLα and/or CLβ. We fit for and distributions but also consider the possibility that CLα and/or CLβ become sufficiently populated to contribute to the observed CSPs and therefore also test fits including and/or . We conclude that the models including CLα and/or CLβ do not fit the data significantly better than the simpler and less parametrized model with only BLα and BLβ. This conclusion is consistent with our preference for a model whose complexity is matched to the information content of the data. Additional experiments beyond the scope of the present study might provide sufficient information to determine the populations of some of the six species not included in our minimal model. An example of such an experiment would be relaxation kinetics, which have the capacity to detect transiently populated species whose equilibrium populations are undetectable.37
The Unliganded C Conformation of TAR Is Poorly Populated But Has High Affinity for ARG.
For the six-state model, the fitting equation is given by
(3) |
where the population of each species is found as described in Methods. The populations and therefore the observed chemical shifts are a function of the site binding affinities for and and equilibrium constants for A ⇌ B (KAB) and . Because the slopes of chemical shifts versus ligand concentration do not vary significantly with temperature for all resonances, (representatives shown in inset in Figure 1C), we assume that the site binding constants are insensitive to temperature at the experimental conditions tested and the only detectable temperature dependency arises from conformational equilibria given by
(4) |
where ΔH is the enthalpy change and Tm is the melting temperature in Kelvin for a given conformational transition. Here, the parameter vector θ is given by where δr,i represents the vector of basis chemical shifts for species i (i = A, B, BLα, BLβ, BL2, and CL2) per resonance r, and ϵj is the measurement error in δobs in units of ppm for chemical shift groups j (j = C6/C8, C1′, C2, H2/H6/H8/H1′, H1/H3, and overlapped resonances). The error terms were grouped according to scale of raw chemical shift values, as described in Methods.
As listed in Table 1, the site binding affinities estimated from the marginal posterior distributions are 66 (+7/−6) μM and 550 (+50/−40) μM for the α and β sites on state B, respectively. The enthalpy change of the stacking equilibrium (A ⇌ B) is estimated to be −7.9 (+0.5/−0.4) kcal mol−1, while the derived entropy change for the same reaction is quite unfavorable at −29.4 (+0.4/−1.2) cal mol−1 K−1. This reduction in entropy is expected for a stacking reaction. The BL2 ⇌ CL2 equilibrium is slightly less exothermic at −2.9 (+0.4/−0.4) kcal mol−1 and also less entropically unfavorable at 9.9 (+1.6/−1.3) cal mol−1 K−1. The derived free energy change for both reactions is slightly positive (unfavorable) at 25 °C (Figure 3C, red), but as depicted in Figure 4, both A and B are significantly populated in the absence of ligand and both BL2 and CL2 are significantly populated at high ARG concentrations. Figure 4 also shows that the only other significantly populated species is BLα, whose population reaches ~35% at the midpoint of the apparent binding curve, but is poorly populated at low and high ARG concentrations.
Table 1.
Parameter | estimate | 5% C.I. | 95% C.I. |
---|---|---|---|
ΔHAB (kcal mol−1) | −7.86 | −8.33 | −7.45 |
−2.85 | −3.27 | −2.47 | |
−6.08 | −9.63 | −2.46 | |
14.55 | 6.92 | 22.05 | |
ΔSAB (cal mol−1K−1)a | −29.44 | −30.83 | −28.23 |
a | −9.90 | −11.45 | −8.55 |
ΔGAB (kcal mol−1)a,b | 0.91 | 0.84 | 0.99 |
a,b | 0.11 | 0.03 | 0.19 |
0.066 | 0.060 | 0.073 | |
0.55 | 0.51 | 0.60 |
Derived parameter.
At 25 °C.
In addition to the six-state model, we tested several models (Figure S5) as potential fits to the data. Although fits to models with species n ≥ 7 were not statistically superior to the simpler, less parametrized six-state model, they provide additional support for the existence of tight binding to an undetectable C species. For instance, the Kd to at least one site on the C conformation was consistently estimated to be below 0.009 mM for fits to n ≥ 7-state models. Under such conditions, the population of the singly bound C conformation would be below 5%, so its contribution to δobs would be negligible and largely indeterminable, which is consistent with the highly uncertain distributions obtained for δCLα and/or δCLβ in those fits. Overall, these results are consistent with a model in which binding to at least one of the sites on the C conformation is much tighter than binding to that site in the B conformation.
Fitted CSPs Give Insight into Structures of Intermediate Species.
To answer the question of whether the A ⇌ B equilibrium reports on coaxial stacking, we compared the fitted chemical shifts for the B state with measured chemical shifts for TAR in Mg2+ and additionally for ΔbulgeTAR in ionic strength conditions comparable to those in the CSP titration (Figure 5A). It is known that Mg2+ shifts the equilibrium to >90% stacked,84,85,87 so we would expect the chemical shifts of the fitted B state to match those seen in Mg2+ if temperature perturbs the same stacking equilibrium. However, a complication with this previously used assumption is that Mg2+ also makes direct contacts with resonances near the bulge, so it may be difficult to separate the chemical shift contributions due to stacking/unstacking versus those due to localization of Mg2+. The bulgeless variant, however, provides a cleaner fingerprint of the chemical shift for the stacked state since it is presumably never unstacked. The disadvantage of this comparison is that it provides no information on the stacked chemical shift for bulge residues U23, C24, and U25 because they are absent in the variant. As shown in Figure 5A, many of the B-state shifts agree in direction and approximate magnitude with shifts for ΔbulgeTAR, supporting the hypothesis that the A ⇌ B equilibrium is the unstacked ⇌ stacked equilibrium. Interestingly but perhaps not surprisingly, the chemical shifts of resonances thought to be in close contact with Mg2+ ions (C24-C1′, C24-C6, U25-C1′, U25-C6, G26-C8, A27-C8) are in opposite direction to the chemical shifts for TAR·Mg2+. Relative to the A state, C39-C6 shifts upfield and C39-C1′ shifts downfield in B, consistent with the base becoming more stacked and the sugar pucker becoming more A-form, as would be expected for residues directly above the bulge in the coaxially stacked B state.
Comparing the chemical shifts for CL2 relative to BL2 gives insight into the structural differences between the stacked/base triple state and the stacked state. Relative to the BL2 state, resonances in CL2 that lose stacking interactions on the basis of downfield chemical shifts include U23-C6, G26-C8, G28-C8/H8, C29-H6, while resonances that become more stacked in the base-tripled state include A22-H2, C24-C6, U25-C6, and C29-C6 (Figure 5B). U25-C1′ has a large upfield chemical shift in CL2 indicating a shift from C3′-endo to C2′-endo (deviating away from RNA A-form geometry). C39-C6 remains unchanged between the BL2 and CL2 states, while its sugar (C39-C1′) becomes slightly more helical (downfield chemical shift), although it is already adopting a more A-form geometry in BL2 relative to unliganded A and B. The upfield shift in H2 of A22 is consistent with increased stacking and stabilization of the A22-U40 base pair in CL2.
Testing the Model and Characterizing a Binding Intermediate by NMR RD.
Our model predicts detectable populations of the intermediate species BLα at intermediate ligand concentrations, so we can exploit this information to gain insight into the kinetics of ligand binding using NMR R1ρ RD. Additionally, R1ρ experiments allow independent determination of chemical shifts for comparison to those obtained from analysis of the CSPs. Measurements were first carried out in the absence of ARG. Consistent with previous results,63 no dispersion is observed in the free TAR construct at temperatures between 5 and 25 °C. Because our model predicts populations of roughly 30% and 15% for the unliganded B state at 5 and 25 °C, respectively (Figure 6A), and sufficiently large 13C chemical shifts (Δω = ωB − ωA > 1 ppm) for several resonances (Figure 5A), the absence of dispersion indicates that the A ⇌ B transition must be fast (τ = 1/kex ≤ 1 μs) and outside the detection limits. Our CSP experiments revealed that binding occurs on the fast exchange NMR time scale (kex (s−1) > Δω (rad s−1))68 at all tested temperatures and therefore we can predict that for average chemical shift differences of 2 ppm in 13C on a 700 MHz spectrometer, the exchange process (kex = k1 + k−1) will be faster than ~2000 s−1. A simplified description of ligand binding between one ligand and one receptor with a single site yields k1 = kon[L] and k−1 = koff. Thus, the total observed exchange (kex = kon[L] + koff) scales with increasing free ligand concentration [L]. Experiments were carried out at low ligand concentration to avoid reaching exchange rates too fast to be detected by RD.
To test the predictions from the CSP analysis, we carried out off-resonance R1ρ measurements at 1.5 mM RNA and 0.2 mM ARG which should give rise to chemical exchange corresponding to 9% of the BLα species and 1% of the BLβ species for resonances with large enough 13C chemical shifts relative to the ground state (Δω = ωBLα − ωGS> 1 ppm and Δω = ωBLβ − ωGS > 1 ppm). We note here that because the observed ground state (GS) is in fast exchange between A ⇌ B as described above, the GS chemical shift is not simply ωA; rather it is given by ωGS = ωA(1 − pB) + ωBpB, where pB is calculated from the Bayesian-fitted parameters according to
(5) |
At 5 °C, we observe apparent two-state dispersion profiles at four resonances located near the primary bulge binding site (Figure 6B). No ligand-dependent dispersion was observed at 25 °C, likely because the reaction is too fast for detection by R1ρ, as reaction rates scale with increasing temperature. Global fitting of the data at 5 °C revealed the apparent on-rate constant to be 3 × 106 M−1 s−1 and the apparent off-rate for binding to be about 8000 s−1. The and values were calculated using eq 19 and eq 20 as described in Methods. The population obtained by RD (5.4 ± 0.5%) is in fairly close agreement with the predicted 9%, although it is important to note that there is no RD evidence for the 1% predicted exchange with a third species. This could be explained by the chemical shifts for the BLβ state being too small (<1 ppm) for RD detection, as we do not have accurate estimations for the BLβ chemical shifts from the global fitting of the CSP data (Supporting Information). Nevertheless, the chemical shifts obtained by RD are in strikingly good agreement with the chemical shifts for species BLα determined from the CSP fitting (Figure 6C), further supporting that the RD is reporting on the binding to the α site on the B conformation.
Furthermore, the apparent dissociation constant can be calculated from the RD fitted parameters using for a two-state system. In this case, because of the fast A ⇌ B pre-equilibrium, the is scaled by the population of the B state, so , where KAB (at 5 °C) is obtained from the CSP results. The result for obtained by RD (0.22 ± 0.05 mM) is in fairly good agreement with the 0.1 mM measured for by the global CSP analysis. Measure ments were also carried out at high ligand concentration (10 mM ARG) to probe the kinetics for base triple formation (BL2 ⇌ CL2) since our model predicts a sufficiently high population of , but no dispersion was observed at all tested temperatures (Figure S11). It is possible that base triple formation is too fast and outside the detection limits of RD, and more experiments in the future will be needed to explore this hypothesis further.
CONCLUSION
Returning to the complete 12-state model introduced earlier, we can begin to make predictions about this more detailed mechanism based on the analysis of our simpler six-state model. As a consequence of thermodynamic linkage and free energy conservation, not all equilibrium constants in the full 12-state model are independent, i.e.,
(6) |
Our fitted minimal model provides estimates at 25 °C for , , and (1.4 × 104 M−1, 1.8 × 103 M−1 and 0.84, respectively). Our inability to detect any form of C except CL2 gives an upper limit for KBC of 0.05, and substitution of these estimates into eq 6 gives an estimate for the product of the two C-state binding constants of at least 3 × 1012 M−2. This estimate implies that the upper limit for both binding dissociation constants and must be 0.6 μM, assuming that they are equal. If they are not the same, then one of the C sites must be even tighter (Figure S12).
Because this affinity is much tighter than that of the B conformation ( by 4 orders of magnitude), the probability of finding a doubly bound C species (CL2) is much higher than finding a singly bound C species (CLα or CLβ). In other words, it is much more thermodynamically favorable for ARG to bind to both sites on conformation C than it is to have only one ligand bound. The same is not true for the B conformation, whose ligand binding is sufficiently weak that the most energetically favorable species is the singly bound B state (BLα) at intermediate ARG concentrations (see population of BLα in Figure 4).
Considering our results in the context of energetic penalties previously measured for RNA systems, the enthalpy we measure for A ⇌ B (ΔHAB ≈ −7.9 kcal mol−1) is in fairly close agreement with the enthalpy reported for formation of a single terminal AU-AU base stack (ΔHstack = −6.82 ± 0.72 kcal mol−1).88 Since formation of the TAR B state involves coaxial stacking between the upper and lower helix, our data is consistent with the conclusion that the ΔH measured for A ⇌ B reflects the enthalpy of stacking. This occurs between junctional bases at the bulge that are likely to be as flexible as the terminal bases studied previously.88 Additionally, consistent with our results of ~ 7 kcal mol−1 penalty for forming the C conformation in the absence of ligand (Figure 7), no evidence has ever been observed suggesting the base triple is detectable in the free RNA, and an RDC-informed MD ensemble also never observed the base triple for either HIV-1 or HIV-2 TAR.89,90
Our results suggest that preferentially tight affinity to high-energy RNA states may be a ubiquitous mechanism by which ligands confer selectivity to otherwise promiscuous RNA binding partners. The binding energy from binding to a tight-affinity ligand can explain how the RNA overcomes the high conformational energetic penalty for formation of the binding-competent conformation. Lack of selectivity is among the main challenges in RNA drug design and in understanding processes regulated by RNA. Our study provides direct evidence that undetectable RNA species are the key thermodynamic players in a paradigmatic example of RNA adaptive recognition and underscores the importance of developing methods to study low-populated states that have gone undetected. While the presented approach only provides thermodynamic information regarding binding intermediates, its integration with other methods such as NMR RD and stopped-flow techniques may afford a deeper and more complete kinetic characterization of the binding reaction. In addition, extending NMR measurements to other sugar resonances should be straightforward, and this will afford the opportunity to structurally characterize the intermediates to higher resolution.91,92 We predict that the mechanism uncovered here is not limited to TAR and anticipate that further experiments using the approach outlined here can help to quantitatively characterize multi-state RNA–ligand interactions.
METHODS
Sample Preparation.
HIV-1 TAR RNA capped with a UUCG loop (sequence GGCAGAUCUGAGCUUCGGCUCUCUGCC) and bulgeless mutant (sequence GGCAGAGAGCUUCGGCUCUCUGCC) were prepared by in vitro transcription using DNA templates containing the T7 promoter (Integrated DNA Technologies). DNA templates were annealed in 3 mM MgCl2 by heating to 95 °C for 5 min and cooling on ice for 30 min. The transcription reaction was carried out at 37 °C for 12 h with T7 RNA polymerase (New England BioLabs) using 13C/15N-labeled nucleotide triphosphates (Cambridge Isotope Laboratories, Inc.). RNA was purified using 20% (w/v) denaturing polyacrylamide gel electrophoresis (PAGE) with 8 M urea and 1× TBE buffer followed by excision from the gel by electroelution in 1× TAE buffer and further purified by ethanol precipitation overnight at −20 °C. Purified RNA was dissolved in water to 50 μM RNA and annealed by heating to 95 °C for 5 min and cooling on ice for 1 h. Prior to NMR experiments, 13C/15N-labeled RNA samples were exchanged into phosphate NMR buffer (15 mM NaH2PO4/Na2HPO4, 25 mM NaCl, 0.1 mM EDTA, 10% (v/v) D2O at pH 6.4) to the desired RNA concentrations (0.2 mM for HSQC titrations and 1.5 mM for RD). Concentrations were measured using a nanodrop UV spectrometer by the absorbance reading of heat-denatured RNA at 260 mm.
NMR CSP Experiments.
NMR experiments were carried out on Bruker Avance III 600 MHz and Bruker Avance III 700 MHz spectrometers equipped with a 5 mm triple resonance cryogenic probe. Assignments for TAR in the unbound and ARG-bound states were previously published.93 CSPs as a function of temperature and ligand concentration were monitored for aromatic and aliphatic resonances using 2D [13C,1H] heteronuclear single quantum coherence (HSQC) spectra and for imino resonances using 1D [1H] SOFAST spectra. For each titration point, successive additions of ARG (l-argininamide dihydrochloride, Sigma) from a concentrated stock dissolved in phosphate NMR buffer were added directly to an NMR sample tube containing 450 μL of 0.2 mM 13C/15N-labeled RNA, and the tube was inverted at least six times to ensure homogeneous mixing. The increase in total sample volume due to ARG addition remained less than 5% of the starting volume and therefore had a negligible effect on the RNA concentration. A total of seven titration points per temperature were used for the TAR titration and six for the bulgeless mutant titration. Experiments for both constructs were carried out at six different temperatures (1, 2.5, 5, 10, 15, and 25 °C). The temperature of the NMR probe was calibrated using a methanol sample and spectra was internally referenced relative to a trace amount of 2,2-dimethyl-2-silapentane-5-sulfonate (DSS). Spectra were processed using NMRPipe,94 and peak positions were measured using Sparky.95
NMR RD Experiments.
13C R1ρ measurements were carried out at 25 and 5 °C on a Bruker Avance III 700 MHz spectrometer using 1.5 mM 13C/15N-labeled RNA. On- and off-resonance measurements were collected at the spin lock powers (Ω) and spin lock offsets (ω1) listed in Table S1 using a 1D acquisition scheme which uses Hartmann–Hahn cross-polarization transfer to selectively excite a single peak at a time as described previously.22,63,92 To account for sample heating effects due to high-power spin locks,96 the pulse sequence contains a heat-compensation element that applies far off-resonance 13C spin locks during the recycle delay after acquisition at the highest field strength for a time such that the net amount of radiofrequency (RF) introduced into the sample is constant across all relaxation delay times.21,64 For preparation of samples, ARG from a concentrated stock dissolved in phosphate NMR buffer was added directly to the NMR sample tube containing 1.5 mM 13C/15N-labeled RNA following the procedure used for titration experiments to achieve the desired ARG concentrations (0.2 and 10 mM). Raw data was processed using NMRPipe to determine peak intensities at each delay time point, which were then fit to a mono-exponential function to estimate the R1ρ values needed for further analysis via eq 18.
Statistical Thermodynamic Framework.
As described in the Results and Discussion section, qualitative analysis of the CSP data led us to conclude that the minimum complexity system necessary to explain the temperature and ligand dependencies must include three conformational states (A, B, C) each potentially capable of binding two ligands. To further minimize complexity, the two binding sites (α and β) are assumed to be independent; that is, occupancy at one site does not detectably influence the intrinsic affinity at the other site. The partition function Q for such a system can be derived by rewriting the following microscopic association and conformational equilibrium constants:
(7) |
in terms of RNA concentrations, summing together all 12 RNA forms,
(8) |
and dividing by a chosen reference state (most simply set as the free A state) to obtain
(9) |
Although this 12-state model (Figure 1D) was used for simulations, extensive testing revealed that the information content of the data is insufficient to fit more parameters than those necessary to quantify the interconversion of the six states highlighted in Figure 1D. The summation of all RNA forms in the six-state model is
(10) |
which gives the partition function
(11) |
The population term for each species is found by taking its statistical weight and dividing by the partition function. Note that the equilibrium constants KBC and are related through
(12) |
The free ligand concentration term [L] is a solution to a cubic equation of the form
(13) |
where the coefficients a, b, c, and d are defined as
(14) |
for the 12-state model and
(15) |
for the six-state model. The general, model-independent algebraic solution97,98 for free ligand concentration [L] is given by
(16) |
where
(17) |
and a, b, c, and d are model-dependent terms as shown in eqs 14 and 15.
Bayesian Analysis of CSP Data.
Bayesian inference was used to fit the CSP data. This fitting method applies Bayes’s rule, p(θ|y) = p(y|θ) p(θ)/p(y), which states that for a specified mathematical model of the data y based on parameters θ, the probability distributions of parameter values given the data, p(θ|y) (called the posterior distributions), are the product of the likelihood of the data given the parameters, p(y|θ) (a measure of goodness-of-fit), and user-specified prior probability distributions, p(θ), normalized by the probability of the data, p(y). In simpler terms, the posterior ∝ prior × likelihood. The calculation of the p(y) term needed for calculating the likelihood involves a multivariate integral that does not have an analytical solution except in the simplest of cases such as single-parameter models,80,99 so Bayesian techniques have been developed that use Markov-chain Monte Carlo (MCMC) rejection sampling to approximate the likelihood function and respective posterior distributions. Here, we use the software package Stan, which employs a version of MCMC sampling known as the no-U-turn (NUTS) Hamiltonian Monte Carlo (HMC) sampler which is efficient and robust for models with complicated posteriors.100,101 The analysis was carried out using in-house scripts written in the Stan statistical programming language run under the R programming interface (RStan).
The Stan interface allows the user to define a likelihood function given by the model equations and conditioned on the data. For each unknown parameter, a prior distribution is specified with a corresponding mean μ and standard deviation σ. Briefly, at each sampling iteration, the algorithm takes a number of discretized leapfrog parameter steps to simulate the trajectory of a particle along a potential energy field. The parameter values at a point in this field correspond to the log of the posterior density we wish to maximize. Essentially, the leapfrog steps are a discrete way to approximate the continuous function represented by the model’s equations. Highly curved posteriors (i.e., complicated nonlinear functions) are especially difficult to sample from, partly because an infinite number of steps would be required to accurately traverse the trajectory. In the work here, the polynomial terms in the free ligand expression (eq 16) give rise to a highly nonlinear dependence of the model on free ligand concentration. This variable is model-dependent and based on the combination of binding and equilibrium constants (as shown in eqs 7–17), which makes the selection of informative priors very important.
The Bayesian fitting algorithm aims to maximize the log-likelihood for master eq 1 and in doing so treats each model parameter as a random variable with its own probability distribution for parameter vector θ. The general parameter vector for all models tested can be expressed as , where superscript X refers to a conformational state (A, B, or C), subscript s refers to a binding site (α or β), Y1 ⇌ Y2 refers to a given conformational equilibrium (such as A ⇌ B), δr,i represents the vector of basis chemical shifts for species i per resonance r, and ϵj is the measurement error for chemical shift categories discussed in the following paragraph.
For all thermodynamic parameters, Gaussian prior distributions were used with standard deviations up to 50% of their means. Gaussian priors were also used for chemical shift parameters with standard deviations of 1.0 ppm for carbon and 0.25 ppm for proton. For a given resonance, the prior distributions used for all species’ chemical shifts were identical to avoid biasing the posteriors. In addition to the thermodynamic and chemical shift parameters, the Bayesian fitting algorithm treats the measurement error as a random unknown variable and therefore fits for the an error term referred to as ϵj. As mentioned in the Results and Discussion, Bayesian techniques are suitable for handling data whose variance may have several sources. Because our fitting approach uses the raw measured chemical shift values of all included resonance types and avoids scaling or normalizing the data, the scale of the measurement error varies depending on the nucleus type (13C vs 1H) and even atom position (C6/C8 vs C1′). For this reason, we fit for six error terms corresponding to the following categories of chemical shifts: C6/C8, C1′, C2, non-exchangeable protons, exchangeable protons, and resonances with significant overlap at any point in the titration. An Inverse-Gamma distribution with shape and scale parameters (α, β) was chosen for the error priors, where the shape and scale terms are related to the mean μ and standard deviation σ by α = (μ2 + 2σ2)/σ2 and β = μ(μ2 + σ2)/σ2.
MCMC sampling was carried out using four independent chains and 4000 iterations per chain, the first half of which comprise the initial warm-up phase where the algorithm optimizes for sampling efficiency and are therefore discarded before posterior analysis. The post-warm-up samples used for approximating the parameter posterior distributions consist of 2000 draws per chain, i.e., 8000 total draws. Convergence was assessed (Figure S6) via the potential scale reduction factor 102 for each parameter where values approaching 1 but ≤1.05 are ideal and indicate lack of convergence.103 Prior to data fitting, the model was extensively tested for robustness by subjecting simulated data to the fitting procedure and evaluating the effects of various prior distributions on posterior distributions. As a benchmark, data with noise were simulated for 20 resonances under experimental conditions identical to that of the collected data (seven ligand concentrations and six temperatures), and the simulated data was subjected to global fitting using the equations described above. The fitting was able to converge within 2000 iterations only when the correct parameter distributions were found. Setting priors very far from simulated values prevented convergence, indicating that the algorithm did not falsely converge to incorrect parameter values.
We note that using a Bayesian approach was not only advantageous to a standard nonlinear least-squares approach but it was integral for quantitative analysis of a model that matched the level of complexity present in the raw data described here. For data sets simulated to the six-state model both with and without noise, least-squares global fitting failed to recover the simulated parameters, whereas Bayesian global fitting correctly recovered the parameter distributions for converged fits.
Analysis of RD Data.
Data for two-state exchange were fit to
(18) |
where R1 and R2 are the longitudinal and transverse relaxation rates, Δω is the change in chemical shift between the low population excited state (ES) and highly populated ground state (GS), expressed as Δω = ωES − ωGS. The strength of the carrier spinlock power is ωSL and the tilt angle in the rotating frame is defined as θ = tan−1(ωSL/ΔΩ) given by ΔΩ = Ω − ωrf, the difference between average spin lock offset (Ωave) and reference frequency (ωrf). Average spin lock offset is calculated by Ωave = pGSωGS + pESωES where pGS and pES are populations of the GS and ES, respectively. The spinlock strengths at the GS and ES are given by and , respectively, and their resonance offsets from the spinlock carrier correspond to ωGS and ωES. ΔωES = ωES − ΩGS defines the difference in offset for transitions from ES to GS. The effective spinlock field strength is calculated as . Finally, the exchange rate constant is expressed as kex = k1 + k−1 = pESkex + pGSkex and can be converted to the form to determine the on and off rate constants. Free ligand concentration [L] is calculated from the relation [L] = [LT] − pbound[RT], where the total ligand and RNA concentrations are known a priori, and the population of bound RNA (pbound) is obtained from the RD fitted value for pES in the case of low ligand concentration where the excited species is the bound state, or pGS in the case of excess ligand where the excited state is the free RNA. This allows us to rewrite and as
(19) |
and
(20) |
Note that the rate constants are considered apparent rather than intrinsic due to the presence of RD-invisible pre-equilibrium. The apparent association and dissociation rate constants and 104 are measured, whereas the intrinsic kon and koff cannot be determined here because kAB and kBA are unknown.
Supplementary Material
ACKNOWLEDGMENTS
We thank Dr. Jonathan Li for help in getting started with the Stan software used for Bayesian fitting, all members of the Al-Hashimi lab for insightful discussions, and the Duke NMR Center for maintenance and assistance with the instruments. This work was supported by the U.S. National Institutes of Health (P50 GM103297).
Footnotes
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jacs.9b10535.
Additional results and discussion, Tables S1–S4, and Figures S1–S12 (PDF)
The authors declare the following competing financial interest(s): H.M.A. is an advisor and holds an ownership interest in Nymirum Inc., an RNA-based drug discovery company.
REFERENCES
- (1).Mattick JS RNA regulation: a new genetics? Nat. Rev. Genet 2004, 5, 316–23. [DOI] [PubMed] [Google Scholar]
- (2).Sharp PA The centrality of RNA. Cell 2009, 136, 577–80. [DOI] [PubMed] [Google Scholar]
- (3).Cruz JA; Westhof E The dynamic landscapes of RNA architecture. Cell 2009, 136, 604–9. [DOI] [PubMed] [Google Scholar]
- (4).Rinnenthal J; Buck J; Ferner J; Wacker A; Fürtig B; Schwalbe H Mapping the landscape of RNA dynamics with NMR spectroscopy. Acc. Chem. Res 2011, 44, 1292–1301. [DOI] [PubMed] [Google Scholar]
- (5).Dethoff EA; Chugh J; Mustoe AM; Al-Hashimi HM Functional complexity and regulation through RNA dynamics. Nature 2012, 482, 322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).Ganser LR; Kelly ML; Herschlag D; Al-Hashimi HM The roles of structural dynamics in the cellular functions of RNAs. Nat. Rev. Mol. Cell Biol 2019, 20, 474–489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Bothe JR; Nikolova EN; Eichhorn CD; Chugh J; Hansen AL; Al-Hashimi HM Characterizing RNA dynamics at atomic resolution using solution-state NMR spectroscopy. Nat. Methods 2011, 8, 919–931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Zhao B; Guffy SL; Williams B; Zhang Q An excited state underlies gene regulation of a transcriptional riboswitch. Nat. Chem. Biol 2017, 13, 968–974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (9).De Guzman RN; Turner RB; Summers MF Protein–RNA recognition. Biopolymers 1998, 48, 181–195. [DOI] [PubMed] [Google Scholar]
- (10).Patel DJ Adaptive recognition in RNA complexes with peptides and protein modules. Curr. Opin. Struct. Biol 1999, 9, 74–87. [DOI] [PubMed] [Google Scholar]
- (11).Hermann T; Patel DJ Adaptive recognition by nucleic acid aptamers. Science 2000, 287, 820–5. [DOI] [PubMed] [Google Scholar]
- (12).Varani G RNA–protein intermolecular recognition. Acc. Chem. Res 1997, 30, 189–195. [Google Scholar]
- (13).Becker WR; Jarmoskaite I; Vaidyanathan PP; Greenleaf WJ; Herschlag D Demonstration of protein cooperativity mediated by RNA structure using the human protein PUM2. RNA 2019, 25, 702–712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (14).Solomatin SV; Greenfeld M; Herschlag D Implications of molecular heterogeneity for the cooperativity of biological macromolecules. Nat. Struct. Mol. Biol 2011, 18, 732–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (15).Motlagh HN; Wrabl JO; Li J; Hilser VJ The ensemble nature of allostery. Nature 2014, 508, 331–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (16).Palmer AG; Kroenke CD; Loria JP In Nuclear Magnetic Resonance of Biological Macromolecules - Part B; James TL, Dötsch V, Schmitz U, Eds.; Methods in Enzymology 339; Academic Press, 2001; pp 204–238. [DOI] [PubMed] [Google Scholar]
- (17).Korzhnev DM; Skrynnikov NR; Millet O; Torchia DA; Kay LE An NMR experiment for the accurate measurement of heteronuclear spin-lock relaxation rates. J. Am. Chem. Soc 2002, 124, 10743–10753. [DOI] [PubMed] [Google Scholar]
- (18).Abergel D; Palmer AG Approximate solutions of the Bloch-McConnell equations chemical equations for two-site chemical exchange. ChemPhysChem 2004, 5, 787–793. [DOI] [PubMed] [Google Scholar]
- (19).Trott O; Palmer AG Theoretical study of R1ρ rotating-frame and R1free-precession relaxation in the presence of n-site chemical exchange. J. Magn. Reson 2004, 170, 104–112. [DOI] [PubMed] [Google Scholar]
- (20).Palmer AG; Massi F Characterization of the dynamics of biomacromolecules using rotating-frame spin relaxation NMR spectroscopy. Chem. Rev 2006, 106, 1700–1719. [DOI] [PubMed] [Google Scholar]
- (21).Hansen AL; Nikolova EN; Casiano-Negroni A; Al-Hashimi HM Extending the range of microsecond-to-millisecond chemical exchange detected in labeled and unlabeled nucleic acids by selective carbon R1ρ NMR spectroscopy. J. Am. Chem. Soc 2009, 131, 3818–9. [DOI] [PubMed] [Google Scholar]
- (22).Xue Y; Kellogg D; Kimsey IJ; Sathyamoorthy B; Stein ZW; McBrairty M; Al-Hashimi HM Characterizing RNA excited states using NMR relaxation dispersion. Methods Enzymol 2015, 558, 39–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Rangadurai A; Szymaski ES; Kimsey IJ; Shi H; Al-Hashimi HM Characterizing micro-to-millisecond chemical exchange in nucleic acids using off-resonance R1ρ relaxation dispersion. Prog. Nucl. Magn. Reson. Spectrosc 2019, 112–113, 55–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (24).Vallurupalli P; Bouvignies G; Kay LE Increasing the exchange time-scale that can be probed by CPMG relaxation dispersion NMR. J. Phys. Chem. B 2011, 115, 14891–14900. [DOI] [PubMed] [Google Scholar]
- (25).Strebitzer E; Nußbaumer F; Kremser J; Tollinger M; Kreutz C Studying sparsely populated conformational states in RNA combining chemical synthesis and solution NMR spectroscopy. Methods 2018, 148, 39–47. [DOI] [PubMed] [Google Scholar]
- (26).Zhao B; Hansen AL; Zhang Q Characterizing slow chemical exchange in nucleic acids by carbon CEST and low spin-lock field R1ρ NMR spectroscopy. J. Am. Chem. Soc 2014, 136, 20–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Herschlag D; Allred BE; Gowrishankar S From static to dynamic: the need for structural ensembles and a predictive model of RNA folding and function. Curr. Opin. Struct. Biol 2015, 30, 125–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).Thomas JR; Hergenrother PJ Targeting RNA with small molecules. Chem. Rev 2008, 108, 1171–1224. [DOI] [PubMed] [Google Scholar]
- (29).Connelly CM; Moon MH; Schneekloth JS The emerging role of RNA as a therapeutic target for small molecules. Cell Chemical Biology 2016, 23, 1077–1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Ganser LR; Lee J; Rangadurai A; Merriman DK; Kelly ML; Kansal AD; Sathyamoorthy B; Al-Hashimi HM High-performance virtual screening by targeting a high-resolution RNA dynamic ensemble. Nat. Struct. Mol. Biol 2018, 25, 425–434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (31).Liang JC; Bloom RJ; Smolke CD Engineering biological systems with synthetic RNA molecules. Mol. Cell 2011, 43, 915–926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (32).Chang AL; Wolf JJ; Smolke CD Synthetic RNA switches as a tool for temporal and spatial control over gene expression. Curr. Opin. Biotechnol 2012, 23, 679–688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (33).Berens C; Groher F; Suess B RNA aptamers as genetic control devices: the potential of riboswitches as synthetic elements for regulating gene expression. Biotechnol. J 2015, 10, 246–257. [DOI] [PubMed] [Google Scholar]
- (34).Henkels CH; Kurz JC; Fierke CA; Oas TG Linked folding and anion binding of the Bacillus subtilis ribonuclease P protein. Biochemistry 2001, 40, 2777–2789. [DOI] [PubMed] [Google Scholar]
- (35).Reining A; Nozinovic S; Schlepckow K; Buhr F; Fürtig B; Schwalbe H Three-state mechanism couples ligand and temperature sensing in riboswitches. Nature 2013, 499, 355–359. [DOI] [PubMed] [Google Scholar]
- (36).Daniels KG; Tonthat NK; McClure DR; Chang Y-C; Liu X; Schumacher MA; Fierke CA; Schmidler SC; Oas TG Ligand concentration regulates the pathways of coupled protein folding and binding. J. Am. Chem. Soc 2014, 136, 822–825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (37).Daniels KG; Suo Y; Oas TG Conformational kinetics reveals affinities of protein conformational states. Proc. Natl. Acad. Sci. U. S. A 2015, 112, 9352–9357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Frankel EA; Strulson CA; Keating CD; Bevilacqua PC Cooperative interactions in the hammerhead ribozyme drive pKa shifting of G12 and its stacked base C17. Biochemistry 2017, 56, 2537–2548. [DOI] [PubMed] [Google Scholar]
- (39).Feng S; Holland EC HIV-1 Tat trans-activation requires the loop sequence within TAR. Nature 1988, 334, 165–7. [DOI] [PubMed] [Google Scholar]
- (40).Garcia JA; Harrich D; Soultanakis E; Wu F; Mitsuyasu R; Gaynor RB Human immunodeficiency virus type 1 LTR TATA and TAR region sequences required for transcriptional regulation. EMBO J 1989, 8, 765–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (41).Marciniak RA; Calnan BJ; Frankel AD; Sharp PA HIV-1 Tat protein trans-activates transcription in vitro. Cell 1990, 63, 791–802. [DOI] [PubMed] [Google Scholar]
- (42).Murchie AIH; Davis B; Isel C; Afshar M; Drysdale MJ; Bower J; Potter AJ; Starkey ID; Swarbrick TM; Mirza S; Prescott CD; Vaglio P; Aboul-ela F; Karn J Structure-based drug design targeting an inactive RNA conformation: exploiting the flexibility of HIV-1 TAR RNA. J. Mol. Biol 2004, 336, 625–638. [DOI] [PubMed] [Google Scholar]
- (43).Davis B; Afshar M; Varani G; Murchie AIH; Karn J; Lentzen G; Drysdale M; Bower J; Potter AJ; Starkey ID; Swarbrick T; Aboul-ela F Rational design of inhibitors of HIV-1 TAR RNA through the stabilisation of electrostatic “hot spots. J. Mol. Biol 2004, 336, 343–356. [DOI] [PubMed] [Google Scholar]
- (44).Davidson A; Patora-Komisarska K; Robinson JA; Varani G Essential structural requirements for specific recognition of HIV TAR RNA by peptide mimetics of Tat protein. Nucleic Acids Res 2011, 39, 248–256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (45).Aboul-ela F; Karn J; Varani G The structure of the human immunodeficiency virus type-1 TAR RNA reveals principles of RNA recognition by Tat protein. J. Mol. Biol 1995, 253, 313–332. [DOI] [PubMed] [Google Scholar]
- (46).Aboul-ela F; Karn J; Varani G Structure of HIV-1 TAR RNA in the absence of ligands reveals a novel conformation of the trinucleotide bulge. Nucleic Acids Res 1996, 24, 3974–3981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (47).Long KS; Crothers DM Characterization of the solution conformations of unbound and Tat peptide-bound forms of HIV-1 TAR RNA. Biochemistry 1999, 38, 10059–10069. [DOI] [PubMed] [Google Scholar]
- (48).Al-Hashimi HM; Gosser Y; Gorin A; Hu W; Majumdar A; Patel DJ Concerted motions in HIV-1 TAR RNA may allow access to bound state conformations: RNA dynamics from NMR residual dipolar couplings. J. Mol. Biol 2002, 315, 95–102. [DOI] [PubMed] [Google Scholar]
- (49).Puglisi JD; Chen L; Frankel AD; Williamson JR Role of RNA structure in arginine recognition of TAR RNA. Proc. Natl. Acad. Sci. U. S. A 1993, 90, 3680–3684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (50).Long KS; Crothers DM Interaction of human immunodeficiency virus type 1 Tat-derived peptides with TAR RNA. Biochemistry 1995, 34, 8885–95. [DOI] [PubMed] [Google Scholar]
- (51).Tao J; Chen L; Frankel AD Dissection of the proposed base triple in human immunodeficiency virus TAR RNA indicates the importance of the Hoogsteen interaction. Biochemistry 1997, 36, 3491–3495. [DOI] [PubMed] [Google Scholar]
- (52).Brodsky A NMR evidence for a base triple in the HIV-2 TAR C-G.C+ mutant- argininamide complex. Nucleic Acids Res 1998, 26, 1991–1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (53).Pitt SW; Majumdar A; Serganov A; Patel DJ; Al-Hashimi HM Argininamide binding arrests global motions in HIV-1 TAR RNA: comparison with Mg2+-induced conformational stabilization. J. Mol. Biol 2004, 338, 7–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (54).Ferner J; Suhartono M; Breitung S; Jonker HRA; Hennig M; Wöhnert J; Göbel M; Schwalbe H Structures of HIV TAR RNA-ligand complexes reveal higher binding stoichiometries. ChemBioChem 2009, 10, 1490–1494. [DOI] [PubMed] [Google Scholar]
- (55).Schneeberger EM; Breuker K Native top-down mass spectrometry of TAR RNA in complexes with a wild-type Tat peptide for binding site mapping. Angew. Chem., Int. Ed 2017, 56, 1254–1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (56).Kumar S; Maiti S Effect of different arginine methylations on the thermodynamics of Tat peptide binding to HIV-1 TAR RNA. Biochimie 2013, 95, 1422–1431. [DOI] [PubMed] [Google Scholar]
- (57).Suryawanshi H; Sabharwal H; Maiti S Thermodynamics of peptide-RNA recognition: the binding of a tat peptide to TAR RNA. J. Phys. Chem. B 2010, 114, 11155–11163. [DOI] [PubMed] [Google Scholar]
- (58).Boudier C; Humbert N; Chaminade F; Chen Y; de Rocquigny H; Godet J; Mauffret O; Fosse P; Mely Y Dynamic interactions of the HIV-1 Tat with nucleic acids are critical for Tat activity in reverse transcription. Nucleic Acids Res 2014, 42, 1065–1078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (59).Tan R; Frankel AD Circular dichroism studies suggest that TAR RNA changes conformation upon specific binding of arginine or guanidine. Biochemistry 1992, 31, 10288–10294. [DOI] [PubMed] [Google Scholar]
- (60).Churcher MJ; Lamont C; Hamy F; Dingwall C; Green SM; Lowe AD; Butler PJG; Gait MJ; Karn J High affinity binding of TAR RNA by the human immunodeficiency virus type-1 tat protein requires base-pairs in the RNA stem and amino acid residues flanking the basic region. J. Mol. Biol 1993, 230, 90–110. [DOI] [PubMed] [Google Scholar]
- (61).Stelzer AC; Kratz JD; Zhang Q; Al-Hashimi HM RNA dynamics by design: biasing ensembles towards the ligand-bound state. Angew. Chem., Int. Ed 2010, 49, 5731–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (62).Muniz L; Egloff S; Ughy B; Jady BE; Kiss T Controlling cellular P-TEFb activity by the HIV-1 transcriptional transactivator Tat. PLoS Pathog 2010, 6, No. e1001152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (63).Lee J; Dethoff EA; Al-Hashimi HM Invisible RNA state dynamically couples distant motifs. Proc. Natl. Acad. Sci. U. S. A 2014, 111, 9485–9490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (64).Dethoff EA; Hansen AL; Musselman C; Watt ED; Andricioaei I; Al-Hashimi HM Characterizing complex dynamics in the transactivation response element apical loop and motional correlations with the bulge by NMR, molecular dynamics, and mutagenesis. Biophys. J 2008, 95, 3906–3915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (65).Note that the absence of 2D curved trajectories does not prove single-site binding; however, the existence of curvature under fast exchange requires multi-site binding.
- (66).Arai M; Ferreon JC; Wright PE Quantitative analysis of multisite protein-ligand interactions by NMR: binding of intrinsically disordered p53 transactivation subdomains with the TAZ2 domain of CBP. J. Am. Chem. Soc 2012, 134, 3792–3803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (67).Quinternet M; Starck JP; Delsuc MA; Kieffer B Unraveling complex small-molecule binding mechanisms by using simple NMR spectroscopy. Chem. - Eur. J 2012, 18, 3969–74. [DOI] [PubMed] [Google Scholar]
- (68).Furukawa A; Konuma T; Yanaka S; Sugase K Quantitative analysis of protein-ligand interactions by NMR. Prog. Nucl. Magn. Reson. Spectrosc 2016, 96, 47–57. [DOI] [PubMed] [Google Scholar]
- (69).Freier SM; Albergo DD; Turner DH Solvent effects on the dynamics of (dG-dC)3. Biopolymers 1983, 22, 1107–31. [DOI] [PubMed] [Google Scholar]
- (70).Pieters JML; Mellema J-R; Van Den Elst H; Van Der Marel GA; Van Boom JH; Altona C Thermodynamics of the various forms of the dodecamer d(ATTACCGGTAAT) and of its constituent hexamers from proton NMR chemical shifts and UV melting curves: three-state and four-state thermodynamic models. Biopolymers 1989, 28, 717–740. [DOI] [PubMed] [Google Scholar]
- (71).Turner DL; Williams RJP 1H and 13C-NMR investigation of redox-state-dependent and temperature-depentent conformation changes in horse cytochrome c. Eur. J. Biochem 1993, 211, 555–562. [DOI] [PubMed] [Google Scholar]
- (72).Davies DB; Pahomov VI; Veselkov AN NMR determination of the conformational and drug binding properties of the DNA heptamer d(GpCpGpApApGpC) in aqueous solution. Nucleic Acids Res 1997, 25, 4523–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (73).Onufriev A; Ullmann GM Decomposing complex cooperative ligand binding into simple components: connections between microscopic and macroscopic models. J. Phys. Chem. B 2004, 108, 11157–11169. [Google Scholar]
- (74).Berkhout B; Jeang K-T Trans activation of human immunodeficiency virus type 1 is sequence specific for both the single-stranded bulge and loop of the trans-acting-responsive hairpin: a quantitative analysis. J. Virol 1989, 63, 5501–5504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (75).Roy S; Delling U; Chen CH; Rosen CA; Sonenberg N A bulge structure in HIV-1 TAR RNA is required for Tat binding and Tat-mediated trans-activation. Genes Dev 1990, 4, 1365–1373. [DOI] [PubMed] [Google Scholar]
- (76).Pham VV; Salguero C; Khan SN; Meagher JL; Brown WC; Humbert N; de Rocquigny H; Smith JL; D’Souza VM HIV-1 Tat interactions with cellular 7SK and viral TAR RNAs identifies dual structural mimicry. Nat. Commun 2018, 9, 4266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (77).Wang S-Q; Li H-X Bayesian inference based modelling for gene transcriptional dynamics by integrating multiple source of knowledge. BMC Syst. Biol 2012, 6, S3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (78).Thijssen B; Dijkstra TM; Heskes T; Wessels LF BCM: toolkit for Bayesian analysis of computational models using samplers. BMC Syst. Biol 2016, 10, 100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (79).Epstein M; Calderhead B; Girolami MA; Sivilotti LG Bayesian statistical inference in ion-channel models with exact missed event correction. Biophys. J 2016, 111, 333–348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (80).Saa PA; Nielsen LK Construction of feasible and accurate kinetic models of metabolism: a Bayesian approach. Sci. Rep 2016, 6, 29635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (81).Musselman C; Al-Hashimi HM; Andricioaei I iRED analysis of TAR RNA reveals motional coupling, long-range correlations, and a dynamical hinge. Biophys. J 2007, 93, 411–422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (82).Ippolito JA; Steitz TAA 1.3-Å resolution crystal structure of the HIV-1 trans-activation response region RNA stem reveals a metal ion-dependent bulge conformation. Proc. Natl. Acad. Sci. U. S. A 1998, 95, 9819–9824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (83).Olejniczak M; Gdaniec Z; Fischer A; Grabarkiewicz T; Bielecki L; Adamiak RW; Bielecki Ł; Adamiak RW The bulge region of HIV-1 TAR RNA binds metal ions in solution. Nucleic Acids Res 2002, 30, 4241–4249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (84).Casiano-Negroni A; Sun X; Al-Hashimi HM Probing Na(+)-induced changes in the HIV-1 TAR conformational dynamics using NMR residual dipolar couplings: new insights into the role of counterions and electrostatic interactions in adaptive recognition. Biochemistry 2007, 46, 6525–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (85).Merriman DK; Yuan J; Shi H; Majumdar A; Herschlag D; Al-Hashimi HM Increasing the length of poly-pyrimidine bulges broadens RNA conformational ensembles with minimal impact on stacking energetics. RNA 2018, 24, 1363–1376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (86).Zacharias M; Hagerman PJ The bend in RNA created by the trans-activation response element bulge of human immunodeficiency virus is straightened by arginine and by Tat-derived peptide. Proc. Natl. Acad. Sci. U. S. A 1995, 92, 6052–6056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (87).Merriman DK; Xue Y; Yang S; Kimsey IJ; Shakya A; Clay M; Al-Hashimi HM Shortening the HIV-1 TAR RNA bulge by a single nucleotide preserves motional modes over a broad range of time scales. Biochemistry 2016, 55, 4445–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (88).Wang Y; Gong S; Wang Z; Zhang W The thermodynamics and kinetics of a nucleotide base pair. J. Chem. Phys 2016, 144, 115101. [DOI] [PubMed] [Google Scholar]
- (89).Frank AT; Stelzer AC; Al-Hashimi HM; Andricioaei I Constructing RNA dynamical ensembles by combining MD and motionally decoupled NMR RDCs: new insights into RNA dynamics and adaptive ligand recognition. Nucleic Acids Res 2009, 37, 3670–3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (90).Salmon L; Bascom G; Andricioaei I; Al-Hashimi HM A general method for constructing atomic-resolution RNA ensembles using NMR residual dipolar couplings: the basis for interhelical motions revealed. J. Am. Chem. Soc 2013, 135, 5457–5466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (91).Cherepanov AV; Glaubitz C; Schwalbe H High-resolution studies of uniformly 13C,15N-labeled RNA by solid-state NMR spectroscopy. Angew. Chem., Int. Ed 2010, 49, 4747–4750. [DOI] [PubMed] [Google Scholar]
- (92).Clay MC; Ganser LR; Merriman DK; Al-Hashimi HM Resolving sugar puckers in RNA excited states exposes slow modes of repuckering dynamics. Nucleic Acids Res 2017, 45, No. e134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (93).Zhang Q; Sun X; Watt ED; Al-Hashimi HM Resolving the motional modes that code for RNA adaptation. Science 2006, 311, 653–6. [DOI] [PubMed] [Google Scholar]
- (94).Delaglio F; Grzesiek S; Vuister GW; Zhu G; Pfeifer J; Bax A NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 1995, 6, 277–293. [DOI] [PubMed] [Google Scholar]
- (95).Lee W; Tonelli M; Markley JL NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy. Bioinformatics 2015, 31, 1325–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (96).Wang AC; Bax A Minimizing the effects of radio-frequency heating in multidimensional NMR experiments. J. Biomol. NMR 1993, 3, 715–720. [DOI] [PubMed] [Google Scholar]
- (97).Nickalls RWD A new approach to solving the cubic: Cardan’s solution revealed. Mathematical Gazette 1993, 77, 354. [Google Scholar]
- (98).Wang ZX; Jiang RF A novel two-site binding equation presented in terms of the total ligand concentration. FEBS Lett 1996, 392, 245–249. [DOI] [PubMed] [Google Scholar]
- (99).Annis J; Miller BJ; Palmeri TJ Bayesian inference with Stan: a tutorial on adding custom distributions. Behav. Res. Methods 2017, 49, 863–886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (100).Carpenter B; Hoffman MD; Brubaker M; Lee D; Li P; Betancourt M The Stan math library: reverse-mode automatic differentiation in C++. arXiv:1509.07164 [cs.MS], arXiv Preprint, 2015. https://arxiv.org/abs/1509.07164. [Google Scholar]
- (101).Carpenter B; Gelman A; Hoffman MD; Lee D; Goodrich B; Betancourt M; Brubaker M; Guo J; Li P; Riddell A Stan: a probabilistic programming language. J. Statistical Software 2017, 76, 1–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (102).Gelman A; Rubin DB Inference from iterative simulation using multiple sequences. Statistical Science 1992, 7, 457–472. [Google Scholar]
- (103).Brooks SP; Gelman A General methods for monitoring convergence of iterative simulations. J. Comput. Graphical Statistics 1998, 7, 434–455. [Google Scholar]
- (104).Sugase K; Dyson HJ; Wright PE Mechanism of coupled folding and binding of an intrinsically disordered protein. Nature 2007, 447, 1021–1025. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.