Abstract
RNA chemical probing experiments are a broadly used method for revealing the structure of RNA as well as for identifying protein binding sites. This is beneficial for expanding our understanding of biological processes governed by protein-RNA complex interactions, as well as facilitating the identification of complex inhibiting molecules. The reagents commonly used in chemical probing experiments are highly reactive, methylating or acylating flexible RNA nucleotides. The highly reactive nature of the chemical probes means that they can also react with nucleophilic amino acid side chains, subsequently affecting protein-RNA binding events. We combine molecular dynamics (MD) simulations, matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS), and nuclear magnetic resonance (NMR) experiments to demonstrate that commonly used RNA chemical probes interact with protein amino acids and that this interaction alters protein-RNA binding through binding assays. We discuss the implications of this phenomenon in elucidating the protein-RNA interaction interface using chemical probing experiments.
Keywords: RNA structure, RNA binding protein, nuclear magnetic resonance (NMR), molecular dynamics, chemical probing, gel electrophoresis
RNA structure is best described as an ensemble of various conformations (1). The dynamic nature of RNA makes experimentally characterizing its structure challenging. Biochemical experiments such as selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE), and dimethyl sulfate (DMS) chemical probing are used to determine the secondary structure of conformationally dynamic RNA molecules, as well as to identify tertiary structural contacts of RNAs both in vitro and in cells (2, 3, 4, 5, 6, 7, 8). This method has been used to elucidate the secondary structure of both small and large RNA molecules, from micro-RNAs to entire viral genomes (9, 10). Further, chemical probing is often used to identify protein and other ligand binding sites on RNA molecules: regions of the RNA that demonstrate different chemical reactivities in the presence of a binding partner versus in the absence of it are typically interpreted as binding sites (11, 12, 13).
The chemical probes used in SHAPE experiments react rapidly with the 2′-hydroxyl group of the ribose sugar of unpaired or everted nucleotides, covalently modifying the ribose (14). DMS functions similarly: it binds and methylates the N1 position of adenine and the N3 position of cytosine (15). At higher pH, uridines and guanosines are also susceptible to DMS methylation (16). The covalently modified RNA is then sequenced, using either reverse transcription-stop (RT-stop) or next-generation sequencing (RT-map) methods, which permits identification of the nucleotides that reacted with the chemical probes (9, 15). The reactivities are then used to improve the prediction of RNA structure with in silico folding approaches.
One of the benefits of chemical probing experiments is that they can be used to identify protein and other ligand binding sites in an RNA. For example, in a recent study, SHAPE was used to identify where two RNA-binding proteins (RBPs), hnRNP U and hnRNP L, bound to the precursor messenger RNA MALT1 to regulate how it is spliced (17). In another study, DMS probing was used to identify how proteins bind to increase the flexibility of the 7SK noncoding RNA, thereby increasing the accessibility of neighboring binding sites (18). Chemical probing has also been used to assess ligand-binding affinity when carried out at variable ligand concentrations (19, 20, 21). These experiments have enabled researchers to better understand a plethora of biological processes that rely on ligand-RNA interactions.
One aspect of using chemical probing to map protein/ligand interactions that has received little attention is the ability of the chemical probes to interact with the ligand, especially given the on-off rates of the RNA-ligand complex and the rate at which the chemical probes interact. Relative to an RNA-only control (Fig. 1A), when a protein ligand is present, four scenarios can occur (Fig. 1, B–E). First, binding of the protein to the RNA can prevent the probe from interacting with the RNA molecule (Fig. 1B). In the second scenario, the probe is bound to the RNA and prevents the recognition of the RNA by the protein (Fig. 1C). Most poorly understood is the ability of the chemical probes to interact with the protein (and/or the RNA), impeding complex formation (Fig. 1, D and E). Finally, combinations of these scenarios may occur. It is important to note that if chemical probing is to be used to investigate binding affinities, only one type of interaction, scenario B, is desired.
Figure 1.
Mechanisms of chemical probing experiments.A, chemical probes modify flexible or single-stranded nucleotides. When protein is present in solution. B, the protein can prevent binding by the probe. C, the probe can prevent binding of the protein. D, the probe can bind to the protein to prevent an interaction. E, the interaction site can be doubly modified, preventing interaction.
The lack of data for the interactions between reagents used for RNA chemical probing and proteins is remarkable considering the same reactive moieties forming the basis for RNA probing reagents have been employed to react with proteins repeatedly in the past. The isatoic anhydride scaffold has been used extensively to covalently tag proteins for fluorescence spectroscopy (22), biotinylation (23, 24), introduction of chemical handles for further reactions (25), and gel pre-labeling (26, 27). Initial studies had suggested that the nucleobase modifying DMS was suitable for use in the presence of proteins, and only N-Cyclohexyl-N′-(2-morpholinoethyl)carbodiimide methyl-p-toluenesulfonate (CMCT) needs to be avoided due to side reactions (28). However, a more recent study employed DMS-induced methylations to establish protein-protein interaction footprints, highlighting the ability of DMS to modify proteins as well as RNA (29).
Several proteinogenic amino acids possess nucleophilic functional groups in their side chains (Figs. 2 and S1). Some of these are known to be post-translationally modified by acylations (30) and methylations (31, 32). The chemical probing reagents are capable of reacting with side chain nucleophiles in a similar manner: isatoic anhydride and its derivatives have been reported to react with lysine (Lys) side chains (22, 23, 25). However, in line with the typical reactivity of the isatoic anhydride scaffold, chemical probing reagents like 1-methyl-7-nitroisatoic anhydride (1M7) and N-methylisatoic anhydride (NMIA) may also acylate the side chains of cysteine (Cys) as well as the hydroxyl groups of tyrosine (Tyr), serine (Ser), and threonine (Thr) (Fig. 2A) (33). The products of acylations of cysteine would be thioesters, which are known to be unstable. However, some thioesters, such as acetyl-CoA, are found in the cellular context. DMS-induced methylations have been reported for Lys, histidine (His), and glutamate (Glu) residues (29). By extension, Cys and aspartate (Asp) might also be susceptible (Fig. 2B). However, identifying potential modification sites and reactivities for these reagents is substantially more complex in a protein than it is for a peptide or monomer. The nucleophilicity of a side chain in the context of a protein is determined not just by the type of amino acid itself: neighboring amino acid residues can significantly alter a side chain's nucleophilicity. The catalytic triad is a typical example of an architecture enhancing the nucleophilicity of a specific residue (34). Accessibility for reagents is another major factor to consider when dealing with side chain modifications. Thus, the three-dimensional structure of a protein will have an influence on the reactivity of a specific residue when presented with a modifying reagent. In this sense, proteins may behave similarly to nucleic acids when presented with chemical probing reagents.
Figure 2.
Nucleophilic amino acids that react with RNA-modifying probes.A, 1M7 acylates Cys, His, Tyr, Ser, Thr and Lys residues. B, DMS methylates Cys, His, Asp, and Glu residues.
In this study, we combine binding assays, NMR, MALDI-MS, and MD simulations to demonstrate that the chemical probes frequently used in RNA chemical probing experiments can bind to and react with nucleophilic amino acids that reside on the surface of proteins. Our results show that the covalent modification of amino acids by the chemical probing reagents alters the protein binding surface, and can obscure the accurate deduction of protein–RNA interactions. We discuss the implications of the chemical probes reacting with amino acids in interpreting protein-RNA binding events, thus highlighting the importance of further validating ligand-RNA interactions with alternative methods following chemical probing experiments.
Results
MALDI-MS reveals covalent modification of proteins
We used MALDI-TOF (matrix-assisted laser desorption and ionization-time of flight) mass spectrometry to determine whether commonly used RNA chemical probing reagents could covalently modify proteins. MALDI-MS is perfectly suited for the facile determination of intact protein molecular weights. As a model system, we used the KH3 domain of the heterogeneous nuclear ribonucleoprotein K (hnRNP K). hnRNP K possesses three RNA-binding KH domains and has several cellular roles, including regulation of dosage compensation through an interaction with XIST and regulation of mRNA splicing (35, 36).
We treated protein solutions of 50 to 200 μM with a final concentration of 18 mM 1M7, NMIA (N-methyl isatoic anhydride), BzCN (benzoyl cyanide), 2A3 ((2-Aminopyridin-3-yl) (1H-imidazol-1-yl)methanone), and 20 mM DMS and compared the molecular weight with unmodified protein. These concentrations mimic the concentrations used in chemical probing experiments. The resulting mass spectra (Fig. 3A) clearly show multiple species with increased molecular weights for hnRNP K KH3 modified by 1M7, NMIA, and 2A3 corresponding to a complex mixture of singly, doubly, and multiply modified proteins. The observed increase in molecular weight is a result of covalent modification of the protein, not non-covalent adduct formation, since adding the 1M7 hydrolysis product to hnRNP K KH3 did not produce the same pattern in the mass spectrum (Fig. S2). The overall modification rates are substantial, given that the unmodified protein accounts for only a minor fraction in the mass spectra. We observed that 2A3 is a particularly powerful acylation reagent for the protein: while the unmodified protein still accounts for a major fraction of the other probes, it is almost undetectable for 18 mM 2A3 (Fig. 3A, bottom). At higher concentrations of 2A3 (36 mM instead of 18 mM), MALDI-MS detects even higher modification rates (Fig. 3B). We did not observe a similar concentration dependence for 1M7 (Fig. S3). In stark contrast, the observed m/z is not changed significantly after treatment with DMS and BzCN, indicating that these probes may not modify KH3 side chains (Fig. 3A). The reactivity of DMS, in the context of RNA chemical probing, is dependent on the pH of the reaction. At pH 8.0, all four nucleotides are susceptible to DMS (16). This inspired us to repeat the DMS reaction at pH 8.0 and compare it to a control conducted at pH 6.5 by MALDI-TOF (Fig. S4). However, the results were the same as at pH 6.5. This was surprising since DMS induced methylations have been observed in literature for other proteins (29).
Figure 3.
MALDI MS of proteins treated with RNA chemical probing reagents.A, MALDI MS of hnRNP K KH3 (top) treated with 1M7, DMS, NMIA, BzCN, 2A3. Single, double and multiple modifications are observed for 1M7, NMIA, and 2A3. B, increasing the 2A3 concentration from 18 mM (top) to 36 mM (bottom) leads to heavily modified protein species with the unmodified variant no longer observed. C, MALDI MS of SHARP RRM1 (top) treated with 1M7 (middle) and DMS (bottom).
To ensure the observed covalent modifications are not a specific phenomenon observed in hnRNP K KH3, we treated the RNA recognition motif 1 (RRM1) domain of SMRT/HDAC1 Associated Repressor Protein (SHARP) with 1M7 and DMS as examples for nucleobase and 2′OH modifying reagents for comparison (Fig. 3C). SHARP possesses four RNA recognition motifs (RRMs) that have been demonstrated to interact with several RNAs including Xist and SRA to regulate gene expression (37, 38, 39). Again, 1M7 caused covalent modification of the protein, while DMS did not cause a significant increase in molecular weight.
MD simulations reveal 1M7 and DMS occupancy near nucleophilic amino acids
We turned to MD simulations to investigate the site selectivity of the covalent modification of the proteins. In addition, we aimed to understand why we did not observe clear signs of DMS-induced methylations. The simulations allow us to dissect the interaction between the chemical probing reagents and the protein, since the initial binding event of the chemical probe cannot be captured by experiments due to the subsequent reaction. As an example of a common nucleobase and 2′-OH modifying reagent, we simulated the interaction of 1M7 and DMS with both RRM1 of SHARP and the KH3 domain of hnRNP K. Both proteins possess surface and buried amino acids that could potentially interact with 1M7 and DMS.
Three 1.0 μs MD simulations of the RRM1 domain of SHARP, or the KH3 domain of hnRNP K, were performed with either 1M7 or DMS, to give a total of four different systems (SHARP + 1M7; SHARP + DMS; hnRNP K + 1M7; hnRNP K + DMS). To ensure the protein structures were stable in our simulations, we monitored the root mean square fluctuation (RMSF) of the protein backbone during the simulations with 1M7 (Fig. S5). The proteins do not deviate significantly from the reference structures in our simulations and exhibit the greatest flexibility in the N- and C-terminal residues. The simulations of SHARP RRM1 exhibit additional flexibility in the loop region, connecting β-strands 2 and 3 (residues P39-A48 in Fig. S5B). We conclude that both SHARP RRM1 and the KH3 domain of hnRNP K are stable in our MD simulations.
To understand which amino acids the 1M7 or DMS molecules might associate with on the two proteins—and by extension, which amino acids may be accessible for covalent modification by these chemical probes—we monitored the occupancy of the two chemical probes around the proteins throughout the MD simulations. We calculated the radial distribution function (RDF) of 1M7 or DMS against potentially reactive amino acids, which measures the population of 1M7 or DMS at each distance from the designated amino acids. For both SHARP RRM1 and the KH3 domain of hnRNP K, the 1M7 molecules preferentially associate with aromatic residues (Phe, Tyr, Trp, and His) on the protein surfaces. In the simulations of SHARP RRM1 with 1M7, the 1M7 molecules form aromatic stacking interactions with His7, Trp9, His63, and Tyr78 (Fig. 4A), which are all potentially reactive towards 1M7 and may facilitate a reaction in vitro. In the MD simulations of hnRNP K KH3 with 1M7, the 1M7 molecules localize near Tyr75 and Tyr84, as well as solvent-exposed residues on helix-3 such as Gln71 and Leu76—which brings 1M7 molecules adjacent to Glu42 and Ser80 (Fig. 4B).
Figure 4.
Analysis of the MD simulations of the RRM domain of SHARP, and the KH3 domain of hnRNP K with 1M7 and DMS.A, simulation structure of the RRM1 domain of SHARP (blue) with 1M7, and contours representing regions of high 1M7 density (magenta). B, simulation structure of the third KH domain of hnRNP K (green) with 1M7, and contours representing the regions of high 1M7 density (blue). C, simulation structure of the RRM1 domain of SHARP (blue) with DMS, and contours representing regions of high DMS density (green). D, simulation structure of the third KH domain of hnRNP K (green) and contours representing regions of high DMS density (blue). The plots represent the radial distribution function (RDF) of DMS or 1M7 to specific amino acids, averaged across the three simulation trials and reported with standard deviations.
DMS is smaller than 1M7, is not aromatic, and does not appear to associate with the protein surfaces with as long of a lifetime as 1M7 does in our MD simulations (Table S1). Instead, DMS is observed to transiently associate with polar or aliphatic residues on the protein surfaces. In the simulations of SHARP RRM1 with DMS, the DMS molecules localize near Tyr29, Asn64, and Asp77, which brings them adjacent to the potentially reactive amino acids His7, His25, and His63 (Fig. 4C). In the simulations of the KH3 domain of hnRNP K, the DMS molecules associate with Gln38, Leu76, and occasionally Gln78, bringing them adjacent to His41 and Tyr75 (Fig. 4D).
In summary, in both sets of simulations containing 1M7, with either SHARP or hnRNP K, the 1M7 molecules associate near potentially catalytically active amino acids with lifetimes of 50 to 100 ns (Table S1). Conversely, in our simulations of both proteins, the DMS molecules appear to interact with aliphatic or polar amino acids on the protein surfaces with much faster lifetimes, binding and unbinding on the order of 10 to 30 ns. Thus, the DMS molecules exhibit a more diffuse occupancy cloud around the protein than 1M7—represented by the greater average distance of DMS probes to KH3 than 1M7 (Fig. S6). The more transient binding interactions between DMS and the two proteins seen in our MD simulations offers a potential rationale as to why no clear signs for DMS-induced methylations were observed for either protein.
Using these two cases as models for chemical probes reactive and unreactive towards proteins we aimed to find an explanation why BzCN is inert toward the protein. With 1M7, NMIA, and 2A3 equally being aromatic activated carboxylic acids, we expected a similar reactivity towards the protein for BzCN—which is also the most reactive chemical probe in the context of SHAPE experiments. However, as our MS data shows, BzCN does not introduce covalent modifications to the protein. We expanded our MD study with simulations of BzCN and hnRNP K KH3 and compared them to the results with 1M7 and DMS. Interestingly, the aromatic BzCN associates with similar regions of KH3 as 1M7 (Fig. S7). However, the average occupancy time is 38.6 ns which is significantly lower than that of 1M7, and more closely resembles the behavior of DMS (Table S1). This more diffuse binding could offer a rationale why BzCN does not introduce covalent modifications in our experiments.
NMR spectroscopy sheds light on modification sites in hnRNP K KH3
We used NMR to experimentally validate the interaction between 1M7 and DMS with the hnRNP K KH3 protein. As a control, we first carried out a 2D 1H 15N HMQC on hnRNP K KH3 and observed good dispersion of amide resonance peaks, indicating a properly folded protein. We then titrated a 15 nucleotide long polyC RNA oligonucleotide (its known target) into hnRNP K KH3 and observed line broadening and chemical shift perturbations (CSP) for several residues (Fig. 5A) (40). Then, we recorded NH backbone spectra for hnRNP K KH3 treated with 1M7 and DMS (Fig. 5, B and C).
Figure 5.
Backbone HMQC spectra recorded at 298K demonstrate binding of hnRNP K KH3 by RNA and interactions with chemical probes. Assignments refer to apo protein since CSPs and line broadening impede transfer of the assignments to bound/modified protein for many residues. Overlays show hnRNP K KH3 backbone spectrum (green) upon addition of (A) 1.0 eq. of 15 nt polyC RNA, (B) 12.5 mM 1M7, and (C) 20 mM DMS. D, residues severely affected by 1M7 (orange) mapped the X-ray structure of hnRNP K KH3. A stacking interaction between Arg35 and Tyr84 potentially catalyzes the reaction. The full structure is shown in Figure S7. E, residues of hnRNP K KH3 affected by treatment with 1M7 identified by MD simulations (top) and NMR spectroscopy (bottom).
Modification of the protein with 1M7 results in substantial changes in the observed NMR spectra (Fig. 5B). While the presence of several distinct resonances and overall good amide backbone distribution indicates a properly folded protein, almost all resonances are affected by the modification. Most backbone resonances experience some degree of line broadening or CSP; some are broadened to undetectability or escape assignment by severe chemical shift perturbations. We mapped the most severely affected resonances on the published crystal structure of hnRNP K KH3 to elucidate potential modification sites (Figs. 5, D, E and S8) (40). These are in good agreement with the association sites identified by MD simulations. Our simulations showed a preference for occupancy at helix three and its aromatic amino acid side chains. Some of the protein backbone NH resonances most affected by 1M7 are also found in this region. Additionally, we observe occupancy at helix two, wherein 1M7 molecules associate with a region including residues Gln34, Arg35, Ile36, Gln38, and Ile39. This overlaps nicely with a second cluster of NH backbone resonances affected by 1M7.
As shown, many of the residues most affected by 1M7 cluster in defined regions of the protein, suggesting a degree of systematic modification by the 1M7 chemical probe. One cluster in particular caught our attention: a multitude of amino acid residues (Gly30, Gln34, Lys37, Gln38, Val81, Gln83, and Ser85) strongly affected by 1M7 are found in close proximity to Arg35 and Tyr84 (Fig. 4B). Our simulations already suggest a preferred binding site for the 1M7 reagent near this site due to the aromatic nature of the tyrosine side chain. Interestingly, the crystal structure shows an interaction between Tyr84 and Arg35. Considering the reaction mechanism by which 1M7 modifies biomolecules, this tyrosine-arginine motif should be highly susceptible to 1M7. We speculate that the positively charged arginine side chain stabilizes a transiently deprotonated tyrosine. The resulting phenolic anion is highly nucleophilic and can attack 1M7. Catalysts promoting deprotonation have been shown to enhance the reactivity of target RNA residues for 1M7 in a similar manner (14). To elucidate additional modification sites, we turned to MALDI-MS of the tryptic peptides derived from the 1M7 modified KH3 domain protein (Fig. S9, Table S2). Interestingly, proteins bearing 1M7 modifications result in more peptide fragments featuring missed cleavage sites. This indicates that 1M7 modified R/K residues are no longer recognized by trypsin. Hence, the missed cleavage sites are likely to be the ones bearing the modifications. Using this approach, we identified Lys31 and Lys48 as additional modification sites. Lys37 is another candidate overlapping with our NMR and MD results but lacks definitive assignment in the mass spectra due to close proximity to another candidate peptide (Fig. S9, Table S2). Notably, we did not observe some of the unmodified peptides expected from our reference experiment. We explain this by the procedure which uses C18 spin columns to purify tryptic peptides. This will favor binding modified peptides bearing the non-polar 1M7 moiety. In the future, we plan to use top-down MS/MS techniques, which avoid this bias to investigate the modification sites in more detail.
What was most notable from our results is that a significant overlap exists between residues involved in the binding polyC RNA to hnRNP K KH3 and residues affected by the 1M7 chemical probe. This suggests that such modifications can potentially impede important binding interactions (Fig. 1, D and E), thus reducing affinity. We attempted to titrate a polyC oligonucleotide into the 1M7-modified protein to study the impact on binding residues by NMR, however the extensive line-broadening made it difficult to obtain unambiguous data (Fig. S10). Treatment with 1M7 causes line broadening and shifts leading to a complex apo spectrum. Shifts can be observed upon the addition of DNA but due to a lack of assignments and line broadening no clear conclusions can be drawn from the spectra.
In line with our MALDI-MS data on the intact protein, treatment with DMS results in only minor changes of linewidths in the NMR spectra (Fig. 5C). When we incubated DMS treated protein with trypsin we largely observed the same peptides as for unmodified KH3 domain (Fig. S9). However, we observed one unknown peptide at m/z 2130. We speculate that this peptide could be a missed cleavage peptide or other non-tryptic peptide bearing methyl modifications. It has been shown in literature that trypsin has varying activities depending on the number of methyl groups the lysine residue is bearing (41). Whether the peptide is in fact bearing methyl modifications cannot be determined definitively based on our data. However, this data shows that treatment with DMS has some effect on hnRNP K KH3 which we aim to investigate in the future.
Reducing agents are not a solution to undesired protein modifications
We were interested in potential solutions to the undesired side reactions of chemical probing in the presence of proteins. In our study, the use of DTT in protein buffers - even if only partially - mitigates the modification of hnRNP K KH3 by 1M7 (Fig. S11). However, the reason for this is the quenching of the chemical probing reaction by DTT (i.e. acetylation of DTT instead of the protein). This process is even sometimes used by design to terminate SHAPE reactions (42). Consequently, DTT does not specifically inhibit or reverse the reaction of the probing reagents with proteins but with all susceptible biomolecules, thus also impeding the desired reactivity. The same applies to all commonly used reducing agents.
Chemical probes impede RNA binding of WDR5 protein
Considering that chemical probing experiments are typically carried out on RNAs longer than the short oligonucleotides recognized by the RBPs described in the aforementioned sections, we turned our attention to a system that could more accurately reflect the implications of protein-modification by the chemical probing reagents. We chose the interaction between the protein WD repeat-containing protein 5 (WDR5) and the long noncoding RNA (lncRNA) UMLILO, which has been previously characterized (43).
Based on our previous results, we chose 1M7 as an exemplary SHAPE reagent. As before, we used MALDI-MS to investigate the covalent modification of WDR5 by 1M7 (Fig. 6A). Given the substantially larger molecular weight of this protein, the resolution of the MALDI mass spectra does not permit the observation of distinct modification patterns seen for hnRNP K KH3, and SHARP RRM1 (Fig. 3). However, an increased average molecular weight is observed for WDR5 treated with 3 mM 1M7. The amount of 1M7 was reduced because our standard procedure (18 mM) did not yield sufficient quality data (Fig. S12). Based on our observations on hnRNP K KH3 and SHARP RRM1, we attribute this increase in molecular weight to covalent modification of the protein by 1M7. We did notice the presence of an apparent dimer peak in the treated WDR5 sample (Fig. 6A, bottom), but further research is required to pinpoint the cause of this observation in response to treatment with chemical probes.
Figure 6.
Treatment with chemical probes perturbs RNA binding of WDR5.A, MALDI-MS shows an increase in molecular weight of WDR5 treated with 1M7. Potential dimer formation of modified protein is highlighted with an asterisk. B, radial distribution plot of 1M7 against specific residues of WDR5, averaged across three simulation trials and reported with standard deviations. C, preferred 1M7 occupancy sites (orange) cluster on the structure of WDR5 (yellow). D, filter dot-blot assay of WDR5 and lncRNA UMLILO. WDR5 loses its binding ability to the target RNA when treated with 1M7. Errors from triplicate experiments shown in Figure S13. E, filter dot-blot assay of pre-formed UMLILO-WDR5 complex treated with DMSO or 1M7. Plot of median and standard deviation error bars from triplicate experiments shown in Figure S14.
We then performed MD simulations of WDR5 in the presence of 1M7 to investigate sites with a propensity for 1M7-induced modifications. 1M7 molecules localize in two distinct regions with high occupancy, preferentially interacting with aromatic surface residues of WDR5—indicating which residues may exhibit a greater propensity for covalent modification. Specifically, we observe a strong tendency for 1M7 to localize near an area of WDR5 enriched in aromatic residues: Tyr131, Phe133, Phe149, Pro152, Tyr170, Phe219, and Tyr260—which frequently positions 1M7 molecules near the potentially nucleophilic Cys240 (Fig. 6B). Additionally, we observe a significant localization of 1M7 molecules near a known RNA binding site of WDR5, which harbors the potentially reactive residues Tyr228 and Tyr243 (44). The association of 1M7 in this RNA binding region encouraged us to investigate whether the introduction of a chemical probe could inhibit the protein’s interaction with RNA, as it would in a chemical probing experiment carried out in the presence of protein.
To this end, we performed protein-RNA binding assays with WDR5 and UMLILO. Indeed, our filter dot blot data clearly shows how 1M7 modified WDR5 loses its binding competence for the lncRNA UMLILO. Unmodified WDR5 binds UMLILO with 4.4 μM affinity (Fig. 6D). When treated with 1M7, however, the binding affinity is greatly reduced (Figs. 6D, S13) to a point where the KD could not be determined within the concentration range of the filter dot blot (>25 μM). The same behavior can be observed by electrophoretic mobility shift assays (Fig. S13). This demonstrates that chemical probes can introduce modifications to proteins that impede their binding to target RNAs.
Next, we tried to mimic the procedure of SHAPE experiments in the presence of proteins as closely as possible: establishing an UMLILO–WDR5 complex, followed by treatment of the complex with the 1M7 chemical probe. We combined 300 nM UMLILO RNA and 15 μM protein and incubated the complex for 20 min. Under these conditions, unmodified WDR5 fully saturates the RNA as shown by our EMSA and dot blot data (Fig. S13). We treated the complex with 1M7 (or DMSO as a control) and incubated for another 20 min before running a filter dot blot assay. Treatment of the UMLILO-WDR5 complex results in dissolution of the complex, indicated by the large fraction of free RNA on the nylon membrane. Signs of the breakdown of the complex can also be observed by native gel electrophoresis (Fig. S14). In summary, our results demonstrate that the protein is likely subject to modifications, which occur within timescales that encompass both the on-off rate of complex formation and the rate by which RNA is modified. This modification can inhibit its interaction with the RNA even when the complex is pre-formed. Further research is required to identify the kinetics of WDR5 complex formation and its link to modification.
Discussion
The experimental design of chemical probing against RNA-protein complexes generally involves the formation of the RNA-protein complex of interest to protect protein-binding nucleotides before treatment with chemical probing reagents. Using MALDI-MS, NMR, and binding assays complemented by MD simulations, we established that RNA chemical probes are capable of selectively modifying proteins and interfering with their binding to target RNAs. We posit that this modification is possible given the life-time of the RNA-protein complex: if the life-time is sufficiently low (i.e. shorter than the half-life of the chemical probing reagent), the free protein can be modified by the chemical probing reagent, potentially causing a decrease of affinity over time, or obscuring accurate deduction of binding affinities in general. The most commonly used chemical probing reagents, with the exception of BzCN, have half-lives that are magnitudes higher than the lifetime of many RNA-protein complexes, which have been reported in a wide range between 0.1 s and 55 h (45, 46, 47).
In this study, we investigated a series of chemical probes and their reactivity with proteins: 2A3, BzCN, NMIA, 1M7, and DMS. While covalent modification of the proteins by 2A3, NMIA, and 1M7 is supported by our experimental results, we did not observe strong modification patterns with BzCN or DMS. The lack of reactivity of these probes for the proteins in our study may be due to a more diffuse occupancy and a faster dissociation rate, as seen in our MD simulations with DMS and BzCN, when compared to 1M7. In extension of this idea, we hypothesize that non-covalent binding of the chemical probes to the protein surface influences the preferred modification sites but is not the primary driving force behind the chemical modification reaction—i.e., not all of the binding events lead to chemical modification. This also offers a rationale for why DMS-induced methylations were only observed in some of the previous studies mentioning the effect of DMS on proteins (28, 29), and highlights how a protein’s 3D structure is a critical factor for the susceptibility to chemical probing reagents. We speculate that chemical probes could potentially be used as simple means to reveal nucleophilically reactive and accessible amino acid side chains in proteins. This technique would, similar to chemical probing of RNA, involve treating a protein of interest with varying concentrations of different reagents. The introduced modifications can then be identified by a method of choice, such as high-resolution MS/MS or NMR, revealing the reactive amino acids within the protein. Similar techniques have been successfully employed to investigate properties such as hydrophobic pockets as well as structure-function relationships of proteins (48, 49).
Our UMLILO-WDR5 results demonstrate that chemical probing experiments with protein-modifying probes like 1M7 should be used with caution; they can interfere with complex formation. Furthermore, chemical probing experiments cannot be reliably used to determine binding affinities between RNA-protein complexes without further validation using orthogonal measures. Interestingly, we found a pronounced concentration dependence of the 2A3 modification rate (Fig. 3B), which we did not observe for 1M7 (Fig. S3). Thus, the ability of a chemical probe to modify the protein relates to an intricate balance between sufficient chemical reactivity toward amino acid side chains, and a half-life long enough to access reactive sites in the protein. Keeping the concentration of the chemical probes as low as possible will also reduce the rate of undesired side reactions. We suggest that each chemical probing reagent is tested for protein side chain interactions in the future before use in RNA-protein complexes.
While not explicitly observed in this study, the possibility of non-covalent binding of chemical probing reagents and their hydrolysis products perturbing RNA binding cannot be ruled out based on our data. The potential impact of non-covalent binding largely depends on the chemical nature of the compounds. However, we expect the hydrolysis products of chemical probing reagents to be more influential than non-covalent binding in disrupting RNA-protein interactions since the supply of chemical probe will deplete over time. DMS hydrolysis yields sulfuric acid and methanol via monomethyl sulfate (50, 51). Both compounds, given sufficient buffer capacity, should not impede binding events between proteins and nucleic acids. 1M7 forms 2-(methylamino)-4-nitrobenzoic acid when reacting with water (52). This small molecule still possesses a variety of functional groups that may potentially interact with the protein, such as a carboxylic acid and an amine group. Thus, we conclude that non-covalent binding of small-molecule side products of a SHAPE reaction can still be relevant in some cases.
In summary, we demonstrate that chemical probes that are used to identify single-stranded and flexible nucleotides can also interact with amino acids. Many protein–RNA interactions occur over time scales from sub-seconds to hours (45). Considering that the half-lifes of chemical probing reagents are found in the same range this allows the probe to interact with the protein, potentially inhibiting its binding (42). Critically, the reaction of a chemical probe with amino acid residues that are known to facilitate interactions with RNA (e.g. tyrosine, lysine) can diminish the binding interaction. On a similar note, RNA-binding molecules (e.g. therapeutic agents) can also be modified by a chemical probe, and its interaction with its RNA target inhibited. As has been observed in multiple studies, even at sites of ligand binding, nucleotides can still possess chemical probe reactivity (53, 54, 55, 56). Thus, the use of additional orthogonal approaches is mandatory to confirm binding sites identified by chemical probing experiments to avoid any potential bias introduced by side reactions. As a silver lining, RNA-modifying probes could serve as a useful tool to identify surface residues of proteins and could potentially be used as an alternative to introducing mutations to amino acids on the RNA-binding interface. Future studies will investigate the factors that govern the reactivity of protein amino acids with chemical probing reagents, and the mechanisms by which they perturb RNA-protein interactions.
Experimental procedures
Protein expression
Plasmids encoding for SHARP RRM1 (9.5 kDa) and the KH3 domain of hnRNP K (9.0 kDa) (each preceded with an N-terminal 6× histidine tag (his-tag) and a tobacco etch virus (TEV) cleavage site) were transformed into BL21 DE3 E. coli chemically competent cells and expressed in LB or M9 minimal media solution supplemented with 15N NH4Cl. Cells were grown at 37 °C; upon reaching an optical density (OD600) of 0.9, they were induced to express protein with 0.5 mM isopropyl ß-D-1-thiogalactopyranoside (IPTG) at 18 °C. Cells were lysed by sonication and purified using immobilized metal (nickel) affinity chromatography (IMAC). Briefly, the protein was washed in a buffer containing 50 mM Tris pH 8, 300 mM NaCl, 5 mM beta-mercaptoethanol, and 10 mM imidazole. An on-column cleavage of the his-tag was performed by adding 10 mg/ml of TEV protease. The flow through was assessed by sodium dodecyl-sulfate (SDS) polyacrylamide gel electrophoresis (PAGE) followed by size exclusion chromatography (HiLoad 16/60 Superdex 75) in a buffer containing 150 mM NaCl, 5 mM dithiothreitol (DTT), and 25 mM sodium phosphate, pH 6.5. Purified fractions were assessed by SDS PAGE, followed by flash freezing in liquid nitrogen and storage at −80 °C until further use.
Plasmid pGEX-4T1-WDR5 was a gift from Debu Chakravarti, obtained from Addgene (Addgene plasmid # 59969; http://n2t.net/addgene:59969; RRID:Addgene_59969) (57). The sequence encoding for WDR5 was amplified by PCR and ligated into pETM11 with an N-terminal 6x-his tag and TEV cleavage site. The plasmid was transformed into BL21 DE3 E. coli competent cells. LB was inoculated with an overnight culture and incubated at 37 °C until OD600 reached 0.8. Expression was induced with 1 mM IPTG at 18 °C overnight. Cells were lysed by sonication in a buffer containing 25 mM Tris, pH 7.5, 300 mM NaCl, and 5 mM DTT, then purified by IMAC, eluted with 300 mM imidazole, followed by dialysis and cleavage with TEV protease. Protein was further purified by size exclusion chromatography on a Superdex S75 column in a buffer containing 25 mM sodium phosphate, pH 7.5, 150 mM NaCl, and 5 mM DTT. Fractions were analyzed by SDS-PAGE, flash-frozen, and stored at −80 °C.
RNA synthesis
The DNA template encoding for UMLILO was amplified by PCR from a plasmid with the exonic sequence (a gift from Dr Musa Mhlanga) using a forward primer containing the T7 promoter sequence. The template DNA was cleaned using AmpureXP bead-based reagent. The transcription reaction consisted of 10 to 20 ng/μl template DNA, 8 mM of each rNTP, 10% polyethylene glycol 8000, 20 mM MgCl2, 1× transcription buffer (5 mM Tris, pH 8, 5 mM spermidine, 10 mM DTT), and in-house expressed and purified T7. RNA was in vitro transcribed at 37 °C for 3 h, then purified by high-performance liquid chromatography (HPLC) using 12.5 mM Tris HCl and 6 M urea and eluted with a gradient of 0 to 500 mM sodium perchlorate.
Protein modification
Protein samples with a concentration of 50 to 200 μM of the respective protein in a buffer (25 mM sodium phosphate, 150 mM NaCl, pH 6.5) were prepared. Stock solutions of 1M7 (a gift from Michael Sattler), NMIA, BzCN, and 2A3 in anhydrous DMSO were added for a final concentration of 18 mM (12.5 mM for NMR studies of KH3 and 3 mM for MALDI-MS of WDR5) in the reaction mixture. For 2A3, an additional sample modified by 36 mM was prepared. For KH3, additional experiments with 3, 6, 9, 18, and 36 mM 1M7 were conducted. For DMS, 40 μl of DMS was dissolved in 90 μl of anhydrous ethanol and diluted with 870 μl of water. Aliquots of this solution were then added to 100 to 200 μM protein solutions for a final concentration of 20 mM DMS. The mixtures were incubated for 15 min at room temperature prior to use.
Trypsin digests
To eliminate contaminations by side products of the modification reactions, the modified proteins were loaded onto a 3 kDa cutoff spin column and washed thrice with water. 20 μl of modified (18 mM 1M7 and 20 mM DMS) and unmodified solutions of hnRNP K KH3 were diluted with 60 μl 50 mM Tris pH 8.0 and treated with Pierce Trypsin Protease (ThermoFisher) and incubated at 37 °C for 20 h according to the manufacturer's protocol. The peptides were purified using Pierce C18 Spin Columns (ThermoFisher) and analyzed by MALDI-MS using α-Cyano-4-Hydroxycinnamic acid as a matrix. Peptide spectra were calibrated using angiotensin II (Sigma Aldrich).
NMR analysis
To eliminate contaminations by side products of the modification reactions, the modified proteins were loaded onto a 3 kDa cutoff spin column. Four subsequent washings with fresh NMR buffer (25 mM sodium phosphate, 150 mM NaCl, pH 6.5) restored the initial solute composition. After recovery, the samples were supplemented with 10% D2O and transferred into 3 mm NMR tubes. Sample concentrations were in the range of 100 to 120 μM (absorption of 1M7 inhibits concentration determination of 1M7-modified proteins by UV). NMR experiments were conducted on a Bruker 800 MHz Avance NEO equipped with a cryo-probe at 298 K. The 2D 1H-15 N correlation spectra were acquired using a SOFAST-HMQC pulse sequence (58). Spectra were processed in Topspin 3.6 and analyzed with CCPN2.5.3 (59). Protein backbone resonance assignments for hnRNP K KH3 have been published elsewhere and could be transferred to our spectra for most backbone resonances without complications (60, 61). For the RNA titration experiments 5′-(CCC)5-3′ RNA and DNA oligonucleotides were supplied by Integrated DNA Technologies.
MALDI mass spectrometry
All mass spectrometric experiments were conducted on a Bruker Autoflex maX MALDI-TOF instrument. All spectra were acquired in positive linear mode. Aliquots of the (modified) protein samples were diluted to <6 μM with water and combined with α-cyano-4-hydroxycinnamic acid for hnRNP K KH3 and SHARP RRM1 and 2,5-dihydroxybenzoic acid matrices for WDR5 using the dried droplet technique. WDR5 samples were additionally desalted by washing with water on a 3 kDa cutoff spin column. This step was omitted for hnRNP K KH3 and SHARP RRM1 since we found no beneficial effect on spectral quality for these proteins. Mass spectra were recorded for SHARP RRM 1 modified with 1M7 and DMS; hnRNP K KH3 modified by 1M7, DMS, NMIA, BzCN, and 2A3; WDR5 modified by 1M7 as well as the unmodified proteins.
Molecular dynamics simulations
1M7 and DMS were separately docked to the WDR5, the RRM1 domain of SHARP, and the KH3 domain of hnRNP K. A structure for the RRM1 domain of SHARP was modeled using AlphaFold2 (62). The WDR5 protein and the KH3 domain of hnRNP K were taken from their respective X-ray structures (PDB IDs 8G3C and 1ZZK, respectively) (40, 63); the GB1 solubility tag was removed from the KH3 domain. Structures for 1M7 and DMS were procured from the CSD (64). Parameters for 1M7 and DMS were obtained using GAFF and Antechamber (65). The 1M7 and DMS molecules were docked to the RRM1 domain of SHARP using the flexible docking algorithm of UCSF’s Dock6 (66). For each ligand, the top four ranked poses were used.
The six systems were then prepared for simulation: (i) hnRNP K KH3 + DMS, (ii) hnRNP K KH3 + 1M7, (iii) SHARP RRM1 + DMS, (iv) SHARP RRM1 + 1M7, (v) WDR5 + DMS, and (vi) WDR5 + 1M7. Additionally, hnRNP K KH3 was also simulated with BzCN. The tleap module of Amber22 (67) was used to solvate the system in a truncated octahedron box of 9833 TIP3P (68) water molecules before neutralizing the total charge with Na+ + ions; a 50 mM NaCl buffer was then added. The total system size consisted of 30,901 atoms. The protein was simulated using the ff14SB (69) force field, and the ions were modeled with the Joung-Cheatham monovalent ion set in TIP3P (70). The 1M7, DMS, and BzCN molecules were modeled using GAFF (65). The masses of non-water hydrogen atoms were repartitioned to permit a 4 fs timestep in production runs (71). A distance cutoff of 10 Å was applied to nonbonded interactions, with long-range electrostatics calculated using Particle Mesh Ewald (72). A Langevin thermostat (73) was used to maintain the temperature with a collision frequency of 1 ps−1. Simulations were carried out using the GPU variant of the pmemd module of Amber22.
The systems were relaxed identically using a 10-step equilibration protocol designed to prepare the systems for simulation conditions. The first step included 1000 steps of steepest descent minimization, followed by 9000 steps of conjugate gradient minimization with only water molecules and hydrogen atoms unrestrained and 100.0 kcal/(mol∗Å2) Cartesian positional restraints on the rest of the system. The second step involved heating from 100 K to 298.15 K over 1 ns at constant NVT, again with all atoms except hydrogen atoms and water molecules restrained with a 100 kcal/(mol∗Å2) force constant, before maintaining the temperature at 298.15 K for an additional 4 ns. The third step was 1 ns MD simulation at constant NPT with 100 kcal/(mol∗Å2) positional restraints on all atoms except hydrogen and water molecules. The fourth step was 1 ns MD simulation at constant NVT with 100 kcal/(mol∗Å2) restraints on all atoms except hydrogen and water molecules. The fifth step was 1000 steps of conjugate gradient minimization with only the protein backbone atoms (Cα, C, N) restrained with a 10 kcal/(mol∗Å2) force constant. The sixth step was 1 ns MD simulation at constant NPT with 10 kcal/(mol∗Å2) restraints on the protein backbone, followed by another 1 ns MD at constant NPT with 1 kcal/(mol∗Å2) restraints on the protein backbone, then another 1 ns with 0.1 kcal/(mol∗Å2) restraints, followed by a final 10 ns of unrestrained MD at constant NPT. The four 1M7 and DMS molecules in each system were unrestrained during the final four equilibration steps. The final coordinates and velocities from the last relaxation step were used to seed triplicate independent MD simulations at 298.15 K for each system.
Analysis was carried out using the cpptraj (74) module of Amber22. The AlphaFold2 model was used as the reference structure for SHARP RRM1, the first model of the NMR structure was used as the reference structure for hnRNP K KH3, and the X-ray structure was used as the reference for WDR5 (40, 43). The RMSF calculations were performed by fitting the non-terminal backbone atoms of the simulated structures to the reference structure before calculating the per-residue RMSF using all non-hydrogen atoms. The RDFs of 1M7, DMS, and BzCN were calculated against particular amino acids on each protein to monitor which residues the chemical probes interact with the most during the simulations. Structures were visualized using VMD (75).
Filter dot blot assays
Filter dot blot binding assays were performed in a buffer of 25 mM sodium phosphate, pH 7.5, and 100 mM sodium chloride. 80 nM 3′ Cy5-labeled UMLILO RNA was combined with dilutions of WDR5, unmodified or modified with 18 mM 1M7, for final concentrations of 0, 1.1, 2.2, 4.4, 6.7, 10, 15, and 25 μM in 10 μl. Samples were incubated at room temperature for 20 min, diluted in 60 μl cold buffer, then rapidly applied to a filter dot blot apparatus with nitrocellulose and nylon membranes (pore sizes 0.45 μm) and gently drawn through by vacuum. Membranes were imaged using an Amersham Typhoon laser scanner. Dot blots were quantified in ImageJ, and binding curves were fitted to the Hill equation in Prism 10. Interference experiments were performed in a buffer of 25 mM sodium phosphate, pH 6.5, and 150 mM sodium chloride. 15 μM WDR5 was incubated with 100 nM 3′ Cy5-labeled UMLILO RNA for 15 min at room temperature, followed by the addition of DMSO or 50 mM 1M7 (10% final volume). Samples were then applied to the filter dot blot apparatus and quantified as before.
Electrophoretic mobility shift assays
EMSAs were conducted in a buffer consisting of 25 mM sodium phosphate, pH 6.5, and 150 mM sodium chloride. 300 nM UMLILO RNA was combined with dilutions of WDR5 (unmodified or modified with 18 mM 1M7) for final concentrations of 0, 0.625, 1.25, 2.50, 5.00, 10.0, and 15.0 μM in 10 μl. The samples were incubated at room temperature for 20 min before adding 3 μl of 30% glycerol and loading on a 1% native agarose gel stained with SYBR safe. Gels were run at 60V for 40 min. Gels were analyzed using ImageJ (76). The decay of the free RNA band was quantified for unmodified WDR5 and curves fitted to the Hill equation using MATLAB.
Data availability
All data are included in the article or available from the corresponding author, A. N. J.
Supporting information
This article contains supporting information (40).
Conflict of interest
The authors declare that they have no conflicts of interest with the contents of this article.
Acknowledgments
The authors would like to thank Emma Gogarnoiu for assistance with figures, Alex Nazzaro for assistance with MALDI experiments, and members of the Jones lab for thoughtful comments and suggestions.
Author contributions
D. C., D. K., and L. F. visualization; D. C., D. K., and L. F. writing–original draft; L. M., A. C., D. C., D. K., and L. F. investigation; A. N. J. conceptualization; A. N. J. funding acquisition; A. N. J. resources; A. N. J. supervision; A. N. J., D. K., and L. F. writing–review & editing.
Funding and additional information
This work was supported by the National Science Foundation (2243667 to A. N. J.), the Austrian Science Fund (FWF) 10.55776/J4869 to D. K., and the NYU Dean’s Undergraduate Research Fund (DURF) to D. C.
Reviewed by members of the JBC Editorial Board. Edited by Karin Musier-Forsyth
Supporting information
References
- 1.Vicens Q., Kieft J.S. Thoughts on how to think (and talk) about RNA structure. Proc. Natl. Acad. Sci. U. S. A. 2022;119 doi: 10.1073/pnas.2112677119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wilkinson K.A., Merino E.J., Weeks K.M. Selective 2’-hydroxyl acylation analyzed by primer extension (SHAPE): quantitative RNA structure analysis at single nucleotide resolution. Nat. Protoc. 2006;1:1610–1616. doi: 10.1038/nprot.2006.249. [DOI] [PubMed] [Google Scholar]
- 3.Peattie D.A., Gilbert W. Chemical probes for higher-order structure in RNA. Proc. Natl. Acad. Sci. U. S. A. 1980;77:4679–4682. doi: 10.1073/pnas.77.8.4679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Peattie D.A. Direct chemical method for sequencing RNA. Proc. Natl. Acad. Sci. U. S. A. 1979;76:1760–1764. doi: 10.1073/pnas.76.4.1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Spitale R.C., Crisalli P., Flynn R.A., Torre E.A., Kool E.T., Chang H.Y. RNA SHAPE analysis in living cells. Nat. Chem. Biol. 2013;9:18–20. doi: 10.1038/nchembio.1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.De Bisschop G., Allouche D., Frezza E., Masquida B., Ponty Y., Will S., et al. Progress toward SHAPE constrained computational prediction of tertiary interactions in RNA structure. Noncoding RNA. 2021 doi: 10.3390/ncrna7040071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Arnold E.B., Cohn D., Bose E., Klingler D., Wolfe G., Jones A.N. Investigating the interplay between RNA structural dynamics and RNA chemical probing experiments. Nucleic Acids Res. 2025 doi: 10.1093/nar/gkaf290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bose E., Xiong S., Jones A.N. Probing RNA structure and dynamics using nanopore and next generation sequencing. J. Biol. Chem. 2024;300 doi: 10.1016/j.jbc.2024.107317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Siegfried N.A., Busan S., Rice G.M., Nelson J.A.E., Weeks K.M. RNA motif discovery by SHAPE and mutational profiling (SHAPE-MaP) Nat. Methods. 2014;11:959–965. doi: 10.1038/nmeth.3029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Smola M.J., Weeks K.M. In-cell RNA structure probing with SHAPE-MaP. Nat. Protoc. 2018;13:1181–1195. doi: 10.1038/nprot.2018.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Smola M.J., Calabrese J.M., Weeks K.M. Detection of RNA–protein interactions in living cells with SHAPE. Biochemistry. 2015;54:6867–6875. doi: 10.1021/acs.biochem.5b00977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Jones A.N., Pisignano G., Pavelitz T., White J., Kinisu M., Forino N., et al. An evolutionarily conserved RNA structure in the functional core of the lincRNA Cyrano. RNA. 2020;26:1234–1246. doi: 10.1261/rna.076117.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Stoddard C.D., Gilbert S.D., Batey R.T. Ligand-dependent folding of the three-way junction in the purine riboswitch. RNA. 2008;14:675–684. doi: 10.1261/rna.736908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.McGinnis J.L., Dunkle J.A., Cate J.H.D., Weeks K.M. The mechanisms of RNA SHAPE chemistry. J. Am. Chem. Soc. 2012;134:6617–6624. doi: 10.1021/ja2104075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zubradt M., Gupta P., Persad S., Lambowitz A.M., Weissman J.S., Rouskin S. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat. Methods. 2017;14:75–82. doi: 10.1038/nmeth.4057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mustoe A.M., Lama N.N., Irving P.S., Olson S.W., Weeks K.M. RNA base-pairing complexity in living cells visualized by correlated chemical probing. Proc. Natl. Acad. Sci. U. S. A. 2019;116:24574–24582. doi: 10.1073/pnas.1905491116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Jones A.N., Graß C., Meininger I., Geerlof A., Klostermann M., Zarnack K., et al. Modulation of pre-mRNA structure by hnRNP proteins regulates alternative splicing of. Sci. Adv. 2022;8 doi: 10.1126/sciadv.abp9153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Luo L., Chiu L.-Y., Sugarman A., Gupta P., Rouskin S., Tolbert B.S. HnRNP A1/A2 proteins assemble onto 7SK snRNA via context dependent interactions. J. Mol. Biol. 2021;433 doi: 10.1016/j.jmb.2021.166885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Martin S., Blankenship C., Rausch J.W., Sztuba-Solinska J. Using SHAPE-MaP to probe small molecule-RNA interactions. Methods. 2019;167:105–116. doi: 10.1016/j.ymeth.2019.04.009. [DOI] [PubMed] [Google Scholar]
- 20.Wang Y., Parmar S., Schneekloth J.S., Tiwary P. Interrogating RNA-small molecule interactions with structure probing and artificial intelligence-augmented molecular simulations. ACS Cent. Sci. 2022;8:741–748. doi: 10.1021/acscentsci.2c00149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Banijamali E., Baronti L., Becker W., Sajkowska-Kozielewicz J.J., Huang T., Palka C., et al. RNA:RNA interaction in ternary complexes resolved by chemical probing. RNA. 2023;29:317–329. doi: 10.1261/rna.079190.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Churchich J.E. Fluorescence properties of o-aminobenzoyl-labeled proteins. Anal. Biochem. 1993;213:229–233. doi: 10.1006/abio.1993.1414. [DOI] [PubMed] [Google Scholar]
- 23.Fessler A.B., Fowler A.J., Ogle C.A. Directly quantifiable biotinylation using a water-soluble isatoic anhydride platform. Bioconjug. Chem. 2021;32:904–908. doi: 10.1021/acs.bioconjchem.1c00150. [DOI] [PubMed] [Google Scholar]
- 24.Ursuegui S., Chivot N., Moutin S., Burr A., Fossey C., Cailly T., et al. Biotin-conjugated N-methylisatoic anhydride: a chemical tool for nucleic acid separation by selective 2′-hydroxyl acylation of RNA. Chem. Commun. 2014;50:5748–5751. doi: 10.1039/c4cc01134a. [DOI] [PubMed] [Google Scholar]
- 25.Hooker J.M., Esser-Kahn A.P., Francis M.B. Modification of aniline containing proteins using an oxidative coupling strategy. J. Am. Chem. Soc. 2006;128:15558–15559. doi: 10.1021/ja064088d. [DOI] [PubMed] [Google Scholar]
- 26.Asadollahi K., Rafiee S., Riazi G. Sensitive detection of proteins in polyacrylamide gel via isatoic anhydride derivatization: introduction of a low-cost fluorescent prelabeling procedure. Electrophoresis. 2016;37:2610–2614. doi: 10.1002/elps.201600237. [DOI] [PubMed] [Google Scholar]
- 27.Asadollahi K., Rafiee S., Riazi G. In: Protein Gel Detection and Imaging: Methods and Protocols. Kurien B.T., Scofield R.H., editors. Springer; New York, New York, NY: 2018. Detection of proteins in polyacrylamide gels via prelabeling by isatoic anhydride; pp. 173–177. [DOI] [PubMed] [Google Scholar]
- 28.Krol A., Carbon P. A guide for probing native small nuclear RNA and ribonucleoprotein structures. Methods Enzymol. 1989;180:212–227. doi: 10.1016/0076-6879(89)80103-x. [DOI] [PubMed] [Google Scholar]
- 29.Moio P., Kulyyassov A., Vertut D., Camoin L., Ramankulov E., Lipinski M., et al. Exploring the use of dimethylsulfate for in vivo proteome footprinting. Proteomics. 2011;11:249–260. doi: 10.1002/pmic.200900832. [DOI] [PubMed] [Google Scholar]
- 30.Shang S., Liu J., Hua F. Protein acylation: mechanisms, biological functions and therapeutic targets. Signal. Transduct Target. Ther. 2022;7:396. doi: 10.1038/s41392-022-01245-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wang X., Xie H., Guo Q., Cao D., Ru W., Zhao S., et al. Molecular basis for METTL9-mediated N1-histidine methylation. Cell Discov. 2023;9:38. doi: 10.1038/s41421-023-00548-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Murn J., Shi Y. The winding path of protein methylation research: milestones and new frontiers. Nat. Rev. Mol. Cell Biol. 2017;18:517–527. doi: 10.1038/nrm.2017.35. [DOI] [PubMed] [Google Scholar]
- 33.Staiger R.P., Miller E.B. Isatoic anhydride. IV. Reactions with various nucleophiles. J. Org. Chem. 1959;24:1214–1219. [Google Scholar]
- 34.Dodson G., Wlodawer A. Catalytic triads and their relatives. Trends Biochem. Sci. 1998;23:347–352. doi: 10.1016/s0968-0004(98)01254-7. [DOI] [PubMed] [Google Scholar]
- 35.Pintacuda G., Wei G., Roustan C., Kirmizitas B.A., Solcan N., Cerase A., et al. hnRNPK recruits PCGF3/5-PRC1 to the Xist RNA B-repeat to establish polycomb-mediated chromosomal silencing. Mol. Cell. 2017;68:955–969.e10. doi: 10.1016/j.molcel.2017.11.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Venables J.P., Koh C.-S., Froehlich U., Lapointe E., Couture S., Inkel L., et al. Multiple and specific mRNA processing targets for the major human hnRNP proteins. Mol. Cell. Biol. 2008;28:6033–6043. doi: 10.1128/MCB.00726-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Arieti F., Gabus C., Tambalo M., Huet T., Round A., Thore S. The crystal structure of the Split End protein SHARP adds a new layer of complexity to proteins containing RNA recognition motifs. Nucleic Acids Res. 2014;42:6742–6752. doi: 10.1093/nar/gku277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Monfort A., Di Minin G., Postlmayr A., Freimann R., Arieti F., Thore S., et al. Identification of spen as a crucial factor for Xist function through forward genetic screening in haploid embryonic stem cells. Cell Rep. 2015;12:554–561. doi: 10.1016/j.celrep.2015.06.067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Button A.C., Hall S.D., Ashley E.L., McHugh C.A. Dissection of protein and RNA regions required for SPEN binding to XIST A-repeat RNA. RNA. 2023 doi: 10.1261/rna.079713.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Backe P.H., Messias A.C., Ravelli R.B.G., Sattler M., Cusack S. X-ray crystallographic and NMR studies of the third KH domain of hnRNP K in complex with single-stranded nucleic acids. Structure. 2005;13:1055–1067. doi: 10.1016/j.str.2005.04.008. [DOI] [PubMed] [Google Scholar]
- 41.Chen M., Zhang M., Zhai L., Hu H., Liu P., Tan M. Tryptic peptides bearing C-terminal dimethyllysine need to be considered during the analysis of lysine dimethylation in proteomic study. J. Proteome Res. 2017;16:3460–3469. doi: 10.1021/acs.jproteome.7b00373. [DOI] [PubMed] [Google Scholar]
- 42.Busan S., Weidmann C.A., Sengupta A., Weeks K.M. Guidelines for SHAPE reagent choice and detection strategy for RNA structure probing studies. Biochemistry. 2019;58:2655–2664. doi: 10.1021/acs.biochem.8b01218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Fanucchi S., Fok E.T., Dalla E., Shibayama Y., Börner K., Chang E.Y., et al. Immune genes are primed for robust transcription by proximal long noncoding RNAs located in nuclear compartments. Nat. Genet. 2019;51:138–150. doi: 10.1038/s41588-018-0298-2. [DOI] [PubMed] [Google Scholar]
- 44.Yang Y.W., Flynn R.A., Chen Y., Qu K., Wan B., Wang K.C., et al. Essential role of lncRNA binding for WDR5 maintenance of active chromatin and embryonic stem cell pluripotency. Elife. 2014;3 doi: 10.7554/eLife.02046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Licatalosi D.D., Ye X., Jankowsky E. Approaches for measuring the dynamics of RNA-protein interactions. Wiley Interdiscip. Rev. RNA. 2020;11 doi: 10.1002/wrna.1565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Webb A.E., Rose M.A., Westhof E., Weeks K.M. Protein-dependent transition states for ribonucleoprotein assembly. J. Mol. Biol. 2001;309:1087–1100. doi: 10.1006/jmbi.2001.4714. [DOI] [PubMed] [Google Scholar]
- 47.Khan M.A., Ma J., Walden W.E., Merrick W.C., Theil E.C., Goss D.J. Rapid kinetics of iron responsive element (IRE) RNA/iron regulatory protein 1 and IRE-RNA/eIF4F complexes respond differently to metal ions. Nucleic Acids Res. 2014;42:6567–6577. doi: 10.1093/nar/gku248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Tamura T., Hamachi I. Chemistry for covalent modification of endogenous/native proteins: from test tubes to complex biological systems. J. Am. Chem. Soc. 2019;141:2782–2799. doi: 10.1021/jacs.8b11747. [DOI] [PubMed] [Google Scholar]
- 49.Lai C., Tang Z., Liu Z., Luo P., Zhang W., Zhang T., et al. Probing the functional hotspots inside protein hydrophobic pockets by in situ photochemical trifluoromethylation and mass spectrometry. Chem. Sci. 2024;15:2545–2557. doi: 10.1039/d3sc05106d. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Guzowski J.P., Jr., Delaney E.J., Humora M.J., Irdam E., Kiesman W.F., Kwok A., et al. Understanding and control of dimethyl sulfate in a manufacturing process: Kinetic modeling of a fischer esterification catalyzed by H2SO4. Org. Process. Res. Dev. 2012;16:232–239. [Google Scholar]
- 51.Claesson P. Ueber die neutralen und sauren Sulfate des Methyl- und Aethylalkohols. J. Prakt. Chem. 1879;19:231–265. [Google Scholar]
- 52.Merino E.J., Wilkinson K.A., Coughlan J.L., Weeks K.M. RNA structure analysis at single nucleotide resolution by selective 2’-hydroxyl acylation and primer extension (SHAPE) J. Am. Chem. Soc. 2005;127:4223–4231. doi: 10.1021/ja043822v. [DOI] [PubMed] [Google Scholar]
- 53.Stoddard C.D., Montange R.K., Hennelly S.P., Rambo R.P., Sanbonmatsu K.Y., Batey R.T. Free state conformational sampling of the SAM-I riboswitch aptamer domain. Structure. 2010;18:787–797. doi: 10.1016/j.str.2010.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Garst A.D., Héroux A., Rambo R.P., Batey R.T. Crystal structure of the lysine riboswitch regulatory mRNA element. J. Biol. Chem. 2008;283:22347–22351. doi: 10.1074/jbc.C800120200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Tyrrell J., McGinnis J.L., Weeks K.M., Pielak G.J. The cellular environment stabilizes adenine riboswitch RNA structure. Biochemistry. 2013;52:8777–8785. doi: 10.1021/bi401207q. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Watters K.E., Strobel E.J., Yu A.M., Lis J.T., Lucks J.B. Cotranscriptional folding of a riboswitch at nucleotide resolution. Nat. Struct. Mol. Biol. 2016;23:1124–1131. doi: 10.1038/nsmb.3316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Kim J.-Y., Banerjee T., Vinckevicius A., Luo Q., Parker J.B., Baker M.R., et al. A role for WDR5 in integrating threonine 11 phosphorylation to lysine 4 methylation on histone H3 during androgen signaling and in prostate cancer. Mol. Cell. 2014;54:613–625. doi: 10.1016/j.molcel.2014.03.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Schanda P., Brutscher B. Very fast two-dimensional NMR spectroscopy for real-time investigation of dynamic events in proteins on the time scale of seconds. J. Am. Chem. Soc. 2005;127:8014–8015. doi: 10.1021/ja051306e. [DOI] [PubMed] [Google Scholar]
- 59.Vranken W.F., Boucher W., Stevens T.J., Fogh R.H., Pajon A., Llinas M., et al. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins. 2005;59:687–696. doi: 10.1002/prot.20449. [DOI] [PubMed] [Google Scholar]
- 60.Baber J.L., Libutti D., Levens D., Tjandra N. High precision solution structure of the C-terminal KH domain of heterogeneous nuclear ribonucleoprotein K, a c-myc transcription factor. J. Mol. Biol. 1999;289:949–962. doi: 10.1006/jmbi.1999.2818. [DOI] [PubMed] [Google Scholar]
- 61.Baber J., Libutti D., Levens D., Tjandra N. Biological Magnetic Resonance Data Bank (BMRB)Hosted by UCONN Health; Farmington, CT: 1999. C-TERMINAL KH domain of HNRNP K (KH3) [Google Scholar]
- 62.Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Ding J., Liu L., Chiang Y.-L., Zhao M., Liu H., Yang F., et al. Discovery and structure-based design of inhibitors of the WD repeat-containing protein 5 (WDR5)-MYC interaction. J. Med. Chem. 2023;66:8310–8323. doi: 10.1021/acs.jmedchem.3c00787. [DOI] [PubMed] [Google Scholar]
- 64.Groom C.R., Bruno I.J., Lightfoot M.P., Ward S.C. The cambridge structural database. Acta Crystallogr. B Struct. Sci. Cryst. Eng. Mater. 2016;72:171–179. doi: 10.1107/S2052520616003954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Wang J., Wolf R.M., Caldwell J.W., Kollman P.A., Case D.A. Development and testing of a general amber force field. J. Comput. Chem. 2004;25:1157–1174. doi: 10.1002/jcc.20035. [DOI] [PubMed] [Google Scholar]
- 66.Allen W.J., Balius T.E., Mukherjee S., Brozell S.R., Moustakas D.T., Lang P.T., et al. Dock 6: impact of new features and current docking performance. J. Comput. Chem. 2015;36:1132–1156. doi: 10.1002/jcc.23905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Case D.A., Metin Aktulga H., Belfon K., Ben-Shalom I.Y., Berryman J.T., Brozell S.R., et al. University of California; San Francisco: 2023. Amber 2023. [Google Scholar]
- 68.Jorgensen W.L., Chandrasekhar J., Madura J.D., Impey R.W., Klein M.L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983;79:926–935. [Google Scholar]
- 69.Maier J.A., Martinez C., Kasavajhala K., Wickstrom L., Hauser K.E., Simmerling C. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theor. Comput. 2015;11:3696–3713. doi: 10.1021/acs.jctc.5b00255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Joung I.S., Cheatham T.E., 3rd Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. J. Phys. Chem. B. 2008;112:9020–9041. doi: 10.1021/jp8001614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Hopkins C.W., Le Grand S., Walker R.C., Roitberg A.E. Long-time-step molecular dynamics through hydrogen mass repartitioning. J. Chem. Theor. Comput. 2015;11:1864–1874. doi: 10.1021/ct5010406. [DOI] [PubMed] [Google Scholar]
- 72.Darden T., York D., Pedersen L. Particle mesh Ewald: an N⋅log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993;98:10089–10092. [Google Scholar]
- 73.Davidchack R.L., Handel R., Tretyakov M.V. Langevin thermostat for rigid body dynamics. J. Chem. Phys. 2009;130 doi: 10.1063/1.3149788. [DOI] [PubMed] [Google Scholar]
- 74.Roe D.R., Cheatham T.E., 3rd PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theor. Comput. 2013;9:3084–3095. doi: 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
- 75.Humphrey W., Dalke A., Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996;14:27–28. doi: 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
- 76.Schneider C.A., Rasband W.S., Eliceiri K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data are included in the article or available from the corresponding author, A. N. J.






