Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Sep 25.
Published in final edited form as: Angew Chem Int Ed Engl. 2023 Aug 18;62(39):e202308650. doi: 10.1002/anie.202308650

An Encodable Scaffold for Sequence-Specific Recognition of Duplex RNA

Jonathan G Kwok 1, Zhi Yuan 1, Paramjit S Arora 1,*
PMCID: PMC10528708  NIHMSID: NIHMS1924116  PMID: 37548640

Abstract

RNA, unlike DNA, folds into a multitude of secondary and tertiary structures. This structural diversity has impeded the development of ligands that can sequence-specifically target this biomolecule. We sought to develop ligands for double-stranded RNA (dsRNA) segments, which are ubiquitous in RNA tertiary structure. The major groove of double-stranded DNA is sequence-specifically recognized by a range of dimeric helical transcription factors, including the basic leucine zippers (bZIP) and basic helix loop helix (bHLH) proteins; however, such simple structural motifs are not prevalent in RNA-binding proteins. We interrogated the high-resolution structures of DNA and RNA to identify requirements for a helix fork motif to occupy dsRNA major grooves akin to dsDNA. Our analysis suggested that the rigidity and angle of approach of dimeric helices in bZIP/bHLH motifs are not ideal for the binding of dsRNA major grooves. This investigation revealed that the replacement of the leucine zipper motifs in bHLH proteins with synthetic crosslinkers would allow recognition of dsRNA. We show that a model bHLH DNA-binding motif does not bind dsRNA but can be reengineered as an RNA ligand. Based on this hypothesis, we rationally designed miniature synthetic crosslinked helix fork (CHF) as a generalizable proteomimetic scaffold for targeting dsRNA. We evaluated several CHF constructs against a set of RNA and DNA hairpins to probe the specificity of the designed construct. Our studies reveal a new class of proteomimetic as an encodable platform for sequence-specific recognition of dsRNA.

Keywords: proteomimetic, peptides, dsRNA, RNA recognition, helix fork

Graphical Abstract

graphic file with name nihms-1924116-f0007.jpg

Recognition of nucleic acids by designed ligands has been a long-term objective due to the biological potential of these ligands. The significant goal of developing small molecule and protein motifs that target duplex DNA has been met but a recognition code for duplex RNA remains in infancy. We show that naturally occurring DNA-binding helix fork motifs, which do not target equivalent duplex RNA sequences, can be engineered to recognize RNA.

Introduction

Sequence-specific recognition of nucleic acids by synthetic ligands has been a focal point of vigorous research efforts for several decades. These efforts have yielded pyrrole-imidazole polyamides[1] and zinc-finger proteins[2] as two textbook examples of scaffolds that afford sequence-specific recognition of DNA (Fig. 1A). Sequence-specific recognition of DNA by proteins and small molecules requires shape-selective recognition of one of the DNA grooves by the ligand followed by specific hydrogen-bonding interactions with the bases.[3] The Hoogsteen surface of an adenine base can be recognized by carboxamide side chains of asparagine and glutamine, while the guanidine group of arginine forms specific hydrogen bonding contacts with guanine (Fig. 1B).[4] Efforts to recognize RNA have been complicated by the multitude of secondary and tertiary structures adopted by this biopolymer.[5] Base-specific recognition of RNA is possible by antisense strategies,[6] including synthetic oligomers such as morpholinos[7] and peptide nucleic acids (PNAs),[8] and classes of small molecules[9] that target RNA are growing, but a synthetic scaffold that allows atom-specific engineering to sequence-specifically target a double-stranded RNA motif is not yet available. Synthetic strategies to target RNA have ranged from aminoglycosides and their derivatives,[10] small molecules emerging from screens,[11] and multivalent intercalators[12] or hydrogen bonding scaffolds with potential to target repeat sequences.[13] Significant efforts have focused on the mimicry of RNA-binding peptides and proteins as starting points for synthetic scaffolds. Structural efforts to characterize HIV RNAs provided two seductive examples for potential mimicry: the Rev peptide adopts an α-helical conformation to target RRE RNA[14] and Tat may adopt a β-hairpin motif to bind TAR RNA (Fig. 1A).[15] These examples suggest that synthetic helix or β-hairpin mimics can potentially access the dsRNA major groove and may serve as encodable scaffolds for RNA recognition. Indeed, several studies to develop α-helix[16] and β-hairpin[17] mimics as specific RNA binding reagents have been detailed, but these efforts have failed to translate to a recognition code akin to a zinc-finger (ZNF) protein-based modular platform for DNA.[18] ZNF, and many other DNA-binding proteins, form specific side-chain interactions with the Hoogsteen surface of DNA bases.[3]

Figure 1.

Figure 1.

Nucleic acid recognition by peptides, proteins, and synthetic ligands. (A) Two codes for sequence-specific recognition of DNA have been established: pyrrole-imidazole (Py-Im) polyamides bind DNA in the minor groove (PDB code 3omj) while zinc-finger proteins recognize the DNA major groove (PDB code 1mey). RNA recognition by protein secondary structures is illustrated by the Rev α-helix peptide (PDB code 1etf) and Tat β-hairpin mimetic (PDB code 2kdq). (B) Specific nucleic acid recognition by protein and peptides requires nucleobase - side chain contacts; two canonical examples of Hoogsteen surface recognition by guanidine and amide protein side chain functionality are shown.

The α-helix represents a canonical DNA major groove binding element – several classes of transcription factors that utilize the α-helical domain for DNA recognition are known,[19] and the geometry of helix-DNA major groove interactions is understood.[20] The RNA major groove can also host an α-helical domain, as illustrated by the Rev-RRE interaction[14] and general classes of RNA recognition proteins.[21] However, a single α-helical domain is limited in the number of side chains it can project into the groove for hydrogen bonding contacts with nucleobases, which in turn lowers its potential as a general platform for sequence-specific nucleic acid binding. For example, natural nucleic acid binding proteins utilize - at minimum - two helical regions for recognition. Based on our hypothesis that the engagement of multiple RNA major grooves would lead to a specific recognition modality, we engineered a proteomimetic platform for RNA binding where the contiguous RNA major grooves are contacted by linked α-helices.

Results and Discussion

Basic helix-loop-helix Max can be re-engineered to bind duplex RNA

The basic leucine zipper (bZIP) and basic helix-loop-helix (bHLH) families of DNA binding proteins serve as inspirations for our proteomimetic design. The helix dimer represents the simplest helical tertiary structure and forms the basis of important DNA recognition motifs in the bZIP and bHLH families (Fig. 1B). Both transcription factor families feature a Y-shaped scissor-grip or helix fork model of DNA recognition.[22] We rationalized that a synthetic helix fork (HF) motif may provide a generalized scaffold to target dsRNA. Unlike a single α-helical motif, a helix fork scaffold would have the advantage of clamping neighboring major grooves with two helices.

The challenge with reengineering these DNA binding proteins as RNA binders is that these motifs are not generally known to bind dsRNA. We began our studies by analyzing why bHLH may recognize duplex DNA but not RNA (Fig. 2A). We started our explorations with the Max bHLH protein because it is well-studied example of DNA binding dimers.[23] We confirmed that Max•Max homodimer recognizes a fluorescein-labeled hairpin DNA motif, referred to as Ebox DNA, that encompasses the Max binding site but not the analogous RNA hairpin, Ebox RNA (Supplementary Information Fig S6 and Table S2). Max homodimer binds Ebox DNA with high affinity (KD = 0.04 ± 0.02 μM) in a fluorescence polarization assay but fails to show any binding to RNA at up to 0.5 μM concentration.

Figure 2.

Figure 2.

Analysis of dsDNA and dsRNA recognition by Max homodimer, a basic helix-loop-helix motif. (A) Max homodimer binds to fluorescein-labeled Ebox DNA hairpin sequence (5’-FAM-CAC CAC GTG GTT TTT ACC ACG TGG TG) with high affinity (Kd = 0.04 ± 0.02 μM) in a fluorescence polarization assay but shows no affinity for the analogous RNA hairpin sequence (5’-FAM-CAC CAC GUG GUU UUU ACC ACG UGG UG). (B) Comparison of dsDNA and dsRNA major groove interactions with α-helices. (i) Positioning of base pairs within major grooves. (ii-iv) Structural analysis of DNA in complex with Max homodimer (PDB code 1hlo) and RNA in complex with Rev peptide (PDB code 1etf) and TAV2b (PDB code 2zi0) reveals that the angles of entry of dimeric α-helical domains into dsDNA and dsRNA diverge. (C) Replacement of the leucine zipper region in Max homodimer with a flexible linker leads to CHF-Max (SI Appendix, Figs. S1 and S2). CHF-Max binding to Ebox DNA and RNA was evaluated in a fluorescence polarization assay. Kd(CHF-Max/RNA): 0.48 ± 0.22 μM; Kd(CHF-Max/DNA): 0.69 ± 0.21 μM. Error bars represent the standard deviation of three independent experiments.

Comparison of the A-form and B-form configurations in nucleic acids-protein complexes reveals the difference between RNA versus DNA recognition by bHLH proteins in their respective major grooves. Our structural analyses used DNA in complex with Max homodimer (PDB code 1hlo)[24] and RNA in complex with Rev peptide (PDB code 1etf)[14] and TAV2b (PDB code 2zi0).[25] The geometries of the B-form versus A-form helices dictate that the base pairs lie at the center of the helical DNA major groove but towards the edge of the minor groove in the RNA duplex (Fig. 2Bi). The Hoogsteen surface in DNA major groove can be readily accessed by side chains projecting from the α-helix backbone but a similar hydrogen-bonding contact with the nucleobase requires the α-helix to bury itself deeper into the RNA major groove.[14] This critical aspect in the α-helix-nucleobase recognition of DNA and RNA major grooves is illustrated by complexes of Max-DNA and Rev-RNA (Fig. 2Bii). The Rev α-helix is buried into the major groove as compared to the Max α-helix.

The relative orientation of the α-helix positioned into the A-form major groove is distinct from that of the DNA B-form axis. This geometrical difference is most clearly observed when two α-helices are positioned into neighboring major grooves of RNA and DNA. A set of non-parallel dimeric α-helices would adopt inverted acute angles to access contiguous DNA as opposed to RNA major grooves (Fig. 2Ciii). The DNA-binding region of Max emerge from a rigid helix-loop region (Fig. 2Biv) to complex with DNA. However, the leucine zipper dimerization domain imposes an angle restriction that hinders complex formation with RNA. There are limited examples of dimeric α-helices that bind RNA; our search of the literature and structural data repositories yielded one example of a Y-shaped α-helical motif from tomato aspermy virus protein 2b (TAV2b) bound to an siRNA duplex (Fig. 2Biv and Fig. 3).[25] TAV2b α-helices mimic the trajectory of Rev α-helix binding geometry in A-form RNA and adopt the wider entry angle required for RNA complexation.

Figure 3.

Figure 3.

Design of the TAV2b-deried crosslinked helix fork (CHF) scaffold. (A) From wild-type TAV2b protein, the α1 helix was retained and truncated to an 8-residue sequence containing R26-R33 (yellow and blue ribbon). The individual α1-derived peptides were covalently crosslinked to obtain the CHF scaffold. (B) Side-chain interactions of R26-R33 with duplex RNA from the native complex, PDB 2zi0. Black dashes represent side-chain interaction with the RNA phosphate backbone, cyan. Red wedges represent Hoogsteen hydrogen bonds between amino acid side chain to nucleobases, red dotted box. (C) Binding affinities of TAV2b-α1Flu and CHF-1Flu for designed oligonucleotide hairpins, RNA 1 and DNA 1. TAV2b-α1Flu and CHF-1Flu sequences are listed in brackets. Yellow highlights on helices and amino acid sequences represent residues involved in Hoogsteen recognition. Nle = Norleucine. R1 = Fluorescein-β-Alanine. R2 = Fluorescein-β-Alanyl-glycine. R3 = Acetyl-tyrosine. All peptide C-termini are carboxamides. Designed hairpin oligonucleotides RNA1 and DNA1 are shown with the oligonucleotide sequences listed in Table S1. (D) Comparison of monomeric and dimeric peptide recognition of RNA1. Binding affinities of fluorescein-labeled peptides for RNA1 as measured in a fluorescence polarization assay. HBS-1 = hydrogen bond surrogate stabilized peptide −1. Ac = Acetyl. Flu = Fluorescein. X = 4-pentenoic acid. G* = N-allylglycine. Φ = melamine. All peptide C-termini are carboxamides. Error bars represent the standard deviation of three independent experiments.

Our analysis suggests that Max dimer could be engineered to bind RNA if the angle between the major groove binding Max α-helices was modified by the replacement of the leucine zipper domain with artificial crosslinkers. We crosslinked the DNA binding basic domain of Max with a flexible GlyGlyCys-benzyl crosslinker to span the ~16 Å distance between major grooves (Supplementary Information Fig. S1). We modified the C-terminus with cysteine residues and crosslinked the two α-helices with 1,3-dibenzyl bromide to obtain crosslinked helix fork (CHF) from the Max sequence (CHF-Max). The chemical structure CHF-Max is depicted in Fig. S2. Titration of CHF-Max with Ebox DNA and Ebox RNA shows that zipper-less synthetic motif binds RNA and DNA sequences with similar affinity (KD ~ 0.5 – 0.7 μM; respectively) in a fluorescence polarization assay (Fig. 2C). These results support our hypothesis that the rigidity of the zipper domain restricts the entry angle required for RNA binding, but that introduction of a flexible linker affords RNA binding.

Crosslinked Helix Fork that mimics an RNA binding viral protein

Encouraged by our preliminary success in converting the Max homodimer into an RNA binder, we hypothesized that CHFs may be developed as a general class of encodable scaffolds for duplex RNA recognition by mimicking known bHLH and bZIP DNA binding motifs and modulating the angle required for RNA binding. As a first step towards this larger goal, we sought to determine the determinants of sequence selectivity in interactions between dsRNA and CHF motifs. We began by developing synthetic mimics of a known dimeric dsRNA binder (TAV2b) to analyze their potential to recognize duplex RNA. TAV2b consists of two segmented α-helical regions, defined as α1 and α2, that homodimerize in the major groove of the siRNA duplex (Fig. 3A). The binding motif observed in the two α1 helices closely resembles that of a dimeric bZIP/bHLH transcription factor but without the leucine zipper domain. We generated a crosslinked mimic of α1 dimer by excising off the α2 region and connecting the α1 dimer with dibenzyl linker (TAV2b-α1).[26] We also developed a minimal TAV2b mimic by crosslinking the RNA-contacting 8-mer sequence RKRHKLNR (CHF-1). While the native peptide-RNA complex features several electrostatic contacts with the negatively charged RNA backbone, it contains two hydrogen bonding interactions on the Hoogsteen face of two guanine bases in the duplex (5’-GC-3’) sequence (Fig. 3B). CHF-1 captures these nucleobase contacts critical for specificity in protein-nucleic acids recognition.[27] The minimal RNA-binding homodimeric region in TAV2b is separated by a distance of 19 Å spanning the major groove; we deployed a cis-stilbene linker to traverse the requisite distance for the construction of CHF-1 (Fig. 3C and Supplementary Information, Fig. S1).

We synthesized fluorescein-linked analogs of TAV2b-α1 and CHF-1 (TAV2b-α1Flu and CHF-1Flu, Supplementary Information, Fig S3 and Table S1) and analyzed their binding to a hairpin RNA sequence derived from TAV2b’s cognate siRNA duplex (RNA 1). A fluorescence polarization (FP) assay was developed to probe the peptide-RNA binding affinities. The 8mer CHF-1Flu dimer binds RNA 1 with a dissociation constant of 210 ± 15 nM; in comparison, the longer 21mer native dimer, TAV2b-α1Flu, binds RNA 1 with Kd = 17 ± 1 nM (Fig. 3C). The higher binding affinity of TAV2b-α1Flu to RNA 1 is not surprising as the longer peptide contains an additional 8 cationic residues that are not present in CHF-1Flu. The additional cationic side chains potentially form non-specific electrostatic interactions with the oligonucleotide phosphate groups.

The dimeric helix is required for high affinity duplex RNA recognition

To ascertain that the dimeric nature of CHF-1Flu is essential to its RNA recognition properties, we prepared a monomeric unconstrained peptide (Peptide-1). Peptide-1 showed moderate affinity towards RNA 1 (Kd = 15 ± 4.2 μM), which correlates to 75-fold decrease in affinity as compared to CHF-1. (Fig. 3D and Fig. S4). We hypothesized that the conformational flexibility of the unconstrained peptide may be contributing to its lower affinity.[28] We incorporated a conformational constraint – a hydrogen bond surrogate – in the backbone of the peptide to promote α-helicity.[29] However, HBS-1 (Kd = 16 ± 0.8 μM) did not lead to an enhancement in hairpin oligonucleotide recognition. We next tested if conjugation of melamine group to the peptide to obtain Peptide-Mel would provide enhanced affinity. Melamine has been proposed to engage the U-U noncanonical pair in CUG bulge,[9g] and improve affinity and recognition for RNA substrates.[30] Unfortunately, the melamine modification also did not result in substantial improvement in hairpin oligonucleotide affinity (Peptide-Mel, Kd = 8.4 ± 0.5 μM) (Fig. 3D). Overall, these results suggest that the dimeric motif has a distinct advantage over the monomeric constructs for duplex RNA targeting.

The RNA binding specificity of crosslinked helix forks can be tuned

CHF-1 is designed to contact the Hoogsteen face of a 5’-GC-3’ stretch. We hypothesized that this peptide may bind to a CUG repeat sequence featuring multiple 5’-GC-3’ regions with higher affinity because a higher population of the cognate binding site is presented in the repeat sequence. Triplet repeat RNA and DNA sequences have been implicated in several human genetic diseases; the CUG repeat RNA is known to be critical for the pathogenesis of myotonic dystrophy type 1.[31] The CUG repeat RNA, r(CUG)10, and RNA 1 (Fig. S5 and Table S2) differ by one nucleotide in spacing between the two sets of diagonal G’s recognized by the peptide dimer – the 5’-GC-3’ stretch is spaced by 5 bases in RNA 1 while GC stretches are spaced by four bases in r(CUG)10. CHF-1 contains a flexible region attached to cis-stilbene, which we predicted would allow recognition of a designed hairpin featuring CUG repeats. As expected from our binding model, CHF-1Flu bound r(CUG)10 with improved affinity (Kd = 74 ± 17 nM) compared to RNA 1 (Fig. 4). Conversely, CHF-1Flu binds the DNA analog d(CTG)10 with six-fold lower binding affinity (Kd = 440 ± 79 nM).

Figure 4.

Figure 4.

Analysis of specific and non-specific contacts in CHF-1·RNA recognition. (A) CHF-1 contains residues that potentially engage the RNA Hoogsteen surface along with two solvent exposed lysine residues (K27 and K30). In CHF-2, solvent-exposed cationic residues are replaced with alanine. In CHF-3 residues (R26 and H29) that potentially engage the RNA bases have been substituted with alanine. Models were generated with UCSF ChimeraX using pdb 2zi0. (B) Binding affinities of fluorescein-labeled CHF compounds as determined in a fluorescence polarization assay. Bar graph of affinities represented in log(Ka). (B) Kd values of CHF compounds. nb = no binding observed. N.D. = not determined. (C) Hairpin nucleotide sequences used in the analysis. Error bars represent the standard deviation of three independent experiments.

CHF-1Flu binds to RNA 1 and r(CUG)10 with nanomolar affinity providing robust ligands for these hairpin RNA oligonucleotides. Our overall goal in this project is to develop a scaffold that allows sequence-specific targeting of duplex RNA. This goal requires that the designed CHF can detect single base mismatches. We determined the potential of CHF-1Flu to discriminate between two closely related RNA sequences that differ by one contact nucleotide and tested the potential of this ligand to differentiate between r(CUG)10 and r(CUG)Mut. r(CUG)Mut is a single mismatch control in which one target guanine nucleotide is changed to an adenine. The sequence of r(CUG)Mut contains an A-U base pair in place of G-C pair in contact with Arg-26 (Fig. 4). The full set of oligonucleotide sequences used in the current analysis are listed in Supplementary Information Table S2; the predicted secondary is shown in Fig. S5.

CHF-1Flu bound r(CUG)Mut with roughly the same affinity as r(CUG)10 (Fig. 4). Based on this result, we conjectured that the specific peptide side chain – nucleotide interactions are being countered by non-specific interactions of solvent exposed cationic side chains. The TAV2b α1 helix contains two lysine residues, K27 and K30, which projects away from the RNA major groove and into the solvent but could engage in non-specific interactions distorting the binding profile of a minimal scaffold (Fig. 4A). To test the impact of these cationic side chains, we synthesized CHF-2Flu in which both lysine residues are replaced with alanine. CHF-2Flu proved to be >50-fold more selective for r(CUG)10 than to the single base mismatch sequence r(CUG)Mut. Surprisingly, CHF-2Flu also showed decreased binding for RNA-1, DNA-1, and d(CTG)10 hairpins suggesting potential non-specific interactions between CHF-1Flu and these oligonucleotides (Fig. 4). We utilized a fluorescence polarization assay with fluorescein-labeled peptides for binding affinity analysis. To rule out any dye-specific results, we confirmed the binding interaction between CHF-2 and r(CUG)10 using microscale thermophoresis (MST) in which r(CUG)10 was 5’-labeled with Cy5. In the MST assay, the unlabeled CHF-2 bound Cy5-r(CUG)10 with Kd = 3.4 ± 0.6 μM, in agreement with the binding constant obtained from the FP assay (Fig. S6).

The premise of the designed dimeric proteomimetics is that side chains engage the Hoogsteen face of two base paired guanine nucleotides as part of r(CUG)10 recognition. We designed an alanine mutant, CHF-3Flu, where R26 of CHF-1Flu was replaced with alanine (Fig. 4A). This negative control shows a loss of binding affinity to r(CUG)10 and RNA 1 demonstrating the importance of interactions between the Hoogsteen surface of RNA and designed peptides. Arginine and asparagine/glutamine side chains can make two hydrogen bonding contacts with the Hoogsteen face of G and A, respectively – these hydrogen bonding interactions are often observed in nucleic acid protein complexes.[4, 32] TAV2b makes two side chain contacts with two guanine Hoogsteen surfaces of the siRNA duplex Besides R26, TAV2b uses a water-mediated interaction with histidine-29.[25] We determined the impact of H-29/guanine contact by generating a mutant (CHF-4) in which H29 has been replaced with an alanine residue (Figure S3). CHF-4 binds (CUG)10 with similar affinity (Kd = 2.2 ± 0.1 μM) as CHF-2 suggesting that H29 is not a strong contributor to binding RNA in TAV2b or the designed CHFs (Figure S4). This result provides further impetus that scaffolds bearing guanidine groups would be better hydrogen bonding partners with guanine Hoogsteen face. We are pursuing analogs of arginine and other nonnatural side chains to further improve the specificity of CHFs; the results of these studies will be described in due course.

RNA footprinting analysis supports duplex RNA binding

The CHF scaffold was rationally designed to bind double-stranded RNA. To support our hypothesis that the CHF scaffold engages the double-stranded region of RNA 1 over the loop region or an unfolded segment, we performed hydroxyl radical footprinting with 5’-32P-radiolabeled RNA 1 and CHF-2. We resolved the cleavage pattern of RNA 1 on a denaturing electrophoresis gel (Fig. 5). The footprinting pattern indicates that regions (i), (ii), and (iv) of the hairpin RNA were not cleaved in the presence of CHF-2, as expected from the binding of CHF-2 to the stem region of the hairpin RNA. Consistent with the design, the footprinting result revealed that CHF-2 did not bind at the apical tetraloop, region (iii).

Figure 5.

Figure 5.

Hydroxyl radical footprinting of 5’-32P-radiolabeled RNA 1 with CHF-2. Lane 1, 5’-32P-radiolabeled RNA 1; lane 2, NaHCO3 hydrolysis ladder; lane 3, RNase T1 standard; Lane 4, hydroxyl radical (·OH), lane 5-20, 2-fold serial dilution of 1 mM to 0.03 μM CHF-2. Footprint signatures are labeled in brackets (i) to (iv). Footprints corresponding to RNA 1 sequence are denoted with Roman numerals and red dashed boxes. Overlay of CHF-2 corresponds to the binding site for RNA 1.

A U·U mismatch is not required for dsRNA binding by helix fork mimics

Early structural efforts to characterize recognition of RNA hairpins by α-helical motifs noted that the RNA major groove must significantly widen to accommodate the α-helix.[33] For example. the Rev-RRE structure featured non-Watson-Crick base pairs and bulges to potentially loosen the RNA hairpin stem.[14] Based on these analyses, we conjectured that U-U base pairs in our designed hairpins (Figure 4) may be aiding the binding of the CHF. A requirement for mismatched base pairs in the dsRNA would limit the generality of the CHF scaffold. We were encouraged that CHF-Max can bind an RNA hairpin that contains only Watson-Crick base pairs in its stem region (Figure S5). We tested the binding of CHF-2 for an analog of (CUG)10 in which the U-U mismatches have been converted to A-U pairs; this hairpin is denoted as (CAGCUG)4 in Figure 6. CHF-2 binds (CAGCUG)4 with similar affinity as (CUG)10 suggesting that the helix fork motif can recognize dsRNA with and without full Watson-Crick base pairing in the duplex RNA.

Figure 6.

Figure 6.

Potential of crosslinked helix fork to target duplex RNA with and without bulges or mismatch base pairing. (CAGCUG)4 represents a fully base paired analog of r(CUG)10. The fluorescence polarization binding assay with CHF-2 provides similar dissociation constant for both RNAs, Kd (CUG)10 = 2.2 ± 0.1 μM; Kd (CAGCUG)4 = 5.9 ± 4.7 μM. Error bars represent the standard deviation of three independent experiments.

Conclusion

We have rationally designed a dsRNA-binding scaffold, dubbed crosslinked helix fork (CHF), that mimics a protein tertiary structure often engaged in double-stranded DNA recognition. Our efforts complement vigorous research undertakings in chemical biology to develop or screen ligands for RNA. Our studies began with a question: why do dimeric basic helix-loop-helix (bHLH) transcription factor family proteins that have been well-characterized as duplex DNA binders do not recognize double-stranded RNA? We analyzed RNA and DNA binding by α-helices to probe the geometrical parameters that allow the binding of A- and B-form nucleic acid duplexes by protein secondary and tertiary helical structures. We learned that bHLHs can be made to bind dsRNA with near equal affinity to dsDNA if the crossing angle between the dimeric α-helical motif enforced by the leucine zipper domain is relaxed. In this study, we incorporated a flexible linker between the individual bHLH peptide units to test the hypothesis that RNA binding with engineered bHLH motifs is possible. In ongoing projects, we are designing precise crosslinkers to achieve selective DNA or RNA recognition. In this initial effort, we showed that the specificity and affinity of the scaffold for dsRNA can be tuned through judicious modifications of the amino acid sequence, net charge, and crosslinker. Given the growing importance of RNA as a therapeutic target, we envision broad potential for the CHF scaffold capable of recognizing hairpin regions in RNA tertiary structure. For this potential to be met, the current CHF design needs to be improved. Our model dictates helical motifs binding in the RNA major groove. Our group has previously described synthetic methods for stabilizing α-helices[34] or replacing peptides with small molecule topographical helix mimics[35] may lead to enhanced affinity, specificity, and cellular activity. We also anticipate that the replacement of canonical amino acid side chains with nonnatural functional groups or nucleotides to form triplex motifs would lead to improved recognition of the RNA Hoogsteen surface.[36] The CHF scaffold provides a platform for these and other modifications to develop a new class of programmable ligands for duplex RNA.

Supplementary Material

Supinfo

Acknowledgements

We thank the NIH (R35 GM130333) for support of this work.

Footnotes

Supporting information for this article is given via a link at the end of the document.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supinfo

RESOURCES