Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2024 Mar 7;52(7):e37. doi: 10.1093/nar/gkae126

Identification of G-quadruplex-interacting proteins in living cells using an artificial G4-targeting biotin ligase

Ziang Lu 1,c, Shengjie Xie 2,c, Haomiao Su 3, Shaoqing Han 4, Haiyan Huang 5,, Xiang Zhou 6,7,
PMCID: PMC11040147  PMID: 38452210

Abstract

G-quadruplexes (G4s) are noncanonical nucleic acid structures pivotal to cellular processes and disease pathways. Deciphering G4-interacting proteins is imperative for unraveling G4’s biological significance. In this study, we developed a G4-targeting biotin ligase named G4PID, meticulously assessing its binding affinity and specificity both in vitro and in vivo. Capitalizing on G4PID, we devised a tailored approach termed G-quadruplex-interacting proteins specific biotin-ligation procedure (PLGPB) to precisely profile G4-interacting proteins. Implementing this innovative strategy in live cells, we unveiled a cohort of 149 potential G4-interacting proteins, which exhibiting multifaceted functionalities. We then substantiate the directly binding affinity of 7 candidate G4-interacting-proteins (SF3B4, FBL, PP1G, BCL7C, NDUV1, ILF3, GAR1) in vitro. Remarkably, we verified that splicing factor 3B subunit 4 (SF3B4) binds preferentially to the G4-rich 3′ splice site and the corresponding splicing sites are modulated by the G4 stabilizer PDS, indicating the regulating role of G4s in mRNA splicing procedure. The PLGPB strategy could biotinylate multiple proteins simultaneously, which providing an opportunity to map G4-interacting proteins network in living cells.

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

G-quadruplexes (G4s) are four-stranded nucleic acids, formed by nucleic acid sequences containing tandem guanine repeats. With the advancement of G4 detection tools, including G4 imaging (1–3) and G4 sequencing technology (4–6) like G4-seq, G4 ChIP-seq, G4P-ChIP and so on, the presence of G4s have been verified not only in human genome and transcriptome but also in other species. Putative G-quadruplex forming sequences (PQSs) are particularly abundant in gene regulatory regions such as promoters (7), 5ʹ untranslated regions (8), and splicing sites (9). Emerging evidence has linked these endogenous G4s to pathological process such as cancer (10), immune disorders (11) and neurodegeneration (12). While significant progress has been made in mapping G4 structures, there remains a need for more comprehensive investigations into the underlying mechanisms of these structures. Exploring G4-interacting proteins holds the promise of shedding light on the biological functions of G4s.

Recent advancements have yielded multiple methods for identifying G4-interacting proteins, broadly categorized into three types. The first method employs biotin modified G4s, incubating them with cell lysates and subsequently purifying the complexes using Streptavidin-coated beads (13). The second method utilizes a trifunctional pyridostatin (PDS) derivative by tethering PDS to a photoreactive aliphatic diazirine group for crosslinking with adjacent proteins, and to a click alkyne handle for introducing a biotin moiety for purification (14,15). The third approach involves systematically exploring particular G4-binding motifs such as OB-fold-like subdomain (16) and RGG motif (17), or by searching for known protein binding DNA or RNA sequences that exhibit regular guanine repeats (7). These methodologies have unveiled diverse G4-interacting proteins such as helicases (7,18,19), telomere-binding proteins (20–22), and epigenetic modulators (11,23,24). The development of novel enrichment methods hold the potential to expand the G4-interacting proteins pool, as complementary to the existing methods. Given dynamic nature of G4 formation in live cells, it is imperative to establish in vivo procedures for pinpointing G4-interacting proteins, enabling a more precise comprehension of G4’s physiological function. Essential to this pursuit is the identification of G4 ‘readers’ that can operate in living cells. The conserved 23-aa domain of DHX36 (also known as RHAU) emerges as a key player, showcasing unique G4 resolvase activity with high affinity and specificity for G4 recognition (25,26).

Proximity labeling (PL) has emerged as a powerful technique for studying molecular interactions in living cells, including nucleotide-protein interactions (27). This innovative method involves the use of promiscuous biotin ligases that enzymatically adenylate biotin molecules, forming a biotin-5-AMP intermediate (28). This reactive intermediate then diffuses from the enzyme to within approximately 20 nanometers, covalently labeling exposed lysine residues of nearby proteins. Consequently, only proteins in close proximity to the target are biotin-labeled, minimizing background labeling in the cellular environment. Enzymes designed for proximity labeling offer notable advantages, including high catalytic efficiency, small size, and programmable editing capabilities. Thus, G4-targeting biotin ligase could serve as a robust probe for identifying G4-interacting proteins in live cells. Notably, miniTurbo, a promiscuous biotin ligase, presents itself as a promising candidate due to its high activity and compact molecular structure (29). In comparison to TurboID, miniTurbo boasts a smaller size and exhibits lower biotin affinity, allowing for enhanced control over labeling through precise modulation of the exogenous biotin incubation time. This feature ensures tighter control over the labeling process, enabling researchers to tailor the experiment to their specific requirements.

In this study, we have ingeniously crafted a bifunctional probe termed G4PID, which effectively targets G4 structures while concurrently marking G4-interacting proteins. This is achieved through the fusion of the G4-binding domain of RHAU (RHAU23) with miniTurbo. The studies of G4–protein binding showed that their binding mode could be various, a G4 could fit into a protein cavity and a protein could docked into a G4 groove (30) or stack on the terminal G-quartet (31). In order to minimize the influence of the probe itself on the original G4 binding protein, we chose RHAU which bind only one terminal of G-quartet, the G-quartet at the other terminal is free for the binding of proteins with similar binding mode (32). Firstly, we convincingly demonstrated the effectiveness of G4PID in targeting and biotinylating G4 antibodies (BG4) (2) in vitro. Harnessing the potentiality of this innovative probe, we have devised a proximity labeling mediated G-quadruplex-interacting proteins specific biotin-ligation procedure (PLGPB). Leveraging this PLGPB method to HEK293T cells, we unveiled 149 candidate G4-interacting proteins within a physiological condition. Selectively, 7 novel G4-binding proteins were characterized in vitro to validate the accuracy of PLGPB. Remarkably, we identified the splicing factor 3b subunit 4 (SF3B4) as a noteworthy player, demonstrating a pronounced preference for binding to G4s-enriched RNA splicing sites and modulate alternative RNA splicing in a G4-dependent manner. Collectively, our findings underscore the critical role of G4 structures in cellular processes and illuminate a potential avenue for the regulation of RNA splicing through G4-dependent mechanisms.

Materials and methods

Plasmid construction

G4PID coding sequence was synthesized by GENEWIZ (Suzhou, China). All plasmid used were constructed via Gibson assembly by pEASY®-Basic Seamless Cloning and Assembly Kit (TransGen Biotech, cat. #CU201). IRES and eGFP coding sequence of pIRES2-EGFP plasmid were deleted using Q5® Site-Directed Mutagenesis Kit (NEB, cat. #E0554S) for eukaryotic expression vector. The G4PID coding sequence was inserted into pET-24b (+) between NdeI and HindIII sites to construct prokaryotic expression plasmid (pET24b-G4PID) and into IRES and eGFP coding sequence deleted pIRES2-eGFP plasmid between NheI and EcoRI sites to construct eukaryotic expression plasmid (p-G4PID). An HA-tag was introduced into p-G4PID by PCR amplification to form (p-G4PID-HA). The miniTurbo sequence was obtained from miniTurbo-V5-pRS415 (Addgene plasmid cat. #107168) by PCR amplification, and used to construct eukaryotic expression plasmid (p-miniTurbo). An HA-tag was introduced into p-miniTurbo by PCR amplification to form (p-miniTurbo-HA). The cDNAs of selected proteins, total RNA of Hela cell was obtained by TRIzol™ Reagent (Ambion, cat. # 15596018) and reverse transcript with M-MuLV Reverse Transcriptase (NEB, cat. #M0253L) than amplified with PrimeSTAR HS Premix (Takara, cat. #R040A) using corresponding primers (Supplementary Table S2). The cDNAs were inserted into pET24b(+) between NdeI and HindIII sites or pCold™ TF DNA plasmid between NdeI/ and EcoRI sites (Takara, cat. #3365) via Gibson assembly. All the ligated plasmids were confirmed by Sanger sequencing (Beijing Tsingke Biotech Co., Ltd.). The p-G4PID and P-G4PID-HA expression vector reported here will be deposited at Addgene.

Recombinant protein expression and purification

For the expression of recombinant protein, the pET24b-G4PID plasmid was transformed into the Escherichia coli strain BL21 (DE3). Cells were grown in LB medium containing 50 μg/ml kanamycin at 37°C until the OD600 reached 0.6. Cultures were cooled down to 16°C, and 0.5 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) was added to induce expression for 16 h. Cell pellets were collected by centrifuging, then suspended in ice-cold lysis buffer (20 mM Tris–HCl, pH 7.5, 500 mM NaCl, 10% glycerol, 10 mM imidazole,1 mM DTT and 1 mM PMSF) and sonicated for 30 min (5 s on and 25 s off) with ice cooling. The lysate was clarified by centrifugation at 10000 rpm for 2 h. Hexahistidine fusion protein of supernatant was purified using HisTrap FF prepacked column (GE, cat. #17525501) and desalted by GE HiTrap Desalting prepacked column (GE, cat. #17140801). The purified proteins were stored in buffer contain 10 mM Tris–HCl, pH 7.5, 150 mM NaCl, 50% glycerol and 1 mM DTT at −80°C. The purified proteins were confirmed by SDS-PAGE.

Circular dichroism (CD) spectroscopy

Circular dichroism spectroscopy was used to determine whether the sequence formed a G4 structure or not. The DNA and RNA oligonucleotides (10 μM) were dissolved in 20 mM Tris–HCl, pH 7.5 @ 25°C, 150 mM KCl. The oligonucleotides were heated at 95°C for 5 min and annealed to 25°C at a rate of −1°C/min. CD experiments were performed at 25°C using a Jasco-810 spectropolarimeter (Jasco, Easton, MD, USA). The bandwidth was set as1.0 nm and path length was 1.0 mm. CD spectra were recorded from 320 to 220 nm with scanning speed of 0.5 nm/s. CD melting profiles was recorded at 264 nm as a function of temperature. The temperature range was set from 4°C to 95°C. The heating rate was set as 1°C/min and the CD values were collected every 1°C. All CD spectra were corrected by blank buffer signal and performed in three replicates.

Electrophoretic mobility shift assay (EMSA)

The FAM-labelled oligonucleotides (detailed in Supplementary Table S1) were heated at 95°C for 5 min in 20 mM Tris, pH 7.5 buffer with 150 mM KCl and annealing to 25°C at a rate of −1°C/min. For G4PID, the binding reactions were performed in 10 μl mix containing 20 mM Tris–HCl, 50 mM KCl, 10 μg/ml BSA, 100 μg/ml yeast tRNA, 0.01% Igepal CA-630 and 5% glycerol at pH 7.5 @ 25°C, 40 nM FAM-labeled oligonucleotides and recombinant proteins G4PID with increasing concentration were incubated at 4°C for 1 h. The sample was loaded on 10% non-denaturing polyacrylamide gel (79:1 (wt/wt) acrylamide: bisacrylamide) containing 70 mM KCl and electrophoresed in 1 × TBE buffer at 4°C for 2 h. The bands were visualized with ChemiDoc MP (Bio-Rad) and quantitated by the Image Lab software. The equilibrium dissociation constants (Kd) was determined by fitting the fractional bound DNA (Y) in a 1:1 stoichiometry against the free G4P concentration (X) to the equation: Y = X/(Kd + X). For the candidate G4-binding proteins, 10 nM annealed oligos were incubated with 100 nM recombinant proteins at 37°C for 30 min. For the PDS competition EMSA, 10 nM annealed oligos were incubated with 200 nM SF3B4 and 10 nM PDS at 4°C for 30 min. The sample was separated in 10% non-denaturing polyacrylamide gel (79:1 (wt/wt) acrylamide: bisacrylamide) containing 10% PEG200 by electrophoresed with 100 V in 0.5× TBE buffer for 40 min.

Streptavidin–HRP blotting analysis

Proteins or cell lysate was denatured with 1× SDS loading and separated on 10% SDS-page gels and transferred to PVDF membrane (Millipore). The PVDF membrane was then blocked in 5% BSA in TBST (Tris-buffer saline, 0.1% Tween 20) for 1 h at room temperature. The membrane was incubated with streptavidin-HRP in 5% BSA in TBST for 1 h at room temperature. Next, the membrane was washed five times with TBST for 5 min and followed by ECL blotting with BeyoECL Plus kit (Beyotime). The Blot images were recorded by a ChemiDoc XRS+ imager (Bio-Rad).

G4PID mediated proximity labeling of BG4 in vitro

5 μl of anneal-formed Bcl2 G4 (1 μM) was incubated with 2 μl G4-specific protein BG4 (4 μM) and 2 μl G4-nonspecific BSA protein at room temperature for 15 min, followed by adding 1.25 μl of G4PID (4 μM) and proximity biotinylating of above complex was carried out in 10 μl reaction buffer of 30 mM Tris–HCl, pH 7.5, 150 mM KCl, 1 mM ATP, 50 μM biotin at 37°C for 1 h. Next, the reaction was terminated by boiling in 1× SDS loading buffer for 10 min. The sample was divided into two parts for Coomassie staining and streptavidin–HRP blotting. The corresponding Bcl2 mutant oligonucleotide was also tested according to the above method.

Cell culture

HeLa and HEK293T cells were cultured in Dulbecco's modified Eagle's medium (DMEM; Gibco) supplemented with 100 units/ml penicillin, 100 μg/ml streptomycin sulfate, and 10% fetal bovine serum (Gibco) at 37°C in a humidified 5% CO2 incubator. MCF10A cells were cultured in DMEM/F12 + 5% HS + 20 ng/ml EGF + 0.5μg/ml Hydrocortisone + 10 μg/ml Insulin + 1% NEAA + 1% P/S (Procell Life Science & Technology Co., Ltd) under 37°C with 5% CO2.

Immunofluorescence imaging

HeLa cells were transfected with pIRES2-HA-G4PID or pIRES2-HA-miniTurbo via Lipofectamine™ 3000 (Invitrogen, L3000015) and positive clones was selected via G418 treating. Cells were seeded on glass-bottom dishes (20 mm) 24 h before fixed by 4% paraformaldehyde and permeabilized by 0.3% Triton X-100. Cells were blocked with 5% BSA before incubated with HA-Tag (C29F4) Rabbit mAb (Cell Signaling Technology, Inc., 3724S) and then with Anti-rabbit IgG (H + L), F(ab')2 Fragment (Alexa Fluor® 555 Conjugate) (Cell Signaling Technology, Inc., 4413). Cell nucleus were stained with DAPI. Images of stained cells were obtained using Andor Revolution XD confocal laser scanning microscope.

Optimization of the proximity labelling efficiency of G4PID in live cells

HEK293T cells were grown in 6-well dishes at 80% confluency. Cells were transfected with p-G4PID-HA plasmid or p-miniTurbo-HA using lipofectamine 3000 (Thermo Scientific) and replaced the fresh DMEM after 6 h. The transfected cells were cultured for 24 h before labeling. Biotin (Sigma-Aldrich) was dissolved in DMSO as 500 mM stock solution and added to the warm DMEM media to a final concentration of 500 μM. The cells were treated with biotin (500 μM) for 10, 30, 60 min at 37°C, labeling reaction was stopped by removing the biotin containing medium and moving the cells onto ice, the cell was washed five times with 1 ml ice-cold PBS. The cells were lysis by 1× RIPA lysis buffer (50 mM Tris, pH 7.4,150 mM NaCl, 1% Triton X-100, 1% sodium deoxycholate, 0.1% SDS, 1× protease inhibitor cocktail and 1 mM PMSF) and analyzed by western blotting.

Enrichment of biotinylated proteins

HEK293T cells were transfected with p-G4PID plasmid or p-miniTurbo using lipofectamine 3000. The cell of labeling reaction in 15 cm dishes were used as input of each sample for proteomics MS analysis. The cell pellets were lysed in 1 ml 1× RIPA lysis buffer by pipetting several times and then incubated on ice for 10 min. The lysates were clarified by centrifugation at 13 000 × g for 10 min at 4°C. The biotinylated proteins were enriched by SA-bead and 5% of the washed streptavidin beads were analyzed by streptavidin–HRP blotting. Streptavidin-coated magnetic beads (Pierce) were washed twice with RIPA buffer, total proteins of each sample lysate were incubated with 250 μl of streptavidin magnetic bead at room temperature for 1 h. After enrichment, the beads were subsequently washed twice with 1 ml of RIPA lysis buffer, once with 1 M KCl (1 ml, 2 min, RT), once with 0.1 M Na2CO3 (1 ml, ∼10 s), once with 2 M urea in 10 mM Tris·HCl (pH 8.0) (1 ml, ∼10 s) and twice with 1 ml RIPA lysis buffer. 5% of the washed proteins–streptavidin beads was taken to verify protein enrichment by silver stain and streptavidin–HRP blotting. Remained samples were processed on-bead trypsin digestion for proteomics MS analysis.

Label-free quantitative proteomics data analysis

The proteins-streptavidin beads were sent to SpecAlly company (Wuhan, China) for label‐free quantitative liquid chromatography–tandem mass spectrometry (LC–MS/MS) to identify differential proteins. Briefly, the data was collected with SWATH/DIA (Sequential Window Acquisition of all Theoretical Mass Spectra/Data-Independent Aquisition SWATH mode). In the first round of detection, DDA model was used to collect data to build a spectral database; in the second round, DIA model was used to collect quantitative data of various peptide segments and proteins for biostatistical and bioinformatics analysis. Missing values for each group are imputed using the nearest neighbour method after removing peptides missing in more than two. For the G4PID groups, the peptides missing more than two were removed. For the control groups, the peptides missing more than two were filled with minimum value, The groups with one missing peptide were filled with approximate value. The gene ontology enrichment analysis was conducted on DAVID (33), related biological process GO terms were classified and its p-value was calculated with Hypergeometric test in R.

eCLIP-seq experiments and data analysis

Cells (∼20 million) were crosslinked with UV light (254 nm, 400 mJ/cm2) and then scraped off the dishes. The eCLIP-seq libraries were constructed as previously described with an antibody and sequenced on Illumina HiSeq Xten platform. eCLIP-seq upstream data analysis pipeline was based on the published article (6), including demultiplexing based on barcodes, twice trimming adapters, removing repetitive elements, mapping to the hg38 genome, removing PCR duplicate based on randommer, calling peaks and normalizing peaks with paired size-matched input. Two peak results were overlapped using bedtools. Considering G-quadruplex (G4) sequence is longer than 20 bases, overlapped peaks were merged and then merged peaks shorter than 20 bases were removed. Final peaks were transformed to fasta sequence classified by different G4 sequence as explain below. Coverage distribution and peaks were screenshotted in IGV software. Peak annotation and distribution were got with R package Guitar. For the data of SF3B4 eCLIP assay, the splicing region was extracted from all transcripts and divided into four regions (5′splicing-exon, 5′splicing-intron, 3′splicing-intron and 3′splicing-exon). Each region was divided into 50 bins by bedtools makewindows. Reads in each bin were calculated by bedtools mulitcov and normalized by total reads number and region length.

Predicted G-quadruplex sites analysis

Considering that G4s are relatively long and diverse sequences, we search PQS within 25 nt of flanking sequence both upstream and downstream from the binding sites using the hierarchical assignment. Predicted G-quadruplex sites (PQS) were classified as four kinds of G4 sequence including Normal Loop, Long Loop, Simple Bulge and Complex Bulge/Two-tetrads based on the published articles (4,34,35). Basic G4 sequence pattern was G{3,}N{1,7}G{3,}N{1,7}G{3,}N{1,7}G{3,}. Normal Loop and Long Loop were G4 sequence without bulge and with at least one loop which length was 1–7, >7 bases (up to 12 for any loop and 21 for middle loop), respectively. Simple Bulge was G4 sequence with multiple 1 base bulge or a 1–7 bases bulge. Complex Bulge/Two-tetrads was G4 sequence with multiple 1–5 bases bulges or 2 repeat G in every G-repeat. The latter two classes were G4 sequence with 1–7 bases loop. PQS type in peaks were got following priority rules above. And PQS in whole transcriptome were got without any priority rules. All PQS analysis were analyzed by regular expression matching using custom perl scripts. RNA G4 sites from rG4-seq and DMS-seq were download in QUADRatlas (https://rg4db.cibio.unitn.it/) database (6,36,37). Vene diagram was performed with intervene (38).

PDS regulated exon inclusion in cell

MCF10A Cell were seeded in 6 cm dishes 24 h before treated with 10 μM PDS, and the equal volume of DMSO was added as a control. After incubation of 24 h, total RNA was obtained by TRIzol™ Reagent. 1μg total RNA were used for revers transcription by PrimeScript™ RT reagent Kit with gDNA Eraser (Perfect Real Time) (TaKaRa, RR047A). The alternative spliced products were obtained by PCR amplification with 1μl product from revers transcription by PrimeSTAR®Max (TaKaRa, R045B). The products were separated in 1.5% agarose gel containing Gel-Red nucleic acid dye and recorded by a ChemiDoc XRS+ imager (Bio-Rad). The alternative spliced products were quantitated by the Image Lab software.

Results

Generation and characterization of G4PID in vitro

The structural model of G4PID, as depicted in Figure 1A and Supplementary Figure S1, showcased its composition: RHAU23 positioned at the N-terminal, miniTurbo at the C-terminal, and an intermediary flexible linker bridging the two components. A 6× His affinity tag was added in the C-terminal for purification. The G4PID probe was subsequently expressed by the BL21(DE3) strain and confirmed by molecular weight (Figure 1A). The computer simulation structure of GP4ID shown that RHAU23 located independently in the N-terminal (Figure 1A). To verify the binding specificity of G4PID, we conducted electrophoretic mobility shift assay (EMSA) employing purified G4PID and 5'-FAM labeled G4 nucleotides. The nucleotide sequences used in EMSA were detailed in Supplementary Table S1. Circular dichroism spectroscopy was utilized to confirm the G4 conformation of each sequence (Figure 1B and Supplementary Figure S2). The results demonstrated that G4PID bound specifically to BCL2 G4 with a Kd value of 15 ± 7 nM (Figure 1B and C). We also confirmed that G4PID binds to G4s exhibiting different conformations and showed negligible binding affinity to single-stand oligos and double-strands hairpins (Figure 1B, C and Supplementary Figure S2).

Figure 1.

Figure 1.

G4PID mediated specific biotinylating of G4-interacting proteins in vitro and in vivo. (A) Illustration of G4PID probe. The structure illustration (upper left), the Coomassie blue staining of G4PID protein (bottom left), the computer simulation structure of G4PID (right) (B) EMSA experiments of G4PID with different RNA sequences. Native gel electrophoresis of 40 nM RNA G4 (left), single stranded RNA (middle), double stranded RNA hairpin (right) in the presence of G4PID (0–400 nM) (up row). The conformation of G4 were confirmed by CD spectra (bottom row). (C) Analysis and fitting results of the dissociation constant Kd of G4PID to BCL2 G4 from EMSA gel in (B). (D) Schematic illustration of the specific proximity biotinylating of BG4 by G4PID in vitro. (E) Western blot analysis and Coomassie staining of the labeling system in the presence of 50 μM biotin. Lane 1 represents the markers, lane 2 illustrates the self-biotinylation of G4PID, lane 3 illustrates the absence of biotinylation for BG4 and BSA in the absence of G4PID, lane 4 illustrates the preferentially biotinylation of BG4 in the presence of G4PID and G4, lane 5 illustrates the biotinylation of BG4 in the presence of G4PID and mutant G4 oligo, lane 6 illustrates the biotinylation of BG4 in the presence of G4PID without G4.

Furthermore, we proceeded to tested the biotinylation specificity of G4PID. As a proof-of-concept study, the G4 antibody BG4 and the RNA G4 BCL2 were selected to mimic the interaction of G4 structures and their interacting proteins in vitro (Figure 1D). As shown in the streptavidin blotting result (Figure 1E), the biotin labeling signal of BG4 is much stronger than that of the BSA which doesn’t target G4s. Intriguingly, the biotinylation signal of BG4 substantially diminished in the presence of the single-stranded oligonucleotide BCL2-mut or in the absence of oligonucleotide altogether. Additionally, Coomassie blue staining was performed concurrently to provide insight into the protein composition of each group (Figure 1E). As aforementioned, we demonstrate that the probe specifically biotinylate G4-interacting proteins when there is the presence of G4 structure in vitro.

Functional analysis validates the binding preference and biotinylating activity of G4PID in vivo

To visualize the distribution pattern of G4PID in cells, the HA-tagged G4PID (HA-G4PID) and miniTurbo (HA-miniTurbo) stable expressed Hela cell line was constructed for the immunofluorescence assays. As shown in Supplementary Figure S3, G4PID predominantly localized within the nucleus, with a secondary presence in the cytoplasm, which is different from the homogeneous distribution of miniTurbo. This distinction might be attributed to the higher prevalence of DNA G4 structures within nucleus (3). Furthermore, the selective binding preference of G4PID for RNA G4 was confirmed through enhanced crosslinking and immunoprecipitation (eCLIP). As shown in Figure 2A, a remarkable 57.60% of G4PID-binding sites were found to contain PQS, while only 16.85% of random sequences observed contain PQS (Supplementary Figure S4). We have compared G4PID eCLIP PQS with PQS data from rG4-seq and DMS-seq was analyzed and shown in Supplementary Figure S5. It should be explained that the results of G4PID eCLIP have much fewer putative RNA G4 sites compared with these two reported RNA G4 sequencing methods, for the reason that these two methods are based on RT-stop profiling in vitro (4,36). RNA G4 regions can be very stable in vitro, particularly in the presence of K+, and previous reports have shown that most RNA G-quadruplexes confirmed in vitro are unfolded in eukaryotic cells (36). The G4PID eCLIP results do not overlap much with these two methods, and this may because of the structures with stabilities insufficient to block reverse transcription which cannot be caught by RT-stop profiling. Collectively, these findings provide a comprehensive demonstration of G4PID’s cellular distribution, and its capacity to selectively interact with RNA G4s.

Figure 2.

Figure 2.

G4PID binds RNA G4s preferentially and labeling efficiency optimization of G4PID in HEK293T cells. (A) Percentage of G4-containing G4PID-binding sites and enrichment fold by eCLIP analysis. (B) The labeling efficiency of protein in HEK293T cell was optimized by incubating probe-transfected cell with 500 μM biotin at different times. The Streptavidin-HRP blotting was processed to detect biotinylated proteins, and anti-HA blotting detects ligase expression. Lane 1 represents the markers, lane 2 and lane 3 illustrates the biotinylated protein of HEK293T cells without auxiliary biotin and with addition of 500 μM biotin, lane 4–lane 6 illustrates the labeling efficiency of G4PID-transfected cell incubated with 500 μM biotin at 10 min, 30 min and 1 h, lane 7–lane 9 illustrates the he labeling efficiency of miniTurbo-transfected cell incubated with 500 μM biotin at 10 min, 30 min and 1 h.

We established that G4PID exhibits the unique characteristic of not necessitating any cofactors for its activity, the introduction of biotin alone triggers the labeling process within cells. As shown in Figure 2B, compared to the negligible signal of control group that without addition of biotin, the addition of biotin significantly stimulated the protein labeling efficiency while the increasing of incubation time. The discernible biotinylating signal of G4PID compared with miniTurbo was observed after 30 min incubation (Figure 2B). The optimized time point of 30 min was subsequently selected for subsequent experiments. In summary, the G4PID probe demonstrated proper expression and functionality within living cells, effectively showcasing its dual capabilities of G4 targeting and protein labeling.

Detecting G4-interacting proteins in living cells by PLGPB

Having fully characterized the function of G4PID in vitro and in vivo, our investigation delved into utilize G4PID to detect the composition of G4-interacting proteins within living cells, the procedure termed proximity labeling mediated G-quadruplex-interacting proteins specific biotin-ligation procedure (PLGPB) (Figure 3A). We transient express G4PID in HEK293T cell, meanwhile, miniTurbo serving as a negative control was transfected into HEK293T to promiscuously label proteins in cells. The pull-down proteins were visualized by streptavidin–HRP blotting (Supplementary Figure S6). For the identification of pull-down proteins, a quantitative liquid chromatography LC–MS/MS approach employing data-independent acquisition (DIA) mode was adopted. Three independent biological replicates were performed, resulting in a coverage of 578 proteins with an 80% consistency rate, thus affirming the repeatability of our methodology (Figure 3B, Supplementary Table S3). Peptide data observed at least twice within the G4PID group was retained. To distinguish significantly enriched proteins, we set a threshold of fold change (FC) >2 and a false discovery rate (FDR) <0.05. Ultimately, our approach yielded a total of 149 candidate G4-interacting proteins (Supplementary Table S3, Figure 3C). These results substantiate the viability and robustness of our methodology in identifying proteins that interact with G4 structures.

Figure 3.

Figure 3.

(A) Schematic for the proximity labeling mediated G-quadruplex-interacting proteins specific biotin-ligation procedure (PLGPB) in living cells. (B) The biotinylated proteins enriched in HEK293T cell by PLGPB and analyzed by silver staining. (C) Volcano plot displaying for enriched proteins of G4PID versus miniTurbo (up-regulated in red, n = 149). (D) Overlap of G4-interacting proteins detected among enzyme-catalyzed PLGPB, photo-crosslinking method CMPP and LIMCAP, and the known G4-associated proteins databank G4IPDB. (E) GO term analysis of biological process, cellular component and molecular function for the high-confidence G4-interacting protein (fold change (FC) > 2, false discovery rate (FDR) < 0.05).

The PLGPB method successfully enriched previously reported G4-interacting proteins such as DHX36 (16), nucleolin (39) and DKC1 (40). The proteins identified through PLGPB demonstrated a notable overlap with existing G4-interacting protein databases (Figure 3D). For instance, six proteins exhibited overlap with the G4IPDB (41), while 12 proteins were common with the CMPP method (15), 14 proteins were shared with the LIMCAP method (14). In aggregate, 31 out of the total 149 proteins (20.8%) had been previously reported in the previous literature (Supplementary Table S3). These findings unequivocally attest to the credibility and reliability of the PLGPB approach.

Moreover, we conducted an ontology (GO) term enrichment analysis to gain insights into the biological processes (BP), cellular components (CC), and molecular functions (MF) of the identified protein. The top 10 GO terms, ranked by count, were portrayed in Figure 3E. The enrichment analysis of biological processes revealed that the proteins were significantly associated with transcription regulation, mRNA splicing, and chromatin remodeling, aligning with previously recognized G4 landscapes (6,42). Additionally, molecular function terms related to RNA binding, chromatin binding, and DNA binding demonstrated significant enrichment. The cellular components that were most highly ranked were the ‘nucleus’ and ‘nucleoplasm,’ where G4 structures were reported to locate concentratedly (5).

7 Novel G4-binding proteins were demonstrated in vitro

PLGPB methods enriched a diverse array of novel candidate G4-interacting proteins. These candidates encompass a wide spectrum of functional categories. Examples include telomere-associated proteins such as DKC1, GAR1, NHP2, NAF1, TCAB1 and NOP10. Additionally, transcription factors and transcriptional coactivators like SMRD1, PHF10, SMCE1, TAF12 and ARI1A, along with epigenetic and chromatin remodeling enzymes including SMARCD1, PBRM1, HDAC2, ACTB DPY30 and ARID1A. Furthermore, our results unveiled the enrichment of functional classes that have been less commonly associated with G4 interactions, such as certain protein binding proteins. This observation may arise from the inherent nature of the protein probe itself, potentially influencing the interactions observed. In essence, the PLGPB method not only corroborated previously known G4-interacting proteins but also significantly expanded the pool by identifying novel candidates spanning diverse functional categories.

To validate the G4 binding affinity of newly identified candidate proteins, we selected several of them and expressed recombinant proteins in Rosetta2 Escherichia coli (Supplementary Figure S7). Our selection encompassed proteins with diverse biological functions: SF3B4, an RNA splicing factor; PP1G, annotated as a phosphatase and a reported mRNA binder; BCL7C, a transcription regulator; NDUV1, a mitochondrial electron transport component; GAR1, a protein associated with telomere maintenance; ILF3, a regulator of transcription and translation and FBRL, a methyltransferase involved in pre-rRNA processing with RNA methylation capacity. Electrophoretic mobility shift assays (EMSA) were utilized to assess their G4 binding specificity. As shown in Figure 4a, all seven selected proteins demonstrated a distinct preference for binding to RNA G4 as opposed to the mutant single-strand RNA and hairpin.

Figure 4.

Figure 4.

EMSA of the selected candidate proteins with (A) RNA-G4 BCL2 oligonucleotide, oligonucleotide BCL2 mutant single stranded nucleotides and double stranded RNA hairpin. (B) DNA G4 structures possessing different conformation. Bound refer to the G4–protein complex. The bands below are free G4 and non-G4 oligos.

Considering the diverse conformations that DNA G4 structures adopted and the propensity of certain proteins to interact with specific conformations, we embarked on an investigation into the binding affinity of the selected proteins with multiple DNA G4 structures (Figure 4B). Protein binding affinity with single stranded DNA and double stranded DNA hairpin were conducted as control (Supplementary Figure S8). Our findings revealed that the binding ability of these seven proteins to DNA of different conformations was heterogeneous. SF3B4 binds preferentially to mixed conformation DNA G4 (hTelo) with a binding efficiency even better than RNA G4. Except for SF3B4, the other proteins exhibited reduced binding affinity for all DNA G4s than RNA G4 (BCL2). Notably, the binding ability of PP1G, NDUV1, GAR1, ILF3 and FBRL to mixed conformation DNA G4 (hTelo) was found to be weakest. These outcomes underscore the potential for regulating the interaction between these G4-interacting proteins and G4 structures through the manipulation of G4 conformation.

The binding sites of a new G4-interacting protein SF3B4 were validated in vivo

Given that the G4 landscape is dynamic and dependent on the functional state of cells, while we have demonstrated their ability to bind G4 structures in vitro, understanding the precise G4 binding pattern of these proteins within the complex in vivo environment is essential. G-quadruplex was reported to play a critical role in mediated alternative splicing (9,43,44), and within our identified set of 149 proteins, 17 were associated with mRNA splicing (Figure 3E). Since splicing factor such as hnRNPF (44) and SF3B2 (45) have been reported to interact with G4, our focus shifted to SF3B4, an RNA binding protein intricately involved in pre-mRNA splicing as a fundamental component of the splicing factor SF3B complex.

Firstly, our exploration extended to investigating the cellular binding patterns of SF3B4 with RNA through eCLIP assay. By merging data from two biological replicates (fold change (FC) > 8, P-value < 10−5), we identified 4289 high-confidence SF3B4 binding sites. The distinctive preference of SF3B4 for RNA G4 structures was validated by a direct comparison with previously reported RNA putative G-quadruplex sequences (PQS) (6). As shown in Figure 5A, a notable 49.36% of SF3B4 binding sites contained PQS, reaffirming its preference for G4 structures. Furthermore, in alignment with prior findings on the RNA G4 landscape (6), 45.44% of these sites were associated with noncanonical G4s. Figure 5B shows the representative gene of SF3B4 binding sites which show a close spatial relationship with RNA PQS.

Figure 5.

Figure 5.

Identification of G4 binding preference of SF3B4 in cellular by eCLIP. (A) PQS percentage of SF3B4 binding sites (n = 3057), normal loop is defined as canonical G4, long loop, simple bulge and 2-terade/complex bulges are defined as noncanonical G4s, others are non-G4 sites. (B) Example gene (CAD and WDR90) represent a spatial coincidence between the SF3B4 binding sites and PQS. Row 1, binding sites of SF3B4; Row 2, PQS; Row 3-4 SF3B4 reads density in reads per million (RPM); Row 5, RPM of SF3B4 input sample; Row 6, annotation of genes. (C) Distribution of SF3B4 binding sites enriched near 3′ splicing site.

Furthermore, we analyzed the distribution of SF3B4 binding sites within intron. The reads were mapped within the region of 50 nucleotides upstream and 500 nucleotides downstream of 5′ splicing sites, and 500 nucleotides upstream and 50 nucleotides downstream of 3′ splicing sites. As shown in Figure 5C the reads of SF3B4 IP sample are obviously enriched near 3' splicing site, which were consistent with the previously reported results in K562 cells. There are 3922 binding sites enriched within the 100 nucleotides upstream 3′ splicing site. As SF3B4 binding sites both overlapped with PQS and enriched near 3′ splicing site, we automatically consider that if G4 structure act roles in the alternative splicing.

G4s within the SF3B4 binding sites regulating exon inclusion of the G-quadruplex-containing pre-mRNA

To analyze if the PQS of SF3B4 binding sites regulate the exon splicing, we screen the binding sites to search normal PQS involved in alternative splicing event, two genes INPPL1 and PPP6R2 was chosen. The G-rich sequences G4-INPPL1 and G4-PPP6R2 were firstly verified to form G4 structure by circular dichroism (Figure 6A, top panel), and melting temperature (Tm) curve indicate that both sequences possess pretty high melting temperature (Figure 6A, bottom panel). We further verified that SF3B4 binds specifically to the G4 structure formed by G4-INPPL1 and G4-PPP6R2, and not to their mutant sequences which possess similar nucleotide composition but cannot form G4 structure (Figure 6B and Supplementary Figure S9). G4 ligand pyridostatin (PDS) was reported to stabilize DNA and RNA G4, and we found that PDS incubation decrease the CD spectrum of G4-INPPL1 and G4-PPP6R2 in vitro (Figure 6A, top panel). The Tm monitor results were consistent with CD spectrums, as shown in Figure 6A, bottom panel, PDS has little effect on Tm but influent the ellipticity when temperature lower than 65°C. What's more, the EMSA analysis shown that, the binding of SF3B4 to G4-INPPL1 or G4-PPP6R2 is inhibited by PDS (Figure 6B).

Figure 6.

Figure 6.

The SF3B4 bound G4 structures influence alternative mRNA splicing. (A) The sequences, CD spectra (at 25°C) and CD thermal melting curves of G4-INPPL1 and G4-PPP6R2. (B) EMSA results of G4-INPPL1 and G4-PPP6R2 bind with SF3B4 (Lane 2). PDS inhibit the binding efficiency of SF3B4 to G4-INPPL1 and G4-PPP6R2 (Lane 4). (C) PDS induced alternative splicing change of INPPL1 and PPP6R2 mRNA. Diagram of a G4-INPPL1 and G4-PPP6R2 next to the cassette exon (up row). The indicated PCR primer set designed across the alternative exon (middle row). The enhanced exon inclusion level of INPPL1 and PPP6R2 identified by PCR amplification. (D) Statistical analysis of the relative inclusion level of INPPL1 and PPP6R2 in MCF10A cells. ****P < 0.0001. Results are obtained by two biological repeats and two technological repeats, shown as mean ± SD.

Then, PDS was incubated directly with MCF10A cells to survey the variation of alternative splicing event. As shown in Figure 6C both G4-INPPL1 and G4-PPP6R2 were located in the intron that adjacent to the cassette exon. Primers for the amplification of alternative spliced products were design to across more than two exon junctions. As shown in Figure 6C and D, PDS incubation increasing the exon inclusion of INPPL1 and PPP6R2, and have little influence on GAPDH gene. This confirm that G4s interacting with alternative factor may play roles in alternative splicing events.

Discussion

In conclusion, we have constructed a G4 targeting biotin ligase named G4PID. Firstly, we validated G4PID’s capability to selectively and strongly bind various structures of DNA G4 and RNA G4 in vitro. In vitro labeling experiments also successfully demonstrated specific biotin labeling of G4PID to BG4 antibody. Subsequently, the eCLIP experiment verified the affinity of G4PID to RNA G4 within cells, furthermore SA-Blot experiments showcasing effective biotin labeling in vivo. Utilizing G4PID, we have introduced a novel enzyme-catalyzed proximity labeling strategy, named PLGPB, designed for the enrichment and identification of G4-interacting proteins within living cells. By employing PLGPB, we identified 149 candidate G4-interacting proteins, with results overlapping with known G4-interacting proteins from various other identification methods, lending credence to our approach's reliability. Seven novel G4-interaction proteins were validated in vitro. An alternative splicing factor SF3B4 was confirmed to bind G4s in vivo, and two G4 structures within the SF3B4 binding sites were found to regulate exon inclusion, we were still working on the underlying regulating principles.

This method, characterized by its manageability and reproducibility, presents a powerful means for unraveling the intricate landscape of G4-interacting proteins. The programmability of the protein probe, G4PID, holds promise in achieving targeted interaction with diverse G4 sites by modifying the protein sequence, enabling the enrichment of proteins specific to certain G4 sequences. Given the diverse G4 sequences in cells and the complexity of interacting proteins, the probe demands substantial sequence tolerance and robust protein-proximity labeling capability. Comparing with the small chemical probes and G4 oligo baits, the protein probes used in this method may cause interference of the G4-interacting protein, more accurate analysis of the regulatory network of G4-interacting proteins needs further experimental verification.

In summary, we developed a novel PLGPB method for the identification of G4-interacting proteins in living cells, which leading to the identification of several unknown G4-interacting proteins. The discovery of native or engineered G4 ‘readers’ with improved affinities and specificities would expand the scope of this method and enable a wider range of G4 interactors to be analyzed.

Supplementary Material

gkae126_Supplemental_Files

Acknowledgements

Author contributions: X.Z. and H.-Y.H. initiated this project. H-Y.H. conceived and designed the experiments. H.-Y.H. and S.-J.X. carried out all the experiments. Z.-A.L. performed the computational analysis of NGS sequencing data and the searching of alternative splicing sites. H-M.S. and S.-Q.H. provide help in checking the code. All the authors interpreted the results. H.-Y.H. wrote the manuscript. H-Y.H., H-M.S and Z-A.L. discussed and revised the manuscript.

Contributor Information

Ziang Lu, College of Chemistry and Molecular Sciences, Wuhan University, Wuhan, Hubei 430072, P.R. China.

Shengjie Xie, College of Chemistry and Molecular Sciences, Wuhan University, Wuhan, Hubei 430072, P.R. China.

Haomiao Su, Department of Chemistry, Yale University, 600 West Campus Drive West Haven, West Haven, CT 06516, USA.

Shaoqing Han, College of Chemistry and Molecular Sciences, Wuhan University, Wuhan, Hubei 430072, P.R. China.

Haiyan Huang, College of Chemistry and Molecular Sciences, Wuhan University, Wuhan, Hubei 430072, P.R. China.

Xiang Zhou, College of Chemistry and Molecular Sciences, Wuhan University, Wuhan, Hubei 430072, P.R. China; Department of Hematology of Zhongnan Hospital, Taikang Center for Life and Medical Sciences, Wuhan University, Wuhan, Hubei 430072, P.R. China.

Data availability

Sequencing data have been deposited into the Gene Expression Omnibus under accession code GSE223574. The original mass spectra are available via ProteomeXchange using the identifier PXD045634. Any additional data that support the findings of this study are available from the corresponding author upon reasonable request.

Supplementary data

Supplementary Data are available at NAR Online.

Funding

National Natural Science Foundation of China [22107086 to H.-Y.H., 22037004, 21721005, 92153303 to X.Z.]; China Postdoctoral Science Foundation [2021M692468 to H.-Y.H.]. Funding for open access charge: National Natural Science Foundation of China.

Conflict of interest statement. None declared.

References

  • 1. Biffi  G., Di Antonio  M., Tannahill  D., Balasubramanian  S.  Visualization and selective chemical targeting of RNA G-quadruplex structures in the cytoplasm of human cells. Nat. Chem.  2014; 6:75–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Biffi  G., Tannahill  D., McCafferty  J., Balasubramanian  S.  Quantitative visualization of DNA G-quadruplex structures in human cells. Nat. Chem.  2013; 5:182–186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Di Antonio  M., Ponjavic  A., Radzevicius  A., Ranasinghe  R.T., Catalano  M., Zhang  X., Shen  J., Needham  L.M., Lee  S.F., Klenerman  D.  et al.  Single-molecule visualization of DNA G-quadruplex formation in live cells. Nat. Chem.  2020; 12:832–837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chambers  V.S., Marsico  G., Boutell  J.M., Di Antonio  M., Smith  G.P., Balasubramanian  S.  High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat. Biotechnol.  2015; 33:877–881. [DOI] [PubMed] [Google Scholar]
  • 5. Hansel-Hertsch  R., Spiegel  J., Marsico  G., Tannahill  D., Balasubramanian  S.  Genome-wide mapping of endogenous G-quadruplex DNA structures by chromatin immunoprecipitation and high-throughput sequencing. Nat. Protoc.  2018; 13:551–564. [DOI] [PubMed] [Google Scholar]
  • 6. Kwok  C.K., Marsico  G., Sahakyan  A.B., Chambers  V.S., Balasubramanian  S.  rG4-seq reveals widespread formation of G-quadruplex structures in the human transcriptome. Nat. Methods. 2016; 13:841–844. [DOI] [PubMed] [Google Scholar]
  • 7. Wu  G., Xing  Z., Tran  E.J., Yang  D  DDX5 helicase resolves G-quadruplex and is involved in MYC gene transcriptional activation. Proc. Natl Acad. Sci. U.S.A.  2019; 116:20453–20461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Sauer  M., Juranek  S.A., Marks  J., De Magis  A., Kazemier  H.G., Hilbig  D., Benhalevy  D., Wang  X., Hafner  M., Paeschke  K.  DHX36 prevents the accumulation of translationally inactive mRNAs with G4-structures in untranslated regions. Nat. Commun.  2019; 10:2421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Zhang  J., Harvey  S.E., Cheng  C.  A high-throughput screen identifies small molecule modulators of alternative splicing by targeting RNA G-quadruplexes. Nucleic Acids Res.  2019; 47:3667–3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Hansel-Hertsch  R., Simeone  A., Shea  A., Hui  W.W.I., Zyner  K.G., Marsico  G., Rueda  O.M., Bruna  A., Martin  A., Zhang  X.  et al.  Landscape of G-quadruplex DNA structural regions in breast cancer. Nat. Genet.  2020; 52:878–883. [DOI] [PubMed] [Google Scholar]
  • 11. Shukla  V., Samaniego-Castruita  D., Dong  Z., Gonzalez-Avalos  E., Yan  Q., Sarma  K., Rao  A.  TET deficiency perturbs mature B cell homeostasis and promotes oncogenesis associated with accumulation of G-quadruplex and R-loop structures. Nat. Immunol.  2022; 23:99–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Wang  E., Thombre  R., Shah  Y., Latanich  R., Wang  J.  G-quadruplexes as pathogenic drivers in neurodegenerative disorders. Nucleic Acids Res.  2021; 49:4816–4830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Herdy  B., Mayer  C., Varshney  D., Marsico  G., Murat  P., Taylor  C., D'Santos  C., Tannahill  D., Balasubramanian  S  Analysis of NRAS RNA G-quadruplex binding proteins reveals DDX3X as a novel interactor of cellular G-quadruplex containing transcripts. Nucleic Acids Res.  2018; 46:11592–11604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Su  H., Xu  J., Chen  Y., Wang  Q., Lu  Z., Chen  Y., Chen  K., Han  S., Fang  Z., Wang  P.  et al.  Photoactive G-quadruplex ligand identifies multiple G-quadruplex-related proteins with extensive sequence tolerance in the cellular environment. J. Am. Chem. Soc.  2021; 143:1917–1923. [DOI] [PubMed] [Google Scholar]
  • 15. Zhang  X., Spiegel  J., Martinez Cuesta  S., Adhikari  S., Balasubramanian  S.  Chemical profiling of DNA G-quadruplex-interacting proteins in live cells. Nat. Chem.  2021; 13:626–633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Chen  M.C., Tippana  R., Demeshkina  N.A., Murat  P., Balasubramanian  S., Myong  S., Ferre-D’Amare  A.R.  Structural basis of G-quadruplex unfolding by the DEAH/RHA helicase DHX36. Nature. 2018; 558:465–469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Huang  Z.L., Dai  J., Luo  W.H., Wang  X.G., Tan  J.H., Chen  S.B., Huang  Z.S.  Identification of G-quadruplex-binding protein from the exploration of RGG motif/G-quadruplex interactions. J. Am. Chem. Soc.  2018; 140:17945–17955. [DOI] [PubMed] [Google Scholar]
  • 18. Ribeiro de Almeida  C., Dhir  S., Dhir  A., Moghaddam  A.E., Sattentau  Q., Meinhart  A., Proudfoot  N.J.  RNA helicase DDX1 converts RNA G-quadruplex structures into R-loops to promote IgH class switch recombination. Mol. Cell. 2018; 70:650–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Lerner  L.K., Holzer  S., Kilkenny  M.L., Svikovic  S., Murat  P., Schiavone  D., Eldridge  C.B., Bittleston  A., Maman  J.D., Branzei  D.  et al.  Timeless couples G-quadruplex detection with processing by DDX11 helicase during DNA replication. EMBO J.  2020; 39:e104185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Moye  A.L., Porter  K.C., Cohen  S.B., Phan  T., Zyner  K.G., Sasaki  N., Lovrecz  G.O., Beck  J.L., Bryan  T.M.  Telomeric G-quadruplexes are a substrate and site of localization for human telomerase. Nat. Commun.  2015; 6:7643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Paeschke  K., Juranek  S., Simonsson  T., Hempel  A., Rhodes  D., Lipps  H.J.  Telomerase recruitment by the telomere end binding protein-β facilitates G-quadruplex DNA unfolding in ciliates. Nat. Struct. Mol. Biol.  2008; 15:598–604. [DOI] [PubMed] [Google Scholar]
  • 22. Saha  D., Singh  A., Hussain  T., Srivastava  V., Sengupta  S., Kar  A., Dhapola  P., Dhople  V., Ummanni  R., Chowdhury  S.  Epigenetic suppression of human telomerase (hTERT) is mediated by the metastasis suppressor NME2 in a G-quadruplex-dependent fashion. J. Biol. Chem.  2017; 292:15205–15215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Mao  S.Q., Ghanbarian  A.T., Spiegel  J., Martinez Cuesta  S., Beraldi  D., Di Antonio  M., Marsico  G., Hansel-Hertsch  R., Tannahill  D., Balasubramanian  S.  DNA G-quadruplex structures mold the DNA methylome. Nat. Struct. Mol. Biol.  2018; 25:951–957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Yoshida  A., Oyoshi  T., Suda  A., Futaki  S., Imanishi  M.  Recognition of G-quadruplex RNA by a crucial RNA methyltransferase component, METTL14. Nucleic. Acids. Res.  2022; 50:449–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Heddi  B., Cheong  V.V., Martadinata  H., Phan  A.T.  Insights into G-quadruplex specific recognition by the DEAH-box helicase RHAU: solution structure of a peptide-quadruplex complex. Proc. Natl Acad. Sci. U.S.A.  2015; 112:9608–9613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Lattmann  S., Giri  B., Vaughn  J.P., Akman  S.A., Nagamine  Y.  Role of the amino terminal RHAU-specific motif in the recognition and resolution of guanine quadruplex-RNA by the DEAH-box RNA helicase RHAU. Nucleic Acids Res.  2010; 38:6219–6233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Qin  W., Cho  K.F., Cavanagh  P.E., Ting  A.Y.  Deciphering molecular interactions by proximity labeling. Nat. Methods. 2021; 18:133–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Cho  K.F., Branon  T.C., Udeshi  N.D., Myers  S.A., Carr  S.A., Ting  A.Y.  Proximity labeling in mammalian cells with TurboID and split-TurboID. Nat. Protoc.  2020; 15:3971–3999. [DOI] [PubMed] [Google Scholar]
  • 29. Branon  T.C., Bosch  J.A., Sanchez  A.D., Udeshi  N.D., Svinkina  T., Carr  S.A., Feldman  J.L., Perrimon  N., Ting  A.Y.  Efficient proximity labeling in living cells and organisms with TurboID. Nat. Biotechnol.  2018; 36:880–887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Gallo  A., Lo Sterzo  C., Mori  M., Di Matteo  A., Bertini  I., Banci  L., Brunori  M., Federici  L.  Structure of nucleophosmin DNA-binding domain and analysis of its complex with a G-quadruplex sequence from the c-MYC promoter. J. Biol. Chem.  2012; 287:26539–26548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Chen  M.C., Tippana  R., Demeshkina  N.A., Murat  P., Balasubramanian  S., Myong  S., Ferré-D’Amaré  A.R.  Structural basis of G-quadruplex unfolding by the DEAH/RHA helicase DHX36. Nature. 2018; 558:465–469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Heddi  B., Cheong  V.V., Martadinata  H., Phan  A.T.  Insights into G-quadruplex specific recognition by the DEAH-box helicase RHAU: solution structure of a peptide-quadruplex complex. Proc. Natl. Acad. Sci. U.S.A.  2015; 112:9608–9613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Sherman  B.T., Hao  M., Qiu  J., Jiao  X., Baseler  M.W., Lane  H.C., Imamichi  T., Chang  W.  DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res.  2022; 50:W216–W221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Zheng  K.W., Zhang  J.Y., He  Y.D., Gong  J.Y., Wen  C.J., Chen  J.N., Hao  Y.H., Zhao  Y., Tan  Z.  Detection of genomic G-quadruplexes in living cells using a small artificial protein. Nucleic Acids Res.  2020; 48:11706–11720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Hänsel-Hertsch  R., Beraldi  D., Lensing  S.V., Marsico  G., Zyner  K., Parry  A., Di Antonio  M., Pike  J., Kimura  H., Narita  M.  et al.  G-quadruplex structures mark human regulatory chromatin. Nat. Genet.  2016; 48:1267–1272. [DOI] [PubMed] [Google Scholar]
  • 36. Guo  J.U., Bartel  D.P.  RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria. Science. 2016; 353:aaf5371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Bourdon  S., Herviou  P., Dumas  L., Destefanis  E., Zen  A., Cammas  A., Millevoi  S., Dassi  E.  QUADRatlas: the RNA G-quadruplex and RG4-binding proteins database. Nucleic Acids Res.  2023; 51:D240–D247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Khan  A., Mathelier  A.  Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinf.  2017; 18:287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. González  V., Guo  K., Hurley  L., Sun  D  Identification and characterization of nucleolin as a c-myc G-quadruplex-binding protein. J. Biol. Chem.  2009; 284:23622–23635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Ko  E., Kim  J.S., Ju  S., Seo  H.W., Chang  Y., Kang  J.A., Park  S.G., Jung  G.  Oxidatively modified protein-disulfide isomerase–Associated 3 promotes dyskerin pseudouridine synthase 1–mediated malignancy and survival of hepatocellular carcinoma cells. Hepatology. 2018; 68:1851–1864. [DOI] [PubMed] [Google Scholar]
  • 41. Mishra  S.K., Tawani  A., Mishra  A., Kumar  A.  G4IPDB: a database for G-quadruplex structure forming nucleic acid interacting proteins. Sci. Rep.  2016; 6:38144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Tu  J., Duan  M., Liu  W., Lu  N., Zhou  Y., Sun  X., Lu  Z.  Direct genome-wide identification of G-quadruplex structures by whole-genome resequencing. Nat. Commun.  2021; 12:6014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Kamura  T., Katsuda  Y., Kitamura  Y., Ihara  T.  G-quadruplexes in mRNA: a key structure for biological function. Biochem. Biophys. Res. Commun.  2020; 526:261–266. [DOI] [PubMed] [Google Scholar]
  • 44. Huang  H., Zhang  J., Harvey  S.E., Hu  X., Cheng  C.  RNA G-quadruplex secondary structure promotes alternative splicing via the RNA-binding protein hnRNPF. Genes Dev.  2017; 31:2296–2309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Matsumoto  K., Okamoto  K., Okabe  S., Fujii  R., Ueda  K., Ohashi  K., Seimiya  H.  G-quadruplex-forming nucleic acids interact with splicing factor 3B subunit 2 and suppress innate immune gene expression. Genes Cells. 2021; 26:65–82. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkae126_Supplemental_Files

Data Availability Statement

Sequencing data have been deposited into the Gene Expression Omnibus under accession code GSE223574. The original mass spectra are available via ProteomeXchange using the identifier PXD045634. Any additional data that support the findings of this study are available from the corresponding author upon reasonable request.


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES