Abstract
Guanine (G)-rich sequences in nucleic acids can form non-canonical secondary structures such as R-loops and G-quadruplexes (G4) during transcription. The R-loop formed on the template strand promotes and stabilizes G4 in the non-template strand. However, the precise role of G4/R-loop-forming sequences on transcription remains poorly understood. In this study, we investigated the effect of different potential G4-forming sequences (PQSs) on G4/R-loop formation and transcription dynamics. We employed gel-based assays and single-molecule fluorescence resonance energy transfer (smFRET) to measure RNA synthesis and concomitant formation of G4 and R-loop during transcription by T7 RNA polymerase. We reveal two types of R-loop that form successively; an R-loop with an intramolecular DNA G4 (IG4) initially forms during transcription, followed by an R-loop with an intermolecular DNA:RNA hybrid G4 (HG4). We found that IG4 R-loops inhibit, whereas HG4 R-loops enhance transcription. We identified that an HG4/IG4 ratio highly correlates with transcriptional activity. PQS with short linkers favors IG4, reducing transcription, while PQS with long linkers that induce loosely folded PQS favor HG4, increasing transcription. Since IG4 formation precedes HG4, tightly folded PQS forms IG4 quickly and stably, slowing its conversion to HG4 and reducing transcriptional enhancement.
Graphical Abstract
Graphical Abstract.
Introduction
G-quadruplexes (G4s) and R-loops are non-canonical nucleic acid structures involved in cellular processes such as DNA replication, transcription, and translation [1–4] with implications for diseases including cancer [5–7] and neurological disorders [8, 9]. G4s, formed in guanine-rich sequences through Hoogsteen base pairing, result in four-stranded guanine-tetrad structures [10, 11]. Sequencing methods, such as G4-seq and G4-miner, have identified between 700 000 and 1 000 000 potential G4-forming sequences in the human genome [12, 13]. Computational studies and G4 ChIP-seq data reveal that G4 peaks are primarily enriched in regions upstream of the transcription start site (TSS), followed by enrichment in the 5′ UTRs [14–18]. Its presence correlates with gene expression through mechanisms such as modulating transcription factor binding and RNA polymerase activity [19–21].
R-loops are three-stranded nucleic acid structures that form in GC-rich regions during transcription, where a nascent RNA strand hybridizes with the DNA template strand, displacing the non-template strand. R-loops are not simply byproducts of transcription but play an active role in regulating gene expression by modulating transcription initiation, elongation, and termination [22–24]. DRIP-seq and R-ChIP studies have shown that R-loops are enriched in gene promoters of transcriptionally active genes [25–27]. However, persistent or misregulated R-loops are associated with DNA damage and genomic instability [2].
Recent studies have highlighted the functional interplay between G4s and R-loops [28]. The displaced, G-rich non-template strand within an R-loop can fold into a G4 structure, particularly in regions of high transcription activity and negative supercoiling, such as gene promoters and 5′ UTR. This interplay suggests the role of G4 and R-loop together in transcriptional regulation [29, 30].
In our previous study using a T7 RNA polymerase (T7 RNAP)-based in vitro transcription system [21], we investigated the potential G4-forming sequence (PQS) within the c-MYC proto-oncogene, which participates in key processes such as cell growth, differentiation, and apoptosis [31]. While PQSs are predominantly enriched in gene promoters [14], we focused on those located in the 5′UTR, which exhibits the second-highest level of PQS enrichment and remains relatively understudied [14]. Our goal was to explore the interplay between G4 and R-loop structures, rather than G4s alone. Since R-loops typically form downstream of promoter regions, the 5′UTR serves as a relevant context for studying these dynamics within our transcription system. We demonstrated that the R-loop forms during transcription when the c-MYC PQS is placed on the non-template strand within the 5′UTR. The R-loop exposes the PQS, enabling it to fold into a G4 structure, which is then stabilized. The simultaneous presence of G4 on the non-template strand and R-loop on the template strand promotes transcription through a mechanism involving successive rounds of R-loop formation [32].
Large-scale G4 and R-loop sequencing studies have correlated their presence upstream of the TSS and 5′UTR regions to enhanced gene expression [14, 15, 26, 27], and our in vitro studies have provided mechanistic insights into how PQS contributes to this in the 5′UTR region [33]. However, the effects of PQS sequence variations on these structures and their transcriptional outcomes remain poorly understood. G4s are defined by a core motif of four guanine tracts separated by loops of varying lengths and compositions. Variations in these motifs lead to structural diversity, influencing G4 stability and folding topology. For example, longer loops tend to increase structural flexibility [34, 35], while shorter loops impose constraints that enhance stability [36–38]. Although genomic surveys have ranked PQS sequences by loop length prevalence [39], the molecular mechanisms by which these variations regulate transcription in the 5′UTR are still not fully understood.
In this study, we investigate how PQS sequence variations modulate G4 and R-loop structures, which impact transcriptional outcomes. Using a T7 RNAP in vitro transcription system, electrophoretic mobility shift assays (EMSA), and single-molecule fluorescence resonance energy transfer (smFRET), we examined G4 and R-loop formation alongside RNA production across various PQS sequences placed downstream of the TSS to mimic the 5′UTR [40]. Our findings demonstrate that sequence variations impact transcription by generating two distinct G4/R-loop structures: an intramolecular G4 with R-loop (IG4 R-loop), which lowers transcription, and an intermolecular hybrid DNA:RNA G4 with R-loop (HG4 R-loop), which enhances transcription. The PQS sequence plays a role in determining the proportion of these structures: (i) shortest-loop PQS generates a high level of IG4 R-loop, which severely diminishes transcription; (ii) mid-length loop forms unstable IG4 R-loop, which gradually gives rise to HG4 R-loop, resulting in moderate level of transcription; and (iii) long-loop PQS favors the formation of HG4 R-loops, leading to higher transcriptional output. Therefore, the balance between HG4 R-loops and IG4 R-loops dictates transcriptional activity, highlighting the role of G4/R-loop structures in tuning transcription.
Materials and methods
DNA construct preparation
HPLC-purified DNA oligonucleotides containing biotin for immobilization and either Cy3, Cy5, or amine modifications were procured from Integrated DNA Technologies (IDT, USA). Amine-modified oligonucleotides were labeled with NHS ester-conjugated fluorescent dyes following established protocol [41, 42]. Duplex DNA constructs were prepared by mixing top, bottom, and biotin-conjugated 18-mer oligonucleotides at a molar ratio of 1:1.2:1.5. DNA strands were annealed in T50 buffer (10 mM Tris–HCl, pH 7.5, and 50 mM NaCl) using a thermocycler. The mixture was heated to 95°C for 2 min, then gradually cooled at a rate of 2°C/min until 40°C, followed by cooling at 5°C/min until 4°C [43]. The annealed constructs were stored at −20°C and freshly re-annealed before use. All buffers were prepared using Milli-Q water and filtered through 0.22 μm membrane filters.
Single molecule FRET data acquisition and analysis
A home-built prism-type total-internal-reflection inverted fluorescence (TIRF) microscope (Olympus IX 71) was used for smFRET studies, as described previously [43–45]. DNA molecules were immobilized on polyethylene glycol (PEG)-passivated quartz slides via biotin–NeutrAvidin interactions. Transcription reactions were performed with 1 mM RNAP and 1 mM NTP mix in a buffer containing 40 mM Tris–HCl, pH 7.8, 50 mM KCl, 6 mM MgCl2, 1 mM DTT, 2 mM spermidine, and 0.1% bovine serum albumin. All smFRET measurements were carried out at room temperature (∼23°C ± 2°C) in an imaging buffer containing an oxygen scavenging system (10 mM trolox, 0.5% glucose, 1 mg/mL glucose oxidase, and 4 μg/mL catalase) to minimize photobleaching of the dyes [46, 47].
The evanescent field was generated through TIRF microscope using a solid-state 532 nm diode laser (Compass 315 M, Coherent) to excite the fluorophores in the sample chamber. Fluorescence from Cy3 (donor) and Cy5 (acceptor) was simultaneously collected using a water immersion objective and projected onto an EMCCD camera (Andor) after passing through a dichroic mirror (cutoff = 630 nm). Data were recorded with a time resolution of 100 ms and analyzed using scripts written in Interactive Data Language (IDL) (http://www.exelisvis.co.uk/ProductsServices/IDL.aspx) and MATLAB (https://www.mathworks.com/).
The FRET efficiency (EFRET) was calculated using IA/(ID + IA), where ID and IA represent the intensity of donor and acceptor, respectively. FRET histograms were generated from more than 4000 molecules (21 frames from 20 short movies) across different imaging surfaces. Alternating green and red laser excitation (10 frames each, separated by a dark frame) excluded donor-only molecules in low EFRET regions. Donor leakage was corrected based on donor-only EFRET values. Histograms were normalized and fitted with multi-peak Gaussian distributions. Long movies (2000 frames, i.e. 200 s) were recorded to observe molecular behavior [42, 46].
In vitro T7 transcription assay
Transcription reactions were carried out in a total volume of 20 μL containing the following components: RNAP transcription buffer (40 mM Tris–HCl pH 8.3, 50 mM KCl, 6 mM Mg2Cl, 2 mM spermidine, 1 mM dithiothreitol), 50 mM KCl, RNase Inhibitor Murine (200 U/mL, NEB), T7 RNAP (1250 U/mL, NEB), and 10 nM DNA template. For transcription in different salt conditions, the 50 mM KCl was replaced with 50 mM of LiCl or water for the no monocation condition.
Transcription was initiated by adding 1 mM rNTPs. For inosine triphosphate (ITP) transcription, where GTP was substituted with ITP, the reaction was initiated with 1 mM ATP, CTP, UTP, 1 mM ITP, and 4 mM GMP (guanosine monophosphate). Similarly, for transcription with 7-deaza-rGTP, where GTP was replaced by 7-deaza-rGTP, the reaction was initiated with 1 mM ATP, CTP, UTP, 1 mM 7-deaza-rGTP, and 4 mM GMP.
The reaction mixture was incubated at 25°C for the desired transcription duration. Transcription was terminated by first adding 0.5 μL of 0.5 M ethylenediaminetetraacetic acid (EDTA), then 4 μL of solution containing 50% glycerol and 10% sodium dodecyl sulfate (SDS) in the ratio 20:1, respectively.
Transcription with pre-folded G-quadruplex
To pre-fold G4, the DNA was annealed in a solution containing 10 mM Tris–HCl, 100 mM KCl, and 40% PEG-200. The DNA was annealed by heating to 95°C and gradually cooling at a rate of 1°C per minute until reaching 25°C. Once the pre-folding process was complete, transcription was carried out as described in the previous section.
RNase H digestion
To terminate transcription prior to RNase H digestion, 2 μL of 10 μM T7 promoter DNA was added to competitively inhibit T7 RNAP. Following transcription termination, 2.5 units of RNase H (NEB) were added to the reaction mixture. The sample was incubated at 25°C for 5 min to allow for digestion. Digestion was terminated by adding 0.5 μL of 0.5 M EDTA, 4 μL of 50% glycerol, and 0.1 μL of 10% SDS.
RNase A digestion
RNase A (10 mg/mL) was diluted 1:1000 in T50 buffer (10 mM Tris–HCl, 50 mM NaCl). Prior to transcription termination, 1 μL of the diluted RNase A was added to the transcription reaction and incubated at 25°C for 10 min. The reaction was then terminated by adding 0.5 μL of 0.5 M EDTA followed by 4 μL of a 20:1 mixture of 50% glycerol and 10% SDS.
Electrophoretic mobility shift assay
All EMSAs were performed using a 10% polyacrylamide gel. The gel composition included 10% acrylamide/bis-acrylamide solution (29:1), 1× TBE buffer, 1% ammonium persulfate, and 1% TEMED. A total of 7 μL of transcribed samples and the low molecular weight ladder (New England Biolabs) were loaded into each gel, and electrophoresis was performed at 4°C under a constant current of 12 mA per gel for 1 h. After electrophoresis, the gel was stained for 10 min in 0.5× SYBR™ Green II RNA gel stain, prepared by diluting 2.5 μL of 10 000× SYBR™ Green II RNA Gel Stain concentrate in 50 mL deionized water, and subsequently destained in deionized water for 5 min. Imaging was conducted using the Azure 400 Imager, with Cy5 fluorescence detected at 628 nm for 3 min to visualize the DNA template and SYBR™ Green II fluorescence detected at 472 nm for 3 s to visualize RNA.
FRET gel imaging
Prior to SYBR Green II RNA staining, the gel was imaged at Cy3 excitation fluorescence and Cy5 emission fluorescence for 1 min.
Gel quantification
Quantification of gels was performed using ImageJ software. The intensities of the IG4 and HG4 bands were determined from the Cy5 signals corresponding to their shifted positions on the gel. These intensities were normalized to the total Cy5 signal in the respective lane to calculate the fraction of IG4 and HG4 relative to the total DNA signal. For RNA quantification, RNA bands were identified as those with faster mobility than the linear DNA template band and exhibiting SYBR™ Green II fluorescence without Cy5 signal. This distinction was made because SYBR™ Green II dye also stains DNA. The intensities of these RNA bands were summed and normalized to the combined SYBR™ Green II signal of DNA, IG4, and HG4 in the same lane. Uncropped gel images and corresponding normalized quantification values are provided in the data source file.
Data analysis
For transcription rate measurements, RNA band intensities were quantified from the SYBR™ Green II fluorescence signals. They were normalized to the total nucleic acid signal in each lane by dividing the RNA intensity by the combined intensities of DNA, IG4, and HG4 bands. RNA levels were then plotted over time (0, 0.5, 3, 5, 10, 15, 20, and 30 min post transcription). A linear regression was applied to determine the transcription rate, with the slope representing the rate. For kinetic modeling, the normalized intensities of the DNA, IG4 R-loop, and HG4 R-loop bands were used as inputs to the chem_kin Python fitting package. All source code is provided in the “Data Availability” section.
Statistical analysis
For all quantified gels, at least three independent replicates were performed. The mean values were plotted, and error bars represent the standard deviation of the three replicates. Two-tailed t-tests were used to calculate P-values, with statistical significance defined as P < .05. Raw data and exact P-values are provided in the accompanying data source file.
Results
Two distinct R-loops form in PQS during transcription
To investigate G4 and R-loop formation during transcription of varying PQS, we designed a DNA construct containing a T7 promoter and a PQS on the non-template strand positioned downstream of the TSS after a 16-bp spacer sequence. We focused on the non-template strand because our previous study demonstrated that PQS on the non-template strand positively influences transcription [21]. To monitor R-loop and G4 formation, we labeled the 5′ and 3′ ends of the PQS sequence with a Cy3 and Cy5, respectively, as done previously (Fig. 1A). The constructs were named based on the number of thymidine linkers between the runs of guanine. For example, 111 is characterized by one thymidine linker between each G-run, i.e. [GGG T GGG T GGG T GGG], while 123 refers to [GGG T GGG TT GGG TTT GGG].
Figure 1.
Formation of two distinct R-loops during transcription of PQS-containing DNA sequences. (A) Schematic of the DNA construct containing a T7 promoter, a 16 base pair spacer, a PQS region flanked by Cy3 and Cy5 dyes, and a 14 base pair spacer downstream of the PQS. (B) EMSA gel showing transcription of PQS 111, a non-G4 control sequence, and PQS 121, from initiation to 30 min after the addition of transcription components. This gel is a merged overlay with red indicating DNA bands and green indicating RNA bands. PQS sequences are named based on linker length. (C) Quantification of RNA levels from the EMSA at the 30-min time point for PQS 111, the non-G4 control sequence, and PQS 121. (D–F) Quantification of EMSA results over transcription time for PQS 111 and PQS 121, showing the relative amounts of DNA, R-loop 1, and R-loop 2 bands. Data represent the mean ± standard deviation from three independent replicates. Raw values are provided in the data source file. The gray horizontal line indicates RNA levels from the non-G4 control sequence after 30 min of transcription.
We performed in vitro transcription using T7 polymerase and monitored time-dependent G4/R-loop formation and RNA production through EMSA. Due to the increased hydrodynamic radius and molecular weight, R-loops were observed as upshifted bands with reduced mobility compared to linear DNA [21]. Cy5 fluorescence allowed us to directly visualize DNA constructs, while RNA was detected using SYBR Green II RNA staining.
Before transcription initiation, the linear DNA band for PQS 111 was observed as a single band exhibiting a Cy5 fluorescence signal. After adding rNTPs to initiate transcription, two distinct low-mobility bands appeared. We designated these as R-loop 1, which forms first and exhibits slightly higher mobility than R-loop 2, which forms later and shows lower mobility (Fig. 1B). In our previous study using the PQS-cMyc sequence [GGG T GGG TA GGG T GGG], only a single upshifted G4/R-loop band was observed. However, in this study, PQS 111 and PQS 121 produced two distinct R-loop bands, particularly noticeable when the gel-running duration was extended (Fig. 1B, left). Concurrently, RNA production was detected, with bands increasing in intensity over time. Transcription using a control DNA construct, which lacked a PQS sequence but contained a scrambled sequence with 50% GC density, produced RNA but no R-loop bands (Fig. 1B, middle), confirming that PQS is necessary for R-loop formation.
We compared transcription outcomes between PQS 111 and PQS 121, which has a similar linker length to the PQS found in the c-MYC promoter sequence from our previous study, but with a TT linker in the middle instead of TA. Both PQS 111 and 121 produced G4/R-loops and formed two distinct R-loop structures during transcription (Fig. 1B, left and right), suggesting that the presence of two R-loops is a common feature of PQS-mediated transcription. We observed that transcripts from the PQS-containing sequences appeared as smeared RNA bands, while the control sequence produced a single, distinct RNA band. This smearing likely reflects secondary structures formed by G-rich RNA. A denaturing gel revealed two distinct RNA bands (Supplementary Fig. S1A): one corresponding to the expected full-length RNA and another with lower mobility, likely due to G-rich RNA folding.
Quantitative analysis showed that after 30 min of transcription, the 111 PQS construct produced less RNA compared to both the control sequence and the 121 PQS construct, indicating that even one nucleotide difference in the loop length can have an impact on transcription yield (Fig. 1C). RNA production kinetics differed between the two constructs, with PQS 111 exhibiting a lower RNA production rate compared to PQS 121 (Fig. 1D). Interestingly, the R-loop 1 and 2 formation dynamics differed between the two constructs. R-loop 1 was similar in both constructs, with PQS 111 plateauing at around 15 min while PQS 121 peaking at around 10 min before decreasing after 20 min (Fig. 1E). R-loop 2 showed a continuous linear increase in both constructs, with PQS 121 displaying a higher formation rate, reaching a higher level earlier than PQS 111 (Fig. 1F).
These findings demonstrate that PQS sequences influence the kinetics of two different R-loop formations and RNA production. We hypothesize that R-loop 1 and R-loop 2 levels that arise from the variations in PQS sequences are responsible for different transcription outcomes.
Formation of two R-loops either with intramolecular or hybrid G-quadruplexes
We conducted a series of tests to identify R-loop 1 and R-loop 2. Our previous study demonstrated that R-loops form before G4 and that G4 formation stabilizes R-loop structure. Together, G4 and R-loops coexist to facilitate transcription.
To verify the identity of R-loop 1 and R-loop 2, we treated the samples with RNase H, which specifically digests DNA:RNA hybrids. Following treatment, R-loop 2 completely disappears, while R-loop 1 is partially reduced, suggesting it may be more resistant to RNase H cleavage (Supplementary Fig. S1B). To further verify whether R-loop 1 and R-loop 2 contain R-loops, we substituted GTP with ITP during transcription. ITP inhibits R-loop formation due to the reduced stability of inosine-cytosine base pairing [48–50]. Transcription with ITP eliminated both R-loop 1 and R-loop 2, indicating that both bands are R-loops (Fig. 2A).
Figure 2.
Identification of R-loop 1 and R-loop 2 as distinct G4-associated structures. R-loop 1 is composed of an intramolecular DNA G4 and an R-loop, while R-loop 2 consists of a DNA:RNA hybrid G4 structure with an R-loop. (A) EMSA comparing transcription under normal conditions with rNTPs versus with ITP substituted for rGTP to assess R-loop formation. (B) EMSA comparing transcription using normal linear duplex DNA versus pre-folded G4 DNA. (C) EMSA comparing transcription under normal conditions with rNTPs versus with 7-deaza-rGTP substituted for rGTP to test for DNA:RNA hybrid G4 formation. (D) Gel imaged using FRET, showing Cy3 excitation and Cy5 emission to distinguish the IG4 band. (E) Schematic representation of R-loop 1 as an intramolecular DNA G4 R-loop structure and R-loop 2 as a DNA:RNA hybrid G4 R-loop structure.
Next, we tested whether R-loop 1 or R-loop 2 contains an IG4, where the PQS in the non-template strand folds into a G4 while the DNA:RNA hybrid of the R-loop forms on the template strand, i.e. IG4/R-loop. We define the IG4 R-loop as a G4 R-loop structure formed exclusively by G-runs within the DNA. Based on our previous finding that pre-folded G4 in non-template DNA leads to fast and high levels of IG4/R-loop, we compared transcription activity between a normal duplex DNA versus the construct with a pre-folded G4 structure. Before transcription initiation, i.e. NTP, the pre-folded G4, due to its bulkier secondary structure, exhibited slightly lower mobility than the linear DNA construct (Fig. 2B, right side). As the transcription progressed, pre-folded G4 generated a prominent R-loop 1 band with no formation of R-loop 2 (Fig. 2B, right). The predominant formation of the R-loop 1 band under pre-folded G4 conditions indicates that R-loop 1 contains IG4. Additionally, the R-loop 1 band exhibited a significantly higher shift than the IG4-alone band (Fig. 2B, right gel, -NTP), further supporting the presence of IG4 associated with the R-loop (Fig. 2B).
Based on previous studies and the lower mobility shift of R-loop 2, we conjectured that R-loop 2 could contain an intermolecular hybrid G4 (HG4), formed by the G-rich non-template DNA strand pairing with the transcribed G-rich RNA [51–53]. This hybrid structure likely arises because the transcribed RNA shares the same G-runs as the non-template DNA strand. These G-runs in the RNA can potentially form Hoogsteen base pairs with the G-runs in the PQS sequence of the non-template DNA strand, forming an intermolecular hybrid DNA:RNA G4 (HG4). We define the HG4 R-loop as a G4 R-loop structure formed by G-runs contributed by both DNA and RNA strands. To test for HG4, we replaced rGTP with deaza-rGTP, an rGTP analog that cannot form the hydrogen bonds required for G4 formation. Therefore, transcription with deaza-rGTP produces RNA with G-runs incapable of participating in G4 formation with the DNA strand, thereby preventing HG4 formation [53]. Transcription with deaza-rGTP eliminated R-loop 2, while R-loop 1 remained, indicating that R-loop 2 contains an HG4 structure along with the R-loop (Fig. 2C).
Furthermore, we performed FRET imaging of the gel to distinguish between the IG4 R-loop and HG4 R-loop. Our DNA construct includes Cy3 donor and Cy5 acceptor dyes flanking the PQS sequence. When IG4 forms, the reduced distance between the two dyes is expected to produce higher FRET than HG4. Excitation of Cy3 fluorescence and detection of Cy5 signals revealed a higher FRET intensity in the R-loop 1 band compared to R-loop 2, further confirming R-loop 1 and R-loop 2 as IG4 R-loop and HG4-containing R-loop, respectively (Fig. 2D).
To rule out the possibility that a head-to-head dimeric IG4 may form [54], which comprises one DNA and one RNA strand, we performed transcription using the 1 construct i.e. GGGTGGG, which cannot form any type of G4. When transcription was carried out with the 1 construct, we observed no low-molecular-weight bands (Supplementary Fig. S2A), indicating that head-to-head IG4 dimers are not forming under these conditions.
We repeated these structural experiments to determine the identity of R-loop 1 and R-loop 2 bands with additional PQS constructs, and the results were consistent across all constructs. R-loop 1 was prominent when pre-folded DNA constructs were used, while R-loop 2 disappeared (Supplementary Fig. S2B), supporting the identity of R-loop 1 as an IG4 R-loop. In experiments where rGTP was replaced with 7-deaza-rGTP, R-loop 2 diminished across constructs, consistent with it being an HG4 R-loop (Supplementary Fig. S2C). FRET gel imaging also showed strong signal at the R-loop 1 position, further supporting its assignment as an IG4 R-loop (Supplementary Fig. S2D). These findings demonstrate that R-loop 1 is an IG4 R-loop that forms first during transcription, followed by R-loop 2, which is the DNA:RNA HG4 R-loop (Fig. 2E). However, the specific role of these structures in regulating transcription remains unclear.
IG4 R-loop inhibits while HG4 R-loop enhances transcription
Having assigned the two types of R-loops, we sought to investigate the roles of IG4 R-loop and HG4 R-loop in transcription by choosing experimental conditions that favor IG4 R-loop or HG4 R-loop formation and measuring the corresponding RNA output. We used the 121 PQS construct for comparison, as it yields higher RNA levels than 111, providing a more reliable baseline for detecting transcriptional changes.
As before, we used pre-folded G4 to favor IG4 R-loop formation during transcription [21, 55, 56]. Under this condition, RNA production was reduced compared to the linear PQS construct (Fig. 3A and B). As expected, pre-folded G4 promoted significantly higher IG4 formation than the linear PQS construct and eliminated HG4 formation compared to the linear PQS condition (Fig. 3C). These findings suggest that IG4 formation lowers transcription but does not completely block it.
Figure 3.
IG4 R-loop reduces while HG4 R-loop facilitates transcription. (A) EMSA comparing transcription using normal linear duplex DNA versus pre-folded G4 DNA. (B, C) Quantification of RNA, IG4, and HG4 levels from the EMSA in panel (A), comparing transcription using normal linear duplex DNA versus pre-folded G4 DNA. (D) EMSA comparing transcription under different ionic conditions: transcription buffer without monovalent cations, 50 mM LiCl, or 50 mM KCl. (E, F) Quantification of RNA, IG4, and HG4 levels from the EMSA in panel (D), comparing transcription under varying salt conditions. (G) Two-axis plot showing the correlation between RNA levels and the HG4/IG4 ratio under different ionic conditions. (H) EMSA of the PQS 11 construct, which contains 3 runs of G separated by a single thymidine linker, compared to PQS 111. (I, J) Quantification of RNA, IG4, and HG4 levels from the EMSA, comparing transcription between PQS 111 and PQS 11 constructs. All data represent the mean ± standard deviation from three independent replicates. Raw values are available in the source data file. EMSA gels in panels (A), (D), and (H) are merged overlays with DNA bands shown in red and RNA bands in green.
To further test the role of IG4 R-loop, we modulated IG4 stability by performing transcription in solutions containing no monovalent cation, LiCl, or KCl, listed in order from the least to the most IG4-stabilizing conditions [57–60]. In the absence of monovalent cations, IG4 stability was reduced, as indicated by a decrease in IG4 band intensity over longer transcription time (Fig. 3D). Concurrently, RNA production was increased compared to LiCl- and KCl-containing conditions, confirming that IG4 reduces transcription yield (Fig. 3E). The slight difference in RNA production between LiCl and KCl suggests that the 121 PQS sequence forms relatively stable IG4 even in LiCl conditions. While the IG4 levels remained similar across all three conditions, HG4 levels varied significantly, with the no-monovalent cation condition yielding the highest HG4 formation (Fig. 3F). This indicates that destabilizing IG4 may promote HG4 formation and that HG4 may enhance transcription.
Since the IG4 and HG4 R-loops play opposite roles in transcription, we reasoned that the HG4/IG4 ratio can be a helpful index that may be correlated to transcription output. Indeed, we observed a positive correlation between the HG4/IG4 ratio and RNA production (Supplementary Fig. S3A). Likewise, the increased presence of HG4 in the no-monovalent cation condition raised the HG4/IG4 ratio, which correlated with higher RNA production. These results suggest the role of HG4 in facilitating transcription (Fig. 3G).
To isolate the effect of HG4 R-loop, we used the “11” construct (-GGGTGGGTGGG-), which contains only three G-runs. Since four G-runs are required for IG4 formation, the 11 construct cannot form IG4, but can still form HG4 through Hoogsteen base pairing between G-runs in the RNA and the G-rich non-template DNA strand. Strikingly, the 11 construct produced significantly more RNA than the 111 PQS construct (Fig. 3H and I) with an increased HG4/IG4 ratio (Supplementary Fig. S3B). The IG4 levels were higher in the 111 PQS construct, while HG4 levels were similar between the two constructs (Fig. 3J).
These findings suggest that transcriptional output is influenced by the ratio between HG4 and IG4, with higher IG4 R-loop levels associated with reduced RNA production and higher HG4 R-loop levels promoting increased RNA production.
RNA production positively correlates with PQS loop length and HG4/IG4 ratio
To investigate whether the loop length of PQS sequences affects transcription, we measured transcription using 111, 122, 133, and 144 PQS constructs, each with progressively longer thymine linker lengths (Fig. 4A). We observed that increasing the linker length corresponded to increased RNA production (Supplementary Fig. S4A) and transcription rate (Fig. 4B). A positive correlation between the HG4/IG4 ratio and RNA production was consistent across the constructs. PQS 111, with the shortest linker, produced the least RNA and exhibited the lowest HG4/IG4 ratio, while PQS 144, with the most extended linker, showed the highest RNA production with the highest HG4/IG4 ratio (Fig. 4C). These results suggest that longer linker lengths may destabilize IG4 R-loop formation, shifting the balance toward HG4 R-loop. The increased HG4/IG4 ratio amplifies HG4’s promoting effect on transcription, driving higher RNA production.
Figure 4.
Longer linker PQS increases RNA production. (A) EMSA gels showing transcription of PQS 111, 122, 133, and 144, each containing progressively longer linker lengths. EMSA gels are merged overlays with DNA bands shown in red and RNA bands in green. (B) Quantification of transcription rate. The transcription rate was determined from the slope of RNA intensity over time, based on gel quantification. Two-tailed t-test was used to calculate statistical significance. (C) Two-axis plots showing the correlation between RNA levels and the HG4/IG4 ratio for each construct, demonstrating a similar pattern of increase with longer linker lengths.
Single molecule analysis reveals unstable IG4 R-loops in longer loop lengths
Next, we sought to probe the loop length dependence in terms of IG4- and HG4 R-loop formation by smFRET analysis. We used the same set of DNA constructs with one biotinylated strand for surface tethering (Fig. 5A). As before, the Cy3 and Cy5 dyes are positioned at either end of the PQS sequence downstream of the TSS. The FRET histogram is plotted by collecting FRET values from over 4000 molecules. Before transcription, the FRET peak is ∼0.3 for 111 and 122 (Fig. 5B and C, DNA). For 144 and 11, the FRET peak is ∼0.2 and ∼0.6, respectively, corresponding to longer and shorter linker lengths (Fig. 5D and E, DNA).
Figure 5.
Single molecule data reveals unstable IG4 formation in PQS with longer loop length. (A) Schematic of the single molecule DNA construct, featuring a biotinylated strand for surface tethering, T7 promoter, a 16-bp spacer, a PQS region flanked by Cy3 and Cy5 dyes, and a 14-bp spacer downstream of the PQS. (B–E) smFRET histograms indicating high-, mid-, and low-FRET peaks, corresponding to IG4, HG4, and DNA. (F–I) Representative traces for PQS 111, 122, 133, and 144.
The FRET histograms taken after 10, 20, and 30 min of transcription show a progressive shift from low to higher FRET states to varying degrees depending on the linker length (Fig. 5B–E, mid-, high FRET). We interpret the mid-FRET as representing the R-loop without any G4 and HG4 R-loop, while the high FRET signal corresponds to the IG4 R-loop, based on ensemble gel-based FRET analysis (Fig. 2E) and our previous study [21]. As demonstrated previously, the R-loop without G4 is mostly a short-lived transient state; thus, the HG4 R-loop likely contributes more to the mid-FRET peak.
The high FRET IG4 R-loop forms the most in 111, followed by 122, with no formation in 144 and 11. In contrast, the HG4 R-loop forms far less in 111 than in 122 and 11. The mid-FRET state is not well distinguished in 144 due to the long dye-to-dye distance. The IG4- and HG4 R-loop pattern is consistent with the ensemble results, which displayed the lowest HG4/IG4 ratio of 111, which produces the least transcript.
Representative smFRET trace for 111 shows the low FRET shifting first to a transient mid-FRET, which leads to a high FRET, representing a transient R-loop followed by an immediate IG4 folding (Fig. 5F). Strikingly, 122 displays a transition from a low to a mid-FRET R-loop followed by a rapid FRET fluctuation that continues for a long duration, frequently exchanging between the R-loop and IG4 R-loop state, likely due to a less stable IG4 in 122 (Fig. 5G). This contrasts with the steady high FRET state, i.e. IG4 R-loop seen in 111, which forms a highly stable IG4. For 144 and 11, the FRET level switches between a long-lived low FRET and mid-FRET without transitioning to high FRET, consistent with the prominent HG4 and the lack of IG4 R-loop (Fig. 5H and I). Together, we demonstrate that the IG4 R-loop stability is a primary factor influencing transcription output. Highly stable IG4 R-loop lowers HG4 formation, thereby reducing transcription, whereas unstable IG4 can give rise to HG4, enhancing transcription.
RNA production is independent of linker positions
To determine whether the position of loop length variation within the PQS sequence affects transcription, we performed in vitro transcription with PQS 112, 121, and 211 constructs. We observed no differences in RNA production, IG4 formation, or HG4 formation across these constructs (Fig. 6A and B). Similarly, transcription experiments with PQS 122, 212, and 221 also showed no changes in RNA production, IG4, or HG4 levels (Fig. 6C and D). These results indicate that varying the loop positions within the PQS sequence does not influence transcription.
Figure 6.
Linker position does not contribute to transcription. (A) EMSA analysis of transcription for PQS 112, 121, and 211, where the increased linker length is positioned at different regions. EMSA gels are merged overlays with DNA bands shown in red and RNA bands in green. (B) Quantification of RNA, IG4, and HG4 levels for PQS 112, 121, and 211. (C) EMSA analysis of transcription for PQS 122, 212, and 221, where the increased linker length is positioned at different regions. (D) Quantification of RNA, IG4, and HG4 levels for PQS 122, 212, and 221. (E) smFRET histogram and representative traces for PQS 121, 211, 212, and 221. Quantification in panels (B) and (D) is based on three independent replicates; data are presented as mean ± standard deviation. Two-tailed t-tests were used to assess statistical significance (P > .05). Raw data are provided in the accompanying source file.
We conducted single molecule measurements on four constructs, 121, 211, 212, and 221, all containing 1 and 2 loop lengths in different positional combinations (Fig. 6E). The results reveal that all constructs behave similarly. First, they all generate a lower level of IG4 R-loop and a higher level of HG4 R-loop, consistent with the EMSA analysis. The difference between the IG4 R-loop and HG4 R-loop is slightly less evident in EMSA, likely due to the constricting gel matrix that may stabilize the IG4 R-loop state. Second, they exhibit fluctuating FRET, indicative of an unstable IG4 R-loop that oscillates between a plain R-loop and an IG4 R-loop, like the case of 122 (Fig. 5G).
Regulation of RNA production by competing IG4 and HG4 formation
To investigate how the kinetics of IG4 and HG4 formation affect transcription across varying PQS sequences, we sought to develop a simple model to explain their formation dynamics during transcription. After testing several models, we found the best fit to be
(Supplementary Fig. S5A–C). This model accurately describes our data (Fig. 7A).
Figure 7.
Competing dynamics between HG4 and IG4 determine transcription outcome. (A) Proposed model that best fits the data, where DNA reversibly transitions to IG4 with forward (k1) and reverse (k1r) rate constants. IG4 then irreversibly transitions to HG4 at rate k2. Fitting of the quantified levels of DNA, IG4, and HG4 for PQS 111 using the model. (B) Correlation plot illustrating relationships between different variables based on data from various PQS sequences. “Variance” represents the variation in loop length at different sequence positions. k1 and k2 are rate constants derived from fitting the model to experimental data. (C) Plot showing a strong positive correlation between the HG4/IG4 ratio and RNA production, with specific PQS sequences labeled at corresponding data points. The gray horizontal line indicates RNA levels produced by the non-G4 control sequence at 30 min. (D) Plot showing a strong negative correlation between IG4 levels and k2, with specific PQS sequences labeled. Marker size represents loop length, with shorter loops depicted as smaller markers and longer loops as larger markers. (E) Summary schematic illustrating how transcription of PQS sequences with varying loop lengths leads to either a dominant IG4 or HG4 structure, ultimately influencing transcriptional outcomes.
To further test the roles of IG4 R-loop and HG4 R-loop in RNA production, we conducted in vitro transcription with additional PQS DNA constructs featuring varying loop lengths (Supplementary Table S1). Ranking RNA production across these constructs revealed differences in RNA output and IG4 and HG4 amounts (Supplementary Fig. S6A–C). To determine the driving factor(s) responsible for the differences, we analyzed correlations between RNA production and several parameters: HG4 R-loop, IG4 R-loop, loop length variance, loop length, and the kinetic parameters k1 (rate of IG4 formation) and k2 (rate of HG4 formation). Among the pairwise correlations, the HG4/IG4 ratio shows the strongest positive correlation with RNA production, followed by k2 (Fig. 7B). In contrast, IG4 displayed the strongest negative correlation with RNA production. HG4 alone did not strongly correlate with RNA output; the HG4/IG4 ratio better explains transcriptional enhancement, with higher HG4 relative to IG4 associated with increased RNA production (Fig. 7C).
We also observed a strong negative correlation between IG4 levels and k2, suggesting that stable IG4 has lower propensity to convert to HG4. Since HG4 facilitates transcription, strong IG4 formation lowers transcriptional enhancement by limiting its conversion to HG4. Examining sequence features more closely, we found that PQS constructs with longer loop lengths generally exhibited reduced IG4 levels and higher k2 values (Fig. 7D). This further supports the idea that longer loop lengths weaken IG4 stability, facilitating its conversion to HG4 and promoting transcriptional enhancement.
To further assess IG4 R-loop stability, we calculated the equilibrium constant (k1/k1r) and ranked PQS sequences accordingly (Supplementary Fig. S7A). IG4 R-loops exhibited varying degrees of resistance to RNase H digestion depending on the PQS sequence, leading us to use RNase H resistance as a proxy for IG4 stability. Plotting IG4 R-loop band intensity after RNase H digestion against the equilibrium constant (k1/k1r) revealed a positive correlation, indicating that IG4 R-loops with shorter loop lengths are more resistant to RNase H digestion and have higher k1/k1r values, whereas those with longer loop lengths show the opposite trend (Supplementary Fig. S7B).
Furthermore, as loop length increases, we observe a negative correlation between the ratio of IG4 remaining after RNase H digestion and total IG4 before RNase H digestion (Supplementary Fig. S7C-D). This supports the idea that IG4 stability decreases with increasing loop length. These findings further confirm that PQS loop length influences IG4 R-loop stability and suggest that weaker IG4 R-loop structures facilitate their conversion to HG4 R-loops, ultimately promoting transcriptional enhancement.
IG4 R-loop reduces transcription by trapping RNAP
We asked how IG4 R-loop reduces transcription by running an EMSA with and without SDS treatment. While the SDS-treated sample removes the protein, allowing for probing the nucleic acid structure in the absence of protein, the non-SDS-treated sample (EDTA only) retains the protein-nucleic acid interaction. The non-SDS-treated samples exhibited bands that appear on top of the IG4- and HG4 R-loop, suggesting a higher-order complex (Supplementary Fig. S8A). When the same samples were treated with SDS, the upper bands disappeared and shifted downward to the IG4 R-loop bands (Supplementary Fig. S8B and C), strongly suggesting that the extra upper bands in the no-SDS condition arise from the IG4 R-loop complexed with RNAP. Notably, the HG4 R-loop band remained unchanged in both conditions, suggesting that HG4 R-loops are not involved in the same RNAP-associated complexes as IG4 R-loops (Supplementary Fig. S8C).
We conclude that the loop length of PQS modulates transcription outcome through two different R-loops: the IG4 and HG4 R-loops. When the PQS possesses a short loop length, such as 111, transcription activity is highly prone to form an IG4 R-loop due to the stability of the G4 structure in the non-template DNA. The IG4 R-loop structure traps RNAP, likely lowering the transcription activity by preventing successive RNAP loading. In contrast, long loop length in PQS leads to an unstable IG4 R-loop, which resorts to the HG4 R-loop as more RNA is produced. The HG4 R-loop facilitates transcription, likely by opening the transcription bubble and promoting RNAP translocation (Fig. 7E).
Discussion
Our study investigated the effects of various PQS sequences on transcription, identifying two distinct R-loop-associated G4 structures. During transcription, an intramolecular G4 with an R-loop (IG4 R-loop) forms first, followed by an intermolecular hybrid DNA:RNA G4 with an R-loop (HG4 R-loop) (Figs 1 and 2). These structures have opposing roles in transcription: IG4 R-loops limit RNA production, while HG4 R-loops enhance it. To confirm HG4’s transcription-enhancing role, we destabilized IG4 R-loops in weak or no monovalent cation conditions and observed enhanced RNA production. Additionally, using the 11 PQS sequence, which exclusively forms HG4 R-loops, we recorded high RNA output, further validating HG4’s positive impact on transcription (Fig. 3). Overall, RNA production is modulated by the IG4 and HG4, with a higher HG4/IG4 ratio correlating strongly with increased RNA output (Fig. 4).
To understand the formation kinetics of IG4 R-loop and HG4 R-loops, we tested different models and found that the
model best fit our data. We also considered an alternative model (
), but the difference in fitting was minimal (Supplementary Fig. S5C). Given that HG4 R-loops exhibit greater stability than IG4 R-loops [61, 62], and the reverse conversion from HG4 to IG4 appears negligible, we excluded this step from our model. One limitation is that our model is based on a 30-min endpoint, which may miss slower or higher-order transitions. Although we performed transcription reactions up to 45 min, we excluded this time point from analysis due to plateaued transcription caused by rNTP depletion. However, we continued to observe IG4 R-loop to HG4 R-loop conversion at 45 min (Supplementary Fig. S9A). While this model is a simplification and may not fully capture all aspects of the process, it allows us to analyze the relationship between IG4 and HG4 formation rates and assess how different PQS sequences influence these dynamics.
In our experiments with the 11 construct, which cannot form an IG4 R-loop, we observed a faint band migrating at the same position as the IG4 R-loop band. We hypothesize that, due to the G-rich nature of the sequence, it may form an incomplete G4 structure or another secondary structure. Additionally, we consistently observed a third band above the HG4 R-loop structure, emerging at later time points, typically after HG4 R-loop formation. Since this band had low intensity, we did not include it in our quantification. At this stage, we are unable to determine its exact nature, but it may represent a complex secondary structure involving an R-loop, G4, and RNA.
HG4 R-loops form later during transcription and exhibit distinct biophysical properties. Unlike IG4, HG4 R-loops are structurally more flexible and exhibit greater stability [61, 62]. HG4 can potentially form larger and more stable G4 structures through the participation of more than four G-tracts, contributed by both the DNA and RNA strands, thereby abrogating RNAP stalling. This enhanced flexibility and stability may facilitate RNA production by promoting the opening of the transcription bubble and alleviating DNA torsional constraints, thereby enabling more efficient RNAP progression.
Longer PQS loops favor HG4 R-loop formation and are associated with increased RNA production (Fig. 4). The reduced stability of longer loops may facilitate the conversion of IG4 R-loops to HG4 R-loops (Fig. 5). Additionally, the lower GC density of longer loops weakens R-loop stability, making them easier for RNAP to unwind and enabling successive transcription cycles. These contrasting effects highlight the critical role of PQS loop length in transcriptional regulation, influenced by the interplay between G4 stability and GC density, which affects R-loop stability. While our study focused on PQS constructs containing thymine (T) linkers for consistency, variation in linker composition—such as substituting adenine (A) for thymine—can alter G4 topology and R-loop formation patterns (Supplementary Fig. S9B and C).
In contrast, IG4 R-loops act as physical barriers (Supplementary Fig. S8), impeding RNAP progression and increasing torsional strain when formed on the non-template strand [63]. This strain arises because all G-tracts originate from the same DNA strand, which must twist and fold to form the G4 structure. The stability and inhibitory nature of IG4 R-loops are particularly pronounced in PQS sequences with short loop lengths. For example, among the PQS tested, the 111 PQS—with the highest GC content and the most stable IG4 R-loop structure—exhibited the lowest RNA production. The high GC density of short-loop PQS results in a stable R-loop that is difficult for RNAP to unwind, limiting transcription efficiency [64, 65]. However, in addition to loop length and GC density, the position of the PQS with respect to the promoter may also influence its regulatory effects, potentially affecting RNAP stalling, backtracking, or the efficiency of R-loop resolution. Further studies are needed to explore how PQS positioning modulates transcriptional outcomes.
Our findings suggest that HG4 R-loops play a significant role in transcriptional regulation. Given that HG4 R-loop formation is transcription-dependent, we hypothesize that HG4 R-loops may function as a positive feedback mechanism to enhance gene expression or as a sensor to modulate RNA levels in cells. The susceptibility of HG4 R-loops to RNase H digestion, which targets DNA:RNA hybrids (Supplementary Fig. S7C), along with the recent discovery of the transcription-coupled helicase CSB (Cockayne syndrome B protein), which preferentially unfolds intermolecular G4s over intramolecular ones [66], suggest that HG4 R-loops may be dynamically regulated or resolved by specific cellular factors.
While previous studies suggest a 1:3 RNA:DNA ratio may be optimal for HG4 formation, the structural variability and functional roles of HG4 require further investigation. These include understanding how HG4 interacts with proteins, responds to cellular ionic conditions, and contributes to transcriptional regulation within the cellular environment. Investigating the structural and functional diversity of HG4 will provide deeper insights into its role in genome regulation and broader cellular processes.
Our study highlights the contrasting roles of IG4 R-loops and HG4 R-loops in transcriptional regulation, emphasizing the importance of PQS features, such as loop length, in shaping their formation and effects. By elucidating the mechanisms underlying these structures, we provide new insights into how R-loop-associated G4s influence transcription. While our in vitro T7 RNAP transcription system provides valuable mechanistic insights, we acknowledge that this system lacks the chromatin context and regulatory elements present in cells and may not fully reflect endogenous G4-mediated processes. These findings enhance understanding of how PQS commonly found in the human genome may behave based on their sequence characteristics. Future research exploring the function of these structures in cellular contexts and their regulatory mechanisms could offer insights for developing therapeutic interventions targeting G4s.
Supplementary Material
Acknowledgements
We sincerely thank Dr Jimin Kang for his insightful discussions, valuable suggestions, and contribution of code for data analysis, as well as his support in interpreting the results.
Author contributions:LeyaYang: Conceptualization, Formal analysis, Investigation, Methodology, Writing – original draft. Chun-Ying Lee: Conceptualization, Resources, Data Curation, Validation, Writing – review & editing. Tapas Paul: Investigation, Resources, Validation. Sua Myong: Conceptualization, Supervision, Visualization, Writing – review & editing.
Contributor Information
Leya Yang, Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States.
Chun-Ying Lee, Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States.
Tapas Paul, Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States.
Sua Myong, Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States.
Supplementary data
Supplementary data is available at NAR online.
Conflict of interest
None declared.
Funding
National Institute of General Medical Sciences, (R01-GM149729-04). Funding to pay the Open Access publication charges for this article was provided by National Institute of General Medical Sciences, (R01-GM149729-04).
Data availability
Custom codes are available on GitHub (https://github.com/Ha-SingleMoleculeLab) and Zenodo at the DOI: 10.5281/zenodo.16990017 and https://doi.org/10.5281/zenodo.16990017.
References
- 1. Spiegel J, Adhikari S, Balasubramanian S The structure and function of DNA G-quadruplexes. Trends Chem. 2020; 2:123–36. 10.1016/j.trechm.2019.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Brickner JR, Garzon JL, Cimprich KA Walking a tightrope: the complex balancing act of R-loops in genome stability. Mol Cell. 2022; 82:2267–97. 10.1016/j.molcel.2022.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Lee C, Joshi M, Wang A et al. 5′UTR G-quadruplex structure enhances translation in size dependent manner. Nat Commun. 2024; 15:3963. 10.1038/s41467-024-48247-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Pandiyan A, Mallikarjun J, Maheshwari H et al. Pathological R-loops in bacteria from engineered expression of endogenous antisense RNAs whose synthesis is ordinarily terminated by Rho. Nucleic Acids Res. 2024; 52:12438–55. 10.1093/nar/gkae839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Hänsel-Hertsch R, Simeone A, Shea A et al. Landscape of G-quadruplex DNA structural regions in breast cancer. Nat Genet. 2020; 52:878–83. 10.1038/s41588-020-0672-8. [DOI] [PubMed] [Google Scholar]
- 6. Richl T, Kuper J, Kisker C G-quadruplex-mediated genomic instability drives SNVs in cancer. Nucleic Acids Res. 2024; 52:2198–211. 10.1093/nar/gkae098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Li F, Zafar A, Luo L et al. R-loops in genome instability and cancer. Cancers. 2023; 15:4986. 10.3390/cancers15204986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Cuartas J, Gangwani L R-loop mediated DNA damage and impaired DNA repair in spinal muscular atrophy. Front Cell Neurosci. 2022; 16:826608. 10.3389/fncel.2022.826608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Richard P, Manley JL R loops and links to human disease. J Mol Biol. 2017; 429:3168–80. 10.1016/j.jmb.2016.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Gellert M, Lipsett MN, Davies DR Helix formation by guanylic acid. Proc Natl Acad Sci USA. 1962; 48:2013–8. 10.1073/pnas.48.12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Monsen RC, Trent JO, Chaires JB G-quadruplex DNA: a longer story. Acc Chem Res. 2022; 55:3242–52. 10.1021/acs.accounts.2c00519. [DOI] [PubMed] [Google Scholar]
- 12. Chambers VS, Marsico G, Boutell JM et al. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol. 2015; 33:877–81. 10.1038/nbt.3295. [DOI] [PubMed] [Google Scholar]
- 13. Tu J, Duan M, Liu W et al. Direct genome-wide identification of G-quadruplex structures by whole-genome resequencing. Nat Commun. 2021; 12:6014. 10.1038/s41467-021-26312-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Lago S, Nadai M, Cernilogar FM et al. Promoter G-quadruplexes and transcription factors cooperate to shape the cell type-specific transcriptome. Nat Commun. 2021; 12:3885. 10.1038/s41467-021-24198-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Huppert JL, Balasubramanian S G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. 2007; 35:406–13. 10.1093/nar/gkl1057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Zyner KG, Simeone A, Flynn SM et al. G-quadruplex DNA structures in human stem cells and differentiation. Nat Commun. 2022; 13:142. 10.1038/s41467-021-27719-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Hänsel-Hertsch R, Beraldi D, Lensing SV et al. G-quadruplex structures mark human regulatory chromatin. Nat Genet. 2016; 48:1267–72. 10.1038/ng.3662. [DOI] [PubMed] [Google Scholar]
- 18. Poulet-Benedetti J, Tonnerre-Doncarli C, Valton A et al. Dimeric G-quadruplex motifs-induced NFRs determine strong replication origins in vertebrates. Nat Commun. 2023; 14:4843. 10.1038/s41467-023-40441-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Lago S, Nadai M, Cernilogar FM et al. Promoter G-quadruplexes and transcription factors cooperate to shape the cell type-specific transcriptome. Nat Commun. 2021; 12:3885. 10.1038/s41467-021-24198-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Varshney D, Spiegel J, Zyner K et al. The regulation and functions of DNA and RNA G-quadruplexes. Nat Rev Mol Cell Biol. 2020; 21:459–74. 10.1038/s41580-020-0236-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Lee C, McNerney C, Ma K et al. R-loop induced G-quadruplex in non-template promotes transcription by successive R-loop formation. Nat Commun. 2020; 11:3392. 10.1038/s41467-020-17176-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Sanz LA, Hartono SR, Lim YW et al. Prevalent, dynamic, and conserved R-loop structures associate with specific epigenomic signatures in mammals. Mol Cell. 2016; 63:167–78. 10.1016/j.molcel.2016.05.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hegazy YA, Fernando CM, Tran EJ The balancing act of R-loop biology: the good, the bad, and the ugly. J Biol Chem. 2020; 295:905–13. 10.1016/S0021-9258(17)49903-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Niehrs C, Luke B Regulatory R-loops as facilitators of gene expression and genome stability. Nat Rev Mol Cell Biol. 2020; 21:167–78. 10.1038/s41580-019-0206-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Promonet A, Padioleau I, Liu Y et al. Topoisomerase 1 prevents replication stress at R-loop-enriched transcription termination sites. Nat Commun. 2020; 11:3940. 10.1038/s41467-020-17858-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Ginno PA, Lott PL, Christensen HC et al. R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol Cell. 2012; 45:814–25. 10.1016/j.molcel.2012.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Chen L, Chen J, Zhang X et al. R-ChIP using inactive RNase H reveals dynamic coupling of R-loops with transcriptional pausing at gene promoters. Mol Cell. 2017; 68:745–57. 10.1016/j.molcel.2017.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Wulfridge P, Sarma K Intertwining roles of R-loops and G-quadruplexes in DNA repair, transcription and genome organization. Nat Cell Biol. 2024; 26:1025–36. 10.1038/s41556-024-01437-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Kumar C, Batra S, Griffith JD et al. The interplay of RNA:DNA hybrid structure and G-quadruplexes determines the outcome of R-loop-replisome collisions. eLife. 2021; 10:e72286. 10.7554/eLife.72286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Miglietta G, Russo M, Capranico G G-quadruplex-R-loop interactions and the mechanism of anticancer G-quadruplex binders. Nucleic Acids Res. 2020; 48:11942–57. 10.1093/nar/gkaa944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Amati B, Frank SR, Donjerkovic D et al. Function of the c-Myc oncoprotein in chromatin remodeling and transcription. Biochim Biophys Acta. 2001; 1471:M135–M145. 10.1016/S0304-419X(01)00020-8. [DOI] [PubMed] [Google Scholar]
- 32. Liu J, Xiao S, Hao Y et al. Strand-biased formation of G-quadruplexes in DNA duplexes transcribed with T7 RNA polymerase. Angew Chem Int Ed. 2015; 54:8992–6. 10.1002/anie.201503648. [DOI] [PubMed] [Google Scholar]
- 33. Lee C, Mcnerney C, Ma K et al. R-loop induced G-quadruplex in non-template promotes transcription by successive R-loop formation. Nat Commun. 2020; 11:3392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Tippana R, Xiao W, Myong S G-quadruplex conformation and dynamics are determined by loop length and sequence. Nucleic Acids Res. 2014; 42:8106–14. 10.1093/nar/gku464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Guédin A, Gros J, Alberti P et al. How long is too long? Effects of loop size on G-quadruplex stability. Nucleic Acids Res. 2010; 38:7858–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Kreig A, Calvert J, Sanoica J et al. G-quadruplex formation in double strand DNA probed by NMM and CV fluorescence. Nucleic Acids Res. 2015; 43:7961–70. 10.1093/nar/gkv749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Piazza A, Adrian M, Samazan F et al. Short loop length and high thermal stability determine genomic instability induced by G-quadruplex-forming minisatellites. EMBO J. 2015; 34:1718–34. 10.15252/embj.201490702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Bugaut A, Balasubramanian S A sequence-independent study of the influence of short loop lengths on the stability and topology of intramolecular DNA G-quadruplexes. Biochemistry. 2008; 47:689–97. 10.1021/bi701873c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Huppert JL, Balasubramanian S Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 2005; 33:2908–16. 10.1093/nar/gki609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Paul T, Yang L, Lee C et al. Simultaneous probing of transcription, G-quadruplex, and R-loop. Methods Enzymol. 2024; 705:377–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Paul T, Voter AF, Cueny RR et al. E. coli rep helicase and RecA recombinase unwind G4 DNA and are important for resistance to G4-stabilizing ligands. Nucleic Acids Res. 2020; 48:6640–53. 10.1093/nar/gkaa442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Johnson SA, Paul T, Sanford SL et al. BG4 antibody can recognize telomeric G-quadruplexes harboring destabilizing base modifications and lesions. Nucleic Acids Res. 2024; 52:1763–78. 10.1093/nar/gkad1209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Paul T, Myong S Protocol for generation and regeneration of PEG-passivated slides for single-molecule measurements. STAR Protocols. 2022; 3:101152. 10.1016/j.xpro.2022.101152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Lee C, McNerney C, Myong S G-quadruplex and protein binding by single-molecule FRET microscopy. Methods Mol Biol. 2019; 2035:309–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Hwang H, Kim H, Myong S Protein induced fluorescence enhancement as a single molecule assay with short distance sensitivity. Proc Natl Acad Sci USA. 2011; 108:7414–8. 10.1073/pnas.1017672108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Lee H, Sanford S, Paul T et al. Position-dependent effect of guanine base damage and mutations on telomeric G-quadruplex and telomerase extension. Biochemistry. 2020; 59:2627–39. 10.1021/acs.biochem.0c00434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Paul T, Liou W, Cai X et al. TRF2 promotes dynamic and stepwise looping of POT1 bound telomeric overhang. Nucleic Acids Res. 2021; 49:12377–93. 10.1093/nar/gkab1123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Wright DJ, Force CR, Znosko BM Stability of RNA duplexes containing inosine·cytosine pairs. Nucleic Acids Res. 2018; 46:12099–108. 10.1093/nar/gky907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Alseth I, Dalhus B, Bjørås M Inosine in DNA and RNA. Curr Opin Genet Dev. 2014; 26:116–23. 10.1016/j.gde.2014.07.008. [DOI] [PubMed] [Google Scholar]
- 50. Srinivasan S, Torres AG, Ribas de Pouplana L Inosine in biology and disease. Genes. 2021; 12:600. 10.3390/genes12040600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Zhang J, Zheng K, Xiao S et al. Mechanism and manipulation of DNA:RNA hybrid G-quadruplex formation in transcription of G-rich DNA. J Am Chem Soc. 2014; 136:1381–90. 10.1021/ja4085572. [DOI] [PubMed] [Google Scholar]
- 52. Ren C, Duan R, Wang J et al. Dominant and genome-wide formation of DNA:RNA hybrid G-quadruplexes in living yeast cells. Proc Natl Acad Sci USA. 2024; 121:e2401099121. 10.1073/pnas.2401099121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Choi B, Lee H DNA–RNA hybrid G-quadruplex tends to form near the 3′ end of telomere overhang. Biophys J. 2022; 121:2962–80. 10.1016/j.bpj.2022.06.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Adrian M, Ang DJ, Lech CJ et al. Structure and conformational dynamics of a stacked dimeric G-quadruplex formed by the human CEB1 minisatellite. J Am Chem Soc. 2014; 136:6297–305. 10.1021/ja4125274. [DOI] [PubMed] [Google Scholar]
- 55. Zheng K, Chen Z, Hao Y et al. Molecular crowding creates an essential environment for the formation of stable G-quadruplexes in long double-stranded DNA. Nucleic Acids Res. 2010; 38:327–38. 10.1093/nar/gkp898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Kreig A, Calvert J, Sanoica J et al. G-quadruplex formation in double strand DNA probed by NMM and CV fluorescence. Nucleic Acids Res. 2015; 43:7961–70. 10.1093/nar/gkv749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Kim BG, Shek YL, Chalikian TV Polyelectrolyte effects in G-quadruplexes. Biophys Chem. 2013; 184:95–100. 10.1016/j.bpc.2013.10.003. [DOI] [PubMed] [Google Scholar]
- 58. Kim BG, Evans HM, Dubins DN et al. Effects of salt on the stability of a G-quadruplex from the human c-MYC promoter. Biochemistry. 2015; 54:3420–30. 10.1021/acs.biochem.5b00097. [DOI] [PubMed] [Google Scholar]
- 59. Tateishi-Karimata H, Kawauchi K, Sugimoto N Destabilization of DNA G-quadruplexes by chemical environment changes during tumor progression facilitates transcription. J Am Chem Soc. 2018; 140:642–51. 10.1021/jacs.7b09449. [DOI] [PubMed] [Google Scholar]
- 60. Luo Y, Živković ML, Wang J et al. A sodium/potassium switch for G4-prone G/C-rich sequences. Nucleic Acids Res. 2024; 52:448–61. 10.1093/nar/gkad1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Wanrooij PH, Uhler JP, Shi Y et al. A hybrid G-quadruplex structure formed between RNA and DNA explains the extraordinary stability of the mitochondrial R-loop. Nucleic Acids Res. 2012; 40:10334–44. 10.1093/nar/gks802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Zhao Y, Zhang J, Zhang Z et al. Real-time detection reveals responsive cotranscriptional formation of persistent intramolecular DNA and intermolecular DNA:RNA hybrid G-quadruplexes stabilized by R-loop. Anal Chem. 2017; 89:6036–42. 10.1021/acs.analchem.7b00625. [DOI] [PubMed] [Google Scholar]
- 63. Belotserkovskii BP, Liu R, Tornaletti S et al. Mechanisms and implications of transcription blockage by guanine-rich DNA sequences. Proc Natl Acad Sci USA. 2010; 107:12816–21. 10.1073/pnas.1007580107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Belotserkovskii BP, Soo Shin JH, Hanawalt PC Strong transcription blockage mediated by R-loop formation within a G-rich homopurine–homopyrimidine sequence localized in the vicinity of the promoter. Nucleic Acids Res. 2017; 45:6589–99. 10.1093/nar/gkx403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Hazel P, Huppert J, Balasubramanian S et al. Loop-length-dependent folding of G-quadruplexes. J Am Chem Soc. 2004; 126:16405–15. 10.1021/ja045154j. [DOI] [PubMed] [Google Scholar]
- 66. Liano D, Chowdhury S, Di Antonio M Cockayne syndrome B protein selectively resolves and interact with intermolecular DNA G-quadruplex structures. J Am Chem Soc. 2021; 143:20988–1002. 10.1021/jacs.1c10745. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Custom codes are available on GitHub (https://github.com/Ha-SingleMoleculeLab) and Zenodo at the DOI: 10.5281/zenodo.16990017 and https://doi.org/10.5281/zenodo.16990017.








