Skip to main content
The Journal of Biological Chemistry logoLink to The Journal of Biological Chemistry
. 2008 May 9;283(19):12756–12762. doi: 10.1074/jbc.M705003200

G4-forming Sequences in the Non-transcribed DNA Strand Pose Blocks to T7 RNA Polymerase and Mammalian RNA Polymerase II*

Silvia Tornaletti ‡,§,1, Shaun Park-Snyder , Philip C Hanawalt
PMCID: PMC2442332  PMID: 18292094

Abstract

DNA sequences rich in runs of guanine have the potential to form G4 DNA, a four-stranded non-canonical DNA structure stabilized by formation and stacking of G quartets, planar arrays of four hydrogen-bonded guanines. It was reported recently that G4 DNA can be generated in Escherichia coli during transcription of plasmids containing G-rich sequences in the non-transcribed strand. In addition, a stable RNA/DNA hybrid is formed with the transcribed strand. These novel structures, termed G loops, are suppressed in recQ+ strains, suggesting that their persistence may generate genomic instability and that the RecQ helicase may be involved in their dissolution. However, little is known about how such non-canonical DNA structures are processed when encountered by an elongating polymerase. To assess whether G4-forming sequences interfere with transcription, we studied their effect on transcription elongation by T7 RNA polymerase and mammalian RNA polymerase II. We used a reconstituted transcription system in vitro with purified polymerase and initiation factors and with substrates containing G-rich sequences in either the transcribed or non-transcribed strand downstream of the T7 promoter or the adenovirus major late promoter. We report that G-rich sequences located in the transcribed strand do not affect transcription by either polymerase, but when the sequences are located in the non-transcribed strand, they partially arrest both polymerases. The efficiency of arrest increases with negative supercoiling and also with multiple rounds of transcription compared with single events.


Several genomic sequences have the ability to assume non-B form DNA structures as distinguished from the canonical right-handed double helix (1). G4 DNA is a four-stranded structure that readily forms in vitro in guanine-rich DNA sequences through the stacking of planar G tetrads (2), and it may form in vivo as a consequence of transient DNA strand separation during transcription (3, 4) or replication of G-rich sequences (5). This unusual DNA structure is very stable as a result of hydrogen bonding within each quartet, stacking of the hydrophobic quartets, and charge coordination by monovalent cations in the central cavity (6). G-rich DNA sequences occur naturally in several biologically relevant genomic regions, such as telomeres and centromeres, immunoglobulin switch regions, and mutational hot spots, as well as in repeat elements in some triplet repeat expansion diseases (7). The biological relevance of these structures is further suggested by the finding that several proteins specifically target G4 DNA structures, including the highly conserved RecQ family helicases (8), enzymes essential for maintaining genomic stability, and nucleases like mammalian GQN1 (9) and Saccharomyces cerevisiae Kem1/Sep1 (10). Formation of G4 DNA in vivo has been documented recently in Escherichia coli, in which transcription of plasmids containing G-rich sequences in the non-transcribed strand produced novel structures called G loops; G4 DNA formed in the non-transcribed strand and a stable RNA/DNA hybrid in the other (3). G loop formation was suppressed in recQ+ strains, indicating a role for RecQ in the processing of these structures and a possible role for transcription-induced G loops in promoting genomic instability (3). In support of this notion, G4 DNA-containing G loops localize to translocation and hypermutation sites in the human c-myc gene (4) and correspond to binding sites for activation-induced deaminase and the mismatch repair protein MutSα in mutagenic and recombinogenic G-rich regions found in immunoglobulin genes (11, 12).

Transcription-coupled DNA repair (TCR)2 is a pathway of excision repair that operates on DNA lesions located in transcribed strands of expressed genes. A current model for TCR proposes transcription arrest at the damage site as a prerequisite for initiation of TCR (13). In support of this model, a strong correlation between transcription arrest of a lesion in vitro and TCR of that lesion in vivo has been found in several cases (1416). Although TCR has been well documented for transcription-arresting damage, little is known about whether this repair pathway may also be involved in processing other types of transcription impediments, such as those generated by natural DNA sequences that can adopt non-B form structures and that may be responsible for transcription pausing. If they inhibit transcription, they might induce “gratuitous” TCR in a futile cycle, in which the original sequence would be generated, or DNA alterations might occur because of faulty processing in the area of transcription arrest (17). As a first step to determining whether G4-forming sequences might interfere with RNAP progression and consequently initiate TCR, we studied the effect of G-rich sequences on transcription elongation by T7 RNAP and mammalian RNAP II.

EXPERIMENTAL PROCEDURES

Proteins and Reagents—T7 RNAP was purchased from Promega. T4 polynucleotide kinase, calf thymus DNA topoisomerase I, and proteinase K were from Invitrogen. RNAP II and transcription initiation factors were purified from rat liver or recombinant sources as described previously (18, 19). Highly purified NTPs and radiolabeled nucleotides were purchased from Amersham Biosciences. D44 IgG anti-RNA antibodies were purified from ascites fluid as described previously (20). Formalin-fixed Staphylococcus aureus cells were obtained from Calbiochem. RNases A and H were purchased from Ambion.

Preparation of DNA Templates for Transcription—DNA templates used for transcription reactions with T7 RNAP and RNAP II consisted of closed circular supercoiled plasmids unless stated otherwise (see Fig. 1A). Plasmids pUC15RTS and pUC15RNTS, containing G-rich sequences in the transcribed and non-transcribed strands, respectively, were generated by ligating a 350-bp fragment from plasmid pRX15F (3), consisting of 15 repeats of the 20-mer murine immunoglobulin heavy chain Sμ consensus sequence 5′-GCTGAGCTGGGGTGAGCTGA-3′, into the BamHI-XhoI sites of pUCGTGTS (21) in one orientation or in the opposite orientation. pUC15RTS and pUC15RNTS were transformed into the F′ E. coli strain MV1184 to produce plasmids (21). Closed circular relaxed templates used in T7 RNAP and RNAP II transcription reactions were obtained by digesting supercoiled plasmids with calf thymus DNA topoisomerase. To obtain linearized templates for transcription, supercoiled plasmids were digested with HindIII.

FIGURE 1.

FIGURE 1.

A, DNA substrate used in this study. DNA template pUC15RNTS for T7 and RNAP II transcription consisted of closed circular DNA molecules containing the Ig Sμ repeat cloned downstream of the T7 or adenovirus major late promoter (AdLMP), constructed as described under “Experimental Procedures.” The transcription start sites (+1) are represented by bent arrows. B, T7 RNAP transcription of substrates containing G-rich sequences in the transcribed (TS) or non-transcribed (NTS) strand. DNA templates were transcribed in vitro such that the transcripts were radioactively labeled. Elongation was allowed to proceed for 30 min after addition of NTPs to the reaction. Lanes 1 and 3, templates containing G-rich sequences in the transcribed strand; lanes 2 and 4, templates containing G-rich sequences in the non-transcribed strand. G4, transcripts arrested near G4-forming sequences; RO, full-length runoff transcript; M, 100-bp ladder.

T7 RNAP Transcription Reactions—For single round transcription reactions, the DNA templates (10 ng) were incubated at 37 °C for 5 min in a mixture of 50 units of T7 RNAP, 40 mm Tris-HCl (pH 7.9), 6 mm MgCl2, 2mm spermidine, 10 μCi of [α-32P]GTP, 10 mm dithiothreitol, 212 units of RNasin and 200 μm UTP. Elongation proceeded in the presence of UTP and [α-32P]GTP until T7 RNAP reached the end of the C-less cassette (nucleotide 5), at which the first CTP is necessary for incorporation. Heparin was added to prevent further initiation, and 200 μm CTP, ATP, and GTP were added to allow elongation to continue. Incubation continued at 37 °C for 30 min. Reactions were stopped by addition of 5 μg of proteinase K, 1% SDS, 100 mm Tris-HCl (pH 7.5), 50 mm EDTA, and 150 mm NaCl, followed by incubation for 15 min at room temperature. The nucleic acids were precipitated with ethanol, resuspended in formamide dye, and denatured at 90 °C for 3 min. The transcription products were resolved on a 5% denaturing polyacrylamide gel in Tris borate/EDTA containing 8.3 m urea. Gels were dried and autoradiographed using intensifying screens. Transcripts were quantified by PhosphorImager analysis. All transcripts were labeled up to nucleotide 5, making quantitation independent of length and G content. A time course for transcription was performed over 30 min for the G-rich sequence-containing substrates. Elongation was stopped at intervals during the time course by addition of proteinase K. Nucleic acids were precipitated with ethanol and resolved on a 5% denaturing polyacrylamide gel in Tris borate/EDTA containing 8.3 m urea.

Reaction conditions for multiple round transcription reactions were identical to those for single rounds except that instead of the initial 5-min incubation with UTP and [α-32P]GTP, followed by addition of heparin, all four nucleotides were included in the transcription reaction at concentrations of 0.02 mm ATP, CTP, and UTP; 0.02 mm GTP; and 10 μCi of [α-32P]GTP. When indicated, 0.1 μg of RNase A or 1 unit of RNase H was included in the transcription reaction or added after the elongation step was completed, followed by further incubation for 30 min at 37 °C.

RNAP II Transcription Reactions—Preinitiation complexes were assembled on 20 ng of DNA template at 28 °C, followed by a 45-min preincubation of 54-μl reaction mixtures containing 20 mm HEPES-NaOH (pH 7.9); 20 mm Tris-HCl (pH 7.9); 50 mm KCl; 0.2 mm EDTA; 1 mm dithiothreitol; 0.5 mg/ml bovine serum albumin; 2% polyvinyl alcohol; 6% glycerol; and 212 units of RNasin, recombinant yeast TATA-binding protein, recombinant transcription factor B, recombinant transcription factor IIF, recombinant transcription factor IIE, highly purified transcription factor IIH, and RNAP II. 7 mm MgCl2, 20 μm ATP, 20 μm UTP, and 0.8 μm [α-32P]CTP (800 Ci/mmol) were added, and incubation was continued for 15 min. Elongation proceeded until RNAP II reached nucleotide 15, at which the first GTP is required for incorporation. Heparin was added to prevent further initiation, and then 800 μm each ATP, CTP, UTP, and GTP were added to allow elongation to continue, typically for 30 min. Elongation complexes were immunoprecipitated with anti-RNA antibodies and formalin-fixed S. aureus cells and then washed three times in reaction buffer containing 20 mm Tris-HCl (pH 7.9), 3 mm HEPES-NaOH (pH 7.9), 60 mm KCl, 0.5 mm EDTA, 2 mm dithiothreitol, 0.2 mg/ml acetylated bovine serum albumin, and 2.2% (w/v) polyvinyl alcohol. Washed complexes were resuspended in 60 μl of reaction buffer for further treatment. Reactions were stopped by addition of 5 μg of proteinase K, 1% SDS, 100 mm Tris-HCl (pH 7.5), 50 mm EDTA, and 150 mm NaCl, followed by incubation for 15 min at room temperature. Nucleic acids were precipitated with ethanol. Samples were resuspended in formamide dye, heat-denatured, and electrophoresed through a polyacrylamide gel. Gels were dried and autoradiographed using intensifying screens. For multiple round transcription reactions, heparin was omitted, and all four NTPs were included in the transcription reaction at final concentrations of 800 μm ATP, GTP, and UTP and 60 μm CTP, followed by incubation for 1 h at 28 °C. When indicated, 0.1 μg of RNase A or 1 unit of RNase H was included in the transcription reaction or added after the elongation step was completed, followed by further incubation for 30 min at 28 °C.

RESULTS

Effect of G-rich Sequences in the Transcribed or Non-transcribed Strand of Template DNA on Transcription Elongation by T7 RNAP—DNA substrates for T7 RNAP transcription consisted of supercoiled plasmids (Fig. 1A; see “Experimental Procedures”) carrying a G-rich region 350 bp in length including 15 repeats of the murine immunoglobulin heavy chain Sμ consensus sequence 5′-GCTGAGCTGGGG-TGAGCTGA-3′. This particular G-rich region has been reported previously by Duquette et al. (3) to generate G4 DNA in the non-transcribed strand and a stable RNA/DNA hybrid on the other when transcribed by T7 RNA polymerase in vitro and in vivo. Formation of G4 DNA was monitored by sensitivity to cleavage by GQN1 endonuclease, an enzyme specific to G4 DNA, and by binding to nucleolin, a protein that specifically interacts with G4 DNA. Further evidence of G4 DNA formation was obtained by dimethyl sulfate footprinting (3). On the basis of these observations, we initiated an investigation of the behavior of translocating T7 RNA polymerase when encountering this G4-forming sequence. In single round transcription, each template was transcribed only once by a single molecule of RNAP so that the transcription products represented a single promoter-dependent elongation event (22). Under these conditions, the G4 DNA, if formed, would be expected to block the polymerase ahead of the structure so that no RNA/DNA hybrid would be formed through the G-rich sequence. After synthesis of a short 32P-labeled RNA through the C-less cassette, T7 RNAP was stalled, and heparin was added to prevent further initiation events. All NTPs were added to allow elongation to continue. G-rich sequences located in the transcribed strand were not an encumbrance to T7 RNAP, as indicated by the production of only full-length runoff transcripts (Fig. 1B, lanes 1 and 3). When the G-rich sequences were located in the non-transcribed strand, transcription by T7 RNAP produced both full-length products and transcripts shorter than the full-length RNA present in the control (Fig. 1B, lane 2). Furthermore, the total amount of products was less than that produced when the G-rich sequences were in the transcribed strand, suggesting an effect not only upon transcription elongation but also upon initiation. When KCl was included in the transcription reaction, the amount of short RNA increased from 15 to 23%, as expected from the KCl stabilization of the G4 DNA structure (6). The shorter transcript was ∼300 bp in length, as judged by comparison with a marker of similar size. Surprisingly, the termination of this transcript mapped ∼90 bp upstream from the first Sμ repeat, in a region of high thymine content in the non-transcribed strand. To determine whether the rate of synthesis of the full-length and truncated transcripts varied with time, we monitored the amounts of transcripts generated from 0 to 30 min after initiation of the elongation phase (Fig. 2). The templates containing the G-rich sequences in the transcribed strands produced only the full-length runoff product as expected (Fig. 2, lanes 1–7). However, when the G-rich sequences were in the non-transcribed strand, shorter transcripts in addition to full-length runoff transcripts (Fig. 2, lanes 8–13) were produced, with a read-through frequency of up to 23%. The extent of arrest did not change with time, indicating that RNA polymerase was arrested rather than paused by the presence of the G-rich sequence in the non-transcribed strand. When heparin was omitted from the transcription reaction to allow additional rounds of transcription (Fig. 3), a second short transcript (lane 8), in addition to that observed after a single round (lane 4), was obtained that mapped ∼10 bp downstream from the first one. This result suggests that the G loop might have a greater effect on transcription than the G quartet alone. The extent of arrest increased ∼2-fold in supercoiled DNA (Fig. 3, lanes 4 and 8) compared with that in relaxed (lanes 3 and 7) or linear (data not shown) DNA, confirming that DNA topology influences formation of these non-canonical structures (1, 3).

FIGURE 2.

FIGURE 2.

Time course of T7 transcription of DNA templates containing G4-forming sequences in the non-transcribed strand. Templates containing G-rich sequences in either the transcribed (TS; lanes 1–7) or non-transcribed (NTS; lanes 8–14) strand were transcribed in vitro. Samples were removed from each reaction mixture at the indicated times. G4, transcripts arrested near G4-forming sequences; RO, full-length runoff transcript.

FIGURE 3.

FIGURE 3.

Effect of DNA topology on T7 transcription of DNA templates containing G-rich sequences in the transcribed or non-transcribed strand. DNA templates were transcribed as described in the legend to Fig. 1B. Lanes 1, 2, 5, and 6, templates containing G-rich sequences in the transcribed strand (TS); lanes 3, 4, 7, and 8, templates containing G-rich sequences in the non-transcribed strand (NTS). G4, transcripts arrested near G4-forming sequences; RO, full-length runoff transcript; M, 100-bp ladder.

Generation of G Loops by T7 Transcription of G-rich Sequences Located in the Non-transcribed Strand of Template DNA—To determine whether G loops formed in our in vitro transcription system, transcripts obtained from T7 transcription of G-rich sequences located in the transcribed or non-transcribed strand were digested with RNase A to remove any single stranded RNA or with RNase H to digest the RNA strand in the RNA/DNA hybrid. After RNase A digestion of transcripts obtained from transcription of G-rich sequences located in the non-transcribed strand, RNase A-resistant transcripts ranging in size from 350 to 400 nucleotides were visible (Fig. 4, lane 8). These transcripts are the approximate size of the DNA insert predicted to adopt G4 structures and thus the size of the G loop formed. Digestion with RNase H produced a significant increase of one of the bands corresponding to the truncated transcripts because of blockage of the polymerase (Fig. 4, lanes 6 and 7). The length of this shorter product, which would be expected if a G loop originated behind the first elongating polymerase, is consistent with formation of an RNA/DNA hybrid starting near the 5′ end of the G-rich insert (3).

FIGURE 4.

FIGURE 4.

RNase A and H digestion of transcripts obtained from transcription of G-rich sequences located in the transcribed or non-transcribed strand of template DNA. Lanes 1–4, templates containing G-rich sequences in the transcribed strand (TS); lanes 5–8, templates containing G-rich sequences in the non-transcribed strand (NTS). Templates were transcribed in vitro as described in the legend to Fig. 1B. Elongation was allowed for 30 min, followed by addition of RNase A (lanes 4 and 8) or RNase H (lanes 2, 3, 6, and 7) and further incubation for 30 min. RNA was then isolated and electrophoresed through a 5% polyacrylamide gel. Transcripts arrested near G4-forming sequences are indicated by G4. RO, full-length runoff transcript.

Effect of G-rich Sequences in the Transcribed or Non-transcribed Strand of Template DNA on Transcription Elongation by Mammalian RNAP II—DNA substrates for RNAP II transcription consisted of supercoiled templates constructed as described (Fig. 1A and see “Experimental Procedures”). A 350-bp fragment including the murine immunoglobulin heavy chain Sμ consensus repeat 5′-GCTGAGCTGGGGTGAGCTGA-3′ was positioned in the transcribed or non-transcribed strand downstream of the adenovirus major late promoter (Fig. 1A). To study the effect of G-rich sequences on transcription by mammalian RNAP II, we utilized an in vitro reconstituted system containing purified RNAP II and initiation factors (18). Using this system, we found that G-rich sequences located in the non-transcribed strand produced shorter transcripts in addition to the full-length RNA products (Fig. 5, lanes 9–12). Comparison of the migration of these RNA products with a marker of similar size indicated that these transcripts were ∼300 nucleotides long, corresponding to transcription arrest ∼12 bp upstream from the first Sμ repeat. When transcription with supercoiled DNA (Fig. 5, lanes 9 and 12) was compared with that with linear (lane 10) or relaxed (lane 11) DNA, we observed an increase in the extent of arrest. Furthermore, transcription arrest with linear or relaxed DNA was detected only under multiple round transcription conditions, suggesting that under single round conditions, the positive supercoiling generated ahead of RNAP II destabilized the G4 structure (23). The presence of G-rich sequences in the transcribed strand did not affect transcription by RNAP II (Fig. 5, lanes 1–6).

FIGURE 5.

FIGURE 5.

Effect of DNA topology on RNAP II transcription of DNA templates containing G-rich sequences in the transcribed or non-transcribed strand. Lanes 1–6, templates containing G-rich sequences in the transcribed strand (TS); lanes 7–12, templates containing G-rich sequences in the non-transcribed strand (NTS). Templates were transcribed in vitro such that transcripts were labeled with 32P as described under “Experimental Procedures.” Elongation was allowed for 30 min. RNA was then isolated and electrophoresed through a 5% polyacrylamide gel. Transcripts arrested near G4-forming sequences are indicated by G4. RO, full-length runoff transcript; M, 100-bp ladder.

RNase A Digestion of Transcripts Obtained from RNAP II Transcription of G-rich Sequences in the Transcribed or Non-transcribed Strand—As a first step to determining whether RNAP II transcription generated G loops, as observed previously with T7 RNA polymerase, transcripts obtained from RNAP II transcription of substrates containing G-rich sequences were digested with RNase A. If G loops were formed, we expected to detect RNase A-resistant products. Indeed, we found that RNase A-resistant transcripts were obtained after extensive RNase A digestion of transcripts obtained from transcription of G-rich sequences in the non-transcribed strand (Fig. 6, lane 8). These transcripts ranged in size from ∼350 to 450 nucleotides. RNase A digestion of transcripts from substrates with G-rich sequences in the transcribed strand did not produce any RNase A-resistant products (Fig. 6, lanes 2–5), confirming that G loops form only when G-rich sequences are located in the non-transcribed strand.

FIGURE 6.

FIGURE 6.

RNase A digestion of transcripts produced after RNAP II transcription of templates containing G-rich sequences in the transcribed or non-transcribed strand. Lanes 1–5, templates containing G-rich sequences in the transcribed strand (TS); lanes 6–10, templates containing G-rich sequences in the non-transcribed strand (NTS). Templates were transcribed in vitro such that transcripts were labeled with 32P as described under “Experimental Procedures.” Elongation was allowed for 30 min, followed by addition of RNase A and further incubation for 30 min. RNA was then isolated and electrophoresed through a 5% polyacrylamide gel. Transcripts arrested near G4-forming sequences are indicated by G4. RO, full-length runoff transcript; M, 100-bp ladder (left) or 10-bp ladder (right). Triangles represent decreasing concentrations (from left to right) of RNase A.

Next, we asked whether loops formed co-transcriptionally or might have resulted from strand displacement in a DNA duplex by an RNA molecule synthesized on a different template. Substrates containing G-rich sequences in the transcribed or non-transcribed strand were transcribed in the presence of RNase A to digest any single-stranded RNA product formed during transcription (Fig. 7). Interestingly, the presence of RNase A during transcription did not significantly affect the pattern of RNase A-resistant products obtained, whether it was added during transcription (Fig. 7, lane 9) or after (lane 10), indicating that an RNA/DNA hybrid formed co-transcriptionally.

FIGURE 7.

FIGURE 7.

Co-transcriptional formation of RNA/DNA hybrids. Lanes 1–5, templates containing G-rich sequences in the transcribed (TS) strand; lanes 6–10, templates containing G-rich sequences in the non-transcribed (NTS) strand. Templates were transcribed in vitro as described in the legend to Fig. 5. RNase A was either included in the transcription reaction (lanes 4 and 9) or added after a 30-min elongation (lanes 5 and 10). RNA was then isolated and electrophoresed through a 5% polyacrylamide gel. Transcripts arrested near G4-forming sequences are indicated by G4. RO, full-length runoff transcript; M, 100-bp ladder.

DISCUSSION

We have studied the effect of G-rich DNA sequences on transcription elongation by T7 RNA polymerase and mammalian RNA polymerase II. We used an in vitro transcription system with purified RNAP and initiation factors and substrates containing G-rich sequences either in the transcribed or non-transcribed strand downstream of the T7 promoter or the adenovirus major late promoter. We found that G-rich sequences in the non-template strand partially arrested both polymerases. Surprisingly, the T7 RNAP arrest site mapped ∼90 bp upstream of the first G repeat, corresponding to a DNA sequence very rich in Ts in the non-transcribed strand. It is unclear why we did not observe a similar effect when the G-rich sequence was in the transcribed strand (Figs. 3, lane 2, and 5, lane 3). We speculate that the location of the T-rich sequence in the non-transcribed strand when T7 RNAP is arrested at the G-rich insert may constitute a relevant factor in determining T7 RNAP arrest. T-rich sequences located in the non-transcribed strand are known to cause arrest of both RNAP II (24) and T7 RNAP (25). However, it is intriguing that T7 RNAP, but not RNAP II, was affected by this T-rich sequence, and we do not understand why the presumed existence of G4 DNA many base pairs downstream should trigger arrest of transcription by T7 RNAP. Experiments are currently under way to characterize the mechanism of arrest at this T-rich sequence. By comparison, the RNAP II arrest site was only 10 bp upstream of the first G repeat. It has been shown previously that the G-rich repeat contained in the murine immunoglobulin heavy chain Sμ region (used in this study) forms G4 DNA in vitro and in vivo during transcription in E. coli (3), suggesting that formation of G4 DNA in the non-transcribed strand exerts an inhibitory effect on RNAP II translocation. However, other studies have reported that the G-rich sequence in the Sμ repeat could also potentially assume a DNA cruciform structure (26). Furthermore, in addition to guanine-guanine base pairing in the G quartet, intrastrand base pairing can also occur between bases located near the G quartets, raising the possibility that several different alternative DNA secondary structures might contribute to the arrest of RNAP transcription we observed at G repeats (2627).

The location of the RNAP II arrest site ∼10 bp upstream of the first G-rich sequence is in good agreement with the expected location of the downstream edge of an arrested RNAP II as obtained from footprinting experiments (2829), suggesting that the unusual DNA structural properties of G-rich sequences might be sufficient to interfere with RNAP II progression. Similar to the effect observed at natural RNAP II arrest sites, resulting for the most part from T-rich sequences located in the non-transcribed strand, we found that G-rich sequences only partially arrested RNAP II. The presumed formation of G4 DNA in the non-transcribed strand appeared to be sufficiently stable to overcome the positive supercoiling produced ahead of RNA polymerase, as indicated by the ability of this non-B DNA structure to block transcription even under single round conditions, when only one RNA polymerase molecule is present per DNA template (Figs. 3, lane 4, and 5, lane 9). It is also likely that the length of this G repeat may have had an effect on the extent of blockage, as shown previously in the case of DNA triplex-forming sequences located in the frataxin gene (3031). In agreement with the observation that quadruplex DNA is further stabilized by transcription (3), we found that the extent of RNAP II arrest at G-rich sequences increases under multiple round transcription conditions compared with that for a single round. In addition, consistent with the observation that the extent of G loop formation increased in supercoiled DNA compared with relaxed and linear DNA (3), we found that the extent of transcription arrest was dependent on DNA topology (Figs. 3 and 5). Our finding that linear DNA also caused polymerase arrest suggests, as observed previously by Duquette et al. (3), that G loops can form in our linear substrates.

It was shown previously that transcription by T7 RNAP of G-rich sequences located in the non-transcribed strand in E. coli produced G loops, consisting of G4 DNA in the non-transcribed strand and an RNA/DNA hybrid in the transcribed strand. We found that transcription of G-rich sequences produced RNase A-resistant products ranging in sizes from 350 to 450 bp, consistent with G loop formation in the Sμ repeat. Furthermore, these RNA species were still visible when RNase A was included in the transcription reaction (Fig. 7), indicating that they were formed co-transcriptionally.

It was reported that formation of G loops in E. coli is inhibited in recQ+ strains (3), suggesting that the RecQ helicase might have a role in processing these structures and that their formation could result in genetic instability (3, 32). In addition, it was shown that large deletions occur within long polyguanine tracts in the nematode Caenorhabditis elegans lines mutant in the protein DOG-1 (for deletion of guanine-rich DNA). Similar to the RecQ helicase function in unwinding G4 DNA, DOG-1 has been suggested to resolve G4 DNA structures in vivo (33).

TCR is a specialized repair pathway that removes DNA lesions from transcribed strands of expressed genes. This repair pathway has been well documented for transcription-blocking DNA damage, including UV-induced cyclobutane pyrimidine dimers and cisplatin-induced intrastrand cross-links. A correlation between transcription arrest by a lesion in vitro and TCR of that type of lesion in vivo has been noted (14, 15). It was speculated that TCR might also be elicited by arrest of transcription at natural DNA sequences that can adopt non-B DNA structures (17). The resulting reiterative and futile repair replication might then generate a significant level of mutagenesis in a frequently transcribed gene because of faulty processing in the area of transcription arrest. For example, there may be a lack of mismatch repair in the short tracts of repair replication, and the repair replication fidelity could also be affected by the unusual DNA sequence. Our finding that G-rich sequences in the non-transcribed strand cause RNAP II arrest raises the intriguing possibility that G repeats might trigger TCR in vivo. If this occurs, it would represent the first example of TCR triggered by the presence of an encumbrance in the non-transcribed strand. Experiments designed to characterize the stability and structural properties of these DNA-RNA-RNA polymerase arrested complexes will further elucidate the role of transcription and TCR in processing G-rich sequences in genomic DNA.

Acknowledgments

We thank Ann K. Ganesan, C. Allen Smith, Jennifer VanOverbeke, and Boris P. Belotserkovskii for helpful discussions and critical reading of this manuscript. We are indebted to Joan and Ron Conaway for providing rat liver RNAP II and initiation factors. We are most grateful to Nancy Maizels for providing plasmid pRX15F and for helpful discussions throughout the project.

*

This work was supported, in whole or in part, by National Institutes of Health Grant CA-77712 from NCI, United States Department of Health and Human Services (to S. T. and P. C. H.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Footnotes

2

The abbreviations used are: TCR, transcription-coupled DNA repair; RNAP, RNA polymerase.

References


Articles from The Journal of Biological Chemistry are provided here courtesy of American Society for Biochemistry and Molecular Biology

RESOURCES