Significance
The ends of eukaryotic linear chromosomes are capped by telomeres which terminate with a single-stranded overhang. Telomeric overhangs fold into compact structures, called G-quadruplexes, that inhibit access to these critical genomic sites. We report single-molecule measurements and computational modeling studies probing the accessibility of a set of human telomeric overhangs that cover a significant portion of the physiologically relevant length scale. We observe accessibility patterns which have a well-defined periodicity and show that certain regions are significantly more accessible than others. These accessibility patterns also suggest the underlying folding frustration of G-quadruplexes depends on telomere length. These patterns have significant implications for regulating the access of DNA-processing enzymes and DNA-binding proteins that can target telomeric overhangs.
Keywords: telomere, G-quadruplex, FRET-PAINT, single molecule
Abstract
We present single-molecule experimental and computational modeling studies investigating the accessibility of human telomeric overhangs of physiologically relevant lengths. We studied 25 different overhangs that contain 4–28 repeats of GGGTTA (G-Tract) sequence and accommodate one to seven tandem G-quadruplex (GQ) structures. Using the FRET-PAINT method, we probed the distribution of accessible sites via a short imager strand, which is complementary to a G-Tract and transiently binds to available sites. We report accessibility patterns that periodically change with overhang length and interpret these patterns in terms of the underlying folding landscape and folding frustration. Overhangs that have [4n]G-Tracts, (12, 16, 20…) demonstrate the broadest accessibility patterns where the peptide nucleic acid probe accesses G-Tracts throughout the overhang. On the other hand, constructs with [4n+2]G-Tracts, (14, 18, 22…) have narrower patterns where the neighborhood of the junction between single- and double-stranded telomeres is most accessible. We interpret these results as the folding frustration being higher in [4n]G-Tract constructs compared to [4n+2]G-Tract constructs. We also developed a computational model that tests the consistency of different folding stabilities and cooperativities between neighboring GQs with the observed accessibility patterns. Our experimental and computational studies suggest the neighborhood of the junction between single- and double-stranded telomeres is least stable and most accessible, which is significant as this is a potential site where the connection between POT1/TPP1 (bound to single-stranded telomere) and other shelterin proteins (localized on double-stranded telomere) is established.
The ends of linear chromosomes, called telomeres, contain repeating sequences and play vital roles in promoting the integrity of these important genomic sites. Telomeres are involved in safeguarding the chromosome ends by differentiating them from DNA double-strand breaks, which would otherwise trigger unwanted DNA damage response (1, 2). Human telomeric DNA is composed of tandem repeats of hexanucleotide d(GGGTTA) arranged into a long double-stranded region followed by a 50–300 nucleotides (nts) long 3′ single-stranded overhang (3). These unique telomeric repeats are conserved across vertebrates and beyond (4). The length of telomeres is important for their protective functions (5, 6); however, the inability of DNA replication machinery to complete the replication and processing of chromosome ends causes progressive telomere shortening during successive cell divisions (7). When a critical telomere length is reached, apoptosis or cell senescence is activated in somatic cells (8). Certain oncogenic cells continue to proliferate by overexpressing telomerase ribonucleoprotein complex, which elongates telomeres, eventually leading to development of malignant tumors (6, 9). Telomerase includes a reverse transcriptase enzyme that synthesizes telomeric DNA repeats d(GGTTAG) using an internal RNA template (3, 10).
The G-rich telomeric overhangs fold into compact noncanonical structures called G-quadruplex (GQ). GQ is formed when four repeats of d(GGGTTA) are arranged into planar quartet geometry, stabilized by Hoogsteen hydrogen bonding and centrally located monovalent cation. For brevity, the d(GGGTTA) sequence will be referred to as a “G-Tract.” Potentially, GQ-forming sequences are concentrated in certain regions of the human genome, including telomeres and promoters, and GQs have been confirmed to form in vitro (11, 12) and in vivo (13–15). GQs are implicated in crucial biological processes, such as transcription, recombination, and replication (16, 17). Telomeric GQs can inhibit telomerase from elongating telomeres (18). While much is known about telomeres, many fundamental questions remain about their structure, the nature of the interactions between folded GQs, and accessibility of different regions of the telomeric overhang.
Most studies on GQs have typically focused on understanding the structure and function of “single” DNA or RNA GQ molecules (19–21) and their interactions with DNA-binding proteins (22–24) and helicases (25–28). However, the presence of many G-Tracts in human telomeric overhangs allows formation of 2–10 tandem GQs with varying separations. How these tandem GQs interact with each other and their folding characteristics would clearly impact telomere accessibility. Prior studies on DNA constructs that form multiple telomeric GQs have typically investigated the thermodynamic, kinetic, or structural characteristics of isolated DNA molecules but have not directly studied telomere accessibility, which was the primary goal of this study. The studies on isolated tandem GQs have not reached a consensus and have either concluded negligible interactions, stabilizing stacking interactions (positive cooperativity), or destabilizing interactions (negative cooperativity) that take place between neighboring GQs (29–34). Variations in ionic conditions, temperature, sensitivity of probes to different thermodynamics parameters, or the design of the DNA constructs might have played a role in these different conclusions.
In this study, we investigated the accessibility patterns of telomeric single-stranded DNA (ssDNA) molecules that contain 4–28 G-Tracts, a total of 25 constructs, using the single-molecule FRET-PAINT (Förster Resonance Energy Transfer-Point Accumulation for Imaging in Nanoscale Topography) method (34). These constructs can form one to seven tandem GQs or could have multiple unfolded G-Tracts at different segments of the overhang. FRET-PAINT combines attributes of the superresolution microscopy technique DNA-PAINT (35) with those of single-molecule FRET (smFRET) (36). This methodology takes advantage of transient binding of a labeled probe (called imager strand) to a target strand. As the imager strand, we used an acceptor (Cy5) labeled peptide nucleic acid (PNA) that was complementary to a 7-nt segment of telomeric sequence. The imager strand is introduced to a microfluidic chamber that contains surface-immobilized, donor (Cy3) labeled partial duplex DNA (pdDNA) constructs that have a single-stranded telomeric overhang (Fig. 1A). This partial duplex design mimics the physiological structure. Binding of Cy5-PNA to an available G-Tract results in a FRET signal (Fig. 1B), which is used as an indicator for telomere accessibility. This FRET signal depends on the position of the binding site within the overhang and the folding pattern between the binding site and the ssDNA/dsDNA (double-stranded DNA) junction, where Cy3 is located. The resulting FRET distributions represent the accessibility patterns of the overhang. The distinguishing feature of this approach is providing a direct probe of telomere accessibility for overhangs with physiologically relevant lengths.
Fig. 1.
Schematic of FRET-PAINT assay. (A) Partial duplex DNA constructs (labeled with donor fluorophore Cy3) are immobilized on the PEGylated surface via biotin-streptavidin attachment. The pdDNA construct has a telomeric overhang with multiple G-Tract repeats, which can form GQs separated from each other with unfolded regions of varying length. Different FRET levels are observed when the imager strand (Cy5-PNA) transiently binds to different segments of the overhang. (B) An example smFRET time trace showing five binding events with low, mid, and high FRET levels. The red line is a fit to the FRET trace to indicate the corresponding FRET levels. (C) Schematics of overhangs that contain four GQs with a modified sequence [(GGGT)3GGG for each GQ] and two specific binding sites (BS1 and BS2) for Cy5-PNA. (D) FRET-PAINT data demonstrating the accessibility patterns for constructs in which BS1 and BS2 are separated from each other by one to four GQs. The FRET level representing binding to BS1 remains constant, while that representing binding to BS2 gradually decreases but remains within detectable range. The blue and red curves are Gaussian peak fits to the data (listed in SI Appendix, Table S5). (E) Schematics of overhangs that contain four modified GQs and a specific BS for Cy5-PNA. (F) FRET-PAINT data on constructs in which BS is separated from the junction by one to five modified GQs. The FRET level representing binding to BS systematically decreases as the overhang length increases yet it remains within detectable range even after five GQs. The blue curves are Gaussian peak fits to the data (listed in SI Appendix, Table S6), and dashed lines indicate the peak positions.
As accessibility patterns depend on the underlying GQ folding landscape, we also studied their implications for folding characteristics. For this, we developed a computational model that calculates the resulting accessibility patterns when folding stabilities for different segments of the overhang and nature of interactions between neighboring GQs are modulated. With this approach, we mapped the resulting patterns for a broad range of stability constants and investigated the resulting patterns for positive, negative, and negligible cooperativity between neighboring GQs.
Results
Before presenting the measurements on telomeric overhangs, we start with important control measurements that establish the validity and sensitivity of the FRET-PAINT approach for sequences that form tandem GQ structures. In these studies, we replaced the telomeric repeats (GGGTTA)3GGG with the (GGGT)3GGG sequence, which forms a very stable GQ structure that attains only a parallel folding conformation (22). This modified sequence is complementary to only four consecutive nts on Cy5-PNA, which is not long enough to form detectable PNA-DNA duplexes. To enable Cy5-PNA binding to specific locations, we inserted the TTAGGGTTA sequence as a binding site (BS) in one location (Fig. 1E–F) or two locations (BS1 and BS2 in Fig. 1C and D) on the overhang. In Fig. 1C and D, the two binding sites were separated by one to four GQs, while in Fig. 1E and F, the binding site was separated from the junction by one to five GQs. To illustrate, the overhang shown in Fig. 1C has the following sequence: TTAGGGTTA [(GGGT)4TA(GGGT)4TA(GGGT)4 TA(GGGT)4] TAGGGTTA. The sequence within the square brackets [] forms four GQs, while the underlined sequences serve as the binding sites (BS1 and BS2 in Fig. 1C) for Cy5-PNA. Complete sequences for these constructs are given in SI Appendix, Table S1.
In the accessibility patterns of constructs with BS1 and BS2, we expect to have a high FRET peak that remains constant (representing binding to BS1) and a lower FRET peak (representing binding to BS2) that gradually shifts to lower levels as the overhang length increases. We also expect the lower FRET peak to broaden with length as the number of potential folding patterns increases, resulting in variability in detected FRET levels. The data in Fig. 1D agrees with these expectations. The high FRET peak at FRET efficiency (EFRET) ∼0.95 remains unchanged in all constructs, while the lower FRET peak gradually shifts to lower levels and broadens. Significantly, the FRET levels remain well within the detectable range, even when BS2 is separated from the donor molecule by an unfolded G-Tract (BS1) and four GQs.
Fig. 1F shows accessibility patterns for the constructs that have a single binding site separated from the junction by one to five GQs. To illustrate, the sequence of the overhang in Fig. 1E is TTA[(GGGT)4TA(GGGT)4TA(GGGT)4TA(GGGT)4] TAGGGTTA, where the underlined sequence serves as the binding site (BS in Fig. 1E) for Cy5-PNA. As shown in Fig. 1F, even after five GQs, the binding events to BS remain well within the detectable FRET range. We conclude that the FRET-PAINT approach can be used to study the accessibility patterns of long telomeric sequences, which will be presented next.
Fig. 2 shows the normalized FRET distributions obtained from transient bindings of Cy5-PNA to unfolded G-Tracts in telomeric overhangs. The data are grouped in sets of four histograms, with the shortest construct in each set having an integer multiple of four G-Tracts, [4n], while the longest construct has an additional three G-Tracts, [4n+3]. These FRET distributions reveal surprising accessibility patterns that emerge when the telomeric overhang reaches a length of about 10 G-Tracts. The patterns have a periodicity of four G-Tracts and persist in the 10–28 G-Tract range. While the [4n]G-Tract constructs have the broadest distributions, the [4n+2]G-Tract constructs are typically the narrowest. These narrower distributions are concentrated at high FRET levels, which represent binding to sites closer to the ssDNA/dsDNA junction (5′-side). The broad distributions of [4n]G-Tract constructs suggest the binding sites, i.e., unfolded G-Tracts, are distributed throughout the overhang.
Fig. 2.
Normalized FRET histograms for telomeric overhangs with 4–28 G-Tracts. The histograms are grouped in sets of four with [4n]G-Tracts in brown, [4n+1]G-Tracts in blue, [4n+2]G-Tracts in violet, and [4n+3]G-Tracts in orange. The 28G-Tract construct is added to the last histogram in green. The FRET histograms, which represent accessibility patterns, are broadest in [4n]G-Tract constructs and narrowest in [4n+2]G-Tract constructs.
Fig. 3A shows a contour plot of the combined data for all 25 constructs that illustrates alternating broad (marked with white dashed lines) and narrow (marked with white dotted lines) distributions. To quantify the broadness of FRET distributions, we defined an S-parameter (Shannon entropy divided by the Boltzmann constant). The S-parameter describes the uncertainty associated with identifying the binding site of Cy5-PNA, which is greater for broader distributions. Therefore, the maxima of S-parameter occur at [4n]G-Tract constructs and the minima at [4n+2]G-Tract constructs (Fig. 3B).
Fig. 3.
Contour plot of accessibility patterns and variation of S-parameter with overhang length. (A) The histograms in Fig. 2 are combined in a contour plot that demonstrates broad histograms for [4n]G-Tract constructs (for 12G-Tract and longer) and narrower histograms for other constructs, especially [4n+2]G-Tract constructs. The overhangs with [4n]G-Tracts are marked with dashed white lines, while [4n+2]G-Tract constructs are marked with dotted white lines. The same bin-size was used in this contour plot as that used in the histograms in Fig. 2. (B) To quantify the broadness of the histograms in Fig. 2, an S-parameter (defined on top) is introduced. S-parameter is a measure of folding frustration: the larger the S-parameter the greater the folding frustration. At lengths longer than 12G-Tract, a pattern is established where the maxima of S-parameter occur at [4n]G-Tract constructs, while minima are at [4n+2]G-Tract constructs.
The observed accessibility patterns have implications about the underlying folding topography. We explored this connection through a computational model introduced by Carrino et al. (33). In this two-state GQ model with nearest neighbor interactions (analogous to the Ising model), combinatorial factors associated with the location of the folded GQs along the telomere and its finite length conspire to produce nontrivial accessibility patterns. This model predicts folding probability (Pi) of a G-Tract based on folding cooperativity (Kc) between neighboring GQs and the statistical weight of a folded GQ (Kf). Due to reasons described in the Materials and Methods section, the following parameters were adopted from the Carrino et al. study: Kf = 60 for the stability of a GQ in the interior, Kf,N = 229 for a GQ ending at the 3′-end (for an overhang that includes N G-Tracts), and Kf,N-1 = 80 for a GQ ending at the neighboring G-Tract (33). Our model predictions of folding patterns are robust as long as Kf,N-1 is greater than Kf, and Kf,N is significantly greater than Kf,N-1. The G-Tracts on the 5′ side (vicinity of ssDNA/dsDNA junction) are significantly less stable due to the flanking duplex regions, similar to a recent observation in KIT promoter (37). We estimated the folding parameters for these 5′ G-Tracts (Kf,1 and Kf,2) and the cooperativity parameter (Kc) by comparing the patterns observed for the experimentally measured and computationally calculated S-parameters, as described below.
The FRET signal measured in the experiment probes the location of unfolded G-Tracts. To compare the outcomes of the model with the experimental distribution measured as a function of telomere length, we calculate how delocalized the distribution of unfolded G-Tracts is for a given length via the S-parameter, as defined in the Materials and Methods section. We find that the model accommodates a variety of patterns of the calculated S-parameter as a function of telomere length depending on folding and cooperativity parameters used in the model. Fig. 4 shows a phase diagram as a function of the stability for a GQ starting with the 5′ G-Tract, Kf,1, and the cooperativity Kc. In this figure, we choose the weight for a GQ starting at the second G-Tract from the junction to be twice that of starting at the first G-Tract, i.e., Kf,2 = 2Kf,1. Similar patterns are observed as long as Kf,2 is greater than Kf,1 and significantly smaller than Kf (SI Appendix, Fig. S1). For simplicity, we focus on a window composed of n = 20 to n = 24 G-Tracts to characterize the pattern of the S-parameter and monitor whether the S-parameter increases or decreases as N increases within this range. With positive cooperativity, four patterns are observed for the S-parameters. The pattern observed in the eFRET (colored green in Fig. 4) occurs within a specific (although rather broad) range for parameters Kf,1; Kf,2; and Kc. For Kc >6, the stabilities of GQ starting near the ssDNA/dsDNA junction must be below an upper threshold to observe the appropriate pattern, and for smaller cooperativity, these stabilities must lie between a lower and upper bound. The S-parameter as a function of N, shown in Fig. 4, agrees qualitatively with the measured pattern. Because of this similarity to the experimentally measured S-parameter, we will use parameters (Kf,1 = 5, Kf,2 = 10, and Kc = 5) as an illustration in what follows.
Fig. 4.
Computational patterns observed for the S-parameter for n = 20–24 G-Tracts for varying Kf,1 and Kc parameters. In these calculations, Kf,1 and Kc are varied, while Kf,2 = 2Kf,1, Kf = 60, Kf,N-1 = 80, and Kf,N = 229. The Left panel shows the four patterns observed depending on whether the S-parameter increases or decreases as the length of the overhang is increased by one G-Tract. The phase diagram in the Middle shows the parameter range where different patterns are observed in colors that match the patterns on the Left panel. The green range corresponds to the pattern observed in the experiment. The Right panel shows the calculated S-parameters for Kc = 5 and Kf,1 = 5, which can be compared to the experimentally observed pattern in Fig. 3B.
Fig. 5 shows that the parameters identified by comparison of the computational and experimental S-parameters correspond to very stabilized GQs. This is consistent with the experiment where most of telomeric constructs do not show any binding events during experimental observation time (∼2–3 min). Although the strong stability weakens as the number of G-Tracts increases, the average number of folded GQs (<nGQ>) is nearly maximal for G-Tracts ranging from n = 4–28 (Fig. 5A). Nevertheless, constructs with 4n and 4n+1 G-Tracts have a higher average number of unfolded G-Tracts (<nuf>) than the minimum associated with fully folded GQs (Fig. 5B). Also, constructs with 4n and 4n+1 G-Tracts have large relative fluctuations, suggesting that the number of sites available to PNA binding can be significantly larger than indicated by <nuf> for these constructs.
Fig. 5.
Folding levels for the set of parameters identified in Fig. 4: Kf,1 = 5; Kf,2 = 10; Kf =60; Kf,N-1 = 80; Kf,N = 229; and Kc = 5. These parameters result in high levels of folding for all constructs investigated, although periodic patterns were observed for probability of being unfolded depending on the position of the G-Tract and length of overhang. (A) For the set of parameters used in these studies, the average number of GQ (<nGQ>) is close to the maximum possible throughout 4–28 G-Tract overhang length. (B) Average number of unfolded G-Tracts (<nuf>) as a function of overhang length. As expected, constructs with 4n+1, 4n+2, and 4n+3 G-Tracts have at least one, two, and three unfolded G-Tracts, respectively. As can be most clearly seen for 4n constructs, <nuf> increases as the overhang length increases. The orange vertical lines are SD in <nuf>. (C) Probability of ith G-Tract being unfolded as a function of its position for overhangs with n = 20–24 G-Tracts. In this diagram, i = 1 for the G-Tract at ssDNA/dsDNA junction and i = N for the G-Tract at 3′-end. (D) The plots that are overlaid in (C) are separated for each N, and the y axis scale is limited to unfolding probability in 0.0–0.2 range (0.0 < Pi < 0.2) to better illustrate the oscillations in the unfolding probability. The amplitude of oscillations in Pi varies depending on overhang length, which can be related to the patterns observed for S-parameter.
Patterns in the S-parameter can be understood as balancing the population of unfolded G-Tracts within the destabilized junction region and the rest of the telomeric G-Tracts given the underlying frustration of the landscape and the constraint that all states have a minimum number of unfolded G-Tracts when the length of the construct differs from 4n. Because GQ involving the first two G-Tracts near the junction (at the 5′) side are destabilized compared to others, the unfolded distribution is shifted toward the first two repeats. As shown in Fig. 5C and D, the G-Tracts are mostly folded for n = 4n, with a weak pattern in unfolding propensity along the chain. For n = 4n+1, which has at least one unfolded G-Tract, the first G-Tract is much more likely unfolded, and the pattern for the unfolded G-Tract along the chain is more prominent. This localization of the unfolded G-Tracts is reflected in a decrease in the S-parameter. For n = 4n+2, the first two G-Tracts are unfolded with high probability, and the remaining unfolding propensity decreases along the chain. Thus, the distribution of Pi is more localized and results in a local minimum of the S-parameter. For n = 4n+3, unfolding of the first three G-Tracts is more likely, although the propensity of unfolding for other G-Tracts increases, giving rise to a more delocalized distribution along the chain and hence an increase in the S-parameter. A similar analysis of the patterns of the S-parameters shows that the pattern observed in the experiment does not develop when folding cooperativity is negative (SI Appendix, Fig. S2). Also, looking at a broader window, n = 12–24, we see that the pattern can shift from one type to another as a function of N near the boundaries of phases of the S-parameter (SI Appendix, Fig. S3). Furthermore, these kinds of patterns occur not only in the S-parameter, which explicitly depends on the distribution of unfolded G-Tracts, but also in the entropy as a function of N (SI Appendix, Fig. S4).
Discussion
This study differs in significant aspects from most studies in literature and provides important findings that were not accessible previously. First, we employ DNA constructs that contain a duplex DNA on one side (similar to the physiological telomeric DNA) rather than two free ends, as done in most studies. Second, we directly probe accessibility of the telomeric overhang to a complementary PNA probe and interpret the results in terms of their implications for relative stability of different segments of the overhang and interactions between neighboring GQs. Most studies in literature follow the opposite route, where telomeric DNA is studied in isolation and stabilities of different segments are determined. These stabilities are then interpreted in terms of their implications for the accessibility of the overhang. The accessibility maps shown in Fig. 3 demonstrate this kind of a direct approach.
The PNA probe we employ in our studies creates a small disturbance to the underlying folding topography. This approach has physiological relevance, as is evident from the cases of telomerase or TERRA, which contain nucleic acid templates that interact with telomeres and impact the underlying GQ structures. Our Cy5-PNA probe base pairs with 7-nt telomeric overhang (TTAGGGT), which is smaller than the 9-nt-long minimum unfolded site (TTAGGGTTA). To determine the thermodynamic stability of the PNA/DNA duplex that would form upon binding of Cy5-PNA to a telomeric sequence, we performed FRET melting measurements under identical ionic conditions to those of smFRET measurements and obtained a melting temperature of Tm = 12.7 ± 0.3 °C (SI Appendix, Fig. S5). This is consistent with the estimate of a model (Tm = 13.7 °C) that predicts Tm of a PNA/DNA duplex based on the melting point of a DNA/DNA duplex of the same sequence (38). Compared to the stability of a telomeric GQ (Tm = 68 °C) (39), this is a very weak disturbance. This is also evident in the low frequency of the binding events we observe and the fact that the majority of DNA molecules that pass the single-molecule screening test do not show any binding event during the experimental observation time (2–3 min) (SI Appendix, Table S4). To provide a quantitative comparison, the frequency of binding events to a single G-Tract (an exposed G-Tract in the absence of GQ) is over an order of magnitude greater than the frequency of binding events for the 4G-Tract construct (SI Appendix, Fig. S6). One might expect that there should not be any binding events for a 4G-Tract construct since all G-Tracts would be protected within a GQ. However, this assumes all the DNA molecules are perfectly folded and remain so throughout the observation time. Another piece of evidence for the Cy5-PNA probe not destabilizing the GQs in KCl comes from the observation that the frequency of binding events are much higher in LiCl compared to KCl, as we demonstrated in an earlier study (34).
The folding patterns of Fig. 2 could be interpreted in terms of folding frustration where higher frustration would be expected to result in broader distributions. With this identification in mind, our data suggest [4n]G-Tract constructs have significantly higher frustration compared to other constructs. Adding one to three G-Tracts resolves this frustration to different extents, resulting in a concentration of unfolded sites in the vicinity of the ssDNA/dsDNA junction on the 5′-side. The [4n]G-Tract constructs have an exact number of G-Tracts that can accommodate n GQs. Therefore, attaining complete folding requires a perfect progression of the folding throughout the overhang. For these constructs, nucleation of GQ folding from multiple sites will likely result in leaving unfolded G-Tracts between neighboring GQs. Also, skipping of even one G-Tract during progression of the folding (e.g., two neighboring GQs separated by an unfolded G-Tract) would result in three more unfolded G-Tracts at different regions of the overhang. We assume GQs with long loops that incorporate one or more G-Tracts (9-nt or longer loops instead of the canonical 3-nt loops) are highly unlikely due to entropic considerations (33). On the other hand, constructs with additional G-Tracts ([4n+1], [4n+2], and [4n+3]) are more tolerant for shifts in the G-Tract register during folding or initiation of folding from different sites.
Our data suggest the G-Tracts at the 3′-end or intermediate regions have higher folding stability compared to those at the 5′-side, which might be due to proximity of these sites to a duplex DNA. Therefore, nucleation of folding from the 3′-end or intermediate regions would more likely result in unfolded G-Tracts at the 5′-side. Once a particular folding conformation is established, it persists for long periods of time due to the high stability of associated GQ structures, kinetically hindering resolution of the frustration in folding (33). These observations have physiological significance as the free 3′-ends of telomeres are more prone to degradation by exonuclease activity, and folding into GQ helps in protecting these ends (40). In addition, both telomerase and alternative lengthening of telomere mechanisms require unfolded 3′-ends for the extension of telomeres (41), and presence of GQs could potentially render telomeres inaccessible. In addition, the ssDNA/dsDNA junction region, which is more likely to be unfolded, is a potential site where POT1-TPP1 might connect to other shelterin proteins, which are localized on duplex DNA (42). Therefore, having unfolded G-Tracts at this region would facilitate binding of POT1 and establishing this connection, which would also reduce accessibility of these regions to other DNA-processing enzymes.
The conclusions of this study rely on the validity of the FRET-PAINT approach to detect binding events throughout the long telomeric overhangs. For unstructured ssDNA, it would not have been possible to detect binding events 100 nt away from a reference site (location of donor fluorophore) using Cy3/Cy5 as FRET pairs. However, the telomeric overhangs are highly folded and stable (most DNA molecules do not show any binding events during the 2–3 min observation time) under our assay conditions. Therefore, the overhangs are much more compact than unstructured DNA. These considerations were supported with the experiments on modified constructs in Fig. 1C–F. However, despite their low levels of abundance, the unfolded G-Tracts are physiologically very significant as they might serve as the nucleation sites for triggering DNA damage response, nuclease activity, or telomerase-mediated telomere extension.
Along these lines, it is also important to ensure that the compact form and 3D structure of the telomeric overhang do not eliminate the correlation between the observed FRET levels and telomere length. Potential interactions between neighboring GQs and the flexible segments (e.g., the TTA sequences that link consecutive GQs) in the telomeric overhang could impact this relationship. However, our data show that the correlation between the observed FRET efficiencies and telomere length are maintained. This is particularly evident in the histograms for [4n]G-Tract constructs where the binding sites are distributed throughout the overhang and hence provide more detailed information about the 3D structure compared to constructs where binding sites are concentrated in a particular region. In the [4n]G-Tract constructs, the FRET distributions systematically shift to lower FRET levels as the overhang length is increased, suggesting the correlation between FRET efficiency and location of binding site (with respect to the junction where the donor fluorophore is located) is not eliminated. We quantified this by studying the fraction of FRET population for EFRET < 0.50 and demonstrate that this population increases as telomere length is increased (SI Appendix, Fig. S7) (34). We note that, depending on the folding pattern, binding to different sites might result in similar FRET efficiencies or binding to a particular site could result in different FRET values in our FRET-PAINT measurements. Therefore, we based our conclusions on the patterns and periodicities observed in the FRET distributions (Fig. 3) rather than exact identification of binding sites. Also, we limited our statements to distinguishing three regions in the overhang: the junction region, the broad intermediate region, and the vicinity of the 3′-end.
Whether accessibility of a particular open site is impacted by its neighborhood is another relevant issue. To illustrate, an open site (a G-Tract that is not part of a GQ) could have folded GQs on both sides of it, a folded GQ on one side and another open site on the other, or open sites on both sides of it. Possible arrangements multiply if beyond nearest neighbors are considered. Since we do not have any control on which of these arrangements prevails for a given open site, we did not quantify the impact of the neighborhood structures on accessibility. However, since over 100 binding events are used to create each of the histograms in Fig. 2, these effects are expected to average out. Attaining more precise information would require performing systematic measurements on many constructs where certain configurations are biased by introducing mutations in the sequence, which will be the focus of future studies.
While the energy landscape of DNA GQ folding has been studied less extensively than that of proteins, it appears that the folding landscape of human telomeric GQ is not funneled (43) and has a high degree of frustration as evidenced by stable folding intermediates (44), alternative conformers (45), and sensitivity of the GQ topology to perturbations such as small changes in loop length (46). In this article, the frustration does not refer to the frustration in folding of a single GQ (which we take as two-state in the model) but rather to the frustration associated with the folding patterns along the telomere in the sense of the term introduced by Mittermaier et al., who introduced the term frustration to describe the diversity of folding patterns in states with similar energy (33). This entropic and topological frustration has thermodynamic and kinetic consequences for accessibility of telomeres. In the present article, we find that this simple model supports a surprising diversity of thermodynamic accessibility patterns which are sensitive to GQ interaction strength, stability, and telomere length. More broadly, frustration in biomolecules (47) is perhaps more familiar as a component of the funneled energy landscape of proteins (48). While natural protein sequences are minimally frustrated, landscape ruggedness and frustration is evident in patterns of frustration in many proteins (49), with locally frustrated regions often associated with allosteric conformation change (50) and protein-ligand binding sites (49). The frustration associated with folding of repeat proteins with coupled neighboring repeat units is perhaps most closely related to the frustration of GQ folding in telomeres. One-dimensional Ising models similar to the GQ model employed in this article have been used to understand the folding thermodynamics and kinetics of repeat proteins (51, 52). In repeat proteins, the balance between the free energy of folding each element and the interactions between them is subtle, leading to a rich variety of behaviors. Consequently, proteins poised at a particular balance may be susceptible to local perturbations that cause large structural effects (51). The sensitivity of the accessibility patterns on the number of repeats (modulo 4) may be an analogous illustration of frustration in the telomeric landscape.
To get a feel for the stabilities suggested by this model, at T = 25 °C, the stability of an interior GQ relative to an unfolded G-Tract is given by ΔG = −2.43 kcal/mol, and a GQ involving the first G-Tract at the ssDNA/dsDNA junction is ΔG = −0.95 kcal/mol. The free energy of two adjacent GQ has free energy ΔG = −5.81 kcal/mol, of which −0.95 kcal/mol is associated with cooperative stabilizing interactions between the GQs. Destabilization of GQs at the ssDNA/dsDNA junction and positive cooperativity are consistent with the high FRET peak observed in many constructs. The free energies for GQ stability calculated from our model are consistent with those reported in literature (53), although there is no consensus about whether the interaction between neighboring GQs is stabilizing or destabilizing. Several studies have investigated the interactions between neighboring GQs and possible higher-order structures that might form in telomeric overhangs using different constructs and experimental methods. However, these studies have not reached a consensus, with conclusions varying from negligible stacking interactions (beads on a string-type arrangement) (29) and destabilizing interactions (negative cooperativity) (33, 54) to higher-order structures, with multiple GQs condensing into compact structures (30, 34, 55, 56), which would suggest positive cooperativity. Our results are more consistent with these latter studies that propose positive cooperativity.
In this study, we demonstrate a direct probe of the accessible sites within telomeric overhangs of physiologically relevant lengths. This was made possible by the sensitivity of the single-molecule methods employed in this study to minority populations, as these sites are only a small fraction of all potential sites on the telomeric overhangs. Our results demonstrate repeating accessibility and folding frustration patterns in these constructs. Overhangs with [4n]G-Tracts demonstrate elevated levels of frustration where the PNA probe is able to access sites throughout the overhang. On the other hand, overhangs with two additional G-Tracts, [4n+2]G-Tracts, show minimal frustration where most accessible sites are concentrated in the vicinity of ssDNA/dsDNA junction. Our computational studies also capture the requirement for lower stability at the ssDNA/dsDNA junction to attain patterns consistent with those observed in the experiment. These accessibility patterns would help with protecting the free 3′-end against exonuclease activity and inhibit telomerase-mediated elongation while facilitating binding of POT1/TPP1 in the vicinity of the ssDNA/dsDNA junction and connect with other shelterin proteins.
Materials and Methods
Nucleic Acid Constructs.
DNA strands were purchased either from Eurofins Genomics or Integrated DNA Technologies (IDT), and their sequences are given in SI Appendix, Tables S1–S3. The oligonucleotides were purified inhouse using denaturing polyacrylamide gel electrophoresis. The corresponding gel images are shown in SI Appendix, Fig. S8. High-performance liquid chromatography (HPLC)-purified Cy5-PNA strand was purchased from PNA-Bio Inc. The sequence of the Cy5-PNA probe was TAACCCTT-Cy5; the underlined nts were complementary to 7-nt of telomeric sequence. The pdDNA constructs (Fig. 1A) were formed by annealing a long strand that contained the telomeric overhang (4–28 G-Tracts) with a short strand (18, 24, or 30 nt) that had a biotin at the 3′-end and Cy3 at the 5′-end. The two strands were annealed in a thermal cycler in 150 mM KCl and 10 mM MgCl2 at 95 °C for 3 min followed by a slow cooling to 30 °C (1 °C decrease every 3 min). This slow cooling process ensured attaining a thermodynamic steady state for the folding pattern. It is important to note that 10 mM MgCl2 was used only during the annealing process (to improve hybridization), while all measurements were performed at 2 mM MgCl2. During annealing, excess long strand (500 nM for 100 nM of short strand) was used to ensure all biotinylated and Cy3-labeled short strands had a matching long strand. The unpaired long strands were washed out of channel after the DNA was immobilized on surface via biotin-streptavidin linker.
To minimize its impact, the Cy5 was attached to PNA via a flexible linker and an additional thymine that did not hybridize with telomeric DNA. If Cy5-PNA binds in the immediate vicinity of a GQ, there will be 3-nt (TT from DNA and T from PNA) and 6-carbon long flexible linker that are not part of the GQ or the PNA/DNA duplex. Similarly, the Cy3 fluorophore at the junction (placed on the short strand) was separated by the nearest GQ by at least three nts (TTA in the junction and 6-carbon long flexible linker). Earlier studies have shown that Cy3/Cy5 fluorophores placed at the terminal nt tend to stack on the dsDNA (57). With 3-nt separation, Cy3 is not expected to have a significant impact on the nearest GQ. In most studies that employ fluorescent dyes (including bulk FRET melting assays), the fluorophores are separated from the GQ by 0–2 nt overhangs and a flexible linker. Even though these would not warrant zero interaction between the fluorophore and the GQ (58), the impact should be small and should be considered as an overall impact of the probe on the telomeric overhang.
The smFRET Assay and Imaging Setup.
A home-built prism-type total internal reflection fluorescence microscope was used for these measurements following protocols described in earlier work (59). The slides and coverslips were initially cleaned with 1 M potassium hydroxide and acetone, followed by piranha etching, surface functionalization by amino silane, and surface passivation by polyethylene glycol (PEG). To reduce nonspecific binding, the surfaces were treated with a mixture of m-PEG-5kDa and biotin-PEG-5kDa in the ratio 40:1, followed by another round of passivation with 333 Da PEG to increase the density of the PEG brush. The microfluidic chamber was created by sandwiching a PEGylated slide and a coverslip with double-sided tape, followed by sealing the chamber with epoxy. The chamber was treated with 2% (vol./vol.) Tween-20 to further reduce nonspecific binding.
After washing the excess detergent from the chamber, 0.01 mg/mL streptavidin was incubated in the chamber for 2 min. The pdDNA samples were diluted to 10 pM in a buffer containing 150 mM KCl and 2 mM MgCl2 and incubated in the chamber for 2–5 min, resulting in surface density of ∼300 molecules per imaging area (∼50 µm × 100 µm). The excess DNA was removed from the chamber with a buffer containing 150 mM KCl and 2 mM MgCl2. Unless otherwise specified, the imaging buffer contained 50 mM Tris⋅HCl (pH 7.5), 2 mM Trolox, 0.8 mg/mL glucose, 0.1 mg/mL glucose oxidase, 0.1 mg/mL bovine serum albumin, 2 mM MgCl2, 150 mM KCl, and 40 nM Cy5-PNA. The Cy5-PNA strand was heated to 85 °C for 10 min prior to adding it to the imaging buffer. We recorded 1,500- to 2,000-frame-long movies with 100 ms frame integration time using an Andor Ixon EMCCD camera. A green laser beam (532 nm) was used to excite the donor fluorophore, and the fluorescence signal was collected by an Olympus water objective (60x, 1.20 NA).
Data Analysis.
A custom-written C++ software was used to record and analyze the movies and to generate single-molecule time traces of donor and acceptor intensities. The time traces of each molecule were further analyzed by a custom MATLAB code to select the molecules that passed a single-molecule screening test. The background was subtracted for each of these molecules based on the remnant donor and acceptor intensities after donor photobleaches. The EFRET was calculated using EFRET = Acceptor Intensity/(Acceptor Intensity + Donor Intensity). The molecules that did not show any binding event formed a donor-only (DO) peak at EFRET = 0.06, which was used as a reference to rescale the FRET distribution such that DO peak corresponded to EFRET = 0.00. After this rescaling, the DO peak was subtracted from the histograms. Therefore, the FRET histograms presented in Fig. 2 were generated from single molecules that show at least one Cy5-PNA binding event. The numbers of molecules that contributed to each histogram in Fig. 2 are presented in SI Appendix, Table S4, but typically, each histogram was constructed from binding events to 100–150 DNA molecules, resulting in several hundred binding events in each histogram. The constructs that do not have any unfolded G-Tracts or those whose unfolded sites are not accessible to the probe do not contribute to the histograms in Fig. 2.
The FRET distributions were normalized such that each molecule contributed equally to the histogram, and the total population across the entire FRET range was normalized to 100%, i.e., the population of a particular FRET bin represented its percent population. The probability of a particular FRET level (a FRET bin of width 0.02) was obtained by dividing population at that FRET level by 100. To quantify the spread in the FRET histograms, we defined an S-parameter: , where the summation was carried over the entire FRET range and was the unfolded probability, normalized so that . These calculations were performed in Origin 2015.
Computational Model.
We considered a simple model introduced by Mittermaier et al. to analyze the folding patterns in a telomeric sequence consisting of N G-Tracts, each of which can either be unfolded or folded with three neighboring G-Tracts in a GQ (33). The G-Tracts were counted from the junction () toward the 3′-end (). Mittermaier et al. used ssDNA constructs with two free ends (33), while we used pdDNA constructs in which the telomeric overhang had a free 3′-end but bordered a double-stranded DNA on the 5′-side, similar to the physiological telomeres. Due to the similarity of our pdDNA constructs to the ssDNA constructs of the Mittermaier et al. study in the middle and 3′-ends, we kept the stabilities of a GQ in the interior (Kf = 60), a GQ ending at the 3′-end (Kf,N = 229, for an overhang that included N G-Tracts), and a GQ ending at the neighboring G-Tract (Kf,N-1 = 80) similar to those reported in Mittermaier et al. study (33). On the 5′ side, the stability of the G-Tracts in our pdDNA constructs were significantly lower than those in the ssDNA constructs of the Mittermaier et al. study (33), as we had a flanking duplex region, rather than a free end.
The statistical weight for a state α with folded GQs and interfaces between nearest neighbor GQs is given by , where is the weight associated with a folded GQ and models interactions between adjacent GQs. Folding cooperativity between neighboring GQs can be positive negative or neutral . In the study by Mittermaier et al., the DNA constructs were in the form of single-stranded DNA with symmetrical 3′- and 5′-ends. This study demonstrated higher folding stability for GQs beginning or ending on the two terminal G-Tracts on each end: initiating on the first and second G-Tract at the 5′-end or ending on the (N-1)th and Nth G-Tract at the 3′-end. Compared to the internal G-Tracts were greater than for internal G-Tracts. Since the free 3′-end and interior regions of our pdDNA constructs were identical to those of the Mittermaier et al. study, we used similar statistical weights for as those estimated in that study. However, the vicinity of the dsDNA/ssDNA junction was significantly different from the rest in our constructs, which required using much lower folding parameters. These considerations were implemented as in our study.
The probability for a state α is given by , where is the partition function. The total number of states can be written as , where the combinatorial factor is the number of states with folded GQs and unfolded G-Tracts. For telomeric overhangs of interest, is relatively small. For example, a telomere of 32 G-Tracts has only 16,493 possible states, assuming long loops that contain one or more G-Tracts (such loops would be 9-, 15-, or 21-nt long compared to the canonical 3-nt loops) are not allowed due to their lower stability (33). This allows the partition sum to be computed explicitly by generating all states and evaluating the statistical weight of each state in the sum. The probability that the ith G-Tract is unfolded is given by , where if the ith G-Tract is unfolded and if the G-Tract is in a folded GQ. Overall, average properties of the telomere can be computed, such as the average number of folded GQs, ; the average number of unfolded G-Tracts, ; and the fluctuations of the unfolded G-Tracts, .
For short chains, this model predicts a nonuniform unfolding probability along the sequence of G-Tracts under conditions strongly favoring folding (33). Here, combinatorial factors associated different ways in which specific G-Tracts can participate in GQ formation, as well as competing states with similar free energy, conspire to produce patterns in as a function of G-Tract index i; that is, some G-Tracts are more likely to be unfolded than others depending on their position. The experimental FRET distribution reflects binding of PNA to unfolded G-Tracts modeled by the distribution , which depends on the location of the G-Tract along the chain and the total number of repeats. This degree of localization of the unfolded G-Tracts is characterized by computing the S-parameter , where is the unfolded probability normalized to .
Supplementary Material
Acknowledgments
This work was supported by the National Institutes of Health (1R15GM123443 to H.B.). We thank Dr. Soumitra Basu for use of his laboratoru to perform the polyacrylamide gel electrophoresis experiments.
Footnotes
The authors declare no competing interest.
This article is a PNAS Direct Submission.
See online for related content such as Commentaries.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2202317119/-/DCSupplemental.
Data Availability
Some study data available (The manuscript has the analyzed data but not the raw data. Raw data and analysis code will be made available).
References
- 1.Blackburn E. H., Structure and function of telomeres. Nature 350, 569–573 (1991). [DOI] [PubMed] [Google Scholar]
- 2.Blackburn E. H., Greider C. W., Szostak J. W., Telomeres and telomerase: The path from maize, Tetrahymena and yeast to human cancer and aging. Nat. Med. 12, 1133–1138 (2006). [DOI] [PubMed] [Google Scholar]
- 3.Lim C. J., Cech T. R., Shaping human telomeres: From shelterin and CST complexes to telomeric chromatin organization. Nat. Rev. Mol. Cell Biol. 22, 283–298 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Meyne J., Ratliff R. L., Moyzis R. K., Conservation of the human telomere sequence (TTAGGG)n among vertebrates. Proc. Natl. Acad. Sci. U.S.A. 86, 7049–7053 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cech T. R., Beginning to understand the end of the chromosome. Mod. Biopharm. Des. Dev. Optim. 1, 36–48 (2008). [Google Scholar]
- 6.Shay J. W., Role of telomeres and telomerase in aging and cancer. Cancer Discov. 6, 584–593 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Harley C. B., Futcher A. B., Greider C. W., Telomeres shorten during ageing of human fibroblasts. Nature 345, 458–460 (1990). [DOI] [PubMed] [Google Scholar]
- 8.Stewart S. A., et al. , Erosion of the telomeric single-strand overhang at replicative senescence. Nat. Genet. 33, 492–496 (2003). [DOI] [PubMed] [Google Scholar]
- 9.Wu R. A., Upton H. E., Vogan J. M., Collins K., Telomerase mechanism of telomere synthesis. Annu. Rev. Biochem. 86, 439–460 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Jafri M. A., Ansari S. A., Alqahtani M. H., Shay J. W., Roles of Telomeres and Telomerase in Cancer, and Advances in Telomerase-Targeted Therapies (BioMed Central Ltd., 2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Henderson E., Hardin C. C., Walk S. K., Tinoco I. Jr., Blackburn E. H., Telomeric DNA oligonucleotides form novel intramolecular structures containing guanine-guanine base pairs. Cell 51, 899–908 (1987). [DOI] [PubMed] [Google Scholar]
- 12.Phan A. T., Mergny J. L., Human telomeric DNA: G-quadruplex, i-motif and Watson-Crick double helix. Nucleic Acids Res. 30, 4618–4625 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Henderson A., et al. , Detection of G-quadruplex DNA in mammalian cells. Nucleic Acids Res. 42, 860–869 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Di Antonio M., et al. , Single-molecule visualization of DNA G-quadruplex formation in live cells. Nat. Chem. 12, 832–837 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Biffi G., Tannahill D., McCafferty J., Balasubramanian S., Quantitative visualization of DNA G-quadruplex structures in human cells. Nat. Chem. 5, 182–186 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Maizels N., G4-associated human diseases. EMBO Rep. 16, 910–922 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Neidle S., Quadruplex nucleic acids as novel therapeutic targets. J. Med. Chem. 59, 5987–6011 (2016). [DOI] [PubMed] [Google Scholar]
- 18.Zaug A. J., Podell E. R., Cech T. R., Human POT1 disrupts telomeric G-quadruplexes allowing telomerase extension in vitro. Proc. Natl. Acad. Sci. U.S.A. 102, 10864–10869 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Banco M. T., Ferré-D’Amaré A. R., The emerging structural complexity of G-quadruplex RNAs. RNA 27, 390–402 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lane A. N., Chaires J. B., Gray R. D., Trent J. O., Stability and kinetics of G-quadruplex structures. Nucleic Acids Res. 36, 5482–5515 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tran P. L., Mergny J. L., Alberti P., Stability of telomeric G-quadruplexes. Nucleic Acids Res. 39, 3282–3294 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ray S., et al. , RPA-mediated unfolding of systematically varying G-quadruplex structures. Biophys. J. 104, 2235–2245 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ray S., Bandaria J. N., Qureshi M. H., Yildiz A., Balci H., G-quadruplex formation in telomeres enhances POT1/TPP1 protection against RPA binding. Proc. Natl. Acad. Sci. U.S.A. 111, 2990–2995 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hwang H., et al. , Telomeric overhang length determines structural dynamics and accessibility to telomerase and ALT-associated proteins. Structure 22, 842–853 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Budhathoki J. B., Stafford E. J., Yodh J. G., Balci H., ATP-dependent G-quadruplex unfolding by Bloom helicase exhibits low processivity. Nucleic Acids Res. 43, 5961–5970 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Maleki P., et al. , Quantifying the impact of small molecule ligands on G-quadruplex stability against Bloom helicase. Nucleic Acids Res. 47, 10744–10753 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chen M. C., et al. , Structural basis of G-quadruplex unfolding by the DEAH/RHA helicase DHX36. Nature 558, 465–469 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Mendoza O., Bourdoncle A., Boulé J. B., Brosh R. M. Jr., Mergny J. L., G-quadruplexes and helicases. Nucleic Acids Res. 44, 1989–2006 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Yu H. Q., Miyoshi D., Sugimoto N., Characterization of structure and stability of long telomeric DNA G-quadruplexes. J. Am. Chem. Soc. 128, 15461–15468 (2006). [DOI] [PubMed] [Google Scholar]
- 30.Petraccone L., et al. , Structure and stability of higher-order human telomeric quadruplexes. J. Am. Chem. Soc. 133, 20951–20961 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kar A., Jones N., Arat N. Ö., Fishel R., Griffith J. D., Long repeating (TTAGGG)n single-stranded DNA self-condenses into compact beaded filaments stabilized by G-quadruplex formation. J. Biol. Chem. 293, 9473–9485 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Abraham Punnoose J., et al. , Interaction of G-quadruplexes in the full-length 3′ human telomeric overhang. J. Am. Chem. Soc. 136, 18062–18069 (2014). [DOI] [PubMed] [Google Scholar]
- 33.Carrino S., Hennecker C. D., Murrieta A. C., Mittermaier A., Frustrated folding of guanine quadruplexes in telomeric DNA. Nucleic Acids Res. 49, 3063–3076 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mustafa G., Shiekh S., Gc K., Abeysirigunawardena S., Balci H., Interrogating accessibility of telomeric sequences with FRET-PAINT: Evidence for length-dependent telomere compaction. Nucleic Acids Res. 49, 3371–3380 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jungmann R., et al. , Single-molecule kinetics and super-resolution microscopy by fluorescence imaging of transient binding on DNA origami. Nano Lett. 10, 4756–4761 (2010). [DOI] [PubMed] [Google Scholar]
- 36.Auer A., Strauss M. T., Schlichthaerle T., Jungmann R., Fast, background-free DNA-PAINT imaging using FRET-based probes. Nano Lett. 17, 6428–6434 (2017). [DOI] [PubMed] [Google Scholar]
- 37.Vesco G., et al. , Double-stranded flanking ends affect the folding kinetics and conformational equilibrium of G-quadruplexes forming sequences within the promoter of KIT oncogene. Nucleic Acids Res. 49, 9724–9737 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Giesen U., et al. , A formula for thermal stability (Tm) prediction of PNA/DNA duplexes. Nucleic Acids Res. 26, 5004–5006 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Qureshi M. H., Ray S., Sewell A. L., Basu S., Balci H., Replication protein A unfolds G-quadruplex structures with varying degrees of efficiency. J. Phys. Chem. B 116, 5588–5594 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tang J., et al. , G-quadruplex preferentially forms at the very 3′ end of vertebrate telomeric DNA. Nucleic Acids Res. 36, 1200–1208 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Wang Q., et al. , G-quadruplex formation at the 3′ end of telomere DNA inhibits its extension by telomerase, polymerase and unwinding by helicase. Nucleic Acids Res. 39, 6229–6237 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Paul T., Liou W., Cai X., Opresko P. L., Myong S., TRF2 promotes dynamic and stepwise looping of POT1 bound telomeric overhang. Nucleic Acids Res. 49, 12377–12393 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Šponer J., et al. , Folding of guanine quadruplex molecules-funnel-like mechanism or kinetic partitioning? An overview from MD simulation studies. Biochim. Biophys. Acta, Gen. Subj. 1861 (5 Pt B), 1246–1263 (2017). [DOI] [PubMed] [Google Scholar]
- 44.Bessi I., Jonker H. R. A., Richter C., Schwalbe H., Involvement of long-lived intermediate states in the complex folding pathway of the human telomeric G-quadruplex. Angew. Chem. Int. Ed. Engl. 54, 8444–8448 (2015). [DOI] [PubMed] [Google Scholar]
- 45.Müller D., Bessi I., Richter C., Schwalbe H., The folding landscapes of human telomeric RNA and DNA G-quadruplexes are markedly different. Angew. Chem. Int. Ed. Engl. 60, 10895–10901 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Guédin A., Gros J., Alberti P., Mergny J. L., How long is too long? Effects of loop size on G-quadruplex stability. Nucleic Acids Res. 38, 7858–7868 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Ferreiro D. U., Komives E. A., Wolynes P. G., Frustration in biomolecules. Q. Rev. Biophys. 47, 285–363 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bryngelson J. D., Onuchic J. N., Socci N. D., Wolynes P. G., Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins 21, 167–195 (1995). [DOI] [PubMed] [Google Scholar]
- 49.Ferreiro D. U., Hegler J. A., Komives E. A., Wolynes P. G., Localizing frustration in native proteins and protein assemblies. Proc. Natl. Acad. Sci. U.S.A. 104, 19819–19824 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Ferreiro D. U., Hegler J. A., Komives E. A., Wolynes P. G., On the role of frustration in the energy landscapes of allosteric proteins. Proc. Natl. Acad. Sci. U.S.A. 108, 3499–3503 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Ferreiro D. U., Walczak A. M., Komives E. A., Wolynes P. G., The energy landscapes of repeat-containing proteins: Topology, cooperativity, and the folding funnels of one-dimensional architectures. PLoS Comput. Biol. 4, e1000070 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kloss E., Courtemanche N., Barrick D., Repeat-protein folding: New insights into origins of cooperativity, stability, and topology. Arch. Biochem. Biophys. 469, 83–99 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Chaires J. B., Human telomeric G-quadruplex: Thermodynamic and kinetic studies of telomeric quadruplex stability. FEBS J. 277, 1098–1106 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Vorlícková M., Chládková J., Kejnovská I., Fialová M., Kypr J., Guanine tetraplex topology of human telomere DNA is governed by the number of (TTAGGG) repeats. Nucleic Acids Res. 33, 5851–5860 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Xu Y., Ishizuka T., Kurabayashi K., Komiyama M., Consecutive formation of G-quadruplexes in human telomeric-overhang DNA: A protective capping structure for telomere ends. Angew. Chem. Int. Ed. Engl. 48, 7833–7836 (2009). [DOI] [PubMed] [Google Scholar]
- 56.Wang H., Nora G. J., Ghodke H., Opresko P. L., Single molecule studies of physiologically relevant telomeric tails reveal POT1 mechanism for promoting G-quadruplex unfolding. J. Biol. Chem. 286, 7479–7489 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Iqbal A., et al. , Orientation dependence in fluorescent energy transfer between Cy3 and Cy5 terminally attached to double-stranded nucleic acids. Proc. Natl. Acad. Sci. U.S.A. 105, 11176–11181 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Søndergaard S., et al. , Dynamics of fluorescent dyes attached to G-quadruplex DNA and their effect on FRET experiments. ChemPhysChem 16, 2562–2570 (2015). [DOI] [PubMed] [Google Scholar]
- 59.Maleki P., Budhathoki J. B., Roy W. A., Balci H., A practical guide to studying G-quadruplex structures using single-molecule FRET. Mol. Genet. Genomics 292, 483–498 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Some study data available (The manuscript has the analyzed data but not the raw data. Raw data and analysis code will be made available).