Abstract
G-quadruplex and i-motif nucleic acid structures are believed to fold through kinetic partitioning mechanisms. Such mechanisms explain the structural heterogeneity of G-quadruplex metastable intermediates which have been extensively reported. On the other hand, i-motif folding is regarded as predictable, and research on alternative i-motif folds is limited. While TC5 normally folds into a stable tetrameric i-motif in solution, we report that 2′-deoxy-2′-fluoroarabinocytidine (araF-C) substitutions can prompt TC5 to form an off-pathway and kinetically-trapped dimeric i-motif, thereby expanding the scope of i-motif folding landscapes. This i-motif is formed by two strands, associated head-to-head, and featuring zero-nucleotide loops which have not been previously observed. Through spectroscopic and computational analyses, we also establish that the dimeric i-motif is stabilized by fluorine and non-fluorine hydrogen bonds, thereby explaining the superlative stability of araF-C modified i-motifs. Comparative experimental findings suggest that the strength of these interactions depends on the flexible sugar pucker adopted by the araF-C residue. Overall, the findings reported here provide a new role for i-motifs in nanotechnology and also pose the question of whether unprecedented i-motif folds may exist in vivo.
Subject terms: Nucleic acids, Solution-state NMR, Molecular modelling, Computational chemistry, Biophysical chemistry
The oligonucleotide d(TC5) forms a well-characterized tetrameric i-motif in solution; however, the isolation of dimeric and trimeric intermediates remains challenging. Here, the authors report that 2′-deoxy-2′-fluoroarabinocytidine substitutions can prompt TC5 to form dimeric i-motif folding intermediates through fluorine and oxygen hydrogen bonds.
Introduction
Kinetic partitioning mechanisms are believed to govern the folding of quadruplex nucleic acid structures. These mechanisms support the formation of multiple, coexisting conformations with the same sequence, differing in their thermodynamic stability. G-quadruplexes, in particular, are notorious for their structural polymorphism resulting from folding landscapes with several energy minima1. On the other hand, i-motifs, which consist of two parallel duplexes intercalated by cytosine:cytosine (C:C+) base pairs, also seem to fold through kinetic partitioning mechanisms2–8 but are deemed less structurally heterogenous9–11. Apart from intermediates of unknown structure or i-motif conformers differing in their C:C+ stacking order3–7,12, no other alternative i-motif folding intermediates have been isolated and characterized to date.
Here, we report the first high-resolution structure of a metastable dimeric i-motif intermediate, thereby expanding the scope of i-motif folding landscapes. This finding is timely owing to the relevance of i-motifs to disease (regulating DNA transcription, protooncogene transcription, and telomere homeostasis)13–19, as well as their nanotechnological applications (molecular rotors20, cellular sensors21 and hydrogel components22) which rely on their pH-dependent folding.
Oligonucleotide d(TC5) forms a well-characterized tetrameric i-motif in solution, which represents the first i-motif structure ever reported23. Studies have suggested that the tetrameric structure assembles through a sequential folding pathway with dimeric and trimeric intermediates that have evaded isolation2. Isolating the off-pathway dimeric i-motif intermediate reported here is only possible once four (Fig. 1, ON4a-e) or all five (Fig. 1, ON5) 2′-deoxycytidine (dC) residues are substituted with 2′-deoxy-2′-fluoroarabinocytidine (araF-C) (Fig. 1). We have previously shown that substituting dC with araF-C in i-motif-forming sequences dramatically slows down i-motif unfolding kinetics24. In line with these previous findings, we show here that araF-C can also stabilize non-native structures, allowing the observation of i-motif folding intermediates. We had also demonstrated that araF-C substitutions superlatively increase the thermal stability of i-motifs at both acidic and neutral pH24–26. However, elucidating the structural reasons for this stabilization has been precluded because of the absence of model systems with sufficiently high NMR resolution required for quantum mechanics-based computational methods. Therefore, in addition to characterizing its folding, we determine that intra-residual organic fluorine- and oxygen-hydrogen bonds contribute to the stability of the structure, which persists in solution for months.
Fig. 1. Oligonucleotide sequences used in this study.

i-Motif forming oligonucleotides are listed in the table, and the chemical structures of dC and the substituent araF-C are represented above the table.
Results
Isolating the dimeric ON5 species
Sample preparation influences the nature of the araF-C-modified ON5 species formed in solution. Non-denaturing gel electrophoresis reveals that, at higher concentrations (500 μM), slow annealing (SA) (see Methods) leads to the formation of a low-mobility species, previously characterized as the tetrameric T(araF-C)5 i-motif26, and a small population of a higher-mobility species that we show here to be an off-pathway dimeric i-motif (Fig. 2A). Interestingly, rapid (snap-cool) annealing (RA) results in a band with intermediate mobility, presumably corresponding to a trimeric species or a less-structured dimer. At concentrations below 200 μM, ON5 still forms a mixture of species under SA but forms the high-mobility dimeric species exclusively under RA (Fig. 2A). In contrast, unmodified TC5 (ON0) fails to form the high-mobility species regardless of the annealing method (Supplementary Figure 1). As expected, it forms the low-mobility tetrameric i-motif under SA and, surprisingly, forms an additional species of intermediate mobility under RA, similar to that which forms in ON5 at high concentrations and RA (Supplementary Figure 1).
Fig. 2. Influence of oligonucleotide concentration and annealing method on ON5 structural populations.

A Non-denaturing gel electrophoresis of ON5 across different concentrations upon RA and SA. B CD spectra of 100 μM RA (blue) and SA (gray) ON5 at 5 °C. C 1H NMR imino signals of 150 μM RA ON5 and SA ON5 at 5 °C. Buffer conditions are aqueous 10 mM NaPi pH 5 (10% D2O for NMR spectra).
Determining the structure of the dimeric ON5 species
We used circular dichroism (CD) and NMR spectroscopy to characterize the structure of the isolated dimeric ON5 species. Regardless of annealing conditions, the CD spectra of ON5 (Fig. 2B) display the same maximum and minimum at around 285 nm and 265 nm, respectively, and only differ in signal intensity. These spectral features are characteristic of i-motifs and are similar to those of the tetrameric ON0 i-motif (Supplementary Figure 2)26. Moreover, both RA and SA ON5 exhibit signals in the 15-16 ppm region of the 1H NMR spectrum (Fig. 2C), which are characteristic of imino protons involved in C:C+ base pairs. While the SA sample shows multiple, overlapping signals, the RA sample shows three sharp and well-dispersed signals, consistent with the formation of a single species and in agreement with observations from non-denaturing electrophoretic experiments. Similarly, RA ON5 shows a single sharp signal at 11.3 ppm, which is characteristic of a T:T base pair, while SA ON5 exhibits several. In assessing the stability of the dimeric RA ON5 species, we found that most imino signals remain visible at pH 7.0 (Supplementary Figure 3), thereby demonstrating the stabilization incurred by araF-C residues.
To determine the structure of RA ON5, 2D NMR spectroscopy was used. The NOESY and TOCSY spectra (Fig. 3A) confirmed the presence of a single species, particularly through our observations of a single cross-peak between the methyl protons of the thymine of residue T1 and its aromatic proton, as well as five intense cross-peaks between the aromatic protons (H5 and H6) of each of the araF-C (C2 through C6). While only three imino-amino cross-peaks of C:C+ homo-base pairs are observed between 15 and 16 ppm (Fig. 3A), the amino protons of all five araF-C residues appear in the range of chemical shifts characteristic of protonated cytosines (8.2–9.5 ppm) (Supplementary Table 1), suggesting they are all involved in base-pairing. The absence of other sequential cross-peaks and a large number of sugar-sugar NOEs ruled out the possibility of parallel duplex formation. To assign C3 through C6, we recorded the 1D 19F-NMR spectra (Fig. 3B) of a series of oligonucleotides, ON4a-e (Fig. 1), which is based on the sequential substitution of one araF-C of ON5 with a dC residue. Non-denaturing gel electrophoresis (Supplementary Figure 4), CD (Supplementary Figure 5) and 1H-NMR (Supplementary Figure 6) spectra of all the oligonucleotides indicate that they adopt the same conformation as ON5 when subjected to RA. The 1D 19F-NMR spectrum of ON5 shows five sharp and dispersed signals in a range of –188 to –202 ppm (Fig. 3B) corresponding to each of the five 19F nuclei in the molecule. As expected, the spectra of ON4a-e show four signals, meaning that each missing fluorine signal could be assigned to the araF-C residue of ON5 that has been substituted with deoxycytidine. In ON4b specifically, two 19F signals appear close together at −193.10 and −193.14 ppm and are resolved in the 1H-19F-HOESY (Supplementary Figure 7). After assigning each of its fluorine signals (Supplementary Table 1), the 1H-19F-HOESY spectrum of ON5 (Fig. 3C) was used to identify the H2′ protons through their intense intra-residual cross-peaks with 19F, thereby facilitating the complete, unambiguous assignment of C3-C6 residues. Consequently, we identified key i-motif minor groove NOE cross-peaks (H1′-H1′ between stacked cytosines and reciprocal H2′-H1′ between stacked cytosines facing their 3′ edges), which serve as conclusive evidence that RA ON5 folds into a dimeric i-motif with the stacking order C2-C6-C3-C5-C4 (top to bottom) (Fig. 4A).
Fig. 3. Characterization of ON5 dimeric structure by NMR.
A Overlapping NOESY (black) and TOCSY (blue) spectra corresponding to RA ON5. Top left: imino-amino cross-peaks. Bottom-left: cross-peaks between methyl protons of T1 and its own aromatic proton. Sequential cross-peaks between H2′/H2′′ of T1 and H6 of C2. Middle: five intense TOCSY signals correlating H5 and H6 of the five araF-C residues. Low-intensity intra-residual H1′-H6 and H2′-H6 cross-peaks. Right: Inter-residual H1′-H1′ and H1′-H2′ cross-peaks across the minor groove. B 1D 19F NMR spectra of RA ON5 and RA ON4a-ON4e. C 1H-19F HOESY spectrum of RA ON5. All spectra were acquired with 150 μM RA ON5 in 10 mM NaPi pH 5, at 5 °C.
Fig. 4. ON5 dimeric i-motif structural characteristics.
A Schematic view (top left), and three-dimensional structure viewed from the major groove (top middle) and minor groove (top, right). B 3′-3′ interphase across the major groove showing inter- F···H-amino and intra-residual F···H6 contacts. C Structures of C2′ and C3′ residues with C2′-endo and C3′-endo sugar pucker, respectively. dT and araF-C are in gray and slate blue, respectively. Atom color code: fluorine: green; nitrogen: blue; oxygen: red; hydrogen: white. Atom type is not color-differentiated along the backbone. ′ symbols denote residues of the strand with protonated araF-Cs.
Structural features of the dimeric ON5 i-motif
We used 50 structurally-relevant distance restraints (Supplementary Table 2) to resolve the structure of the dimeric i-motif. These are derived from NOESY and 1H-19F-HOESY spectra (see Methods) and torsion angular restraints obtained from qualitative analysis of DQF-COSY (Supplementary Figure 8).
H2′-H1′ cross-peak intensities in the COSY spectrum indicate that araF-C sugars of C3 and C5 adopt a north conformation, T1, C2 and C4 a south conformation, and C6 a Northeast conformation. Supplementary Table 3 displays the average pseudorotation parameters of the dimeric ON5 i-motif and Supplementary Table 4 displays its average dihedral angles and order parameters.
The dimeric i-motif is formed by two tightly turned strands, associated head-to-head, and stabilized by five C:C+ and one T:T inter-strand homo-base pairs. The change in strand orientation occurs through an unprecedented nucleotide-free loop at the C4-p-C5 step (Fig. 4A). Interestingly, the T:T base pair does not follow the alternating intercalation pattern of the C:C+ base pairs, so the intercalated unit of the structure consists of five C:C+ base pairs, with C2:C2+ and C4:C4+ being the most external ones. Importantly, it was possible to detect HOE cross-peaks correlating the fluorine atom of C3 with the amino protons of C5 and vice versa (Fig. 3C). The same correlations are observed between C2 2′F and C6 amino protons, but they are not observed between C6 2′F and C2 amino protons, likely because the C2:C2+ base pair is the most external, with its amino protons exchanging more rapidly with solvent. We also detect intense HOE cross-peaks correlating 2′F and H6 aromatic protons of all araF-C residues (Fig. 3C). Overall, the presence of these HOE correlations is indicative of short distances between 2′F atoms and inter-strand amino protons or intra-residual aromatic protons (Fig. 4B). These results led us to investigate the formation of 2′F···H hydrogen bonds, which we have previously postulated to be important in araF-C stabilized i-motifs24–26.
F···H and O···H bonds contribute to the stability of araF-C-modified i-motifs
The establishment of organic fluorine-hydrogen bonds in nucleic acid systems has seldom been conclusive, so we resorted to experimental (NMR) and computational methods to determine their existence in the dimeric i-motif. We found that all araF-C aromatic proton (H6) signals of RA ON5 exhibit significant broadening in linewidth in the 19F-coupled spectrum (Fig. 5A). Given the long through-bond distance (5 bonds) between the 2′F and H6 atoms (Fig. 4B), such dramatic broadening is unambiguously attributed to a through-space scalar coupling occurring between these atoms. C3H6, in particular, exhibits clear splitting in the 19F-coupled 1H NMR spectrum and resonates as a doublet of doublets, allowing us to estimate the magnitude of the J coupling of the H-bond (≈3.8 Hz) (Fig. 5A).
Fig. 5. Analysis of hydrogen bonds in ON5 dimeric i-motif by NMR and NCI.

Overlapped 19F-coupled (black) and 19F-decoupled (blue) 1D 1H-NMR spectra showing aromatic signals of A RA ON5 and B RA ON4b. (A- and B-insets): Zoomed view of the dashed square region of A- and B-left, where the coupling constant associated with signal splitting of C3 H6 upon 19F coupling has been measured. C NCI plots for C2–C6′ two-residue system. Hydrogen bonds can be visualized as blue spikes in the NCI plot regions at sign(λ2)*ρ ≤ −0.01 au. D Model of an inter-strand, capped araF-C pair evaluated by the NCI method, showing the s = 0.04 isosurface in shades of red (repulsive interactions) and shades of blue (attractive interactions), the s = 0.013 isosurface in black, and the most relevant intra- and inter-residual interactions detected.
To determine whether other araF-C stabilized i-motifs also feature intra-residual fluorine-hydrogen bonds, we performed 19F-coupled and decoupled 1H-NMR on SA ON4b (Fig. 1; Supplementary Figure 9A), which folds exclusively into a tetrameric i-motif at high concentrations (Supplementary Figures 4 and 6). Due to the larger size of the tetramer, the linewidth of the signals is markedly broader than that of the dimer. Nevertheless, specific differences in the fine structure of the aromatic signals of the araF-C residues are detected between the two spectra. Again, all—and only—the aromatic proton signals of araF-C broaden when 19F-coupling is enabled. Similar results (Supplementary Figure 9B) are obtained for the previously-studied HC3 centromeric sequence (Fig. 1), which forms a dimeric i-motif26,27. Consequently, we infer that intra-residual 2′F···H6 hydrogen bonding contributes to the enhanced stability of i-motifs containing araF-C.
We next sought to understand whether the sugar pucker adopted by the araF-C residues could influence the magnitude of the through-space J constant and therefore the strength of the H-bond. Just like C3, C5 adopts a north (C3′-endo) sugar pucker (Fig. 4C). However, the H6 signal of C5 overlaps with other signals in the 19F-coupled spectrum (Fig. 5A), making the evaluation of the J constant difficult. This obstacle is overcome in the 1D 1H-NMR spectrum of the ON4b oligonucleotide (Fig. 5B), which also folds into the same dimeric i-motif upon RA (Supplementary Figure 7). In the spectrum, the H6 signal of C5 appears isolated and splits in the 19F-coupled spectrum with the same J coupling of 3.8 Hz as the signal of C3H6 in the RA ON5 sample. Given that C3 and C5 are the only residues that adopt a north (C3′-endo) sugar pucker (Supplementary Figure 8 and Supplementary Table 3), we postulate that i-motif araF-C residues with a north conformation exhibit intra-residual H-bonds of higher strength than those with other conformations.
To complement our experimental findings on 2′F···H bonding, we used the NCI (non-covalent interaction) computational method, which is effective in detecting weak interactions, such as intra-residual hydrogen bonds28 or those involving a fluorine acceptor29. The NCI framework is based on the analysis of the reduced density gradient (s) across a molecule, which is plotted as function of sign(λ2)ρ, where ρ is the electronic density and λ2 the second eigenvalue of ρ Hessian. In regions of hydrogen bonds, sign(λ2)ρ would have low negative values while s would be close to 030,31. Accordingly, we constructed NCI plots (Supplementary Figures 10–13) for a total of 20 structures consisting of models of two capped cytosines (C2-C6′, C3-C5′, C6-C2′, and C5-C3′) (Supplementary Figure 14) extracted from NMR-restrained molecular dynamics simulations of the dimeric i-motif. To associate molecular interactions with spikes in the NCI plots, we chose a representative structure for each of the two-residue models and reported the relative NCI plot (Fig. 5C) along with the 0.04 s isosurface in the molecular representation (Fig. 5D). Additionally, NCI plots of single-residue models (Supplementary Figures 15–18) were used to discern intra- from inter-residual interactions (a detailed assignment is discussed in the Supplementary Information).
Using the NCI method, we detected intra-residual fluorine-hydrogen bonds involving C2, C3, C5, and C6 residues and their protonated (′) equivalents (Fig. 5D), thereby corroborating our NMR-based results showing the splitting of araF-C H6 aromatic signals upon 19F coupling (Fig. 5A, B). In each of the pairs of residues examined, we also detected inter-residual fluorine-hydrogen bonds between 2′F and cytosine H4 amino protons, as well as intra-residual O2···H1′ and O4′···H6 hydrogen bonds. The networks of hydrogen bonds detected likely contribute to the stabilization of i-motifs upon modification with araF-C.
The dimeric ON5 i-motif is an off-pathway kinetic trap
We conducted thermal hysteresis (TH) experiments using a temperature-controlled UV spectrometer to understand the folding of the dimeric T(araF-C)5 i-motif and compare it to the folding of the well-characterized tetrameric i-motif. In TH experiments, the temperature ramp rate is fast compared to the folding and unfolding rates of the sample, leading to melting and annealing traces that shifted to higher and lower temperatures, respectively. Two scan rates were applied to ON0 and ON5 samples at two concentrations (50 and 250 μM), representing either RA (5 °C/min) or SA (0.5 °C/min). The resulting TH profiles (Fig. 6A) show biphasic melting curves and pronounced hysteresis, suggesting that both oligonucleotides can form dimeric and tetrameric species with lower- and higher-temperature melting transitions, respectively26.
Fig. 6. Modeling ON5 dimeric i-motif formation by thermal hysteresis analysis.

A TH traces for ON0 (top) and ON5 (bottom) at 50 μM (left) 250 μM (right). Cooling traces are shown in blue (5 °C/min) and cyan (0.5 °C/min). Heating traces are shown in red (5 °C/min) and yellow (0.5 °C/min). Experimental data are shown as points and modeled experiments corresponding to the best-fit are shown as solid lines. ON0 was fit to Model 1 and ON5 was fit to Model 2 shown in B. B Schemes of folding models. Model 1: Sequential tetrameric assembly of monomer (M) into dimer (D), trimer (Tri), and finally tetramer (T). Model 2: Sequential tetrameric assembly including an off-pathway dimeric species (D*).
The tetrameric ON0 i-motif was previously proposed to fold through sequential monomer additions, proceeding through dimer and trimer intermediates2. Owing to its folded nature, we hypothesized that the dimeric ON5 i-motif we isolated was an off-pathway, kinetically-trapped intermediate resulting from a structural rearrangement of the on-pathway dimeric intermediate. To test this hypothesis, the TH traces for ON0 and ON5 were globally fit to two different assembly pathways (Fig. 6B): (1) following the sequential intermolecular assembly mechanism and (2) allowing for an off-pathway dimeric species. The dimer spectral absorption coefficient was assumed to be half-way between those of the monomer and tetramer, as raw absorbance changes were found to be smaller in magnitude when the dimer was formed. The improvements in RSS given by mechanism 2, which accounts for the formation of an off-pathway dimer, compared to mechanism 1, was calculated using F-test statistics32 and found to be significant at levels of p ≤ 10−2 for ON5, whereas ON0 did not show any improvement to the RSS for mechanism 2 (Supplementary Table 5).
Furthermore, by modeling the assembly of the tetrameric and dimeric i-motifs from TH experiments, we can extract information on the populations of individual species in the sample. For instance, isothermal kinetic simulations of RA ON0 and ON5 at low concentrations (50 μM) show a large degree of dimer assembly at 4 °C (Supplementary Figure 19), which is consistent with previous experiments showing that the rate-limiting step of tetramer assembly for ON0 is the formation of the trimeric species2. In the case of ON0, this dimer intermediate quickly converts to the tetrameric species. In contrast, the ON5 on-pathway dimeric species folds into the off-pathway dimeric i-motif which acts as a kinetic trap and further slows the assembly of the tetrameric species and remains folded for days at low temperatures.
Discussion
We have previously shown that dC-to-araF-C substitutions result in a dramatic enhancement in i-motif stability at acidic and neutral pH26. Here, we show that araF-C substitutions can also promote the formation of an unprecedented dimeric T(araF-C)5 i-motif, which consists of two looping strands, associated by intermolecular C:C+ base pairs. Interestingly, the loops connecting the two antiparallel segments of each strand are devoid of any nucleotide, which reflects that the length of one phosphate linker is sufficient to span the exceptionally narrow i-motif minor groove. Replacing the fourth araF-C residue with dC (as in ON4d) reduces the propensity for the sequence to fold into a dimeric i-motif (Supplementary Figure 4), indicating that the presence of an araF-C in the 3′-closing turn position is important for the integrity of the structure. To our knowledge, this study is the first to report on i-motifs with zero-nucleotide loops—a feature that highlights the unique folding requirements and capabilities of i-motifs, compared to other noncanonical structures such as G-quadruplexes. While it is in unclear whether such loops can form in unmodified i-motifs, the 1H NMR spectrum of RA ON0 (Supplementary Figure 6) shows the formation of a minor species which we hypothesize is the same dimeric i-motif as ON5. Additional studies are being conducted on other cytosine-rich sequences to explore whether unmodified i-motifs could also form with zero-nucleotide loops.
In addition to determining the structure of the dimeric i-motif, we capitalized on the quality of its NMR signals (high dispersion, low overlap, and high resolution) to experimentally assess the presence of fluorine-hydrogen bonds. Previously, we attributed araF-C-induced i-motif stability to favorable electrostatic interactions based on structural information obtained for a substituted tetrameric TC5 i-motif26. Nevertheless, the potential role of fluorine-hydrogen bonds remained an open question. The ability of organic fluorine to serve as a hydrogen bond acceptor has been debated, and only few studies have suggested the formation of fluorine-hydrogen bonds in nucleic acid systems. Often, F-H interactions have been deemed pseudohydrogen bonding, when the only experimental evidence available includes short distances (or ‘contacts’) and favorable angles (120-180°)33–36. Here, we observed 19F-induced broadening of H6 aromatic proton signals across three different i-motifs, suggesting the presence of intra-residual hydrogen bonding. While the coupling could occur through the five bonds connecting 2′F to the aromatic protons37, we observe no splitting of signals corresponding to protons four bonds away from the 2′F, such as C3H4′. This corroborates our hypothesis that the coupling between 2′F and H6 is transferred through space and is a hydrogen bond in nature. In the case of residues C3 and C5 of ON5 and ON4b, which adopt a north conformation, the H6 signals are split with a magnitude of 3.8 Hz (Fig. 5A, B). This J coupling is significantly higher than those measured between 2′F and H8 in araF-G residues in a G-quadruplex (3.0 Hz)38 and in an araF-N:RNA hybrid duplex (2.7 Hz)39. Similarly, the J coupling reported here is more than two-fold higher than those measured for corresponding residues in araF-C (1.7 Hz) and araF-T (1.5 Hz) in the same araF-N:RNA hybrid duplex40. Another study in our group has shown 2′F-induced splitting of araF-G H8 protons in a fully modified araF-N duplex (araF:araF) but did not measure coupling constants41. Based on the remarkable splitting of the aromatic signals in C3 and C5 residues, we postulate that the north pucker of araF-C is conducive to stronger intra-residual fluorine-hydrogen bonds. While araF nucleosides are typically oriented in the south (C2′-endo) conformation, the pucker flexibility displayed by the araF-C residues in the dimeric i-motif reported here is not surprising42. In araF-N:RNA duplexes, araF residues adopt an O4′-endo (east) pucker, which provides an ideal geometry for favorable fluorine-hydrogen interactions, while minimizing steric stress33,43. Owing to this geometry and the resulting interactions, araF-N:RNA duplexes are superlatively stable compared to other hybrid duplexes, including DNA:RNA43,44. Meanwhile, araF-G residues adopting the south puckering in G-quadruplexes have favorable 2′F···H8 distances and geometries suggestive of pseudohydrogen bonds35. Lastly, araF-G residues have also been reported in the north pucker in a G-quadruplex with V-loop topology, thereby enabling a favorable 2′F···H-N bond45. To our knowledge, this study is the first to report araF pyrimidine residue oriented as north, and it is conceivable that C3 and C5 adopt such a geometry to optimize the stability of the dimeric i-motif.
In this study, we also use computational methods based on NCI analyses to corroborate our experimental evidence for fluorine-hydrogen bonds. To our knowledge, there are no reports that succeed in reconciling experimental and computational results to describe fluorine-hydrogen bonds in nucleic acids. A recent study showed that computations at the QTAIM-NBO (quantum theory of atoms in molecules—natural bonding orbitals) level did not provide evidence for intra-residual 2′F∙∙∙H6/8 bonds in 2′-fluorinated nucleosides, despite the detection of scalar coupling by NMR46. However, these computational analyses are regarded as too stringent and limited to describing strong hydrogen bonds, including those with significant covalent character28. The NCI analyses (also based on QTAIM) used here are more suitable for describing weaker fluorine-hydrogen bonds, including those of intramolecular nature28,47. Consistent with the NMR data for ON5, we detect intra-residual 2′F···H6 hydrogen bonds. We also find inter-residual fluorine-hydrogen bonds between 2′F and cytosine amino protons (Fig. 4B). Unlike H6 proton signals, these amino protons do not exhibit any 19F-induced broadening or splitting in the 1H NMR spectrum (Supplementary Figure 20). Meanwhile, 19F-induced splitting of inter-residual amino protons has been previously observed when araF-G was incorporated into a G-quadruplex45. Given that NCI analyses are based on optimized structures, it is possible that the inter-residual fluorine-hydrogen bonds detected (Fig. 5D) are very weak in nature and may not contribute significantly to the stability of the dimeric i-motif. NCI analyses also reveal non-fluorine hydrogen bonds. The O2···H-C1′ bonds detected have not been reported in nucleic acids before; it is likely that the electronegativity of 2′F makes H1′ of araF-C more electropositive, rendering it a better hydrogen bond donor. Interestingly, NCI analyses also reveal O4′···H6 hydrogen bonds; given that H6 is also hydrogen bonded intra-residually to 2′F, we hypothesize that both interactions could form a bifurcated hydrogen bond system48,49. Overall, we believe that the geometry adopted by araF-C residues, optimized through their flexible puckering, is conducive to creating a network of fluorine and non-fluorine hydrogen bonds, thereby contributing to the unparalleled stability of araF-modified i-motifs.
In addition to studying the stability of araF-C modified i-motifs, we used TH analyses to study the folding mechanism of the dimeric ON5 i-motif. Several studies suggest that i-motif folding follows kinetic partitioning models4–8,12. Most of these studies have been carried out with monomeric i-motifs3–5,7,8,12. In the case of the tetrameric TC5 i-motif, the folding pathway has been previously described to proceed via the stepwise association of oligonucleotide strands2. While the dimeric and trimeric intermediates have been postulated2, they have never been detected. Hence, aside from conformers differing in C:C+ stacking order (3′E vs 5′E)4,5,26, alternative i-motif folds stemming from the same sequence have never been reported. Here, we show that ON5 folding is best described by a model which leads to a kinetically-trapped off-pathway dimeric i-motif and a more thermodynamically-stable tetrameric i-motif (Fig. 6B). These strongly support the hypothesis that i-motif folding follows a kinetic partitioning mechanism. We also show that the simple sequential folding does adequately explain ON0 folding, but not ON5 folding; the latter is a significantly better fit when an off-pathway dimeric intermediate is introduced to the model. Therefore, we deduce that the araF-C modifications can change i-motif folding landscapes and can be used to sequester transient structures that would go unnoticed in the native sequence.
In conclusion, we show that the incorporation of araF-C in TC5 i-motifs slows down their folding kinetics, thus allowing the observation of metastable folding intermediates. By isolating and determining the structure of the dimeric T(araF-C)5 i-motif, we reveal the fundamental reasons for the enhanced stability provoked by araF-C substitutions in native and non-native i-motifs. While i-motifs have been used extensively in nanotechnology for their pH sensitivity20, we demonstrate that i-motif molecularity can be modulated and controlled through rates of temperature change, which is useful for constructing novel molecular switches50,51, thermally-tunable hydrogels52,53, and DNA nanostructures with increased 3D complexity54. These i-motifs also lend to additional strand functionalization (fluorophores, cargoes…), which expands the scope of potential applications. Lastly, given that i-motifs have been recently detected in cells and that their formation seems to be transitory14, the results of this study concretize the possibility that alternative i-motif conformations may fold in vivo and may be promoted in the absence of araF-C substitutions, through factors such as protein binding or molecular crowding effects.
Methods
Oligonucleotide synthesis and purification
Oligonucleotide synthesis was performed on an ABI 3400 DNA synthesizer (Applied Biosystems) at 1 μmol scale on Unylinker (Chemgenes) CPG solid support. Thymidine (dT), deoxycytidine (N-acetyl) (dC), and deoxyadenosine (N-Bz) (dA) phosphoramidites were used at 0.1 M concentration in acetonitrile and coupled for 200 s. araF-C was used at 0.1 M concentration and coupled for 600 s. Oligonucleotide deprotection and cleavage from the solid support were achieved using aqueous AMA (30% ammonium hydroxide/40% methylamine, 1:1) at 65 °C for 1 h. Oligonucleotide sequences were purified by anion exchange HPLC on a Waters 1525 instrument using a Source 15Q Resin column (11.5 cm × 3 cm). The aqueous buffer system consisted of solution A (25% acetonitrile, 15 mM sodium acetate) and solution B (0.5 M lithium perchlorate, 25% acetonitrile, 15 mM sodium acetate) at a flow rate of 10 mL/min. The gradient was 0−100% lithium perchlorate over 50 min at 60 °C. Under these conditions, the desired peaks eluted at roughly 23 min. The purified oligonucleotides were desalted using Glen-pack desalt columns (Glen Research), and their masses were confirmed by high-resolution LC-MS.
Annealing conditions
Rapid or ‘snap-cool’ annealing (RA) involves heating the sample to 90 °C over 5 min and placing it immediately afterwards on ice for 10 min.
Slow-annealing (SA) involves heating the sample to 90 °C over 5 min, cooling it over 3 h to room temperature, and placing it in the fridge overnight.
Native polyacrylamide gel electrophoresis (PAGE)
Oligonucleotide samples were analyzed using native gels consisting of acrylamide/bis 19:1 (20%), 10 mM sodium phosphate pH 5, and 1× TAE (Tris-Acetate-EDTA). The final gel mix solution was adjusted to pH 5 prior to casting.
Oligonucleotide samples (range of 50–200 μM, as specified in the main text) were annealed in 10 mM sodium phosphate pH 5, with a final temperature of 5 °C (by either snap-cooling or slow-annealing methods). They were mixed with 50% glycerol to attain a final glycerol concentration of ~11.5% before loading them in the gels.
After casting the gels and loading them with sample, they were run at 19 V/cm over 2 h and 7 °C, using 1× TAE pH 5 as the running buffer. Gel results were visualized by UV-shadowing. The oligonucleotide controls were dT12 and dT24 strands. Xylene cyanol and bromothymol blue dyes were used to monitor the progress of electrophoresis.
Circular dichroism (CD) spectroscopy
CD studies were performed on a Chirascan VX spectrometer using a 1 mm path-length cuvette. Temperature was maintained at 5 °C (unless otherwise specified) using the Peltier unit within the instrument. Spectra were recorded from 320–220 nm at a scan rate of 100 nm/min with three acquisitions recorded for each spectrum. The buffer spectrum was subtracted from each sample’s spectrum, and the sample spectra were consequently smoothed using the Savitsky-Golay function within the Chirascan graphing software. Oligonucleotide solutions for CD measurements were 100 μM in concentration and were prepared in 10 mM sodium phosphate buffer pH 5.0. Acquisitions were conducted after snap-cooling or slow-annealing conditions.
NMR spectroscopy
Samples for NMR experiments were dissolved in 10 mM sodium phosphate aqueous buffer containing 10 % D2O. The pH of the samples was adjusted to five by adding concentrated solutions of HCl. Oligonucleotide concentrations were 150 μM, except for SA ON4b and HC-3 samples which were at 500 μM. All NMR spectra were acquired on Bruker spectrometers operating at 600 MHz and having cryoprobes with 1H and 19F channels. NMR data were processed using TOPSPIN software. One-dimensional 1H-NMR was acquired using excitation sculpting for water suppression. Decoupling of 19F in 1D 1H-NMR experiments was done on the same pulse program than coupled but applying a 180° radiofrequency pulse along acquisition. NOESY spectra were acquired at mixing times of 150 and 250 ms. NOESY and DQF-COSY experiments used excitation sculpting for water suppression and a selective pulse during acquisition for 19F decoupling. TOCSY spectra were recorded with the DIPSI-2 sequence and a mixing time of 80 ms. 1H-19F-HOESY experiments were set with detection in 19F, decoupling in both dimensions, and a mixing time of 150 ms. The software NMRFAM-Sparky was used for the assignment of NOESY and HOESY cross-peaks and quantitative evaluation of the signal intensities55.
NMR restraints for structure determination
Distance restraints were obtained from intensities of the signals in the NOESY and HOESY spectra. NOE and HOE intensities were classified as very strong, strong, medium, weak or very weak, and distances were restrained to 3, 3.5, 4, 4.5 or 5 Å, respectively. In addition to these restraints derived from NMR spectra, target values for distance and angles related to hydrogen bonds in base pairs were set to values obtained from related structures determined by X-ray crystallography56. Force constants were set to 29 kcal/mol∙Å2 and 20 kcal/mol∙Å2 for base pairs hydrogen bonds and experimental distance restraints, respectively. The angular restraints for dihedral angles were obtained from qualitative analysis of intra-residual H1′-H2′ (for araF-C residues) and H1′-H2′ and H1′-H2′′ (for dT residue) cross-peaks in the DQF-COSY spectra. Loose values of ν0, v1 and v2 were set to restrain deoxyribose and 2′F-arabinose conformation to north or south domains which is equivalent to restraining the sugar pseudorotation phase angles from 0° to 36° for north conformation and between 144° and 180° for south conformation. H6-F2′ distance and C6-H6···F2′ angle of residues C3 and C5 were set to 2.0 Å and higher than 100°, respectively.
Structure determination
Three oligonucleotide structures were initially calculated with the program CYANA 3.057 and used as starting points for further refinement with the SANDER module of the molecular dynamics package AMBER 1858. Each refined structure was placed in the center of a cubic water box, such as to ensure a minimum distance of 8 Å from any solute atom and the edge of the box. Five sodium counterions were added to neutralize the total charge of the system. The BSC1 force field59 and suitable parameters for araF-C were used to describe the oligonucleotide. The TIP3P model was used to simulate water molecules60. Hemiprotonated C:C+ base pairs were modeled as base pair formation between neutral and protonated cytosines, the parameters of which are included in the BSC1 forcefield.
The system was minimized through 4000 steps of steepest descent algorithm, followed by 16,000 steps of conjugate gradient. The cartesian coordinates of the oligonucleotide were restrained applying a harmonic potential of 100 kcal/mol∙A−2. The system was slowly heated at constant volume from 0 to 298.15 K using the Berendsen thermostat61, and keeping the same cartesian restraints as for the minimization. Thus, five NPT equilibrations of 20 ps each were performed, during which the cartesian restraints on the oligonucleotide atoms were decreased from 100 to 5 kcal/mol∙A−2. Finally, 5 ns restrained Molecular Dynamics (rMD) were run using the NMR restraints described in the previous section.
Equilibration and rMD runs were simulated at the pressure of 1 bar and temperature of 298.15 K, applying the Berendsen barostat and thermostat algorithms61. For all simulations, an integration step of 2 fs was used, and the SHAKE algorithm was applied to constrain the bonds involving hydrogen atoms. Particle Mesh Ewald method was used to evaluate long-range electrostatic interactions62. The same protocol was applied to all three simulations starting from the different CYANA structures. Final structures were obtained by extracting ten structures from the rMD trajectories and further relaxation of the structures keeping the same restraints used during rMD simulations. Analysis of the representative structures was carried out with the programs MOLMOL63, x3DNA64, and Pymol65.
Quantum Mechanics (QM) calculations
Five representative structures, obtained as described above, were used to evaluate the potential presence of 2′F···H intra- and inter-residual hydrogen bond. To do so at an accessible computational cost, we built two model systems for each of the five structures: one consisting of araF-cytosines C2, C6, C2′, C6′ (“ ′” denotes the strand containing the protonated araF-Cs) and another one comprising araF-cytosines C3, C5, C3′, C5′. The PO42- phosphate groups at the 3′ and 5′ positions were removed in the four nucleotides, thus the sugar 3′ ends were capped with a hydroxyl group, while the 5′ ends with a hydrogen atom (Supplementary Figure 14). C2-C2′, C6-C6′ and C3-C3′, C5-C5′ base pair hydrogen bonds, as well as base-stacking interactions, contribute to the stability of the two model systems.
T1 and T1′ were excluded from QM calculations as they lack fluorine and amino group and, consequently, cannot form either intra- or inter-residual 2′F···H hydrogen bonds. C4 and C4′ were also excluded as the position of its fluorine atom rules out the formation of any inter-residual 2′F···H hydrogen bond.
All the model systems were optimized using Gaussian 16 Rev. B.01 electronic structure software66. To do so, the M06-2X67 functional and the 6–31 G(d) basis set were applied. During the optimization the positions of the O3′ and C5′ atoms were kept frozen. The optimized structures were used to perform a non-covalent interaction (NCI) analysis and detect intra- and inter-residual 2′F···H hydrogen bonds. The NCI approach is a method derived from the “atoms in molecules” (AIM) theory that defines chemical bonds on the basis of topological characteristics of ρ68. The quantity evaluated by the NCI analysis is s, where
The quantity s will assume large values as the density tails far from the nuclei, where ρ is decreasing exponentially. On the other hand, it will assume small values close to the nuclei (large density and approaching-to-zero gradient). Finally, s will vanish at AIM critical points (CPs), where ∇ρ = 0, but also at so-called “non-AIM-CPs”, in which ∇ρ ≈ 031. If the density of two atoms overlaps, e.g., in the presence of covalent bonds, the total density will exhibit a typical AIM-CP. On the other hand, if the density overlaps in a region in which the exponential decay is asymptotic, the total density may feature a non-AIM-CP, indicating the presence of NCIs31.
Pairs of residues containing the following cytosines were extracted from previously optimized structures: C2-C6′, C3-C5′, C6-C2′ and C5-C3′. For each of them, the electronic wave function was evaluated using the M06-2X functional and the 6-311 + G(d) basis set. NCI plots, reporting the reduced density gradient (s) as a function of sign(λ2)*ρ (where λ2 is the second eigenvalue of the Hessian of ρ), were calculated using NCIPLOT469. Note that the sign(λ2) is negative for attractive interactions and positive for repulsive ones. Multiplying ρ by sign(λ2) is particularly useful in the presence of hydrogen bonds (sign(λ2) < 0), which may otherwise overlap with repulsive steric clashes (sign(λ2) > 0) if plotting s only as a function of ρ. ρ values close to zero are typical of weak dispersion interactions, while hydrogen bonds are generally characterized by larger values47.
UV-Vis spectroscopy and global analysis of TH profiles
UV-based thermal denaturation experiments were performed on a Cary 100 UV-Vis spectrophotometer equipped with a Peltier temperature controller. Samples were prepared in 10 mM sodium phosphate buffer (pH 5) at either 50 or 250 μM concentrations, and acquisitions were performed at 265 nm and in 1 mm path-length cuvettes. Absorbance values were acquired between 5 and 85 °C and at two scanning rates: 0.5 °C/min and 5.0 °C/min. The total change in absorbance was dependent on the scanning rate applied, we hypothesize that this is due to the dimeric species having a different absorption coefficient than the tetrameric species, and for the purpose of this analysis we assumed that the dimer had an absorption coefficient equal to half of the tetrameric species and that the trimer had an absorption coefficient equal to three-quarters of the tetrameric species. The TH profiles were globally fit to different models for the i-motif folding pathway. The mathematical relationships and assumptions made are elaborated upon in the Supplementary Information. Kinetic and thermodynamic parameters obtained from TH traces are in Supplementary Table 5.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We acknowledge Dr. Javier López Prados, the manager of the NMR facility at the Biomolecular Interactions Platform at cicCartuja (Seville, Spain) for his contribution to obtaining high-resolution 19F- and 1H-NMR data. This investigation was supported by research grants from the Spanish “Ministerio de Ciencia e Innovación” [PID2020-116620GB-I00, to C.G., RTI2018-096704-B-100, and PID2021-122478NB-I00 to M.O.], the Center of Excellence for HPC H2020 European Commission, “BioExcel-2 Centre of Excellence for Computational Biomolecular Research” [823830], the European Regional Development Fund under the framework of the ERFD Operative Program for Catalunya, the Catalan Government AGAUR [SGR2017-134], the European Union Marie Sklodowska Curie Action [799693, to M.G.], the Canadian Natural Sciences and Engineering Research Council of Canada (Discovery Grant, to M.J.D. and A.K.M.), and the Fonds de Recherche du Québec—Nature et Technologies Doctoral Scholarship (DE, to R.E.K). The IRB Barcelona is the recipient of a Severo Ochoa Award of Excellence from the MINECO. M.O. is an ICREA Academy scholar.
Author contributions
R.E.K. and M.G. designed the concept of the manuscript and wrote it. R.E.K. prepared the oligonucleotide samples and characterized them by gel electrophoresis, CD spectroscopy, and UV-vis spectroscopy. V.M. carried out the computational calculations and contributed to writing and editing the manuscript. C.H. carried out thermal hysteresis analyses and kinetic simulations and contributed to writing and editing the manuscript. T.M. supervised thermal hysteresis and kinetic simulation analyses and contributed to editing the manuscript. M.O. supervised the computational analyses and contributed to editing the manuscript. C.G. supervised the project and contributed to editing the manuscript. M.G. supervised the project and carried out NMR spectroscopy and structure determination. M.J.D. supervised the project and contributed to editing the manuscript.
Peer review
Peer review information
Communications Chemistry thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
Atomic coordinates and structure factors for the reported nucleic acid structure have been deposited with the Protein Data bank under accession number 7ZYX.
Competing interests
All authors declare no competing interest.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors jointly supervised this work: Miguel Garavís, Masad J. Damha.
Contributor Information
Miguel Garavís, Email: mgaravis@iqfr.csic.es.
Masad J. Damha, Email: masad.damha@mcgill.ca
Supplementary information
The online version contains supplementary material available at 10.1038/s42004-023-00831-7.
References
- 1.Grün JT, Schwalbe H. Folding dynamics of polymorphic G-quadruplex structures. Biopolymers. 2022;113:e23477. doi: 10.1002/bip.23477. [DOI] [PubMed] [Google Scholar]
- 2.Leroy J-L. The formation pathway of i-motif tetramers. Nucleic Acids Res. 2009;37:4127–4134. doi: 10.1093/nar/gkp340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Khamari L, Mukherjee S. Deciphering the nanoconfinement effect on the folding pathway of c-MYC promoter-based intercalated-motif DNA by single-molecule förster resonance energy transfer. J. Phys. Chem. Lett. 2022;13:8169–8176. doi: 10.1021/acs.jpclett.2c01893. [DOI] [PubMed] [Google Scholar]
- 4.Lieblein AL, Buck J, Schlepckow K, Fürtig B, Schwalbe H. Time-resolved NMR spectroscopic studies of DNA i-motif folding reveal kinetic partitioning. Angew. Chem. Int. Ed. 2012;51:250–253. doi: 10.1002/anie.201104938. [DOI] [PubMed] [Google Scholar]
- 5.Lieblein AL, Fürtig B, Schwalbe H. Optimizing the kinetics and thermodynamics of DNA i-motif folding. ChemBioChem. 2013;14:1226–1230. doi: 10.1002/cbic.201300284. [DOI] [PubMed] [Google Scholar]
- 6.Cao Y, et al. Formation and dissociation of the interstrand i-motif by the sequences d(XnC4Ym) monitored with electrospray ionization mass spectrometry. J. Am. Soc. Mass. Spectrom. 2015;26:994–1003. doi: 10.1007/s13361-015-1093-2. [DOI] [PubMed] [Google Scholar]
- 7.Garabedian A, et al. Structures of the kinetically trapped i-motif DNA intermediates. Phys. Chem. Chem. Phys. 2016;18:26691–26702. doi: 10.1039/C6CP04418B. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dhakal S, et al. Coexistence of an ILPR i-motif and a partially folded structure with comparable mechanical stability revealed at the single-molecule level. J. Am. Chem. Soc. 2010;132:8991–8997. doi: 10.1021/ja100944j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Abou Assi H, Garavís M, González C, Damha MJ. i-Motif DNA: structural features and significance to cell biology. Nucleic Acids Res. 2018;46:8038–8056. doi: 10.1093/nar/gky735. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Benabou S, Aviñó A, Eritja R, González C, Gargallo R. Fundamental aspects of the nucleic acid i-motif structures. RSC Adv. 2014;4:26956–26980. doi: 10.1039/C4RA02129K. [DOI] [Google Scholar]
- 11.Day HA, Pavlou P, Waller ZAE. i-Motif DNA: Structure, stability and targeting with ligands. Biorg. Med. Chem. 2014;22:4407–4418. doi: 10.1016/j.bmc.2014.05.047. [DOI] [PubMed] [Google Scholar]
- 12.Mustafa G, et al. A single molecule investigation of i-motif stability, folding intermediates, and potential as in-situ pH sensor. Front. Mol. Biosci. 2022;9:977113. doi: 10.3389/fmolb.2022.977113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Dzatko S, et al. Evaluation of the stability of DNA i-motifs in the nuclei of living mammalian cells. Angew. Chem. Int. Ed. 2018;57:2165–2169. doi: 10.1002/anie.201712284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zeraati M, et al. I-motif DNA structures are formed in the nuclei of human cells. Nat. Chem. 2018;10:631–637. doi: 10.1038/s41557-018-0046-3. [DOI] [PubMed] [Google Scholar]
- 15.King JJ, et al. DNA G-quadruplex and i-motif structure formation is interdependent in human cells. J. Am. Chem. Soc. 2020;142:20600–20604. doi: 10.1021/jacs.0c11708. [DOI] [PubMed] [Google Scholar]
- 16.Kang H-J, Kendrick S, Hecht SM, Hurley LH. The transcriptional complex between the BCL2 i-motif and hnRNP LL is a molecular switch for control of gene expression that can be modulated by small molecules. J. Am. Chem. Soc. 2014;136:4172–4185. doi: 10.1021/ja4109352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Takahashi S, Brazier JA, Sugimoto N. Topological impact of noncanonical DNA structures on Klenow fragment of DNA polymerase. Proc. Natl Acad. Sci. USA. 2017;114:9605–9610. doi: 10.1073/pnas.1704258114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chen Y, et al. Insights into the biomedical effects of carboxylated single-wall carbon nanotubes on telomerase and telomeres. Nat. Commun. 2012;3:1074. doi: 10.1038/ncomms2091. [DOI] [PubMed] [Google Scholar]
- 19.Kendrick S, et al. The dynamic character of the BCL2 promoter i-motif provides a mechanism for modulation of gene expression by compounds that bind selectively to the alternative DNA hairpin structure. J. Am. Chem. Soc. 2014;136:4161–4171. doi: 10.1021/ja410934b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Dong Y, Yang Z, Liu D. DNA nanotechnology based on i-Motif structures. Acc. Chem. Res. 2014;47:1853–1860. doi: 10.1021/ar500073a. [DOI] [PubMed] [Google Scholar]
- 21.Modi S, et al. A DNA nanomachine that maps spatial and temporal pH changes inside living cells. Nat. Nanotechnol. 2009;4:325–330. doi: 10.1038/nnano.2009.83. [DOI] [PubMed] [Google Scholar]
- 22.Cheng E, et al. A pH-triggered, fast-responding DNA hydrogel. Angew. Chem. Int. Ed. 2009;48:7660–7663. doi: 10.1002/anie.200902538. [DOI] [PubMed] [Google Scholar]
- 23.Gehring K, Leroy J-L, Guéron M. A tetrameric DNA structure with protonated cytosine-cytosine base pairs. Nature. 1993;363:561–565. doi: 10.1038/363561a0. [DOI] [PubMed] [Google Scholar]
- 24.Abou Assi H, El-Khoury R, González C, Damha MJ. 2′-Fluoroarabinonucleic acid modification traps G-quadruplex and i-motif structures in human telomeric DNA. Nucleic Acids Res. 2017;45:11535–11546. doi: 10.1093/nar/gkx838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Abou Assi H, Lin YC, Serrano I, González C, Damha MJ. Probing synergistic effects of DNA methylation and 2′-β-fluorination on i-motif stability. Chem. Eur. J. 2018;24:471–477. doi: 10.1002/chem.201704591. [DOI] [PubMed] [Google Scholar]
- 26.Assi HA, et al. Stabilization of i-motif structures by 2′-β-fluorination of DNA. Nucleic Acids Res. 2016;44:4998–5009. doi: 10.1093/nar/gkw402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Garavís M, Escaja N, Gabelica V, Villasante A, González C. Centromeric alpha-satellite DNA adopts dimeric i-motif structures capped by AT hoogsteen base pairs. Chem. Eur. J. 2015;21:9816–9824. doi: 10.1002/chem.201500448. [DOI] [PubMed] [Google Scholar]
- 28.Lane JR, Contreras-García J, Piquemal J-P, Miller BJ, Kjaergaard HG. Are bond critical points really critical for hydrogen bonding. J. Chem. Theory Comput. 2013;9:3263–3266. doi: 10.1021/ct400420r. [DOI] [PubMed] [Google Scholar]
- 29.Mishra SK, Suryaprakash N. Intramolecular hydrogen bonding involving organic fluorine: NMR investigations corroborated by DFT-based theoretical calculations. Molecules. 2017;22:423. doi: 10.3390/molecules22030423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Narth, C. et al. in Applications of topological methods in molecular chemistry (eds Remi Chauvin, Christine Lepetit, Bernard Silvi, & Esmail Alikhani) 491–527 (Springer International Publishing, 2016).
- 31.Laplaza R, et al. NCIPLOT and the analysis of noncovalent interactions using the reduced density gradient. WIREs Comput. Mol. Sci. 2021;11:e1497. doi: 10.1002/wcms.1497. [DOI] [Google Scholar]
- 32.Bevington P. & Robinson, D. K. Data reduction and error analysis for the physical sciences. (McGraw-Hill, 2003).
- 33.Anzahaee MY, Watts JK, Alla NR, Nicholson AW, Damha MJ. Energetically Important C−H···F−C pseudohydrogen bonding in water: evidence and application to rational design of oligonucleotides with high binding affinity. J. Am. Chem. Soc. 2011;133:728–731. doi: 10.1021/ja109817p. [DOI] [PubMed] [Google Scholar]
- 34.Martin-Pintado N, et al. Backbone FC•H⋅⋅⋅O hydrogen bonds in 2′F-substituted nucleic acids. Angew. Chem. Int. Ed. 2013;52:12065–12068. doi: 10.1002/anie.201305710. [DOI] [PubMed] [Google Scholar]
- 35.Martín-Pintado N, et al. Dramatic effect of furanose C2′ substitution on structure and stability: directing the folding of the human telomeric quadruplex with a single fluorine atom. J. Am. Chem. Soc. 2013;135:5344–5347. doi: 10.1021/ja401954t. [DOI] [PubMed] [Google Scholar]
- 36.Egli M. The steric hypothesis for DNA replication and fluorine hydrogen bonding revisited in light of structural data. Acc. Chem. Res. 2012;45:1237–1246. doi: 10.1021/ar200303k. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Bergstrom DE, Swartling DJ, Wisar A, Hoffmann MR. Evaluation of thymidine, dideoxythymidine and fluorine substituted deoxyribonucleoside geometry by the MIND0/3 technique the effect of fluorine substitution on nucleoside geometry and biological activity. Nucleosides Nucleotides. 1991;10:693–697. doi: 10.1080/07328319108046575. [DOI] [Google Scholar]
- 38.Dickerhoff J, Weisz K. Fluorine-mediated editing of a G-quadruplex folding pathway. ChemBioChem. 2018;19:927–930. doi: 10.1002/cbic.201800099. [DOI] [PubMed] [Google Scholar]
- 39.Trempe J-F, et al. NMR solution structure of an oligonucleotide hairpin with a 2’F-ANA/RNA stem: implications for RNase H specificity toward DNA/RNA hybrid duplexes. J. Am. Chem. Soc. 2001;123:4896–4903. doi: 10.1021/ja003859p. [DOI] [PubMed] [Google Scholar]
- 40.Denisov AY, et al. Solution structure of an arabinonucleic acid (ANA)/RNA duplex in a chimeric hairpin: comparison with 2′-fluoro-ANA/RNA and DNA/RNA hybrids. Nucleic Acids Res. 2001;29:4284–4293. doi: 10.1093/nar/29.21.4284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Martín-Pintado N, et al. The solution structure of double helical arabino nucleic acids (ANA and 2′F-ANA): effect of arabinoses in duplex-hairpin interconversion. Nucleic Acids Res. 2012;40:9329–9339. doi: 10.1093/nar/gks672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.El-Khoury R, Damha MJ. 2′-Fluoro-arabinonucleic acid (FANA): a versatile tool for probing biomolecular interactions. Acc. Chem. Res. 2021;54:2287–2297. doi: 10.1021/acs.accounts.1c00125. [DOI] [PubMed] [Google Scholar]
- 43.Watts JK, et al. Differential stability of 2′F-ANA•RNA and ANA•RNA hybrid duplexes: roles of structure, pseudohydrogen bonding, hydration, ion uptake and flexibility. Nucleic Acids Res. 2010;38:2498–2511. doi: 10.1093/nar/gkp1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Damha MJ, et al. Hybrids of RNA and arabinonucleic acids (ANA and 2’F-ANA) are substrates of ribonuclease H. J. Am. Chem. Soc. 1998;120:12976–12977. doi: 10.1021/ja982325+. [DOI] [Google Scholar]
- 45.Haase L, Weisz K. Switching the type of V-loop in sugar-modified G-quadruplexes through altered fluorine interactions. Chem. Commun. 2020;56:4539–4542. doi: 10.1039/D0CC01285H. [DOI] [PubMed] [Google Scholar]
- 46.O’Reilly D, et al. Exploring atypical fluorine–hydrogen bonds and their effects on nucleoside conformations. Chem. Eur. J. 2018;24:16432–16439. doi: 10.1002/chem.201803940. [DOI] [PubMed] [Google Scholar]
- 47.Dalvit C, Invernizzi C, Vulpetti A. Fluorine as a hydrogen-bond acceptor: experimental evidence and computational calculations. Chem. Eur. J. 2014;20:11058–11068. doi: 10.1002/chem.201402858. [DOI] [PubMed] [Google Scholar]
- 48.Jeffrey, G. A. & Saenger, W. Hydrogen Bonding in Biological Structures. (Springer-Verlag, 1994).
- 49.Rozas I, Alkorta I, Elguero J. Bifurcated hydrogen bonds: three-centered interactions. J. Phys. Chem. A. 1998;102:9925–9932. doi: 10.1021/jp9824813. [DOI] [Google Scholar]
- 50.Ranallo S, Porchetta A, Ricci F. DNA-based scaffolds for sensing applications. Anal. Chem. 2019;91:44–59. doi: 10.1021/acs.analchem.8b05009. [DOI] [PubMed] [Google Scholar]
- 51.Harroun SG, et al. Programmable DNA switches and their applications. Nanoscale. 2018;10:4607–4641. doi: 10.1039/C7NR07348H. [DOI] [PubMed] [Google Scholar]
- 52.Koetting MC, Peters JT, Steichen SD, Peppas NA. Stimulus-responsive hydrogels: theory, modern advances, and applications. Mater. Sci. Eng.: R: Rep. 2015;93:1–49. doi: 10.1016/j.mser.2015.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Morya V, Walia S, Mandal BB, Ghoroi C, Bhatia D. Functional DNA based hydrogels: development, properties and biological applications. ACS Biomater. Sci. Eng. 2020;6:6021–6035. doi: 10.1021/acsbiomaterials.0c01125. [DOI] [PubMed] [Google Scholar]
- 54.Lo PK, Metera KL, Sleiman HF. Self-assembly of three-dimensional DNA nanostructures and potential biological applications. Curr. Opin. Chem. Biol. 2010;14:597–607. doi: 10.1016/j.cbpa.2010.08.002. [DOI] [PubMed] [Google Scholar]
- 55.Lee W, Tonelli M, Markley JL. NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy. Bioinformatics. 2014;31:1325–1327. doi: 10.1093/bioinformatics/btu830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Cai L, et al. Intercalated cytosine motif and novel adenine clusters in the crystal structure of the Tetrahymena telomere. Nucleic Acids Res. 1998;26:4696–4705. doi: 10.1093/nar/26.20.4696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Güntert P, Mumenthaler C, Wüthrich K. Torsion angle dynamics for NMR structure calculation with the new program Dyana11Edited by P. E. Wright. J. Mol. Biol. 1997;273:283–298. doi: 10.1006/jmbi.1997.1284. [DOI] [PubMed] [Google Scholar]
- 58.AMBER (University of California, San Fransisco, 2018).
- 59.Ivani I, et al. Parmbsc1: a refined force field for DNA simulations. Nat. Methods. 2016;13:55–58. doi: 10.1038/nmeth.3658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983;79:926–935. doi: 10.1063/1.445869. [DOI] [Google Scholar]
- 61.Berendsen HJC, Postma JPM, Gunsteren WFV, DiNola A, Haak JR. Molecular dynamics with coupling to an external bath. J. Chem. Phys. 1984;81:3684–3690. doi: 10.1063/1.448118. [DOI] [Google Scholar]
- 62.Darden T, York D, Pedersen L. Particle mesh Ewald: an N⋅log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993;98:10089–10092. doi: 10.1063/1.464397. [DOI] [Google Scholar]
- 63.Koradi R, Billeter M, Wüthrich K. MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 1996;14:51–55. doi: 10.1016/0263-7855(96)00009-4. [DOI] [PubMed] [Google Scholar]
- 64.Lu X-J, Olson WK. 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nat. Protoc. 2008;3:1213–1227. doi: 10.1038/nprot.2008.104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.The Pymol Molecular Graphics System v. Version 1.3 (Schrödinger, LLC).
- 66.Gaussian 16 Rev. C.01 (Wallingford, CT, 2016).
- 67.Zhao Y, Truhlar DG. The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: two new functionals and systematic testing of four M06-class functionals and 12 other functionals. Theor. Chem. Acc. 2008;120:215–241. doi: 10.1007/s00214-007-0310-x. [DOI] [Google Scholar]
- 68.Bader, R. F. W. Atoms in molecules: a quantum theory. (Clarendon Press, 1994).
- 69.Boto RA, et al. NCIPLOT4: fast, robust, and quantitative analysis of noncovalent interactions. J. Chem. Theory Comput. 2020;16:4150–4158. doi: 10.1021/acs.jctc.0c00063. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Atomic coordinates and structure factors for the reported nucleic acid structure have been deposited with the Protein Data bank under accession number 7ZYX.


