Abstract
Vibrational spectroscopy, in particular infrared spectroscopy, has been widely used to probe the three-dimensional structures and conformational dynamics of nucleic acids. As commonly used chromophores, the C=O and C=C stretch modes in the nucleobases exhibit distinct spectral features for different base pairing and stacking configurations. To elucidate the origin of their structural sensitivity, in this work, we develop transition charge coupling (TCC) models that allow one to efficiently calculate the interactions or couplings between the C=O and C=C chromophores based on the geometric arrangements of the nucleobases. To evaluate their performances, we apply the TCC models to DNA and RNA oligonucleotides with a variety of secondary and tertiary structures and demonstrate that the predicted couplings are in quantitative agreement with the reference values. We further elucidate how the interactions between the paired and stacked bases give rise to characteristic IR absorption peaks and show that the TCC models provide more reliable predictions of the coupling constants as compared to the transition dipole coupling scheme. The TCC models, together with our recently developed through-bond coupling constants and vibrational frequency maps, provide an effective theoretical strategy to model the vibrational Hamiltonian, and hence the vibrational spectra of nucleic acids in the base carbonyl stretch region directly from atomistic molecular simulations.
I. INTRODUCTION
Interactions between nucleobases play essential roles in maintaining the three-dimensional structures of nucleic acids and modulating their biological functions. Watson–Crick pairing of the purine and pyrimidine bases drives the association of complementary strands of DNA to form regular double helix structures. The alternative Hoogsteen hydrogen bonding arrangements allow nucleic acids to form triple helices and G-quadruplexes, which possibly regulate cellular pathways ranging from gene transcription to DNA replication and DNA–protein recognition.1–4 As common building blocks of the RNA structure, wobble base pairs such as those between guanine and uracil are of fundamental importance to the translation of the genetic code.5–7 These pairing schemes hold the nucleobases in planar configurations, which stack with one another through π–π interactions to construct a variety of functional structures of nucleic acids.
Vibrational spectroscopy, in particular infrared (IR) spectroscopy, provides a powerful tool to probe these interactions and unravel how they give rise to the characteristic structures and conformational dynamics of nucleic acids. These experiments often focus on the base carbonyl stretch modes, which absorb in the spectral region between 1600 cm−1 and 1800 cm−1, and are typically performed in D2O to avoid the interference of the H2O bending modes (∼1640 cm−1). As illustrated in Fig. 1, except adenine, all the other nucleobases contain at least one carbonyl group and their spectral features are highly sensitive to the hydrogen bonding patterns and base stacking configurations. For example, guanosine 5′-monophosphate has a strong C=O stretch peak at 1663 cm−1.8 Upon formation of Watson–Crick base pairs in double-stranded DNA, the absorption peak of guanine blue shifts to the 1678–1689 cm−1 region.9 Instead, when guanine uses the O6 and N7 atoms as hydrogen bonding sites to form Hoogsteen base pairs in DNA triple helices, its C=O stretch absorbs at a much higher frequency of 1715 cm−1.9 Due to this structural sensitivity, IR spectroscopy in the base carbonyl stretch region has been utilized to distinguish between A-, B-, and Z-form DNA double helices, identify G-C+ and A-T Hoogsteen base pairs in duplex DNA, and track the fast denaturation kinetics of RNA tetraloops.9–13
Advancements in two-dimensional IR (2D IR) spectroscopy have significantly enhanced the spectral resolution and enabled the detection of nucleic acid dynamics over a broad range of time scales.14–21 By spreading the IR absorption information over two frequency axes, 2D IR experiments directly probe the interactions, or couplings, between the vibrational modes. For example, in their pioneering studies, Zanni and co-workers applied 2D IR spectroscopy to A- and B-form DNA oligonucleotides, determined the intra- and interstrand couplings between the C=O groups in the guanine and cytosine bases, and elucidated the origin of the cross peaks in the spectra.22,23 More recently, Tokmakoff and co-workers monitored the cross peaks that arised from the hydrogen bonding interactions between Watson–Crick base pairs and unraveled the thermal dehybridization mechanism of model DNA double helices with base-pair-specific resolution.16 In addition to the interactions between carbonyl groups, they also observed intense cross peaks between the C=O and C=C stretches in the 2D IR spectra of the pyrimidine bases, revealing their coupled motions.8 Using 2D IR spectroscopy, Hunt and co-workers uncovered that a ring mode of adenine mainly associated with the stretch of its C5=C6 group strongly interacts with the C=O groups in thymine in the A-T base pairs of DNA duplexes.18,19,24
Given the importance of inter-base interactions, it is desirable to develop a theoretical scheme that efficiently predicts the coupling constants based on the structure of an oligonucleotide. To reduce the computational complexity, researchers often consider a vibrational subspace composed of all the chromophores of interest, in this case, the base C=O groups, and treat it quantum mechanically. We will also include the C5=C6 stretches in the vibrational subspace when dealing with the pyrimidine bases C, T, and U and the purine base A, considering their strong couplings with the C=O vibrations.8,18,19,24 The chromophores in the nucleobases are highlighted in Fig. 1. Within this subspace, the vibrational Hamiltonian of a system containing N chromophores is a N × N matrix, in which the diagonal elements are the site frequencies of each C=O or C=C stretch and the off-diagonal elements are the couplings between them. By invoking the Hessian Matrix Reconstruction (HMR) method, Cho and co-workers effectively obtained the vibrational Hamiltonian of nucleobases and base pairs and used them to simulate the 2D IR spectra of model DNA double helices.25–28 From the definition of the Hamiltonian elements, one can also calculate the coupling constants (in cm−1) as
(1) |
Here, xi and xj are the vibrational displacements of chromophores i and j, respectively, and V is the interaction energy between them. Using electronic structure calculations, both the HMR and analytical differentiation approaches naturally incorporate the through-bond couplings, which arise from the charge flow between covalently linked chromophores, and through-space couplings such as those between hydrogen bonded nucleobases. However, the application of these methods is limited by the high computational costs for performing such calculations on large nucleic acids in the condensed phase.
In the modeling of proteins, it has been well established that the through-space couplings between the amide I vibrations, which are mainly the peptide bond C=O stretches, are dominated by electrostatic interactions.29–33 Similarly, as two nucleobases in the same or complementary strands of nucleic acids approach each other, the electron densities of their C=O and C=C group interact electrostatically. When the distance between the chromophores is much larger than their sizes, the leading term in the interactions is between their transition dipole moments. This has led to the development of the transition dipole coupling (TDC) scheme,29,34 a simple model that has been extensively used in the study of the protein amide I band.31,35–39 When the TDC model is applied to the carbonyl groups in oligonucleotides, it underestimates the coupling constants between the hydrogen bonded G-C base pairs by about 50% as compared to those calculated from electronic structure methods or fitting to 2D IR experiments.23 This is not surprising because the distance between the C=O groups in a standard Watson–Crick G-C pair, which is about 5 Å, is comparable to the C=O bond length of 1.2 Å. Hence, it is likely that the dipole approximation is not valid for the closely spaced base pairs, consistent with the well-known phenomenon that the TDC scheme is not adequate to describe the couplings between adjacent peptide groups.31,36
To tackle this problem, a few approaches have been designed to incorporate higher-order multipoles in the expansion of Eq. (1). These include the transition density derivative distribution method and the transition charge coupling (TCC) scheme, which have been successfully applied to model the couplings between the protein amide I modes.31–33,38,39 In particular, the TCC method mimics the charge densities of the chromophores by assigning point charges and charge flows to individual atoms and provides an effective way to calculate the couplings directly from the distances and relative orientations of the chromophores.33 In this work, we extend the TCC scheme to nucleobases and determine the parameters for the C=O and C=C vibrations from density functional theory (DFT) calculations. We then evaluate the performance of the TCC models by applying them to a series of model oligonucleotides with characteristic secondary and tertiary structures and elucidate the origin of the structural sensitivity of IR spectroscopy. We further show that in comparison with the TDC scheme, the TCC models are more reliable in predicting the interactions between nucleobases.
II. THEORETICAL AND COMPUTATIONAL METHODS
A. The TCC scheme
The electron density in the vicinity of an oscillator changes when it vibrates around the equilibrium geometry. If a quantum harmonic oscillator with a reduced mass of μ vibrates at a frequency of ω, the amplitude of its vibration is40
(2) |
where ℏ is the reduced Planck constant. One can displace an oscillator i by an amount of xiA along its normal mode coordinate, where xi is a dimensionless scalar, and the resulting charge density at position is . As two chromophores i and j move closer to each other, there is a strong electrostatic interaction between their charge densities and . From Eq. (1), the vibrational coupling between chromophores i and j is31,33
(3) |
In this equation, ε0 is the vacuum permittivity, which takes the value of 8.854 × 10−22 F/Å, and the coupling constant βij is evaluated at the equilibrium geometry of the chromophores with xi = xj = 0. In a molecule, all the atoms involved in a normal mode move as the chromophore vibrates. As such, when a chromophore i is displaced by xi, the Cartesian coordinates of atom n in the chromophore move by , where is the component for atom n in the normalized normal mode coordinates multiplied by A [Eq. (2)].
To simplify the evaluation of Eq. (3), one can approximate the charge density as the sum of the point charges qn(xi) on the atoms constituting the chromophore,33
(4) |
When chromophore i is in the equilibrium geometry, we assume that atom n is at the position of with a partial charge of qn0. As the chromophore is displaced by xi, the position of atom n in Eq. (4) becomes . Its partial charge is taken as the first-order correction to the equilibrium value, , where dqni is the derivative of qn with respect to the displacement xi.33 Note that both and qn are functions of xi because it determines the position of each atom in the chromophore, which, in turn, determines the charge distribution in a normal mode vibration.
One can insert Eq. (4) in Eq. (3) and obtain the coupling constant in the TCC model,
(5) |
Here, the summation is over all possible combinations of atom n in chromophore i and atom m in chromophore j. Using the expressions of and qn and taking the double derivative at xi = 0 and xj = 0, one can calculate βij analytically,33
(6) |
In Eq. (6), , which links atoms n and m when the chromophores are at their equilibrium geometries. Equation (6) can also be used in conjunction with molecular dynamics (MD) simulations to provide on-the-fly predictions of the coupling constants, and in this case, is taken as the displacement vector connecting atoms n and m in each snapshot of the simulation. Equation (6) provides an efficient way to compute the couplings between two chromophores based on their distances and relative orientations. To implement it, we need a parameter set of q0, dq, and for all the atoms in the chromophores. We determine these parameters from DFT calculations of model molecules, as shown in Fig. 2. The procedure to obtain the TCC parameters is described in Sec. III A.
B. Hessian matrix reconstruction
To validate the TCC models, we apply them to a series of oligonucleotides with different structures and hydrogen bonding patterns. These biological systems are schematically shown in Fig. 3. The predicted coupling constants from the TCC models are compared to the references values, as obtained from the HMR method.41,42 As each of the purine bases contains only one chromophore, we consider the C5=C6 and C6=O vibrations for adenine and guanine, respectively, in their vibrational subspaces. In contrast, we incorporate both C=O and C=C stretches in the vibrational subspaces for the pyrimidine bases since these vibrations are strongly coupled.8,43 Specifically, we treat the C2=O and C5=C6 groups as chromophores for cytosine, and the C2=O, C4=O, and C5=C6 groups for thymine and uracil (Fig. 1). For an oligonucleotide, the size of its vibrational subspace is determined by the number of purine and pyrimidine bases that it contains. For example, when we consider an oligonucleotide with two complementary strands of GAAC and GUUC, the vibrational subspace contains a total of 14 chromophores. This comes from one C=C vibration of each adenine, one C=O vibration of each guanine, two vibrations of each cytosine, and three vibrations of each uracil. As a result, the vibrational Hamiltonian, κ, is a 14 × 14 matrix.
Diagonalizing κ gives the normal mode frequency matrix, Ω, and the eigenvector matrix, U,
(7) |
The HMR method takes the reverse process to generate the vibrational Hamiltonian,41,42
(8) |
Using DFT methods, we performed geometry optimization and frequency analysis of each system and collected all the normal mode frequencies to form Ω for the vibrational subspace. We then used the normal mode coordinates and the amplitude of the vibration, A [Eq. (2)], to obtain U. From its delocalized nature, each normal mode incorporates the contributions from multiple C=O or C=C chromophores. We assume that the eigenvector components are proportional to the changes of bond lengths in the normal mode,
(9) |
where Uiα is the matrix element of chromophore i in normal mode α. is the equilibrium length of a C=O or C=C bond in chromophore i, and riα is the disturbed length of this bond in normal mode α. After obtaining the values of Uiα, we used the Gram–Schmidt method44 to make U orthonormal to ensure that κ from Eq. (8) was symmetric. In the reconstructed κ, the diagonal terms are the site frequencies of each chromophore and the off-diagonal elements are the coupling constants that we use as reference values in this work.
C. Model systems to validate the TCC parameters
We constructed a series of model oligonucleotides to validate the TCC models. First, we built double helical structures with the oligonucleotide sequences GGG, CCC, AAA, TTT, ATTA, and TAAT using the nucleic acid builder program45 in the Amber 16 software.46 Their structures are schematically shown in Figs. 3(a)–3(c). We then took segments of DNA and RNA oligonucleotides from the Protein Data Bank (PDB). Specifically, we chose residues 301–304 in chain C and residues 409–412 in chain D from a Z-DNA structure with the PDB ID 6AQV [Fig. 3(d)].47 We included residues 3, 4, 11, 12, 17, and 18 in a model DNA triplex with the PDB ID 134D [Fig. 3(e)].48 From a G-quadruplex structure (PDB ID 6FQ2), we chose residues 1, 4, 7, and 10 from both chains A and B and included the potassium ion between the two layers [Fig. 3(f)].49 We also took residues 4–7 in chain A and residues 13–16 in chain B from an RNA double helix with PDB ID 6IA2 [Fig. 3(g)].50 In addition, we took residues 3–6 in chain A and residues 11–14 in chain B from an RNA oligomer with G-U wobble pairs [PDB ID 315D, Fig. 3(h)].51 Finally, we considered an RNA duplex with the PDB ID 6N8F and chose residues 5–8 in chain A and residues 17–20 in chain B from the first structure of its PDB file [Fig. 3(i)].52 As some of the hydrogen atoms were missing in the crystal structures, we added them using the LEaP program in AMBER 2016.46 For all model oligonucleotides, we replaced the backbone sugars and phosphates by methyl groups and used the resulting structures for the electronic structure evaluations.
D. DFT calculations
DFT calculations were first performed to determine the parameters q0, dq, and for the TCC models. In addition, they were used to calculate the reference couplings of the oligonucleotides. In all cases, the electronic structure was described using the B3LYP density functional,53 the D3 dispersion correction,54 and the 6-311G(d,p) basis set, as implemented in the Gaussian 16 program.55 Considering that the experimental IR measurements of nucleic acids were performed in D2O, we replaced the labile H’s that were covalently bonded to nitrogen atoms in all the structures with D. A correction factor of 0.9679 was applied to all the normal mode frequencies.56
As described in Secs. II B and II C, we used the HMR method to obtain the reference couplings from DFT calculations. For each oligonucleotide (Fig. 3), we carried out geometry optimization and frequency analysis. The phosphate and sugar groups of the oligonucleotides were replaced with methyl groups to reduce the computational cost, as they have been shown to have minor influences on the vibrational properties of the C=O and C=C groups in the nucleobases.25 To maintain the base pairing and stacking configurations of the model systems, we performed constrained geometry optimization where the positions of all the hydrogen bond donor and acceptor atoms as well as the atoms that were covalently bonded to the oligonucleotide backbone were fixed. For example, if an adenine formed a Watson–Crick pair with thymine, we would fix the position of its N1, N6, and N9 atoms (Fig. 1). If it formed both Watson–Crick and Hoogsteen base pairs with thymine in the triad T·AT, we would fix the positions of its N1, N6, N7, and N9 atoms. From analyzing the vibrational normal modes, we confirmed that all the imaginary frequencies came from the inter-base movements and were not related to the stretching of the C=O and C=C bonds. All the DFT calculations were performed in vacuum. As the couplings between the chromophores are dominated by electrostatic interactions, we expect them to depend only on the geometry of the oligonucleotides and inclusion of solvent molecules will have minor influences on their values.
III. RESULTS AND DISCUSSION
A. The TCC models for nucleobase C=O and C=C stretches
From Fig. 1, the C=O groups in the nucleobases G, C, T, and U are in two distinct chemical environments. In the first case, C6=O in guanine and C4=O in thymine and uracil are bonded to only one amino group. To mimic their structures, we use deuterated cis-N-methylacetamide (cis-NMAD) as a model molecule [Fig. 2(a)] and refer to the resulting TCC model as G44CO. In the second type, the C2=O group in cytosine, thymine, and uracil is sandwiched between two N-containing functional groups. Accordingly, we use a deuterated compound 1,1-dimethyl-3-methyleneurea [Fig. 2(b)] to mimic the structures of these C=O chromophores and construct another TCC model, C22CO. Using these molecules, we define an internal coordinate system as shown in Fig. 2. The y axis is along the bond direction, and the orthogonal x axis is in the molecular plane and points toward the N atom. In the C22CO model, atom N3 is used to determine the x axis. The z axis is placed to be perpendicular to the molecular plane.
From the vibrational analyses, we observe that the atoms highlighted in red in Figs. 2(a) and 2(b) contribute to over 90% of the carbonyl stretch normal modes. As such, we incorporate these atoms in the TCC models and determine their q, dq, and from DFT calculations. For each molecule, we first carry out geometry optimization and compute the Mulliken charges57,58 of the atoms as q0. To keep the overall system neutral, we add the charges of the terminal atoms not included in the TCC model to the C or N atoms that are covalently bonded to them. With the DFT calculations, we obtain the reduced mass, vibrational frequency, and coordinates of the carbonyl stretch normal mode of each system, from which we compute the vibrational amplitude using Eq. (2). We then determine as the product of the normal mode coordinates, normalized and rotated according to the internal coordinate system, and the vibrational amplitude. From the optimized geometry of a model molecule, we slightly displace the atoms involved in the normal mode by , where x takes the values of ±0.005 and ±0.010, and conduct single-point energy calculations to acquire their partial charges q′. By plotting q′ (including q0) against x, we obtain the charge flows dq using a linear regression algorithm. The resulting parameters for the G44CO and C22CO models are listed in Tables I and II, respectively.
TABLE I.
Atom | q0 | dq | vx | vy | vz |
---|---|---|---|---|---|
C5 | 0.031 08 | −0.015 50 | −0.008 | −0.057 | 0 |
C6/4 | 0.311 75 | 0.004 30 | −0.011 | 0.815 | −0.002 |
O | −0.359 79 | 0.028 80 | −0.006 | −0.497 | 0.001 |
N1/3 | −0.405 27 | −0.012 54 | 0.038 | −0.114 | 0.007 |
D | 0.226 57 | −0.004 70 | −0.135 | 0.150 | −0.011 |
C2 | 0.195 66 | −0.000 36 | 0.007 | 0.007 | −0.006 |
TABLE II.
Atom | q0 | dq | vx | vy | vz |
---|---|---|---|---|---|
N1 | −0.186 35 | −0.018 08 | −0.090 | −0.145 | 0 |
C2 | 0.482 62 | 0.000 56 | 0.119 | 0.788 | 0 |
O | −0.393 51 | 0.027 70 | −0.007 | −0.459 | 0 |
N3 | −0.320 29 | −0.002 26 | −0.108 | −0.117 | 0 |
C4 | 0.215 20 | −0.006 50 | 0.097 | 0.070 | 0 |
C6 | 0.202 33 | −0.001 42 | 0.011 | 0.010 | 0 |
From Table I, the C=O stretch of cis-NMAD accounts for 91.1% of its vibrational normal mode in the 1600–1800 cm−1 region and the C6/4–N1/3–D bending motion also plays a role. As shown in Table II, the C=O stretch contributes to 84.5% of the corresponding normal mode in 1,1-dimethyl-3-methyleneurea. Due to the conjugation between the C=O and N3=C4 groups, this normal mode also involves the C2–N3 and N3=C4 stretches. For both molecules, the sum of is less than 1 because the normal mode coordinates of a few terminal atoms are not included [Figs. 2(a) and 2(b)]. Note that in Tables I and II, the heavy atoms are numbered according to their positions in the nucleobases G, C, T, and U (Fig. 1). Considering that the purine base G and the pyrimidine bases T and U have different numbering for the amide groups, we include both notations in Table I.
From previous 2D IR experiments and DFT calculations, the C=O vibrations in cytosine, thymine, and uracil are strongly coupled to a ring mode mainly attributed to the C5=C6 stretches.8,25,43 In addition, our calculations show that when adenine becomes part of the base pairs, its ring mode around 1600 cm−1 also interacts with the chromophores in thymine or uracil. Therefore, we will develop TCC models for the C5=C6 stretches in C, T, U, and A to fully model the vibrational spectra of nucleobases in the 1600–1800 cm−1 region.
Considering that the electrons are quantum mechanically delocalized over the C5, C6, and other atoms in the conjugated purine and pyrimidine rings, it is necessary to incorporate all the ring atoms in the development of their TCC models. Our DFT calculations further show that the partial charges and charge flows on these atoms are particularly sensitive to the electron-withdrawing groups covalently bonded to the rings. Accordingly, we use three model molecules to develop the parameters for the C5=C6 stretches and name the resulting TCC models as CCC, TUCC, and ACC. As demonstrated in Figs. 2(c)–2(e), these include deuterated cytosine for the CCC model, deuterated uracil for the C5=C6 group in both thymine and uracil (TUCC), and deuterated 4-aminopyrimidine to mimic the ring structure of adenine (ACC). The internal coordinate system for these molecules is defined such that the molecule is in the xy plane, with the y axis along the direction and the orthogonal x axis pointing toward N1 in the CCC and TUCC models and toward N6 in ACC. The z axis is perpendicular to the molecular plane. Following the same procedure as in the C=O stretch case, we determine the partial charges q0, the charge flows dq, and the normal modes for the three model molecules and list their parameters in Tables III–V. From these tables, we observe that the ring modes mainly concentrate in the xy plane and are delocalized over all the atoms. The C5=C6 stretch contributes to 43.3% of the normal mode in cytosine and 45.2% in 4-aminopyrimidine, whereas this ratio increases to 59.5% in uracil.
TABLE III.
Atom | q0 | dq | vx | vy | vz |
---|---|---|---|---|---|
N1 | −0.182 42 | −0.012 26 | 0.136 | −0.179 | 0 |
C2 | 0.060 84 | 0.002 32 | 0.139 | 0.045 | 0 |
N3 | −0.391 50 | −0.006 00 | 0.182 | −0.219 | 0 |
C4 | 0.366 15 | −0.004 06 | −0.262 | 0.320 | −0.001 |
N4 | 0.006 16 | 0.012 66 | 0.026 | −0.070 | 0.007 |
C5 | −0.176 28 | 0.003 34 | −0.030 | −0.388 | 0 |
C6 | 0.317 05 | 0.004 00 | −0.215 | 0.485 | 0 |
TABLE IV.
Atom | q0 | dq | vx | vy | vz |
---|---|---|---|---|---|
N1 | −0.174 56 | −0.014 46 | 0.045 | −0.212 | 0 |
C2 | 0.167 97 | 0.002 54 | 0.146 | 0.111 | 0 |
N3 | −0.200 21 | 0.002 14 | 0.007 | −0.076 | 0 |
C4 | 0.410 82 | −0.001 90 | −0.026 | −0.062 | 0 |
O4 | −0.342 61 | 0.000 12 | 0.027 | 0.110 | 0 |
C5 | −0.175 35 | 0.000 56 | −0.086 | −0.473 | 0 |
C6 | 0.313 94 | 0.011 00 | −0.164 | 0.581 | 0 |
TABLE V.
Atom | q0 | dq | vx | vy | vz |
---|---|---|---|---|---|
N1 | −0.349 67 | −0.000 40 | 0.262 | −0.134 | −0.008 |
C2 | 0.240 50 | 0.000 88 | −0.210 | 0.324 | 0.007 |
N3 | −0.307 16 | 0.008 06 | 0.081 | −0.133 | −0.002 |
C4 | 0.184 78 | 0.000 50 | −0.337 | 0.082 | 0.010 |
C5 | −0.106 01 | 0.004 86 | 0.317 | −0.384 | −0.010 |
C6 | 0.343 62 | 0.000 90 | −0.308 | 0.330 | 0.011 |
N6 | −0.006 06 | −0.014 80 | 0.094 | −0.089 | −0.013 |
B. Validation of the TCC models
To assess the performance of the TCC models, we apply them to a set of model oligonucleotides with well-defined pairing and stacking patterns, as schematically shown in Fig. 3. We start with DNA oligomers of the sequences GGG, AAA, and ATTA that form B- and A-type double helices with their complementary strands [Figs. 3(a)–3(c)]. We then consider a Z-DNA oligonucleotide from the PDB (PDB ID 6AQV), which contains a self-complementary sequence of CGCG and adopts the left-handed Z-form double helix structure [Fig. 3(d)].47 As demonstrated in Figs. 3(e) and 3(f), we further consider a model DNA triplex that contains the hydrogen bonded G·GC and T·AT triples held together by both Watson–Crick and Hoogsteen base pairings (taken from PDB ID 134D)48 and a two-layer G-quadruplex stabilized by an inter-layer K+ ion (from PDB ID 6FQ2).49 Apart from the DNA oligomers, we incorporate 3 model RNA oligonucleotides. As shown in Figs. 3(g) and 3(h), these include a double-helical oligomer formed from complementary strands of sequences GAAC and GUUC (from PDB ID 6IA2)50 and a duplex with self-complementary sequence of AUGU that contains both Watson–Crick and wobble base pairs (from PDB ID 315D).51 Finally, we consider an RNA internal loop 5′-GCUU/3′-UUCG that forms the non-canonical G-U and C–U base pairs (from PDB ID 6N8F).52
Each test case contains 8–18 C=O and C=C groups in the nucleobases. We calculate the coupling constants between each pair of the chromophores and compare the values predicted from the TCC models, βTCC, with those from the HMR method, βHMR. As demonstrated in Fig. 4, βHMR spans a broad range between −16 cm−1 and +14 cm−1, with the negative and positive values corresponding to attractive and repulsive interactions between the vibrations of the chromophores, respectively. These couplings give rise to concerted vibrations in the normal modes of the oligonucleotides, and their signs determine the relative intensity of the IR absorption peaks.59 The TCC models are capable of correctly reproducing the reference values in the full region. For example, when we consider the C=O/C=O interactions in Fig. 4(a), the average βTCC is only 0.07 cm−1 larger than that of βHMR and the root-mean-square deviation (RMSD) is 1.55 cm−1. From all the test cases, we find only one outlier with a βTCC of −5.75 cm−1 and a βHMR of 3.65 cm−1. This corresponds to the interaction between the C2=O groups of U3 in Fig. 3(i), in which the two bases reside in strands 1 and 2 in the RNA internal loop and form a “shifted stacking” conformation. As these C=O groups are separated by a very short distance of 3.2 Å, we expect their electron densities to overlap strongly with each other and the TCC scheme to be inadequate to fully describe the interactions, resulting in this qualitatively incorrect prediction. To further examine the impact of the inter-chromophore distances, we consider the C2=O groups between the U3 and U4 bases in chains 1 and 2 in Fig. 3(i) and find the distances between their midpoints to be 4.7 Å and 4.9 Å, respectively. Accordingly, their βTCC are 3.87 cm−1 and 3.98 cm−1, respectively, in reasonable agreement with the βHMR values of 1.84 cm−1 and 1.71 cm−1.
For the C=O/C=C couplings in Fig. 4(b), the average βTCC is 0.16 cm−1 smaller than that of βHMR and the RMSD between them is 1.54 cm−1. We notice that the TCC models slightly underestimate the C=O/C=C coupling constants when βHMR is above +8 cm−1 or below −8 cm−1. Comparing the TCC parameters in Tables I–V, we find that the atomic charge flows dq accompanying the C=C vibrations are much smaller than those with the C=O vibrations. Therefore, factors other than electrostatics might have considerable contribute to the C=C vibrational modes, especially when they are in close proximity to other chromophores and the interactions are strong. This underestimation is even more prominent for the C=C/C=C couplings. As shown in Fig. 4(c), the correlation between βTCC and βHMR is much weaker compared to the C=O/C=O and C=O/C=C interactions. The absolute values of βTCC are, in general, smaller than those of βHMR, particularly when the couplings become larger. This phenomenon again indicates that non-electrostatic forces such as dispersion and polarization might play a role in determining how the C=C groups interact with each other. Despite this lack of agreement, we note that the C=C/C=C couplings are usually within 5 cm−1 and the RMSD between βTCC and βHMR is 1.68 cm−1, comparable to the other two cases. In addition, only 17% of all the inter-base interactions are between the C=C vibrations. Therefore, we will still implement the TCC model to describe the C=C/C=C interactions and their impact on the overall spectra in the 1600–1800 cm−1 region is expected to be small.
In Tables I–V, we obtained the atomic charges and charge flows, q0 and dq, from the Mulliken population analysis.57,58 To further validate the TCC models, we repeat the parameterization using natural population analysis (NPA)60 and apply the resulting models to compute the coupling constants in the model oligonucleotides, which we will refer to as . Taking the B-form double helix [Fig. 3(a)] as an example, we analyze 33 inter-base interactions that involve both the C=O and C=C chromophores and observe a maximal difference of 1.8 cm−1 between the predicted βTCC and values. Compared to βHMR, the RMSD is 1.53 cm−1 for βTCC and 1.65 cm−1 for , demonstrating that the TCC parameterization does not have a strong dependence on the method for partial charge assignment.
C. Through-bond couplings in the pyrimidine bases
As demonstrated in Fig. 1, the pyrimidine bases C, T, and U contain multiple chromophores. In cytosine, the C2=O and C5=C6 vibrations both absorb in the 1600–1800 cm−1 region. Thymine and uracil further have the C4=O stretch as a chromophore. These C=O and C=C groups are part of the conjugated rings and are linked by covalent bonds, leading to significant overlap between their electron densities. As such, the electrostatic TCC scheme is not sufficient to model the through-bond couplings between them, and methods that account for the full electronic quantum effects are required to provide the correct description.
Constrained by the pyrimidine ring structures, the C=O and C=C groups have relatively fixed positions and orientations with respect to each other throughout the MD simulations. Hence, it is reasonable to assume that these intra-base couplings take constant values. In a previous work, we used deoxycytidine, deoxythymidine, and uridine 5′-monophosphates (CMP, TMP, and UMP, respectively) as model molecules to calculate the through-bond coupling constants.43 Specifically, we performed MD simulations of CMP, TMP, and UMP in water and extracted a total of 1200 solute–solvent clusters, for which we combined DFT calculations and the HMR method to obtain the off-diagonal elements of the vibrational Hamiltonian. The average values of these intramolecular couplings are listed in Table VI.43 In the pyrimidine bases, stretching the C2=O and C5=C6 groups simultaneously decreases their interaction energy, giving rise to negative coupling constants [Eq. (1)]. In contrast, vibrations of the other chromophores in thymine and uracil exhibit repulsive interactions and their corresponding coupling constants are positive.
TABLE VI.
Nucleobase | C2=O/C4=O | C2=O/C5=C6 | C4=O/C5=C6 |
---|---|---|---|
C | … | −10.57 | … |
T | 17.27 | −8.16 | 6.26 |
U | 18.55 | −9.35 | 16.20 |
As a validation, we compute the intramolecular couplings in the oligonucleotide test cases (Fig. 3) and compare them to the values in Table VI. We first consider the C2=O/C4=O interactions in thymine and uracil and find the average βHMR from the model oligonucleotides to be 14.40 cm−1 and 16.45 cm−1, respectively. These values are close to the coupling constants of 17.27 cm−1 and 18.55 cm−1 from the calculations of TMP and UMP, respectively, as shown in Table VI. Furthermore, βHMR of the intramolecular C2=O/C4=O interactions sample a narrow range with a standard deviation of 0.63 cm−1 for thymine and 1.27 cm−1 for uracil, justifying our assumption that these couplings can be approximately treated as constants. Similarly, the average βHMR of the C2=O/C5=C6 and C4=O/C5=C6 interactions in thymine and uracil are −9.26 cm−1, 7.70 cm−1, −10.15 cm−1, and 16.32 cm−1, respectively, and their standard deviations are all within 1.6 cm−1. In all cases, the average βHMR differ from the corresponding values in Table VI by less than 1.5 cm−1, demonstrating that these interactions stay more or less constant in different environments around the nucleobases. For cytosine, the average βHMR of the C2=O/C5=C6 interaction is −9.66 cm−1, very close to the value of −10.57 cm−1 in Table VI. However, the standard deviation is 3.60 cm−1, suggesting that the intra-base coupling in cytosine depends more on their chemical environment as compared to the other types of through-bond couplings.
D. Sensitivity of the coupling constants to the stacking and pairing of nucleobases
From Fig. 4, the vibrational couplings vary over a range of 30 cm−1 depending on the three-dimensional arrangements of the nucleobases, and hence they provide a sensitive probe to the structures of nucleic acids. Among the factors that determine the coupling constants, the separation and relative angle between two chromophores, R and ϕ, are of particular interest. As an example, we plot βTCC of the C=O/C=O interactions as a function of R, calculated as the distance between the midpoint of the C=O groups (Fig. 5). In general, the magnitude of the coupling constants follows an inverse relationship with R, in good agreement with the findings of previous DFT studies.25,26 As R lengthens from 3.4 Å to 7 Å, the average coupling constants exhibit a 92% reduction in magnitude from 11.28 cm−1 to −0.94 cm−1. From Fig. 5, βTCC sample different signs and magnitudes at a given R, which arises from the variation of ϕ in the model oligonucleotides. In the following, we will examine the systems in which R between the C=O groups are short to elucidate how the distance and angle between two chromophores influence their coupling constant.
When R < 3.55 Å, the nucleobases are mostly in stacking configurations. Despite the short distances, there are considerable fluctuations in the C=O/C=O coupling constants, as shown in Fig. 5. For example, βTCC is 12.69 cm−1 for the stacked G1 and G2 in a model A-DNA with a R of 3.4 Å [Fig. 3(a)]. However, the coupling constant becomes 4.23 cm−1 when the C4=O groups in U3 and U4 of strand 1 in Fig. 3(i) have a distance of 3.5 Å. As the stacked bases are almost parallel to each other, this fluctuation in βTCC mainly arises from the twist angle between their molecular planes. To elucidate its impact, we construct a model system consisted of two 9-methylguanine (mG) molecules stacked together, as depicted in Fig. 6. We place the two mG molecules in a parallel conformation so that the twist angle can be well represented by the angle between the two C=O groups, ϕ. We then fix R between the C=O groups at 3.5 Å and rotate one of the mG molecules about an axis perpendicular to its molecular plane. As shown in Fig. 6, the resulting βTCC exhibit a strong variation with the angle and are symmetric about ϕ = 0. At a constant R of 3.5 Å, βTCC reaches a maximum of 11.26 cm−1 at ϕ = 0 and decreases to 6.58 cm−1 when ϕ = 60°. As one further increases ϕ to over 100°, βTCC becomes negative. Therefore, together with R, ϕ determines the sign and magnitude of the coupling constants. In the two cases described above, ϕ are 32° and 50° between the C=O groups in the stacked guanine [Fig. 3(a)] and uracil [Fig. 3(i)] bases, and hence their couplings are both positive and there is a 3-fold difference in their magnitude.
One can use the mG dimer in Fig. 6 to elucidate how the stacking interactions influence the absorption peaks in the IR spectra. In a standard B-DNA, the twist angle is 36° 61 and it gives a βTCC of 9.45 cm−1. As the C=O stretch frequency of mG is 1746.14 cm−1 from the DFT calculations in vacuum, the vibrational Hamiltonian of the mG dimer in a B-form double helix is
After diagonalizing κGG, one obtains the normal mode frequencies of 1736.69 cm−1 and 1755.59 cm−1. Therefore, at a R of 3.5 Å, the stacking of two mG bases gives rise to two absorption peaks separated by 2 × βTCC = 18.90 cm−1.
In addition to stacking, hydrogen bonding interactions bring the nucleobases together, resulting in R and ϕ between the C=O groups and characteristic coupling patterns. We will use the Watson–Crick G-C pair as an example [Fig. 7(a)]. From the model systems in Fig. 3, we find an average R of 4.9 Å between the C=O chromophores in the G-C pairs, giving an average βTCC of −7.56 cm−1, consistent with previous values obtained from DFT calculations and 2D IR experiments.23,25 Here, the coupling constant is negative because the C=O groups have almost antiparallel alignment with an average ϕ of 171°. Similar arrangements of the C=O groups are observed in non-canonical hydrogen bonding patterns such as G-U wobble pairs and Hoogsteen pairs, producing large and negative coupling constants. For example, Fig. 7(b) demonstrates that the C–U base pair in an RNA internal loop [Fig. 3(i)] has an average R of 4.7 Å and ϕ of 145° between the C2=O of cytosine and C4=O of uracil, giving an average βTCC of −6.98 cm−1. In the model DNA triplex [Fig. 3(e)], the G·GC triple involves a Hoogsteen pair between the two guanine bases. As shown in Fig. 7(c), this base pair is held together by a hydrogen bond between the N2–H group of one guanine (G1) and the O6 atom of the other guanine (G2), and a second hydrogen bond between the N1–H group of G1 and the N7 atom of G2. The resulting R is 5.2 Å and ϕ is 169° between the C=O groups in G1 and G2, resulting in a βTCC of −5.95 cm−1.
While the C=O/C=O coupling constants in most of the base pairs are negative, they can also take positive values when the chromophores are arranged such that their ϕ are less than 100°. For example, compared to Figs. 7(c), G1 has a flipped configuration in the Hoogsteen base pairs of the G-quadruplex [Fig. 7(d)]. This leads to an average R of 3.9 Å and ϕ of 90° between the two C=O groups in the G-quadruplex, giving a βTCC of +3.66 cm−1. Furthermore, the average R and ϕ between the C2=O groups in cytosine and uracil in the C–U base pair [Fig. 7(b)] are 3.8 Å and 62°, respectively. Due to their closer distance and more parallel orientation, the corresponding βTCC of +10.08 cm−1 is larger than that in the G-quadruplex. From these analyses, the vibrational couplings are strongly dependent on the base pairing geometries. They are sensitive to the hydrogen bonding patterns in Watson–Crick and C–U base pairs and have distinct values and signs for different types of Hoogsteen base pairs.
We now consider a Watson–Crick G-C pair, which is composed of mG and 1-methylcytosine (mC), to demonstrate the impact of hydrogen bonding on the IR absorption peaks. As shown in Fig. 7(a), this dimer system contains the C6=O chromophore of mG and the C2=O and C5=C6 chromophores of mC, and hence its vibrational Hamiltonian is a 3 × 3 matrix. To obtain the diagonal elements of the Hamiltonian, we replace C6=O in mG and C2=O in mC, one at a time, by the C=S group to remove their interactions while keeping the hydrogen bonding environment. After optimizing the mutated G-C dimers in vacuum and applying the HMR method, we find the site frequencies of the C6=O stretch in mG and the C2=O and C5=C6 stretches in mC to be 1675.55 cm−1, 1652.62 cm−1, and 1636.67 cm−1, respectively. For the off-diagonal elements of the Hamiltonian, we acquire the intermolecular coupling constants from the TCC models and the intramolecular coupling constant in mC from Table VI. The resulting vibrational Hamiltonian of the G-C pair is
By diagonalizing κGC, we obtain the eigenvalue matrix ΩGC and the normal mode matrix UGC,
Here, we observe three prominent influences of the hydrogen bonding interactions on the vibrational Hamiltonian, and hence the vibrational spectra of the G-C pair. First, they significantly reduce the site frequencies of the C=O groups. From DFT calculations of isolated mG and mC in the gas phase, the site frequencies of C6=O in mG and C2=O in mC are 1746.14 cm−1 and 1710.21 cm−1, respectively. This means that the formation of hydrogen bonds between the carbonyl and amino groups in guanine and cytosine red shifts their C=O frequencies by 70.59 cm−1 and 57.59 cm−1, respectively. In contrast, as the C5=C6 group does not participate in the base pairing interactions, its site frequency is almost unchanged as compared to the gas-phase value of 1635.65 cm−1 in isolated mC. These observations are in good agreement with previous DFT calculations on paired bases.26 Second, the hydrogen bonding interactions bring the C=O and C=C groups close together, leading to coupling constants that have the magnitude around 10 cm−1. These large couplings further increase the separation of the normal mode frequencies, giving three absorption peaks at 1681.00 cm−1, 1652.93 cm−1, and 1630.91 cm−1. Finally, due to the strong couplings, the vibrational normal modes of the G-C pair are delocalized over all 3 chromophores. For example, the peak at 1652.93 cm−1 arises mostly (72%) from the C2=O stretch of mC, while the contributions of C6=O in mG and C5=C6 in mC are 17% and 11%, respectively. We note that the vibrational Hamiltonian is computed from the paired mG and mC in vacuum, and one can obtain more accurate predictions by incorporating the impact of the condensed phase environment and the structural fluctuations.
E. Comparison to the TDC scheme
In the TDC scheme, one assumes that the distances between two chromophores are sufficiently long that the interactions are mainly due to their transition dipole moments.29,34 Within this scheme, the coupling between two chromophores i and j is35,37
(10) |
Here, is the transition dipole of the ith C=O or C=C chromophore and is the vector connecting the two chromophores, which points from i to j. The transition dipoles are placed at the midpoint of the C=O or C=C bond. R is the distance between the chromophores, which is calculated as the magnitude of and has the unit of Å. ε is the dielectric constant of the medium and is taken to be 1. The conversion factor , which gives the coupling in cm−1. Here, ωi is the unperturbed frequency of chromophore i, and we take 165030 and 1627 cm−1 for its value for the C=O and C=C vibrations, respectively. To compute βij using Eq. (10), one needs the direction and magnitude of the transition dipole of each chromophore. From our previous DFT calculations, we find that of a C=O chromophore points from O toward the N atom that is covalently bonded to the carbonyl group with an angle of 3.65° from the direction. For the pyrimidine bases, the N3 atom is used to define the direction of the transition dipoles. The magnitude of the carbonyl transition dipole is 2.57 D·Å−1·u−1/2,43 where u represents the atomic mass unit, consistent with the values determined from previous DFT calculations and 2D IR experiments.23,26 For a C=C group in cytosine, thymine, and uracil, is along the chemical bond pointing from C5 to C6 with a magnitude of 1.44 D·Å−1·u−1/2.43
Using the TDC models, we repeat the calculations on the oligonucleotide test cases. We then use the coupling constants from the HMR method as reference values and calculate the error of the TCC and TDC models as ΔβTCC = βTCC − βHMR and ΔβTDC = βTDC − βHMR, respectively. As shown in Fig. 8, over the whole range of βHMR, ΔβTCC mostly fluctuate between −5 cm−1 and +5 cm−1 with an average of −0.06 cm−1. There is one outlier with a ΔβTCC of −9.40 cm−1, which corresponds to the U3 base from both strands in Fig. 3(i). In contrast, the TDC model systematically underestimates the coupling constants predicted from the HMR method, with ΔβTDC varying from −12.74 cm−1 to +11.00 cm−1. This underestimation is most prominent when the magnitude of βHMR is larger than 10 cm−1, which arises in 3 types of spatial arrangements of the nucleobases.
The first type is base pairing. For example, in a C–U base pair [Fig. 7(b)], βHMR between the C2=O groups of cytosine and uracil is +13.66 cm−1. Compared to a ΔβTCC of −3.91 cm−1, the TDC model underestimates the coupling constant by 8.41 cm−1. Similarly, in a G-U wobble pair with a βHMR of −15.87 cm−1, ΔβTDC is as large as 9.44 cm−1. These observations are further supported by the findings that the TDC method cannot quantitatively capture the C=O/C=O couplings between Watson–Crick base pairs.23,25 In these cases, R between the C=O groups is below 3.8 Å. The underestimation indicates that the dipole approximation is not sufficient to capture the full interactions and high-order multipoles must be included to provide the correct description. We observe similar behavior for the C=O/C=C couplings in paired bases. As an example, the C–U base pair in Fig. 7(b) has a βHMR of +13.39 cm−1 between the C5=C6 group in cytosine and the C4=O group in uracil, for which ΔβTDC is −12.28 cm−1. Likewise, in the B-form double helix formed from DNA sequences AAA and TTT [Fig. 3(b)], the average βHMR between the C5=C6 group of adenine and the C4=O group of thymine is −11.59 cm−1, while the average ΔβTDC is 10.41 cm−1. Here, R between the C=O and C=C groups are 4.0–5.9 Å, slightly longer than those between the C=O groups. From Fig. 2, the atoms involved in the G44CO and C22CO models are around the C=O chromophore, whereas the atoms in the C=C models are distributed over the entire pyrimidine rings. For example, from DFT calculations of a Watson–Crick G-C pair in its optimized conformation, atoms N3 and N4 in cytosine contribute to 83% of the C=O/C=C interactions. As a result, these C=O/C=C coupling constants remain large even when R between the C=O and C=C groups are long, which cannot be explained by the TDC model.
The second type is base stacking. For example, in an A-form double helix in Fig. 3(c), βHMR between the C5=C6 group of A1 and the C4=O group of T2 in strand 1 is −12.00 cm−1, for which ΔβTCC is 2.33 cm−1 and ΔβTDC is 9.94 cm−1. In another model A-DNA as shown in Fig. 3(b), βHMR of the C5=C6 vibrations of bases A2 and A3 is +10.29 cm−1, and the TDC model underestimates it by 9.00 cm−1. Apart from the first 2 types, we also observe a few cases of the shifted stacking configurations that result in large coupling constants. As shown in Fig. 3(d), C1 of strand 1 partially stack with C3 of strand 2, giving rise to a βHMR of +11.48 cm−1 between their C=O groups. For this case, ΔβTCC is comparable to ΔβTDC with values of −2.52 cm−1 and −3.51 cm−1, respectively. Similarly, in an A-form double helix [Fig. 3(c)], A4 of strand 1 and A2 of strand 2 form the shifted stacking geometry with a βHMR of 10.61 cm−1. For this system, ΔβTDC is −10.07 cm−1, much larger than the ΔβTCC value of −4.50 cm−1.
From Fig. 8, the TDC models can quantitatively predict the coupling constants only when the interactions of the chromophores are relatively weak. To further evaluate their applicability, we compute the average ΔβTDC for a given βHMR and assume that the TDC models are adequate if the average ΔβTDC is less than 3 cm−1. This gives βHMR to be in a range of −5 to +5 cm−1, which corresponds to R between 3.2 Å and 15.7 Å. When R is relatively short (<6 Å), configurations with these small coupling constants have an average ϕ of 82.6°. As R increases to above 8 Å, both the TCC and TDC models give quantitatively correct predictions. For example, for these long-range interactions between the C=O groups, the RMSD of βTCC and βTDC are 0.53 cm−1 and 0.52 cm−1, respectively, as compared to the reference βHMR values. Likewise, the RMSD from the two models are 0.63 cm−1 and 0.80 cm−1, respectively, for the C=C/C=C interactions. Note that in our implementation of the TDC scheme, we placed the origin of the transition dipole moments at the midpoint of the C=O and C=C bonds and determined their magnitudes and directions from the DFT calculations of methylated guanine and thymine.43 One can possibly improve the performance of the TDC method by optimizing the location, magnitude, and direction of the transition dipoles using a set of model oligonucleotides.
IV. CONCLUSIONS
In this work, we develop a set of TCC models to describe the interactions of the C=O and C=C groups between nucleobases. By applying them to oligonucleotides with well-defined base paring and stacking patterns, we demonstrate that the TCC models are capable of quantitatively capturing the coupling constants as compared to the results obtained from DFT calculations and the HMR method and provide an efficient way to obtain the inter-base interactions between the chromophores based on the structures of nucleic acids. As the TCC scheme is electrostatic in nature, it is not sufficient to describe the through-bond interactions between the chromophores in pyrimidine bases, and we provide these intra-base coupling constants by approximating them as constant values.43 Combining the inter- and intra-base couplings, we demonstrate how the stacking and pairing of bases result in characteristic absorption peaks in the IR spectra. We further compare the TCC and TDC models and show that the dipole approximation is only valid when the βHMR is between −5 cm−1 and +5 cm−1 and higher-order multipoles are required to capture stronger interactions between chromophores.
A widely used approach in spectroscopy modeling is the mixed quantum/classical approximation. For nucleobases in the 1600–1800 cm−1 region, we treat the vibrational subspace constituted by the C=O and C=C chromophores quantum mechanically, ignore the higher-frequency modes, and treat all the other degrees of freedom as a classical bath. Within this approach, the Hamiltonian in the vibrational subspace is of central importance. In a previous work, we have developed vibrational frequency maps that effectively model the site frequencies of the C=O and C=C stretches from their local electrostatic environment, which provide the diagonal terms of the vibrational Hamiltonian.43 The TCC models, in conjunction with the through-bond coupling constants, allow one to accurately and efficiently acquire the off-diagonal elements. These methods provide a theoretical framework to predict the vibrational Hamiltonian, and hence the vibrational spectra of nucleic acids directly from classical MD simulations. Hence, it bridges atomistic simulations and vibrational spectroscopy experiments and helps elucidate the structure and dynamics of nucleic acids that give rise to the observed spectral features at the molecular level.
ACKNOWLEDGMENTS
L.W. would like to thank Professor Wilma Olson for the helpful discussions of non-canonical base paring in nucleic acids. L.W. acknowledges the support from the National Institutes of Health (Award No. R01GM130697). The authors acknowledge the Office of Advanced Research Computing at Rutgers University for providing access to the Amarel server.
Note: This paper is part of the JCP Emerging Investigators Special Collection.
REFERENCES
- 1.Buske F. A., Mattick J. S., and Bailey T. L., RNA Biol. 8, 427 (2011). 10.4161/rna.8.3.14999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nikolova E. N., Zhou H., Gottardo F. L., Alvey H. S., Kimsey I. J., and Al-Hashimi H. M., Biopolymers 99, 955 (2013). 10.1002/bip.22334 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rhodes D. and Lipps H. J., Nucleic Acids Res. 43, 8627 (2015). 10.1093/nar/gkv862 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hänsel-Hertsch R., Di Antonio M., and Balasubramanian S., Nat. Rev. Mol. Cell Biol. 18, 279 (2017). 10.1038/nrm.2017.3 [DOI] [PubMed] [Google Scholar]
- 5.Crick F. H. C., J. Mol. Biol. 19, 548 (1966). 10.1016/s0022-2836(66)80022-0 [DOI] [PubMed] [Google Scholar]
- 6.Varani G. and McClain W. H., EMBO Rep. 1, 18 (2000). 10.1093/embo-reports/kvd001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Agris P. F., Eruysal E. R., Narendran A., Väre V. Y. P., Vangaveti S., and Ranganathan S. V., RNA Biol. 15, 537 (2018). 10.1080/15476286.2017.1356562 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Peng C. S., Jones K. C., and Tokmakoff A., J. Am. Chem. Soc. 133, 15650 (2011). 10.1021/ja205636h [DOI] [PubMed] [Google Scholar]
- 9.Banyay M., Sarkar M., and Graslund A., Biophys. Chem. 104, 477 (2003). 10.1016/s0301-4622(03)00035-8 [DOI] [PubMed] [Google Scholar]
- 10.Taillandier E. and Liquier J., “DNA structures part A: Synthesis and physical analysis of DNA,” Methods Enzymol. 211, 307 (1992). 10.1016/0076-6879(92)11018-e [DOI] [PubMed] [Google Scholar]
- 11.Liquier J. and Taillandier E., in Infrared Spectroscopy of Biomolecules, edited by Mantsch H. H. and Chapman D. (Wiley-Liss: New York, 1996), Chap. 6, p. 131. [Google Scholar]
- 12.Stancik A. L. and Brauns E. B., Biochemistry 47, 10834 (2008). 10.1021/bi801170c [DOI] [PubMed] [Google Scholar]
- 13.Stelling A. L., Xu Y., Zhou H., Choi S. H., Clay M. C., Merriman D. K., and Al-Hashimi H. M., FEBS Lett. 591, 1770 (2017). 10.1002/1873-3468.12681 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Szyc L., Yang M., Nibbering E. T. J., and Elsaesser T., Angew. Chem., Int. Ed. 49, 3598 (2010). 10.1002/anie.200905693 [DOI] [PubMed] [Google Scholar]
- 15.Peng C. S., Baiz C. R., and Tokmakoff A., Proc. Natl. Acad. Sci. U. S .A. 110, 9243 (2013). 10.1073/pnas.1303235110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sanstead P. J., Stevenson P., and Tokmakoff A., J. Am. Chem. Soc. 138, 11792 (2016). 10.1021/jacs.6b05854 [DOI] [PubMed] [Google Scholar]
- 17.Dai Q., Sanstead P. J., Peng C. S., Han D., He C., and Tokmakoff A., ACS Chem. Biol. 11, 470 (2016). 10.1021/acschembio.5b00762 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hithell G., Gonzalez-Jimenez M., Greetham G. M., Donaldson P. M., Towrie M., Parker A. W., Burley G. A., Wynne K., and Hunt N. T., Phys. Chem. Chem. Phys. 19, 10333 (2017). 10.1039/c7cp00054e [DOI] [PubMed] [Google Scholar]
- 19.Hithell G., Donaldson P. M., Greetham G. M., Towrie M., Parker A. W., Burley G. A., and Hunt N. T., Chem. Phys. 512, 154 (2018). 10.1016/j.chemphys.2017.12.010 [DOI] [Google Scholar]
- 20.Sanstead P. J. and Tokmakoff A., J. Phys. Chem. B 122, 3088 (2018). 10.1021/acs.jpcb.8b01445 [DOI] [PubMed] [Google Scholar]
- 21.Fritzsch R., Greetham G. M., Clark I. P., Minnes L., Towrie M., Parker A. W., and Hunt N. T., J. Phys. Chem. B 123, 6188 (2019). 10.1021/acs.jpcb.9b04354 [DOI] [PubMed] [Google Scholar]
- 22.Krummel A. T., Mukherjee P., and Zanni M. T., J. Phys. Chem. B 107, 9165 (2003). 10.1021/jp035473h [DOI] [Google Scholar]
- 23.Krummel A. T. and Zanni M. T., J. Phys. Chem. B 110, 13991 (2006). 10.1021/jp062597w [DOI] [PubMed] [Google Scholar]
- 24.Hithell G., Shaw D. J., Donaldson P. M., Greetham G. M., Towrie M., Burley G. A., Parker A. W., and Hunt N. T., J. Phys. Chem. B 120, 4009 (2016). 10.1021/acs.jpcb.6b02112 [DOI] [PubMed] [Google Scholar]
- 25.Lee C., Park K.-H., and Cho M., J. Chem. Phys. 125, 114508 (2006). 10.1063/1.2213257 [DOI] [PubMed] [Google Scholar]
- 26.Lee C. and Cho M., J. Chem. Phys. 125, 114509 (2006). 10.1063/1.2213258 [DOI] [PubMed] [Google Scholar]
- 27.Lee C., Park K.-H., Kim J.-A., Hahn S., and Cho M., J. Chem. Phys. 125, 114510 (2006). 10.1063/1.2213259 [DOI] [PubMed] [Google Scholar]
- 28.Lee C. and Cho M., J. Chem. Phys. 126, 145102 (2007). 10.1063/1.2715602 [DOI] [PubMed] [Google Scholar]
- 29.Krimm S. and Abe Y., Proc. Natl. Acad. Sci. U. S .A. 69, 2788 (1972). 10.1073/pnas.69.10.2788 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Torii H. and Tasumi M., in Infrared Spectroscopy of Biomolecules, edited by Mantsch H. H. and Chapman D. (Wiley-LISS, 1996), Chap. 1, pp. 1–18. [Google Scholar]
- 31.Hamm P. and Woutersen S., Bull. Chem. Soc. Jpn. 75, 985 (2002). 10.1246/bcsj.75.985 [DOI] [Google Scholar]
- 32.Moran A. and Mukamel S., Proc. Natl. Acad. Sci. U. S. A. 101, 506 (2004). 10.1073/pnas.2533089100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Jansen T. l. C., Dijkstra A. G., Watson T. M., Hirst J. D., and Knoester J., J. Chem. Phys. 125, 044312 (2006). 10.1063/1.2218516 [DOI] [Google Scholar]
- 34.Moore W. H. and Krimm S., Proc. Natl. Acad. Sci. U. S .A. 72, 4933 (1975). 10.1073/pnas.72.12.4933 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Krimm S. and Bandekar J., Adv. Protein Chem. 38, 181 (1986). 10.1016/s0065-3233(08)60528-8 [DOI] [PubMed] [Google Scholar]
- 36.Torii H. and Tasumi M., J. Raman Spectrosc. 29, 81 (1998). [DOI] [Google Scholar]
- 37.Zhuang W., Abramavicius D., Hayashi T., and Mukamel S., J. Phys. Chem. B 110, 3362 (2006). 10.1021/jp055813u [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Woys A. M., Almeida A. M., Wang L., Chiu C.-C., McGovern M., de Pablo J. J., Skinner J. L., Gellman S. H., and Zanni M. T., J. Am. Chem. Soc. 134, 19118 (2012). 10.1021/ja3074962 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Cunha A. V., Bondarenko A. S., and Jansen T. L. C., J. Chem. Theory Comput. 12, 3982 (2016). 10.1021/acs.jctc.6b00420 [DOI] [PubMed] [Google Scholar]
- 40.Jansen T. l. C. and Knoester J., J. Chem. Phys. 124, 044502 (2006). 10.1063/1.2148409 [DOI] [PubMed] [Google Scholar]
- 41.Ham S., Cha S., Choi J.-H., and Cho M., J. Chem. Phys. 119, 1451 (2003). 10.1063/1.1581855 [DOI] [Google Scholar]
- 42.Choi J.-H., Ham S., and Cho M., J. Phys. Chem. B 107, 9132 (2003). 10.1021/jp034835i [DOI] [Google Scholar]
- 43.Jiang Y. and Wang L., J. Phys. Chem. B 123, 5791 (2019). 10.1021/acs.jpcb.9b04633 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Cheney E. and Kincaid D., Linear Algebra: Theory and Applications (Jones and Bartlett Publishers, Sudbury, MA, 2009). [Google Scholar]
- 45.Macke T. J. and Case D. A., in Molecular Modeling of Nucleic Acids, edited by Leontis N. B. and SantaLucia J. Jr. (American Chemical Society, Washington, D. C., 1997), Vol. 682, pp. 379–393. [Google Scholar]
- 46.Case D., Betz R., Cerutti D., Cheatham T., Darden T., Duke R., Giese T., Gohlke H., Goetz A., Homeyer N., Izadi S., Janowski P., Kaus J., Kovalenko A., Lee T., LeGrand S., Li P., Lin C., Luchko T., Luo R., Madej B., Mermelstein D., Merz K., Monard G., Nguyen H., Nguyen H., Omelyan I., Onufriev A., Roe D., Roitberg A., Sagui C., Simmerling C., Botello-Smith W., Swails J., Walker R., Wang J., Wolf R., Wu X., Xiao L., and Kollman P., AMBER 2016 (University of California, San Francisco, 2016). [Google Scholar]
- 47.Luo Z., Dauter Z., and Gilski M., Acta Crystallogr., Sect. D: Struct. Biol. 73, 940 (2017). 10.1107/s2059798317014954 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Radhakrishnan I. and Patel D. J., Structure 1, 135 (1993). 10.1016/0969-2126(93)90028-f [DOI] [PubMed] [Google Scholar]
- 49.Bakalar B., Heddi B., Schmitt E., Mechulam Y., and Phan A. T., Angew. Chem., Int. Ed. 58, 2331 (2019). 10.1002/anie.201812628 [DOI] [PubMed] [Google Scholar]
- 50.Nowacka M., Fernandes H., Kiliszek A., Bernat A., Lach G., and Bujnicki J. M., PLoS One 14, e0214481 (2019). 10.1371/journal.pone.0214481 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Biswas R., Wahl M. C., Ban C., and Sundaralingam M., J. Mol. Biol. 267, 1149 (1997). 10.1006/jmbi.1997.0936 [DOI] [PubMed] [Google Scholar]
- 52.Berger K. D., Kennedy S. D., and Turner D. H., Biochemistry 58, 1094 (2019). 10.1021/acs.biochem.8b01027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Becke A. D., J. Chem. Phys. 98, 5648 (1993). 10.1063/1.464913 [DOI] [Google Scholar]
- 54.Grimme S., Antony J., Ehrlich S., and Krieg H., J. Chem. Phys. 132, 154104 (2010). 10.1063/1.3382344 [DOI] [PubMed] [Google Scholar]
- 55.Frisch M. J., Trucks G. W., Schlegel H. B., Scuseria G. E., Robb M. A., Cheeseman J. R., Scalmani G., Barone V., Petersson G. A., Nakatsuji H., Li X., Caricato M., Marenich A. V., Bloino J., Janesko B. G., Gomperts R., Mennucci B., Hratchian H. P., Ortiz J. V., Izmaylov A. F., Sonnenberg J. L., Williams-Young D., Ding F., Lipparini F., Egidi F., Goings J., Peng B., Petrone A., Henderson T., Ranasinghe D., Zakrzewski V. G., Gao J., Rega N., Zheng G., Liang W., Hada M., Ehara M., Toyota K., Fukuda R., Hasegawa J., Ishida M., Nakajima T., Honda Y., Kitao O., Nakai H., Vreven T., Throssell K., J. A. Montgomery, Jr., Peralta J. E., Ogliaro F., Bearpark M. J., Heyd J. J., Brothers E. N., Kudin K. N., Staroverov V. N., Keith T. A., Kobayashi R., Normand J., Raghavachari K., Rendell A. P., Burant J. C., Iyengar S. S., Tomasi J., Cossi M., Millam J. M., Klene M., Adamo C., Cammi R., Ochterski J. W., Martin R. L., Morokuma K., Farkas O., Foresman J. B., and Fox D. J., Gaussian 16 Revision A.03, Gaussian Inc, Wallingford CT, 2016. [Google Scholar]
- 56.Andersson M. P. and Uvdal P., J. Phys. Chem. A 109, 2937 (2005). 10.1021/jp045733a [DOI] [PubMed] [Google Scholar]
- 57.Mulliken R. S., J. Chem. Phys. 23, 1833 (1955). 10.1063/1.1740588 [DOI] [Google Scholar]
- 58.Mulliken R. S., J. Chem. Phys. 36, 3428 (1962). 10.1063/1.1732476 [DOI] [Google Scholar]
- 59.Hamm P. and Zanni M. T., Concepts and Methods of 2D Infrared Spectroscopy (Cambridge Univeristy Press, 2011). [Google Scholar]
- 60.Reed A. E., Weinstock R. B., and Weinhold F., J. Chem. Phys. 83, 735 (1985). 10.1063/1.449486 [DOI] [Google Scholar]
- 61.Watson J. D. and Crick F. H. C., Nature 171, 737 (1953). 10.1038/171737a0 [DOI] [PubMed] [Google Scholar]