Abstract

Ribonucleotide:protein interactions play crucial roles in a number of biological processes. Unlike the RNA:protein interface where van der Waals contacts are prevalent, the recognition of a single ribonucleotide such as ATP by a protein occurs predominantly through hydrogen-bonding interactions. As a first step toward understanding the role of hydrogen bonding in ribonucleotide:protein recognition, the present work employs density functional theory to provide a detailed quantum-mechanical analysis of the structural and energetic characteristics of 18 unique hydrogen-bonded pairs involving the nucleobase/nucleoside moiety of four canonical ribonucleotides and the side chains of three polar amino-acid residues (arginine, glutamine, and glutamic acid) of proteins. In addition, we model five new pairs that are till now not observed in crystallographically identified ribonucleotide:protein complexes but may be identified in complexes crystallized in the future. We critically examine the characteristics of each pair in its ribonucleotide:protein crystal structure occurrence and (gas phase and water phase) optimized intrinsic structure. We further evaluated the interaction energy of each pair and characterized the associated hydrogen bonds using a number of quantum mechanics-based relationships including natural bond orbital analysis, quantum theory atoms in molecules analysis, Iogansen relationships, Nikolaienko–Bulavin–Hovorun relationships, and noncovalent interaction–reduced density gradient analysis. Our analyses reveal rich variability in hydrogen bonds in the crystallographic as well as intrinsic structure of each pair, which includes conventional O/N–H···N/O and C–H···O hydrogen bonds as well as donor/acceptor-bifurcated hydrogen bonds. Further, we identify five combinations of nucleobase and amino acid moieties; each of which exhibits at least two alternate (i.e., multimodal) structures that interact through the same nucleobase edge. In fact, one such pair exhibits four multimodal structures; one of which possesses unconventional “amino–acceptor” hydrogen bonding with comparable (−9.4 kcal mol–1) strength to the corresponding conventional (i.e., amino:donor) structure (−9.2 kcal mol–1). This points to the importance of amino–acceptor hydrogen bonds in RNA:protein interactions and suggests that such interactions must be considered in the future while studying the dynamics in the context of molecular recognition. Overall, our study provides preliminary insights into the intrinsic features of ribonucleotide:amino acid interactions, which may help frame a clearer picture of the molecular basis of RNA:protein recognition and further appreciate the role of such contacts in biology.
1. Introduction
Ribonucleotide:protein interactions play vital roles in the catalytic activity of a number of enzymes where ribonucleotides (rNs) such as adenosine triphosphate and cytidine triphosphate act as cofactors. Such enzymes include kinases, G-proteins, motor proteins, and chaperones.1,2 In contrast to RNA:protein recognition that primarily occurs at the broad and nonspecific macromolecular interface, the interaction of a single rN occurs at the specific, well-defined deep binding site of a protein.3 Consequently, although RNA:protein complexes involve significant van der Waals contacts at the interface and a relatively smaller proportion of hydrogen-bonding contacts, rN:protein interactions predominantly involve hydrogen bonding between rNs and amino acid residues of proteins.4−7
Considerable effort was put forth in the literature to understand the structural principles of RNA:protein recognition. This resulted in a number of studies that identified the noncovalent contacts between RNA and protein residues in the crystal structures of RNA:protein complexes.4,8−19 In addition to revealing the preponderance of van der Waals contacts,9,10,20,21 few studies specifically focused on the diversity of hydrogen bonds at the RNA:protein interface. For example, Allers et al. revealed that RNA:protein contacts mainly involve the interaction of hydrogen-bonding edges of nucleobases and the amide or carbonyl groups of the protein backbone.4 Similarly, Jeong et al.12 and Kim et al.13 carried out a statistical analysis on the nature of hydrogen bonds between amino acid residues and rNs in RNA:protein complexes and observed a number of interactions involving polar or charged amino acids. Further, Lejeune et al.15 observed that, in contrast to DNA:protein hydrogen-bonding interactions that frequently involve the phosphate groups, analogous RNA:protein contacts more commonly involve nucleobase edges and the ribose sugar. Subsequent work by Ellis et al.16 revealed that proteins complexed with mRNA, tRNA, or viral RNA have greater propensity to form nucleobase-specific contacts compared to those complexed with rRNA. On similar lines, Morozova et al.17 suggested that RNA:protein recognition through a small number of specific hydrogen-bonding contacts depends on the uniqueness in shape and donor–acceptor locations on each RNA nucleobase. Furthermore, the intrinsic structures and strengths of π–π contacts22,23 and hydrogen-bonding interactions5,24,25 between nucleobases and proteins in specific nucleic acid:protein complexes have been explored using quantum mechanical methods.
In contrast to a substantial number of studies on RNA:protein recognition, very few studies have specifically focused on the characteristics of an interaction of a single rN with a protein. Nevertheless, Kondo et al. carried out a statistical analysis of 446 crystal structures of rN:protein complexes in an attempt to understand the structural principles of hydrogen-bonding contacts between rNs and proteins.3 The study revealed that the hydrogen-bonding interactions of a nucleotide can occur with the amino acid side chains as well as the peptide backbone of a protein. The interactions were further classified based on the identity of the nucleobase edge (i.e., the Watson–Crick (W) edge, Hoogsteen (H) edge, or sugar (S) edge that interacts with the protein, Figure 1a–d).
Figure 1.
(a–d) Structures, chemical numbering, and interacting edges of the four RNA ribonucleosides. (e–g) Structures and carbon numbering of the amino acids considered for studying ribonucleotide:amino acid hydrogen-bonding interactions in nucleotide:protein complexes. The side-chain atoms of R, E, or Q, the amino acids that interact with ribonucleosides, are marked.
Despite the existence of Kondo and Westhof’s classification scheme,3 information on the intrinsic structure and strength of rN:protein interactions is currently missing in the literature. This stems from the fact that, although structure determination techniques such as NMR, X-ray crystallography, and electron microscopy provide useful information on the noncovalent interactions within the macromolecular context, these techniques are unable to determine the intrinsic (i.e., isolated) nature of interactions between the two noncovalently bonded constituent entities (e.g., nucleobase:amino acid pairs). Further, such techniques cannot determine the strength of noncovalent interactions in the macromolecular as well as in the isolated context. However, the structure and strength of such contributing interactions can be quantified with reasonable accuracy and reliability using nonempirical QM methods. Indeed, QM methods have played a pivotal role in understanding the nature of contacts involved in RNA base pairing,26−32 nucleobase:protein hydrogen bonding in DNA:protein complexes,5,33 and nucleobase:amino acid stacking in RNA:protein complexes.21 This suggests that QM calculations on rN:protein complexes can contribute toward providing an improved understanding of the RNA:protein interaction landscape.
As a first step toward understanding the nature of rN:protein interactions, our recent study carried out QM analysis on the pairs of RNA bases and the side chains of aspartic acid (D) or asparagine (N) residues occurring in rN:protein crystal structures.25 Through comparison of the crystal-structure geometries and isolated (i.e., fully optimized) geometries of these complexes, our analysis revealed the rich variability in the structure and strengths of rN:protein hydrogen bonding. Although analogous studies involving other amino acids can provide a comparative analysis of the role of amino-acid side chains in determining the structure and strength of nucleotide:protein contacts, such an analysis is currently missing in the literature.
To fill this void, in the present work, we extend the previous efforts toward understanding the interaction landscape of nucleobases25−28,34−37 by analyzing the structure and strength of hydrogen bonding between four canonical rNs and the side chains of three polar amino acids, arginine (R), glutamic acid (E), and glutamine (Q). Specifically, the analyzed pairs involve the interaction of R, E, or Q through the amide (Amd), carboxylic (Car), or guanidinium (Gnm) group of the amino acid side chains with one of the rN edges (i.e., W edge, H edge, or S edge, Figure 1). These pairs are denoted as A/G/C/U:R/E/Q W/H/S:Amd/Car/Gnm, depending on the identity of the nucleobase constituting the rN (adenine (A), guanine (G), cytosine (C), or uracil (U)), interacting amino acid (R, E, or Q), nucleobase edge (W, H, or S), and the amino acid side chain (Amd, Car, or Gnm, Figures 1 to 9). In addition to the analysis of structure and strength of complexes in their crystal structure occurrences (Table S1) and isolated gas-phase or solvent (water)-phase optimized geometries, we propose structures and reveal the strength of five hitherto unknown pairs by modeling them based on other available pairs. Overall, our analysis is expected to help derive a better understanding of the rules governing the rN:protein interaction at the atomic level and thus contribute toward further understanding the structural principles of RNA:protein recognition.
Figure 9.

IEFPCM-B3LYP/6-31G(d,p) solvent (water)-phase fully optimized geometries of nucleobase:amino acid corresponding to the gas-phase optimized geometries of the pairs shown in Figures 6 and 7. Respective superpositions of gas-phase fully optimized (red) and solvent-phase fully optimized (blue) geometries are provided. Hydrogen–acceptor distances (Å) and IEFPCM-B3LYP/6-311+G(2df,p) interaction energies (in parentheses, kcal mol–1) are provided for each solvent-phase fully optimized pair, and rmsd values (parentheses, Å) are provided for each structural superposition.
2. Results and Discussion
2.1. Occurrence of rN:Amino Acid Pairs in rN:Protein Crystal Structures
The 16 unique rN:amino acid pairs considered in the present work represent the rich diversity in combinations of nucleobases and nucleobase edges that interact with proteins and form rN:amino acid pairs (Table S1). Specifically, in terms of the identity of the interacting nucleobase, eight pairs involve A, one pair involves C, four pairs involve G, and three pairs involve U (Figure 2). Alternatively, in terms of the interacting nucleobase edge, nine pairs involve the W edge, six pairs involve the H edge, and one pair involves the S edge.
Figure 2.

(a) Hydrogen-bonded ribonucleotide:amino acid pairs showing a single occurrence in the crystal structures of ribonucleotide:protein complexes (see Table S1 for details). (b–d) Superposition of the multiple occurrences of each considered ribonucleotide:amino acid pair containing (b) E, (c) Q, and (d) R residues present in the crystal structures of ribonucleotide:protein complexes.3 Only the nucleobase moieties of the nucleotides are shown for clarity. The nucleobase moieties of different occurrences of the pairs are superposed, while deviation is shown for the amino-acid moiety. Pairs designated with I and II constitute multimodal structures.
Four of the pairs involve a single occurrence in the crystal structure dataset,3 which in turn include two pairs involving E, one pair involving Q, and one pair involving R (Figure 2a). However, the remaining 12 pairs exhibit multiple occurrences3 and include 4 pairs involving E (Figure 2b), 5 pairs involving Q (Figure 2c), and 3 pairs involving R (Figure 2b–d). Structural superimposition of the heavy-atom coordinates of the multiple crystal occurrences of each pair reveals the relative conformational differences in each context (Figure 2).
Six of the total 16 pairs possess unique hydrogen-bonding patterns corresponding to each edge interaction and are thus designated as “unimodal pairs” (Figure 2). In contrast, the remaining 10 pairs can be divided into five categories where the two structures within each category (designated with symbols I and II, Figure 2) interact through the same nucleobase edge, albeit with different hydrogen-bonding patterns.3 These structures thus exhibit multimodality in nucleobase edge interactions and are thus designated as “multimodal pairs.”
2.2. Hydrogen-Only Optimizations
Hydrogen-only optimizations were carried out to identify the hydrogen bonds between R, E, or Q and rNs within the native crystal environment. These calculations reveal rich variability in hydrogen bonds among the six unimodal pairs (Figure 3a). Specifically, one of these pairs possesses a single N–H···O hydrogen bond, one pair possesses donor-bifurcated hydrogen bonding, and one pair possesses acceptor-bifurcated hydrogen bonding (Figure 3a). However, each of the remaining three pairs possesses two conventional donor–acceptor hydrogen bonds (Figure 3a). Due to variation in the nature of hydrogen bonds, these pairs exhibit a wide range (−3.6 to −34.8 kcal mol–1) in binding energies (Figure 3a).
Figure 3.

B3LYP/6-31G(d,p) gas-phase hydrogen-only-optimized geometries and associated B3LYP/6-311+G(2df,p) gas-phase interaction energies (in parentheses, kcal mol–1) of the (a) six unimodal structures that involve a unique interaction with the same edge of nucleobases and (b) five pairs of (10) multimodal structures where structures within each multimodal pair are designated as I and II. Hydrogen–acceptor distances (Å) are provided for each hydrogen bond.
In contrast to the unimodal pairs, one structure within each of the five multimodal pairs (Figure 3b) possesses either a single (N–H···O) bond or acceptor-bifurcated hydrogen bond, whereas the other structure invariably possesses two conventional donor–acceptor hydrogen bonds (Figure 3b). In terms of binding energies, the two structures of G:E W:Car (i.e., I and II, Figure 3b) possess similar (i.e., within 0.4 kcal mol–1) interaction energies, mainly since both structures possess two hydrogen bonds of a conventional/bifurcated type. In contrast, due to a difference in the number of hydrogen bonds, the two structures within each of the remaining four multimodal pairs exhibit a significant (4.4 to 21.1 kcal mol–1, Figure 3b) difference in binding energies.
In addition to the above pairs, two unique variants could be derived for G:E W:Car (II), depending on the initial positioning of the added hydrogen atoms. Specifically, although the parent structure (G:E W:Car (II)) possesses a hydrogen on the carboxylic oxygen that interacts with the N1 of G, a pair of alternate structures (IIa and IIb) involves attachment of the hydrogen atom to the carboxylic oxygen that interacts with the N2 of G (Figure 4). Although structures II and IIa are similar in terms of the planarity of the N2 amino group of G and the involvement of this amino group as a hydrogen bond donor, structure IIb utilizes the amino nitrogen as a hydrogen-bond acceptor where the amino hydrogens attain a nonplanar orientation with respect to the nucleobase skeleton due to pyramidalization. As a result, the nitrogen interacts with the hydroxyl donor of the carboxylic group of E (Figure 4). Binding energies reveal that the “amino–acceptor” structure (IIb) is 2.9 kcal mol–1 more stable than the competing “conventional” structure (IIa). Regardless, the structural similarity and close energetic separation suggests that the interconversion between the conventional and amino–acceptor structures may not require any significant geometrical or energetic penalty. Consequently, such low-barrier interconversions may be helpful in dynamics at the RNA:protein interface. Owing to their potential importance in RNA dynamics, such interactions must be investigated in detail in the future. Nevertheless, similar interactions have previously been described in RNA base pairs38 and more recently in contextual RNA:protein hydrogen bonding.25
Figure 4.

B3LYP/6-31G(d,p) gas-phase hydrogen-only-optimized geometries of two additional variants (i.e., IIa and IIb) of the G:E W:Car (II) pair. Hydrogen–acceptor distances (Å) for each hydrogen bond and B3LYP/6-311+G(2df,p) gas-phase interaction energies (in parentheses, kcal mol–1) are provided.
2.3. Full Optimizations
Full optimizations were carried out to reveal the intrinsic hydrogen-bonding characteristics of the pairs in their free (i.e., isolated) forms. In the absence of the macromolecular environment, full optimizations maximize the number and strength of hydrogen bonds in each pair. Thus, the fully optimized geometries represent the ideal geometries that would be obtained in the gas phase in the absence of macromolecular effects.
On full optimization, 11 of the total 18 pairs (16 original pairs and 2 G:E W:Car (II) variants (IIa and IIb)) retain the hydrogen-bonding pattern present in the starting crystal structures (Figure 5). Except A:E W:Car (II), which possesses a single hydrogen bond, all these pairs involve two strong (i.e., N–H···N/O type) hydrogen bonds in the crystal structures and fully optimized structures. However, due to substantial changes in amino acid conformation on full optimization, many of these pairs observe substantial (up to 1.757 Å in rmsd, Figure 6) deviation between hydrogen-only-optimized and fully optimized geometries. Regardless, due to significant stability provided by hydrogen bond(s), these pairs possess high (−9.4 to −33.6 kcal mol–1, Figure 5) interaction energies, which are comparable to the interaction energies of the corresponding hydrogen-only-optimized pairs (−8.2 to −34.8 kcal mol–1, Figure 3).
Figure 5.

B3LYP/6-31G(d,p) gas-phase fully optimized geometries of the pairs that retain the crystallographically determined hydrogen-bonding pattern. Optimized hydrogen–acceptor distances (Å) and B3LYP/6-311+G(2df,p) gas-phase interaction energies (in parentheses, kcal mol–1) are provided. Superposition of the fully optimized (blue) and hydrogen-only-optimized geometries (red) and respective rmsd values (Å) are also provided for each pair.
Figure 6.

B3LYP/6-31G(d,p) gas-phase fully optimized geometries of the pairs that form an additional hydrogen bond on full optimization of the crystal geometry. Optimized hydrogen–acceptor distances (Å) and B3LYP/6-311+G(2df,p) gas-phase interaction energies (in parentheses, kcal mol–1) are provided. Superposition of the fully optimized (blue) and hydrogen-only-optimized geometries (red) and respective rmsd values (Å) are also provided for each pair.
In contrast, five pairs involve the formation of additional hydrogen bonds on full optimization (Figure 6). Due to greater change in structure resulting from change in hydrogen bonding, these pairs possess higher (0.950–2.694 Å, Figure 6) rmsd values between hydrogen-only-optimized and fully optimized structures compared to the pairs that do not change in hydrogen bonding on full optimization (0.395–1.757 Å, Figure 6). Further, these pairs possess a significantly stronger interaction (−5.1 to −23.0 kcal mol–1, Figure 6) than that of their hydrogen-only-optimized counterparts (−0.9 to −5.5 kcal mol–1, Figure 3), mainly due to an increase in the number of hydrogen bonds.
On the other hand, two pairs (i.e., G:E W:Car (I and II)) undergo change in their hydrogen bonding pattern on full optimization. Specifically, full optimization of these structures leads to breaking of hydrogen bonds involving the amino group of G and formation of hydrogen bonds involving O6 of G (Figure 7). Further, due to optimization of the hydrogen-bonding pattern, the fully optimized geometries of these complexes possess significantly higher binding energies (−24.2 to −24.4 kcal mol–1, Figure 7) compared to hydrogen-only-optimized geometries (−9.9 to −10.3 kcal mol–1, Figure 3).
Figure 7.

B3LYP/6-31G(d,p) gas-phase fully optimized geometries of the pairs that completely change the hydrogen-bonding pattern on full optimization of the crystal geometry. Optimized hydrogen–acceptor distances (Å) and B3LYP/6-311+G(2df,p) gas-phase interaction energies (in parentheses, kcal mol–1) are provided. Superposition of the fully optimized (blue) and hydrogen-only-optimized geometries (red) and respective rmsd values (Å) are also provided for each pair.
Regardless of whether new hydrogen bonds are formed or not on full optimization, the hydrogen bonding A–H distances within the retained hydrogen bonds of all 18 complexes are shorter in fully optimized geometries (1.563–2.588 Å) than in the respective hydrogen-only-optimized geometries (1.471–2.664 Å). Further, the average interaction energy of the fully optimized geometries (−16.4 kcal mol–1) is higher than that of the hydrogen-only-optimized geometries (−11.3 kcal mol–1, Figure 3–7). This suggests that the binding strength within the macromolecular crystal geometries of the pairs is enhanced on an average by 5.1 kcal mol–1 on full optimization.
Among the three (R, E, and Q) amino acids considered, the average interaction energy of pairs involving R (−26.7 kcal mol–1) is significantly greater than those involving Q (−7.6 kcal mol–1) and E (−7.3 kcal mol–1) as well as the previously studied pairs involving D (−11.8 kcal mol–1) and N (−11.4 kcal mol–1). Alternatively, when the variation in interaction energies among four different nucleobases is considered, C possesses a higher average interaction energy (−33.6 kcal mol–1) followed by G (−20.1 kcal mol–1), U (−17.6 kcal mol–1), and A (−10.9 kcal mol–1).
Classification of the optimized pairs in terms of the nature of hydrogen bonds reveal that 13 pairs possess a conventional O/N–H···N/O type of hydrogen bond (Figures 5 and 6) and one (A:E H:Car(I)) pair possesses one N–H···O and one C–H···O hydrogen bonds. Further, two (U:R S:Gnm and A:Q H:Amd) pairs possess amino–donor bifurcated bonding, and one (U:R H:Gnm) pair possesses acceptor-bifurcated hydrogen bonding. In addition, one (G:E W:Car (IIb)) pair involves one conventional N–H···N/O hydrogen bond and one amino–acceptor hydrogen bond. More importantly, the binding of the amino–acceptor structure (−9.4 kcal mol–1) is comparable to the corresponding conventional structure (G:E W:Car IIb, −9.2 kcal mol–1). This reiterates the importance of amino–acceptor structures in RNA:protein interactions involving the guanine nucleobase and suggests that such interactions must be considered in detail in the future while studying noncovalent interactions in the context of molecular recognition.
2.4. Solvent-Phase Calculations
We reoptimized the structures in implicit water to estimate the influence of bulk solvent screening on their geometries and binding strengths. The structural characteristics of the pairs optimized in the solvent phase do not deviate significantly from their gas-phase counterparts (rmsd of 0.051–0.968 Å), although solvent-phase structures possess a slight change in the A–H distances (1.579–2.397 Å, Figures 8 and 9) compared to their gas-phase counterparts (1.563–2.588 Å, Figures 5–7).
Figure 8.

IEFPCM-B3LYP/6-31G(d,p) solvent (water)-phase fully optimized geometries of nucleobase:amino acid corresponding to the gas-phase optimized geometries of the pairs shown in Figure 5. Respective superpositions of gas-phase fully optimized (red) and solvent-phase fully optimized (blue) geometries are provided. Hydrogen–acceptor distances (Å) and IEFPCM-B3LYP/6-311+G(2df,p) interaction energies (in parentheses, kcal mol–1) are provided for each solvent-phase fully optimized pair, and rmsd values (parentheses, Å) are provided for each structural superposition.
The interaction energies of the pairs in the solvent phase (−1.1 to −12.7 kcal mol–1) are significantly reduced compared to the gas phase values (−5.5 to −33.6 kcal mol–1, Figures 5–9). Although 11 pairs possess small (1 to 6 kcal mol–1) reduction, 7 pairs show substantial (11 to 23 kcal mol–1) reduction in gas-phase and solvent-phase interaction energies (Figures 8 and 9). Specifically, the lowest (1 kcal mol–1) reduction in interaction energy on inclusion of solvent occurs in A:Q H:Amd (II). However, the highest (23 kcal mol–1) reduction occurs in C:R W:Gnm where the solvent-phase interaction energy (−10.6 kcal mol–1) is more than three times lower compared to the gas-phase interaction energy (−33.6 kcal mol–1).
Our analysis further reveals that the difference in interaction energies between each type of multimodal structure is smaller in the solvent phase (0.1 to 11.2 kcal mol–1) compared to the gas phase (0.6 to 15.2 kcal mol–1, Figures 5–9). This suggests that variable hydrogen-bonding patterns through same nucleobase edge interactions become closer in energy in the solvent phase. This may, in turn, lead to a greater possibility of switching between such structurally similar interaction states and may thus help in macromolecular dynamics during RNA:protein recognition. The analogous role of multimodal base pairs in RNA dynamics has previously been proposed in the context of RNA base pairing.39
Although the conventional structures (I and II) of G:E W:Car possess up to 15 kcal mol–1 higher interaction energy than that of the amino acceptor structure (IIb) in the gas phase, the difference is reduced to only 6 kcal mol–1 in the solvent phase (Figures 8 and 9). More importantly, the amino–acceptor structure possesses 1.0 kcal mol–1 stronger binding than that of the third conventional structure (IIa) in the solvent phase (Figures 8 and 9). This reiterates our proposition from gas-phase calculations that both conventional and amino acceptor structures may play an important role in the stability of rN:amino acid interactions.
2.5. Modeled Interactions
We modeled five additional interactions that have not been observed in the rN:protein crystal structures (Figure 10). Specifically, we modeled C:E W:Car, U:E W:Car (I), and U:E W:Car (II) based on the optimized structure of A:E W:Car. However, A:R H:Gnm and A:Q W:Amd were modeled on the optimized geometry of G:R H:Gnm and C:Q W:Amd, respectively.
Figure 10.

(a) B3LYP/6-31G(d,p) gas-phase fully optimized geometries and B3LYP/6-311+G(2df,p) gas-phase interaction energies (kcal mol–1) of the modeled nucleobase:amino acid pairs. Optimized hydrogen-bond donor–acceptor distances (Å) and IEFPCM-B3LYP/6-31G(d,p) are provided. (b) IEFPCM-B3LYP/6-31G(d,p) solvent-phase fully optimized modeled geometries and IEFPCM-B3LYP/6-311+G(2df,p) solvent-phase interaction energies (in parentheses, kcal mol–1) of the studied modeled pairs of nucleobases with E, Q, or R. Hydrogen–acceptor distances (Å) for hydrogen-bonding interactions are provided. (c) Superposition of the B3LYP/6-311+G(2df,p) fully optimized gas-phase (red) and IEFPCM-B3LYP/6-31G(d,p) fully optimized solvent (water)-phase (blue) geometries of the modeled structures; rmsd values (Å) between the two sets of structures (parentheses) are provided.
All the modeled interactions remain stable both in the gas phase and solvent phase, which points toward their intrinsic stability (Figure 10). Further, the structural characteristics of all five pairs remain similar in gas and solvent phases (rmsd of 0.086–0.137 Å, Figure 10), and the hydrogen-bonding pattern remains unaltered in both phases. However, as observed for fully optimized structures derived from crystal structures, the interaction energies of the modeled structures are reduced in the solvent phase (−7.3 to −13.4 kcal mol–1) compared to the gas phase (−13.8 to −20.0 kcal mol–1, Figure 10). Further, since the magnitude of reduction in binding energies in the solvent phase is variable and depends on the type of hydrogen bonds present in each pair, the trend in the gas-phase relative stability of the pairs changes on inclusion of the solvent (Figure 10). Regardless, the magnitude of stability of the modeled interactions points toward their candidature for likely occurrence in nucleotide:protein or RNA:protein complexes that might be crystallized in the future.
2.6. Characteristics of Hydrogen Bonds in the Optimized Complexes
Analysis of each gas-phase optimized pair using NBO analysis reveals that the average delocalization energy (E(2)) associated with n(A) → σ*(D–H) charge delocalization of each hydrogen bond follows the order O–H···N/O bonds (24.2 kcal mol–1) > N–H···N/O hydrogen bonds (13.6 kcal mol–1, Table S3). A similar trend is obtained for the ρHBCP or ∇2ρHBCP values (0.049 or 0.115 a.u. for O–H···N/O bonds and 0.029 or 0.081 a.u. for N–H···O/N bonds, Table S3). Similarly, the average Δν associated with O–H···N/O bonds (713.5 cm–1) is greater than that with N–H···O/N hydrogen bonds (280.17 cm–1, Table S3). Further, a similar trend is observed in EHB values calculated from the Iogansen relationship (EHBO – H···O/N (8.3 kcal mol–1) > EHB (5.0 kcal mol–1)) and Nikolaienko–Bulavin–Hovorun relationships (EHBO – H···O/N (8.7 kcal mol–1) > EHB (4.6 kcal mol–1), Table S3).
To find a correlation between the hydrogen-bond energies obtained from Iogansen relationships, Nikolaienko–Bulavin–Hovorun relationships, and E(2) values from NBO analysis, we added the hydrogen bond energies of all hydrogen bonds in each pair to obtain ΣEHBIogansen, ΣEHB, and ΣE(2) (Table S4 and Figure 11). ΣEHBIogansen and ΣEHB values correlate with ΣE(2) energy for only three complexes where |ΣE(2) – ΣEHBIogansen| or |ΣE(2) – ΣEHB| lies within 5 kcal mol–1 (Table S4). On the other hand, ΣEHBIogansen and ΣEHB correlate very well within 13 of the 23 complexes with a coefficient of determination (R2) of 0.94 and |ΣEHBIogansen – ΣEHB| ranging from 0.0 to 2.6 kcal mol–1 (Figure 12 and Table S5). However, the remaining 10 complexes show weak correlation where |ΣEHBIogansen – ΣEHB| ranges from 3.9 to 6.0 kcal mol–1 (Table S5).
Figure 11.
Comparison of the sum of hydrogen-bonding energies of all hydrogen bonds within each pair deduced using NBO analysis (ΣE(2)), the Iogansen relation (ΣEHBIogansen), and Nikolaienko–Bulavin–Hovorun relations (ΣEHB) with the gas-phase interaction energy.
Figure 12.
Correlation between the sums of hydrogen-bonding energies of each pair calculated using the Iogansen relation (ΣEHBIogansen) and Nikolaienko–Bulavin–Hovorun relations (ΣEHB) for 13 pairs that show good correlation. See Table S5 for details.
In contrast, the gas-phase interaction energies (Egas phase) correlate well with ΣEHBIogansen for 10 of the 23 complexes (R2 of 0.92 and Egas phase – ΣEHB of 3 kcal mol–1), whereas the remaining complexes show weak correlation (R2 of 0.52 and Egas phase – ΣEHBIogansen of up to 21 kcal mol–1, Figure 13 and Table S6). Regardless, the contribution of the hydrogen-bonding energy (ΣEHB) to Egas phase ranges between 20 and 107% for all complexes (Table S4). Similarly, the Egas phase correlates with ΣEHBNBH for nine complexes (R2 of 0.92 and Egas phase – ΣEHB of up to 5 kcal mol–1), although the remaining 13 complexes show weak correlation (R2 of 0.22 and Egas phase – ΣEHBNBH of up to 26 kcal mol–1, Figure 13 and Table S7). However, irrespective of the strength of correlation, ΣEHB contributes 21 to 120% to the Egas phase values (Table S4). In contrast, although ΣE(2) correlates with Egas phase in 10 complexes (R2 of 0.94 and Egas phase – ΣE(2) of up to 4 kcal mol–1), the remaining 13 complexes show very weak correlation (R2 of 0.94 and Egas phase – ΣE(2) of up to 37 kcal mol–1, Figure 13 and Table S8). Out of these, seven complexes also show correlation between Egas phase and ΣEHBIogansen values (Table S6). However, all three energies correlate with gas-phase interaction energies for only one complex (A:Q H:Amd (II)).
Figure 13.
Correlation between the sums of hydrogen-bonding energies of each pair calculated using NBO analysis (ΣE2), the Iogansen relation (ΣEHBIogansen), or Nikolaienko–Bulavin–Hovorun relations (ΣEHB) and the gas-phase interaction energy. (a, c, e) Plots corresponding to pairs that show good correlation (R2 = 0.92); (b, d, f) plots corresponding to pairs that do not show correlation. See Tables S6–S8 for details.
In addition, we performed NCI-RDG analysis to provide a graphical visualization of hydrogen bonds and distinguish them from other weak interactions. Specifically, the plot of S and sign(λ2)ρ reveals troughs, which correspond to different noncovalent interactions. The troughs corresponding to attractive interactions (such as hydrogen bonds) lie in the region of negative sign(λ2)ρ, whereas repulsive interactions lie in the positive region. NCI plots for representative complexes are provided in Figure 14, whereas those for all the complexes are given in Figures S3 and S4. NCI plots reveal that O–H···N/O hydrogen bonds appear in the region of higher ρ values followed by N–H···O/N bonds. In addition to hydrogen bonds, certain secondary interactions could be visualized in the NCI-RDG plot, which could be ascribed to nonspecific dispersion contacts. Overall, in conjunction with ρHBCP values obtained from QTAIM analysis (Figures S1 and S2), NCI-RDG analysis helps in characterization and confirmation of hydrogen-bonding interactions in the studied pairs.
Figure 14.
Plots of reduced density gradient (S) and sign(λ2)ρ for pairs involving the R amino-acid moiety.
3. Conclusions
In the present work, we analyzed 23 hydrogen-bonded pairs derived from canonical rNs and R, E, or Q residues of proteins. These include 16 pairs directly derived from crystal structures of rN:protein complexes, 2 additional alternate geometries of one pair that differ in the position of the hydrogen atoms, and 5 modeled pairs.
Analysis of pairs present in rN:protein crystal structures reveals two categories of pairs, unimodal and multimodal. Although each unimodal pair possesses a unique hydrogen-bonding pattern, the multimodal pairs involve multiple hydrogen-bonding possibilities through the same interacting nucleobase edge. Hydrogen-only optimizations reveal that all unimodal pairs as well as one type of geometry of each multimodal pair possess a single or bifurcated hydrogen bond. However, the second type of geometry of each multimodal pair invariably possesses two conventional donor–acceptor hydrogen bonds.
Gas phase and solvent-phase full optimizations reveal three categories of pairs: (i) pairs that retain the crystal-structure hydrogen-bonding pattern on full optimization, (ii) pairs that form additional hydrogen bonds on full optimization, and (iii) pairs that undergo complete change in hydrogen-bonding patterns on full optimization. Further, interaction energy calculations reveal that the average binding strength of the pairs present within the macromolecular crystal environment is enhanced on gas-phase optimization, although the binding strength in the solvent phase is smaller compared to the gas phase. More importantly, the difference in interaction energies between each type of multimodal structure is smaller in the solvent phase compared to the gas phase. This points toward the possibility of switching between such structurally similar and energetically close interaction states, which may be significant for the dynamics at the RNA:protein interface. Overall, structural analysis of the pairs reveals significant variability in their geometrical patterns and associated stabilities.
Our analysis further reveals one example (G:E W:Car) where the alternate positioning of the amino hydrogens leads to either amino–donor or amino–acceptor optimized structures, which lie very close in energy especially in the solvent phase. This suggests that both these types of interactions may play a role in the stability of rN:amino acid interactions involving guanine and should be considered in detail in the future while studying the noncovalent interactions in the context of molecular recognition.
Analysis of the five modeled pairs reveals their intrinsic stability both in the gas phase and solvent phase, although the binding strength is reduced in the solvent phase compared to the gas phase. Regardless, the stability of the modeled interactions points toward their candidature for likely occurrence in the crystal structures of nucleotide:protein or RNA:protein complexes.
The strength of hydrogen bonds within each pair was analyzed in terms of NBO analysis, vibrational frequency-based Iogansen relationships, and QTAIM analysis-based Nikolaienko–Bulavin–Hovorun relationships. These analyses reveal interesting correlations between the interaction strengths estimated using different relationships as well as their correlation with the gas-phase interaction energies. In addition, NCI-RDG analysis provides a graphical visualization of hydrogen bonds and helps distinguish them from other secondary interactions.
In conclusion, this study characterizes and highlights the rich diversity of hydrogen-bonding interactions at the rN:protein interface. Owing to their significant intrinsic strength, we propose that such interactions should be analyzed in detail while performing molecular dynamics simulations on rN:protein and RNA:protein macromolecular complexes.
4. Computational Methods
4.1. Starting Structures
A set of 16 X-ray crystal structures of rN:protein pairs involving R, E, or Q and rNs with resolution better than 2 Å and identified in a previous study3 were extracted from the Protein Databank (PDB, Table S1). rN:amino acid pairs were isolated from these crystal structures using the Rasmol program.40
4.2. Model Building
The protein chain connected to the interacting amino-acid moiety was removed, and the backbone carbonyl and amide groups were capped with methyl groups. Methyl capping eliminated the possibility of non-native interactions between the rN and protein backbone that could potentially occur during full optimizations of the isolated models in the absence of the rest of the protein chain. Further, in analogy with previous work,25 the neutral states of both the side-chain carboxylic group of E and amide group of Q were used for model building, whereas the side chain of R was taken in its cationic state. Each rN was modeled as a nucleobase truncating the sugar–phosphate backbone and replacing it with a hydrogen atom, which was attached to the N9 atom (purine) or N1 atom (pyrimidine). However, the sugar moiety was retained in one pair since the interaction of the amino acid residue occurs through the 2′ OH group of U. Here, the nucleotide was truncated at the 5′ oxygen, which was capped with a hydrogen atom. Further, the 3′ oxygen was capped with a methyl group. Furthermore, hydrogen atoms absent in the crystal structures were added to each pair to complete the atomic valences.
4.3. Gas-Phase Calculations
4.3.1. Hydrogen-Only Optimization and Full Optimization
Due to ambiguity in the positions of hydrogen atoms added to complete the atomic valencies, optimization of the hydrogens was carried out first while fixing the positions of the heavy atoms to the coordinates derived from crystal structures. This “frozen state” optimization is denoted as “hydrogen only” optimization. Each complex was then fully optimized after removing any crystallographic constraints. Both hydrogen-only optimization and full optimization were performed at the B3LYP/6-31G(d,p) level using Gaussian 09.41 This method was chosen in analogy with the previous studies on hydrogen-bonded complexes involving RNA base pairs28−32,38 and base:amino acid complexes.25 The hydrogen-only optimized and fully optimized geometries were then overlayed to estimate the root mean square deviation (rmsd) values using the visual molecular dynamics (VMD) program.42
4.3.2. Interaction Energies
Basis set superposition error (BSSE)-corrected interaction energies of both hydrogen-only optimized and fully optimized complexes were calculated at B3LYP/6-311+G(2df,p) using the counterpoise method using Gaussian 09.41 This method was chosen in analogy with previous studies on hydrogen-bonded systems.25,31,32,43,44 Representative interaction energies were also calculated using the B3LYP-D3/6-311+G(2df,p) method and compared with the energies calculated using B3LYP/6-311+G(2df,p) to estimate the effect of dispersion correction. The energies differ only by 2.7 to 3.7 kcal mol–1 (Table S2). More importantly, the trend in relative energies does not change upon inclusion of the dispersion correction.
4.4. Solvent-Phase Optimizations and Interaction Energies
The gas-phase optimized structures were reoptimized in the solvent (water) phase at B3LYP/6-31G(d,p) using the integral equation formalism polarizable continuum model (IEFPCM) method45 to mimic the solvent environment. These calculations were carried out using Gaussian 09.37 Further, the solvent-phase optimized geometries were superposed on the gas-phase fully optimized geometries to analyze the relative structural deviations using VMD.42
4.5. Analysis of the Strength of Individual Hydrogen Bonds in Gas-Phase Optimized Geometries
4.5.1. Natural Bond Orbital (NBO) Analysis
To understand the characteristics and relative strength of hydrogen bonds occurring in the analyzed complexes, NBO analysis46 was carried out. Specifically, the magnitude of charge transfer from the donor NBO i (i.e., nonbonding (lone pair) orbital (n) located on the hydrogen bond acceptor (A)) to the acceptor NBO j (i.e., antibonding (σ*) orbital located on the donor–hydrogen (D–H) bond) associated with each (D–H···A) hydrogen bond was estimated using the second-order perturbation energy (E(2)), which was deduced from the following equation:
Here, qi is the occupancy of the donor orbital, Ei and Ej are the energies (diagonal Fock matrix elements) of orbitals i and j, and F(i, j) is the off-diagonal NBO Fock matrix element. NBO analysis was carried out using Gaussian 09.41
4.6. Iogansen Relationship
The strength of each hydrogen-bonding interaction was further estimated in terms of hydrogen-bond energy (EHB, kcal mol–1), which was calculated using the Iogansen relationship,47 which has been recently used to successfully analyze hydrogen bonding in nucleobase pairs.48 Specifically,
Here, Δν is the difference in the vibrational frequency of the D–H bond stretching in the isolated monomer (νmonomer) and hydrogen-bonded complex (νcomplex). The vibrational frequency calculations were carried out on the gas-phase fully optimized pairs and isolated monomers using Gaussian 09.41
4.6.1. Quantum Theory Atoms in Molecules (QTAIM) Analysis
The strength of hydrogen-bonding interactions within the pairs was further analyzed using QTAIM analysis.48,49 Specifically, each hydrogen bond was characterized in terms of the electron density (ρHBCP) and its Laplacian (∇2ρHBCP) at the hydrogen-bond critical point (HBCP). The wave functions obtained for B3LYP/6-31g(d,p)-optimized geometries were used as input for these calculations, which were carried out using the Multiwfn program.50
4.6.2. Nikolaienko–Bulavin–Hovorun Relationships
EHB was further estimated for the three (i.e., N–H···O, O–H···O, and O–H···N) types of hydrogen bonds using the ρ values obtained from QTAIM analysis and the following Nikolaienko–Bulavin–Hovorun relations:51
4.6.3. Noncovalent Interaction–Reduced Density Gradient (NCI-RDG) Analysis
For each pair, the reduced density gradient (S) was calculated from the electron density (ρ) and its gradient (∇ρ) using the following relationship:52
In addition, the three components of ∇2ρ along the three principal axes of maximum variation (i.e., λ1, λ2, and λ3) were recalculated as the eigenvalues of the electron density Hessian matrix. Since λ2 < 0 (negative) for hydrogen bonds, the product of the sign of λ2 and ρ (i.e., sign(λ2)ρ) is used to characterize different types of hydrogen bonds.52 NCI-RDG analysis was performed using the Multiwfn program.50
Acknowledgments
P.S. thanks the Department of Science and Technology (DST) and University Grants Commission (UGC), New Delhi, for financial support through the DST INSPIRE (no. IFA14-CH162) and UGC FRP (no. F.4-5(176-FRP/2015(BSR))) programs, respectively.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.9b04083.
QTAIM analysis for crystallographically derived and modeled pairs of RNA nucleobases with R, E, and Q, NCI-RDG analysis for pairs involving E and Q, PDB codes of source crystal structures, comparison of interaction energies calculated at B3LYP and B3LYP-D3 levels, E(2), Δν, ρ, ∇2ρ, EHBIogansen, and EHB analysis of fully optimized pairs, comparison of ΣEHBIogansen, ΣEHB and ΣE(2), correlation between ΣEHBIogansen and ΣEHB, correlation between Egas phase and ΣEHBIogansen, correlation between Egas phase and ΣEHB, correlation between Egas phase and ΣE(2), and Cartesian coordinates of the gas phase and solvent phase fully optimized geometries (PDF)
Author Present Address
∥ Present address: Department of Chemistry, University of Tennessee, Knoxville, Tennessee 37996, United States.
Author Present Address
§ Present address: Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K3M4, Canada.
The authors declare no competing financial interest.
Supplementary Material
References
- Leipe D. D.; Wolf Y. I.; Koonin E. V.; Aravind L. Classification and evolution of P-loop GTPases and related ATPases. J. Mol. Biol. 2002, 317, 41–72. 10.1006/jmbi.2001.5378. [DOI] [PubMed] [Google Scholar]
- Vetter I. R.; Wittinghofer A. Nucleoside triphosphate-binding proteins: different scaffolds to achieve phosphoryl transfer. Q. Rev. Biophys. 1999, 32, 1–56. 10.1017/S0033583599003480. [DOI] [PubMed] [Google Scholar]
- Kondo J.; Westhof E. Classification of pseudo pairs between nucleotide bases and amino acids by analysis of nucleotide–protein complexes. Nucleic Acids Res. 2011, 39, 8628–8637. 10.1093/nar/gkr452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allers J.; Shamoo Y. Structure-based analysis of protein-RNA interactions using the program ENTANGLE. J. Mol. Biol. 2001, 311, 75–86. 10.1006/jmbi.2001.4857. [DOI] [PubMed] [Google Scholar]
- Stasyuk O. A.; Jakubec D.; Vondrášek J.; Hobza P. Noncovalent interactions in specific recognition motifs of protein–DNA complexes. J. Chem. Theory Comput. 2017, 13, 877–885. 10.1021/acs.jctc.6b00775. [DOI] [PubMed] [Google Scholar]
- Elstner M.; Hobza P.; Frauenheim T.; Suhai S.; Kaxiras E. Hydrogen bonding and stacking interactions of nucleic acid base pairs: a density-functional-theory based treatment. J. Chem. Phys. 2001, 114, 5149–5155. 10.1063/1.1329889. [DOI] [Google Scholar]
- Lin M.; Guo J.-t. New insights into protein–DNA binding specificity from hydrogen bond based comparative study. Nucleic Acids Res. 2019, 47, 11103–11113. 10.1093/nar/gkz963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mukherjee S.; Majumdar S.; Bhattacharyya D. Role of hydrogen bonds in protein– DNA recognition: effect of nonplanar amino groups. J. Phys. Chem. B 2005, 109, 10484–10492. 10.1021/jp0446231. [DOI] [PubMed] [Google Scholar]
- Jones S.; Daley D. T.; Luscombe N. M.; Berman H. M.; Thornton J. M. Protein–RNA interactions: a structural analysis. Nucleic Acids Res. 2001, 29, 943–954. 10.1093/nar/29.4.943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Treger M.; Westhof E. Statistical analysis of atomic contacts at RNA–protein interfaces. J. Mol. Recognit. 2001, 14, 199–214. 10.1002/jmr.534. [DOI] [PubMed] [Google Scholar]
- Babitzke P.; Baker C. S.; Romeo T. Regulation of translation initiation by RNA binding proteins. Annu. Rev. Microbiol. 2009, 63, 27–44. 10.1146/annurev.micro.091208.073514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeong E.; Kim H.; Lee S. W.; Han K. Discovering the interaction propensities of amino acids and nucleotides from protein-RNA complexes. Mol. Cell 2003, 16, 161–167. [PubMed] [Google Scholar]
- Kim H.; Jeong E.; Lee S. W.; Han K. Computational analysis of hydrogen bonds in protein–RNA complexes for interaction patterns. FEBS Lett. 2003, 552, 231–239. 10.1016/S0014-5793(03)00930-X. [DOI] [PubMed] [Google Scholar]
- Cheng A. C.; Chen W. W.; Fuhrmann C. N.; Frankel A. D. Recognition of nucleic acid bases and base-pairs by hydrogen bonding to amino acid side-chains. J. Mol. Biol. 2003, 327, 781–796. 10.1016/S0022-2836(03)00091-3. [DOI] [PubMed] [Google Scholar]
- Lejeune D.; Delsaux N.; Charloteaux B.; Thomas A.; Brasseur R. Protein–nucleic acid recognition: statistical analysis of atomic interactions and influence of DNA structure. Proteins: Struct., Funct., Bioinformatics 2005, 61, 258–271. 10.1002/prot.20607. [DOI] [PubMed] [Google Scholar]
- Ellis J. J.; Broom M.; Jones S. Protein–RNA interactions: structural analysis and functional classes. Proteins: Struct., Funct., Bioinformatics 2007, 66, 903–911. 10.1002/prot.21211. [DOI] [PubMed] [Google Scholar]
- Morozova N.; Allers J.; Myers J.; Shamoo Y. Protein–RNA interactions: exploring binding patterns with a three-dimensional superposition analysis of high resolution structures. Bioinformatics 2006, 22, 2746–2752. 10.1093/bioinformatics/btl470. [DOI] [PubMed] [Google Scholar]
- Bahadur R. P.; Zacharias M.; Janin J. Dissecting protein–RNA recognition sites. Nucleic Acids Res. 2008, 36, 2705–2716. 10.1093/nar/gkn102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barik A.; Pilla S. P.; Bahadur R. P. Molecular architecture of protein-RNA recognition sites. J. Biomol. Struct. Dyn. 2015, 33, 2738–2751. 10.1080/07391102.2015.1004652. [DOI] [PubMed] [Google Scholar]
- Guallar V.; Borrelli K. W. A binding mechanism in protein–nucleotide interactions: implication for U1A RNA binding. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 3954–3959. 10.1073/pnas.0500888102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson K. A.; Holland D. J.; Wetmore S. D. Topology of RNA–protein nucleobase–amino acid π–π interactions and comparison to analogous DNA–protein π–π contacts. RNA 2016, 22, 696–708. 10.1261/rna.054924.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson K. A.; Kellie J. L.; Wetmore S. D. DNA–protein π-interactions in nature: abundance, structure, composition and strength of contacts between aromatic amino acids and DNA nucleobases or deoxyribose sugar. Nucleic Acids Res. 2014, 42, 6726–6741. 10.1093/nar/gku269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Churchill C. D. M.; Wetmore S. D. Noncovalent interactions involving histidine: the effect of charge on π– π stacking and T-shaped interactions with the DNA nucleobases. J. Phys. Chem. B 2009, 113, 16046–16058. 10.1021/jp907887y. [DOI] [PubMed] [Google Scholar]
- Cheng A. C.; Frankel A. D. Ab initio interaction energies of hydrogen-bonded amino acid side chain– nucleic acid base interactions. J. Am. Chem. Soc. 2004, 126, 434–435. 10.1021/ja037264g. [DOI] [PubMed] [Google Scholar]
- Kagra D.; Preethi S.; Sharma P. Interaction of aspartic acid and asparagine with RNA nucleobases: a quantum chemical view. J. Biomol. Struct. Dyn. 2019, 1–13. 10.1080/07391102.2019.1592025. [DOI] [PubMed] [Google Scholar]
- Sharma P.; Mitra A.; Sharma S.; Singh H.; Bhattacharyya A. Quantum chemical studies of structures and binding in noncanonical RNA base pairs: the trans Watson–Crick: Watson–Crick family. J. Biomol. Struct. Dyn. 2008, 25, 709–732. 10.1080/07391102.2008.10507216. [DOI] [PubMed] [Google Scholar]
- Sharma P.; Mitra A.; Sharma S.; Singh H. Base pairing in RNA structures: A computational analysis of structural aspects and interaction energies. J. Chem. Sci. 2007, 119, 525–531. 10.1007/s12039-007-0066-9. [DOI] [Google Scholar]
- Sharma P.; Singh H.; Mitra A. Noncanonical Base Pairing in RNA: Topological and NBO Analysis of Hoogsteen Edge-Sugar Edge Interactions. Int. Conf. Comput. Sci. 2008, 5102, 379–386. 10.1007/978-3-540-69387-1_42. [DOI] [Google Scholar]
- Seelam P. P.; Sharma P.; Mitra A. Structural landscape of base pairs containing post-transcriptional modifications in RNA. RNA 2017, 23, 847–859. 10.1261/rna.060749.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Preethi P. S.; Sharma P.; Mitra A. Higher order structures involving post transcriptionally modified nucleobases in RNA. RSC Adv. 2017, 7, 35694–35703. 10.1039/C7RA05284G. [DOI] [Google Scholar]
- Kaur S.; Sharma P.; Wetmore S. D. Can Cyanuric Acid and 2,4,6-Triaminopyrimidine Containing Ribonucleosides be Components of Prebiotic RNA? Insights from QM Calculations and MD Simulations. ChemPhysChem 2019, 20, 1425–1436. 10.1002/cphc.201900237. [DOI] [PubMed] [Google Scholar]
- Kaur S.; Sharma P.; Wetmore S. D. Structural and electronic properties of barbituric acid and melamine-containing ribonucleosides as plausible components of prebiotic RNA: implications for prebiotic self-assembly. Phys. Chem. Chem. Phys. 2017, 19, 30762–30771. 10.1039/C7CP06123D. [DOI] [PubMed] [Google Scholar]
- Černý J.; Hobza P. Non-covalent interactions in biomacromolecules. Phys. Chem. Chem. Phys. 2007, 9, 5291–5303. 10.1039/b704781a. [DOI] [PubMed] [Google Scholar]
- Brovarets’ O. O.; Oliynyk T. A.; Hovorun D. M. Novel tautomerisation mechanisms of the biologically important conformers of the reverse Löwdin, Hoogsteen, and revrse Hoogsteen G*-C* DNA base pairs via proton transfer: Quantum mechanical survey. Front Chem. 2019, 7, 597. 10.3389/fchem.2019.00597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brovates O. O.; Voiteshenko I. S.; Pérez-Sánchez H.; Hovorun D. A QM/QTAIM detailed look at the Warson-Crick ↔ wobble tautomeric transformations of the 2-aminopurine:pyrimidine mispairs. J. Biomol. Struct. Dyn. 2018, 36, 1649–1665. [DOI] [PubMed] [Google Scholar]
- Brovates O. O.; Tsiupa K. S.; Hovorun D. M. Surprising conformers of the biologically important A.T DNA base pairs: QM/QTAIMM proofs. Front. Chem. 2018, 6, 8. 10.3389/fchem.2018.00008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brovates O. O.; Tsiupa K. S.; Hovorun D. M. Novel pathway for mutagenic tautomerization of classical A·T DNA base pairs via sequential proton transfer through quasi-orthogonal transition states: A QMM/QTAI investigation. PLoS One 2018, 13, e0199044 10.1371/journal.pone.0199044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Šponer J. E.; Špačková N.; Leszczynski J.; Šponer J. Principles of RNA base pairing: structures and energies of the trans Watson– Crick/sugar edge base pairs. J. Phys. Chem. B 2005, 109, 11399–11410. 10.1021/jp051126r. [DOI] [PubMed] [Google Scholar]
- Bhattacharyya D.; Koripella S. C.; Mitra A.; Rajendran V. B.; Sinha B. Theoretical analysis of noncanonical base pairing interactions in RNA molecules. J. Biosci. 2007, 32, 809–825. 10.1007/s12038-007-0082-4. [DOI] [PubMed] [Google Scholar]
- Sayle R. A.; Milner-White E. J. RASMOL: biomolecular graphics for all. Trends Biochem. Sci. 1995, 20, 374–376. 10.1016/S0968-0004(00)89080-5. [DOI] [PubMed] [Google Scholar]
- Frisch M. J.; Trucks G. W.; Schlegel H. B.; Scuseria G. E.; Robb M. A.; Cheeseman J. R.; Scalmani G.; Barone V.; Mennucci B.; Petersson G. A.; Nakatsuji H.; Caricato M.; Li X.; Hratchian H. P.; Izmaylov A. F.; Bloino J.; Zheng G.; Sonnenberg J. L.; Hada M.; Ehara M.; Toyota K.; Fukuda R.; Hasegawa J.; Ishida M.; Nakajima T.; Honda Y.; Kitao O.; Nakai H.; Vrevan T.; Montgomery J. A.; Peralta J. E.; Ogliaro F.; Bearpark M.; Heyd J. J.; Brothers E.; Kudin K. N.; Staroverov V. N.; Kobayashi R.; Normand J.; Raghavachari K.; Rendell A.; Burant J. C.; Iyengar S. S.; Tomasi J.; Cossi M.; Rega N.; Millam J. M.; Klena M.; Knox J. E.; Bakken V.; Adamo C.; Jarmillo J.; Gomperts R.; Stratmann R. E.; Yazyev O.; Austin J.; Cammi R.; Pomelli C.; Ochterski J. W.; Martin R. L.; Morokuma K.; Zakrzewski V. G.; Voth G. A.; Salvador P.; Dannenberg J. J.; Dapprich S.; Daniels A. D.; Farkas O.; Foresman J. B.; Ortiz J. V.; Cioslowski J.; Fox D.. Gaussian; J. Gaussian. Inc.: Wallingford CT,2009, 200, 28.
- Humphrey W.; Dalke A.; Schulten K. VMD: visual molecular dynamics. J. Mol. Graphics 1996, 14, 33–38. 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
- Sharma P.; Lait L. A.; Wetmore S. D. Exploring the limits of nucleobase expansion: computational design of naphthohomologated (xx-) purines and comparison to the natural and xDNA purines. Phys. Chem. Chem. Phys. 2013, 15, 15538–15549. 10.1039/c3cp52656a. [DOI] [PubMed] [Google Scholar]
- Sharma P.; Lait L. A.; Wetmore S. D. yDNA versus yyDNA pyrimidines: computational analysis of the effects of unidirectional ring expansion on the preferred sugar–base orientation, hydrogen-bonding interactions and stacking abilities. Phys. Chem. Chem. Phys. 2013, 15, 2435–2448. 10.1039/c2cp43910g. [DOI] [PubMed] [Google Scholar]
- Tomasi J.; Mennucci B.; Cammi R. Quantum mechanical continuum solvent models. Chem. Rev. 2005, 105, 2999–3094. 10.1021/cr9904009. [DOI] [PubMed] [Google Scholar]
- Reed A. E.; Weinstock R. B.; Weinhold F. Natural population analysis. J. Chem. Phys. 1985, 83, 735–746. 10.1063/1.449486. [DOI] [Google Scholar]
- Iogansen A. V. Direct proportionality of the hydrogen bonding energy and the intensification of the stretching ν (XH) vibration in infrared spectra. Spectrochim. Acta, Part A 1999, 55, 1585–1612. 10.1016/S1386-1425(98)00348-5. [DOI] [Google Scholar]
- Halder A.; Data D.; Seelam P. P.; Bhattacharyya D.; Mitra A. Estimating Strengths of Individual Hydrogen Bonds in RNA Base Pairs: Toward a Consensus between Different Computational Approaches. ACS Omega 2019, 4, 7354–7368. 10.1021/acsomega.8b03689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar P. S. V.; Raghavendra V.; Subramanian V. Bader’s theory of atoms in molecules (AIM) and its applications to chemical bonding. J. Chem. Sci. 2016, 128, 1527–1536. 10.1007/s12039-016-1172-3. [DOI] [Google Scholar]
- Lu T.; Chen F. Multiwfn: a multifunctional wavefunction analyzer. J. Comput. Chem. 2012, 33, 580–592. 10.1002/jcc.22885. [DOI] [PubMed] [Google Scholar]
- Nikolaienko T. Y.; Bulavin L. A.; Hovorun D. M. Bridging QTAIM with vibrational spectroscopy: The energy of intramolecular hydrogen bonds in DNA-related biomolecules. Phys. Chem. Chem. Phys. 2012, 14, 7441–7447. 10.1039/c2cp40176b. [DOI] [PubMed] [Google Scholar]
- Johnson E. R.; Keinan S.; Mori-Sánchez P.; Contreras-Garća J.; Cohen A. J.; Yang W. Revealing noncovalent interactions. J. Am. Chem. Soc. 2010, 132, 6498–6506. 10.1021/ja100936w. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





