Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2014 Apr 15;42(10):6726–6741. doi: 10.1093/nar/gku269

DNA–protein π-interactions in nature: abundance, structure, composition and strength of contacts between aromatic amino acids and DNA nucleobases or deoxyribose sugar

Katie A Wilson 1, Jennifer L Kellie 1, Stacey D Wetmore 1,*
PMCID: PMC4041443  PMID: 24744240

Abstract

Four hundred twenty-eight high-resolution DNA–protein complexes were chosen for a bioinformatics study. Although 164 crystal structures (38% of those searched) contained no interactions, 574 discrete π–contacts between the aromatic amino acids and the DNA nucleobases or deoxyribose were identified using strict criteria, including visual inspection. The abundance and structure of the interactions were determined by unequivocally classifying the contacts as either π–π stacking, π–π T-shaped or sugar–π contacts. Three hundred forty-four nucleobase–amino acid π–π contacts (60% of all interactions identified) were identified in 175 of the crystal structures searched. Unprecedented in the literature, 230 DNA–protein sugar–π contacts (40% of all interactions identified) were identified in 137 crystal structures, which involve C–H···π and/or lone–pair···π interactions, contain any amino acid and can be classified according to sugar atoms involved. Both π–π and sugar–π interactions display a range of relative monomer orientations and therefore interaction energies (up to –50 (–70) kJ mol−1 for neutral (charged) interactions as determined using quantum chemical calculations). In general, DNA–protein π-interactions are more prevalent than perhaps currently accepted and the role of such interactions in many biological processes may yet to be uncovered.

INTRODUCTION

DNA–protein interactions are essential to life. Indeed, the genetic information contained in the sequence of DNA nucleobases (A, C, T and G) must be processed by enzymes, which transcribe the nucleobase code into RNA and subsequently generate new proteins. Alternatively, proteins can bind to DNA in order to replicate the nucleobase sequence as cells grow and divide. DNA–protein interactions are also evident in other critical cellular processes, such as the repair of DNA damage caused by carcinogenic compounds or UV light (1–4). Contacts between DNA and proteins are typically noncovalent, which allows the resulting complex to perform necessary biological functions, yet readily degrade such that both biomolecules can provide additional function to the cell (5,6). The noncovalent contacts between DNA and proteins have traditionally been categorized as (direct or water-mediated) hydrogen bonding, ionic (salt bridges or DNA backbone interactions) and other forces, including van der Waals and hydrophobic interactions (7–9). Understanding each class of DNA–protein contacts will provide a greater appreciation of critical cell functions and open the door for the development of new medicinal and biological applications, including rational drug design (10–12) and the control of gene expression (13–16).

To gain an understanding of the interactions between DNA and proteins, previous work has searched crystal structures published in the protein data bank (PDB) and determined the relative frequency of different types of contacts. Early studies in this area were limited by the lack of high-resolution crystal structures of DNA–protein complexes (17–20). While this problem has been overcome in the past decade (7,21–23), more recent works disagree about the relative frequency of different types of contacts. Indeed, characterization of 129 DNA–protein complexes suggests that van der Waals interactions are more common than (direct or water-mediated) hydrogen bonding (7). In contrast, a survey of 139 DNA–protein complexes suggests that hydrogen bonding is more frequent than van der Waals, hydrophobic or electrostatic interactions (22). Such discrepancies may arise since, unlike hydrogen bonding, there are relatively undefined guidelines for the structure of van der Waals interactions, and therefore there is likely substantial variation among the interactions included in this category. Regardless, both studies determined that van der Waals interactions compose more than 30% of DNA–protein contacts (7,22).

In addition to traditional classifications of DNA–protein interactions, careful examination of the list of contacts identified in previous works suggests that many interactions occur between the DNA nucleobases and the aromatic amino acids (Supplementary Figure S1) (7,22). In general, interactions between aromatic rings are known to be widespread throughout chemistry and biology (24,25). Indeed, the prevalence and potential importance of interactions between aromatic side chains in proteins (26–31), as well as at protein–protein interfaces (32), have been documented through PDB searches. Furthermore, investigation of 89 RNA–protein complexes suggests that RNA–protein van der Waals interactions are more prevalent than hydrogen bonding, with the most favoured nucleotide–amino acid pairs including the aromatic amino acids (specifically, the U:Tyr, A:Phe and G:Trp pairs) (33), while a search of 61 structures revealed an abundance of interactions between Trp and the purines (8). Collectively, these studies suggest that closer investigations of DNA–protein π–π interactions are warranted.

Among the first studies to specifically consider DNA–protein π–π contacts, Mao et al. investigated the molecular recognition of adenosine 5’-triphosphate (ATP) by different proteins, and determined that π–π interactions between A and the aromatic amino acids are essential for substrate binding, with a 2.7:1.0 DNA–protein hydrogen bonding:π–π contact ratio (34). Subsequently, Baker and Grant identified a large number of π–π interactions between the DNA nucleobases and Tyr, Phe, His or Trp in 141 DNA–protein complexes (8). Unfortunately, the overall trends in the relative abundances of A–amino acid pairs are significantly different in these two studies. This discrepancy may arise due to differences in the structures searched, but is more likely an artefact of the (distance only) search criteria implemented. Indeed, ring proximity alone does not guarantee a suitable relative orientation of two residues, and therefore not all previously characterized interactions correspond to π–π (stacking or T-shaped) contacts (Supplementary Figure S2). Thus, the true frequency and structure of these interesting aromatic interactions between DNA and proteins remain unclear. Nevertheless, the proximity of the nucleobases and aromatic amino acids suggests that aromatic–aromatic (π–π or C/N–H···π) interactions may help stabilize DNA–protein complexes or may be involved in nucleic acid recognition.

Recent works corroborate that modern computational techniques can provide important information about π–π interactions (see, for example, references 35–39). In terms of DNA–protein contacts, quantum chemical calculations have been used to clarify the strength of π–π contacts between the nucleobases and aromatic amino acids found in experimental crystal structures (8,34,40–42). To complement this data, the preferred (lowest energy) relative monomer orientations have been identified for isolated dimers by systematically changing the relative orientations of monomers of fixed geometry (41,43–46) or fully relaxed systems (40–42). Both π–π stacking (face-to-face) (41,43–46) and π–π T-shaped (edge-to-face) (41,43–46) contacts have been considered in these studies (Figure 1A and B). Our group has completed the most extensive investigations, where over 1000 relative monomer orientations were considered for each nucleobase–aromatic amino acid pair to determine the preferred relative monomer orientation (46–49). Our highly accurate calculations suggest that the strengths of these π–π stacking and T-shaped interactions are up to approximately –43 kJ mol−1 (46,50), which were calculated as the energy difference between the dimer and individual monomers. This suggests that π–π contacts can contribute to DNA–protein binding and/or stabilize DNA–protein complexes to the same extent as hydrogen bonding. Furthermore, our group investigated the enhancement in the binding energy due to charge by considering dimers involving cationic His (49) or a damaged (cationic alkylated) nucleobase (47,51,52), as well as the effects of water molecules on the stability of charged dimers (53). Although most of these studies were performed on model systems that only include aromatic rings, the extension of the computational model to include the biological backbone (54–56) or additional π–π contacts (57) has been determined to minimally affect the strength of individual contacts. Together, these works provide important details about the preferred structure and magnitude of DNA–protein π–π interactions, and their potential biological roles.

Figure 1.

Figure 1.

Examples of (A) nucleobase–amino acid π–π T-shaped interaction (PDB ID: 2WQ7), (B) nucleobase–amino acid π–π stacking interaction (PDB ID: 3MR5) and (C) deoxyribose–amino acid sugar–π interaction (PDB ID: 3BKZ).

In addition to interactions with the DNA nucleobases, analysis of crystal structures reveals a significant number of short distances between the aromatic amino acids and the DNA backbone (7,22). Although many of these likely correspond to ionic contacts or hydrogen bonding with the phosphate moiety, a significant number of interactions were deemed to specifically involve the deoxyribose sugar. Indeed, all aromatic amino acids were found to participate in these interactions in nature. Despite short distances between the sugar and the aromatic amino acids, the nature of these contacts has yet to be explicitly discussed in the literature.

In contrast to π-interactions involving the DNA sugar moiety, contacts between various carbohydrates and the aromatic amino acids have been identified in crystal structures (58–61), and the importance of these contacts has been accepted in many fields, including glycobiology (see, for example, (62)-(68) and reference therein) and nanotechnology (see, for example, (69)-(74) and references therein). The significant strength of carbohydrate–π contacts in crystal structures has been verified using computational methods (58–61). Other modeling studies have characterized the binding strengths of dimers between different carbohydrates and aromatic amino acids modeled as benzene (Phe) (73,75–79), toluene (Phe) (80–82), phenol (Tyr) (83) and/or indole (His) (80,83), or with the protein backbone included (84,85). Complexes involving naphthalene have also been considered in an effort to better understand the properties of carbohydrate C–H···π interactions (86). These works have collectively determined that the amino acid can interact with either side (face) of the carbohydrate. The strengths of the carbohydrate–π interactions are dependent on the carbohydrate, the amino acid and relative monomer orientation, and are up to approximately –50 kJ mol−1, with the most stable structures containing both carbohydrate–π contacts and hydrogen bonding (with an exocyclic hydroxyl group). Interestingly, carbohydrate–π interactions involving a DNA nucleobase have also been characterized (87–90).

By analogy to the importance of carbohydrate–π interactions to glycobiology, it is reasonable to propose that π–contacts between the DNA deoxyribose moiety and the aromatic amino acids in proteins may provide stability and/or function in DNA–protein complexes. Furthermore, previous work on carbohydrate–π interactions suggests that deoxyribose contacts could involve C–H···π and/or hydrogen-bonding interactions (via the hydroxyl groups) with the amino acid π–system. From a fundamental perspective, the ring size is notably different between deoxyribose and the most widely studied carbohydrates (pyranoses), which could substantially affect the structure and energetics of the π-interactions. Although interactions predominantly involve one of the two carbohydrate faces, contacts may also occur with the sides of deoxyribose due to the relative positions of the ring hydrogen atoms.

In the current study, over 400 high-resolution DNA–protein complexes available in the PDB were searched to definitively determine the frequency and characterize the nature (structure, composition and strength) of contacts between the aromatic amino acids (including cationic His) and the DNA nucleobases (π–π contacts, Figure 1A and B) or the deoxyribose moiety (sugar–π contacts, Figure 1C). Unprecedented in the DNA–protein interaction literature, all nucleobase–aromatic amino acid dimers identified were visually inspected to unequivocally verify each contact represents a π–π interaction, and to classify the contact as either a nucleobase–amino acid stacking or T-shaped interaction (Figure 1A and B), which could involve either a nucleobase edge interacting with an amino acid π–system (face) or an amino acid edge interacting with the nucleobase face. Although experimental data can be used to identify contacts in nature, no information is obtained about the strength of these interactions. Therefore, accurate quantum chemical methods were used to evaluate the binding energy of each dimer system found in the crystal structures. Our study thereby clarifies previous literature by providing the most complete information to date on DNA–protein π–π interactions in nature. Using the same thorough approach, deoxyribose–aromatic amino acid sugar–π interactions in experimental crystal structures have been quantified for the first time, and determined to be based on many different types of noncovalent interactions that are known in structural chemistry, including C–H···π (Figure 1C) and lone–pair···π contacts. As a result, a novel classification system is developed based on the nature of the edge of the sugar. Combining data on the natural occurrence and strength of these two broad classes of DNA–protein interactions provides important information that will help unveil their potential roles in many biological systems.

MATERIALS AND METHODS

Datasets

DNA–protein complexes were identified in the PDB using similar criteria to those previously used in the literature to detect nucleobase–amino acid π–π contacts (Supplementary Figure S3) (8,30). Specifically, X–ray crystal structures published before 24 May 2011 with a resolution better than 2.0 Å and less than 90% sequence identity were chosen for analysis (428 crystal structures total).

Selecting systems for analysis

Pymol (91) was used to select all aromatic amino acids and nucleobase or deoxyribose moieties separated by less than 5.0 Å in each crystal structure. This choice of distance is supported by computational studies that determined the optimal vertical separation in DNA–protein nucleobase–aromatic amino acid dimers is typically 3.5 Å (45,46). As outlined in the Introduction, the qualifying DNA–protein dimers were then visually inspected to indisputably verify the contact is a π-interaction and classify the contact as either a nucleobase–amino acid stacking, nucleobase–amino acid T-shaped (nucleobase or amino acid edge) or deoxyribose sugar–π interaction. The PDB IDs for the crystal structures searched in the present work, as well as the type(s) of interactions identified and the nucleobase/sugar–amino acid residues involved, are provided in the SI.

Geometries used for quantum mechanical calculations

For the nucleobase–amino acid π–π interactions, the interplanar angle between the two rings, denoted as tilt (ω, Figure 1), was measured using Mercury (92), and used to further classify the π–π interaction as stacked (ω = 0–20°), T-shaped (ω = 70–90°) or inclined (20° < ω < 70°). Mercury was also used to measure the closest heavy atom distance between monomers. The dimer binding strengths were determined using truncated models obtained by replacing the DNA or protein backbone with a hydrogen atom (Supplementary Figure S1). Previous research has shown that neglect of the DNA or protein backbone does not significantly affect the magnitude of the π–π contact (52,54,55). For His interactions, both a cationic (His+) and two neutral (Hisδ and Hisϵ; Supplementary Figure S1) models were considered due to the unique pKa of this amino acid, and therefore varied protonation states adopted in biological systems (93). Additionally, the hydroxyl group of Tyr was orientated in two directions, denoted as clockwise (CW) and counter–clockwise (CCW) according to the direction of the hydroxyl moiety when the dimer is oriented with Tyr below the nucleobase (see Supplementary Figure S1). The planar (Cs symmetric) monomers were aligned by overlaying MP2/6–31G(d) optimized geometries onto the crystal structure orientation according to root-mean-square (RMS) fitting of the ring heavy atoms using HyperChem 8.0.8 (94).

For all identified sugar–π interactions, the amino acid was initially overlaid (using RMS fitting) onto the crystal structure geometry as discussed for the nucleobase–amino acid interactions (94). However, due to variations in the sugar pucker throughout the crystal structures, and the anticipated effect of sugar puckering on the binding energy, a fully optimized isolated sugar could not be overlaid onto the crystal structure. Instead, the sugar moiety was first truncated by replacing the nucleobase, as well as the 5’ and 3’ phosphorus atoms, with hydrogen atoms (Supplementary Figure S1). Subsequently, all protons in the sugar–amino acid dimer were then optimized at the MP2/6–31G(d) level of theory, while fixing the heavy atoms. The ∠(C4–C5′–O5′–H) and ∠(C4′–C3′–O3′–H) dihedral angles in the sugar (Supplementary Figure S1) were also frozen to the crystal structure geometry during the optimizations, in order to constrain the orientation of the hydrogen atoms at the O5′ and O3′ truncation points. This approach for sugar–π contacts is justified by studies revealing that neither structures nor binding strengths of carbohydrate–π interactions deviate significantly (< 2 kJ/mol) when crystal structures or fully optimized geometries are considered (58).

Interaction energies

Quantum chemical calculations were used to determine the strength of the intermolecular forces acting between the nucleobase and amino acid (π–π interactions) and the intermolecular forces acting between the sugar and amino acid (sugar–π interactions) based on the dimer geometries discussed in the previous section. Specifically, the interaction or binding energy (ΔE) was calculated according to Equation (1).

graphic file with name M2.gif (1)

In this equation, Edimer stands for the electronic energy of the π–π stacking, T-shaped or sugar–π dimer, while Eaa and Ent stand for the electronic energies of the isolated subsystems (aromatic amino acid (aa) and nucleobase or deoxyribose subunit of the nucleotide (nt), respectively). The geometry of each monomer in the dimer is the same as the structure of the isolated monomer. The calculated interaction energy does not include zero-point vibrational or Gibbs energy correction. Furthermore, the binding energies were calculated in the gas phase and are therefore relevant to DNA–protein binding environments of low polarity (95). We acknowledge that polar environments will likely decrease the magnitude of the reported interaction energies, as well as diminish the impact of His protonation. Nevertheless, previous work has shown that π–π and πcation–π interactions are of significant strength in more polar environments (41,49,51). Future work should consider the effects of solvation and thereby extend our conclusions to all DNA–protein binding environments including the rarer high polarity active sites.

To identify a quantum chemical method that best balances accuracy and computational cost due to the large number of contacts identified, the binding strength of select dimers that span the range of interactions found in the PDB search was calculated with several levels of theory (Supplementary Table S1). The M06–2X density functional theory (DFT) functional was chosen (with both 6–31+G(d,p) and aug-cc-pVTZ basis sets) based on literature testing the ability of this functional to accurately describe carbohydrate–π contacts (96), as well as DNA–protein nucleobase–amino acid π–contacts (48,50). However, other DFT functionals were also considered that were originally developed to account for dispersion interactions and have proven to work well for noncovalent contacts (97,98), namely B3LYP-D3, B97-D3 and ωB97-D (with aug-cc-pVTZ basis sets). The DFT results were validated using the highly accurate CCSD(T) calculations at the complete basis set (CBS) limit. To obtain CCSD(T)/CBS estimates, MP2/CBS energies were determined using the aug-cc-pVDZ and aug-cc-pVTZ basis sets with Helgaker's extrapolation scheme (99,100), and the differences in the (counterpoise-corrected) MP2 and CCSD(T) energies were calculated with aug-cc-pVDZ and added to the MP2/CBS values. We note that these energies are denoted as CCSD(T)/CBS for consistency with our previous work on other DNA–protein interactions (46,48,50) despite some literature referring to these extrapolated values as CBS(T) (44,101–106). Furthermore, only slight changes in the interaction energies of nucleobase pairs have been reported upon considering a higher-level triple to quadruple-zeta extrapolation (107,108).

Upon changing the M06–2X basis set from 6–31+G(d,p) to aug-cc-pVTZ, the MUD (mean unsigned deviation) for the sugar–π interactions decreases (Supplementary Table S1). However, due to significant errors in the nucleobase–aromatic amino acid π–π interactions, the overall MUD increases with respect to the CCSD(T)/CBS estimate from 1.5 to 2.4 upon basis set expansion along with a substantial increase in computational time. Indeed, M06–2X has been shown to accurately describe other DNA–protein noncovalent interactions with a moderately sized basis set (48,50). In contrast, ωB97x-D/aug-cc-pVTZ describes both broad classes of contacts as accurately as M06–2X/6–31+G(d,p), leading to the same overall MUD at an increased computational cost. Among the functionals tested, B3LYP-D3/aug-cc-pVTZ performs the best, but again this is coupled with significantly increased computational cost compared to the efficient M06–2X/6–31+G(d,p) combination. Most importantly, the trends in the interaction energies and the large magnitude of the nucleobase and sugar–aromatic amino acid π-interactions predicted by M06–2X/6–31+G(d,p) are preserved upon consideration of the CCSD(T)/CBS estimates. Thus, M06–2X/6–31+G(d,p) was confidently used in the present study to compare the strength of many different types of DNA–protein π–π interactions.

Software

All M06–2X, MP2 and CCSD(T) calculations were performed with program defaults using Gaussian 09 (revisions A.02 and C.01) (109), while all DFT–D and DFT–D3 calculations were performed using Q-Chem 4.0.1.0 (110).

RESULTS

Crystal structure analysis of nucleobase–aromatic amino acid contacts in nature

Overall distribution of contacts in DNA–protein complexes

Among the 428 crystal structures considered in the present work, 175 (41%) contain at least one nucleobase–amino acid stacking or T-shaped interaction, with 344 total nucleobase–amino acid stacking or T-shaped interactions identified. Most of the 175 crystal structures contain one or two interactions, but as many as 13 contacts can be found in a single structure (Figure 2A). These interactions occur in a wide variety of proteins, including DNA–binding and transcription proteins, with approximately 38% of the π–π contacts being identified in transferase proteins and 25% in hydrolase proteins (Figure 2B).

Figure 2.

Figure 2.

(A) Number of nucleobase–amino acid stacking/T-shaped interactions identified in PDB structures in the present study. (B) Types of proteins in which nucleobase–amino acid stacking/T-shaped interactions were found. (C) Overall composition of the proteins in the crystal structures considered in the present work.

Occurrence of nucleobases and aromatic amino acids in contacts

Pyrimidines are involved in more π–π interactions than purines (Figure 3A), where the population trend with respect to the nucleobase decreases according to T > C > A ∼ G. Specifically, 37% of the contacts involve T, with the remaining being relatively equally distributed among the other bases (∼20%). When the distribution is considered as a function of the amino acid (Figure 3B), significantly more interactions are found with Phe (44%) and Tyr (32%) than either His (11%) or Trp (13%). Nevertheless, Trp is the least common amino acid (∼1% abundance), which may explain the fewer contacts identified with this residue. On the other hand, Tyr, Phe and His have similar natural abundances (3–4%) and therefore our results suggest that His is less likely to form π–π stacking or T-shaped interactions with a DNA nucleobase. When all nucleobase–amino acid combinations are considered (Figure 3C), Phe, Tyr and Trp contacts decrease in abundance with respect to the nucleobase as T > C > A ∼ G, while His forms the most contacts with C (the second most frequently observed interaction with respect to the nucleobase) and does not form any contacts with G.

Figure 3.

Figure 3.

The proportions of (A) nucleobases, (B) amino acids and (C) nucleobase–amino acid combinations in DNA–protein π–π stacked and T-shaped orientations found in nature.

Relative abundance of face-to-face and face-to-edge π–π binding arrangements

The nucleobase–amino acid π–π contacts adopt conformations ranging from stacked (ω = 0–20°) to T-shaped (ω = 70–90°) orientations (Figure 4). However, the stacked orientation is substantially more common (58%) than the T-shaped configuration (13%). The T-shaped interactions are also less frequent than the inclined structures (ω = 20–70°, 29%, Figure 4), but this is due to the large number of angles in the inclined category, while the frequency for a given angle in the T-shaped and inclined categories are nearly equal (approximately <5%). Within the π–π stacking interactions, the dimers more commonly adopt a tilt of 5–10° rather than a perfectly parallel orientation (ω = 0). Conversely, the perfectly perpendicular arrangement (ω = 90°) is the preferred T-shaped configuration. The most common inclined structures (ω = 20–70°) involve either a ω = 25–30° or a maximum tilt of 45–50° (Figure 4).

Figure 4.

Figure 4.

Frequency of tilt angle (degrees) between the ring planes for all interactions according to the (A) nucleobase or (B) amino acid.

Dependence of π–π binding arrangement on the nucleobase

A correlation exists between the nucleobase in the dimer and the tilt angle adopted (Figure 4A). Specifically, although all nucleobases prefer a stacked orientation, the largest frequency occurs with ω = 5–10° for T, C and A, but with ω = 10–15° for G. Among the inclined orientations, C and G prefer only slight deviations from stacking (ω = 25–35°), T prefers the maximum degree of tilt (ω = 45–50°) and A rarely adopts an inclined orientation (< 5% frequency for ω = 30–70°). Cytosine is the most likely nucleobase to adopt a T-shaped structure (15% frequency for ω = 85–90°). Although A and T also adopt T-shaped orientations with > 10% frequency, G rarely forms a T-shaped dimer (< 5% frequency). Interestingly, A is only found in a T-shaped orientation with Phe. Furthermore, 74% of the identified T-shaped interactions and 21% of the inclined interactions involve a nucleobase edge and an amino acid face.

Dependence of π–π binding arrangement on the amino acid

As discussed for the nucleobases, all amino acids show a preference for the ω = 5–10° stacked orientation, except His which equally prefers a 0–5° tilt (Figure 4B). In fact, His and Trp are rarely found in any orientation besides a stacked structure (5 and 8% frequency for ω = 20–90°, respectively). Although Tyr adopts almost the full range of tilt angles, a stacked or slightly tilted orientation is most frequent adopted. Unlike the other amino acids, Phe exhibits a substantial occupancy of both inclined (ω = 45–50°) and T-shaped (ω = 85–90°) orientations (32 and 20%, respectively).

Trends in the distances between monomers

In addition to the varied tilt angles adopted by the nucleobase–amino acid dimers, many different separation distances are observed (Supplementary Figure S4A). Overall, the closest heavy atom distances fall between 3.0 and 4.2 Å in the nucleobase–amino acid π–π dimers, with nearly a quarter of all interactions adopting a 3.5 Å separation. Interestingly, there is no clear correlation between the separation distance and tilt angle (Supplementary Figure S4B). Furthermore, unlike the stacking angle, which preferentially adopts a different value for each nucleobase, all bases have the same trend in the preferred separation distance (Supplementary Figure S4C). Conversely, the amino acids do not follow a particular trend in the separation distance. Specifically, Tyr adopts a large range of distances and His general adopts shorter distances (< 5% occupancy of distances greater than 3.7 Å; Supplementary Figure S4D), while Phe and Trp display the same overall trend as across all π–π contacts.

Quantum chemical calculations of nucleobase–aromatic amino acid interaction energies

The discussion above shows that nucleobase–amino acid dimers adopt a wide range of π–π structures and therefore it is not surprising that the dimers also span a significant range of binding strengths (Figure 5). The magnitude of the nucleobase–amino acid stacking or T-shaped π–π interaction depends on several factors such as the relative monomer orientation (including tilt angle), and the identity of the nucleobase and amino acid. For all DNA–protein pairs, the largest (most negative) binding energy occurs when the amino acid and nucleobase adopt a stacked (ω = 0–20°), not T-shaped (ω = 70–90°), orientation. With the exception of the fact that the maximum interaction energies generally occur for T and G, the most dominant trends depend on the amino acid. Therefore, interesting features of the binding energies will be discussed below as a function of the amino acid.

Figure 5.

Figure 5.

Binding energy of nucleobase–amino acid π–π interactions with respect to the tilt angle (degrees) for dimers involving (A) Phe, (B) Trp, (C) Tyr (for TyrCW (diamonds) and TyrCCW (squares) and (D) His (for Hisδ (diamonds), Hisϵ (squares) and His+ (triangles) (see Supplementary Figure S1, SI, for the definition of different Tyr and His conformations).

Phenylalanine

Phe interactions are up to –26.3 kJ mol−1. In the stacked orientation, G or T generally leads to stronger contacts than A or C, while G or C interactions are generally stronger than T or A T-shaped interactions (Figure 5A). This leads to, for example, an 18.8 kJ mol−1 energy difference between the strongest T:Phe stacking and T-shaped dimers (Figure 5A).

Tryptophan

Similarly, the Trp interactions are up to –31.3 kJ mol−1, with the strongest stacking interactions occurring with T or G (Figure 5B). However, no general conclusions about the strength of Trp T-shaped interactions can be drawn since only one such contact was identified (Figure 5B).

Tyrosine

Unlike Trp and Phe, Tyr can adopt multiple conformations when stacked with the nucleobases, which differ in the orientation of the hydroxyl moiety (Supplementary Figure S1). However, the hydroxyl orientation has a negligible effect on the binding energy, with less than a 5 kJ mol−1 energy difference between the two conformations for 74% of the interactions considered (Figure 5C). As discussed for Phe and Trp, Tyr interactions are stronger in the stacked rather than T-shaped orientation, with the largest deviation (up to 28.7 kJ mol−1) occurring for T dimers (Figure 5C). The overall strongest Tyr interaction occurs with C (–31.6 kJ mol−1, Figure 5C). Tyr nucleobase interactions are similar in strength to the corresponding Phe contact. Furthermore, although Tyr, Phe and Trp bind strongest to the pyrimidines, there is only a 5 kJ mol−1 difference in the corresponding strongest interaction energies for these three amino acids.

Histidine

Similar to Tyr, (neutral) His can adopt two orientations (protonation states) with respect to the nucleobase (Supplementary Figure S1). However, unlike Tyr interactions, His contacts are highly dependent on the amino acid orientation, with 60% of the structures considered displaying a greater than 10 kJ mol−1 energy difference with a change in His orientation and the largest difference (18 kJ mol−1) occurring in a C dimer (Figure 5D). The greatest number of contacts and strongest interactions (–27.1 kJ mol−1) with (neutral) His occur when stacked with C, which contrasts the greatest number and strongest interactions found with T for all other amino acids. As previously mentioned, very few His contacts were found to adopt a T-shaped orientation in nature (Figure 5D), where the only T-shaped interaction is –5.0 kJ mol−1 and occurs with A. Interactions with cationic His are up to –48.7 kJ mol−1, which is 21.6 kJ mol−1 stronger than the neutral dimer. As for neutral His, the strongest interaction for cationic His occurs when stacked with C. Interestingly, although the interaction strengths between His and A, G or C always increase, and the interaction strengths with T decrease upon protonation. The different behaviour of T:His dimers upon protonation has been previously noted in the literature (49) and is attributed to the more positive π–system of T compared to the other nucleobases.

Crystal structure analysis of deoxyribose sugar–aromatic amino acid contacts in nature

Overall distribution of sugar–π contacts in DNA–protein complexes

Among the 428 crystal structures searched in the present study, 230 sugar–π contacts were identified in 137 structures. Although crystal structures containing sugar–π contacts typically have only one such interaction, up to six sugar–π contacts can be observed in a single structure (Figure 6A). The sugar–π contacts occur in a wide variety of DNA–binding proteins (Figure 6B). Interestingly, 68% of the structures do not contain a sugar–π interaction (Figure 6A), which is more than the 59% that do not contain a nucleobase–amino acid contact (Figure 2A), while 38% of the structures do not contain any nucleobase π–π or sugar–π interactions (Figure 6C). Nevertheless, both types of amino acid interactions can be found in 11% of the structures, with these DNA–protein complexes typically possessing one of each type, but can contain up to six of one and two of the other class (Figure 6C).

Figure 6.

Figure 6.

(A) The number of sugar–π contacts found in each structure. (B) Types of proteins in which sugar–π interactions were found. (C) The number of sugar–π and nucleobase–amino acid interactions observed in crystal structures considered in the present work.

Occurrence of aromatic amino acids in sugar–π contacts

Sugar–π interactions occur with all four aromatic amino acids (Figure 7A). However, most sugar–π contacts involve Tyr (45%), which is closely followed by Phe (36%). In contrast, few sugar–π interactions are found with His (4%) despite a similar natural abundance as Phe and Tyr (3–4%). Trp interactions make up 14% of all sugar–π interactions, which is consistent with the relative natural abundance of Trp (1%) in comparison to Tyr and Phe.

Figure 7.

Figure 7.

(A) Composition of sugar–π interactions found in nature as a function of amino acid. (B) Frequency of sugar–π interactions found in nature with respect to the class of contact.

Classification of sugar–π contacts in DNA–protein complexes

A variety of contacts occur between the π–systems (faces) of the aromatic amino acids and deoxyribose in nature, which can be classified according to the sugar “edge” (Figure 8). The sugar edge that interacts with the π–system can involve a single proton, two protons (a bridge), three protons (a face), a lone pair, or both a lone pair and a proton (lone pair–proton). Furthermore, these contacts can involve any of the hydrogen atoms in the sugar ring. The bridged and face interactions are the most common in the structures searched, with overall abundances of 33 and 30%, respectively (Figure 7B). While lone pair–proton interactions are fairly uncommon (4%), distinction between lone pair–proton and lone pair interactions is difficult, which collectively account for 17% of the contacts and is similar to the proportion of single proton interactions (20%, Figure 7B). Example orientations of the four most common interactions from select crystal structures are provided in Figure 9, which further clarifies the geometry of these contacts in nature.

Figure 8.

Figure 8.

(A) Numbering scheme of the sugar moiety. Representative sugar–π interactions identified in crystal structures for (B) single proton, (C) face, (D) bridged, (E) lone pair and (F) lone pair–proton interactions (the amino acid is represented by a solid black line below the sugar).

Figure 9.

Figure 9.

Example dimer and overlay of all dimers for the four most common sugar–π contacts identified in crystal structures, including calculated binding strengths.

Relative monomer orientations in sugar–π contacts

Figure 9 displays overlays of all contacts identified for each of the four most common sugar–π contacts, which were obtained using RMS fitting of the sugar atoms involved in the interaction. From these representative examples, it can be seen that the sugar–π interactions display significant variation in the amino acid position, which covers nearly all relative monomer orientations for a given sugar–edge type and leads to a continuum between the edges. Variations in the sugar are also evident from the overlays, which mainly arise due to different puckering in the crystal structures.

Dependence of binding arrangement on the sugar atoms involved

Within each category of sugar–π interactions, there is a clear preference for contacts with certain atoms (Figure 10). For example, single proton interactions occur with H5a more than twice as frequently as any other proton. Similarly, the H1a–H2b bridged contact occurs more than three times as often as any other contact in this category and the H4–H5a–H5b contact dominates the face class, which is in fact the overall most frequent sugar–π interaction (25% frequency). All lone pair interactions identified involve O4′ (rather than O5′ or O3′ phosphate backbone atoms) and more frequently do not involve a proton. When O4′ lone pair–proton interactions occur, contacts involving H4 are twice as likely as those involving H1a.

Figure 10.

Figure 10.

Frequency of sugar–π interactions found in nature with respect to the type of contact and the amino acid.

Dependence of binding arrangement on the amino acid

Within a given type of interaction, certain amino acids are more prevalent (Figure 10). Specifically, the single proton interactions are most common with Tyr. On the other hand, lone pair and bridged interactions involving each of the four aromatic amino acids can be identified, with Tyr or Phe involved in the majority of the contacts. Conversely, Trp and Tyr compose approximately two-third of all face interactions. When the trend is instead considered as a function of amino acid and interaction adopted (Supplementary Figure S5), substantial variation in the types of contacts identified for each amino acid is noted. Trp only forms four types of sugar–π interactions in the crystal structures searched, which is fewer than for any other amino acid and does not include a single proton contact. The H4–H5a–H5b face interaction makes up 76% of all sugar–Trp interactions, while the other three Trp interactions include two O4′ interactions and the H2a–H5b bridged interaction. Unlike Trp, His forms seven different sugar–π interactions that span all four categories of sugar–π contacts, with the O4′ interaction being the most common (30%) and the H5b interaction also prevalent (20%, Supplementary Figure S5). In addition to being significantly more common, interactions with Phe and Tyr are markedly more varied, with more than 8 and 15 types of contacts found, respectively (Supplementary Figure S5). The most prevalent sugar–π Phe interaction is the H1a–H2b bridged interaction (43%), where Phe bridged interactions are in general considerably more common (59%) than face, lone pair and single proton contacts (19%, 16% and 13%, respectively). Unlike the other amino acids, Tyr does not substantially prefer one specific interaction. However, Tyr has some similarities to the other amino acids, where three of the four most common Tyr interactions include H4–H5a–H5b (most common for Trp), O4′ (most common for His) and H1a–H2b (most common for Phe).

Quantum chemical calculations of deoxyribose sugar–aromatic amino acid interaction energies

The previous section shows that sugar–π interactions with the aromatic amino acids can adopt many different orientations in DNA–protein complexes. This structural variation leads to binding strengths for (neutral) sugar–π interactions between approximately 0 and –30 kJ mol−1 (Figure 11). Interactions with Trp are particularly strong, with magnitudes of up to –29.3 kJ mol−1 and generally more stable than –20 kJ mol−1. Interactions with Tyr can also be strong (up to –31.6 kJ mol−1), but cover the full range of binding energies (i.e. from 0 to –30 kJ mol−1). In general, the Tyr interactions do not greatly depend on the orientation of the hydroxyl moiety, with 86% of all sugar–Tyr interactions displaying a less than 5 kJ mol−1 difference between the two orientations, but the dependence can be up to 22.1 kJ mol−1 when a hydrogen bond forms in addition to the sugar–π interaction. Conversely, although Phe and (neutral) His contacts are generally weaker, they exhibit a significant range (from 0 to –20 kJ mol−1, Figure 11). Similar to Tyr, the His binding strength depends on the amino acid orientation by 0.1–20 kJ mol−1. The overall strongest sugar–π contacts typically occur when His is cationic (especially when interacting with O4′), with binding strengths up to –68.2 kJ mol−1.

Figure 11.

Figure 11.

Binding energies of sugar–π interactions with respect to the type of contact for dimers involving (A) Phe, (B) Trp, (C) Tyr (for TyrCW (red) and TyrCCW (blue)), and (D) His (for Hisδ (red), Hisϵ (blue) and His+ (green)), (see Supplementary Figure S1, for the definition of different Tyr and His conformations).

Dependence on sugar edge

Among all sugar edge–aromatic amino acid combinations, only interactions with H2a, H2b, O4′–H4, H2a–H3, H1a–H2b and H4–H5a–H5b have (neutral) interaction energies stronger than –20 kJ mol−1 and only occur with Trp and Tyr. The strongest interactions with Trp, Tyr, Phe and (neutral) His occur for H4–H5a–H5b (–29.3 kJ mol−1), H2a (–31.6 kJ mol−1), H1a–H2b–H4 (–16.2 kJ mol−1) and H1a–H2b (–18.9 kJ mol−1), respectively. The overall four strongest interactions are the H4–H5a–H5b dimer (–29.3 kJ mol−1), followed by the H1a–H2b (–24.1 kJ mol−1), O4′ (–22.3 kJ mol−1) and H5a (–18.4 kJ mol−1) contacts (Figure 11). Furthermore, the binding strength of these four structures can vary by up to approximately 25 kJ mol−1 due to differences in the relative orientation of the amino acid residue (Figure 11).

DISCUSSION

Abundance of nucleobase–aromatic amino acid π–π interactions

In the 428 crystal structures containing DNA–protein π-interactions (see Supplementary Data), 344 nucleobase–aromatic amino acid π–π contacts were identified and, for the first time in the literature, unambiguously confirmed through visual inspection. These contacts were found in all types of proteins (Figure 2B). However, the protein distribution directly correlates with the protein composition of the DNA complexes investigated (Figure 2C), which suggests that the observed distribution is a consequence of the structures searched rather than one protein class being more likely to rely on nucleobase–amino acid π–π interactions.

Structure of nucleobase–aromatic amino acid π–π interactions

Among the nucleobase interactions identified, stacked orientations (with a 5–10° angle (tilt) between ring planes) are more prevalent than T-shaped arrangements in a 3:2 ratio (Figure 4). Nevertheless, structures ranging from perfectly parallel to perfectly perpendicular relative monomer orientations appear in nature. Interestingly, the typical closest heavy atom–heavy atom distance between the two monomers (3.5 Å; Supplementary Figure S4) matches the preferred distance previously identified in computational studies of isolated monomers (45,46), and therefore some features of the relative monomer orientations in crystal structures may arise due to the inherent nature of the interactions.

Composition of nucleobase–aromatic amino acid π–π interactions

The pyrimidines are more likely to be involved in π–π interactions with aromatic amino acids than the purines (Figure 3A), which contrasts expectations that a larger ring size may lead to more π-interactions in nature due to greater possible overlap. In terms of the amino acids, more interactions occur with Phe and Tyr than with Trp and His in nature (Figure 3B), which does not directly relate to the relative natural abundances of these amino acids. This finding also contrasts previous literature that reports His to be the most likely aromatic amino acid to be involved in DNA–protein π–π interactions (8). Furthermore, our observation that Phe, Tyr and Trp contacts decrease in abundance with respect to the nucleobase as T > C > A ∼ G. His was found to form the most contacts with C. No contacts between His and G were identified (Figure 3C). These findings contrast previous reports that His selectively binds to T and G, while Phe selectively binds to T and A (7,8). Discrepancies between the present study and previous work may arise due to the careful visual inspection implemented herein as additional verification prior to classifying the π–π interactions.

Strength of nucleobase–aromatic amino acid π–π interactions

Since there is a large variation in the geometry of nucleobase–amino acid π–π interactions in nature (Figure 4), it is not surprising that there is also significant variation in the calculated binding strengths (Figure 5), as reported previously in computational studies of isolated dimers (40–47) or select crystal structure geometries (8,34,40–42). The magnitude of the nucleobase–amino acid π–π interactions are up to approximately –30 kJ mol−1 and vary with the monomers involved and their relative orientation (with stacked structures being more stable than T-shaped). However, the trends in the binding strengths are not always the same as those found by considering two monomers in the absence of geometrical constraints imposed by an enzyme (45–47,49–57). Interestingly, most interactions identified in nature are on average 4.9 kJ mol−1 weaker than the corresponding optimal interaction previously reported between two monomers in the absence of geometrical constraints imposed by the enzyme (Supplementary Table S2) (45–47,49–57). This difference arises due to deviations in the geometries (Supplementary Table S2), including greater separation distances and tilt in the crystal structures, which likely arise due to constraints imposed by the protein versus the perfectly parallel (stacked) or perpendicular (T-shaped) monomer arrangements implemented in the potential energy surface searches. The perfectly stacked or T-shaped orientations, as well as the step size implemented, in previous calculations also explain why three of the interaction energies calculated in the natural orientations are slightly stronger than the “optimal” values identified by searching the potential energy surface. These features underscore the influence of the relative monomer orientations on the binding strengths. In agreement with previous studies of charged DNA–protein interactions (41,49,50,53) and reports that π–π and πcation–π interactions are distinct (111), cationic His has significantly stronger interactions than the neutral amino acids, with interaction energies up to approximately –50 kJ mol−1.

Biological relevance of nucleobase–aromatic amino acid π–π interactions

Nucleobase–aromatic amino acid π–π interactions have been implicated in the discriminatory and catalytic removal of damaged bases from the human genetic code by the DNA repair enzyme alkyladenine DNA glycosylase (AAG) (4,112). Specifically, unlike other DNA repair enzymes in the same glycosylase family, the active site of AAG is lined with three aromatic amino acids and there is limited hydrogen bonding to the substrate (Figure 12A). Although the resolution of the associated crystal structure (PDB ID: 1EWN) is lower than the criteria used to select PDB structures in this study, and the interactions occur with a damaged nucleobase, the strengths of contacts between AAG and the bound substrate, ethenoadenine (ϵA), were evaluated using the same methodology employed in the present work. Specifically, the interactions were determined to be –24.4 kJ mol−1 for the ϵA:Tyr127 stacking interaction, –6.9 kJ mol−1 for the ϵA:His136 tilted (inclined) contact and –1.0 kJ mol−1 for the ϵA:Tyr159 T-shaped (amino acid–edge) interaction. In particular, the strength of the ϵA:Tyr127 contact suggests that such active site π–π interactions could be involved in substrate identification and/or binding.

Figure 12.

Figure 12.

(A) The damaged nucleobase–amino acid π–π interactions in the AAG active site (PDB ID: 1EWN), (B) the natural nucleobase–amino acid π–π in the active site (PDB ID: 1G38) and (C) the sugar–π interaction in the Dpo4 active site (PDB ID: 3QZ8).

The broader implications of the DNA–protein π–π contacts in the AAG active site were determined by a computational study of the associated catalytic mechanism using a full DNA–AAG model and different substrates (112). Specifically, the individual effects of sequentially removing each AAG active site amino acid suggest that the π–rings are catalytic (by approximately 30 kJ mol−1) for the removal of neutral damaged nucleobases, but anti-catalytic for the removal of charged (cationic) alkylated nucleobases (by up to 35 kJ mol−1). Coupled with previous work studying the strength of isolated dimers between a natural/damaged DNA base and an aromatic amino acid (47,51,52,57), a proposal was developed that AAG has evolved to take advantage of active site amino acid π–systems in several ways. First, the flexibility provided by the active composition (lack of discriminatory hydrogen bonding) explains why AAG can excise many different substrates. Second, the π–π interactions with the substrate maximize the catalytic power towards neutral lesions that are inherently difficult to excise. Finally, although the ability to remove neutral DNA lesions comes at the expense of the excision of cationic lesions, the inherent nature of πcation–π interactions (47,51,52,57) allows AAG to more strongly attract and bind cationic lesions.

Although AAG provides an exemplary example of the multiple roles π–π contacts can play in biology, interactions between damaged nucleobases and an aromatic amino acid residue may also be involved in the catalytic mechanism of other enzymes. Repair enzymes such as hUNG2 (113,114) and hOgg1 (115,116) are known to have π–π interactions in their active sites (involving Phe or Tyr), which may contribute towards the catalytic function of these enzymes. Notably, although AAG, hUNG2 and hOgg1 all involve damaged DNA nucleobase active site π–π interactions, π–π interactions are also known to contribute to the binding and catalytic function of proteins that process natural DNA. For example, the extrahelical target A of N6-adenine DNA methyltransferase (PDB ID: 1G38; Figure 12B) forms an active site stacking interaction with Tyr108 (–21.6 kJ mol−1) and a T-shaped interaction with Phe196 (–7.7 kJ mol−1). Furthermore, as discussed for the DNA repair enzymes, the π–π interactions in the active site of N-DNA methyltransferases (including N6-adenine DNA methyltransferase) have been proposed to contribute to catalysis (117).

Abundance of deoxyribose–aromatic amino acid sugar–π interactions

Among the 428 crystal structures searched in the present work, 230 sugar–π contacts between the deoxyribose moiety and an aromatic amino acid were identified. Although a considerable number of nucleobase π–π interactions were expected based on previous literature (7,8,21–23,34), this is the first time that the significance of sugar–π contacts has been highlighted. Indeed, sugar–π contacts represent approximately 40% of all DNA–protein π–contacts found in the present work, and therefore occur with nearly the same frequency as nucleobase–amino acid π–π interactions. As discussed for the nucleobase–aromatic amino acid interactions, the sugar–π contacts are found in a variety of different proteins, with the relative abundances equal to the types of proteins searched (Figures 2C and 6B).

Structure of deoxyribose–aromatic amino acid sugar–π interactions

Although only π-interactions between the entire sugar face of pyranose and the aromatic amino acid were considered in previous work (61,62,67,76), a range of sugar–π contacts were identified for deoxyribose in the present study, which can involve a single proton, two protons (a bridge), three protons (a face), a lone pair, or both a lone pair and a proton (lone pair–proton; Figures 7B, 8 and 10). As a result, we introduce a classification system for DNA–protein sugar–π interactions based on the sugar edge participating in the contact, which can yield C–H···π and/or lone–pair···π interactions. In the literature, pyranoses involved in stacking interactions simultaneously participated in hydrogen bonding via a hydroxyl group and/or other van der Waals contact(s) (82–84). Although this preference was not explicitly examined in the present work, such hydrogen-bonding contacts are likely less important in the case of deoxyribose due to the lack of hydroxyl substituents on the sugar in DNA helices (except at the terminal positions). Interestingly, for each class of sugar–π interactions, the amino acid adopts a continuum of positions with respect to the sugar moiety (Figure 9).

Composition of deoxyribose–aromatic amino acid sugar–π interactions

Across the deoxyribose contacts identified in nature, each hydrogen atom in the sugar ring is involved in an interaction with the π–system of an aromatic amino acid (Figure 10). Nevertheless, certain atoms are more prone to participate in particular types of contacts (H5a dominates the single proton, H1a–H2b the bridged and H4–H5a–H5b the face interactions). Furthermore, although the bridged and face interactions are the most common overall relative monomer arrangements (Figure 10), interactions with the ring oxygen (rather than the O3′ or O5′ phosphate atoms) are also prevalent and are sometimes accompanied by a C–H···π contact.

The abundance of interactions with respect to the amino acid involved (Figure 7A) is similar to that discussed for the amino acid–nucleobase contacts (Figure 3B), with most interactions involving Tyr and Phe. The preferred binding arrangement is different for each amino acid, which likely occurs due to differences in the relative size of the π–systems. Specifically, Trp displays a preference for face interactions, Phe prefers bridged contacts, and His adopts the most lone pair–π contacts (Figure 10). Although Tyr assumes a wide variety of conformations with respect to the sugar moiety, most single proton interactions occur with Tyr (Figure 10).

Strength of deoxyribose–aromatic amino acid sugar–π interactions

The variation in the sugar–π conformations leads to a significant range in the binding energies (Figure 11), which are as strong as, or even stronger than, nucleobase–amino acid interactions (Figure 5). Indeed, the magnitude of sugar–π contacts found in nature can be up to approximately –70 kJ mol−1. Among the neutral dimers, the sugar interactions with Trp are the strongest (most negative), which is consistent with the highly stable nucleobase–Trp interactions found in the present work and reported previously (45,46,50), as well as carbohydrate–Trp contacts (83). Nevertheless, the strongest interactions overall occur with cationic His, as discussed for the nucleobase π–contacts, which typically represent lone pair binding arrangements.

Interestingly, although the strongest interactions occur when a pyranose C–H is directed at the center of the aromatic face (76), the amino acid displays a wide range of locations with respect to the sugar in DNA sugar–π contacts. This implies that the sugar composition plays a large role in determining the preferred geometry of the interaction. To gain further fundamental information about sugar–π contacts, calculations as previously conducted for nucleobase–amino acid pairs (45,46,49) that consider the preferred relative orientation of isolated dimers in the absence of an enzyme, as well as the associated inherent interaction energy, should be considered for sugars of varying composition.

Biologically relevance of deoxyribose–aromatic amino acid sugar–π interactions

Despite the fact that DNA sugar–π contacts with aromatic amino acid residues are rarely discussed in the literature, the importance of analogous carbohydrate–π interactions in many fields (62–74) coupled with the number of contacts found in nature in the present study suggests that these interactions may also be important for biological processes, either by providing stability to DNA–protein complexes, facilitating DNA binding/recognition, or possibly even having a greater (catalytic) role. As an example, the DNA polymerases in the RT, Y, X and B-families that are involved in crucial cell replication have a conserved Tyr/Phe in their active sites. It has been proposed that the conserved π–containing amino acid uses stacking with the deoxyribose sugar through the R-group and hydrogen bonding with the 3′–OH through the backbone to select DNA deoxyribose nucleotide triphosphates (dNTPs) over RNA ribose nucleotide triphosphates (rNTPs) in a 1 000 000 (118) to 100 ratio (119). Indeed, the conserved Tyr/Phe has been referred to as a ‘steric gate’ since steric clashes may prevent incorporate of rNTP (enhance dNTP incorporation) (120). Nevertheless, the only support for this proposal comes from crystal structures (119,121) or mutational studies (120,122–125) that replace Tyr/Phe by Gly/Ala/Val, which significantly reduces the size of the R-group and removes the π–system.

In the present work, the sugar–π interactions in crystal structures with a nucleoside triphosophate bound in the active site were re-evaluated and determined to almost exclusively represent either H1a–H2b or H1a–H2b–H4 contacts with Tyr or Phe depending on the dNTP orientation. A representative example is the H1a–H2b sugar–π interaction between Tyr12 and the incoming dCTP in the Dpo4 active site (a Y-family polymerase; PDB ID: 3QZ8; Figure 12C), which has a corresponding calculated binding energy of –15.6 kJ/mol (TyrCW or –12.6kJ/mol TyrCCW; see Supplementary Figure S1 for definition of Tyr orientations). This is a significant magnitude and indicates that the sugar–π contact with Tyr12 may be more than simply a steric constraint and, for example, may contribute to the selection of dNTP over rNTP. Indeed, modification of the sugar to the corresponding ribose analogue severely impacts this interaction in the polymerase active site, decreasing the closest heavy atom contact distance between the sugar and Tyr planes to 2.126 Å (3.397 Å with deoxyribose present) and is repulsive by approximately 95 kJ mol−1 (with same hydroxyl orientation, which makes the sugar–π interaction highly repulsive). Although the RNA sugar–π interaction is repulsive compared to the stabilizing interaction with the DNA analogue in the Dpo4 example discussed above, this calculation was performed on a structure obtained by replacing the sugar without geometry relaxation. Therefore, it is possible that different relative monomer orientations in RNA–protein complexes allow sugar–π contacts to be capitalized for cellular RNA processing. Nevertheless, this example illustrates the potential importance of DNA sugar–π contacts in human biology.

CONCLUSIONS

In summary, our calculations yield important insight into the abundance and strength of over 500 DNA–protein interactions in nature. This in turn can be used to estimate the magnitude of similar contacts identified in lower resolution or newly released crystal structures. Most importantly, the present contribution suggests that nucleobase–amino acid contacts are wider spread than perhaps originally believed and highlights the role of novel interactions between the deoxyribose moiety and the aromatic amino acids, which parallel the carbohydrate–π contacts identified in glycobiology (62–68). Furthermore, we confirm for the first time that both broad classes of DNA–protein π–contacts are varied in structure and can provide significant stability to DNA–protein complexes. We therefore propose that the critical role of nucleobase–aromatic amino acids π–π interactions and deoxyribose–aromatic amino acid sugar–π contacts in many biological processes may yet to be uncovered. Indeed, examples can be found of both types of DNA–protein contacts in the active sites of enzymes crucial for human survival. Understanding the DNA–protein π-interactions in such systems may lead to advances in nanotechnology (69–74) and (anticancer (4,126,127) or antiviral (128–130)) drug development.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

SUPPLEMENTARY DATA

Acknowledgments

Computational resources from the Upscale and Robust Abacus for Chemistry in Lethbridge (URACIL) and those provided by Westgrid and Compute/Calcul Canada are greatly appreciated.

FUNDING

Natural Sciences and Engineering Research Council of Canada (249598-07); Canada Research Chair Program (950-228175); Canadian Foundation for Innovation (22770); Natural Sciences and Engineering Research Council of Canada Undergraduate Summer Research Award (to K.A.W.); Natural Sciences and Engineering Research Council of Canada Alexander Graham Bell Canada Graduate Scholarship-Doctoral and the University of Lethbridge (to J.L.K.).

Conflict of interest statement. None declared.

REFERENCES

  • 1.Ciccia A., Elledge S.J. The DNA damage response: making it safe to play with knives. Mol. Cell. 2010;40:179–204. doi: 10.1016/j.molcel.2010.09.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hoeijmakers J.H.J. Genome maintenance mechanisms for preventing cancer. Nature. 2001;411:366–374. doi: 10.1038/35077232. [DOI] [PubMed] [Google Scholar]
  • 3.Jackson S.P., Bartek J. The DNA-damage response in human biology and disease. Nature. 2009;461:1071–1078. doi: 10.1038/nature08467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Stivers J.T., Jiang Y.L. A mechanistic perspective on the chemistry of DNA repair glycosylases. Chem. Rev. 2003;103:2729–2759. doi: 10.1021/cr010219b. [DOI] [PubMed] [Google Scholar]
  • 5.von Hippel P.H., Berg O.G. On the specificity of DNA-protein interactions. Proc. Natl. Acad. Sci. U.S.A. 1986;83:1608–1612. doi: 10.1073/pnas.83.6.1608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rohs R., Jin X., West S.M., Joshi R., Honig B., Mann R.S. Origins of specificity in protein-DNA recognition. Ann. Rev. Biochem. 2010;79:233–269. doi: 10.1146/annurev-biochem-060408-091030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Luscombe N.M., Laskowski R.A., Thornton J.M. Amino acid–base interactions: a three-dimensional analysis of protein–DNA interactions at an atomic level. Nucleic Acids Res. 2001;29:2860–2874. doi: 10.1093/nar/29.13.2860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Baker C.M., Grant G.H. Role of aromatic amino acids in protein–nucleic acid recognition. Biopolymers. 2007;85:456–470. doi: 10.1002/bip.20682. [DOI] [PubMed] [Google Scholar]
  • 9.Rutledge L.R., Wetmore S.D. Quantum biochemistry. Wiley-VCH Verlag GmbH & Co. KGaA; 2010. pp. 307–336. [Google Scholar]
  • 10.Mavromoustakos T., Durdagi S., Koukoulitsa C., Simcic M., G. Papadopoulos M., Hodoscek M., Golic Grdadolnik S. Strategies in the rational drug design. Curr. Med. Chem. 2011;18:2517–2530. doi: 10.2174/092986711795933731. [DOI] [PubMed] [Google Scholar]
  • 11.Speck-Planche A., Kleandrova V.V., Luan F., Cordeiro M.N.D.S. Rational drug design for anti-cancer chemotherapy: multi-target QSAR models for the in silico discovery of anti-colorectal cancer agents. Bioorg. Med. Chem. 2012;20:4848–4855. doi: 10.1016/j.bmc.2012.05.071. [DOI] [PubMed] [Google Scholar]
  • 12.Bultinck J., Lievens S., Tavernier J. Protein-protein interactions: network analysis and applications in drug discovery. Curr. Pharm. Des. 2012;18:4619–4629. doi: 10.2174/138161212802651562. [DOI] [PubMed] [Google Scholar]
  • 13.Dermitzakis E.T. From gene expression to disease risk. Nat. Genet. 2008;40:492–493. doi: 10.1038/ng0508-492. [DOI] [PubMed] [Google Scholar]
  • 14.Cuccato G., Gatta G.D., di Bernardo D. Systems and synthetic biology: tackling genetic networks and complex diseases. Heredity. 2009;102:527–532. doi: 10.1038/hdy.2009.18. [DOI] [PubMed] [Google Scholar]
  • 15.Stranger B.E., Dermitzakis E.T. From DNA to RNA to disease and back: the ‘central dogma’of regulatory disease variation. Coron. Artery Dis. 2006;2:383–390. doi: 10.1186/1479-7364-2-6-383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Stranger B.E., Raj T. Genetics of human gene expression. Curr. Opin. Genet. Dev. 2013;23:627–634. doi: 10.1016/j.gde.2013.10.004. [DOI] [PubMed] [Google Scholar]
  • 17.Matthews B.W. No code for recognition. Nature. 1988;335:294–295. doi: 10.1038/335294a0. [DOI] [PubMed] [Google Scholar]
  • 18.Pabo C.O., Sauer R.T. Transcription factors: structural families and principles of DNA recognition. Annu. Rev. Biochem. 1992;61:1053–1095. doi: 10.1146/annurev.bi.61.070192.005201. [DOI] [PubMed] [Google Scholar]
  • 19.Suzuki M. A framework for the DNA–protein recognition code of the probe helix in transcription factors: the chemical and stereochemical rules. Structure. 1994;2:317–326. doi: 10.1016/s0969-2126(00)00033-2. [DOI] [PubMed] [Google Scholar]
  • 20.Mandel-Gutfreund Y., Schueler O., Margalit H. Comprehensive analysis of hydrogen bonds in regulatory protein DNA-complexes: in search of common principles. J. Mol. Biol. 1995;253:370–382. doi: 10.1006/jmbi.1995.0559. [DOI] [PubMed] [Google Scholar]
  • 21.Luscombe N.M., Thornton J.M. Protein–DNA interactions: amino acid conservation and the effects of mutations on binding specificity. J. Mol. Biol. 2002;320:991–1009. doi: 10.1016/s0022-2836(02)00571-5. [DOI] [PubMed] [Google Scholar]
  • 22.Lejeune D., Delsaux N., Charloteaux B., Thomas A., Brasseur R. Protein–nucleic acid recognition: statistical analysis of atomic interactions and influence of DNA structure. Proteins: Struct. Funct. Bioinform. 2005;61:258–271. doi: 10.1002/prot.20607. [DOI] [PubMed] [Google Scholar]
  • 23.Sathyapriya R., Vijayabaskar M., Vishveshwara S. Insights into protein–DNA interactions through structure network analysis. PLoS Comput. Biol. 2008;4:e1000170. doi: 10.1371/journal.pcbi.1000170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Meyer E.A., Castellano R.K., Diederich F. Interactions with aromatic rings in chemical and biological recognition. Angew. Chem. Int. Ed. Engl. 2003;42:1210–1250. doi: 10.1002/anie.200390319. [DOI] [PubMed] [Google Scholar]
  • 25.Salonen L.M., Ellermann M., Diederich F. Aromatic rings in chemical and biological recognition: energetics and structures. Angew. Chem. Int. Ed. 2011;50:4808–4842. doi: 10.1002/anie.201007560. [DOI] [PubMed] [Google Scholar]
  • 26.Brandl M., Weiss M.S., Jabs A., Sühnel J., Hilgenfeld R. C-H···π interactions in proteins. J. Mol. Biol. 2001;307:357–377. doi: 10.1006/jmbi.2000.4473. [DOI] [PubMed] [Google Scholar]
  • 27.Thomas A., Meurisse R., Charloteaux B., Brasseur R. Aromatic side-chain interactions in proteins. I. Main structural features. Proteins: Struct. Funct. Bioinform. 2002;48:628–634. doi: 10.1002/prot.10190. [DOI] [PubMed] [Google Scholar]
  • 28.Thomas A., Meurisse R., Brasseur R. Aromatic side-chain interactions in proteins. II. Near- and far-sequence phe-X pairs. Proteins: Struct. Functi. Bioinform. 2002;48:635–644. doi: 10.1002/prot.10191. [DOI] [PubMed] [Google Scholar]
  • 29.Meurisse R., Brasseur R., Thomas A. Aromatic side-chain interactions in proteins. Near- and far-sequence his–X pairs. Biochim. Biophys. Acta. 2003;1649:85–96. doi: 10.1016/s1570-9639(03)00161-4. [DOI] [PubMed] [Google Scholar]
  • 30.Chourasia M., Sastry G.M., Sastry G.N. Aromatic–aromatic interactions database, A2ID: an analysis of aromatic π-networks in proteins. Int. J. Biol. Macromol. 2011;48:540–552. doi: 10.1016/j.ijbiomac.2011.01.008. [DOI] [PubMed] [Google Scholar]
  • 31.Jenkins D.D., Harris J.B., Howell E.E., Hinde R.J., Baudry J. STAAR: statistical analysis of aromatic rings. J. Comput. Chem. 2013;34:518–522. doi: 10.1002/jcc.23164. [DOI] [PubMed] [Google Scholar]
  • 32.Yan C., Wu F., Jernigan R.L., Dobbs D., Honavar V. Characterization of protein–protein interfaces. The Protein Journal. 2008;27:59–70. doi: 10.1007/s10930-007-9108-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ellis J.J., Broom M., Jones S. Protein–RNA interactions: structural analysis and functional classes. Proteins: Struct. Funct. Bioinform. 2007;66:903–911. doi: 10.1002/prot.21211. [DOI] [PubMed] [Google Scholar]
  • 34.Mao L., Wang Y., Liu Y., Hu X. Molecular determinants for ATP-binding in proteins: a data mining and quantum chemical analysis. J. Mol. Biol. 2004;336:787–807. doi: 10.1016/j.jmb.2003.12.056. [DOI] [PubMed] [Google Scholar]
  • 35.Řezáč J., Riley K.E., Hobza P. Evaluation of the performance of post-hartree-fock methods in terms of intermolecular distance in noncovalent complexes. J. Comput. Chem. 2012;33:691–694. doi: 10.1002/jcc.22899. [DOI] [PubMed] [Google Scholar]
  • 36.Sherrill C.D. Energy component analysis of π interactions. Acc. Chem. Res. 2012;46:1020–1028. doi: 10.1021/ar3001124. [DOI] [PubMed] [Google Scholar]
  • 37.Riley K.E., Hobza P. Noncovalent interactions in biochemistry. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2011;1:3–17. [Google Scholar]
  • 38.Grimme S., Steinmetz M. Effects of london dispersion correction in density functional theory on the structures of organic molecules in the gas phase. Phys. Chem. Chem. Phys. 2013;15:16031–16042. doi: 10.1039/c3cp52293h. [DOI] [PubMed] [Google Scholar]
  • 39.Johnson E.R., Otero-de-la-Roza A., Dale S.G., DiLabio G.A. Efficient basis sets for non-covalent interactions in XDM-corrected density-functional theory. J. Chem. Phys. 2013;139:214109. doi: 10.1063/1.4832325. [DOI] [PubMed] [Google Scholar]
  • 40.Copeland K.L., Anderson J.A., Farley A.R., Cox J.R., Tschumper G.S. Probing phenylalanine/adenine pi-stacking interactions in protein complexes with explicitly correlated and CCSD(T) computations. J. Phys. Chem. B. 2008;112:14291–14295. doi: 10.1021/jp805528v. [DOI] [PubMed] [Google Scholar]
  • 41.Cauet E., Rooman M., Wintjens R., Lievin J., Biot C. Histidine-aromatic interactions in proteins and protein-ligand complexes: quantum chemical study of X-ray and model structures. J. Chem. Theory Comput. 2005;1:472–483. doi: 10.1021/ct049875k. [DOI] [PubMed] [Google Scholar]
  • 42.Copeland K.L., Pellock S.J., Cox J.R., Cafiero M.L., Tschumper G.S. Examination of tyrosine/adenine stacking interactions in protein complexes. J. Phys. Chem. B. 2013;117:14001–14008. doi: 10.1021/jp408027j. [DOI] [PubMed] [Google Scholar]
  • 43.Cysewski P. A Post-SCF complete basis set study on the recognition patterns of uracil and cytosine by aromatic and π-aromatic stacking interactions with amino acid residues. Phys. Chem. Chem. Phys. 2008;10:2636–2645. doi: 10.1039/b718394a. [DOI] [PubMed] [Google Scholar]
  • 44.Riley K.E., Pitoňák M., Černý J.I., Hobza P. On the structure and geometry of biomolecular binding motifs (hydrogen-bonding, stacking, X−H···π): WFT and DFT calculations. J. Chem. Theory Comput. 2010;6:66–80. doi: 10.1021/ct900376r. [DOI] [PubMed] [Google Scholar]
  • 45.Rutledge L.R., Campbell-Verduyn L.S., Wetmore S.D. Characterization of the stacking interactions between DNA or RNA nucleobases and the aromatic amino acids. Chem. Phys. Lett. 2007;444:167–175. [Google Scholar]
  • 46.Rutledge L.R., Durst H.F., Wetmore S.D. Evidence for stabilization of DNA/RNA-protein complexes arising from nucleobase-amino acid stacking and T-shaped interactions. J. Chem. Theory Comput. 2009;5:1400–1410. doi: 10.1021/ct800567q. [DOI] [PubMed] [Google Scholar]
  • 47.Rutledge L.R., Campbell-Verduyn L.S., Hunter K.C., Wetmore S.D. Characterization of nucleobase-amino acid stacking interactions utilized by a DNA repair enzyme. J. Phys. Chem. B. 2006;110:19652–19663. doi: 10.1021/jp061939v. [DOI] [PubMed] [Google Scholar]
  • 48.Wells R.A., Kellie J.L., Wetmore S.D. Significant strength of charged DNA–protein π–π interactions: a preliminary study of cytosine. J. Phys. Chem. B. 2013;117:10462–10474. doi: 10.1021/jp406829d. [DOI] [PubMed] [Google Scholar]
  • 49.Churchill C.D.M., Wetmore S.D. Noncovalent interactions involving histidine: the effect of charge on π-π stacking and T-shaped interactions with the DNA nucleobases. J. Phys. Chem. B. 2009;113:16046–16058. doi: 10.1021/jp907887y. [DOI] [PubMed] [Google Scholar]
  • 50.Rutledge L.R., Wetmore S.D. The Assessment of density functionals for DNA-protein stacked and T-shaped complexes. Can. J. Chem. 2010;88:815–830. [Google Scholar]
  • 51.Rutledge L.R., Durst H.F., Wetmore S.D. Computational comparison of the stacking interactions between the aromatic amino acids and the natural or (Cationic) methylated nucleobases. Phys. Chem. Chem. Phys. 2008;10:2801–2812. doi: 10.1039/b718621e. [DOI] [PubMed] [Google Scholar]
  • 52.Rutledge L.R., Wetmore S.D. Remarkably strong T-shaped interactions between aromatic amino acids and adenine: their increase upon nucleobase methylation and a comparison to stacking. J. Chem. Theory Comput. 2008;4:1768–1780. doi: 10.1021/ct8002332. [DOI] [PubMed] [Google Scholar]
  • 53.Leavens F.M.V., Churchill C.D.M., Wang S., Wetmore S.D. Evaluating how discrete water molecules affect protein–DNA π–π and π+–π stacking and T-shaped interactions: the case of histidine-adenine dimers. J. Phys. Chem. B. 2011;115:10990–11003. doi: 10.1021/jp205424z. [DOI] [PubMed] [Google Scholar]
  • 54.Churchill C.D.M., Navarro-Whyte L., Rutledge L.R., Wetmore S.D. Effects of the biological backbone on DNA-protein stacking interactions. Phys. Chem. Chem. Phys. 2009;11:10657–10670. doi: 10.1039/b910747a. [DOI] [PubMed] [Google Scholar]
  • 55.Churchill C.D.M., Rutledge L.R., Wetmore S.D. Effects of the biological backbone on sacking interactions at DNA-protein interfaces: the interplay between the backbone-π and π-π components. Phys. Chem. Chem. Phys. 2010;12:14515–14526. doi: 10.1039/c0cp00550a. [DOI] [PubMed] [Google Scholar]
  • 56.Rutledge L.R., Navarro-Whyte L., Peterson T.L., Wetmore S.D. Effects of extending the computational model on DNA-protein T-shaped interactions: the case of adenine-histidine dimers. J. Phys. Chem. A. 2011;115:12646–12658. doi: 10.1021/jp203248j. [DOI] [PubMed] [Google Scholar]
  • 57.Rutledge L.R., Churchill C.D.M., Wetmore S.D. A preliminary investigation of the additivity of π−π or π+−π stacking and T-shaped interactions between natural or damaged DNA nucleobases and histidine. J. Phys. Chem. B. 2010;114:3355–3367. doi: 10.1021/jp911990g. [DOI] [PubMed] [Google Scholar]
  • 58.Spiwok V., Lipovová P., Skálová T., Buchtelová E., Hašek J., Králová B. Role of CH/π interactions in substrate binding by escherichia coli β-galactosidase. Carbohydr. Res. 2004;339:2275–2280. doi: 10.1016/j.carres.2004.06.016. [DOI] [PubMed] [Google Scholar]
  • 59.Spiwok V., Lipovová P., Skálová T., Vondráčková E., Dohnálek J., Hašek J., Králová B. Modelling of carbohydrate–aromatic interactions: ab initio energetics and force field performance. J. Comput. Aided Mol. Des. 2006;19:887–901. doi: 10.1007/s10822-005-9033-z. [DOI] [PubMed] [Google Scholar]
  • 60.Sujatha M.S., Sasidhar Y.U., Balaji P.V. Energetics of galactose– and glucose–aromatic amino acid interactions: implications for binding in galactose-specific proteins. Protein Science. 2004;13:2502–2514. doi: 10.1110/ps.04812804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Wimmerová M., Kozmon S., Nečasová I., Mishra S.K., Komárek J., Koča J. Stacking interactions between carbohydrate and protein quantified by combination of theoretical and experimental methods. PLoS ONE. 2012;7:e46032. doi: 10.1371/journal.pone.0046032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Asensio J.L., Ardá A., Cañada F.J., Jiménez-Barbero J. Carbohydrate–aromatic interactions. Acc. Chem. Res. 2012;46:946–954. doi: 10.1021/ar300024d. [DOI] [PubMed] [Google Scholar]
  • 63.Tatko C. Sugars stack up. Nat. Chem. Biol. 2008;4:586–587. doi: 10.1038/nchembio1008-586. [DOI] [PubMed] [Google Scholar]
  • 64.Laughrey Z.R., Kiehna S.E., Riemen A.J., Waters M.L. Carbohydrate-π interactions: what are they worth. J. Am. Chem. Soc. 2008;130:14625–14633. doi: 10.1021/ja803960x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Lee Y.C., Lee R.T. Carbohydrate-protein interactions: basis of glycobiology. Acc. Chem. Res. 1995;28:321–327. [Google Scholar]
  • 66.Stanca-Kaposta E.C., Çarçabal P., Cocinero E.J., Hurtado P., Simons J.P. Carbohydrate–aromatic interactions: vibrational spectroscopy and structural assignment of isolated monosaccharide complexes with p-hydroxy toluene and N-acetyl l-tyrosine methylamide. J. Phys. Chem. B. 2013;117:8135–8142. doi: 10.1021/jp404527s. [DOI] [PubMed] [Google Scholar]
  • 67.Kumari M., Sunoj R.B., Balaji P.V. Exploration of CH···π mediated stacking interactions in saccharide: aromatic residue complexes through conformational sampling. Carbohydr. Res. 2012;361:133–140. doi: 10.1016/j.carres.2012.08.015. [DOI] [PubMed] [Google Scholar]
  • 68.V. Balaji P. Contribution of C-H···π; interactions to the affinity and specificity of carbohydrate binding sites. Mini-Rev. Org. Chem. 2011;8:222–228. [Google Scholar]
  • 69.Cao M., Fu A., Wang Z., Liu J., Kong N., Zong X., Liu H., Gooding J.J. Electrochemical and theoretical study of π–π stacking interactions between graphitic surfaces and pyrene derivatives. J. Phys. Chem. C. 2014;118:2650–2659. [Google Scholar]
  • 70.Cho Y., Min S.K., Yun J., Kim W.Y., Tkatchenko A., Kim K.S. Noncovalent interactions of DNA bases with naphthalene and graphene. J. Chem. Theory Comput. 2013;9:2090–2096. doi: 10.1021/ct301097u. [DOI] [PubMed] [Google Scholar]
  • 71.Gazit E. Self-assembled peptide nanostructures: the design of molecular building blocks and their technological utilization. Chem. Soc. Rev. 2007;36:1263–1269. doi: 10.1039/b605536m. [DOI] [PubMed] [Google Scholar]
  • 72.Kumar R.M., Elango M., Subramanian V. Carbohydrate-aromatic interactions: the role of curvature on XH···π interactions. J. Phys. Chem. A. 2010;114:4313–4324. doi: 10.1021/jp907547f. [DOI] [PubMed] [Google Scholar]
  • 73.Sharma R., McNamara J.P., Raju R.K., Vincent M.A., Hillier I.H., Morgado C.A. The interaction of carbohydrates and amino acids with aromatic systems studied by density functional and semi-empirical molecular orbital calculations with dispersion corrections. Phys. Chem. Chem. Phys. 2008;10:2767–2774. doi: 10.1039/b719764k. [DOI] [PubMed] [Google Scholar]
  • 74.Cozens C., Pinheiro V.B., Vaisman A., Woodgate R., Holliger P. A short adaptive path from DNA to RNA polymerases. Proc. Natl. Acad. Sci. U.S.A. 2012;109:8067–8072. doi: 10.1073/pnas.1120964109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Tsuzuki S., Uchimaru T., Mikami M. Magnitude and nature of carbohydrate−aromatic interactions: ab initio calculations of fucose−benzene complex. J. Phys. Chem. B. 2009;113:5617–5621. doi: 10.1021/jp8093726. [DOI] [PubMed] [Google Scholar]
  • 76.Kozmon S., Matuška R., Spiwok V., Koča J. Three-dimensional potential energy surface of selected carbohydrates’ CH/π dispersion interactions calculated by high-level quantum mechanical methods. Chem. Eur. J. 2011;17:5680–5690. doi: 10.1002/chem.201002876. [DOI] [PubMed] [Google Scholar]
  • 77.Ramirez-Gualito K., Alonso-Rios R., Quiroz-Garcia B., Rojas-Aguilar A., Diaz D., Jimenez-Barbero J., Cuevas G. enthalpic nature of the CH/pi interaction involved in the recognition of carbohydrates by aromatic compounds, confirmed by a novel interplay of NMR, calorimetry, and theoretical calculations. J. Am. Chem. Soc. 2009;131:18129–18138. doi: 10.1021/ja903950t. [DOI] [PubMed] [Google Scholar]
  • 78.Raju R.K., Hillier I.H., Burton N.A., Vincent M.A., Doudou S., Bryce R.A. The effects of perfluorination on carbohydrate-π interactions: computational studies of the interaction of benzene and hexafluorobenzene with fucose and cyclodextrin. Phys. Chem. Chem. Phys. 2010;12:7959–7967. doi: 10.1039/c002058c. [DOI] [PubMed] [Google Scholar]
  • 79.Bautista-Ibanez L., Ramirez-Gualito K., Quiroz-Garcia B., Rojas-Aguilar A., Cuevas G. Calorimetric measurement of the CH/pi interaction involved in the molecular recognition of saccharides by aromatic compounds. J. Org. Chem. 2008;73:849–857. doi: 10.1021/jo701926r. [DOI] [PubMed] [Google Scholar]
  • 80.Raju R.K., Ramraj A., Vincent M.A., Hillier I.H., Burton N.A. Carbohydrate-protein recognition probed by density functional theory and Ab initio calculations including dispersive interactions. Phys. Chem. Chem. Phys. 2008;10:6500–6508. doi: 10.1039/b809164a. [DOI] [PubMed] [Google Scholar]
  • 81.Su Z., Cocinero E.J., Stanca-Kaposta E.C., Davis B.G., Simons J.P. Carbohydrate-aromatic interactions: a computational and IR spectroscopic investigation of the complex, methyl alpha-L-fucopyranoside·toluene, isolated in the gas phase. Chem. Phys. Lett. 2009;471:17–21. [Google Scholar]
  • 82.Stanca-Kaposta E.C., Gamblin D.P., Screen J., Liu B., Snoek L.C., Davis B.G., Simons J.P. Carbohydrate molecular recognition: a spectroscopic investigation of carbohydrate-aromatic interactions. Phys. Chem. Chem. Phys. 2007;9:4444–4451. doi: 10.1039/b704792d. [DOI] [PubMed] [Google Scholar]
  • 83.Tsuzuki S., Uchimaru T., Mikami M. Magnitude and nature of carbohydrate–aromatic interactions in fucose–phenol and fucose–indole complexes: CCSD(T) level interaction energy calculations. J. Phys. Chem. A. 2011;115:11256–11262. doi: 10.1021/jp2045756. [DOI] [PubMed] [Google Scholar]
  • 84.Mohamed M.N.A., Watts H.D., Guo J., Catchmark J.M., Kubicki J.D. MP2, density functional theory, and molecular mechanical calculations of C–H···π and hydrogen bond interactions in a cellulose-binding Module–Cellulose Model System. Carbohydr. Res. 2010;345:1741–1751. doi: 10.1016/j.carres.2010.05.021. [DOI] [PubMed] [Google Scholar]
  • 85.Sujatha M.S., Sasidhar Y.U., Balaji P.V. Insights into the role of the aromatic residue in galactose-binding sites: MP2/6–311G++** study on galactose- and glucose-aromatic residue analogue complexes. Biochemistry. 2005;44:8554–8562. doi: 10.1021/bi050298b. [DOI] [PubMed] [Google Scholar]
  • 86.Kozmon S., Matuška R., Spiwok V., Koča J. Dispersion interactions of carbohydrates with condensate aromatic moieties: theoretical study on the CH–π interaction additive properties. Phys. Chem. Chem. Phys. 2011;13:14215–14222. doi: 10.1039/c1cp21071h. [DOI] [PubMed] [Google Scholar]
  • 87.Lucas R., Gomez-Pinto I., Avino A., Reina J.J., Eritja R., Gonzalez C., Morales J.C. Highly polar carbohydrates stack onto DNA duplexes via CH/pi interactions. J. Am. Chem. Soc. 2011;133:1909–1916. doi: 10.1021/ja108962j. [DOI] [PubMed] [Google Scholar]
  • 88.Maresca M., Derghal A., Carravagna C., Dudin S., Fantini J. Controlled aggregation of adenine by sugars: physicochemical studies, molecular modelling simulations of sugar-aromatic CH-pi stacking interactions, and biological significance. Phys. Chem. Chem. Phys. 2008;10:2792–2800. doi: 10.1039/b802594k. [DOI] [PubMed] [Google Scholar]
  • 89.Gómez-Pinto I., Vengut-Climent E., Lucas R., Aviñó A., Eritja R., González C., Morales J.C. Carbohydrate–DNA interactions at G-quadruplexes: folding and stability changes by attaching sugars at the 5′-End. Chem. Eur. J. 2013;19:1920–1927. doi: 10.1002/chem.201203902. [DOI] [PubMed] [Google Scholar]
  • 90.Lucas R., Vengut-Climent E., Gomez-Pinto I., Avino A., Eritja R., Gonzalez C., Morales J.C. Apolar carbohydrates as DNA capping agents. Chem. Commun. 2012;48:2991–2993. doi: 10.1039/c2cc17093k. [DOI] [PubMed] [Google Scholar]
  • 91.Schrodinger L. The PyMOL molecular graphics system. 2010. Version 1.3r1.
  • 92.Macrae C.F., Bruno I.J., Chisholm J.A., Edgington P.R., McCabe P., Pidcock E., Rodriguez-Monge L., Taylor R., van de Streek J., Wood P.A. Mercury CSD 2.0 - new features for the visualization and investigation of crystal structures. J. Appl. Crystallogr. 2008;41:466–470. [Google Scholar]
  • 93.Edgcomb S.P., Murphy K.P. Variability in the pKa of histidine side-chains correlates with burial within proteins. Proteins: Struct. Funct. Bioinform. 2002;49:1–6. doi: 10.1002/prot.10177. [DOI] [PubMed] [Google Scholar]
  • 94.HyperChem(TM). Professional 7.5 ed. Gainesville, Florida: Hypercube, Inc.; [Google Scholar]
  • 95.Cerny J., Hobza P. Non-covalent interactions in biomacromolecules. Phys. Chem. Chem. Phys. 2007;9:5291–5303. doi: 10.1039/b704781a. [DOI] [PubMed] [Google Scholar]
  • 96.Raju R.K., Ramraj A., Hillier I.H., Vincent M.A., Burton N.A. Carbohydrate-aromatic π interactions: a test of density functionals and the DFT-D method. Phys. Chem. Chem. Phys. 2009;11:3411–3416. doi: 10.1039/b822877a. [DOI] [PubMed] [Google Scholar]
  • 97.Hujo W., Grimme S. Performance of non-local and atom-pairwise dispersion corrections to DFT for structural parameters of molecules with noncovalent interactions. J. Chem. Theory Comput. 2012;9:308–315. doi: 10.1021/ct300813c. [DOI] [PubMed] [Google Scholar]
  • 98.Burns L.A., Vazquez-Mayagoitia A., Sumpter B.G., Sherrill C.D. Density-functional approaches to noncovalent interactions: a comparison of dispersion corrections (DFT-D), exchange-hole dipole moment (XDM) theory, and specialized functionals. J. Chem. Phys. 2011;134:084107. doi: 10.1063/1.3545971. [DOI] [PubMed] [Google Scholar]
  • 99.Halkier A., Helgaker T., Jorgensen P., Klopper W., Koch H., Olsen J., Wilson A.K. Basis-set convergence in correlated calculations on Ne, N-2, and H2O. Chem. Phys. Lett. 1998;286:243–252. [Google Scholar]
  • 100.Halkier A., Helgaker T., Jorgensen P., Klopper W., Olsen J. Basis-set convergence of the energy in molecular hartree-fock calculations. Chem. Phys. Lett. 1999;302:437–446. [Google Scholar]
  • 101.Jurecka P., Sponer J., Hobza P. Potential energy surface of the cytosine dimer: MP2 complete basis set limit interaction energies, CCSD(T) correction term, and comparison with the AMBER force field. J. Phys. Chem. B. 2004;108:5466–5471. [Google Scholar]
  • 102.Sponer J., Jurecka P., Marchan I., Luque F.J., Orozco M., Hobza P. Nature of base stacking: reference quantum-chemical stacking energies in ten unique B-DNA base-pair steps. Chem.-Eur. J. 2006;12:2854–2865. doi: 10.1002/chem.200501239. [DOI] [PubMed] [Google Scholar]
  • 103.Sponer J., Riley K.E., Hobza P. Nature and magnitude of aromatic stacking of nucleic acid bases. Phys. Chem. Chem. Phys. 2008;10:2595–2610. doi: 10.1039/b719370j. [DOI] [PubMed] [Google Scholar]
  • 104.Morgado C.A., Jurecka P., Svozil D., Hobza P., Sponer J. Reference MP2/CBS and CCSD(T) quantum-chemical calculations on stacked adenine dimers. Comparison with DFT-D, MP2.5, SCS(MI)-MP2, M06–2X, CBS(SCS-D) and force field descriptions. Phys. Chem. Chem. Phys. 2010;12:3522–3534. doi: 10.1039/b924461a. [DOI] [PubMed] [Google Scholar]
  • 105.Mládek A., Banáš P., Jurečka P., Otyepka M., Zgarbová M., Šponer J. Energies and 2′-Hydroxyl group orientations of RNA backbone conformations. Benchmark CCSD(T)/CBS database, electronic analysis, and assessment of DFT methods and MD simulations. J. Chem. Theory Comput. 2013;10:463–480. doi: 10.1021/ct400837p. [DOI] [PubMed] [Google Scholar]
  • 106.Mladek A., Krepl M., Svozil D., Cech P., Otyepka M., Banas P., Zgarbova M., Jurecka P., Sponer J. Benchmark quantum-chemical calculations on a complete set of rotameric families of the DNA sugar-phosphate backbone and their comparison with modern density functional theory. Phys. Chem. Chem. Phys. 2013;15:7295–7310. doi: 10.1039/c3cp44383c. [DOI] [PubMed] [Google Scholar]
  • 107.Jurecka P., Sponer J., Cerny J., Hobza P. Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs. Phys. Chem. Chem. Phys. 2006;8:1985–1993. doi: 10.1039/b600027d. [DOI] [PubMed] [Google Scholar]
  • 108.Riley K.E., Platts J.A., Řezáč J., Hobza P., Hill J.G. Assessment of the performance of MP2 and MP2 variants for the treatment of noncovalent interactions. J. Phys. Chem. A. 2012;116:4159–4169. doi: 10.1021/jp211997b. [DOI] [PubMed] [Google Scholar]
  • 109.Frisch M.J., Trucks G.W., Schlegel H.B., Scuseria G.E., Robb M.A., Cheeseman J.R., Scalmani G., Barone V., Mennucci B., Petersson G.A., et al. Revision A.02 ed. Wallingford CT: Gaussian, Inc.; 2009. [Google Scholar]
  • 110.Shao Y., Molnar L.F., Jung Y., Kussmann J., Ochsenfeld C., Brown S.T., Gilbert A.T.B., Slipchenko L.V., Levchenko S.V., O’Neill D.P., et al. Advances in methods and algorithms in a modern quantum chemistry program package. Phys. Chem. Chem. Phys. 2006;8:3172–3191. doi: 10.1039/b517914a. [DOI] [PubMed] [Google Scholar]
  • 111.Singh N.J., Min S.K., Kim D.Y., Kim K.S. Comprehensive energy analysis for various types of π-interaction. J. Chem. Theory Comput. 2009;5:515–529. doi: 10.1021/ct800471b. [DOI] [PubMed] [Google Scholar]
  • 112.Rutledge L.R., Wetmore S.D. Modeling the chemical step utilized by human alkyladenine DNA glycosylase: a concerted mechanism AIDS in selectively excising damaged purines. J. Am. Chem. Soc. 2011;133:16258–16269. doi: 10.1021/ja207181c. [DOI] [PubMed] [Google Scholar]
  • 113.Kellie J.L., Navarro-Whyte L., Carvey M.T., Wetmore S.D. Combined effects of π–π stacking and hydrogen bonding on the (N1) acidity of uracil and hydrolysis of 2’-deoxyuridine. J. Phys. Chem. B. 2012;116:2622–2632. doi: 10.1021/jp2121627. [DOI] [PubMed] [Google Scholar]
  • 114.Przybylski J.L., Wetmore S.D. A QM/QM investigation of the hUNG2 reaction surface: the untold tale of a catalytic residue. Biochemistry. 2011;50:4218–4227. doi: 10.1021/bi2003394. [DOI] [PubMed] [Google Scholar]
  • 115.Calvaresi M., Bottoni A., Garavelli M. Computational clues for a new mechanism in the glycosylase activity of the human DNA repair protein hOGG1. A generalized paradigm for purine-repairing systems. J. Phys. Chem. B. 2007;111:6557–6570. doi: 10.1021/jp071581i. [DOI] [PubMed] [Google Scholar]
  • 116.Bruner S.D., Norman D.P.G., Verdine G.L. Structural basis for recognition and repair of the endogenous mutagen 8-oxoguanine in DNA. Nature. 2000;403:859–866. doi: 10.1038/35002510. [DOI] [PubMed] [Google Scholar]
  • 117.Goedecke K., Pignot M., Goody R.S., Scheidig A.J., Weinhold E. Structure of the N6-adenine DNA methyltransferase MTaqI in complex with DNA and a cofactor analog. Nat. Struct. Mol. Biol. 2001;8:121–125. doi: 10.1038/84104. [DOI] [PubMed] [Google Scholar]
  • 118.Wang W., Wu E.Y., Hellinga H.W., Beese L.S. Structural factors that determine selectivity of a high fidelity DNA polymerase for deoxy-, dideoxy-, and ribonucleotides. J. Biol. Chem. 2012;287:28215–28226. doi: 10.1074/jbc.M112.366609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Brown J.A., Suo Z. Unlocking the sugar “steric gate” of DNA polymerases. Biochemistry. 2011;50:1135–1142. doi: 10.1021/bi101915z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.DeLucia A.M., Grindley N.D.F., Joyce C.M. An error-prone family Y DNA polymerase (DinB Homolog from sulfolobus solfataricus) uses a ‘steric gate’ residue for discrimination against ribonucleotides. Nucleic Acids Res. 2003;31:4129–4137. doi: 10.1093/nar/gkg417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Kirouac K.N., Suo Z., Ling H. Structural mechanism of ribonucleotide discrimination by a Y-family DNA polymerase. J. Mol. Biol. 2011;407:382–390. doi: 10.1016/j.jmb.2011.01.037. [DOI] [PubMed] [Google Scholar]
  • 122.Yang G., Franklin M., Li J., Lin T.C., Konigsberg W. A conserved Tyr residue is required for sugar selectivity in a Pol α DNA polymerase. Biochemistry. 2002;41:10256–10261. doi: 10.1021/bi0202171. [DOI] [PubMed] [Google Scholar]
  • 123.Astatke M., Ng K., Grindley N.D., Joyce C.M. A single side chain prevents escherichia coli DNA polymerase I (klenow fragment) from incorporating ribonucleotides. Proc. Natl. Acad. Sci. U.S.A. 1998;95:3402–3407. doi: 10.1073/pnas.95.7.3402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Bonnin A., Lázaro J.M., Blanco L., Salas M. A single tyrosine prevents insertion of ribonucleotides in the eukaryotic-type φ29 DNA polymerase. J. Mol. Biol. 1999;290:241–251. doi: 10.1006/jmbi.1999.2900. [DOI] [PubMed] [Google Scholar]
  • 125.Gao G., Orlova M., Georgiadis M.M., Hendrickson W.A., Goff S.P. Conferring RNA polymerase activity to a DNA polymerase: a single residue in reverse transcriptase controls substrate selection. Proc. Natl. Acad. Sci. U.S.A. 1997;94:407–411. doi: 10.1073/pnas.94.2.407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Lindahl T., Wood R.D. Quality control by DNA repair. Science. 1999;286:1897–1905. doi: 10.1126/science.286.5446.1897. [DOI] [PubMed] [Google Scholar]
  • 127.Sale J.E., Lehmann A.R., Woodgate R. Y-family DNA polymerases and their role in tolerance of cellular DNA damage. Nat. Rev. Mol. Cell. Biol. 2012;13:141–152. doi: 10.1038/nrm3289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Loregian A., Marsden H.S., Palù G. Protein–protein interactions as targets for antiviral chemotherapy. Rev. Med. Virol. 2002;12:239–262. doi: 10.1002/rmv.356. [DOI] [PubMed] [Google Scholar]
  • 129.Park S.-H., Raines R.T. Genetic selection for dissociative inhibitors of designated protein-protein interactions. Nat. Biotech. 2000;18:847–851. doi: 10.1038/78451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.De Clercq E. Strategies in the design of antiviral drugs. Nat. Rev. Drug Discov. 2002;1:13–25. doi: 10.1038/nrd703. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUPPLEMENTARY DATA

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES