Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Oct 13.
Published in final edited form as: Structure. 2010 Aug 26;18(10):1321–1331. doi: 10.1016/j.str.2010.07.006

Folding, DNA recognition and function of GIY-YIG endonucleases: crystal structures of R.Eco29kI

Amanda Nga-Sze Mak 1, Abigail R Lambert 1,2, Barry L Stoddard 1,3
PMCID: PMC2955809  NIHMSID: NIHMS228753  PMID: 20800503

Abstract

The GIY-YIG endonuclease family comprises hundreds of diverse proteins and a multitude of functions; none have been visualized bound to DNA. The structure of the GIY-YIG restriction endonuclease R.Eco29kI has been solved both alone and bound to its target site. The protein displays a domain-swapped homodimeric structure with several extended surface loops encircling the DNA. Only three side chains from each protein subunit contact DNA bases, two directly and one via a bridging solvent molecule. Both tyrosine residues within the GIY-YIG motif are positioned in the catalytic center near a putative nucleophilic water; the remainder of the active site resembles the HNH endonuclease family. The structure illustrates how the GIY-YIG scaffold has been adapted for the highly specific recognition of a DNA restriction site, in contrast to nonspecific DNA cleavage by GIY-YIG domains in homing endonucleases or structure-specific cleavage by DNA repair enzymes such as UvrC.


Enzymatic catalysis of phosphodiester bond hydrolysis and ligation is a fundamental requirement for virtually all forms of nucleic acid modification, rearrangement, and repair. A relatively small number of protein folds are found to encompass many of the enzymes that make and break phosphodiester bonds. This includes the PD-(D/E)xK endonuclease family (Kosinski et al., 2005) (which dominates the known restriction endonucleases and is broadly distributed across many additional biological functions) and the HNH family (Mehta et al., 2004) (which is found in an equally large array of nuclease families, including nonspecific bacterial colicins, restriction endonucleases and mobile homing endonucleases).

Proteins harboring the ‘GIY-YIG’ catalytic motif are also ubiquitously distributed across a broad range of biological hosts and functions (Dunin-Horkawicz et al., 2006). Many of the homing endonucleases found in mobile introns within bacteriophage contain GIY-YIG catalytic domains (Kowalski et al., 1999). Similar free-standing GIY-YIG phage proteins, such as the Seg endonucleases in T4 phage, also display gene invasion behaviors (Belle et al., 2002). As well, the Penelope-like non-LTR retroelements contain a GIY-YIG protein domain (Pyatkov et al., 2004).

GIY-YIG endonucleases are also employed for maintenance and repair of DNA (Aravind et al., 1999). In bacteria, the UvrC endonuclease (which is involved in nucleotide excision repair, or NER), uses its N-terminal GIY-YIG domain to cleave the phosphodiester bond on the 3′ side of a damaged DNA base (Verhoeven et al., 2000). A different NER endonuclease known as ‘Cho’ uses a GIY-YIG domain to cut the 3′ phosphodiester bond that flanks a damaged base; this enzyme can accommodate bulky lesions that interfere with UvrC activity (Moolenaar et al., 2002). A GIY-YIG endonuclease domain appears to be involved in DNA proofreading coupled to the activity of DNA polymerase III and its corresponding exonuclease (Van Houten et al., 2002). In yeast, a structure-specific protein complex that contains the Slx1 GIY-YIG endonuclease maintains rDNA copy number by generating a double-strand break and inducing targeted recombination at arrested replication forks (Coulon et al., 2004). A large family of prokaryotic homologues of Slx-1 is thought to be involved in DNA repair (Aravind and Koonin, 2001).

Many bacterial restriction endonucleases are known to harbor GIY-YIG catalytic motifs (Bujnicki et al., 2001). In a reversal of this mechanism, nonspecific GIY-YIG endonucleases encoded within phage (such as T4 endonuclease II) degrade bacterial DNA and thereby allow the phage to scavenge host nucleotides for its own DNA synthesis (Carlson and Wiberg, 1983). At the same time, the phage genome is protected from cleavage by chemical modification (usually hydroxymethylation and/or glucosylation of cytosine bases).

An even broader collection of GIY-YIG proteins are distributed throughout all domains of life, and are often tethered to a wide variety of additional protein domains (Dunin-Horkawicz et al., 2006). Some of these proteins have known biological activities, while many others lack any clear functional annotation. No DNA-bound structures have been determined for the catalytic domain of any of these enzymes. However, the structures of several representative structures of isolated GIY-YIG domains, most removed from their full-length protein chains, have been solved in the absence of DNA. These studies include the nuclease domain of the I-TevI homing endonuclease (PDB code 1ln0; 1mk0) (VanRoey et al., 2002), the nuclease domain of the UvrC NER enzyme (1ycz; 1yd0 through 1yd6) (Karakas et al., 2007; Truglio et al., 2005), the Slx1 endonuclease (1ywl; 1zg2) (Swapna, 2005) and full length T4 endonuclease II (2wsh) (Andersson et al., 2010). These structures collectively demonstrate that the GIY-YIG domain corresponds to an α/β-sandwich topology with a central three-stranded antiparallel β sheet. The conserved tyrosine residues of the GIY-YIG motif are located on a common surface of this fold. The GIY-YIG endonuclease fold also extends to noncatalytic nucleic acid binding domains, such as the L9 ribosomal protein (2hbb) (Hoffman et al., 1996) and the N-terminal domain of yeast RNase H1 (1qhk) (Evans and Bycroft, 1999), possibly indicating a common origin for these functionally diverse proteins.

R.Eco29kI is encoded on plasmid pECO29 in the E. coli strain 29K (Pertzev et al., 1992) where its reading frame is separated by two basepairs from its cognate methyltransferase (M.Eco29kI) (Zakharova et al., 1998). The protein was hypothesized to be a member of the GIY-YIG endonuclease family on the basis of sequence comparisons with I-TevI (Bujnicki et al., 2001); this conclusion was validated through mutagenic and kinetic analyse (Ibryashkina et al., 2007). The enzyme cleaves a palindromic 5′-CCGC/GG-3′ target, cleaving between C4 and G5 on each strand, to generate 3′ GC overhangs. The protein is composed of two identical subunits of 214 amino acids each, that each contain one GIY-YIG motif which extends from residue G47 through G78 (the precise sequence at each end of the motif in R.Eco29kI is G47-V48-Y49 and Y76-V77-G78). Biochemical studies of the enzyme previously determined that it acts as a dimer during DNA binding and cleavage (Ibryashkina et al., 2009), and demonstrated the importance of several conserved residues for catalysis (Ibryashkina et al., 2007). R.Eco29kI possesses several internal peptide insertions within and near its recognizable GIY-YIG motif, which have been presumed to play a role in DNA recognition (Dunin-Horkawicz et al., 2006; Ibryashkina et al., 2009).

In order to visualize the structural basis for DNA recognition and cleavage by a GIY-YIG endonuclease, we have determined the crystal structure of the R.Eco29kI restriction endonuclease both alone and in the presence of its bound cognate DNA target, respectively, to 2.3 and 2.5 Å resolution. This study defines the position and putative role of the catalytic residues found in the GIY-YIG endonucleases for DNA strand cleavage, describes its mechanism of DNA recognition, and illustrates the structural relationships between sequence-specific, structure-specific, and nonspecific GIY-YIG endonuclease domains.

Results and Discussion

The structure of R.Eco29kI

A synthetic gene encoding an R.Eco29kI construct that contains a catalytically inactivating E142Q mutation was used to express and purify the protein (Supplementary Figure S1). Prior studies had demonstrated that mutation of E142 prevents DNA cleavage while still allowing near wild-type DNA binding affinity (Ibryashkina et al., 2007). The structure of the endonuclease was determined in the presence and absence of its bound DNA target site as described in the experimental methods section and supplementary material. Data and refinement statistics are provided in Table 1. Sample electron density for the initial structural modeling and refinement of the apoenzyme is shown in Supplementary Figure S2.

Table 1.

Crystallographic data and refinement statistics *

DATA COLLECTION
Dataset REco29KI apo REco29KI apo REco29KI apo REco29KI apo REco29KI-DNA
Se-met peak Se-met inflection Se-met remote Unlabeled Unlabeled
X-ray source ALS 5.0.2 ALS 5.0.2 ALS 5.0.2 ALS 5.0.1 RAXIS IV ++
Wavelength(Å) 0.9801 0.9802 0.975 1.0 1.5418
Space group C2221 C2221 C2221 C2221 P21
Unit Cell (Å) a=87.5 a=87.5 a=87.5 a= 87.9 a= 99.9
b=91.2 b=91.2 b=91.1 b=90.1 b=101.5
c=88.9 c=88.8 c=88.8 c=88.8 c=144.4
Resolution (Å)a 50–3.0 (3.11–3.0) 50–3.0 (3.11–3.0) 50–3.0 (3.11–3.0) 50–2.3 (2.38–2.3) 50–2.5 (2.59–2.5)
Rmerge (%) 25.5 (73.8) 18.0 (51.3) 14.9 (37.3) 7.9 (31.1) 6.4 (26.4)
I/σ (I) 9.64 (1.92) 17.39 (5.4) 18.38 (8.0) 18.35 (4.2) 19.3 (6.3)
Redundancy 10.7 (6.9) 13.8 (11.7) 12.5 (12.5) 4.9 (4.5) 6.6 (6.5)
Completeness (%) 99.5 (96.2) 100 (100) 100 (100) 98.9 (94.2) 99.6 (100)
Unique reflections 7156 7392 7438 15875 93336
REFINEMENT
Rwork 0.183 0.211
Rfree 0.228 0.274
Asymmetric Unit 1 X Protein subunit 4X (dimer+DNA duplex)
Protein Atoms 1640 13302
Solvent 170 496
DNA Atoms - - - 3608
Rmsd bonds (Å) 0.023 0.018
Rmsd angles (°) 1.95 1.987
Average B factor (Å2) 19.71 23.60
Ramachandran (% core, allowed, generous, disallowed) 86.7%, 11.6%, 1.7%, 0% 85.2%, 12.7%, 2.0%, 0.1%

The R.Eco29kI endonuclease forms a domain-swapped homodimeric structure, with the N-terminal region (residues1 to 14) extending across the protein interface and forming an extended series of contacts with the surface of the opposing subunit (Figure 1). The enzyme dimer’s overall dimensions in the absence of bound DNA are approximately 80 Å × 50 Å × 50 Å. Early studies of the enzyme indicated that the unbound protein might be a monomer in solution and then dimerize as part of its DNA-binding and cleavage mechanism (Pertzev et al., 1992). However, the total buried surface area in the protein-protein interface of the unbound protein is calculated by the PISA webserver (Krissinel and Henrick, 2007) to be approximately 1700 Å2, and is found to involve up to 24 hydrogen bonds and salt-bridges between the two subunits, which would appear to indicate that the protein forms a stable homodimer in solution. Size exclusion analyses of the purified protein used in these studies, after removal of affinity purification tags, indicates a molecular mass corresponding to a protein homodimer (data not shown).

Figure 1. The structure of the unbound R.Eco29kI restriction endonuclease.

Figure 1

Panel a: The enzyme homodimer. Subunit 1 is colored by secondary structure; subunit 2 is colored light blue. The GIY-YIG domain of each subunit is a β-sandwich fold containing a 3-stranded antiparallel β-sheet and two α-helices. In R.Eco29kI this core is interrupted by a short additional helix (α2) located between the first and second β-strand (residues 48 to 148; core topology β1–α 2–β 2–α 3–β 3–α4). This region is also interrupted by a poorly ordered surface loop (L1, residues 81 to 98) that packs against the center of the DNA recognition site in its minor groove. The C-terminal region (residues 149 to 214) contributes two additional exposed loops (L2 and L3) to the DNA-binding interface, and a final helix (α5) at the protein carboxy-terminus. Panel b: Topology diagram of an R.Eco29kI subunit. The secondary structure is colored and labeled as in panel a. The L1, L2 and L3 loops are colored blue and labeled. The residues of the GIY-YIG motif and additional active site residues are indicated in red font. Three residues involved in base-specific DNA recognition are indicated in blue font. The domain-swapped N-terminal extention (residues 1 to 14) is indicated in green. Panel c: The protein dimer viewed from the top, to illustrate the structural basis of protein dimerization. The domain-swapped N-terminal extensions of each subunit (residues 1 to 14) are blue and green. The α1 helix continues from that extension, and is bundled against it α1’ symmetry mate. These two helices lie across one face of the α4-α4’ helical bundle to form the majority of the homodimeric protein-protein interface. See also Supplementary Figure S1.

Each protein subunit contains a single folded domain (spanning residues 48 to 149) that contains a GIY-YIG structural core and corresponding active site. This domain displays a mixed α/β topology (β1–α 2–β 2–α 3–β 3–α4), built around a central three-stranded antiparallel β-sheet. The first two strands of this β-sheet contain the residues of the catalytic GIY-YIG motif (G47-V48-Y49 and Y76-V77-G78). The core domain of the endonuclease is further elaborated by long N- and C-terminal extensions (comprising residues 1 to 48 and 150 to 214), as well as a single extended, intervening surface loop (residues 81 to 99; labeled ‘L1’ in Figure 1) that is located between β3 and α4.

The domain-swapped N-terminal region described above packs against its symmetry-related counterpart to augment the contacts that participate in protein dimerization (Figure 1c). In contrast, the extended C-terminal region of each protein subunit contains two additional long exposed loops (residues 161 to 175 and 185 to 196; labeled L2 and L3 in Figure 1), that along with residues 81 to 99 (the L1 loop within the middle of the GIY-YIG domain) form a series of three long exposed peptide regions that extend downward from a common surface of the protein dimer. The three extended loops are quite hydrophilic overall, with a large number of basic and polar residues that are known to be capable of interacting with bound DNA. Three residues found within these loops (R86, H163 and R169) from each subunit are involved in readout of the DNA sequence, as described below. A final αhelix (α5) is found at the C-terminus of the protein structure.

The structure of the DNA-bound endonuclease was then solved using a construct that contained an additional point mutation (L69K) on the protein surface, distant from the DNA binding region, that significantly increased protein solubility (Supplementary information and Figure S3). Upon binding DNA, the enzyme homodimer undergoes a significant closure around the target site (Figure 2), with the individual protein subunits swinging toward each other by approximately 20°. This movement results in a reduction in the longest dimension of the dimer by over 10 Å. The motion is accommodated by torsional rotations of the protein backbone around a well-defined hinge point spanning residues 14 and 15 in the N-terminal region of each protein subunit (Figure 2b). The rmsd between the backbone atoms in the bound and unbound protein subunits (excluding residues 1 to 15) is approximately 1.7 Å. Although the individual subunits of the protein dimer move as rigid bodies and display relatively small structural changes during binding, the closure of the dimer around the DNA is accompanied by an extensive reorganization of the noncovalent interactions between the protein subunits (Supplementary Figure S4 and Table S2). Of the 24 hydrogen bonds and salt bridges modeled in the apoenzyme protein interface, only 12 are maintained in the DNA-bound structure. The 12 polar protein-protein interactions that are lost upon DNA binding are replaced by as many as 18 new interactions that are unique to that structure.

Figure 2. Structure of the DNA-bound R.Eco29kI endonuclease.

Figure 2

Panel a: The bound enzyme. The two subunits rotate by ~ 20° towards one another to engage the DNA duplex; the six basepair restriction site is completely encircled by the exposed loops of the enzyme homodimer. The dimensions of the enzyme are reduced by ~ 10 Å in the longest dimension. The DNA duplex is almost completely unperturbed from canonical B-form duplex conformation as a result of binding. Panel b: Superposition of a bound and unbound R.Eco29kI subunit (superposition calculated using the GIY-YIG core domain described above); the rmsd for backbone atoms across this domain is ~1.7 Å. The domain closure shown in panel b is accommodated by a hinge movement in the N-terminal region of the protein at residues 14/15 (arrow). In addition, the L1 loop in the GIY-YIG domain reorganizes to establish contacts to the minor groove at the center of the restriction site (see Figure 3). See also Supplementary Figure S3.

The two extended surface loops from the C-terminal region of the protein (loops L2 and L3) encircle the DNA target; residues from the L2 loop make base-specific and nonspecific contacts to the target site within the near the major groove. These loops display relatively small conformational differences relative to the same residues in the unbound structure. Although their average conformation is relatively unchanged as a result of DNA binding, these residues display significant reduction in their crystallographic B-factor values in the structure of the protein-DNA complex (changing from an average of ~ 38 Å2 in the unbound structure, to ~ 25 Å2 for the main chain atoms), indicating that DNA binding results in structural ordering of those residues around the DNA target. In contrast, the backbone conformation of the L1 loop is significantly altered in the DNA-bound structure, allowing it to pack into the minor groove of the target site and make additional contacts (described in detail below).

GIY-YIG folds: diversification of form and function

The catalytic core of a GIY-YIG endonuclease consists of a “β–β–α-β–α” topology (Dunin-Horkawicz et al., 2006), where the first two β-strands (numbered β1 and β2 in the R.Eco29kI structure) harbor the residues of the conserved namesake motif (Figures 1 and 3). The length of this entire region can be as short as 70 residues (as observed for the Slx-1 endonuclease, which has no significant insertions within the core fold) to over 100 residues. The longer GIY-YIG endonuclease structures harbor significant insertions between these core secondary structural elements (Figure 3). UvrC, I-TevI and T4 endonuclease II all contain an additional α-helix prior to the final β3 strand, and T4 endonuclease II harbors another helix after the same β3 strand. In contrast, R.Eco29kI (and presumably its closest homologues) displays an extended DNA-binding loop (L1 in figure 1) immediately after the β2 strand, as well as a unique α-helix (α2 in Figures 1 and 3) inserted between the β1 and β2 strands. This latter helix is found on the surface of the protein, distant from both the active site and the bound DNA, and appears to play a purely structural role in the protein fold. Five conserved catalytic residues (corresponding, in R.Eco29kI, to Y49 and Y76 from the β1 and β2 strands of the GIY-YIG motif, H108 and R104 from α3, and E142 from α4) are all found within this core domain (H108 is only partially conserved; it is replaced by a tyrosine in UvrC and T4 endonuclease II). A sixth active site residue (N154) is found outside the catalytic core region of all the visualized endonucleases, except for Slx1. The position and role of these residues is described in more detail below.

Figure 3. Diversity of GIY-YIG endonucleases.

Figure 3

Panel a: Structure-based sequence alignment of five GIY-YIG endonucleases that have been visualized using x-ray crystallography (R.Eco29kI, UvrC, I-TevI and T4 Endonuclease II) or solution NMR (Slx-1). The sequences of two additional GIY-YIG restriction endonucleases that are both isoschizomers of R.Eco29kI, but that act as tetramers instead of homodimers, are also shown (R.NgoMIII and R.Cfr42I). The secondary structure of R.Eco29kI is shown above the alignment. Black residues correspond to the core GIY-YIG fold (spanning β1, β2, α3, β3 and α4). Additional secondary structural elements that are unique to one or more members of the alignment are green; the DNA-binding loops of R.Eco29kI are blue, and the six residues of the GIY-YIG motif are red and boxed. Residues found in the R.Eco29kI active site are denoted with bold font and asterisks; all are conserved across all the enzymes shown except for N154 (which is missing in Slx-1). Residues that make sequence-specific contacts to DNA bases are indicated with bold font and dots; all are conserved across the three restriction endonucleases. Panel b: Structural alignment of the endonuclease domains shown in panel a above. All alignments were performed using the β1, β2 and β3 strands at the core of the GIY-YIG domain as the structural basis for superpositions. The overall alignment of the core domain of all five endonucleases is shown in the box; all of the core folds with the exception of Slx-1 superimpose with rmsd for backbone atoms of approximately 2.7 Å. The latter fold (colored light cyan for uniqueness) is considerably more divergent (rmsd ~ 3.6 Å). Within the core fold, there is one unique inserted secondary structural element (α2 in R.Eco29kI, which is inserted between β1 and β2). As well, the α3 helix of Slx-1 occupies a significantly different position relative to the equivalent region in the related proteins. The additional ribbon diagrams display the individual complete subunits or domains of each endonuclease from the superposition. The coloring of secondary structural elements is identical to that used in panel a above. See also Supplementary Figure S6.

The overall sequence identities between the catalytic core domain of R.Eco29kI and each of the corresponding individual nuclease domains from SlxI, I-TevI, UvrC and T4 endonuclease II are between 11% and 13% (Figure 3a). The structure superposition (Figure 3b; Supplementary table S1) of the core GIY-YIG domain of the Eco29kI restriction endonuclease against the equivalent regions of Uvr-C, I-TevI and T4 endonuclease II give similar values (ranging from 2.7 Å to 3.0 Å for backbone atoms in the GIY-YIG core region), whereas alignment with Slx1 indicates a more dissimilar core fold (rmsd ~ 3.6 Å, with a significantly lower Z-score in the DALI pairwise alignment algorithm (Zhang and Skolnick, 2005). The core structure of Slx-1, while representing a ‘minimal’ GIY-YIG endonuclease, appears significantly different from other enzymes in the superfamily that have been visualized to date.

Beyond the direct comparison of the GIY-YIG proteins with known structures, a sequence homology search of R.Eco29kI using the NCBI BLAST server (Altschul et al., 1990) indicates the presence of a large number of broadly distributed bacterial proteins that are highly homologous to R.Eco29kI. Included in this list are previously characterized restriction endonucleases NgoMIII and Cfr42I (64% and 31% sequence identity, respectively), which are isoschizomers of R.Eco29kI (Ibryashkina et al., 2009), and an extensive distribution of putative restriction endonucleases that are mostly found in β- and γ-proteobacteria. Unlike R.Eco29kI, Cfr42I is known to assemble and act as a tetramer, binding and cleaving two target sites in a cooperative manner (Gasiunas et al., 2008). A comparison of the sequence of this endonuclease to R.Eco29kI, relative to the crystal structure of the latter enzyme, indicates that the determinants of oligomeric structure probably correspond to exposed residues on the α1 helix (corresponding to amino acids N15 to F29 in R.Eco29kI). This region of the two enzymes corresponds to one of the most significantly diverged regions of sequence between the two aligned enzymes (Figure 3a).

DNA recognition

The structure of the protein-DNA complex consists of four independently packed copies of the enzyme homodimer bound to its full-length target site that are found within the asymmetric unit, thus providing four independent views of that complex. The structures of these four complexes were built and refined without noncrystallographic symmetry (NCS) constraints. While these four views of the protein-DNA complex are very consistent with one another, the most well-ordered of these structures corresponds to a homodimer containing protein chains D and F, which are the basis for the description and corresponding figures in the remainder of the results and discussion.

DNA binding leads to a closure of the two protein subunits around the target site as described above (Figure 2). In this structure, the L2 and L2’ loops from the two protein subunits wrap into the major groove of each corresponding DNA half-site, while the L1 and L1’ loops contact the minor groove on the opposite side of the DNA duplex (Figure 4a). From the L2 and L2’ loops, two protein side chains (His 163/163’ and Arg 169/169’) and two neighboring backbone carbonyl oxygen atoms (from residues 164/164’ and 165/165’) make direct contacts to a single DNA base at each position of the restriction site (Figure 4b). A single residue from the L1 and L1’ loops (Arg 86 and 86’) augments these contacts by establishing an additional water-mediated contact in the minor groove, to Cytosine 2 in each half-site (Figure 4b; middle panel).

Figure 4. DNA recognition and binding by R.Eco29kI.

Figure 4

Panel a: The endonuclease homodimer completely encircles the DNA target, as also illustrated in Figure 1b. Residues from loops L1 and L2 in each subunit (R86, H163 and R169), as well as backbone atoms from neighboring residues 164 and 165 in L2, form sequence-specific contacts to DNA bases. The contacts to bases in each DNA half-site correspond to residues from individual protein subunits (right side of panel a). Panel b: Illustration of DNA recognition viewed looking down the axis of the DNA duplex. Left: The endonuclease homodimer in the bound conformation (viewed without the DNA), with residues from the L1 loop (R86 and R86’) and the L2 loop (H163 and H163’; R169 and R169’) shown and labeled. R86 and its L1 loop interact with each DNA half-site the minor groove; H163 and R169 and the L2 loop interact with the major groove. Right: Contacts made to basepairs -1, -2 and -3, shown in the same orientation as the protein in the left side of the same panel. The numbering of individual bases, as well as the center of the palindromic site and the positions of cleavage on each strand, is shown to the left. Readout of the individual bases involves (1) direct contacts between R169 and Gua3 in the major groove for basepairs +/−1; (2) direct contacts between backbone oxygens of residues 164 and 165 and Cyt 2 in the major groove for basepairs +/−2 (augmented by a water-mediated contact between the same base and R86 in the minor groove); and (3) a direct contact between H163 and Gua6 in the major groove for basepairs +/−3. See also Supplementary figure S4.

All three of the residues described above (Arg 86, His 163 and Arg 169) are conserved between the isoschizomeric enzymes R.Eco29kI, R.NgoMIII and R.Cfr42I (Figure 3a). Prior mutagenesis studies of R.Eco29kI indicated that alteration R86 has little effect on the function of the enzyme (Ibryashkina et al., 2007), indicating that this contact may make a relatively minor contribution to DNA binding and/or specificity. The use of a histidine residue to recognize a distal guaninine residue in a restriction endonuclease active site has been observed previously for the R.NotI enzyme (Lambert et al., 2008); that interaction appears to promote pH-dependent star activity (reduced fidelity) at that position.

In the protein-DNA interface described above, a total of five direct hydrogen bonds are made to DNA bases in each half-site, by a total of two amino acid chains and two main-chain carboxyl oxygens from each protein subunit. This degree of direct sequence readout is rather minimal for a high-fidelity restriction endonuclease, particularly as compared to structures from the PD-D/ExK protein family, which are often observed to saturate all possible hydrogen-bond donor and acceptor groups in their corresponding DNA restriction site. The HNH restriction endonuclease R.PacI, which also is folded around a nonspecific endonuclease active site motif, has also been shown to make a small number of direct contacts to the DNA bases of its restriction site (Shen et al., 2010). However, DNA recognition by R.PacI involves significant disruption of the DNA duplex, whereas the protein-DNA interface of R.Eco29kI is notable for its lack of DNA bending or basepair disruption. As well, additional van der Waals packing interactions around the target site basepairs (which might add to specificity through steric readout of the shape of each basepair) is not particularly extensive (Supplementary Figure S6). Although the sequence-specific contacts made by R.Eco29kI to its restriction site are few in number, they appear to be sufficient to ensure the fidelity required of a restriction-modification system.

The cocrystal structure exhibits subtle asymmetries that are noticeable upon close inspection. This asymmetry is reproduced in all four protein-DNA complexes that are found in the asymmetric unit, and is also observed in crystals containing either calcium or manganese. Most notably, Arg 169 and Arg 169’ in the DNA interface (which contact Gua 3 in each DNA half-site) are found in slightly different rotameric conformations, such that one guanidino group makes a single observable contact to the purine ring, while the other makes two contacts to the same base in the opposite DNA half-site. As described in the following section, this asymmetry is also visible when comparing the two endonuclease active sites, where the catalytic groups in one subunit appear slightly more optimized for hydrolysis than in the opposing subunit and DNA strand, and where the two tyrosines of the GIY-YIG motif are each implicated as possible general bases in the hydrolysis reaction.

There are two possible explanations for this observation. The first is that the E142Q mutation (which appears to prevent binding of a metal ion in the active site, as discussed in the next section) causes a perturbation of the binding interactions for one or both subunits. However, previously described biochemical studies indicate that mutation of E142, and/or absence of metal ions from the protein buffer, have minimal effect on the DNA binding affinity of R.Eco29kI. The second explanation, which is more interesting mechanistically, is that the structure of the enzyme dimer, and the spacing between the DNA-binding surface of the two subunits, is not perfectly optimal for simultaneous, identical binding contacts to both DNA half-sites. If true, this might cause the enzyme to cleave the two DNA strands in a sequential manner, rather than in a single concerted step: this is a well-known property of many restriction endonucleases (Gowers et al., 2004).

To further address this question, the DNA-bound structure of a different point mutant of R.Eco29kI (containing Y49F, rather than E142Q) was also determined (Supplementary Table S3). The structure of that point mutant, including the asymmetry described above, is very similar to E142Q.

The GIY-YIG active site and catalysis

The organization and architecture of the DNA-bound R.Eco29kI active site (Figure 5) resembles, at least superficially, the catalytic arrangement observed in the ββα-metal nuclease active sites of enzymes such as bacterial colicins and HNH endonucleases (Mehta et al., 2004), with the exception that the protein fold topologies in the region of the active sites are quite different (Figure 5c), and that the tyrosine residues found in the GIY-YIG motif are positioned near the postulated water nucleophile for DNA hydrolysis (rather than a histidine side chain). The imidazole side chain of histidine 108 is located within hydrogen-bonding distance of tyrosines 49 and 76, and thereby might facilitate proton abstraction. The side chain of glutamine 142 (which is a glutamate in the wild-type enzyme) is located at the same position, relative to the scissile phosphate, as acidic metal binding residues in the active sites of a variety of HNH endonucleases. Another active site residue (N154) located outside the core fold is also found at the same relative position as a metal-binding asparagine residue that is bound in almost all HNH endonucleases. As mentioned above, this final residue is not absolutely conserved in the GIY-YIG endonucleases (being absent in the Slx-1 lineage); the corresponding Asn residue is also only partially conserved in the HNH endonucleases. With the exception of N154, the active site architecture is very strongly conserved across the GIY-YIG endonucleases that have been visualized (Figure 6a).

Figure 5. The R.Eco29kI active site.

Figure 5

Cleavage occurs between Cyt4 and Gua5 on each strand of the restriction site. Panel a: The scissile phosphates are located 15 Å apart, entirely within separate active sites. All residues in each active site are provided entirely by individual protein subunits. Panel b: The active site of one protein subunit, shown in two different orientations (rotated by ~90° relative to one another). In the left image the view is looking across the DNA strand; in the right image the view is looking down the same DNA strand (from the approximate position of the incoming water nucleophile). The residues shown and labeled correspond to those conserved residues also found in the active sites of UvrC, I-TevI, and T4 endonuclease II (Slx-1 is missing N154). E142Q denotes the inactivating point mutation included in the construct of R.Eco29kI used for these structural studies. A single bound water molecule is clearly visible in-line with the scissile phosphate, giving a ‘through phosphorus’ angle to the 3′ oxygen leaving group of ~ 180°; the distance from this water to the phosphorus and to the phenolate oxygen of Y49 is approximately 3 Å. H108 is not located close enough to act directly on the water molecule, but is positioned appropriately to assist in acid base catalysis by establishing an H-bond to Y49. E142 and N154 are found at positions that could bind a divalent metal ion in coordination with a nonbridging oxygen of the scissile phosphate. Panel c: Comparison of the GIY-YIG endonuclease active site of R.Eco29kI (left) with the HNH active site of R.Hpy99I (right), viewed from the same relative orientation of the scissile phosphate and flanking bases. The proposed mechanism and overall catalytic architecture is largely the same (with the exception of the substitution of a tyrosine for a histidine), but are grafted onto two completely separate protein fold topologies. The sphere in the R.Hpy99I structure is a bound magnesium ion; the open circle in the R.Eco29kI active site represents the equivalent position, which would allow coordination by N154, E142Q and the scissile phosphate. See also Supplementary Figure S5.

Figure 6. The GIY-YIG active site and proposed mechanism.

Figure 6

Panel a: Structure superposition of the crystallized GIY-YIG endonucleases, color coded as shown in the table of homologous residues. The scissile phosphate and flanking bases from R.Eco29kI are shown for orientation purposes. The active site from both subunits of the R.Eco29kI are shown (dark and light blue). The positions and identity of four residues (corresponding to Y49, Y76, E142 and N154) are well conserved (except for the absence of the Asn residue in Slx1). In contrast, the conformation and/or identity of residues corresponding to H108 and R104 are more divergent. Panel b: Proposed mechanism of R.Eco29kI. The structure indicates that strand cleavage in each active site involves a single bound metal ion (coordinated by at least E142, and possibly by N154) and the activation of a water molecule by a tyrosine residue. In this figure, we indicate Y49 as the likely general base. However, the slightly different positions and relative angles of the active site tyrosines, waters and phosphate in the separately visualized active sites in R.Eco29kI do not currently allow us to unambiguously assign this role to Y49, Y76, or a combination of the two residues.

In the structure of DNA-bound R.Eco29kI, a bound metal ion is absent from the active site, which implies that the likely reason for the inactivating effect of the E142Q mutation is loss of a bound metal ion that is required for stabilization of the reaction transition state. When a bound metal ion is modeled near this side chains at a position similar to typical HNH endonucleases (corresponding to coordination by E142, N154 and a nonbridging oxygen of the scissile phosphate) a mechanism for DNA hydrolysis can be postulated (Figure 6b) that involves the activation of a water nucleophile by an active site tyrosine residue (which is rendered a more effective base via a hydrogen bond to His 108) and the simultaneous stabilization of the phosphoanion transition state and the 3′ leaving group by a single bound divalent metal ion. This mechanism is consistent with previously published biochemical and kinetic studies of mutations in the R.Eco29kI active site (Ibryashkina et al., 2009; Nikitin et al., 2003) and also is in agreement to the previously postulated mechanisms for the I-TevI homing endonuclease (VanRoey et al., 2002) and the UvrC endonuclease (Karakas et al., 2007; Truglio et al., 2005).

As described in the prior section for DNA sequence recognition, the active sites of the two subunits differ slightly in the measured distances and interactions between the DNA phosphate and neighboring DNA backbone atoms to corresponding protein atoms. In particular, Arg 104 and 104’ are found in different rotameric confirmations, such that in the first subunit it interacts with the scissile phosphate, whereas in the second it is in close contact with the neighboring ribose in the DNA backbone.

In the active site conformation in subunit 1 of R.Eco29kI (chain D in the corresponding PDB file), depicted in Figures 5 and 6, the distances and angles from Y49, through the ordered water molecule in the active site, to the scissile phosphate appear nearly ideal for an in-line hydrolytic attack and formation of 5′ phosphate and 3′ hydroxyl product ends. A mechanism that relies upon Y49 as the general base for the reaction is in agreement with mutagenesis studies of Eco29kI (Ibryashkina et al., 2007) and I-TevI (Kowalski et al., 1999) (where mutation of this residue or its homlogous counterpart in I-TevIresulted in complete loss of activity). However, in the active site in subunit 2 of R.Eco29kI (chain F in the corresponding PDB file), the positioning of Y49 and Y76 relative to the water and phosphate groups is more ambiguous; in this active site the distance from the water to the phosphate is longer (over 4 Å) and Y76 could reasonably be assigned the role of primary general base. Given this observation, as well as the results of mutational analyses in UvrC (Truglio et al., 2005) (where mutation of the counterpart of Y76 resulted in complete loss of activity), it is formally possible that either of the two GIY-YIG tyrosine residues in R.Eco29kI (Y49 or Y76) might act directly or indirectly as a general base in the reaction, or that one residue might ‘back up’ the other when one of them is mutated. Such a result (catalytic redundancy for an active site general base in an endonuclease) has been reported for the HNH/His-Cys box enzyme I-PpoI (Eastberg et al., 2007).

The differing evolutionary fates of endonuclease superfamilies

In nature, a relatively small number of protein families are found to encompass the vast majority of enzymes that make and break phosphodiester bonds. In particular, the PD-(D/E)xK, HNH and GIY-YIG domains (in addition to the LAGLIDADG endonuclease family and the less common phospholipase D and half-pipe folds) are each found in enzymes involved in similar processes, including phage restriction, transposition and homing, Holliday junction resolution and recombination (Orlowski and Bujnicki, 2008). A notable difference between the PD-(D/E)xK and GIY-YIG endonucleases is observed in the extent of structural conservation of their active sites. The GIY-YIG endonuclease domains that have been crystallized participate in highly disparate biological functions that include genomic degradation, invasion, defense and repair. They also reside within protein scaffolds and architectures ranging from single domain endonucleases, to single chain fusions of catalytic domains and DNA binding domains, to assemblages of multiple protein subunits. Given this level of functional and structural diversity, the conservation of their active sites is striking: of the position and identity of six catalytically important residues, only one (N154) is not absolutely or strongly conserved.

In contrast, the PD-(D/E)xK endonucleases, which are equally divergent in function and oligomeric structure, display a striking diversity in their active site architectures. As a result, PD(D/E)xK enzymes do not appear to display a single uniform reaction mechanism: across various member of this protein family, different numbers of metal ions can be required for DNA cleavage, the position of metal binding sites may be shifted, and a variety of residues and chemistries can be used for proton transfer and transition state stabilization (Yang et al., 2006).

The strict conservation of the GIY-YIG active site implies that this catalytic geometry was catalytically optimized and strongly fixated before the divergence of the endonuclease family lineage from their last common ancestor, and that the GIY-YIG active site is significantly less capable of mutational reorganization, without a catastrophic loss of activity that cannot easily be regained, as compared to its PD-(D/E)xK counterpart. This might be caused by the combined importance of the GIY-YIG active site residues both for catalysis and for structural stabilization (which might impose exceptionally strong evolutionary constraints that cause their almost absolute fixation and conservation).

These various endonuclease families have each experienced distinct areas of particular success and expansion, in terms of adopting various biological functions. The PD-(D/E)xK family is clearly the dominant catalytic motif involved in restriction-modification systems. In combination with the HNH endonucleases, the PD-(D/E)xK family also displays an additional broad repertoire of DNA modification and repair functions and activities. In contrast, “the GIY-YIG domain (appears to have been) less successful than several other nuclease superfamilies in spreading to new loci, parasitizing different organisms and adopting different functions” (Dunin-Horkawicz et al., 2006). It is quite possible that this might reflect a reduced ability of the GIY-YIG fold to readily adapt to the demands of structural oligomerization or sequence-specific DNA recognition (Dunin-Horkawicz et al., 2006). However, the analysis of R.Eco29kI presented here indicates that the formation of an active and highly specific restriction endonuclease, using a GIY-YIG catalytic core, does not involve any more complex elaboration upon its catalytic core than is observed for either PD-D/ExK or HNH nucleases. Therefore, while it is entirely reasonable to hypothesize that the GIY-YIG endonuclease family persists even though it possesses suboptimal biophysical properties, as genomes have become increasingly complex, an undetermined constraint or selective pressure may have influenced the outcome of competition between endonuclease superfamilies.

Experimental Procedures

Detailed materials and methods are provided in supplementary material. Briefly, a synthetic, codon-optimized gene encoding full-length R.Eco29kI, harboring a single inactivating point mutation (E142Q) was subcloned into an in-house variant of a commercially available pET vector (Novagen, Inc.) that incorporates a thrombin-cleavable his-tag onto the N-terminal end of the protein (Supplementary Figure S1). The protein was overexpressed by IPTG induction overnight at 37°C and purified by metal affinity chromatography. The protein was poorly expressed (~ 0.5 mg per 6 liters of culture) and could only be concentrated after purification to ~ 1 mg/mL. The protein could be resolubilized under extremely alkaline pH (~12 to 13) and was then crystallized out of this condition in the absence of bound DNA. The crystals were in space group C2221; the asymmetric unit corresponded to a single protein subunit.

The structure of the apoenzyme was determined by selenomethionyl derivatization and subsequent multiple anomalous dispersion (MAD) data collection and phasing, using beamline 5.0.2 at the Advanced Light Source synchrotron facility (Lawrence Berkeley National Laboratory). The structure was modeled using COOT (Emsley and Cowtan, 2004) and refined to 2.3 Å resolution using REFMAC (Winn et al., 2003) (Rwork/Rfree = 0.176/0.225). All residues of the protein were visible and modeled in the electron density maps.

The structure of the unbound protein was used to identify a point mutation on the protein surface (L69K) that greatly improved protein expression and solubility. Expression and purification was conducted as described above, leading to a yield of approximately 36 mg per liter of culture. This protein construct was cocrystallized in the presence of an 18 basepair DNA duplex containing the endonuclease restriction site. The structure was determined by molecular replacement using program PHASER (McCoy et al., 2007) and modeled and refined as described above (Rwork/Rfree = 0.209/0.272). A second inactive variant (Y49F/E142Q) was also expressed and purified using a similar protoco, and was crystallized bound to DNA and refined to 2.8 Å resolution (Rwork/Rfree = 0.206/0.279).

Coordinates and Data Deposition

The X-ray structure factor amplitudes and corresponding refined coordinates for the R.Eco29kI structure have been deposited in the RCSB database (PDB ID codes 3MX1, 3MX4 and 3NIC).

Supplementary Material

01

Acknowledgments

X-ray data was collected at the Advanced Light Source (ALS) synchrotron facility at the Lawrence Berkeley National Laboratory (University of California) on beamlines 5.0.1 and 5.0.2 with the assistance of ALS staff. We thank members of the laboratories of Roland Strong and Adrian Ferre-D’Amare for advice and assistance during structure determination, and in particular valuable assistance from Dr. Betty W. Shen. This work was supported by funding from the NIH to BLS (R01 GM49857 and RL1 CA133833) and by the Hutchinson Center Division of Basic Sciences.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  2. Andersson CE, Lagerback P, Carlson K. Structure of bacteriophage T4 endonuclease II mutant E118A, a tetrameric GIY-YIG enzyme. J Mol Biol. 2010;397(4):1003–16. doi: 10.1016/j.jmb.2010.01.076. [DOI] [PubMed] [Google Scholar]
  3. Aravind L, Koonin EV. Prokaryotic homologs of the eukaryotic DNA-end-binding protein Ku, novel domains in the Ku protein and prediction of a prokaryotic double-strand break repair system. Genome Res. 2001;11:1365–1374. doi: 10.1101/gr.181001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Aravind L, Walker DR, Koonin EV. Conserved domains in DNA repair proteins and evolution of repair systems. Nucleic Acids Res. 1999;27:1223–1242. doi: 10.1093/nar/27.5.1223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Belle A, Landthaler M, Shub DA. Intronless homing: site-specific endonuclease SegF of bacteriophage T4 mediates localized marker exclusion analogous to homing endonucleases of group I introns. Genes and Dev. 2002;16:351 – 362. doi: 10.1101/gad.960302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bujnicki JM, Radlinska M, Rychlewski L. Polyphyletic evolution of type II restriction enzymes revisited: two independent sources of second-hand folds revealed. Trends Biochem Sci. 2001;26:9–11. doi: 10.1016/s0968-0004(00)01690-x. [DOI] [PubMed] [Google Scholar]
  7. Carlson K, Wiberg JS. In vivo cleavage of cytosine-containing bacteriophage T4 DNA to genetically distinct, discretely sized fragments. J Virol. 1983;48:18–30. doi: 10.1128/jvi.48.1.18-30.1983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Coulon S, Gaillard PH, Chahwan C, McDonald WH, Yates JR, 3rd, Russell P. Slx1-Slx4 are subunits of a structure-specific endonuclease that maintains ribosomal DNA in fission yeast. Mol Biol Cell. 2004;15:71–80. doi: 10.1091/mbc.E03-08-0586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Dunin-Horkawicz S, Feder M, Bujnicki JM. Phylogenomic analysis of the GIY-YIG nuclease superfamily. BMC Genomics. 2006;7:98. doi: 10.1186/1471-2164-7-98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Eastberg JH, Eklund J, Monnat R, Jr, Stoddard BL. Mutability of an HNH nuclease imidazole general base and exchange of a deprotonation mechanism. Biochemistry. 2007;46:7215–7225. doi: 10.1021/bi700418d. [DOI] [PubMed] [Google Scholar]
  11. Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
  12. Evans SP, Bycroft M. NMR structure of the N-terminal domain of Saccharomyces cerevisiae RNase HI reveals a fold with a strong resemblance to the N-terminal domain of ribosomal protein L9. J Mol Biol. 1999;291:661–669. doi: 10.1006/jmbi.1999.2971. [DOI] [PubMed] [Google Scholar]
  13. Gasiunas G, Sasnauskas G, Tamulaitis G, Urbanke C, Razaniene D, Siksnys V. Tetrameric restriction enzymes: expansion to the GIY-YIG nuclease family. Nucleic Acids Res. 2008;36:938–949. doi: 10.1093/nar/gkm1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gowers DM, Bellamy SR, Halford SE. One recognition sequence, seven restriction enzymes, five reaction mechanisms. Nucleic Acids Res. 2004;32:3469–3479. doi: 10.1093/nar/gkh685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hoffman DW, Cameron CS, Davies C, White SW, Ramakrishnan V. Ribosomal protein L9: a structure determination by the combined use of X-ray crystallography and NMR spectroscopy. J Mol Biol. 1996;264:1058–1071. doi: 10.1006/jmbi.1996.0696. [DOI] [PubMed] [Google Scholar]
  16. Ibryashkina EM, Sasnauskas G, Solonin AS, Zakharova MV, Siksnys V. Oligomeric structure diversity within the GIY-YIG nuclease family. J Mol Biol. 2009;387:10–16. doi: 10.1016/j.jmb.2009.01.048. [DOI] [PubMed] [Google Scholar]
  17. Ibryashkina EM, Zakharova MV, Baskunov VB, Bogdanova ES, Nagornykh MO, Den’mukhamedov MM, Melnik BS, Kolinski A, Gront D, Feder M, et al. Type II restriction endonuclease R.Eco29kI is a member of the GIY-YIG nuclease superfamily. BMC Struct Biol. 2007;7:48. doi: 10.1186/1472-6807-7-48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Karakas E, Truglio JJ, Croteau D, Rhau B, Wang L, Van Houten B, Kisker C. Structure of the C-terminal half of UvrC reveals an RNase H endonuclease domain with an Argonaute-like catalytic triad. Embo J. 2007;26:613–622. doi: 10.1038/sj.emboj.7601497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Kosinski J, Feder M, Bujnicki JM. The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function. BMC Bioinformatics. 2005;6:172. doi: 10.1186/1471-2105-6-172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kowalski JC, Belfort M, Stapleton MA, Holpert M, Dansereau JT, Pietrokovski S, Baxter SM, Derbyshire V. Configuration of the catalytic GIY-YIG domain of intron endonuclease I-TevI: coincidence of computational and molecular findings. Nucleic Acids Research. 1999;27:2115–2125. doi: 10.1093/nar/27.10.2115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J Mol Biol. 2007;372:774–797. doi: 10.1016/j.jmb.2007.05.022. [DOI] [PubMed] [Google Scholar]
  22. Lambert AR, Sussman D, Shen B, Maunus R, Nix J, Samuelson J, Xu SY, Stoddard BL. Structures of the rare-cutting restriction endonuclease NotI reveal a unique metal binding fold involved in DNA binding. Structure. 2008;16:558–569. doi: 10.1016/j.str.2008.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. McCoy AJ, Grosse-Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. Phaser crystallographic software. J Appl Cryst. 2007;40:658 – 674. doi: 10.1107/S0021889807021206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Mehta P, Katta K, Krishnaswamy S. HNH family subclassification leads to identification of commonality in the His-Me endonuclease superfamily. Protein Science. 2004;13:295 – 300. doi: 10.1110/ps.03115604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Moolenaar GF, van Rossum-Fikkert S, van Kesteren M, Goosen N. Cho, a second endonuclease involved in Escherichia coli nucleotide excision repair. Proc Natl Acad Sci U S A. 2002;99:1467–1472. doi: 10.1073/pnas.032584099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Nikitin D, Mokrishcheva M, Denjmukhametov M, Pertzev A, Zakharova M, Solonin A. Construction of an overproducing strain, purification, and biochemical characterization of the 6His-Eco29kI restriction endonuclease. Protein Expr Purif. 2003;30:26–31. doi: 10.1016/s1046-5928(03)00072-x. [DOI] [PubMed] [Google Scholar]
  27. Orlowski J, Bujnicki JM. Structural and evolutionary classification of Type II restriction enzymes based on theoretical and experimental analyses. Nucleic Acids Res. 2008;36:1 – 13. doi: 10.1093/nar/gkn175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Pertzev AV, Ruban NM, Zakharova MV, Beletzkaja IV, Petrov SI, Kravetz AN, Solonin AS. Eco29kI, a novel plasmid encoded restriction endonuclease from Escherichia coli. Nucleic Acids Res. 1992;20:1991. doi: 10.1093/nar/20.8.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Pyatkov KI, Arkhipova IR, Malkova NV, Finnegan DJ, Evgen’ev MB. Reverse transcriptase and endonuclease activities encoded by Penelope-like retroelements. Proc Natl Acad Sci U S A. 2004;101:14719–14724. doi: 10.1073/pnas.0406281101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Shen BW, Heiter DF, Chan S-H, Wang H, Xu S-Y, Morgan RD, Wilson GG, Stoddard BL. Unusual target site disruption by the rare-cutting HNH restriction endonuclease PacI. Structure . 2010 doi: 10.1016/j.str.2010.03.009. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Swapna GVT, Bhattacharya A, Aramini JM, Acton TB, Ma L, Xiao R, Shastry R, Shih L, Cunningham KE, Montelione GT. Solution NMR structure of the protein EF2693 from E. faecalis: Northeast Structural Genomics Consortium target EFR36. 2005 doi: 10.2210/pdb1ywl/pdb. [DOI] [Google Scholar]
  32. Truglio JJ, Rhau B, Croteau DL, Wang L, Skorvaga M, Karakas E, DellaVecchia MJ, Wang H, Van Houten B, Kisker C. Structural insights into the first incision reaction during nucleotide excision repair. Embo J. 2005;24:885–894. doi: 10.1038/sj.emboj.7600568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Van Houten B, Eisen JA, Hanawalt PC. A cut above: discovery of an alternative excision repair pathway in bacteria. Proc Natl Acad Sci U S A. 2002;99:2581–2583. doi: 10.1073/pnas.062062599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. VanRoey P, Meehan L, Kowalski JC, Belfort M, Derbyshire V. Catalytic domain structure and hypothesis for function of GIY-YIG intron endonuclease I-TevI. Nat Struct Biol. 2002;9:806 – 811. doi: 10.1038/nsb853. [DOI] [PubMed] [Google Scholar]
  35. Verhoeven EE, van Kesteren M, Moolenaar GF, Visse R, Goosen N. Catalytic sites for 3′ and 5′ incision of Escherichia coli nucleotide excision repair are both located in UvrC. J Biol Chem. 2000;275:5120–5123. doi: 10.1074/jbc.275.7.5120. [DOI] [PubMed] [Google Scholar]
  36. Winn MD, Murshudov GN, Papiz MZ. Macromolecular TLS refinement in REFMAC at moderate resolutions. Methods Enzymol. 2003;374:300–321. doi: 10.1016/S0076-6879(03)74014-2. [DOI] [PubMed] [Google Scholar]
  37. Yang W, Lee JY, Nowotny M. Making and breaking nucleic acids: two-Mg2+-ion catalysis and substrate specificity. Mol Cell. 2006;22:5–13. doi: 10.1016/j.molcel.2006.03.013. [DOI] [PubMed] [Google Scholar]
  38. Zakharova MV, Beletskaya IV, Kravetz AN, Pertzev AV, Mayorov SG, Shlyapnikov MG, Solonin AS. Cloning and sequence analysis of the plasmid-borne genes encoding the Eco29kI restriction and modification enzymes. Gene. 1998;208:177–182. doi: 10.1016/s0378-1119(97)00637-9. [DOI] [PubMed] [Google Scholar]
  39. Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–2309. doi: 10.1093/nar/gki524. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES