Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2005 Jul 11;102(29):10082–10087. doi: 10.1073/pnas.0504556102

Evolution of highly active enzymes by homology-independent recombination

Karl E Griswold *, Yasuaki Kawarasaki †,, Nada Ghoneim †,§, Stephen J Benkovic , Brent L Iverson *,∥,**, George Georgiou †,∥,††
PMCID: PMC1177412  PMID: 16009931

Abstract

The theta-class GST enzymes hGSTT1-1 (human GSTθ-1-1) and rGSTT2-2 (rat GSTθ-2-2) share 54.3% amino acid identity and exhibit different substrate specificities. Homology-independent techniques [incremental truncation for the creation of hybrid enzymes (ITCHY) and SCRATCHY] and low-homology techniques (recombination-dependent exponential amplification PCR) were used to create libraries of chimeric enzymes containing crossovers (C/Os) at positions not accessible by DNA family shuffling. High-throughput flow cytometric screening using the fluorogenic rGSTT2-2-specific substrate 7-amino-4-chloromethyl coumarin led to the isolation of active variants with either one or two C/Os. One of these enzymes, SCR23 (83% identity to hGSTT1-1), was encoded by a gene that exchanged helices 4 and 5 of hGSTT1-1 with the corresponding sequence from rGSTT2-2. Compared with either parent, this variant was found to have an improved kcat with the selection substrate and also exhibited activity for the conjugation of glutathione to ethacrynic acid, a compound that is not recognized by either parental enzyme. These results highlight the power of combinatorial homology-independent and low-homology recombination methods for the generation of unique, highly active enzymes and also suggest a possible means of enzyme “humanization.”

Keywords: enzyme engineering, enzyme humanization, GST, ITCHY, high-throughput screening


GSTs play a crucial role in cellular detoxification by conjugating reactive electrophilic compounds to the tripeptide glutathione (GSH) (1). Mammals encode at least seven distinct classes of GSTs. The theta enzymes represent the oldest class of GSTs and, as such, are produced by a surprisingly diverse array of animals, plants, algae, and bacteria (2). Although theta-class enzymes retain the highly conserved GST 3D fold, they possess a characteristic C-terminal α-helical extension, which, unlike that of some alpha-class enzymes, covers both the electrophile and GSH-binding sites, thus clearly differentiating the theta class from other members of the soluble GST superfamily (3).

The rat GSTθ-2-2 (rGSTT2-2, 244 aa) and human GSTθ-1-1 (hGSTT1-1, 240 aa) enzymes exhibit 54.3% overall amino acid identity. The GSH-binding domain, or G site, of the two enzymes (residues ≈1-77) is conserved (79.2% amino acid identity), and the 5′ regions that encode this domain (nucleotides 1-231) exhibit 74.5% DNA sequence identity. In contrast, the electrophilic substrate binding domain, or H site (amino acid ≈89 to termini), of the rat and human enzymes (encoded by nucleotides ≈265-732 in rGSTT2-2 and nucleotides ≈265-720 in hG-STT1-1) exhibit only 41.4% amino acid identity and 57.9% DNA sequence identity. The sequence divergence of the two enzymes in the C-terminal domain is manifest in their respective electrophilic substrate selectivities. The rat enzyme preferentially catalyzes the GSH conjugation of 1-menaphythyl sulfate, whereas hGSTT1-1 exhibits a characteristic reactivity with dichloromethane (2). Additionally, we recently showed that rG-STT2-2 exhibits much higher activity than hGSTT1-1 with 7-amino-4-chloromethyl coumarin (CMAC), which, upon GSH conjugation, gives rise to a fluorescent cytoplasmically retained product (4). The amino acid sequence determinants that dictate the difference in substrate selectivity within the mammalian theta-class GSTs are not understood. Earlier attempts to isolate C-terminal rGSTT2-2/hGSTT1-1 chimeras with rat-like specificity from libraries generated by homologous recombination were unsuccessful due to the low degree of homology in the sequences encoding the H site (4, 5).

Site-specific, homology-independent recombination has been accomplished by means of visual inspection of sequence alignments and structural alignments, by computational methods, and by exon shuffling (6-14). Recently, several techniques for the combinatorial generation of protein chimeras in regions of low homology have been developed. These techniques include nonhomologous random recombination (15), sequence homology-independent protein recombination (16), and incremental truncation for the creation of hybrid enzymes (ITCHY) and SCRATCHY (17, 18). However, random nonhomologous recombination results in the creation of libraries containing a large number of out-of-frame or otherwise inactive clones. The theoretical size of such libraries renders them unsuitable for manual screening because a vast number of clones must be interrogated to identify highly active enzymes. In part because of this fact, the isolation of enzymes containing one or more crossovers (C/Os) in low-homology regions and exhibiting high levels of catalytic competency or new specificities has not yet been demonstrated.

Here, we report a combination of homology-dependent and homology-independent methods for the construction of chimeric theta-class GSTs. This superfamily represents an ideal model system because the highly conserved GST 3D fold lends itself to homology-independent recombination and the enzymes themselves have biomedical relevance. High-throughput flow cytometric screening of chimeric libraries using the rGSTT2-2-specific fluorogenic substrate CMAC yielded enzymes exhibiting kcat values >3-fold higher than the parental rat protein. Notably, one of the isolated clones (SCR23) substituted two H site helices from rGSTT2-2 into the hGSTT1-1 framework and resulted in an enzyme that possessed 3.5-fold and 300-fold higher kcat values for the selection substrate relative to rGSTT2-2 and hGSTT1-1, respectively. Additionally, SCR23 catalyzed the GSH conjugation of ethacrynic acid, an activity absent in both parental proteins. These results demonstrate that unique yet highly active enzymes can be isolated from libraries containing a substantial fraction of homology-independent C/Os.

The construction of highly active chimeras with human and nonhuman parents also suggests a potential route to practical therapeutic enzymes. Protein humanization has been used extensively for the generation of therapeutic antibodies having reduced immunogenicity (19-21). Unfortunately, unlike antibodies, enzymes do not possess a singular modular structure that readily allows the rational grafting of functional exogenous loops onto an analogous human framework. The isolation of a chimeric GST possessing only 38 unique rat residues (15.9% of the sequence) yet exhibiting full rGSTT2-2 activity indicates that homology-independent recombination of human genes with those from other species may provide a means of engineering humanized enzymes (i.e., containing a small fraction of nonhuman sequence) that retain a desired exogenous activity and might be useful as clinical therapeutics (22).

Experimental Protocol

Library Construction. The construction of the rat-human (r-h) and human-rat (h-r) GSTθ (GSTT) ITCHY libraries is described in ref. 4. For the construction of the multiple C/O library eSCR-A (Fig. 1), the hGSTT1 (720 nt) and rGSTT2 (732 nt) genes were divided into five overlapping sections as follows: section 1, nucleotides 1-149; section 2, nucleotides 130-261; section 3, nucleotides 242-460; section 4, nucleotides 441-602; section 5, nucleotides 585-720 (numbering was based on hGSTT1). Chimeric fragment generation for sections 1-4 was performed by recombination-dependent exponential amplification PCR (23) with skewed primer sets using the authentic hGSTT1 and rGSTT2 genes as templates. Reaction conditions were as follows: 94°C for 60 s; 40 cycles of 94°C for 30 s and 50°C for 3 s; 10 cycles of 94°C for 30 s, 50°C for 30 s, and 72°C for 30 s. Chimeras within section 5 were prepared by amplifying the single C/O chimeric genes from the h-r and r-hGSTT ITCHY libraries (4) with skewed primers. Gene fragments for each of the five sections were gel-purified, combined, and subjected to an overlap extension PCR. PCR products of ≈720 bp were gel-purified, ligated into pGEM-T vector (Promega), and used to transform electrocompetent Escherichia coli Jude-1 cells (DH10B harboring the F′ factor derived from XL1-blue). Plasmid DNA was isolated from ≈1 × 105 transformants, digested with NcoI/BamHI, and ligated into pET-28a (Novagen). After transformation, plasmid DNA was extracted, GSTT genes were PCR-amplified, and the PCR products were subjected to DNA shuffling as described in ref. 24. The reassembled products were digested with NcoI/BamHI, ligated into pET-28a, and transformed into E. coli BL21(DE3) [F- ompT hsdSB(rB- mB-) gal dcm (DE3)]. The resulting library, eSCR-A, contained ≈5 × 105 independent transformants. Approximately 700 clones exhibiting high fluorescence after incubation with 10 μM CMAC were isolated by flow cytometry, and the respective GSTT genes were PCR-amplified. The amplified fragments were gel-purified, mixed with hGSTT1 DNA in a 9:1 ratio, and shuffled by DNase I digestion followed by a reassembly PCR (backcrossing) (24). The backcrossed chimeras were digested with NcoI/BamHI, ligated into pET-28a, and used to transform E. coli Jude-1 cells by electroporation, yielding library eSCR-B. Plasmid DNA was isolated and used to transform the expression host BL21(DE3) before screening.

Fig. 1.

Fig. 1.

A schematic representation of chimeric GSTT library construction. hGSTT1 (red) and rGSTT2 (blue) are used as parental donors for ITCHY library construction (A), resulting in two complementary single C/O libraries from which RH-A5, RH-F2, HR-216, and HR-25 were isolated. A multiple C/O SCRATCHY library, eSCR-A, is generated by using a combination of enhanced C/O SCRATCHY (B1) and recombination-dependent exponential amplification PCR (B2). The library eSCR-A is sorted on a flow cytometer (C), selecting for clones with high CMAC activity. Genes from the selected clones are pooled and combined with the parental hGSTT1-1 gene, and the mixture is subjected to a shuffling step (D), resulting in the humanized library eSCR-B. Library eSCR-B is sorted on a flow cytometer (E), selecting for clones with high CMAC activity. Individual clones are selected for detailed analysis based on whole-cell CMAC activity, restriction fragment length polymorphism analysis, and DNA sequencing (F).

Flow Cytometric Screening. High-throughput screening of libraries for members with CMAC activity was carried out as described in ref. 4, with the exception that events were triggered with a 514.5-nm argon ion laser (100 mW) detected through a 530/40 bandpass filter.

Enzyme Purification and Kinetics. To facilitate purification, genes of interest were subcloned into the NdeI/BamHI sites of pET-28a, installing an N-terminal His-6 tag. Cultures (500 ml) expressing the tagged constructs were induced in LB (Beckton Dickinson) supplemented with 50 μg/ml kanamycin and 1 mM IPTG at 25°C for 20 h. Cells were harvested by centrifugation at 4,420 × g for 20 min, resuspended in 15 ml of IMAC buffer (10 mM Tris/500 mM NaCl/15 mM imidazole, pH 8.0), and lysed in a French pressure cell. After lysis, insoluble material was removed by centrifugation at 39,200 × g for 45 min at 4°C. The supernatant was clarified by syringe filtration [with a Whatman 0.2-μm poly(vinylidene difluoride) filter disk], combined with 1 ml of Ni-nitrilotriacetic acid resin (Qiagen, Valencia, CA), and gently agitated for 45 min at 25°C. The resin was separated by gravity filtration, washed four times with 10 ml of IMAC buffer, and GSTT protein was desorbed by adding 30 ml of 10 mM Tris, 500 mM NaCl, and 215 mM imidazole (pH 8.0). The resulting solution was dialyzed four times against 1 liter of storage buffer [10 mM Tris/500 mM NaCl/10% (vol/vol) glycerol, pH 8.0] at 4°C. Yields were 5-100 mg per liter depending on the enzyme, and preparations were typically 90%-plus pure as analyzed by SDS/PAGE (data not shown).

The kinetics of CMAC-GSH conjugation were determined with freshly purified enzymes. Stock solutions of CMAC were made in dimethylforamide, and GSH stocks were made daily in assay buffer (100 mM phosphate, pH 6.5). Reactions were carried out in a total volume of 100 μl, and measurements of initial rates were taken on a Synergy HT fluorescent plate reader (Bio-Tek, Burlington, VT) by using a 360/40 excitation filter and a 485/20 emission filter. CMAC concentrations were varied from 5 to 150 μM (10 mM GSH), and GSH was varied from 0.5 to 10 mM (150 μM CMAC). CMAC-GSH conjugate concentration was correlated to fluorescent signal by HPLC analysis. prism 4 software (GraphPad, San Diego) was used for determination of kcat and Km values.

Specific activities with 1-chloro-2,4-dinitrobenzene (CDNB), phenethyl isothiocyanate (PEITC), and ethacrynic acid were measured in 100 mM phosphate (pH 6.5) with 10 mM GSH. Initial reaction rates were measured by monitoring the absorbance change at 340 nm for CDNB or 270 nm for PEITC and ethacrynic acid. Reactions were performed in a 1-cm quartz cuvette, and measurements were taken on a Lamba 35 UV/Vis spectrophotometer (PerkinElmer). Conjugate concentrations were correlated to absorbance measurements by using the equation

graphic file with name M1.gif

where [C] is conjugate concentration, AT is total absorbance, [T] is the concentration of substrate added to the reaction, εU is the extinction coefficient of unconjugated substrate, εC is the extinction coefficient of conjugated substrate, and b is path length.

The extinction coefficients for the unconjugated substrates were measured in quadruplicate and are as follows: CDNB = 0.5 mM-1·cm-1, PEITC = 0.1 mM-1·cm-1, and ethacrynic acid = 3.4 mM-1·cm-1. The extinction coefficients for the GSH conjugates were determined by spectroscopic and HPLC analysis of equilibrated reaction mixtures and were calculated to be CDNB-GSH = 1.5 mM-1·cm-1, PEITC-GSH = 9.1 mM-1·cm-1, and ethacrynic acid-GSH = 7.9 mM-1·cm-1.

Results

Library Construction and Screening. Initial attempts to create GST chimeras by using DNA family shuffling failed to yield sequences containing C/Os within the H site where rGSTT2-2 and hGSTT1-1 exhibit the highest degree of sequence divergence (4, 5). These results were not surprising, because homology-based techniques are heavily biased toward the generation of C/Os in long stretches of identical sequences (>7 nt) (25, 26), which are conspicuously rare in the 3′ ends of the rGSTT2-2 and hGSTT1-1 genes.

To circumvent this problem, r-hGSTT and h-rGSTT single C/O chimeric libraries were first constructed by using ITCHY (4) (Fig. 1). Two libraries consisting of 13,000 and 1,000 transformants were obtained, respectively. The chimeric genes were cloned downstream from the T7 promoter, and both libraries were screened for activity with the rGSTT2-2-specific CMAC substrate by flow cytometry. The rGSTT2-2 enzyme possesses a 400-fold higher kcat/Km for CMAC compared with hGSTT1-1 (Table 1), and cells expressing the former enzyme exhibit a 10-fold higher fluorescence (Fig. 2). After incubation with 10 μM CMAC, 4 × 106 cells from the r-hGSTT library and 3 × 105 cells from the h-rGSTT library were sorted by flow cytometry, and highly fluorescent events from each were isolated directly onto selective LB plates. For the r-hGSTT library, 20 of 176 resulting clones were selected at random, grown in liquid media, and rank-ordered with respect to CMAC fluorescence. Similarly, 30 of 219 clones from the selected h-rGSTT library were analyzed as monoclonal populations. The most fluorescent clones from the r-hGSTT library, RH-A5 and RH-F2, exhibited mean flow cytometric signal intensities that were 30% and 50% lower compared with cells expressing the parental rat enzyme. On the other hand, clones HR-216 and HR-25 from the h-rGSTT library displayed 15% lower and equivalent mean fluorescence, respectively.

Table 1. Kinetic parameters for GSTT enzymes.

Enzyme Km (CMAC), μM Km (GSH), μM kcat, min−1 kcat/Km (CMAC), mM−1·min−1 kcat/Km (GSH), mM−1·min−1
hGSTT1-1 55 ± 5 6,400 ± 700 0.089 ± 0.004 1.6 ± 0.2 0.014 ± 0.002
rGSTT2-2 11 ± 1 420 ± 70 7.2 ± 0.2 650 ± 90 17 ± 4
RH-A5 51 ± 6 4,330 ± 80 2.9 ± 0.1 60 ± 10 0.67 ± 0.04
RH-F2 67 ± 9 4,340 ± 40 1.32 ± 0.07 20 ± 4 0.30 ± 0.02
HR-25 55 ± 2 3,200 ± 600 9.5 ± 0.6 170 ± 20 3.0 ± 0.9
HR-216 19 ± 2 1,600 ± 200 23 ± 1 1,200 ± 200 14 ± 3
SCR9 53 ± 3 2,300 ± 100 27.0 ± 0.5 510 ± 40 11.7 ± 0.8
SCR23 46 ± 7 2,600 ± 200 27 ± 1 600 ± 100 10 ± 1

Fig. 2.

Fig. 2.

Whole-cell CMAC activity measured by flow cytometry and depicted as fluorescence histograms of GSTT-expressing E. coli. Data for monoclonal hGSTT1-1-expressing cells are shown in foreground (black), and data for rGSTT2-2-expressing cells are shown in background (white). Library eSCR-B is represented as gray histograms before sorting, after one round, and after two rounds. Mean fluorescence of populations is based on 10,000 events (M).

DNA sequencing revealed that RH-A5 has a single in-frame C/O at the distal 3′ end of the gene (Fig. 3A), such that amino acids 1-231 are derived from the rat enzyme, whereas the final 13 amino acids correspond to residues 228-240 of the human GST. Modeling based on the homologous theta-class hGSTT2-2 crystal structure (27) indicated that the majority of the C-terminal helix of RH-A5 is derived from hGSTT1-1. Similarly, ITCHY clone RH-F2 contained a single 3′ C/O (Fig. 3A) such that amino acids 1-221 were derived from rGSTT2-2 and amino acids 222-240 were derived from hGSTT1-1. RH-F2 thus contains a human C-terminal α-helix preceded by a human flexible loop connecting it to the rat-derived helix 8. The rest of the highly active clones from the r-hGSTT ITCHY library contained similar C/Os exchanging various portions of the rat C-terminal α-helix for that of hGSTT1-1 or were truncated within or adjacent to this helix (data not shown).

Fig. 3.

Fig. 3.

Gene and protein structures of GSTT enzymes. (A) Schematic representation of selected chimeric genes. Sequences inherited from hGSTT1-1 are shown as red bars, and segments from rGSTT2-2 are shown as blue bars. Insertions not derived from either parent are represented as green bars. Positioning of the progeny segments corresponds to their origin in the parental genes (depicted at top). Point mutations are represented as white (silent) or green (encoding for amino acid substitution) stars. Sequences encoding the G and H sites are noted. (B) Mapping of SCR9 amino acid sequence onto the crystal structure of the hGSTT2 monomer. Human-derived sequence is in red, rat sequence is in blue, and identical amino acids at fusion points are in magenta. The location of the active site is marked with a black star. (C) Mapping of SCR23 amino acid sequence onto the hGSTT2 structure. Point mutations are shown in green.

The clones isolated from the complementary h-rGSTT ITCHY library all contained single C/Os within or adjacent to the conserved N-terminal domain. Specifically, the HR-216 gene (Fig. 3A) encodes a chimeric enzyme with human amino acids 1-83, residues 84-86 common to both parents, and amino acids 87-244 originating from the rat protein. As a result, the fusion junction is located in the flexible loop connecting the N- and C-terminal domains. The second highly fluorescent clone, HR-25, comprises amino acids 1-77 of hGSTT1-1 followed by a six-residue segment (ARDIRS, amino acids 78-83) that has no homology to either parent. The C-terminal end of this chimera is derived from residues 79-244 of rGSTT2-2. Structurally, it appears that the entire N-terminal domain of hGSTT1-1 is fused to the connecting flexible loop and the C-terminal domain of rGSTT2-2 through a 6-aa linker of unknown origin. All other highly active h-rGSTT chimeras obtained from this library showed a similar fusion pattern with the N-terminal domain of hGSTT1-1 being linked in-frame to the C-terminal domain of rGSTT2-2 (data not shown).

The screening and analysis of the two complementary ITCHY libraries described above suggested that no single C/O event within the core of the H site (between amino acids 89 and 221) is capable of generating chimeras with a significant fraction of the parental rGSTT2-2 CMAC activity. We wondered whether CMAC specificity is critically determined by the near complete sequence of the rat H site (excluding the C-terminal helix) or whether the generation of multiple C/Os within the C-terminal domain might yield enzymes with CMAC activity comparable to the rGSTT2-2 protein. A combination of low-homology and homology-independent recombination techniques was used to force the creation of C/Os within the 3′ region encoding the H site. Briefly, the genes for the GST enzymes were divided into five segments, the first four of which were recombined by using recombination-dependent exponential amplification PCR, a library construction technique that allows the facile generation of low-homology recombination events. C/Os in the fifth segment encoding the C-terminal α-helix, which initially appeared to be critical in conferring parental levels of rat catalytic activity, were generated by enhanced C/O SCRATCHY (4). The chimeric DNA fragments from each section were assembled by overlap extension PCR, further diversified by DNA family shuffling and the gene products ligated into pET-28a to yield library eSCR-A (Fig. 1). This multiple C/O library (5 × 105 clones) was incubated with 10 μM CMAC followed by flow cytometric isolation of high fluorescence intensity cells. Chimeric genes from the resulting 700 active clones were isolated, pooled, and backcrossed with a 1/10 molar fraction of the parental hGSTT1 gene by DNA shuffling, yielding the “humanized” library eSCR-B (Fig. 1).

Library eSCR-B was subjected to two rounds of flow cytometric cell sorting. Fig. 2 shows the fluorescence histograms of the pooled clones after sorts 1 and 2. A total of 96 clones was randomly chosen from the sort 2 population, and AluI restriction fragment length polymorphism was used to detect the presence of multiple C/Os. Based on this analysis, 16 putative chimeric clones were grown in liquid media, and their whole-cell fluorescence with CMAC was determined. The two most fluorescent clones, SCR23 and SCR9, exhibited 90% and 100%, respectively, of the signal obtained with cells expressing the parental rat enzyme.

Sequence analysis of SCR9 revealed three C/Os at the DNA level (Fig. 3A). At the amino acid level, residues 1-96 are derived from hGSTT1-1, followed by 97-99 (at C/O-2), which are identical in both parental enzymes (Fig. 3B). Residues 100-226 are from rGSTT2-2, whereas 227-248 are the same as hGSTT1-1 amino acids 219-240. Thus, SCR9 has an overall human-rat-human sandwich structure and combines features analogous to the complementary ITCHY clones HR-216 and RH-F2 (i.e., a human G site and C-terminal α-helix).

The second highly fluorescent chimera, SCR23, contained two DNA C/Os (Fig. 3A). This gene encodes amino acids 1-87 of hGSTT1-1, amino acids 88-93, which are identical in both parent enzymes, amino acids 94-153 from rGSTT2-2, and residues 154-239, which correspond to 155-240 of hGSTT1-1 (Fig. 3C). SCR23 also contained two point mutations, L113P and W234R, the latter of which is known to play a role in substrate selectivity (28). It is noteworthy that SCR23 contains a low-homology C/O near the middle of the electrophile-determining H site, indicating that exchange of parental C-terminal subdomains can give rise to enzymes with high catalytic activity.

Enzyme Characterization. The two parental GSTs and the six chimeras described above were fused to an N-terminal His-6 tag and purified to near homogeneity by immobilized metal affinity chromatography. Fusion of a His-6 tag to the C terminus resulted in enzymes with low specific activities (data not shown). The kcat and Km values for the conjugation of CMAC to GSH are listed in Table 1. Although the purified enzymes displayed a wide range of activities, even hGSTT1-1, which is a poor catalyst with the selection substrate, exhibited a relatively large catalytic rate enhancement compared with the uncatalyzed reaction (second-order rate constant k2 = 7.6 × 10-5± 9 × 10-6 mM-1·min-1). Exchange of the C-terminal helix of the rat enzyme with the corresponding sequence from the human protein in clones RH-A5 and RH-F2 produced proteins that displayed 30- and 10-fold rate enhancements, respectively, relative to hGSTT1-1 but were nonetheless substantially less active than rGSTT2-2. Such reductions in catalytic rate relative to the most active parent are typical of chimeric enzymes derived from homology-independent recombination (15, 17, 29).

By contrast, the h-r chimeras HR-25 and HR-216 exhibited kcat values of 9.5 min-1 and 23 min-1, making them 30% and 320% faster than the rat enzyme, respectively. In both HR-25 and HR-216, the G site of the rat enzyme had been entirely substituted with the highly homologous domain from hGSTT1-1 without any sequence exchange in the electrophile-specifying H site. Inspection of the kcat/Km values of these enzymes for GSH revealed a comparable or lower efficiency of GSH turnover relative to the rat parent. These results indicate that the increased kcat exhibited by HR-216 is not likely to be due to better binding/activation of GSH by the human-derived G site; instead, the presence of the human N-terminal domain appears to be indirectly affecting catalytic efficiency with respect to the electrophile.

As noted above, SCR9 incorporated the human structural aspects of both HR-216 and RH-F2. SCR9 displayed a 375% increase in turnover number (27.0 min-1) relative to rGSTT2-2, similar to HR-216. As was seen for RH-F2, the Km values for CMAC and GSH were greater than those of the rat enzyme and approached those of hGSTT1-1, resulting in an overall catalytic efficiency that was equivalent to rGSTT2-2. SCR23 was found to have CMAC catalytic parameters that were essentially indistinguishable from those of SCR9. However, 60% of the C-terminal H site of SCR23 is derived from hGSTT1-1 compared with only 21% for SCR9.

To evaluate the range of substrate selectivities exhibited by the purified enzymes, their specific activities were determined with CDNB, PEITC, and ethacrynic acid (Fig. 4). Clones RH-A5 and RH-F2 have elevated activity toward CDNB compared with the inactive hGSTT1-1, but their specific activities with this substrate (7 and 5 μM·min-1·mg-1, respectively) are only moderate when compared with the other isolated GSTs. Additionally, these two ITCHY clones are effectively inactive with PEITC and ethacrynic acid. Exchange of the N-terminal domains in clones HR-25 and HR-216 resulted in selectivity profiles similar to the rGSTT2-2 enzyme. However, it is worth noting that fusion of the human G site to the rat H site in HR-216 led to a 300% and 280% increase in activity with PEITC and CDNB, respectively. These findings support the hypothesis that the human N-terminal domain is indirectly enhancing rat-like catalytic efficiency.

Fig. 4.

Fig. 4.

GSTT activities. (A) kcat/Km values with CMAC. (B-D) Specific activities with alternate electrophiles CDNB (B), PEITC (C), and ethacrynic acid (D).

Unlike the ITCHY variants, the multiple C/O enzymes SCR9 and SCR23 exhibit unique selectivity profiles relative to one another (Fig. 4). SCR9, but not SCR23, possesses high CDNB activity (30 μmol·min-1·mg-1). Both SCR9 and SCR23 have PEITC activities elevated from those of the two parental enzymes. Interestingly, SCR23 functions as a catalyst for the conjugation of ethacrynic acid to GSH (0.30 μmol·min-1·mg-1), an activity possessed by neither parent (rGSTT2-2 and hGSTT1-1 < 0.03 μmol·min-1·mg-1).

Discussion

The substrate selectivities of GSTs vary dramatically both between enzyme classes and within members of a given class (1). The hGSTT1-1 and rGSTT2-2 enzymes are examples of the widely varied electrophile preferences that can be found within structurally related members of a single GST class (2). Although the natural substrates for which these enzymes have evolved are currently unknown, their distinct selectivities with synthetic compounds can be traced to the divergence in the amino acid compositions of their H sites. A particularly interesting question is the extent to which the C-terminal α-helix, which covers the active site and is characteristic of the GST theta class, influences substrate selectivity.

It has previously been found that homology-dependent recombination of the two parental GSTs based on family shuffling does not yield active enzymes having C/Os within the C-terminal domain that is responsible for electrophile selectivity. All active clones isolated from such libraries were found to contain single C/Os within the first half of the N-terminal domain (5).

Screening of an r-h ITCHY library containing single homology-independent recombination events for activity with the rGSTT2-2-specific CMAC substrate also yielded chimeras in which the vast majority of the H sites, with the exception of the C-terminal helix, had originated from the rat enzyme (4). This swapping of the C-terminal helix led to enzymes in which >90% of the sequence had been derived from rGSTT2-2. Although these enzymes were functional, they displayed considerably lower kcat/Km values for CMAC relative to the parental rat protein (Table 1). Nonetheless, they were found to be significantly better catalysts than the human parental enzyme with respect to CMAC conjugation, indicating that the specificity determinants for this substrate do not reside exclusively in the C-terminal α-helix. Screening of the complementary h-r ITCHY library also resulted in clones having rGSTT2-2 H sites. Specifically, HR-25 and HR-216 possessed full-length human N-terminal G sites fused to intact rat C-terminal H sites by means of junctions in the connecting flexible loop. Relative to rGSTT2-2, HR-216 exhibited a 3-fold higher kcat for CMAC as well as improved specific activities for both CDNB and PEITC (Table 1 and Fig. 4). A likely yet counterintuitive explanation for the enhanced activity of HR-216 is that replacement of the rGSTT2-2 G site with its hGSTT1-1 counterpart resulted in optimal interactions between the two domains.

Because the ITCHY methodology generates C/Os in a homology-independent manner, these results revealed that no single recombination event within the H site is capable of creating chimeras that retain high levels of CMAC activity comparable to the rat parental enzyme. However, highly active enzymes with chimeric H sites were created by forcing the generation of multiple C/Os. Clone SCR23, in particular, shares 82.9% amino acid identity with hGSTT1-1, yet it has CMAC catalytic efficiency comparable to rGSTT2-2. Homology modeling suggested that SCR23 possessed only two rGSTT2-2 α-helices, derived from the first two structural subdomains of this parent's H site (amino acids 94-153; Fig. 3C). Importantly, not only did SCR23 reconstitute the CMAC activity of rGSTT2-2, it also led to a new specificity for ethacrynic acid that is absent in both parents. This activity is equivalent to that exhibited by hGSTT2-2, the only other known theta-class GST capable of catalyzing this reaction (30). The evolution of this activity in the absence of an associated selection pressure suggests a potential role for low-homology recombination in the development of promiscuous protein functions, a topic that has been explored in detail for random mutagenesis and conventional family shuffling (31).

The second clone from the multiple C/O library, SCR9, resembles the single C/O clones RH-A5 and RH-F2 in that it contains a short segment of hGSTT1-1 inserted at the C terminus (Fig. 3B). However, in contrast to the two ITCHY clones, the presence of hGSTT1-1 amino acid residues 1-96 complements the detrimental effect of the C-terminal helix substitution restoring rat-like catalytic activity with CMAC. SCR9 closely mimics the rGSTT2-2 selectivity profile with the alternate substrates, differing primarily in its 230% increase in specific activity with CDNB (Fig. 4).

Although SCR23 and SCR9 are equally efficient catalysts for GSH-CMAC conjugation, reactivity with the alternate substrates clearly distinguishes these multiple C/O clones from each other and the two parents. Clone SCR9 and rGSTT2-2 efficiently conjugate GSH to CDNB, whereas SCR23 is the only isolated chimera that closely mimics the exceptionally low hG-STT1-1 activity with this substrate. Structural determination of SCR9 and SCR23 is ongoing and should yield valuable insights into the determinants of substrate specificity in theta GSTs. However, the current results indicate that the rat enzyme's promiscuous nature is not due to a “loose” active site that indiscriminately accommodates aromatic compounds with reactive benzyl halides; rather, it derives from the presence of distinct determinates for various halogenated target molecules. Sequence alignment of SCR9, SCR23, and the two parents revealed that SCR9 and rGSTT2-2 share the sequence SVLGQAAKK-TLP at positions 215-226, whereas hGSTT1-1 and SCR23 exhibit the amino acid KAK followed by a nine-residue deletion in this section. These results suggest that the 12-aa stretch found in SCR9 and the rat enzyme may be critical for CDNB activity but unrelated to CMAC selectivity.

In SCR23, a large portion of the sequence (82.9%) is derived from the human protein, yet the enzyme displays the full CMAC activity of the nonhuman parent. In other words, low-homology recombination and screening effectively led to the creation of a humanized enzyme. Deimmunization of therapeutic proteins by computational methods aimed at identifying class II MHC agretopes is the basis for several commercial technologies (32). Using the class II MHC peptide-binding prediction program propred (33), the top 153 potential T cell epitopes (3 each for 51 MHC alleles) were identified for rGSTT2-2, hGSTT1-1, and SCR23. The rat and human proteins differed in 139 of these 153 candidates, illustrating the substantially different immunogenic potential of the two parents. SCR23 possesses 120 peptides that are identical to the human targets for the same allele and shares only 10 unique candidates with rGSTT2-2. Thus, it would seem that the immunogenic potential of SCR23 is drastically reduced relative to the rat enzyme with which it has comparable CMAC activity. These results indicate that homology-independent and low-homology recombination methods, coupled with high-throughput screening for catalytic activity, could represent an attractive method for reducing the immunogenicity of enzymes for therapeutic purposes.

Acknowledgments

We thank Navin Varadarajan for many helpful discussions and comments on the manuscript. The hGSTT1 and rGSTT2 genes were a kind gift from Prof. Bengt Mannervik (Uppsala University, Uppsala). This work was supported by National Institutes of Health Grants R01 GM065551 and R01 GM073089.

Abbreviations: GSH, glutathione; GSTT, GSTθ; rGSTT2-2, rat GSTT2-2; hGSTT1-1, human GSTT1-1; CMAC, 7-amino-4-chloromethyl coumarin; ITCHY, incremental truncation for the creation of hybrid enzymes; C/O, crossover; r-h, rat-human; h-r, human-rat; CDNB, 1-chloro-2,4-dinitrobenzene; PEITC, phenethyl isothiocyanate.

References


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES