Abstract
The helminth parasite Fasciola hepatica secretes cysteine proteases to facilitate tissue invasion, migration, and development within the mammalian host. The major proteases cathepsin L1 (FheCL1) and cathepsin L2 (FheCL2) were recombinantly produced and biochemically characterized. By using site-directed mutagenesis, we show that residues at position 67 and 205, which lie within the S2 pocket of the active site, are critical in determining the substrate and inhibitor specificity. FheCL1 exhibits a broader specificity and a higher substrate turnover rate compared with FheCL2. However, FheCL2 can efficiently cleave substrates with a Pro in the P2 position and degrade collagen within the triple helices at physiological pH, an activity that among cysteine proteases has only been reported for human cathepsin K. The 1.4-Å three-dimensional structure of the FheCL1 was determined by x-ray crystallography, and the three-dimensional structure of FheCL2 was constructed via homology-based modeling. Analysis and comparison of these structures and our biochemical data with those of human cathepsins L and K provided an interpretation of the substrate-recognition mechanisms of these major parasite proteases. Furthermore, our studies suggest that a configuration involving residue 67 and the “gatekeeper” residues 157 and 158 situated at the entrance of the active site pocket create a topology that endows FheCL2 with its unusual collagenolytic activity. The emergence of a specialized collagenolytic function in Fasciola likely contributes to the success of this tissue-invasive parasite.
Clan CA papain-like cysteine peptidases, such as cathepsins B and L (1), are ubiquitous in helminth (worm) parasites of human and veterinary importance. These peptidases are involved in a variety of pathogen-specific functions, including penetration and migration through host tissues, catabolism of host proteins to peptides and amino acids, and modulation or suppression of host immune defenses by cleaving immunoglobulin or altering the activity of immune effector cells (2–4). The central role of Clan CA proteases in the survival of helminth parasites has positioned them as lead targets for the development of new chemotherapies and vaccines (5–7).
Fasciola hepatica is a helminth parasite that causes liver fluke disease (fasciolosis) in cattle and sheep worldwide. It is most prevalent in Europe with infection rates increasing because of the emergence of drug-resistant parasites and possibly as a result of climate change (8, 9). Human fasciolosis has recently emerged as a major zoonosis in rural areas of South America (particularly Bolivia, Peru, and Equador), Egypt, and Iran where organized farm management practices are poor. It is estimated that worldwide over 2.4 million people are infected with F. hepatica and about 180 million are at risk of infection (10, 11).
Secretion of cysteine proteases is associated with the virulence of F. hepatica and its capacity to infect a wide range of mammalian hosts (4, 6, 12–14). Cathepsin L1 (FheCL1) and cathepsin L2 (FheCL2) are the two major peptidases secreted by the infective larvae that traverse the host intestinal wall, by the migratory stages that penetrate the liver tissues, and by the mature adult parasites that reside in the bile ducts and feed on host blood, which they ingest through the punctured bile duct wall (4, 6, 15). Experiments using purified native enzymes demonstrated that FheCL1 and FheCL2 efficiently degrade host hemoglobin, immunoglobulin, and interstitial matrix proteins such as fibronectin, laminin, and native collagen (6, 16, 17). Although FheCL1 and FheCL2 exhibited similar substrate specificities, FheCL2 showed a greater affinity for peptides containing Pro residues in the P2 position (18 –20). We proposed that by producing proteases with overlapping specificity the parasite could digest these host macromolecules more efficiently, and therefore more effectively penetrate host organs (6, 16).
The F. hepatica cathepsin Ls belong to a lineage that eventually gave rise to the mammalian cathepsin Ls from which the mammalian cathepsin Ks diverged (2). Mammalian cathepsin L is ubiquitously expressed in tissues and performs a housekeeping function in protein turnover, but it also plays a part in more specialized functions such as antigen processing and presentation, hormone and protease activation, and extracellular matrix turnover (21). Cathepsin K, on the other hand, exhibits a more restricted expression profile being predominantly found in osteoclasts but also in multinucleated giant cells, macrophages, and lung epithelial cells (22, 23). A specific role for cathepsin K in bone resorption by osteoclasts has been related to the ability of the protease to cleave the covalently linked triple helices of native collagen, a unique property among the mammalian papain-like cysteine proteases (24). This unusual property was attributed to the presence of a tyrosine residue at position 67 within the S2 subsite of cathepsin K that interacts with proline in the P2 of substrates, including the Gly-Pro-Xaa repeat sequence (where Xaa is mainly proline or 4-trans-l-hydroxyproline) found in collagen. A parallel therefore exists between mammalian cathepsin K and the F. hepatica FheCL2 as the latter can also cleave substrates with a P2 proline and possesses a tyrosine residue at the corresponding position 67.
To understand the role of the major secreted cathepsin L proteases of F. hepatica in the virulence of the parasite and its adaptation to various hosts, it is important to elucidate their biochemical properties and relate these to structure and function. Therefore, in this study, we have characterized the substrate specificity of active recombinant forms of FheCL1 and FheCL2. These properties were further explored by preparing variants of FheCL1 in which specific substitutions were made within the S2 subsite of the active site (positions 67 and 205) to simulate those residues present in human cathepsins L and K. In addition, the 1.4-Å three-dimensional structure of a variant FheCL1 zymogen, in which the active site Cys was replaced by a Gly (FheproCL1Gly25), has been determined by x-ray crystallography. For FheCL2, the three-dimensional structure has been constructed via homology-based modeling. Analysis and comparison of these major parasite proteases with the human cathepsins L and K provide a structural interpretation of the substrate-recognition mechanisms.
EXPERIMENTAL PROCEDURES
Materials—Z-Phe-Arg-NHMec,6 Z-Leu-Arg-NHMec, Z-Pro-Arg-NHMec, Z-Val-Pro-Arg-NHMec, Z-Gly-Pro-Arg-NHMec, Z-Ala-Gly-Pro-Arg-NHMec, Z-Phe-Arg-NHMec, Z-Gly-Pro-Lys-NHMec, and Z-Phe-Ala-CHN2 were obtained from Bachem (St. Helens, UK). Z-Leu-Arg-NHMec was purchased from Peptide Institute Inc. (Japan). E-64, DTT, and EDTA were obtained from Sigma. Cathepsin K inhibitor II was purchased from BD Biosciences. Prestained molecular weight markers and the AvrII and SnaBI restriction enzymes were obtained from New England Biolabs. Primers were obtained from Sigma-Genosys. The pPIC9K vector and Pichia pastoris strain GS115 were obtained from Invitrogen. Nickel-nitrilotriacetic acid-agarose and columns were obtained from Qiagen (Crawley, UK). Collagen, calf skin, was purchased from Calbiochem. Pre-cast 4–20% gradient SDS-polyacrylamide gels were purchased from Gradipore (Australia).
Expression and Purification of Recombinant Cathepsin L Zymogens in Yeast—F. hepatica procathepsin L1 (FheCL1) and procathepsin L2 (FheCL2) were amplified by PCR from the pAAH5 Saccharomyces cerevisiae expression vector into which the full-length cDNA had been cloned previously in our laboratory (12, 25). FheCL1 variants (FheCL1 L67Y and FheCL1 L205A) were synthesized and incorporated an SnaBI restriction site at the 5′ end of the gene and an AvrII restriction site and His6 tag sequence at the 3′ end (Geneart, Regensburg, Germany). The 980-bp fragments were ligated into pCR-Script cloning vector (Stratagene), which were transformed into competent Escherichia coli for amplification. Inserts were digested from plasmid preparations with AvrII and SnaBI and inserted in-frame with the yeast α-factor at the AvrII/SnaBI site of P. pastoris expression vector pPIC9K (Invitrogen). Plasmids were linearized with SacI and then transformed into chemically competent GS115 cells (Invitrogen) as described previously (12). All inserts were sequenced to ensure congruence with original cDNAs.
P. pastoris yeast transformants were cultured in 500 ml of buffered glycerol complex medium broth, buffered to pH 8.0, in 5-liter baffled flasks at 30 °C until an A600 of 2–6 was reached (12). Cells were harvested by centrifugation at 2000 × g for 5 min, and protein expression was induced by resuspending in 100 ml of buffered minimal methanol medium broth, buffered at pH 6.0 containing 1% methanol (20). Recombinant proteins were purified from yeast medium by affinity chromatography using nickel-nitrilotriacetic acid-agarose (Qiagen) (12, 26). Purified recombinant zymogens were dialyzed against phosphate-buffered saline (PBS) and stored at −20 °C. The 37-kDa zymogens were autocatalytically activated and processed to 24.5-kDa mature enzymes by incubation for 2 h at 37 °C in 0.1 m sodium citrate buffer, pH 5.0, containing 2 mm DTT and 2.5 mm EDTA. The mixture was then dialyzed against PBS, pH 7.3. The proportion of functionally active recombinant protein in these preparations was determined by titration against E-64.
P1–P4 Specificity Using a Positional Scanning Synthetic Combinatorial Library—The substrate specificities of FheCL1, FheCL1 L67Y, and FheCL1 L205A and FheCL2 were determined using a complete diverse positional scanning synthetic combinatorial library (PS-SCL) (27). Screens were performed at 25 °C in 0.1 m sodium acetate, 0.1 m NaCl, 0.01 m DTT, 0.001 m EDTA, 0.01% Brij-35, 1% Me2SO (from the substrates), pH 5.5. Aliquots of 25 nmol in 1 μl from each of 20 sub-libraries of the P1, P2, P3, and P4 libraries were added to the wells of a 96-well Microfluor-1 U-bottom plate (Dynex Technologies). The final concentration of each compound of the 8000 compounds per well was 31.25 nm in a 100-μl final reaction volume. The assays were initiated by addition of preactivated enzyme, and the reaction was monitored with a SpectraMax Gemini fluorescence spectrometer (Molecular Devices) with excitation at 380 nm, emission at 460 nm, and cutoff at 435 nm. Screens were performed in duplicate and triplicate for wild type and mutated enzymes, respectively.
Enzyme Assays and Kinetics with Fluorogenic Peptide Substrates—Initial rates of hydrolysis of the fluorogenic dipeptide substrates were measured by monitoring the release of the fluorogenic leaving group, NHMec, at an excitation wavelength of 380 nm and an emission wavelength of 460 nm using a Bio-Tek KC4 microfluorometer. kcat and Km values were determined using nonlinear regression analysis. Initial rates were obtained at 37 °C over a range of substrate concentrations spanning Km values (0.2–200 μm) and at fixed enzyme concentrations (0.5–5 nm). Assays were performed in PBS, pH 7.3, and 100 mm sodium acetate buffer, pH 5.5, each containing 2.5 mm DTT and 2.5 mm EDTA.
Rate constants for the inactivation of enzyme by Z-Phe-Ala-CHN2 and cathepsin K inhibitor II were determined from progress curves in the presence of substrate (28, 29). When substrate and inhibitor bind to enzyme in rapid equilibrium and the substrate concentration does not change significantly during the course of the assay, the concentration of product, [P], at time t after the start of the reaction is given by Equation 1,
| (Eq. 1) |
where v0 is the initial rate of reaction; kobs is the rate of inactivation, and A0 is the background fluorescence. kobs is related to the inhibitor concentration by Equation 2,
| (Eq. 2) |
When [I] ≪ Ki plots of kobsversus [I] were linear with slope equal to an apparent second-order rate constant kobs/[I]. This value was then corrected for substrate concentration and the Michaelis constant to determine a true second-order rate constant kinact/Ki.
The initial rate v0 is related to inhibitor concentration by Equation 3,
| (Eq. 3) |
Because the inactivation was carried out with [S] = Km, Equation 3 reduces to Equation 4,
| (Eq. 4) |
An apparent inhibition constant Ki(app) for the formation of the initial reversible enzyme-inhibitor complex prior to inactivation was determined by plotting v0 against [I] and fitting to Equation 4.
Collagen Digestion—Calf skin collagen type 1 was solubilized in 0.2 m acetic acid at a concentration of 2 mg/ml and dialyzed for 2 days against 0.1 m sodium acetate, pH 4.0, 0.1 m sodium acetate, pH 5.5, or PBS, pH 7.3. Reactions contained 10 μg of dialyzed collagen type 1, 1 mm DTT, and 2 mm EDTA and 5.47 μm activated peptidase in a final volume of 100 μl of one of the above buffers. Reactions were performed at 28 °C for 3 and 20 h or at 37 °C for 30 min. All reactions were stopped by the addition of 10 μm E-64. Collagen digests were analyzed by 4–20% gradient SDS-PAGE under reducing conditions and stained with Coomassie Brilliant Blue R-250.
Production of Inactive Variant FheproCL1 Gly25—For the purpose of obtaining a high resolution three-dimensional structure of FheCL1, an inactive enzyme was produced by replacing the active site Cys residue at position 25 in the mature domain by a Gly (12, 26). This FheproCL1 Gly25 enzyme migrated as a single protein of 37 kDa on reducing 12% SDS-PAGE, which represents the full zymogen containing a prosegment and mature enzyme domain (data not shown).
Data Collection, Structure Solution, and Crystallographic Refinement of FheproCL1 Gly25—Initial crystallization screening experiments were performed at the Hauptman-Woodward Institute high throughput crystallization laboratory. A total of 1536 conditions were tested using a nanoscale microbatch-under-oil method, resulting in several preliminary hits that suggested a route to diffraction-quality crystals (30). Ultimately, high quality crystals were grown in-house via vapor diffusion in sitting drops. One μl of 10 mg/ml FheproCL1 Gly25 enzyme was mixed with 1 μl of the precipitating agent, 0.2 m sodium thiocyanate in 20% polyethylene glycol 3350, and allowed to equilibrate at 23 °C over a 100-μl reservoir of precipitating agent. Crystalline plates formed within 2 days; however, full-size growth to plates greater than 75 μm in thickness took nearly 2 months.
Diffraction data were collected at the Advanced Light Source, beam line 8.3.1, using monochromatic (Si-111) radiation of 1.11588 Å (31). An ADSC Quantum 210 2 × 2 CCD array detector was used with low temperature conditions of 100 K at the crystal position. Crystals of the single mutant protein were flash-cooled in liquid nitrogen after being soaked for ∼1.5 min in a cryoprotectant solution of crystal growth solution plus 50% 2-methyl-2,4-pentane diol. High and low resolution data-sets were collected from the same crystal. Data processing was completed with MOSFLM (32) and SCALA. The structure was solved via molecular replacement using the MOLREP program of the CCP4 suite (33) with a polyserine search model derived from the 1.8 Å structure of 1CS8 (human procathepsin L). The topmost solution had an R-factor value of 0.535 and correlation coefficient of 0.288, each several σ levels above the next best solution, which had corresponding statistics of 0.604 and 0.083, respectively. One unique solution was found with one molecule in the asymmetric unit and a starting Rfactor of 0.526. The initial molecular replacement solution was improved using ARP/wARP as implemented in the CCP4 program suite (34) resulting in a model that was better than 85% complete. Iterative rounds of visualization and manual model building and refinement were completed with QUANTA (Accelrys, San Diego) and Refmac5 with anisotropic atomic displacement parameters (35), respectively. Water molecules were added automatically using ARPwaters in CCP4 (36) and were manually verified. In the final stages of refinement, XPLEO (37) was used to improve the fit of two areas of ambiguous density in the structure. Final visualization and manual adjustments to the structure as well as final assessment of water molecules were completed with COOT (38). Crystallographic parameters and statistics are summarized in Table 1, and final atomic coordinates have been deposited with the Protein Data Bank, accession ID 2O6X (RCSB040763).
TABLE 1.
Crystallographic parameters: data collection and refinement statistics
Note the values given in parentheses under Data Collection are for highest resolution bin (1.48-1.40 Å).
| Data Collection | |
| Space group | P21212 |
| Unit cell parameters | |
| a | 57.17 Å |
| b | 105.78 Å |
| c | 49.11 Å |
| Wavelength | 1.1159 Å |
| Temperature | 100 K |
| Resolution | 1.4 Å |
| Total no. of reflections | 333,582 (19,545) |
| Total unique reflections | 59,595 (8465) |
| Completeness | 99.7 (98.7%) |
| Redundancy | 5.6 (2.3) |
| Rmerge | 0.080 (0.549) |
| Rp.i.m. | 0.029 (0.431) |
| 〈I〉/〈σI〉 | 14.6 (1.8) |
|
| |
| Refinement | |
| Resolution range (Å) | 52.93–1.40 Å |
| No. of reflections | 56,141 |
| Rfactor | 0.128 |
| Rfree | 0.165 |
| Free reflections | 5.05% |
| Average B factor (Å2) | |
| Protein | 15.39 |
| Water | 34.09 |
| Root mean square deviation from ideal | |
| Bond lengths | 0.020 Å |
| Bond angles | 1.79° |
| Ramachandran Plot | |
| Residues in most favored regions | 238 (90.5%) |
| Residues in additional allowed regions | 25 (9.5%) |
| Residues in generously allowed regions | 0 (0.0%) |
| Residues in disallowed regions | 0 (0.0%) |
Homology-based Molecular Modeling—A model structure of the mature domain of FheCL2 was built using Modeler (release 8, version 1), a program for protein structure modeling (39–41). The 1.8 Å structure of human procathepsin L (PDB code 1CS8), the 2.2 Å structure of human cathepsin K (PDB code 1ATK), and our 1.4 Å solved structure of FheproCL1 Gly25 were used as three-dimensional templates of related fold. Generated models were visualized and compared with COOT (37) and with PyMOL (42).
Sequence Analysis—F. hepatica cathepsin L protein sequences were aligned using Clustal X 1.81. Phylogenetic trees were generated from the alignment by the boot-strapped (1000-trial) neighbor-joining method using MEGA (43).
RESULTS
Active Site Residues Involved in Substrate Specificity of FheCL1 and FheCL2—Residues that make up the S2 pocket of FheCL1 and FheCL2 were determined using the three-dimensional x-ray crystal structure of FheCL1 and homology-based model of FheCL2, respectively, (see below), and their comparison to the structure of human cathepsin L (PDB code 1CS8) and cathepsin K (PDB code 1ATK) is shown in Table 2 (see also Fig. 1 and Fig. 7A; papain numbering is used). Most variation between papain cysteine proteases occurs at residues 67 and 205, and studies with human cathepsin L and cathepsin K demonstrated that the difference in residues 67 (Leu and Tyr, respectively) and 205 (Ala and Leu, respectively) reflect the striking difference in the substrate specificity of these two enzymes; for example, human cathepsin L exhibits a broad specificity and favors both aromatic and aliphatic P2 residues but will not accept proline, whereas cathepsin K prefers only aliphatic resides and most particularly proline. Indeed, the acceptance of a P2 Pro residue confers cathepsin K with its unique ability to cleave native type I and type II collagens, proteins that contain repeated Gly-Pro-X motifs (44). Like human cathepsin L, FheCL1 possesses a Leu at position 67; however, unlike cathepsin L it possesses a Leu at position 205 rather than an Ala. The Leu at position 205 is similar to cathepsin K, and thus, the FheCL1 exhibits hybrid character in the S2 subsite. By contrast, FheCL2 possesses a Tyr at position 67 and Leu at position 205 and hence is identical to cathepsin K at both sites. It has been suggested by us (2) and others (44, 45) that the accommodation of Pro in the P2 position of peptide substrates by FheCL2 may be related to the presence of the Tyr67, analogous to the cathepsin K scenario.
TABLE 2.
Residues contributing to substrate binding in the S2 subsite of human cathepsin L, human cathepsin K,F. hepatica cathepsin L1 (FheCL1),F. hepatica cathepsin L1 L67A variant (FheCL1 L67A),F. hepatica cathepsin L1 L205Y variant (FheCL1 L205Y), andF. hepatica cathepsin L2 (FheCL2)
| Residue | |||||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| 67 | 68 | 133 | 157 | 158 | 160 | 205 | |
| Human cathepsin L | Leu | Met | Ala | Met | Asp | Gly | Ala |
| Human cathepsin K | Tyr | Met | Ala | Leu | Asn | Ala | Leu |
| FheCL1 | Leu | Met | Ala | Val | Asn | Ala | Leu |
| FheCL1 L205A | Leu | Met | Ala | Val | Asn | Ala | Ala |
| FheCL1 L67Y | Tyr | Met | Ala | Val | Asn | Ala | Leu |
| FheCL2 | Tyr | Met | Ala | Leu | Thr | Ala | Leu |
FIGURE 1.
Sequence alignment of the matureF. hepatica cathepsin L1 (FheCL1) and cathepsin L2 (FheCL2) with human cathepsin L (hCatL), human cathepsin K (hCatK), and papain was performed with ClustalW (EBI, EMBL). Residues within the S2 subsite of the active site involved in determining substrate specificity are indicated with arrows and numbers (see also Table 2).
FIGURE 7.
A, surface representation of the active site region of FheproCL1 Gly25. The S2 pocket of the enzyme is highlighted in pink, and key residues implicated in substrate preference, including the gatekeeper residues (Val157 and Asn158), are noted. B, representation of the active site region of FheproCL1 Gly25 and FheCL2. The S2 pocket of both enzymes are highlighted. Gatekeeper residues of FheproCL1 Gly25 are further indicated in pink stick representation and those of the modeled FheCL2 in yellow. Figures created with PyMol (42).
To address the relationship between the residues presented at position 67 and 205 and the substrate specificity and function of FheCL1 and FheCL2, we prepared variants of FheCL1 as shown in Table 2. The FheCL1 L67Y variant has a single amino acid change making the S2 subsite similar to FheCL2 and cathepsin K at positions 67 and 205. The FheCL1 L205A variant has a single amino acid that was designed to make the S2 subsite similar to human cathepsin L. The wild type and variant F. hepatica cathepsin L peptidases were recombinantly expressed in the methylotrophic yeast P. pastoris, purified, and activated as described under “Experimental Procedures.” All enzymes were expressed as 37-kDa zymogens that autocatalytically processed at pH 4.5 to produce 24.5-kDa mature enzymes, which was confirmed by N-terminal sequencing (Fig. 2). Enzymatic assays showed that all substitutions made in the S2 subsite of the FheCL1 active site did not alter its pH profile for activity against the fluorogenic substrate Z-Phe-Arg-NHMec; both the wild type and variant free enzymes exhibited a Gaussian bell-shaped pH profile with an optimum for activity in the region pH 6.5 to 7.0 (pKI = 3.87 ± 0.07 and pKII 8.14 ± 0.08).
FIGURE 2.
Activation of purified recombinant to FheproCL1 and FheproCL2. The 37-kDa zymogens were autocatalytically activated and processed to 24.5-kDa mature enzymes by incubation for 2 h at 37 °C in 0.1 m sodium citrate buffer, pH 5.0, containing 2 mm DTT and 2.5 mm EDTA. Reaction samples were analyzed by 4–20% SDS-PAGE; lanes 1–4, activation reaction of FheCL1 at 0, 30, 60, and 120 min; lanes 5– 8, activation reaction of FheCL2 at 0, 30, 60, and 120 min. Similar results were obtained with variant peptidases (not shown). MW, molecular mass markers.
Substrate Specificity Profiling Using a PS-SCL Reveals Unique and Distinct Activities of FheCL1 and FheCL2—Wild type FheCL1 and FheCL2 exhibited similar preferences for amino acids at P1. As expected for papain-like cysteine proteases, both enzymes had a clear preference for Arg at P1, but other residues accommodated in this position included Lys, Glu, Thr, and Met (Fig. 3, P1 panel), and these were all cleaved at similar relative rates to that observed for human cathepsin L and cathepsin K (44). Similar results were obtained for the variants FheCL1 L67Y and FheCL1 L205A, which were expected as the introduced substitutions do not affect the S1 active site pocket (not shown).
FIGURE 3.
Profiling of the P1–P4 substrate specificity of FheCL1 and FheCL2 using positional scanning synthetic combinatorial libraries. The y axis represents activity against the substrates relative to the highest activity of the library, whereas the x axis presents the amino acids as represented by the one-letter code (n = norleucine).
A P1-Arg fixed library was then used to explore P2–P4 specificities of FheCL1 and FheCL2. The enzymes show a distinct preference for hydrophobic amino acids in the P2; both favored Leu. Interestingly, however, the positional scanning method did not identify Phe as a suitable P2 residue even though our kinetic studies demonstrate that both FheCL1 and FheCL2, like other papain cysteine proteases, cleave fluorogenic substrates with a P2 Phe efficiently (see Table 3). The most striking observation was the distinct preference for Pro residues by FheCL2, particularly when compared with FheCL1 that did not accommodate this residue (Fig. 3, P2 panel). The unusual preference for a P2 Pro exhibited by FheCL2 is similar to that observed for human cathepsin K using the same methodology (44). However, whereas human cathepsin K favored equally Ile and Leu at P2 (44), both Fasciola cathepsins were more similar to human cathepsin L by preferring Leu over Ile.
TABLE 3.
Kinetic parameters for hydrolysis of peptidyl-NHMec substrates by recombinant wild type FheCL1, variants FheCL1 L205A, FheCL1 L67Y, and wild type FheCL2
| Enzyme | Substrate | kcat | Km | kcat/km |
|---|---|---|---|---|
| s−1 | μm | m−1 s−1 | ||
| FheCL1 | Z-FR-NMec | 24.69 ± 1.30 | 24.18 ± 3.92 | 1,021,092 |
| FheCL1 L205A | Z-FR-NMec | 29.60 ± 1.05 | 19.21 ± 3.78 | 1,540,864 |
| FheCL1 L67Y | Z-FR-NMec | 3.58 ± 0.11 | 8.16 ± 0.86 | 438,725 |
| FheCL2 | Z-FR-NMec | 1.70 ± 0.07 | 39.94 ± 5.12 | 42,564 |
| FheCL1 | Z-LR-NMec | 36.52 ± 0.63 | 4.35 ± 0.21 | 8,395,402 |
| FheCL1 L205A | Z-LR-NMec | 9.15 ± 0.24 | 2.75 ± 0.61 | 3,327,273 |
| FheCL1 L67Y | Z-LR-NMec | 1.73 ± 0.38 | 0.38 ± 0.05 | 4,552,632 |
| FheCL2 | Z-LR-NMec | 1.62 ± 0.08 | 1.39 ± 0.20 | 1,165,468 |
| FheCL1 | Z-PR-NMec | 1.03 ± 0.04 | 191.21 ± 16.90 | 5,387 |
| FheCL1 L205A | Z-PR-NMec | 0.122 ± 0.005 | 48.41 ± 4.45 | 2,479 |
| FheCL1 L67Y | Z-PR-NMec | 0.62 ± 0.04 | 136.98 ± 12.03 | 4,526 |
| FheCL2 | Z-PR-NMec | 2.64 ± 0.13 | 84.03 ± 11.28 | 31,417 |
| FheCL1 | Tos-GPR-NMec | 0.36 ± 0.03 | 10.02 ± 1.20 | 35,928 |
| FheCL1 L205A | Tos-GPR-NMec | 0.113 ± 0.003 | 20.35 ± 1.00 | 5,405 |
| FheCL1 L67Y | Tos-GPR-NMec | 0.26 ± 0.01 | 6.96 ± 1.08 | 37,069 |
| FheCL2 | Tos-GPR-NMec | 1.17 ± 0.08 | 15.33 ± 2.62 | 76,321 |
| FheCL1 | Boc-AGPR-NMec | 0.48 ± 0.03 | 10.43 ± 2.63 | 46,021 |
| FheCL1 L205A | Boc-AGPR-NMec | 0.20 ± 0.04 | 21.57 ± 8.85 | 9,179 |
| FheCL1 L67Y | Boc-AGPR-NMec | 0.93 ± 0.03 | 11.13 ± 1.37 | 83,378 |
| FheCL2 | Boc-AGPR-NMec | 2.48 ± 0.08 | 33.69 ± 0.99 | 73,731 |
The replacement of Leu for Ala at residue 205 (FheCL1 L205A) markedly altered the activity profile from wild type enzyme (Fig. 4). This variant exhibited a broader substrate specificity by accepting Phe, Trp, and Tyr at P2, residues that were not accepted by wild type FheCL1. The same residues are also accommodated by human cathepsin L (44), thus demonstrating that the replacement of Leu205 for Ala in FheCL1 generates an enzyme more similar to the human orthologue. By contrast, the FheCL1 L67Y variant did not show a significant change in the P2 preference to wild type FheCL1; in particular, this substitution did not alter the activity of the enzyme toward Pro in the P2 position (Fig. 4). This was a surprising result as we expected that the FheCL1 L67Y variant would behave similarly to FheCL2 and cathepsin K given that the residues at positions 67 and 205 were identical.
FIGURE 4.
Comparison of the P2 specificities of recombinant wild type FheCL1, variant FheCL1 L205A (L205A), variant FheCL1 L67Y (L67Y), and wild type FheCL2 using positional scanning synthetic combinatorial libraries.
As anticipated, the P3 and P4 specificities for FheCL1 and FheCL2 were similar, and like human cathepsin L and cathepsin K, the Fasciola enzymes accepted a broad range of residues in these positions. The P3–P4 specificity of FheCL1 was unaffected by the P2 substitutions present in the variant proteases (not shown).
Wild type and Variant Protease Specificities against Fluorogenic Peptide Substrates Correlates with Residues at Position 67 and 205—To support and extend the data derived from the positional scanning libraries, and to determine substrate kinetic parameters (Km, kcat, and kcat/Km) for wild type FheCL1, the variants FheCL1 L67Y and FheCL1 L205A, and wild type FheCL2, we examined their hydrolytic activity against various fluorogenic di- and tripeptides (Table 3). FheCL1 efficiently cleaved both Z-Phe-Arg-NHMec (kcat/Km = 1,021,092 m−1 s−1) and Z-Leu-Arg-NHMec (kcat/Km = 8,395,402 m−1 s−1); the enzyme cleaved the latter substrate over eight times more rapidly largely because its Km value for this substrate is much lower than for the former substrate. Although the substrates Z-Pro-Arg-NHMec (kcat/Km = 5,387 m−1 s−1), Tos-Gly-Pro-Arg-NHMec (kcat/Km = 35,928 m−1 s−1), and Boc-Ala-Gly-Pro-Arg-NHMec (kcat/Km = 46,021 m−1 s−1) were cleaved relatively poorly, nevertheless, the data indicate that FheCL1 can accommodate proline residues in the P2 position.
In comparison with FheCL1, FheCL2 is much less efficient at cleaving substrates with Phe and Leu in the P2 position; the kcat/Km values for Z-Phe-Arg-NHMec and Z-Leu-Arg-NHMec with this enzyme are 24- and 7-fold lower than for FheCL1, respectively. However, its ability to cleave Z-Pro-Arg-NHMec, Tos-Gly-Pro-Arg-NHMec, and Boc-Ala-Gly-Pro-Arg-NHMec is 6-, 2-, and 2-fold higher than that of FheCL1 (Table 3). The kinetic data show that the S2 subsite of FheCL2 is able to accommodate proline residues more readily than the S2 subsite of FheCL1 and is in agreement with the data obtained by PS-SCL (Figs. 3 and 4).
Substitution at the 205 position of FheCL1 to generate variant FheCL1 L205A had a significant impact on the substrate specificity of the enzyme by increasing its ability to cleave Z-Phe-Arg-NHMec, while reducing its effectiveness on Z-Leu-Arg-NHMec, Z-Pro-Arg-NHMec, Tos-Gly-Pro-ArgNHMec, and Boc-Ala-Gly-Pro-Arg-NHMec (Table 3). Substitution at position 67 to give the variant FheCL1 L67Y reduced the efficiency of the enzyme for both Z-Phe-Arg-NHMec and Z-Leu-Arg-NHMec about 2-fold, which was reflected in a reduction of both kcat and Km values for each substrate. This substitution did not significantly alter the specificity of the enzyme for the substrate Z-Pro-Arg-NHMec or Tos-Gly-Pro-Arg-NHMec, although it almost doubled its efficiency on Boc-Ala-Gly-Pro-Arg-NHMec (Table 3).
Kinetic Analyses of Wild Type and Variant Proteases with Specific Inhibitors—Peptidyl diazomethyl ketones are irreversible inhibitors of cysteine proteases (46). Changes in rates of inactivation by these inhibitors have highlighted different specificities at subsites of cysteine proteases such as cathepsin L and cathepsin B (47). In this study, rates of inactivation of FheCL1, FheCL1 L205A, FheCL1 L67Y, and FheCL2 by the cathepsin inhibitor Z-Phe-Ala-CHN2 have been measured. Wild type FheCL1 and FheCL2 had second-order rate constants of 20,838 and 11,899 m−1 s−1, respectively, showing that both enzymes were rapidly inactivated by Z-Phe-Ala-CHN2 (Table 4). The 2-fold greater rate of inactivation of FheCL1 compared with FheCL2 is further evidence that FheCL1 accommodates hydrophobic P2 residues more effectively than FheCL2.
TABLE 4.
Inactivation of recombinant wild type FheCL1, variants FheCL1 L205A, FheCL1 L67Y, and wild type FheCL2 by the diazomethyl ketone inhibitor Z-Phe-Ala-CHN2
| Enzyme | kobs/〈I〉 | Ki(app) |
|---|---|---|
| m−1 s−1 | nm | |
| FheCL1 | 20,838 ± 589 | 1,125 ± 213 |
| FheCL1 L209A | 492,727 ± 12592 | 15.4 ± 3.44 |
| FheCL1 L67Y | 53,704 ± 4331 | 93.6 ± 9.1 |
| FheCL2 | 11,899 ± 477 | 1,670.7 ± 791 |
The rate of inactivation of FheCL1 L205A was 24-fold greater than wild type indicating that Ala at residue 205 in the S2 sub-sites binds a P2 Phe more effectively than a Leu, which is consistent with our substrate kinetics studies (Table 3) and data derived from our tetrameric peptide library. These data highlighted further the major impact that Ala at position 205 has on binding P2 residues, and it is interesting to note that the second-order rate constant of 492,727 m−1 s−1 (Table 4) for the inactivation of FheCL1 L205A by Z-Phe-Ala-CHN2 is similar to the value of 660,000 m−1 s−1 for the inactivation of mammalian cathepsin L by the same inhibitor (47, 48). By contrast, the FheCL1 L67Y variant had a kobs/[I] value of 53,704 m−1 s−1 (Table 4), which is only 2.5-fold higher than wild type FheCL1 and 5-fold greater than wild type FheCL2; therefore, this substitution has not been such a major influence on the binding of Phe in the S2 pocket.
The inhibitor known as cathepsin K Inhibitor II (Z-LNHNH-CONHNHLF-Boc, CKII) is a potent time-dependent inhibitor of human cathepsin K; its selectivity for this enzyme is largely because of the effectiveness by which leucine occupies the S2 subsite (49). FheCL1 and FheCL2 were both potently inhibited by cathepsin K inhibitor II with kobs/[I] values of 397,237 and 269,447 m−1 s−1, respectively, which are 20 times higher than that observed for the peptidyl diazomethyl ketone Z-Phe-Ala-CHN2 (compare Tables 4 and 5). These values are similar to the value of 590,000 m−1 s−1 reported for the inactivation of cathepsin K by Wang et al. (49). The data are consistent with the kinetic data for hydrolysis of peptidyl fluorogenic substrates as both enzymes had highest kcat/Km values for Z-Leu-Arg-NHMec.
TABLE 5.
Inhibition values for recombinant wild type FheCL1, variants FheCL1 L205A, FheCL1 L67Y, and wild type FheCL2 cathepsin K inhibitor II
| Enzyme | kobs/〈I〉 | Ki(app) |
|---|---|---|
| m−1 s−1 | nm | |
| FheCL1 | 397,237 ± 55370 | 10.80 ± 0.48 |
| FheCL1 L209A | 58,212 ± 5306 | 116.33 ± 24.40 |
| FheCL1 L67Y | 113,182 ± 8220 | 13.27 ± 0.87 |
| FheCL2 | 269,447 ± 4611 | 23.14 ± 3.59 |
| Cathepsin Ka | 590,000 ± 1200 | 6.0 |
| Cathepsin La | 11,000 ± 560 |
Values are from Wang et al. (49).
The rate of inactivation of FheCL1 L205A by cathepsin K inhibitor II was 7-fold lower than wild type. The Ki(app) increased 11-fold demonstrating that this variant cannot accommodate leucine in the S2 subsite to the same extent as wild type FheCL1. Because the rate of inactivation of human cathepsin L by cathepsin K Inhibitor II was 53-fold lower than that for human cathepsin K (Table 5), these data indicate that the FheCL1 L205A variant has S2 specificity more characteristic of human cathepsin L. The rate of inactivation of FheCL1L67Y by cathepsin K inhibitor II was 3.5-fold lower than wild type, although the Ki(app) did not change significantly, against showing that the Tyr substitution at this position exerts a relatively lower effect on P2 binding.
Wild Type FheCL2 but Not Wild Type FheCL1 or Its Variants Cleaves Native Collagen Type 1—FheCL1 and FheCL2 degraded type 1 collagen at pH 4.0 and 5.5 in reactions held at 28 °C, but the activity of FheCL1 was much less and was limited to the β and γ chains, whereas the α1 and α2 chains remained intact. Moreover, whereas FheCL1 produced clear degradation fragments, FheCL2 degraded the collagen completely, particularly at pH 4.0, indicating that only the latter cleaves efficiently within the helical structures (Fig. 5A). Because low pH may cause some structural unraveling of the collagen, additional studies were performed at neutral pH. FheCL1 and FheCL2 both exhibit optimum activity against fluorogenic substrates in the neutral pH range. However, FheCL1 exhibited minimal activity against type 1 collagen in PBS, pH 7.3, whereas FheCL2 cleaved within all collagen chains (Fig. 5B).
FIGURE 5.
Comparison of the collagen cleaving activities of wild type FheCL1, variant FheCL1 L205A, variant FheCL1 L67Y, and wild type FheCL2. A, type I collagen was incubated with FheCL1 and FheCL2 at pH 4.0 and 5.5 and at 28 °C for 3 h, and the reaction was analyzed by 4–20% SDS-PAGE: lane 1, collagen alone; lane 2 collagen plus FheCL1 pH 4.0; lane 3, collagen plus FheCL2 pH 4.0; lane 4, collagen alone; lane 5, collagen plus FheCL1, pH 5.5; lane 6, collagen plus FheCL2, pH 5.5. B, type I collagen was incubated with active recombinant peptidase (5.47 μm) at 28 °C at neutral pH (PBS, pH 7.3) for 20 h, and the reactions analyzed as above: lane 1, collagen alone (i.e. no peptidase added); lane 2, collagen plus FheCL1; lane 3, collagen plus FheCL1 L209A; lane 4, collagen plus FheCL1 L67Y; lane 5, collagen plus FheCL2. Molecular mass standards are indicated on the right, and collagen chains (α1, α2, β 11, β 12, and γ, see Ref. 44) are indicated on the left.
Like the wild type FheCL1 enzyme, FheCL1 L205A and FheCL1 L67Y variants cleaved collagen but were unable to cleave within the tightly wound helices. Although in all experiments FheCL1 L205A appeared to cleave collagen more efficiently than the wild type enzyme, the pattern of digested fragments was similar (Fig. 5B). Nevertheless, the greater efficiency of cleavage of collagen is consistent with this variant’s enhanced activity against fluorogenic substrates (Table 3). The inability of the FheCL1 L67Y variant to cleave within the helices of collagen is also consistent with the PS-PCL studies and substrate kinetics studies, because this enzyme did not show any increase in preference for P2 Pro compared with wild type FheCL1.
The FheproCL1Gly25Structure in Comparison to Other Clan CA Cysteine Proteases—The experimentally determined structure of FheproGL1Gly25 is quite similar to that of previously described mammalian cathepsins. Although the x-ray crystal structure of FheCL1 presented here is that of an inactive zymogen mutant in which active site Cys25 has been mutated to glycine, the remainder of the active site machinery is intact, and the key specificity determinant, the S2 pocket, has not been altered.
The molecule, FheproCL1Gly25, is similar in tertiary structure to human cathepsin L1. Electron density is clear, connected, and easily traceable for the entirety of the main chain of the mature domain of FheproCL1Gly25. The mature domain is bi-lobed, with a substrate-binding cleft running between the two lobes of the enzyme, which is characteristic of the papain superfamily of cysteine proteases (Fig. 6A). With the exception of the mutated catalytic cysteine at position 25, the expected catalytic machinery, highlighted in pink in Fig. 6B, is present in the area of the substrate-binding cleft. The left-hand lobe of the mature domain (Fig. 6A) is predominantly helical in composition. The second domain contains several elements of β-sheet. Similar to other members of the papain superfamily of enzymes, there are three disulfide bonds in the mature domain of FheCL1Gly25. These connect Cys22–Cys63, Cys56–Cys95, and Cys153–Cys200, respectively. Superimposition of the α-carbons of the mature domain of FheproCL1 Gly25 on those of active papain (PDB code 9PAP) yields a root mean square deviation of 1.085 Å (the primary structure FheCL1 exhibits 32.8% identity and 62% similarity to papain). Superimposition of the main chain atoms of the mature domain of FheproCL1 Gly25 on those of the mature domain of human cathepsin L yields a root mean square deviation of 0.780 Å, whereas superimposition of the two full-length molecules (PDB code1CJL) yields 1.115 Å (the primary structure FheCL1 exhibits 42.8% identity and 71% similarity to human cathepsin L). These structural comparisons are indicative of the high degree of overall fold and shape similarity that exists among the family of papain-like cysteine proteases (1).
FIGURE 6.
A, bi-lobed mature FheproCL1 Gly25 is shown as a schematic. The predominantly helical domain is at left, and the predominantly sheet domain is at right. The mutated active site residue Gly25 lies in the cleft between the two domains and is indicated in red. B, structure of full-length FheproCL1 Gly25 zymogen is shown, with the mature segment surface illustrated in blue and the prosegment as a schematic. The extended C-terminal portion of the prosegment runs through the active site cleft. The catalytic machinery of the enzyme is highlighted in pink. Figures were created with PyMol (42).
The prosegment of FheproCL1 Gly25 folds in a manner very similar to human cathepsin L, as indicated by the value given for superimposition above, although it does show more divergence than is observed in the mature domain. In general, there is a globular region and an extended C-terminal portion, as illustrated in Fig. 6B, that connects the prosegment to the mature domain. As was described for human procathepsin L, the globular portion of the prosegment is fairly well structured and is comprised of distinct components of helix and β-strand (50). One notable change in the structure as compared with the human zymogen structure is as follows. In FheproCL1Gly25, a stretch of β-strand extends from 79P through 84P, which is then followed by a very short helical turn from 85P to 88P. In the human enzyme, this final helical turn is absent, and this segment as well as the remainder of the prodomain is made up of β-strand only. The final visible residues of FheproCL1Gly25, 89P through 96P are β-strand. It should be noted that the two species show the greatest structural divergence from residues 85P to 96P, with the chains carving a somewhat different path through three-dimensional space. The final four residues (97P, 98P, 99P, and 100P) of the prosegment of FheproCL1 Gly25 are not visible in experimental electron density, which suggests that they are disordered and subject to motion within the crystal. Most of the prosegment of FheproCL1 Gly25 sits adjacent to one side of the mature domain, in the region of a loop that extends from approximately residues 138–155 of the mature domain. The corresponding area of contact in the prosegment is residues 55P through 68P. The extended C-terminal tether of the prosegment that links the two domains lies across the active site cleft of the mature domain (Fig. 6B).
Significant Differences Exist in the Active Site Clefts of FheCL1 and FheCL2—The composition of the active site cleft, particularly the deep and well defined S2 pocket, is a key determinant of the substrate specificity of the papain family of cysteine proteases (44). The ability to accept or exclude particular substrate moieties is highly dependent upon the size, shape, and volume of the available pocket, as well as the presence or absence of stabilizing interactions such as charge-charge pairs, hydrogen bonding, and hydrophobic interactions. The S2 pocket in FheproCL1 Gly25 is lined with several residues that extend into the active site space. These include Leu67, Met68, Ala133, Val157, Ala160, and Leu205 (Table 2 and Fig. 7A). Leu67 and Val157 are situated at the entrance to the pocket and act as “gatekeepers.” Met68, Ala133, and Ala160 sit below them, deeper into the pocket, whereas Leu205 lines the floor of the pocket (illustrated in Fig. 7B). Sequence alignment and homology-based modeling of FheCL2 place the following residues within the S2 pocket: Tyr67, Met68, Ala133, Leu157, Ala160, and Leu205 (Table 3), all at locations corresponding to those observed in the structure of FheCL1 Gly25. The differences between the S2 pockets in these similar enzymes include the presence of the dramatically larger Tyr in the “gatekeeping” position 67 at the opening of the S2 pocket, and the somewhat larger Leu157 in the opposing position at the entrance to the pocket. Although tyrosine is much larger and bulkier than leucine, it is conformationally able to rotate somewhat freely based on the availability of an unrestrained torsion angle about the Cα–Cβ bond, and its presence at the top of the pocket does not necessary preclude the entry of P2 substrate residues (Fig. 7).
In one of the variants of FheCL1 constructed for this study (FheCL1 L67Y), a substitution of Tyr was made for Leu67 at the opening of the pocket, which renders the entrance to the S2 pocket somewhat more similar to that found in FheCatL2 and human cathepsin K (Table 2). Human cathepsin K is similar to the model constructed for FheCL2 sharing the presence of a larger Tyr residue at the entrance to the pocket. In this structure of human cathepsin K (PDB code 1ATK), the Tyr residue does not preclude access to the pocket and is positioned such that an inhibitor (E-64) is able to bind with a P2 Leu-like moiety just within the top of the S2 pocket. In the second FheCL1 variant constructed (FheCL1 L205A), a substitution of Ala was made in the Leu205 position at the base of the pocket that changed this site to be similar to that found in human cathepsin L (Table 2). As mentioned above the overall structure of the human enzyme is very similar to FheCL1 (50).
DISCUSSION
Substrate Specificity of FheCL1 and FheCL2—A comparison of the substrate specificity between the F. hepatica cathepsin L peptidases (wild type and variants) and human cathepsin L and cathepsin K is shown in Fig. 8, and helps to summarize the findings of our substrate specificity analyses and the effect active site substitutions have on this. First, it is clear that both FheCL1 and FheCL2 are similar to cathepsin K with regard to their preference for a P2 Leu over Phe. Second, both enzymes can accommodate Pro in the P2 position, but this is more readily accepted by FheCL2 compared with FheCL1; neither enzyme, however, cleaves substrates with this residue in the P2 position as readily as human cathepsin K. Third, substituting Tyr for Leu at residue 67 (variant FheCL1 L67Y) to make the S2 subsite of FheCL1 more like that of human cathepsin K did not significantly enhance its ability to cleave substrates with Pro in the P2 position; this was confirmed using three fluorogenic substrates as shown in Table, and by PS-SCL as shown in Fig. 4. Finally, substitution of Leu205 with Ala (variant FheCL1 L205A) increased the relative activity of the peptidase for substrates with Phe in the P2 position, but this increase was not sufficiently dramatic as to reverse its preference for Leu over Phe as observed for human cathepsin L; thus, compared with wild type FheCL1, FheCL1 L205A is more similar to human cathepsin L but is not identical in its substrate specificity.
FIGURE 8.
Comparison of the substrate specificity of human cathepsin L (CL), human cathepsin K (CK), recombinant FheCL1, FheCL1 L205A (L205A), FheCL1 L67Y (L67Y), and FheCL2. Data shown as relative kcat/Km for the hydrolysis of the substrates Z-Phe-Arg-NHMec, Z-Phe-Leu-NHMec, and Tos-Gly-Pro-ArgNHMec. Asterisk indicates data for human CL and CK are taken from Lecaille et al. (44).
Our results using inhibitors are consistent with our data derived from the PS-SCL and substrate specificity studies. FheCL1 accommodates hydrophobic P2 residues of diazomethyl ketone Z-Phe-Ala-CHN2 more effectively than FheCL2, and therefore its inhibition by this reagent was 2-fold greater. Replacement of the Leu205 by Ala, however, created an S2 pocket that accepted the P2 Phe more readily, and hence the inhibitory constant for Z-Phe-Ala-CHN2 against the FheCL1 L205A variant was 24-fold greater than for the wild type enzyme. The cathepsin K inhibitor II, on the other hand, was 20 times more potent than Z-Phe-Ala-CHN2 against both FheCL1 and FheCL2 and exhibited similar kinetics to that reported for human cathepsin K by Wang et al. (49). By contrast, it was 7-fold less potent against the FheCL1 L205A variant, which is interesting because it is 53-fold less effective against human cathepsin L, which possesses an Ala at position 205, compared with cathepsin K (49). Similar to our observations for substrate binding, the replacement of Leu67 for Tyr did not have a dramatic effect on the binding of both inhibitors.
Differences in the substrate specificities of human cathepsin L and cathepsin K can be exquisitely demonstrated using collagen type 1 as a substrate. Because of its acceptance of a P2 proline, cathepsin K can completely degrade collagen by cleaving within the repeated Gly-Pro-Xaa motif in the helices of the tightly wound triple helical structure. Human cathepsin L, on the other hand, cleaves within the nonhelical telomeric regions but does not possess intrahelical activity (24, 44). Although both FheCL1 and FheCL2 could cleave native collagen, only FheCL2 cleaved this substrate within the helical structures. Most strikingly, FheCL2 cleaved native collagen even at neutral pH, which suggests that the enzyme could perform this function in vivo to facilitate parasite tissue migration. Collectively, these results support the idea that the ability of FheCL2 to accommodate proline in the P2 position of substrates confers the enzyme with collagenase-like activity, similar to that observed for cathepsin K. Although FheCL1 exhibited low activity against fluorogenic substrates with a P2 Pro, this was insufficient to endow this enzyme the ability to cleave the helices within native collagen. Replacement of Leu67 in FheCL1 with Tyr (FheCL1 L67Y) to make the S2 subsite of this enzyme similar to FheCL2 and cathepsin K did not enhance its ability to accept substrates with Pro residues in the P2 position, nor did it confer the enzyme with collagenase-like activity suggesting that other S2 residue(s) besides that at position 67 are essential for this activity (see below).
The Amino Acid at Position 205 Lies at the Bottom of the S2 Pocket and Has a Major Impact on Substrate Specificity—Our PS-SCL, substrate binding, and inhibitor studies showed that the Leu at position 205 is indeed a determinant of substrate turnover and inhibitor specificity in FheCL1. The wild type enzyme is able to cleave substrates with Phe in the P2 position; however, substrates with the much smaller Leu moiety in the P2 position were cleaved more than 8-fold more rapidly. In comparison with the wild type FheCL1, replacing the Leu with an Ala (FheCL1 L205A) not only enhanced the ability of the enzyme to accept P2 Phe much more readily, but broadened the overall substrate specificity of the enzyme such that larger P2 residues such as Tyr and Trp were also accepted (see Table 3 and Fig. 4).
It has been observed in another papain family member, cruzain, from the protozoan parasite Trypanosoma cruzi, that the character (i.e. size, charge, and torsion-based flexibility) of the residue Glu205 is crucial for determining which P2 residues can be accommodated in the S2 pocket (51). Examination of structures of cruzain bound to inhibitors containing Phe at P2 shows that this residue is flexible and can rotate in or out of the pocket depending upon the substrate residue entering the pocket (51). In several inhibitor-bound structures of cruzain where the P2 residue is a Phe, Glu205 is swung out into the solvent to accommodate the size of the phenylalanine (52). In FheCL1, however, the Leu at position 205 is shorter by one carbon-carbon bond than Glu and therefore cannot rotate its C-δ1 and C-δ2 out of the pocket. Although the Leu residue of FheCL1 is shorter than cruzain’s Glu by ∼1.5 Å, it is conformationally flexible. It can therefore position itself to make the most space possible available to an incoming P2 Phe. Nonetheless, it clearly prefers the smaller Leu. The side chain of the wild type Leu, as determined in the x-ray crystal structure of FheproCL1Gly25, extends 3.98Å from the bottom of the pocket, filling a volume of 166.7 Å3 (53) and exposing a surface area of 170 Å2 (54). The observed broader specificity of FheCL1 L205A, i.e. the ability to minimally accommodate Tyr and Trp, can thus be understood by comparing the space available in the base of the S2 pocket. An Ala variant at 205 would extend only as far as the C-β of Leu, or to a distance of 1.51 Å and a corresponding volume of 88.6 Å3 and surface area of 115 Å2, leaving considerable additional space for larger substrate peptide residues to be accommodated.
Ability to Cleave Proline at P2 Is Influenced by Residue 67 and the Surrounding Gatekeeper Residues—By engineering a variant cathepsin K with a Leu replacing the Tyr at position 67, Lecaille et al. (44) demonstrated an important role for Tyr67 in determining the P2 Pro activity of cathepsin K and in its unique collagen cleaving activity. In this study, we found that replacing the Leu67 of FheCL1 with a Tyr to engineer the S2 pocket of the active site of this enzyme to mimic FheCL2 and human cathepsin K did not significantly alter its S2 subsite specificity and, most particularly, did not enhance the ability of the enzyme to accept a P2 Pro. The three-dimensional structure of FheCL1 was therefore analyzed and compared with cathepsin K to explain these observations and to determine what additional factors within the S2 pocket of FheCL1 may influence the acceptance of a P2 Pro residue (Fig. 7).
The S2 pockets of human cathepsin K and FheCL2 are very similar, as evidenced by our modeling data and that presented in Table 2; however, there are some noteworthy differences within a 5 Å radius of this site. First, residue 133 is an alanine in the FheCL2 enzyme, but is a serine in human cathepsin K. The impact of this difference on substrate preference may be minimal, however, because this residue is more peripheral to the outer edge of the base of the S2 pocket (Fig. 7). More significant is residue 158, which is adjacent to the gatekeeping residue 157 and sits just above the upper lip of the pocket. Based on its position, this residue, which is Asn in cathepsin K and Thr in FheCL2, appears to have some secondary influence on accessibility of the opening of the pocket. In the published structure of human cathepsin K (PDB code 1ATK), Asn158 is swung out of the way of the entrance to the pocket and does not preclude the entrance of any incoming P2 moiety. However, the Thr158 of FheCL2, which is shorter by one carbon than asparagine, cannot move completely out of the way and would possibly have either a carbon atom or an oxygen atom pointing in toward the top of the S2 opening. Based on spatial constraints, a proline would be accommodated in the S2 area of FheCL2, but this would not be as readily accepted as in human cathepsin K. Our analysis suggests, however, that acceptance of proline in this pocket is not achieved by offering a topology of easy deep penetration access but rather by providing opportunities for stabilizing interactions with the 5-membered proline ring of the substrate at the entrance to the pocket, and that such stabilization involves interactions between the aromatic ring of Tyr67 and the P2 proline. The location and positioning of Leu157 in the structure of human cathepsin K and FheCL2 suggest its availability to further stabilize the presence of a P2 proline, perhaps with constructive aliphatic interactions.
By comparison, there are greater differences in the gatekeeping positions at the top entrance to the S2 pocket of FheCL1 compared with FheCL2 and cathepsin K (Fig. 7). Residue 67 is the smaller Leu in FheCL1, and its terminal carbons, C-δ1 and C-δ2, extend only as far as the corresponding C-δ1 and C-δ2 of Tyr, which stretches its terminal oxygen nearly 3.7 Å further from the protein main chain along the edge of the pocket. On the opposing side of the pocket entrance, residues 158 of FheCL1 is Asn158, as in human cathepsin K, and is swung away from the entrance to the pocket, offering unimpeded access. However, residue 157 is a Val in FheCL1 and is one carbon shorter than the Leu found in FheCL2 and human cathepsin K, and accordingly its extension into the pocket is ∼1.5 Å less; this does not allow it to extend far enough into the available space to participate in aliphatic interactions. Thus, the absence of both stabilizing Tyr and Leu residues would account for the reduced preference for P2 proline by FheCL1. On the other hand, the composition of the S2 pocket in FheCL1, being more open and accessible to deeper penetration, would more readily favor processing of longer amino acid moieties, such as Leu and Phe, as we have observed.
In summary, previous studies with human cathepsins K and L have shown that residues at positions 67 and 205 are essential in dictating substrate specificity (44, 55). Much attention has been given to the importance of a Tyr67 in conferring cathepsin K with the ability of accepting P2 Pro residues in the corresponding S2 subsite of the enzyme and the capacity to degrade native collagens. However, this study using FheCL1 and a recent study by Lecaille et al. (56) using human cathepsin L show that mutations that replace the Leu67 to Tyr67 in these enzymes are not sufficient alone to accommodate proline and thus to endow collagenolytic activity. Therefore, other residues at the opening of the active site pocket, namely the gatekeeper residues identified here that occupy sites 157 and 158, combine with Tyr to generate these specialized properties, and hence, we have set the groundwork for future mutational studies. It is important to note that glycosaminoglycans such as chondroitin sulfate are known to enhance the collagenolytic activity of human cathepsin K by binding to a site other than the active site (57). However, these do not influence the activity of FheCL2 (data not shown), which points to further intriguing differences between the parasite and mammalian enzymes.
Given that collagen is a major interstitial matrix protein that is highly resistant to proteolysis, our data showing that FheCL2 can degrade native collagen within the helical regions at physiological pH would suggest that this protease enabled Fasciola spp. to become proficient tissue-degrading pathogens. It is important to note that native collagenase-like activity is restricted to very few enzymes. These include the bacterial collagenases, matrix-metalloproteinases, and cathepsin K (24), and therefore the evolution and maintenance of such an activity in Fasciola are significant. By extension, the emergence of this enzyme group may have been essential to the adaptation of the parasites to the wide variety of mammalian species it infects (13).
Acknowledgments
We thank Dr. J. Holton for technical assistance with x-ray data collection and processing and Dr. T. Stout for assistance with structure refinement and analysis. Portions of this research were carried out at the Advanced Light Source, a national user facility supported by the Director, Office of Science, Office of Basic Energy Sciences, of the United States Department of Energy under Contract DE-AC02-05CH11231. Work at The Sandler Center was supported by National Institutes of Health Grant AI-053247.
Footnotes
The abbreviations used are: Z-Phe-Arg-NHMec, benzyloxycarbonyl-l-phe-nylalanyl-l-arginine-4-methylcourmarinyl-7-amide; DTT, dithiothreitol; E-64, trans-epoxysuccinyl-l-leucylamido(4-guanidino)butane; PS-SCL, positional scanning synthetic combinatorial library; Z-Phe-Ala-CHN2, benzyloxycarbonyl-l-phenylalanyl-l-alanine-diazomethyl ketone; Tos, tosyl; PBS, phosphate-buffered saline; Boc, t-butoxycarbonyl; PDB, Protein Data Bank.
This work was supported in part by The Sandler Family Supporting Foundation. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The atomic coordinates and structure factors (code 2O6X) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).
REFERENCES
- 1.Rawlings ND, Morton FR, Barrett AJ. Nucleic Acids Res. 2006;34:D270–D272. doi: 10.1093/nar/gkj089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Tort J, Brindley PJ, Knox D, Wolfe KH, Dalton JP. Adv Parasitol. 1999;43:161–266. doi: 10.1016/s0065-308x(08)60243-2. [DOI] [PubMed] [Google Scholar]
- 3.Sajid M, Mckerrow JH. Mol Biochem Parasitol. 2002;120:1–21. doi: 10.1016/s0166-6851(01)00438-8. [DOI] [PubMed] [Google Scholar]
- 4.Dalton JP, Caffrey CR, Sajid M, Stack C, Donnelly S, Loukas A, Don T, McKerrow J, Halton DW, Brindley PJ. In: Parasitic Flatworms: Molecular Biology, Biochemistry, Immunology, and Physiology. Maule AG, Marks NJ, editors. CAB International; Wallingford, Oxon, UK: 2006. pp. 1–36. [Google Scholar]
- 5.Wasilewski MM, Lim KC, Philips J, McKerrow JH. Mol Biochem Parasitol. 1996;81:179–189. doi: 10.1016/0166-6851(96)02703-x. [DOI] [PubMed] [Google Scholar]
- 6.Dalton JP, O’Neill SM, Stack C, Collins P, Walsh A, Sekiya M, Doyle S, Mulcahy G, Hoyle D, Khaznadji E, Moire N, Brennan G, Mousley A, Kreshchenko N, Maule A, Donnelly S. Int Parasitol. 2003;33:1173–1181. doi: 10.1016/s0020-7519(03)00171-1. [DOI] [PubMed] [Google Scholar]
- 7.Abdulla MH, Lim KC, Sajid M, McKerrow JH, Caffrey CR. Plos Med. 2007;4:e14. doi: 10.1371/journal.pmed.0040014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mitchell GB, Maris L, Bonniwell MA. Vet Rec. 1998;143:399. [PubMed] [Google Scholar]
- 9.Borgsteed FH, Moll L, Vellema P, Gaasenbeek CP. Vet Rec. 2005;156:350–351. doi: 10.1136/vr.156.11.350. [DOI] [PubMed] [Google Scholar]
- 10.Mas-Coma S, Bargues MD, Valero MA. Int J Parasitol. 2005;35:1255–1278. doi: 10.1016/j.ijpara.2005.07.010. [DOI] [PubMed] [Google Scholar]
- 11.MacManus DP, Dalton JD. Parasitology. 2006;133:S43–S61. doi: 10.1017/S0031182006001806. [DOI] [PubMed] [Google Scholar]
- 12.Collins PR, Stack CM, O’Neill SM, Doyle S, Ryan T, Brennan GP, Mousley A, Stewart M, Maule AG, Dalton JP, Donnelly S. J Biol Chem. 2004;279:17038–17046. doi: 10.1074/jbc.M308831200. [DOI] [PubMed] [Google Scholar]
- 13.Irving JA, Spithill TW, Pike RN, Whisstock JC, Smooker PM. J Mol Evol. 2003;57:1–15. doi: 10.1007/s00239-002-2434-x. [DOI] [PubMed] [Google Scholar]
- 14.Beckham SA, Law RH, Smooker PM, Quinsey NS, Caffrey CR, McKerrow JH, Pike RN, Spithill TW. Biol Chem. 2006;387:1053–1061. doi: 10.1515/BC.2006.130. [DOI] [PubMed] [Google Scholar]
- 15.Hanna RE, Trudgett AG. Parasite Immunol. 1983;5:409–425. doi: 10.1111/j.1365-3024.1983.tb00756.x. [DOI] [PubMed] [Google Scholar]
- 16.Berasain P, Goni F, McGonigle S, Dowd A, Dalton JP, Frangione B, Carmona C. J Parasitol. 1997;83:1–5. [PubMed] [Google Scholar]
- 17.Berasain P, Carmona C, Frangione B, Dalton JP, Goni F. Exp Parasitol. 2000;94:99–110. doi: 10.1006/expr.1999.4479. [DOI] [PubMed] [Google Scholar]
- 18.Dowd AJ, Smith AM, McGonigle S, Dalton JP. Eur J Biochem. 1994;223:91–98. doi: 10.1111/j.1432-1033.1994.tb18969.x. [DOI] [PubMed] [Google Scholar]
- 19.Dowd AJ, McGonigle S, Dalton JP. Eur J Biochem. 1995;232:241–260. doi: 10.1111/j.1432-1033.1995.tb20805.x. [DOI] [PubMed] [Google Scholar]
- 20.Dowd AJ, Tort J, Roche L, Ryan T, Dalton JP. Mol Biochem Parasitol. 1997;88:163–174. doi: 10.1016/s0166-6851(97)00090-x. [DOI] [PubMed] [Google Scholar]
- 21.Ishidoh K, Kominami E. Biol Chem. 1998;379:131–135. doi: 10.1515/bchm.1998.379.2.131. [DOI] [PubMed] [Google Scholar]
- 22.Drake FH, Dodds RA, James IE, Connor JR, Debouck C, Richardson S, Lee-Rykaczewski E, Coleman L, Rieman D, Barthlow R, Hastings G, Gowen M. J Biol Chem. 1996;271:12511–12516. doi: 10.1074/jbc.271.21.12511. [DOI] [PubMed] [Google Scholar]
- 23.Buhling F, Reisenauer A, Gerber A, Kruger S, Weber E, Brömme D, Roessner A, Ansorge S, Welte T, Rocken C. J Pathol. 2001;195:375–382. doi: 10.1002/path.959. [DOI] [PubMed] [Google Scholar]
- 24.Atley LM, Mort JS, Lalumiere M, Eyre DR. Bone (Elmsford) 2000;26:241–247. doi: 10.1016/s8756-3282(99)00270-7. [DOI] [PubMed] [Google Scholar]
- 25.Roche L, Dowd AJ, Tort J, McGonigle S, MacSweeney A, Curley GP, Ryan T, Dalton JP. Eur J Biochem. 1997;232:241–246. doi: 10.1111/j.1432-1033.1997.t01-1-00373.x. [DOI] [PubMed] [Google Scholar]
- 26.Stack CM, Donnelly S, Lowther J, Xu W, Collins PR, Brinen LS, Dalton JP. J Biol Chem. 2007;282:16532–16543. doi: 10.1074/jbc.M611501200. [DOI] [PubMed] [Google Scholar]
- 27.Choe Y, Leonetti F, Greenbaum DC, Lecaille F, Bogyo M, Brömme D, Ellman JA, Craik CS. J Biol Chem. 2006;281:12824–12832. doi: 10.1074/jbc.M513331200. [DOI] [PubMed] [Google Scholar]
- 28.Morrison JF, Walsh CT. Adv Enzymol. 1988;61:201–301. doi: 10.1002/9780470123072.ch5. [DOI] [PubMed] [Google Scholar]
- 29.Tian WX, Tsou CL. Biochemistry. 1982;21:1028–1032. doi: 10.1021/bi00534a031. [DOI] [PubMed] [Google Scholar]
- 30.Luft JR, Collins RJ, Fehrman NA, Lauricella AM, Veatch CK, DeTitta GT. J Struct Biol. 2003;142:170–179. doi: 10.1016/s1047-8477(03)00048-0. [DOI] [PubMed] [Google Scholar]
- 31.MacDowell AA, Celestre RS, Howells M, McKinney W, Krupnick J, Cambie D, Domning EE, Duarte RM, Kelez N, Plate DW, Cork CW, Earnest TN, Dickert J, Meigs G, Ralston C, Holton JM, Alber T, Berger JM, Agard DA, Padmore HA. J Synchrotron Radiat. 2004;11:447–455. doi: 10.1107/S0909049504024835. [DOI] [PubMed] [Google Scholar]
- 32.Leslie AG. Acta Crystallogr Sect D Biol Crystallogr. 2006;62:48–57. doi: 10.1107/S0907444905039107. [DOI] [PubMed] [Google Scholar]
- 33.Collaborative Computational Project, No. 4 Acta Crystallogr Sect D Biol Crystallogr. 1994;50:760–763. [Google Scholar]
- 34.Lamzin VS, Perrakis A, Wilson KS. In: Crystallography of Biological Macromolecules. Rossman FMG, Arnold E, editors. Kluwer Academic Publishing; Dordrecht, Netherlands: 2001. pp. 720–722. [Google Scholar]
- 35.Murshudov GN, Vagin A, Dodson E. Acta Crystallogr Sect D Biol Crystallogr. 1997;53:240–255. doi: 10.1107/S0907444996012255. [DOI] [PubMed] [Google Scholar]
- 36.Perrakis A, Sixma TK, Wilson KS, Lamzin VS. Acta Crystallogr Sect D Biol Crystallogr. 1997;53:448–455. doi: 10.1107/S0907444997005696. [DOI] [PubMed] [Google Scholar]
- 37.van den Bedem H, Lotan IJ, Latombe CL, Deacon AM. Acta Crystallogr Sect D Biol Crystallogr. 2005;61:2–13. doi: 10.1107/S0907444904025697. [DOI] [PubMed] [Google Scholar]
- 38.Emsley P, Cowtan K. Acta Crystallogr Sect D Biol Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
- 39.Marti-Renom MA, Stuart A, Fiser A, Sanchez R, Melo F, Sali A. Annu Rev Biophys Biomol Struct. 2000;29:291–325. doi: 10.1146/annurev.biophys.29.1.291. [DOI] [PubMed] [Google Scholar]
- 40.Sali A, Blundell TL. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
- 41.Fiser A, Do RK, Sali A. Protein Sci. 2000;9:1753–1773. doi: 10.1110/ps.9.9.1753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.DeLano WL. The PyMOL Molecular Graphics System. Delano Scientific; San Carlos, CA: 2002. Version 1.0. [Google Scholar]
- 43.Kumar S, Tamura K, Jakobsen IB, Nei M. Bioinformatics (Oxf) 2001;17:1244–1245. doi: 10.1093/bioinformatics/17.12.1244. [DOI] [PubMed] [Google Scholar]
- 44.Lecaille F, Choe Y, Brandt W, Li Z, Craik CS, Brömme D. Biochemistry. 2002;41:8447–8454. doi: 10.1021/bi025638x. [DOI] [PubMed] [Google Scholar]
- 45.Smooker PM, Whisstock JC, Irving JA, Siyaguna S, Spithill TW, Pike RN. Protein Sci. 2000;9:2567–2572. doi: 10.1110/ps.9.12.2567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Green GD, Shaw E. J Biol Chem. 1981;256:1923–1928. [PubMed] [Google Scholar]
- 47.Kirschke H, Shaw E. Biochem Biophys Res Commun. 1981;101:454–480. doi: 10.1016/0006-291x(81)91281-x. [DOI] [PubMed] [Google Scholar]
- 48.Kirschke H, Wikstrom P, Shaw E. FEBS Lett. 1988;228:128–130. doi: 10.1016/0014-5793(88)80600-8. [DOI] [PubMed] [Google Scholar]
- 49.Wang D, Pechar M, Li W, Kopeèková P, Brömme D, Kopeèek J. Biochemistry. 2002;41:8849–8859. doi: 10.1021/bi0257080. [DOI] [PubMed] [Google Scholar]
- 50.Coulombe R, Grochulski P, Sivaraman J, Ménard R, Mort JS, Cygler M. EMBO J. 1996;15:5492–5503. [PMC free article] [PubMed] [Google Scholar]
- 51.Gillmor SA, Craik CS, Fletterick RJ. Protein Sci. 1997;6:1603–1611. doi: 10.1002/pro.5560060801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Brinen LS, Hansell E, Cheng J, Roush WR, McKerrow JH, Fletterick RJ. Structure (Lond) 2000;8:831–840. doi: 10.1016/s0969-2126(00)00173-8. [DOI] [PubMed] [Google Scholar]
- 53.Zamyatin H. Prog Biophys Mol Biol. 1972;24:107–123. doi: 10.1016/0079-6107(72)90005-3. [DOI] [PubMed] [Google Scholar]
- 54.Chothia C. Nature. 1973;248:338–339. doi: 10.1038/248338a0. [DOI] [PubMed] [Google Scholar]
- 55.Brömme D, Bonneau PR, Lachance P, Storer AC. J Biol Chem. 1994;269:30238–30242. [PubMed] [Google Scholar]
- 56.Lecaille F, Chowdhury S, Purisima E, Brömme D, Lalmanach G. Protein Sci. 2007;16:662–670. doi: 10.1110/ps.062666607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Li Z, Yasuda Y, Li W, Bogyo M, Katz N, Gordon RE, Fields GB, Brömme D. J Biol Chem. 1994;279:5470–5479. doi: 10.1074/jbc.M310349200. [DOI] [PubMed] [Google Scholar]








