Abstract
Small-insert metagenomic libraries from four samples were constructed by a topoisomerase-based and a T4 DNA ligase-based approach. Direct comparison of both approaches revealed that application of the topoisomerase-based method resulted in a higher number of insert-containing clones per μg of environmental DNA used for cloning and a larger average insert size. Subsequently, the constructed libraries were partially screened for the presence of genes conferring proteolytic activity. The function-driven screen was based on the ability of the library-containing Escherichia coli clones to form halos on skim milk-containing agar plates. The screening of 80,000 E. coli clones yielded four positive clones. Two of the plasmids (pTW2 and pTW3) recovered from positive clones conferred strong proteolytic activity and were studied further. Analysis of the entire insert sequences of pTW2 (28,113 bp) and pTW3 (19,956 bp) suggested that the DNA fragments were derived from members of the genus Xanthomonas. Each of the plasmids harbored one gene (2,589 bp) encoding a metalloprotease (mprA, pTW2; mprB, pTW3). Sequence and biochemical analyses revealed that MprA and MprB are similar extracellular proteases belonging to the M4 family of metallopeptidases (thermolysin-like family). Both enzymes possessed a unique modular structure and consisted of four regions: the signal sequence, the N-terminal proregion, the protease region, and the C-terminal extension. The architecture of the latter region, which was characterized by the presence of two prepeptidase C-terminal domains and one proprotein convertase P domain, is novel for bacterial metalloproteases. Studies with derivatives of MprA and MprB revealed that the C-terminal extension is not essential for protease activity. The optimum pH and temperature of both proteases were 8.0 and 65°C, respectively, when casein was used as substrate.
Proteolytic enzymes catalyze the hydrolytic cleavage of peptide bonds. These enzymes are present in all living organisms and are essential for cell growth and differentiation. Microorganisms produce a variety of intracellular and/or extracellular proteases. Intracellular microbial proteases are highly specific and are involved in several cellular and metabolic processes such as activation of inactive precursors, maintenance of the cellular protein pool, and sporulation. Extracellular proteases degrade proteins in cell-free environments. The resulting hydrolytic products (small peptides and amino acids) can be transported into the cells and utilized as carbon or nitrogen sources (13, 39). Especially, extracellular proteases are of industrial importance and are used as cleaning agents and food and feed additives.
Proteases can be divided into various groups with respect to the functional group present at the active site and the pH profile of the activity. Microbial alkaline proteases, which are defined to be active in a neutral to alkaline pH range (13), possess either a serine center (serine proteases) or are of metal-type (metalloproteases). Extracellular serine proteases are one of the most important groups of industrial enzymes (13, 23). In addition, many extracellular metalloproteases play an important role in pathogenesis (36). Both types of proteases have been cloned and sequenced from many different individual microorganisms, i.e., Bacillus subtilis (56), Enterobacter sakazakii (22), Listeria monocytogenes (4), and Alteromonas sp. (33, 34).
In the present study, we sought to isolate proteases by a metagenomic approach. Functional metagenomics comprising the isolation of DNA from environmental samples without prior enrichment of individual microorganisms, the construction of libraries from the recovered DNA, and function-driven screening of the generated libraries has led to the identification and characterization of a variety of novel enzymes (7, 14, 43), i.e., amylases (44, 58), oxidoreductases (21), lipolytic enzymes (9, 16, 44), monooxygenases (53), and proteins conferring nickel resistance (32). Despite the wealth of different biotechnologically important enzyme activities derived from metagenomes, little information on the characteristics of metagenome-derived proteases is available. One metagenome-derived fibrinolytic metalloprotease has been recovered (26). In addition, one serine protease has been cited in reviews (13, 29). In other metagenome projects, activity-based screening was unsuccessful (19, 44), or the thereby recovered proteolytic clones were not characterized (47).
We report here on the identification and characterization of two metagenome-derived metalloproteases. We constructed metagenomic DNA libraries from environmental samples by a T4 DNA ligase-based and a topoisomerase-based cloning method. The suitability of both approaches for the construction of small-insert metagenomic libraries was compared The resulting library-containing Escherichia coli clones were screened for proteolytic activity. The plasmids were recovered from the positive E. coli strains and the insert sequences of two plasmids (pTW2 and pTW3), which conferred strong proteolytic activity, were sequenced. The targeted protease-encoding genes and the corresponding gene products were characterized.
MATERIALS AND METHODS
Bacterial strains and plasmids.
The plasmids used in the present study are shown in Table 1. The E. coli strains TOP10 and BL21(DE3) (Invitrogen, Karlsruhe, Germany) were used as hosts for the cloning and for production of the proteases, respectively.
TABLE 1.
Plasmids examined in this study
| Plasmid | Relevant characteristicsa | Source |
|---|---|---|
| pBluescript SKII+ (pSKII+) | Apr, lac promoter, pMB1 replicon | Stratagene |
| pCR2.1-TOPO | Apr Kanr, lac promoter, pMB1 replicon | Invitrogen |
| pET101/D | Apr, T7 promoter, pMB1 replicon | Invitrogen |
| pTW2 | pSKII+: 28,113-bp fragment of cloned metagenomic DNA | This study |
| pTW3 | pSKII+: 19,956-bp fragment of cloned metagenomic DNA | This study |
| pMPRA | pET101/D: 2,586-bp fragment containing the entire mprA gene under the control of the T7 promoter | This study |
| pMPRA.1 | pET101/D: 1,635-bp fragment containing bases 1 to 1635 of the mprA gene under the control of the T7 promoter | This study |
| pMPRB | pET101/D: 2,586-bp fragment containing the entire mprB gene under the control of the T7 promoter | This study |
| pMPRB.1 | pET101/D: 1,638-bp fragment containing bases 1 to 1638 of the mprB gene under the control of the T7 promoter | This study |
| pMPRB.2 | pET101/D: 1,932-bp fragment containing bases 655 to 2586 of the mprB gene under the control of the T7 promoter | This study |
| pMPRB.3 | pET101/D: 984-bp fragment containing bases 655 to 1638 of the mprB gene under the control of the T7 promoter | This study |
| pMPRB.4 | pET101/D: 2,505-bp fragment containing bases 82 to 2586 of the mprB gene under the control of the T7 promoter | This study |
Apr, ampicillin resistance; Kanr, kanamycin resistance.
Media and growth conditions.
E. coli was routinely grown in LB medium at 30°C (2). For activity-based screening of metagenomic libraries, recombinant E. coli strains were grown under aerobic conditions in LB medium, which was supplemented with skim milk (2% [wt/vol]) and solidified with agar (15 g/liter). Colonies showing clear zones (halos) against the creamy background were regarded as protease producing. To determine the substrate specificity of the protease-producing clones skim milk was replaced by 0.3% (wt/vol) azoalbumin or azocasein. The formation of halos around the colonies was indicative for proteolytic cleavage of these substrates.
For the production of the metalloproteases and derivatives of these proteins, recombinant E. coli strains were grown under aerobic conditions at 37°C in basal medium containing the following per liter: Na2HPO4, 6.0 g; KH2PO4, 3.0 g; NH4Cl, 3.0 g; MgSO4·7H2O, 0.05 g; CaCl2·12H2O, 0.02 g; yeast extract, 0.2 g; and trace element solution SL4 (37), 1 ml (pH 7.4). The medium was supplemented with 20 mM glucose and, to increase the yield of secreted recombinant proteins, 0.3 M l-arginine. For the induction of heterologous gene expression, 0.5 mM (final concentration) IPTG (isopropyl-β-d-thiogalactopyranoside) was added to cultures with an A600 of ∼0.8. Subsequently, the cultures were incubated for 15 h and then harvested. All growth media for E. coli strains harboring plasmids contained 100 μg of ampicillin/ml.
General molecular procedures.
All manipulations of DNA, PCR, and transformation of plasmids into E. coli were done according to routine procedures (2) unless otherwise specified. The Göttingen Genomics Laboratory (Göttingen, Germany) determined the DNA sequences.
Isolation of environmental DNA and library construction.
The genomic DNA was isolated from the samples by direct lysis of the microorganisms present without prior removal of environmental compounds as described previously (17). The purified DNA was partially digested with Bsp142I and, in order to avoid cloning of very small DNA fragments, size fractionated by sucrose density centrifugation (10 to 40% [wt/vol]). To construct small-insert libraries, fractions containing DNA fragments of >2 kb were pooled and used for library construction. One library was constructed in pCR2.1-TOPO and one was constructed in pBluescript SKII+ (pSKII+) from the size-fractionated DNA derived from each of the samples described below (Table 2). The quality of the different environmental libraries produced was controlled by determination of the average insert size and the number of insert-containing clones.
TABLE 2.
Characterization of constructed metagenomic libraries and screening of these libraries for genes conferring proteolytic activity on E. colia
| Library | Sample site | Vector | No. of insert-containing E. coli clones | Avg insert size (kb)b | Estimated library size (Mb) | No. of proteolytic E. coli clones with a stable phenotype (designation) |
|---|---|---|---|---|---|---|
| SK1 | Mixed sample A | pSKII+ | 52,000 | 4.2 | 218 | 0 |
| CR1 | Mixed sample A | pCR2.1 | 78,000 | 5.3 | 413 | 1 (pTW1) |
| SK2 | Mixed sample B | pSKII+ | 55,000 | 3.9 | 215 | 2 (pTW2, pTW3) |
| CR2 | Mixed sample B | pCR2.1 | 65,000 | 4.0 | 260 | 0 |
| SK3 | Mining shaft | pSKII+ | 10,000 | 3.9 | 39 | 0 |
| CR3 | Mining shaft | pCR2.1 | 37,000 | 4.6 | 170 | 1 (pTW4) |
| SK4 | Compost soil | pSKII+ | 32,000 | 2.4 | 77 | 0 |
| CR4 | Compost soil | pCR2.1 | 60,000 | 3.5 | 210 | 0 |
In each case, 5 μg of isolated metagenomic DNA was used as starting material for library construction. Approximately 10,000 E. coli clones of each library were screened.
The average insert size was determined by analysis of 50 insert-containing recombinant plasmids.
In the case of pSKII+, 5 μg of size-fractionated metagenomic DNA was inserted into BamHI-digested and dephosphorylated vector by ligation using T4 DNA ligase (MBI Fermentas, St. Leon-Rot, Germany) in the reaction mixture recommended by the manufacturer. Size-fractionated metagenomic DNA was mixed with the vector at a molar ratio of 1:1. Ligation reactions were incubated overnight at 16°C.
To clone the metagenomic DNA into pCR2.1-TOPO, a TOPO TA cloning method (Invitrogen) was used. For this purpose, 5 μg of the size-fractionated metagenomic DNA was subjected to blunt-end polishing by using T4 DNA polymerase (MBI Fermentas) as suggested by the manufacturer. The reaction was performed in a total volume of 50 μl at 25°C for 1 h. Subsequently, the DNA was purified by using QIAquick PCR purification kit (Qiagen, Hilden, Germany) as recommended by the manufacturer. The volume of the purified DNA solution was 50 μl. The next step comprised addition of deoxyadenosine to the 3′ termini of the DNA, which is required to facilitate cloning by the TA method. For this purpose, 6 μl of dATP solution (2 mM), 8 μl of 10-fold MgCl2-containing Taq DNA polymerase buffer (MBI Fermentas), 2 μl of Taq DNA polymerase (2 U), and 14 μl of H2O were mixed with the DNA solution, followed by incubation at 72°C for 20 min, and then purified by using a QIAquick PCR purification kit. The resulting DNA solution (50 μl) was dephosphorylated by using 2 U of calf intestine alkaline phosphatase (MBI Fermentas) as recommended by the manufacturer. Subsequently, the DNA was purified by using a QIAquick PCR purification kit. Finally, the recovered DNA solution (30 μl) was inserted into pCR2.1-TOPO by using the TOPO TA cloning kit (Invitrogen) according to the manufacturer's instructions. All constructed libraries were used to transform E. coli TOP10 (Invitrogen) as recommended by the manufacturer.
Four different samples were used to construct metagenomic libraries. The characteristics and the designations of the constructed libraries are given in Table 2. Two libraries (SK3 and CR3) were constructed from a soil sample derived from the mining shaft Fortuna (Harz Mountains, Germany), and two libraries (SK4 and CR4) were constructed from compost soil. The other four libraries were generated from DNA derived from two different mixed samples, designated mixed sample A and mixed sample B. Mixed samples were prepared by merging 0.5 g of each of the different environmental samples. Mixed sample A (libraries SK1 and CR1) contained sediment from a digestion tower of a sewage plant (Göttingen, Germany), sediment from the Gulf of Eilat (Israel), and sediment from the Leine River (Göttingen, Germany). Mixed sample B (libraries SK2 and CR2) harbored garden soil (Varmissen, Germany) and sediment from the Wadden Sea (Germany), sediment from the Leine River, and sediment from the Solar Lake (Egypt).
Cloning of the protease-encoding genes and derivatives of these genes into expression vector pET101/D.
In order to construct derivatives of protease-encoding genes mprA and mprB and to achieve high-level expression of the mprA and mprB gene products and derivatives of these gene products, the plasmids pMPRA, pMPRA.1, pMPRB, and pMPRB.1 to pMPRB.4 were constructed by a PCR-based approach (Table 1). For this purpose, the coding regions of mprA and mprB or parts of these regions were amplified from recombinant plasmids pTW2 and pTW3, respectively, by PCR using the following sets of primers with synthetic sites (underlined) that allowed a directional cloning into pET101/D and translation of the gene regions by using the pET101/D directional TOPO expression kit (Invitrogen): pMPRA, 5′-CACCATGCATTCCAGCAGTCG-3′ (primer mprA-for) and 5′-GAAGGTGACGCTCCAGCTG-3′ (primer mprA/mprB-rev); pMPRA.1, primer mprA-for and 5′-CGATGCCGACTGGTTGGTC-3′ (primer mprA.1-rev); pMPRB, 5′-CACCATGCAGTCCAGCAGTCGT-3′ (primer mprB-for) and primer mprA/mprB-rev; pMPRB.1, primer mprB-for and 5′-GCCGGTGCTGGCGGAGATG-3′ (primer mprB.1-rev); pMPRB.2, 5′-CACCATGGCCGATATCGGCACC-3′ (primer mprB.2-for) and primer mprA/mprB-rev; pMPRB.3, primer mprB.2-for and primer mprB.1-rev; and pMPRB.4, 5′-CACCATGGCCACCCGGGTCGACCTG-3′ (primer mprB.4-for) and primer mprA/mprB-rev Each PCR contained in a total volume of 50 μl of 1-fold Mg-free polymerase buffer (MBI Fermentas), 200 μM concentrations of each of the four deoxynucleoside triphosphates, 1.5 mM MgCl2, 2 μM concentrations of each of the primers, 1 U of Pfu DNA polymerase (MBI Fermentas), and 0.1 μg of pTW2 or pTW3 as a template. The reactions were initiated at 95°C (2 min), followed by 30 cycles of 95°C (1 min), a temperature gradient ranging from 37 to 57°C (1 min), 72°C (1 min per 500 bp), and ended with incubation at 72°C for at least 10 min. Finally, all obtained PCR products were purified and cloned into pET101/D as recommended by the manufacturer (Invitrogen). All coding regions were placed under the control of the IPTG-inducible T7 promoter by cloning in the expression vector pET101/D. In addition, sequences encoding a His6 tag and a V5 epitope provided by the vector were added to the 3′ end of the coding regions.
To detect V5 epitope-tagged proteins by Western blotting hybridization analysis samples of cell extracts were separated by analytical sodium dodecyl sulfate polyacrylamide gel electrophoresis using the procedure of Laemmli (24), and then transferred to Hybond-P membranes (GE Healthcare, Freiburg, Germany) as recommended by the manufacturer. Incubation of membranes with alkaline phosphatase-conjugated anti-V5 antibody (Invitrogen) and development of signals by using BCIP (5-bromo-4-chloro-3-indolylphosphate)/nitroblue tetrazolium color development reagent (Promega) were performed according to the manufacturer's instructions.
Sequence analysis.
The initial prediction of open reading frames (ORFs) located on the inserts of pTW2 and pTWP3 was accomplished by using the Artemis program (http://www.sanger.ac.uk/Software/Artemis) (46). The results were verified and improved manually by using criteria such as the presence of a ribosome-binding site, GC frame plot analysis, and similarity to known protein-encoding sequences. Initial annotation of the deduced proteins was performed by searching the amino acid sequences against the public GenBank database by using the BLAST program (1). All predictions were verified and modified manually by comparing the protein sequences with the Swiss-Prot, ProDom, COG, and Prosite public databases. All coding sequences were searched for similarities to protein families and domains using searches against the Pfam (10) and CDD databases (31). Signal peptides of proteins were predicted by using the SignalP 3.0 server (http://www.cbs.dtu.dk/services/SignalP/) (3). Sequence analysis and classification of the identified proteases were performed by comparing the sequences to the MEROPS peptidase database (http://merops.sanger.ac.uk) (42).
Preparation of cell-free culture supernatants and cell extracts.
The cells and the culture supernatants from 500-ml cultures were separated by centrifugation at 16,300 × g and 4°C for 15 min. To prepare cell extracts, the resulting cell pellets were washed once with 100 mM potassium phosphate buffer (pH 8.0) and resuspended in 2 to 3 ml of the same buffer. The cells were disrupted by using a French press (1.38 × 108 Pa). Subsequently, the extract was cleared by centrifugation at 32,000 × g and 4°C for 30 min.
The culture supernatant was concentrated 20-fold by ultrafiltration using a Vivaflow 200 module (exclusion limit 10,000 Da; Sartorius AG, Göttingen, Germany) as recommended by the manufacturer. The module was also used to change buffer systems. The concentrated culture supernatants were used for protein purification as described below. To characterize the protease activity directly in the culture supernatant, it was further concentrated by using a Vivaspin 6 concentrator (Sartorious AG) as recommended by the manufacturer. The culture supernatant was concentrated 100-fold by using both ultrafiltration steps.
Enzyme assays.
Protease activity was determined by assaying the amount of liberated amino acids using casein as a substrate. To 0.5 ml of 0.65% (wt/vol) casein solution in 50 mM potassium phosphate buffer (pH 7.5) 0.1 ml of enzyme solution was added. After incubation for 1 h at 55°C, the reaction was terminated by the addition of 0.5 ml of 0.11 M trichloroacetic acid solution. For the preparation of blanks, the enzyme solutions were added after termination of the reaction by trichloroacetic acid. Subsequently, the terminated reaction mixtures were incubated at 55°C for 30 min, and the precipitate was removed by centrifugation at 14,000 × g for 5 min. An aliquot (0.4 ml) of the remaining supernatant was mixed with 1 ml of 0.5 M Na2CO3 solution and 0.2 ml of Folin-Ciocalteu reagent (Sigma Chemie, Deisenhofen, Germany), which was diluted fivefold prior to use. After incubation at 37°C for 30 min, the solution was cleared by centrifugation, and the absorbance was measured at 660 nm. A standard curve was generated using solutions of 0 to 80 mg of tyrosine/liter. One unit of protease activity was defined as the amount of enzyme that liberated 1 μmol of tyrosine in 1 min. All measured values of absorbency were corrected for the blank absorbencies. Protein concentrations were determined by the method of Bradford (5) with bovine serum albumin as the standard.
The optimum temperature for enzyme activity was measured in the range of 15 to 85°C. The optimum pH was determined by incubating the enzyme with casein as the substrate in 50 mM potassium phosphate buffer (pH 6 to 9) or 50 mM sodium carbonate buffer (pH 9 to 11.5).
Effect of inhibitors.
The effects of inhibitors on proteolytic activity were determined by incubation of the proteases with 10 mM ortho-phenanthroline, 10 mM phenylmethylsulfonyl fluoride, or 10 mM EDTA for 1 h at 25°C before the enzyme activity was measured.
Purification of proteases.
The proteases were purified from the concentrated culture supernatants by using hydrophobic interaction chromatography using a computer-controlled ÄKTA FPLC system (GE Healthcare) as recommended by the manufacturer. Protease solutions (13.7 to 16.5 mg of protein) containing MprA, MprB, MprA.1, or MprB.1 were applied to a prepacked Phenyl-Sepharose HP XK 16/10 column (GE Healthcare) that has been equilibrated with 1 M (NH4)2SO4 in 50 mM Tris buffer (pH 8). Subsequently, the column was washed with a negatively sloped gradient of ammonium sulfate [200 ml of 1 to 0 M (NH4)2SO4 in 50 mM Tris buffer; pH 8]. The proteases desorbed from the column, when the ammonium sulfate concentration dropped below 100 mM. The flow rate was 1.5 ml/min, and 6-ml fractions were collected. Subsequently, active fractions were pooled, and the pools of the different proteases (0.7 to 2.4 mg of protein) were used for characterization of the enzymes.
Nucleotide sequence accession numbers.
The nucleotide sequences of the inserts of pTW2 and pTW3 have been deposited in the GenBank database under accession numbers EU333168 and EU333169, respectively.
RESULTS
Construction of metagenomic DNA libraries.
Four different samples (soil from a mining shaft, compost soil, mixed sample A, and mixed sample B) were used for the construction of metagenomic DNA libraries. The yields of the DNA isolated from the four different samples ranged from approximately 48 to 73 μg of DNA per g of sample (data not shown). These yields were in the same range as described for the isolation of DNA from other environmental samples (7).
Two different types of small-insert metagenomic libraries were constructed from each of the four environmental samples. One type was generated by using pSKII+ as the vector and a standard ligation reaction with T4 DNA ligase to clone the metagenomic DNA. The other type was constructed by using pCR2.1-TOPO as the vector and a TOPO TA cloning approach. This approach is based on the ligase activity of topoisomerase I, which is bound to the linearized vector. In order to compare both approaches, 5-μg portions of the isolated and size-fractionated DNA from each of the four environmental samples were used as starting material for library construction. The mixed samples were used to study the influence of different environmental matrix substances on library construction.
The four libraries in the high-copy vector pSKII+ (libraries SK1 to SK4) and the four libraries in vector pCR2.1-TOPO (libraries CR1 to CR4) contained 389,000 insert-bearing E. coli clones, which harbored 1.94 Gb of cloned environmental DNA (Table 2). The number of clones per μg of DNA used for library construction and the average insert sizes ranged from 2,000 to 15,600 and from 2.4 to 5.3 kb, respectively. Direct comparison of both different types of libraries generated from each of the four samples revealed that use of the TOPO TA-based cloning method results in more insert-containing clones per μg of DNA and in larger average insert sizes than the application of the standard cloning method involving T4 DNA ligase (Table 2).
Activity-based screening of the constructed libraries.
The screen for genes conferring proteolytic activity was based on the ability of the library-containing E. coli clones to form halos when grown on indicator agar medium containing skim milk. Halo formation is caused by hydrolysis of the milk proteins. The screening was initiated by the transfer of approximately 10,000 clones of each library to indicator agar medium. Thus, ∼318 Mb of environmental DNA were screened for the presence of genes encoding proteolytic activity. After incubation for 48 h at 37°C, positive E. coli clones were collected. In order to confirm that the proteolytic activity of the positive clones was plasmid encoded, the recombinant plasmids were isolated and used to transform E. coli. The resulting E. coli strains were screened again on indicator agar. Four different recombinant plasmids designated pTW1 to pTW4 conferred a stable proteolytic phenotype. Two of these plasmids were derived from library SK2: one from library CR1 and one from library CR3 (Table 2). All four plasmids conferred also a proteolytic phenotype on indicator agar containing other protease substrates such as azoalbumin and azocasein. E. coli strains harboring pTW2 or pTW3 showed the largest halos and the fastest appearance of the proteolytic phenotype of all positive clones on indicator agar. In addition, prolonged incubation of these recombinant E. coli strains resulted in complete clearing of the entire skim milk-containing indicator agar medium (25 ml) placed in a standard petri dish (data not shown). Since these results indicated a strong proteolytic activity, the plasmids pTW2 and pTW3 and the corresponding E. coli clones (E. coli TOP10/pTW2 and E. coli TOP10/pTW3) were studied further.
Sequence analysis of pTW2 and pTW3.
The inserts of pTW2 (28,113 bp) and pTW3 (19,956 bp) were sequenced, analyzed, and compared to the sequences in the National Center for Biotechnology Information databases (see Tables S1 and S2, respectively, in the supplemental material). The apparent gene organizations are shown in Fig. 1A and B, respectively. The similar G+C content of the sequenced inserts of pTW2 (68.3%) and pTW3 (69.5%) suggested a similar phylogenetic affiliation of the cloned DNA fragments. Each of the inserts contained one ORF that encoded a putative protease (mprA [pTW2] and mprB [pTW3]; see below).
FIG. 1.
Comparison of the inserts of pTW2 (A) and pTW3 (B) with correlating genomic regions of X. campestris pv. vesicatoria strain 85-10 (C), X. campestris pv. campestris strain ATCC 33913 (D), X. campestris pv. campestris strain 8004 (E), X. axonopodis pv. citri strain 306 (F), X. oryzae pv. oryzae KACC 10331 (G), and X. oryzae pv. oryzae MAFF 311018 (H). Arrows and arrowheads indicate the lengths, location, and orientations of potential genes. Genes sharing the same predicted functions are marked with identical colors. Gray arrows and open arrows indicate conserved hypothetical proteins and hypothetical proteins, respectively. Genes encoding putative proteases are boxed in yellow. ORFs that are found in pTW2 and pTW3 and not in one of the six Xanthomonas strains are marked by the red triangles. Genes of the Xanthomonas strains that are not present in the insert sequences of pTW2 and pTW3 are indicated by the black rectangles. Numbers below the arrows indicate the ORF position. The predicted functions of the gene products from the six Xanthomonas strains and detailed comparisons with pTW2 and pTW3 are given in Tables S4 to S9 in the supplemental material. References were as follows: pTW2, EU333168 (the present study); pTW3, EU333169 (the present study); X. campestris pv. vesicatoria strain 85-10 (base 1118626 to base 1074579), AM039952; X. campestris pv. campestris strain ATCC 33913 (base 1040326 to base 995188), AE008922; X. campestris pv. campestris strain 8004 (base 4000216 to base 4045359), CP000050; X. axonopodis pv. citri strain 306 (base 1119791 to base 1071729), AE008923; X. oryzae pv. oryzae KACC 10331 (base 3859493 to base 3902413), AE013598; and X. oryzae pv. oryzae MAFF 311018, (base 3858820 to base 3899102), AP008229.
The insert sequence of pTW2 harbored 32 predicted protein-encoding ORFs. Twenty-four of the proteins deduced from these ORFs were significant similar (49 to 92% amino acid sequence identity) to database entries from various members of the family Xanthomonadaceae, such as Xanthomonas axonopodis pv. citri strain 306, X. campestris pv. campestris strain ATCC 33913, X. campestris pv. campestris strain 8004, X. campestris pv. vesicatoria strain 85-10, X. oryzae pv. oryzae strain KACC 10331, X. oryzae pv. oryzae strain MAFF 311018, Stenotrophomonas maltophilia R551-3, and Xylella fastidiosa 9a5c (see Table S1 in the supplemental material). Four of the remaining eight predicted protein-encoding ORFs exhibited no significant sequence similarity to known protein sequences. The other four presumptive genes (ORFs 15, 18, 27, and 29) encoded proteins that exhibited significant similarities to database entries from bacteria, which do not belong to the Xanthomonadaceae. This included the ORF encoding the protease (ORF 29).
The insert of pTW3 contained 22 predicted protein-encoding ORFs. Two ORFs (ORFs 1 and 9) were unrelated to any known genes, and 20 ORFs possessed similarities to known bacterial genes. Seventeen of the proteins deduced from the latter ORFs revealed high amino acid sequence identities (33 to 89%) to database entries of the above-mentioned members of the Xanthomonadaceae. The gene products deduced from the remaining three ORFs (ORFs 2, 8, and 22), including mprB, were most similar to gene products from organisms belonging to other families (see Table S2 in the supplemental material).
Comparative sequence analysis of pTW2 and pTW3 revealed that the inserts contained similar but not identical overlapping gene regions. (Fig. 1A and B). This includes ORFs 18 to 32 of pTW2 and ORFs 1 to 12 of pTW3. The amino acid identities of correlating proteins deduced from pTW2 and pTW3 in this region ranged from 68 to 96% (see Table S3 in the supplemental material). These high similarities, together with the similar gene organization of the overlapping regions and the similar G+C content of the inserts, suggested that the cloned DNA fragments of pTW2 and pTW3 were derived from closely related but different organisms.
The high similarities of the majority of the gene products deduced from the inserts of pTW2 and pTW3 to gene products from members of the Xanthomonadaceae family suggested a phylogenetic affiliation of the organisms from which the inserts of pTW2 or pTW3 were derived to the Xanthomonadaceae. To further analyze this, the sequences and the gene organization of the inserts of pTW2 and pTW3 were compared to complete genome sequences of members of the Xanthomonadaceae. The gene organization of both inserts revealed striking similarities to the organization of genomic regions of members of the genus Xanthomonas, such as X. axonopodis pv. citri strain 306, X. campestris pv. campestris strain ATCC 33913, X. campestris pv. campestris strain 8004, X. campestris pv. vesicatoria strain 85-10, X. oryzae pv. oryzae strain KACC 10331, and X. oryzae pv. oryzae strain MAFF 311018 (Fig. 1). In each case, at least 68% of the deduced gene products of pTW2 or pTW3 shared high amino acid sequence identity with the correlating gene products of the different Xanthomonas species (for detailed comparisons, see Tables S4 to S9 in the supplemental material). In addition, the G+C contents of pTW2 and pTW3 and of the Xanthomonas species (63 to 67%) (25) are in the same range.
Most of the differences between the insert sequences of pTW2 and pTW3 and the genomic regions of the Xanthomonas species were located in the neighborhood of the putative protease-encoding genes. The six Xanthomonas strains possessed one or three putative serine protease-encoding genes in the corresponding region but none of the protein sequences deduced from these genes showed high similarity to MprA and MprB. The amino acid identities ranged from 6 to 22% (see Tables S4 to S9 in the supplemental material).
Sequence analysis of the protease-encoding genes.
The presumptive genes encoding the proteases MprA (pTW2) and MprB (pTW3) were of equal size (2589 bp) and exhibited 90% nucleotide sequence identity and 92% amino acid sequence identity (see Fig. S1 in the supplemental material). A potential signal peptide of 27 amino acids was predicted at the N terminus of MprA and MprB by using the SignalP program (3). The amino acid sequences of both putative signal peptides showed the typical orientation of signal peptides with three distinct parts (N, H, and C domains (38). The protein sequences of MprA and MprB contained the highly conserved zinc-binding motif HEXXH (residues H362 to H366) and the third zinc ligand motif GXXNEXXSD (residues G382 to D390) (Fig. 2). The presence of these conserved motifs is typical for metalloproteases belonging to the M4 family of metallopeptidases, which is also known as the thermolysin-like family (40). Most of the members are secreted bacterial enzymes that degrade extracellular proteins. This is in accordance with the presence of putative signal peptides in the N-terminal regions of MprA and MprB.
FIG. 2.
Alignment of the of the active-site regions of MprA and MprB with those of other metalloproteases belonging to the M4 family. The conserved HEXXH and GXXNEXXSD motif are shaded. Identical amino acid residues are indicated by bold letters. References were as follows: MprA and MprB (the present study); MprI from Pseudoalteromonas piscicida (34); EmpI from Pseudoalteromonas sp. strain A28 (27); vibriolysin from Vibrio vulnificus (6); and thermolysin from Bacillus thermoproteolyticus (CAA01492).
Further comparison of the deduced amino acid sequences with the National Center for Biotechnology Information databases and the MEROPS database revealed that MprA and MprB were modular enzymes consisting of four regions: the signal sequence (27 amino acids), the N-terminal proregion (198 amino acids), the protease region (296 amino acids), and the C-terminal extension (341 amino acids) (Fig. 3 and Table 3). The fungalysin/thermolysin propeptide motif typical for M4 metalloproteases is part of the N-terminal proregion (51). The protease region consists of the putative catalytic domain and the alpha-helical domain. This region showed high similarities to metalloproteases belonging to the M4 family, such as MprI from Pseudoalteromonas piscicida (formerly Alteromomonas sp. strain O-7) (34), EmpI from Pseudoalteromonas sp. strain A28 (27), thermolysin from Bacillus thermoproteolyticus (52), and vibriolysin from Vibrio vulnificus (6). The C-terminal extension harbors two putative bacterial prepeptidase C-terminal (PPC) domains that are also present in some M4 metalloproteases and one proprotein convertase P domain at the C terminus (Table 3, Fig. 3). The presence of the latter domain is unusual for bacterial proteases and is not common for members of the M4 family of proteases. Usually, P domains are associated with eukaryotic intracellular serine endopeptidases (59). Amino acid sequence similarity searches revealed that MprA and MprB were most similar to the zinc metalloproteases MprI from Pseudoalteromonas piscicida (49 and 49% identity, respectively) (34) and EmpI from Pseudoalteromonas sp. strain A28 (50 and 47% identity, respectively) (27), but not over the entire length. EmpI and MprI possess a similar modular structure as MprA and MprB but both enzymes lack a P domain at the C terminus.
FIG. 3.
Domain structure of MprA (A) and MprB (B) and localization of the constructed derivatives of both proteases. For amino acid positions and similarities of domains and motifs, see Table 3. Abbreviations: SP, signal peptide; FTP, fungalysin/thermolysin propeptide motif; M4, peptidase M4 catalytic domain; M4_C, peptidase M4 alpha-helical domain; P_propr, proprotein convertase P domain.
TABLE 3.
Localization and similarities of domains and motifs in the amino acid sequences of MprA and MprBa
| Domain or motif | MprA
|
MprB
|
||
|---|---|---|---|---|
| aa range(s) | E value(s) | aa range(s) | E value(s) | |
| Signal peptide | 1-27 | NA | 1-27 | NA |
| FTP, fungalysin/thermolysin propeptide motif, pfam07504 | 73-123 | 3e−07 | 73-123 | 1e−08 |
| Peptidase_M4, thermolysin metallopeptidase, catalytic domain, pfam01447 | 226-371 | 2e−28 | 226-371 | 3e−26 |
| Peptidase_M4_C, thermolysin metallopeptidase, alpha-helical domain, pfam02868 | 373-521 | 4e−33 | 373-521 | 5e−33 |
| PPC, bacterial prepeptidase, C-terminal domain, pfam01451 | 546-617, 658-728 | 6e−13, 2e−120 | 547-617, 658-728 | 2e−13, 1e−12 |
| P_proprotein, proprotein convertase P-domain, pfam01483 | 780-862 | 4e−14 | 780-862 | 4e−14 |
Localization of signal peptides was predicted by using the SignalP 3.0 server (3). The localization and similarities of the other domains and motifs were determined by searches against the CDD database (31). The apparent domain structures of MprA and MprB are given in Fig. 3. aa, amino acids. NA, not applicable.
In summary, molecular analysis indicated that MprA and MprB are extracellular proteases belonging to the M4 family of metallopeptidases. In addition, both enzymes exhibited a modular structure, which is not known from other members of the M4 family.
Subcloning, expression, and characterization of mprA and mprB and of derivatives.
To show that mprA and mprB encode extracellular proteases and to unravel the importance of the different domains for enzyme activity the coding regions and derivatives of these regions that lack one or more of the putative domains (Fig. 3) were amplified by PCR. Since mprA and mprB are highly similar, most of the derivatives were constructed from the coding region of one gene (mprB). The PCR products were cloned into the expression vector pET101/D, thereby placing the genes under the control of the IPTG-inducible T7 promoter and adding sequences encoding a His6 tag and a V5 epitope. In addition, to facilitate protein production of derivatives that lack the N-terminal region, a start codon (ATG) was added to the 5′ end of the coding region. The fidelity of the PCR products and the cloning steps were confirmed by sequencing of the resulting constructs pMPRA, pMPRA.1, pMPRB, and pMPRB.1 to pMPRB.4 (Table 1 and Fig. 3). The plasmids were transformed into E. coli BL21. Subsequently, the correlating E. coli strains BL21/pMPRA, BL21/pMPRA.1, BL21/pMPRB, and BL21/pMPRB.1 to BL21/pMPRB.4 were grown on indicator agar medium containing skim milk and trace amounts of IPTG (Fig. 4). Proteolytic activity was detectable for the E. coli strains BL21/pMPRA, BL21/pMPRA.1, BL21/pMPRB, and BL21/pMPRB.1. These strains harbored full-length protease-encoding genes or derivatives encoding the signal peptide, the proregion, and the protease region. A proteolytic phenotype was also recorded for the positive controls (E. coli TOP10/pTW2 and E. coli TOP10/pTW3). The other recombinant E. coli strains, including the strains containing pET101/D or pSKII+ (negative controls), showed no proteolytic activity (Fig. 4). All of the active E. coli clones mentioned above revealed also a proteolytic phenotype on indicator agar containing azoalbumin or azocasein as protease substrate.
FIG. 4.
Proteolytic activity of recombinant E. coli strains containing pTW2, pTW3, pMPRA, pMPRA.1, pMPRB, and pMPRB.1 to pMPRB.4 on skim milk-containing agar plates. The host for the plasmids pTW2 and pTW3 and the negative control pSKII+ was E. coli TOP10. The plasmids pMPRA, pMPRA.1, pMPRB, pMPRB.1 to pMPRB.4 and the negative control pET101/D were maintained in E. coli BL21. The recombinant E. coli strains were cultured on LB agar containing 2% skim. Halo formation indicated proteolytic activity. The gene regions cloned in pMPRA, pMPRA.1, pMPRB, and pMPRB.1 to pMPRB.4 are shown in Fig. 3.
To further analyze the proteolytic activity, all recombinant E. coli strains, including the negative control, were grown in basal medium. After the induction of gene expression by the addition of 0.5 mM IPTG and incubation for 15 h, the cells and culture supernatant were separated by centrifugation. The production of MprA and MprB and of all derivatives of these gene products in the cell extracts was confirmed by Western blot analysis with antibodies against the V5 epitope (data not shown). Subsequently, cell extracts and the cell-free culture supernatants were analyzed for the presence of proteolytic activity. The culture supernatants were concentrated prior to analysis because of the high dilution of the proteins in the supernatants. Protease activity was detected in the concentrated culture supernatants of the E. coli strains BL21/pMPRA, BL21/pMPRA.1, BL21/pMPRB, and BL21/pMPRB.1 (0.121 to 0.161 U/mg) (Table 4), which also revealed proteolytic activity on indicator agar (Fig. 4). The protease activities in cell extracts of these strains and all of the other E. coli clones harboring derivatives of mprA and mprB (0.01 to 0.02 U/mg) were in the range of the negative control E. coli BL21/pET101/D (0.02 U/mg). Thus, these results confirmed that mprA and mprB encode proteolytic enzymes and indicated that the C-terminal extension is not required for proteolytic activity. Since the sequence encoding the putative signal peptide was present in all gene regions coding for active proteases and the entire protease activity was found in the cell-free supernatants, mprA and mprB encode extracellular proteases.
TABLE 4.
Proteolytic activity in culture supernatants of recombinant E. coli BL21 strains containing pMPRA, pMPRA.1, pMPRB, pMPRB.1, pMPRB.2, pMPRB.3, or pMPRB.4a
| E. coli strain | Sp act (U/mg)
|
Purification (fold) | |
|---|---|---|---|
| Concentrated supernatant | After hydrophobic interaction chromatography | ||
| BL21/pMPRA | 0.161 | 1.514 | 9.4 |
| BL21/pMPRA.1 | 0.110 | 1.194 | 10.9 |
| BL21/pMPRB | 0.140 | 1.278 | 9.1 |
| BL21/pMPRB.1 | 0.121 | 0.859 | 7.1 |
| BL21/pMPRB.2 | 0.001 | NA | NA |
| BL21/pMPRB.3 | 0.002 | NA | NA |
| BL21/pMPRB.4 | 0.002 | NA | NA |
| BL21/pET101/D | 0.002 | NA | NA |
Proteolytic activity was assayed as described in Materials and Methods with casein as the substrate. The supernatants were concentrated 100-fold prior to measurement. The concentrated culture supernatants of E. coli strains BL21/pMPRA, BL21/pMPRA.1, BL21/pMPRB, and BL21/pMPRB.1 were subjected to hydrophobic interaction chromatography using a phenyl Sepharose column. NA, not applicable.
To increase the specific activity of MprA, MprA.1, MprB, and MprB.1 the corresponding culture supernatants were subjected to hydrophobic interaction chromatography. In this way, the specific activities increased 9.4-, 10.9-, 9.1-, and 7.1-fold, respectively (Table 4). To confirm that MprA and MprB are metalloproteases, the effect of protease inhibitors on enzyme activity was examined. The proteolytic activity of both enzymes was completely inhibited by the metalloprotease inhibitors 1,10-phenanthroline and EDTA but was not affected by the serine protease inhibitor phenylmethylsulfonyl fluoride. The behavior against the inhibitors was the same for MprA.1 and MprB1. The biochemical characteristics with respect to optimum pH and temperature for casein hydrolysis were identical for MprA and MprB. Both enzymes were most active at 65°C and pH 8. The pH optimum of the derivatives MprA.1 and MprB.1 was also pH 8, but the temperature optimum of both derivatives was slightly lower (60°C). All proteases were active without significant loss of activity under the above-mentioned optimal conditions for at least 30 min.
DISCUSSION
In this study, we isolated novel proteases by construction and function-driven screening of metagenomic libraries. Metagenomics based on this approach has proven to be a powerful tool for the identification of novel biocatalyst (7, 9, 14, 43, 44, 53). Many of the thereby encountered genes and gene products were derived from screens of small-insert metagenomic libraries, which have been generated by direct isolation and cloning of DNA from environmental samples (12, 16, 30, 32, 53, 58). Usually, the environmental DNA is inserted in a high-copy plasmid by using a T4 DNA ligase reaction. We used this standard approach and a TOPO TA cloning method for the construction of small-insert metagenomic libraries from four different samples. In all cases, the latter approach, which relies on ligase activity of topoisomerase I, revealed a better performance than the T4 DNA ligase-based method with respect to the average insert size and number of insert-containing recombinant plasmids. A high suitability of the TOPO TA-based approach for cloning of metagenomic DNA has also been indicated by the results of Wilkinson et al. (54). These researchers constructed a small-insert environmental library from geothermal sediments by using a direct lysis approach for DNA isolation and the TOPO TA cloning method. Cloning of DNA by other methods failed. Recently, it has been reported that topoisomerase cloning is appropriate for construction of small-insert libraries from small quantities of metagenomic DNA (48).
In the present study, the isolation of metagenomic DNA was achieved by a direct lysis approach. This approach often results in coextraction of environmental matrix substances, which remain in the purified environmental DNA and interfere with the restriction digestion or the ligation reaction (7). The better performance of the TOPO TA approach for cloning of environmental DNA may be due to a lack of susceptibility of topoisomerase 1 toward remaining matrix substances. In addition, the TOPO TA method involves blunt-end polishing using T4 DNA polymerase, which exhibits proofreading activity. This activity may have contributed to an improvement of the overall quality of the environmental DNA used for cloning. In conclusion, the TOPO TA-based method is advisable for the generation of small-insert libraries from environmental DNA that has been isolated by direct lysis approaches.
The function-driven screening and the identification of clones exhibiting proteolytic activity was based on a well-established screen on skim milk-containing agar. This screen has been used to identify protease activity of individual microorganisms (22, 55) and recombinant E. coli strains that harbor gene libraries from single microorganisms (27) or metagenomic libraries (13, 26, 47). The partial screening of the constructed metagenomic libraries for genes encoding proteolytic enzymes and characterization of the recombinant plasmids recovered from positive clones resulted in the identification of mprA and mprB. Besides MprA and MprB, only one other active metalloprotease has been obtained from a metagenome project (26). In addition, a serine protease has been mentioned in reviews, but the original work was not published (13, 29). In other cases, screening of metagenomic libraries for proteolytic activity was unsuccessful (19, 44). Jones et al. (19) found glycoside hydrolases instead of proteases during activity-based screening of the human gut microbiome on skim milk-containing LB medium. These authors concluded that this screen is not entirely suitable for detection of proteolytic enzymes. Thus, the use of this screen for the identification of proteolytic clones might be one explanation for the current deficit of metagenome-derived proteases. Since the above-mentioned activity-based screen is the standard screen for identification of proteolytic enzymes, this cannot be the only explanation. Proteases degrade or modify enzymes (13, 39). Thus, the detection and recovery of proteases might have been hindered by deleterious effects of foreign protease activity on the host cells. Other possible explanations are associated with the general drawbacks of activity-based screening of metagenomic libraries. Although function-driven screens result in the identification of functional gene products, one limitation of this approach is its dependence on the expression of the cloned genes and synthesis of the corresponding gene products in a foreign host (7). In the case of extracellular proteases, secretion and processing of the foreign protein by the host strain is also required. Therefore, the inability to recover active proteases encoded by metagenomic DNA during function-driven screening might be due to the fact that many genes and gene products are not expressed, not translated, or not active in the host strain. In the present study, the E. coli host was able to synthesize and secrete active proteases derived from a foreign organism.
We were able to deduce the origin of the metagenomic DNA fragments that harbor mprA and mprB. The high sequence similarities of most of the gene products encoded by the inserts of pTW2 and pTW3 to gene products from different members of the Xanthomonadaceae (see Tables S1 and S2 in the supplemental material), and the similar organization of the cloned metagenomic gene regions and genomic regions of different Xanthomonas strains (Fig. 1) suggested that the inserts of pTW2 and pTW3 were derived from members of the genus Xanthomonas. Xanthomonas and E. coli are both gram negative and belong to the Gammaproteobacteria. This phylogenetic relationship and the similar type of cell wall might have contributed to the ability of E. coli to produce a foreign extracellular protease in an active form. Interestingly, the Xanthomonas strains contained one or more serine protease-encoding genes in a position similar to that of mprA and mprB (Fig. 1). Therefore, the localization of a protease-encoding gene in this genomic region appears to be important. The six Xanthomonas strains depicted in Fig. 1 are phytopathogenic organisms, and protease activity plays an important role in virulence (15, 28). However, an involvement of the proteases located in this region in pathogenesis has not been shown. In addition, MprA and MprB showed no significant sequence similarity to the serine proteases derived from the Xanthomonas species. Nevertheless, MprA and MprB belong to the zinc-containing metalloproteases of which many are involved in pathogenesis (4, 6, 15, 20, 35). Furthermore, the amino acid sequences of MprA and MprB showed the highest similarities (ca. 50% identity) to the zinc-containing metalloprotease EmpI from Pseudoalteromonas sp. strain A28, which exhibits potent algicidal activity (27). Thus, it cannot be excluded that MprA and MprB have a function in pathogenesis in the organisms from which these enzymes originate.
Analysis of MprA and MprB revealed that both enzymes are members of the M4 family of metallopeptidases. Both metalloproteases exhibited a unique modular structure and consisted of four parts (Fig. 3). The protease region showed high similarity to that of various zinc-containing metalloproteases belonging to the M4 family. It has been shown for thermolysin the prototype member of the M4 family that the two histidines residues of the highly conserved zinc-binding motif HEXXH (Fig. 2) are zinc ligands and the glutamate residue is an active site residue (41). The glutamic acid residue of the other conserved motif GXXNEXXSD is the third zinc ligand. This residue is located 20 residues downstream of the zinc-binding motif of MprA and MprB (Fig. 2), as is typical for M4 family members (18). All members of this family bind a single zinc ion that is coordinated by the above-mentioned residues and an activated water molecule. Thermolysin-like metalloproteases are synthesized as inactive preproenzymes, which are converted to the mature protein by proteolytic cleavage (4, 41, 50). SignalP analysis revealed that the first 27 amino acids of MprA and MprB are typical signal peptides. The signal peptides may be removed during the passage through the inner membrane by endogenous signal peptidases provided by the host, as observed for other zinc-metalloproteases (8, 20). Since the entire protease activity was found in the cell-free culture supernatant and the presence of the sequences encoding the signal peptide in the gene regions was required to obtain protease activity (Fig. 4), the functionality of the signal peptides of MprA and MprB is indicated. In addition to the signal peptide and the protease region, the presence of the N-terminal proregion was also essential for the activity of MprA and MprB. Processing of the proregion of thermolysin-like proteases is required to yield the active protein. The proregion serves as a chaperone and as an inhibitor of catalysis, which prevents premature activation of the enzyme (50, 51). The C-terminal extension of MprA and MprB containing two PPC domains and one P domain is unique for M4 metalloproteases. Usually, the C-terminal extension is not essential for enzyme activity and is not present in the active proteases (57). Some metalloproteases, such as thermolysin and elastase, have no C-terminal extension. It has been suggested that C-terminal extensions containing PPC domains are involved in adhesion to protein substrates (35) or in protein secretion through the outer membrane of gram-negative bacteria (11). The C-terminal extension of MprA and MprB showed with respect to the presence of the two PPC domains similarity to other M4 metalloproteases such as EmpI (27) and MprI (34), but the latter enzymes lack an additional P domain. P domains are primarily associated with subtilisin-like proprotein convertases of eukaryotes, which are serine endopeptidases. The P domain of these enzymes is located immediately downstream of the catalytic domain. It has been reported that the P domain is required for the folding and regulation of pH dependence of the catalytic domain (49, 59). In addition, P domains contain the amino acid sequence RGD. The integrity of this sequence is necessary for zymogen and C-terminal processing and cellular trafficking of proprotein convertase (45). The amino acid sequences of MprA and MprB contained an RGD sequence (R794 to D796). Our studies with the derivatives lacking the PPC domains and the P domain (MprA.1 and MprB.1) revealed that the domains of the C-terminal extension are not crucial for enzyme activity, localization of the enzymes, and the pH optimum of enzyme activity. Taking into account that mprA and mprB were expressed in a foreign host, this could be different in the organisms from which mprA and mprB originated.
In conclusion, the applied function-driven metagenomic approach resulted in identification of two extracellular metalloproteases with a novel domain structure. The function of the unique C-terminal extension, especially the role of the P domain, will be examined in the future. Since the recovered protease genes-containing DNA fragments were probably derived from members of the genus Xanthomonas, we will also use genetically accessible Xanthomonas strains as hosts for mprA and mprB.
Supplementary Material
Acknowledgments
We thank the Bundesministerium für Bildung und Forschung for generous support.
Footnotes
Published ahead of print on 13 February 2009.
Supplemental material for this article may be found at http://aem.asm.org/.
REFERENCES
- 1.Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410. [DOI] [PubMed] [Google Scholar]
- 2.Ausubel, F. M., R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A. Smith, and K. Struhl. 1987. Current protocols in molecular biology. John Wiley & Sons, Inc., New York, NY.
- 3.Bendtsen, J. D., H. Nielsen, G. von Heijne, and S. Brunak. 2004. Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340:783-795. [DOI] [PubMed] [Google Scholar]
- 4.Bitar, A. P., M. Cao, and H. Marquis. 2008. The metalloprotease of Listeria monocytogenes is activated by intramolecular autocatalysis. J. Bacteriol. 272:217-228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bradford, M. M. 1976. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72:248-254. [DOI] [PubMed] [Google Scholar]
- 6.Chuang, Y. C., T. M. Chang, and M. C. Chang. 1997. Cloning and characterization of the gene (empV) encoding extracellular metalloprotease from Vibrio vulnificus. Gene 189:163-168. [DOI] [PubMed] [Google Scholar]
- 7.Daniel, R. 2005. The metagenomics of soil. Nat. Rev. Microbiol. 3:470-478. [DOI] [PubMed] [Google Scholar]
- 8.David, V. A., A. H. Deutch, A. Sloma, A. Pawlyk, A. Ally, and D. R. Durham. 1992. Cloning, sequencing and expression of the gene encoding the extracellular neutral protease, vibriolysin, of Vibrio proteolyticus. Gene 112:107-112. [DOI] [PubMed] [Google Scholar]
- 9.Elend, C., C. Schmeisser, C. Leggewie, P. Babiak, J. D. Carballeira, H. L. Steele, J.-L. Reymond, K.-E. Jaeger, and W. R. Streit. 2006. Isolation and biochemical characterization of two novel metagenome-derived esterases. Appl. Environ. Microbiol. 72:3637-3645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Finn, R. D., J. Mistry, B. Schuster-Böckler, S. Griffiths-Jones, V. Hollich, T. Lassmann, S. Oxon, M. Marshall, A. Khanna, R. Durbin, S. R. Eddy, E. L. L. Sonnhammer, and A. Bateman. 2006. Pfam: clans, web tools and services. Nucleic Acids Res. 34:D247-D251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Gray, L., N. Mackman, J. M. Nicaud, and I. B. Holland. 1986. The carboxy-terminal region of haemolysin 2001 is required for secretion of the toxin from Escherichia coli. Mol. Gen. Genet. 205:127-133. [DOI] [PubMed] [Google Scholar]
- 12.Guan, C., J. Ju, B. R. Borlee, L. L. Williamson, B. Shen, K. F. Raffa, and J. Handelsman. 2007. Signal mimics derived from a metagenomic analysis of the gypsy moth gut microbiota. Appl. Environ. Microbiol. 73:3669-3676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gupta, R., Q. K. Berg, and P. Lorenz. 2002. Bacterial alkaline proteases: molecular approaches and industrial applications. Appl. Microbiol. Biotechnol. 59:15-32. [DOI] [PubMed] [Google Scholar]
- 14.Handelsman, J. 2004. Metagenomics: application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev. 68:669-685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Häse, C. C., and R. A. Finkelstein. 1993. Bacterial extracellular zinc-containing metalloproteases. Microbiol. Rev. 57:823-837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Henne, A., R. A. Schmitz, M. Bömeke, G. Gottschalk, and R. Daniel. 2000. Screening of environmental DNA libraries for the presence of genes conferring lipolytic activity on Escherichia coli. Appl. Environ. Microbiol. 66:3113-3116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Henne, A., R. Daniel, R. A. Schmitz, and G. Gottschalk. 1999. Construction of environmental DNA libraries in Escherichia coli and screening for the presence of genes conferring utilization of 4-hydroxybutyrate. Appl. Environ. Microbiol. 65:3901-3907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Holmes, M. A., and B. W. Matthews. 1982. Structure of thermolysin refined at 1.6 Å resolution. J. Mol. Biol. 160:623-639. [DOI] [PubMed] [Google Scholar]
- 19.Jones, B. V., F. Sun, and J. R. Marchesi. 2007. Using skimmed milk agar to functionally screen a gut metagenomic library for proteases may lead to false positives. Lett. Appl. Microbiol. 45:418-420. [DOI] [PubMed] [Google Scholar]
- 20.Kim, S.-K., J.-Y. Yang, and J. Cha. 2002. Cloning and sequence analysis of a novel metalloprotease gene of Vibrio parahaemolyticus. Gene 282:277-286. [DOI] [PubMed] [Google Scholar]
- 21.Knietsch, A., T. Waschkowitz, S. Bowien, A. Henne, and R. Daniel. 2003. Construction and screening of metagenomic libraries derived from enrichment cultures: generation of a gene bank for genes conferring alcohol oxidoreductase activity on Escherichia coli. Appl. Environ. Microbiol. 69:1408-1416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kothary, M. H., B. A. McCardell, C. D. Frazar, D. Deer, and B. D. Tall. 2007. Characterization of the zinc-containing metalloprotease encoded by zpx and development of a species-specific detection method for Enterobacter sakazakii. Appl. Environ. Microbiol. 73:4142-4151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kumar, C. G., and H. Takagi. 1999. Microbial alkaline proteases: from a bioindustrial viewpoint. Biotechnol. Adv. 17:561-594. [DOI] [PubMed] [Google Scholar]
- 24.Laemmli, U. K. 1970. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 270:680-685. [DOI] [PubMed] [Google Scholar]
- 25.Lee, B.-M., Y.-J. Park, D.-S. Park, H.-W. Kang, J.-G. Kim, E.-S. Song, I.-C. Park, U.-H. Yoon, J.-H. Hahn, B.-S. Koo, G.-B. Lee, H. Kim, H.-S. Park, K.-O. Yoon, J.-H. Kim, C. Jung, N.-H. Koh, J.-S. Seo, and S.-J. Go. 2005. The genome sequence of Xanthomonas oryzae pathovar oryzae KACC 10331, the bacterial blight pathogen. Nucleic Acids Res. 33:577-586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lee, D.-G., J. H. Jeon, M. K. Jang, N. Y. Kim, J. H. Lee, J.-H. Lee, S.-J. Kim, G.-D. Kim, and S.-H. Lee. 2007. Screening and characterization of a novel fibrinolytic metalloprotease from a metagenomic library. Biotechnol. Lett. 29:465-472. [DOI] [PubMed] [Google Scholar]
- 27.Lee, S.-O., J. Kato, K. Nakashima, A. Kuroda, T. Ikeda, N. Takiguchi, and H. Ohtake. 2002. Cloning and characterization of extracellular metal protease gene of the algicidal marine bacterium Pseudoalteromonas sp. strain A28. Biosci. Biotechnol. Biochem. 66:1366-1369. [DOI] [PubMed] [Google Scholar]
- 28.Leyns, F., M. de Cleene, J. Swings, and J. de Ley. 1984. The host range of the genus Xanthomonas. Bot. Rev. 50:305-355. [Google Scholar]
- 29.Lorenz, P. J., and J. Eck. 2005. Metagenomics and industrial applications. Nat. Rev. Microbiol. 3:510-516. [DOI] [PubMed] [Google Scholar]
- 30.Majernik, A., G. Gottschalk, and R. Daniel. 2001. Screening of environmental DNA libraries for the presence of genes conferring Na+(Li+)/H+ antiporter activity on Escherichia coli: characterization of the recovered genes and the corresponding gene products. J. Bacteriol. 183:6645-6653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Marchler-Bauer, A., J. B. Anderson, M. K. Derbyshire, C. DeWeese-Scott, N. R. Gonzales, M. Gwadz, L. Hao, S. He, D. I. Hurwitz, J. D. Jackson, Z. Ke, D. Krylov, C. J. Lanczycki, C. A. Liebert, C. Liu, F. Lu, S. Lu, G. H. Marchler, M. Mullokandov, J. S. Song, N. Thanki, R. A. Yamashita, J. J. Yin, D. Zhang, and S. H. Bryant. 2007. CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res. 35:D237-D240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mirete, S., C. G. de Figueras, and J. E. González-Pastor. 2007. Novel nickel-resistance genes from the rhizosphere metagenome of acid mine drainage-adapted plants. Appl. Environ. Microbiol. 73:6001-6011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Miyamoto, K., E. Nukui, M. Hirose, F. Nagai, T. Sato, Y. Inamori, and H. Tsujibo. 2002. A metalloprotease (MprIII) involved in the chitinolytic system of a marine bacterium, Alteromonas sp. strain O-7. Appl. Environ. Microbiol. 68:5563-5570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Miyamoto, K., H. Tsujibo, E. Nukui, H. Itoh, Y. Kaidzu, and Y. Inamori. 2002. Isolation and characterization of the genes encoding two metalloproteases (MprI and MprII) from a marine bacterium, Alteromonas sp. strain O-7. Biosci. Biotechnol. Biochem. 66:416-421. [DOI] [PubMed] [Google Scholar]
- 35.Miyoshi, S., K. Kawata, K. Tomochika, S. Shinoda, and S. Yamamoto. 2001. The C-terminal domain promotes the hemorrhagic damage caused by Vibrio vulnificus metalloprotease. Toxicon 39:1883-1886. [DOI] [PubMed] [Google Scholar]
- 36.Miyoshi, S., and S. Shinoda. 2000. Microbial metalloproteases and pathogenesis. Microbes Infect. 2:91-98. [DOI] [PubMed] [Google Scholar]
- 37.Pfennig, N., and K. D. Lippert. 1966. Über das Vitamin B12-Bedürfnis phototropher Schwefelbakterien. Arch. Microbiol. 55:245-256. [Google Scholar]
- 38.Pugsley, A. P. 1993. The complete general secretory pathway in gram-negative bacteria. Microbiol. Rev. 57:50-108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rao, M. B., A. M. Tanksale, M. S. Ghatge, and V. V. Deshpande. 1998. Molecular and biotechnological aspects of microbial proteases. Microbiol. Mol. Biol. Rev. 62:597-635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Rawlings, N. D., and A. J. Barret. 1995. Evolutionary families of metallopeptidases. Methods Enzymol. 248:183-228. [DOI] [PubMed] [Google Scholar]
- 41.Rawlings, N. D., and A. J. Barrett. 2004. Introduction: metallopeptidases and their clans, p. 231-267. In A. J. Barrett, N. D. Rawlings, and J. F. Woessner (ed.), Handbook of proteolytic enzymes, 2nd ed. Elsevier, London, United Kingdom.
- 42.Rawlings, N. D., F. R. Morton, and A. J. Barrett. 2006. MEROPS: the peptidase database. Nucleic Acids Res. 34:D270-D272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Riesenfeld, C. S., P. D. Schloss, and J. Handelsman. 2004. Metagenomics: genomic analysis of microbial communities. Annu. Rev. Genet. 38:525-552. [DOI] [PubMed] [Google Scholar]
- 44.Rondon, M. R., P. R. August, A. D. Bettermann, S. F. Brady, T. H. Grossman, M. R. Liles, K. A. Loiacono, B. A. Lynch, L. A. MacNeil, C. Minor, C. L. Tiong, M. Gilman, M. S. Osburne, J. Clardy, J. Handelsman, and R. M. Goodman. 2000. Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66:2541-2547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Rovere, C., J. Luis, J. C. Lissitzky, A. Basak, J. Marvaldi, M. Chretien, and N. G. Seidah. 1999. The RGD motif and the C-terminal segment of proprotein convertase 1 are critical for its cellular trafficking but not for its intracellular binding to integrin α5β1. J. Biol. Chem. 274:12461-12467. [DOI] [PubMed] [Google Scholar]
- 46.Rutherford, K., J. Parkhill, J. Crook, T. Horsnell, P. Rice, M. A. Rajandream, and B. Barrell. 2000. Artemis: sequence visualization and annotation. Bioinformatics 16:944-945. [DOI] [PubMed] [Google Scholar]
- 47.Santosa, D. A. 2001. Rapid extraction and purification of environmental DNA for molecular cloning applications and molecular diversity studies. Mol. Biotechnol. 17:59-64. [DOI] [PubMed] [Google Scholar]
- 48.Schmitz, J. E., A. Daniel, M. Collin, R. Schuch, and A. Fischetti. 2008. Rapid DNA library construction for functional genomic and metagenomic screening. Appl. Environ. Microbiol. 74:1649-1652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Seidah, N. G., and M. Chretien. 1997. Eukaryotic protein processing: endoproteolysis of precursor proteins. Curr. Opin. Biotechnol. 8:602-607. [DOI] [PubMed] [Google Scholar]
- 50.Shinde, U., and M. Inonye. 2000. Intramolecular chaperones: polypeptide extensions that modulate protein folding. Semin. Cell Dev. Biol. 11:35-44. [DOI] [PubMed] [Google Scholar]
- 51.Tang, B., S. Nirasawa, M. Kitaoka, C. Marie-Claire, and K. Hayashi. 2003. General function of N-terminal propeptide on assisting protein folding and inhibiting catalytic activity based on observations with a chimeric thermolysin-like protease. Biochem. Biophys. Res. Commun. 301:1093-1098. [DOI] [PubMed] [Google Scholar]
- 52.Titani, T., M. A. Mermodson, L. H. Ericsson, K. A. Walsh, and H. Neurath. 1972. Amino acid sequence of thermolysin. Nat. New Biol. 238:35-37. [DOI] [PubMed] [Google Scholar]
- 53.Van Hellemond, E. W., D. B. Janssen, and M. W. Fraaije. 2007. Discovery of a novel styrene monooxygenase originating from the metagenome. Appl. Environ. Microbiol. 73:5832-5839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Wilkinson, D. E., T. Jeanicke, and D. A. Cowan. 2002. Efficient molecular cloning of environmental DNA from geothermal sediments. Biotechnol. Lett. 24:155-161. [Google Scholar]
- 55.Xiong, H., L. Song, Y. Xu, M.-Y. Tsoi, S. Dobretsov, and P.-Y. Qian. 2007. Characterization of proteolytic bacteria from the Aleutian deep-sea and their proteases. J. Ind. Microbiol. Biotechnol. 34:63-71. [DOI] [PubMed] [Google Scholar]
- 56.Yamagata, Y., R. Abe, Y. Fujita, and E. Ichishima. 1995. Molecular cloning and nucleotide sequence of the 90K serine protease gene, hspK, from Bacillus subtilis (natto) no. 16. Curr. Microbiol. 31:340-344. [DOI] [PubMed] [Google Scholar]
- 57.Yeats, C., S. Bentley, and A. Bateman. 2003. New knowledge from old: in silico discovery of novel protein domains in Streptomyces coelicolor. BMC Microbiol. 3:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Yun, J., S. Kang, S. Park, H. Yoon, M.-J. Kim, S. Heu, and S. Ryu. 2004. Characterization of a novel amylolytic enzyme encoded by a gene from a soil-derived metagenomic library. Appl. Environ. Microbiol. 70:7229-7235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhou, A., S. Martin, G. Lipkind, J. La Mendola, and D. F. Steiner. 1998. Regulatory roles of the P domain of the subtilisin-like prohormone convertases. J. Biol. Chem. 273:11107-11114. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




