Abstract
The nucleotide sequence of the Clostridium thermocellum F7 cbhA gene, coding for the cellobiohydrolase CbhA, has been determined. An open reading frame encoding a protein of 1,230 amino acids was identified. Removal of a putative signal peptide yields a mature protein of 1,203 amino acids with a molecular weight of 135,139. Sequence analysis of CbhA reveals a multidomain structure of unusual complexity consisting of an N-terminal cellulose binding domain (CBD) homologous to CBD family IV, an immunoglobulin-like β-barrel domain, a catalytic domain homologous to cellulase family E1, a duplicated domain similar to fibronectin type III (Fn3) modules, a CBD homologous to family III, a highly acidic linker region, and a C-terminal dockerin domain. The cellulosomal localization of CbhA was confirmed by Western blot analysis employing polyclonal antibodies raised against a truncated enzymatically active version of CbhA. CbhA was identified as cellulosomal subunit S3 by partial amino acid sequence analysis. Comparison of the multidomain structures indicates striking similarities between CbhA and a group of cellulases from actinomycetes. Average linkage cluster analysis suggests a coevolution of the N-terminal CBD and the catalytic domain and its spread by horizontal gene transfer among gram-positive cellulolytic bacteria.
Numerous proteins of higher organisms have a multidomain architecture consisting of strings of mobile modules (10). Many of the modules identified so far have defined binding functions, but some may just act as simple spacer elements required only to arrange binding surfaces in space. Common types of constituent modules found in extracellular mosaic proteins are the fibronectin type III (Fn3) domain and the variants of the immunoglobulin (Ig) domain. These modules have very similar three-dimensional folds that form a sandwich of two antiparallel β-sheets with slightly different strand topologies (7, 25). The broad distribution of these modules in animal proteins is often regarded as evidence for exon shuffling. Many modules are found in multiple copies resulting from several gene duplications after the original shuffling event.
Large mosaic proteins are conspicuously absent in plants and fungi but appear to be widespread among bacteria. Thus, cellulases and other glycohydrolases from diverse bacteria have multidomain structures containing in addition to their catalytic domains several noncatalytic domains involved in substrate binding or specific protein interactions (46). A particularly interesting example is the cellulosome of Clostridium thermocellum, a cellulolytic multienzyme complex located at the cell surface and consisting of numerous catalytic components, including β-1,4-endoglucanases, cellobiohydrolases, and hemicellulases attached to the cellulosome integrating protein (scaffoldin) CipA (3, 4). This attachment is mediated by the conserved dockerin domain of the catalytic subunits and the iterated cohesin domains of CipA (43). Targeting of the cellulosome to its cellulose substrate is accomplished primarily by the cellulose-binding domain (CBD) of CipA.
In this paper, we report the structure of the C. thermocellum F7 cellobiohydrolase gene cbhA and the encoded cellulase, CbhA. This enzyme, formerly designated CBH3, has been characterized as a cellobiohydrolase by its ability to hydrolyze crystalline cellulose, yielding cellobiose as the only degradation product (36, 44, 49). It will be shown that CbhA has a highly complex multidomain structure containing, in addition to an Ig-like domain and a catalytic domain homologous to cellulase family E1, two distinct CBDs, a duplicated Fn3-like module, and a dockerin domain. Evidence identifying CbhA as cellulosomal component S3 is presented.
MATERIALS AND METHODS
Bacterial strains and growth conditions.
Escherichia coli TG1 harboring recombinant plasmid pCU303 or pCU304 (49) was aerated at 37°C in Luria broth supplemented with ampicillin (0.1 mg/ml). C. thermocellum F7 (obtained from the Institute of Microbial Biochemistry and Physiology, RAS, Puschino, Moscow Region, Russia) was grown under strict anaerobiosis at 60°C in GS-2 medium (21).
Sequence analysis.
The DNA sequence was determined from supercoiled double-stranded plasmid DNA for both strands by using the Sequenase kit (Pharmacia) for extension of 5′ biotinylated primers. DNA fragments were detected with a GATC 1500 Direct-Blotting-Electrophoresis apparatus (GATC, Konstanz, Germany) using streptavidin-conjugated alkaline phosphatase and nitroblue tetrazolium–5-bromo-4-chloro-3-indolylphosphate (Serva) as the chromogenic substrate. Sequence data were analyzed with the DNASIS software package (Hitachi Software Engineering). Multiple sequence alignments were carried out by the CLUSTAL procedure (19).
Hydrophobic cluster analysis (HCA) was performed by the method of Gaboriaud et al., employing a simplified two-dimensional sequence representation (18). To define hydrophobic clusters, F, I, L, M, V, W, and Y were considered hydrophobic amino acids. Alanine is considered hydrophobic only within a hydrophobic cluster. To evaluate the correspondence between two HCA patterns, a matching score [(2 CR × 100)/(RC1 + RC2)] was calculated, where RC1 and RC2 are the numbers of aligned hydrophobic residues in sequences 1 and 2, respectively, and CR is the total number of matching residues.
Purification of truncated CbhA protein.
A 5-liter culture of E. coli TG1(pCU304) was harvested by centrifugation, washed with 50 mM phosphate-citrate (PC) buffer (pH 6.3), suspended in 200 ml of buffer containing 2 mM phenylmethylsulfonyl fluoride, and sonicated by using an ultrasonic disintegrator (MSE). Cell extracts were heated for 30 min at 60°C and centrifuged (10,000 × g, 20 min). The cleared crude extract was precipitated with ammonium sulfate (60% saturation). The precipitate was collected by centrifugation and dissolved in 50 ml of PC buffer.
Column chromatography was performed at room temperature with a fast-performance liquid chromatography system (Pharmacia). Aliquots (1.5 ml) were loaded on a 20- by 900-mm Toyopearl HW-60 column (Toyo-Soda, Shinanyo, Japan) equilibrated with PC buffer and eluted with the same buffer at a flow rate of 0.7 ml/min. Pooled fractions with cellobiohydrolase activity were applied to a MonoQ HR 10/10 column. Elution was performed with a linear NaCl gradient (0.0 to 0.4 M) in PC buffer. Fractions containing CbhA, which eluted at 0.3 M NaCl, were dialyzed, concentrated, and purified to electrophoretic homogeneity by gel filtration on a Superose-12 HR column (16 by 500 mm).
Enzyme assay.
Cellobiohydrolase activity was assayed at 60°C for 10 min in PC buffer (pH 6.0) by using p-nitrophenyl-β-d-cellobioside (1 mM) as the substrate. Reactions were terminated by the addition of 1 M Na2CO3. One enzyme unit corresponds to the release of 1 μmol of p-nitrophenol per min.
Preparation of cellulosomes.
A 0.5-liter culture of C. thermocellum F7 was grown for 36 h in GS-2 medium containing filter paper as a sole carbon source. Cells were harvested by centrifugation, washed six times with 250 ml of deionized water, and resuspended in 30 ml of 100 mM acetate buffer (pH 5.7) containing 10 mM CaCl2, 2 mM EDTA, and 5 mM dithiothreitol. The suspension was sonicated for 3 min in an MSE ultrasonic disintegrator and dialyzed at 50°C against acetate buffer to completely hydrolyze the remaining cellulose fibers (37). After centrifugation (30,000 × g, 30 min), the supernatant was concentrated by ultrafiltration (XM300 membrane; Amicon) to 1 ml and applied to a Superose 6 HR 10/30 column (Pharmacia) equilibrated with 50 mM Tris-HCl (pH 7.5). The purified cellulosomes eluted near the void volume of the column.
Immunological methods.
Polyclonal antibodies were raised in white rabbits by infection of 0.25 mg of recombinant CbhA protein in Freund’s complete adjuvant (Amersham). Booster injections were given after 7 days, and bleeding was performed after 14 days. The serum was purified by using a serum IgG column and checked for specificity. For Western blot analysis, sodium dodecyl sulfate (SDS)–12% polyacrylamide gel electrophoresis (PAGE) slabs of purified cellulosomes were blotted onto a nitrocellulose membrane. The replicates were incubated with anti-CbhA rabbit serum and subjected to immunostaining using donkey anti-rabbit serum conjugated to horseradish peroxidase (Amersham) and 4-chloro-1-naphthol as a chromogenic substrate.
Protein cleavage, isolation of peptides, and sequencing of peptides and N termini.
Cellulosomal proteins (100 μg) were separated by SDS–10% PAGE and stained with Coomassie blue. The band corresponding to subunit S3 was cut out and incubated with 2 μg of endoproteinase LysC (Boehringer) in 200 μl of 0.1 M Tris-HCl (pH 8.5) for 6 h at 37°C. The peptide mixture was separated by reversed-phase high-performance liquid chromatography on a Supersphere 60 RP select B column (Merck) at a flow rate of 0.3 ml/min. Solvent A was 0.1% trifluoroacetic acid, and solvent B was 0.1% trifluoroacetic acid in acetonitrile. The gradient of 0 to 70% solvent B was run in 70 min. Selected peptide-containing fractions were subjected to automated sequencing. N-terminal amino acid sequences were determined by Edman degradation using a Procise 492 protein sequencer (Applied Biosystems). The phenylthiohydantoin derivatives were identified by reversed-phase high-performance liquid chromatography.
Nucleotide sequence accession number.
The nucleotide and amino acid sequences reported in this study have been submitted to GenBank under accession no. X80993.
RESULTS
Nucleotide sequence of the cbhA gene.
The recombinant plasmid pCU303 carries a 10.7-kb insert of C. thermocellum including the cellobiohydrolase gene cbhA. EcoRI digestion of pCU303 resulted in the deletion of a 7.3-kb DNA segment, yielding plasmid pCU304 (49). Sequencing the insert of pCU304 revealed that EcoRI cleavage had removed the 5′-end portion of the cbhA gene, leading to the production of a truncated enzyme species still exhibiting cellobiohydrolase activity. Therefore, the cbhA sequence was completed by sequencing the corresponding region of pCU303 by using specific oligonucleotide primers.
The sequenced region (4,183 bp) contained only one long open reading frame (ORF) of 3,690 nucleotides encoding a protein of 1,230 amino acids (Fig. 1). The putative initiation codon ATG was preceded at a spacing of 6 bp by a potential ribosome-binding site with a calculated free energy of Shine-Dalgarno base pairing of −66.5 kJ/mol. The ochre stop codon at position 4129 is followed by another in-frame ochre stop codon at position 4153. As observed previously for other C. thermocellum genes (1), the coding sequence and its flanking regions differed markedly in their G+C contents (43.0 and 29.7%, respectively). A palindromic sequence with a free energy for RNA hairpin formation of −81.2 kJ/mol is located immediately downstream of the ORF. This dyad symmetry element, which is followed by a run of 5 T’s, might function as a factor-independent transcription terminator. Sequence inspection did not reveal any consensus promoter sequence recognized by bacterial RNA polymerases.
FIG. 1.
Nucleotide and deduced amino acid sequences of the cbhA gene. The potential ribosome-binding site (SD) is in boldface type and underlined. A palindrome is indicated by arrows facing each other. The putative leader sequence is indicated by italic type. The segments encoding the different regions of CbhA are indicated by boxes of different patterns: ▨, CBD family IV; , Ig-like domain; ▪, catalytic domain; ▤, Fn3-like domain; ▧, CBD family III; ▥, dockerin domain. The underlined amino acids were determined for cellulosomal protein S3 by liquid-phase sequencing.
Multidomain structure of CbhA.
Analysis of the amino acid sequence of CbhA derived from the nucleotide sequence revealed a multidomain structure of unexpected complexity (Fig. 1). Most structural elements could be readily identified by sequence comparison. Thus, the N-terminal sequence exhibits the typical features of a bacterial signal peptide required for protein secretion (50) with a predicted cleavage site between position 27 (Ala) and position 28 (Leu). Removal of the signal peptide yields a mature protein of 1,203 amino acids with a molecular weight of 135,139.
The central region of CbhA contains the catalytic domain, which is homologous to cellulase subfamily E1 (46). It exhibits 38 to 40% sequence identity with the catalytic domains of a carboxmethylcellulase from Pseudomonas fluorescens (14) and a group of cellulases from gram-positive bacteria with high G+C contents, including endoglucanase E1 from Thermomonospora fusca (26), endoglucanase CenC from Cellulomonas fimi (9), and endoglucanase Cel1 from Streptomyces reticuli (42). On the other hand, only 20 to 22% sequence identity was observed between the catalytic domain of CbhA and the C. thermocellum cellulases CelD (24) and CelJ (1), two other members of cellulase subfamily E1. As observed for all enzymes of this subfamily, the catalytic domain of CbhA is preceded by an Ig-like β-barrel domain of unknown function (27).
The catalytic core region of CbhA is flanked by two distinct CBDs. The N-terminal domain is homologous to family IV substrate-binding domains of bacterial cellulases and endo-1,3-β-glucanases (Fig. 2), whereas the C-terminal domain is a member of CBD family III (Fig. 3). Both domains consist of two antiparallel β-sheets with the topology of a jelly role β-sandwich (23, 48). Substrate binding is mediated by a strip of highly conserved aromatic residues flanked by polar hydrogen-bonding groups. The family III CBD of the C. thermocellum scaffoldin CipA also contains a Ca2+ binding site (48), which seems to be present in all members of this family.
FIG. 2.
Alignment of amino acid sequences of family IV CBDs of bacterial cellulases and endo-1,3-β-glucanases. Abbreviations and accession numbers: Cth-LicA, C. thermocellum LicA, X89732; Tne-LamA, Thermotoga neapolitana LamA (54), Z47974; Cth-CbhA, C. thermocellum CbhA; Cce-CelE, Clostridium cellulolyticum CelE (2), Q46002; Cfi-CenC, Cellulomonas fimi CenC (9), P14090; Sre-Cel1, S. reticuli Cel1 (42), Q05156; Tfu-E1, Thermomonospora fusca E1 (26), Q08166. Shaded boxes highlight positions where residues are conserved in five or more family members, including those of CbhA. The conserved aromatic residues are indicated by asterisks. All sequences are numbered from Met-1.
FIG. 3.
Alignment of amino acid sequences of selected CBDs from family III. Abbreviations and accession numbers: Cth-CbhA, C. thermocellum CbhA; Cth-CipA, C. thermocellum CipA (12), X67506; Csa-CelB, Caldicellulosiruptor saccharolyticus CelB (41), X13602; Cst-CelZ, Clostridium stercorarium CelZ (20), X55299; Cth-CelI, C. thermocellum CelI (17), L04735; Bla-CelA, Bacillus lautus CelA (15), M76588; Eca-CelV, Erwinia carotovora CelV (33), X79241. Shaded boxes highlight positions where residues are conserved in four or more family members, including those of CbhA. The conserved aromatic residues and the residues that are implicated in Ca2+ binding are indicated by asterisks and solid triangles, respectively. All sequences are numbered from Met-1.
Identification of a novel Fn3-like domain.
Inspection of the protein joining the CbhA catalytic domain and the family III CBD sequence by HCA (13, 29) revealed the presence of a repeated domain (Fig. 4). Although the aligned sequences exhibit only 26% identity, their HCA matching score is 80%, which is considered strong evidence for sequence homology (13). The duplicated domain showed no obvious homology to other noncatalytic domains but appeared to be distantly related to the Fn3-like domain of T. fusca endoglucanase E1 and T. fusca exoglucanase E4. Although the similarity is barely detectable on the amino acid sequence level, high-accuracy secondary-structure prediction (11) suggests a β-sheet topology strikingly similar to that of Fn3 modules (Fig. 5). The duplicated CbhA domain also resembles Fn3-like domains in amino acid composition, exhibiting an increased content of valine and hydroxylated aliphatic amino acids (data not shown).
FIG. 4.
HCA plots of CbhA amino acid positions 825 to 912 (A) and 914 to 1000 (B). Hydrophobic amino acids are shown as gray circles with conserved positions highlighted in dark gray. Proline residues are shown as black circles, and other helix-breaking amino acids (D, G, S, and N) found predominantly in loop regions are shown as white circles.
FIG. 5.
Secondary-structure prediction of Fn3-like modules. Abbreviations and accession numbers: Tfu-E4, Thermomonospora fusca exoglucanase E4 (26), L20093; Hum-Fib, human fibronectin (34), P02751. Other abbreviations are as described in the legend to Fig. 2. Secondary-structure states of amino acids were predicted by the PREDATOR program (11) and are represented by an “E” (extended or sheet) and a dash (coil). The seven antiparallel β-strands of the 10th Fn3 module of human fibronectin are designated by the letters A to F (34) at the bottom of the figure. All sequences are numbered from Met-1.
Cellulosomal localization.
The C-terminal segment of CbhA is made up of the highly conserved duplicated sequence of 24 amino acids constituting the cellulosomal dockerin module. This domain is separated from the family III CBD by an acidic 14-amino-acid linker sequence consisting of repeats of the tripeptide Pro-Glu-Glu (Fig. 1). The presence of a dockerin domain strongly suggests that CbhA is a cellulosome constituent. To confirm this conjecture, polyclonal antibodies were raised against the truncated CbhA protein expressed by pCU304. It should be pointed out that the cloned insert terminates at the EcoRI site at nucleotide 3688 and thus lacks the C-terminal portion of CbhA including the dockerin domain. The truncated protein is further processed upon expression in E. coli, yielding an enzymatically active protein of 80 kDa, which presumably consists only of the catalytic domain and the flanking Ig- and Fn3-like domains. Western blot analysis indicated that two cellulosomal proteins, S3 and S5, with apparent molecular masses of 150 and 98 kDa, respectively, strongly reacted with anti-CbhA antibodies (Fig. 6). The comparison of molecular masses indicates that CbhA might correspond to subunit S3. The identity of CbhA and S3 was established by amino acid sequence analysis. Due to blockage of the N terminus, partial sequences were determined upon cleavage of S3 with endoprotease LysC. The two peptide sequences obtained (see Fig. 1) were fully consistent with the deduced amino acid sequence of CbhA.
FIG. 6.
Detection of CbhA in the cellulosome of C. thermocellum. (Left panel) Western blot analysis of cellulosomal proteins detected with a polyclonal antibody raised against truncated CbhA; (right panel) SDS-PAGE of cellulosomal proteins stained with Coomassie brilliant blue. Cellulosomal subunits S1 to S14 are indicated with corresponding molecular masses.
DISCUSSION
Although numerous cellulolytic and hemicellulolytic C. thermocellum enzymes are considered cellulosome constituents due to the presence of a dockerin domain, only a few have been correlated with cellulosomal subunits (1, 8, 16, 38, 53). Western blot analysis and amino acid sequence determination clearly demonstrate that CbhA is identical to cellulosomal protein S3. It should be noted that the molecular mass of S3 (150 kDa) determined by SDS-PAGE is considerably larger than the mass of CbhA (135 kDa) deduced from the DNA sequence. This difference might be due to an atypical electrophoretic mobility of CbhA possibly caused by the highly acidic linker sequence (positions 1150 to 1162). It was observed previously that the presence of linker regions rich in glutamic acid residues can retard migration of multidomain proteins in SDS-PAGE (31).
The immunological data suggest that S5 is either a structurally related protein or a proteolytic degradation product of CbhA. Formation of S5 by proteolytic cleavage of CbhA is consistent with the N-terminal sequence of S5 from C. thermocellum JW20 (8). The reported sequence LEDKS(S)KLPDYKNDL(L)YE is nearly identical to the N terminus of mature CbhA predicted from the sequence data (Fig. 1). Minor sequence variations could reflect differences between C. thermocellum JW20 and F7. The size of S5 (98 kDa) indicates that the proteolytic cleavage between the two Fn3-like modules of CbhA might have occurred. Truncation of the C-terminal dockerin domain during cellulosome dissociation has recently been reported for subunit S8, which corresponds to cellobiohydrolase CelS (8).
The identification of CbhA and CelS as cellulosomal constituents S3 and S8, respectively, implies that the cellulosome contains at least two exoglucanases and refutes the early concept that the cellulosome consists entirely of endoglucanase activities (35). Both exoglucanases have been characterized as cellobiohydrolases (28, 36, 44, 49) but belong to different cellulase families. CbhA is a member of cellulase family E1, whereas CelS belongs to family L (46). The two enzymes also differ strikingly in their domain structures. CelS is less complex and consists of a catalytic domain and a C-terminal dockerin domain (51). Due to its lack of CBDs, CelS requires the presence of CipA for the efficient hydrolysis of crystalline cellulose. It has been proposed that both proteins interact synergistically in an enzyme (CelS)-anchor (CipA) manner (32, 52).
The multidomain structure of CbhA was unexpected, considering that the cellulosome is mainly an assembly of catalytic subunits, which are organized for concerted action and targeted to the insoluble substrate by the CipA protein (4). In particular, the presence of both an N-terminal and a C-terminal CBD is apparently redundant. However, it should be kept in mind that family IV and family III CBDs differ strikingly in their substrate specificity. Whereas family III domains bind specifically to crystalline cellulose, family IV domains bind with approximately equal affinities to amorphous cellulose, cellooligopentaose, and mixed-linkage β-glucans (22, 47). Conceivably, this binding site could participate directly in cellulose degradation by keeping the amorphous region in a noncrystalline state suitable for enzymatic hydrolysis. On the other hand, the C-terminal family III CBD might assist CipA in attaching the cellulosome to crystalline cellulose fibers.
The role of the other noncatalytic domains of CbhA is less obvious. It should be noted that the Ig-like β-barrel domain has so far been found only in members of cellulase family E1, where it is always positioned at the N terminus of the catalytic domain (see Fig. 7). It might therefore be specifically involved in the folding and/or stabilization of the catalytic α6/α6-barrel domain of this cellulase subfamily. In contrast, Fn3-like domains are found in various unrelated prokaryotic depolymerases in widely different arrangements (30). It is therefore likely that these domains have a similar function in prokaryotic and in eukaryotic exoproteins, namely, adhesion to cell surface receptors. In the case of CbhA, this original function became redundant upon integration of the enzyme into the cellulosomal complex. On the other hand, duplication of the module might be required for correct positioning of the C-terminal CBD with respect to the catalytic domain. Structure analysis has shown that such module pairs do not simply function as flexible spacer elements but adopt defined relative orientations stabilized by specific intermodule interactions (7, 39). This change of function could explain the sequence divergence from other prokaryotic Fn3-like modules.
FIG. 7.
Comparison of the domain structure of cellulases of subfamily E1. Domains and regions showing significant similarity are indicated by the same pattern. Abbreviations and accession numbers: Cth-CelJ, C. thermocellum CelJ (1), D83704; Cth-CelD, C. thermocellum CelD (24), X04584; Pfl-EglA, P. fluorescens EglA (14), X12570; Fsu-EgB, Fibrobacter succinogenes EgB (6), L14436; Bfi-CelD, Butyrivibrio fibrisolvens CelD (5), X55732; aa, amino acids. Other abbreviations for the enzymes are described in the legend to Fig. 2.
Comparison of the domain structures of various other cellulases of subfamily E1 indicates striking similarities between CbhA and a group of enzymes from gram-positive bacteria with high G+C content (Fig. 7). In particular, it is obvious that the endoglucanase E1 of T. fusca has a similar functional design consisting of an N-terminal and central catalytic region involved in cellulose hydrolysis and a C-terminal portion involved in substrate and cell surface adherence. Average linkage cluster analysis of the N-terminal family IV CBD and the catalytic domains suggests coevolution of these two domains (Fig. 8). Apparently, this domain array arose by a rare recombination event and spread by horizontal transfer among gram-positive cellulolytic bacteria. Contrary to the proposed rearrangements of eukaryotic multidomain proteins due to exon shuffling, such domain arrays appear to be remarkably stable in bacteria, reflecting a fundamental difference in gene structure.
FIG. 8.
Average linkage cluster analysis. Similar amino acids were grouped by the classification of Risler et al. (40). The dendrogram was derived from pairwise similarity scores in accordance with the UPGMA (unweighted pair group maximum averages) method (45). Abbreviations for enzymes are described in the legends to Fig. 2 and 7.
ACKNOWLEDGMENTS
This work was supported in part by a grant from the Deutsche Forschungsgemeinschaft (SFB 145), by a NATO Collaborative Research grant (HTECH. CRG 930993), by a grant from the Volkswagenstiftung, and by a grant from the Russian Foundation of Basic Research.
REFERENCES
- 1.Ahsan M M, Kimura T, Karita S, Sakka K, Ohmiya K. Cloning, DNA sequencing, and expression of the gene encoding Clostridium thermocellumcellulase CelJ, the largest catalytic component of the cellulosome. J Bacteriol. 1996;178:5732–5740. doi: 10.1128/jb.178.19.5732-5740.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bagnara-Tardif C, Gaudin C, Belaich A, Hoest P, Citard T, Belaich J P. Sequence analysis of a gene cluster encoding cellulases from Clostridium cellulolyticum. Gene. 1992;119:17–28. doi: 10.1016/0378-1119(92)90062-t. [DOI] [PubMed] [Google Scholar]
- 3.Bayer E A, Morag E, Lamed R. The cellulosome—a treasure-trove for biotechnology. Trends Biotechnol. 1994;12:379–386. doi: 10.1016/0167-7799(94)90039-6. [DOI] [PubMed] [Google Scholar]
- 4.Béguin P, Lemaire M. The cellulosome: an exocellular, multiprotein complex specialized in cellulose degradation. Crit Rev Biochem Mol Biol. 1996;31:201–236. doi: 10.3109/10409239609106584. [DOI] [PubMed] [Google Scholar]
- 5.Berger E, Jones W A, Jones D T, Woods D R. Sequencing and expression of a cellodextrinase (ced1) gene from Butyrivibrio fibrisolvens H17c cloned in Escherichia coli. Mol Gen Genet. 1990;223:310–318. doi: 10.1007/BF00265068. [DOI] [PubMed] [Google Scholar]
- 6.Broussolle V, Forano E, Gaudet G, Ribot Y. Gene sequence and analysis of protein domains of EGB, a novel family E endoglucanase from Fibrobacter succinogenesS58. FEMS Microbiol Lett. 1994;124:439–447. doi: 10.1111/j.1574-6968.1994.tb07321.x. [DOI] [PubMed] [Google Scholar]
- 7.Campbell I D, Spitzfaden C. Building proteins with fibronectin type III modules. Structure. 1994;2:333–337. doi: 10.1016/s0969-2126(00)00034-4. [DOI] [PubMed] [Google Scholar]
- 8.Choi S K, Ljungdahl L G. Dissociation of the cellulosome of Clostridium thermocellumin the presence of ethylenediaminetetraacetic acid occurs with the formation of truncated polypeptides. Biochemistry. 1996;35:4897–4905. doi: 10.1021/bi9524629. [DOI] [PubMed] [Google Scholar]
- 9.Coutinho J B, Moser B, Kilburn D G, Warren R A J, Miller R C. Nucleotide sequence of the endoglucanase C gene (cenC) of Cellulomonas fimi, its high-level expression in Escherichia coli, and characterization of its products. Mol Microbiol. 1991;5:1221–1233. doi: 10.1111/j.1365-2958.1991.tb01896.x. [DOI] [PubMed] [Google Scholar]
- 10.Doolittle R F. The multiplicity of domains in proteins. Annu Rev Biochem. 1995;64:287–314. doi: 10.1146/annurev.bi.64.070195.001443. [DOI] [PubMed] [Google Scholar]
- 11.Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure prediction. Proteins Struct Funct Genet. 1997;27:329–335. doi: 10.1002/(sici)1097-0134(199703)27:3<329::aid-prot1>3.0.co;2-8. [DOI] [PubMed] [Google Scholar]
- 12.Fujino T, Beguin P, Aubert J P. Organization of a Clostridium thermocellumgene cluster encoding the cellulosomal scaffolding protein CipA and a protein possibly involved in attachment of the cellulosome to the cell surface. J Bacteriol. 1993;175:1891–1899. doi: 10.1128/jb.175.7.1891-1899.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gaboriaud C, Bissery V, Benchetrit T, Mornon J P. Hydrophobic cluster analysis: an efficient new way to compare and analyse amino acid sequences. FEBS Lett. 1987;224:149–155. doi: 10.1016/0014-5793(87)80439-8. [DOI] [PubMed] [Google Scholar]
- 14.Hall J, Gilbert H J. The nucleotide sequence of a carboxycellulase gene from Pseudomonas fluorescens subsp. cellulosa. Mol Gen Genet. 1988;213:112–117. doi: 10.1007/BF00333406. [DOI] [PubMed] [Google Scholar]
- 15.Hansen C K, Diderichsen B, Jorgensen P L. celA from Bacillus lautusPL236 encodes a novel cellulose-binding endo-β-1,4-glucanase. J Bacteriol. 1992;174:3522–3531. doi: 10.1128/jb.174.11.3522-3531.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Hayashi H, Takagi K-I, Fukumura M, Kimura T, Karita S, Sakka K, Ohmiya K. Sequence of xynC and properties of XynC, a major component of the Clostridium thermocellumcellulosome. J Bacteriol. 1997;179:4246–4253. doi: 10.1128/jb.179.13.4246-4253.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hazlewood G P, Davidson K, Laurie J I, Huskisson N S, Gilbert H J. Gene sequence and properties of CelI, a family E endoglucanase from Clostridium thermocellum. J Gen Microbiol. 1993;139:307–316. doi: 10.1099/00221287-139-2-307. [DOI] [PubMed] [Google Scholar]
- 18.Henrissat B, Popineau Y, Kader Y. Hydrophobic cluster analysis of plant protein sequences. A domain homology between storage and lipid transfer proteins. Biochem J. 1988;255:901–905. doi: 10.1042/bj2550901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Higgins D G, Thompson J D, Bibson T J. Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 1996;266:388–402. doi: 10.1016/s0076-6879(96)66024-8. [DOI] [PubMed] [Google Scholar]
- 20.Jauris S, Ruecknagel K P, Schwarz W H, Kratzsch P, Bronnenmeier K, Staudenbauer W L. Sequence analysis of the Clostridium stercorarium celZgene encoding a thermoactive cellulase (Avicelase I): identification of catalytic and cellulose-binding domains. Mol Gen Genet. 1990;223:258–267. doi: 10.1007/BF00265062. [DOI] [PubMed] [Google Scholar]
- 21.Johnson E A, Madia A, Demain A L. Chemically defined minimal medium for growth of the anaerobic cellulolytic thermophile Clostridium thermocellum. Appl Environ Microbiol. 1981;41:1060–1062. doi: 10.1128/aem.41.4.1060-1062.1981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Johnson P E, Tomme P, Joshi M D, McIntosh L P. Interaction of soluble cellooligosaccharides with the N-terminal cellulose-binding domain of Cellulomonas fimiCenC. 2. NMR and ultraviolet absorption spectroscopy. Biochemistry. 1996;35:13895–13906. doi: 10.1021/bi961186a. [DOI] [PubMed] [Google Scholar]
- 23.Johnson P E, Joshi M D, Tomme P, Kilburn D G, McIntosh L P. Structure of the N-terminal cellulose-binding domain of Cellulomonas fimiCenC determined by nuclear magnetic resonance spectroscopy. Biochemistry. 1996;35:14381–14394. doi: 10.1021/bi961612s. [DOI] [PubMed] [Google Scholar]
- 24.Joliff G, Béguin P, Aubert J P. Nucleotide sequence of the cellulase gene celD encoding endoglucanase D of Clostridium thermocellum. Nucleic Acids Res. 1986;14:8605–8613. doi: 10.1093/nar/14.21.8605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Jones E Y. The immunoglobulin superfamily. Curr Opin Struct Biol. 1993;3:846–852. [Google Scholar]
- 26.Jung E D, Lao G, Irwin D, Barr B K, Benjamin A, Wilson D B. DNA sequences and expression in Streptomyces lividans of an exoglucanase gene and an endoglucanase gene from Thermomonospora fusca. Appl Environ Microbiol. 1993;59:3032–3043. doi: 10.1128/aem.59.9.3032-3043.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Juy M, Amit A G, Alzari P M, Poljak R J, Claeyssens M, Béguin P, Aubert J P. Three-dimensional structure of a thermostable bacterial cellulase. Nature. 1992;357:89–91. [Google Scholar]
- 28.Kruus K, Wang W K, Wu J H D. Exoglucanase activities of the recombinant Clostridium thermocellumCelS, a major cellulosome component. J Bacteriol. 1995;177:1641–1644. doi: 10.1128/jb.177.6.1641-1644.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Lemesle-Varloot L, Henrissat B, Garboriaud C, Bissery V, Morgat A, Mornon J P. Hydrophobic cluster analysis: procedures to derive structural and functional information from 2-D-representation of protein sequences. Biochimie. 1990;72:555–574. doi: 10.1016/0300-9084(90)90120-6. [DOI] [PubMed] [Google Scholar]
- 30.Little E, Bork P, Doolittle R F. Tracing the spread of fibronectin type III domains in bacterial glycohydrolases. J Mol Evol. 1994;39:631–643. doi: 10.1007/BF00160409. [DOI] [PubMed] [Google Scholar]
- 31.Lück A, D’Haese J, Hinssen H. A gelsolin-related protein from lobster muscle: cloning, sequence analysis and expression. Biochem J. 1995;305:767–775. doi: 10.1042/bj3050767. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lytle B, Myers C, Kruus K, Wu J H D. Interactions of the CelS binding ligand with various receptor domains of the Clostridium thermocellumcellulosomal scaffolding protein, CipA. J Bacteriol. 1996;178:1200–1203. doi: 10.1128/jb.178.4.1200-1203.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Mae A, Heikinheimo R, Palva E T. Structure and regulation of the Erwinia carotovora subspecies carotovora SCC3193 cellulase gene celV1and the role of cellulase in phytopathogenicity. Mol Cen Genet. 1995;247:17–26. doi: 10.1007/BF00425817. [DOI] [PubMed] [Google Scholar]
- 34.Main L M, Harvey T S, Baron M, Boyd J, Campbell I D. The three-dimensional structure of the tenth type III module of fibronectin: an insight into RGD-mediated interactions. Cell. 1992;71:671–678. doi: 10.1016/0092-8674(92)90600-h. [DOI] [PubMed] [Google Scholar]
- 35.Mayer F, Coughlan M P, Mori Y, Ljungdahl L G. Macromolecular organization of the cellulolytic enzyme complex of Clostridium thermocellumas revealed by electron microscopy. Appl Environ Microbiol. 1987;53:2785–2792. doi: 10.1128/aem.53.12.2785-2792.1987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Mel’nik M S, Rabinovich M L, Voznyi I V. Cellobiohydrolase from Clostridium thermocellum, synthesized by a recombinant E. colistrain. Biokhimiya. 1991;56:1787–1797. [PubMed] [Google Scholar]
- 37.Morag E, Bayer E A, Lamed R. Affinity digestion for the near-total recovery of purified cellulosome from Clostridium thermocellum. Enzyme Microb Technol. 1992;14:289–292. [Google Scholar]
- 38.Morag E, Bayer E A, Hazlewood G P, Gilbert H J, Lamed R. Cellulase SS(CelS) is synonymous with the major cellobiohydrolase (subunit S8) from the cellulosome of Clostridium thermocellum. Appl Biochem Biotechnol. 1993;43:147–151. doi: 10.1007/BF02916439. [DOI] [PubMed] [Google Scholar]
- 39.Potts J R, Campbell I D. Structure and function of fibronectin modules. Matrix Biol. 1996;15:313–320. doi: 10.1016/s0945-053x(96)90133-x. [DOI] [PubMed] [Google Scholar]
- 40.Risler J L, Delorme M O, Delacroix H, Henaut A. Amino acid substitutions in structurally related proteins. A pattern recognition approach. Determination of a new and efficient scoring matrix. J Mol Biol. 1988;204:1019–1029. doi: 10.1016/0022-2836(88)90058-7. [DOI] [PubMed] [Google Scholar]
- 41.Saul D J, Williams L C, Grayling R A, Chamley L W, Love D R, Bergquist P L. celB, a gene coding for a bifunctional cellulase from the extreme thermophile Caldocellum saccharolyticum. Appl Environ Microbiol. 1990;56:3117–3124. doi: 10.1128/aem.56.10.3117-3124.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Schlochtermeier A, Walter S, Schröder J, Moorman M, Schrempf H. The gene encoding the cellulase (Avicelase) Cel1 from Streptomyces reticuliand analysis of protein domains. Mol Microbiol. 1992;6:3611–3621. doi: 10.1111/j.1365-2958.1992.tb01797.x. [DOI] [PubMed] [Google Scholar]
- 43.Shimon L J W, Bayer E A, Morag E, Lamed R, Yaron S, Shoham Y, Frolow F. A cohesin domain from Clostridium thermocellum: the crystal structure provides new insights into cellulosome assembly. Structure. 1997;5:381–390. doi: 10.1016/s0969-2126(97)00195-0. [DOI] [PubMed] [Google Scholar]
- 44.Singh R N, Akimenko V K. Isolation of a cellobiohydrolase of Clostridium thermocellumcapable of degrading natural crystalline substrates. Biochem Biophys Res Commun. 1993;192:1123–1130. doi: 10.1006/bbrc.1993.1533. [DOI] [PubMed] [Google Scholar]
- 45.Sokal R R, Sneath P H A. Principles of numerical taxonomy. San Francisco, Calif: Freeman; 1963. [Google Scholar]
- 46.Tomme P, Warren R A J, Gilkes N R. Cellulose hydrolysis by bacteria and fungi. Adv Microb Physiol. 1995;37:1–81. doi: 10.1016/s0065-2911(08)60143-5. [DOI] [PubMed] [Google Scholar]
- 47.Tomme P, Creagh L, Kilburn D, Haynes C. Interaction of polysaccharides with the N-terminal cellulose-binding domain of Cellulomonas fimiCenC. 1. Binding specificity and calorimetric analysis. Biochemistry. 1996;35:13885–13894. doi: 10.1021/bi961185i. [DOI] [PubMed] [Google Scholar]
- 48.Tormo J, Lamed R, Chirino A J, Morag E, Bayer E A, Shoham Y, Steitz T A. Crystal structure of a bacterial family-III cellulose-binding domain: a general mechanism for attachment to cellulose. EMBO J. 1996;15:5739–5751. [PMC free article] [PubMed] [Google Scholar]
- 49.Tuka K, Zverlov V V, Bumazkin B K, Velikodvorskaya G A, Strongin A Y. Cloning and expression of Clostridium thermocellum genes coding for thermostable exoglucanases (cellobiohydrolases) in Escherichia colicells. Biochem Biophys Res Commun. 1990;169:1055–1060. doi: 10.1016/0006-291x(90)92001-g. [DOI] [PubMed] [Google Scholar]
- 50.von Heijne G. Signal sequences. The limits of variation. J Mol Biol. 1985;184:99–105. doi: 10.1016/0022-2836(85)90046-4. [DOI] [PubMed] [Google Scholar]
- 51.Wang W K, Kruus K, Wu J H D. Cloning and DNA sequence of the gene coding for Clostridium thermocellum cellulase SS(CelS), a major cellulosome component. J Bacteriol. 1993;175:1293–1302. doi: 10.1128/jb.175.5.1293-1302.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wu J H D, Orme-Johnson W H, Demain A L. Two components of an extracellular protein aggregate of Clostridium thermocellumtogether degrade crystalline cellulose. Biochemistry. 1988;27:1703–1709. [Google Scholar]
- 53.Zverlov V V, Fuchs K P, Schwarz W H, Velikodvorskaya G. Purification and cellulosomal localization of Clostridium thermocellummixed linkage β-glukanase LicB (1,3-1,4-β-D-glucanase) Biotechnol Lett. 1994;16:29–34. [Google Scholar]
- 54.Zverlov V V, Volkov I Y, Velikodvorskaya T V, Schwarz W H. Highly thermostable endo-1,3-β-glucanase (laminarinase) LamA from Thermotoga neapolitana: nucleotide sequence of the gene and characterization of the recombinant gene product. Microbiology. 1997;143:1701–1708. doi: 10.1099/00221287-143-5-1701. [DOI] [PubMed] [Google Scholar]