Abstract
Carbohydrate-binding modules (CBMs) are appended to glycoside hydrolases and can contribute to the degradation of complex recalcitrant substrates such as the plant cell wall. For application in bioethanol production, novel enzymes with high catalytic activity against recalcitrant lignocellulosic material are being explored and developed. In this work, we report the functional and structural study of CBM_E1, which was discovered through a metagenomics approach and is the founding member of a novel CBM family, CBM81. CBM_E1, which is linked to an endoglucanase, displayed affinity for mixed linked β1,3-β1,4-glucans, xyloglucan, Avicel, and cellooligosaccharides. The crystal structure of CBM_E1 in complex with cellopentaose displayed a canonical β-sandwich fold comprising two β-sheets. The planar ligand binding site, observed in a parallel orientation with the β-strands, is a typical feature of type A CBMs, although the expected affinity for bacterial crystalline cellulose was not detected. Conversely, the binding to soluble glucans was enthalpically driven, which is typical of type B modules. These unique properties of CBM_E1 are at the interface between type A and type B CBMs.
Keywords: carbohydrate-binding protein, cellulose, glycoside hydrolase, metagenomics, X-ray crystallography
Introduction
Carbohydrate-binding modules (CBMs)3 are functionally and structurally discrete units that are linked to a range of carbohydrate active enzymes (CAZymes), although primarily glycoside hydrolases (GHs) (1). CBM-containing GHs target insoluble polysaccharides exemplified by the plant cell wall (2). The major industrial interest in these enzymes is for the production of lignocellulosic-derived ethanol, which is a promising alternative to environmentally damaging and finite fossil fuels. The study and development of GHs and their accessory modules, primarily CBMs, are crucial to overcome accessibility issues regarding the deconstruction of plant cell wall polysaccharides into their monomeric fermentable units (3, 4).
CBMs are capable of binding to different carbohydrates (5). Although non-catalytic, CBMs can increase the enzymatic efficiency of its associated catalytic module against insoluble substrates through proximity effects (6).
Currently, CBMs are classified into 80 sequence-based families according to the CAZy database (7). In addition, it is proposed that CBM classification should also be based on the structure and binding specificity of these modules: “surface-binding to crystalline ligands” or type A; “endo-single chain glycan binding” or type B; and “exo-binding” or type C (1). Type A CBMs bind to insoluble crystalline carbohydrates, crystalline cellulose, and/or chitin and have a planar binding site, usually composed of three aromatic amino acids, displaying little or no affinity for soluble carbohydrates. Type B CBMs, on the other hand, have a binding site shaped as a cleft and bind to a wide variety of glycans such as xylans, mannans, galactans, glucans of mixed linkage, and non-crystalline cellulose. Aromatic amino acids also play an important role in ligand binding in type B CBMs (2). The binding sites of type C CBMs display a pocket topology explaining why they recognize the non-reducing termini of glycans.
Here we report the comprehensive structural and biochemical characterization of a founding member of a new CBM family, CBM_E1. The CBM comprises the C-terminal region of a GH family 5 (GH5) endoglucanase derived from the sugar cane soil metagenomic library (4). CBM_E1 shows biophysical properties that are characteristic of type B CBMs but contains a planar binding site, typical of type A CBMs. Thus, CBM_E1 displays unique properties that are at the interface between type A and type B CBMs.
Results
CBM_E1 Is the C-terminal Domain of CelE1, an Endoglucanase Derived from Sugar Cane Soil Metagenome
Previous functional screening of a metagenomic library using carboxymethyl cellulose as substrate identified a clone encoding a 427-amino acid polypeptide (4). The full-length CelE1 (accession number KF498957) contains a putative N-terminal signal peptide (prediction performed on Signal-Blast) (8), a catalytic module belonging to glycoside hydrolase family 5 (GH5), predicted on dbCAN web server (9), a serine-rich linker sequence, and a C-terminal region of unknown function, which was defined here as CBM_E1 (Fig. 1; accession number KJ917170). Based on BLASTp sequence comparison, the most similar protein is the C-terminal region (unknown function) of another GH5 cellulase from Pseudomonas sp. ND137 (GenBankTM accession number BAB79288.1), presenting 31% identity to CBM_E1.
FIGURE 1.
A, domain architecture of CelE1. B, SDS-PAGE analysis of purified CBM_E1 (11.8 kDa). MW, molecular weight marker (PageRuler Unstained Protein Ladder); E1–E4, elutions from affinity chromatography; A23–A26, fractions from size exclusion chromatography using a Superdex 75 10/30 GL column.
Ligand Binding Properties of CBM_E1
To evaluate whether CBM_E1 binds to polysaccharides, the protein module, comprising residues 335–427 of full-length CelE1, was expressed in soluble form in Escherichia coli and purified to electrophoretic homogeneity. Isothermal titration calorimetry (ITC) was used to explore binding to potential soluble ligands. The data reported in Table 1 and Fig. 2 show that CBM_E1 bound with highest affinity for barley β-glucan, with a Ka of ∼ 104 m−1, whereas the interaction with xyloglucan was almost 10-fold weaker, and the protein displayed no recognition of birchwood xylan. With respect to oligosaccharides, CBM_E1 displayed affinity for cellohexaose (C6), with a Ka of 1.2 × 104 m−1, and cellopentaose (C5), with Ka of 4.3 × 103 m−1. The calculated thermodynamic parameters indicate that protein binding to oligosaccharides is enthalpically driven, which is typical for type B CBMs (10). Binding to mannohexaose (M6), xylohexaose (X6), cellotetraose (C4), and XXXG (heptasaccharide derived from xyloglucan, where the X stands for a glucose decorated with xylose, and G indicates an undecorated glucose) was not detected.
TABLE 1.
Affinity and thermodynamic parameters of CBM_E1 binding to polysaccharides and oligosaccharides
| Ligand | Ka × 104 | ΔG | ΔH | TΔS | na |
|---|---|---|---|---|---|
| m−1 | kcal mol−1 | kcal mol−1 | kcal mol−1 | ||
| Cellohexaose | 1.23 ± 0.21 | −5.57 | −8.24 ± 1.64 | −2.67 | 0.995 ± 0.172 |
| Cellopentaose | 0.433 ± 0.06 | −4.95 | −5.07 ± 1.66 | −0.12 | 0.955 ± 0.285 |
| Cellotetraose | NBb | ||||
| Mannohexaose | NB | ||||
| Xylohexaose | NB | ||||
| β-Glucan | 1.37 ± 0.473 | −5.63 | −3.31 ± 1.49 | 2.32 | 1.00 ± 0.4 |
| Xyloglucan | 0.5 ± 0.1 | −5.04 | −5.55 ± 2.76 | −0.51 | 1.03 ± 0.474 |
| XXXG | NB | ||||
| Xylan birchwood | NB |
a The ITC data were fitted to a single site binding model for all ligands. For polysaccharide ligands in which the molar concentration of binding sites is unknown, the n value was iteratively fitted as close as possible to 1, by adjusting the molar concentration of the ligand (41).
b NB, no binding.
FIGURE 2.
Representative ITC data of CBM_E1 binding to soluble ligands. A, 5 mm C6; B, 5 mm C5; C, 5 mm C4; D, 1% β-glucan; E, 1% xyloglucan; F, 1% xylan birchwood. The ligand in the syringe was titrated into CBM_E1 (100 μm) in the cell. The top half of each panel shows the raw ITC heats; the bottom half displays the integrated peak areas fitted using a one single binding model by MicroCal Origin software. ITC was carried out in 50 mm Na-HEPES, pH 8.0, at 25 °C.
A pulldown assay was performed using Avicel or bacterial microcrystalline cellulose (BMCC) to determine whether CBM_E1 binds to insoluble and crystalline forms of cellulose. As observed in Fig. 3, CBM_E1 bound to Avicel (the protein was present only in the insoluble fraction), but not to BMCC (CBM_E1 appeared only in the soluble fraction). Considering that BMCC has a high degree of crystallinity, ∼95% (11), and Avicel contains ∼40% of amorphous regions (12), the results obtained suggest that CBM_E1 does not bind to crystalline cellulose but targets disordered regions of the polysaccharide.
FIGURE 3.

A, pulldown assay for CBM_E1 with Avicel. B, pulldown assay with BMCC. The large arrows indicate the protein bound to Avicel (insoluble fraction), and the thin arrows indicate the protein in the soluble fraction, indicating that CBM_E1 does not bind to BMCC. MW, molecular weight marker (PageRuler Unstained Protein Ladder); E1, purified CBM_E1; I1, insoluble fraction from the assay with 10 μg of CBM_E1; S1, soluble fraction from the assay with 10 μg of CBM_E1; I2, insoluble fraction from the assay with 20 μg of CBM_E1; S2, soluble fraction from the assay with 20 μg of CBM_E1.
The Structure of CBM_E1
The native crystal structure of CBM_E1 was solved in the absence and presence of the ligand cellopentaose at 1.5 and 1.74 Å resolution, respectively. The initial phases were found by a single-wavelength anomalous dispersion method using the anomalous scattering of gadolinium. The derivative data set presented 12 heavy atoms with clear electron densities (data not shown). The statistics of data collection and refinement are described in Table 2.
TABLE 2.
Data collection and processing statistics
| CBM_E1/C5 | CBM_E1/Apo | CBM_E1/Gd | |
|---|---|---|---|
| Data collection | |||
| Wavelength (Å) | 0.96 | 1.46 | 1.48 |
| Space group | I213 | I21 | I213 |
| Cell dimensions | |||
| a, b, c (Å) | 88.8 | 28.4, 26.5, 99.3 | 88.41 |
| β (°) | 92.8 | ||
| Resolution (Å) | 44.4–1.5 (1.53–1.5) | 27.62–1.74 (1.78–1.74) | 31.26–1.8 (1.84–1.8) |
| Total reflections | 114,680 | 51,818 | 113,830 |
| Unique reflections | 18,747 | 7,638 | 10,827 |
| Average mosaicity | 0.23 | 0.43 | 0.85 |
| Rmerge | 0.08 (0.84) | 0.08 (0.97) | 0.13 (0.81) |
| Rpim | 0.05 (0.52) | 0.05 (0.58) | 0.06 (0.38) |
| I/σI | 17.9 (2.2) | 14.8 (1.9) | 15.1 (4.0) |
| Completeness (%) | 99.5 (90.7) | 98.9 (81.0) | 100 (100) |
| Redundancy | 6.1 (5.6) | 6.8 (5.5) | 10.5 (10.3) |
| CC1/2 (%) | 99.9 (70.4) | 99.8 (32.9) | 99.7 (90.6) |
| Refinement | |||
| Rwork/Rfree (%) | 15.50/18.29 | 15.39/20.73 | |
| No. atoms | |||
| Protein | 713 | 676 | |
| Ligand/ion | 56 | ||
| Water | 179 | 117 | |
| B-factors (Å2) | |||
| Protein | 13.5 | 14.9 | |
| Ligand/ion | 20.7 | ||
| Water | 31.1 | 29.6 | |
| Root mean square deviations | |||
| Bond lengths (Å) | 0.022 | 0.006 | |
| Bond angles (°) | 1.72 | 0.81 | |
| Ramachandran | |||
| Favored | 97.9 | 95.6 | |
| Allowed | 2.1 | 4.4 | |
| Outliers | 0 | 0 | |
The final structures presented a monomer in the asymmetric unit, and all residues were built with the exception of the vector encoded N-terminal two and four residues from ligand-complexed CBM_E1 and apo CBM_E1, respectively, because of poor electron density. The structural alignment of apo and ligand-complexed protein resulted in a root mean square deviation of 0.19 Å, demonstrating that they are almost identical (Fig. 4), and thus ligand binding did not induce conformational changes in the protein. Reflecting the high degree of structural similarity, only the cellopentaose-CBM_E1 complex was considered further.
FIGURE 4.

Crystal structures of CBM_E1 apo and holo overlaid. Cartoon depicting the three-dimensional structure of CBM_E1 apo (orange) and holo (cyan) showing the two β-sheets formed by five and four β-strands and the disulfide bond represented in spheres. The only significant difference between them is observed in the first three N-terminal amino acids (N).
Characteristic of CBMs (13), CBM_E1 has a globular β-sandwich fold composed of two antiparallel β-sheets with four and five β-strands connected by loops (Fig. 4). β-Sheets 1 and 2 comprised strands β2, β4, β7, and β9 and strands β1, β3, β5, β6, and β8, respectively. A striking feature of CBM_E1 is a completely solvent-exposed planar surface that is orientated parallel and close to the strands β4, β7, and β9 in β-sheet 2. This planar surface is dominated by three tryptophan residues: Trp375, Trp398, and Trp427. The side chains of Trp375 and Trp427 are orientated parallel with β-sheet 2, whereas the indole ring of Trp398 is positioned perpendicular to this secondary structural element. The planar surface is reminiscent with the binding sites of CBMs from type A families such as CBM2 (14), CBM3 (15), CBM5 (16), and CBM10 (17, 18).
The ligand-complexed structure has an electron density that clearly represents the co-crystallized cellopentaose (Fig. 5E), present in two conformations orientated 180° with respect to each other, perpendicularly to the rotational axis (Fig. 5). This electron density is positioned at the interface in a 2-fold rotational symmetry axis, over the planar surface of the CBM. The conformation of the bound cellopentaose adopts a perfect 2-fold screw axis in which adjacent glucose molecules are orientated 180° with respect to each other. Thus, the ligand displays the conformation adopted by cellulose chains in the crystalline polysaccharide and not the twisted helical structures displayed by cellooligosaccharides and β-glucans in solution. It is evident that the binding sites of both molecules are packing the cellopentaose as a sandwich, and each CBM_E1 is able to bind to cellopentaose in both orientations (Fig. 5).
FIGURE 5.
Structure of CBM_E1 in complex with cellopentaose. A, model of CBM_E1 composed of the monomer in the asymmetric unit (cyan) and another symmetric related molecule (green). The monomers bind to cellopentaose that is located perpendicularly to the 2-fold rotational symmetry axis. Consequently, cellopentaose is presented in two orientations: orientation 1 in cyan and orientation 2 in green. B, details of the ligand-binding site illustrating the interactions between amino acids and ligand. C and D, 90° rotated view (C) and 45° rotated view (D) between images in B and C, revealing the parallel alignment between the amino acids composing the binding site with the ligand and the perpendicular orientation between ligand and rotational symmetry axis. E, cellopentaose, co-crystallized with CBM_E1, present at the binding site in two conformations (∼50% occupancy each one) orientated 180° with respect to each other (orientation 1 in cyan and orientation 2 in green). The 2Fo − Fc electron density map contoured at 1 σ confirmed the presence of the double conformation. The symmetry axis is indicated by a dashed line or the centered black symbol, where the axis is perpendicular to the figure.
The ligand makes extensive parallel hydrophobic contacts (CH-π interactions) with Trp375 and Trp427 and hydrogen bonds with Trp398 and Lys423 (Fig. 5B). In one orientation (orientation 1), Trp375 and Trp427 interact with G2 and G4, respectively (G1 is the non-reducing glucose of cellopentaose, and G5 is the reducing terminal glucose), the indole nitrogen of Trp398 makes a hydrogen bond with the glycosidic oxygen between G2 and G3, and the N-ϵ of Lys423 makes hydrogen bonds with oxygens O2 and O3 of G2. In the opposite orientation, (orientation 2) Trp375 and Trp427 interact with G4 and G2, respectively, Trp398 hydrogen bonds with the glycosidic O between G3 and G4, and Lys423 interacts with G4. Because CBM_E1 interacts only with symmetrical regions of the ligand (does not interact with O6 or the endocyclic O), it is not entirely clear which orientation is biologically significant or, indeed, whether the module can interact with both orientations of cellopentaose in vivo. In orientation 1, Trp375 and Trp427 interact with the α-faces of G2 and G4, respectively, which appear to be more hydrophobic by virtue of both axial H-5 and H-3 hydrogens and the aliphatic C5-C6 bond. H-5 “points” into the π-electron cloud of the aromatic ring, hinting at “ring current” hydrogen bonding possibilities. Indeed, in all CBM-ligand complexes reported to date, aromatic residues make apolar interactions exclusively with the α-face of sugar rings.
Although CBM_E1 has a typical fold of such protein modules, the CBM_E1 does not have significant structural identity with any other CBM in the PDB, based on similarity search using the PDBeFold server. The most similar structures are a collagenase from Clostridium histolyticum, PDB code 1NQJ (19), with a Q score of 0.42, and a DNA-binding kinetochore protein from Saccharomyces cerevisiae, PDB code 2VPV (20), with a Q score of 0.33. The low structural and sequence similarity with known CBM families indicates that CBM_E1 is the founding member of a novel CBM family, defined as CBM81.
Mutation of Tryptophans Abolishes Protein-Substrate Interaction
To evaluate the importance of tryptophan residues (Trp375, Trp398, and Trp427) and Lys423 in substrate binding, these amino acids were mutated to alanine. The pulldown assay with Avicel showed that substitution of any of the three tryptophans abolished cellulose binding (Fig. 6). Although mutation of Lys423 reduced binding, it did not abolish interaction, as can be seen in Fig. 6, where the K423A mutant is found in soluble and insoluble fractions.
FIGURE 6.
Pulldown assay of CBM_E1 wild type and mutants against Avicel. The thin arrows show that the wild type interacts with Avicel. The thick arrows highlight that the mutants W375A, W398A, and W427A do not interact with Avicel. The mutant K423A interacts partially with the substrate, because the protein can be found both in insoluble (thin arrow) and soluble (thick arrow) fractions. MW, molecular weight marker; C, control (recombinant protein); I, insoluble fraction; S, soluble fraction; WT, wild type CBM_E1; W375, W375A mutant; W398, W398A mutant; W427, W427A mutant; K423, K423A mutant.
For further evaluation of the mutants, intrinsic tryptophan fluorescence experiments were carried out (Table 3 and Fig. 7). The data revealed that wild type CBM_E1 bound to cellohexaose and cellopentaose (Ka APP = 4.3 × 103 and 2.3 × 103 m−1, respectively), but not to cellotetraose, as observed in ITC experiments (shown above). The mutant K423A interacts only with cellohexaose (Ka APP = 5 × 103 m−1), and the other mutants did not interact with any of the ligands. According to pulldown assays and intrinsic tryptophan fluorescence, Lys423 influences substrate binding but is not essential; however, all tryptophans residues are necessary for ligand recognition.
TABLE 3.
Apparent affinity parameters of CBM_E1 and mutants binding to oligosaccharides
| CBM_E1 | Ligand | KaAPP |
|---|---|---|
| m−1 | ||
| WT | Cellohexaose | 4.3 × 103 |
| W375A | Cellohexaose | No binding |
| W398A | Cellohexaose | No binding |
| W427A | Cellohexaose | No binding |
| K423A | Cellohexaose | 5.1 × 103 |
| WT | Cellopentaose | 2.3 × 103 |
| W375A | Cellopentaose | No binding |
| W398A | Cellopentaose | No binding |
| W427A | Cellopentaose | No binding |
| K423A | Cellopentaose | No binding |
| WT | Cellotetraose | No binding |
| W375A | Cellotetraose | No binding |
| W398A | Cellotetraose | No binding |
| W427A | Cellotetraose | No binding |
| K423A | Cellotetraose | No binding |
FIGURE 7.

Intrinsic tryptophan fluorescence assay of CBM_E1 wild type and mutant K423A. A, CBM_E1 wild type assay. The calculated 1/<λ> signals as a function of cellohexaose concentration present a consistent blue shift. This result suggests that the interaction of CBM_E1 wild type with cellohexaose involves Trp375, Trp398, and Trp427. B, CBM_E1 K423 assay. The calculated 1/<λ> signals as a function of cellohexaose concentration also present a consistent blue shift. This result suggests that mutation of the residue Lys423 does not affect the tryptophan region surfaces and the protein-ligand interaction.
CBM_E1 Is a Monomer in Solution
The oligomeric state of CBM_E1 in solution was accessed by dynamic light scattering and analytical size exclusion chromatography. Both techniques indicated that CBM_E1 is a monomer in solution. Analytical size exclusion chromatography suggests a protein with 7.7 kDa and a 1.3-nm radius (Fig. 8A), and the dynamic light scattering data indicated an 11.3-kDa protein with a 1.6-nm radius (Fig. 8B). Both experiments presented a single peak and agreed with the radius of gyration of the monomer in the crystallographic structure (1.18 nm) and calculated molecular mass (11.8 kDa). A CBM crystallographic ligand-mediated dimer has been observed in a type A CBM, an integrated part of an expansion from Bacillus subtilis (10). According to our data, the dimerization observed in CBM_E1 structures is probably due to crystal packing.
FIGURE 8.
Oligomeric analysis of CBM_E1. A, analytical gel filtration performed with the Superdex 75 10/300 GL chromatographic column. The run was performed with calibration kit, composed of mix A (black line: conalbumin (C), carbonic anidrase (CA), ribonuclease A (R), and aprotinin (Ap)) and mix B (large dashes: ovalbumin (O), ribonuclease A (R), and aprotinin (Ap); and short dashes: CBM_E1 at 0.5 mg/ml). B, dynamic light scattering performed with CBM_E1 at 0.75 mg/ml.
Discussion
This report describes the characterization of CBM_E1, which is the founding member of a novel CBM class. Despite the typical β-sandwich fold, CBM_E1 displays no structural similarity to any CBM family in the CAZy database. The three-dimensional structure of CBM_E1 was solved in the presence and absence of cellopentaose, and the data indicated that the CBM underwent no obvious conformation change upon ligand binding.
Regarding the structural properties of CBM_E1, the first striking characteristic is the orientation of the binding site, which is parallel to the β-strands. From the best of our knowledge, only type A CBMs display this characteristic (Fig. 9). Type A CBMs present a β-sandwich fold, like a chitin-binding domain from Pyrococcus furiosus (21), whose binding site is almost parallel (with a small slope) in relation to the β-strands. The binding site of type B CBMs often consists of the variable loops site, which connects the β-strands at one end of the β-sandwich or comprises the concave face site, whose β-strands are perpendicular to the ligand chain (13).
FIGURE 9.

A, the substrate-interacting tryptophans and the β-strands that compose the binding site of CBM_E1 have a distinct parallel orientation. B, type A CBMs, represented here by CBM2 from P. furiosus, have an almost parallel (with a small slope) orientation between substrate-interacting aromatic residues and the β-strands. Conversely, type B CBMs have a perpendicular orientation between substrate and β-strands, independent of the binding site position: C, concave face site, represented by a CBM6 from Clostridium stercorarium. D, variable loops site, represented by a CBM15 from Cellvibrio japonicus. Geometric figures represent the orientations of substrates (green), substrate-interacting aromatic residues (cyan), and β-strands (red).
The second striking feature of CBM_E1 is the completely solvent-exposed planar conformation of the binding site, which is typical of type A CBMs (2). The ligand-binding site of CBM_E1 is composed of three tryptophans and a lysine residue, and mutation of any of these aromatic residues abolished ligand binding. In type A CBM binding sites, the side chain of the three aromatic residues are displayed in the same plane, resulting in CH-π interactions with the glucose rings from the ligand. In contrast, only two tryptophan residues from CBM_E1 (Trp375 and Trp427) adopt a planar conformation with the ligand. The third tryptophan and the lysine residue form hydrogen bonds with cellopentaose, which can help to explain the enthalpically driven interaction of CBM_E1 according to ITC data. This finding corroborates the classification of CBM_E1 as a type B CBM, in contrast to ligand-binding type A CBM, which is entropically driven (10).
Although CBM_E1 has a typical Type A planar binding site, its classification as a type B CBM was defined based on its binding properties. The CBM_E1 binds to Avicel, which is composed of 60% crystalline cellulose (12), but not to BMCC, composed of a high content of crystalline regions, ∼95% (11). Consequently, CBM_E1 probably binds to the amorphous region of Avicel and is not able to bind to crystalline cellulose. CBM_E1 also displays significant binding to barley β-glucan and xyloglucan, as well to cellohexaose and cellopentaose, which are ligands compatible with type B CBMs (1). The inability of CBM_E1 to bind on BMCC (crystalline cellulose model) may reflect the presence of only two tryptophan residues that make CH-π interactions with cellulose, whereas in type A CBMs three aromatic residues are required to bind to crystalline ligands (17, 22). Indeed, mutating any of the three aromatic residues in type A CBMs greatly reduces and in some cases completely abolishes ligand binding (14).
In conclusion, we characterized the structure and binding properties of CBM_E1, a novel CBM derived from soil metagenomics that represents the new CBM81 family. Although it is classified as a type B CBM, because of its affinity for soluble carbohydrates and the enthalpically driven binding, CBM_E1 presents a binding site structure that resembles type A. It is possible that CBM_E1 binds to regions of paracrystalline cellulose and thus targets the cognate enzyme to areas of the substrate at the interface between crystalline and amorphous structures. Indeed, within type A and type B groups, different CBMs recognize distinct regions of crystalline and amorphous forms of cellulose (23, 24). Thus, the specificity of CBM_E1 may contribute to the multiple CBM targeting roles required to fully deconstruct the myriad of structures present in plant cell wall cellulose.
Materials and Methods
Sequence Analysis of CBM_E1 Gene
The CBM_E1 gene nucleotide sequence was deposited in the GenBankTM database (accession number KJ917170). Physical and chemical parameters were predicted using the ProtParam tool (25). The sequence of amino acid residues from the CBM_E1 gene was aligned with reference sequences from the non-redundant NCBI database using the ClustalX 1.83 program (26).
Protein Production and Purification
The gene encoding CBM_E1 (KJ917170) was amplified by PCR using full-length CelE1, retrieved from a sugarcane soil metagenomics library as a template (4). The cloning in pET28a is described in Ref. 27. The CBM_E1 gene was also cloned in pET41a using the CBM_E1 pET28a clone as the template and the forward (5′-CTCGCGGGATCCAGCGCATCATGCGGTAGC-3′) and reverse primers (5′-CGCGAGCTCGAGTTACCAGTTATCGAACTTCAC-3′), which contains a BamHI and XhoI restriction site (in bold type), respectively. The 282-bp product and the expression vector were digested with BamHI and XhoI restriction enzymes. The ligation mixture was transformed in E. coli DH5α competent cells, and cloning was verified by PCR. The final construct in pET41a encodes CBM_E1 fused to both N-terminal GST and His tags.
Recombinant His tag CBM_E1 was expressed in E. coli strain Origami 2 (DE3) (Novagen) and purified as described previously (27). The recombinant CBM_E1 protein with GST tag was expressed in E. coli strain BL21 (DE3). A single colony was used to inoculate a 10-ml LB starter culture supplemented with kanamycin (50 mg/ml), and this was used to inoculate 1 liter of LB medium. The bacteria were cultured at 37 °C and 250 rpm until the A600 nm reached 0.6, followed by induction with 1 mm isopropyl β-d-1-thiogalactopyranoside for 16 h at 16 °C. The cells were harvested by centrifugation at 5.800 × g for 30 min, suspended in binding buffer (20 mm Tris-HCl, pH 8.0, 100 mm NaCl), and sonicated. CBM_E1 was purified from lysed cells by immobilized metal ion affinity chromatography using Talon resin (Clontech). After incubation, the beads were washed once with wash buffer A (20 mm Tris-HCl, pH 8.0, 100 mm NaCl, 10 mm imidazol), and the protein was eluted with an elution buffer (20 mm Tris-HCl, pH 8.0, 100 mm NaCl, 100 mm imidazol). Purified CBM_E1 GST tag with 20 mm Tris-HCl, pH 8.0, 100 mm NaCl, 100 mm imidazol, and CBM_E1 His tag with 20 mm sodium phosphate, pH 7.2, 50 mm NaCl were stored at 4 °C.
Crystallization and Data Collection
Crystallization experiments were performed manually (CBM_E1 His tag with the tag removed) using the hanging drop vapor diffusion method at 18 °C. The crystals of apo-CBM_E1 (9 mg/ml) were obtained in 0.1 m CAPS, pH 10.5, 0.2 m lithium sulfate, 2 m ammonium sulfate. CBM_E1 in complex with C5 (molar ratio of 1 CBM_E1: 2 C5) was crystallized at 6 mg/ml in 4 m sodium formate. The crystals were soaked in a cryoprotection solution (15% glycerol and crystallization solution) and flash cooled in a stream of gaseous nitrogen at 100 K. For derivatization, CBM_E1-C5 crystal was soaked in a cryoprotection solution containing 1 m gadolinium sulfate. The x-ray diffraction data were collected in the MX2 Beamline (28) of the Brazilian Synchrotron Light Laboratory (Laboratório Nacional de Luz Sincrotron, Campinas-SP) using a MAR 225 detector.
The collected data were processed with iMOSFLM (29) or XDS (30) and AIMLESS (31). The structure was solved using the single-wavelength anomalous dispersion method AUTOSOL (32) from PHENIX (33). A single solution for space group I213 was obtained for the derivatized data and for CBM_E1-C5. For the Apo structure, the space group I21 was found. The models were adjusted and refined using REFMAC5 (34) interspersed with model adjustment in COOT (35) to give the final model to a resolution of 1.5 Å for CBM_E1-C5. The Apo-CBM_E1 structure was solved by molecular replacement, using CBM_E1-C5 as model and PHASER MR (36). The final structures were deposited in the Protein Data Bank with the following codes: CBM_E1/Apo PDB code 5KLC, CBM_E1/C5 PDB code 5KLE, and CBM_E1/Gd PDB code 5KLF.
Site-directed Mutagenesis
Site-directed mutagenesis was carried out employing PCR-based Q5® site-directed mutagenesis kit (New England Biolabs) according to the manufacturer's instructions, using CBM_E1 cloned in pET28a as the template. Trp375, Trp398, Lys423, and Trp427 were replaced by Ala. The mutated DNA sequences were sequenced to ensure that only the appropriate mutations had been incorporated into the amplified DNA. The mutant proteins W375A, W427A, and K423A were expressed as His6 tag fusions and purified as described (27). W398A mutant protein was expressed in E. coli strain Origami 2 (DE3) (Novagen). The bacteria were cultured at 23 °C and 200 rpm for 16 h, followed by induction with 0.5 mm isopropyl β-d-1-thiogalactopyranoside for 6 h at 18 °C. W398A protein purification followed the protocol described previously (27).
BMCC Preparation
Bacterial cellulose membranes were produced as described previously (37). Briefly, cultures of Gluconacetobacter hansenii (strain ATCC 23769) were incubated for 96 h at 28 °C in trays measuring 30 × 50 cm, using a static culture liquid medium (38) composed of 50 g/liter glucose, 4 g/liter of yeast extract, 0.73 g/liter of MgSO4·7H2O, 2 g/liter KH2PO4, 20 g/liter ethanol, and distilled water. Then BC membranes obtained were washed in 1 wt% aqueous NaOH at 70 °C to remove bacteria and then several times in water, until reaching a neutral pH.
Pulldown Assay
The experiments were based on the protocol described previously (39), with some modifications. 10–20 μg of purified proteins (WT or mutants) were incubated with 200 μl of solution containing 35 mg/ml BMCC or Avicel (PH-101; Fluka Analytical), dissolved in 25 mm ammonium acetate, pH 5.0, for 20 min at 8 °C and under 1000 rpm agitation. The mixture was centrifuged at 14,000 rpm for 15 min. The soluble fraction was collected, concentrated, and mixed with Laemmli buffer. The insoluble fraction was washed three times with 25 mm ammonium acetate, pH 5.0, 1 m NaCl. After centrifugation, the pellet was resuspended in 100 μl of SDS sample buffer. Soluble and insoluble fractions were analyzed by SDS-PAGE.
Isothermal Titration Calorimetry (ITC)
Thermodynamic parameters of binding of the CBM_E1 with GST tag to soluble polysaccharides and oligosaccharides were determined by ITC, using a VP-ITC calorimeter (Microcal, Northampton, MA). Titrations consisted of 10-μl injections of 5 mm oligosaccharides or 10 mg/ml of polysaccharides in 50 mm Na-HEPES buffer, pH 8.0, into the cell containing 100 μm CBM_E1 dialyzed into the Na-HEPES buffer, at 25 °C. The recorded data were analyzed using the Microcal Origin 7.0 software to derive the n, Ka, and ΔH values. ΔS was calculated using the standard thermodynamic equation, -RTlnKa = ΔG = ΔH − TΔS. All soluble polysaccharides and oligosaccharides were purchased from Megazyme International (Bray, Ireland), except for Xylan Birchwood, purchased from Sigma.
Intrinsic Fluorescence Emission
The intrinsic fluorescence emission measurements were performer in a Cary Eclipse Fluorescence Spectrophotometer (Varian) using a 10-mm-path length cell with CBM_E1 wild type or mutants (10 μm) in 20 mm sodium phosphate, pH 7.4, 50 mm NaCl buffer at room temperature. The excitation wavelength (λ) was set to 295 nm with a bandpass of 5 mm, and emission was measured from 305 to 550 nm with a bandpass of 5 mm. Titration of cellotetraose, cellopentaose, or cellohexaose was performed by adding from 0 to 400 μm of the ligand to the protein. Fluorescence was monitored immediately after the ligand was added. The spectra were concentration normalized, and the data were analyzed using the spectral center of mass (<λ>), where <λ> = ΣλFi/ΣFi (41). The Ka APP was obtained by obtaining the center of mass (<λ>) data versus ligand concentration using a hyperbole model according to the following equation: y = P1 * x/(P2 + x).
Dynamic Light Scattering
Dynamic light scattering was measured with a Zetasizer Nano Series dynamic light scattering instrument (Malvern) at 20 °C. Sample of CBM_E1 in 20 mm Tris-HCl, pH 8.0, 150 mm NaCl, and 2% glycerol buffer was used immediately after size exclusion chromatography at concentration of 0.75 mg/ml. The data were analyzed with software provided by the instrument.
Analytical Size Exclusion Chromatography
Analytical size exclusion chromatography of CBM_E1 was performed at room temperature in a Superdex 75 10/300 GL column with an AKTA instrument (GE Healthcare). Absorbance was recorded at wavelength of 280 nm. The system was calibrated with the following globular (3 mg/ml each), compact molecules with known hydrodynamic radius: conalbumin (75 kDa, 40.4 Å), ovalbumin (44 kDa, 30.5 Å), carbonic anhydrase (29 kDa, 20.1 Å), ribonuclease A (13.7 kDa, 16.4 Å), and aprotinin (6.5 kDa, 1.35 Å), from GE calibration kit. The chromatography was performed at 0.5 ml/min with 20 mm Tris-HCl, pH 8.0, 150 mm NaCl, 2% glycerol buffer. The protein was injected at 0.5 mg/ml. After the run, the elution volume (Ve) was determined for each protein, and the void volume (Vo) was determined with Dextran Blue 2000.
Author Contributions
B. M. C., M. V. L., T. M. A., and G. C. E. performed the experiments. B. M. C., M. V. L., T. M. A., and L. M. Z. analyzed the data. B. M. C., M. V. L., T. M. A., G. C. E., L. M. Z., H. B., I. P., R. R., A. C. d. M. Z., and H. J. G. contributed reagents, materials, or analysis. B. M. C., M. V. L., T. M. A., H. J. G., and F. M. S. wrote the paper. All authors read and agreed with the submitted version of the paper.
Acknowledgments
We gratefully acknowledge the time provided on the MX2 Beamline (Laboratório Nacional de Luz Sincrotron, Brazilian Synchrotron Light Laboratory), LAM (Laboratório Nacional de Ciência e Tecnologia do Bioetanol), and Robolab (Laboratório Nacional de Biociências, Brazilian Biosciences National Laboratory) at the National Center for Research in Energy and Materials (Campinas, Brazil).
This work was funded by grants from Fundação de Amparo à Pesquisa do Estado de São Paulo Process Grants 2013/06336-0, 2014/04105-4, 2010/11469-1, and 2008/58037-9 and by the Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil. The authors declare that they have no conflicts of interest with the contents of this article.
- CBM
- carbohydrate-binding module
- GH
- glycoside hydrolases
- CAZy
- carbohydrate-active enzymes database
- ITC
- isothermal titration calorimetry
- C6
- cellohexaose
- C5
- cellopentaose
- C4
- cellotetraose
- BMCC
- bacterial microcrystalline cellulose
- PDB
- Protein Data Bank
- CAPS
- 3-(cyclohexylamino)1-propanesulfonic acid.
References
- 1. Boraston A. B., Bolam D. N., Gilbert H. J., and Davies G. J. (2004) Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem. J. 382, 769–781 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Gilbert H. J., Knox J. P., and Boraston A. B. (2013) Advances in understanding the molecular basis of plant cell wall polysaccharide recognition by carbohydrate-binding modules. Curr. Opin. Struct. Biol. 23, 669–677 [DOI] [PubMed] [Google Scholar]
- 3. Yang B., and Wyman C. E. (2006) BSA Treatment to enhance enzymatic hydrolysis of cellulose in lignin containing substrates. Biotechnol. Bioeng. 94, 611–617 [DOI] [PubMed] [Google Scholar]
- 4. Alvarez T. M., Paiva J. H., Ruiz D. M., Cairo J. P., Pereira I. O., Paixão D. A., de Almeida R. F., Tonoli C. C., Ruller R., Santos C. R., Squina F. M., and Murakami M. T. (2013) Structure and function of a novel cellulase 5 from sugarcane soil metagenome. PLoS One 8, e83635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Hashimoto H. (2006) Recent structural studies of carbohydrate-binding modules. Cell Mol. Life Sci. 63, 2954–2967 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Hervé C., Rogowski A., Blake A. W., Marcus S. E., Gilbert H. J., and Knox J. P. (2010) Carbohydrate-binding modules promote the enzymatic deconstruction of intact plant cell walls by targeting and proximity effects. Proc. Natl. Acad. Sci. U.S.A. 107, 15293–15298 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Lombard V., Golaconda Ramulu H., Drula E., Coutinho P. M., and Henrissat B. (2014) The carbohydrate-active enzyme database (CAZy) in 2013. Nucleic Acids Res. 42, D490–D495 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Frank K., and Sippl M. J. (2008) High performance signal peptide prediction based on sequence alignment. Bioinformatics 24, 2172–2176 [DOI] [PubMed] [Google Scholar]
- 9. Yin Y., Mao X., Yang J., Chen X., Mao F., and Xu Y. (2012) dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 40, W445–W451 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Georgelis N., Yennawar N. H., and Cosgrove D. J. (2012) Structural basis for entropy-driven cellulose binding by a Type A cellulose-binding module (CBM) and bacterial expansin. Proc. Natl. Acad. Sci. U.S.A. 109, 14830–14835 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Park S., Baker J. O., Himmel M. E., Parilla P. A., and Johnson D. K. (2010) Cellulose crystallinity index: measurement techniques and their impact on interesting cellulase performance. Biotechnol. Biofuels 3, 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Hall M., Bansal P., Lee J. H., Realff M. J., and Bommarius A. S. (2010) Cellulose crystallinity: a key predictor of the enzymatic hydrolysis rate. FEBS J. 277, 1571–1582 [DOI] [PubMed] [Google Scholar]
- 13. Abbott D. W., and van Bueren A. L. (2014) Using structure to inform carbohydrate binding module function. Curr. Opin. Struct. Biol. 28, 32–40 [DOI] [PubMed] [Google Scholar]
- 14. Simpson P. J., Xie H., Bolam D. N., Gilbert H. J., and Williamson M. P. (2000) The structural basis for the ligand specificity of family 2 carbohydrate-binding modules. J. Biol. Chem. 275, 41137–41142 [DOI] [PubMed] [Google Scholar]
- 15. Petkun S., Rozman Grinberg I., Lamed R., Jindou S., Burstein T., Yaniv O., Shoham Y., Shimon L. J. W., Bayer E. A., and Frolow F. (2015) Reassembly and co-crystallization of a family 9 processive endoglucanase from its component parts: structural and functional significance of the intermodular linker. PeerJ. 3, e1126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Malecki P. H., Raczynska J. E., Vorgias C. E., and Rypniewski W. (2013) Structure of a complete four-domain chitinase from Moritella marina, a marine psychrophilic bacterium. Acta Crystallogr. D Biol. Crystallogr. 69, 821–829 [DOI] [PubMed] [Google Scholar]
- 17. Raghothama S., Simpson P. J., Szabó L., Nagy T., Gilbert H. J., and Williamson M. P. (2000) Solution structure of the CBM10 cellulose binding module from Pseudomonas xylanase A. Biochemistry 39, 978–984 [DOI] [PubMed] [Google Scholar]
- 18. Ponyi T., Szabó L., Nagy T., Orosz L., Simpson P. J., Williamson M. P., and Gilbert H. J. (2000) Trp22, Trp24, and Tyr8 play a pivotal role in the binding of the family 10 cellulose-binding module from Pseudomonas xylanase A to insoluble ligands. Biochemistry 39, 985–991 [DOI] [PubMed] [Google Scholar]
- 19. Wilson J. J., Matsushita O., Okabe A., and Sakon J. (2003) A bacterial collagen-binding domain with novel calcium-binding motif controls domain orientation. EMBO J. 22, 1743–1752 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Cohen R. L., Espelin C. W., De Wulf P., Sorger P. K., Harrison S. C., and Simons K. T. (2008) Structural and functional dissection of Mif2P, a conserved DNA-binding kinetochore protein. Mol. Biol. Cell 19, 4480–4491 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Nakamura T., Mine S., Hagihara Y., Ishikawa K., Ikegami T., and Uegaki K. (2008) Tertiary structure and carbohydrate recognition by the chitin-binding domain of a hyperthermophilic chitinase from Pyrococcus furiosus. J. Mol. Biol. 381, 670–680 [DOI] [PubMed] [Google Scholar]
- 22. McLean B. W., Bray M. R., Boraston A. B., Gilkes N. R., Haynes C. A., and Kilburn D. G. (2000) Analysis of binding of the family 2a carbohydrate-binding module from Cellulomonas fimi xylanase 10a to cellulose: specificity and identification of functionally important amino acid residues. Protein Eng. 13, 801–809 [DOI] [PubMed] [Google Scholar]
- 23. Blake A. W., McCartney L., Flint J. E., Bolam D. N., Boraston A. B., Gilbert H. J., and Knox J. P. (2006) Understanding the biological rationale for the diversity of cellulose-directed carbohydrate-binding modules in prokaryotic enzymes. J. Biol. Chem. 281, 29321–29329 [DOI] [PubMed] [Google Scholar]
- 24. McLean B. W., Boraston A. B., Brouwer D., Sanaie N., Fyfe C. A., Warren R. A., Kilburn D. G., and Haynes C. A. (2002) Carbohydrate-binding modules recognize fine substructures of cellulose. J. Biol. Chem. 277, 50245–50254 [DOI] [PubMed] [Google Scholar]
- 25. Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M. R., Appel R. D., and Bairoch A. (2005) Protein identification and analysis tools on the ExPASy server. In The Proteomics Protocols Handbook, pp. 571–607, Humana Press, Totowa, NJ [Google Scholar]
- 26. Thompson J. D., Gibson T. J., Plewniak F., Jeanmougin F., and Higgins D. G. (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Campos B. M., Alvarez T. M., Liberato M. V., Polikarpov I., Gilbert H. J., Zeri A. C., and Squina F. M. (2014) Cloning, purification, crystallization and preliminary x-ray studies of a carbohydrate-binding module (CBM_E1) derived from sugarcane soil metagenome. Acta Crystallogr. F Struct. Biol. Commun. 70, 1232–1235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Guimarães B. G., Sanfelici L., Neuenschwander R. T., Rodrigues F., Grizolli W. C., Raulik M. A., Piton J. R., Meyer B. C., Nascimento A. S., and Polikarpov I. (2009) The MX2 macromolecular crystallography beamline: a wiggler x-ray source at the LNLS. J. Synchrotron Radiat. 16, 69–75 [DOI] [PubMed] [Google Scholar]
- 29. Leslie A. G. W. (1992) Recent changes to the MOSFLM package for processing film and image plate data. Joint CCP4 + ESF-EAMCB. Newsletter Prot. Crystallogr. 26, 27–33 [Google Scholar]
- 30. Kabsch W. (2010) XDS. XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Evans P. (2006) Scaling and assessment of data quality. Acta Crystallogr. D Biol Crystallogr. 62, 72–82 [DOI] [PubMed] [Google Scholar]
- 32. Terwilliger T. C., Adams P. D., Read R. J., McCoy A. J., Moriarty N. W., Grosse-Kunstleve R. W., Afonine P. V., Zwart P. H., and Hung L. W. (2009) Decision-making in structure solution using Bayesian estimates of map quality: the PHENIX AutoSol wizard. Acta Crystallogr. D Biol. Crystallogr. 65, 582–601 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Adams P. D., Afonine P. V., Bunkóczi G., Chen V. B., Davis I. W., Echols N., Headd J. J., Hung L. W., Kapral G. J., Grosse-Kunstleve R. W., McCoy A. J., Moriarty N. W., Oeffner R., Read R. J., Richardson D. C., et al. (2010) PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Murshudov G. N., Vagin A. A., and Dodson E. J. (1997) Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr. D Biol. Crystallogr. 53, 240–255 [DOI] [PubMed] [Google Scholar]
- 35. Emsley P., Lohkamp B., Scott W. G., and Cowtan K. (2010) Features and development of COOT. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. McCoy A. J., Grosse-Kunstleve R. W., Adams P. D., Winn M. D., Storoni L. C., and Read R. J. (2007) Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Pinto E. R. P., Barud H. S., Silva R. R., Palmieri M., Polito W. L., Calil V. L., Cremona M., Ribeiro S. J. L., and Messaddeq Y. (2015) Transparent composites prepared from bacterial cellulose and castor oil based polyurethane as substrates for flexible OLEDs. J. Mater. Chem. C 3, 11581–11588 [Google Scholar]
- 38. Hestrin S., and Schramm M. (1954) Synthesis of cellulose by Acetobacter xylinum. II. Preparation of freeze-dried cells capable of polymerizing glucose to cellulose. Biochem J. 58, 345–352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Okazaki F., Tamaru Y., Hashikawa S., Li Y. T., and Araki T. (2002) Novel carbohydrate-binding module of β-1,3-xylanase from marine bacterium, Alcaligenes sp. strain XY-234. J. Bacteriol. 184, 2399–2403 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Borges J. C., and Ramos C. H. (2006) Spectroscopic and thermodynamic measurements of nucleotide-induced changes in the human 70-kDa heat shock cognate protein. Arch. Biochem. Biophys. 452, 46–54 [DOI] [PubMed] [Google Scholar]
- 41. Szabo L., Jamal S., Xie H., Charnock S. J., Bolam D. N., Gilbert H. J., and Davies G. J. (2001) Structure of a family 15 carbohydrate-binding module in complex with xylopentaose. J. Biol. Chem. 276, 49061–49065 [DOI] [PubMed] [Google Scholar]





