Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Mar 1.
Published in final edited form as: FEBS Lett. 2019 Nov 13;594(5):799–812. doi: 10.1002/1873-3468.13652

Evolutionary coupling saturation mutagenesis: Coevolution-guided identification of distant sites influencing Bacillus naganoensis pullulanase activity

Xinye Wang 1, Xiaoran Jing 1, Yi Yao Deng 1, Fei Xu 1, Yan Xu 1,2, Yi-Lei Zhao 3, John F Hunt 4, Gaetano T Montelione 5,6,7, Thomas Szyperski 8
PMCID: PMC7194205  NIHMSID: NIHMS1057031  PMID: 31665817

Abstract

Pullulanases are well-known debranching enzymes hydrolyzing α−1,6-glycosidic linkages. To date, engineering of pullulanase is mainly focused on catalytic pocket or domain tailoring based on structure/sequence information. Saturation mutagenesis-involved directed evolution is, however, limited by the low number of mutational sites compatible with combinatorial libraries of feasible size. Using Bacillus naganoensis pullulanase as a target protein, here we introduce the “evolutionary coupling saturation mutagenesis” (ECSM) approach: residue pair co-variances are calculated to identify residues for saturation mutagenesis, focusing directed evolution on residue pairs playing important roles in natural evolution. Evolutionary coupling (EC) analysis identified seven residue pairs as evolutionary mutational hotspots. Subsequent saturation mutagenesis yielded variants with enhanced catalytic activity. The functional pairs apparently represent distant sites affecting enzyme activity.

Keywords: evolutionary information, coevolving residues, directed evolution, saturation mutagenesis, pullulanase, activity


graphic file with name nihms-1057031-f0006.jpg

Introduction

Pullulanase (EC 3.2.1.41), generally with a large molecular weight and a multi-domain structure, is an important member of the glycoside hydrolase family [1]. Pullulanase specifically catalyzes the hydrolysis of α−1,6-glycosidic linkages of unmodified substrates and is widely used in medicine, materials, organometallic chemistry, and the saccharification industry [2]. However, the catalytic efficiency of pullulanase needs to be further improved for industrial application. To date, engineering of pullulanase is mainly focused on the catalytic pocket or domain tailoring based on structure/sequence information [38].

Directed evolution, a pioneering technology for engineering enzymes, represents an important approach to create enzymes with desired properties such as enhanced catalytic activities [9, 10]. To this end, random mutagenesis-based protein engineering methods, including error-prone polymerase chain reaction and DNA shuffling, have been widely used [9, 1113]. This approach requires the construction and subsequent screening of large libraries containing mostly negative variants. To narrow the search in the mutation library and reduce screening efforts, structure information-guided saturation mutagenesis methods, such as triple code saturation mutagenesis [14], have been explored to evolve stereoselective and regioselective enzymes with optimized catalytic profiles [15]. In structure-based saturation mutagenesis methods, the amino acid residues selected for substitution are mostly confined to specific spatial regions in the protein such as an active center or substrate-binding domain, rather than potential sites outside of these functional regions, potentially involved in distal effects or residue-residue networks that can modulate enzyme functions [1622]. This highlights the need for new approaches which expand the selection of mutational hotspots for saturation mutagenesis to the entire protein molecule, while keeping the mutation libraries focused to a comparative small number of residues.

Proteins from diverse organisms can be identified as homologues by sequence comparison, because evolutionary constraints limit substitutions at particular positions. The evolutionary record encoded in multiple sequence alignments of homologous proteins also provides information on residue pair co-variance. A long-standing goal of bioinformatics has been to utilize such sequence co-variation to provide information about residue pair functional correlations and/or contacts [2332]. A key challenge in this analysis is created by transitive correlations, or relay effects; i.e., to distinguish A-B covariation due to A->B interactions from A-C covariation due to relayed A->B->C interactions. Recently, methods have been developed that can distinguish direct evolutionary couplings from such transitive correlations, allowing reliable analysis of direct evolutionary residue-residue couplings [3338]. Such evolutionary couplings (ECs) can provide accurate information about residue pair contacts [3335]. Generally, the highest scoring ECs are between residues that indeed contact one another in the 3D structure [3335, 37, 3947].

Global modeling approaches, which treat correlated pairs of residues as dependent on each other, generate high coupling scores only for residue pairs that are likely to be direct correlations, thereby minimizing transitive sequence covariance effects. Mean-field direct coupling analysis (DI), based purely on sequence information, computes residue couplings in a maximum entropy model [3335, 38, 48]. The resulting couplings can be used to generate distance restraints for modeling 3D structures of proteins, RNA, and complexes [34, 35]. More recently, an alternative pseudo-likelihood maximization direct coupling analysis (PLM) has been described [36, 37]. PLM was shown to outperform DI on a set of protein-domain families with a large number of sequences [36]. Both the DI and PLM methods are implemented in the EVfold (mailto:http://www.EVfold.org/) contact prediction server [34], and the PLM method is implemented in the GREMLIN (mailto:http://gremlin.bakerlab.org/) contact prediction server [37].

So far, these algorithms, predicting residue-residue contacts from sequence covariation data, have been used mostly for protein and nucleic acid 3D structure prediction [34, 35, 37, 3947], or for protein structure determination in combination with sparse experimental data [49], providing insights into protein structure-function relationships. In this regard, the strength of the inferred couplings, especially the top-scoring residue couplings, has been effectively used to predict co-evolving residue pairs and inferred residue-residue contacts, which have been applied in modeling 3D structures with remarkable accuracy. These co-varying residues include interactions in the protein structure which have been probed in the course of evolution.

An alternative application is to exploit ECs for the information they provide on the naturally-occurring co-variation of residues that has occurred in course of protein evolution, which could potentially reveal functional diversity that has been selected for by different evolutionary pressures.

Based on these properties of ECs, we hypothesized that coevolution analysis of protein sequence and prediction of evolutionary couplings could be effective for identifying productive contacts across the entire enzyme molecule for protein engineering, and to create a focused library [50] providing positive variants with desired properties, even for large proteins without known 3D structures. Here, using the industrially-important 926-residue pullulanase from Bacillus naganoensis as a target protein, we describe a new approach called “evolutionary coupling saturation mutagenesis” (ECSM), to engineer the enzyme guided by coevolution analysis. Although approaches using sequence co-variance to target sites for mutagenesis has been previously described [22, 28, 30], the recent advances in bioinformatic analysis outlined above provide more reliable identification of co-evolution due to direct residue contacts, minimizing co-variance due to transitive correlations, and enhancing the effectiveness of this strategy. Moreover, in order to validate the current view [51, 52] that co-evolution of two residues most often relies on a first mutation at one residue site, reducing fitness which followed, by a second mutation at a contacting site, resulting in increasing fitness, we performed saturation mutagenesis for the residues forming an evolutionary coupled pair in a sequential manner.

Results and Discussion

The ECSM method is schematically described in Fig. 1a. For target protein, the residue pairs for saturation mutagenesis are determined using improved methods for coevolution analysis [43, 5355]. Then, one can construct libraries of saturation mutagenesis for selected EC pairs and screen them to select mutants showing good enzymatic performance. Based on this ECSM method, the target protein can be efficiently engineered towards a desired purpose.

Figure 1.

Figure 1.

Evolutionary coupling saturation mutagenesis (ECSM) strategy (a) using pullulanase from Bacillus naganoensis (b) as a target protein. (a) Schematic representation of ECSM strategy. (b) Domain structure of pullulanase from B. naganoensis: CBM, carbohydrate-binding module; X, domain of unknown function.

To verify the feasibility and efficiency of the proposed ECSM method, pullulanase from Bacillus naganoensis (BnPUL), an enzyme with a large molecular weight (~101 kDa) and a multi-domain structure (Fig. 1b), was selected as a model system [6]. BnPUL specifically catalyzes the hydrolysis of α−1,6-glucosidic linkages of unmodified substrate and has been widely applied in drug formulation, processing of polymeric materials, organometallic chemistry, and industrial saccharification (i.e., carbohydrate degradation to soluble sugars) [56]. Previous efforts to engineer a homologous pullulanase through semi-rational design increased catalytic efficiency by 2 times [8]. Since BnPUL is a highly active naturally occurring enzyme, it represents a very challenging target for further improvement.

Prediction of evolutionary couplings and selection of mutation sites

The EVcouplings server (mailto:http://evfold.org) calculates ECs using a maximum entropy model [33, 57], constrained by the statistics of the multiple sequence alignment (MSA), to infer residue pair couplings [33, 34]. The EVcouplings server uses two alternative scoring methods to identify evolutionary coupled residues, the PLM and DI methods, which use different mathematical algorithms to infer the parameters of the maximum entropy model describing the couplings [34, 36]. For very large sequence alignments, PLM becomes asymptotically equivalent to full maximum-likelihood inference, whereas DI remains intrinsically approximate [34, 36]. Therefore, PLM is supposed to consistently outperform DI, as assessed across a number of large protein domain families [34, 36]. Thus, in this work, the EVcouplings-PLM method was used to generate ECs from sequences spanning individual-and sequential pairs of domains within the protein. In addition, EVcomplex (https://evcomplex.hms.harvard.edu), which is designed to calculate ECs between different domain families, was used to calculate additional inter-domain ECs within BnPUL.

For the catalytic domain alone (473–787), the polypeptide segment spanning the catalytic and CBM48 domains (315–787), and the polypeptide segment spanning the catalytic domain and the C-terminal segment (473–926), the EVcouplings server identified 47,650 (151 sequences/residue), 29,241 (61 sequences/residue), and 51,574 (113 sequences/residue) non-redundant sequences, respectively. For the combination of the CBM48 and catalytic domains without the linker sequence connecting them, the EVcomplex server identified 21,488 non-redundant sequences. The large number of sequences per residue used for these analyses promises to generate accurate evolutionary coupling information.

Tables 12 show residue pairs with scores larger than the significance threshold of 0.8 suggested by Hopf et al. [41], which served as candidates for coupled hotspots used for subsequent saturation mutagenesis. The identified pairs of ECs are summarized in Table 1 (calculated by EVcouplings) and Table 2 (calculated by EVcomplex). Seven residue pairs were selected from the top third of coupled sites, i.e., D614/H539, E530/T520, D541/D473, E777/T730, and K631/Q597 pairs with EVcouplings scores above 1.0, and V328/I565 and Y392/Y571 pairs with inter-domain EVcomplex scores above 1.0. Pairs within ten residues of a higher scoring pair were excluded. All of the pairs selected from the EVcouplings-PLM analysis also gave high-ranking EC scores using the GREMLIN [37, 43] server (Table S1). Table 3 (third column) provides the special distance in the 3D structure between co-varying residues for each seven pairs selected for saturation mutagenesis. As is usually observed for evolutionary coupled residues [45], six out of the seven residue pairs are in close spatial proximity (≤ 6 Å; Table 3). For three of those pairs, saturation mutagenesis yielded enzyme mutants with improved catalytic activity (Table 3; Fig. 2). Significantly, the one residue pair (Y392/Y571) separated by ~15 Å, corresponding to residues which are not in direct contact, did not yield any mutants with improved activity (Table 3).

Table 1.

Evolutionary couplings generated from analysis with EVcouplings (PLM score > 0.8).

Catalytic domain CBM48 and catalytic domain Catalytic domain and C-terminal segment

Residue 1 Residue 2 PLMa
score
Residue 1 Residue 2 PLM
score
Residue position Residue 1 Residue 2 PLM
score
Residue position

D614c H539 1.4898 D541b D473 1.267 Catalytic domain D614 H539 1.3426 Catalytic domain
E610 K532 1.2174 D541 K476 0.98858 Catalytic domain E530 T520 1.2768 Catalytic domain
E777 T730 1.0485 E530 T520 0.98639 Catalytic domain E610 K532 1.0537 Catalytic domain
V545 L538 0.99965 E533 L455 0.98564 Catalytic domain/linker K631 Q597 1.0308 Catalytic domain
K631 Q597 0.87149 E530 L455 0.90629 Catalytic domain/linker Y814 Q761 0.94657 C-terminal segment/Catalytic domain
K631 Q597 0.88181 Catalytic domain V545 L538 0.93468 Catalytic domain
K451 D443 0.85737 Linker E777 T730 0.90179 Catalytic domain
A364 M354 0.84801 CBM48 Y611 K532 0.90062 Catalytic domain
E610 K532 0.83898 Catalytic domain T861 D855 0.85811 C-terminal segment
L538 V483 0.83182 Catalytic domain Y851 T766 0.85629 C-terminal segment/Catalytic domain
a

PLM means pseudo-likelihood maximization.

b

Residue pairs in bold resulted in enzymes with improved catalytic activity (Fig. 3).

c

Residue pairs in italic were selected for saturation mutagenesis.

Table 2.

Evolutionary couplings generated from analysis using EVcomplex (score > 0.8).

Residue 1 Residue 2 PLMa score EVcomplex score

V328bc I565 0.624 3.140
V379 I565 0.438 2.204
Y392 Y571 0.373 1.880
T332 E568 0.303 1.526
W329 M595 0.280 1.408
T332 V566 0.274 1.380
V336 I565 0.257 1.292
R386 Q558 0.255 1.282
T380 E568 0.215 1.084
Y392 D602 0.205 1.032
A388 I565 0.203 1.024
T332 Y569 0.173 0.870
S384 Q558 0.172 0.866
W329 Y569 0.160 0.807
a

PLM means pseudo-likelihood maximization.

b

Residue pairs in bold resulted in enzymes with improved catalytic activity (Fig. 3).

c

Residue pairs in italic were selected for saturation mutagenesis.

Table 3.

Summary of results from saturation mutagenesis of strongest EC-coupled site pairs.

Method EC pair Inter-
residue
distanceb
(Å)
Distance
from
active
sitec (Å)
Number
of active
mutants
at 1st site
Residues
at 1st in
active
single
mutantsd
Number
of double
mutants
increasing
activity
vs. WTe
Mutations in
double mutants
with increased
activity
kcat
ratio
vs.
WT
kcat /Km
ratio
vs.
WT
1st site
inactive
mutants
screened at
2nd sitef
Percentage of
double
mutants
restoring
activityg

EVC-PLMa D541/D473 6.0 26 8 A, I, L, V, Y, E, M, R 3 D541I/D473E
D541I/D473I
D541I/D473Q
2.3
1.7
1.5
2.9
2.6
2.9
N/A N/A
EVC-PLM K631/Q597 2.9 16 10 A, V, I, L, S, T, E, P, R, H 2 K631V/Q597K
K631V/Q597S
1.8
2.3
4.0
5.6
N/A N/A
EVC-PLM D614/H539 2.8 17 7 A, I, L, V, E, T, S 0 N, W, M, G, Q 6–11%
EVC-PLM E530/T520 2.6 24 N/A N/A 0 N/A N/A
EVC-PLM E777/T730 2.7 8 8 D, Q, Y, N, A, S, G, V 0 N/A N/A
EVcomplex V328/I565 4.0 25 8 A, G, I, L, M, T, W, D 2 V328L/I565L
V328L/I565V
2.4
2.4
3.0
4.2
H, S, Q, E, Y 7–16%
EVcomplex Y392/Y571 15 16 7 A, V, S, F, W, L, I 0 N/A N/A
a

EVC means EVcouplings and PLM means pseudo-likelihood maximization.

b

Minimum distance between non-hydrogen atoms of residue pair.

c

Minimum distance between C-α atoms in residue pair and catalytic triad (D619, E648, D733).

d

Activity per unit culture volume: 46–113% of wild type.

e

Activity per unit culture volume: D541/D473: 127–175%; K631/Q597: 152–177%; V328/I565: 154–160%.

f

Activity per unit culture volume: 9–11% of wild type.

g

Activity per unit culture volume: 43–65% of wild type.

Figure 2.

Figure 2.

The spatial location in the homology model of BnPUL of the three residue pairs (K631/Q597, V328/I565, D541/D473) identified as mutational hotspots based on EC analysis resulting in mutant enzymes with improved catalytic activity. (a, c) Backbone representation with the X25 and X45 domains shown in teal, the CBM48 domain shown in blue, and the catalytic domain and C-terminal segment shown in gray. The residues in the catalytic triad in the active site of the enzyme (D619, E648, and D733) are shown in purple space-filling representation. The three residue pairs yielding improved catalytic activity are also shown in space-filling representation, with K631/Q597 in green, V328/I565 in yellow, and D541/D473 in red. The view in panel c is rotated by 180˚ from that in panel a. (b, d) Surface representation of BnPUL in the same orientations and with the same color-coding as in panels a and c. These images were all produced using the program PyMOL.

Construction and screening of ECSM pullulanase libraries

Enzyme engineering was performed using a system in which full-length pullulanase from B. naganoensis fused to an N-terminal PelB secretion signal is expressed under a T7 promoter from plasmid pET-28a-PelB in E. coli strain BL21(DE3). We used the NNK degenerate primers encoding 32 distinct codons, while saturating all 20 amino acids. When one amino acid residue is targeted by NNK, 96 colonies (~3 times 32) are necessary to cover 95% of the variants [5860].

In order to efficiently identify mutants with largely decreased enzyme activity, we used an effective high-throughput screening method, the red-pullulan plate assay (Fig. 3). For every single mutant, we picked 96 colonies after transformation and spotted them at 45 positions on plates. After mutation at the first site, we selected only variants showing enzyme activity in the assay for saturation mutagenesis at the second site. For the 7 pairs, a total of 66 inactive single mutants (Table 3) were discarded prior to the generation of 1,056 double-mutants. Out of those, 154 double mutants showing at least wild-type activity in the plate assay were overexpressed and the enzyme activity per unit volume (which reflects both enzyme activity and overexpression yield) was assayed in the supernatant obtained after cell lysis. This resulted in the selection of seven double mutants (Table 3) corresponding to three residue pairs from the EC co-variance analysis (Fig. S2; residue pairs are in bold in Tables 1 and 2), which were then purified in order to accurately measure their Michaelis-Menten parameters (Fig. 4; Table S2). The fact that no mutants with improved catalytic activity could be identified for four of the residue pairs, indicates that the covariation of these pairs does not provide guidance for enhancing catalytic performance.

Figure 3.

Figure 3.

Flowchart of the two-step saturation mutagenesis performed for two of the seven residue pairs (Table 3). The red plate pullulanase assay is performed after saturation mutagenesis of the first site. Left branch: Compensatory 2nd site mutations identified for 1st site mutations with greatly reduced activity (< ~10% of WT). Right branch: 2nd site mutations for 1st site mutations retaining ~WT activity with significantly increased enzyme activity (Table 3).

Figure 4.

Figure 4.

Michaelis-Menten parameters of the seven double mutants (a) and the eight quadruple mutants (b) derived from the evolutionary coupling saturation mutagenesis libraries. The values were measured in triplicate and estimates of the standard deviations are indicated as error bars.

To explore the potential recovery of enzyme activity for reduced activity single mutants by introducing compensatory mutation in the second position, we selected five such single mutants for each of the two pairs D614/H539 and V328/I565 (Table 3; rightmost section) and employed saturation mutagenesis for the second position. Intriguingly, we identified at least one compensatory mutation bringing the activity back to 43–65% of the wild-type (Fig. 3; Table 3) in each case. This finding reveals that interdependent evolutionary plasticity at the coupled sites is required for co-evolution of residue pairs with strong ECs [51, 52].

Catalytic efficiency of engineered enzymes

The seven double mutants derived from our ECSM libraries exhibit 1.5–2.4 folds increases of kcat and 2.6–5.6 folds increases of kcat/Km relative to the WT enzyme (Fig. 4a; Table S2), with K631V/Q597S being the best performing double mutant. Such increases in catalytic performance are impressive given the fact that WT BnPUL is a highly active enzyme originating from an organism which evolved in a carbohydrate rich environment [61]. Moreover, the seven double mutants with enhanced catalytic efficiency exhibit no detectable differences in terms of temperature and pH dependence of their activities (Fig. S3). For comparison, we performed saturation mutagenesis for two residue pairs with low PLM score (V361/K327, PLM score = 0.41021; S721/E428, PLM score = 0.084614) and found that the resulting mutants did not significantly affect enzymatic activity (Table S3).

In order to evaluate potential synergy arising from combining the double mutants and to allow further directed evolution, the five best performing ones (Fig. S2; Table S2) were selected to create a total of eight quadruple mutants (Fig. S4; Table S4). Apparently, construction of quadruple mutants did not bring significant improvement of activity for most of the double mutants (Fig. S4). Concerning the catalytic performance of the mutants, on the other hand, the eight quadruple mutants exhibit 2.5–3.0 folds increases of kcat and 3.8–6.3 folds increases of kcat/Km relative to the WT enzyme (Fig. 4b; Table S4). As shown in Table S2 and Table S4, both double and quadruple mutations lower the Km values of the enzyme obviously, while the quadruple mutants exhibit similar Km values as those of the double mutants, and thus the increases of kcat/Km after combination of positive double mutants mostly due to the increases of kcat values. Of the mutants, the best performing quadruple mutant K631V/Q597K/D541I/D473E shows a 3.0-fold increases of kcat and 6.3-fold increases of kcat/Km relative to the WT enzyme. As we observed for the double mutants, all quadruple mutants exhibit WT BnPUL temperature and pH dependence of their activities (Fig. S5). In addition, by measuring the half-lives of the variants at 55 °C, the mutations at K631/Q597 and/or V328/I565 did not change much of the stability of the enzyme, while the mutations at D541/D473 gave negative effects on the stability of the enzyme, suggesting that the residue pair D541/D473 might be the functional sites relating to the stability of the pullulanase (Table S5; Table S6).

Hence, the ECSM approach compares very favorably with site-directed mutagenesis [7, 8, 62, 63] and regional truncation [3, 5, 64] methods, which increased the catalytic efficiency only about two folds (Table S7).

Spatial location of mutational hotspots

Fig. 2 shows the spatial location in the homology model of BnPUL of the six residues (none of which is highly conserved; Fig. S6) forming the three residue pairs (K631/Q597, V328/I565, D541/D473) identified as mutational hotspots based on EC analysis resulting in mutant enzymes with improved catalytic activity (Fig. 4). For all three pairs, residues forming the pair are in close spatial proximity (< 6 Å; Table 3), as is usually observed for evolutionary coupled residues [45]. Interestingly, however, all six residues are located more than 14 Å away from each of the three residues forming the catalytic triad of BnPUL. Hence, although the four residues (K631, Q597, D541, D473) identified using EVcouplings are largely buried and located on the catalytic domain opposite to the catalytic site, it might be that they are of importance for substrate, intermediate or product binding. Alternatively [19, 20], these residues may affect [1618, 6567] the structure and/or dynamics of the active site to improve catalytic function. In contrast, the two residues (V328, I565) identified using EVcomplex are largely buried in the catalytic-CBM48 domain interface. Hence, it is very likely that those residues enhance catalytic performance through distal effects.

Molecular dynamic simulations of pre-reaction state

To further understand the role of identified residues and the corresponding mutations on catalytic performance of the enzyme, molecular dynamic simulations were performed to analyze the structural characteristics of pre-reaction state. According to the previous QM/MM calculation, the rate-determining step of α-amylase was believed to be the nucleophilic attack from the key residue of aspartic acid to the C1 atom [68]. Fig. 5a shows accessibility of the two carboxyl oxygen atoms of Asp619 toward C1 anomeric carbon of the targeted α−1,6-glucosidic bond for the pre-reaction state, in which the carboxylate attacks on the sp3 carbon and breaks down the C-O bond at the rate-determining step and afterwards the anomeric carbon would be neutralized by a water molecule under the catalysis of the general acid/base of Glu648 and Asp733. It is highly essential for the carboxylate oxygen, anomeric carbon, and the leaving glycosyl group (OR) to orient in a good linear alignment in the rate-determining transition state. Surprisingly it was computationally observed that the Asp619 OD1 atom in the highest-efficient mutant (K631V/Q597K/D541I/D473E) points to the C1 atom statistically and dynamically, much better than those in the WT enzyme (Fig. 5b). Thus, it is conclusive that the distal mutations can significantly promote the reactive population of pre-reaction state and global effect might be very important in enzyme engineering [6971].

Figure 5.

Figure 5.

Interaction between the active residue Asp619 and the targeted α−1,6-glucosidic bond in the pre-reaction state (a) and conformation maps of the highest-efficient mutant and the WT enzyme in the pre-reaction state simulations (b).

Conclusions

Using ECSM, pullulanase mutants were created with significantly enhanced catalytic activity when compared with WT BnPUL (kcat and kcat/Km were increased by of ~3 and ~6, respectively). These gains of catalytic activity are remarkable for a highly evolved naturally occurring enzyme, and they compare very favorably with previously engineered variants [3, 5, 7, 8, 6264]. Since high thermostability and high enzyme activity at comparatively low pH are key for industrial applications of pullulanases [62], it is important to observe that the mutant enzymes exhibit virtually the same dependence of activity on temperature and pH as the WT, making these mutants suitable for industrial applications. Importantly, none of the six residues (i.e. three pairs) identified from calculations of ECs is located in, or close to the active site. This finding indicates that mutations made at these residue pairs propagate through the enzyme structure to affect enzyme activity.

We showed that the calculation of evolutionary couplings is effective for identifying residue pairs for protein engineering by saturation mutagenesis. Even for large proteins without known 3D structures, the ECSM approach allows one to focus on a small number of top scoring residue pairs in order to generate small saturation mutagenesis libraries [50]. Importantly, coevolution analysis enables the identification of (peripheral) distant sites influencing enzyme activity, i.e., information which cannot be derived with structure-based engineering protocols focusing on the active site and its vicinity. Finally, our study showed that residues with strong ECs are very tolerant to mutations, and that a second mutation following a mutation which results in largely reduced activity can restore activity to nearly WT levels. Overall, our findings support the current view [50, 51] that co-evolution of two residues most often relies on a first mutation resulting in unchanged or reduced fitness, which is then followed by a second mutation resulting in increased fitness.

Overall, the newly developed ECSM approach, guided by natural sequence co-variance captured in the evolutionary record, promises to become a valuable tool for future enzyme engineering synergizing strongly with established techniques [72].

Materials and Methods

EVcouplings and EVcomplex analysis

Full-length 926-residue pullulanase from B. naganoensis contains five domains identified using the Conserved Domain Database at NCBI: CBM41, residues 5–100 (PF03714); X45, residues 109–163; X25, residues 162–250; CBM48, residues 315–393 (PF02922); and the catalytic domain, residues 473–787 (PF00128). The Jackhmmer algorithm [73] was used to search the Uniprot [74] database to generate multiple sequence alignments as input for EC calculations. Using the EVcouplings server (http://evfold.org) with default parameters, ECs were determined from sequences comprising the catalytic domain (473–787), the catalytic and CBM48 domains (315–787), the catalytic domain and the C-terminal segment (473–926) using EVcouplings-PLM. ECs with PLM scores higher than 1.0 were selected for saturation mutagenesis, excluding pairs within ten residues of a higher scoring pair (Table 1). Next, additional inter-domain ECs between the catalytic and CBM48 domains were identified using the EVcomplex server (https://evcomplex.hms.harvard.edu) with default parameters. Inter-domain ECs with EVcomplex scores higher than 1.0 were selected for saturation mutagenesis, excluding pairs within ten residues of a higher scoring pair (Table 2).

Materials

A QuikChange Lightning Site-Directed Mutagenesis Kit was purchased from Stratagene (La Jolla, CA, USA). Oligonucleotides were synthesized by Talen-bio Technology (Shanghai, China). Plasmid preparation kit I (D6943–02) was purchased from Omega Bio-tek (Norcross, GA, USA). DNA sequencing was conducted by Talen-Bio Scientific (Shanghai, China). All commercial chemicals were purchased from Sigma (Saint Louis, MO, USA), Takara (Otsu, Japan), or Sinopharm Chemical Reagent (Shanghai, China). Pullulan was purchased from Tokyo Chemical Industry (Chuo-ku, Tokyo), and red pullulan was ordered from Megazyme (Wicklow, Ireland).

Primer design and library construction

The full-length enzyme was expressed with a 31-residue N-terminal extension in a pET vector (pET-28a-PelB) using lactose induction of T7 polymerase in E. coli strain BL21 (DE3). The full sequence of the N-terminal extension is MKYLLPTAAAGLLLLAAQPAMAMDIGINSDP. The first 22 residues (underlined) in this sequence encode the signal sequence of the PelB protein from E. carotovora, while the final nine residues arise from the expression vector. The list of primers is shown in Table S8. The primers contained a random sampling of the nucleotides at each of the three positions in the codon for the mutated amino acid. For the double mutant T520/E530, a single primer containing both saturation mutation sites were designed. For the other double mutants, expression systems were generated in two steps. First, primers for saturation mutagenesis at one of the two mutation sites were used with the WT sequence as the template. After digestion of the PCR products with Dpn I to remove the WT template plasmid, 2 μL of the reaction mixture was used to transform E. coli BL21 (DE3). After plating, individual colonies were picked, and based on sequencing results, plasmids were selected representing the full set of 19 amino-acid substitutions at the target site. The single mutants were screened for enzyme activity as described in the next paragraph, and inactive mutants were discarded. Next, each of the plasmids encoding active single mutants were used as a template for saturation mutagenesis at the second mutation site following the same protocol.

Screening

Colonies of cells containing mutant plasmids were picked and spotted at 45 positions on plates containing solid auto-induction Studier [75] medium (recipe given below) with 2% agar, 1% red pullulan, and 50 μg·mL−1 kanamycin, and the plates were incubated overnight at 37 °C (Fig. S1). Colonies forming transparent zones on the screening plates were picked for protein overexpression and used to inoculate 5 mL LB medium in the presence of kanamycin (50 μg·mL−1). After incubation for 10 h at 37 °C with shaking at 200 rpm, 1 mL of the culture was transferred to a 250 mL flask containing 50 mL auto-induction medium (10 g·L−1 α-lactose, 1.0 g·L−1 glucose, 5.0 g·L−1 glycerol, 6.8 g·L−1 KH2PO4, 0.25 g·L−1 MgSO4, 10 g·L−1 tryptone, 5.0 g·L−1 yeast extract, 7.1 g·L−1 Na2HPO4, 0.71 g·L−1 Na2SO4, and 2.67 g·L−1 NH4Cl, pH 7.5) supplemented with kanamycin (50 μg·mL−1). After cultivation at 37 °C with shaking at 200 rpm for the first 2–3 h, the culture was incubated at 17 °C with shaking at 200 rpm for another 60 h to express the target protein.

Subsequently, the cells were centrifuged and lysed by sonication. The supernatant obtained before and after cell lysis was assayed for pullulanase activity by measuring aldehyde release from a reaction mixture containing 0.2 mL of 2% (w/v) pullulan in 0.1 M sodium acetate buffer (pH 4.5) and 0.2 mL enzyme solution diluted with 0.1 M sodium acetate buffer (pH 4.5), which was incubated at 60 °C for 20 min. Next, the amount of released aldehyde was determined using dinitrosalicylic acid (DNS) and measuring the absorbance at 540 nm spectrophotometrically. One unit of pullulanase activity is defined as the amount of pullulanase that releases 1 μmol of aldehyde per min under these reaction conditions [76]. The reported enzyme activity (Fig. S2 and S4) corresponded to the intracellular (supernatant after cell lysis) activities. All assays were performed in triplicate. Double-mutants exhibiting enzyme activity greater than that of WT pullulanase (cut-off 580 U·mL−1) were selected for protein purification [6] to accurately measure the Michaelis-Menten parameters.

Colonies from D614/H539 and V328/I565 showing no transparent zones on the red pullulanase screening plates were picked for sequencing. Five single-mutants were selected for protein overexpression, enzyme activity per unit culture volume was measured, and second site saturation mutagenesis was performed. For the resulting double-mutants which showed transparent transformation zones on the plate enzyme activity per unit culture volume was measured.

Determination of Michaelis-Menten parameters

Km and kcat of purified enzymes were determined based on a previously described method [77] with measurements at the following pullulan concentrations: 0.1, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 2.5, 2.75, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 15.0, 20.0 mg·mL-1. The kcat and Km values were obtained by fitting the initial rate data to the Michaelis-Menten equation using nonlinear regression with the GraphPad Prism [78]. Values are shown as means from three replicates with standard deviation.

Molecular dynamic simulations of pre-reaction state

The initial protein structure used in the pre-reaction state analysis was constructed with homology modeling [79], in which a crystal structure (PDB ID: 2WAN) of pullulanase from B. acidopullulyticus was employed as template (sequence identity: 63.67%). The complexation of substrate and the enzyme was referred with the similar triad (D-E-D) in human α-glucosidase maltase-glucoamylase [80]. The enzyme-substrate complexes were further optimized with the CHARMm force field in Discovery Studio software package 3.5. The saccharide substrate was structurally optimized with the semi-empirical method (AM1) and parameterized for the following-up molecular dynamic simulation with the antechamber program in the Gaussian16 [81] and AMBER12 [82] software package. Water molecules was assigned with the TIP3P model, and the general Amber force field (GAFF) and ff03.r1 force field were applied for the classical molecular dynamic simulation. The complexes were placed in a truncated octahedral box of water molecules, extending 10.0 Å along each dimension. A certain number of counterions Cl-were added to neutralized the calculated system. The MD systems were first minimized by the steepest descent minimization of 1000 steps followed the conjugate gradient minimization of 9000 steps, heated up from 0 to 300 K at constant volume in 50 ps, and equilibrated for another 50 ps without any restraints. In the molecular dynamic simulations, the Particle Mesh Ewald (PME) method was employed for long-range electrostatic interactions. Finally, multiple 10 ns of trajectories were collected for the further pre-reaction state analysis [6971].

Supplementary Material

Supplementary Material Revised

Acknowledgments

Financial supports from the National Natural Science Foundation of China (NSFC) (31872891, 21676120), the 111 Project (111–2-06), the High-end Foreign Experts Recruitment Program (G20190010083), the Program for Advanced Talents within Six Industries of Jiangsu Province (2015-NY-007), the National Program for Support of Top-notch Young Professionals, the Fundamental Research Funds for the Central Universities (JUSRP51504), the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, the Jiangsu province “Collaborative Innovation Center for Advanced Industrial Fermentation” industry development program, and the National First-Class Discipline Program of Light Industry Technology and Engineering (LITE2018–09) are greatly appreciated. G.T.M.’s contributions to this work were supported in part by NIH NIGMS grant 1R01-GM120574.

Abbreviations

EC

evolutionary coupling

PLM

pseudo-likelihood maximization direct coupling analysis

DI

mean-field direct coupling analysis

SM

saturation mutagenesis

BnPUL

pullulanase from Bacillus naganoensis

MSA

multiple sequence alignment

pullulanase

(EC 3.2.1.41)

Footnotes

Supporting Information

Supplementary results of ECSM of low-covariance residue pairs. Auto-induction red pullulan plate (Fig. S1); enzyme activity and specific activity of the WT and double mutants (Fig. S2); effects of pH and temperature on enzyme activity of the WT and double mutants (Fig. S3); enzyme activity and specific activity of the WT and quadruple mutants (Fig. S4); effects of pH and temperature on enzyme activity of the WT and quadruple mutants (Fig. S5); graphical representation of sequence conservation (Fig. S6); enzyme activity and specific activity of the WT and double mutants from low-covariance residue pairs (Fig. S7); effects of pH and temperature on enzyme activity of the WT and double mutants from low-covariance residue pairs (Fig. S8); evolutionary couplings generated from domain sequence analysis by GREMLIN (Table S1); kinetic parameters of the WT and double mutants (Table S2); kinetic parameters of the WT and double mutants from low-covariance residue pairs (Table S3); kinetic parameters of the WT and quadruple mutants (Table S4); half-lives of the WT and the ECSM double mutants (Table S5); half-lives of the WT and the ECSM quadruple mutants (Table S6); comparison of mutation results between the ECSM method and other protein engineering methods for improving pullulanase catalytic performance (Table S7); oligonucleotides for saturation mutagenesis (Table S8).

References

  • 1.Henrissat B & Bairoch A (1993) New families in the classification of glycosyl hydrolases based on amino acid sequence similarities, Biochem J. 293, 781–788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Bertoldo C & Antranikian G (2002) Starch-hydrolyzing enzymes from thermophilic archaea and bacteria, Curr Opin Chem Biol 6, 151–160. [DOI] [PubMed] [Google Scholar]
  • 3.Duan X & Wu J (2015) Enhancing the secretion efficiency and thermostability of a Bacillus deramificans pullulanase mutant (D437H/D503Y) by N-terminal domain truncation, Appl Environ Microbiol 81, 1926–1931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Nisha M & Satyanarayana T (2015) The role of N1 domain on the activity, stability, substrate specificity and raw starch binding of amylopullulanase of the extreme thermophile Geobacillus thermoleovorans, Appl Microbiol Biotechnol 99, 5461–74. [DOI] [PubMed] [Google Scholar]
  • 5.Chen A, Sun Y, Zhang W, Peng F, Zhan C, Liu M, Yang Y & Bai Z (2016) Downsizing a pullulanase to a small molecule with improved soluble expression and secretion efficiency in Escherichia coli, Microb Cell Fact. 15, 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wang X, Nie Y, Mu X, Xu Y & Xiao R (2016) Disorder prediction-based construct optimization improves activity and catalytic efficiency of Bacillus naganoensis pullulanase, Sci Rep 6, 24574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wang X, Nie Y & Xu Y (2018) Improvement of the activity and stability of starch-debranching pullulanase from Bacillus naganoensis via tailoring of the active sites lining the catalytic pocket, J Agric Food Chem 66, 13236–13242. [DOI] [PubMed] [Google Scholar]
  • 8.Duan X, Chen J & Wu J (2013) Improving the thermostability and catalytic efficiency of Bacillus deramificans pullulanase by site-directed mutagenesis, Appl Environ Microbiol 79, 4072–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Brustad EM & Arnold FH (2011) Optimizing non-natural protein function with directed evolution, Curr Opin Chem Biol 15, 201–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Turner NJ (2009) Directed evolution drives the next generation of biocatalysts, Nat Chem Biol 5, 568–574. [DOI] [PubMed] [Google Scholar]
  • 11.Denard CA, Ren H & Zhao H (2015) Improving and repurposing biocatalysts via directed evolution, Curr Opin Chem Biol 25, 55–64. [DOI] [PubMed] [Google Scholar]
  • 12.Reetz MT (2011) Laboratory evolution of stereoselective enzymes: A prolific source of catalysts for asymmetric reactions, Angew Chem Int Ed 50, 138–174. [DOI] [PubMed] [Google Scholar]
  • 13.Kazlauskas RJ & Bornscheuer UT (2009) Finding better protein engineering strategies, Nat Chem Biol 5, 526–529. [DOI] [PubMed] [Google Scholar]
  • 14.Sun Z, Lonsdale R, Wu L, Li G, Li A, Wang J, Zhou J & Reetz MT (2016) Structure-guided triple-code saturation mutagenesis: Efficient tuning of the stereoselectivity of an epoxide hydrolase, ACS Catal. 6, 1590–1597. [Google Scholar]
  • 15.Wijma HJ, Floor RJ & Janssen DB (2013) Structure-and sequence-analysis inspired engineering of proteins for enhanced thermostability, Curr Opin Struct Biol 23, 588–94. [DOI] [PubMed] [Google Scholar]
  • 16.Keszei AFA & Sicheri F (2017) Mechanism of catalysis, E2 recognition, and autoinhibition for the IpaH family of bacterial E3 ubiquitin ligases, Proc Natl Acad Sci USA. 114, 1311–1316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Schuetz AK & Kay LE (2016) A dynamic molecular basis for malfunction in disease mutants of p97/VCP, eLife. 5, e20143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Gershenson A, Schauerte JA, Giver L & Arnold FH (2000) Tryptophan phosphorescence study of enzyme flexibility and unfolding in laboratory-evolved thermostable esterases, Biochemistry. 39, 4658–4665. [DOI] [PubMed] [Google Scholar]
  • 19.Suel GM, Lockless SW, Wall MA & Ranganathan R (2003) Evolutionarily conserved networks of residues mediate allosteric communication in proteins, Nat Struct Biol 10, 59–69. [DOI] [PubMed] [Google Scholar]
  • 20.Motlagh HN, Wrabl JO, Li J & Hilser VJ (2014) The ensemble nature of allostery, Nature. 508, 331–339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sumbalova L, Stourac J, Martinek T, Bednar D & Damborsky J (2018) HotSpot Wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information, Nucleic Acids Res 46, W356–W362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Suplatov D, Sharapova Y, Timonina D, Kopylov K & Svedas V (2018) The visualCMAT: A web-server to select and interpret correlated mutations/co-evolving residues in protein families, J Bioinf Comput Biol 16, 1840005. [DOI] [PubMed] [Google Scholar]
  • 23.Gobel U, Sander C, Schneider R & Valencia A (1994) Correlated mutations and residue contacts in proteins, Proteins. 18, 309–317. [DOI] [PubMed] [Google Scholar]
  • 24.Neher E (1994) How frequent are correlated changes in families of protein sequences, Proc Natl Acad Sci USA. 91, 98–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Taylor WR & Hatrick K (1994) Compensating changes in protein multiple sequence alignments, Protein Eng 7, 341–348. [DOI] [PubMed] [Google Scholar]
  • 26.Shindyalov IN, Kolchanov NA & Sander C (1994) Can 3-dimensional contacts in protein structures be predicted by analysis of correlated mutations, Protein Eng 7, 349–358. [DOI] [PubMed] [Google Scholar]
  • 27.Thomas DJ, Casari G & Sander C (1996) The prediction of protein contacts from multiple sequence alignments, Protein Eng 9, 941–948. [DOI] [PubMed] [Google Scholar]
  • 28.Lee J, Natarajan M, Nashine VC, Socolich M, Vo T, Russ WP, Benkovic SJ & Ranganathan R (2008) Surface sites for engineering allosteric control in proteins, Science. 322, 438–442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Halabi N, Rivoire O, Leibler S & Ranganathan R (2009) Protein sectors: evolutionary units of three-dimensional structure, Cell. 138, 774–786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Reynolds KA, McLaughlin RN & Ranganathan R (2011) Hot spots for allosteric regulation on protein surfaces, Cell. 147, 1564–1575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Teşileanu T, Colwell LJ & Leibler S (2015) Protein sectors: statistical coupling analysis versus conservation, PLoS Comput Biol 11, e1004091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Steipe B, Schiller B, Plückthun A & Steinbacher S (1994) Sequence statistics reliably predict stabilizing mutations in a protein domain, J Mol Biol 240, 188–192. [DOI] [PubMed] [Google Scholar]
  • 33.Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T & Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families, Proc Natl Acad Sci USA. 108, 1293–1301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R & Sander C (2011) Protein 3D structure computed from evolutionary sequence variation, PLoS One. 6, e28766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Sulkowska JI, Morcos F, Weigt M, Hwa T & Onuchic JN (2012) Genomics-aided structure prediction, Proc Natl Acad Sci USA. 109, 10340–10345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ekeberg M, Lovkvist C, Lan Y, Weigt M & Aurell E (2013) Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models, Phys Rev E Stat Nonlinear Soft Matter Phys 87, 012707. [DOI] [PubMed] [Google Scholar]
  • 37.Kamisetty H, Ovchinnikov S & Baker D (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence-and structure-rich era, Proc Natl Acad Sci USA. 110, 15674–15679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lapedes A, Giraud B & Jarzynski C (2012) Using sequence alignments to predict protein structure and stability with high accuracy, arXiv, 1207.2484. [Google Scholar]
  • 39.Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C & Marks DS (2012) Three-dimensional structures of membrane proteins from genomic sequencing, Cell. 149, 1607–1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Marks DS, Hopf TA & Sander C (2012) Protein structure prediction from sequence variation, Nat Biotechnol 30, 1072–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hopf TA, Schaefe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, Bonvin AMJJ & Marks DS (2014) Sequence co-evolution gives 3D contacts and structures of protein complexes, eLife. 3, e03430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Michel M, Hayat S, Skwark MJ, Sander C, Marks DS & Elofsson A (2014) PconsFold: improved contact predictions improve protein models, Bioinformatics. 30, 482–488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ovchinnikov S, Kamisetty H & Baker D (2014) Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information, eLife. 3, e02030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Ovchinnikov S, Kim DE, Wang RY-R, Liu Y, DiMaio F & Baker D (2016) Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta, Proteins. 84, 67–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Anishchenko I, Ovchinnikov S, Kamisetty H & Baker D (2017) Origins of coevolution between residues distant in protein 3D structures, Proc Natl Acad Sci USA. 114, 9122–9127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Ovchinnikov S, Park H, Varghese N, Huang P-S, Pavlopoulos GA, Kim DE, Kamisetty H, Kyrpides NC & Baker D (2017) Protein structure determination using metagenome sequence data, Science. 355, 294–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Simkovic F, Oychinnikov S, Baker D & Rigden DJ (2017) Applications of contact predictions to structural biology, IUCrJ. 4, 291–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Cheng RR, Morcos F, Levine H & Onuchic JN (2014) Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information, Proc Natl Acad Sci USA. 111, E563–E571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Tang Y, Huang YJ, Hopf TA, Sander C, Marks DS & Montelione GT (2015) Protein structure determination by combining sparse NMR data with evolutionary couplings, Nat Methods. 12, 751–754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lutz S, Williams E & Muthu P (2017) Engineering Therapeutic Enzymes In Directed Enzyme Evolution: Advances and Applications (Alcalde M, ed), pp. 17–67. Springer International Publishing, AG, Switzerland. [Google Scholar]
  • 51.Tamaki FK, Textor LC, Polikarpov I & Marana SR (2014) Sets of covariant residues modulate the activity and thermal stability of GH1 beta-glucosidases, PLoS One. 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Aakre CD, Herrou J, Phung TN, Perchuk BS, Crosson S & Laub MT (2015) Evolving new protein-protein interaction specificity through promiscuous intermediates, Cell. 163, 594–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Sheridan R, Fieldhouse RJ, Hayat S, Sun Y, Antipin Y, Yang L, Hopf T, Marks DS & Sander C (2015) EVfold.org: Evolutionary Couplings and Protein 3D Structure Prediction, bioRxiv. [Google Scholar]
  • 54.Stein RR, Marks DS & Sander C (2015) Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models, PLoS Comput Biol 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Jones DT, Buchan DWA, Cozzetto D & Pontil M (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments, Bioinformatics. 28, 184–190. [DOI] [PubMed] [Google Scholar]
  • 56.Nisha M & Satyanarayana T (2016) Characteristics, protein engineering and applications of microbial thermostable pullulanases and pullulan hydrolases, Appl Microbiol Biotechnol 100, 5661–5679. [DOI] [PubMed] [Google Scholar]
  • 57.Schneidman E, Berry MJ, Segev R & Bialek W (2006) Weak pairwise correlations imply strongly correlated network states in a neural population, Nature. 440, 1007–1012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Kille S, Acevedo-Rocha CG, Parra LP, Zhang ZG, Opperman DJ, Reetz MT & Acevedo JP (2013) Reducing codon redundancy and screening effort of combinatorial protein libraries created by saturation mutagenesis, ACS Synth Biol 2, 83–92. [DOI] [PubMed] [Google Scholar]
  • 59.Reetz MT, Bocola M, Carballeira JD, Zha DX & Vogel A (2005) Expanding the range of substrate acceptance of enzymes: Combinatorial active-site saturation test, Angew Chem Int Ed 44, 4192–4196. [DOI] [PubMed] [Google Scholar]
  • 60.Bosley AD & Ostermeier M (2005) Mathematical expressions useful in the construction, description and evaluation of protein libraries, Biomol Eng 22, 57–61. [DOI] [PubMed] [Google Scholar]
  • 61.Brown SH, Costantino HR & Kelly RM (1990) Characterization of amylolytic enzyme activities associated with the hyperthermophilic archaebacterium Pyrococcus furiosus, Appl Environ Microbiol 56, 1985–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Chang M, Chu X, Lv J, Li Q, Tian J & Wu N (2016) Improving the thermostability of acidic pullulanase from Bacillus naganoensis by rational design, PLoS One. 11, e0165006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Wang J, Liu Z & Zhou Z (2018) The N-terminal domain of pullulanase from Anoxybacillus sp WB42 modulates enzyme specificity and thermostability, ChemBioChem 19, 949–955. [DOI] [PubMed] [Google Scholar]
  • 64.Mesbah NM & Wiegel J (2018) Biochemical characterization of halophilic, alkalithermophilic amylopullulanase PulD7 and truncated amylopullulanases PulD7DeltaN and PulD7DeltaC, Int J Biol Macromol 111, 632–638. [DOI] [PubMed] [Google Scholar]
  • 65.Oelschlaeger P, Schmid RD & Pleiss J (2003) Modeling domino effects in enzymes: Molecular basis of the substrate specificity of the bacterial metallo-beta-lactamases IMP-1 and IMP-6, Biochemistry. 42, 8945–8956. [DOI] [PubMed] [Google Scholar]
  • 66.Oelschlaeger P & Pleiss J (2007) Hydroxyl groups in the beta beta sandwich of metallo-beta-lactamases favor enzyme activity: Tyr218 and Ser262 pull down the lid, J Mol Biol 366, 316–329. [DOI] [PubMed] [Google Scholar]
  • 67.Agniswamy J, Louis JM, Roche J, Harrison RW & Weber IT (2016) Structural studies of a rationally selected multi-drug resistant HIV-1 protease reveal synergistic effect of distal mutations on flap dynamics, PLoS One. 11, e0168616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Pinto GP, Bras NF, Perez MAS, Fernandes PA, Russo N, Ramos MJ & Toscano M (2015) Establishing the Catalytic Mechanism of Human Pancreatic alpha-Amylase with QM/MM Methods, J Chem Theory Comput 11, 2508–2516. [DOI] [PubMed] [Google Scholar]
  • 69.Chen X-P, Shi T, Wang X-L, Wang J, Chen Q, Bai L & Zhao Y-L (2016) Theoretical Studies on the Mechanism of Thioesterase-Catalyzed Macrocyclization in Erythromycin Biosynthesis, ACS Catal 6, 4369–4378. [Google Scholar]
  • 70.Shi T, Liu L, Tao W, Luo S, Fan S, Wang X-L, Bai L & Zhao Y-L (2018) Theoretical Studies on the Catalytic Mechanism and Substrate Diversity for Macrocyclization of Pikromycin Thioesterase, ACS Catal 8, 4323–4332. [Google Scholar]
  • 71.Zhou J, Wang Y, Xu G, Wu L, Han R, Schwaneberg U, Rao Y, Zhao Y-L, Zhou J & Ni Y (2018) Structural Insight into Enantioselective Inversion of an Alcohol Dehydrogenase Reveals a “Polar Gate” in Stereorecognition of Diaryl Ketones, J Am Chem Soc 140, 12645–12654. [DOI] [PubMed] [Google Scholar]
  • 72.Bornscheuer UT, Huisman GW, Kazlauskas RJ, Lutz S, Moore JC & Robins K (2012) Engineering the third wave of biocatalysis, Nature. 485, 185–194. [DOI] [PubMed] [Google Scholar]
  • 73.Johnson LS, Eddy SR & Portugaly E (2010) Hidden Markov model speed heuristic and iterative HMM search procedure, BMC Bioinf 11, 431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang HZ, Lopez R, Magrane M, Martin MJ, Natale DA, O’Donovan C, Redaschi N & Yeh LSL (2004) UniProt: the Universal Protein knowledgebase, Nucleic Acids Res 32, D115–D119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Studier FW (2005) Protein production by auto-induction in high-density shaking cultures, Protein Expression Purif 41, 207–234. [DOI] [PubMed] [Google Scholar]
  • 76.Nie Y, Yan W, Xu Y, Chen WB, Mu XQ, Wang X & Xiao R (2013) High-level expression of Bacillus naganoensis pullulanase from recombinant Escherichia coli with auto-induction: effect of lac operator, PLoS One. 8, e78416. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Malle D, Itoh T, Hashimoto W, Murata K, Utsumi S & Mikami B (2006) Overexpression, purification and preliminary X-ray analysis of pullulanase from Bacillus subtilis strain 168, Acta Crystallogr Sect F Struct Biol Cryst Commun 62, 381–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Swift ML (1997) GraphPad prism, data analysis, and scientific graphing, J Chem Inf Comput Sci 37, 411–412. [Google Scholar]
  • 79.Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R & Schwede T (2018) SWISS-MODEL: homology modelling of protein structures and complexes, Nucleic Acids Res 46, 296–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Bras NF, Santos-Martins D, Fernandes PA & Ramos MJ (2018) Mechanistic Pathway on Human alpha-Glucosidase Maltase-Glucoamylase Unveiled by QM/MM Calculations, J Phys Chem B 122, 3889–3899. [DOI] [PubMed] [Google Scholar]
  • 81.Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Mennucci B, Petersson GA, Nakatsuji H, Caricato M, Li X, Hratchian HP, Izmaylov AF, Bloino J, Zheng G, Sonnenberg JL, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T Jr., J AM, Peralta JE, Ogliaro F, Bearpark M, Heyd JJ, Brothers E, Kudin KN, Staroverov VN, Kobayashi R, Normand J, Raghavachari K, Rendell A, Burant JC, Iyengar SS, Tomasi J, Cossi M, Rega N, Millam JM, Klene M, Knox JE, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Martin RL, Morokuma K, Zakrzewski VG, Voth GA, Salvador P, Dannenberg JJ, Dapprich S, Daniels AD, Farkas O, Foresman JB, Ortiz JV, Cioslowski J & Fox DJ (2016) Gaussian 16, Revision B.01, Gaussian, Inc, Wallingford CT. [Google Scholar]
  • 82.Case DA, Ben-Shalom IY, Brozell SR, Cerutti DS, E T Cheatham I, Cruzeiro VWD, Darden TA, Duke RE, Ghoreishi D, Gilson MK, Gohlke H, Goetz AW, Greene D, Harris R, Homeyer N, Izadi S, Kovalenko A, Kurtzman T, Lee TS, LeGrand S, Li P, Lin C, Liu J, Luchko T, Luo R, Mermelstein DJ, Merz KM, Miao Y, Monard G, Nguyen C, Nguyen H, Omelyan I, Onufriev A, Pan F, Qi R, Roe DR, Roitberg A, Sagui C, Schott-Verdugo S, Shen J, Simmerling CL, Smith J, Salomon-Ferrer R, Swails J, Walker RC, Wang J, Wei H, Wolf RM, Wu X, Xiao L, York DM & Kollman PA (2012) AMBER 12, University of California, San Francisco. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material Revised

RESOURCES