Skip to main content
Nature Communications logoLink to Nature Communications
. 2018 Sep 26;9:3935. doi: 10.1038/s41467-018-06403-x

Mapping protein selectivity landscapes using multi-target selective screening and next-generation sequencing of combinatorial libraries

Si Naftaly 1,#, Itay Cohen 1,#, Anat Shahar 2, Alexandra Hockla 3, Evette S Radisky 3, Niv Papo 1,
PMCID: PMC6158287  PMID: 30258049

Abstract

Characterizing the binding selectivity landscape of interacting proteins is crucial both for elucidating the underlying mechanisms of their interaction and for developing selective inhibitors. However, current mapping methods are laborious and cannot provide a sufficiently comprehensive description of the landscape. Here, we introduce a novel and efficient strategy for comprehensively mapping the binding landscape of proteins using a combination of experimental multi-target selective library screening and in silico next-generation sequencing analysis. We map the binding landscape of a non-selective trypsin inhibitor, the amyloid protein precursor inhibitor (APPI), to each of the four human serine proteases (kallikrein-6, mesotrypsin, and anionic and cationic trypsins). We then use this map to dissect and improve the affinity and selectivity of APPI variants toward each of the four proteases. Our strategy can be used as a platform for the development of a new generation of target-selective probes and therapeutic agents based on selective protein–protein interactions.


Characterizing the binding selectivity landscape of interacting proteins is crucial in protein engineering. Here the authors use multi-target selective library screening and in silico next-generation sequencing to map the binding landscape of proteins and produce improved proteases inhibitors.

Introduction

A defining characteristic of protein–protein interactions (PPIs) is the binding selectivity landscape of the interacting proteins13, which relates the amino acid sequence to the affinity of a protein toward its target. Comprehensively mapping this landscape is crucial both for understanding the mechanisms and evolutionary origins of selective PPIs and for protein engineering purposes, e.g., for designing selective binders and/or inhibitors for target proteins47. The binding selectivity landscape of each protein in a certain PPI is characterized by the interfacial residues of the protein, such that point mutating these residues can help determine the contribution of each residue to target selectivity—or in a protein with a broad selectivity spectrum to the selectivity of the protein to each of its putative targets individually. The binding selectivity landscape usually comprises of four types of key interface residues. Hot-spot residues are a few8 interface residues that are highly relevant for a specific PPI, i.e., they contribute almost 75% of the total free energy of binding (ΔΔGbind) of the protein to its partner911. Mutating hot-spot residues therefore decreases the affinity of the protein to a specific partner—but not necessarily to others. Cold-spot residues1,1214 are interface residues occupied by suboptimal amino acids, such that mutating them increases the binding affinity of the protein to a specific partner. Selectivity-switch residues15,16 are interface residues in which a point-mutation simultaneously decreases the affinity of the protein to one partner and increases its affinity to another. Finally, correlated-selectivity residues17,18 are interface residues that work together to increase the selectivity of the protein to one specific partner. Such residues are especially difficult to characterize with conventional methods because only a double mutation (one mutation in each residue) can change the affinity of the protein to a certain partner.

Methods for mapping protein selectivity landscapes typically include mutating candidate residues and testing the resulting changes in affinity4,19. Despite considerable advancements in recent years8,2022, currently available methods still demonstrate several caveats that hinder our ability to develop, inter alia, selective inhibitors for clinically important proteins. For instance, alanine scanning and similar classical approaches2329 can test only a subset of all possible mutants, are time-consuming and laborious, require protein purification and binding affinity measurements for each mutant, and most importantly, focus chiefly on hot-spot residues. Modern approaches can overcome some of these caveats by employing protein library display and sorting technologies, which rapidly explore all possible (hot- and cold-spot) mutations and qualitatively map the contribution of each residue to the affinity of a protein toward its target3037. However, as only several hundred clones of the sorted libraries are ultimately sequenced, these methods do not comprehensively characterize the entire library and, to date, they cannot identify correlated-selectivity residues. A more recent approach employed next-generation sequencing (NGS) to guide protein and synthetic small-molecule optimization4,7,3842, effectively improving the binding affinity and selectivity, and generating binding epitopes de novo6,7,19,40,41,43,44. However, this and most other currently available approaches generate high-affinity (but not necessarily selective) binders, and, in the few studies designed to generate selective binders7,40,41,43,44, the methodology was limited to improving discrimination between only two target proteins that have different binding epitopes. Significantly, some broad-spectrum proteins may have many potential targets with binding epitopes that have high-sequence homology and structural similarity.

In the current study, we present a novel, single-step approach for comprehensively mapping the binding selectivity landscape of proteins (including hot-spot, cold-spot, selectivity-switch, and correlated-selectivity residues) using a combination of experimental multi-target selective library screening and in silico NGS analysis. To test our approach in a real-life context, we chose to map the binding selectivity landscape of a broad-spectrum trypsin inhibitor, namely, the human amyloid protein precursor inhibitor (APPI; a member of the human Kunitz-domain family of serine protease inhibitors45), to each of the four human serine proteases—kallikrein-6 (KLK6), mesotrypsin, anionic trypsin, and cationic trypsin—all of which share high-sequence homology and structural similarity. Then, we used this landscape to improve both the selectivity and affinity of APPI variants to each protease, which we evaluated through inhibition studies using the purified proteins.

We recently used a yeast-surface display (YSD) platform—a powerful directed evolution protein engineering technology30,4650—to generate APPI-3M51: a triple-mutant APPI (M17G/I18F/F34V) whose affinity to mesotrypsin, anionic trypsin, and cationic trypsin is comparable [Ki = 89.8 ± 0.23 pM, 1.47 ± 0.02 pM, and 4.96 ± 0.25 pM for mesotrypsin, anionic trypsin, and cationic trypsin, respectively51], whereas its affinity to KLK6 is lower by three orders of magnitude [Ki = 1.09 ± 0.12 nM51]. These features render APPI-3M an optimal model scaffold for engineering binding selectivity; its lack of selectivity toward mesotrypsin, anionic trypsin, and cationic trypsin is a good starting point for manipulating its relative selectivity, while its lower selectivity toward KLK6 makes it a good target for engineering selectivity switches.

Our study design is demonstrated in Supplementary Figure 1. We began by generating a yeast-displayed APPI-3M library including clones with single-residue random mutations in the binding interface (i.e., in the APPI binding loop) and clones with multiple-residue random mutations both within and beyond the binding interface (i.e., in the APPI scaffold and binding loop). Then, we divided the four proteases into combinatorial pairs (six combinations) and sorted the YSD APPI-3M library for variants with differential selectivity toward each protease in each pair. We then used NGS to sequence these fractions and analyzed them computationally. Consequently, the sorted APPI-3M mutant library fractions were rich in affinity- and selectivity-enhancing mutations; of these, we identified the most highly selective APPI mutations based on their ability to inhibit—as soluble proteins—each of the four proteases. To the best of our knowledge, this is the first report of a platform that can provide such a rich PPI binding selectivity landscape.

Results

Selecting APPI variants with improved selectivity

We began by generating a library of APPI-3M clones using both site-directed random mutagenesis of the APPI-3M binding loop (residues 11–18, except invariant Cys-14) and error-prone PCR amplification of the entire coding sequence. This design yielded a ‘naive’ APPI-3M library of 3.5 million variants, each with 0–2 amino acid mutations. Then, using our YSD system, we expressed each of these variants on the surface of yeast cells and used fluorescence-activated cell sorting (FACS) to quantify their binding to each of the four (soluble) serine proteases (mesotrypsin, KLK6, anionic trypsin, and cationic trypsin). We introduced the yeast-displayed naive library to pairs of serine proteases, each labeled with a different fluorescent dye (Alexa Fluor-650 or Alexa Fluor-488; i.e., a pairwise selective screen, Fig. 1a), at concentrations optimized for each pair to achieve an equivalent distribution of staining intensities (Fig. 1b). The library was sorted to isolate ∼1 million variants per sorted fraction (sorting gate), with increased selectivity toward each of the four serine protease targets versus its paired protease. Subsequent FACS analyses showed clear enrichment of the binding population for each individual protease (Supplementary Figure 2), confirming selectivity improvements of the sorted library fractions.

Fig. 1.

Fig. 1

Yeast-surface display of APPI-3M. a Schematic drawing of the pairwise selective screen using the YSD system. A naive library of mutated APPI-3M variants was displayed on the yeast cell surface and presented to pairs of proteases. Each protease in the pair (denoted A or B) was labeled with a different fluorescent dye—Alexa Fluor-650 or Alexa Fluor-488 (represented by green and blue stars, respectively). b Pairwise selective screen. Flow cytometry sorting was used to screen the library to isolate APPI-3M variants, with enhanced selectivity toward each of the four serine proteases (Meso: mesotrypsin; KLK6; Anionic: anionic trypsin; Cationic: cationic trypsin). In each sort, two variant populations were collected inside the black gates, yielding sorted library populations of protease-selective APPI-3M variants. Green and blue colors represent a high and low cell densities, respectively

Mapping hot and cold spots and selectivity switches

To map the binding selectivity landscape of APPI-3M to each of the four serine proteases, we used Illumina Miseq to perform a high-throughput sequencing of APPI-3M gene fragments from the sorted and naive libraries. We then used this sequencing data to identify single and double amino acid substitutions in the APPI-3M sequence that had modulated its selectivity toward each of the four serine proteases. In each sorted fraction, the average number of read pairs per sequenced library was 1 million; of these, 95% of the sequenced read pairs passed quality filtering and integration and were thus translated to amino acid sequences and aligned to the sequence of APPI-3M. Because we were only interested in amino acid substitutions (and not in insertions or deletions), we analyzed only sequences of the same length as that of APPI-3M and determined a threshold value of 100 reads for variants with a single-amino acid substitution and 10 reads for variants with a double-amino acid substitution. To correlate between the abundance of a variant and its target selectivity, we determined the enrichment ratio of each variant, which we defined as the frequency of a certain mutation in the sorted library fraction divided by the frequency of that mutation in the naive library. Thus, we assumed that mutations that increase the selectivity of each variant to its putative target (mesotrypsin, KLK6, anionic trypsin, or cationic trypsin) will be more abundant in the sorted library fraction than in the naive library (enrichment ratio >1), while mutations that decrease selectivity will be less abundant in the sorted library than in the naive library (enrichment ratio <1).

We first characterized the effect of single-amino acid substitution on target selectivity. To this end, we created a heatmap for each sorted library fraction (Fig. 2 and Supplementary Figure 3), using the enrichment ratio as a measure of binding selectivity. Then, we used this map to identify (i) hot spots, defined as APPI-3M residues, in which most mutations decreased the binding selectivity to one target protease versus another, (ii) cold spots, in which most mutations increased the binding selectivity; and (iii) selectivity switches, in which a single mutation decreased the selectivity to one target protease and increased the selectivity toward another.

Fig. 2.

Fig. 2

Single mutation selectivity landscape of APPI-3M. The colors in the heatmaps indicate the enrichment ratio (defined as the frequency of a certain mutation in the sorted library divided by its frequency in the naive library) and represent the effect of a single-amino acid substitution on APPI-3M selectivity toward one serine protease (a: mesotrypsin, b: KLK6, c: cationic trypsin, and d: anionic trypsin) versus the other three. The different colors of the heatmaps correspond to the scale shown on the right of each panel and indicate log2 of the enrichment ratio [yellow and red: positive (increased selectivity); green: negative (decreased selectivity)]. The position of the substituted amino acid is shown on the X axis and the substituting amino acid is shown on the Y axis. Meso: mesotrypsin; Cationic: cationic trypsin; Anionic: anionic trypsin. See Supplementary Figure 3 for further details

Our analysis revealed that residue 15 in APPI-3M is a general hot spot for binding human serine proteases, as all mutations in this residue, except a substitution to Lys, decreased its binding affinity toward all four proteases (Fig. 2 and Supplementary Figure 3). The analysis also revealed two clear cold spots: most mutations in residue 13 increased binding selectivity toward mesotrypsin versus all other proteases (Fig. 2a and Supplementary Figure 3A, dashed line), while most mutations in residue 17 increased binding selectivity toward KLK6 versus all other proteases (Fig. 2b and Supplementary Figure 3B, dashed line). These two selectivity-switch residues (13 and 17) enable a selectivity shift from three proteases toward a single, different protease (either mesotrypsin or KLK6). In addition, most mutations in residue 17 increased the selectivity of APPI-3M toward anionic trypsin and cationic trypsin as compared with mesotrypsin (Fig. 2c, d and Supplementary Figure 3C and D, lower dashed line), while most mutations in residues 11 and 18 increased the selectivity toward anionic trypsin and cationic trypsin as compared with KLK6 (Fig. 2c, d and Supplementary Figure 3C and D, upper dashed line). For example, we found that residues 11 and 17 are selectivity switches for mesotrypsin and KLK6, respectively (Fig. 2, Supplementary Figure 3 and Table 1), as mutating the residue in position 11 (originally Thr) from His to Ile (Fig. 2a, d, white arrows) switched the selectivity from anionic trypsin to mesotrypsin by a factor of 69 × 103 and mutating the residue in position 17 (originally Gly) from Glu to Arg (Fig. 2b, c, black arrows) switched the selectivity from cationic trypsin to KLK6 by a factor of 7 × 103.

Table 1.

Selectivity of APPI-3M variants with mutations at selectivity-switch residues toward human serine proteases

Mutation Target A Target B Enrichment ratio, target A Enrichment ratio, target B Selectivitya
T11H Mesotrypsin Anionic trypsin 0.07 7.39 1
T11I 5.91 0.01 69.22 × 103
G17E KLK6 Cationic trypsin 0.12 4.88 1
G17R 12.50 0.08 6.65 × 103

aSelectivity is defined as the fold change in the enrichment ratio for target A, divided by the fold change in the enrichment ratio for target B

Mapping correlated-selectivity residues

Next, we turned to identify the effects of double-amino acid substitutions in APPI-3M on the selectivity toward each of the four serine proteases. The first steps in this process (quality filtration and integration, translation, alignment, and enrichment ratio calculations) were similar to those described above for single-amino acid analyses. Most double-mutant APPI-3M variants that increased the selectivity toward one serine protease versus all others increased the selectivity toward KLK6 [note that the affinity of the parental APPI-3M to KLK6 was two orders of magnitude lower than to anionic and cationic trypsin and one order of magnitude lower than to mesotrypsin51], and these variants had mutations in residues 11 and 17 (Supplementary Table 1). To elucidate the effects of correlated residues and of residues 11 and 17 (Fig. 3c), in particular on the selectivity toward KLK6, we predicted the total effect of each pair of mutated residues (i.e., the effect of all mutations in these two residues; see Methods) and illustrated the results as heatmaps (Fig. 3a and Supplementary Figure 4). Variants in which both residues 11 and 17 were mutated demonstrated an increased selectivity (enrichment ratio >1) toward KLK6 versus the three other proteases. Therefore, we generated additional heatmaps to estimate the effect of specific pair residues (all pair combinations of residues 11 and 17, Fig. 3b). These heatmaps (Fig. 3b) revealed that many combinations of double-amino acid substitutions increased the selectivity toward KLK6 versus the three other proteases, including a combination of either Val, Ala, Pro, or Ser at residue 11 with either Ala, Arg, or Ser at residue 17 (enrichment ratio >1). For instance, the combination of Val at residue 11 and Arg at residue 17 increased the total selectivity of APPI-3M toward KLK6 by a factor of 4 × 109 (calculated as the multiplication of the three relative selectivities: ~59 × 103-fold versus mesotrypsin, ~364-fold versus cationic trypsin, and ~170-fold versus anionic trypsin; Supplementary Table 2). Similarly, the combination of Ser at residue 11 and Arg at residue 17 increased the total selectivity by a factor of 7 × 107 (~37 × 103-fold versus mesotrypsin, ~24-fold versus cationic trypsin, and ~85-fold versus anionic trypsin).

Fig. 3.

Fig. 3

Double mutation selectivity landscape of APPI-3M. Heatmaps demonstrating the effect of double-amino acid mutations in APPI-3M on the selectivity toward KLK6 versus the three other serine proteases. a The effect of different pairs of mutated residues on selectivity is illustrated by the colors of the heatmaps (red = increased selectivity, enrichment ratio >1; blue = decreased selectivity, enrichment ratio <1). The contribution of each double mutation to selectivity was summed and the maps demonstrate the overall effect. The X and Y axes indicate the position of the substituted amino acid residues. See Supplementary Figure 4 for further details. b The effect of different amino acid mutations at residues 11 and 17 of APPI-3M on its selectivity toward KLK6, illustrated by the colors in the heatmaps. The X axis indicates amino acids mutated at residue 17 and the Y axis indicates amino acids mutated at residue 11. c Crystal structure of APPI-3M (PDB ID: 5C67). Cartoon representation of APPI-3M illustrating the positions of correlated residues Thr-11 and Gly-17 (red) within the APPI binding loop (green, positions 11–18). The APPI scaffold is shown in gray

Validating the selectivity changes using soluble inhibitors

To validate the results of the NGS computational analysis, we generated and purified the soluble forms of APPI-3M variants, in which the mutation was located at selectivity-switch residues on the APPI loop (Supplementary Figure 5). These variants included the mutations T11I (for which the NGS analysis predicted a selectivity switch from anionic trypsin to mesotrypsin), T11H (predicted switch from mesotrypsin to anionic trypsin), G17R (predicted switch from cationic trypsin to KLK6), and G17E (predicted switch from KLK6 to cationic trypsin) (Table 1). Then, we evaluated the affinity of these four purified APPI-3M variants to each serine protease by measuring the degree to which they inhibit the ability of each protease to hydrolyze its substrate [benzyloxycarbonyl–Gly–Pro–Arg–p–nitroanilide (Z-GPR-pNA) for mesotrypsin, anionic trypsin, cationic trypsin, and BOC–FSR–MSC for KLK6]. We determined the inhibition constant (Ki) of each of these interactions by quantifying the slow tight binding behavior (Supplementary Figure 6). The experimental results indeed correlated well with those of the NGS analysis (Table 2).

Table 2.

Changes in the selectivity of APPI-3M variants with mutations at selectivity-switch residues toward human serine proteases

Mutation Target A Target B Predicted switcha Ki [pM], target Ab Ki [pM], target Bb Switch ratioc
T11H Mesotrypsin Anionic trypsin A –>B 302 ± 24 4.45 ± 0.23 1.6
T11I B –>A 61.2 ± 2.6a 1.44 ± 0.14
Ki Fold changed  —  —  — 4.93 3.09  —
G17E KLK6 Cationic trypsin A –>B 464 ± 39 85.8 ± 5 .9 5.7
G17R B –>A 77.4 ± 2.6 8.12 ± 0.08
Ki Fold changee 59.95 10.57

aAs predicted by the NGS analysis

bResults (means ± SD) were obtained from three independent experiments

cCalculated as the Ki fold change for target A divided by the Ki fold change for target B

Switchratio=KifoldfortargetAKifoldfortargetB

dKi (T11H)/Ki (T11I)

eKi (G17E)/Ki (G17R)

Positions 11 and 17 in the APPI-3M sequence are correlated

As residues in positions 11 and 17 of the APPI-3M sequence increased the selectivity of APPI-3M toward KLK6, we elucidated the interactions between different amino acids at these positions by generating and purifying representative single- and double-mutant APPI-3M variants. We chose the KLK6-selective T11V/G17R and T11S/G17R double-mutant variants (see Fig. 3), and their corresponding single-mutant selectivity-switch variants T11V, T11S, and G17R (see Tables 1 and 2). We tested the affinity of the soluble forms of these five variants to each of the four serine proteases in a competitive inhibition assay (Table 3) and, based on the extracted Ki values, we determined the selectivity of each variant toward KLK6 and compared it with the selectivity of the unmodified APPI-3M (Table 4).

Table 3.

Ki constants of human serine proteases inhibited by various APPI-3M variants

Mutant aKi [pM]
Mesotrypsin Anionic trypsin Cationic trypsin KLK6
Unmodified APPI-3M 98.0 ± 1.0 2.26 ± 0.08 22.5 ± 0.6 362 ± 10
T11V/G17R 494 ± 28 0.92 ± 0.07 2.37 ± 0.17 16.4 ± 0.9
T11S/G17R 1060 ± 30 2.98 ± 0.19 7.25 ± 0.5 124 ± 13
T11S 581 ± 7 1.16 ± 0.09 7.63 ± 0.55 1000 ± 60
T11V 65.0 ± 1.0 3.72 ± 0.21 14.1± 0.5 378 ± 9
G17R 676 ± 8 3.58 ± 0.16 8.12 ± 0.08 77.4 ± 2.6

aResults (means ± SD) were obtained from three independent experiments

Table 4.

The selectivity of APPI-3M variants (normalized to the unmodified APPI-3M) toward KLK6 versus the three other proteases

Mutant vs. mesotrypsin vs. anionic trypsin vs. cationic trypsin Calculated KLK6 total selectivitya Expected KLK6 total selectivityb
Unmodified APPI-3M 1 1 1 1
T11V/G17R 111.10 8.94 2.32 2304.13 242.45
T11S/G17R 31.62 3.87 0.94 115.53 32.33
T11S 2.14 0.30 0.12 0.08
T11V 0.64 1.58 0.60 0.60
G17R 32.26 7.42 1.69 404.09

a Calculatedselectivity=KiWTforKLK6KimutantforKLK6KiWTforotherproteaseinthepairKimutantforotherproteaseinthepair

bExpected selectivity of double-mutant AB = calculated selectivity of A × calculated selectivity of B

The amino acid substitution that most increased the total selectivity of APPI-3M toward KLK6 was T11V/G17R, followed by G17R and finally, T11S/G17R. The individual substitutions T11V and T11S did not improve the selectivity toward KLK6, rather they somewhat decreased it (Table 4). These results suggest that residues 11 and 17 are correlated-selectivity residues, which act together to increase target selectivity. To further test this hypothesis, we conducted a double-mutant cycle analysis52, in which we used the selectivity values of KLK6 with the two double-mutant variants and their single variants (T11V/G17R, T11S/G17R, T11V, T11S, and G17R, Table 4) to calculate the selectivity strength between two mutated residues (i.e., the coupling energy, ΔΔGint; Supplementary Figure 7). Indeed, in both double mutations, the ΔΔGint values were non-zero, indicating that residues 11 and 17 interact with each other to cooperatively affect the selectivity toward KLK6.

To gain insight into the structural basis of the observed selectivity changes, we attempted to crystallize the APPI-3M-T11V/G17R variant in complex with the increased-selectivity target KLK6 and the reduced-selectivity target mesotrypsin. We were able to obtain a high-resolution crystal structure of the APPI-3M-T11V/G17R variant bound to mesotrypsin (PDB ID: 6GFI; Supplementary Table 3). A structural analysis of this complex revealed that a deleterious steric interaction between the APPI Arg-17 mutation and mesotrypsin Arg-193 pushes Arg-193 into a more buried conformation (Supplementary Figure 8), as previously found in the structures of mesotrypsin bound to wild-type APPI or BPTI Kunitz-type inhibitors53,54. The steric clash and the restriction of Arg-193 to a single buried conformation can explain the reduction in affinity toward mesotrypsin, which is consistent with our prior structure and mutagenesis studies51,55. The corresponding amino acid that occupies position 193 in KLK6 is Gly (PDB ID: 4D8N); therefore, the lack of a side chain in position 193 of KLK6 is probably more energetically favored (upon binding to APPI-3M-G17R) than that of mesotrypsin Arg-193 (due to the steric clash and the restriction of Arg-193). Efforts to crystallize the APPI-3M-T11V/G17R complex with KLK6 were unsuccessful, and thus the basis for selectivity improvements toward this alternative target, and for cooperativity between APPI residues 11 and 17, remain a subject for future investigations.

Selective screens are superior to affinity screens

A significant advantage of our pairwise selectivity screen approach over the traditional sequential affinity screen (a commonly used method, in which the library is sorted against each enzyme separately in a sequential manner40,43) is the ability of our approach to identify, in a single screening step (rather than two sequential affinity screen steps), the top ~5% of clones that are more selective toward one target versus another, even if the absolute affinity of these clones toward both targets is lower than that of the parent variant (in the current study, APPI-3M). To demonstrate that the traditional sequential affinity screens are unable to detect the clones obtained by our pairwise selectivity screens (namely, those with improved selectivity and low affinity), we performed two separate sequential affinity screens, one toward KLK6 and another toward cationic trypsin (Supplementary Figure 9A, B, D, E). As expected, both the sequential affinity and the pairwise selectivity screen approaches were able to identify the G17R mutation as a KLK6 selectivity-improving mutation (Supplementary Table 4), which is consistent with the 1.7-fold improvement in the selectivity toward KLK6 versus cationic trypsin, measured by the enzymatic assay (Supplementary Table 5). In contrast, we were unable to identify the selective G17E mutation by using the sequential affinity approach (Supplementary Table 4), although it was clearly identified using the pairwise selectivity screen between KLK6 and cationic trypsin (Supplementary Table 4), demonstrating a 3.4-fold improved selectivity toward cationic trypsin, as measured by the enzymatic assay (Supplementary Table 5). This discrepancy between the two approaches stems from the fact that the G17E mutation was not in the top ~5% binders in the cationic trypsin and KLK6 sorts due to its weakened affinity toward cationic trypsin and KLK6 relative to the parental molecule APPI-3M (by ~4-fold and ~10-fold, respectively, Supplementary Table 5).

Upscaling

Our selective pairwise screening approach can be easily scaled up for multiple target proteins per screen, such that a library can be screened against a target of interest (labeled with one type of fluorophore) versus a mixture of competitors (all labeled with the same fluorophore, which is different from the one used for the target of interest). Such an approach is especially useful in the case where there is a single primary target of interest, since it will be completed through only a single sort. To demonstrate the feasibility of such an approach, we performed a competitive sort, in which KLK6 was the primary target of interest (labeled with Alexa Fluor-650) and cationic trypsin, anionic trypsin, and mesotrypsin (each labeled with Alexa Fluor-488) were the competitors (Supplementary Figure 9C, F), and compared the enrichment values to those of our pairwise comparisons. The enrichment ratios of the competitive multi-target screen were highly correlated with those of the pairwise selective screen; in both setups, the top-rated selectivity-improving clones were similar (Supplementary Table 6), both for single mutations (e.g., G17R) and for double mutations (e.g., T11S/G17R and T11V/G17R). In addition, this analysis revealed a clear selectivity cold spot, in which most mutations in residue 17 increased the binding selectivity toward KLK6 versus all other proteases (Supplementary Figure 3E). This finding is consistent with those obtained using the pairwise screening approach (Supplementary Figure 3B).

Discussion

We describe a novel strategy for mapping the binding selectivity landscapes of proteins through a combination of experimental multi-target selective library screening and in silico next-generation sequencing analysis. Employing the APPI/serine protease system as a model PPI, we show that our strategy can be used to map, in a rapid, single-step, cost-effective process, several crucial aspects of the selectivity landscape, including hot-spot residues, selectivity switch residues, and correlated-selectivity residues. The latter are of special importance, as characterizing correlated-selectivity mutations and analyzing their effects (both individually and combined) on target affinity and selectivity is challenging with currently available approaches27,56.

Several previous studies have combined selective screening of a protein library and NGS analyses to map the binding landscape of various proteins, including influenza inhibitors (HB36.4, HB80.3)7, the human Yes Associated Protein 65 (hYAP65) WW domain19, and an anti-VEGF antibody40. However, these approaches employed either libraries of clones with only single mutations or library screens that were performed against only a single target. Therefore, in these previous studies, it was difficult to identify mutations that change target selectivity or that work in concert to affect target affinity and selectivity in a correlated manner. Thus, a major advantage of our approach is its ability to identify correlated-selectivity mutations. For example, we found that the mutations T11V and G17R, when combined, yield a highly potent and selective inhibitor for KLK6, while combining the mutations T11S and G17R yield only a moderately potent and partially selective inhibitor for KLK6. These findings may suggest that a small and hydrophobic amino acid (e.g., Val in position 11) exerts a stronger effect on selectivity towards KLK6 than a small and polar amino acid (Ser in position 11).

Another advantage of our approach lies in using a pairwise selectivity screen, rather than the sequential affinity screen that is commonly used in other approaches40,43, to increase selectivity. This advantage is especially noticeable for the identification of clones that are selective but have distinct affinities toward both targets that are lower than that of the parent variant (in the current study, APPI-3M), as demonstrate in Supplementary Table 4. In addition, the pairwise screening approach can be easily scaled up for multiple target proteins per screen, such that a library can be screened against a target of interest versus a mixture of competitors. Such an approach is especially useful where there is a single primary target of interest, since it will be completed with only one sort, as demonstrate in Supplementary Table 6.

We chose the serine protease family as an ideal group of targets to demonstrate our strategy mainly because inhibiting the human serine proteases is of clinical value: both KLK6 and mesotrypsin are involved in cancer progression5759, while anionic and cationic trypsins are involved in the etiology of pancreatitis60,61. However, the development of inhibitors capable of discriminating among trypsin-like proteases has been challenging. We and others have previously used X-ray crystallography to explore the structures of these proteases, in some cases identifying the distinguishing features that suggest the potential for developing highly selective inhibitors54,62,63. For example, several adaptive mutations have been shown to shape the active site of mesotrypsin for distinct substrate and inhibitor-binding selectivity54,6365. Nevertheless, the development of truly selective inhibitors has yet to be achieved, and we anticipate that our novel approach, which is capable of rapidly and efficiently screening large libraries to comprehensively map selectivity, will enable the development of selective probes and therapeutic agents.

APPI has attracted our interest as a scaffold for engineering selective serine protease inhibitors due to the marked sequence diversity among Kunitz family members, which possess canonical binding loops that are highly tolerant to substitution or incorporation of additional amino acids66,67. Because the sequence of the canonical binding loop and neighboring residues largely determine the affinity and selectivity of the inhibitor to its targets53,68, using APPI as a scaffold offers a unique opportunity to optimize target affinity and selectivity without compromising stability. In addition, the affinity of the complexes between APPI and mesotrypsin, anionic trypsin, and cationic trypsin is similar, which facilitated the identification of cold spots, whereas the affinity of the APPI/KLK6 complex is three orders of magnitude lower than that of the other complexes, thus allowing us to identify selectivity-switch residues.

As a validation of the utility of our platform, we show that the results obtained using NGS of the selected APPI clones typically correlate well with the binding selectivity of the purified protein variants in solution (as measured by competitive inhibition studies), but at different scales (Supplementary Table 7). For example, the selectivity values of 13 combinations of enzyme–inhibitor variants (out of a total of 15 possible combinations examined), calculated using NGS, are well-correlated (whether the selectivity was improved or damaged) with those obtained in the enzymatic assay. Of note, in all 15 combinations, a clear correlation was found between the ranking of the selectivity values that were calculated by each method (ranking is according to the level of selectivity improvement within each method for each enzyme, with the greatest improvement ranked as one; see example in bold boxes in Supplementary Table 7). As shown in Table 1, the NGS analysis predicted a selectivity increase of ~7 × 103-fold from cationic trypsin to KLK6 for G17R compared with G17E, and of ~70 × 103-fold from anionic trypsin to mesotrypsin for T11I compared with T11H; both these findings are in qualitative agreement with the increase in selectivity determined from the Ki values of the soluble proteins, namely, an increase of ~5.7-fold and ~1.6-fold, respectively (Table 2). However, no correlation was found between the magnitudes of the improvements, i.e., the 7 × 103-fold improvement calculated by NGS was calculated as a ~5.7-fold improvement in the enzymatic assay, while the 70 × 103-fold improvement calculated by NGS was calculated as only a ~1.6-fold improvement in the enzymatic assay. Therefore, the selectivity increase values that were calculated by the NGS cannot be directly compared with those of the competitive inhibition studies; rather, the values can be compared between experiments using each method, and not between the two methods. Nevertheless, the results shown in Tables 1 and 2 confirm that our approach can predict the positions that can change target selectivity, and that our approach is sufficiently sensitive to detect small affinity changes, whereas other currently available approaches can typically identify only greater changes in the interactions between proteins69.

In further validation of our strategy, we identified most previously described mutations that affect the binding affinity and selectivity of APPI to serine proteases, as well as some novel mutations. For example, we identified residue 15 as a hot spot for all four human serine proteases, as all amino acid mutations in this residue (except R15K) reduced the binding affinity of APPI-3M to each of the four proteases (Fig. 2 and Supplementary Figure 3). Indeed, residue 15 had previously been identified as a hot spot in Kunitz-domain inhibitors in studies with BPTI70,71. In addition, our data identified, for the first time, to the best of our knowledge, that residue 13 is a selectivity cold-spot for mesotrypsin, as most of the mutations in this residue improved selectivity toward mesotrypsin versus all other proteases. On the other hand, mutating the residue in position 11 switched the selectivity from anionic trypsin to mesotrypsin. Therefore, the difference between residues 13 and 11 is that the former facilitates a selectivity switch from three proteases to a specific protease, while the latter enables a selectivity switch from one protease to one other protease.

The use of NGS covered the entire library and provided a comprehensive map of the binding interface. However, generating the library by using a combination of site-specific saturation mutagenesis on the APPI loop, and random mutations also on other parts of the gene, limited our ability to analyze residues that are distant from the interaction site. We attribute this limitation to technical aspects of our library design, as the random mutations generated by using the error-prone PCR were represented to a lower extent than mutations generated by using site-saturation libraries. Nevertheless, the residues that we found to improve the selectivity of APPI toward the four serine proteases can provide an explanation for the basis for target selectivity of inhibitors toward serine proteases. These selectivity-improving mutations can also be beneficial for designing targeted therapeutics for cancer and other diseases, as they can potentially inhibit the desired serine protease in a selective manner, so as to minimize toxic effects. This study also serves as an example for the general utility of our new platform, as many PPI mediators and disease targets belong to large families of related proteins, making target selectivity a highly desirable but challenging goal in drug development. Thus, we our approach for simply and efficiently mapping PPI selectivity landscapes offers great promise for designing novel target-selective therapeutics.

Methods

YSD and flow cytometry cell sorting

The yeast-displayed APPI-3M library was constructed as described in Supplementary Methods. To display the APPI-3M library on the surface of the yeast, the library was grown in an SDCAA selective medium (2% dextrose, 0.67% Difco yeast nitrogen base, 0.5% Bacto casamino acids, 0.52% Na2HPO4, and 0.856% NaH2PO4∙H2O) and induced for expression with a galactose medium (as for SDCAA, but with galactose 2%, instead of dextrose) according to an established protocol72. Inactive forms of mesotrypsin, anionic trypsin, and cationic trypsin containing the mutation S195A were used as a precaution against enzymatic cleavage during the experiments51. The four serine proteases were labeled with Alexa Fluor dyes (Invitrogen, Carlsbad, CA) and used to detect binding. For pairwise selectivity screen, ~ 1 × 108 yeast cells were incubated with different Alexa Fluor-labeled serine proteases in a binding buffer (100 mM Tris, pH = 8.0, 1 mM CaCl2, 1% BSA) for 1.5 h at room temperature. Then, the cells were washed with the binding buffer and sorted for the high-selective variants by conducting several independent sorts, using FACSAria [the Ilse Katz Institute for Nanoscale Science and Technology, Ben-Gurion University of the Negev (BGU), Israel]. The complexes included the following pairs and concentrations: mesotrypsin/KLK6 [25 nM/7 nM], mesotrypsin/cationic trypsin [25 nM/8 nM], mesotrypsin/anionic trypsin [25 nM/12.5 nM], anionic trypsin/KLK6 [250 nM/7 nM], cationic trypsin/KLK6 [100 nM/7 nM], and anionic trypsin/cationic trypsin [12.5 nM/100 nM]. APPI-3M variants that showed a high binding affinity (top 5% of the entire population) toward one serine protease in the pair and a low binding affinity toward the other were selected. Dual-color flow cytometry (BD Accuri C6, Piscataway, NJ) was used to test the selective binding of each sorted library to one serine protease in the pair in the presence of the other.

Quality filtration and integration of sequences

Sequencing data from each library were treated identically and evaluated in triplicates, and Spearman’s rank correlation coefficient73 was calculated to be above 95%. An average Illumina quality score was calculated for each read in a given set of paired-end reads, and read pairs in which either read had an average quality score lower than 20 (i.e., less than 99% accuracy) were discarded. The remaining read pairs were merged into a single sequence by fast length adjustment of short reads (FLASH) software74. DNA sequences and their amino acid translations were aligned to the sequence of APPI-3M; sequences of different lengths and sequences containing stop codons were discarded.

Computational analysis of high-throughput sequencing results

The analysis was performed in MATLAB, version R2016a. Variants with one amino acid mutation and variants with multiple amino acids mutations were analyzed separately. First, the number of reads of each variant from each library was counted. Then, to avoid variants with a low number of reads (which can yield noisy frequencies and enrichment ratios), we determined a threshold value of 100 reads for variants with a single amino acid mutation and 10 reads for variants with double amino acids mutations. Variants with read numbers below the threshold in the naive and sorted library fractions were discarded, and variants with read numbers below the threshold value received the threshold value if the read number of the variant in the other library was above the threshold.

Next, the frequency of each remaining variant, v, from each library was computed as Fv=ReadsvReadsv, where Readsv is the number of times that this variant appeared in the library. Based on its frequency, the enrichment ratio of each variant from each sorted library was calculated. The enrichment ratio for a given variant, v, was calculated as ERv,=Fv,sortedFv,naive, where Fv,sorted is the frequency of the variant in the sorted library and Fv,naive is the frequency of the same variant in the naive (pre-sorted) library. Eventually, for single amino acid substitution, heatmaps were created based on the enrichment ratio5; for double amino acid substitutions, we summed the enrichment ratios of similar double-mutation variants that have mutations in the same residues (ERx,y=ER1+ER2++ERN, where x and y are the mutated residues and N is the number of substitutions at the x and y residues). We illustrated these results as heat maps.

Electronic supplementary material

Peer Review File (369.8KB, pdf)

Acknowledgements

The authors thank Vered Caspi (BGU), Matan Shemer (BGU), and Jonathan Barlev (Weizmann Institute of Science, Israel) for their helpful discussions. We thank Dr. Uzi Hadad for his technical assistance. FACS experiments were performed at the Ilse Katz Institute for Nanoscale Science & Technology, BGU. N.P. acknowledges support from the European Research Council “Ideas program” ERC-2013-StG (contract grant number: 336041). N.P. and E.S.R. acknowledge support from the US-Israel Binational Science Foundation (BSF). E.S.R. acknowledges support from the United States National Institutes of Health grant number R01CA154387. The structural studies were performed on beamline ID30-B at the European Synchrotron Radiation Facility (ESRF), Grenoble, France. We are grateful to Christoph Mueller-Dieckmann for providing assistance in using this beamline. We would like to thank Prof. Kay Diederichs and Dr. Ronan Keegan for their help and contribution in the structure determination during the 1st CCP4/BGU Structure Solution Workshop, which took place at Ben-Gurion University of Negev during February 2018.

Author contributions

S.N and I.C. made an equal contribution as first authors; S.N., I.C., and N.P. designed the research; S.N., I.C., A.H., and E.S.R generated the proteins; S.N. and I.C. performed the research; S.N., I.C., A.S., E.S.R., and N.P. analyzed the data; S.N., I.C., and N.P. wrote the paper. All authors edited the manuscript and approved the final version.

Data availability

All relevant data are available from the authors. The coordinates and structure factors for the complex of APPI-3M-T11V/G17R variant bound to mesotrypsin have been submitted to the Protein Data Bank (PDB) under the accession code 6GFI. The crystal structure of APPI-3M is available in the PDB under the accession code 5C67. The crystal structure of KLK6 is available in the PDB under the accession code 4D8N. The crystal structure of the mesotrypsin/BPTI complex is available in the PDB under the accession code 2R9P.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Si Naftaly, Itay Cohen.

Electronic supplementary material

Supplementary Information accompanies this paper at 10.1038/s41467-018-06403-x.

References

  • 1.Aizner Y, et al. Mapping of the binding landscape for a picomolar protein-protein complex through computation and experiment. Structure. 2014;22:636–645. doi: 10.1016/j.str.2014.01.012. [DOI] [PubMed] [Google Scholar]
  • 2.Gfeller D, et al. The multiple-specificity landscape of modular peptide recognition domains. Mol. Syst. Biol. 2011;7:484. doi: 10.1038/msb.2011.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sharabi O, et al. Affinity- and specificity-enhancing mutations are frequent in multispecific interactions between TIMP2 and MMPs. PLoS ONE. 2014;9:e93712. doi: 10.1371/journal.pone.0093712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat. Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Fowler DM, Stephany JJ, Fields S. Measuring the activity of protein variants on a large scale using deep mutational scanning. Nat. Protoc. 2014;9:2267–2284. doi: 10.1038/nprot.2014.153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kowalsky Caitlin A., Klesmith Justin R., Stapleton James A., Kelly Vince, Reichkitzer Nolan, Whitehead Timothy A. High-Resolution Sequence-Function Mapping of Full-Length Proteins. PLOS ONE. 2015;10(3):e0118193. doi: 10.1371/journal.pone.0118193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Whitehead TA, et al. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat. Biotechnol. 2012;30:543–548. doi: 10.1038/nbt.2214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Moreira IS, Fernandes PA, Ramos MJ. Hot spots—a review of the protein-protein interface determinant amino-acid residues. Proteins. 2007;68:803–812. doi: 10.1002/prot.21396. [DOI] [PubMed] [Google Scholar]
  • 9.Kortemme T, Baker D. A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Natl Acad. Sci. USA. 2002;99:14116–14121. doi: 10.1073/pnas.202485799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chen J, Sawyer N, Regan L. Protein-protein interactions: general trends in the relationship between binding affinity and interfacial buried surface area. Protein Sci. 2013;22:510–515. doi: 10.1002/pro.2230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lin J, et al. Factors that affect the computational prediction of hot spots in protein-protein complexes. Comput. Mol. Biosci. 2012;2.1:23. doi: 10.4236/cmb.2012.21003. [DOI] [Google Scholar]
  • 12.Han J, et al. Structure-based rational design of a Toll-like receptor 4 (TLR4) decoy receptor with high binding affinity for a target protein. PLoS ONE. 2012;7:e30929. doi: 10.1371/journal.pone.0030929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Meenan NA, et al. The structural and energetic basis for high selectivity in a high-affinity protein-protein interaction. Proc. Natl Acad. Sci. USA. 2010;107:10080–10085. doi: 10.1073/pnas.0910756107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Karanicolas J, et al. A de novo protein binding pair by computational design and directed evolution. Mol. Cell. 2011;42:250–260. doi: 10.1016/j.molcel.2011.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Shirian J, Sharabi O, Shifman JM. Cold spots in protein binding. Trends Biochem. Sci. 2016;41:739–745. doi: 10.1016/j.tibs.2016.07.002. [DOI] [PubMed] [Google Scholar]
  • 16.Sio CF, Otten LG, Cool RH, Quax WJ. Analysis of a substrate specificity switch residue of cephalosporin acylase. Biochem. Biophys. Res. Commun. 2003;312:755–760. doi: 10.1016/j.bbrc.2003.10.180. [DOI] [PubMed] [Google Scholar]
  • 17.Gobel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins. 1994;18:309–317. doi: 10.1002/prot.340180402. [DOI] [PubMed] [Google Scholar]
  • 18.Pazos F, Helmer-Citterich M, Ausiello G, Valencia A. Correlated mutations contain information about protein-protein interaction. J. Mol. Biol. 1997;271:511–523. doi: 10.1006/jmbi.1997.1198. [DOI] [PubMed] [Google Scholar]
  • 19.Fowler DM, et al. High-resolution mapping of protein sequence-function relationships. Nat. Methods. 2010;7:741–746. doi: 10.1038/nmeth.1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Araya CL, Fowler DM. Deep mutational scanning: assessing protein function on a massive scale. Trends Biotechnol. 2011;29:435–442. doi: 10.1016/j.tibtech.2011.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hietpas RT, Jensen JD, Bolon DN. Experimental illumination of a fitness landscape. Proc. Natl Acad. Sci. USA. 2011;108:7896–7901. doi: 10.1073/pnas.1016024108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Siloto RM, Randall J. Weselake. Site saturation mutagenesis: methods and applications in protein engineering. Biocatal. Agric. Biotechnol. 2012;1.3:181–189. [Google Scholar]
  • 23.Ashkenazi A, et al. Mapping the CD4 binding site for human immunodeficiency virus by alanine-scanning mutagenesis. Proc. Natl Acad. Sci. USA. 1990;87:7150–7154. doi: 10.1073/pnas.87.18.7150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Cunningham BC, Wells JA. High-resolution epitope mapping of hGH-receptor interactions by alanine-scanning mutagenesis. Science. 1989;244:1081–1085. doi: 10.1126/science.2471267. [DOI] [PubMed] [Google Scholar]
  • 25.Hietpas RT, Bank C, Jensen JD, Bolon DNA. Shifting fitness landscapes in response to altered environments. Evol.; Int. J. Org. Evol. 2013;67:3512–3522. doi: 10.1111/evo.12207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kortemme T, Kim DE, Baker D. Computational alanine scanning of protein-protein interfaces. Sci. STKE. 2004;2004:pl2. doi: 10.1126/stke.2192004pl2. [DOI] [PubMed] [Google Scholar]
  • 27.Kristensen C, et al. Alanine scanning mutagenesis of insulin. J. Biol. Chem. 1997;272:12978–12983. doi: 10.1074/jbc.272.20.12978. [DOI] [PubMed] [Google Scholar]
  • 28.Weiss GA, Watanabe CK, Zhong A, Goddard A, Sidhu SS. Rapid mapping of protein functional epitopes by combinatorial alanine scanning. Proc. Natl Acad. Sci. USA. 2000;97:8950–8954. doi: 10.1073/pnas.160252097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Xu P, et al. Design of specific serine protease inhibitors based on a versatile peptide scaffold: conversion of a urokinase inhibitor to a plasma kallikrein inhibitor. J. Med. Chem. 2015;58:8868–8876. doi: 10.1021/acs.jmedchem.5b01128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Boder ET, Wittrup KD. Yeast surface display for screening combinatorial polypeptide libraries. Nat. Biotechnol. 1997;15:553–557. doi: 10.1038/nbt0697-553. [DOI] [PubMed] [Google Scholar]
  • 31.Cortese R, et al. Epitope discovery using peptide libraries displayed on phage. Trends Biotechnol. 1994;12:262–267. doi: 10.1016/0167-7799(94)90137-6. [DOI] [PubMed] [Google Scholar]
  • 32.Fack F, et al. Epitope mapping by phage display: random versus gene-fragment libraries. J. Immunol. Methods. 1997;206:43–52. doi: 10.1016/S0022-1759(97)00083-5. [DOI] [PubMed] [Google Scholar]
  • 33.Gai SA, Wittrup KD. Yeast surface display for protein engineering and characterization. Curr. Opin. Struct. Biol. 2007;17:467–473. doi: 10.1016/j.sbi.2007.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Pal G, Kouadio JL, Artis DR, Kossiakoff AA, Sidhu SS. Comprehensive and quantitative mapping of energy landscapes for protein-protein interactions by rapid combinatorial scanning. J. Biol. Chem. 2006;281:22378–22385. doi: 10.1074/jbc.M603826200. [DOI] [PubMed] [Google Scholar]
  • 35.Rabinovich E, et al. Identifying residues that determine SCF molecular-level interactions through a combination of experimental and in silico analyses. J. Mol. Biol. 2017;429:97–114. doi: 10.1016/j.jmb.2016.11.018. [DOI] [PubMed] [Google Scholar]
  • 36.Rosenfeld L, Heyne M, Shifman JM, Papo N. Protein engineering by combined computational and in vitro evolution approaches. Trends Biochem. Sci. 2016;41:421–433. doi: 10.1016/j.tibs.2016.03.002. [DOI] [PubMed] [Google Scholar]
  • 37.Rosenfeld L, et al. Combinatorial and computational approaches to identify interactions of macrophage colony-stimulating factor (M-CSF) and its receptor c-FMS. J. Biol. Chem. 2015;290:26180–26193. doi: 10.1074/jbc.M115.671271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Cakar ZP, Turanli-Yildiz B, Alkim C, Yilmaz U. Evolutionary engineering of Saccharomyces cerevisiae for improved industrially important properties. Fems. Yeast. Res. 2012;12:171–182. doi: 10.1111/j.1567-1364.2011.00775.x. [DOI] [PubMed] [Google Scholar]
  • 39.Fowler DM, Araya CL, Gerard W, Fields S. Enrich: software for analysis of protein function by enrichment and depletion of variants. Bioinformatics. 2011;27:3430–3431. doi: 10.1093/bioinformatics/btr577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Koenig P, et al. Deep sequencing-guided design of a high affinity dual specificity antibody to target two angiogenic factors in neovascular age-related macular degeneration. J. Biol. Chem. 2015;290:21773–21786. doi: 10.1074/jbc.M115.662783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Cohen-Khait R, Schreiber G. Low-stringency selection of TEM1 for BLIP shows interface plasticity and selection for faster binders. Proc. Natl Acad. Sci. USA. 2016;113:14982–14987. doi: 10.1073/pnas.1613122113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Mendes KR, et al. High-throughput identification of DNA-encoded IgG ligands that distinguish active and latent Mycobacterium tuberculosis infections. Acs. Chem. Biol. 2017;12:234–243. doi: 10.1021/acschembio.6b00855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Jardine JG, et al. HIV-1 broadly neutralizing antibody precursor B cells revealed by germline-targeting immunogen. Science. 2016;351:1458–1463. doi: 10.1126/science.aad9195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wang X, et al. Fine epitope mapping of two antibodies neutralizing the Bordetella adenylate cyclase toxin. Biochemistry. 2017;56:1324–1336. doi: 10.1021/acs.biochem.6b01163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Salameh MA, et al. The amyloid precursor protein/protease nexin 2 Kunitz inhibitor domain is a highly specific substrate of mesotrypsin. J. Biol. Chem. 2010;285:1939–1949. doi: 10.1074/jbc.M109.057216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Boder ET, Midelfort KS, Wittrup KD. Directed evolution of antibody fragments with monovalent femtomolar antigen-binding affinity. Proc. Natl Acad. Sci. USA. 2000;97:10701–10705. doi: 10.1073/pnas.170297297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Graff CP, Chester K, Begent R, Wittrup KD. Directed evolution of an anti-carcinoembryonic antigen scFv with a 4-day monovalent dissociation half-time at 37 degrees C. Protein Eng. Des. Sel. 2004;17:293–304. doi: 10.1093/protein/gzh038. [DOI] [PubMed] [Google Scholar]
  • 48.Kieke MC, et al. Selection of functional T cell receptor mutants from a yeast surface-display library. Proc. . Natl. Acad. Sci. U. S. A. 1999;96:5651–5656. doi: 10.1073/pnas.96.10.5651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kim YS, Bhandari R, Cochran JR, Kuriyan J, Wittrup KD. Directed evolution of the epidermal growth factor receptor extracellular domain for expression in yeast. Proteins. 2006;62:1026–1035. doi: 10.1002/prot.20618. [DOI] [PubMed] [Google Scholar]
  • 50.Shusta EV, Holler PD, Kieke MC, Kranz DM, Wittrup KD. Directed evolution of a stable scaffold for T-cell receptor engineering. Nat. Biotechnol. 2000;18:754–759. doi: 10.1038/77325. [DOI] [PubMed] [Google Scholar]
  • 51.Cohen I, et al. Combinatorial protein engineering of proteolytically resistant mesotrypsin inhibitors as candidates for cancer therapy. Biochem. J. 2016;473:1329–1341. doi: 10.1042/BJ20151410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Horovitz A. Double-mutant cycles: a powerful tool for analyzing protein structure and function. Fold. Des. 1996;1:R121–R126. doi: 10.1016/S1359-0278(96)00056-9. [DOI] [PubMed] [Google Scholar]
  • 53.Salameh MA, et al. Determinants of affinity and proteolytic stability in interactions of Kunitz family protease inhibitors with mesotrypsin. J. Biol. Chem. 2010;285:36884–36896. doi: 10.1074/jbc.M110.171348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Salameh MA, Soares AS, Hockla A, Radisky ES. Structural basis for accelerated cleavage of bovine pancreatic trypsin inhibitor (BPTI) by human mesotrypsin. J. Biol. Chem. 2008;283:4115–4123. doi: 10.1074/jbc.M708268200. [DOI] [PubMed] [Google Scholar]
  • 55.Salameh MA, Soares AS, Hockla A, Radisky DC, Radisky ES. The P(2)’ residue is a key determinant of mesotrypsin specificity: engineering a high-affinity inhibitor with anticancer activity. Biochem. J. 2011;440:95–105. doi: 10.1042/BJ20110788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Chao G, Cochran JR, Wittrup KD. Fine epitope mapping of anti-epidermal growth factor receptor antibodies through random mutagenesis and yeast surface display. J. Mol. Biol. 2004;342:539–550. doi: 10.1016/j.jmb.2004.07.053. [DOI] [PubMed] [Google Scholar]
  • 57.Hockla A, et al. PRSS3/mesotrypsin is a therapeutic target for metastatic prostate cancer. Mol. Cancer Res.: MCR. 2012;10:1555–1566. doi: 10.1158/1541-7786.MCR-12-0314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Hockla A, Radisky DC, Radisky ES. Mesotrypsin promotes malignant growth of breast cancer cells through shedding of CD109. Breast Cancer Res. Treat. 2010;124:27–38. doi: 10.1007/s10549-009-0699-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Jiang G, et al. PRSS3 promotes tumour growth and metastasis of human pancreatic cancer. Gut. 2010;59:1535–1544. doi: 10.1136/gut.2009.200105. [DOI] [PubMed] [Google Scholar]
  • 60.Lopez-Otin C, Matrisian LM. Emerging roles of proteases in tumour suppression. Nat. Rev. Cancer. 2007;7:800–808. doi: 10.1038/nrc2228. [DOI] [PubMed] [Google Scholar]
  • 61.Kukor Z, Toth M, Sahin-Toth M. Human anionic trypsinogen: properties of autocatalytic activation and degradation and implications in pancreatic diseases. Eur. J. Biochem. 2003;270:2047–2058. doi: 10.1046/j.1432-1033.2003.03581.x. [DOI] [PubMed] [Google Scholar]
  • 62.Bernett MJ, et al. Crystal structure and biochemical characterization of human kallikrein 6 reveals that a trypsin-like kallikrein is expressed in the central nervous system. J. Biol. Chem. 2002;277:24562–24570. doi: 10.1074/jbc.M202392200. [DOI] [PubMed] [Google Scholar]
  • 63.Katona G, Berglund GI, Hajdu J, Graf L, Szilagyi L. Crystal structure reveals basis for the inhibitor resistance of human brain trypsin. J. Mol. Biol. 2002;315:1209–1218. doi: 10.1006/jmbi.2001.5305. [DOI] [PubMed] [Google Scholar]
  • 64.Alloy AP, et al. Mesotrypsin has evolved four unique residues to cleave trypsin inhibitors as substrates. J. Biol. Chem. 2015;290:21523–21535. doi: 10.1074/jbc.M115.662429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Salameh MA, Soares AS, Alloy A, Radisky ES. Presence versus absence of hydrogen bond donor Tyr-39 influences interactions of cationic trypsin and mesotrypsin with protein protease inhibitors. Protein Sci. 2012;21:1103–1112. doi: 10.1002/pro.2097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Dennis MS, Herzka A, Lazarus RA. Potent and selective Kunitz domain inhibitors of plasma kallikrein designed by phage display. J. Biol. Chem. 1995;270:25411–25417. doi: 10.1074/jbc.270.43.25411. [DOI] [PubMed] [Google Scholar]
  • 67.Dennis MS, Lazarus RA. Kunitz domain inhibitors of tissue factor-factor VIIa. I. Potent inhibitors selected from libraries by phage display. J. Biol. Chem. 1994;269:22129–22136. [PubMed] [Google Scholar]
  • 68.Krowarsch D, Cierpicki T, Jelen F, Otlewski J. Canonical protein inhibitors of serine proteases. Cell. Mol. life Sci.: CMLS. 2003;60:2427–2444. doi: 10.1007/s00018-003-3120-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Iffland A, Gendreizig S, Tafelmeyer P, Johnsson K. Changing the substrate specificity of cytochrome c peroxidase using directed evolution. Biochem. Biophys. Res. Commun. 2001;286:126–132. doi: 10.1006/bbrc.2001.5366. [DOI] [PubMed] [Google Scholar]
  • 70.Buczek O, Koscielska-Kasprzak K, Krowarsch D, Dadlez M, Otlewski J. Analysis of serine proteinase-inhibitor interaction by alanine shaving. Protein Sci. 2002;11:806–819. doi: 10.1110/ps.3510102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Castro MJ, Anderson S. Alanine point-mutations in the reactive region of bovine pancreatic trypsin inhibitor: effects on the kinetics and thermodynamics of binding to beta-trypsin and alpha-chymotrypsin. Biochemistry. 1996;35:11435–11446. doi: 10.1021/bi960515w. [DOI] [PubMed] [Google Scholar]
  • 72.Chao G, et al. Isolating and engineering human antibodies using yeast surface display. Nat. Protoc. 2006;1:755–768. doi: 10.1038/nprot.2006.94. [DOI] [PubMed] [Google Scholar]
  • 73.Sedgwick P. Spearman’s rank correlation coefficient. BMJ. 2014;349:g7327. doi: 10.1136/bmj.g7327. [DOI] [PubMed] [Google Scholar]
  • 74.Magoc T, Salzberg SL. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics. 2011;27:2957–2963. doi: 10.1093/bioinformatics/btr507. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Peer Review File (369.8KB, pdf)

Data Availability Statement

All relevant data are available from the authors. The coordinates and structure factors for the complex of APPI-3M-T11V/G17R variant bound to mesotrypsin have been submitted to the Protein Data Bank (PDB) under the accession code 6GFI. The crystal structure of APPI-3M is available in the PDB under the accession code 5C67. The crystal structure of KLK6 is available in the PDB under the accession code 4D8N. The crystal structure of the mesotrypsin/BPTI complex is available in the PDB under the accession code 2R9P.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES