Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 May 15.
Published in final edited form as: Biochem J. 2020 May 15;477(9):1701–1719. doi: 10.1042/BCJ20200188

Quantitative Mapping of Binding Specificity Landscapes for Homologous Targets by using a High-Throughput Method

Lidan Aharon 1, Shay-Lee Aharoni 1, Evette S Radisky 2, Niv Papo 1
PMCID: PMC7376575  NIHMSID: NIHMS1610398  PMID: 32296833

Abstract

To facilitate investigations of protein-protein interactions (PPIs), we developed a novel platform for quantitative mapping of protein binding specificity landscapes, which combines multi-target screening of a mutagenesis library into high- and low-affinity populations with sophisticated next-generation sequencing analysis. Importantly, this method generates accurate models to predict affinity and specificity values for any mutation within a protein complex, and requires only a small number of experimental binding affinity measurements using purified proteins for calibration. We demonstrated the utility of the approach by mapping quantitative landscapes for interactions between the N-terminal domain of the tissue inhibitor of metalloproteinase 2 (N-TIMP2) and three matrix metalloproteinases (MMPs) having homologous structures but different affinities (MMP-1, MMP-3 and MMP-14). The binding landscapes for N-TIMP2/MMP-1 and N-TIMP2/MMP-3 showed the PPIs to be almost fully optimized, with most single mutations giving a loss of affinity. In contrast, the non-optimized PPI for N-TIMP2/MMP-14 was reflected in a wide range of binding affinities, where single mutations exhibited a far more attenuated effect on the PPI. Our new platform reliably and comprehensively identified not only hot- and cold-spot residues, but also specificity-switch mutations that shape target affinity and specificity. Thus, our approach provides a methodology giving an unprecedentedly rich quantitative analysis of the binding specificity landscape, which will broaden the understanding of the mechanisms and evolutionary origins of specific PPIs and facilitate the rational design of specific inhibitors for structurally similar target proteins.

Keywords: protein-protein interactions (PPIs), protein engineering, next generation sequencing (NGS), matrix metalloproteinases (MMPs), protease inhibitors

Summary statement:

Aharon et al. describe a novel strategy for generating quantitative affinity and specificity landscapes for any protein-protein interaction regardless of its KD (or Ki).

Introduction

Protein fitness – the ability of a protein to perform its main function – is delicately balanced against other protein properties, including solubility and stability to unfolding, aggregation, and degradation [1,2]. Despite evolutionary fine-tuning, protein fitness is very often not perfect [3], and thus to enhance protein functions for desired applications, we may need to apply high-throughput screening and selection of mutagenized protein libraries [47]. In such an endeavor, it is necessary to have in hand a reliable tool for evaluating the functional potency of the mutagenized proteins, for example, for determining the binding affinities of mutants with altered protein-protein interactions (PPIs). This type of evaluation can take the form of mapping fitness landscapes, which relate the effects of all possible mutations to protein fitness [819].

In investigations of PPIs, protein fitness landscapes find particular utility in the study of pairs of proteins with similar physicochemical properties and similar binding interface sizes but markedly different binding affinities (>12 orders of magnitude) [20]. Determinants of binding affinity include ‘hot spots’ [21,22], namely, residues at critical positions in the binding interface for which mutations cause a significant reduction in affinity (>3 orders of magnitude); the binding landscapes of such PPIs are referred to here as ‘steep’ [8,9,23,24]. Mutations at other positions in the PPI interfaces, known as ‘cold spots,’ will not greatly affect – or may even enhance [25] – the binding affinity, producing landscapes referred to here as ‘shallow.’ In situations in which it is desirable to block natural PPIs for therapeutic applications, hot spots – being energetically favorable residues – make the greatest contribution to the stability of protein-protein complexes and may thus represent druggable sites on target interfaces [26]. Cold spots, which are believed to play a major role in the evolution of PPIs [27], are particularly important for optimization in affinity maturation experiments aimed at improving the binding of protein-based therapeutics. For both these strategies, there remains a need to bridge knowledge gaps regarding hot and cold spots, particularly their exact positions and frequency in the binding interface.

To date, the techniques used in studies of PPIs range from alanine scanning mutagenesis to yeast surface display (YSD) in combination with next-generation sequencing (NGS) [9,2831]. However, these, and most other currently available, approaches generate affinity – but not necessarily specificity – landscapes, and, in the few studies that were designed to generate specificity landscapes [29,3234], the methodology was confined to discrimination between only two target proteins that have different binding epitopes. This constitutes a limitation because broad-spectrum binding proteins may have many potential targets with binding epitopes that have high sequence homology and structural similarity. The current study is thus aimed to address the need for better methods to comprehensively and accurately quantify the impact of individual mutations on the binding specificity of a protein toward homologous targets exhibiting a broad range of affinities.

In our initial efforts to tackle this problem, our laboratory previously developed a qualitative approach for mapping protein specificity landscapes by combining experimental multi-target selective screening of YSD libraries with NGS analysis [35]. This approach enabled us to identify both hot-spot and cold-spot residues, as well as specificity-switch and correlated mutations that act in tandem to shape specificity; nonetheless, there remained a need for similarly facile methods to obtain quantitative binding landscapes. To address this need, we used our previously developed reliable and comprehensive platform as the starting point to construct a quantitative generic methodology to map protein specificity landscapes, regardless of the KD (or Ki) values of the PPIs, as shown schematically in Fig. 1. The major advance offered by this methodology for accurately quantifying the target specificity for thousands of protein mutants lies in the need to determine the inhibition constants (Ki) for only a small number of selected purified mutants after the steps of protein sequence randomization, fractional library screening using YSD into low- and high-affinity PPIs, and NGS for a large number of mutants.

Fig. 1.

Fig. 1.

Overview of our research methodology. (A) Structure of the multispecific ligand N-TIMP2 [Protein Data Bank (PDB) ID code 1BUV], with 7 mutagenized interface positions shown as red spheres. Our N-TIMP2 library contained variants with a single-site random mutation in each clone. (B) Following library preparation, transformation, and expression in the yeast surface display system, the N-TIMP2 library was sequentially screened against three MMP targets (denoted MMP-A, MMP-B and MMP-C) to obtain populations with high and low affinities for each target. The DNA from the sorted library fractions was extracted and subjected to next-generation sequencing (NGS) to determine the frequency of each variant in each subpopulation. NGS affinity and specificity scores were calculated from the collected NGS data, providing insights into the contribution of each amino acid substitution in the N-TIMP2 binding interface to the target binding affinity and specificity, relative to wild-type N-TIMP2. (C) Calculated NGS fitness scores were used to construct qualitative fitness landscapes for each MMP target. As the first step to obtaining quantitative fitness landscapes, several mutants (shown in dotted yellow circles) were chosen for empirical assessment, with focus on mutants that spanned the entire fitness scoring spectrum so as to obtain an optimal representation of the library’s diversity. Following the completion of complementary functional assays with the purified mutants, Ki affinity and specificity scores were calculated and compared to their corresponding NGS scores. (D) The NGS and Ki scores were fitted to linear regression models. The resulting regression equations were then used to ‘scale’ the entire NGS data set through interpolation and to predict the effect of single amino acid substitutions on target affinity and specificity. (E) The scaled scores were finally summarized into definitive binding landscape heat maps, which facilitated the quantitative visualization of the predicted effects of amino acid substitutions on target binding affinity and specificity and the identification of hot and cold spots and specificity switch residues in the N-TIMP2/MMP binding interface. The color scale bars on the right of panels C and E range from blue, representing increases, to red, representing decreases, in the estimated affinity or specificity relative to N-TIMP2WT. Grey indicates cases for which there was insufficient data to determine the score. This lack of data was a result of a data point being excluded from the analysis if the mutant’s frequency was below the minimum frequency threshold in more than one subpopulation, as explained in the Results section under Scaling of NGS scores by linear regression vs. Ki scores.

As model systems for this study, we chose three homologous protein-protein complexes, each composed of the N-terminal domain of the tissue inhibitor of metalloproteinase 2 (N-TIMP2) and the catalytic domain of a matrix metalloproteinase (MMP). As the MMP partner, we chose to focus on MMPs that represent three different functional groups—collagenases, stromelysins, and membrane-type MMPs, namely, MMP-1, MMP-3, and MMP-14, respectively. Importantly, despite their high structural homology (known from X-ray structures), these three MMPs bind N-TIMP2 with different affinities, spanning two orders of magnitude [3638]. This range of affinities renders N-TIMP2 an optimal model through which to develop and test our novel approach, in that its lack of discrimination between MMP-3 and MMP-14 offers a good starting point for manipulating relative specificity, while its higher preexisting selectivity toward MMP-1 offers an opportunity for engineering specificity switches. In addition, the binding interface residues of N-TIMP2 have been shown to tolerate substitution or incorporation of additional amino acids with only a minimal impact on protein stability [3638]. In addition and equally importantly, the three model MMPs selected represent potential targets of clinical value, since MMP-1 and MMP-14 are oncogenic [3944], while MMP-3 plays important roles in tissue regeneration and wound healing [45,46]. To date, there has been very limited progress toward development of specific inhibitors – natural or synthetic – targeting these or other MMP family members, probably due to the high similarity in the sequences and structures MMPs, which share nearly identical active sites in their catalytic domains [47,48].

We thus applied our novel platform to explore all possible single mutations in the N-TIMP2 binding interface of the three complexes, N-TIMP2/MMP-1, N-TIMP2/MMP-3 and N-TIMP2/MMP-14. In contrast to the affinity landscapes obtained in previous studies that generated qualitative information, the quantitative binding landscapes obtained here for the three different N-TIMP2/MMP complexes enabled us to dissect out the contribution of each interface position to binding and hence to quantitatively analyze the effects of all single mutations on affinity and, most importantly, on specificity, without the need to purify all the mutants and test them separately.

Materials and Methods

MMP expression and purification.

The MMPs used in our experiments were the catalytic domains of the proteins known as MMP-1, MMP-3 and MMP-14. The catalytic domains of MMP-1 and MMP-3 were purified as previously described [49,50]. For the MMP-14 catalytic domain, the gene [positions 112–292 [51]] fused to a hexahistidine tag at its C-terminal in a pET3a vector was expressed in Bl21 (DE3) Escherichia coli cells and was purified as described previously [52]. The concentrations of the purified proteins were determined by UV-Visible absorbance at 280 nm [with extinction coefficients (ε280) of 25,440, 28,420 and 35,410 M−1cm−1 for MMP-1, MMP-3 and MMP-14, respectively] using a NanoDrop Spectrophotometer (Thermo Scientific, USA). The purity of the proteins was determined by SDS-PAGE. In experiments for which labeled MMP-14 was required, biotinylation of the purified protein was performed with EZ-Link sulfo-NHS-LC-LC-biotin according to the manufacturer’s instructions (Pierce, Rockford, IL, USA), followed by purification of the labeled protein on a size-exclusion Superdex 75 column.

Preparation of a focused N-TIMP2 library in a YSD system

A single-mutation N-TIMP2 library was engineered on the basis of seven mutagenesis-tolerant [53] residues in the binding interface of the protein. Briefly, the library, purchased from GenScript (Piscataway, NJ), was constructed using NNS degenerate codons (where N represents A, C, T, or G nucleotides, and S represents C or G) at positions 4, 35, 38, 68, 71, 97, and 99 on the basis of the N-TIMP2WT gene (PDB ID code 1BUV), giving random mutations at a single position for each clone. This single position N-TIMP2 mutagenesis library was expressed in a YSD system with the S. cerevisiae EBY100 yeast strain according to an established protocol [54]. For the construction of a yeast-displayed N-TIMP2 library with a free N-terminus, the pCHA-VRC01-scFv vector (obtained from Dane Wittrup, Massachusetts Institute of Technology) was used [55]. In this construct, the C-terminus of the N-TIMP2 library was fused to the N-terminus of Aga-2, leaving the N-terminus of the displayed N-TIMP2 protein variants exposed to the solvent [56]. The pCHA vector was linearized with the restriction enzymes BamHI-HF and Nhe-HF (New England Biolabs, Ipswich, MA) and, together with the N-TIMP2 library, were transformed by homologous recombination into a competent EBY100 S. cerevisiae yeast strain using a MicroPulser electroporator (Bio-Rad, CA, USA), as previously described [36]. The transformed yeast was grown in SDCAA selective medium (2% dextrose, 0.67% yeast nitrogen base, 0.5% Bacto casamino acids, 1.47% sodium citrate, and 0.429% citric acid monohydrate, adjusted to pH 4.5) overnight at 30 °C to an OD600 of 10.0 (108 cells/ml). A library size of 5.45×106 transformants was obtained, as determined by plating serial dilutions on SDCAA plates (2% dextrose, 0.67% yeast nitrogen base, 0.5% Bacto casamino acids, 1.54% Na2HPO4, 1.856% Na2HPOH2O, 18.2% sorbitol, and 1.5% agar).

Library sorting and analysis by flow cytometry

The yeast-displayed N-TIMP2 library that had been grown in SDCAA medium overnight was transferred into SGCAA medium (2% galactose, 0.67% yeast nitrogen base, 0.5% Bacto casamino acids, 1.47% sodium citrate, and 0.429% citric acid monohydrate, adjusted to pH 4.5) for overnight growth at 30 °C to induce expression of the library and to display it on the surface of the yeast cells [54]. For screening against the MMP targets, ~5×107 yeast cells expressing the library were collected and washed with a binding buffer (50 mM Tris, pH 7.5, 100 mM NaCl, 5 mM CaCl2, and 1% BSA). Thereafter, the cells were incubated for 1 h at room temperature (RT) with the primary antibody, mouse anti-c-Myc antibody (Abcam, Cambridge, UK), in 1:50 ratio together with soluble biotinylated MMP-1, MMP-3 or MMP-14 in final concentrations in the binding buffer of 0.78 nM, 12.5 nM or 12.5 nM, respectively. The cells were washed with the binding buffer and incubated with streptavidin-conjugated to Alexa Fluor® 647 (Invitrogen, IL) in a 1:200 ratio together with sheep anti-mouse antibody conjugated to phycoerythrin (PE) (Sigma-Aldrich, MO) in a 1:50 ratio in the binding buffer in the dark for 30 min on ice. Thereafter, the cells were washed and sorted into two gates to differentiate between high- and low-affinity clones by using an FACSAria (Ilse Katz Institute for Nanoscale Science and Technology, BGU, Israel). Dual-color flow cytometry was used for analysis of the expression and binding of the N-TIMP2 library and selected individual clones to the different MMP targets using an Accuri C6 flow cytometer (BD Biosciences, San Jose, CA).

For the YSD-based affinity titration curves, the same MMP labeling and N-TIMP2WT/MMP binding reaction conditions as those described above were used. The binding of N-TIMP2WT to MMP-1, MMP-3 and MMP-14 was determined by incubating the yeast cells displaying N-TIMP2WT with different concentrations of purified MMP-1 (0.05 nM, 0.1 nM, 0.2 nM, 0.5 nM, 1 nM, 2 nM, 5 nM, 8 nM, 10 nM, 20 nM, 75 nM, 100 nM and 250 nM) and MMP-3 and MMP-14 (1 nM, 5 nM, 10 nM, 20 nM, 50 nM, 100 nM, 500 nM, 750 nM and 1 μM). To obtain a plot of the normalized binding for the different MMP concentrations, each affinity measurement was normalized to the highest binding against the same MMP target, which was set as 1,. A nonlinear binding fit was implemented using GraphPad Prism (GraphPad software, CA, USA) for the N-TIMP2WT/MMP-1, N-TIMP2WT/MMP-3 and N-TIMP2WT/MMP-14 interactions. The values of the apparent KD for each interaction were calculated using the ‘binding saturation – one site total’ fit.

DNA Sequencing of the sorted N-TIMP2 library fractions.

The sorted N-TIMP2 library fractions (high and low affinity) were isolated from the yeast cells using the E.Z.N.A. Yeast Plasmid Mini Kit (Omega Bio-tek, Norcross, GA, USA) according to the manufacturer’s protocol. The kit products were run on a 1% agarose gel and were then purified using HiYield Gel/PCR Fragments Extraction Kit (RBC Bioscience, Taiwan). Thereafter, ~1 ng of plasmid DNA was used for gene amplification in a PCR reaction, containing 2% DNA template, 5% forward primer (10 μM), 5% reverse primer (10 μM) (Table S2), 20% Q5 reaction buffer, 2% dNTPs, and 1% Q5 HF polymerase (New England Biolabs, Ipswich, MA, USA) in doubly distilled water. The PCR conditions were as follows: 98 °C for 30 s, followed by 30 cycles of 98 °C for 10 s, 56 °C for 30 s and 72 °C for 1 min; the reaction mixture was then left to stand at 72 °C for 10 min. The PCR products were then sent for sequencing (Hylabs, Rehovot, Israel), and a second PCR reaction was performed using the Fluidigm Access Array primers to add the adaptors and barcodes. Then, the samples were purified with AmpureXP beads (Beckman Coulter, CA, USA), and their DNA concentrations were detected in a DNA high sensitivity assay performed in a Qubit fluorometer (Thermo Fisher Scientific). Thereafter, the samples were run on a TapeStation (Agilent, CA, USA) to verify the size of the PCR product. As a final quality test, the samples underwent a qRT-PCR reaction to determine the concentration of the DNA that could be sequenced. The pools were then loaded for sequencing on an Illumina Miseq, using the 500v2 kit.

Data filtration and integration of high-fidelity reads.

The data from all the sequencing runs for the N-TIMP2 library fractions was analyzed in the same manner. An average Illumina quality score was calculated for each read in a given set of paired-end reads, and read pairs in which either read had an average quality score lower than 20 (i.e., less than 99% accuracy) were discarded. The paired end reads were combined together into a single sequence by Fast Length Adjustment of Short reads (FLASH) software [57] [The Center for Computational Biology (CCB), Johns Hopkins University, Maryland, USA].

Analysis of the high-throughput sequencing data

Translation and analysis of the DNA sequences were performed with MATLAB software, version R2016a. The DNA sequences of the sorted N-TIMP2 library fractions were translated to their corresponding amino acid sequences and aligned to the N-TIMP2WT sequence (PDB ID code 1BUV). Sequences containing longer or shorter lengths than the WT or sequences with a stop codon incorporated inside them were filtered out. Thereafter, a minimum threshold value was set in the form of a minimum frequency (fmin), rather than an absolute mutant count, for two reasons: to avoid inflation of the data in cases of mutants with low read counts and because the library size of the sorted library fractions was not consistent (~5×105 – 9×105 reads). The fmin value (SI Appendix, Fig. S7) was set to 2−12 and was based on the distribution of mutants in the pre-sorting library. Then, the numbers of appearances of each mutant in each library fraction were summed, and the frequency of each mutant was calculated as:

fmut,i=#readsmut,i#readsmut,i. (Eq. 1)

where # readsmut,i is the number of reads of a mutant in library fraction i, and ∑# readsmut,i is the sum of all the reads for all mutants in library fraction i.

Next, to compare the frequencies of each mutant to that of the WT in the same library fraction, we derived a parameter that we termed the normalized frequency (NF), which is the ratio between the frequency of a given mutant and the frequency of the WT in library fraction i, i.e.:

NFmut,i=fmut,ifWT,i. (Eq. 2)

For mutants with low frequencies, i.e., below fmin, the NF value was calculated with fmut,i set to the fmin value.

Based on the normalized frequency, we determined the following NGS scores:

  1. NGSaffinityscore=NFmut,HighAffinityNFmut,LowAffinity. (Eq. 3)
    where NFmut,High Affinity and NFmut,Low Affinity are the normalized frequencies of a mutant in the high- and low-affinity library fractions for a given target, respectively; a high score implies a high affinity and vice versa.
  2. PairedNGSspecificityscoreA/B=NFmut,MMPANFmut,MMPB. (Eq. 4)
    where NFmut,MMP-A and NFmut,MMP-B are the normalized frequencies of a mutant in the high affinity library fractions of MMP-A and MMP-B, respectively; this score indicates the extent of specificity of a mutant towards MMP-A in preference to MMP-B.
  3. TotalNGSspecificityscoreA=NFmut,MMPANFmut,MMPB*NFmut,MMPANFmut,MMPC (Eq. 5)
    which is the product of all the paired NGS specificity scores for target A vs. the other two targets. In all the scoring calculations above, a score was discarded if it had to be used with more than one NF parameter containing the fmin value.

We then presented these NGS scores (affinity scores, paired specificity scores and total specificity scores) in the form of qualitative heat maps presenting the log2 transformation of the scores listed above (see Eq. 25), in which blue indicates an increase and red a decrease in the estimated affinity or specificity relative to N-TIMP2WT.

Based on the above-mentioned NGS results, several mutants whose affinity and specificity scores fell within a wide spectrum (i.e., from low to high) were chosen for further production and purification, followed by functional evaluation, i.e., Ki determination, with the aim to achieve an optimal representation of each mutant population for the validation and ‘scaling’ of the NGS results.

Engineering N-TIMP2 variants by using site-directed mutagenesis.

Based on our NGS results, N-TIMP2WT and several variants were selected for further purification and characterization. For purification, we used the pPICZαA construct, containing both the N-TIMP2WT gene [36] and the Zeocin resistance gene, and also the AOX1 promoter at its N terminus and a hexahistidine tag its C terminus. The plasmid was propagated in DH5α E. coli cells and then purified using a HiYield plasmid mini kit (RBC Bioscience, Taiwan). Thereafter, the site-directed mutagenesis procedure for the selected N-TIMP2 variants was carried out in a PCR reaction using specific primers (Table S2) containing the desired amino acid encoding codon in the middle and 15 bp flanking it from each side, complementary to the template WT DNA sequence. The PCR reaction mixture comprised: 2% DNA plasmid template (~50 ng), 5% forward primer (10 μM), 5% reverse primer (10 μM), 20% Phusion HF buffer, 2% dNTPs, and 1% Phusion HF polymerase in doubly distilled water. The PCR conditions were: 98 °C for 3 min, followed by 25 cycles of 98 °C for 10 s, 65°C for 30 s and 72°C for 10 min; the reaction mixture was then left to stand at 72 °C for 10 min. Next, the PCR reaction products were loaded on a diagnostic 1% agarose gel to verify the procedure’s success, and then transformed into competent DH10β E. coli cells. The transformed bacteria were plated on LB agar plates containing 50 μg/ml Zeocin (Invitrogen, NY, USA). The plasmid was extracted from several bacterial colonies, and the correct sequence of the desired mutation was verified (Genetics Unit, NIBN, BGU, Israel).

Production of selected N-TIMP2 variants in Pichia pastoris.

To purify the selected N-TIMP2 variants, the yeast P. pastoris strain X-33, which upon induction secretes proteins to the growth medium, was used according to the pPICZα protocol (Invitrogen, CA, USA) with minor modifications. In brief, for preparing a sufficient amount of plasmid from each N-TIMP2 variant, the transformed DH10β E. coli cells containing the plasmid were grown overnight at 37 °C in 500 ml of LB medium containing 50 μg/ml Zeocin (Invitrogen, Grand Island, NY, USA), and the plasmid was extracted using MaxiPrep (Geneaid, New Taipei City, Taiwan). Thereafter, ~100 μg of plasmid from each variant was linearized with the restriction enzyme SacI-HF (New England Biolabs, Ipswich, MA, USA) and then transformed into electro-competent P. pastoris X-33 according to the pPICZα protocol (Invitrogen, CA, USA). The transformed yeasts were grown on YPDS plates (18.2% sorbitol, 2% peptone, 2% d-glucose, 2% agar, 1% yeast extract and 50 μg/ml Zeocin) for 72 h at 30 °C.

Purification of N-TIMP2 variants was performed as previously described [36]. For each N-TIMP2 variant, 5 colonies were selected and grown overnight in 5 ml of BMGY medium (2% peptone, 1% yeast extract, 0.23% K2H(PO4), 1.181% KH2(PO4), 1.34% yeast nitrogen base, 4×10−5 % biotin, 1% glycerol) at 30 °C, and then transferred into 5 ml of BMMY medium (2% peptone, 1% yeast extract, 0.23% K2H(PO4), 1.181% KH2(PO4), 1.34% yeast nitrogen base, 4×10–5% biotin, 0.5% methanol) for protein induction of 72 h, with the addition of 1% methanol once a day in the last two days. Overexpression of the secreted proteins was determined by western blot, using a 1:3000 dilution of mouse anti-6×His antibody (Abcam, Cambridge, UK) primary antibody, followed by a 1:5000 dilution of anti-mouse secondary antibody conjugated to alkaline phosphatase (Jackson ImmunoResearch, West Grove, PA, USA), and detection by incubation in 2 ml of 5-bromo-4-chloro-3-indolyl phosphate reagent (Sigma-Aldrich, USA). Large-scale production of the proteins was performed by growth of the N-TIMP2-expressing yeast exhibiting the highest protein overexpression in 50 ml of BMGY medium overnight, followed by 72 h of growth in BMMY medium, with daily additions of 1% methanol. The proteins were purified by centrifugation of the yeast cell suspension at 3800 g for 10 min and filtration of the supernatant, followed by addition of 500 mM NaCl and 10 mM imidazole in pH 8.0. The supernatant was incubated for 1 h at 4 °C, centrifuged at 3800 g for 10 min, filtered, and then loaded on nickel-nitrilotriacetic acid-Sepharose beads (Invitrogen, USA), washed with 50 mM Tris, pH 7.5, 100 mM NaCl, and 10 mM imidazole, and eluted with 20 ml of 50 mM Tris, pH7.5, 100 mM NaCl, 300 mM imidazole, and 5 mM CaCl2. The elution fraction was concentrated using a Vivaspin centrifugal concentrator with a 3-kDa cutoff (GE Healthcare Life Sciences, USA). The proteins were further purified using a Superdex 75 column (GE Healthcare Life Sciences, USA) with elution buffer (50 mM Tris, pH 7.5, 100 mM NaCl and 5 mM CaCl2) in an ÄKTA pure instrument (GE Healthcare Life Sciences, USA). SDS-PAGE analysis on a 15% polyacrylamide gel under reducing conditions for the purified proteins was then performed. Bands were visualized by staining with Instant Blue (CBS Scientific, CA, USA). Protein samples were concentrated using a Vivaspin centrifugal concentrator with a 3-kDa cutoff and subjected to mass spectrometry analysis (Ilse Katz Institute for Nanoscale Science and Technology, BGU, Israel). Protein concentrations were determined by UV-Vis absorbance at 280 nm, using a NanoDrop Spectrophotometer (Thermo Scientific, USA), with an extinction coefficient (ε280) of 13,500 M−1cm−1 for N-TIMP2WT and all its variants except for N-TIMP2V71W with ε280 = 18,825 M−1cm−1.

Catalytic activity and inhibition assays

Catalytic activity and inhibition assays were performed as previously described with minor modifications [37]. N-TIMP2WT and its variants were tested for inhibitory activity by incubating them with the three different MMPs in the following concentrations: 0.25 nM MMP-1 with 0.156–10 nM N-TIMP2WT or with 0.156–62.5 nM N-TIMP2 variants; 0.25 nM MMP-3 with 0.625–40 nM N-TIMP2WT or with 0.313–250 nM of the N-TIMP2 variants; or 0.0075 nM MMP-14 with 0.625–40 nM N-TIMP2WT or with 0.156–80 nM N-TIMP2 variants. The incubations were performed in TCNB buffer (50 mM Tris, pH 7.5, 100 mM NaCl, 5 mM CaCl2, and 0.05% Brij) for 1 h at 37 °C. Next, the fluorogenic substrate Mca-Pro-Leu-Gly-Leu-Dpa-Ala-Arg-NH2·TFA [where Mca is (7-methoxycoumarin-4-yl)acetyl, Dpa is N-3-(2,4-dinitrophenyl)-l-2,3-diaminopropionyl and TFA is trifluoroacetic acid] (Merck Millipore, CA) was added to the reaction mixture at a final concentration of 12.5 μM for MMP-1 and MMP-3 or 15 μM for MMP-14, and the fluorescence was monitored (with 340/30 excitation and 400/30 emission filters) using a Synergy 2 plate reader (BioTek, Winooski, VT, USA) at 37 °C. Reactions were followed spectroscopically for 60 min, and initial rates were determined from the linear portion of the increase in fluorescence signal caused by the cleavage of the fluorescent substrate. Data were globally fitted by multiple regression to Morrison’s tight binding inhibition equation (see Eq. 6) using GraphPad Prism 7 (San Diego, CA, USA). The inhibition constant, Ki, was calculated by plotting the initial velocities against seven different concentrations of the inhibitors. Reported Ki values are the averages of three independent experiments ± standard error of the mean. Calculations were performed using Km values of 3.607 ± 0.598 μM for MMP-1, 3.771 ± 0.428 μM for MMP-3 and 7.960 ± 2.230 μM for MMP-14.

ViV0=1([E]+[I]+Kiapp)([E]+[I]+Kiapp)24[E][I]2[E]. (Eq. 6)

where Vi and V0 are the enzyme (MMP) velocities in the presence and absence of the relevant N-TIMP2 inhibitor, respectively; E and I are the concentrations of enzyme and inhibitor, respectively; Km is the Michaelis-Menten constant; and Kiapp is the apparent inhibition constant, which is given by: Kiapp=Ki(1+[S]Km), where S is the substrate concentration.

Next, the Ki affinity score of each mutant for each target MMP (Ki fold) was evaluated by calculating the fold change in its inhibition constant relative to the inhibition constant of the WT, as:

Kiaffinityscore=KiWTKimut. (Eq. 7).

Thereafter, the Ki specificity scores were evaluated in the same manner as for the NGS specificity scores (see Eq. 35):

PairedKispecificityscoreA/B=Kifoldmut,MMPAKifoldmut,MMPB. (Eq. 8)

was used to calculate the specificity of a mutant towards target A vs. target B by dividing their respective Ki fold values, and

TotalKispecificityscoreA=Kifoldmut,MMPAKifoldmut,MMPB*Kifoldmut,MMPAKifoldmut,MMPC (Eq. 9)

was used to evaluate the extent of specificity of a mutant towards target A vs. the other targets, by multiplying the paired Ki specificity scores for target A vs. all the other targets.

‘Scaling’ of the binding landscapes

After determining the Ki affinity and specificity scores, we performed log2 transformations of the NGS and Ki scores and then fitted the NGS and Ki results to a linear regression model. Analysis of the same data was also performed using leave-one-out cross-validation approach, where each data point was predicted without the enrichment information for that particular data point. The obtained regression lines allowed us to place the NGS scores of other, unpurified, variants in the regression formulas, and calculate their corrected corresponding affinity and specificity scores via interpolation. The calculation was carried out for all the single mutations in the seven N-TIMP2 interface positions, which then enabled us to ‘scale’ our initial qualitative (‘apparent’) binding landscapes and transform them to quantitative heat maps that reflected/predicted the exact outcomes of single amino acid substitutions on target binding affinity and specificity.

Statistical analysis

To validate the correlation between the NGS scores and the Ki scores, we used Pearson correlation (R), p-value, Spearman’s correlation (ρ) and Kendall correlation (τ); the analysis was performed in the Partek Genomics Suite [58,59].

Results

Fractional sorting of the N-TIMP2 mutagenesis library for high and low affinity to MMP-1, MMP-3 and MMP-14

As the first step to comprehensively mapping the affinity and specificity landscapes of the interactions of N-TIMP2 with the three different MMPs, i.e., to determine the contribution of each position and of each specific mutation to the affinity and specificity of N-TIMP2 for the target MMP, we generated a single-mutation N-TIMP2 library with mutations in seven key positions, i.e., 4, 35, 38, 68, 71, 97 and 99 (Fig. 1, step A). These positions were chosen for their known importance in MMP binding, their close proximity (within 4 Å) to the MMP interface and to the catalytic zinc in the N-TIMP2/MMP-14 complex structure [53,60], and their structural tolerance to mutagenesis [36,53]. Although previous experiments have evaluated the combined effects of certain mutations in these positions, the contributions of each position and mutation to the affinity and, particularly, to the specificity for the target, i.e., as cold spots or hot spots, has not been previously determined.

We used a YSD platform to select N-TIMP2 variants that bind to MMP-1, MMP-3 and MMP-14 with different affinities, as follows. We cloned the coding region of the N-TIMP2 variants into the YSD plasmid pCHA (which allows the N-terminus of N-TIMP2 to be freely exposed) for presentation of the proteins on the Saccharomyces cerevisiae yeast surface as fusion proteins with the Aga2p/Aga1p system (SI Appendix, Fig. S1A). The N-TIMP2 library, expressed in the YSD system (SI Appendix, Fig. S1A), was subjected to fluorescence-activated cell sorting (FACS) for separate screening against each of the three MMP targets, namely, the catalytic domains of MMP-1, MMP-3 and MMP-14 (Fig. 1, step B). Target concentrations were chosen to give maximal scattering in the affinity signal of the sorted populations and hence to facilitate identification of, and differentiation between, high- and low-affinity populations, where ‘high’ is > wild type (WT) and ‘low’ is < WT. We note that to compensate for differences between target concentrations used for sorting (in the FACS experiment) and the Ki of the relevant N-TIMP2/MMP complex, the results were later calibrated as described below in the section Scaling of NGS scores by linear regression vs. Ki scores. For each sort, high- and low-affinity fractions were collected by applying diagonal sorting gates, which facilitated the normalization of the binding signal to the expression for each N-TIMP2 clone (SI Appendix, Fig. S1B). Further post-sort flow cytometry analysis of the fractionated sub-populations verified the differences in the binding signal between the sorted high- and low-affinity fractions (SI Appendix, Fig. S1C). A comparison between scattering pattern of the pre-sorted library population vs that of each of the individual clones verified that the range of scattering for a single mutant was significantly narrower than that of the library (SI Appendix, Fig. S2).

Generating qualitative affinity and specificity heat maps based on NGS analysis of the sorted N-TIMP2 library fractions

Following the isolation of the high- and low-affinity fractions for each target, the plasmid DNA of each fraction was extracted and subjected to high-throughput NGS analysis. The quality of the sequencing output was verified, with ~90% of the sequenced paired end reads undergoing successful quality filtration and integration. Thereafter, the DNA sequences of the N-TIMP2 library fractions were translated into their respective amino acid sequences and aligned to the WT N-TIMP2 sequence (N-TIMP2WT; PDB ID code 1BUV). The numbers of reads for each mutant in each sorted fraction were summed, and the frequency of the mutant in each sorted fraction was determined, where frequency was defined as the ratio of number of reads for a single mutant normalized to the total number of reads for a given sorted fraction (Eq. 1; for Eqs 16, see Analysis of the high-throughput sequencing data in Materials and Methods). Next, to estimate the contribution of each amino acid substitution in the N-TIMP2 binding interface to the change in binding affinity and specificity for each MMP target, we defined a parameter, which we termed the normalized frequency (NF), as the ratio between the frequency of a single mutant and the frequency of N-TIMP2WT in the same sorted fraction (Eq. 2; see Analysis of the high-throughput sequencing data). This parameter allowed us to compare the affinity of each N-TIMP2 mutant for a specific MMP to that of the WT N-TIMP2 for the same MMP in a manner that was not influenced by either the size of the pooled libraries or the frequency of the mutant in the original (pre-sorting) mutated library. This approach complements previous studies that evaluated changes in target affinities on the basis of enrichment ratios comparing the frequency of a particular mutant in a sorted library to its frequency in the original library [29,61].

To enable us to treat the raw NGS data – and ultimately to quantify our findings for binding affinity and specificity – we then defined three different scores, as follows:

  1. the ‘NGS affinity score’ (Eq. 3; see Analysis of the high-throughput sequencing data) as the ratio between the mutant’s NF parameters in the high- and low-affinity fractions of the same MMP target, our assumption being that if a mutant’s NGS affinity score exceeds 1, then its affinity towards the desired target would be higher than that of N-TIMP2WT, and vice versa;

  2. the ‘paired NGS specificity score’ (Eq. 4; see Analysis of the high-throughput sequencing data) as the NF value obtained from a high-affinity fraction for a desired target divided by the NF value of the same high-affinity fraction for a different target, thereby quantifying the specificity of a mutant towards a given target over another target (a score > 1 representing a higher specificity of a mutant towards target A over B); and

  3. the ‘total NGS specificity score’ (Eq. 5; see Analysis of the high-throughput sequencing data) as the product of all the paired NGS specificity scores for a certain target vs. all the other targets, thereby giving a score that determines the extent of specificity of a mutant towards a particular target in a group of homologous targets.

The information obtained from the NGS analysis and scoring calculations provided qualitative insight into the relative contribution of each amino acid substitution in the N-TIMP2 binding interface to the binding affinity and specificity for a particular target. The construction of quantitative binding landscapes then required further adjustments and ‘scaling’ to complementary experimental affinity determinations of representative purified mutants. To facilitate the optimal selection of a number of representative variants for binding/inhibition measurements, we first summarized the initial NGS results (expressed as log2 of the affinity, paired specificity and total specificity NGS scores) into heat maps (SI Appendix, Fig. S3); these heat maps enabled the qualitative visualization of the apparent effects of each amino acid substitution on the affinity and specificity (vs. a single different MMP or multiple MMPs) towards the MMP targets (Fig. 1, step C).

Empirical assessment of affinity and specificity of purified calibrator mutants for scaling of the NGS data

To ‘scale’ the apparent affinity and specificity scores obtained from the NGS data, we selected eight mutants (covering the entire spectrum of NGS affinity and specificity scores; see below in this section) and purified them for further empirical binding analysis (SI Appendix, Fig. S4). This analysis of the interactions between free proteins (solution KD) was required because solution KD values were ~ 50-fold lower than apparent KD values obtained for the same PPI when using the yeast display format (SI Appendix, Fig. S5 and Table 1). By creating linear regression equations, we were able to correlate the NGS scores with the Ki binding results obtained for these purified ‘calibrator’ mutants, and subsequently to use these equations to interpolate the affinity and specificity of each of the mutants in the N-TIMP2 mutated library (i.e., for each mutation in the binding interface) to the selected MMP targets. To optimally represent the complete range of affinity and specificity of mutants present within the library, the selection of calibrator mutants for the binding analysis was aimed to cover the entire spectrum of NGS affinity and specificity scores, as follows. For each of the three MMP targets, two mutants exhibiting high specificity scores were chosen, as follows: N-TIMP2T99M and N-TIMP2T99Q selective for MMP-1, N-TIMP2S68M and N-TIMP2V71W selective for MMP-3, and N-TIMP2H97R and N-TIMP2T99G selective for MMP-14. In addition, we chose one mutant that exhibited high-affinity scores towards all three targets, namely, N-TIMP2S68E, and one mutant with low affinity scores to all three targets, namely, N-TIMP2V71R. We then examined the affinity and specificity of each of the purified calibrator mutants for the selected MMP targets by performing catalytic inhibition assays. In these assays, Ki for the binding between each MMP target and N-TIMP2WT or one of the selected N-TIMP2 mutants was evaluated by monitoring the cleavage of a fluorescent MMP substrate over time in the presence of increasing concentrations of inhibitors and fitting the acquired data by multiple regression to Morrison’s tight binding inhibition equation (Eq. 6; see Catalytic activity and inhibition assays in Materials and Methods), as shown in Fig. 2 and Table 1. Utilizing the above-measured inhibition constants, we then determined the following Ki affinity and specificity scores for each mutant:

  1. Ki affinity score, namely, Ki fold, as the ratio between the Ki of the WT and the Ki of the mutant, which represents the fold change in affinity as a result of a given amino acid substitution, where Ki fold > 1 thus means an increase in affinity relative to the WT, and vice versa (Eq. 7; see Catalytic activity and inhibition assays);

  2. paired Ki specificity score, as a ratio between the Ki fold change of a mutant towards one target and the Ki fold change towards a second target, which indicates the extent of specificity awarded by a single mutation towards a given target over another (Eq. 8; see Catalytic activity and inhibition assays); and

  3. total Ki specificity score, as the product of all the paired Ki specificity scores for a certain target vs. the rest of the targets, which aids in quantifying the degree of specificity of a mutant towards a single target vs. all the other targets (Eq. 9; see Catalytic activity and inhibition assays).

Table 1.

Ki and Ki fold values of the N-TIMP2 variants against the three MMP targets.

Ki (nM)* Ki fold**
Clone MMP-1 MMP-3 MMP-14 MMP-1 MMP-3 MMP-14
WT 0.054 ± 0.003 1.34 ± 0.05 0.726 ± 0.030 1 1 1
S68E 0.249 ± 0.015 4.71 ± 0.10 1.42 ± 0.10 0.22 0.28 0.51
S68M 0.181 ± 0.005 3.29 ± 0.08 2.60 ± 0.12 0.30 0.41 0.28
V71R 5.59 ± 0.10 86.2 ± 1.9 ND 0.0097 0.016 ND
V71W 2.99 ± 0.06 26.4 ± 1.8 34.3 ± 2.1 0.018 0.051 0.021
H97R 0.031 ± 0.002 1.31 ± 0.03 0.125 ± 0.010 1.72 1.02 5.81
T99M 0.085 ± 0.003 8.29 ± 0.54 2.05 ± 0.08 0.64 0.16 0.35
T99Q 0.099 ± 0.006 3.72 ± 0.24 3.92 ± 0.27 0.55 0.36 0.19
T99G 0.080 ± 0.003 4.24 ± 0.20 0.534 ± 0.040 0.67 0.32 1.36
*

Data shown is the average of independent triplicate experiments ± SEM.

**

Ki fold represents the ratio between the Ki value of the wild-type and the Ki of the mutant.

ND – Could not be determined.

Fig. 2.

Fig. 2.

MMP inhibition by N-TIMP2WT and selected N-TIMP2 mutants. (A&B) MMP-1 inhibition; (C&D) MMP-3 inhibition; and (E&F) MMP-14 inhibition. Cleavage of the fluorescent MMP substrate Mca-Pro-Leu-Gly-Leu-Dpa-Ala-Arg-NH2·TFA was measured over time, and the initial reaction velocities at each inhibitor concentration were determined. To obtain the inhibition constants (Ki), data was fitted by multiple regression to Morrison’s tight binding inhibition equation (Eq. 6; see Materials and Methods). Data shown is the average of independent triplicate experiments, and error bars represent the standard deviation.

The Ki values for the PPI of N-TIMP2WT with MMP-1, MMP-3 and MMP-14 were 0.054 ± 0.003, 1.34 ± 0.05 and 0.726 ± 0.030 nM, respectively, values that are consistent with those reported in previous studies [3638]. The Ki values of the purified mutants interacting with the MMP targets indicated a wide range of affinities, with most of the purified mutants showing weakened affinities to the different MMPs relative to the WT: the Ki fold values lay in the range 0.0097–1.72 for MMP-1, 0.016–1.02 for MMP-3 and 0.021–5.81 for MMP-14 (Table 1).

Comparison of the experimentally obtained quantitative Ki scores (i.e., Ki affinity score, paired Ki specificity score and total Ki specificity score) showed good agreement with the qualitative trends indicated by the respective NGS-based scores. For example, the N-TIMP2 variant that had the highest NGS affinity scores for MMP-1 and MMP-14, namely N-TIMP2H97R, did indeed exhibit the highest increase in affinity towards those targets, with Ki fold values of 1.72 and 5.81 for MMP-1 and MMP-14, respectively (Table 1). Similarly, variant N-TIMP2V71R exhibited the lowest NGS affinity scores for MMP-1 and MMP-3 and the most deleterious effects on binding affinity, with Ki fold values of 0.0097 and 0.016 for MMP-1 and MMP-3, respectively (Table 1). The results for specificity showed similar correlations between NGS scores and experimentally obtained Ki scores for all the mutants. For example, indications from the NGS scores that N-TIMP2T99M would be specific for MMP-1 were confirmed by the experimentally obtained 4.0- and 1.8-fold enhancements in specificity for MMP-1 vs. MMP-3 and MMP-14, respectively. The results for all the mutants are summarized in the SI Appendix, Table S1.

Scaling of NGS scores by linear regression vs. Ki scores

Due to the differences between the target MMP concentrations used for the N-TIMP2 library sorts vs the Ki values for the respective purified N-TIMP2WT/MMP complexes, we calibrated the NGS scores by linear regression vs the Ki scores (see Materials and Methods). Log2 transformations of the Ki affinity and specificity scores of the purified mutants were plotted against the corresponding NGS scores, and the plots were fitted to linear regression models (Fig. 1, step D; SI Appendix, Fig. S6). The affinity plots showed good agreement between the experimentally obtained Ki results and the NGS scores (SI Appendix, Fig. S6AC, Table S3), with Pearson’s R-values of 0.9803, 0.8849 and 0.7662 for MMP-1, MMP-3 and MMP-14, respectively. The affinity correlation for MMP-14 (SI Appendix, Fig. S6C) included 11 additional experimental data points that were obtained from a previous study [53], thereby strengthening the reliability of our findings. In addition, the specificity scores for the pairs MMP-1 vs. MMP-3, MMP-1 vs. MMP-14, and MMP-3 vs. MMP-14 showed good correlations (SI Appendix, Fig. S6DF, Table S3), with Pearson’s R-values of 0.8963, 0.9112 and 0.9645, respectively. Likewise, good correlations were obtained for the total specificity scores assigned to each MMP target individually vs. all the MMPs, the resulting Pearson’s R-values being 0.8731, 0.9320 and 0.9462 for MMP-1, MMP-3 and MMP-14, respectively (SI Appendix, Fig. S6GI, Table S3). Analysis of the same data was performed using leave-one-out cross-validation approach [62], where each data point was predicted without the enrichment information for that particular data point. Good correlations were obtained both for the affinity, specificity and total specificity scores, with Pearson’s R-values of 0.87, 0.87 and 0.88, respectively (SI Appendix, Fig.S7AC). In all the NGS scoring calculations, a data point was excluded from the analysis if the mutant’s frequency was below the minimum frequency threshold in more than one subpopulation (i.e., both high- and low-affinity gates in the affinity score calculations, or both high-affinity gates of two different targets in the specificity score calculations, see Materials and Methods for further explanations). This approach enabled the scaling of the NGS scores to the Ki data, and hence further refinement, fine-tuning and quantification of the affinity and specificity landscapes in a precise way.

Quantitative N-TIMP2 affinity and specificity landscapes

Use of the regression equations linking the NGS and Ki scores for as small a number as eight individual calibrant variants allowed us to scale our entire set of NGS results through interpolation and thereby to predict the effects of numerous other mutants – and almost all the single amino acid substitutions in our library – with regard to affinity and specificity towards the MMP targets. Heat maps prepared on the basis of the scaled affinity and specificity scores constitute quantitative binding landscapes for N-TIMP2 and the three MMPs (Fig. 1, step E), as presented in Fig. 3.

Fig. 3.

Fig. 3.

Scaled quantitative N-TIMP2 binding landscapes for (A) MMP-1, (B) MMP-3, and (C) MMP-14, which show the predicted effects of single amino acid substitutions on target affinity and specificity, as determined from the scaling regression equations. The color scale bars on the right of each panel range from blue, representing increases, to red, representing decreases, in the estimated affinity or specificity relative to N-TIMP2WT. Grey indicates cases for which there was insufficient data to determine the score. This lack of data was a result of a data point being excluded from the analysis if the mutant’s frequency was below the minimum frequency threshold in more than one subpopulation, as explained in the Results section under Scaling of NGS scores by linear regression vs. Ki scores. Wild-type residues are shown as black letters inside the matrix; the substituting amino acid is shown on the X-axis; and the identity and position of the original substituted amino acid are shown in the Y-axis.

Examination of the scaled binding landscapes revealed the effects of the various mutations in the N-TIMP2 binding interface on target binding affinity and specificity and showed clearly that some N-TIMP2/MMP complexes were more tolerant to mutations than others. The binding affinity landscape for MMP-1 (Fig. 3A) showed that most of the tested mutations led to a substantial decrease in binding affinity. Nonetheless, among the seven positions that were examined, mutations in positions 97 and 99 were less deleterious or sometimes even affinity enhancing vs those in the other five positions, e.g., mutations H97R and H97L in position 97 increased affinity towards MMP-1. The affinity landscape for MMP-3 (Fig. 3B) showed that all the mutations led to decreased affinity, with mutations in position 71 being the most deleterious. In contrast, the binding affinity landscape for MMP-14 suggested that the binding affinity of N-TIMP2 was not impaired dramatically by the mutations, and that many mutations have the potential to increase the binding affinity to MMP-14 (Fig. 3C): Positions 97 and 99 were identified as potentially affinity-enhancing positions for MMP-14, as most of the amino acid substitutions in those positions led to an increase in affinity. Notably, our analysis revealed a subtle shift in specificity from one target to another; for example, while most of the mutations in position 68 led to an increase in specificity for MMP-14 over MMP-1, mutation S68G led to higher specificity for MMP-1 over MMP-14. In addition, a decrease in the total specificity for MMP-3 was observed in most of the mutations in position 97, except for the mutation to Glu (Fig. 3A), while an increase in total specificity for MMP-14 was revealed at position 99, except for the mutation to Gln (Fig. 3C).

Discussion

We report here a generic platform for quantitatively predicting the binding affinity and specificity for all protein variants with a single mutation in the binding interface for a particular PPI. Such predictions can be made on the basis of a small data set of experimentally determined Ki values for a small number of purified variants and scaled NGS data. Our methodology differs from previous approaches in which correlations between NGS results and Ki (or KD) could not necessarily be generalized, since those studies required library screening using high-affinity sorts of mutant populations and a target protein concentration similar to the Ki (or KD) of the PPI, and could thus predict affinities lying only in a narrow range. Furthermore, such previous approaches could not predict affinities for very high affinity (low Ki) complexes due to technical limitations, as the required low target protein concentrations would fall below the detection levels of the screening process. In contrast, we are able to predict the affinity and specificity of PPIs spanning a broad range of values, including very high affinity interactions, by analysis of NGS data from low- and high-affinity sorts of mutant populations.

As expected, for the three tight binding N-TIMP2/MMP complexes investigated here as model systems for quantitative affinity landscapes, the correlation curves for the NGS results vs. the Ki values were shifted towards the low values of our NGS affinity scores (i.e., for negative NGS scores, positive Ki scores were obtained). Nevertheless, we were able to correct for the bias imposed by the differences between the Ki values and the concentrations used in the YDS library sorting (by scaling the NGS affinity scores to the Ki affinity values) and to obtain strong correlations between NGS and Ki values for these interactions, even though the Ki values of the three complexes were approximately 10-fold lower than the protein concentrations used in the YSD library sorting. Indeed, the resulting binding landscapes correlated very well with experimental data obtained in this paper and with the findings of previous studies [36,53].

To obtain the linear regression curves for the affinity predictions and for generating the quantitative affinity landscape maps, it was necessary to use the more accurate solution KD – and not the apparent KD of the PPI – to normalize the NGS scores. The differences between the solution KD values (0.05 nM, 1.34 nM and 0.73 nM for MMP-1, MMP-3 and MMP-14, respectively) and the lower (by ~ 50-fold) apparent KD values (3.22 nM, 36.20 nM and 48.25 nM for MMP-1, MMP-3 and MMP-14, respectively) for the N-TIMP2–MMP PPI may be attributed to the modifications to the two PPI partners that were required by the experimental protocol, as follows. Detection of the target MMP by binding to streptavidin required biotinylation of the Lys residues, of which two, six, and four lie in the binding interface for MMP-1 (PDB: 3SHI), MMP-3 (PDB: 1UEA) and MMP-14 (PDB: 1BUV), respectively. For detection of N-TIMP2WT with an anti-c-Myc antibody, it was necessary to express the protein with a c-Myc tag. It is likely that the resultant changes to the highly complementary electrostatic potentials of N-TIMP2WT and MMP, particularly those caused by the biotinylation of the target MMP, reduced the strength of the PPI. Support for this idea may be drawn from our findings that the MMP-TIMP PPI is highly optimized (especially for MMP-1 and MMP-3), giving affinities in the low to sub-nanomolar range, and that even a single mutation can easily lead to a drastic decrease in affinity; hence, modifications of the proteins that are used for detection in the YSD setup could result in in drastic changes to KD values. Since the same modifications to both proteins in the N-TIMP2/MMP PPI are used in all the sorting experiments, they are likely to reduce the apparent affinity by the same amount for all the different variants. This premise is corroborated by the high correlation that we obtained between the NGS-derived enrichments and the in vitro KD values.

For any mutant, the correlations between the experimentally obtained Ki scores and the NGS scores for all three N-TIMP2/MMP complexes allow the accurate prediction of Ki values and of both affinity and specificity, without the need to purify that mutant or to perform a functional assay to determine its activity. In order to show how well the linear regression models could predict a Ki measurement for a mutant that was not included when building the model, we used a computational cross-validation approach. Using this approach, involving sequential recalculation of regression models with omission of one mutant at a time and then using those models to predict Ki for the omitted mutant, we demonstrated that the predicted Ki values are still very close to the measured ones. Indeed, we were able to predict five affinity-enhancing mutations that were also identified in an N-TIMP2 variant that was previously evolved by our group for high affinity towards MMP-14 in preference to other MMPs [36]. In fact, three out of five of these mutations showed the highest affinity enhancing scores (vis-à-vis other mutations) for each mutated position. We note that the high correlations between the experimentally obtained Ki scores and the NGS scores for N-TIMP2/MMP-1 and N-TIMP2/MMP-3 (Pearson’s R-values of 0.9803 and 0.8849, respectively) may be attributed to the broad range of Ki values that covered both affinity-enhancing and affinity-reducing mutations, whereas the lower correlation (R = 0.7662) obtained for the N-TIMP2/MMP-14 complex was probably the outcome of using data for proteins purified and tested in different laboratories (our data were supplemented with data from the literature).

Multigate sorting approaches similar – but not identical – to ours for mapping binding landscapes have been reported by the group of Keating [30,63] and the group of Kinney [64], with the consensus of results pointing to the potential utility of exploiting binding landscapes in studies designed to improve the affinity and specificity of PPIs. A key difference between our approach and the approaches taken in these prior studies is that whereas the groups of Keating and Kinney correlated NGS data to apparent dissociation constants derived from flow analysis in the artificial yeast surface display context, we have used actual in vitro affinity measurements of purified proteins for normalization. We attribute the superior correlations between our Ki predictions and actual in vitro measurements (R=0.76–0.98), over a much broader range than reported in prior studies, to our use of the more accurate solution KD values to normalize the NGS scores. Furthermore, if we restrict our analysis to the data generated in our own laboratory, thereby removing the variability introduced by incorporating data points from a previous report [53], the correlations obtained with our approach are even further improved (R values ranging from 0.87 to 0.98). This approach has enabled us to accurately model binding affinities extending to the far extremes of the KD range. An additional advantage of our new approach derives from its practical simplicity, as it requires only two sorting gates, and thus implementation should be straightforward for application of this approach to other systems in other laboratories.

One of the most important attributes of the quantitative binding landscapes is that they allow us to compare structurally similar PPIs with different binding affinities, for which the landscapes are significantly different in terms of average mutational effects. The quantitative binding landscapes for N-TIMP2/MMP-1 and N-TIMP2/MMP-3 illustrate the almost completely optimal nature of these two PPIs, since the binding landscapes are narrow (i.e., have small range of binding affinities) with many single mutations that can cause a large decrease in affinity and only a few single mutations that can cause affinity enhancement. For example, for N-TIMP2/MMP-1 (the highest affinity complex), most single mutations (91.1%) caused a loss of affinity (i.e., NGS affinity score ˂ 1), whereas only few single mutations increased (8.0%) or preserved (0.9%) the initial affinity of N-TIMP2 for MMP-1 (NGS affinity scores of > 1 or lie between –1 and +1, respectively).

In contrast, the quantitative binding landscape for N-TIMP2/MMP-14 is much broader (having the highest range of binding affinities), with single mutations enhancing (21.7%), reducing (17.4%) or not affecting (60.9%) affinity. Most importantly, the PPI for N TIMP2/MMP-14 is far less well optimized than the PPIs for N-TIMP2/MMP-1 and N TIMP2/MMP-3, for which there are larger proportions of affinity-reducing mutants (91.1% and 89.9% for MMP-1 and MMP-3, respectively). These differences between MMP-14, on the one hand, and MMP-1 and MMP-3, on the other, can be seen both in the experimentally measured affinities using the calibrant mutants and in the NGS affinity scores.

In addition, we confirmed experimentally that the NGS predictions for mutant H97R to be slightly affinity enhancing and for V71R and V71W to be affinity reducing were indeed correct. Similar agreement between the NGS predictions and Ki experimental measurement was observed for T99M, T99Q and T99G mutants, with all of them being affinity reducing towards MMP-1, −2 and −14 except T99G mutant being affinity enhancing towards MMP-14. The results for specificity also showed similar correlations between NGS scores and experimentally obtained Ki scores for all the mutants. For example, indications from the NGS scores that the mutants would be specific for a certain MMP were confirmed by the experimentally obtained enhancements in specificity for the different MMPs. These include N-TIMP2T99M and N-TIMP2T99Q obtaining enhancements in specificity for MMP-1 vs. MMP-3 and MMP-14, TIMP2S68M and N-TIMP2V71W obtaining enhancements in specificity for MMP-3 vs. MMP-1 and MMP-14, and TIMP2H97R and N-TIMP2T99G obtaining enhancements in specificity for MMP-14 vs. MMP-1 and MMP-3.

Since our setup uses sequential screens against three different targets, a comparison between affinity scores obtained from individual screens against each target enables the identification of specificity-enhancing mutations. This method can be easily applied for multiple target proteins by comparing the high-affinity library fractions screened against each target separately and calculating the paired specificity scores for each target pair. This type of information cannot be obtained from a competitive multi-target screening approach using multiple target proteins per single screen, where a library is screened against a target of interest (labeled with one type of fluorophore) versus a mixture of competitors (all labeled with a second, different of fluorophore) [65]. Such a competitive multi-target screen approach is useful only for a single primary target of interest but not when specificity between multiple target pairs is sought. A different approach utilizes single-step pairwise selective screens in which a library is sorted against two targets simultaneously for variants with differential selectivity toward each target in the pair [35]. Then, NGS is used to sequence these fractions and analyze them. However, this method, too, cannot be scaled up for multiple target proteins, as the selective screens are performed only against one pair at a time.

A limitation inherent in our platform is that very large changes in binding affinities may not be properly estimated for some PPIs, because the NGS-based scores are log2-linear with Ki binding scores only within a specific range. A methodology to expand the range and possibly improve the calculated binding affinity would require sorting the mutants with the highest and the lowest affinities at different concentrations [66]. Yet another direction that is yet to be explored is the comprehensive investigation of sequences with more than one mutation [67,68], which remains technically very challenging and must therefore await progress in methodologies for library constructions and NGS analysis.

The above notwithstanding, our platform for mapping binding landscapes is sufficiently mature to facilitate the study of many additional PPIs with different functions and binding affinities. In particular, the platform is applicable for studies of multiple mutations, for which it can provide comprehensive information on PPI evolution. In future modeling and protein engineering studies, our approach will allow us to study how protein function influences the evolution of the binding interface sequences and how high-affinity and high-specificity PPIs differ from short-lived and unnatural PPIs. It will also facilitate the rational design of specific inhibitors for structurally similar target proteins.

Supplementary Material

SI

Acknowledgments

The authors thank Michael Heyne, Gal Yosef, Naama Shafir and Vered Caspi (BGU) for helpful discussions. We thank Dr. Uzi Hadad for his technical assistance and Ms Inez Mureinik for editing the manuscript. FACS experiments were performed at the Ilse Katz Institute for Nanoscale Science & Technology. N.P. acknowledges support from the European Research Council “Ideas program” ERC-2013-StG (contract grant number: 336041). N.P. and E.S.R. acknowledge support from the US-Israel Binational Science Foundation (BSF). E.S.R. acknowledges support from United States National Institutes of Health grant R01CA154387.

Abbreviations used:

FACS

fluorescence-activated cell sorting

MMPs

matrix metalloproteinases

NF

normalized frequency

NGS

next generation sequencing

PPIs

protein-protein interactions

TIMP

tissue inhibitor of matrix metalloproteinases

WT

wild type

YSD

yeast surface display

Footnotes

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest with respect to the publication of this paper.

References

  • 1.Klesmith JR, Bacik J-P, Wrenbeck EE, Michalczyk R and Whitehead TA (2017) Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning. Proc. Natl. Acad. Sci 114, 2265–2270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Mavor D, Barlow K, Thompson S, Barad BA, Bonny AR, Cario CL, Gaskins G, Liu Z, Deming L, Axen SD, et al. (2016) Determination of ubiquitin fitness landscapes under different chemical stresses in a classroom setting. Elife 5, 1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Harms MJ and Thornton JW (2013) Evolutionary biochemistry: revealing the historical and physical causes of protein properties Nat. Rev. Genet, Nature Publishing Group; 14, 559–571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cho S, Swaminathan CP, Yang J, Kerzic MC, Guan R, Kieke MC, Kranz DM, Mariuzza RA and Sundberg EJ (2005) Structural basis of affinity maturation and intramolecular cooperativity in a protein-protein interaction. Structure 13, 1775–1787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bloom JD, Romero PA, Lu Z and Arnold FH (2007) Neutral genetic drift can alter promiscuous protein functions, potentially aiding functional evolution. Biol. Direct 2, 7–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ernst A, Avvakumov G, Tong J, Fan Y, Zhao Y, Alberts P, Persaud A, Walker JR, Neculai A-M, Neculai D, et al. (2013) A strategy for modulation of enzymes in the ubiquitin system. Science, American Association for the Advancement of Science 339, 590–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Goldsmith M and Tawfik DS (2012) Directed enzyme evolution: Beyond the low-hanging fruit Curr. Opin. Struct. Biol, Elsevier Ltd; 22, 406–412. [DOI] [PubMed] [Google Scholar]
  • 8.Lu SM, Lu W, Qasim MA, Anderson S, Apostol I, Ardelt W, Bigler T, Chiang YW, Cook J, James MNG, et al. (2012) Predicting the reactivity of proteins from their sequence alone: Kazal family of protein inhibitors of serine proteinases. Proc. Natl. Acad. Sci 98, 1410–1415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Leung I, Dekel A, Shifman JM and Sidhu SS (2016) Saturation scanning of ubiquitin variants reveals a common hot spot for binding to USP2 and USP21 Proc. Natl. Acad. Sci, National Academy of Sciences; 113, 8705–8710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Klesmith JR, Bacik JP, Michalczyk R and Whitehead TA (2015) Comprehensive Sequence-Flux Mapping of a Levoglucosan Utilization Pathway in E. coli. ACS Synth. Biol 4, 1235–1243. [DOI] [PubMed] [Google Scholar]
  • 11.Firnberg E, Labonte JW, Gray JJ and Ostermeier M (2014) A Comprehensive, High-Resolution Map of a Gene’s Fitness Landscape Mol. Biol. Evol, Oxford University Press; 31, 1581–1592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Rockah-Shmuel L, Tóth-Petróczy Á and Tawfik DS (2015) Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations. PLoS Comput. Biol 11, 1–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Eyre-Walker A and Keightley PD (2007) The distribution of fitness effects of new mutations Nat. Rev. Genet, Nature Publishing Group; 8, 610–618. [DOI] [PubMed] [Google Scholar]
  • 14.Kowalsky CA and Whitehead TA (2016) Determination of binding affinity upon mutation for type I dockerin-cohesin complexes from C lostridium thermocellum and C lostridium cellulolyticum using deep sequencing Proteins Struct. Funct. Bioinforma, John Wiley & Sons, Ltd; 84, 1914–1928. [DOI] [PubMed] [Google Scholar]
  • 15.Hietpas RT, Jensen JD and Bolon DNA (2011) Experimental illumination of a fitness landscape. Proc. Natl. Acad. Sci. U. S. A 108, 7896–7901. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Roscoe BP, Thayer KM, Zeldovich KB, Fushman D and Bolon DNA (2013) Analyses of the effects of all ubiquitin point mutants on yeast growth rate J. Mol. Biol, Elsevier Ltd; 425, 1363–1377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Starita LM, Pruneda JN, Lo RS, Fowler DM, Kim HJ, Hiatt JB, Shendure J, Brzovic PS, Fields S and Klevit RE (2013) Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis Proc. Natl. Acad. Sci. U. S. A, National Academy of Sciences; 110, E1263–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Melnikov A, Rogov P, Wang L, Gnirke A and Mikkelsen TS (2014) Comprehensive mutational scanning of a kinase in vivo reveals substrate-dependent fitness landscapes. Nucleic Acids Res. 42, 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Sarkisyan KS, Bolotin DA, Meer MV, Usmanova DR, Mishin AS, Sharonov GV, Ivankov DN, Bozhanova NG, Baranov MS, Soylemez O, et al. (2016) Local fitness landscape of the green fluorescent protein Nature, Nature Publishing Group; 533, 397–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Erijman A, Rosenthal E and Shifman JM (2014) How structure defines affinity in protein-protein interactions. PLoS One 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bass SH, Mulkerrin MG and Wells JA (1991) A systematic mutational analysis of hormone-binding determinants in the human growth hormone receptor. Proc. Natl. Acad. Sci. U. S. A 88, 4498–4502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Delano WL Unraveling hot spots in binding interfaces: progress and challenges 14–20. [DOI] [PubMed]
  • 23.Maria José M, Castro A, and Anderso S (1996) Alanine Point-Mutations in the Reactive Region of Bovine Pancreatic Trypsin Inhibitor: Effects on the Kinetics and Thermodynamics of Binding to β-Trypsin and α-Chymotrypsin† Biochemistry, American Chemical Society; 35, 11435–11446. [DOI] [PubMed] [Google Scholar]
  • 24.Meenan NAG, Sharma A, Fleishman SJ, MacDonald CJ, Morel B, Boetzel R, Moore GR, Baker D and Kleanthous C (2010) The structural and energetic basis for high selectivity in a high-affinity protein-protein interaction. Proc. Natl. Acad. Sci 107, 10080–10085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Shirian J, Sharabi O and Shifman JM (2016) Cold Spots in Protein Binding Trends Biochem. Sci, Elsevier Ltd; 41, 739–745. [DOI] [PubMed] [Google Scholar]
  • 26.Cukuroglu E, Engin HB, Gursoy A and Keskin O (2014) Hot spots in protein–protein interfaces: Towards drug discovery Prog. Biophys. Mol. Biol, Pergamon; 116, 165–173. [DOI] [PubMed] [Google Scholar]
  • 27.Campbell EC, Correy GJ, Mabbitt PD, Buckle AM, Tokiriki N and Jackson CJ (2018) Laboratory evolution of protein conformational dynamics Curr. Opin. Struct. Biol, Elsevier Ltd; 50, 49–57. [DOI] [PubMed] [Google Scholar]
  • 28.Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS (2000) Rapid mapping of protein functional epitopes by combinatorial alanine scanning [In Process Citation]. Proc.Natl.Acad.Sci.U.S.A 97, 8950–8954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Whitehead TA, Chevalier A, Song Y, Dreyfus C, Fleishman SJ, De Mattos C, Myers CA, Kamisetty H, Blair P, Wilson IA, et al. (2012) Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing Nat. Biotechnol, Nature Publishing Group; 30, 543–548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Jenson JM, Xue V, Stretz L, Mandal T, Reich L. “Luther” and Keating AE (2018) Peptide design by optimization on a data-parameterized protein interaction landscape. Proc. Natl. Acad. Sci 115, E10342–E10351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Fleishman SJ, Whitehead TA, Ekiert DC, Dreyfus C, Corn JE, Strauch E-M, Wilson IA and Baker D (2011) Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science (80-.), American Association for the Advancement of Science 332, 816–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Koenig P, Lee CV, Sanowar S, Wu P, Stinson J, Harris SF and Fuh G (2015) Deep sequencing-guided design of a high affinity dual specificity antibody to target two angiogenic factors in neovascular age-related macular degeneration. J. Biol. Chem 290, 21773–21786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Cohen-Khait R and Schreiber G (2016) Low-stringency selection of TEM1 for BLIP shows interface plasticity and selection for faster binders. Proc. Natl. Acad. Sci 113, 14982–14987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Mendes KR, Malone ML, Ndungu JM, Suponitsky-Kroyter I, Cavett VJ, McEnaney PJ, MacConnell AB, Doran TDM, Ronacher K, Stanley K, et al. (2017) High-throughput identification of DNA-encoded IgG ligands that distinguish active and latent mycobacterium tuberculosis infections. ACS Chem. Biol 12, 234–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Naftaly S, Cohen I, Shahar A, Hockla A, Radisky ES and Papo N (2018) Mapping protein selectivity landscapes using multi-target selective screening and next-generation sequencing of combinatorial libraries Nat. Commun, Springer US: 9, 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Arkadash V, Yosef G, Shirian J, Cohen I, Horev Y, Grossman M, Sagi I, Radisky ES, Shifman JM and Papo N (2017) Development of high affinity and high specificity inhibitors of matrix metalloproteinase 14 through computational design and directed evolution. J. Biol. Chem 292, 3481–3495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Arkadash V, Radisky ES and Papo N (2018) Combinatorial engineering of N-TIMP2 variants that selectively inhibit MMP9 and MMP14 function in the cell. Oncotarget 9, 32036–32053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Shirian J, Arkadash V, Cohen I, Sapir T, Radisky ES, Papo N and Shifman JM (2018) Converting a broad matrix metalloproteinase family inhibitor into a specific inhibitor of MMP-9 and MMP-14. FEBS Lett. 592, 1122–1134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Overall CM and Kleifeld O (2006) Validating matrix metalloproteinases as drug targets and anti-targets for cancer therapy Nat. Rev. Cancer, Nature Publishing Group; 6, 227–239. [DOI] [PubMed] [Google Scholar]
  • 40.Liu M, Hu Y, Zhang M-F, Luo K-J, Xie X-Y, Wen J, Fu J-H and Yang H (2016) MMP1 promotes tumor growth and metastasis in esophageal squamous cell carcinoma Cancer Lett, Elsevier; 377, 97–104. [DOI] [PubMed] [Google Scholar]
  • 41.Ozden F, Saygin C, Uzunaslan D, Onal B, Durak H and Aki H (2013) Expression of MMP-1, MMP-9 and TIMP-2 in prostate carcinoma and their influence on prognosis and survival. J. Cancer Res. Clin. Oncol 139, 1373–1382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Têtu B, Brisson J, Wang CS, Lapointe H, Beaudry G, Blanchette C and Trudel D (2006) The influence of MMP-14, TIMP-2 and MMP-2 expression on breast cancer prognosis. Breast Cancer Res. 8, 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.He L, Chu D, Li X, Zheng J, Liu S, Li J, Zhao Q and Ji G (2013) Matrix metalloproteinase-14 is a negative prognostic marker for patients with gastric cancer. Dig. Dis. Sci 58, 1264–1270. [DOI] [PubMed] [Google Scholar]
  • 44.Wu K, Li Q, Lin F, Li J, Wu L, Li W and Yang Q (2014) MT1-MMP is not a good prognosticator of cancer survival: evidence from 11 studies Tumor Biol, Springer Netherlands; 35, 12489–12495. [DOI] [PubMed] [Google Scholar]
  • 45.Martins VL, Caley M and O’Toole EA (2013) Matrix metalloproteinases and epidermal wound repair. Cell Tissue Res. 351, 255–268. [DOI] [PubMed] [Google Scholar]
  • 46.Bullard KM, Lund L, Mudgett JS, Mellin TN, Hunt TK, Murphy B, Ronan J, Werb Z and Banda MJ (1999) Impaired wound contraction in stromelysin-1-deficient mice Ann. Surg, Lippincott, Williams, and Wilkins; 230, 260–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Brinckerhoff CE and Matrisian LM (2002) Matrix metalloproteinases: a tail of a frog that became a prince Nat. Rev. Mol. Cell Biol, Nature Publishing Group; 3, 207–214. [DOI] [PubMed] [Google Scholar]
  • 48.Bahudhanapati H, Zhang Y, Sidhu SS and Brew K (2011) Phage display of tissue inhibitor of metalloproteinases-2 (TIMP-2): Identification of selective inhibitors of collagenase-1 (metalloproteinase 1 (MMP-1)). J. Biol. Chem 286, 31761–31770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Batra J, Soares AS, Mehner C and Radisky ES (2013) Matrix Metalloproteinase-10/TIMP-2 Structure and Analyses Define Conserved Core Interactions and Diverse Exosite Interactions in MMP/TIMP Complexes. PLoS One 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Suzuki K, Kan CC, Hung W, Gehring MR, Brew K and Nagase H (1998) Expression of human pro-matrix metalloproteinase 3 that lacks the N-terminal 34 residues in Escherichia coli autoactivation and interaction with tissue inhibitor of metalloproteinase 1 (TIMP-1). Biol. Chem 379, 185–191. [DOI] [PubMed] [Google Scholar]
  • 51.Ogata H, Decaneto E, Grossman M, Havenith M, Sagi I, Lubitz W and Knipp M (2014) Crystallization and preliminary X-ray crystallographic analysis of the catalytic domain of membrane type 1 matrix metalloproteinase. Acta Crystallogr. Sect. FStructural Biol. Commun 70, 232–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Grossman M, Tworowski D, Dym O, Lee MH, Levy Y, Murphy G and Sagi I (2010) The intrinsic protein flexibility of endogenous protease inhibitor TIMP-1 controls its binding interface and affects its function. Biochemistry 49, 6184–6192. [DOI] [PubMed] [Google Scholar]
  • 53.Sharabi O, Shirian J, Grossman M, Lebendiker M, Sagi I and Shifman J (2014) Affinity- and specificity-enhancing mutations are frequent in multispecific interactions between TIMP2 and MMPs. PLoS One 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Chao G, Lau WL, Hackel BJ, Sazinsky SL, Lippow SM and Wittrup KD (2006) Isolating and engineering human antibodies using yeast surface display. Nat. Protoc 1, 755–768. [DOI] [PubMed] [Google Scholar]
  • 55.Mata-Fink J, Kriegsman B, Yu HX, Zhu H, Hanson MC, Irvine DJ and Wittrup KD (2013) Rapid conformational epitope mapping of anti-gp120 antibodies with a designed mutant panel displayed on yeast J. Mol. Biol, Elsevier Ltd; 425, 444–456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Angelini Alessandro, Chen Tiffany F., Seymour de Picciotto Nicole J. Yang, Tzeng Alice, Santos Michael S., Van Deventer James A, Traxlmayr Michael W., and K. D. W. (2015) Protein Engineering and Selection Using Yeast Surface Display. Yeast Surf. Disp. Methods, Protoc. Appl [DOI] [PubMed] [Google Scholar]
  • 57.Magoč T and Salzberg SL (2011) FLASH: Fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Mukaka MM (2012) Statistics Corner : A guide to appropriate use of Correlation coefficient in medical research 24, 69–71. [PMC free article] [PubMed] [Google Scholar]
  • 59.Gallagher EJ (1999) p < 0.05: Threshold for Decerebrate Genuflection. Acad. Emerg. Med 6(11) 1084–1087. [DOI] [PubMed] [Google Scholar]
  • 60.Fernandez-Catalan C, Bode W, Huber R, Turk D, Calvete JJ, Lichte A, Tschesche H and Maskos K (1998) Crystal structure of the complex formed by the membrane type 1-matrix metalloproteinase with the tissue inhibitor of metalloproteinases-2, the soluble progelatinase A receptor EMBO J, EMBO Press; 17, 5238–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Tinberg CE, Khare SD, Dou J, Doyle L, Nelson JW, Schena A, Jankowski W, Kalodimos CG, Johnsson K, Stoddard BL, et al. (2013) Computational design of ligand-binding proteins with high affinity and selectivity Nature, Nature Publishing Group; 501, 212–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Vehtari A, Gelman A and Gabry J (2015) Efficient implementation of leave-one-out cross-validation and WAIC for evaluating fitted Bayesian models. Stat. Comput 27, 1413–1432. [Google Scholar]
  • 63.Dutta S, Ryan J, Chen TS, Kougentakis C, Letai A and Keating AE (2015) Potent and specific peptide inhibitors of human pro-survival protein bcl-xl J. Mol. Biol, Elsevier Ltd; 427, 1241–1253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Adams RM, Mora T, Walczak AM and Kinney JB (2016) Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. Elife 5, 1–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Cohen I, Naftaly S, Ben-Zeev E, Hockla A, Radisky ES and Papo N (2018) Pre-equilibrium competitive library screening for tuning inhibitor association rate and specificity toward serine proteases. Biochem. J 475, 1335–1352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Chevalier A, Silva DA, Rocklin GJ, Hicks DR, Vergara R, Murapa P, Bernard SM, Zhang L, Lam KH, Yao G, et al. (2017) Massively parallel de novo protein design for targeted therapeutics Nature, Nature Publishing Group; 550, 74–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Canale AS, Cote-Hammarlof PA, Flynn JM and Bolon DN (2018) Evolutionary mechanisms studied through protein fitness landscapes Curr. Opin. Struct. Biol, Elsevier Ltd; 48, 141–148. [DOI] [PubMed] [Google Scholar]
  • 68.Wrenbeck EE, Faber MS and Whitehead TA (2017) Deep sequencing methods for protein engineering and design Curr. Opin. Struct. Biol, Elsevier Ltd; 45, 36–44. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

RESOURCES