Abstract
Binding ligands empower molecular therapeutics and diagnostics. Despite an array of protein scaffolds engineered for binding, the biophysical elements that drive developability and evolvability are not fully understood. In particular, engineering novel function while maintaining biophysical integrity within the context of small, single-domain proteins is challenged by integration of the structural framework and the evolved binding site. Miniproteins present a challenge to our limits of protein engineering capability and provide advantages in physiological targeting, modularity for multi-functional constructs, and unique binding modes. Herein, we evaluate the ability of hyperstable synthetic miniproteins, originally designed for foldedness, to function as binding scaffolds. We synthesized 45 combinatorial libraries, with 109 variants, systematically varied across two topologies, each with five starting frameworks and four or five diverse, structurally distinct paratopes, to elucidate their impact on evolvability and developability. We evaluated evolvability with yeast display binding selections against four targets. High-throughput assays – stability via yeast display and soluble expression via split-GFP in E. coli – measured developability. The comprehensive, robust dataset demonstrates how protein topology, parental framework, and paratope structure and location all impact scaffold performance. A hyperstable framework and localized diversity are not sufficient for an effective scaffold, but several designs of these elements within synthetic miniproteins designed solely for stability result in scaffold libraries with effective evolvability and developability. Engineered variants were well-folded, thermally stable, and bound target with single-digit nanomolar affinity. Thus, hyperstable synthetic miniproteins can serve as precursors to developable, evolvable mini-scaffolds with unique potential for physiological transport, modularity, and binding modes.
Keywords: protein engineering, molecular targeting, topology, innovability, biophysical robustness
Graphical Abstract

Introduction
Engineered proteins empower biotechnologies in industrial, medical, and agricultural settings. Specifically, molecular recognition binding ligands are integral for targeted therapy and diagnostic applications.1 A common approach to engineer high-affinity, specific binding entails scaffold proteins2,3, which consist of a conserved framework to provide structural integrity (for stability and reduced entropic cost of binding4) while accommodating a variable active site which can be engineered to provide new binding functions (a property called evolvability or innovability5). To be effective solutions within these applications, proteins must also exhibit “developability”6–8, i.e., remain stable and soluble in complex environments as well as be efficiently produced, often in multifunctional conjugates. Yet, mutations required to change protein function are generally destabilizing resulting in a trade-off between developability and evolvability.9–13 To complicate matters further, proteins do not come with ideal properties because they evolved to fit environmental pressures rather than for our specific needs.10 Despite extensive engineering of binding ligands across various topologies2,14,15, the factors that dictate performance remain incompletely understood5,16, thereby motivating fundamental elucidation of the impact of protein topology, parental framework sequence, paratope structure, and paratope sequence on scaffold developability and evolvability. Moreover, current scaffolds all present various liabilities, inspiring identification of ligand scaffolds with an improved balance of evolvability and developability.
Nature’s primary solution for evolvable binding proteins, the antibody, is the dominant scaffold for clinical and biotechnological applications.17,18 Yet large size inhibits effective tissue penetration which can decrease therapeutic efficacy19,20, multi-domain structures hinder modularity for multifunctional applications21, and variable developability hinders utility2,7,22. To overcome these limitations, small, single domain scaffolds have been developed23 – including affibodies24,25, fibronectin domains26, knottins27,28, and DARPins29 – in a wide variety of topologies with varied binding paratope structures22, including loops (both flexible21,26 and constrained30–32), α helices24, and β strands33. Yet, the evolvability and developability of these scaffolds vary significantly, which renders the ideal scaffold uncertain. Maintaining developability is non-trivial because introducing new functions requires varying an appreciable fraction of these small domains34. Yet, decoupling the engineered paratope from the framework – a beneficial approach to evolvability5,11 – is challenging in small domains. Stability of the parental molecule promotes evolvability thereby motivating selection of a highly stable framework sequence.10,12,13 Rocklin and colleagues computationally designed and experimentally validated a collection of proteins that fold into an array of small topologies with very high stability35. Yet, the parental sequence must also tolerate a chemically diverse set of paratope sequences to enable introduction of new function. Though these sequences were designed for stability and topology without evolvability as a design criterion, these proteins are compelling ligand scaffolds because of their small size, hyperstability, and variety of potential paratope structures. We seek to answer several core questions: Can synthetic proteins, designed for wild-type protease stability35, behave as ligand scaffolds (i.e., be robust enough to tolerate mutations to exhibit evolvability while maintaining high developability)?. Which combinations of topology, paratope, and framework are most effective for overall scaffold developability and evolvability?
To answer these questions, we developed a full-factorial experimental design by constructing 45 combinatorial libraries systematically varying in topology, framework sequence, and paratope locations. We measured evolvability via binder discovery to four diverse targets and developability via four high-throughput assays. Through deep sequencing analysis validated by statistical testing, we measured an array of evolvability and developability across design space to identify drivers of performance. Although a hyperstable framework and localized diversity are insufficient criteria for an effective scaffold, select synthetic miniproteins, originally designed solely for foldedness, indeed serve as developable, evolvable scaffolds. Furthermore, a single round of affinity maturation of VEGF binders yielded well-folded, highly stable clones with single-digit nanomolar affinity.
Results and Discussion
Systematic library design and construction
To evaluate the impact of topology, framework sequence, paratope structure, and paratope sequence on ligand scaffold evolvability and developability, we systematically varied each of these elements in a set of small (43 amino acids) synthetic scaffolds and measured protein performance. We chose two topologies – a triple-helix bundle (α3) and another with two pairs of anti-parallel beta-strands surrounding a central alpha helix (β2αβ2) – to provide a range of structural paratope options (Figure 1). Five framework sequences (Table 1) with high stability were chosen for each topology from a set of synthetic designs35. Five distinct paratope locations were chosen for β2αβ2: the α-helical surface, exterior sites on β-strands 1 and 2, or 3 and 4, or 1–4, or two loops; four distinct paratopes were chosen for α3: the surface of either helix 1, 2, or 2 and 3, or a single loop. Thus, we created forty-five combinatorial libraries in total (Figure 1). For each paratope, sites were diversified to all 20 natural amino acids with the exception that areas connecting secondary structures were moderately constrained (Table 1).
Figure 1. Overall synthetic ligand scaffold design.

Design can be viewed as a three-dimensional factorial design with two topologies (β2αβ2 and α3), each with five frameworks (different colors), and four or five paratope structures. Sites of paratope diversification are displayed as gray spheres. Sequences are detailed in Table 1.
Table 1. Synthetic scaffold sequence designs.
The following letters represent different degenerate codons used when designing each framework/paratope combination -X: NNK (all 20 amino acids), X’: RGY (G, S), X”: KYR (A, L, S, V) X: DVY (A, C, D, G, N, S, T, Y), X: VRN (D, E, G, H, K, N, Q, R, S). Consensus indicates the most common amino acid across parents at each site; in cases of equal frequencies, homology to the fifth parent is considered. Parentheticals indicate parent origin from Rocklin et al.35
|
We synthesized 45 combinatorial libraries across these topologies, frameworks, and paratope sites. Genetic libraries were transformed into a yeast surface display system36 to achieve a genotype-phenotype linkage for high-throughput selection for binding function (i.e. evolvability) and evaluation of developability37. Transformation into yeast yielded 23 ± 5 million variants per sublibrary for a total merged library size of 1.0 billion variants. Deep sequencing revealed that the constructed libraries match designs with a 0.8% absolute median deviation of amino acid frequency at each site (Fig. S1).
Designed synthetic topologies – with the proper combination of topology, framework, and paratope – are evolvable as ligand scaffolds
To identify functional scaffolds and evaluate how the choice of topology, framework, and paratope locations impact scaffold evolvability, we pooled all 45 libraries and sorted for binders to four distinct targets: programmed cell death protein 1 (PD-1), cytotoxic T-lymphocyte-associated protein 4 (CTLA-4), vascular endothelial growth factor (VEGF), and insulin-like growth factor receptor 1 (IGF1R). These biologically- and clinically relevant targets were chosen to provide differences in structure and size to aid in assessing the generality of the overall library evolvability. We performed three rounds of selections to identify binders to each target immobilized on magnetic beads while depleting non-specific binders to beads coated in streptavidin or immunoglobulin G. Enrichment of target-specific binding was observed in all four target campaigns (Fig. S2). The resulting populations were deep sequenced to assess the ability of each topology, framework, and paratope to yield specific binders within the yeast display context. Evolvability was scored based on the quantity and quality of functional variants, i.e., number of unique binders and binding strength of the variants, quantified as the logarithmic enrichment of binder frequency compared to the unsorted population (detailed in Materials and Methods). Several scaffold library designs exhibited high evolvability with numerous binders to multiple targets (Figure 2). The most evolvable library design comprised diversification of 8 sites in the loop connecting helices 2 and 3 (Figure 1, paratope L1) in clone J of the ⍺3 bundle, which enabled discovery of 122 – 1,204 binders to each target. Other paratope locations were also evolvable in this framework, as evolution of helix 1, helix 2, or a combination of helices 2 and 3 all yielded specific binders to all four targets (Figure 2A). The β2αβ2 topology was also capable of high evolvability as diversification of 12 sites on the surface of the four β-strands (termed β1–4) of the clone β2αβ2A yielded binders to all four targets. This full β-sheet paratope was the most broadly effective across frameworks, as all five β2αβ2 frameworks exhibited high evolvability to all four targets with this paratope location (Figure 2B and Figure 3C).
Figure 2. Select synthetic scaffold designs exhibit high evolvability.

The merged library of all 45 designs was sorted for specific binders to each indicated target. Deep sequencing of specific binding populations quantified enrichment of each design relative to the unsorted library. The resultant evolvability scores across all four binding targets are presented in descending order by median (shown as a black horizontal bar) for the (A) α3 topology and (B) β2αβ2 topology.
Figure 3. Evolvability varies by topology, framework, and paratope.

The median evolvability for binding to four distinct targets (Figure 2) is shown for (A) all combinations of framework and paratope, (B) grouped by framework, or (C) grouped by paratope location.
Yet, hyperstable frameworks in these two topologies were not sufficient criteria for evolvability as there were many combinations of frameworks and paratopes that did not enable effective evolution of specific binders (Figure 2). Across the frameworks, paratopes, and targets tested, the β2αβ2 topology was more evolvable than α3 (Figure 3A). A driver of this outcome was differential performance of framework sequences. Four of five β2αβ2 frameworks were enriched as contrasted by only one of five α3 frameworks, two of which (F and G) had very minimal evolvability (Figure 3B). Paratope structures exhibit a range of evolvability (Figure 3C). Diversification of surface sites on all four β strands was highly effective within the β2αβ2 topology, while diversifying only pairs of β strands was also effective. The loop paratope was the least evolvable location in the β2αβ2 topology and also depleted in the α3 topology. The first helix was the only enriched paratope in the α3 topology. The variable evolvability of the two-helix paratope (⍺2,3) is notable in comparison to the efficacy of the affibody24,25, which is an effective ligand scaffold diversifying two adjacent helices in a three-helix bundle. Yet, the precise bundle topology, including helical length and orientation is distinct between the scaffolds as is the framework sequence (notably, ⍺3:J:⍺2,3 is highly evolvable) and diversity design (which has been optimized for affibody24). Strikingly, diversification of the second helix (α2 paratope) of the α3 bundles exhibited minimal binders in the F and G frameworks, as well as for three of four targets in the I framework, but was strongly enriched in the J framework with 83 – 552 binders to each target. This contextual dependence of evolutionary performance extended to molecular targets as the impact of topology was target-dependent as CTLA-4 and PD-1 are 2.1 and 1.7-fold preferentially targeted by libraries within the β2αβ2 topology whereas α3 libraries are 2.0 and 1.5-fold more enriched for binding IGF1R and VEGF (Fig. S3). Future experiments will be required to fully explain the underlying factors driving this phenomenon.
Scaffold topology, framework, and paratope location impact developability
Along with primary function (i.e., binding), scaffolds must also exhibit biophysical robustness. The pooled libraries were evaluated for multiple metrics of developability using a set of library-scale assays. These flow cytometric assays measure protease resistance and thermal stability in a yeast display format and soluble expression in a split-GFP (green fluorescent protein) format, which have demonstrated utility to proxy for protein developability38. Protease resistance is a valuable property for therapeutics and in vivo diagnostics and also proxies for other metrics of developability35,38. The yeast-displayed scaffold variants are exposed to the relatively non-specific proteases proteinase K or thermolysin at 37 °C or 55 °C. The presence of epitope tags at the N- and C-termini of the scaffolds are then quantified for each cell via flow cytometry with anti-epitope antibodies. Protease-resistant variants exhibit a high C-terminal:N-terminal signal ratio whereas proteolysis of less stable variants reduces the presence of C-terminal epitopes thereby decreasing the ratio. Library variants were sorted via FACS into eight tiered gates to stratify fitness (Fig. S4A–C). The split-GFP assay measures stable, soluble protein expression via genetic fusion of the scaffold variant to the 11th beta-strand of GFP. Co-expression of the remainder of GFP enables protein recombination and fluorescent signal proportional to scaffold-GFP11 expression. Library variants were sorted via FACS into six tiered gates to stratify fitness (Fig S4D). Scores were calculated using a gate-weighted dot product, normalized to an unsorted population, using deep sequencing data from gated populations.
We evaluated the impact of the framework sequence on protease stability across several paratope libraries with diverse sets of sequences and observed a range of performance. Two frameworks, one within each topology, were the most protease stable: α3H and β2αβ2B (Figure 4A). Aggregated across paratopes, α3H exhibited stability 1.6x standard deviations above the mean and was significantly more stable (p < 7 × 10−5) than 8 of 9 other frameworks at 37 °C and all 9 other frameworks (p < 3 × 10−4) at 55 °C (Figure S5A). Β2αβ2B exhibited stability 1.0 standard deviation higher than the mean and was significantly more stable to proteinase K than all 7 frameworks (aside from α3H) at 37 °C (p < 1 × 10−4); and five of 7 frameworks at 55 °C. The reduced relative stability of β2αβ2B at 55 °C suggests that this framework is more protease stable than thermally stable relative to other frameworks (Fig. S5A). α3G is the least proteinase K resistant framework at 37 °C (p < 9 × 10−10). The other seven frameworks exhibited near-neutral scores.
Figure 4. Developability varies by topology, framework, and paratope.

(A, B) Yeast displaying ligand variants from each sublibrary were exposed to protease K at 37 °C and sorted into eight tiers based on maintenance of full-length protein. Protease stability of libraries is shown across (A) frameworks (grouping paratopes) and (B) paratopes (grouping frameworks). Each box represents the performance of an individual combinatorial library after experiments run in triplicate. (*: p < 0.05 #: p < 0.01) (C, D) The merged libraries were evaluated for GFP expression across (C) frameworks and (D) paratopes. (*: p < 0.05 #: p < 0.01). (E) The average developability scores for each library design is presented across all assays.
We evaluated the impact of paratope location on protease stability across multiple frameworks for each paratope type (Figure 4B). Diversification of the surface of the third and fourth strands of β2αβ2 (β2αβ2:β3,4) results in the most stability; further diversification of the first and second strands nominally decreases stability but the resultant variants remain enriched. The two paratopes that result in the least protease stability are the surface of the α helix (β2αβ2:α) and the loops (β2αβ2:L2) of β2αβ2. Notably, the highly protease stable outliers in both topologies come from the α3:H and β2αβ2:B frameworks. There is a high correlation between stability to protease K at 37 °C and 55 °C which is consistent with minimal differential thermal stress in this temperature range (Fig. S6A). Notably, α3 scaffolds consistently exhibit higher thermal stability (ratio of stability at 55 °C vs. 37 °C) than β2αβ2 scaffolds.
We used thermolysin, another broad-spectrum protease, to investigate the generalizability of framework/paratope impact on protease stability. The most stable framework in the protease K assay, α3H, proved to be the most stable against thermolysin (Fig. S5C). The second-most stable framework in the proteinase K context, β2αβ2B, was moderately resistant against thermolysin. Other frameworks consistently exhibit moderate stability to both proteases. The paratope most resistant to protease K, β2αβ2:β3,4, was moderately stable to thermolysin. Of the two paratopes least resistant to protease K, both within the β2αβ2 topology, one (the α paratope) was also least stable to thermolysin whereas the other (L2 paratope) exhibited moderate thermolysin stability. Overall, protease K vs. thermolysin results exhibit moderate correlation (ρ = 0.37) indicative of a related ability to proxy for conformational stability while also providing independent utility (Fig. S6B).
To expand the developability assessment, we measured soluble bacterial expression of 105 variants from each of the 45 libraries via a split-GFP assay39. A library with clones in the top gate (with a score of 2; Figure 4C–D) indicates that they are 100-fold more expressible than a library with clones collected in the bottom gate (with a score of −2). Sequence analysis indicated that framework sequences exhibited significantly different soluble expression (Figure 4C). α3F, the highest expresser when averaging across all paratope diversification libraries, exhibited 10- to 15-fold higher expression than α3J, the least expressed framework. Each framework exhibited multimodal distribution of soluble expression across different paratope diversifications which highlights the importance of paratope location in assessment of soluble expression. Indeed, when aggregating across framework sequences, libraries varying in their paratope location vary widely in their soluble expression (Figure 4D). Loop diversification is effectively tolerated in the β2αβ2 topology, especially in the C and D frameworks. Conversely, α-helix diversifications result in very poor soluble expression in the β2αβ2 topology. Combined with poor stability, the results indicate that the α-helix is critical to the overall developability for this topology.
Collectively, the protease stability assays, with the two proteases and thermal stress, and the split-GFP soluble expression assay provide multiple metrics on protein developability for the topologies, frameworks, and paratopes evaluated. As these metrics are distinct, differential performance is achieved in different assays for many combinations. Yet, the α3H framework is consistently strong with the highest protease stability across conditions and a moderately high soluble expression (Figure 4E). The α3F and β2αβ2D frameworks are relatively robust with moderate protease stability across conditions and the highest soluble expression. At the other end of the spectrum, diversification of the α-helix in the β2αβ2 topology consistently yields the poorest performance.
Select scaffold designs provide evolvable and developable library designs
As useful proteins must be developable and functional, we evaluated the ability to achieve both desirable outcomes within a single design. We combine all the developability and evolvability metrics into one score for each by taking the mean of the four protease, thermal, and expression assays and median of the four target evolvabilities. Previous studies have demonstrated a tradeoff between function and stability in single molecules9–13. The combined performance is similarly challenging within the context of diversified distributions as only 13% of sublibrary designs (6 of 45) exhibit mutual enrichment for evolvability and developability (Figure 5). Yet, several combinations of topology, framework, and paratope exhibit a balance of good developability and evolvability and thereby serve as promising scaffold libraries. Scaffold α3J:α2 exhibits the second highest evolvability of all 45 designs along with a developability score in the top 20%. β2αβ2B:β1,2 is another promising scaffold with balanced developability and evolvability in the top 16% in both categories. Multiple other combinations of topology, framework, and paratope provide compelling opportunities including two additional paratope diversities within the α3J framework, multiple additional β2αβ2 designs with balanced developability and evolvability, and four highly developable α3H designs, including one with elevated evolvability. Interestingly, while stability has been shown to aid clonal evolution10,12,13, relying solely on the hyperstability of these four parental frameworks does not ensure the designed library will provide a substantial number of high-performing variants, further substantiating the difficulty of evolving novel function while maintaining robustness within small domains. Within this challenging context, the discovery of performant scaffolds was achieved by the systematic evaluation across topologies, frameworks, and paratopes.
Figure 5. Select scaffold designs exhibit high developability and evolvability while many others exhibit reduced performance.

The median evolvability across four targets (Figure 2) is plotted versus the mean developability across four assays (Figure 4) for all 45 library designs.
We evaluated if there were any site-specific amino acid preferences that dictated developability and evolvability in these scaffolds. Such preferences would elucidate performance and empower constrained library design24,40. We computed sitewise amino acid enrichment across the four evolutionary campaigns and four developability assays for the two most promising scaffold libraries, α3J:α2 (Figure 6, Fig. S7). While many site/amino acid combinations had conflicting impacts on evolvability and developability (Figure 6A), there were also multiple examples that consistently aid performance (N19, Y22, M27 in α3J:α2 and W2, V4, P7 in β2αβ2B:β1,2) or consistently hinder performance (K19, M22, T23 in α3J:α2 and G4 in β2αβ2B:β1,2).
Figure 6. Amino acid preferences within scaffolds.

(A) Residue frequency enrichment across each site within the α3J: α2 scaffold across all eight assays. Strong red indicates a high enrichment, while a strong blue indicates strong depletion. Black box indicates wildtype residue. (B) The slope of amino acid frequency versus developability across all libraries is presented as a heatmap. Strong red colors represent a high positive slope while strong blues represent high negative slopes.
We extended the analysis to evaluate amino acids’ impact on developability independent of site. Proline, glycine, alanine, asparagine, and aspartic acid exhibit a strong negative correlation with protease stability (Figure 6B). Proline provides a unique conformational constraint, which could enrich destabilized conformations. Conversely, glycine could provide excessive flexibility thereby enabling protease accessibility. Hydrophobic amino acids generally aid stability and soluble expression in these scaffolds and show a positive correlation.
Synthetic scaffolds yield strongly developable and evolvable binders to VEGF
VEGF is a crucial signalling protein for blood vessel development, contributing to cancer tumor growth and progression, while also being associated with coronary heart disease, solidifying VEGF as a prime target for protein therapeutics.41–43 We aimed to assess the developability and functionality – particularly recombinant yield, melting temperature, and binding affinity – readily achievable within the synthetic scaffolds. We performed random mutagenesis on the variants selected for VEGF binding and sorted the resultant mutants via magnetic bead and flow cytometric selections to identify the strongest binding clones. The binding population was then subjected to proteinase K at 55 °C and sorted via flow cytometry to collect the top 5% of cells to find the most developable clones. Deep sequencing revealed multiple highly enriched variants from three distinct libraries (Table S1).
Twenty-four clones were expressed in unoptimized shake flask E. coli culture and purified from the soluble lysate fraction. Recombinant yields were 2.6 ± 1.9 mg/L (range: 0 – 6 mg/L, Figure 7A). The 15 clones with highest yield were evaluated by circular dichroism and AlphaFold244 to assess structure. Circular dichroism revealed these clones to exhibit secondary structure consistent with their parental topology (Fig. S9). AlphaFold2 predicts tertiary structures consistent with the parental topology (Figure 8A, Figs. S11D, S12). Thermal treatment indicates midpoints of thermal denaturation of 69 °C to >95 °C (Figures 7B, S11) as well as a return to their original structure when cooled to 25 °C after 95 °C incubation indicating reversible denaturation (Figure 8B, C). Lastly, we measured the binding affinities for four clones via VEGF titrations. We discovered that, in addition to being very stable, the molecules had strong binding affinities with the lowest having sub-nanomolar KD (Figures 8D, Fig. S11).
Figure 7. Characterizing the recombinant yield and melting temperature of the top VEGF clones.

(A) 24 selected VEGF binding clones reveal a distribution of moderate recombinant yield expression. (B) Distribution of melting temperatures for 15 clones calculated from circular dichroism experiments.
Figure 8. Characterization of clone α3J:α2:6 reveals a well-folded, thermally stable, and strong VEGF binder.

(A) Structural alignment of α3J framework (black) with the α3J:α2:6 clone (orange) predicted by α-fold2. Mutations away from the parental are shown as spheres. (B) Circular dichroism spectra at 25 °C (red) prior to raising the temperature to 95 °C (black), then cooling back down to 25 °C (blue). MRE: Mean residue ellipticity. (C) Melting curve of α3J:α2:6 displaying the change in fraction folded as a function of temperature. (D) Affinity titration data to fit a binding curve to calculate the KD. 68% confidence interval shown in shaded blue.
Conclusion
We systematically evaluated the impact of topology, framework, and paratope on developability and evolvability of synthetic miniproteins originally designed for stability. Starting parental framework stability and a localized diverse paratope are not sufficient criteria for high-performance scaffolds as most designs yielded limited developability and/or evolvability. Yet, we discovered multiple designs whose variant libraries exhibited high stability, moderate expression, and numerous selective binders across multiple targets. The observed performance is especially striking given the moderate library size for each sublibrary (23 million). Moreover, the lead molecules were performant without affinity maturation, and a single round of random mutagenesis yielded subnanomolar binding while maintaining thermal stabilities >95 °C. The solid performance will likely be further enhanced with second-generation libraries that [1] focus on the best topology/framework/paratope structure designs (Figure 5), [2] use sitewise amino acid constraint based on developable and evolvable variants (Figure 6), and [3] scale up library diversity ~100-fold to levels readily achievable in the implemented selections. At that stage, comparison with previous scaffolds will be valuable as well. The lead molecules and designed libraries demonstrate and empower discovery of molecular targeting agents with unique capabilities for physiological targeting, synthetic modularity, and structural binding modes. Fundamentally, the results advance understanding of the structural elements that dictate function and robustness within the integrated framework and paratope of miniproteins.
Materials and Methods
Library construction
All 45 libraries were independently synthesized via overlap extension polymerase chain reaction (PCR) of oligonucleotides (Integrated DNA Technologies). The 129 bp product was gel extracted from a 2% agarose gel and purified via silicon spin column (Epoch Life Sciences) per manufacturer’s instructions. The constructed libraries were amplified via PCR, concentrated via ethanol precipitation, and resuspended in 30 μL of 0.5 M sorbitol and 0.5 mM calcium chloride in dH2O. A yeast display vector pCT with a V5 epitope tag was digested with BamHI and NheI and ethanol precipitated. Library DNA that shared the same framework sequence were pooled together in equal concentrations to perform a total of ten electroporation transformations. Scaffold gene variants and linearized vector (Aga2 – HA – GS linker – library – V5) were reassembled via homologous recombination upon electroporation45,46 into S. cerevisiae yeast (EBY10036). Dilutions on an agar plate of the transformed cells were used to calculate the resulting library diversities. Yeast were propagated in SD-CAA medium (16.8 g sodium citate dihydrate, 3.9 g citric acid, 20.0 g dextrose, 6.7 g yeast nitrogen base, 5.0 g casamino acids, filled to 1 L dH2O) at 30 °C with agitation at 250 rpm. Proteins were displayed on the yeast surface by switching to galactose-containing SG-CAA medium (10.2 g sodium phosphate monobasic monohydrate crystal (Na2HPO47H2O), 8.6 g sodium phosphate dibasic heptahydrate (NaH2PO4H2O), 19.0 g galactose, 1.0 g dextrose, 6.7 g yeast nitrogen base, 5.0 g casamino acids, filled to 1 L dH2O) and left to grow overnight at 30 °C, 250 rpm.
On-yeast protease assay
Proteinase K (P81075, New England Biolabs) was diluted in phosphate-buffered saline with 0.1% albumin (PBSA). Thermolysin (V4001, Promega) was reconstituted to 1 mg/mL in 50 mM Tris at pH 8 with 0.5 mM calcium chloride and diluted with PBSA. Dilutions for both proteases, 1:16000 and 1:200 for proteinase K and thermolysin, respectively, were prepared on ice the day of the experiment. Dilutions were determined by preliminary titrations to obtain a broad distribution of cells across all FACS gates.
A total of 10 million yeast expressing a collection of all libraries were centrifuged at 12,000 g for 1 minute, aspirated, resuspended in 1 mL cold PBSA, centrifuged, resuspended in 50 μL in PBSA, and placed on ice. A total of 50 μL diluted enzyme was added to the cells and mixed via pipetting. The 100 μL of cell/enzyme solution was placed in a 4 °C pre-chilled PCR block where a preset program heated the mixture to 37 °C or 55 °C (proteinase K) or 55 °C or 75 °C (thermolysin) for 10 minutes and returned to 4 °C. The cell/enzyme mixture was diluted with 1 mL of PBSA, centrifuged at 12,000 g for 1 minute, and aspirated. The cells were resuspended in PBSA containing chicken–anti–HA antibody (ab9111, Abcam) and mouse–anti–V5 (MCA1360, AbD Serotec), and rotated for 30 min at room temperature. Cells were pelleted, washed, and resuspended in PBSA containing goat–anti–chicken AlexaFluor488 antibody (A11039, Invitrogen) and goat–anti–mouse AlexaFluor647 antibody (A21235, Invitrogen) and incubated at 4 °C for 20 min without exposure from light. Finally, cells were pelleted, washed, and stored until sorting.
Cells were separated via FACS into eight populations based on the V5 to HA ratio. Gate boundaries drawn were determined by the location of the libraries in a no enzyme control. The boundary for the least protease resistant gate (lowest V5:HA ratio) was drawn to include only 1% of the control population where the primary anti-V5 labeling antibody was omitted. All eight gates were drawn to contain equal percentages of cells collected. Collected cells were centrifuged and immediately prepared for Illumina sequencing. FACS was performed at the University of Minnesota Flow Cytometry Resource facilities.
Split GFP assay
Scaffold variant gene libraries were extracted with NheI-HF and BamHI-HF, ligated into pET-gfp11 plasmid37, electroporated into NEB 10-beta electro-competent E. coli, and extracted via miniprep. E. coli cells that were pre-transformed with pBAD-gfp1–1037 were transformed, via electroporation, with scaffold-GFP11 plasmids, aliquoted, and stored at −80 °C to be used on the day of the experiment. Frozen aliquots of cells were thawed, grown in 5 mL LB + ampicillin + kanamycin overnight, diluted to an optical density at 600 nm (OD600) of 0.1 and grown for 90 minutes. Scaffold-GFP11 production was induced by adding 5 μL 0.5 mM IPTG for two hours at 37 °C. GFP1–10 production was induced by adding 500 μL 2 mg/mL arabinose and continued to grow for two hours at 37 °C. Afterwards, cells were centrifuged at 3200g for 3 minutes at 4 °C and resuspended in 1 mL of PBSA and stored in wet ice. FACS was used to separate cells based upon the GFP signal into six equally spaced (log scale) gates. An average of 2.4 million cells were collected, centrifuged at 3200 g for 10 minutes, and prepared for Illumina deep sequencing without further growth.
Magnetic-activated cell sorting
Yeast were grown at 30 °C and induced to display proteins on the cell surface following the yeast surface display protocol. For each campaign, 33 pmoles of biotinylated target protein was added to 100 μL of PBSA + 10 μL of Biotin Binder Dynabeads (Invitrogen) and incubated at 4 °C for 2 hours. Beads were washed with 1 mL PBSA via magnetic capture. Yeast equivalent to 15-fold of the library diversity were centrifuged, washed with PBSA, combined with 10 μL of unconjugated Dynabeads, and incubated at 4 °C on a rotator for 2 hours. Cell/bead mixture was placed on a magnet for 5 minutes before collecting unbound cells. Unbound cells were placed in a new 2 mL tube along with 10 μL of IgG-conjugated beads (as a negative control) and incubated at 4 °C on a rotator for 2 hours. The cell/bead mixture was placed on the magnet; unbound cells were collected and mixed with 10 μL of target-conjugated beads for another 2 hours. Cell/bead mixture was placed on a magnet for 5 minutes before collecting bound cells and discarding unbound cells. Beads from all steps were subsequently washed twice using cold PBSA and placed in 100 mL SD-CAA to grow overnight. Cells bound to bare beads and IgG negative control were also grown to determine population enrichment of binders. 100x and 2,000x dilutions were added to YPD plates for diversity calculations. Collected target-bound yeast were grown and induced for additional sorting. Three sorts were performed before cells were prepared for Illumina deep sequencing.
Library preparation for Illumina deep sequencing
Cells were centrifuged at 3220 g for 3 minutes and resuspended in 200 μL Zymo solution 1 + 2 μL β-mercaptoethanol + 10 μL zymolase (5 U/μL; Zymo Research). Solution was incubated at 37 °C for 1 hour. Afterwards, 200 μL of MX2 and 300 μL of MX3 (Epoch) was added and mixed with the solution and centrifuged at 12,000 g for 8 minutes. Supernatant was transferred to Epoch DNA column and centrifuged at 8000 g for 1 minute, followed by two washes with WS, and eluted. 15 μL of eluted DNA + 2 μL ExoI + 1 μL Lambda Exo + 2 μL Lambda Buffer was mixed and incubated on a PCR block at 30 °C for 90 minutes followed by 80 °C for 20 minutes to inactivate enzymes. Standard PCR protocol was followed using Q5 High-fidelity polymerase (New England Biolabs), followed by a ExoI cleanup. 2 μL 1:10 dilution of ExoI was directly added to PCR tubes and incubated at 37 °C for 30 minutes followed by 80 °C for 20 minutes to inactivate the enzymes. 1 μL from the PCR tube was taken and used for a subsequent PCR using the Illumina specific forward primer and reverse index primers. Resulting PCR products were purified by gel electrophoresis. Illumina iSeq sequencing was performed by the University of Minnesota Genomics Center.
Protein sequence and library developability score calculation
We used the Usearch47 algorithms to merge, align, filter, and identify unique sequences from raw fasta files. Afterwards, we developed homemade python scripts for data preprocessing: remove sequences with stop codons, assign each sequence to its respective library, and calculate library scores. Library scores are calculated by taking a gate-weighted dot product between the number of reads found in each gate with a respective gate weight (from top gate 8 to bottom gate 1). Each library score is then normalized by subtracting the overall library mean and divided by the standard deviation (i.e. computing the z-score). Independent t-tests were used to determine statistical significances between library scores using the open source SciPy package.
Protein sequence and library evolvability score calculation
Evolvability is assessed via two components: diversity (number of unique binders) and binding strength (relative abundance of each clone). The evolvability score of each variant is the quartic root of the number of sequencing reads to dampen dominant clones. Library evolvability scores are computed by taking the log2 of the ratio between the sum of variant scores from that library and the sum of all variant scores subtracted by the log2 of the ratio between the sum of variant scores from that library and the sum of all variants scores in the initial unsorted population.
Calculation of residue enrichment scores
To investigate the impact of any residue’s presence within the paratope on overall library developability, we calculated the frequency of every residue within the given paratope for every sequence in the library. A line of best fit was calculated based on the stability as a function of frequency across all residues. The slope of the linear regression is used as a proxy for determining the relationship between a residue’s frequency and the resulting library’s stability. Sitewise residue enrichment scores were calculated by finding the log2 of the frequency a residue was seen in the top gate normalized to the sitewise frequency of all residues from an unsorted population.
Affinity maturation
DNA from the naïve VEGF-binding population was extracted via Zymoprep, as detailed above. Genes were amplified using error-prone PCR in 50 μL reactions (5 μL NEB ThermoPol reaction buffer, 2.5 μL 10 μM forward primer, 2.5 μL 10 μM reverse primer, 1 μL 10 mM dNTPs, 10 μL 8-oxo-dGTP + dPTP 10 μM each, 0.25 μL Taq polymerase, fill to 50 μL ddH2O), concentrated via ethanol precipitation, and resuspended in 30 μL of 0.5 M sorbitol and 0.5 mM calcium chloride in dH2O. Diversified genes were shuttled into the yeast display system as detailed above for original library creation. Variants were iteratively sorted thrice via MACS against biotinylated human VEGF121 protein (Acro biosystems; VE1-H82E7). A more stringent sort was conducted via a monovalent MACS sort at 100 nM VEGF, followed by selection of the top 1% of strongest binders via FACS at 10 nM VEGF. The resulting binding population was subjected to proteinase K at 55 °C, and the top 5% of stable binders were sorted via FACS. The final binding population was then deep sequenced.
Protein production and purification protocol
The genes for 24 clones affinity matured for binding to VEGF were synthesized by IDT, ligated into a pET vector with a C-terminal His6 tag, and transformed into NEB T7 express competent E. coli. Single colonies were grown in 5mL of LB + kanamycin at 37 °C, 250 rpm overnight, diluted in 100 mL of LB + kanamycin to yield an OD600 of 0.1, and incubated at 37 °C, 250 rpm until OD600 ~ 1.0. Protein production was induced by adding 0.5 mM IPTG and incubated for 2 hours at 37 °C, 250 rpm. Cells were then pelleted at 3200g for 15 minutes and aspirated, resuspended in 1mL of lysis buffer (2.07 g sodium phosphate monobasic monohydrate crystal (NaH2PO4H2O), 9.38 g sodium phosphate dibasic heptahydrate (Na2HPO47H2O), 29.2g sodium chloride, 50 mL glycerol, 3.1g CHAPS, 1.7g imidazole, filled to 1L of filtered ddH2O) with protease inhibitors (1 tablet per 10 mL lysis buffer), and freeze-thaw thrice at −80 °C for mechanical disruption. Lysate was centrifuged at 12,000g for 10 minutes at 4 °C, filtered with a 0.22μm filter and rotated with 100 μL of Cobalt HisPur resin at 4 °C for 5 minutes. Lysate + cobalt resin was washed column with 1 mL of wash buffer (PBS, 30 mM imidazole). Ligand was eluted with 100 μL of elution buffer (PBS with 300 mM imidazole) via centrifugation at 700g for 1 minute and flash frozen for long term storage.
Quantifying protein recombinant yield via SDS-PAGE
12 μL of the protein sample (or lysozyme calibrant) along with 4 μL of NuPAGE LDS buffer (4x) and 1 μL of β-mercaptoethanol were combined and heated at 99 °C for 10 minutes, and loaded into a 1.5mm NuPAGE 4–12% Bis-Tris (Invitrogen) gel and ran at 180 V for 30 minutes. The gel was stained with SimplyBlue SafeStain, imaged, and analyzed using the ImageJ software to quantify the protein concentration relative to the lysozyme concentration controls.
Measuring secondary structure and melting temperature using circular dichroism
Protein samples were dialyzed in Slide-A-Lyzer MINI Dialysis 10K MWCO cup in 10 mM sodium phosphate buffer (1.01g sodium phosphate dibasic heptahydrate (Na2HPO47H2O), 169.7 mg sodium phosphate monobasic monohydrate (NaH2PO4H2O), up to 500 mL ddH2O, pH 7.4). A minimum of four buffer exchanges were performed to ensure maximum removal of imidazole. Protein samples were diluted between 0.1 – 0.2 mg/mL in 200 μL of 10 mM sodium phosphate buffer and loaded into the Picoland Jasco J815 Circular Dichroism Spectropolarimeter. Default system parameters were used. Settings for the temperature sweep included sampling every 1 °C between 15 – 95 °C at a ramp rate of 3 °C/min and a wait time of 6 seconds monitoring at the 218 nm wavelength.
Affinity titration experiments via yeast surface display
Genes were shuttled to the yeast plasmid display vector pCT via digestion/ligation with NheI and BamHI, and transformed into EBY100 yeast using the Frozen-EZ yeast transformation kit (Zymo Research) following manufacturer’s protocols. Yeast were propagated in SD-CAA medium at 30 °C, 250 rpm, and induced to display protein by switching to galactose-containing SG-CAA medium and left to grow overnight at 30 °C, 250 rpm. For each affinity titration sample, 1–2 million cells were washed with PBSA and incubated with biotinylated human VEGF121 (0.1 – 250 nM, Acro Biosystems, Cat. No. VE1-H82E7) until equilibrium was approached. Cells were washed with PBSA, resuspended in PBSA containing mouse–anti–V5 (MCA1360, AbD Serotec), and rotated for 30 min at room temperature. Cells were pelleted, washed, and resuspended in PBSA containing AlexaFluor488 streptavidin conjugate (S32354, Invitrogen) and goat–anti–mouse AlexaFluor647 antibody (A21235, Invitrogen) and incubated at 4 °C for 20 min shielded from light. Cells were pelleted, washed, and analyzed on a BD Accuri C6 Plus flow cytometer. The median AlexaFluor 488 fluorescence of the cells displaying AlexaFluor 647 signal was recorded, corrected against a no-target control, normalized relative to the maximum within the replicate. Affinity was computed via a global fit of a 1:1 binding model using Excel’s GRG nonlinear optimization solver.
Supplementary Material
Highlights.
Engineering function and developability is difficult in small, single-domain proteins
Systematically varied scaffold topology, framework, and paratope in miniproteins
Evaluated evolvability and developability of 1 billion variants in 45 library designs
Advanced understanding of molecular features that impact performance
Discovered developable, evolvable miniprotein scaffolds for targeting ligands
Acknowledgements
This research was supported by a grant from the National Institutes of Health (R01 GM146372). We appreciate assistance from Alex Golinski, Adam Boeckermann, and Mark Sipahimalani. We appreciate assistance from the University of Minnesota Genomics Center.
Abbreviations
- CTLA-4
cytotoxic T-lymphocyte-associated protein 4
- PBSA
phosphate-buffered saline with 0.1% bovine serum albumin
- PCR
polymerase chain reaction
- PD-1
programmed cell death protein 1
- VEGF
vascular endothelial growth factor
- IGF1R
insulin-like growth factor receptor 1
Footnotes
Declaration of Interests
A.M. and B.J.H. have patent applications related to the proteins described in the study.
Credit Statement
Adam McConnell: Conceptualization, Methodology, Software, Formal Analysis, Investigation, Writing: Original Draft, Writing: Review and Editing; Sun Li Batten: Investigation, Writing: Review and Editing; Benjamin Hackel: Conceptualization, Methodology, Formal Analysis, Writing: Original Draft, Writing: Review and Editing, Supervision, Funding Acquisition
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Kintzing JR, Filsinger Interrante MV, Cochran JR. Emerging Strategies for Developing Next-Generation Protein Therapeutics for Cancer Treatment. Trends Pharmacol Sci. 2016;37(12):993–1008. doi: 10.1016/j.tips.2016.10.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Škrlec K, Štrukelj B, Berlec A. Non-immunoglobulin scaffolds: A focus on their targets. Trends Biotechnol. 2015;33(7):408–418. doi: 10.1016/j.tibtech.2015.03.012 [DOI] [PubMed] [Google Scholar]
- 3.Stern LA, Case BA, Hackel BJ. Alternative non-antibody protein scaffolds for molecular imaging of cancer. Curr Opin Chem Eng. 2013;2(4):425–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Searle MS, Williams DH. The cost of conformational order: entropy changes in molecular associations. J Am Chem Soc. 1992;114(27):10690–10697. doi: 10.1021/ja00053a002 [DOI] [Google Scholar]
- 5.Dellus-Gur E, Toth-Petroczy A, Elias M, Tawfik DS. What Makes a Protein Fold Amenable to Functional Innovation? Fold Polarity and Stability Trade-offs. J Mol Biol. 2013;425(14):2609–2621. doi: 10.1016/j.jmb.2013.03.033 [DOI] [PubMed] [Google Scholar]
- 6.Xu Y, Wang D, Mason B, et al. Structure, heterogeneity and developability assessment of therapeutic antibodies. mAbs. 2019;11(2):239–264. doi: 10.1080/19420862.2018.1553476 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jain T, Sun T, Durand S, et al. Biophysical properties of the clinical-stage antibody landscape. Proc Natl Acad Sci. 2017;114(5):201616408. doi: 10.1073/pnas.1616408114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Yang X, Xu W, Dukleska S, et al. Developability studies before initiation of process development: Improving manufacturability of monoclonal antibodies. mAbs. 2013;5(5):787–794. doi: 10.4161/mabs.25269 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Tokuriki N, Stricher F, Serrano L, Tawfik DS. How protein stability and new functions trade off. PLoS Comput Biol. 2008;4(2):35–37. doi: 10.1371/journal.pcbi.1000002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tokuriki N, Tawfik DS. Stability effects of mutations and protein evolvability. Curr Opin Struct Biol. 2009;19(5):596–604. [DOI] [PubMed] [Google Scholar]
- 11.Tóth-Petróczy Á, Tawfik DS. The robustness and innovability of protein folds. Curr Opin Struct Biol. 2014;26(1):131–138. doi: 10.1016/j.sbi.2014.06.007 [DOI] [PubMed] [Google Scholar]
- 12.Bloom JD, Wilke CO, Arnold FH, Adami C. Stability and the Evolvability of Function in a Model Protein. Biophys J. 2004;86(5):2758–2764. doi: 10.1016/S0006-3495(04)74329-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bloom JD, Labthavikul ST, Otey CR, Arnold FH. Protein stability promotes evolvability. Proc Natl Acad Sci U S A. 2006;103(15):5869–5874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sha F, Salzman G, Gupta A, Koide S. Monobodies and other synthetic binding proteins for expanding protein science. Protein Sci. 2017;26:910–924. doi: 10.1002/pro.3148 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hober S, Lindbo S, Nilvebrant J. Bispecific applications of non-immunoglobulin scaffold binders. Methods. 2019;154:143–152. doi: 10.1016/j.ymeth.2018.09.010 [DOI] [PubMed] [Google Scholar]
- 16.Golinski AW, Holec PV, Mischler KM, Hackel BJ. Biophysical Characterization Platform Informs Protein Scaffold Evolvability. Published online 2019. doi: 10.1021/acscombsci.8b00182 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Holliger P, Hudson PJ. Engineered antibody fragments and the rise of single domains. Nat Biotechnol. 2005;23(9):1126–1136. doi: 10.1038/nbt1142 [DOI] [PubMed] [Google Scholar]
- 18.Mullard A FDA approves 100th monoclonal antibody product. Nat Rev Drug Discov. 2021;20(7):491–495. doi: 10.1038/d41573-021-00079-7 [DOI] [PubMed] [Google Scholar]
- 19.Schmidt MM, Wittrup KD. A modeling analysis of the effects of molecular size and binding affinity on tumor targeting. Mol Cancer Ther. 2009;8(10):2861–2871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wittrup KD, Thurber G, Schmidt MM, Rhoden J. Practical theoretical guidance for the design of tumor-targeting agents. Methods Enzymol. 2012;503:255–268. doi: 10.1016/B978-0-12-396962-0.00010-0.PRACTICAL [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gebauer M, Skerra A. Engineered Protein Scaffolds as Next-Generation Therapeutics. Published online 2020. [DOI] [PubMed]
- 22.Gebauer M, Skerra A. Engineered Protein Scaffolds as Next-Generation Antibody Therapeutics. Curr Opin Chem Biol. 2009;13(3):245–255. doi: 10.1016/j.cbpa.2009.04.627 [DOI] [PubMed] [Google Scholar]
- 23.Löfblom J, Frejd FY, Ståhl S. Non-immunoglobulin based protein scaffolds. Curr Opin Biotechnol. 2011;22(6):843–848. doi: 10.1016/j.copbio.2011.06.002 [DOI] [PubMed] [Google Scholar]
- 24.Woldring DR, Holec PV, Stern LA, Du Y, Hackel BJ. A gradient of sitewise diversity promotes evolutionary fitness for binder discovery in a three-helix bundle protein scaffold. Biochemistry. 2017;56(11):1656–1671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Löfblom J, Feldwisch J, Tolmachev V, Carlsson J, Ståhl S, Frejd FY. Affibody molecules: engineered proteins for therapeutic, diagnostic and biotechnological applications. FEBS Lett. 2010;584(12):2670–2680. [DOI] [PubMed] [Google Scholar]
- 26.Koide A, Bailey CW, Huang X, Koide S. The fibronectin type III domain as a scaffold for novel binding proteins. J Mol Biol. 1998;284(4):1141–1151. doi: 10.1006/jmbi.1998.2238 [DOI] [PubMed] [Google Scholar]
- 27.Moore SJ, Hayden Gephart MG, Bergen JM, et al. Engineered knottin peptide enables noninvasive optical imaging of intracranial medulloblastoma. Proc Natl Acad Sci. 2013;110(36):14598–14603. doi: 10.1073/pnas.1311333110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kintzing JR, Cochran JR. Engineered knottin peptides as diagnostics, therapeutics, and drug delivery vehicles. Curr Opin Chem Biol. 2016;34:143–150. doi: 10.1016/j.cbpa.2016.08.022 [DOI] [PubMed] [Google Scholar]
- 29.Simeon RA, Zeng Y, Chonira V, et al. Protease-stable DARPins as promising oral therapeutics. Published online 2021:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kruziki MA, Sarma V, Hackel BJ. Constrained Combinatorial Libraries of Gp2 Proteins Enhance Discovery of PD-L1 Binders. ACS Comb Sci. 2018;20(7):423–435. doi: 10.1021/acscombsci.8b00010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Getz JA, Rice JJ, Daugherty PS. Protease-resistant peptide ligands from a knottin scaffold library. ACS Chem Biol. 2011;6(8):837–844. doi: 10.1021/cb200039s [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hosse RJ, Rothe A, Power BE. A new generation of protein display scaffolds for molecular recognition. Protein Sci. 2006;15(1):14–27. doi: 10.1110/ps.051817606 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Koide A, Wojcik J, Gilbreth RN, Hoey RJ, Koide S. Teaching an Old Scaffold New Tricks: Monobodies Constructed Using Alternative Surfaces of the FN3 Scaffold. J Mol Biol. 2012;415(2):393–405. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bogan AA, Thorn KS. Anatomy of hot spots in protein interfaces. J Mol Biol. 1998;280(1):1–9. doi: 10.1006/jmbi.1998.1843 [DOI] [PubMed] [Google Scholar]
- 35.Rocklin GJ, Chidyausiku TM, Goreshnik I, et al. Global analysis of protein folding using massively parallel design, synthesis, and testing. Science. 2017;357(6347):168–175. doi: 10.1126/science.aan0693 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Boder ET, Wittrup KD. Yeast Surface Display for Screening Combinatorial Polypeptide Libraries. Nat Biotechnol. 1997;15:553–557. [DOI] [PubMed] [Google Scholar]
- 37.Golinski AW, Mischler KM, Laxminarayan S, et al. High-throughput developability assays enable library-scale identification of producible protein scaffold variants. Proc Natl Acad Sci U S A. 2021;118(23):1–11. doi: 10.1073/pnas.2026658118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Golinski AW, Mischler KM, Laxminarayan S, et al. High-throughput developability assays enable library-scale identification of producible protein scaffold variants. bioRxiv. Published online 2020:1–11. doi: 10.1101/2020.12.14.422755 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Cabantous S, Waldo GS. In vivo and in vitro protein solubility assays using split GFP. Nat Methods. 2006;3(10):845–854. doi: 10.1038/nmeth932 [DOI] [PubMed] [Google Scholar]
- 40.Woldring DR, Holec PV, Zhou H, Hackel BJ. High-Throughput Ligand Discovery Reveals a Sitewise Gradient of Diversity in Broadly Evolved Hydrophilic Fibronectin Domains. PLOS One. 2015;10(9):e0138956. doi: 10.1371/journal.pone.0138956 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Ferrara N, peter Gerber H, Lecouter J. The biology of VEGF and its receptors. Nat Med. 2003;9(6):669–676. [DOI] [PubMed] [Google Scholar]
- 42.Zhou Y, Zhu X, Cui H, Shi J, Yuan G, Shi S. The Role of the VEGF Family in Coronary Heart Disease. Front Cardiovasc Med. 2021;8(August):1–16. doi: 10.3389/fcvm.2021.738325 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Jain RK, Duda DG, Clark JW, Loeffler JS. Lessons from phase III clinical trials on anti-VEGF therapy for cancer. Nat Rev Clin Oncol. 2006;3:24–40. [DOI] [PubMed] [Google Scholar]
- 44.Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. Published online 2021. doi: 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Benatuil L, Perez JM, Belk J, Hsieh CM. An improved yeast transformation method for the generation of very large human antibody libraries. Protein Eng Des Sel. 2010;23(4):155–159. [DOI] [PubMed] [Google Scholar]
- 46.Woldring DR, Holec PV, Zhou H, Hackel BJ. High-throughput ligand discovery reveals a sitewise gradient of diversity in broadly evolved hydrophilic fibronectin domains. PLoS ONE. 2015;10(9). doi: 10.1371/journal.pone.0138956 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–2461. doi: 10.1093/bioinformatics/btq461 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
