Extended Data Figure 22:
Computational validation of the structure-based combinatorial assembly strategy
Structure-based design permits the rational combinatorial assembly of heavy and light chains, assembling only heavy and light chains from structurally similar pairs. A) Fine-tuned RoseTTAFold (left), and AlphaFold3 (right) validate that pairing heavy and light chains from structurally similar (i.e. high pairwise TM score) designs yields scFvs that are more likely to be predicted to bind with high confidence (RF2 pBind, left; AF3 iPTM, right) than heavy and light chains from structurally-dissimilar (low pairwise TM score) designs. Note that the extremely high pBind distribution of the “designed pairings” (rightmost bar of left plot) is an artifact of those designs being specifically filtered for high pBind scores. B-C) combinatorial assembly leads to dramatically larger library sizes. Plots show the number of clusters (pink) at different TM score thresholds for TcdB (left) and Phox2b (right) scFvs. For the amplification strategy to work, each “cluster” becomes a PCR subpool, requiring independent PCR reactions (3 per subpool). Hence, we limit ourselves to large subpools (>= 100 designs), which maximises the combinatorial amplification for the amount of additional library assembly work. We additionally plot the theoretical library size for each target (blue), calculated as number_of_clusters x cluster_size2. Gray lines indicate the TM threshold chosen for library assembly, where library sizes approximately match the transfection efficiency of yeast (107) 58.
