APPLIED BIOLOGICAL SCIENCES, COMPUTER SCIENCES Correction for “Machine learning-assisted directed protein evolution with combinatorial libraries,” by Zachary Wu, S. B. Jennifer Kan, Russell D. Lewis, Bruce J. Wittmann, and Frances H. Arnold, which was first published April 12, 2019; 10.1073/pnas.1901979116 (Proc. Natl. Acad. Sci. U.S.A. 116, 8852–8858).
The authors note that Fig. 4 appeared incorrectly. The corrected figure and its legend appear below.
Fig. 4.
(A) Structural homology model of Rma NOD and positions of mutated residues made by SWISS-MODEL (47). Set I positions 32, 46, 56, and 97 are shown in red, and set II positions 49, 51, and 53 are shown in blue. (B) Evolutionary lineage of the two rounds of evolution. (C) Summary statistics for each round, including the number of sequences obtained to train each model, the fraction of the total library represented in the input variants, each model’s leave-one-out cross-validation (CV) Pearson correlation, and the number of predicted sequences tested.
The authors also note that Table 2 appeared incorrectly. The corrected table appears below.
Table 2.
Summary of the most (S)- and (R)-selective variants in the input and predicted libraries in position set II (P49, R51, I53)
| Residue | |||||
| Variant | 49 | 51 | 53 | Selectivity, % ee (enantiomer) | Cellular activity increase over KFLL |
| Input variants | _ | ||||
| From VCHV | P* | R* | I* | 86 (S) | _ |
| Y | V | F | 86 (S) | _ | |
| N | D | V | 75 (S) | _ | |
| From GSSG | P† | R† | I† | 62 (R) | _ |
| Y | F | F | 57 (R) | _ | |
| C | V | N | 52 (R) | _ | |
| Predicted variants | |||||
| From VCHV | Y | V | V | 93 (S) | 2.8-fold |
| P | V | I | 93 (S) | 3.2-fold | |
| P | V | V | 92 (S) | 3.1-fold | |
| From GSSG | P | R | L | 79 (R) | 2.2-fold |
| P | G | L | 75 (R) | 2.1-fold | |
| P | F | F | 70 (R) | 2.2-fold | |
Mutations that improve selectivity for the (S)-enantiomer appear in the background of [32V, 46C, 56H, 97V (VCHV)] and for the (R)-enantiomer are in [32G, 46S, 56S, 97G (GSSG)]. Activity increase over the starting variant, 32K, 46F, 56L, 97L (KFLL), is shown for the final variants.
Parent sequence used for set II for (S)-selectivity.
Parent sequence used for set II for (R)-selectivity.
Lastly, the authors also note that Table S4 in the SI Appendix appeared incorrectly. The SI Appendix has been corrected online.

