Fig. 1. Experimental workflow for generating pre- and post-packaged AAV5–7-mer library data for ML-based library design.
Ni, number of reads for each unique insertion sequence i. Experimental data were used to build a supervised regression model where the target variable reflects the packaging success of each insertion sequence. The predictive model was then systematically inverted to design libraries that trace out an optimal trade-off curve between diversity and packaging fitness. Schematic illustration created with BioRender.com.