Figure 3: MultiSTEP reveals biochemical constraints on secretion.

a. Predicted signal peptide regions for WT FIX from SignalP 6.0 (top). Heatmap shows FIX heavy chain secretion scores for signal peptide variants (bottom, n = 3 replicates). Heatmap color indicates secretion score. Black dots indicate the WT amino acid. Missing scores are gray. N: N-region; H: H-region; C: C-region. b. Comparison of MultiSTEP secretion scores with SignalP 6.0 (SP6) functional classification, grouped by signal peptide region. Violin plot shows distribution of points with an inset box plot representing the 25th, 50th, and 75th percentiles. Whiskers span the range of data. Dashed horizontal line is the 5th percentile of the synonymous secretion score distribution. Number of variants in each class is labeled above the violin plot. c. FIX cysteine positions colored by domain architecture (top). Sig: Signal peptide. Gla: Gla domain. EGF1: Epidermal growth-like factor 1 domain. EGF2: Epidermal growth-like factor 2 domain. Protease: Serine protease domain. Disulfide bridges in WT FIX are denoted by black connecting lines9,108–110. Heatmap of FIX heavy chain secretion scores for loss-of-cysteine substitutions, colored as in (b) (bottom, n = 3 replicates). d. Mean (point) and standard error (error bars) of variant effect scores (FIX: MultiSTEP, all others: VAMP-seq) for all loss-of-cysteine substitutions for different proteins22–24,34 (n = 1,031 variants). Bonferroni-corrected pairwise two-sided t-test p values are shown. e. Box plots representing the 25th, 50th, and 75th percentiles of secretion scores for all missense variants across all positions with the indicated WT amino acid (n = 8,528 variants). Whiskers span the range of data. f. Mean (point) and standard error (error bars) of variant effect scores (FIX: MultiSTEP, all others: VAMP-seq) for all gain-of-cysteine substitutions for different proteins (n = 1,404 variants). Bonferroni-corrected pairwise two-sided t-test p values are shown. g. Box plots representing the 5th, 25th, 50th, 75th, and 95th percentiles of secretion scores for all missense substitutions of the indicated variant amino acid across all positions (n = 8,528 variants). Whiskers span the range of data.