Skip to main content
. 2025 Jul 8;5(8):635–647. doi: 10.1038/s43588-025-00823-8

Fig. 3. Considerations for experimental ΔΔG dataset generation, with respect to ML predictiveness.

Fig. 3

a,b, Model performance with varying training plus validation dataset size (datasets: Synthetic_FoldX_ΔΔG_{580-450000}, Supplementary Table 1) (a) and dataset diversity (datasets: Synthetic_FoldX_ΔΔG_100000_randomly_sampled, Synthetic_FoldX_ΔΔG_100000_{sequence/substitution_type/substitution_distribution}_{min/max}; Supplementary Table 1) (b). For b, we considered diversity in antibody CDR sequence identity, amino acid substitution type frequency and the distribution of mutated positions in the complex.

Source data