Skip to main content
. 2025 Jan 24;26(1):bbaf023. doi: 10.1093/bib/bbaf023

Table 1b.

Levenshtein distances of in-silico generated antibody sequences from their closest training antibody sequences in the control and training datasets.

Dataset L-dista ScFv VH VL HCDR1 HCDR2 HCDR3 LCDR1 LCDR2 LCDR3
71 283 sequences in control dataset # of 0 L-dist 0 231 0 78 851 56 156 9031 79 854 96 170 67 842
Mean ± std 24.8 ± 5.1 11.7 ± 4.5 7.0 ± 2.1 0.3 ± 0.6 0.9 ± 1.2 4.1 ± 2.4 0.2 ± 0.5 0.1 ± 0.2 0.4 ± 0.6
Range (min - max) 4–46 0–31 1–16 0–4 0–8 0–13 0–3 0–2 0–3
31 416 sequences in training dataset # of 0 L-dist 9 1184 1464 74 609 48 517 11 169 78 326 91 163 68 081
Mean ± std 22.7 ± 6.2 11.1 ± 5.0 5.4 ± 2.3 0.3 ± 0.6 0.9 ± 1.1 4.1 ± 2.6 0.2 ± 0.5 0.1 ± 0.3 0.4 ± 0.6
Range (min - max) 0–46 0–31 0–15 0–4 0–6 0–14 0–3 0–4 0–3
a

L-dist stands for Levenshtein distance.