Table 3.
Germline residues |
Nongermline residues |
|||||||||
---|---|---|---|---|---|---|---|---|---|---|
Heavy |
Light |
Heavy |
Light |
|||||||
FWR | CDR1/2 | FWR | CDR1/2 | FWR | CDR1/2 | CDR3 | FWR | CDR1/2 | CDR3 | |
ESM-2 | 1.91 | 4.12 | 2.54 | 6.11 | 32.03 | 24.36 | 20.85 | 23.20 | 19.37 | 24.29 |
AntiBERTy | 1.05 | 1.10 | 1.17 | 1.28 | 29.64 | 21.51 | 18.44 | 40.14 | 21.75 | 16.95 |
AbLang-1 | 1.03 | 1.08 | 1.07 | 1.16 | 25.80 | 17.73 | 14.47 | 52.14 | 25.72 | 16.75 |
Ab-Unpaired | 1.02 | 1.07 | 1.01 | 1.05 | 26.81 | 18.95 | 14.42 | 37.60 | 19.37 | 17.25 |
Ab-Paired | 1.02 | 1.06 | 1.02 | 1.05 | 27.24 | 18.70 | 14.23 | 38.95 | 19.25 | 16.98 |
Ab-FL | 1.10 | 1.17 | 1.09 | 1.16 | 10.33 | 11.18 | 12.69 | 10.82 | 10.24 | 11.04 |
Ab-ModMask | 1.11 | 1.18 | 1.09 | 1.17 | 10.26 | 11.13 | 13.18 | 10.78 | 10.19 | 11.42 |
Ab-FT | 1.11 | 1.18 | 1.10 | 1.18 | 10.88 | 11.91 | 13.67 | 11.25 | 10.63 | 12.29 |
AbLang-2 | 1.10 | 1.17 | 1.09 | 1.16 | 9.92 | 11.13 | 12.47 | 10.09 | 9.54 | 10.77 |
While most of the models are near perfect at predicting masked germline residues, predictions for nongermline (NGL) residues show significantly higher perplexities. For ESM-2, AntiBERTy, AbLang-1, Ab-Unpaired, and Ab-Paired NGL perplexities are close to or worse than a random prediction. The largest improvement for NGL prediction came from switching to focal loss. Scaling up the model also improved performance, e.g. as seen by AbLang-2’s performances compared to Ab-FT. The best perplexity for each region is shown in bold.