Skip to main content
. 2024 Oct 26;40(11):btae618. doi: 10.1093/bioinformatics/btae618

Table 3.

Perplexity comparison between the protein language model (LM) ESM-2 (Lin et al. 2023), the antibody-specific LMs AntiBERTy (Ruffolo et al. 2021) and AbLang-1 (Olsen et al. 2022b), and our new selection of antibody-specific LMs (see Section 2.4).a

Germline residues
Nongermline residues
Heavy
Light
Heavy
Light
FWR CDR1/2 FWR CDR1/2 FWR CDR1/2 CDR3 FWR CDR1/2 CDR3
ESM-2 1.91 4.12 2.54 6.11 32.03 24.36 20.85 23.20 19.37 24.29
AntiBERTy 1.05 1.10 1.17 1.28 29.64 21.51 18.44 40.14 21.75 16.95
AbLang-1 1.03 1.08 1.07 1.16 25.80 17.73 14.47 52.14 25.72 16.75
Ab-Unpaired 1.02 1.07 1.01 1.05 26.81 18.95 14.42 37.60 19.37 17.25
Ab-Paired 1.02 1.06 1.02 1.05 27.24 18.70 14.23 38.95 19.25 16.98
Ab-FL 1.10 1.17 1.09 1.16 10.33 11.18 12.69 10.82 10.24 11.04
Ab-ModMask 1.11 1.18 1.09 1.17 10.26 11.13 13.18 10.78 10.19 11.42
Ab-FT 1.11 1.18 1.10 1.18 10.88 11.91 13.67 11.25 10.63 12.29
AbLang-2 1.10 1.17 1.09 1.16 9.92 11.13 12.47 10.09 9.54 10.77
a

While most of the models are near perfect at predicting masked germline residues, predictions for nongermline (NGL) residues show significantly higher perplexities. For ESM-2, AntiBERTy, AbLang-1, Ab-Unpaired, and Ab-Paired NGL perplexities are close to or worse than a random prediction. The largest improvement for NGL prediction came from switching to focal loss. Scaling up the model also improved performance, e.g. as seen by AbLang-2’s performances compared to Ab-FT. The best perplexity for each region is shown in bold.