Skip to main content
. 2024 Aug 23;25(5):bbae404. doi: 10.1093/bib/bbae404

Table 1.

Characteristics of the PLM embeddings used in this study

Embeddings Language models Layers Parameters Training databases
ProteinBERT Modified BERT 6 16M Uniref90 (106M seqs)
esm2_t30_150M_UR50D BERT 30 150M UniRef50D (2021_04) (50M seqs)
esm2_t33_650M_UR50D BERT 33 650M UniRef50D (2021_04) (50M seqs)
ProtT5-XL T5 24 3B BFD100 (2B seqs) + UniRef50 (45M seqs)
ProtBert BERT 30 420 M BFD100 (2B seqs)