Skip to main content
. 2025 Jan 24;26(1):bbaf023. doi: 10.1093/bib/bbaf023

Table 3.

Incidence of non-canonical Cys residues and N-linked glycosylation motifs among the CDRs of the 100 000 in-silico generated antibody sequences generated using the control and training datasets.a

CDR name Number of reference sequences with non-canonical cysteine residues Number of main sequences with non-canonical cysteine residues Number of reference sequences with N-linked glycosylation motifs Number of main sequences with N-linked glycosylation motifs
HCDR1 56 (0.05%) 19 (0.02%) 91 (0.09%) 490 (0.49%)
HCDR2 110 (0.11%) 9 (0.01%) 1279 (1.28%) 2567 (2.6%)
HCDR3 1344 (1.34%) 410 (0.41%) 1340 (1.34%) 744 (0.74%)
All HCDRs 1508 (1.5%) 438 (0.44%) 2694 (2.69%) 3775 (3.77%)
LCDR1 82 (0.08%) 9 (0.01%) 749 (0.75%) 3706 (3.71%)
LCDR2 387 (0.39%) 12 (0.01%) 25 (0.02%) 13 (0.01%)
LCDR3 12 (0.01%) 21 (0.02%) 3246 (3.24%) 478 (0.48%)
All LCDRs 481 (0.48%) 42 (0.04%) 3967 (3.96%) 4197 (4.20%)
All CDRs 1980 (1.98%) 480 (0.48%) 6556 (6.54%) 7816 (7.81%)
a

The 100 000 antibody sequences generated in-silico using the control dataset are called as reference sequences in the second and fourth columns of this table. The 100 000 antibody sequences generated in-silico using the training dataset are called as main sequences in the third and fifth columns of this table. Note that the number of paired antibody sequences containing unpaired Cys and N-linked glycosylation motifs in All HCDRs, All LCDRs and All CDRs may not be the sum of individual CDRs. This is because a given sequence may contain these liabilities in multiple CDRs.