Skip to main content
. 2023 Dec 9;39(12):btad743. doi: 10.1093/bioinformatics/btad743

Table 1.

Effect of input features.a

TPP2 AUROC TPP2 AP TPP3 AUROC TPP3 AP
1. αβ (CDR3) 0.830 ± 0.000 0.574 ± 0.000 0.513 ± 0.008 0.179 ± 0.003
2. αβ (CDR3) + VJ 0.891 ± 0.000 0.665 ± 0.001 0.548 ± 0.007 0.192 ± 0.004
3. αβ (CDR3) + MHC 0.837 ± 0.000 0.583 ± 0.000 0.611 ± 0.002 0.243 ± 0.002
4. αβ (CDR3) + VJ + MHC 0.897 ± 0.000 0.676 ± 0.000 0.692 ± 0.007 0.289 ± 0.006
5. αβ (long) 0.888 ± 0.000 0.663 ± 0.000 0.528 ± 0.008 0.191 ± 0.004
6. αβ (long) + MHC 0.893 ± 0.000 0.674 ± 0.000 0.682 ± 0.010 0.284 ± 0.007
7. αβ (long) + VJ + MHC 0.906 ± 0.000 0.698 ± 0.000 0.691 ± 0.008 0.291 ± 0.005
8. αβ (long) + VJ + MHC [Dαβ,α,β] 0.906 ± 0.000 0.691 ± 0.001 0.693 ± 0.008 0.294 ± 0.007
a

The model was trained on Dαβ,β using different subsets of the input features. Here CDR3 and long in parenthesis denote the context used for the ProtBERT embeddings and VJ and MHC denote if the respective categorical features were used. We also compared the model on Dαβ,α,β that contains also datapoints that have only the α chain but not the β chain (row 8). Reported values are the mean of the five 10-fold cross-validation runs together with the standard error. The values corresponding to best performing configurations are bolded.