Table 1.
TPP2 AUROC | TPP2 AP | TPP3 AUROC | TPP3 AP | |
---|---|---|---|---|
1. (CDR3) | 0.830 0.000 | 0.574 0.000 | 0.513 0.008 | 0.179 0.003 |
2. (CDR3) + VJ | 0.891 0.000 | 0.665 0.001 | 0.548 0.007 | 0.192 0.004 |
3. (CDR3) + MHC | 0.837 0.000 | 0.583 0.000 | 0.611 0.002 | 0.243 0.002 |
4. (CDR3) + VJ + MHC | 0.897 0.000 | 0.676 0.000 | 0.692 0.007 | 0.289 0.006 |
5. (long) | 0.888 0.000 | 0.663 0.000 | 0.528 0.008 | 0.191 0.004 |
6. (long) + MHC | 0.893 0.000 | 0.674 0.000 | 0.682 0.010 | 0.284 0.007 |
7. (long) + VJ + MHC | 0.906 0.000 | 0.698 0.000 | 0.691 0.008 | 0.291 0.005 |
8. (long) + VJ + MHC [] | 0.906 0.000 | 0.691 0.001 | 0.693 0.008 | 0.294 0.007 |
The model was trained on using different subsets of the input features. Here CDR3 and long in parenthesis denote the context used for the ProtBERT embeddings and VJ and MHC denote if the respective categorical features were used. We also compared the model on that contains also datapoints that have only the chain but not the chain (row 8). Reported values are the mean of the five 10-fold cross-validation runs together with the standard error. The values corresponding to best performing configurations are bolded.