TABLE 5.
Model | Dataset | ROC AUC | APS | # of data records | Pos. rate |
---|---|---|---|---|---|
Cross-TCR-interpreter | Recent data test set | 0.5362 | 0.1855 | 33,360 | 0.1667 |
Recent data test set of the new peptide subset | 0.5085 | 0.1707 | 28,422 | 0.1662 | |
Recent data test set of the known peptide subset | 0.6598 | 0.3318 | 4,938 | 0.1692 | |
Recent data test set of the new CDR3 subset | 0.5355 | 0.1844 | 33,335 | 0.1660 | |
NetTCR-2.0 | Recent data test set | 0.5274 | 0.1808 | 33,360 | 0.1667 |
Recent data test set of the new peptide subset | 0.5113 | 0.1705 | 28,422 | 0.1662 | |
Recent data test set of the known peptide subset | 0.6327 | 0.3008 | 4,938 | 0.1692 | |
Recent data test set of the new CDR3 subset | 0.5267 | 0.1798 | 33,335 | 0.1660 | |
PanPep a | Recent data test set | 0.5337 | 0.1897 | 30,221 | 0.1745 |
Recent data test set of the new peptide subset | 0.5359 | 0.1908 | 25,661 | 0.1739 | |
Recent data test set of the known peptide subset | 0.5199 | 0.1852 | 4,560 | 0.1779 | |
Recent data test set of the new CDR3 subset | 0.5374 | 0.1923 | 29,145 | 0.1752 |
The scores for the test set comprising only known CDR3s could not be computed as all the data records are positive.
However, when setting a threshold at 0.5, our model achieves a recall score of 0.56, compared to the NetTCR-2.0 score of 0.44 and PanPep 0.59.
The datasets employed in our model and NetTCR-2.0 were identical. However, the dataset utilized in PanPep differed due to its exclusive use of a CDR3 beta chain. Consequently, by eliminating duplicates of the beta chain CDR3 from the test set, the total number of data records was reduced from that of our model and NetTCR-2.0.