Skip to main content
. 2023 Dec 18;3:1274599. doi: 10.3389/fbinf.2023.1274599

TABLE 5.

Result of the recent data test dataset. APS stands for the average precision score.

Model Dataset ROC AUC APS # of data records Pos. rate
Cross-TCR-interpreter Recent data test set 0.5362 0.1855 33,360 0.1667
Recent data test set of the new peptide subset 0.5085 0.1707 28,422 0.1662
Recent data test set of the known peptide subset 0.6598 0.3318 4,938 0.1692
Recent data test set of the new CDR3 subset 0.5355 0.1844 33,335 0.1660
NetTCR-2.0 Recent data test set 0.5274 0.1808 33,360 0.1667
Recent data test set of the new peptide subset 0.5113 0.1705 28,422 0.1662
Recent data test set of the known peptide subset 0.6327 0.3008 4,938 0.1692
Recent data test set of the new CDR3 subset 0.5267 0.1798 33,335 0.1660
PanPep a Recent data test set 0.5337 0.1897 30,221 0.1745
Recent data test set of the new peptide subset 0.5359 0.1908 25,661 0.1739
Recent data test set of the known peptide subset 0.5199 0.1852 4,560 0.1779
Recent data test set of the new CDR3 subset 0.5374 0.1923 29,145 0.1752

The scores for the test set comprising only known CDR3s could not be computed as all the data records are positive.

However, when setting a threshold at 0.5, our model achieves a recall score of 0.56, compared to the NetTCR-2.0 score of 0.44 and PanPep 0.59.

a

The datasets employed in our model and NetTCR-2.0 were identical. However, the dataset utilized in PanPep differed due to its exclusive use of a CDR3 beta chain. Consequently, by eliminating duplicates of the beta chain CDR3 from the test set, the total number of data records was reduced from that of our model and NetTCR-2.0.