. 2023 Dec 18;3:1274599. doi: 10.3389/fbinf.2023.1274599

TABLE 5.

Result of the recent data test dataset. APS stands for the average precision score.

Model	Dataset	ROC AUC	APS	# of data records	Pos. rate
Cross-TCR-interpreter	Recent data test set	0.5362	0.1855	33,360	0.1667
	Recent data test set of the new peptide subset	0.5085	0.1707	28,422	0.1662
	Recent data test set of the known peptide subset	0.6598	0.3318	4,938	0.1692
	Recent data test set of the new CDR3 subset	0.5355	0.1844	33,335	0.1660
NetTCR-2.0	Recent data test set	0.5274	0.1808	33,360	0.1667
	Recent data test set of the new peptide subset	0.5113	0.1705	28,422	0.1662
	Recent data test set of the known peptide subset	0.6327	0.3008	4,938	0.1692
	Recent data test set of the new CDR3 subset	0.5267	0.1798	33,335	0.1660
PanPep ^a	Recent data test set	0.5337	0.1897	30,221	0.1745
	Recent data test set of the new peptide subset	0.5359	0.1908	25,661	0.1739
	Recent data test set of the known peptide subset	0.5199	0.1852	4,560	0.1779
	Recent data test set of the new CDR3 subset	0.5374	0.1923	29,145	0.1752

The scores for the test set comprising only known CDR3s could not be computed as all the data records are positive.

However, when setting a threshold at 0.5, our model achieves a recall score of 0.56, compared to the NetTCR-2.0 score of 0.44 and PanPep 0.59.

^{^a}

The datasets employed in our model and NetTCR-2.0 were identical. However, the dataset utilized in PanPep differed due to its exclusive use of a CDR3 beta chain. Consequently, by eliminating duplicates of the beta chain CDR3 from the test set, the total number of data records was reduced from that of our model and NetTCR-2.0.