Skip to main content
. 2024 Mar 20;40(4):btae157. doi: 10.1093/bioinformatics/btae157

Table 1.

Datasets of protein sequences that were used in the development of the tool.

Dataset Subset All sequences Class 0 sequences Class 1 sequences Max. length
TemStaPro-Minor-30 Cross-validation 239 629 146 657 92 972 1022
Testing 41 930 25 580 16 350 1022
TemStaPro-Major-30 Training 943 605 879 799 63 806 9841
Validation 230 985 215 167 15 818 9387
Testing 210 137 196 072 14 065 13 394
SAPPHIRE Testing 742 371 371 1643
iThermo Testing 562 289 273 3567