Table A1. Single-dataset classifiers, no Q.
Training data | Test data | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
s | xl | s-k | xl-k | GPT3 | Grover | |||||||
Acc. | AUC | Acc. | AUC | Acc. | AUC | Acc. | AUC | Acc. | AUC | Acc. | AUC | |
s | 0.894 | 0.962 | 0.729 | 0.838 | 0.486 | 0.312 | 0.471 | 0.281 | 0.512 | 0.491 | 0.484 | 0.451 |
xl | 0.867 | 0.957 | 0.777 | 0.864 | 0.443 | 0.311 | 0.427 | 0.289 | 0.410 | 0.415 | 0.462 | 0.449 |
s-k | 0.492 | 0.275 | 0.486 | 0.335 | 0.917 | 0.972 | 0.800 | 0.903 | 0.617 | 0.775 | 0.574 | 0.732 |
xl-k | 0.454 | 0.174 | 0.457 | 0.277 | 0.887 | 0.959 | 0.837 | 0.917 | 0.622 | 0.724 | 0.566 | 0.684 |
GPT3 | 0.445 | 0.266 | 0.458 | 0.350 | 0.703 | 0.791 | 0.624 | 0.705 | 0.739 | 0.828 | 0.585 | 0.629 |
Grover | 0.386 | 0.265 | 0.444 | 0.404 | 0.705 | 0.755 | 0.675 | 0.719 | 0.537 | 0.526 | 0.683 | 0.760 |