Table 1.
The impact of fine-tuning and initialization on downstream model performance.
| Remote Homology | Fluorescence | Stability | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Resnet | LSTM | Trans | Resnet | LSTM | Trans | Resnet | LSTM | Trans | |
| Pre+Fix | 0.27 | 0.37 | 0.27 | 0.23 | 0.74 | 0.48 | 0.65 | 0.70 | 0.62 |
| Pre+Fin | 0.17 | 0.26 | 0.21 | 0.21 | 0.67 | 0.68 | 0.73 | 0.69 | 0.73 |
| Rng+Fix | 0.03 | 0.10 | 0.04 | 0.25 | 0.63 | 0.14 | 0.21 | 0.61 | – |
| Rng+Fin | 0.10 | 0.12 | 0.09 | − 0.28 | 0.21 | 0.22 | 0.61 | 0.28 | − 0.06 |
| Baseline | 0.09 (Accuracy) | 0.14 (Correlation) | 0.19 (Correlation) | ||||||
The embedding models were either randomly initialized (Rng) or pre-trained (Pre), and subsequently either fixed (Fix) or fine-tuned to the task (Fin). The baseline is a simple one-hot encoding of the sequence. Although fine-tuning is beneficial on some task/model combinations, we see clear signs of overfitting in the majority of cases (best results in bold).