Table 2.
Comparison of SPIRED-Fitness with other zero-shot predictors on ProteinGym single mutation data
| Method | ρ | NDCG | Recall | AUC | MCC |
|---|---|---|---|---|---|
| Structure-based | |||||
| ProteinMPNN | 0.17 ± 0.12 | 0.69 ± 0.14 | 0.14 ± 0.06 | 0.59 ± 0.07 | 0.13 ± 0.08 |
| ESM-IF1 | 0.37 ± 0.18 | 0.72 ± 0.18 | 0.21 ± 0.12 | 0.70 ± 0.09 | 0.29 ± 0.14 |
| ProtSSN (k30_h1280) | 0.40 ± 0.17 | 0.74 ± 0.15 | 0.21 ± 0.10 | 0.72 ± 0.09 | 0.31 ± 0.14 |
| MIF-ST | 0.40 ± 0.14 | 0.76 ± 0.15 | 0.22 ± 0.11 | 0.72 ± 0.08 | 0.31 ± 0.12 |
| SaProt (650M_AF2) | 0.40 ± 0.18 | 0.74 ± 0.17 | 0.20 ± 0.10 | 0.73 ± 0.10 | 0.31 ± 0.15 |
| ProtSSN (ensemble) | 0.42 ± 0.16 | 0.75 ± 0.15 | 0.22 ± 0.12 | 0.73 ± 0.10 | 0.32 ± 0.14 |
| GVP-MSA | 0.44 ± 0.14 | 0.77 ± 0.15 | 0.20 ± 0.09 | 0.74 ± 0.07 | 0.33 ± 0.11 |
| MSA-based | |||||
| WaveNet | 0.33 ± 0.20 | 0.76 ± 0.14 | 0.19 ± 0.09 | 0.68 ± 0.11 | 0.25 ± 0.15 |
| Site-independent | 0.35 ± 0.16 | 0.74 ± 0.15 | 0.18 ± 0.09 | 0.70 ± 0.09 | 0.29 ± 0.14 |
| DeepSequence (single) | 0.40 ± 0.16 | 0.78 ± 0.14 | 0.22 ± 0.11 | 0.72 ± 0.08 | 0.31 ± 0.13 |
| EVmutation | 0.41 ± 0.14 | 0.78 ± 0.15 | 0.21 ± 0.09 | 0.72 ± 0.08 | 0.31 ± 0.13 |
| DeepSequence (ensemble) | 0.42 ± 0.16 | 0.78 ± 0.15 | 0.21 ± 0.10 | 0.73 ± 0.08 | 0.33 ± 0.13 |
| MSA transformer (single) | 0.42 ± 0.16 | 0.78 ± 0.14 | 0.20 ± 0.11 | 0.73 ± 0.09 | 0.32 ± 0.13 |
| MSA transformer (ensemble) | 0.43 ± 0.16 | 0.78 ± 0.14 | 0.22 ± 0.11 | 0.74 ± 0.09 | 0.33 ± 0.12 |
| EVE (single) | 0.44 ± 0.15 | 0.79 ± 0.15 | 0.22 ± 0.10 | 0.74 ± 0.08 | 0.34 ± 0.13 |
| Tranception L | 0.44 ± 0.14 | 0.79 ± 0.13 | 0.21 ± 0.10 | 0.74 ± 0.08 | 0.34 ± 0.12 |
| EVE (ensemble) | 0.44 ± 0.16 | 0.79 ± 0.15 | 0.22 ± 0.09 | 0.74 ± 0.08 | 0.35 ± 0.14 |
| GEMME | 0.46 ± 0.15 | 0.79 ± 0.15 | 0.20 ± 0.09 | 0.75 ± 0.08 | 0.35 ± 0.13 |
| TranceptEVE L | 0.46 ± 0.15 | 0.79 ± 0.15 | 0.22 ± 0.11 | 0.75 ± 0.08 | 0.35 ± 0.12 |
| Single sequence-based | |||||
| UniRep | 0.15 ± 0.19 | 0.64 ± 0.17 | 0.13 ± 0.07 | 0.59 ± 0.11 | 0.12 ± 0.14 |
| ProtGPT2 | 0.16 ± 0.14 | 0.66 ± 0.17 | 0.13 ± 0.05 | 0.59 ± 0.08 | 0.13 ± 0.11 |
| ESM-1b | 0.35 ± 0.21 | 0.73 ± 0.18 | 0.17 ± 0.08 | 0.70 ± 0.11 | 0.28 ± 0.16 |
| CARP (640M) | 0.36 ± 0.20 | 0.74 ± 0.16 | 0.19 ± 0.11 | 0.70 ± 0.11 | 0.28 ± 0.16 |
| RITA XL | 0.36 ± 0.17 | 0.75 ± 0.15 | 0.19 ± 0.09 | 0.69 ± 0.11 | 0.28 ± 0.14 |
| ESM-1v (single) | 0.37 ± 0.21 | 0.73 ± 0.18 | 0.19 ± 0.09 | 0.70 ± 0.11 | 0.29 ± 0.15 |
| ESM-1v (ensemble) | 0.39 ± 0.20 | 0.75 ± 0.16 | 0.21 ± 0.11 | 0.72 ± 0.11 | 0.31 ± 0.15 |
| ProGen2 XL | 0.39 ± 0.16 | 0.77 ± 0.15 | 0.19 ± 0.09 | 0.72 ± 0.08 | 0.30 ± 0.12 |
| ESM-2 (15B) | 0.41 ± 0.20 | 0.76 ± 0.16 | 0.20 ± 0.09 | 0.72 ± 0.11 | 0.31 ± 0.16 |
| VESPA | 0.42 ± 0.15 | 0.77 ± 0.15 | 0.19 ± 0.09 | 0.73 ± 0.08 | 0.33 ± 0.12 |
| SPIRED-Fitness | 0.45 ± 0.16 | 0.77 ± 0.15 | 0.20 ± 0.10 | 0.75 ± 0.09 | 0.35 ± 0.13 |
All metrics are calculated within each protein and then reported as mean ± standard-deviation over all tested proteins. All prediction methods are classified into a number of categories (shown in italic) and the method with the best mean performance within each category is highlighted in bold for each metric. Here, ρ refers to the Spearman correlation coefficient, while “Recall" stands for the Top 10 % recall. The definition of all evaluation metrics can be found in ”Methods” section. Results of GVP-MSA are calculated using the zero-shot model provided in the paper44, whereas results of the other methods are directly obtained from the ProteinGym official website.