Table 1.
Dataset | Total Variants (Proteins) | Stabilizing Variants (Proteins) | Destabilizing Variants (Proteins) | Additional Details |
---|---|---|---|---|
Broom2017 [16] | 605 (58) | 147 (37) | 458 (54) | Unique Variants/Background Variants |
Cao Test [17] | 276 (37) | 79 (21) | 197 (35) | Replicated Variants |
Cao Train [17] | 5,444 (2 0 4) | 1,233 (1 5 0) | 4,211 (1 8 5) | Replicated Variants |
Fold-X [18] | 964 (38) | 110 (25) | 854 (36) | Unique Variants |
Myoglobin [19] | 134 (1) | 36 (1) | 98 (1) | One Protein |
p53 [20] | 42 (1) | 11 (1) | 31 (1) | One Protein |
PTmul [21] | 914 (90) | 310 (57) | 604 (68) | Unique Variants/Multiple Variants |
Q3421 [22] | 3,421 (1 4 8) | 763 (1 1 4) | 2,658 (1 3 1) | Unique Variants/Averaged ΔΔG |
S1615 [23] | 1,615 (41) | 449 (35) | 1166 (35) | Unique Variants |
S1676 [24] | 1676 (95) | 453 (53) | 1,223 (62) | Unique Variants/Averaged ΔΔG |
S1859 [25] | 1,859 (64) | 583 (48) | 1,276 (55) | Replicated Variants/Averaged ΔΔG |
S1925 [20] | 1925 (55) | 582 (42) | 1,343 (48) | Replicated Variants |
S1948 [26] | 1,948 (58) | 592 (45) | 1,356 (50) | Replicated Variants |
S2156 [27] | 2,156 (84) | 472 (61) | 1,684 (68) | Unique Variants/Averaged ΔΔG |
S238 [28] | 238 (25) | 45 (16) | 193 (20) | Unique Variants/Subset of S1948 |
S2648 [28] | 2,648 (1 3 1) | 602 (96) | 2,046 (1 1 8) | Unique Variants/Averaged ΔΔG |
S3366[29] | 3,366 (1 3 0) | 836 (1 0 3) | 2,530 (1 1 0) | Unique/Single and Multiple Variants |
S350 [28] | 350 (67) | 90 (35) | 260 (57) | Unique Variants/Subset of S2648 |
S388 [23] | 388 (17) | 48 (12) | 340 (15) | Unique Variants/Physiological Conditions |
S3568 [30] | 3,568 (1 5 4) | 947 (1 1 0) | 2,621 (1 3 8) | Replicated Variants |
S630 [30] | 630 (39) | 467 (26) | 163 (32) | Replicated Variants |
Ssym* [11] | 342 (15) | 90 (10) | 251 (13) | Unique/Symmetric Variants |
VariBench [31] | 1,564 (89) | 436 (70) | 1,128 (78) | Unique Variants |
VariBench3D[31] | 1,423(79) | 382 (60) | 1,041 (68) | Variants with available structures from [31] |
Unique Variants: Only one ΔΔG value for each variation. Replicated Variants: Multiple data for the same variant are included. Averaged ΔΔG: Multiple ΔΔG values for the same variant are replaced with their average. Multiple Variant: The dataset includes variation data for protein with multiple-site variants. Background Variants: The initial protein used as a reference for calculating the ΔGs is different from the wild-type. Physiological Conditions: Temperature 20–40 °C and pH: 6–8. *Reported data only for direct variants.