Table 1:
Split | Max. #states | Model | GNN | Max. train length | Perplexity (↓) | Native seq. recovery (↑) | Self-consistency metrics |
|||
---|---|---|---|---|---|---|---|---|---|---|
2D – EternaFold scMCC (↑) | scRMSD (↓) | 3D – RhoFold scTM-score (↑) | scGDT_TS (↑) | |||||||
| ||||||||||
Single-state split | 1 | AR | Equiv | 500 | 1.77±0.07 | 0.438±0.01 | 0.624±0.07 | 13.01±1.18 (0.5%) | 0.21±0.0 (14.3%) | 0.22±0.0 (12.7%) |
1 | AR | Equiv | 1000 | 1.73±0.08 | 0.453±0.01 | 0.648±0.01 | 13.10±0.58 (1.0%) | 0.20±0.0 (10.8%) | 0.21±0.0 (10.6%) | |
1 | AR | Equiv | 2500 | 1.41±0.01 | 0.493±0.01 | 0.633±0.03 | 11.76±0.91 (1.4%) | 0.27±0.0 (28.8%) | 0.27±0.0 (28.0%) | |
1 | AR | Equiv | 5000 | 1.29±0.02 | 0.530±0.01 | 0.585±0.03 | 11.70±0.56 (1.3%) | 0.26±0.0 (24.8%) | 0.25±0.0 (20.1%) | |
|
||||||||||
1 | AR | Inv | 5000 | 1.32±0.04 | 0.549±0.00 | 0.612±0.02 | 11.50±0.64 (1.9%) | 0.28±0.0 (32.1%) | 0.28±0.0 (26.2%) | |
|
||||||||||
1 | NAR | Inv | 5000 | 1.54±0.04 | 0.571±0.00 | 0.430±0.02 | 14.26±0.51 (1.3%) | 0.19±0.0 (15.9%) | 0.18±0.0 (12.7%) | |
1 | NAR | Equiv | 5000 | 1.46±0.06 | 0.584±0.00 | 0.473±0.02 | 13.04±0.88 (1.3%) | 0.23±0.0 (24.0%) | 0.22±0.0 (17.9%) | |
| ||||||||||
3 | AR | Equiv | 5000 | 1.23±0.05 | 0.539±0.01 | 0.620±0.01 | 11.47±1.05 (2.5%) | 0.28±0.0 (31.4%) | 0.28±0.0 (27.2%) | |
5 | AR | Equiv | 5000 | 1.25±0.01 | 0.539±0.02 | 0.596±0.03 | 11.90±1.00 (2.9%) | 0.27±0.0 (31.6%) | 0.26±0.0 (26.4%) | |
| ||||||||||
Groundtruth sequence prediction baseline: | - | 1.000±0.00 | 0.686±0.00 | 5.23±0.07 (27.9%) | 0.56±0.0 (68.7%) | 0.55±0.0 (68.7%) | ||||
Random sequence prediction baseline: | - | 0.251±0.00 | 0.012±0.00 | 24.40±0.34 (0.0%) | 0.04±0.0 (0.0%) | 0.02±0.0 (0.0%) | ||||
ViennaRNA 2D-only baseline: | - | 0.259±0.00 | 0.611±0.00 | 20.34±0.10 (0.0%) | 0.07±0.0 (0.6%) | 0.07±0.0 (1.1%) | ||||
| ||||||||||
Multi-state split | 1 | AR | Equiv | 500 | 1.87±0.06 | 0.445±0.01 | 0.603±0.03 | 13.08±0.20 (3.5%) | 0.10±0.0 (1.2%) | 0.25±0.0 (20.7%) |
1 | AR | Equiv | 1000 | 1.84±0.01 | 0.447±0.01 | 0.580±0.01 | 13.02±0.56 (2.3%) | 0.09±0.0 (0.9%) | 0.25±0.0 (20.4%) | |
1 | AR | Equiv | 2500 | 1.73±0.04 | 0.480±0.02 | 0.567±0.01 | 12.83±0.05 (3.4%) | 0.10±0.0 (1.9%) | 0.26±0.0 (21.2%) | |
1 | AR | Equiv | 5000 | 1.68±0.03 | 0.455±0.01 | 0.569±0.02 | 12.88±0.20 (4.1%) | 0.11±0.0 (1.6%) | 0.26±0.0 (22.6%) | |
|
||||||||||
1 | AR | Inv | 5000 | 1.72±0.01 | 0.463±0.01 | 0.559±0.03 | 13.09±0.27 (4.1%) | 0.10±0.0 (2.2%) | 0.27±0.0 (23.0%) | |
|
||||||||||
1 | NAR | Inv | 5000 | 2.01±0.04 | 0.457±0.01 | 0.461±0.01 | 14.06±0.23 (3.2%) | 0.08±0.0 (1.7%) | 0.23±0.0 (16.5%) | |
1 | NAR | Equiv | 5000 | 1.89±0.06 | 0.432±0.01 | 0.423±0.01 | 13.63±0.27 (3.6%) | 0.09±0.0 (1.2%) | 0.24±0.0 (18.3%) | |
| ||||||||||
3 | AR | Equiv | 5000 | 1.60±0.03 | 0.467±0.03 | 0.561±0.03 | 13.31±0.38 (3.4%) | 0.10±0.0 (2.6%) | 0.24±0.0 (19.0%) | |
5 | AR | Equiv | 5000 | 1.55±0.04 | 0.473±0.01 | 0.549±0.03 | 13.48±0.79 (3.3%) | 0.10±0.0 (3.0%) | 0.24±0.0 (20.2%) | |
| ||||||||||
Groundtruth sequence prediction baseline: | - | 1.000±0.00 | 0.570±0.01 | 9.78±0.13 (10.3%) | 0.16±0.0 (11.7%) | 0.36±0.0 (36.7%) | ||||
Random sequence prediction baseline: | - | 0.249±0.00 | 0.128±0.00 | 21.15±0.21 (0.9%) | 0.02±0.0 (0.0%) | 0.09±0.0 (3.3%) | ||||
ViennaRNA 2D-only baseline: | - | 0.258±0.00 | 0.601±0.00 | 15.47±0.20 (2.4%) | 0.05±0.0 (0.2%) | 0.19±0.0 (15.2%) | ||||
| ||||||||||
All data | 1 | AR | Equiv | 5000 | 1.23±0.01 | 0.733±0.00 | 0.627±0.02 | 8.10±0.28 (20.7%) | 0.42±0.0 (46.1%) | 0.41±0.0 (43.0%) |
2 | AR | Equiv | 5000 | 1.21±0.01 | 0.783±0.01 | 0.629±0.03 | 8.40±0.09 (19.1%) | 0.42±0.0 (47.8%) | 0.41±0.0 (41.7%) | |
3 | AR | Equiv | 5000 | 1.19±0.01 | 0.787±0.01 | 0.606±0.02 | 7.88±0.68 (20.5%) | 0.43±0.0 (47.4%) | 0.42±0.0 (44.0%) | |
5 | AR | Equiv | 5000 | 1.15±0.01 | 0.811±0.01 | 0.617±0.02 | 7.51±0.30 (20.7%) | 0.45±0.0 (50.2%) | 0.44±0.0 (46.7%) |