Skip to main content
[Preprint]. 2024 May 25:2024.03.31.587283. [Version 2] doi: 10.1101/2024.03.31.587283

Table 1:

Ablation study and aggregated benchmark results for gRNAde. We report metrics averaged over 100 test sets samples and standard deviations across 3 consistent random seeds. The percentages reported in brackets for the 3D self-consistency scores are the percentage of designed samples within the ‘designability’ threshold values (scRMSD≤2Å, scTM≥0.45, scGDT≥0.5).

Split Max. #states Model GNN Max. train length Perplexity (↓) Native seq. recovery (↑) Self-consistency metrics
2D – EternaFold scMCC (↑) scRMSD (↓) 3D – RhoFold scTM-score (↑) scGDT_TS (↑)

Single-state split 1 AR Equiv 500 1.77±0.07 0.438±0.01 0.624±0.07 13.01±1.18 (0.5%) 0.21±0.0 (14.3%) 0.22±0.0 (12.7%)
1 AR Equiv 1000 1.73±0.08 0.453±0.01 0.648±0.01 13.10±0.58 (1.0%) 0.20±0.0 (10.8%) 0.21±0.0 (10.6%)
1 AR Equiv 2500 1.41±0.01 0.493±0.01 0.633±0.03 11.76±0.91 (1.4%) 0.27±0.0 (28.8%) 0.27±0.0 (28.0%)
1 AR Equiv 5000 1.29±0.02 0.530±0.01 0.585±0.03 11.70±0.56 (1.3%) 0.26±0.0 (24.8%) 0.25±0.0 (20.1%)

1 AR Inv 5000 1.32±0.04 0.549±0.00 0.612±0.02 11.50±0.64 (1.9%) 0.28±0.0 (32.1%) 0.28±0.0 (26.2%)

1 NAR Inv 5000 1.54±0.04 0.571±0.00 0.430±0.02 14.26±0.51 (1.3%) 0.19±0.0 (15.9%) 0.18±0.0 (12.7%)
1 NAR Equiv 5000 1.46±0.06 0.584±0.00 0.473±0.02 13.04±0.88 (1.3%) 0.23±0.0 (24.0%) 0.22±0.0 (17.9%)

3 AR Equiv 5000 1.23±0.05 0.539±0.01 0.620±0.01 11.47±1.05 (2.5%) 0.28±0.0 (31.4%) 0.28±0.0 (27.2%)
5 AR Equiv 5000 1.25±0.01 0.539±0.02 0.596±0.03 11.90±1.00 (2.9%) 0.27±0.0 (31.6%) 0.26±0.0 (26.4%)

Groundtruth sequence prediction baseline: - 1.000±0.00 0.686±0.00 5.23±0.07 (27.9%) 0.56±0.0 (68.7%) 0.55±0.0 (68.7%)
Random sequence prediction baseline: - 0.251±0.00 0.012±0.00 24.40±0.34 (0.0%) 0.04±0.0 (0.0%) 0.02±0.0 (0.0%)
ViennaRNA 2D-only baseline: - 0.259±0.00 0.611±0.00 20.34±0.10 (0.0%) 0.07±0.0 (0.6%) 0.07±0.0 (1.1%)

Multi-state split 1 AR Equiv 500 1.87±0.06 0.445±0.01 0.603±0.03 13.08±0.20 (3.5%) 0.10±0.0 (1.2%) 0.25±0.0 (20.7%)
1 AR Equiv 1000 1.84±0.01 0.447±0.01 0.580±0.01 13.02±0.56 (2.3%) 0.09±0.0 (0.9%) 0.25±0.0 (20.4%)
1 AR Equiv 2500 1.73±0.04 0.480±0.02 0.567±0.01 12.83±0.05 (3.4%) 0.10±0.0 (1.9%) 0.26±0.0 (21.2%)
1 AR Equiv 5000 1.68±0.03 0.455±0.01 0.569±0.02 12.88±0.20 (4.1%) 0.11±0.0 (1.6%) 0.26±0.0 (22.6%)

1 AR Inv 5000 1.72±0.01 0.463±0.01 0.559±0.03 13.09±0.27 (4.1%) 0.10±0.0 (2.2%) 0.27±0.0 (23.0%)

1 NAR Inv 5000 2.01±0.04 0.457±0.01 0.461±0.01 14.06±0.23 (3.2%) 0.08±0.0 (1.7%) 0.23±0.0 (16.5%)
1 NAR Equiv 5000 1.89±0.06 0.432±0.01 0.423±0.01 13.63±0.27 (3.6%) 0.09±0.0 (1.2%) 0.24±0.0 (18.3%)

3 AR Equiv 5000 1.60±0.03 0.467±0.03 0.561±0.03 13.31±0.38 (3.4%) 0.10±0.0 (2.6%) 0.24±0.0 (19.0%)
5 AR Equiv 5000 1.55±0.04 0.473±0.01 0.549±0.03 13.48±0.79 (3.3%) 0.10±0.0 (3.0%) 0.24±0.0 (20.2%)

Groundtruth sequence prediction baseline: - 1.000±0.00 0.570±0.01 9.78±0.13 (10.3%) 0.16±0.0 (11.7%) 0.36±0.0 (36.7%)
Random sequence prediction baseline: - 0.249±0.00 0.128±0.00 21.15±0.21 (0.9%) 0.02±0.0 (0.0%) 0.09±0.0 (3.3%)
ViennaRNA 2D-only baseline: - 0.258±0.00 0.601±0.00 15.47±0.20 (2.4%) 0.05±0.0 (0.2%) 0.19±0.0 (15.2%)

All data 1 AR Equiv 5000 1.23±0.01 0.733±0.00 0.627±0.02 8.10±0.28 (20.7%) 0.42±0.0 (46.1%) 0.41±0.0 (43.0%)
2 AR Equiv 5000 1.21±0.01 0.783±0.01 0.629±0.03 8.40±0.09 (19.1%) 0.42±0.0 (47.8%) 0.41±0.0 (41.7%)
3 AR Equiv 5000 1.19±0.01 0.787±0.01 0.606±0.02 7.88±0.68 (20.5%) 0.43±0.0 (47.4%) 0.42±0.0 (44.0%)
5 AR Equiv 5000 1.15±0.01 0.811±0.01 0.617±0.02 7.51±0.30 (20.7%) 0.45±0.0 (50.2%) 0.44±0.0 (46.7%)