Top-K single-step retrosynthesis results on USPTO-50K (top), USPTO-MIT (middle), and USPTO-FULL (bottom) datasets.
| Category | Model | K = 1 | 3 | 5 | 10 | 20 | 50 |
|---|---|---|---|---|---|---|---|
| USPTO-50K top-K accuracy (%) | |||||||
| Template-based | Retrosim5 | 37.3 | 54.7 | 63.3 | 74.1 | 82.0 | 85.3 |
| Neuralsym6 | 44.4 | 6.3 | 72.4 | 78.9 | 82.2 | 83.1 | |
| GLN7 | 52.5 | 69.0 | 75.6 | 83.7 | 89.0 | 92.4 | |
| LocalRetro8 | 53.4 | 77.5 | 85.9 | 92.4 | — | 97.7 | |
| Semi-template | G2Gs20 | 48.9 | 67.6 | 72.5 | 75.5 | — | — |
| GraphRetro21 | 53.7 | 68.3 | 72.2 | 75.5 | — | — | |
| RetroXpert15 | 50.4 | 61.1 | 62.3 | 63.4 | 63.9 | 64.0 | |
| RetroPrime16 | 51.4 | 70.8 | 74.0 | 76.1 | — | — | |
| Oursa | 49.1 ± 0.42 | 68.4 ± 0.53 | 75.8 ± 0.62 | 82.2 ± 0.72 | 85.1 ± 0.81 | 88.7 ± 0.88 | |
| Template-free | Liu's Seq2seq11 | 37.4 | 52.4 | 57.0 | 61.7 | 65.9 | 70.7 |
| Levenshtein39 | 41.5 | 48.1 | 50.0 | 51.4 | — | — | |
| GTA18 | 51.1 ± 0.29 | 67.6 ± 0.22 | 74.8 ± 0.36 | 81.6 ± 0.22 | — | — | |
| Dual-TF33 | 53.3 | 69.7 | 73.0 | 75.0 | — | — | |
| MEGAN22 | 48.1 | 70.7 | 78.4 | 86.1 | 90.3 | 93.2 | |
| Tied transformer19 | 47.1 | 67.2 | 73.5 | 78.5 | — | — | |
| AT17 | 53.5 | — | 81.0 | 85.7 | — | — | |
| Oursb | 56.3 ± 0.15 | 79.2 ± 0.28 | 86.2 ± 0.34 | 91.0 ± 0.46 | 93.1 ± 0.48 | 94.6 ± 0.56 | |
| MEGAN22 (MaxFrag) | 54.2 | 75.7 | 83.1 | 89.2 | 92.7 | 95.1 | |
| Tied transformer19 (MaxFrag) | 51.8 | 72.5 | 78.2 | 82.4 | — | — | |
| AT17 (MaxFrag) | 58.5 | — | 85.4 | 90.0 | — | — | |
| Oursb (MaxFrag) | 61.0 ± 0.14 | 82.5 ± 0.26 | 88.5 ± 0.30 | 92.8 ± 0.35 | 94.6 ± 0.45 | 95.7 ± 0.53 | |
| USPTO-MIT top-K accuracy (%) | |||||||
| Template-based | Neuralsym6 | 47.8 | 67.6 | 74.1 | 80.2 | — | — |
| LocalRetro8 | 54.1 | 73.7 | 79.4 | 84.4 | — | 90.4 | |
| Template-free | Liu's Seq2seq11 | 46.9 | 61.6 | 66.3 | 70.8 | — | — |
| AutoSynRoute14 | 54.1 | 71.8 | 76.9 | 81.8 | — | — | |
| RetroTRAE40 | 58.3 | — | — | — | — | — | |
| Oursb | 60.3 ± 0.22 | 78.2 ± 0.28 | 83.2 ± 0.36 | 87.3 ± 0.38 | 89.7 ± 0.35 | 91.6 ± 0.44 | |
| USPTO-FULL top-K accuracy (%) | |||||||
| Template-based | Retrosim5 | 32.8 | — | — | 56.1 | — | — |
| Neuralsym6 | 35.8 | — | — | 60.8 | — | — | |
| GLN7 | 39.3 | — | — | 63.7 | — | — | |
| LocalRetro8c | 39.1 | 53.3 | 58.4 | 63.7 | 67.5 | 70.7 | |
| Semi-template | RetroPrime16 | 44.1 | — | — | 68.5 | — | — |
| Template-free | MEGAN22 | 33.6 | — | — | 63.9 | — | 74.1 |
| GTA18 | 46.6 ± 0.20 | — | — | 70.4 ± 0.15 | — | — | |
| AT17 | 46.2 | — | — | 73.3 | — | — | |
| Oursb | 48.9 ± 0.18 | 66.6 ± 0.24 | 72.0 ± 0.34 | 76.4 ± 0.40 | 80.4 ± 0.45 | 83.1 ± 0.52 | |
Our product-to-synthon-to-reactant variant.
Our product-to-reactant variant.
Denotes that the result is implemented by the open-source code with well-tuned hyperparameters.