Table 4.
Comparison of recently published methods for direct synthesis prediction on the USPTO-MIT set.
| Model | Top-1 | Top-2 | Top-5 | ||||
|---|---|---|---|---|---|---|---|
| Separated | Mixed | Separated | Mixed | Separated | Mixed | Ref. # | |
| Transformer (single model) | 90.4 | 88.6 | 93.7 | 92.4 | 95.3 | 94.2 | 18 |
| Transformer (ensemble of models) | 91 | 94.3 | 95.8 | 18 | |||
| Seq2Seq | 80.3 | 87.5 | 11 | ||||
| WLDN | 79.6 | 89.2 | 32 | ||||
| GTPN | 83.2 | 86.5 | 40 | ||||
| WLDN5 | 85.6 | 93.4 | 23 | ||||
| AT, this worka | 91.9 | 90.4 | 95.4 | 94.6 | 97 | 96.5 | |
| AT trained with same training set as in ref. 22. | 92 | 90.6 | 95.4 | 94.4 | 97 | 96.1 | |
aThe results of the models applied to x100 augmented dataset using beam size = 10. Model was trained on a set of 439 k reactions, which combines both the training set of 400 k and the validation set of 39 k from ref. 22. The model was trained on the 400 k training set to better match performance of previous models.