Table 2. Test Performance with Molecular Transformer on Forward Prediction and Retrosynthesis (%)a.
| test sets | random
split from USPTO |
non
USPTO |
||||
|---|---|---|---|---|---|---|
| tasks | invalid SMILES | accuracy (with SC) | accuracy (w/o SC) | invalid SMILES | accuracy (with SC) | accuracy (w/o SC) |
| forward (separated) | 0.34 | 83.86 | 85.84 | 0.40 | 66.10 | 66.92 |
| forward (mixed) | 0.36 | 81.96 | 83.99 | 0.27 | 84.12 | 85.20 |
| retrosynthesis | 0.21 | 51.28 | 52.30 | 0.27 | 37.22 | 37.42 |
The first column shows the percentage of invalid SMILES strings produced by the transformer (lower is better), while the second and third column show the top-1 accuracy with and without consideration of stereochemistry (SC), respectively (higher is better). Accuracy with non-USPTO test data for the task of retrosynthesis and forward (separated) is markedly lower than when using USPTO data, which is due to failure of reactant/agent separation.