. 2022 Jul 12;13(31):9023–9034. doi: 10.1039/d2sc02763a

Top-K single-step retrosynthesis results on USPTO-50K (top), USPTO-MIT (middle), and USPTO-FULL (bottom) datasets.

Category	Model	K = 1	3	5	10	20	50
USPTO-50K top-K accuracy (%)
Template-based	Retrosim⁵	37.3	54.7	63.3	74.1	82.0	85.3
	Neuralsym⁶	44.4	6.3	72.4	78.9	82.2	83.1
	GLN⁷	52.5	69.0	75.6	83.7	89.0	92.4
	LocalRetro⁸	53.4	77.5	85.9	92.4	—	97.7
Semi-template	G2Gs²⁰	48.9	67.6	72.5	75.5	—	—
	GraphRetro²¹	53.7	68.3	72.2	75.5	—	—
	RetroXpert¹⁵	50.4	61.1	62.3	63.4	63.9	64.0
	RetroPrime¹⁶	51.4	70.8	74.0	76.1	—	—
	Ours^a	49.1 ± 0.42	68.4 ± 0.53	75.8 ± 0.62	82.2 ± 0.72	85.1 ± 0.81	88.7 ± 0.88
Template-free	Liu's Seq2seq¹¹	37.4	52.4	57.0	61.7	65.9	70.7
	Levenshtein³⁹	41.5	48.1	50.0	51.4	—	—
	GTA¹⁸	51.1 ± 0.29	67.6 ± 0.22	74.8 ± 0.36	81.6 ± 0.22	—	—
	Dual-TF³³	53.3	69.7	73.0	75.0	—	—
	MEGAN²²	48.1	70.7	78.4	86.1	90.3	93.2
	Tied transformer¹⁹	47.1	67.2	73.5	78.5	—	—
	AT¹⁷	53.5	—	81.0	85.7	—	—
	Ours^b	56.3 ± 0.15	79.2 ± 0.28	86.2 ± 0.34	91.0 ± 0.46	93.1 ± 0.48	94.6 ± 0.56
	MEGAN²² (MaxFrag)	54.2	75.7	83.1	89.2	92.7	95.1
	Tied transformer¹⁹ (MaxFrag)	51.8	72.5	78.2	82.4	—	—
	AT¹⁷ (MaxFrag)	58.5	—	85.4	90.0	—	—
	Ours^b (MaxFrag)	61.0 ± 0.14	82.5 ± 0.26	88.5 ± 0.30	92.8 ± 0.35	94.6 ± 0.45	95.7 ± 0.53

USPTO-MIT top-K accuracy (%)
Template-based	Neuralsym⁶	47.8	67.6	74.1	80.2	—	—
Template-based	LocalRetro⁸	54.1	73.7	79.4	84.4	—	90.4
Template-free	Liu's Seq2seq¹¹	46.9	61.6	66.3	70.8	—	—
	AutoSynRoute¹⁴	54.1	71.8	76.9	81.8	—	—
	RetroTRAE⁴⁰	58.3	—	—	—	—	—
	Ours^b	60.3 ± 0.22	78.2 ± 0.28	83.2 ± 0.36	87.3 ± 0.38	89.7 ± 0.35	91.6 ± 0.44

USPTO-FULL top-K accuracy (%)
Template-based	Retrosim⁵	32.8	—	—	56.1	—	—
	Neuralsym⁶	35.8	—	—	60.8	—	—
	GLN⁷	39.3	—	—	63.7	—	—
	LocalRetro⁸^c	39.1	53.3	58.4	63.7	67.5	70.7
Semi-template	RetroPrime¹⁶	44.1	—	—	68.5	—	—
Template-free	MEGAN²²	33.6	—	—	63.9	—	74.1
	GTA¹⁸	46.6 ± 0.20	—	—	70.4 ± 0.15	—	—
	AT¹⁷	46.2	—	—	73.3	—	—
	Ours^b	48.9 ± 0.18	66.6 ± 0.24	72.0 ± 0.34	76.4 ± 0.40	80.4 ± 0.45	83.1 ± 0.52

Our product-to-synthon-to-reactant variant.

Our product-to-reactant variant.

Denotes that the result is implemented by the open-source code with well-tuned hyperparameters.