Figure 1.
The objective of the
retrosynthesis game is to synthesize the target
product m0 from available substrates by
way of a synthesis tree that minimizes the cost function. Molecules
and reactions are illustrated by circles and squares, respectively.
Starting from the target, a reaction (yellow) is selected according
to a policy
π(r0|m0) that links m0 with precursors m1, m2, m3. The gray squares leading to m0 illustrate the other potential reactions in
. The game continues one move at a time
reducing intermediate molecules (blue) until there are only substrates
remaining, or until a maximum depth of 10 is reached. Dead-end molecules
(green), for which no reactions are possible, are assigned a cost
penalty of 100, while molecules at maximum depth (purple) are assigned
a cost penalty of 10. Commercially available substrates (red) are
assigned zero cost. The synthesis cost of the product may be computed
according to eq 1 only
on completion of the game. Here, the sampled pathway leading to the
target (red arrows) has a cost of 5.