Fig. 2. Performance comparison on the USPTO-50K dataset with Tanimoto similarity splits.
The sub-figures represent the top-k accuracies (k = 1, 3, 5, 10) of our RetroExplainer and the existing methods on the USPTO-50K dataset. These are measured under various similarity thresholds for input molecule outcomes ( = 0.4, 0.5, 0.6) and different splitting ratios (0.2, 0.25, and 0.3) for the combined proportions of the validation and test set, respectively. Each result was derived from three repeated experiments conducted with distinct random seeds. The minimum, maximum, and median of the three data points are represented by the lower whisker, upper whisker, and central line within each box, respectively. Source data are provided as a Source Data file.
