Table 1.
Data splitting | Yield-BERT | DRFP | MFF | SEMG-MIGNN |
---|---|---|---|---|
Random 90/10 | 5.20 ± 0.500 | 5.09 ± 0.500 | 6.34 ± 0.500 | 4.79 ± 0.500 |
Random 70/30 | 5.82 ± 0.400 | 6.28 ± 0.300 | 6.77 ± 0.300 | 4.81 ± 0.400 |
Random 50/50 | 7.62 ± 0.500 | 7.36 ± 0.300 | 8.55 ± 0.300 | 6.83 ± 0.500 |
Random 30/70 | 9.41 ± 0.500 | 8.67 ± 0.500 | 10.09 ± 0.500 | 8.79 ± 0.700 |
Aryl Halidea | 26.04 ± 0.300 | 26.19 ± 0.200 | 22.04 ± 0.200 | 19.34 ± 0.400 |
Additivea | 21.29 ± 0.200 | 22.43 ± 0.200 | 21.66 ± 0.200 | 10.36 ± 0.200 |
Liganda | 20.04 ± 0.200 | 18.35 ± 0.200 | 18.85 ± 0.200 | 11.02 ± 0.200 |
Basea | 19.40 ± 0.200 | 19.90 ± 0.200 | 20.66 ± 0.200 | 14.52 ± 0.200 |
Note: The best performance of each task is shown in bold. aThese data splitting tasks refer to the extrapolative predictions based on the scaffold splitting of the reaction components. Details are elaborated in Supplementary Fig. 20. RMSEs are in %.