Table 4.
Ames | Tox21 | ||
---|---|---|---|
Single task | RF | 0.520 (0.517–0.523) | 0.406 (0.404–0.412) |
XGB | 0.540 (0.537–0.543) | 0.415 (0.412–0.421) | |
ST-DNN | 0.500 (0.495–0.507) | 0.415 (0.406–0.419) | |
Multi-task | XGB-FN | 0.677 (0.675–0.682) | 0.521 (0.516–0.525) |
MT-DNN | 0.676 (0.667–0.688) | 0.503 (0.493–0.512) | |
Macau | 0.679 (0.677–0.681) | 0.385 (0.379–0.388) |
Single task models are included as a benchmark. Median scores and interquartile ranges for each technique and dataset on the test set across 20 different random seeds. Before computing the median, the mean across the different assays for a single run was calculated. The best model for each dataset is in bold.