Skip to main content
. 2022 Jun 8;10:852893. doi: 10.3389/fchem.2022.852893

TABLE 1.

Summary of the applied datasets.

Dataset Original dataset Final dataset Reference
Dataset size Active molecule % of actives Dataset size Internal set External set
Mutagenicity 6,512 3,503 54 6,190 4,952 1,238 Hansen et al (2009)
P-glycoprotein 1,275 666 52 1,180 944 236 Broccatelli et al (2011)
hERG 4,787 2,749 57 4,612 3,690 922 Alves et al (2018)
Hepatotoxicity 2,476 619 25 2,414 1932 482 Wu et al (2019)
BBB 1864 1,438 77 1750 1,400 350 Roy et al (2019)
CYP 2C9 12,776 5,800 45 12,379 9,904 2,475 PubChem (2021)