Table 1. Regression and Classification Task Data Sets Used in the Researcha.
database | n | min | max | units | reference |
---|---|---|---|---|---|
log P | 5574 (8199) | –4.64 | 8.27 | (26) | |
BACE-1 pIC50 | 285 (1513) | 2.699 | 10.523 | μM (log) | (27,28) |
solubility | 803 (1128) | –11.6 | 1.58 | mols per liter (log) | (28) |
lipophilicity | 2237 (4200) | –1.50 | 4.48 | (28) | |
ionization energy | 1575 (2147) | 1.04 | 13.94 | eV | (29) |
melting pointb | 7316 (10765) | –196.0 | 492.5 | °C | (10,30) |
database | n | class ratio | reference |
---|---|---|---|
BBBP | 1089 (2053) | 161/928 | (31) |
Ames mutagenicity | 3801 (6512) | 1779/2022 | (32) |
hERG | 2452 (6795) | 1068/1385 | (33) |
ClinTox | 707 (1478) | 56/651 | (28) |
The number of entries (n) is the number of molecules that lie within the applicability domain. The number in parentheses is the original number of molecules in the database.
Melting point database is a combination of two other databases.