Table 1.
Section | Column | Description | Data type |
---|---|---|---|
Chemical | Compound name | Generated in ChemBioDraw | string |
Empirical formula | Generated in ChemBioDraw | string | |
CAS | Chemical Abstracts Service number (where available) | string | |
SMILES | Simplified Molecular Input Line Entry System, a chemical format that allows encoding a 3-D structure of a chemical in a string of symbols. Regular SMILES can have multiple valid representations for the same molecule. | string | |
Canonical SMILES | A unique or standardized SMILES representation ensuring that each molecule is assigned a distinct SMILES string. The specific output, however, depends on the canonicalization algorithm employed, which may influence the final representation. | string | |
Mw | Molecular weight | numeric | |
Biological | IC50, EC50, CC50* | Half-maximal inhibitory, effective, or cytotoxicity concentration | numeric |
Range of IC50, EC50, or CC50 | Confidence intervals, standard deviations, standard errors of the mean, etc., for the cytotoxicity values provided in the previous field | numeric | |
Incubation time | Time period, for which cells were exposed to a given IL | numeric | |
Cell line | Name of the cell culture used in a particular study | string | |
Method | Assay used for measuring IC50, EC50, or CC50 | string | |
Notes | Additional information on biological activity of a particular IL, if provided | string | |
Bibliographic | Reference | Author, year, and journal | text |
DOI | Digital Object Identifier | text |
* In the papers annotated for the dataset, the designation of the value – IC50, EC50, or CC50 – depended mainly on the authors of a particular study. Here, the assay determines the type of value, and we can compare the values from different studies obtained by the same assay. Thus, in most cases, we made no marks in the dataset on the particular designations used by the authors.