Table 1.
Dataset | Data type | Platform | Samples |
---|---|---|---|
GDSC | Gene expression | Affymetrix Human Genome U219 array | 49 cell lines × 14,770 genes |
CNV | Affymetrix SNP6 array | 49 cell lines × 3037 genes | |
Mutation | Whole exome sequencing | 49 cell lines × 8849 genes | |
Drug response | IC50 | 49 cell lines × 220 drugs | |
CCLE | Gene expression | Illumina Hiseq 2000 | 28 cell lines × 14,770 genes |
Drug response | IC50 | 28 cell lines × 13 drugs | |
TCGA | Gene expression | Illumina Hiseq 2000 | 1100 tumors × 14,770 genes |
Drug response | RECIST response categories a | 110 tumors × 5 drugs | |
Drugs | Chemical structure | rcdk b | 220 drugs × 1024 fingerprints |
Target | Curated | 220 drugs × 272 targets | |
MSigDB | Canonical pathways | Curated | 1329 pathway gene sets |
a Response Evaluation Criteria in Solid Tumours (RECIST), a standard way to categorize treatment response of a cancer patient, including complete response, a partial response, progressive disease, and stable disease
b An R package which can take the SMILES string of a drug as input and output the fingerprints, 1D- and 2D-structres of the drug