Figure 1.
pdCSM-cancer workflow. The developed approach has four major stages. (1) In data curation, small-molecule activity data (in terms of GI50%) were obtained from DTP of NCI23 for nine different tumor types (74 cancer cell lines); (2) in feature engineering, two types of features were calculated: (i) graph-based signatures, which represent the chemical geometry and physicochemical properties of small molecules, and (ii) compound general properties and pharmacophores; (3) these features were then employed to train and test predictive models using supervised learning, and model optimization was carried out, via greedy feature selection; (4) finally, the models with the best performance were made available through a user-friendly web interface.
