Skip to main content
. 2017 May 31;13(5):154–159. doi: 10.6026/97320630013154

Figure 1.

Figure 1

Workflow adopted for the current study. The initial dataset is in SDF format. Descriptors are calculated, and preprocessing-I is applied regardless of data and the applied Machine Learning (ML) method. The preprocessed data was subjected to Recursive Feature Elimination (RFE) based feature selection method to obtain the best feature subset for model building. The input data is prepared according to the selected feature set, and preprocessing-II was applied which solely depends on best practices suggested by caret package for the underlying ML method. The model building step includes hyper parameter optimisation, cross-validation and best model selection steps. The output is a model file which can be further used for prediction of unlabelled compound libraries. The preprocessing and model building step has been carried out by using R and the caret package.