Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2018 Apr 1.

Published in final edited form as: Mol Inform. 2016 Dec 21;36(4):10.1002/minf.201600126. doi: 10.1002/minf.201600126

The flowchart to elucidate the data preparation, training and test sets selection, and model construction using two different under-sampling consensus approaches. The cleansed hERG dataset contains 3,024 non-redundant drug-like molecules with 15.97% hERG positive compounds. The dataset was randomly split into a training set, consisting of two thirds of the total compounds, and a test set, with the remaining one third compounds. All the experiments are repeated ten times.