The flowchart to elucidate the data preparation, training and test sets selection, and model construction using two different under-sampling consensus approaches. The cleansed hERG dataset contains 3,024 non-redundant drug-like molecules with 15.97% hERG positive compounds. The dataset was randomly split into a training set, consisting of two thirds of the total compounds, and a test set, with the remaining one third compounds. All the experiments are repeated ten times.