Skip to main content
. 2017 Nov 22;7:16023. doi: 10.1038/s41598-017-16397-z

Figure 2.

Figure 2

Algorithm flow chart. Training set and testing set were constructed by expending the features of BV 3.0, BV 4.0 and BV 5.0. Sample subset of negative class was combined with positive class to generate a balance training sample dataset. And the size of Sample subset of negative class was set to be the same with that of positive class that was 2676. The balance training sample can be trained by one of four classification models shown before and the tuning process was described in Supplementary. Then the model was used to predict testing set and the estimation probabilities of all models on testing set were collected. We compared different ensemble methods using mean, median and weighted mean of the estimation probabilities of all models on testing set. And the results were shown in Fig. 3. The weights were chosen to be the numbers of correct predicted dimers on training set. And a dimer was regarded to be correct if the top 20 residue pairs chosen by final result contained at least 1 interactive residue pair.