Skip to main content
. 2019 Jun 14;20:334. doi: 10.1186/s12859-019-2871-9

Fig. 5.

Fig. 5

Distributions of percent missing for high and how quality peaks in the training set. Using (a) box plots and (b) density plots of the percent missing values for high and low quality features in the training set, we chose a filtering cutoff of 68% missing, the median value of percent missing for the low quality features in the training set. The plots help to visualize and compare the modes and percentiles of the two distributions (for low and high quality features). The median of the distribution for low quality features is greater than the mode and even 90th percentile of the distribution for high quality features in the training set, making it an appropriate cutoff