Skip to main content
. 2018 Oct 24;9:4418. doi: 10.1038/s41467-018-06735-8

Fig. 3.

Fig. 3

Preprocessing leads to more accurate predictions. a Schematic representation of the analysis of the participating teams’ writeups to identify methodological steps associated with more accurate prediction of symptoms. First, the writeups were manually inspected to identify the preprocessing, feature selection and predictive modeling method used by each team. Second, the methods were regrouped into general categories across teams. Third, each general method was assessed for its association with predictive model accuracies on the leaderboard test set and the independent test set. On the boxplot, the lower whisker, the lower hinge, the mid hinge, the upper hinge and the upper whisker correspond to −1.5× IQR from the first quartile, the first quartile, the median, the third quartile and 1.5× IQR from the third quartile of the AUROC, respectively. b Heatmap showing the association of each general method with prediction ability (i.e. AUROC for SC2 (prediction of symptom presence) and Pearson’s correlation coefficient for SC3 (prediction of symptom severity)). For each general method, a Wilcoxon rank-sum test was used to assess the association between using the method (coded as a binary variable) and prediction ability