Skip to main content
. 2020 Nov 30;18:375. doi: 10.1186/s12916-020-01823-3

Fig. 7.

Fig. 7

Classification of haematological parameters using random forest shows that patient age and sampling location do not affect the ML models. Three models were generated: a a model for all the UM and nMI cases (n = 1681), which show a significant difference in patient age, while b shows the impurity-based measurement of the feature importance of the model; c a model for UM and nMI from Kintampo cases only (n = 756), which do not show any significant difference between the patient age, and d shows the feature importance of the model; and e a model for only Kintampo cases and ages under 4 years, whereby there was no significant difference between the nMI and UM (n = 416) and f shows the feature importance of the model. The samples for each model were split 80% for training and 20% for testing. The accuracy of the models was 0.806, 0.767, and 0.768, respectively. The most important feature across the three models was platelet and RBC counts