. 2021 Apr 29;22:221. doi: 10.1186/s12859-021-04138-z

Table 3.

Summary of the real datasets

Scenario	Dataset	Correlation	Maximum missing (%)	Complete row	Predictors	Outcome variable	Sample size (n)
Scenario	Dataset	Correlation	Maximum missing (%)	Complete row	Predictors	Outcome variable	Total	Train	Test
1	I	± 0.52	10	Yes	27	Percentage of unhealthy days	2596	1432	1164
2		± 0.52	20	Yes	30	Percentage of unhealthy days	2596	1571	1025
3		± 0.52	30	Yes	32	Percentage of unhealthy days	2596	1793	803
4		± 0.52	10	No	27	Percentage of unhealthy days	2596	267	2329
5		± 0.52	20	No	30	Percentage of unhealthy days	2596	546	2050
6		± 0.52	30	No	32	Percentage of unhealthy days	2596	990	1606
7	II	± 0.52	10	Yes	5	Body Mass Index	1947	1000	947
8		± 0.52	20	Yes	20	Body Mass Index	1947	1162	785
9		± 0.52	30	Yes	21	Body Mass Index	1947	1242	705
10		± 0.52	10	No	5	Body Mass Index	1947	52	1895
11		± 0.52	20	No	20	Body Mass Index	1947	376	1571
12		± 0.52	30	No	21	Body Mass Index	1947	536	1411