Table 4.
Summary of the data mining methods used in the reviewed studies.
Data mining method and techniques or algorithms | Data size | Independent variable | Dependent variable | ||||
Correlation analysis | |||||||
|
Pearson correlation [20] | Preliminary study: 24 users over 20 days; final study: 19 users over 21 days | TSTa and contextual factors | SOLb, NAWKc, and sleep rating | |||
|
Spearman correlation [16] | 12 users over 2 weeks | Contextual factors | TST, WASOd, NAWK, SOL, and SEe | |||
|
Repeated measure correlation [19] | 10 users over 2 weeks | Bedtime, TIBf, and contextual factors | SE, SOL, NAWK, restlessness, TIB, and LSEQg | |||
Regression analysis | |||||||
|
Piecewise fixed effects regression [34] | 31,793 users over 18 months; all American users | Time of day, time after waking up, and sleep duration | Cognitive performance | |||
|
Simple linear regression [43] | Approximately 20,000 users over 4 months | Contextual factors | SOL, NAWK, and SE | |||
|
Linear mixed effects model [35] | 557 users over 1 year | Bedtime regularity | Resting heart rate | |||
Rule induction | |||||||
|
A priori algorithm [33] | 1 user over 180 days; 4 users over 2 weeks | Contextual factors | SE | |||
|
Learn from Examples using Rough Sets [44] | 280 users over 1 month; only the data of males were used | Contextual factors | Sleep ratio | |||
|
Event mining (+causal inference) [42] | 1 user over 800 days | Contextual factors | SOL, WASO, NAWK, and SE | |||
Causal inference | |||||||
|
Stratified propensity score analysis [43] | Approximately 20,000 users over 4 months | Contextual factors | SOL, NAWK, and SE | |||
|
Bayesian network analysis [36] | 5200 users over 6 months | Contextual factors and bedtime | Contextual factors and bedtime | |||
Time series analysis | |||||||
|
Anomaly detection [37] | 1 user over 35 days | Fitbit measured intraday time series, TST, WASO, NAWK, and bedtime | Permutation entropy of sleep time series | |||
|
SAXh-based motif matching and principle optimization [38] | 100 users over 10 weeks | Heart rate time series data | PSQIi | |||
Statistical test | |||||||
|
Unpaired 2-samples Wilcoxon test [40] | 271 users over 8 months | Contextual factors | Statistical differences between good and poor sleep | |||
Decision tree | |||||||
|
J4.8 Classifier [31] | 400 users over 15 months | Contextual factors | PSQI |
aTST: total sleep time.
bSOL: sleep onset latency.
cNAWK: number of awakenings.
dWASO: wake after sleep onset.
eSE: sleep efficiency.
fTIB: time in bed.
gLSEQ: Leeds Sleep Evaluation Questionnaire.
hSAX: Symbolic Aggregate Approximation.
iPSQI: Pittsburgh Sleep Quality Index.