Skip to main content
. 2023 Jun 28;11:e42750. doi: 10.2196/42750

Table 4.

Summary of the data mining methods used in the reviewed studies.

Data mining method and techniques or algorithms Data size Independent variable Dependent variable
Correlation analysis

Pearson correlation [20] Preliminary study: 24 users over 20 days; final study: 19 users over 21 days TSTa and contextual factors SOLb, NAWKc, and sleep rating

Spearman correlation [16] 12 users over 2 weeks Contextual factors TST, WASOd, NAWK, SOL, and SEe

Repeated measure correlation [19] 10 users over 2 weeks Bedtime, TIBf, and contextual factors SE, SOL, NAWK, restlessness, TIB, and LSEQg
Regression analysis

Piecewise fixed effects regression [34] 31,793 users over 18 months; all American users Time of day, time after waking up, and sleep duration Cognitive performance

Simple linear regression [43] Approximately 20,000 users over 4 months Contextual factors SOL, NAWK, and SE

Linear mixed effects model [35] 557 users over 1 year Bedtime regularity Resting heart rate
Rule induction

A priori algorithm [33] 1 user over 180 days; 4 users over 2 weeks Contextual factors SE

Learn from Examples using Rough Sets [44] 280 users over 1 month; only the data of males were used Contextual factors Sleep ratio

Event mining (+causal inference) [42] 1 user over 800 days Contextual factors SOL, WASO, NAWK, and SE
Causal inference

Stratified propensity score analysis [43] Approximately 20,000 users over 4 months Contextual factors SOL, NAWK, and SE

Bayesian network analysis [36] 5200 users over 6 months Contextual factors and bedtime Contextual factors and bedtime
Time series analysis

Anomaly detection [37] 1 user over 35 days Fitbit measured intraday time series, TST, WASO, NAWK, and bedtime Permutation entropy of sleep time series

SAXh-based motif matching and principle optimization [38] 100 users over 10 weeks Heart rate time series data PSQIi
Statistical test

Unpaired 2-samples Wilcoxon test [40] 271 users over 8 months Contextual factors Statistical differences between good and poor sleep
Decision tree

J4.8 Classifier [31] 400 users over 15 months Contextual factors PSQI

aTST: total sleep time.

bSOL: sleep onset latency.

cNAWK: number of awakenings.

dWASO: wake after sleep onset.

eSE: sleep efficiency.

fTIB: time in bed.

gLSEQ: Leeds Sleep Evaluation Questionnaire.

hSAX: Symbolic Aggregate Approximation.

iPSQI: Pittsburgh Sleep Quality Index.