Table 3.
Comparison of different cross-validation methods and different performance metrics (column) for different models on a single sensor (row).
Sensors | Model | k-fold | leave-recordings-out | leave-one-subject-out | |||
---|---|---|---|---|---|---|---|
Acc | F1 | Acc | F1 | Acc | F1 | ||
{LW} | CNN-LSTM | 0.36 (±0.03) | 0.31 (±0.04) | 0.36 (±0.04) | 0.3 (±0.03) | 0.36 (±0.04) | 0.31 (±0.02) |
{LW} | ResNet | 0.34 (±0.02) | 0.32 (±0.03) | 0.33 (±0.06) | 0.3 (±0.07) | 0.35 (±0.04) | 0.33 (±0.05) |
{LW} | DeepConvLSTM | 0.3 (±0.02) | 0.18 (±0.04) | 0.3 (±0.02) | 0.19 (±0.02) | 0.29 (±0.03) | 0.18 (±0.03) |
{RW} | CNN-LSTM | 0.35 (±0.04) | 0.28 (±0.05) | 0.35 (±0.04) | 0.28 (±0.05) | 0.36 (±0.03) | 0.28 (±0.03) |
{RW} | ResNet | 0.34 (±0.07) | 0.32 (±0.07) | 0.38 (±0.07) | 0.36 (±0.07) | 0.33 (±0.05) | 0.32 (±0.04) |
{RW} | DeepConvLSTM | 0.27 (±0.02) | 0.16 (±0.01) | 0.29 (±0.04) | 0.16 (±0.04) | 0.29 (±0.04) | 0.17 (±0.03) |
{ST} | CNN-LSTM | 0.32 (±0.04) | 0.25 (±0.03) | 0.3 (±0.03) | 0.23 (±0.04) | 0.32 (±0.03) | 0.24 (±0.04) |
{ST} | ResNet | 0.26 (±0.05) | 0.24 (±0.05) | 0.26 (±0.06) | 0.24 (±0.05) | 0.29 (±0.06) | 0.27 (±0.06) |
{ST} | DeepConvLSTM | 0.28 (±0.06) | 0.19 (±0.05) | 0.27 (±0.05) | 0.16 (±0.05) | 0.28 (±0.06) | 0.18 (±0.07) |
{LF} | CNN-LSTM | 0.29 (±0.04) | 0.23 (±0.06) | 0.29 (±0.03) | 0.23 (±0.04) | 0.29 (±0.02) | 0.23 (±0.03) |
{LF} | ResNet | 0.21 (±0.03) | 0.19 (±0.04) | 0.22 (±0.03) | 0.2 (±0.03) | 0.2 (±0.02) | 0.18 (±0.02) |
{LF} | DeepConvLSTM | 0.24 (±0.03) | 0.15 (±0.01) | 0.21 (±0.03) | 0.13 (±0.02) | 0.26 (±0.05) | 0.17 (±0.05) |
{RF} | CNN-LSTM | 0.32 (±0.02) | 0.25 (±0.03) | 0.3 (±0.04) | 0.24 (±0.05) | 0.29 (±0.04) | 0.23 (±0.04) |
{RF} | ResNet | 0.23 (±0.06) | 0.19 (±0.04) | 0.25 (±0.05) | 0.22 (±0.04) | 0.25 (±0.05) | 0.22 (±0.05) |
{RF} | DeepConvLSTM | 0.25 (±0.06) | 0.16 (±0.06) | 0.27 (±0.04) | 0.18 (±0.04) | 0.26 (±0.03) | 0.17 (±0.03) |
The best performing model for each sensor and validation setup is highlighted in bold.