Table 3.
Model network structural parameters
| Index | Layer | Weight parameters | Trainable parameters |
|---|---|---|---|
| 1 | Conv2d (4, 8, kernel_size=(3, 1), stride=(2, 1)) | 4 ∗ 8 ∗ 3 ∗ 1 = 96 | 96 |
| 2 | Conv2d (8, 16, kernel_size=(3, 1), stride=(2, 1)) | 8 ∗ 16 ∗ 3 ∗ 1 = 384 | 384 |
| 3 | Conv2d (16, 64, kernel_size=(3, 1), stride=(2, 1)) | 16 ∗ 64 ∗ 3 ∗ 1 = 3,072 | 3,072 |
| 4 | Linear (in_features = 64, out_features = 64) | 64 ∗ 64 = 4,096 | 4,096 |
| 5 | MultiheadAttention (out_proj = 64, 64) | 64 ∗ 64 = 4,096 | 4,096 |
| 6 | Linear (in_features = 64, out_features = 64) | 64 ∗ 64 = 4,096 | 4,096 |
| 7 | MultiheadAttention (out_proj = 64, 64) | 64 ∗ 64 = 4,096 | 4,096 |
| 8 | Linear (dense_soh: in_features = 64, out_features = 1) | 64 ∗ 1 = 64 | 64 |