Table 8. SepFormer architecture.
Input | (1, 16,000) |
---|---|
Embedding: Conv1d(1, 256, kernel_size = (16,), stride = (8,), bias = False) | (256, 1,999) |
ReLU_1() | (256, 1,999) |
Conv1d(256, 256, kernel_size = (1,), stride = (1,), bias = False) | (256, 1,999) |
Encoder_1 | (18, 250, 256) |
Encoder_2 | (18, 250, 256) |
… | … |
Encoder_31 | (18, 250, 256) |
Encoder_32 | (18, 250, 256) |