Table 1:
Model | Conv. skip. | Trans. skip. | Parameters (M) |
---|---|---|---|
w/o conv. skip. | ✓ | - | 46.70 |
w/o Trans. skip. | - | ✓ | 41.55 |
w/o positional embedding | ✓ | ✓ | 46.77 |
w/ shuffling | ✓ | ✓ | 46.77 |
w/ rel. positional bias | ✓ | ✓ | 46.77 |
w/ lrn. positional embedding | ✓ | ✓ | 63.63 |
w/ sin. positional embedding | ✓ | ✓ | 46.77 |