Table 1.
Model selection for the best 5 models on the development set.
| Parameter settings | BLEU score | ||
|---|---|---|---|
| n_block | p_drop | d_model | |
| 4 | 0.3 | 512 | 12.11 |
| 6 | 512 | 12.04 | |
| 8 | 512 | 11.59 | |
| 6 | 256 | 11.62 | |
| 8 | 256 | 11.38 | |
The n_blocks are the number of layers engaged by the Transformer.