Skip to main content
. 2021 Jul 14;10:e66410. doi: 10.7554/eLife.66410

Table 7. Model validation for hyperparameters selection.

Table lists losses of models with different hyperparameter values. N represents the number of layers for the transformer architecture. demb is the dimension of the embedding space. The loss shown is the average cross entropy loss evaluated on a held out validation set.

Inline graphic
32 64 128
4 83.1% 88.4% 90.7%
6 86.3% 94.6% 96.8%
8 90.5% 96.4% 96.8%