. 2024 Jun 5;630(8018):841–846. doi: 10.1038/s41586-024-07335-x

Table 2.

Improvements from EOM and CL

	eng_Latn-xx				xx-eng_Latn				xx-yy	Average
	All	High	Low	Very low	All	High	Low	Very low	All	All
(1) Baseline MoE	44.8	54.3	41.4	39.0	56.2	64.0	53.4	52.5	41.9	47.6
(2) Baseline MoE + CL	45.2	54.7	41.8	39.5	57.6	64.5	55.1	55.4	42.7	48.5
(2) Baseline MoE + CL + EOM	45.4	52.9	41.6	41.2	57.2	61.4	55.1	56.4	44.9	51.0

We report chrF++ scores on FLORES-200 dev set on different types of language pairs. For eng_Latn-xx and xx-eng_Latn, we included all 199 pairs. For xx-yy, we randomly chose 200 directions. We observe that combining EOM and CL is particularly helpful for low and very low-resource languages. A language is defined as a very low resource if it has fewer than 100,000 samples across all pairings with any other language in our dataset. The highest score in each column is shown in bold.