. 2019 May 23;14(5):e0216922. doi: 10.1371/journal.pone.0216922

Table 2. Topic model evaluation (NPMI and MTA) for different training conditions and different numbers of topics (K).

The two “MT” settings use a full machine translation system, while the “word replacement” approach approximates machine translation by simply replacing the words with entries in a bilingual dictionary.

Training data	K	NPMI			MTA
Training data	K	Mean	SD	Max	Mean	SD	Max
MT (all tweets)	10	.185	.063	.291	3.15	1.85	6.00
MT (translations only)	10	.101	.058	.202	7.45	1.72	9.00
Word replacement	10	.130	.075	.289	3.30	2.07	6.50
MT (all tweets)	25	.112	.082	.295	3.30	2.44	7.50
MT (translations only)	25	.111	.062	.247	7.44	1.41	9.50
Word replacement	25	.126	.086	.327	3.28	2.13	7.50
MT (all tweets)	50	.096	.097	.342	3.22	2.05	8.50
MT (translations only)	50	.098	.070	.306	7.37	1.42	10.00
Word replacement	50	.126	.085	.361	3.39	2.12	8.50
MT (all tweets)	75	.150	.086	.424	2.77	1.89	9.00
MT (translations only)	75	.089	.076	.346	7.12	1.51	10.00
Word replacement	75	.124	.085	.363	3.61	2.08	9.00
MT (all tweets)	100	.127	.082	.404	2.26	1.58	7.00
MT (translations only)	100	.078	.068	.327	6.58	1.47	10.00
Word replacement	100	.113	.086	.384	3.49	1.94	9.00