Table 10.
No. of Tokens |
Vocabulary | Inference Time (in Seconds) |
Distance Measurement | |||
---|---|---|---|---|---|---|
Unlemmatized | Lemmatized | |||||
Hellinger | Jaccard | Hellinger | Jaccard | |||
604,389 | 85,463 | 33.14 | 0.476 | 0.970 | 0.491 | 0.993 |
561,648 | 42,722 | 29.63 | 0.495 | 0.968 | 0.546 | 0.998 |
531,870 | 27,533 | 26.77 | 0.481 | 0.982 | 0.520 | 0.996 |
512,085 | 21,238 | 22.15 | 0.489 | 0.982 | 0.517 | 0.999 |
496,373 | 17,310 | 18.92 | 0.495 | 0.982 | 0.528 | 1.000 |
483,108 | 14,657 | 16.55 | 0.492 | 0.983 | 0.526 | 0.999 |