Skip to main content
[Preprint]. 2024 Sep 24:2024.09.18.612131. [Version 2] doi: 10.1101/2024.09.18.612131

Fig. 5:

Fig. 5:

Model prediction accuracy of the four tokenization schemes. We did not include the node-ID-based method in character level accuracy figure because it is vague to define it when the sequence lengths varies too much represented by different tokens.