Robuta

https://aclanthology.org/2024.emnlp-main.672/
Marion Di Marco, Alexander Fraser. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.
subwordsegmentationllmslookinginflection
https://github.com/MeLeLBGU/SaGe
Code for SaGe subword tokenizer (EACL 2023). Contribute to MeLeLBGU/SaGe development by creating an account on GitHub.
githubsagecodesubwordtokenizer