Picture for Jiří Balhar

Jiří Balhar

Mitigating Language Barriers in Education: Developing Multilingual Digital Learning Materials with Machine Translation

Add code
Sep 11, 2025
Viaarxiv icon

Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages

Add code
May 26, 2023
Figure 1 for Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages
Figure 2 for Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages
Figure 3 for Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages
Figure 4 for Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages
Viaarxiv icon