Picture for Yuval Pinter

Yuval Pinter

Ben-Gurion University of the Negev

Universal NER v2: Towards a Massively Multilingual Named Entity Recognition Benchmark

Add code
Apr 14, 2026
Viaarxiv icon

Faster Superword Tokenization

Add code
Apr 06, 2026
Viaarxiv icon

The Degree of Language Diacriticity and Its Effect on Tasks

Add code
Mar 29, 2026
Viaarxiv icon

The Effect of Scripts and Formats on LLM Numeracy

Add code
Jan 21, 2026
Viaarxiv icon

Which Pieces Does Unigram Tokenization Really Need?

Add code
Dec 14, 2025
Figure 1 for Which Pieces Does Unigram Tokenization Really Need?
Figure 2 for Which Pieces Does Unigram Tokenization Really Need?
Figure 3 for Which Pieces Does Unigram Tokenization Really Need?
Figure 4 for Which Pieces Does Unigram Tokenization Really Need?
Viaarxiv icon

Hebrew Diacritics Restoration using Visual Representation

Add code
Oct 30, 2025
Viaarxiv icon

Probing Subphonemes in Morphology Models

Add code
May 16, 2025
Viaarxiv icon

Boundless Byte Pair Encoding: Breaking the Pre-tokenization Barrier

Add code
Mar 31, 2025
Viaarxiv icon

Splintering Nonconcatenative Languages for Better Tokenization

Add code
Mar 18, 2025
Viaarxiv icon

Token-Level Privacy in Large Language Models

Add code
Mar 05, 2025
Viaarxiv icon