Picture for Yuval Pinter

Yuval Pinter

Ben-Gurion University of the Negev

The Effect of Scripts and Formats on LLM Numeracy

Add code
Jan 21, 2026
Viaarxiv icon

Which Pieces Does Unigram Tokenization Really Need?

Add code
Dec 14, 2025
Figure 1 for Which Pieces Does Unigram Tokenization Really Need?
Figure 2 for Which Pieces Does Unigram Tokenization Really Need?
Figure 3 for Which Pieces Does Unigram Tokenization Really Need?
Figure 4 for Which Pieces Does Unigram Tokenization Really Need?
Viaarxiv icon

Hebrew Diacritics Restoration using Visual Representation

Add code
Oct 30, 2025
Viaarxiv icon

Probing Subphonemes in Morphology Models

Add code
May 16, 2025
Viaarxiv icon

Boundless Byte Pair Encoding: Breaking the Pre-tokenization Barrier

Add code
Mar 31, 2025
Viaarxiv icon

Splintering Nonconcatenative Languages for Better Tokenization

Add code
Mar 18, 2025
Viaarxiv icon

Token-Level Privacy in Large Language Models

Add code
Mar 05, 2025
Viaarxiv icon

How Much is Enough? The Diminishing Returns of Tokenization Training Data

Add code
Feb 27, 2025
Viaarxiv icon

Information Types in Product Reviews

Add code
Feb 20, 2025
Figure 1 for Information Types in Product Reviews
Figure 2 for Information Types in Product Reviews
Figure 3 for Information Types in Product Reviews
Figure 4 for Information Types in Product Reviews
Viaarxiv icon

Don't Touch My Diacritics

Add code
Oct 31, 2024
Viaarxiv icon