Text


Power Reinforcement Post-Training of Text-to-Image Models with Super-Linear Advantage Shaping

Add code
May 11, 2026
Viaarxiv icon

Grounded or Guessing? LVLM Confidence Estimation via Blind-Image Contrastive Ranking

Add code
May 11, 2026
Viaarxiv icon

Count Anything at Any Granularity

Add code
May 11, 2026
Viaarxiv icon

BabelDOC: Better Layout-Preserving PDF Translation via Intermediate Representation

Add code
May 11, 2026
Viaarxiv icon

Transcoda: End-to-End Zero-Shot Optical Music Recognition via Data-Centric Synthetic Training

Add code
May 11, 2026
Viaarxiv icon

Probing Cross-modal Information Hubs in Audio-Visual LLMs

Add code
May 11, 2026
Viaarxiv icon

Likelihood scoring for continuations of mathematical text: a self-supervised benchmark with tests for shortcut vulnerabilities

Add code
May 11, 2026
Viaarxiv icon

The Last Word Often Wins: A Format Confound in Chain-of-Thought Corruption Studies

Add code
May 11, 2026
Viaarxiv icon

TrajPrism: A Multi-Task Benchmark for Language-Grounded Urban Trajectory Understanding

Add code
May 11, 2026
Viaarxiv icon

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenizatio

Add code
May 11, 2026
Viaarxiv icon