Picture for Ran He

Ran He

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Add code
May 14, 2026
Viaarxiv icon

UniPrefill: Universal Long-Context Prefill Acceleration via Block-wise Dynamic Sparsification

Add code
May 07, 2026
Viaarxiv icon

Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning

Add code
Apr 23, 2026
Viaarxiv icon

SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation

Add code
Apr 22, 2026
Viaarxiv icon

Advancing Vision Transformer with Enhanced Spatial Priors

Add code
Apr 20, 2026
Viaarxiv icon

TIP: Token Importance in On-Policy Distillation

Add code
Apr 15, 2026
Viaarxiv icon

OmniUMI: Towards Physically Grounded Robot Learning via Human-Aligned Multimodal Interaction

Add code
Apr 12, 2026
Viaarxiv icon

Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element Injection

Add code
Apr 09, 2026
Viaarxiv icon

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Add code
Apr 06, 2026
Viaarxiv icon

MVPBench: A Multi-Video Perception Evaluation Benchmark for Multi-Modal Video Understanding

Add code
Mar 24, 2026
Viaarxiv icon