Picture for Zhengyuan Yang

Zhengyuan Yang

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

Add code
Jun 11, 2025
Viaarxiv icon

What makes Reasoning Models Different? Follow the Reasoning Leader for Efficient Decoding

Add code
Jun 08, 2025
Viaarxiv icon

Audio-Aware Large Language Models as Judges for Speaking Styles

Add code
Jun 06, 2025
Viaarxiv icon

Are Unified Vision-Language Models Necessary: Generalization Across Understanding and Generation

Add code
May 29, 2025
Viaarxiv icon

Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning

Add code
May 26, 2025
Viaarxiv icon

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Add code
May 13, 2025
Viaarxiv icon

SITE: towards Spatial Intelligence Thorough Evaluation

Add code
May 08, 2025
Viaarxiv icon

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

Add code
Apr 24, 2025
Viaarxiv icon

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Add code
Apr 10, 2025
Viaarxiv icon

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon