Picture for Yue Wang

Yue Wang

Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Add code
May 28, 2025
Viaarxiv icon

Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Add code
May 28, 2025
Viaarxiv icon

Pessimism Principle Can Be Effective: Towards a Framework for Zero-Shot Transfer Reinforcement Learning

Add code
May 24, 2025
Viaarxiv icon

U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding

Add code
May 23, 2025
Viaarxiv icon

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Add code
May 20, 2025
Viaarxiv icon

DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management

Add code
May 19, 2025
Viaarxiv icon

A Finite-Sample Analysis of Distributionally Robust Average-Reward Reinforcement Learning

Add code
May 18, 2025
Viaarxiv icon

MSCI: Addressing CLIP's Inherent Limitations for Compositional Zero-Shot Learning

Add code
May 15, 2025
Viaarxiv icon

ManipBench: Benchmarking Vision-Language Models for Low-Level Robot Manipulation

Add code
May 14, 2025
Viaarxiv icon

Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models

Add code
May 01, 2025
Viaarxiv icon