Picture for Xiaoye Qu

Xiaoye Qu

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Add code
Jun 04, 2025
Viaarxiv icon

Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models

Add code
May 26, 2025
Viaarxiv icon

Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning

Add code
May 25, 2025
Viaarxiv icon

SATORI-R1: Incentivizing Multimodal Reasoning with Spatial Grounding and Verifiable Rewards

Add code
May 25, 2025
Viaarxiv icon

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Add code
May 20, 2025
Viaarxiv icon

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Add code
May 13, 2025
Viaarxiv icon

Learning to Reason under Off-Policy Guidance

Add code
Apr 22, 2025
Viaarxiv icon

SEE: Continual Fine-tuning with Sequential Ensemble of Experts

Add code
Apr 09, 2025
Viaarxiv icon

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Add code
Mar 27, 2025
Viaarxiv icon