Picture for Xiaoye Qu

Xiaoye Qu

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Add code
Jun 04, 2025
Viaarxiv icon

Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models

Add code
May 26, 2025
Viaarxiv icon

Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning

Add code
May 25, 2025
Viaarxiv icon

SATORI-R1: Incentivizing Multimodal Reasoning with Spatial Grounding and Verifiable Rewards

Add code
May 25, 2025
Viaarxiv icon

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Add code
May 20, 2025
Viaarxiv icon

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Add code
May 13, 2025
Viaarxiv icon

Learning to Reason under Off-Policy Guidance

Add code
Apr 22, 2025
Viaarxiv icon

SEE: Continual Fine-tuning with Sequential Ensemble of Experts

Add code
Apr 09, 2025
Viaarxiv icon

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Add code
Mar 27, 2025
Viaarxiv icon