Picture for Qiaosheng Zhang

Qiaosheng Zhang

PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration

Add code
Aug 25, 2025
Viaarxiv icon

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Add code
Jul 24, 2025
Viaarxiv icon

Unsupervised Skill Discovery through Skill Regions Differentiation

Add code
Jun 17, 2025
Viaarxiv icon

The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants

Add code
May 26, 2025
Viaarxiv icon

MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision

Add code
May 19, 2025
Viaarxiv icon

CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Add code
May 18, 2025
Viaarxiv icon

Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute

Add code
Apr 02, 2025
Viaarxiv icon

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Add code
Mar 10, 2025
Viaarxiv icon

If Multi-Agent Debate is the Answer, What is the Question?

Add code
Feb 12, 2025
Viaarxiv icon

Graph Feedback Bandits on Similar Arms: With and Without Graph Structures

Add code
Jan 24, 2025
Viaarxiv icon