Picture for Qiaosheng Zhang

Qiaosheng Zhang

PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration

Add code
Aug 25, 2025
Figure 1 for PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration
Figure 2 for PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration
Figure 3 for PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration
Figure 4 for PerPilot: Personalizing VLM-based Mobile Agents via Memory and Exploration
Viaarxiv icon

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Add code
Jul 24, 2025
Figure 1 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 2 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 3 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 4 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Viaarxiv icon

Unsupervised Skill Discovery through Skill Regions Differentiation

Add code
Jun 17, 2025
Figure 1 for Unsupervised Skill Discovery through Skill Regions Differentiation
Figure 2 for Unsupervised Skill Discovery through Skill Regions Differentiation
Figure 3 for Unsupervised Skill Discovery through Skill Regions Differentiation
Figure 4 for Unsupervised Skill Discovery through Skill Regions Differentiation
Viaarxiv icon

The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants

Add code
May 26, 2025
Viaarxiv icon

MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision

Add code
May 19, 2025
Viaarxiv icon

CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Add code
May 18, 2025
Viaarxiv icon

Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute

Add code
Apr 02, 2025
Figure 1 for Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Figure 2 for Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Figure 3 for Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Figure 4 for Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Viaarxiv icon

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Add code
Mar 10, 2025
Viaarxiv icon

If Multi-Agent Debate is the Answer, What is the Question?

Add code
Feb 12, 2025
Viaarxiv icon

Graph Feedback Bandits on Similar Arms: With and Without Graph Structures

Add code
Jan 24, 2025
Viaarxiv icon