Picture for Qiushi Sun

Qiushi Sun

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Add code
Sep 18, 2025
Figure 1 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 2 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 3 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 4 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Viaarxiv icon

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Add code
Aug 27, 2025
Viaarxiv icon

Dynamic and Generalizable Process Reward Modeling

Add code
Jul 23, 2025
Viaarxiv icon

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Add code
May 26, 2025
Figure 1 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Figure 2 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Figure 3 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Figure 4 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Viaarxiv icon

Breaking the Data Barrier -- Building GUI Agents Through Task Generalization

Add code
Apr 15, 2025
Viaarxiv icon

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Add code
Apr 11, 2025
Figure 1 for Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Figure 2 for Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Figure 3 for Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Figure 4 for Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Viaarxiv icon

CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era

Add code
Mar 16, 2025
Figure 1 for CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
Figure 2 for CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
Figure 3 for CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
Figure 4 for CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
Viaarxiv icon

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Add code
Dec 27, 2024
Figure 1 for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Figure 2 for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Figure 3 for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Figure 4 for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Viaarxiv icon

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Add code
Oct 30, 2024
Figure 1 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Figure 2 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Figure 3 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Figure 4 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Viaarxiv icon

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Add code
Oct 24, 2024
Viaarxiv icon