Picture for Zhaoxiang Zhang

Zhaoxiang Zhang

Ocean4D: Generative Underwater 4D Reconstruction via Medium-Aware Video Diffusion

Add code
Jun 22, 2026
Viaarxiv icon

CLI-Universe: Towards Verifiable Task Synthesis Engine for Terminal Agents

Add code
Jun 22, 2026
Viaarxiv icon

World Pilot: Steering Vision-Language-Action Models with World-Action Priors

Add code
Jun 10, 2026
Viaarxiv icon

TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation

Add code
Jun 01, 2026
Viaarxiv icon

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research

Add code
May 27, 2026
Viaarxiv icon

GoClick: Lightweight Element Grounding Model for Autonomous GUI Interaction

Add code
Apr 27, 2026
Viaarxiv icon

AutoGUI-v2: A Comprehensive Multi-Modal GUI Functionality Understanding Benchmark

Add code
Apr 27, 2026
Viaarxiv icon

WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models

Add code
Apr 20, 2026
Viaarxiv icon

CodeTracer: Towards Traceable Agent States

Add code
Apr 14, 2026
Viaarxiv icon

ReinDriveGen: Reinforcement Post-Training for Out-of-Distribution Driving Scene Generation

Add code
Apr 01, 2026
Viaarxiv icon