Picture for Bofei Zhang

Bofei Zhang

TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials

Add code
Apr 17, 2025
Viaarxiv icon

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

Add code
Dec 20, 2024
Viaarxiv icon

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

Add code
Jul 16, 2024
Viaarxiv icon