Picture for Ran Xu

Ran Xu

WALT: Web Agents that Learn Tools

Add code
Oct 01, 2025
Viaarxiv icon

SCUBA: Salesforce Computer Use Benchmark

Add code
Sep 30, 2025
Viaarxiv icon

GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation

Add code
Sep 19, 2025
Viaarxiv icon

CoAct-1: Computer-using Agents with Coding as Actions

Add code
Aug 05, 2025
Viaarxiv icon

RAG in the Wild: On the (In)effectiveness of LLMs with Mixture-of-Knowledge Retrieval Augmentation

Add code
Jul 26, 2025
Viaarxiv icon

MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale

Add code
Jun 04, 2025
Viaarxiv icon

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Add code
May 14, 2025
Viaarxiv icon

DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs

Add code
Apr 23, 2025
Figure 1 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Figure 2 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Figure 3 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Figure 4 for DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Viaarxiv icon

Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration

Add code
Apr 07, 2025
Viaarxiv icon

Could AI Trace and Explain the Origins of AI-Generated Images and Text?

Add code
Apr 05, 2025
Viaarxiv icon