Picture for Caiming Xiong

Caiming Xiong

Salesforce AI Research

ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning

Add code
Oct 09, 2025
Viaarxiv icon

WALT: Web Agents that Learn Tools

Add code
Oct 01, 2025
Viaarxiv icon

GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness

Add code
Oct 01, 2025
Viaarxiv icon

SCUBA: Salesforce Computer Use Benchmark

Add code
Sep 30, 2025
Viaarxiv icon

LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

Add code
Sep 11, 2025
Viaarxiv icon

Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data

Add code
Sep 03, 2025
Viaarxiv icon

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Add code
Aug 20, 2025
Viaarxiv icon

CoAct-1: Computer-using Agents with Coding as Actions

Add code
Aug 05, 2025
Viaarxiv icon

UserBench: An Interactive Gym Environment for User-Centric Agents

Add code
Jul 29, 2025
Viaarxiv icon

Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning

Add code
Jun 05, 2025
Viaarxiv icon