Picture for Xin Eric Wang

Xin Eric Wang

Auditing Agent Harness Safety

Add code
May 14, 2026
Viaarxiv icon

Stateful Reasoning via Insight Replay

Add code
May 14, 2026
Viaarxiv icon

EnactToM: An Evolving Benchmark for Functional Theory of Mind in Embodied Agents

Add code
May 11, 2026
Viaarxiv icon

On the Reliability of Computer Use Agents

Add code
Apr 20, 2026
Viaarxiv icon

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Add code
Apr 01, 2026
Viaarxiv icon

Context Bootstrapped Reinforcement Learning

Add code
Mar 19, 2026
Viaarxiv icon

Learning Situated Awareness in the Real World

Add code
Feb 18, 2026
Viaarxiv icon

CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use

Add code
Feb 12, 2026
Viaarxiv icon

Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

Add code
Feb 04, 2026
Viaarxiv icon

SafeGround: Know When to Trust GUI Grounding Models via Uncertainty Calibration

Add code
Feb 03, 2026
Viaarxiv icon