Picture for Xuandong Zhao

Xuandong Zhao

VIMPO: Value-Implicit Policy Optimization for LLMs

Add code
Jun 18, 2026
Viaarxiv icon

Reward Hacking in Language Model Agents: Revisiting AI Safety Gridworlds

Add code
Jun 13, 2026
Viaarxiv icon

MemFail: Stress-Testing Failure Modes of LLM Memory Systems

Add code
May 26, 2026
Viaarxiv icon

Auditing Agent Harness Safety

Add code
May 14, 2026
Viaarxiv icon

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

Add code
Feb 13, 2026
Viaarxiv icon

Making Bias Non-Predictive: Training Robust LLM Judges via Reinforcement Learning

Add code
Feb 02, 2026
Viaarxiv icon

Clipping-Free Policy Optimization for Large Language Models

Add code
Jan 30, 2026
Viaarxiv icon

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Add code
Jan 17, 2026
Viaarxiv icon

InfoSynth: Information-Guided Benchmark Synthesis for LLMs

Add code
Jan 02, 2026
Viaarxiv icon

Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models

Add code
Jul 10, 2025
Viaarxiv icon