Picture for Zhiheng Xi

Zhiheng Xi

ChartE$^{3}$: A Comprehensive Benchmark for End-to-End Chart Editing

Add code
Jan 29, 2026
Viaarxiv icon

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Add code
Jan 26, 2026
Viaarxiv icon

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Add code
Jan 20, 2026
Viaarxiv icon

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Add code
Jan 20, 2026
Viaarxiv icon

FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

Add code
Jan 19, 2026
Viaarxiv icon

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

Add code
Jan 16, 2026
Viaarxiv icon

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Add code
Jan 04, 2026
Viaarxiv icon

Memory in the Age of AI Agents

Add code
Dec 15, 2025
Viaarxiv icon

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress

Add code
Nov 11, 2025
Viaarxiv icon

Counteracting Matthew Effect in Self-Improvement of LVLMs through Head-Tail Re-balancing

Add code
Oct 30, 2025
Viaarxiv icon