Picture for Weinan Zhang

Weinan Zhang

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

Add code
Jun 16, 2026
Viaarxiv icon

BALTO: Balanced Token-Level Policy Optimization for Hallucination Mitigation

Add code
Jun 14, 2026
Viaarxiv icon

Retrospective Progress-Aware Self-Refinement for LLM Agent Training

Add code
Jun 12, 2026
Viaarxiv icon

Communication Policy Evolution for Proactive LLM Agents

Add code
Jun 12, 2026
Viaarxiv icon

SkillJuror: Measuring How Agent Skill Organization Changes Runtime Behavior

Add code
Jun 10, 2026
Viaarxiv icon

DiffCold: A Diffusion-based Generative Model for Cold-Start Item Recommendation

Add code
Jun 10, 2026
Viaarxiv icon

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

Add code
Jun 08, 2026
Viaarxiv icon

SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents

Add code
Jun 04, 2026
Viaarxiv icon

Autoregressive Diffusion World Models for Off-Policy Evaluation of LLM Agents

Add code
Jun 04, 2026
Viaarxiv icon

LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents

Add code
Jun 04, 2026
Viaarxiv icon