Picture for Rongpeng Zhu

Rongpeng Zhu

DGPO: Distribution Guided Policy Optimization for Fine Grained Credit Assignment

Add code
May 05, 2026
Viaarxiv icon

HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents

Add code
Mar 01, 2026
Viaarxiv icon