Picture for Lizhou Cai

Lizhou Cai

TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

Add code
Jun 09, 2026
Viaarxiv icon

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Add code
May 07, 2026
Viaarxiv icon