Picture for sun zhe

sun zhe

From Correctness to Preference: A Framework for Personalized Agentic Reinforcement Learning

Add code
May 22, 2026
Viaarxiv icon