Picture for Ning Lv

Ning Lv

On the Implicit Reward Overfitting and the Low-rank Dynamics in RLVR

Add code
May 07, 2026
Viaarxiv icon