Picture for Xiaoyan Han

Xiaoyan Han

StaRPO: Stability-Augmented Reinforcement Policy Optimization

Add code
Apr 10, 2026
Viaarxiv icon