Picture for Gengsheng Li

Gengsheng Li

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Add code
Apr 02, 2026
Viaarxiv icon

R-Diverse: Mitigating Diversity Illusion in Self-Play LLM Training

Add code
Feb 16, 2026
Viaarxiv icon