Picture for Chongyang He

Chongyang He

Learning What to Learn: Stage-Specific Data Sets for SFT-then-RL in Small Language Model Reasoning

Add code
Jun 03, 2026
Viaarxiv icon