Picture for Jiwon Jeon

Jiwon Jeon

Be My Tutor: On-Policy Co-Distillation for Mutual LLM Improvement via Peer Feedback

Add code
Jun 12, 2026
Viaarxiv icon

Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR

Add code
May 11, 2026
Viaarxiv icon

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Add code
Mar 25, 2026
Viaarxiv icon

STAIRS-Former: Spatio-Temporal Attention with Interleaved Recursive Structure Transformer for Offline Multi-task Multi-agent Reinforcement Learning

Add code
Mar 12, 2026
Viaarxiv icon