Picture for Liuchen Liao

Liuchen Liao

Preference-Based Self-Distillation: Beyond KL Matching via Reward Regularization

Add code
May 06, 2026
Viaarxiv icon