Picture for Miao Chunyan

Miao Chunyan

When In-Distribution Gains Fail: Evaluating Weak-to-Strong Reward Models under Preference Shift

Add code
May 26, 2026
Viaarxiv icon