Picture for Guowei Rong

Guowei Rong

Mitigating Reward Hacking in RLHF via Bayesian Non-negative Reward Modeling

Add code
Feb 11, 2026
Viaarxiv icon

Merging Smarter, Generalizing Better: Enhancing Model Merging on OOD Data

Add code
Jun 10, 2025
Viaarxiv icon