Alert button

Confronting Reward Model Overoptimization with Constrained RLHF

Add code
Bookmark button
Alert button
Oct 06, 2023
Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen McAleer

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: