Alert button

The Trickle-down Impact of Reward (In-)consistency on RLHF

Add code
Bookmark button
Alert button
Sep 28, 2023
Lingfeng Shen, Sihao Chen, Linfeng Song, Lifeng Jin, Baolin Peng, Haitao Mi, Daniel Khashabi, Dong Yu

Figure 1 for The Trickle-down Impact of Reward (In-)consistency on RLHF
Figure 2 for The Trickle-down Impact of Reward (In-)consistency on RLHF
Figure 3 for The Trickle-down Impact of Reward (In-)consistency on RLHF
Figure 4 for The Trickle-down Impact of Reward (In-)consistency on RLHF

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: