Alert button

Secrets of RLHF in Large Language Models Part I: PPO

Add code
Bookmark button
Alert button
Jul 11, 2023
Rui Zheng, Shihan Dou, Songyang Gao, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Limao Xiong, Lu Chen, Zhiheng Xi, Yuhao Zhou, Nuo Xu, Wenbin Lai, Minghao Zhu, Rongxiang Weng, Wensen Cheng, Cheng Chang, Zhangyue Yin, Yuan Hua, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang

Figure 1 for Secrets of RLHF in Large Language Models Part I: PPO
Figure 2 for Secrets of RLHF in Large Language Models Part I: PPO
Figure 3 for Secrets of RLHF in Large Language Models Part I: PPO
Figure 4 for Secrets of RLHF in Large Language Models Part I: PPO

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: