Alert button
Picture for Chandler Zhou

Chandler Zhou

Alert button

Aligning Language Models with Offline Reinforcement Learning from Human Feedback

Add code
Bookmark button
Alert button
Aug 23, 2023
Jian Hu, Li Tao, June Yang, Chandler Zhou

Figure 1 for Aligning Language Models with Offline Reinforcement Learning from Human Feedback
Figure 2 for Aligning Language Models with Offline Reinforcement Learning from Human Feedback
Figure 3 for Aligning Language Models with Offline Reinforcement Learning from Human Feedback
Figure 4 for Aligning Language Models with Offline Reinforcement Learning from Human Feedback
Viaarxiv icon