Alert button

Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons

Jan 26, 2023
Banghua Zhu, Jiantao Jiao, Michael I. Jordan

Figure 1 for Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons
Figure 2 for Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: