Alert button

On the Exploitability of Reinforcement Learning with Human Feedback for Large Language Models

Nov 16, 2023
Jiongxiao Wang, Junlin Wu, Muhao Chen, Yevgeniy Vorobeychik, Chaowei Xiao

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: