Alert button

COPR: Continual Human Preference Learning via Optimal Policy Regularization

Add code
Bookmark button
Alert button
Feb 22, 2024
Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: