Alert button

Aligning Large Language Models with Human Preferences through Representation Engineering

Dec 26, 2023
Wenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: