Alert button

DESTEIN: Navigating Detoxification of Language Models via Universal Steering Pairs and Head-wise Activation Fusion

Apr 16, 2024
Yu Li, Zhihua Wei, Han Jiang, Chuanyang Gong

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: