Alert button

SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF

Add code
Bookmark button
Alert button
Oct 09, 2023
Yi Dong, Zhilin Wang, Makesh Narsimhan Sreedhar, Xianchao Wu, Oleksii Kuchaiev

Figure 1 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Figure 2 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Figure 3 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Figure 4 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: