CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection

Add code
Aug 18, 2025

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: