Picture for Ayse Arslan

Ayse Arslan

Scaling laws for activation steering with Llama 2 models and refusal mechanisms

Add code
Jul 15, 2025
Viaarxiv icon