Picture for Diogo de Lucena

Diogo de Lucena

Endogenous Resistance to Activation Steering in Language Models

Add code
Feb 06, 2026
Viaarxiv icon