Picture for Helena Casademunt

Helena Casademunt

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Add code
Mar 05, 2026
Viaarxiv icon

Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning

Add code
Jul 22, 2025
Viaarxiv icon