Picture for Keenan Pepper

Keenan Pepper

Endogenous Resistance to Activation Steering in Language Models

Add code
Feb 06, 2026
Viaarxiv icon

Self-Ablating Transformers: More Interpretability, Less Sparsity

Add code
May 01, 2025
Figure 1 for Self-Ablating Transformers: More Interpretability, Less Sparsity
Figure 2 for Self-Ablating Transformers: More Interpretability, Less Sparsity
Figure 3 for Self-Ablating Transformers: More Interpretability, Less Sparsity
Figure 4 for Self-Ablating Transformers: More Interpretability, Less Sparsity
Viaarxiv icon