Picture for Yoav Gur-Arieh

Yoav Gur-Arieh

Precise In-Parameter Concept Erasure in Large Language Models

Add code
May 28, 2025
Viaarxiv icon

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Add code
Jan 14, 2025
Figure 1 for Enhancing Automated Interpretability with Output-Centric Feature Descriptions
Figure 2 for Enhancing Automated Interpretability with Output-Centric Feature Descriptions
Figure 3 for Enhancing Automated Interpretability with Output-Centric Feature Descriptions
Figure 4 for Enhancing Automated Interpretability with Output-Centric Feature Descriptions
Viaarxiv icon