Picture for Clément Dumas

Clément Dumas

Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers

Add code
Dec 17, 2025
Viaarxiv icon

nnterp: A Standardized Interface for Mechanistic Interpretability of Transformers

Add code
Nov 18, 2025
Viaarxiv icon

Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers

Add code
Nov 13, 2024
Figure 1 for Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Figure 2 for Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Figure 3 for Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Figure 4 for Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Viaarxiv icon