Picture for Arnab Sen Sharma

Arnab Sen Sharma

Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers

Add code
Dec 17, 2025
Viaarxiv icon

LLMs Process Lists With General Filter Heads

Add code
Oct 30, 2025
Viaarxiv icon

Language Models use Lookbacks to Track Beliefs

Add code
May 20, 2025
Viaarxiv icon

Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare

Add code
Feb 18, 2025
Viaarxiv icon

The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability

Add code
Aug 02, 2024
Figure 1 for The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability
Figure 2 for The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability
Figure 3 for The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability
Viaarxiv icon

NNsight and NDIF: Democratizing Access to Foundation Model Internals

Add code
Jul 18, 2024
Figure 1 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 2 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 3 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Figure 4 for NNsight and NDIF: Democratizing Access to Foundation Model Internals
Viaarxiv icon

Locating and Editing Factual Associations in Mamba

Add code
Apr 04, 2024
Figure 1 for Locating and Editing Factual Associations in Mamba
Figure 2 for Locating and Editing Factual Associations in Mamba
Figure 3 for Locating and Editing Factual Associations in Mamba
Figure 4 for Locating and Editing Factual Associations in Mamba
Viaarxiv icon

Function Vectors in Large Language Models

Add code
Oct 23, 2023
Viaarxiv icon

Linearity of Relation Decoding in Transformer Language Models

Add code
Aug 17, 2023
Viaarxiv icon

Mass-Editing Memory in a Transformer

Add code
Oct 13, 2022
Figure 1 for Mass-Editing Memory in a Transformer
Figure 2 for Mass-Editing Memory in a Transformer
Figure 3 for Mass-Editing Memory in a Transformer
Figure 4 for Mass-Editing Memory in a Transformer
Viaarxiv icon