Picture for Dana Arad

Dana Arad

Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs

Add code
Jun 11, 2025
Viaarxiv icon

SAEs Are Good for Steering -- If You Select the Right Features

Add code
May 26, 2025
Viaarxiv icon

MIB: A Mechanistic Interpretability Benchmark

Add code
Apr 17, 2025
Viaarxiv icon

Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines

Add code
Mar 09, 2024
Viaarxiv icon

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder

Add code
Jun 01, 2023
Viaarxiv icon