Picture for Sonia Joseph

Sonia Joseph

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

Add code
Sep 08, 2025
Viaarxiv icon

How Visual Representations Map to Language Feature Space in Multimodal LLMs

Add code
Jun 13, 2025
Viaarxiv icon

Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video

Add code
Apr 28, 2025
Viaarxiv icon

Decoding Vision Transformers: the Diffusion Steering Lens

Add code
Apr 18, 2025
Viaarxiv icon

Steering CLIP's vision transformer with sparse autoencoders

Add code
Apr 11, 2025
Viaarxiv icon

Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

Add code
Jul 16, 2024
Viaarxiv icon