Picture for Javier Ferrando

Javier Ferrando

Language Models Can Explain Visual Features via Steering

Add code
Mar 25, 2026
Viaarxiv icon

Weight space Detection of Backdoors in LoRA Adapters

Add code
Feb 16, 2026
Viaarxiv icon

Putting a Face to Forgetting: Continual Learning meets Mechanistic Interpretability

Add code
Jan 29, 2026
Viaarxiv icon

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Add code
Nov 21, 2024
Figure 1 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Figure 2 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Figure 3 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Figure 4 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Viaarxiv icon

On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task

Add code
Oct 09, 2024
Figure 1 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Figure 2 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Figure 3 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Figure 4 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Viaarxiv icon

A Primer on the Inner Workings of Transformer-based Language Models

Add code
May 02, 2024
Viaarxiv icon

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

Add code
Apr 10, 2024
Figure 1 for LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Viaarxiv icon

Information Flow Routes: Automatically Interpreting Language Models at Scale

Add code
Feb 27, 2024
Viaarxiv icon

Neurons in Large Language Models: Dead, N-gram, Positional

Add code
Sep 09, 2023
Viaarxiv icon

Automating Behavioral Testing in Machine Translation

Add code
Sep 07, 2023
Figure 1 for Automating Behavioral Testing in Machine Translation
Figure 2 for Automating Behavioral Testing in Machine Translation
Figure 3 for Automating Behavioral Testing in Machine Translation
Figure 4 for Automating Behavioral Testing in Machine Translation
Viaarxiv icon