Picture for Aaron Mueller

Aaron Mueller

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop

Add code
Feb 24, 2026
Viaarxiv icon

Causality is Key for Interpretability Claims to Generalise

Add code
Feb 18, 2026
Viaarxiv icon

Mechanisms of AI Protein Folding in ESMFold

Add code
Feb 05, 2026
Viaarxiv icon

Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics?

Add code
Dec 23, 2025
Figure 1 for Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics?
Figure 2 for Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics?
Figure 3 for Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics?
Figure 4 for Measuring Mechanistic Independence: Can Bias Be Removed Without Erasing Demographics?
Viaarxiv icon

From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts?

Add code
Dec 17, 2025
Viaarxiv icon

BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models

Add code
Dec 11, 2025
Viaarxiv icon

In-Context Learning Without Copying

Add code
Nov 07, 2025
Viaarxiv icon

Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining

Add code
Sep 05, 2025
Figure 1 for Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
Figure 2 for Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
Figure 3 for Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
Figure 4 for Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
Viaarxiv icon

How to Improve the Robustness of Closed-Source Models on NLI

Add code
May 26, 2025
Viaarxiv icon

SAEs Are Good for Steering -- If You Select the Right Features

Add code
May 26, 2025
Viaarxiv icon