Picture for David Bau

David Bau

Discovering Forbidden Topics in Language Models

Add code
May 26, 2025
Viaarxiv icon

When Are Concepts Erased From Diffusion Models?

Add code
May 22, 2025
Viaarxiv icon

Language Models use Lookbacks to Track Beliefs

Add code
May 20, 2025
Viaarxiv icon

Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions

Add code
May 13, 2025
Viaarxiv icon

MIB: A Mechanistic Interpretability Benchmark

Add code
Apr 17, 2025
Viaarxiv icon

Distilling Diversity and Control in Diffusion Models

Add code
Mar 13, 2025
Viaarxiv icon

Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare

Add code
Feb 18, 2025
Viaarxiv icon

Position-aware Automatic Circuit Discovery

Add code
Feb 07, 2025
Viaarxiv icon

SliderSpace: Decomposing the Visual Capabilities of Diffusion Models

Add code
Feb 03, 2025
Viaarxiv icon

Open Problems in Mechanistic Interpretability

Add code
Jan 27, 2025
Figure 1 for Open Problems in Mechanistic Interpretability
Figure 2 for Open Problems in Mechanistic Interpretability
Figure 3 for Open Problems in Mechanistic Interpretability
Figure 4 for Open Problems in Mechanistic Interpretability
Viaarxiv icon