Picture for Atticus Geiger

Atticus Geiger

Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space

Add code
May 12, 2026
Viaarxiv icon

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

Add code
May 06, 2026
Viaarxiv icon

Bucketing the Good Apples: A Method for Diagnosing and Improving Causal Abstraction

Add code
May 04, 2026
Viaarxiv icon

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Add code
Mar 05, 2026
Viaarxiv icon

Surgical Activation Steering via Generative Causal Mediation

Add code
Feb 17, 2026
Viaarxiv icon

From Directions to Regions: Decomposing Activations in Language Models via Local Geometry

Add code
Feb 02, 2026
Viaarxiv icon

The Shape of Beliefs: Geometry, Dynamics, and Interventions along Representation Manifolds of Language Models' Posteriors

Add code
Feb 02, 2026
Viaarxiv icon

Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics

Add code
Nov 06, 2025
Viaarxiv icon

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Add code
Jun 12, 2025
Figure 1 for Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Figure 2 for Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Figure 3 for Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Figure 4 for Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
Viaarxiv icon

How Do Transformers Learn Variable Binding in Symbolic Programs?

Add code
May 27, 2025
Viaarxiv icon