Picture for Daniel Murfet

Daniel Murfet

Studying Small Language Models with Susceptibilities

Add code
Apr 25, 2025
Viaarxiv icon

Modes of Sequence Models and Learning Coefficients

Add code
Apr 25, 2025
Viaarxiv icon

Programs as Singularities

Add code
Apr 10, 2025
Viaarxiv icon

You Are What You Eat -- AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

Add code
Feb 08, 2025
Viaarxiv icon

Dynamics of Transient Structure in In-Context Linear Regression Transformers

Add code
Jan 31, 2025
Figure 1 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Figure 2 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Figure 3 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Figure 4 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Viaarxiv icon

Open Problems in Mechanistic Interpretability

Add code
Jan 27, 2025
Figure 1 for Open Problems in Mechanistic Interpretability
Figure 2 for Open Problems in Mechanistic Interpretability
Figure 3 for Open Problems in Mechanistic Interpretability
Figure 4 for Open Problems in Mechanistic Interpretability
Viaarxiv icon

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

Add code
Oct 03, 2024
Viaarxiv icon

The Developmental Landscape of In-Context Learning

Add code
Feb 04, 2024
Viaarxiv icon

Dynamical versus Bayesian Phase Transitions in a Toy Model of Superposition

Add code
Oct 10, 2023
Viaarxiv icon

Quantifying degeneracy in singular models via the learning coefficient

Add code
Aug 23, 2023
Viaarxiv icon