Picture for Daniel Murfet

Daniel Murfet

Patterning: The Dual of Interpretability

Add code
Jan 20, 2026
Viaarxiv icon

Towards Spectroscopy: Susceptibility Clusters in Language Models

Add code
Jan 19, 2026
Viaarxiv icon

Stagewise Reinforcement Learning and the Geometry of the Regret Landscape

Add code
Jan 12, 2026
Viaarxiv icon

Embryology of a Language Model

Add code
Aug 01, 2025
Viaarxiv icon

Studying Small Language Models with Susceptibilities

Add code
Apr 25, 2025
Viaarxiv icon

Modes of Sequence Models and Learning Coefficients

Add code
Apr 25, 2025
Viaarxiv icon

Programs as Singularities

Add code
Apr 10, 2025
Viaarxiv icon

You Are What You Eat -- AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

Add code
Feb 08, 2025
Viaarxiv icon

Dynamics of Transient Structure in In-Context Linear Regression Transformers

Add code
Jan 31, 2025
Figure 1 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Figure 2 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Figure 3 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Figure 4 for Dynamics of Transient Structure in In-Context Linear Regression Transformers
Viaarxiv icon

Open Problems in Mechanistic Interpretability

Add code
Jan 27, 2025
Figure 1 for Open Problems in Mechanistic Interpretability
Figure 2 for Open Problems in Mechanistic Interpretability
Figure 3 for Open Problems in Mechanistic Interpretability
Figure 4 for Open Problems in Mechanistic Interpretability
Viaarxiv icon