Alert button
Picture for Andrew M. Saxe

Andrew M. Saxe

Alert button

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

Add code
Bookmark button
Alert button
Apr 10, 2024
Aaditya K. Singh, Ted Moskovitz, Felix Hill, Stephanie C. Y. Chan, Andrew M. Saxe

Viaarxiv icon

When Representations Align: Universality in Representation Learning Dynamics

Add code
Bookmark button
Alert button
Feb 14, 2024
Loek van Rossem, Andrew M. Saxe

Viaarxiv icon

The Transient Nature of Emergent In-Context Learning in Transformers

Add code
Bookmark button
Alert button
Nov 15, 2023
Aaditya K. Singh, Stephanie C. Y. Chan, Ted Moskovitz, Erin Grant, Andrew M. Saxe, Felix Hill

Viaarxiv icon

Meta-Learning Strategies through Value Maximization in Neural Networks

Add code
Bookmark button
Alert button
Oct 30, 2023
Rodrigo Carrasco-Davis, Javier Masís, Andrew M. Saxe

Viaarxiv icon

Regularised neural networks mimic human insight

Add code
Bookmark button
Alert button
Feb 22, 2023
Anika T. Löwe, Léo Touzo, Paul S. Muhle-Karbe, Andrew M. Saxe, Christopher Summerfield, Nicolas W. Schuck

Figure 1 for Regularised neural networks mimic human insight
Figure 2 for Regularised neural networks mimic human insight
Figure 3 for Regularised neural networks mimic human insight
Figure 4 for Regularised neural networks mimic human insight
Viaarxiv icon

The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

Add code
Bookmark button
Alert button
Jul 21, 2022
Andrew M. Saxe, Shagun Sodhani, Sam Lewallen

Figure 1 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Figure 2 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Figure 3 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Figure 4 for The Neural Race Reduction: Dynamics of Abstraction in Gated Networks
Viaarxiv icon

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

Add code
Bookmark button
Alert button
Jun 18, 2019
Sebastian Goldt, Madhu S. Advani, Andrew M. Saxe, Florent Krzakala, Lenka Zdeborová

Figure 1 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Figure 2 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Figure 3 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Figure 4 for Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup
Viaarxiv icon

Generalisation dynamics of online learning in over-parameterised neural networks

Add code
Bookmark button
Alert button
Jan 25, 2019
Sebastian Goldt, Madhu S. Advani, Andrew M. Saxe, Florent Krzakala, Lenka Zdeborová

Figure 1 for Generalisation dynamics of online learning in over-parameterised neural networks
Figure 2 for Generalisation dynamics of online learning in over-parameterised neural networks
Figure 3 for Generalisation dynamics of online learning in over-parameterised neural networks
Figure 4 for Generalisation dynamics of online learning in over-parameterised neural networks
Viaarxiv icon

A mathematical theory of semantic development in deep neural networks

Add code
Bookmark button
Alert button
Oct 23, 2018
Andrew M. Saxe, James L. McClelland, Surya Ganguli

Figure 1 for A mathematical theory of semantic development in deep neural networks
Figure 2 for A mathematical theory of semantic development in deep neural networks
Figure 3 for A mathematical theory of semantic development in deep neural networks
Figure 4 for A mathematical theory of semantic development in deep neural networks
Viaarxiv icon

Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning

Add code
Bookmark button
Alert button
Mar 05, 2018
Yao Zhang, Andrew M. Saxe, Madhu S. Advani, Alpha A. Lee

Figure 1 for Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
Figure 2 for Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
Figure 3 for Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
Figure 4 for Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
Viaarxiv icon