Alert button
Picture for Nicolas Zucchet

Nicolas Zucchet

Alert button

Uncovering mesa-optimization algorithms in Transformers

Add code
Bookmark button
Alert button
Sep 11, 2023
Johannes von Oswald, Eyvind Niklasson, Maximilian Schlegel, Seijin Kobayashi, Nicolas Zucchet, Nino Scherrer, Nolan Miller, Mark Sandler, Blaise Agüera y Arcas, Max Vladymyrov, Razvan Pascanu, João Sacramento

Viaarxiv icon

Gated recurrent neural networks discover attention

Add code
Bookmark button
Alert button
Sep 04, 2023
Nicolas Zucchet, Seijin Kobayashi, Yassir Akram, Johannes von Oswald, Maxime Larcher, Angelika Steger, João Sacramento

Figure 1 for Gated recurrent neural networks discover attention
Figure 2 for Gated recurrent neural networks discover attention
Figure 3 for Gated recurrent neural networks discover attention
Figure 4 for Gated recurrent neural networks discover attention
Viaarxiv icon

Online learning of long-range dependencies

Add code
Bookmark button
Alert button
May 25, 2023
Nicolas Zucchet, Robert Meier, Simon Schug, Asier Mujika, João Sacramento

Figure 1 for Online learning of long-range dependencies
Figure 2 for Online learning of long-range dependencies
Figure 3 for Online learning of long-range dependencies
Figure 4 for Online learning of long-range dependencies
Viaarxiv icon

Random initialisations performing above chance and how to find them

Add code
Bookmark button
Alert button
Sep 15, 2022
Frederik Benzing, Simon Schug, Robert Meier, Johannes von Oswald, Yassir Akram, Nicolas Zucchet, Laurence Aitchison, Angelika Steger

Figure 1 for Random initialisations performing above chance and how to find them
Figure 2 for Random initialisations performing above chance and how to find them
Figure 3 for Random initialisations performing above chance and how to find them
Figure 4 for Random initialisations performing above chance and how to find them
Viaarxiv icon

The least-control principle for learning at equilibrium

Add code
Bookmark button
Alert button
Jul 04, 2022
Alexander Meulemans, Nicolas Zucchet, Seijin Kobayashi, Johannes von Oswald, João Sacramento

Figure 1 for The least-control principle for learning at equilibrium
Figure 2 for The least-control principle for learning at equilibrium
Figure 3 for The least-control principle for learning at equilibrium
Figure 4 for The least-control principle for learning at equilibrium
Viaarxiv icon

Beyond backpropagation: implicit gradients for bilevel optimization

Add code
Bookmark button
Alert button
May 06, 2022
Nicolas Zucchet, João Sacramento

Figure 1 for Beyond backpropagation: implicit gradients for bilevel optimization
Figure 2 for Beyond backpropagation: implicit gradients for bilevel optimization
Figure 3 for Beyond backpropagation: implicit gradients for bilevel optimization
Viaarxiv icon

Learning where to learn: Gradient sparsity in meta and continual learning

Add code
Bookmark button
Alert button
Oct 27, 2021
Johannes von Oswald, Dominic Zhao, Seijin Kobayashi, Simon Schug, Massimo Caccia, Nicolas Zucchet, João Sacramento

Figure 1 for Learning where to learn: Gradient sparsity in meta and continual learning
Figure 2 for Learning where to learn: Gradient sparsity in meta and continual learning
Figure 3 for Learning where to learn: Gradient sparsity in meta and continual learning
Figure 4 for Learning where to learn: Gradient sparsity in meta and continual learning
Viaarxiv icon

A contrastive rule for meta-learning

Add code
Bookmark button
Alert button
Apr 19, 2021
Nicolas Zucchet, Simon Schug, Johannes von Oswald, Dominic Zhao, João Sacramento

Figure 1 for A contrastive rule for meta-learning
Figure 2 for A contrastive rule for meta-learning
Figure 3 for A contrastive rule for meta-learning
Figure 4 for A contrastive rule for meta-learning
Viaarxiv icon