Alert button
Picture for Lorenzo Noci

Lorenzo Noci

Alert button

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Add code
Bookmark button
Alert button
Feb 27, 2024
Lorenzo Noci, Alexandru Meterez, Thomas Hofmann, Antonio Orvieto

Viaarxiv icon

How Good is a Single Basin?

Add code
Bookmark button
Alert button
Feb 05, 2024
Kai Lion, Lorenzo Noci, Thomas Hofmann, Gregor Bachmann

Viaarxiv icon

Disentangling Linear Mode-Connectivity

Add code
Bookmark button
Alert button
Dec 15, 2023
Gul Sena Altintas, Gregor Bachmann, Lorenzo Noci, Thomas Hofmann

Viaarxiv icon

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

Add code
Bookmark button
Alert button
Sep 28, 2023
Blake Bordelon, Lorenzo Noci, Mufan Bill Li, Boris Hanin, Cengiz Pehlevan

Viaarxiv icon

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

Add code
Bookmark button
Alert button
Jun 30, 2023
Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

Figure 1 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Figure 2 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Figure 3 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Figure 4 for The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Viaarxiv icon

Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers

Add code
Bookmark button
Alert button
May 25, 2023
Sotiris Anagnostidis, Dario Pavllo, Luca Biggio, Lorenzo Noci, Aurelien Lucchi, Thomas Hoffmann

Figure 1 for Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Figure 2 for Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Figure 3 for Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Figure 4 for Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Viaarxiv icon

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning

Add code
Bookmark button
Alert button
Mar 31, 2023
Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, Thomas Hofmann

Figure 1 for Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
Figure 2 for Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
Figure 3 for Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
Figure 4 for Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
Viaarxiv icon

The Curious Case of Benign Memorization

Add code
Bookmark button
Alert button
Oct 25, 2022
Sotiris Anagnostidis, Gregor Bachmann, Lorenzo Noci, Thomas Hofmann

Figure 1 for The Curious Case of Benign Memorization
Figure 2 for The Curious Case of Benign Memorization
Figure 3 for The Curious Case of Benign Memorization
Figure 4 for The Curious Case of Benign Memorization
Viaarxiv icon

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Add code
Bookmark button
Alert button
Jun 07, 2022
Lorenzo Noci, Sotiris Anagnostidis, Luca Biggio, Antonio Orvieto, Sidak Pal Singh, Aurelien Lucchi

Figure 1 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 2 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 3 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 4 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Viaarxiv icon