Picture for Samy Jelassi

Samy Jelassi

DMA, CIMS

Universal Length Generalization with Turing Programs

Add code
Jul 03, 2024
Figure 1 for Universal Length Generalization with Turing Programs
Figure 2 for Universal Length Generalization with Turing Programs
Figure 3 for Universal Length Generalization with Turing Programs
Figure 4 for Universal Length Generalization with Turing Programs
Viaarxiv icon

How Does Overparameterization Affect Features?

Add code
Jul 01, 2024
Figure 1 for How Does Overparameterization Affect Features?
Figure 2 for How Does Overparameterization Affect Features?
Figure 3 for How Does Overparameterization Affect Features?
Figure 4 for How Does Overparameterization Affect Features?
Viaarxiv icon

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Add code
Feb 22, 2024
Viaarxiv icon

Repeat After Me: Transformers are Better than State Space Models at Copying

Add code
Feb 01, 2024
Figure 1 for Repeat After Me: Transformers are Better than State Space Models at Copying
Figure 2 for Repeat After Me: Transformers are Better than State Space Models at Copying
Figure 3 for Repeat After Me: Transformers are Better than State Space Models at Copying
Figure 4 for Repeat After Me: Transformers are Better than State Space Models at Copying
Viaarxiv icon

Length Generalization in Arithmetic Transformers

Add code
Jun 27, 2023
Figure 1 for Length Generalization in Arithmetic Transformers
Figure 2 for Length Generalization in Arithmetic Transformers
Figure 3 for Length Generalization in Arithmetic Transformers
Figure 4 for Length Generalization in Arithmetic Transformers
Viaarxiv icon

Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Add code
May 13, 2023
Viaarxiv icon

Vision Transformers provably learn spatial structure

Add code
Oct 13, 2022
Figure 1 for Vision Transformers provably learn spatial structure
Figure 2 for Vision Transformers provably learn spatial structure
Figure 3 for Vision Transformers provably learn spatial structure
Figure 4 for Vision Transformers provably learn spatial structure
Viaarxiv icon

Dissecting adaptive methods in GANs

Add code
Oct 09, 2022
Figure 1 for Dissecting adaptive methods in GANs
Figure 2 for Dissecting adaptive methods in GANs
Figure 3 for Dissecting adaptive methods in GANs
Figure 4 for Dissecting adaptive methods in GANs
Viaarxiv icon

Towards understanding how momentum improves generalization in deep learning

Add code
Jul 13, 2022
Figure 1 for Towards understanding how momentum improves generalization in deep learning
Figure 2 for Towards understanding how momentum improves generalization in deep learning
Figure 3 for Towards understanding how momentum improves generalization in deep learning
Figure 4 for Towards understanding how momentum improves generalization in deep learning
Viaarxiv icon

Depth separation beyond radial functions

Add code
Feb 03, 2021
Viaarxiv icon