Alert button
Picture for Samy Jelassi

Samy Jelassi

Alert button

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Add code
Bookmark button
Alert button
Feb 22, 2024
Kenneth Li, Samy Jelassi, Hugh Zhang, Sham Kakade, Martin Wattenberg, David Brandfonbrener

Viaarxiv icon

Repeat After Me: Transformers are Better than State Space Models at Copying

Add code
Bookmark button
Alert button
Feb 01, 2024
Samy Jelassi, David Brandfonbrener, Sham M. Kakade, Eran Malach

Viaarxiv icon

Length Generalization in Arithmetic Transformers

Add code
Bookmark button
Alert button
Jun 27, 2023
Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton

Figure 1 for Length Generalization in Arithmetic Transformers
Figure 2 for Length Generalization in Arithmetic Transformers
Figure 3 for Length Generalization in Arithmetic Transformers
Figure 4 for Length Generalization in Arithmetic Transformers
Viaarxiv icon

Depth Dependence of $μ$P Learning Rates in ReLU MLPs

Add code
Bookmark button
Alert button
May 13, 2023
Samy Jelassi, Boris Hanin, Ziwei Ji, Sashank J. Reddi, Srinadh Bhojanapalli, Sanjiv Kumar

Viaarxiv icon

Vision Transformers provably learn spatial structure

Add code
Bookmark button
Alert button
Oct 13, 2022
Samy Jelassi, Michael E. Sander, Yuanzhi Li

Figure 1 for Vision Transformers provably learn spatial structure
Figure 2 for Vision Transformers provably learn spatial structure
Figure 3 for Vision Transformers provably learn spatial structure
Figure 4 for Vision Transformers provably learn spatial structure
Viaarxiv icon

Dissecting adaptive methods in GANs

Add code
Bookmark button
Alert button
Oct 09, 2022
Samy Jelassi, David Dobre, Arthur Mensch, Yuanzhi Li, Gauthier Gidel

Figure 1 for Dissecting adaptive methods in GANs
Figure 2 for Dissecting adaptive methods in GANs
Figure 3 for Dissecting adaptive methods in GANs
Figure 4 for Dissecting adaptive methods in GANs
Viaarxiv icon

Towards understanding how momentum improves generalization in deep learning

Add code
Bookmark button
Alert button
Jul 13, 2022
Samy Jelassi, Yuanzhi Li

Figure 1 for Towards understanding how momentum improves generalization in deep learning
Figure 2 for Towards understanding how momentum improves generalization in deep learning
Figure 3 for Towards understanding how momentum improves generalization in deep learning
Figure 4 for Towards understanding how momentum improves generalization in deep learning
Viaarxiv icon

Depth separation beyond radial functions

Add code
Bookmark button
Alert button
Feb 03, 2021
Luca Venturi, Samy Jelassi, Tristan Ozuch, Joan Bruna

Viaarxiv icon

Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

Add code
Bookmark button
Alert button
Jan 26, 2021
Aaron Defazio, Samy Jelassi

Figure 1 for Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
Figure 2 for Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
Figure 3 for Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
Figure 4 for Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
Viaarxiv icon