Alert button
Picture for Omid Saremi

Omid Saremi

Alert button

LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

Add code
Bookmark button
Alert button
Dec 07, 2023
Vimal Thilak, Chen Huang, Omid Saremi, Laurent Dinh, Hanlin Goh, Preetum Nakkiran, Joshua M. Susskind, Etai Littwin

Viaarxiv icon

Vanishing Gradients in Reinforcement Finetuning of Language Models

Add code
Bookmark button
Alert button
Oct 31, 2023
Noam Razin, Hattie Zhou, Omid Saremi, Vimal Thilak, Arwen Bradley, Preetum Nakkiran, Joshua Susskind, Etai Littwin

Viaarxiv icon

What Algorithms can Transformers Learn? A Study in Length Generalization

Add code
Bookmark button
Alert button
Oct 24, 2023
Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum Nakkiran

Viaarxiv icon

When can transformers reason with abstract symbols?

Add code
Bookmark button
Alert button
Oct 15, 2023
Enric Boix-Adsera, Omid Saremi, Emmanuel Abbe, Samy Bengio, Etai Littwin, Joshua Susskind

Figure 1 for When can transformers reason with abstract symbols?
Figure 2 for When can transformers reason with abstract symbols?
Figure 3 for When can transformers reason with abstract symbols?
Figure 4 for When can transformers reason with abstract symbols?
Viaarxiv icon

Adaptivity and Modularity for Efficient Generalization Over Task Complexity

Add code
Bookmark button
Alert button
Oct 13, 2023
Samira Abnar, Omid Saremi, Laurent Dinh, Shantel Wilson, Miguel Angel Bautista, Chen Huang, Vimal Thilak, Etai Littwin, Jiatao Gu, Josh Susskind, Samy Bengio

Figure 1 for Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Figure 2 for Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Figure 3 for Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Figure 4 for Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Viaarxiv icon

The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

Add code
Bookmark button
Alert button
Jun 13, 2022
Vimal Thilak, Etai Littwin, Shuangfei Zhai, Omid Saremi, Roni Paiss, Joshua Susskind

Figure 1 for The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
Figure 2 for The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
Figure 3 for The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
Figure 4 for The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
Viaarxiv icon

Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks

Add code
Bookmark button
Alert button
Jul 02, 2021
Shih-Yu Sun, Vimal Thilak, Etai Littwin, Omid Saremi, Joshua M. Susskind

Figure 1 for Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks
Figure 2 for Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks
Figure 3 for Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks
Figure 4 for Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks
Viaarxiv icon

Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks

Add code
Bookmark button
Alert button
Jul 02, 2021
Etai Littwin, Omid Saremi, Shuangfei Zhai, Vimal Thilak, Hanlin Goh, Joshua M. Susskind, Greg Yang

Figure 1 for Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
Figure 2 for Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
Figure 3 for Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
Figure 4 for Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
Viaarxiv icon