Alert button
Picture for Emmanuel Abbe

Emmanuel Abbe

Alert button

When can transformers reason with abstract symbols?

Add code
Bookmark button
Alert button
Oct 15, 2023
Enric Boix-Adsera, Omid Saremi, Emmanuel Abbe, Samy Bengio, Etai Littwin, Joshua Susskind

Figure 1 for When can transformers reason with abstract symbols?
Figure 2 for When can transformers reason with abstract symbols?
Figure 3 for When can transformers reason with abstract symbols?
Figure 4 for When can transformers reason with abstract symbols?
Viaarxiv icon

Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs

Add code
Bookmark button
Alert button
Jun 29, 2023
Emmanuel Abbe, Elisabetta Cornacchia, Aryo Lotfi

Figure 1 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Figure 2 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Figure 3 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Figure 4 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Viaarxiv icon

Transformers learn through gradual rank increase

Add code
Bookmark button
Alert button
Jun 12, 2023
Enric Boix-Adsera, Etai Littwin, Emmanuel Abbe, Samy Bengio, Joshua Susskind

Figure 1 for Transformers learn through gradual rank increase
Figure 2 for Transformers learn through gradual rank increase
Figure 3 for Transformers learn through gradual rank increase
Figure 4 for Transformers learn through gradual rank increase
Viaarxiv icon

SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics

Add code
Bookmark button
Alert button
Feb 21, 2023
Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz

Figure 1 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Figure 2 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Figure 3 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Figure 4 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Viaarxiv icon

Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Add code
Bookmark button
Alert button
Jan 30, 2023
Emmanuel Abbe, Samy Bengio, Aryo Lotfi, Kevin Rizk

Figure 1 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Figure 2 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Figure 3 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Figure 4 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Viaarxiv icon

On the non-universality of deep learning: quantifying the cost of symmetry

Add code
Bookmark button
Alert button
Aug 05, 2022
Emmanuel Abbe, Enric Boix-Adsera

Figure 1 for On the non-universality of deep learning: quantifying the cost of symmetry
Figure 2 for On the non-universality of deep learning: quantifying the cost of symmetry
Figure 3 for On the non-universality of deep learning: quantifying the cost of symmetry
Viaarxiv icon

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

Add code
Bookmark button
Alert button
May 26, 2022
Emmanuel Abbe, Samy Bengio, Elisabetta Cornacchia, Jon Kleinberg, Aryo Lotfi, Maithra Raghu, Chiyuan Zhang

Figure 1 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Figure 2 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Figure 3 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Figure 4 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Viaarxiv icon

An initial alignment between neural network and target is needed for gradient descent to learn

Add code
Bookmark button
Alert button
Feb 25, 2022
Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła, Christopher Marquis

Figure 1 for An initial alignment between neural network and target is needed for gradient descent to learn
Figure 2 for An initial alignment between neural network and target is needed for gradient descent to learn
Viaarxiv icon

The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks

Add code
Bookmark button
Alert button
Feb 17, 2022
Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz

Figure 1 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Figure 2 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Figure 3 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Figure 4 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Viaarxiv icon

Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster

Add code
Bookmark button
Alert button
Nov 04, 2021
Emmanuel Abbe, Shuangping Li, Allan Sly

Figure 1 for Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Figure 2 for Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Figure 3 for Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Viaarxiv icon