Picture for Emmanuel Abbe

Emmanuel Abbe

When can transformers reason with abstract symbols?

Add code
Oct 15, 2023
Figure 1 for When can transformers reason with abstract symbols?
Figure 2 for When can transformers reason with abstract symbols?
Figure 3 for When can transformers reason with abstract symbols?
Figure 4 for When can transformers reason with abstract symbols?
Viaarxiv icon

Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs

Add code
Jun 29, 2023
Figure 1 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Figure 2 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Figure 3 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Figure 4 for Provable Advantage of Curriculum Learning on Parity Targets with Mixed Inputs
Viaarxiv icon

Transformers learn through gradual rank increase

Add code
Jun 12, 2023
Figure 1 for Transformers learn through gradual rank increase
Figure 2 for Transformers learn through gradual rank increase
Figure 3 for Transformers learn through gradual rank increase
Figure 4 for Transformers learn through gradual rank increase
Viaarxiv icon

SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics

Add code
Feb 21, 2023
Figure 1 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Figure 2 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Figure 3 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Figure 4 for SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Viaarxiv icon

Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Add code
Jan 30, 2023
Figure 1 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Figure 2 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Figure 3 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Figure 4 for Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Viaarxiv icon

On the non-universality of deep learning: quantifying the cost of symmetry

Add code
Aug 05, 2022
Figure 1 for On the non-universality of deep learning: quantifying the cost of symmetry
Figure 2 for On the non-universality of deep learning: quantifying the cost of symmetry
Figure 3 for On the non-universality of deep learning: quantifying the cost of symmetry
Viaarxiv icon

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

Add code
May 26, 2022
Figure 1 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Figure 2 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Figure 3 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Figure 4 for Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures
Viaarxiv icon

An initial alignment between neural network and target is needed for gradient descent to learn

Add code
Feb 25, 2022
Figure 1 for An initial alignment between neural network and target is needed for gradient descent to learn
Figure 2 for An initial alignment between neural network and target is needed for gradient descent to learn
Viaarxiv icon

The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks

Add code
Feb 17, 2022
Figure 1 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Figure 2 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Figure 3 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Figure 4 for The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
Viaarxiv icon

Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster

Add code
Nov 04, 2021
Figure 1 for Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Figure 2 for Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Figure 3 for Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Viaarxiv icon