Picture for Yassir Akram

Yassir Akram

Scale leads to compositional generalization

Add code
Jul 09, 2025
Viaarxiv icon

Weight decay induces low-rank attention layers

Add code
Oct 31, 2024
Figure 1 for Weight decay induces low-rank attention layers
Figure 2 for Weight decay induces low-rank attention layers
Figure 3 for Weight decay induces low-rank attention layers
Figure 4 for Weight decay induces low-rank attention layers
Viaarxiv icon

Learning Randomized Algorithms with Transformers

Add code
Aug 20, 2024
Viaarxiv icon

When can transformers compositionally generalize in-context?

Add code
Jul 17, 2024
Figure 1 for When can transformers compositionally generalize in-context?
Figure 2 for When can transformers compositionally generalize in-context?
Figure 3 for When can transformers compositionally generalize in-context?
Figure 4 for When can transformers compositionally generalize in-context?
Viaarxiv icon

Attention as a Hypernetwork

Add code
Jun 09, 2024
Figure 1 for Attention as a Hypernetwork
Figure 2 for Attention as a Hypernetwork
Figure 3 for Attention as a Hypernetwork
Figure 4 for Attention as a Hypernetwork
Viaarxiv icon

Discovering modular solutions that generalize compositionally

Add code
Dec 22, 2023
Figure 1 for Discovering modular solutions that generalize compositionally
Figure 2 for Discovering modular solutions that generalize compositionally
Figure 3 for Discovering modular solutions that generalize compositionally
Figure 4 for Discovering modular solutions that generalize compositionally
Viaarxiv icon

Gated recurrent neural networks discover attention

Add code
Sep 04, 2023
Figure 1 for Gated recurrent neural networks discover attention
Figure 2 for Gated recurrent neural networks discover attention
Figure 3 for Gated recurrent neural networks discover attention
Figure 4 for Gated recurrent neural networks discover attention
Viaarxiv icon

Random initialisations performing above chance and how to find them

Add code
Sep 15, 2022
Figure 1 for Random initialisations performing above chance and how to find them
Figure 2 for Random initialisations performing above chance and how to find them
Figure 3 for Random initialisations performing above chance and how to find them
Figure 4 for Random initialisations performing above chance and how to find them
Viaarxiv icon