Picture for Razvan Pascanu

Razvan Pascanu

Google DeepMind

Retrieval-Augmented Decision Transformer: External Memory for In-context RL

Add code
Oct 09, 2024
Figure 1 for Retrieval-Augmented Decision Transformer: External Memory for In-context RL
Figure 2 for Retrieval-Augmented Decision Transformer: External Memory for In-context RL
Figure 3 for Retrieval-Augmented Decision Transformer: External Memory for In-context RL
Figure 4 for Retrieval-Augmented Decision Transformer: External Memory for In-context RL
Viaarxiv icon

Round and Round We Go! What makes Rotary Positional Encodings useful?

Add code
Oct 08, 2024
Viaarxiv icon

softmax is not enough (for sharp out-of-distribution)

Add code
Oct 01, 2024
Viaarxiv icon

When can transformers compositionally generalize in-context?

Add code
Jul 17, 2024
Figure 1 for When can transformers compositionally generalize in-context?
Figure 2 for When can transformers compositionally generalize in-context?
Figure 3 for When can transformers compositionally generalize in-context?
Figure 4 for When can transformers compositionally generalize in-context?
Viaarxiv icon

Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis

Add code
Jul 13, 2024
Figure 1 for Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
Figure 2 for Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
Figure 3 for Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
Figure 4 for Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
Viaarxiv icon

Normalization and effective learning rates in reinforcement learning

Add code
Jul 01, 2024
Figure 1 for Normalization and effective learning rates in reinforcement learning
Figure 2 for Normalization and effective learning rates in reinforcement learning
Figure 3 for Normalization and effective learning rates in reinforcement learning
Figure 4 for Normalization and effective learning rates in reinforcement learning
Viaarxiv icon

Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers

Add code
Jun 24, 2024
Figure 1 for Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Figure 2 for Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Figure 3 for Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Figure 4 for Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
Viaarxiv icon

Transformers meet Neural Algorithmic Reasoners

Add code
Jun 13, 2024
Figure 1 for Transformers meet Neural Algorithmic Reasoners
Figure 2 for Transformers meet Neural Algorithmic Reasoners
Figure 3 for Transformers meet Neural Algorithmic Reasoners
Figure 4 for Transformers meet Neural Algorithmic Reasoners
Viaarxiv icon

State Soup: In-Context Skill Learning, Retrieval and Mixing

Add code
Jun 12, 2024
Viaarxiv icon

Attention as a Hypernetwork

Add code
Jun 09, 2024
Viaarxiv icon