Picture for Alberto Bietti

Alberto Bietti

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Add code
May 29, 2025
Viaarxiv icon

BAnG: Bidirectional Anchored Generation for Conditional RNA Design

Add code
Feb 28, 2025
Viaarxiv icon

In-context denoising with one-layer transformers: connections between attention and associative memory retrieval

Add code
Feb 07, 2025
Viaarxiv icon

Understanding Factual Recall in Transformers via Associative Memories

Add code
Dec 09, 2024
Figure 1 for Understanding Factual Recall in Transformers via Associative Memories
Figure 2 for Understanding Factual Recall in Transformers via Associative Memories
Figure 3 for Understanding Factual Recall in Transformers via Associative Memories
Figure 4 for Understanding Factual Recall in Transformers via Associative Memories
Viaarxiv icon

How Truncating Weights Improves Reasoning in Language Models

Add code
Jun 05, 2024
Viaarxiv icon

Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task

Add code
May 30, 2024
Viaarxiv icon

Level Set Teleportation: An Optimization Perspective

Add code
Mar 05, 2024
Figure 1 for Level Set Teleportation: An Optimization Perspective
Figure 2 for Level Set Teleportation: An Optimization Perspective
Figure 3 for Level Set Teleportation: An Optimization Perspective
Figure 4 for Level Set Teleportation: An Optimization Perspective
Viaarxiv icon

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models

Add code
Feb 29, 2024
Figure 1 for Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Figure 2 for Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Figure 3 for Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Figure 4 for Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Viaarxiv icon

Learning Associative Memories with Gradient Descent

Add code
Feb 28, 2024
Figure 1 for Learning Associative Memories with Gradient Descent
Figure 2 for Learning Associative Memories with Gradient Descent
Figure 3 for Learning Associative Memories with Gradient Descent
Figure 4 for Learning Associative Memories with Gradient Descent
Viaarxiv icon

On Learning Gaussian Multi-index Models with Gradient Flow

Add code
Nov 02, 2023
Figure 1 for On Learning Gaussian Multi-index Models with Gradient Flow
Figure 2 for On Learning Gaussian Multi-index Models with Gradient Flow
Figure 3 for On Learning Gaussian Multi-index Models with Gradient Flow
Figure 4 for On Learning Gaussian Multi-index Models with Gradient Flow
Viaarxiv icon