Picture for Albert Gu

Albert Gu

Machine Learning Department, Carnegie Mellon University

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Add code
Jul 10, 2025
Viaarxiv icon

Understanding and Improving Length Generalization in Recurrent Models

Add code
Jul 03, 2025
Viaarxiv icon

Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism

Add code
Apr 22, 2025
Figure 1 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Figure 2 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Figure 3 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Figure 4 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Viaarxiv icon

Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences

Add code
Mar 20, 2025
Viaarxiv icon

Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners

Add code
Feb 27, 2025
Viaarxiv icon

Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing

Add code
Feb 23, 2025
Viaarxiv icon

On the Benefits of Memory for Modeling Time-Dependent PDEs

Add code
Sep 03, 2024
Viaarxiv icon

Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models

Add code
Aug 19, 2024
Figure 1 for Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Figure 2 for Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Figure 3 for Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Figure 4 for Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Viaarxiv icon

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers

Add code
Jul 13, 2024
Viaarxiv icon

An Empirical Study of Mamba-based Language Models

Add code
Jun 12, 2024
Figure 1 for An Empirical Study of Mamba-based Language Models
Figure 2 for An Empirical Study of Mamba-based Language Models
Figure 3 for An Empirical Study of Mamba-based Language Models
Figure 4 for An Empirical Study of Mamba-based Language Models
Viaarxiv icon