Picture for Albert Gu

Albert Gu

Machine Learning Department, Carnegie Mellon University

Mamba-3: Improved Sequence Modeling using State Space Principles

Add code
Mar 16, 2026
Viaarxiv icon

dnaHNet: A Scalable and Hierarchical Foundation Model for Genomic Sequence Learning

Add code
Feb 14, 2026
Viaarxiv icon

Retrieval-Aware Distillation for Transformer-SSM Hybrids

Add code
Feb 11, 2026
Viaarxiv icon

Autoregressive Universal Video Segmentation Model

Add code
Aug 26, 2025
Figure 1 for Autoregressive Universal Video Segmentation Model
Figure 2 for Autoregressive Universal Video Segmentation Model
Figure 3 for Autoregressive Universal Video Segmentation Model
Figure 4 for Autoregressive Universal Video Segmentation Model
Viaarxiv icon

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Add code
Jul 10, 2025
Viaarxiv icon

Understanding and Improving Length Generalization in Recurrent Models

Add code
Jul 03, 2025
Viaarxiv icon

Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism

Add code
Apr 22, 2025
Figure 1 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Figure 2 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Figure 3 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Figure 4 for Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Viaarxiv icon

Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences

Add code
Mar 20, 2025
Viaarxiv icon

Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners

Add code
Feb 27, 2025
Viaarxiv icon

Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing

Add code
Feb 23, 2025
Viaarxiv icon