Alert button
Picture for Atri Rudra

Atri Rudra

Alert button

Simple linear attention language models balance the recall-throughput tradeoff

Add code
Bookmark button
Alert button
Feb 28, 2024
Simran Arora, Sabri Eyuboglu, Michael Zhang, Aman Timalsina, Silas Alberti, Dylan Zinsley, James Zou, Atri Rudra, Christopher Ré

Viaarxiv icon

Zoology: Measuring and Improving Recall in Efficient Language Models

Add code
Bookmark button
Alert button
Dec 08, 2023
Simran Arora, Sabri Eyuboglu, Aman Timalsina, Isys Johnson, Michael Poli, James Zou, Atri Rudra, Christopher Ré

Viaarxiv icon

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

Add code
Bookmark button
Alert button
Oct 28, 2023
Stefano Massaroli, Michael Poli, Daniel Y. Fu, Hermann Kumbong, Rom N. Parnichkun, Aman Timalsina, David W. Romero, Quinn McIntyre, Beidi Chen, Atri Rudra, Ce Zhang, Christopher Re, Stefano Ermon, Yoshua Bengio

Figure 1 for Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
Figure 2 for Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
Figure 3 for Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
Figure 4 for Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
Viaarxiv icon

Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture

Add code
Bookmark button
Alert button
Oct 18, 2023
Daniel Y. Fu, Simran Arora, Jessica Grogan, Isys Johnson, Sabri Eyuboglu, Armin W. Thomas, Benjamin Spector, Michael Poli, Atri Rudra, Christopher Ré

Viaarxiv icon

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

Add code
Bookmark button
Alert button
Feb 13, 2023
Daniel Y. Fu, Elliot L. Epstein, Eric Nguyen, Armin W. Thomas, Michael Zhang, Tri Dao, Atri Rudra, Christopher Ré

Figure 1 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Figure 2 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Figure 3 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Figure 4 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Viaarxiv icon

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Add code
Bookmark button
Alert button
Dec 28, 2022
Tri Dao, Daniel Y. Fu, Khaled K. Saab, Armin W. Thomas, Atri Rudra, Christopher Ré

Figure 1 for Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Figure 2 for Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Figure 3 for Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Figure 4 for Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Viaarxiv icon

Arithmetic Circuits, Structured Matrices and (not so) Deep Learning

Add code
Bookmark button
Alert button
Jun 24, 2022
Atri Rudra

Figure 1 for Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
Figure 2 for Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
Figure 3 for Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
Figure 4 for Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
Viaarxiv icon

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

Add code
Bookmark button
Alert button
Jun 24, 2022
Albert Gu, Isys Johnson, Aman Timalsina, Atri Rudra, Christopher Ré

Figure 1 for How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Figure 2 for How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Figure 3 for How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Figure 4 for How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Viaarxiv icon

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Add code
Bookmark button
Alert button
May 27, 2022
Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré

Figure 1 for FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Figure 2 for FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Figure 3 for FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Figure 4 for FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Viaarxiv icon

Monarch: Expressive Structured Matrices for Efficient and Accurate Training

Add code
Bookmark button
Alert button
Apr 01, 2022
Tri Dao, Beidi Chen, Nimit Sohoni, Arjun Desai, Michael Poli, Jessica Grogan, Alexander Liu, Aniruddh Rao, Atri Rudra, Christopher Ré

Figure 1 for Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Figure 2 for Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Figure 3 for Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Figure 4 for Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Viaarxiv icon