Alert button
Picture for Tri Dao

Tri Dao

Alert button

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Feb 15, 2024
James Liu, Guangxuan Xiao, Kai Li, Jason D. Lee, Song Han, Tri Dao, Tianle Cai

Viaarxiv icon

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Jan 19, 2024
Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao

Viaarxiv icon

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Dec 01, 2023
Albert Gu, Tri Dao

Figure 1 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Figure 2 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Figure 3 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Figure 4 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Viaarxiv icon

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Oct 26, 2023
Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

Figure 1 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 2 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 3 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 4 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Viaarxiv icon

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Jul 17, 2023
Tri Dao

Figure 1 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Figure 2 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Figure 3 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Figure 4 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Viaarxiv icon

StarCoder: may the source be with you!

May 09, 2023
Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Figure 1 for StarCoder: may the source be with you!
Figure 2 for StarCoder: may the source be with you!
Figure 3 for StarCoder: may the source be with you!
Figure 4 for StarCoder: may the source be with you!
Viaarxiv icon

Effectively Modeling Time Series with Simple Discrete State Spaces

Mar 16, 2023
Michael Zhang, Khaled K. Saab, Michael Poli, Tri Dao, Karan Goel, Christopher Ré

Figure 1 for Effectively Modeling Time Series with Simple Discrete State Spaces
Figure 2 for Effectively Modeling Time Series with Simple Discrete State Spaces
Figure 3 for Effectively Modeling Time Series with Simple Discrete State Spaces
Figure 4 for Effectively Modeling Time Series with Simple Discrete State Spaces
Viaarxiv icon

Hyena Hierarchy: Towards Larger Convolutional Language Models

Mar 06, 2023
Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré

Figure 1 for Hyena Hierarchy: Towards Larger Convolutional Language Models
Figure 2 for Hyena Hierarchy: Towards Larger Convolutional Language Models
Figure 3 for Hyena Hierarchy: Towards Larger Convolutional Language Models
Figure 4 for Hyena Hierarchy: Towards Larger Convolutional Language Models
Viaarxiv icon

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

Feb 13, 2023
Daniel Y. Fu, Elliot L. Epstein, Eric Nguyen, Armin W. Thomas, Michael Zhang, Tri Dao, Atri Rudra, Christopher Ré

Figure 1 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Figure 2 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Figure 3 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Figure 4 for Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Viaarxiv icon