Alert button
Picture for Tri Dao

Tri Dao

Alert button

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

Add code
Bookmark button
Alert button
Mar 05, 2024
Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

Figure 1 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Figure 2 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Figure 3 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Figure 4 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Viaarxiv icon

StarCoder 2 and The Stack v2: The Next Generation

Add code
Bookmark button
Alert button
Feb 29, 2024
Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Viaarxiv icon

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Add code
Bookmark button
Alert button
Feb 28, 2024
James Liu, Guangxuan Xiao, Kai Li, Jason D. Lee, Song Han, Tri Dao, Tianle Cai

Viaarxiv icon

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Add code
Bookmark button
Alert button
Jan 19, 2024
Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao

Viaarxiv icon

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Add code
Bookmark button
Alert button
Dec 01, 2023
Albert Gu, Tri Dao

Figure 1 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Figure 2 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Figure 3 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Figure 4 for Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Viaarxiv icon

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Add code
Bookmark button
Alert button
Oct 26, 2023
Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

Figure 1 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 2 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 3 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 4 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Viaarxiv icon

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Add code
Bookmark button
Alert button
Jul 17, 2023
Tri Dao

Figure 1 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Figure 2 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Figure 3 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Figure 4 for FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Viaarxiv icon

StarCoder: may the source be with you!

Add code
Bookmark button
Alert button
May 09, 2023
Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Figure 1 for StarCoder: may the source be with you!
Figure 2 for StarCoder: may the source be with you!
Figure 3 for StarCoder: may the source be with you!
Figure 4 for StarCoder: may the source be with you!
Viaarxiv icon

Effectively Modeling Time Series with Simple Discrete State Spaces

Add code
Bookmark button
Alert button
Mar 16, 2023
Michael Zhang, Khaled K. Saab, Michael Poli, Tri Dao, Karan Goel, Christopher Ré

Figure 1 for Effectively Modeling Time Series with Simple Discrete State Spaces
Figure 2 for Effectively Modeling Time Series with Simple Discrete State Spaces
Figure 3 for Effectively Modeling Time Series with Simple Discrete State Spaces
Figure 4 for Effectively Modeling Time Series with Simple Discrete State Spaces
Viaarxiv icon