Alert button
Picture for Beidi Chen

Beidi Chen

Alert button

LLM Inference Unveiled: Survey and Roofline Model Insights

Mar 11, 2024
Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Viaarxiv icon

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Mar 06, 2024
Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

Figure 1 for GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Figure 2 for GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Figure 3 for GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Figure 4 for GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Viaarxiv icon

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

Mar 05, 2024
Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang

Figure 1 for Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Figure 2 for Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Figure 3 for Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Figure 4 for Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Viaarxiv icon

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

Feb 29, 2024
Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen

Viaarxiv icon

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

Feb 14, 2024
Harry Dong, Xinyu Yang, Zhenyu Zhang, Zhangyang Wang, Yuejie Chi, Beidi Chen

Viaarxiv icon

Learn To be Efficient: Build Structured Sparsity in Large Language Models

Feb 13, 2024
Haizhong Zheng, Xiaoyan Bai, Beidi Chen, Fan Lai, Atul Prakash

Viaarxiv icon

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Feb 05, 2024
Zirui Liu, Jiayi Yuan, Hongye Jin, Shaochen Zhong, Zhaozhuo Xu, Vladimir Braverman, Beidi Chen, Xia Hu

Viaarxiv icon