Alert button
Picture for Zhaozhuo Xu

Zhaozhuo Xu

Alert button

NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

Add code
Bookmark button
Alert button
Mar 02, 2024
Tianyi Zhang, Jonah Wonkyu Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava

Figure 1 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Figure 2 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Figure 3 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Figure 4 for NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Viaarxiv icon

LLM Multi-Agent Systems: Challenges and Open Problems

Add code
Bookmark button
Alert button
Feb 05, 2024
Shanshan Han, Qifan Zhang, Yuhang Yao, Weizhao Jin, Zhaozhuo Xu, Chaoyang He

Viaarxiv icon

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Add code
Bookmark button
Alert button
Feb 05, 2024
Zirui Liu, Jiayi Yuan, Hongye Jin, Shaochen Zhong, Zhaozhuo Xu, Vladimir Braverman, Beidi Chen, Xia Hu

Viaarxiv icon

LETA: Learning Transferable Attribution for Generic Vision Explainer

Add code
Bookmark button
Alert button
Dec 23, 2023
Guanchu Wang, Yu-Neng Chuang, Fan Yang, Mengnan Du, Chia-Yuan Chang, Shaochen Zhong, Zirui Liu, Zhaozhuo Xu, Kaixiong Zhou, Xuanting Cai, Xia Hu

Viaarxiv icon

Zen: Near-Optimal Sparse Tensor Synchronization for Distributed DNN Training

Add code
Bookmark button
Alert button
Sep 23, 2023
Zhuang Wang, Zhaozhuo Xu, Anshumali Shrivastava, T. S. Eugene Ng

Figure 1 for Zen: Near-Optimal Sparse Tensor Synchronization for Distributed DNN Training
Figure 2 for Zen: Near-Optimal Sparse Tensor Synchronization for Distributed DNN Training
Figure 3 for Zen: Near-Optimal Sparse Tensor Synchronization for Distributed DNN Training
Figure 4 for Zen: Near-Optimal Sparse Tensor Synchronization for Distributed DNN Training
Viaarxiv icon

Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time

Add code
Bookmark button
Alert button
May 26, 2023
Zichang Liu, Aditya Desai, Fangshuo Liao, Weitao Wang, Victor Xie, Zhaozhuo Xu, Anastasios Kyrillidis, Anshumali Shrivastava

Figure 1 for Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Figure 2 for Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Figure 3 for Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Figure 4 for Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Viaarxiv icon

Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model

Add code
Bookmark button
Alert button
May 24, 2023
Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, Ruixiang Tang, Zhimeng Jiang, Kaixiong Zhou, Vipin Chaudhary, Shuai Xu, Xia Hu

Figure 1 for Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Figure 2 for Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Figure 3 for Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Figure 4 for Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Viaarxiv icon

Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt

Add code
Bookmark button
Alert button
May 17, 2023
Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava

Figure 1 for Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Figure 2 for Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Figure 3 for Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Figure 4 for Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Viaarxiv icon

A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph

Add code
Bookmark button
Alert button
Mar 10, 2023
Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu

Figure 1 for A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph
Figure 2 for A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph
Viaarxiv icon

Adaptive and Dynamic Multi-Resolution Hashing for Pairwise Summations

Add code
Bookmark button
Alert button
Dec 21, 2022
Lianke Qin, Aravind Reddy, Zhao Song, Zhaozhuo Xu, Danyang Zhuo

Viaarxiv icon