Alert button
Picture for Guangyu Sun

Guangyu Sun

Alert button

LLM Inference Unveiled: Survey and Roofline Model Insights

Mar 11, 2024
Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Viaarxiv icon

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models

Dec 10, 2023
Zhihang Yuan, Yuzhang Shang, Yue Song, Qiang Wu, Yan Yan, Guangyu Sun

Viaarxiv icon

FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning

Aug 17, 2023
Guangyu Sun, Matias Mendieta, Jun Luo, Shandong Wu, Chen Chen

Figure 1 for FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning
Figure 2 for FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning
Figure 3 for FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning
Figure 4 for FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning
Viaarxiv icon

RPTQ: Reorder-based Post-training Quantization for Large Language Models

Apr 25, 2023
Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu

Figure 1 for RPTQ: Reorder-based Post-training Quantization for Large Language Models
Figure 2 for RPTQ: Reorder-based Post-training Quantization for Large Language Models
Figure 3 for RPTQ: Reorder-based Post-training Quantization for Large Language Models
Figure 4 for RPTQ: Reorder-based Post-training Quantization for Large Language Models
Viaarxiv icon

Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance

Mar 23, 2023
Zhihang Yuan, Jiawei Liu, Jiaxiang Wu, Dawei Yang, Qiang Wu, Guangyu Sun, Wenyu Liu, Xinggang Wang, Bingzhe Wu

Figure 1 for Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance
Figure 2 for Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance
Figure 3 for Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance
Figure 4 for Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance
Viaarxiv icon

Latency-aware Spatial-wise Dynamic Networks

Oct 12, 2022
Yizeng Han, Zhihang Yuan, Yifan Pu, Chenhao Xue, Shiji Song, Guangyu Sun, Gao Huang

Figure 1 for Latency-aware Spatial-wise Dynamic Networks
Figure 2 for Latency-aware Spatial-wise Dynamic Networks
Figure 3 for Latency-aware Spatial-wise Dynamic Networks
Figure 4 for Latency-aware Spatial-wise Dynamic Networks
Viaarxiv icon