Alert button
Picture for Stephen Youn

Stephen Youn

Alert button

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Add code
Bookmark button
Alert button
Jan 25, 2024
Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song

Viaarxiv icon

ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Add code
Bookmark button
Alert button
Dec 18, 2023
Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao

Figure 1 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 2 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 3 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 4 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Viaarxiv icon

ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers

Add code
Bookmark button
Alert button
Oct 26, 2023
Zhewei Yao, Reza Yazdani Aminabadi, Stephen Youn, Xiaoxia Wu, Elton Zheng, Yuxiong He

Viaarxiv icon

A Comprehensive Study on Post-Training Quantization for Large Language Models

Add code
Bookmark button
Alert button
Mar 16, 2023
Zhewei Yao, Cheng Li, Xiaoxia Wu, Stephen Youn, Yuxiong He

Figure 1 for A Comprehensive Study on Post-Training Quantization for Large Language Models
Figure 2 for A Comprehensive Study on Post-Training Quantization for Large Language Models
Figure 3 for A Comprehensive Study on Post-Training Quantization for Large Language Models
Figure 4 for A Comprehensive Study on Post-Training Quantization for Large Language Models
Viaarxiv icon