Alert button
Picture for Shiyang Chen

Shiyang Chen

Alert button

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Add code
Bookmark button
Alert button
Jan 25, 2024
Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song

Viaarxiv icon

ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Add code
Bookmark button
Alert button
Dec 18, 2023
Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao

Figure 1 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 2 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 3 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 4 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Viaarxiv icon

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Add code
Bookmark button
Alert button
Oct 11, 2023
Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

Figure 1 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 2 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 3 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 4 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Viaarxiv icon

Tango: rethinking quantization for graph neural network training on GPUs

Add code
Bookmark button
Alert button
Aug 02, 2023
Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu

Viaarxiv icon

Motif-based Graph Representation Learning with Application to Chemical Molecules

Add code
Bookmark button
Alert button
Aug 09, 2022
Yifei Wang, Shiyang Chen, Guobin Chen, Ethan Shurberg, Hang Liu, Pengyu Hong

Figure 1 for Motif-based Graph Representation Learning with Application to Chemical Molecules
Figure 2 for Motif-based Graph Representation Learning with Application to Chemical Molecules
Figure 3 for Motif-based Graph Representation Learning with Application to Chemical Molecules
Figure 4 for Motif-based Graph Representation Learning with Application to Chemical Molecules
Viaarxiv icon

A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining

Add code
Bookmark button
Alert button
Aug 07, 2022
Hongwu Peng, Shaoyi Huang, Shiyang Chen, Bingbing Li, Tong Geng, Ang Li, Weiwen Jiang, Wujie Wen, Jinbo Bi, Hang Liu, Caiwen Ding

Figure 1 for A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining
Figure 2 for A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining
Figure 3 for A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining
Figure 4 for A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining
Viaarxiv icon

Transfer learning of phase transitions in percolation and directed percolation

Add code
Bookmark button
Alert button
Jan 06, 2022
Jianmin Shen, Feiyi Liu, Shiyang Chen, Dian Xu, Xiangna Chen, Shengfeng Deng, Wei Li, Gabor Papp, Chunbin Yang

Figure 1 for Transfer learning of phase transitions in percolation and directed percolation
Figure 2 for Transfer learning of phase transitions in percolation and directed percolation
Figure 3 for Transfer learning of phase transitions in percolation and directed percolation
Figure 4 for Transfer learning of phase transitions in percolation and directed percolation
Viaarxiv icon

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Add code
Bookmark button
Alert button
Oct 18, 2021
Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Sung-en Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Hang Liu, Caiwen Ding

Figure 1 for Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Figure 2 for Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Figure 3 for Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Figure 4 for Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Viaarxiv icon