Alert button
Picture for Xiaoying Jia

Xiaoying Jia

Alert button

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Add code
Bookmark button
Alert button
Feb 23, 2024
Ziheng Jiang, Haibin Lin, Yinmin Zhong, Qi Huang, Yangrui Chen, Zhi Zhang, Yanghua Peng, Xiang Li, Cong Xie, Shibiao Nong, Yulu Jia, Sun He, Hongmin Chen, Zhihao Bai, Qi Hou, Shipeng Yan, Ding Zhou, Yiyao Sheng, Zhuo Jiang, Haohan Xu, Haoran Wei, Zhang Zhang, Pengfei Nie, Leqi Zou, Sida Zhao, Liang Xiang, Zherui Liu, Zhe Li, Xiaoying Jia, Jianxi Ye, Xin Jin, Xin Liu

Viaarxiv icon

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs

Add code
Bookmark button
Alert button
Oct 06, 2022
Yujia Zhai, Chengquan Jiang, Leyuan Wang, Xiaoying Jia, Shang Zhang, Zizhong Chen, Xin Liu, Yibo Zhu

Figure 1 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 2 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 3 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 4 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Viaarxiv icon

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity

Add code
Bookmark button
Alert button
Aug 29, 2020
Cong Guo, Bo Yang Hsueh, Jingwen Leng, Yuxian Qiu, Yue Guan, Zehuan Wang, Xiaoying Jia, Xipeng Li, Minyi Guo, Yuhao Zhu

Figure 1 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Figure 2 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Figure 3 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Figure 4 for Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Viaarxiv icon