Alert button
Picture for Weigao Sun

Weigao Sun

Alert button

HGRN2: Gated Linear RNNs with State Expansion

Add code
Bookmark button
Alert button
Apr 11, 2024
Zhen Qin, Songlin Yang, Weixuan Sun, Xuyang Shen, Dong Li, Weigao Sun, Yiran Zhong

Viaarxiv icon

Linear Attention Sequence Parallelism

Add code
Bookmark button
Alert button
Apr 03, 2024
Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong

Viaarxiv icon

MS-Net: A Multi-Path Sparse Model for Motion Prediction in Multi-Scenes

Add code
Bookmark button
Alert button
Mar 01, 2024
Xiaqiang Tang, Weigao Sun, Siyuan Hu, Yiyang Sun, Yafeng Guo

Figure 1 for MS-Net: A Multi-Path Sparse Model for Motion Prediction in Multi-Scenes
Figure 2 for MS-Net: A Multi-Path Sparse Model for Motion Prediction in Multi-Scenes
Figure 3 for MS-Net: A Multi-Path Sparse Model for Motion Prediction in Multi-Scenes
Figure 4 for MS-Net: A Multi-Path Sparse Model for Motion Prediction in Multi-Scenes
Viaarxiv icon

CO2: Efficient Distributed Training with Full Communication-Computation Overlap

Add code
Bookmark button
Alert button
Jan 29, 2024
Weigao Sun, Zhen Qin, Weixuan Sun, Shidi Li, Dong Li, Xuyang Shen, Yu Qiao, Yiran Zhong

Viaarxiv icon

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Add code
Bookmark button
Alert button
Jan 15, 2024
Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong

Viaarxiv icon

Scaling TransNormer to 175 Billion Parameters

Add code
Bookmark button
Alert button
Jul 27, 2023
Zhen Qin, Dong Li, Weigao Sun, Weixuan Sun, Xuyang Shen, Xiaodong Han, Yunshen Wei, Baohong Lv, Fei Yuan, Xiao Luo, Yu Qiao, Yiran Zhong

Figure 1 for Scaling TransNormer to 175 Billion Parameters
Figure 2 for Scaling TransNormer to 175 Billion Parameters
Figure 3 for Scaling TransNormer to 175 Billion Parameters
Figure 4 for Scaling TransNormer to 175 Billion Parameters
Viaarxiv icon