Alert button
Picture for Shengnan Wang

Shengnan Wang

Alert button

BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Add code
Bookmark button
Alert button
Mar 14, 2024
Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su

Figure 1 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 2 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 3 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 4 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Viaarxiv icon

Digital twin-assisted three-dimensional electrical capacitance tomography for multiphase flow imaging

Add code
Bookmark button
Alert button
Dec 22, 2023
Shengnan Wang, Yi Li, Zhou Chen, Yunjie Yang

Viaarxiv icon

Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup

Add code
Bookmark button
Alert button
Nov 27, 2020
Cheng Yang, Shengnan Wang, Chao Yang, Yuechuan Li, Ru He, Jingqiao Zhang

Figure 1 for Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup
Figure 2 for Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup
Figure 3 for Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup
Figure 4 for Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup
Viaarxiv icon

CoRe: An Efficient Coarse-refined Training Framework for BERT

Add code
Bookmark button
Alert button
Nov 27, 2020
Cheng Yang, Shengnan Wang, Yuechuan Li, Chao Yang, Ming Yan, Jingqiao Zhang, Fangquan Lin

Figure 1 for CoRe: An Efficient Coarse-refined Training Framework for BERT
Figure 2 for CoRe: An Efficient Coarse-refined Training Framework for BERT
Figure 3 for CoRe: An Efficient Coarse-refined Training Framework for BERT
Figure 4 for CoRe: An Efficient Coarse-refined Training Framework for BERT
Viaarxiv icon