Alert button

BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Mar 14, 2024
Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su

Figure 1 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 2 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 3 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 4 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: