Alert button

LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Add code
Bookmark button
Alert button
Oct 05, 2023
Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Xuezhe Ma, Hao Zhang

Figure 1 for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Figure 2 for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Figure 3 for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Figure 4 for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: