Alert button

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Sep 25, 2023
Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Leon Song, Samyam Rajbhandari, Yuxiong He

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: