Alert button

DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale

Add code
Bookmark button
Alert button
Jun 30, 2022
Reza Yazdani Aminabadi, Samyam Rajbhandari, Minjia Zhang, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Jeff Rasley, Shaden Smith, Olatunji Ruwase, Yuxiong He

Figure 1 for DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale
Figure 2 for DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale
Figure 3 for DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale
Figure 4 for DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: