Alert button

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Add code
Bookmark button
Alert button
Nov 25, 2022
Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi, Xiaonan Nie, Hailin Zhang, Bin Cui

Figure 1 for Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Figure 2 for Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Figure 3 for Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Figure 4 for Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: