Alert button

Reducing Activation Recomputation in Large Transformer Models

Add code
Bookmark button
Alert button
May 10, 2022
Vijay Korthikanti, Jared Casper, Sangkug Lym, Lawrence McAfee, Michael Andersch, Mohammad Shoeybi, Bryan Catanzaro

Figure 1 for Reducing Activation Recomputation in Large Transformer Models
Figure 2 for Reducing Activation Recomputation in Large Transformer Models
Figure 3 for Reducing Activation Recomputation in Large Transformer Models
Figure 4 for Reducing Activation Recomputation in Large Transformer Models

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: