Alert button

Memory-efficient Transformers via Top-$k$ Attention

Jun 13, 2021
Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonathan Berant

Figure 1 for Memory-efficient Transformers via Top-$k$ Attention
Figure 2 for Memory-efficient Transformers via Top-$k$ Attention
Figure 3 for Memory-efficient Transformers via Top-$k$ Attention
Figure 4 for Memory-efficient Transformers via Top-$k$ Attention

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: