Alert button

Linear attention is (maybe) all you need (to understand transformer optimization)

Oct 02, 2023
Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra

Figure 1 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 2 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 3 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 4 for Linear attention is (maybe) all you need (to understand transformer optimization)

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: