Alert button

Which transformer architecture fits my data? A vocabulary bottleneck in self-attention

May 09, 2021
Noam Wies, Yoav Levine, Daniel Jannai, Amnon Shashua

Figure 1 for Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Figure 2 for Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Figure 3 for Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Figure 4 for Which transformer architecture fits my data? A vocabulary bottleneck in self-attention

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: