Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Informer: Transformer Likes Informed Attention

Add code

Dec 21, 2020
Ruining He, Anirudh Ravula, Bhargav Kanagal, Joshua Ainslie

Share this with someone who'll enjoy it:

Transformer is the backbone of modern NLP models. In this paper, we propose Informer, a simple architecture that significantly outperforms canonical Transformers on a spectrum of tasks including Masked Language Modeling, GLUE, and SQuAD. Qualitatively, Informer is easy to implement and requires minimal hyper-parameter tuning. It also stabilizes training and leads to models with sparser attentions. Code will be open-sourced upon paper acceptance.

* 13 pages, 8 figures 

   Access Paper Source

Share this with someone who'll enjoy it: