Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

OPT: Open Pre-trained Transformer Language Models


May 05, 2022
Susan Zhang , Stephen Roller , Naman Goyal , Mikel Artetxe , Moya Chen , Shuohui Chen , Christopher Dewan , Mona Diab , Xian Li , Xi Victoria Lin , Todor Mihaylov , Myle Ott , Sam Shleifer , Kurt Shuster , Daniel Simig , Punit Singh Koura , Anjali Sridhar , Tianlu Wang , Luke Zettlemoyer


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Efficient Language Modeling with Sparse all-MLP


Mar 16, 2022
Ping Yu , Mikel Artetxe , Myle Ott , Sam Shleifer , Hongyu Gong , Ves Stoyanov , Xian Li


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Efficient Large Scale Language Modeling with Mixtures of Experts


Dec 20, 2021
Mikel Artetxe , Shruti Bhosale , Naman Goyal , Todor Mihaylov , Myle Ott , Sam Shleifer , Xi Victoria Lin , Jingfei Du , Srinivasan Iyer , Ramakanth Pasunuru , Giri Anantharaman , Xian Li , Shuohui Chen , Halil Akin , Mandeep Baines , Louis Martin , Xing Zhou , Punit Singh Koura , Brian O'Horo , Jeff Wang , Luke Zettlemoyer , Mona Diab , Zornitsa Kozareva , Ves Stoyanov


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Few-shot Learning with Multilingual Language Models


Dec 20, 2021
Xi Victoria Lin , Todor Mihaylov , Mikel Artetxe , Tianlu Wang , Shuohui Chen , Daniel Simig , Myle Ott , Naman Goyal , Shruti Bhosale , Jingfei Du , Ramakanth Pasunuru , Sam Shleifer , Punit Singh Koura , Vishrav Chaudhary , Brian O'Horo , Jeff Wang , Luke Zettlemoyer , Zornitsa Kozareva , Mona Diab , Veselin Stoyanov , Xian Li

* 36 pages 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

NormFormer: Improved Transformer Pretraining with Extra Normalization


Nov 01, 2021
Sam Shleifer , Jason Weston , Myle Ott


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Sustainable AI: Environmental Implications, Challenges and Opportunities


Oct 30, 2021
Carole-Jean Wu , Ramya Raghavendra , Udit Gupta , Bilge Acun , Newsha Ardalani , Kiwan Maeng , Gloria Chang , Fiona Aga Behram , James Huang , Charles Bai , Michael Gschwind , Anurag Gupta , Myle Ott , Anastasia Melnikov , Salvatore Candido , David Brooks , Geeta Chauhan , Benjamin Lee , Hsien-Hsin S. Lee , Bugra Akyildiz , Maximilian Balandat , Joe Spisak , Ravi Jain , Mike Rabbat , Kim Hazelwood


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

On Anytime Learning at Macroscale


Jun 17, 2021
Lucas Caccia , Jing Xu , Myle Ott , Marc'Aurelio Ranzato , Ludovic Denoyer


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Larger-Scale Transformers for Multilingual Masked Language Modeling


May 02, 2021
Naman Goyal , Jingfei Du , Myle Ott , Giri Anantharaman , Alexis Conneau

* 4 pages 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
3
4
>>