Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Building Machine Translation Systems for the Next Thousand Languages


May 16, 2022
Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes

* V2: updated with some details from 24-language Google Translate launch in May 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

LaMDA: Language Models for Dialog Applications


Feb 10, 2022
Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao, Yanqi Zhou, Chung-Ching Chang, Igor Krivokon, Will Rusch, Marc Pickett, Pranesh Srinivasan, Laichee Man, Kathleen Meier-Hellstern, Meredith Ringel Morris, Tulsee Doshi, Renelito Delos Santos, Toju Duke, Johnny Soraker, Ben Zevenbergen, Vinodkumar Prabhakaran, Mark Diaz, Ben Hutchinson, Kristen Olson, Alejandra Molina, Erin Hoffman-John, Josh Lee, Lora Aroyo, Ravi Rajakumar, Alena Butryna, Matthew Lamm, Viktoriya Kuzmina, Joe Fenton, Aaron Cohen, Rachel Bernstein, Ray Kurzweil, Blaise Aguera-Arcas, Claire Cui, Marian Croak, Ed Chi, Quoc Le


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Data Scaling Laws in NMT: The Effect of Noise and Architecture


Feb 04, 2022
Yamini Bansal, Behrooz Ghorbani, Ankush Garg, Biao Zhang, Maxim Krikun, Colin Cherry, Behnam Neyshabur, Orhan Firat


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts


Dec 13, 2021
Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathy Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, Claire Cui


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference


Sep 24, 2021
Sneha Kudugunta, Yanping Huang, Ankur Bapna, Maxim Krikun, Dmitry Lepikhin, Minh-Thang Luong, Orhan Firat

* EMNLP Findings 2021 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Scaling Laws for Neural Machine Translation


Sep 16, 2021
Behrooz Ghorbani, Orhan Firat, Markus Freitag, Ankur Bapna, Maxim Krikun, Xavier Garcia, Ciprian Chelba, Colin Cherry

* 31 pages, 23 figures 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

GSPMD: General and Scalable Parallelization for ML Computation Graphs


May 10, 2021
Yuanzhong Xu, HyoukJoong Lee, Dehao Chen, Blake Hechtman, Yanping Huang, Rahul Joshi, Maxim Krikun, Dmitry Lepikhin, Andy Ly, Marcello Maggioni, Ruoming Pang, Noam Shazeer, Shibo Wang, Tao Wang, Yonghui Wu, Zhifeng Chen


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding


Jun 30, 2020
Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, Zhifeng Chen


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges


Jul 11, 2019
Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, Yonghui Wu


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
>>