Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Layerwise Bregman Representation Learning with Applications to Knowledge Distillation


Sep 15, 2022
Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K. Warmuth


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models


Sep 12, 2022
Rohan Anil, Sandra Gadanho, Da Huang, Nijith Jacob, Zhuoshu Li, Dong Lin, Todd Phillips, Cristina Pop, Kevin Regan, Gil I. Shamir, Rakesh Shivanna, Qiqi Yan

* ORSUM - ACM RecSys, September 23, 2022, Seattle, WA 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

N-Grammer: Augmenting Transformers with latent n-grams


Jul 13, 2022
Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao, Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, Yonghui Wu

* 8 pages, 2 figures 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Learning from Randomly Initialized Neural Network Features


Feb 13, 2022
Ehsan Amid, Rohan Anil, Wojciech Kotłowski, Manfred K. Warmuth


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Step-size Adaptation Using Exponentiated Gradient Updates


Jan 31, 2022
Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K. Warmuth


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Efficiently Identifying Task Groupings for Multi-Task Learning


Sep 10, 2021
Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Large-Scale Differentially Private BERT


Aug 03, 2021
Rohan Anil, Badih Ghazi, Vineet Gupta, Ravi Kumar, Pasin Manurangsi

* 12 pages, 6 figures 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

LocoProp: Enhancing BackProp via Local Loss Optimization


Jun 11, 2021
Ehsan Amid, Rohan Anil, Manfred K. Warmuth


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Knowledge distillation: A good teacher is patient and consistent


Jun 09, 2021
Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, Alexander Kolesnikov

* Lucas, Xiaohua, Am\'elie, Larisa, and Alex contributed equally 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes


Feb 16, 2021
Zachary Nado, Justin M. Gilmer, Christopher J. Shallue, Rohan Anil, George E. Dahl


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
>>