Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s



Felix Chern , Blake Hechtman , Andy Davis , Ruiqi Guo , David Majnemer , Sanjiv Kumar


   Access Paper or Ask Questions

Unified Scaling Laws for Routed Language Models



Aidan Clark , Diego de las Casas , Aurelia Guy , Arthur Mensch , Michela Paganini , Jordan Hoffmann , Bogdan Damoc , Blake Hechtman , Trevor Cai , Sebastian Borgeaud , George van den Driessche , Eliza Rutherford , Tom Hennigan , Matthew Johnson , Katie Millican , Albin Cassirer , Chris Jones , Elena Buchatskaya , David Budden , Laurent Sifre , Simon Osindero , Oriol Vinyals , Jack Rae , Erich Elsen , Koray Kavukcuoglu , Karen Simonyan

* Fixing typos and affiliation clarity 

   Access Paper or Ask Questions

Scaling Language Models: Methods, Analysis & Insights from Training Gopher



Jack W. Rae , Sebastian Borgeaud , Trevor Cai , Katie Millican , Jordan Hoffmann , Francis Song , John Aslanides , Sarah Henderson , Roman Ring , Susannah Young , Eliza Rutherford , Tom Hennigan , Jacob Menick , Albin Cassirer , Richard Powell , George van den Driessche , Lisa Anne Hendricks , Maribeth Rauh , Po-Sen Huang , Amelia Glaese , Johannes Welbl , Sumanth Dathathri , Saffron Huang , Jonathan Uesato , John Mellor , Irina Higgins , Antonia Creswell , Nat McAleese , Amy Wu , Erich Elsen , Siddhant Jayakumar , Elena Buchatskaya , David Budden , Esme Sutherland , Karen Simonyan , Michela Paganini , Laurent Sifre , Lena Martens , Xiang Lorraine Li , Adhiguna Kuncoro , Aida Nematzadeh , Elena Gribovskaya , Domenic Donato , Angeliki Lazaridou , Arthur Mensch , Jean-Baptiste Lespiau , Maria Tsimpoukelli , Nikolai Grigorev , Doug Fritz , Thibault Sottiaux , Mantas Pajarskas , Toby Pohlen , Zhitao Gong , Daniel Toyama , Cyprien de Masson d'Autume , Yujia Li , Tayfun Terzi , Vladimir Mikulik , Igor Babuschkin , Aidan Clark , Diego de Las Casas , Aurelia Guy , Chris Jones , James Bradbury , Matthew Johnson , Blake Hechtman , Laura Weidinger , Iason Gabriel , William Isaac , Ed Lockhart , Simon Osindero , Laura Rimell , Chris Dyer , Oriol Vinyals , Kareem Ayoub , Jeff Stanway , Lorrayne Bennett , Demis Hassabis , Koray Kavukcuoglu , Geoffrey Irving

* 118 pages 

   Access Paper or Ask Questions

GSPMD: General and Scalable Parallelization for ML Computation Graphs



Yuanzhong Xu , HyoukJoong Lee , Dehao Chen , Blake Hechtman , Yanping Huang , Rahul Joshi , Maxim Krikun , Dmitry Lepikhin , Andy Ly , Marcello Maggioni , Ruoming Pang , Noam Shazeer , Shibo Wang , Tao Wang , Yonghui Wu , Zhifeng Chen


   Access Paper or Ask Questions

Scaling Local Self-Attention for Parameter Efficient Visual Backbones



Ashish Vaswani , Prajit Ramachandran , Aravind Srinivas , Niki Parmar , Blake Hechtman , Jonathon Shlens

* CVPR 2021 Oral 

   Access Paper or Ask Questions

Exploring the limits of Concurrency in ML Training on Google TPUs



Sameer Kumar , James Bradbury , Cliff Young , Yu Emma Wang , Anselm Levskaya , Blake Hechtman , Dehao Chen , HyoukJoong Lee , Mehmet Deveci , Naveen Kumar , Pankaj Kanwar , Shibo Wang , Skye Wanderman-Milne , Steve Lacy , Tao Wang , Tayo Oguntebi , Yazhou Zu , Yuanzhong Xu , Andy Swing


   Access Paper or Ask Questions

Automatic Cross-Replica Sharding of Weight Update in Data-Parallel Training



Yuanzhong Xu , HyoukJoong Lee , Dehao Chen , Hongjun Choi , Blake Hechtman , Shibo Wang

* 12 pages, 23 figures, 1 table 

   Access Paper or Ask Questions

Scale MLPerf-0.6 models on Google TPU-v3 Pods



Sameer Kumar , Victor Bitorff , Dehao Chen , Chiachen Chou , Blake Hechtman , HyoukJoong Lee , Naveen Kumar , Peter Mattson , Shibo Wang , Tao Wang , Yuanzhong Xu , Zongwei Zhou


   Access Paper or Ask Questions

1
2
>>