Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Training Compute-Optimal Large Language Models



Jordan Hoffmann , Sebastian Borgeaud , Arthur Mensch , Elena Buchatskaya , Trevor Cai , Eliza Rutherford , Diego de Las Casas , Lisa Anne Hendricks , Johannes Welbl , Aidan Clark , Tom Hennigan , Eric Noland , Katie Millican , George van den Driessche , Bogdan Damoc , Aurelia Guy , Simon Osindero , Karen Simonyan , Erich Elsen , Jack W. Rae , Oriol Vinyals , Laurent Sifre


   Access Paper or Ask Questions

Unified Scaling Laws for Routed Language Models



Aidan Clark , Diego de las Casas , Aurelia Guy , Arthur Mensch , Michela Paganini , Jordan Hoffmann , Bogdan Damoc , Blake Hechtman , Trevor Cai , Sebastian Borgeaud , George van den Driessche , Eliza Rutherford , Tom Hennigan , Matthew Johnson , Katie Millican , Albin Cassirer , Chris Jones , Elena Buchatskaya , David Budden , Laurent Sifre , Simon Osindero , Oriol Vinyals , Jack Rae , Erich Elsen , Koray Kavukcuoglu , Karen Simonyan

* Fixing typos and affiliation clarity 

   Access Paper or Ask Questions

Scaling Language Models: Methods, Analysis & Insights from Training Gopher



Jack W. Rae , Sebastian Borgeaud , Trevor Cai , Katie Millican , Jordan Hoffmann , Francis Song , John Aslanides , Sarah Henderson , Roman Ring , Susannah Young , Eliza Rutherford , Tom Hennigan , Jacob Menick , Albin Cassirer , Richard Powell , George van den Driessche , Lisa Anne Hendricks , Maribeth Rauh , Po-Sen Huang , Amelia Glaese , Johannes Welbl , Sumanth Dathathri , Saffron Huang , Jonathan Uesato , John Mellor , Irina Higgins , Antonia Creswell , Nat McAleese , Amy Wu , Erich Elsen , Siddhant Jayakumar , Elena Buchatskaya , David Budden , Esme Sutherland , Karen Simonyan , Michela Paganini , Laurent Sifre , Lena Martens , Xiang Lorraine Li , Adhiguna Kuncoro , Aida Nematzadeh , Elena Gribovskaya , Domenic Donato , Angeliki Lazaridou , Arthur Mensch , Jean-Baptiste Lespiau , Maria Tsimpoukelli , Nikolai Grigorev , Doug Fritz , Thibault Sottiaux , Mantas Pajarskas , Toby Pohlen , Zhitao Gong , Daniel Toyama , Cyprien de Masson d'Autume , Yujia Li , Tayfun Terzi , Vladimir Mikulik , Igor Babuschkin , Aidan Clark , Diego de Las Casas , Aurelia Guy , Chris Jones , James Bradbury , Matthew Johnson , Blake Hechtman , Laura Weidinger , Iason Gabriel , William Isaac , Ed Lockhart , Simon Osindero , Laura Rimell , Chris Dyer , Oriol Vinyals , Kareem Ayoub , Jeff Stanway , Lorrayne Bennett , Demis Hassabis , Koray Kavukcuoglu , Geoffrey Irving

* 118 pages 

   Access Paper or Ask Questions

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning



Jean-Bastien Grill , Florian Strub , Florent Altché , Corentin Tallec , Pierre H. Richemond , Elena Buchatskaya , Carl Doersch , Bernardo Avila Pires , Zhaohan Daniel Guo , Mohammad Gheshlaghi Azar , Bilal Piot , Koray Kavukcuoglu , Rémi Munos , Michal Valko


   Access Paper or Ask Questions