Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Training Compute-Optimal Large Language Models



Jordan Hoffmann , Sebastian Borgeaud , Arthur Mensch , Elena Buchatskaya , Trevor Cai , Eliza Rutherford , Diego de Las Casas , Lisa Anne Hendricks , Johannes Welbl , Aidan Clark , Tom Hennigan , Eric Noland , Katie Millican , George van den Driessche , Bogdan Damoc , Aurelia Guy , Simon Osindero , Karen Simonyan , Erich Elsen , Jack W. Rae , Oriol Vinyals , Laurent Sifre


   Access Paper or Ask Questions

Unified Scaling Laws for Routed Language Models



Aidan Clark , Diego de las Casas , Aurelia Guy , Arthur Mensch , Michela Paganini , Jordan Hoffmann , Bogdan Damoc , Blake Hechtman , Trevor Cai , Sebastian Borgeaud , George van den Driessche , Eliza Rutherford , Tom Hennigan , Matthew Johnson , Katie Millican , Albin Cassirer , Chris Jones , Elena Buchatskaya , David Budden , Laurent Sifre , Simon Osindero , Oriol Vinyals , Jack Rae , Erich Elsen , Koray Kavukcuoglu , Karen Simonyan

* Fixing typos and affiliation clarity 

   Access Paper or Ask Questions

Improving language models by retrieving from trillions of tokens



Sebastian Borgeaud , Arthur Mensch , Jordan Hoffmann , Trevor Cai , Eliza Rutherford , Katie Millican , George van den Driessche , Jean-Baptiste Lespiau , Bogdan Damoc , Aidan Clark , Diego de Las Casas , Aurelia Guy , Jacob Menick , Roman Ring , Tom Hennigan , Saffron Huang , Loren Maggiore , Chris Jones , Albin Cassirer , Andy Brock , Michela Paganini , Geoffrey Irving , Oriol Vinyals , Simon Osindero , Karen Simonyan , Jack W. Rae , Erich Elsen , Laurent Sifre

* Add missing references. Fix some typos 

   Access Paper or Ask Questions

Scaling Language Models: Methods, Analysis & Insights from Training Gopher



Jack W. Rae , Sebastian Borgeaud , Trevor Cai , Katie Millican , Jordan Hoffmann , Francis Song , John Aslanides , Sarah Henderson , Roman Ring , Susannah Young , Eliza Rutherford , Tom Hennigan , Jacob Menick , Albin Cassirer , Richard Powell , George van den Driessche , Lisa Anne Hendricks , Maribeth Rauh , Po-Sen Huang , Amelia Glaese , Johannes Welbl , Sumanth Dathathri , Saffron Huang , Jonathan Uesato , John Mellor , Irina Higgins , Antonia Creswell , Nat McAleese , Amy Wu , Erich Elsen , Siddhant Jayakumar , Elena Buchatskaya , David Budden , Esme Sutherland , Karen Simonyan , Michela Paganini , Laurent Sifre , Lena Martens , Xiang Lorraine Li , Adhiguna Kuncoro , Aida Nematzadeh , Elena Gribovskaya , Domenic Donato , Angeliki Lazaridou , Arthur Mensch , Jean-Baptiste Lespiau , Maria Tsimpoukelli , Nikolai Grigorev , Doug Fritz , Thibault Sottiaux , Mantas Pajarskas , Toby Pohlen , Zhitao Gong , Daniel Toyama , Cyprien de Masson d'Autume , Yujia Li , Tayfun Terzi , Vladimir Mikulik , Igor Babuschkin , Aidan Clark , Diego de Las Casas , Aurelia Guy , Chris Jones , James Bradbury , Matthew Johnson , Blake Hechtman , Laura Weidinger , Iason Gabriel , William Isaac , Ed Lockhart , Simon Osindero , Laura Rimell , Chris Dyer , Oriol Vinyals , Kareem Ayoub , Jeff Stanway , Lorrayne Bennett , Demis Hassabis , Koray Kavukcuoglu , Geoffrey Irving

* 118 pages 

   Access Paper or Ask Questions

Podracer architectures for scalable Reinforcement Learning



Matteo Hessel , Manuel Kroiss , Aidan Clark , Iurii Kemaev , John Quan , Thomas Keck , Fabio Viola , Hado van Hasselt


   Access Paper or Ask Questions

Skillful Precipitation Nowcasting using Deep Generative Models of Radar



Suman Ravuri , Karel Lenc , Matthew Willson , Dmitry Kangin , Remi Lam , Piotr Mirowski , Megan Fitzsimons , Maria Athanassiadou , Sheleem Kashem , Sam Madge , Rachel Prudden , Amol Mandhane , Aidan Clark , Andrew Brock , Karen Simonyan , Raia Hadsell , Niall Robinson , Ellen Clancy , Alberto Arribas , Shakir Mohamed

* 46 pages, 17 figures, 2 tables 

   Access Paper or Ask Questions

Transformation-based Adversarial Video Prediction on Large-Scale Data



Pauline Luc , Aidan Clark , Sander Dieleman , Diego de Las Casas , Yotam Doron , Albin Cassirer , Karen Simonyan


   Access Paper or Ask Questions

Stabilizing Transformers for Reinforcement Learning



Emilio Parisotto , H. Francis Song , Jack W. Rae , Razvan Pascanu , Caglar Gulcehre , Siddhant M. Jayakumar , Max Jaderberg , Raphael Lopez Kaufman , Aidan Clark , Seb Noury , Matthew M. Botvinick , Nicolas Heess , Raia Hadsell


   Access Paper or Ask Questions

High Fidelity Speech Synthesis with Adversarial Networks



Mikołaj Bińkowski , Jeff Donahue , Sander Dieleman , Aidan Clark , Erich Elsen , Norman Casagrande , Luis C. Cobo , Karen Simonyan


   Access Paper or Ask Questions

1
2
>>