Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Unified Scaling Laws for Routed Language Models



Aidan Clark , Diego de las Casas , Aurelia Guy , Arthur Mensch , Michela Paganini , Jordan Hoffmann , Bogdan Damoc , Blake Hechtman , Trevor Cai , Sebastian Borgeaud , George van den Driessche , Eliza Rutherford , Tom Hennigan , Matthew Johnson , Katie Millican , Albin Cassirer , Chris Jones , Elena Buchatskaya , David Budden , Laurent Sifre , Simon Osindero , Oriol Vinyals , Jack Rae , Erich Elsen , Koray Kavukcuoglu , Karen Simonyan

* Fixing typos and affiliation clarity 

   Access Paper or Ask Questions

Improving language models by retrieving from trillions of tokens



Sebastian Borgeaud , Arthur Mensch , Jordan Hoffmann , Trevor Cai , Eliza Rutherford , Katie Millican , George van den Driessche , Jean-Baptiste Lespiau , Bogdan Damoc , Aidan Clark , Diego de Las Casas , Aurelia Guy , Jacob Menick , Roman Ring , Tom Hennigan , Saffron Huang , Loren Maggiore , Chris Jones , Albin Cassirer , Andy Brock , Michela Paganini , Geoffrey Irving , Oriol Vinyals , Simon Osindero , Karen Simonyan , Jack W. Rae , Erich Elsen , Laurent Sifre

* Add missing references. Fix some typos 

   Access Paper or Ask Questions

Scaling Language Models: Methods, Analysis & Insights from Training Gopher



Jack W. Rae , Sebastian Borgeaud , Trevor Cai , Katie Millican , Jordan Hoffmann , Francis Song , John Aslanides , Sarah Henderson , Roman Ring , Susannah Young , Eliza Rutherford , Tom Hennigan , Jacob Menick , Albin Cassirer , Richard Powell , George van den Driessche , Lisa Anne Hendricks , Maribeth Rauh , Po-Sen Huang , Amelia Glaese , Johannes Welbl , Sumanth Dathathri , Saffron Huang , Jonathan Uesato , John Mellor , Irina Higgins , Antonia Creswell , Nat McAleese , Amy Wu , Erich Elsen , Siddhant Jayakumar , Elena Buchatskaya , David Budden , Esme Sutherland , Karen Simonyan , Michela Paganini , Laurent Sifre , Lena Martens , Xiang Lorraine Li , Adhiguna Kuncoro , Aida Nematzadeh , Elena Gribovskaya , Domenic Donato , Angeliki Lazaridou , Arthur Mensch , Jean-Baptiste Lespiau , Maria Tsimpoukelli , Nikolai Grigorev , Doug Fritz , Thibault Sottiaux , Mantas Pajarskas , Toby Pohlen , Zhitao Gong , Daniel Toyama , Cyprien de Masson d'Autume , Yujia Li , Tayfun Terzi , Vladimir Mikulik , Igor Babuschkin , Aidan Clark , Diego de Las Casas , Aurelia Guy , Chris Jones , James Bradbury , Matthew Johnson , Blake Hechtman , Laura Weidinger , Iason Gabriel , William Isaac , Ed Lockhart , Simon Osindero , Laura Rimell , Chris Dyer , Oriol Vinyals , Kareem Ayoub , Jeff Stanway , Lorrayne Bennett , Demis Hassabis , Koray Kavukcuoglu , Geoffrey Irving

* 118 pages 

   Access Paper or Ask Questions

Prune Responsibly



Michela Paganini


   Access Paper or Ask Questions

Bespoke vs. PrĂȘt-Ă -Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding



Michela Paganini , Jessica Zosa Forde

* arXiv admin note: text overlap with arXiv:2001.05050 

   Access Paper or Ask Questions

dagger: A Python Framework for Reproducible Machine Learning Experiment Orchestration



Michela Paganini , Jessica Zosa Forde

* 4 pages, 3 code listings, 1 figure 

   Access Paper or Ask Questions

Streamlining Tensor and Network Pruning in PyTorch



Michela Paganini , Jessica Forde

* 5 pages, 1 figure, 5 code listings. Published as a workshop paper at ICLR 2020 

   Access Paper or Ask Questions

On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks



Michela Paganini , Jessica Forde

* 8 pages, 8 figures, plus 5 appendices with additional figures and tables 

   Access Paper or Ask Questions

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers



Ari S. Morcos , Haonan Yu , Michela Paganini , Yuandong Tian


   Access Paper or Ask Questions

1
2
>>