Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

PaLM: Scaling Language Modeling with Pathways



Aakanksha Chowdhery , Sharan Narang , Jacob Devlin , Maarten Bosma , Gaurav Mishra , Adam Roberts , Paul Barham , Hyung Won Chung , Charles Sutton , Sebastian Gehrmann , Parker Schuh , Kensen Shi , Sasha Tsvyashchenko , Joshua Maynez , Abhishek Rao , Parker Barnes , Yi Tay , Noam Shazeer , Vinodkumar Prabhakaran , Emily Reif , Nan Du , Ben Hutchinson , Reiner Pope , James Bradbury , Jacob Austin , Michael Isard , Guy Gur-Ari , Pengcheng Yin , Toju Duke , Anselm Levskaya , Sanjay Ghemawat , Sunipa Dev , Henryk Michalewski , Xavier Garcia , Vedant Misra , Kevin Robinson , Liam Fedus , Denny Zhou , Daphne Ippolito , David Luan , Hyeontaek Lim , Barret Zoph , Alexander Spiridonov , Ryan Sepassi , David Dohan , Shivani Agrawal , Mark Omernick , Andrew M. Dai , Thanumalayan Sankaranarayana Pillai , Marie Pellat , Aitor Lewkowycz , Erica Moreira , Rewon Child , Oleksandr Polozov , Katherine Lee , Zongwei Zhou , Xuezhi Wang , Brennan Saeta , Mark Diaz , Orhan Firat , Michele Catasta , Jason Wei , Kathy Meier-Hellstern , Douglas Eck , Jeff Dean , Slav Petrov , Noah Fiedel


   Access Paper or Ask Questions

Self-Consistency Improves Chain of Thought Reasoning in Language Models



Xuezhi Wang , Jason Wei , Dale Schuurmans , Quoc Le , Ed Chi , Sharan Narang , Aakanksha Chowdhery , Denny Zhou

* V2: added PaLM based results 

   Access Paper or Ask Questions

Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$



Adam Roberts , Hyung Won Chung , Anselm Levskaya , Gaurav Mishra , James Bradbury , Daniel Andor , Sharan Narang , Brian Lester , Colin Gaffney , Afroz Mohiuddin , Curtis Hawthorne , Aitor Lewkowycz , Alex Salcianu , Marc van Zee , Jacob Austin , Sebastian Goodman , Livio Baldini Soares , Haitang Hu , Sasha Tsvyashchenko , Aakanksha Chowdhery , Jasmijn Bastings , Jannis Bulian , Xavier Garcia , Jianmo Ni , Andrew Chen , Kathleen Kenealy , Jonathan H. Clark , Stephan Lee , Dan Garrette , James Lee-Thorp , Colin Raffel , Noam Shazeer , Marvin Ritter , Maarten Bosma , Alexandre Passos , Jeremy Maitin-Shepard , Noah Fiedel , Mark Omernick , Brennan Saeta , Ryan Sepassi , Alexander Spiridonov , Joshua Newlan , Andrea Gesmundo


   Access Paper or Ask Questions

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers



Yi Tay , Mostafa Dehghani , Jinfeng Rao , William Fedus , Samira Abnar , Hyung Won Chung , Sharan Narang , Dani Yogatama , Ashish Vaswani , Donald Metzler


   Access Paper or Ask Questions

ByT5: Towards a token-free future with pre-trained byte-to-byte models



Linting Xue , Aditya Barua , Noah Constant , Rami Al-Rfou , Sharan Narang , Mihir Kale , Adam Roberts , Colin Raffel


   Access Paper or Ask Questions

Do Transformer Modifications Transfer Across Implementations and Applications?



Sharan Narang , Hyung Won Chung , Yi Tay , William Fedus , Thibault Fevry , Michael Matena , Karishma Malkan , Noah Fiedel , Noam Shazeer , Zhenzhong Lan , Yanqi Zhou , Wei Li , Nan Ding , Jake Marcus , Adam Roberts , Colin Raffel


   Access Paper or Ask Questions

On Task-Level Dialogue Composition of Generative Transformer Model



Prasanna Parthasarathi , Arvind Neelakantan , Sharan Narang

* 8 pages; Accepted at Workshop on Insights from Negative Results in NLP 

   Access Paper or Ask Questions

WT5?! Training Text-to-Text Models to Explain their Predictions



Sharan Narang , Colin Raffel , Katherine Lee , Adam Roberts , Noah Fiedel , Karishma Malkan


   Access Paper or Ask Questions

Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning



Arvind Neelakantan , Semih Yavuz , Sharan Narang , Vishaal Prasad , Ben Goodrich , Daniel Duckworth , Chinnadhurai Sankar , Xifeng Yan


   Access Paper or Ask Questions

1
2
>>