Alert button
Picture for Lukasz Kaiser

Lukasz Kaiser

Alert button

Training Verifiers to Solve Math Word Problems

Nov 18, 2021
Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, John Schulman

Figure 1 for Training Verifiers to Solve Math Word Problems
Figure 2 for Training Verifiers to Solve Math Word Problems
Figure 3 for Training Verifiers to Solve Math Word Problems
Figure 4 for Training Verifiers to Solve Math Word Problems
Viaarxiv icon

Evaluating Large Language Models Trained on Code

Jul 14, 2021
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba

Figure 1 for Evaluating Large Language Models Trained on Code
Figure 2 for Evaluating Large Language Models Trained on Code
Figure 3 for Evaluating Large Language Models Trained on Code
Figure 4 for Evaluating Large Language Models Trained on Code
Viaarxiv icon

Rethinking Attention with Performers

Sep 30, 2020
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

Figure 1 for Rethinking Attention with Performers
Figure 2 for Rethinking Attention with Performers
Figure 3 for Rethinking Attention with Performers
Figure 4 for Rethinking Attention with Performers
Viaarxiv icon

Parallel Scheduled Sampling

Jun 11, 2019
Daniel Duckworth, Arvind Neelakantan, Ben Goodrich, Lukasz Kaiser, Samy Bengio

Figure 1 for Parallel Scheduled Sampling
Figure 2 for Parallel Scheduled Sampling
Figure 3 for Parallel Scheduled Sampling
Figure 4 for Parallel Scheduled Sampling
Viaarxiv icon

Sample Efficient Text Summarization Using a Single Pre-Trained Transformer

May 21, 2019
Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser

Figure 1 for Sample Efficient Text Summarization Using a Single Pre-Trained Transformer
Figure 2 for Sample Efficient Text Summarization Using a Single Pre-Trained Transformer
Figure 3 for Sample Efficient Text Summarization Using a Single Pre-Trained Transformer
Figure 4 for Sample Efficient Text Summarization Using a Single Pre-Trained Transformer
Viaarxiv icon

Model-Based Reinforcement Learning for Atari

Mar 05, 2019
Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Ryan Sepassi, George Tucker, Henryk Michalewski

Figure 1 for Model-Based Reinforcement Learning for Atari
Figure 2 for Model-Based Reinforcement Learning for Atari
Figure 3 for Model-Based Reinforcement Learning for Atari
Figure 4 for Model-Based Reinforcement Learning for Atari
Viaarxiv icon

Area Attention

Oct 30, 2018
Yang Li, Lukasz Kaiser, Samy Bengio, Si Si

Figure 1 for Area Attention
Figure 2 for Area Attention
Figure 3 for Area Attention
Figure 4 for Area Attention
Viaarxiv icon

Generating Wikipedia by Summarizing Long Sequences

Jan 30, 2018
Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, Noam Shazeer

Figure 1 for Generating Wikipedia by Summarizing Long Sequences
Figure 2 for Generating Wikipedia by Summarizing Long Sequences
Figure 3 for Generating Wikipedia by Summarizing Long Sequences
Figure 4 for Generating Wikipedia by Summarizing Long Sequences
Viaarxiv icon

Unsupervised Cipher Cracking Using Discrete GANs

Jan 15, 2018
Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser

Figure 1 for Unsupervised Cipher Cracking Using Discrete GANs
Figure 2 for Unsupervised Cipher Cracking Using Discrete GANs
Figure 3 for Unsupervised Cipher Cracking Using Discrete GANs
Figure 4 for Unsupervised Cipher Cracking Using Discrete GANs
Viaarxiv icon