Alert button
Picture for Bobak Shahriari

Bobak Shahriari

Alert button

Gemma: Open Models Based on Gemini Research and Technology

Add code
Bookmark button
Alert button
Mar 13, 2024
Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari, Charline Le Lan, Christopher A. Choquette-Choo, Clément Crepy, Daniel Cer, Daphne Ippolito, David Reid, Elena Buchatskaya, Eric Ni, Eric Noland, Geng Yan, George Tucker, George-Christian Muraru, Grigory Rozhdestvenskiy, Henryk Michalewski, Ian Tenney, Ivan Grishchenko, Jacob Austin, James Keeling, Jane Labanowski, Jean-Baptiste Lespiau, Jeff Stanway, Jenny Brennan, Jeremy Chen, Johan Ferret, Justin Chiu, Justin Mao-Jones, Katherine Lee, Kathy Yu, Katie Millican, Lars Lowe Sjoesund, Lisa Lee, Lucas Dixon, Machel Reid, Maciej Mikuła, Mateo Wirth, Michael Sharman, Nikolai Chinaev, Nithum Thain, Olivier Bachem, Oscar Chang, Oscar Wahltinez, Paige Bailey, Paul Michel, Petko Yotov, Pier Giuseppe Sessa, Rahma Chaabouni, Ramona Comanescu, Reena Jana, Rohan Anil, Ross McIlroy, Ruibo Liu, Ryan Mullins, Samuel L Smith, Sebastian Borgeaud, Sertan Girgin, Sholto Douglas, Shree Pandya, Siamak Shakeri, Soham De, Ted Klimenko, Tom Hennigan, Vlad Feinberg, Wojciech Stokowiec, Yu-hui Chen, Zafarali Ahmed, Zhitao Gong, Tris Warkentin, Ludovic Peran, Minh Giang, Clément Farabet, Oriol Vinyals, Jeff Dean, Koray Kavukcuoglu, Demis Hassabis, Zoubin Ghahramani, Douglas Eck, Joelle Barral, Fernando Pereira, Eli Collins, Armand Joulin, Noah Fiedel, Evan Senter, Alek Andreev, Kathleen Kenealy

Figure 1 for Gemma: Open Models Based on Gemini Research and Technology
Figure 2 for Gemma: Open Models Based on Gemini Research and Technology
Figure 3 for Gemma: Open Models Based on Gemini Research and Technology
Figure 4 for Gemma: Open Models Based on Gemini Research and Technology
Viaarxiv icon

Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Add code
Bookmark button
Alert button
May 09, 2023
Patrick Emedom-Nnamdi, Abram L. Friesen, Bobak Shahriari, Nando de Freitas, Matt W. Hoffman

Figure 1 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning
Figure 2 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning
Figure 3 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning
Figure 4 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning
Viaarxiv icon

Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach

Add code
Bookmark button
Alert button
Apr 22, 2022
Bobak Shahriari, Abbas Abdolmaleki, Arunkumar Byravan, Abe Friesen, Siqi Liu, Jost Tobias Springenberg, Nicolas Heess, Matt Hoffman, Martin Riedmiller

Figure 1 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 2 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 3 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 4 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Viaarxiv icon

Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach

Add code
Bookmark button
Alert button
Apr 21, 2022
Bobak Shahriari, Abbas Abdolmaleki, Arunkumar Byravan, Abe Friesen, Siqi Liu, Jost Tobias Springenberg, Nicolas Heess, Matt Hoffman, Martin Riedmiller

Figure 1 for Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach
Figure 2 for Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach
Figure 3 for Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach
Figure 4 for Revisiting Gaussian mixture critic in off-policy reinforcement learning: a sample-based approach
Viaarxiv icon

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 15, 2021
Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, Andras Gyorgy, Csaba Szepesvari, Raia Hadsell, Nicolas Heess, Martin Riedmiller

Figure 1 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 2 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 3 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 4 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Viaarxiv icon

Critic Regularized Regression

Add code
Bookmark button
Alert button
Jun 26, 2020
Ziyu Wang, Alexander Novikov, Konrad Żołna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

Figure 1 for Critic Regularized Regression
Figure 2 for Critic Regularized Regression
Figure 3 for Critic Regularized Regression
Figure 4 for Critic Regularized Regression
Viaarxiv icon

Acme: A Research Framework for Distributed Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 01, 2020
Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Andrew Cowie, Ziyu Wang, Bilal Piot, Nando de Freitas

Figure 1 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 2 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 3 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 4 for Acme: A Research Framework for Distributed Reinforcement Learning
Viaarxiv icon

Making Efficient Use of Demonstrations to Solve Hard Exploration Problems

Add code
Bookmark button
Alert button
Sep 03, 2019
Tom Le Paine, Caglar Gulcehre, Bobak Shahriari, Misha Denil, Matt Hoffman, Hubert Soyer, Richard Tanburn, Steven Kapturowski, Neil Rabinowitz, Duncan Williams, Gabriel Barth-Maron, Ziyu Wang, Nando de Freitas, Worlds Team

Figure 1 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 2 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 3 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 4 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Viaarxiv icon

Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?

Add code
Bookmark button
Alert button
May 12, 2016
Paul Tupper, Bobak Shahriari

Figure 1 for Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Figure 2 for Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Figure 3 for Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Viaarxiv icon

Unbounded Bayesian Optimization via Regularization

Add code
Bookmark button
Alert button
Aug 14, 2015
Bobak Shahriari, Alexandre Bouchard-Côté, Nando de Freitas

Figure 1 for Unbounded Bayesian Optimization via Regularization
Figure 2 for Unbounded Bayesian Optimization via Regularization
Figure 3 for Unbounded Bayesian Optimization via Regularization
Figure 4 for Unbounded Bayesian Optimization via Regularization
Viaarxiv icon