Alert button
Picture for Zhaohan Daniel Guo

Zhaohan Daniel Guo

Alert button

Generalized Preference Optimization: A Unified Approach to Offline Alignment

Feb 08, 2024
Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Rémi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Ávila Pires, Bilal Piot

Viaarxiv icon

Nash Learning from Human Feedback

Dec 06, 2023
Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot

Figure 1 for Nash Learning from Human Feedback
Figure 2 for Nash Learning from Human Feedback
Figure 3 for Nash Learning from Human Feedback
Figure 4 for Nash Learning from Human Feedback
Viaarxiv icon

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

May 02, 2023
Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana L Borsa

Figure 1 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 2 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 3 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Figure 4 for Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Viaarxiv icon

Understanding Self-Predictive Learning for Reinforcement Learning

Dec 06, 2022
Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko

Figure 1 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 2 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 3 for Understanding Self-Predictive Learning for Reinforcement Learning
Figure 4 for Understanding Self-Predictive Learning for Reinforcement Learning
Viaarxiv icon

BYOL-Explore: Exploration by Bootstrapped Prediction

Jun 16, 2022
Zhaohan Daniel Guo, Shantanu Thakoor, Miruna Pîslar, Bernardo Avila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Rémi Munos, Mohammad Gheshlaghi Azar, Bilal Piot

Figure 1 for BYOL-Explore: Exploration by Bootstrapped Prediction
Figure 2 for BYOL-Explore: Exploration by Bootstrapped Prediction
Figure 3 for BYOL-Explore: Exploration by Bootstrapped Prediction
Figure 4 for BYOL-Explore: Exploration by Bootstrapped Prediction
Viaarxiv icon

Geometric Entropic Exploration

Jan 07, 2021
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Avila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, Rémi Munos

Figure 1 for Geometric Entropic Exploration
Figure 2 for Geometric Entropic Exploration
Figure 3 for Geometric Entropic Exploration
Figure 4 for Geometric Entropic Exploration
Viaarxiv icon

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

Jun 13, 2020
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko

Figure 1 for Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Figure 2 for Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Figure 3 for Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Figure 4 for Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Viaarxiv icon

Directed Exploration for Reinforcement Learning

Jun 18, 2019
Zhaohan Daniel Guo, Emma Brunskill

Figure 1 for Directed Exploration for Reinforcement Learning
Figure 2 for Directed Exploration for Reinforcement Learning
Figure 3 for Directed Exploration for Reinforcement Learning
Figure 4 for Directed Exploration for Reinforcement Learning
Viaarxiv icon

Neural Predictive Belief Representations

Nov 15, 2018
Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Bernardo A. Pires, Toby Pohlen, Rémi Munos

Figure 1 for Neural Predictive Belief Representations
Figure 2 for Neural Predictive Belief Representations
Figure 3 for Neural Predictive Belief Representations
Figure 4 for Neural Predictive Belief Representations
Viaarxiv icon