Alert button
Picture for Daniel Guo

Daniel Guo

Alert button

Human Alignment of Large Language Models through Online Preference Optimisation

Mar 13, 2024
Daniele Calandriello, Daniel Guo, Remi Munos, Mark Rowland, Yunhao Tang, Bernardo Avila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot

Viaarxiv icon

A General Theoretical Paradigm to Understand Learning from Human Preferences

Oct 18, 2023
Mohammad Gheshlaghi Azar, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, Rémi Munos

Figure 1 for A General Theoretical Paradigm to Understand Learning from Human Preferences
Figure 2 for A General Theoretical Paradigm to Understand Learning from Human Preferences
Viaarxiv icon

Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning

Apr 30, 2020
Daniel Guo, Bernardo Avila Pires, Bilal Piot, Jean-bastien Grill, Florent Altché, Rémi Munos, Mohammad Gheshlaghi Azar

Figure 1 for Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Figure 2 for Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Figure 3 for Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Figure 4 for Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Viaarxiv icon

Agent57: Outperforming the Atari Human Benchmark

Mar 30, 2020
Adrià Puigdomènech Badia, Bilal Piot, Steven Kapturowski, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Charles Blundell

Figure 1 for Agent57: Outperforming the Atari Human Benchmark
Figure 2 for Agent57: Outperforming the Atari Human Benchmark
Figure 3 for Agent57: Outperforming the Atari Human Benchmark
Figure 4 for Agent57: Outperforming the Atari Human Benchmark
Viaarxiv icon

Never Give Up: Learning Directed Exploration Strategies

Feb 14, 2020
Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andew Bolt, Charles Blundell

Figure 1 for Never Give Up: Learning Directed Exploration Strategies
Figure 2 for Never Give Up: Learning Directed Exploration Strategies
Figure 3 for Never Give Up: Learning Directed Exploration Strategies
Figure 4 for Never Give Up: Learning Directed Exploration Strategies
Viaarxiv icon