Get our free extension to see links to code for papers anywhere online!

 Add to Chrome

 Add to Firefox

CatalyzeX Code Finder - Browser extension linking code for ML papers across the web! | Product Hunt Embed
Avoiding Tampering Incentives in Deep RL via Decoupled Approval

Nov 17, 2020
Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg


  Access Paper or Ask Questions

REALab: An Embedded Perspective on Tampering

Nov 17, 2020
Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg


  Access Paper or Ask Questions

Algorithms for Causal Reasoning in Probability Trees

Nov 12, 2020
Tim Genewein, Tom McGrath, Grégoire Déletang, Vladimir Mikulik, Miljan Martic, Shane Legg, Pedro A. Ortega

* (2nd version with correction to algorithm) 11 pages, 8 figures, 5 algorithms. A companion Colaboratory tutorial is available at https://github.com/deepmind/deepmind-research/tree/master/causal_reasoning 

  Access Paper or Ask Questions

Meta-trained agents implement Bayes-optimal agents

Oct 21, 2020
Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro A. Ortega

* Published at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada 

  Access Paper or Ask Questions

Avoiding Side Effects By Considering Future Tasks

Oct 15, 2020
Victoria Krakovna, Laurent Orseau, Richard Ngo, Miljan Martic, Shane Legg

* Published in NeurIPS 2020 

  Access Paper or Ask Questions

Quantifying Differences in Reward Functions

Jun 24, 2020
Adam Gleave, Michael Dennis, Shane Legg, Stuart Russell, Jan Leike

* 8 pages main paper, 29 pages total 

  Access Paper or Ask Questions

Pitfalls of learning a reward function online

Apr 28, 2020
Stuart Armstrong, Jan Leike, Laurent Orseau, Shane Legg


  Access Paper or Ask Questions

The Incentives that Shape Behaviour

Jan 20, 2020
Ryan Carey, Eric Langlois, Tom Everitt, Shane Legg

* 12 pages, 7 figures, accepted to SafeAI workshop at AAAI 

  Access Paper or Ask Questions

Learning Human Objectives by Evaluating Hypothetical Behavior

Dec 05, 2019
Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike


  Access Paper or Ask Questions

Modeling AGI Safety Frameworks with Causal Influence Diagrams

Jun 20, 2019
Tom Everitt, Ramana Kumar, Victoria Krakovna, Shane Legg

* IJCAI 2019 AI Safety Workshop 

  Access Paper or Ask Questions

Meta-learning of Sequential Strategies

May 08, 2019
Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

* DeepMind Technical Report (15 pages, 6 figures) 

  Access Paper or Ask Questions

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings

Mar 12, 2019
Tom Everitt, Pedro A. Ortega, Elizabeth Barnes, Shane Legg


  Access Paper or Ask Questions

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss

Jan 08, 2019
Laurent Orseau, Tor Lattimore, Shane Legg

* Algorithmic Learning Theory 2017 

  Access Paper or Ask Questions

Scaling shared model governance via model splitting

Dec 14, 2018
Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

* 9 pages 

  Access Paper or Ask Questions

Scalable agent alignment via reward modeling: a research direction

Nov 19, 2018
Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg


  Access Paper or Ask Questions

Reward learning from human preferences and demonstrations in Atari

Nov 15, 2018
Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

* NIPS 2018 

  Access Paper or Ask Questions

Modeling Friends and Foes

Jun 30, 2018
Pedro A. Ortega, Shane Legg

* 13 pages, 9 figures 

  Access Paper or Ask Questions

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Jun 28, 2018
Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu


  Access Paper or Ask Questions

Measuring and avoiding side effects using relative reachability

Jun 04, 2018
Victoria Krakovna, Laurent Orseau, Miljan Martic, Shane Legg


  Access Paper or Ask Questions

Agents and Devices: A Relative Definition of Agency

May 31, 2018
Laurent Orseau, Simon McGregor McGill, Shane Legg


  Access Paper or Ask Questions

Noisy Networks for Exploration

Feb 15, 2018
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg

* ICLR 2018 

  Access Paper or Ask Questions

Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents

Feb 04, 2018
Joel Z. Leibo, Cyprien de Masson d'Autume, Daniel Zoran, David Amos, Charles Beattie, Keith Anderson, Antonio García Castañeda, Manuel Sanchez, Simon Green, Audrunas Gruslys, Shane Legg, Demis Hassabis, Matthew M. Botvinick

* 28 pages, 11 figures 

  Access Paper or Ask Questions

AI Safety Gridworlds

Nov 28, 2017
Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg


  Access Paper or Ask Questions

Reinforcement Learning with a Corrupted Reward Channel

Aug 19, 2017
Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg

* A shorter version of this report was accepted to IJCAI 2017 AI and Autonomy track 

  Access Paper or Ask Questions

Deep reinforcement learning from human preferences

Jul 13, 2017
Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei


  Access Paper or Ask Questions

DeepMind Lab

Dec 13, 2016
Charles Beattie, Joel Z. Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich Küttler, Andrew Lefrancq, Simon Green, Víctor Valdés, Amir Sadik, Julian Schrittwieser, Keith Anderson, Sarah York, Max Cant, Adam Cain, Adrian Bolton, Stephen Gaffney, Helen King, Demis Hassabis, Shane Legg, Stig Petersen

* 11 pages, 8 figures 

  Access Paper or Ask Questions

Massively Parallel Methods for Deep Reinforcement Learning

Jul 16, 2015
Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver

* Presented at the Deep Learning Workshop, International Conference on Machine Learning, Lille, France, 2015 

  Access Paper or Ask Questions

An Approximation of the Universal Intelligence Measure

Sep 29, 2011
Shane Legg, Joel Veness

* 14 pages 

  Access Paper or Ask Questions

Temporal Difference Updating without a Learning Rate

Oct 31, 2008
Marcus Hutter, Shane Legg

* Advances in Neural Information Processing Systems 20 (NIPS 2008) pages 705-712 
* 12 pages, 6 figures 

  Access Paper or Ask Questions