Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Shane Legg

Causal Analysis of Agent Behavior for AI Safety


Mar 05, 2021
Grégoire Déletang, Jordi Grau-Moya, Miljan Martic, Tim Genewein, Tom McGrath, Vladimir Mikulik, Markus Kunesch, Shane Legg, Pedro A. Ortega

* 16 pages, 16 figures, 6 tables 

  Access Paper or Ask Questions

Agent Incentives: A Causal Perspective


Feb 02, 2021
Tom Everitt, Ryan Carey, Eric Langlois, Pedro A Ortega, Shane Legg

* In Proceedings of the AAAI 2021 Conference. Supersedes arXiv:1902.09980, arXiv:2001.07118 

  Access Paper or Ask Questions

Avoiding Tampering Incentives in Deep RL via Decoupled Approval


Nov 17, 2020
Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg


  Access Paper or Ask Questions

REALab: An Embedded Perspective on Tampering


Nov 17, 2020
Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg


  Access Paper or Ask Questions

Algorithms for Causal Reasoning in Probability Trees


Nov 12, 2020
Tim Genewein, Tom McGrath, Grégoire Déletang, Vladimir Mikulik, Miljan Martic, Shane Legg, Pedro A. Ortega

* (2nd version with correction to algorithm) 11 pages, 8 figures, 5 algorithms. A companion Colaboratory tutorial is available at https://github.com/deepmind/deepmind-research/tree/master/causal_reasoning 

  Access Paper or Ask Questions

Meta-trained agents implement Bayes-optimal agents


Oct 21, 2020
Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro A. Ortega

* Published at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada 

  Access Paper or Ask Questions

Avoiding Side Effects By Considering Future Tasks


Oct 15, 2020
Victoria Krakovna, Laurent Orseau, Richard Ngo, Miljan Martic, Shane Legg

* Published in NeurIPS 2020 

  Access Paper or Ask Questions

Quantifying Differences in Reward Functions


Jun 24, 2020
Adam Gleave, Michael Dennis, Shane Legg, Stuart Russell, Jan Leike

* 8 pages main paper, 29 pages total 

  Access Paper or Ask Questions

Pitfalls of learning a reward function online


Apr 28, 2020
Stuart Armstrong, Jan Leike, Laurent Orseau, Shane Legg


  Access Paper or Ask Questions

The Incentives that Shape Behaviour


Jan 20, 2020
Ryan Carey, Eric Langlois, Tom Everitt, Shane Legg

* 12 pages, 7 figures, accepted to SafeAI workshop at AAAI 

  Access Paper or Ask Questions

Learning Human Objectives by Evaluating Hypothetical Behavior


Dec 05, 2019
Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike


  Access Paper or Ask Questions

Modeling AGI Safety Frameworks with Causal Influence Diagrams


Jun 20, 2019
Tom Everitt, Ramana Kumar, Victoria Krakovna, Shane Legg

* IJCAI 2019 AI Safety Workshop 

  Access Paper or Ask Questions

Meta-learning of Sequential Strategies


May 08, 2019
Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

* DeepMind Technical Report (15 pages, 6 figures) 

  Access Paper or Ask Questions

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings


Mar 12, 2019
Tom Everitt, Pedro A. Ortega, Elizabeth Barnes, Shane Legg


  Access Paper or Ask Questions

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss


Jan 08, 2019
Laurent Orseau, Tor Lattimore, Shane Legg

* Algorithmic Learning Theory 2017 

  Access Paper or Ask Questions

Scaling shared model governance via model splitting


Dec 14, 2018
Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

* 9 pages 

  Access Paper or Ask Questions

Scalable agent alignment via reward modeling: a research direction


Nov 19, 2018
Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg


  Access Paper or Ask Questions

Reward learning from human preferences and demonstrations in Atari


Nov 15, 2018
Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

* NIPS 2018 

  Access Paper or Ask Questions

Modeling Friends and Foes


Jun 30, 2018
Pedro A. Ortega, Shane Legg

* 13 pages, 9 figures 

  Access Paper or Ask Questions

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures


Jun 28, 2018
Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu


  Access Paper or Ask Questions

Measuring and avoiding side effects using relative reachability


Jun 04, 2018
Victoria Krakovna, Laurent Orseau, Miljan Martic, Shane Legg


  Access Paper or Ask Questions

Agents and Devices: A Relative Definition of Agency


May 31, 2018
Laurent Orseau, Simon McGregor McGill, Shane Legg


  Access Paper or Ask Questions

Noisy Networks for Exploration


Feb 15, 2018
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg

* ICLR 2018 

  Access Paper or Ask Questions

Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents


Feb 04, 2018
Joel Z. Leibo, Cyprien de Masson d'Autume, Daniel Zoran, David Amos, Charles Beattie, Keith Anderson, Antonio García Castañeda, Manuel Sanchez, Simon Green, Audrunas Gruslys, Shane Legg, Demis Hassabis, Matthew M. Botvinick

* 28 pages, 11 figures 

  Access Paper or Ask Questions

AI Safety Gridworlds


Nov 28, 2017
Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg


  Access Paper or Ask Questions

Reinforcement Learning with a Corrupted Reward Channel


Aug 19, 2017
Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg

* A shorter version of this report was accepted to IJCAI 2017 AI and Autonomy track 

  Access Paper or Ask Questions

Deep reinforcement learning from human preferences


Jul 13, 2017
Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei


  Access Paper or Ask Questions

DeepMind Lab


Dec 13, 2016
Charles Beattie, Joel Z. Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich Küttler, Andrew Lefrancq, Simon Green, Víctor Valdés, Amir Sadik, Julian Schrittwieser, Keith Anderson, Sarah York, Max Cant, Adam Cain, Adrian Bolton, Stephen Gaffney, Helen King, Demis Hassabis, Shane Legg, Stig Petersen

* 11 pages, 8 figures 

  Access Paper or Ask Questions

Massively Parallel Methods for Deep Reinforcement Learning


Jul 16, 2015
Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver

* Presented at the Deep Learning Workshop, International Conference on Machine Learning, Lille, France, 2015 

  Access Paper or Ask Questions