Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Tom Everitt

Shaking the foundations: delusions in sequence models for interaction and control


Oct 20, 2021
Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Perolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian Chen, Scott Reed, Marcus Hutter, Nando de Freitas, Shane Legg

* DeepMind Tech Report, 16 pages, 4 figures 

  Access Paper or Ask Questions

Alignment of Language Agents


Mar 26, 2021
Zachary Kenton, Tom Everitt, Laura Weidinger, Iason Gabriel, Vladimir Mikulik, Geoffrey Irving


  Access Paper or Ask Questions

How RL Agents Behave When Their Actions Are Modified


Feb 15, 2021
Eric D. Langlois, Tom Everitt

* 10 pages (+6 appendix); 5 figures. Published in the AAAI 2021 Conference. Code is available at https://github.com/edlanglois/mamdp 

  Access Paper or Ask Questions

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice


Feb 09, 2021
Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael Wooldridge

* Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21) 

  Access Paper or Ask Questions

Agent Incentives: A Causal Perspective


Feb 02, 2021
Tom Everitt, Ryan Carey, Eric Langlois, Pedro A Ortega, Shane Legg

* In Proceedings of the AAAI 2021 Conference. Supersedes arXiv:1902.09980, arXiv:2001.07118 

  Access Paper or Ask Questions

Avoiding Tampering Incentives in Deep RL via Decoupled Approval


Nov 17, 2020
Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg


  Access Paper or Ask Questions

REALab: An Embedded Perspective on Tampering


Nov 17, 2020
Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg


  Access Paper or Ask Questions

The Incentives that Shape Behaviour


Jan 20, 2020
Ryan Carey, Eric Langlois, Tom Everitt, Shane Legg

* 12 pages, 7 figures, accepted to SafeAI workshop at AAAI 

  Access Paper or Ask Questions

Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective


Aug 20, 2019
Tom Everitt, Marcus Hutter


  Access Paper or Ask Questions

Modeling AGI Safety Frameworks with Causal Influence Diagrams


Jun 20, 2019
Tom Everitt, Ramana Kumar, Victoria Krakovna, Shane Legg

* IJCAI 2019 AI Safety Workshop 

  Access Paper or Ask Questions

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings


Mar 12, 2019
Tom Everitt, Pedro A. Ortega, Elizabeth Barnes, Shane Legg


  Access Paper or Ask Questions

Scalable agent alignment via reward modeling: a research direction


Nov 19, 2018
Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg


  Access Paper or Ask Questions

AGI Safety Literature Review


May 21, 2018
Tom Everitt, Gary Lea, Marcus Hutter

* Published in International Joint Conference on Artificial Intelligence (IJCAI), 2018 

  Access Paper or Ask Questions

A Topological Approach to Meta-heuristics: Analytical Results on the BFS vs. DFS Algorithm Selection Problem


Apr 12, 2018
Tom Everitt, Marcus Hutter

* Main results published in 28th Australian Joint Conference on Artificial Intelligence, 2015 

  Access Paper or Ask Questions

AI Safety Gridworlds


Nov 28, 2017
Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg


  Access Paper or Ask Questions

Reinforcement Learning with a Corrupted Reward Channel


Aug 19, 2017
Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg

* A shorter version of this report was accepted to IJCAI 2017 AI and Autonomy track 

  Access Paper or Ask Questions

Count-Based Exploration in Feature Space for Reinforcement Learning


Jun 25, 2017
Jarryd Martin, Suraj Narayanan Sasikumar, Tom Everitt, Marcus Hutter

* Conference: Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17), 8 pages, 1 figure 

  Access Paper or Ask Questions

Free Lunch for Optimisation under the Universal Distribution


Aug 16, 2016
Tom Everitt, Tor Lattimore, Marcus Hutter

* Proceedings of 2014 IEEE Congress on Evolutionary Computation (CEC), July 6-11, 2014, Beijing, China, pp. 167-174 

  Access Paper or Ask Questions

Death and Suicide in Universal Artificial Intelligence


Jun 02, 2016
Jarryd Martin, Tom Everitt, Marcus Hutter

* Conference: Artificial General Intelligence (AGI) 2016 13 pages, 2 figures 

  Access Paper or Ask Questions

Avoiding Wireheading with Value Reinforcement Learning


May 10, 2016
Tom Everitt, Marcus Hutter

* Artificial General Intelligence (AGI) 2016 

  Access Paper or Ask Questions

Self-Modification of Policy and Utility Function in Rational Agents


May 10, 2016
Tom Everitt, Daniel Filan, Mayank Daswani, Marcus Hutter

* Artificial General Intelligence (AGI) 2016 

  Access Paper or Ask Questions

Sequential Extensions of Causal and Evidential Decision Theory


Jun 24, 2015
Tom Everitt, Jan Leike, Marcus Hutter

* ADT 2015 

  Access Paper or Ask Questions