Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Model-Free Risk-Sensitive Reinforcement Learning



Grégoire Delétang , Jordi Grau-Moya , Markus Kunesch , Tim Genewein , Rob Brekelmans , Shane Legg , Pedro A. Ortega

* DeepMind Tech Report: 13 pages, 4 figures 

   Access Paper or Ask Questions

Shaking the foundations: delusions in sequence models for interaction and control



Pedro A. Ortega , Markus Kunesch , Grégoire Delétang , Tim Genewein , Jordi Grau-Moya , Joel Veness , Jonas Buchli , Jonas Degrave , Bilal Piot , Julien Perolat , Tom Everitt , Corentin Tallec , Emilio Parisotto , Tom Erez , Yutian Chen , Scott Reed , Marcus Hutter , Nando de Freitas , Shane Legg

* DeepMind Tech Report, 16 pages, 4 figures 

   Access Paper or Ask Questions

Causal Analysis of Agent Behavior for AI Safety



Grégoire Déletang , Jordi Grau-Moya , Miljan Martic , Tim Genewein , Tom McGrath , Vladimir Mikulik , Markus Kunesch , Shane Legg , Pedro A. Ortega

* 16 pages, 16 figures, 6 tables 

   Access Paper or Ask Questions

Algorithms for Causal Reasoning in Probability Trees



Tim Genewein , Tom McGrath , Grégoire Déletang , Vladimir Mikulik , Miljan Martic , Shane Legg , Pedro A. Ortega

* (2nd version with correction to algorithm) 11 pages, 8 figures, 5 algorithms. A companion Colaboratory tutorial is available at https://github.com/deepmind/deepmind-research/tree/master/causal_reasoning 

   Access Paper or Ask Questions

Meta-trained agents implement Bayes-optimal agents



Vladimir Mikulik , Grégoire Delétang , Tom McGrath , Tim Genewein , Miljan Martic , Shane Legg , Pedro A. Ortega

* Published at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada 

   Access Paper or Ask Questions

Action and Perception as Divergence Minimization



Danijar Hafner , Pedro A. Ortega , Jimmy Ba , Thomas Parr , Karl Friston , Nicolas Heess

* 14 pages, 10 figures, 2 tables 

   Access Paper or Ask Questions

Meta reinforcement learning as task inference



Jan Humplik , Alexandre Galashov , Leonard Hasenclever , Pedro A. Ortega , Yee Whye Teh , Nicolas Heess


   Access Paper or Ask Questions

Meta-learning of Sequential Strategies



Pedro A. Ortega , Jane X. Wang , Mark Rowland , Tim Genewein , Zeb Kurth-Nelson , Razvan Pascanu , Nicolas Heess , Joel Veness , Alex Pritzel , Pablo Sprechmann , Siddhant M. Jayakumar , Tom McGrath , Kevin Miller , Mohammad Azar , Ian Osband , Neil Rabinowitz , András György , Silvia Chiappa , Simon Osindero , Yee Whye Teh , Hado van Hasselt , Nando de Freitas , Matthew Botvinick , Shane Legg

* DeepMind Technical Report (15 pages, 6 figures) 

   Access Paper or Ask Questions

1
2
3
4
>>