Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Miljan Martic

Causal Analysis of Agent Behavior for AI Safety

Mar 05, 2021
Grégoire Déletang, Jordi Grau-Moya, Miljan Martic, Tim Genewein, Tom McGrath, Vladimir Mikulik, Markus Kunesch, Shane Legg, Pedro A. Ortega

* 16 pages, 16 figures, 6 tables 

  Access Paper or Ask Questions

Algorithms for Causal Reasoning in Probability Trees

Nov 12, 2020
Tim Genewein, Tom McGrath, Grégoire Déletang, Vladimir Mikulik, Miljan Martic, Shane Legg, Pedro A. Ortega

* (2nd version with correction to algorithm) 11 pages, 8 figures, 5 algorithms. A companion Colaboratory tutorial is available at 

  Access Paper or Ask Questions

Meta-trained agents implement Bayes-optimal agents

Oct 21, 2020
Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro A. Ortega

* Published at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada 

  Access Paper or Ask Questions

Avoiding Side Effects By Considering Future Tasks

Oct 15, 2020
Victoria Krakovna, Laurent Orseau, Richard Ngo, Miljan Martic, Shane Legg

* Published in NeurIPS 2020 

  Access Paper or Ask Questions

Scaling shared model governance via model splitting

Dec 14, 2018
Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

* 9 pages 

  Access Paper or Ask Questions

Scalable agent alignment via reward modeling: a research direction

Nov 19, 2018
Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg

  Access Paper or Ask Questions

Measuring and avoiding side effects using relative reachability

Jun 04, 2018
Victoria Krakovna, Laurent Orseau, Miljan Martic, Shane Legg

  Access Paper or Ask Questions

AI Safety Gridworlds

Nov 28, 2017
Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg

  Access Paper or Ask Questions

Deep reinforcement learning from human preferences

Jul 13, 2017
Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei

  Access Paper or Ask Questions