Alert button
Picture for Tom Zahavy

Tom Zahavy

Alert button

Meta-Gradients in Non-Stationary Environments

Add code
Bookmark button
Alert button
Sep 13, 2022
Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh

Figure 1 for Meta-Gradients in Non-Stationary Environments
Figure 2 for Meta-Gradients in Non-Stationary Environments
Figure 3 for Meta-Gradients in Non-Stationary Environments
Figure 4 for Meta-Gradients in Non-Stationary Environments
Viaarxiv icon

Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality

Add code
Bookmark button
Alert button
May 26, 2022
Tom Zahavy, Yannick Schroecker, Feryal Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh

Figure 1 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 2 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 3 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 4 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Viaarxiv icon

Bootstrapped Meta-Learning

Add code
Bookmark button
Alert button
Sep 09, 2021
Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh

Figure 1 for Bootstrapped Meta-Learning
Figure 2 for Bootstrapped Meta-Learning
Figure 3 for Bootstrapped Meta-Learning
Figure 4 for Bootstrapped Meta-Learning
Viaarxiv icon

Emphatic Algorithms for Deep Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 21, 2021
Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

Figure 1 for Emphatic Algorithms for Deep Reinforcement Learning
Figure 2 for Emphatic Algorithms for Deep Reinforcement Learning
Figure 3 for Emphatic Algorithms for Deep Reinforcement Learning
Figure 4 for Emphatic Algorithms for Deep Reinforcement Learning
Viaarxiv icon

Discovering Diverse Nearly Optimal Policies withSuccessor Features

Add code
Bookmark button
Alert button
Jun 01, 2021
Tom Zahavy, Brendan O'Donoghue, Andre Barreto, Volodymyr Mnih, Sebastian Flennerhag, Satinder Singh

Figure 1 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Figure 2 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Figure 3 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Figure 4 for Discovering Diverse Nearly Optimal Policies withSuccessor Features
Viaarxiv icon

Reward is enough for convex MDPs

Add code
Bookmark button
Alert button
Jun 01, 2021
Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh

Figure 1 for Reward is enough for convex MDPs
Figure 2 for Reward is enough for convex MDPs
Viaarxiv icon

Online Apprenticeship Learning

Add code
Bookmark button
Alert button
Feb 13, 2021
Lior Shani, Tom Zahavy, Shie Mannor

Figure 1 for Online Apprenticeship Learning
Figure 2 for Online Apprenticeship Learning
Figure 3 for Online Apprenticeship Learning
Figure 4 for Online Apprenticeship Learning
Viaarxiv icon

Discovery of Options via Meta-Learned Subgoals

Add code
Bookmark button
Alert button
Feb 12, 2021
Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Figure 1 for Discovery of Options via Meta-Learned Subgoals
Figure 2 for Discovery of Options via Meta-Learned Subgoals
Figure 3 for Discovery of Options via Meta-Learned Subgoals
Figure 4 for Discovery of Options via Meta-Learned Subgoals
Viaarxiv icon

Discovering a set of policies for the worst case reward

Add code
Bookmark button
Alert button
Feb 08, 2021
Tom Zahavy, Andre Barreto, Daniel J Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Baveja Singh

Figure 1 for Discovering a set of policies for the worst case reward
Figure 2 for Discovering a set of policies for the worst case reward
Figure 3 for Discovering a set of policies for the worst case reward
Figure 4 for Discovering a set of policies for the worst case reward
Viaarxiv icon

Online Limited Memory Neural-Linear Bandits with Likelihood Matching

Add code
Bookmark button
Alert button
Feb 07, 2021
Ofir Nabati, Tom Zahavy, Shie Mannor

Figure 1 for Online Limited Memory Neural-Linear Bandits with Likelihood Matching
Figure 2 for Online Limited Memory Neural-Linear Bandits with Likelihood Matching
Figure 3 for Online Limited Memory Neural-Linear Bandits with Likelihood Matching
Figure 4 for Online Limited Memory Neural-Linear Bandits with Likelihood Matching
Viaarxiv icon