Picture for Tom Schaul

Tom Schaul

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Add code
Dec 08, 2021
Figure 1 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 2 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 3 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 4 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Viaarxiv icon

When should agents explore?

Add code
Aug 26, 2021
Figure 1 for When should agents explore?
Figure 2 for When should agents explore?
Figure 3 for When should agents explore?
Figure 4 for When should agents explore?
Viaarxiv icon

Return-based Scaling: Yet Another Normalisation Trick for Deep RL

Add code
May 11, 2021
Figure 1 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Figure 2 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Figure 3 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Figure 4 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Viaarxiv icon

Policy Evaluation Networks

Add code
Feb 26, 2020
Figure 1 for Policy Evaluation Networks
Figure 2 for Policy Evaluation Networks
Figure 3 for Policy Evaluation Networks
Figure 4 for Policy Evaluation Networks
Viaarxiv icon

Adapting Behaviour for Learning Progress

Add code
Dec 14, 2019
Figure 1 for Adapting Behaviour for Learning Progress
Figure 2 for Adapting Behaviour for Learning Progress
Figure 3 for Adapting Behaviour for Learning Progress
Figure 4 for Adapting Behaviour for Learning Progress
Viaarxiv icon

Conditional Importance Sampling for Off-Policy Learning

Add code
Oct 16, 2019
Figure 1 for Conditional Importance Sampling for Off-Policy Learning
Figure 2 for Conditional Importance Sampling for Off-Policy Learning
Figure 3 for Conditional Importance Sampling for Off-Policy Learning
Figure 4 for Conditional Importance Sampling for Off-Policy Learning
Viaarxiv icon

Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods

Add code
Jun 07, 2019
Figure 1 for Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Figure 2 for Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Figure 3 for Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Figure 4 for Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Viaarxiv icon

Ray Interference: a Source of Plateaus in Deep Reinforcement Learning

Add code
Apr 25, 2019
Figure 1 for Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
Figure 2 for Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
Figure 3 for Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
Figure 4 for Ray Interference: a Source of Plateaus in Deep Reinforcement Learning
Viaarxiv icon

Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement

Add code
Jan 30, 2019
Figure 1 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Figure 2 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Figure 3 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Figure 4 for Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
Viaarxiv icon

Universal Successor Features Approximators

Add code
Dec 18, 2018
Figure 1 for Universal Successor Features Approximators
Figure 2 for Universal Successor Features Approximators
Figure 3 for Universal Successor Features Approximators
Figure 4 for Universal Successor Features Approximators
Viaarxiv icon