Picture for Tom Schaul

Tom Schaul

Open-Endedness is Essential for Artificial Superhuman Intelligence

Add code
Jun 06, 2024
Viaarxiv icon

Vision-Language Models as a Source of Rewards

Add code
Dec 14, 2023
Figure 1 for Vision-Language Models as a Source of Rewards
Figure 2 for Vision-Language Models as a Source of Rewards
Figure 3 for Vision-Language Models as a Source of Rewards
Figure 4 for Vision-Language Models as a Source of Rewards
Viaarxiv icon

Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization

Add code
Apr 08, 2023
Figure 1 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 2 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 3 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 4 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Viaarxiv icon

Scaling Goal-based Exploration via Pruning Proto-goals

Add code
Feb 09, 2023
Figure 1 for Scaling Goal-based Exploration via Pruning Proto-goals
Figure 2 for Scaling Goal-based Exploration via Pruning Proto-goals
Figure 3 for Scaling Goal-based Exploration via Pruning Proto-goals
Figure 4 for Scaling Goal-based Exploration via Pruning Proto-goals
Viaarxiv icon

Discovering Evolution Strategies via Meta-Black-Box Optimization

Add code
Nov 25, 2022
Figure 1 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 2 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 3 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 4 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Viaarxiv icon

The Phenomenon of Policy Churn

Add code
Jun 09, 2022
Figure 1 for The Phenomenon of Policy Churn
Figure 2 for The Phenomenon of Policy Churn
Figure 3 for The Phenomenon of Policy Churn
Figure 4 for The Phenomenon of Policy Churn
Viaarxiv icon

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Add code
Dec 08, 2021
Figure 1 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 2 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 3 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 4 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Viaarxiv icon

When should agents explore?

Add code
Aug 26, 2021
Figure 1 for When should agents explore?
Figure 2 for When should agents explore?
Figure 3 for When should agents explore?
Figure 4 for When should agents explore?
Viaarxiv icon

Return-based Scaling: Yet Another Normalisation Trick for Deep RL

Add code
May 11, 2021
Figure 1 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Figure 2 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Figure 3 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Figure 4 for Return-based Scaling: Yet Another Normalisation Trick for Deep RL
Viaarxiv icon

Policy Evaluation Networks

Add code
Feb 26, 2020
Figure 1 for Policy Evaluation Networks
Figure 2 for Policy Evaluation Networks
Figure 3 for Policy Evaluation Networks
Figure 4 for Policy Evaluation Networks
Viaarxiv icon