Picture for Sam Devlin

Sam Devlin

Efficient Offline Reinforcement Learning: The Critic is Critical

Add code
Jun 19, 2024
Figure 1 for Efficient Offline Reinforcement Learning: The Critic is Critical
Figure 2 for Efficient Offline Reinforcement Learning: The Critic is Critical
Figure 3 for Efficient Offline Reinforcement Learning: The Critic is Critical
Figure 4 for Efficient Offline Reinforcement Learning: The Critic is Critical
Viaarxiv icon

Aligning Agents like Large Language Models

Add code
Jun 06, 2024
Figure 1 for Aligning Agents like Large Language Models
Figure 2 for Aligning Agents like Large Language Models
Figure 3 for Aligning Agents like Large Language Models
Figure 4 for Aligning Agents like Large Language Models
Viaarxiv icon

Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games

Add code
Dec 04, 2023
Figure 1 for Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Figure 2 for Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Figure 3 for Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Figure 4 for Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Viaarxiv icon

Adaptive Scaffolding in Block-Based Programming via Synthesizing New Tasks as Pop Quizzes

Add code
Mar 28, 2023
Viaarxiv icon

Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games

Add code
Mar 02, 2023
Figure 1 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Figure 2 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Figure 3 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Figure 4 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Viaarxiv icon

Trust-Region-Free Policy Optimization for Stochastic Policies

Add code
Feb 15, 2023
Figure 1 for Trust-Region-Free Policy Optimization for Stochastic Policies
Figure 2 for Trust-Region-Free Policy Optimization for Stochastic Policies
Viaarxiv icon

Contrastive Meta-Learning for Partially Observable Few-Shot Learning

Add code
Jan 30, 2023
Figure 1 for Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Figure 2 for Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Figure 3 for Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Figure 4 for Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Viaarxiv icon

Imitating Human Behaviour with Diffusion Models

Add code
Jan 25, 2023
Figure 1 for Imitating Human Behaviour with Diffusion Models
Figure 2 for Imitating Human Behaviour with Diffusion Models
Figure 3 for Imitating Human Behaviour with Diffusion Models
Figure 4 for Imitating Human Behaviour with Diffusion Models
Viaarxiv icon

UniMASK: Unified Inference in Sequential Decision Problems

Add code
Nov 20, 2022
Figure 1 for UniMASK: Unified Inference in Sequential Decision Problems
Figure 2 for UniMASK: Unified Inference in Sequential Decision Problems
Figure 3 for UniMASK: Unified Inference in Sequential Decision Problems
Figure 4 for UniMASK: Unified Inference in Sequential Decision Problems
Viaarxiv icon

Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

Add code
Apr 28, 2022
Figure 1 for Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Figure 2 for Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Figure 3 for Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Figure 4 for Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Viaarxiv icon