Picture for Pieter Abbeel

Pieter Abbeel

UC Berkeley

High-Dimensional Continuous Control Using Generalized Advantage Estimation

Add code
Oct 20, 2018
Figure 1 for High-Dimensional Continuous Control Using Generalized Advantage Estimation
Figure 2 for High-Dimensional Continuous Control Using Generalized Advantage Estimation
Figure 3 for High-Dimensional Continuous Control Using Generalized Advantage Estimation
Figure 4 for High-Dimensional Continuous Control Using Generalized Advantage Estimation
Viaarxiv icon

Enabling Robots to Communicate their Objectives

Add code
Oct 18, 2018
Figure 1 for Enabling Robots to Communicate their Objectives
Figure 2 for Enabling Robots to Communicate their Objectives
Figure 3 for Enabling Robots to Communicate their Objectives
Figure 4 for Enabling Robots to Communicate their Objectives
Viaarxiv icon

Establishing Appropriate Trust via Critical States

Add code
Oct 18, 2018
Figure 1 for Establishing Appropriate Trust via Critical States
Figure 2 for Establishing Appropriate Trust via Critical States
Figure 3 for Establishing Appropriate Trust via Critical States
Figure 4 for Establishing Appropriate Trust via Critical States
Viaarxiv icon

ProMP: Proximal Meta-Policy Search

Add code
Oct 17, 2018
Figure 1 for ProMP: Proximal Meta-Policy Search
Figure 2 for ProMP: Proximal Meta-Policy Search
Figure 3 for ProMP: Proximal Meta-Policy Search
Viaarxiv icon

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation

Add code
Oct 16, 2018
Figure 1 for Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation
Figure 2 for Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation
Figure 3 for Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation
Figure 4 for Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation
Viaarxiv icon

SFV: Reinforcement Learning of Physical Skills from Videos

Add code
Oct 15, 2018
Figure 1 for SFV: Reinforcement Learning of Physical Skills from Videos
Figure 2 for SFV: Reinforcement Learning of Physical Skills from Videos
Figure 3 for SFV: Reinforcement Learning of Physical Skills from Videos
Figure 4 for SFV: Reinforcement Learning of Physical Skills from Videos
Viaarxiv icon

Equivalence Between Policy Gradients and Soft Q-Learning

Add code
Oct 14, 2018
Figure 1 for Equivalence Between Policy Gradients and Soft Q-Learning
Figure 2 for Equivalence Between Policy Gradients and Soft Q-Learning
Figure 3 for Equivalence Between Policy Gradients and Soft Q-Learning
Viaarxiv icon

Model-Ensemble Trust-Region Policy Optimization

Add code
Oct 05, 2018
Figure 1 for Model-Ensemble Trust-Region Policy Optimization
Figure 2 for Model-Ensemble Trust-Region Policy Optimization
Figure 3 for Model-Ensemble Trust-Region Policy Optimization
Figure 4 for Model-Ensemble Trust-Region Policy Optimization
Viaarxiv icon

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

Add code
Oct 01, 2018
Figure 1 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Figure 2 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Figure 3 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Figure 4 for Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
Viaarxiv icon

Learning with Opponent-Learning Awareness

Add code
Sep 19, 2018
Figure 1 for Learning with Opponent-Learning Awareness
Figure 2 for Learning with Opponent-Learning Awareness
Figure 3 for Learning with Opponent-Learning Awareness
Figure 4 for Learning with Opponent-Learning Awareness
Viaarxiv icon