Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Peter Welinder

Tony

Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Mar 10, 2018

Matthias Plappert, Marcin Andrychowicz, Alex Ray, Bob McGrew, Bowen Baker, Glenn Powell, Jonas Schneider, Josh Tobin, Maciek Chociej, Peter Welinder(+2 more)

Figure 1 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Figure 2 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Figure 3 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Figure 4 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Abstract:The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware. The tasks include pushing, sliding and pick & place with a Fetch robotic arm as well as in-hand object manipulation with a Shadow Dexterous Hand. All tasks have sparse binary rewards and follow a Multi-Goal Reinforcement Learning (RL) framework in which an agent is told what to do using an additional input. The second part of the paper presents a set of concrete research ideas for improving RL algorithms, most of which are related to Multi-Goal RL and Hindsight Experience Replay.

Via

Access Paper or Ask Questions

Hindsight Experience Replay

Feb 23, 2018

Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba

Figure 1 for Hindsight Experience Replay

Figure 2 for Hindsight Experience Replay

Figure 3 for Hindsight Experience Replay

Figure 4 for Hindsight Experience Replay

Abstract:Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be deployed on a physical robot and successfully complete the task.

Via

Access Paper or Ask Questions

Asymmetric Actor Critic for Image-Based Robot Learning

Oct 18, 2017

Lerrel Pinto, Marcin Andrychowicz, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

Figure 1 for Asymmetric Actor Critic for Image-Based Robot Learning

Figure 2 for Asymmetric Actor Critic for Image-Based Robot Learning

Figure 3 for Asymmetric Actor Critic for Image-Based Robot Learning

Figure 4 for Asymmetric Actor Critic for Image-Based Robot Learning

Abstract:Deep reinforcement learning (RL) has proven a powerful technique in many sequential decision making domains. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator. While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations (RGBD images). We do this by employing an actor-critic training algorithm in which the critic is trained on full states while the actor (or policy) gets rendered images as input. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block. We achieve this simulation to real world transfer without training on any real world data.

* Videos of experiments can be found at http://www.goo.gl/b57WTs

Via

Access Paper or Ask Questions

Semisupervised Classifier Evaluation and Recalibration

Oct 08, 2012

Peter Welinder, Max Welling, Pietro Perona

Figure 1 for Semisupervised Classifier Evaluation and Recalibration

Figure 2 for Semisupervised Classifier Evaluation and Recalibration

Figure 3 for Semisupervised Classifier Evaluation and Recalibration

Figure 4 for Semisupervised Classifier Evaluation and Recalibration

Abstract:How many labeled examples are needed to estimate a classifier's performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the data, it is possible to estimate performance curves, with confidence bounds, using a small number of ground truth labels. Our approach, which we call Semisupervised Performance Evaluation (SPE), is based on a generative model for the classifier's confidence scores. In addition to estimating the performance of classifiers on new datasets, SPE can be used to recalibrate a classifier by re-estimating the class-conditional confidence distributions.

Via

Access Paper or Ask Questions