Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthias Plappert

Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Mar 10, 2018
Matthias Plappert, Marcin Andrychowicz, Alex Ray, Bob McGrew, Bowen Baker, Glenn Powell, Jonas Schneider, Josh Tobin, Maciek Chociej, Peter Welinder, Vikash Kumar, Wojciech Zaremba

Figure 1 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Figure 2 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Figure 3 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Figure 4 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware. The tasks include pushing, sliding and pick & place with a Fetch robotic arm as well as in-hand object manipulation with a Shadow Dexterous Hand. All tasks have sparse binary rewards and follow a Multi-Goal Reinforcement Learning (RL) framework in which an agent is told what to do using an additional input. The second part of the paper presents a set of concrete research ideas for improving RL algorithms, most of which are related to Multi-Goal RL and Hindsight Experience Replay.

Via

Access Paper or Ask Questions

Parameter Space Noise for Exploration

Jan 31, 2018
Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz

Figure 1 for Parameter Space Noise for Exploration

Figure 2 for Parameter Space Noise for Exploration

Figure 3 for Parameter Space Noise for Exploration

Figure 4 for Parameter Space Noise for Exploration

Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both off- and on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.

* Updated to camera-ready ICLR submission

Via

Access Paper or Ask Questions

Classification of Human Whole-Body Motion using Hidden Markov Models

May 05, 2016
Matthias Plappert

Figure 1 for Classification of Human Whole-Body Motion using Hidden Markov Models

Figure 2 for Classification of Human Whole-Body Motion using Hidden Markov Models

Figure 3 for Classification of Human Whole-Body Motion using Hidden Markov Models

Figure 4 for Classification of Human Whole-Body Motion using Hidden Markov Models

Human motion plays an important role in many fields. Large databases exist that store and make available recordings of human motions. However, annotating each motion with multiple labels is a cumbersome and error-prone process. This bachelor's thesis presents different approaches to solve the multi-label classification problem using Hidden Markov Models (HMMs). First, different features that can be directly obtained from the raw data are introduced. Next, additional features are derived to improve classification performance. These features are then used to perform the multi-label classification using two different approaches. The first approach simply transforms the multi-label problem into a multi-class problem. The second, novel approach solves the same problem without the need to construct a transformation by predicting the labels directly from the likelihood scores. The second approach scales linearly with the number of labels whereas the first approach is subject to combinatorial explosion. All aspects of the classification process are evaluated on a data set that consists of 454 motions. System 1 achieves an accuracy of 98.02% and system 2 an accuracy of 93.39% on the test set.

Via

Access Paper or Ask Questions