Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Jun 08, 2021

Tao Yu, Cuiling Lan, Wenjun Zeng, Mingxiao Feng, Zhibo Chen

Figure 1 for PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Figure 2 for PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Figure 3 for PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Figure 4 for PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:Learning good feature representations is important for deep reinforcement learning (RL). However, with limited experience, RL often suffers from data inefficiency for training. For un-experienced or less-experienced trajectories (i.e., state-action sequences), the lack of data limits the use of them for better feature learning. In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning. Specifically, PlayVirtual predicts future states based on the current state and action by a dynamics model and then predicts the previous states by a backward dynamics model, which forms a trajectory cycle. Based on this, we augment the actions to generate a large amount of virtual state-action trajectories. Being free of groudtruth state supervision, we enforce a trajectory to meet the cycle consistency constraint, which can significantly enhance the data efficiency. We validate the effectiveness of our designs on the Atari and DeepMind Control Suite benchmarks. Our method outperforms the current state-of-the-art methods by a large margin on both benchmarks.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Paper and Code