Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:The Information Geometry of Unsupervised Reinforcement Learning

Oct 06, 2021

Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Figure 1 for The Information Geometry of Unsupervised Reinforcement Learning

Figure 2 for The Information Geometry of Unsupervised Reinforcement Learning

Figure 3 for The Information Geometry of Unsupervised Reinforcement Learning

Figure 4 for The Information Geometry of Unsupervised Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:How can a reinforcement learning (RL) agent prepare to solve downstream tasks if those tasks are not known a priori? One approach is unsupervised skill discovery, a class of algorithms that learn a set of policies without access to a reward function. Such algorithms bear a close resemblance to representation learning algorithms (e.g., contrastive learning) in supervised learning, in that both are pretraining algorithms that maximize some approximation to a mutual information objective. While prior work has shown that the set of skills learned by such methods can accelerate downstream RL tasks, prior work offers little analysis into whether these skill learning algorithms are optimal, or even what notion of optimality would be appropriate to apply to them. In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function. However, we show that the distribution over skills provides an optimal initialization minimizing regret against adversarially-chosen reward functions, assuming a certain type of adaptation procedure. Our analysis also provides a geometric perspective on these skill learning methods.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:The Information Geometry of Unsupervised Reinforcement Learning

Paper and Code