Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mathieu Petitbois

Offline Learning of Controllable Diverse Behaviors

Apr 25, 2025

Mathieu Petitbois, Rémy Portelas, Sylvain Lamprier, Ludovic Denoyer

Abstract:Imitation Learning (IL) techniques aim to replicate human behaviors in specific tasks. While IL has gained prominence due to its effectiveness and efficiency, traditional methods often focus on datasets collected from experts to produce a single efficient policy. Recently, extensions have been proposed to handle datasets of diverse behaviors by mainly focusing on learning transition-level diverse policies or on performing entropy maximization at the trajectory level. While these methods may lead to diverse behaviors, they may not be sufficient to reproduce the actual diversity of demonstrations or to allow controlled trajectory generation. To overcome these drawbacks, we propose a different method based on two key features: a) Temporal Consistency that ensures consistent behaviors across entire episodes and not just at the transition level as well as b) Controllability obtained by constructing a latent space of behaviors that allows users to selectively activate specific behaviors based on their requirements. We compare our approach to state-of-the-art methods over a diverse set of tasks and environments. Project page: https://mathieu-petitbois.github.io/projects/swr/

* Generative Models for Robot Learning Workshop at ICLR 2025

Via

Access Paper or Ask Questions

Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning

Nov 12, 2024

Alexi Canesse, Mathieu Petitbois, Ludovic Denoyer, Sylvain Lamprier, Rémy Portelas

Figure 1 for Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning

Figure 2 for Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning

Figure 3 for Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning

Figure 4 for Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning

Abstract:Offline Reinforcement Learning (RL) has emerged as a powerful alternative to imitation learning for behavior modeling in various domains, particularly in complex navigation tasks. An existing challenge with Offline RL is the signal-to-noise ratio, i.e. how to mitigate incorrect policy updates due to errors in value estimates. Towards this, multiple works have demonstrated the advantage of hierarchical offline RL methods, which decouples high-level path planning from low-level path following. In this work, we present a novel hierarchical transformer-based approach leveraging a learned quantizer of the space. This quantization enables the training of a simpler zone-conditioned low-level policy and simplifies planning, which is reduced to discrete autoregressive prediction. Among other benefits, zone-level reasoning in planning enables explicit trajectory stitching rather than implicit stitching based on noisy value function estimates. By combining this transformer-based planner with recent advancements in offline RL, our proposed approach achieves state-of-the-art results in complex long-distance navigation environments.

* Under review. Code will be released upon acceptance

Via

Access Paper or Ask Questions