Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matin Moezzi

Data-Efficient Hierarchical Goal-Conditioned Reinforcement Learning via Normalizing Flows

Feb 11, 2026

Shaswat Garg, Matin Moezzi, Brandon Da Silva

Abstract:Hierarchical goal-conditioned reinforcement learning (H-GCRL) provides a powerful framework for tackling complex, long-horizon tasks by decomposing them into structured subgoals. However, its practical adoption is hindered by poor data efficiency and limited policy expressivity, especially in offline or data-scarce regimes. In this work, Normalizing flow-based hierarchical implicit Q-learning (NF-HIQL), a novel framework that replaces unimodal gaussian policies with expressive normalizing flow policies at both the high- and low-levels of the hierarchy is introduced. This design enables tractable log-likelihood computation, efficient sampling, and the ability to model rich multimodal behaviors. New theoretical guarantees are derived, including explicit KL-divergence bounds for Real-valued non-volume preserving (RealNVP) policies and PAC-style sample efficiency results, showing that NF-HIQL preserves stability while improving generalization. Empirically, NF-HIQL is evaluted across diverse long-horizon tasks in locomotion, ball-dribbling, and multi-step manipulation from OGBench. NF-HIQL consistently outperforms prior goal-conditioned and hierarchical baselines, demonstrating superior robustness under limited data and highlighting the potential of flow-based architectures for scalable, data-efficient hierarchical reinforcement learning.

* 9 pages, 3 figures, IEEE International Conference on Robotics and Automation 2026

Via

Access Paper or Ask Questions

A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Aug 10, 2023

Marshall Wang, John Willes, Thomas Jiralerspong, Matin Moezzi

Figure 1 for A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Figure 2 for A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Figure 3 for A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Figure 4 for A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Abstract:Reinforcement learning (RL) is a promising approach for optimizing HVAC control. RL offers a framework for improving system performance, reducing energy consumption, and enhancing cost efficiency. We benchmark two popular classical and deep RL methods (Q-Learning and Deep-Q-Networks) across multiple HVAC environments and explore the practical consideration of model hyper-parameter selection and reward tuning. The findings provide insight for configuring RL agents in HVAC systems, promoting energy-efficient and cost-effective operation.

Via

Access Paper or Ask Questions