Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bobak Hashemi

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Jun 01, 2023

Rohan Chitnis, Yingchen Xu, Bobak Hashemi, Lucas Lehnert, Urun Dogan, Zheqing Zhu, Olivier Delalleau

Figure 1 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Figure 2 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Figure 3 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Figure 4 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Abstract:Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset. We hypothesize that model-based RL agents struggle in these environments due to a lack of long-term planning capabilities, and that planning in a temporally abstract model of the environment can alleviate this issue. In this paper, we make two key contributions: 1) we introduce an offline model-based RL algorithm, IQL-TD-MPC, that extends the state-of-the-art Temporal Difference Learning for Model Predictive Control (TD-MPC) with Implicit Q-Learning (IQL); 2) we propose to use IQL-TD-MPC as a Manager in a hierarchical setting with any off-the-shelf offline RL algorithm as a Worker. More specifically, we pre-train a temporally abstract IQL-TD-MPC Manager to predict "intent embeddings", which roughly correspond to subgoals, via planning. We empirically show that augmenting state representations with intent embeddings generated by an IQL-TD-MPC manager significantly improves off-the-shelf offline RL agents' performance on some of the most challenging D4RL benchmark tasks. For instance, the offline RL algorithms AWAC, TD3-BC, DT, and CQL all get zero or near-zero normalized evaluation scores on the medium and large antmaze tasks, while our modification gives an average score over 40.

Via

Access Paper or Ask Questions

LHC analysis-specific datasets with Generative Adversarial Networks

Jan 16, 2019

Bobak Hashemi, Nick Amin, Kaustuv Datta, Dominick Olivito, Maurizio Pierini

Figure 1 for LHC analysis-specific datasets with Generative Adversarial Networks

Figure 2 for LHC analysis-specific datasets with Generative Adversarial Networks

Figure 3 for LHC analysis-specific datasets with Generative Adversarial Networks

Figure 4 for LHC analysis-specific datasets with Generative Adversarial Networks

Abstract:Using generative adversarial networks (GANs), we investigate the possibility of creating large amounts of analysis-specific simulated LHC events at limited computing cost. This kind of generative model is analysis specific in the sense that it directly generates the high-level features used in the last stage of a given physics analyses, learning the N-dimensional distribution of relevant features in the context of a specific analysis selection. We apply this idea to the generation of muon four-momenta in $Z \to \mu\mu$ events at the LHC. We highlight how use-case specific issues emerge when the distributions of the considered quantities exhibit particular features. We show how substantial performance improvements and convergence speed-up can be obtained by including regression terms in the loss function of the generator. We develop an objective criterion to assess the geenrator performance in a quantitative way. With further development, a generalization of this approach could substantially reduce the needed amount of centrally produced fully simulated events in large particle physics experiments.

* 14 pages, 11 figures

Via

Access Paper or Ask Questions