Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Variable-Decision Frequency Option Critic

Dec 11, 2022

Amirmohammad Karimi, Jun Jin, Jun Luo, A. Rupam Mahmood, Martin Jagersand, Samuele Tosatto

Figure 1 for Variable-Decision Frequency Option Critic

Figure 2 for Variable-Decision Frequency Option Critic

Share this with someone who'll enjoy it:

Abstract:In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The physical duration between one decision and the next becomes a critical hyperparameter. When this duration is too short, the agent needs to make many decisions to achieve its goal, aggravating the problem's difficulty. But when this duration is too long, the agent becomes incapable of controlling the system. Physical systems, however, do not need a constant control frequency. For learning agents, it is desirable to operate with low frequency when possible and high frequency when necessary. We propose a framework called Continuous-Time Continuous-Options (CTCO), where the agent chooses options as sub-policies of variable durations. Such options are time-continuous and can interact with the system at any desired frequency providing a smooth change of actions. The empirical analysis shows that our algorithm is competitive w.r.t. other time-abstraction techniques, such as classic option learning and action repetition, and practically overcomes the difficult choice of the decision frequency.

* Submitted to the 2023 International Conference on Robotics and Automation (ICRA). Source code at https://github.com/amir-karimi96/continuous-time-continuous-option-policy-gradient.git

View paper on

Share this with someone who'll enjoy it:

Title:Variable-Decision Frequency Option Critic

Paper and Code