Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Soft Q-network

Dec 20, 2019
Jingbin Liu, Xinyang Gu, Shuai Liu, Dexiang Zhang

Share this with someone who'll enjoy it:

When DQN is announced by deepmind in 2013, the whole world is surprised by the simplicity and promising result, but due to the low efficiency and stability of this method, it is hard to solve many problems. After all these years, people purposed more and more complicated ideas for improving, many of them use distributed Deep-RL which needs tons of cores to run the simulators. However, the basic ideas behind all this technique are sometimes just a modified DQN. So we asked a simple question, is there a more elegant way to improve the DQN model? Instead of adding more and more small fixes on it, we redesign the problem setting under a popular entropy regularization framework which leads to better performance and theoretical guarantee. Finally, we purposed SQN, a new off-policy algorithm with better performance and stability.

* arXiv admin note: substantial text overlap with arXiv:1912.01557, arXiv:1908.11494; text overlap with arXiv:1812.05905, arXiv:1710.02298, arXiv:1801.01290 by other authors 

   Access Paper Source

Share this with someone who'll enjoy it: