Gradient Q$(蟽, 位)$: A Unified Algorithm with Function Approximation for Reinforcement Learning

Sep 06, 2019
Long Yang, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan


  Access Model/Code and Paper
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control

Jul 10, 2019
Longxiang Shi, Shijian Li, Longbing Cao, Long Yang, Gang Zheng, Gang Pan

* 9 pages 

  Access Model/Code and Paper
Policy Optimization with Stochastic Mirror Descent

Jun 25, 2019
Long Yang, Yu Zhang


  Access Model/Code and Paper
Expected Sarsa($位$) with Control Variate for Variance Reduction

Jun 25, 2019
Long Yang, Yu Zhang


  Access Model/Code and Paper
TBQ($蟽$): Improving Efficiency of Trace Utilization for Off-Policy Reinforcement Learning

May 17, 2019
Longxiang Shi, Shijian Li, Longbing Cao, Long Yang, Gang Pan

* 8 pages 

  Access Model/Code and Paper
Beetle Swarm Optimization Algorithm:Theory and Application

Aug 01, 2018
Tiantian Wang, Long Yang, Qiang Liu


  Access Model/Code and Paper
Qualitative Measurements of Policy Discrepancy for Return-based Deep Q-Network

Jul 08, 2018
Wenjia Meng, Qian Zheng, Long Yang, Pengfei Li, Gang Pan


  Access Model/Code and Paper
A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning

Feb 09, 2018
Long Yang, Minhao Shi, Qian Zheng, Wenjia Meng, Gang Pan


  Access Model/Code and Paper