Alert button
Picture for Long Yang

Long Yang

Alert button

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

May 24, 2022
Linrui zhang, Li Shen, Long Yang, Shixiang Chen, Bo Yuan, Xueqian Wang, Dacheng Tao

Figure 1 for Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Figure 2 for Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Figure 3 for Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Figure 4 for Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Viaarxiv icon

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

May 23, 2022
Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll

Figure 1 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Figure 2 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Figure 3 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Figure 4 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Viaarxiv icon

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

Feb 15, 2022
Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Figure 1 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Figure 2 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Figure 3 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Figure 4 for CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning
Viaarxiv icon

Secure Hybrid Beamforming for IRS-Assisted Millimeter Wave Systems

Jan 09, 2022
Xuan Xue, Yongchao Wang, Long Yang, Jian Chen

Figure 1 for Secure Hybrid Beamforming for IRS-Assisted Millimeter Wave Systems
Figure 2 for Secure Hybrid Beamforming for IRS-Assisted Millimeter Wave Systems
Viaarxiv icon

Thompson Sampling for Unimodal Bandits

Jun 16, 2021
Long Yang, Zhao Li, Zehong Hu, Shasha Ruan, Shijian Li, Gang Pan, Hongyang Chen

Figure 1 for Thompson Sampling for Unimodal Bandits
Figure 2 for Thompson Sampling for Unimodal Bandits
Figure 3 for Thompson Sampling for Unimodal Bandits
Figure 4 for Thompson Sampling for Unimodal Bandits
Viaarxiv icon

On Convergence of Gradient Expected Sarsa($λ$)

Dec 14, 2020
Long Yang, Gang Zheng, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

Figure 1 for On Convergence of Gradient Expected Sarsa($λ$)
Figure 2 for On Convergence of Gradient Expected Sarsa($λ$)
Figure 3 for On Convergence of Gradient Expected Sarsa($λ$)
Figure 4 for On Convergence of Gradient Expected Sarsa($λ$)
Viaarxiv icon

Sample Complexity of Policy Gradient Finding Second-Order Stationary Points

Dec 02, 2020
Long Yang, Qian Zheng, Gang Pan

Figure 1 for Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Figure 2 for Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Viaarxiv icon

Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning

Sep 06, 2019
Long Yang, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

Figure 1 for Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning
Figure 2 for Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning
Figure 3 for Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning
Figure 4 for Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning
Viaarxiv icon

FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control

Jul 10, 2019
Longxiang Shi, Shijian Li, Longbing Cao, Long Yang, Gang Zheng, Gang Pan

Figure 1 for FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Figure 2 for FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Figure 3 for FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Figure 4 for FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Viaarxiv icon