Alert button
Picture for Shangding Gu

Shangding Gu

Alert button

TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 13, 2024
Shangding Gu, Alois Knoll, Ming Jin

Figure 1 for TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning
Figure 2 for TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning
Figure 3 for TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning
Figure 4 for TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning
Viaarxiv icon

Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study

Add code
Bookmark button
Alert button
Jan 12, 2024
Shangding Gu

Viaarxiv icon

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

Add code
Bookmark button
Alert button
Dec 11, 2023
Jing Hou, Guang Chen, Ruiqi Zhang, Zhijun Li, Shangding Gu, Changjun Jiang

Figure 1 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 2 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 3 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Figure 4 for Spreeze: High-Throughput Parallel Reinforcement Learning Framework
Viaarxiv icon

SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization

Add code
Bookmark button
Alert button
Nov 01, 2023
Jaafar Mhamed, Shangding Gu

Viaarxiv icon

A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors

Add code
Bookmark button
Alert button
Mar 02, 2023
Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Jan Peters, Alois Knoll

Figure 1 for A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors
Figure 2 for A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors
Figure 3 for A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors
Viaarxiv icon

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

Add code
Bookmark button
Alert button
May 23, 2022
Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll

Figure 1 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Figure 2 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Figure 3 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Figure 4 for A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Viaarxiv icon

A Circle Grid-based Approach for Obstacle Avoidance Motion Planning of Unmanned Surface Vehicles

Add code
Bookmark button
Alert button
Feb 09, 2022
Man Zhu, Changshi Xiao, Shangding Gu, Zhe Du, Yuanqiao Wen

Figure 1 for A Circle Grid-based Approach for Obstacle Avoidance Motion Planning of Unmanned Surface Vehicles
Figure 2 for A Circle Grid-based Approach for Obstacle Avoidance Motion Planning of Unmanned Surface Vehicles
Figure 3 for A Circle Grid-based Approach for Obstacle Avoidance Motion Planning of Unmanned Surface Vehicles
Figure 4 for A Circle Grid-based Approach for Obstacle Avoidance Motion Planning of Unmanned Surface Vehicles
Viaarxiv icon

Multi-Agent Constrained Policy Optimisation

Add code
Bookmark button
Alert button
Oct 06, 2021
Shangding Gu, Jakub Grudzien Kuba, Munning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois Knoll, Yaodong Yang

Figure 1 for Multi-Agent Constrained Policy Optimisation
Figure 2 for Multi-Agent Constrained Policy Optimisation
Figure 3 for Multi-Agent Constrained Policy Optimisation
Figure 4 for Multi-Agent Constrained Policy Optimisation
Viaarxiv icon

Settling the Variance of Multi-Agent Policy Gradients

Add code
Bookmark button
Alert button
Aug 20, 2021
Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

Figure 1 for Settling the Variance of Multi-Agent Policy Gradients
Figure 2 for Settling the Variance of Multi-Agent Policy Gradients
Figure 3 for Settling the Variance of Multi-Agent Policy Gradients
Viaarxiv icon