Picture for Deheng Ye

Deheng Ye

Tencent Inc

RLTF: Reinforcement Learning from Unit Test Feedback

Add code
Jul 10, 2023
Figure 1 for RLTF: Reinforcement Learning from Unit Test Feedback
Figure 2 for RLTF: Reinforcement Learning from Unit Test Feedback
Figure 3 for RLTF: Reinforcement Learning from Unit Test Feedback
Figure 4 for RLTF: Reinforcement Learning from Unit Test Feedback
Viaarxiv icon

Future-conditioned Unsupervised Pretraining for Decision Transformer

Add code
May 26, 2023
Figure 1 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Figure 2 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Figure 3 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Figure 4 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Viaarxiv icon

Deploying Offline Reinforcement Learning with Human Feedback

Add code
Mar 13, 2023
Viaarxiv icon

Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization

Add code
Feb 05, 2023
Viaarxiv icon

Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning

Add code
Jan 20, 2023
Viaarxiv icon

A Survey on Transformers in Reinforcement Learning

Add code
Jan 08, 2023
Viaarxiv icon

RLogist: Fast Observation Strategy on Whole-slide Images with Deep Reinforcement Learning

Add code
Dec 13, 2022
Viaarxiv icon

Pretraining in Deep Reinforcement Learning: A Survey

Add code
Nov 08, 2022
Viaarxiv icon

Curriculum-based Asymmetric Multi-task Reinforcement Learning

Add code
Nov 07, 2022
Figure 1 for Curriculum-based Asymmetric Multi-task Reinforcement Learning
Figure 2 for Curriculum-based Asymmetric Multi-task Reinforcement Learning
Figure 3 for Curriculum-based Asymmetric Multi-task Reinforcement Learning
Figure 4 for Curriculum-based Asymmetric Multi-task Reinforcement Learning
Viaarxiv icon

Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation

Add code
Oct 19, 2022
Figure 1 for Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Figure 2 for Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Figure 3 for Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Figure 4 for Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Viaarxiv icon