Picture for Chenlu Ye

Chenlu Ye

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

Add code
Feb 15, 2024
Viaarxiv icon

A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference

Add code
Feb 11, 2024
Figure 1 for A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference
Figure 2 for A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference
Figure 3 for A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference
Figure 4 for A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference
Viaarxiv icon

Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF

Add code
Dec 18, 2023
Figure 1 for Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF
Figure 2 for Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF
Figure 3 for Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF
Figure 4 for Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF
Viaarxiv icon

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks

Add code
Nov 24, 2023
Figure 1 for Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Figure 2 for Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Figure 3 for Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Figure 4 for Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Viaarxiv icon

Corruption-Robust Offline Reinforcement Learning with General Function Approximation

Add code
Oct 23, 2023
Viaarxiv icon

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning

Add code
Sep 05, 2023
Figure 1 for Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Figure 2 for Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Figure 3 for Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Figure 4 for Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Viaarxiv icon

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

Add code
Dec 12, 2022
Viaarxiv icon