Picture for Chenlu Ye

Chenlu Ye

Daunce: Data Attribution through Uncertainty Estimation

Add code
May 29, 2025
Viaarxiv icon

Self-rewarding correction for mathematical reasoning

Add code
Feb 26, 2025
Viaarxiv icon

Logarithmic Regret for Online KL-Regularized Reinforcement Learning

Add code
Feb 11, 2025
Viaarxiv icon

Catoni Contextual Bandits are Robust to Heavy-tailed Rewards

Add code
Feb 04, 2025
Viaarxiv icon

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Add code
Nov 07, 2024
Figure 1 for Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Figure 2 for Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Viaarxiv icon

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

Add code
Feb 15, 2024
Viaarxiv icon

A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference

Add code
Feb 11, 2024
Viaarxiv icon

Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF

Add code
Dec 18, 2023
Viaarxiv icon

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks

Add code
Nov 24, 2023
Viaarxiv icon

Corruption-Robust Offline Reinforcement Learning with General Function Approximation

Add code
Oct 23, 2023
Viaarxiv icon