Alert button
Picture for Han Zhong

Han Zhong

Alert button

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation

Add code
Bookmark button
Alert button
Apr 19, 2024
Jianliang He, Han Zhong, Zhuoran Yang

Viaarxiv icon

Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm

Add code
Bookmark button
Alert button
Apr 04, 2024
Miao Lu, Han Zhong, Tong Zhang, Jose Blanchet

Viaarxiv icon

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

Add code
Bookmark button
Alert button
Feb 25, 2024
Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen

Viaarxiv icon

Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

Add code
Bookmark button
Alert button
Dec 28, 2023
Guhao Feng, Han Zhong

Viaarxiv icon

Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF

Add code
Bookmark button
Alert button
Dec 18, 2023
Wei Xiong, Hanze Dong, Chenlu Ye, Han Zhong, Nan Jiang, Tong Zhang

Viaarxiv icon

Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation

Add code
Bookmark button
Alert button
Dec 07, 2023
Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang

Viaarxiv icon

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation

Add code
Bookmark button
Alert button
Oct 30, 2023
Shuang Qiu, Ziyu Dai, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang

Viaarxiv icon

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

Add code
Bookmark button
Alert button
Oct 19, 2023
Rui Yang, Han Zhong, Jiawei Xu, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang

Viaarxiv icon

Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds

Add code
Bookmark button
Alert button
Jun 12, 2023
Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang

Figure 1 for Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds
Figure 2 for Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds
Viaarxiv icon