Alert button
Picture for Heyang Zhao

Heyang Zhao

Alert button

Feel-Good Thompson Sampling for Contextual Dueling Bandits

Add code
Bookmark button
Alert button
Apr 09, 2024
Xuheng Li, Heyang Zhao, Quanquan Gu

Viaarxiv icon

A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation

Add code
Bookmark button
Alert button
Nov 26, 2023
Heyang Zhao, Jiafan He, Quanquan Gu

Viaarxiv icon

Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 02, 2023
Qiwei Di, Heyang Zhao, Jiafan He, Quanquan Gu

Viaarxiv icon

Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits

Add code
Bookmark button
Alert button
Oct 02, 2023
Qiwei Di, Tao Jin, Yue Wu, Heyang Zhao, Farzad Farnoud, Quanquan Gu

Figure 1 for Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
Viaarxiv icon

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

Add code
Bookmark button
Alert button
Feb 21, 2023
Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

Figure 1 for Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Figure 2 for Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Viaarxiv icon

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Add code
Bookmark button
Alert button
Dec 12, 2022
Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu

Figure 1 for Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Viaarxiv icon

Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds

Add code
Bookmark button
Alert button
Feb 28, 2022
Heyang Zhao, Dongruo Zhou, Jiafan He, Quanquan Gu

Figure 1 for Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds
Viaarxiv icon

Linear Contextual Bandits with Adversarial Corruptions

Add code
Bookmark button
Alert button
Oct 25, 2021
Heyang Zhao, Dongruo Zhou, Quanquan Gu

Figure 1 for Linear Contextual Bandits with Adversarial Corruptions
Viaarxiv icon