Alert button
Picture for Yichun Hu

Yichun Hu

Alert button

Practical Policy Optimization with Personalized Experimentation

Add code
Bookmark button
Alert button
Mar 30, 2023
Mia Garrard, Hanson Wang, Ben Letham, Shaun Singh, Abbas Kazerouni, Sarah Tan, Zehui Wang, Yin Huang, Yichun Hu, Chad Zhou, Norm Zhou, Eytan Bakshy

Figure 1 for Practical Policy Optimization with Personalized Experimentation
Figure 2 for Practical Policy Optimization with Personalized Experimentation
Viaarxiv icon

Fast Rates for the Regret of Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Jan 31, 2021
Yichun Hu, Nathan Kallus, Masatoshi Uehara

Figure 1 for Fast Rates for the Regret of Offline Reinforcement Learning
Viaarxiv icon

Fast Rates for Contextual Linear Optimization

Add code
Bookmark button
Alert button
Nov 05, 2020
Yichun Hu, Nathan Kallus, Xiaojie Mao

Figure 1 for Fast Rates for Contextual Linear Optimization
Figure 2 for Fast Rates for Contextual Linear Optimization
Viaarxiv icon

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

Add code
Bookmark button
Alert button
Jun 05, 2020
Yichun Hu, Nathan Kallus

Figure 1 for DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Figure 2 for DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Figure 3 for DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Figure 4 for DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Viaarxiv icon

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

Add code
Bookmark button
Alert button
Sep 05, 2019
Yichun Hu, Nathan Kallus, Xiaojie Mao

Figure 1 for Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Figure 2 for Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Figure 3 for Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Figure 4 for Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Viaarxiv icon