Alert button
Picture for Yichi Zhou

Yichi Zhou

Alert button

Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process

Mar 07, 2024
Xiangxin Zhou, Liang Wang, Yichi Zhou

Figure 1 for Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
Figure 2 for Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
Figure 3 for Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
Figure 4 for Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
Viaarxiv icon

Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback

Jun 16, 2022
Fang Kong, Yichi Zhou, Shuai Li

Figure 1 for Simultaneously Learning Stochastic and Adversarial Bandits with General Graph Feedback
Viaarxiv icon

Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit

Jun 29, 2021
Yichi Zhou, Shihong Song, Huishuai Zhang, Jun Zhu, Wei Chen, Tie-Yan Liu

Figure 1 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Figure 2 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Figure 3 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Figure 4 for Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Viaarxiv icon

Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information

Oct 10, 2018
Yichi Zhou, Tongzheng Ren, Jialian Li, Dong Yan, Jun Zhu

Figure 1 for Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information
Figure 2 for Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information
Viaarxiv icon

Label Aggregation via Finding Consensus Between Models

Jul 19, 2018
Chi Hong, Yichi Zhou

Figure 1 for Label Aggregation via Finding Consensus Between Models
Figure 2 for Label Aggregation via Finding Consensus Between Models
Figure 3 for Label Aggregation via Finding Consensus Between Models
Figure 4 for Label Aggregation via Finding Consensus Between Models
Viaarxiv icon

Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors

Aug 16, 2017
Yichi Zhou, Jun Zhu, Jingwei Zhuo

Figure 1 for Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
Figure 2 for Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
Figure 3 for Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors
Viaarxiv icon