Alert button
Picture for Zheqing Zhu

Zheqing Zhu

Alert button

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Bookmark button
Alert button
Dec 06, 2023
Zheqing Zhu, Rodrigo de Salvo Braz, Jalaj Bhandari, Daniel Jiang, Yi Wan, Yonathan Efroni, Liyuan Wang, Ruiyang Xu, Hongbo Guo, Alex Nikulkov, Dmytro Korenkevych, Urun Dogan, Frank Cheng, Zheng Wu, Wanqiao Xu

Viaarxiv icon

Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling

Add code
Bookmark button
Alert button
Oct 14, 2023
Zheqing Zhu, Yueyang Liu, Xu Kuang, Benjamin Van Roy

Figure 1 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 2 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 3 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Figure 4 for Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling
Viaarxiv icon

Offline Reinforcement Learning for Optimizing Production Bidding Policies

Add code
Bookmark button
Alert button
Oct 13, 2023
Dmytro Korenkevych, Frank Cheng, Artsiom Balakir, Alex Nikulkov, Lingnan Gao, Zhihao Cen, Zuobing Xu, Zheqing Zhu

Viaarxiv icon

Scalable Neural Contextual Bandit for Recommender Systems

Add code
Bookmark button
Alert button
Jun 26, 2023
Zheqing Zhu, Benjamin Van Roy

Figure 1 for Scalable Neural Contextual Bandit for Recommender Systems
Figure 2 for Scalable Neural Contextual Bandit for Recommender Systems
Figure 3 for Scalable Neural Contextual Bandit for Recommender Systems
Figure 4 for Scalable Neural Contextual Bandit for Recommender Systems
Viaarxiv icon

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Add code
Bookmark button
Alert button
Jun 01, 2023
Rohan Chitnis, Yingchen Xu, Bobak Hashemi, Lucas Lehnert, Urun Dogan, Zheqing Zhu, Olivier Delalleau

Figure 1 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control
Figure 2 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control
Figure 3 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control
Figure 4 for IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control
Viaarxiv icon

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Add code
Bookmark button
Alert button
May 24, 2023
Ruiyang Xu, Jalaj Bhandari, Dmytro Korenkevych, Fan Liu, Yuchen He, Alex Nikulkov, Zheqing Zhu

Figure 1 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Figure 2 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Figure 3 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Figure 4 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Viaarxiv icon

Optimism Based Exploration in Large-Scale Recommender Systems

Add code
Bookmark button
Alert button
Apr 05, 2023
Hongbo Guo, Ruben Naeff, Alex Nikulkov, Zheqing Zhu

Figure 1 for Optimism Based Exploration in Large-Scale Recommender Systems
Figure 2 for Optimism Based Exploration in Large-Scale Recommender Systems
Figure 3 for Optimism Based Exploration in Large-Scale Recommender Systems
Figure 4 for Optimism Based Exploration in Large-Scale Recommender Systems
Viaarxiv icon

Deep Exploration for Recommendation Systems

Add code
Bookmark button
Alert button
Sep 26, 2021
Zheqing Zhu, Benjamin Van Roy

Figure 1 for Deep Exploration for Recommendation Systems
Figure 2 for Deep Exploration for Recommendation Systems
Figure 3 for Deep Exploration for Recommendation Systems
Figure 4 for Deep Exploration for Recommendation Systems
Viaarxiv icon

Multi-Agent Safe Planning with Gaussian Processes

Add code
Bookmark button
Alert button
Aug 10, 2020
Zheqing Zhu, Erdem Bıyık, Dorsa Sadigh

Figure 1 for Multi-Agent Safe Planning with Gaussian Processes
Figure 2 for Multi-Agent Safe Planning with Gaussian Processes
Figure 3 for Multi-Agent Safe Planning with Gaussian Processes
Viaarxiv icon