Alert button
Picture for Jalaj Bhandari

Jalaj Bhandari

Alert button

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Bookmark button
Alert button
Dec 06, 2023
Zheqing Zhu, Rodrigo de Salvo Braz, Jalaj Bhandari, Daniel Jiang, Yi Wan, Yonathan Efroni, Liyuan Wang, Ruiyang Xu, Hongbo Guo, Alex Nikulkov, Dmytro Korenkevych, Urun Dogan, Frank Cheng, Zheng Wu, Wanqiao Xu

Viaarxiv icon

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Add code
Bookmark button
Alert button
May 24, 2023
Ruiyang Xu, Jalaj Bhandari, Dmytro Korenkevych, Fan Liu, Yuchen He, Alex Nikulkov, Zheqing Zhu

Figure 1 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Figure 2 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Figure 3 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Figure 4 for Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Viaarxiv icon

A Note on the Linear Convergence of Policy Gradient Methods

Add code
Bookmark button
Alert button
Jul 21, 2020
Jalaj Bhandari, Daniel Russo

Figure 1 for A Note on the Linear Convergence of Policy Gradient Methods
Viaarxiv icon

Global Optimality Guarantees For Policy Gradient Methods

Add code
Bookmark button
Alert button
Jun 05, 2019
Jalaj Bhandari, Daniel Russo

Figure 1 for Global Optimality Guarantees For Policy Gradient Methods
Viaarxiv icon

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

Add code
Bookmark button
Alert button
Nov 06, 2018
Jalaj Bhandari, Daniel Russo, Raghav Singal

Figure 1 for A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Viaarxiv icon