Alert button
Picture for Wanqiao Xu

Wanqiao Xu

Alert button

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Bookmark button
Alert button
Dec 06, 2023
Zheqing Zhu, Rodrigo de Salvo Braz, Jalaj Bhandari, Daniel Jiang, Yi Wan, Yonathan Efroni, Liyuan Wang, Ruiyang Xu, Hongbo Guo, Alex Nikulkov, Dmytro Korenkevych, Urun Dogan, Frank Cheng, Zheng Wu, Wanqiao Xu

Viaarxiv icon

RLHF and IIA: Perverse Incentives

Add code
Bookmark button
Alert button
Dec 02, 2023
Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy

Viaarxiv icon

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Add code
Bookmark button
Alert button
May 19, 2023
Wanqiao Xu, Shi Dong, Dilip Arumugam, Benjamin Van Roy

Figure 1 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 2 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 3 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 4 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Viaarxiv icon

Posterior Sampling for Continuing Environments

Add code
Bookmark button
Alert button
Nov 29, 2022
Wanqiao Xu, Shi Dong, Benjamin Van Roy

Viaarxiv icon

Safely Bridging Offline and Online Reinforcement Learning

Add code
Bookmark button
Alert button
Oct 25, 2021
Wanqiao Xu, Kan Xu, Hamsa Bastani, Osbert Bastani

Figure 1 for Safely Bridging Offline and Online Reinforcement Learning
Viaarxiv icon