Alert button
Picture for Shi Dong

Shi Dong

Alert button

RLHF and IIA: Perverse Incentives

Dec 02, 2023
Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy

Viaarxiv icon

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Jun 06, 2023
Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Chenguang Zhu, Michael I. Jordan, Jiantao Jiao

Figure 1 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 2 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 3 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 4 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Viaarxiv icon

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

May 19, 2023
Wanqiao Xu, Shi Dong, Dilip Arumugam, Benjamin Van Roy

Figure 1 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 2 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 3 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 4 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Viaarxiv icon

Inclusive Artificial Intelligence

Dec 24, 2022
Dilip Arumugam, Shi Dong, Benjamin Van Roy

Figure 1 for Inclusive Artificial Intelligence
Viaarxiv icon

Posterior Sampling for Continuing Environments

Nov 29, 2022
Wanqiao Xu, Shi Dong, Benjamin Van Roy

Viaarxiv icon

A unified interpretable intelligent learning diagnosis framework for smart education

Jul 07, 2022
Zhifeng Wang, Wenxing Yan, Chunyan Zeng, Shi Dong

Figure 1 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 2 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 3 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 4 for A unified interpretable intelligent learning diagnosis framework for smart education
Viaarxiv icon

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Mar 08, 2021
Shi Dong, Benjamin Van Roy, Zhengyuan Zhou

Figure 1 for Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State
Viaarxiv icon

Provably Efficient Reinforcement Learning with Aggregated States

Dec 13, 2019
Shi Dong, Benjamin Van Roy, Zhengyuan Zhou

Viaarxiv icon