Alert button
Picture for Shi Dong

Shi Dong

Alert button

RLHF and IIA: Perverse Incentives

Add code
Bookmark button
Alert button
Dec 02, 2023
Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy

Viaarxiv icon

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Add code
Bookmark button
Alert button
Jun 06, 2023
Banghua Zhu, Hiteshi Sharma, Felipe Vieira Frujeri, Shi Dong, Chenguang Zhu, Michael I. Jordan, Jiantao Jiao

Figure 1 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 2 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 3 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Figure 4 for Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Viaarxiv icon

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Add code
Bookmark button
Alert button
May 19, 2023
Wanqiao Xu, Shi Dong, Dilip Arumugam, Benjamin Van Roy

Figure 1 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 2 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 3 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Figure 4 for Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models
Viaarxiv icon

Inclusive Artificial Intelligence

Add code
Bookmark button
Alert button
Dec 24, 2022
Dilip Arumugam, Shi Dong, Benjamin Van Roy

Figure 1 for Inclusive Artificial Intelligence
Viaarxiv icon

Posterior Sampling for Continuing Environments

Add code
Bookmark button
Alert button
Nov 29, 2022
Wanqiao Xu, Shi Dong, Benjamin Van Roy

Viaarxiv icon

A unified interpretable intelligent learning diagnosis framework for smart education

Add code
Bookmark button
Alert button
Jul 07, 2022
Zhifeng Wang, Wenxing Yan, Chunyan Zeng, Shi Dong

Figure 1 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 2 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 3 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 4 for A unified interpretable intelligent learning diagnosis framework for smart education
Viaarxiv icon

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Add code
Bookmark button
Alert button
Mar 08, 2021
Shi Dong, Benjamin Van Roy, Zhengyuan Zhou

Figure 1 for Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State
Viaarxiv icon

Provably Efficient Reinforcement Learning with Aggregated States

Add code
Bookmark button
Alert button
Dec 13, 2019
Shi Dong, Benjamin Van Roy, Zhengyuan Zhou

Viaarxiv icon