Alert button
Picture for Shentao Yang

Shentao Yang

Alert button

Sequential Decision-Making for Inline Text Autocomplete

Add code
Bookmark button
Alert button
Mar 21, 2024
Rohan Chitnis, Shentao Yang, Alborz Geramifard

Viaarxiv icon

A Dense Reward View on Aligning Text-to-Image Diffusion with Preference

Add code
Bookmark button
Alert button
Feb 13, 2024
Shentao Yang, Tianqi Chen, Mingyuan Zhou

Viaarxiv icon

Preference-grounded Token-level Guidance for Language Model Fine-tuning

Add code
Bookmark button
Alert button
Jun 01, 2023
Shentao Yang, Shujian Zhang, Congying Xia, Yihao Feng, Caiming Xiong, Mingyuan Zhou

Figure 1 for Preference-grounded Token-level Guidance for Language Model Fine-tuning
Figure 2 for Preference-grounded Token-level Guidance for Language Model Fine-tuning
Figure 3 for Preference-grounded Token-level Guidance for Language Model Fine-tuning
Figure 4 for Preference-grounded Token-level Guidance for Language Model Fine-tuning
Viaarxiv icon

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems

Add code
Bookmark button
Alert button
Feb 20, 2023
Yihao Feng, Shentao Yang, Shujian Zhang, Jianguo Zhang, Caiming Xiong, Mingyuan Zhou, Huan Wang

Figure 1 for Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems
Viaarxiv icon

A Unified Framework for Alternating Offline Model Training and Policy Learning

Add code
Bookmark button
Alert button
Oct 12, 2022
Shentao Yang, Shujian Zhang, Yihao Feng, Mingyuan Zhou

Figure 1 for A Unified Framework for Alternating Offline Model Training and Policy Learning
Figure 2 for A Unified Framework for Alternating Offline Model Training and Policy Learning
Figure 3 for A Unified Framework for Alternating Offline Model Training and Policy Learning
Figure 4 for A Unified Framework for Alternating Offline Model Training and Policy Learning
Viaarxiv icon

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Jun 14, 2022
Shentao Yang, Yihao Feng, Shujian Zhang, Mingyuan Zhou

Figure 1 for Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
Figure 2 for Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
Figure 3 for Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
Figure 4 for Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
Viaarxiv icon

A Regularized Implicit Policy for Offline Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 19, 2022
Shentao Yang, Zhendong Wang, Huangjie Zheng, Yihao Feng, Mingyuan Zhou

Figure 1 for A Regularized Implicit Policy for Offline Reinforcement Learning
Figure 2 for A Regularized Implicit Policy for Offline Reinforcement Learning
Figure 3 for A Regularized Implicit Policy for Offline Reinforcement Learning
Figure 4 for A Regularized Implicit Policy for Offline Reinforcement Learning
Viaarxiv icon