Alert button
Picture for Dong Yan

Dong Yan

Alert button

Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective

Add code
Bookmark button
Alert button
Feb 20, 2024
Tianyi Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Han Yang, Josef Dai, Xuehai Pan, Yaodong Yang

Viaarxiv icon

Baichuan 2: Open Large-scale Language Models

Add code
Bookmark button
Alert button
Sep 20, 2023
Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, JunTao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, Weipeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu

Figure 1 for Baichuan 2: Open Large-scale Language Models
Figure 2 for Baichuan 2: Open Large-scale Language Models
Figure 3 for Baichuan 2: Open Large-scale Language Models
Figure 4 for Baichuan 2: Open Large-scale Language Models
Viaarxiv icon

Reward Informed Dreamer for Task Generalization in Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 09, 2023
Chengyang Ying, Zhongkai Hao, Xinning Zhou, Hang Su, Songming Liu, Jialian Li, Dong Yan, Jun Zhu

Figure 1 for Reward Informed Dreamer for Task Generalization in Reinforcement Learning
Figure 2 for Reward Informed Dreamer for Task Generalization in Reinforcement Learning
Figure 3 for Reward Informed Dreamer for Task Generalization in Reinforcement Learning
Figure 4 for Reward Informed Dreamer for Task Generalization in Reinforcement Learning
Viaarxiv icon

Model-based Reinforcement Learning with a Hamiltonian Canonical ODE Network

Add code
Bookmark button
Alert button
Nov 02, 2022
Yao Feng, Yuhong Jiang, Hang Su, Dong Yan, Jun Zhu

Figure 1 for Model-based Reinforcement Learning with a Hamiltonian Canonical ODE Network
Figure 2 for Model-based Reinforcement Learning with a Hamiltonian Canonical ODE Network
Figure 3 for Model-based Reinforcement Learning with a Hamiltonian Canonical ODE Network
Figure 4 for Model-based Reinforcement Learning with a Hamiltonian Canonical ODE Network
Viaarxiv icon

On the Reuse Bias in Off-Policy Reinforcement Learning

Add code
Bookmark button
Alert button
Sep 15, 2022
Chengyang Ying, Zhongkai Hao, Xinning Zhou, Hang Su, Dong Yan, Jun Zhu

Figure 1 for On the Reuse Bias in Off-Policy Reinforcement Learning
Figure 2 for On the Reuse Bias in Off-Policy Reinforcement Learning
Figure 3 for On the Reuse Bias in Off-Policy Reinforcement Learning
Figure 4 for On the Reuse Bias in Off-Policy Reinforcement Learning
Viaarxiv icon

Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk

Add code
Bookmark button
Alert button
Jun 09, 2022
Chengyang Ying, Xinning Zhou, Hang Su, Dong Yan, Ning Chen, Jun Zhu

Figure 1 for Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Figure 2 for Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Figure 3 for Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Figure 4 for Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Viaarxiv icon

Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model

Add code
Bookmark button
Alert button
Mar 15, 2022
Jialian Li, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu

Figure 1 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model
Figure 2 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model
Figure 3 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model
Figure 4 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model
Viaarxiv icon

Policy Learning for Robust Markov Decision Process with a Mismatched Generative Mode

Add code
Bookmark button
Alert button
Mar 13, 2022
Jialian Li, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu

Figure 1 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Mode
Figure 2 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Mode
Figure 3 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Mode
Figure 4 for Policy Learning for Robust Markov Decision Process with a Mismatched Generative Mode
Viaarxiv icon

Tianshou: a Highly Modularized Deep Reinforcement Learning Library

Add code
Bookmark button
Alert button
Jul 29, 2021
Jiayi Weng, Huayu Chen, Dong Yan, Kaichao You, Alexis Duburcq, Minghao Zhang, Hang Su, Jun Zhu

Figure 1 for Tianshou: a Highly Modularized Deep Reinforcement Learning Library
Figure 2 for Tianshou: a Highly Modularized Deep Reinforcement Learning Library
Figure 3 for Tianshou: a Highly Modularized Deep Reinforcement Learning Library
Figure 4 for Tianshou: a Highly Modularized Deep Reinforcement Learning Library
Viaarxiv icon