Alert button
Picture for Yaodong Yang

Yaodong Yang

Alert button

Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning

Add code
Bookmark button
Alert button
Jul 24, 2023
Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

Viaarxiv icon

Safe DreamerV3: Safe Reinforcement Learning with World Models

Add code
Bookmark button
Alert button
Jul 14, 2023
Weidong Huang, Jiaming Ji, Borong Zhang, Chunhe Xia, Yaodong Yang

Figure 1 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Figure 2 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Figure 3 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Figure 4 for Safe DreamerV3: Safe Reinforcement Learning with World Models
Viaarxiv icon

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset

Add code
Bookmark button
Alert button
Jul 10, 2023
Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang

Figure 1 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 2 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 3 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Figure 4 for BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Viaarxiv icon

Policy Space Diversity for Non-Transitive Games

Add code
Bookmark button
Alert button
Jun 29, 2023
Jian Yao, Weiming Liu, Haobo Fu, Yaodong Yang, Stephen McAleer, Qiang Fu, Wei Yang

Figure 1 for Policy Space Diversity for Non-Transitive Games
Figure 2 for Policy Space Diversity for Non-Transitive Games
Figure 3 for Policy Space Diversity for Non-Transitive Games
Figure 4 for Policy Space Diversity for Non-Transitive Games
Viaarxiv icon

Large Sequence Models for Sequential Decision-Making: A Survey

Add code
Bookmark button
Alert button
Jun 24, 2023
Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Figure 1 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 2 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 3 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 4 for Large Sequence Models for Sequential Decision-Making: A Survey
Viaarxiv icon

Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

Add code
Bookmark button
Alert button
Jun 21, 2023
Yonggang Jin, Chenxu Wang, Liuyu Xiang, Yaodong Yang, Jie Fu, Zhaofeng He

Figure 1 for Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork
Figure 2 for Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork
Figure 3 for Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork
Figure 4 for Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork
Viaarxiv icon

Maximum Entropy Heterogeneous-Agent Mirror Learning

Add code
Bookmark button
Alert button
Jun 19, 2023
Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang

Figure 1 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Figure 2 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Figure 3 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Figure 4 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Viaarxiv icon

Heterogeneous Value Evaluation for Large Language Models

Add code
Bookmark button
Alert button
Jun 01, 2023
Zhaowei Zhang, Nian Liu, Siyuan Qi, Ceyao Zhang, Ziqi Rong, Song-Chun Zhu, Shuguang Cui, Yaodong Yang

Figure 1 for Heterogeneous Value Evaluation for Large Language Models
Figure 2 for Heterogeneous Value Evaluation for Large Language Models
Figure 3 for Heterogeneous Value Evaluation for Large Language Models
Figure 4 for Heterogeneous Value Evaluation for Large Language Models
Viaarxiv icon

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

Add code
Bookmark button
Alert button
May 16, 2023
Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang

Figure 1 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 2 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 3 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Figure 4 for OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Viaarxiv icon

Heterogeneous-Agent Reinforcement Learning

Add code
Bookmark button
Alert button
Apr 19, 2023
Yifan Zhong, Jakub Grudzien Kuba, Siyi Hu, Jiaming Ji, Yaodong Yang

Figure 1 for Heterogeneous-Agent Reinforcement Learning
Figure 2 for Heterogeneous-Agent Reinforcement Learning
Figure 3 for Heterogeneous-Agent Reinforcement Learning
Figure 4 for Heterogeneous-Agent Reinforcement Learning
Viaarxiv icon