Alert button
Picture for Weilin Liu

Weilin Liu

Alert button

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Add code
Bookmark button
Alert button
Apr 16, 2024
Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu

Viaarxiv icon

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

Add code
Bookmark button
Alert button
Feb 03, 2023
Chao Yu, Jiaxuan Gao, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang, Yi Wu

Figure 1 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Figure 2 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Figure 3 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Figure 4 for Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Viaarxiv icon

Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward

Add code
Bookmark button
Alert button
Dec 12, 2021
Weilin Liu, Ye Mu, Chao Yu, Xuefei Ning, Zhong Cao, Yi Wu, Shuang Liang, Huazhong Yang, Yu Wang

Figure 1 for Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward
Figure 2 for Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward
Figure 3 for Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward
Figure 4 for Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward
Viaarxiv icon